Blog

Archive for the ‘Uncategorized’ Category

SMP: Solving The Lock Problem Part 4

Posted on: April 25th, 2012 by stephenbroeker No Comments

My last three blog posts have presented the SMP (Symmetric Multi Processor scaling problem, Performance Solution as presented by Pyramid Technology, and the user space SMP problem. Now, I’d like to delve into the locks themselves.  What actually is a lock?

There are essentually two types of SMP locks: mutexes and semaphores.

SMP Mutex Lock

A mutex (short for mutual exclusion), is a lock that waits for a field to be released before proceeding.  In its simplist incantation, a mutex constantly checks the field in question.  This type of lock thus uses the following alorithm:

mutex (void *addr)  {
while (test_and_set(addr) == 1) ;
}

For this algorithm to work, there must be an atomic operation: test_and_set(void *ptr).  This operation writes a 1 into the given word address.  If the value was previously 0, then test_and_set() returns 0.  Otherwise this operation returns 1.

It is important to note that a mutex does not voluntarily give up the cpu.  In the kernel, this means never.  In user space, the schedular can run another process if it has higher priority.

SMP Semaphore Lock

A semaphore is a lock in which the cpu is released if the field is not free.  In its simplist incantation, a semaphore uses a mutex to try to set the lock.  If the lock is not set, then the semaphore gives up the cpu and tries later.  This type of lock thus uses the following alorithm:

semaphore (void *addr)
{
    while (test_and_set(addr) == 1)
        sleep(0);
}

The call to sleep with a seconds value of 0, simply results in a call to sched().  And sched() will preempt the semaphore if there is another process ready to run.

So why is mutex used at all?  Why not always use a semaphore?  The reason is that the kernel has to use a mutex, it cannot use a semaphore, since there is only a single copy of the kernel that can run on a given cpu.  But in user mode, semaphores allow other processes to run, thus improving system efficiency.

Hello world!

Posted on: April 20th, 2012 by admin 1 Comment

Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!

SMP: Symmetric Multi-Processor User Processes (Part 3)

Posted on: April 18th, 2012 by stephenbroeker No Comments

Continuing on the information presented in Part 1:  The Symmetric Multi Processor (SMP) scaling problem and Part 2: The Performance Solution as presented by Pyramid Technology, my comments were addressed at the kernel.  But what about user processes?  Do they also need to use locking?

The answer is yes, if multiple processes share the same physical memory.  This can occur with Shared Memory. User space proccesses use Shared Memory so that proccesses can easily and quickly share data.  Shared Memory is the fastest Inter Process Communication (IPC) mechanism that can be employed.  But it does have a cost (alas nothing is free): lack of synchronization.

If a single process is writing to shared memory, and a single process is reading the same shared memory, then locking is not needed.

If a single process is writing to shared memory, and multiple processes are reading the shared memory, then locking may or may not be required.  If the shared memory is used as a resource allocator, then locks will be required.  On the other hand, if the shared memory is used to post events, then locks might not be required.

Multiple Processes, Locking Required

If SMP multiple processes are writing to shared memory, then locking will be required.

User space Shared Memory is thus like the kernel.  Each user process can be thought of as a seperate kernel running on different cpu.  Multiple instances of a user process can run on a single cpu.  If they use Shared Memory, they will then all use the same physical memory, just like the kernel.  It turns out that using user processes in this manner is a great way to simulate a kernel SMP environment.

A true SMP system requires hardware memory support, called cache coherency.  A cache coheriancy problem occurs when a cpu writes to physical memory and does not update or invalidate the caches on the other cpus in the system.

Next up I’ll delve more into locks.

SMP: Symmetric Multi-Processor Explained (Part 1)

Posted on: April 4th, 2012 by stephenbroeker No Comments

As a software engineer at Pyramid Technology, Symmetric Multi-Processors (SMPs) served as a technology backbone. An SMP system essentually consists of multiple cpus that all use the same physical memory.

Each cpu ran some flavor of the Unix operating system.  In Unix, user spaces processes have seperate physical address spaces, so user processes are not a problem in SMP.  But the kernel maps to hard-wired physical addresses; thus multiple cpus, running the same kernel in SMP, is a problem.

SMP Locks to Avoid Corruption

A problem occurs when data structures are manipulated or changed.  If the same code or related code (on different cpus) is changing the same data structure, at the same time, then data corruption can occur. To prevent this data corruption, we protect the data structures with locks.  That is, a process must obtain a read or write lock before accessing a data structure.  Such operations thus synchronize data access.

So how many locks do we add?

AT&T was one of the first companies to create an SMP system: the 3B5000.  This system consisted of multiple cpus running Unix (AT&T invented Unix) and a single kernel lock.  This single lock was used to make sure that one and only one process was in the kernel at the same time.

If a process was in user mode, then performance was wonderful.  On the other hand, if a process needed to be in kernel mode, then performance could be terrible.  Adding additional cpus did not improve the problem since only a single process could be in the kernel. Thus this lock did not scale.  That is, adding X number of cpus did not result in a linear increase in system performance, since all processes need the kernel at some point.

In this blog entry, I have described the SMP problem.  My next blog post will describe the solution.