BitCtrl Systems GmbH
Foto Weissenfelserstr. 67
Suche:     
 BitCtrl Systems GmbH
Produkte Support News & Events Download Shop Partner Kontakt
  QNX® - Allgemeine Info
  QNX® News
  QNX® Produkt Katalog
  Free Downloads
  FAQ
  Support/Service
  Links
  Repository
  Produkt Dokumentation
QNX® 6 - System Architecture
Chapter 3: SMP with Neutrino
Table of Contents Zurück zum Seitenanfang
Introduction
Booting an SMP system
How the SMP microkernel works
Scheduling
Hard processor affinity
Kernel locking
Critical sections
IntroductionTop Table of Contents

SMP (Symmetrical Multi-Processing) is typically associated with high-end operating systems such as UNIX and NT running on high-end servers. These large monolithic systems tend to be quite complex, the result of many man-years of development. Since these large kernels contain the bulk of all OS services, the changes to support SMP are extensive, usually requiring large numbers of modifications and the use of specialized spinlocks throughout the code.

Neutrino, on the other hand, contains a very small microkernel surrounded by processes that act as resource mangers, providing services such as filesystems, character I/O, and networking. By modifying the microkernel alone, all other OS services will gain full advantage of SMP without the need for coding changes. If these service-providing processes are multi-threaded, their many threads will be scheduled among the available processors. Even a single-threaded server would also benefit from an SMP system, because its thread would be scheduled on the available processors beside other servers and client processes.

As a testament to this microkernel approach, the SMP version of Neutrino adds only a few kilobytes of additional code. This version, called procnto-smp, will boot on any system that conforms to the Intel MultiProcessor Specification (MP Spec) with up to 8 Pentium or Pentium Pro processors. The procnto-smp manager will also function on a single non-SMP system. With the cost of building a dual-processor Pentium motherboard very nearly the same as a single-processor motherboard, it's possible to deliver cost-effective solutions that can be scaled in the field by the simple addition of a second CPU. The fact that the OS itself is only a few kilobytes larger also allows SMP to be seriously considered for small CPU-intensive embedded systems, not just high-end servers.

Booting an SMP systemTop Table of Contents

The Neutrino microkernel itself contains very little hardware- or system-specific code. The code that determines the capabilities of the system is isolated in a startup program, which is responsible for initializing the system, determining available memory, etc. Information gathered is placed into a memory table available to the microkernel and to all processes (on a read-only basis).

The startup-bios program is designed to work on systems compatible with the Intel MP Spec (version 1.4 or later). This startup program is responsible for:

  • determining the number of processors
  • determining the address of the local and I/O APIC
  • initializing each additional processor

After reset, only one processor will be executing the reset code. This processor is called the boot processor (BP). For each additional processor found, the BP running the startup-bios code will:

  • initialize the processor
  • switch it to 32-bit protected mode
  • allocate the processor its own page directory.
  • set the processor spinning with interrupts disabled, waiting to be released by the kernel.
How the SMP microkernel worksTop Table of Contents

Once the additional processors have been released and are running, all processors are considered peers for the scheduling of threads.

Scheduling

The scheduling algorithm follows the same rules as on a uniprocessor system. That is, the highest-priority threads will be running on the available processors. If a new thread becomes ready to run, it will be dispatched to the processor running the lowest-priority thread.

If more than one processor is selected as a potential target, then the microkernel will try to dispatch the thread to the processor where it last ran. This affinity is used as an attempt to reduce thread migration, which can affect cache performance.

Hard processor affinity

Neutrino also supports the concept of hard processor affinity through the kernel call ThreadCtl(_NTO_TCTL_RUNMASK, runmask). Each set bit in runmask represents a processor that a thread can run on. By default, a thread's runmask is set to all ones, allowing it to run on any processor. A value of 0x01 would allow a thread to execute only on the first processor. By careful use of this primitive, a systems designer can further optimize the runtime performance of a system (e.g. by relegating non-realtime processes to a specific processor). In general, however, this shouldn't be necessary, because Neutrino's realtime scheduler will always preempt a lower-priority thread immediately when a higher-priority thread becomes ready. Processor locking will likely affect only the efficiency of the cache, since threads can be prevented from migrating.

Kernel locking

In a uniprocessor system, only one thread is allowed to execute within the microkernel at a time. Most kernel operations are short in duration (typically a few microseconds on a Pentium-class processor). The microkernel is also designed to be completely preemptable and restartable for those operations that take more time. This design keeps the microkernel lean and fast without the need for large numbers of fine-grained locks. It is interesting to note that placing many locks in the main code path through a kernel will noticeably slow the kernel down. Each lock introduces at least one conditional branch, which can cause processor stalls.

In an SMP system, Neutrino maintains this philosophy of only one thread in a preemptable and restartable kernel. The microkernel may be entered on any processor, but only one processor will be granted access at a time.

For most systems, the time spent in the microkernel represents only a small fraction of the processor's workload. Therefore, while conflicts will occur, they should be more the exception than the norm. This is especially true for a microkernel where traditional OS services like filesystems are separate processes and not part of the kernel itself.

Inter-processor interrupts (IPIs)

The processors communicate with each other through IPIs (inter-processor interrupts). IPIs can effectively schedule and control threads over multiple processors. For example, an IPI to another processor is often needed when:

  • a higher-priority thread becomes ready
  • a thread running on another processor is hit with a signal
  • a thread running on another processor is canceled
  • a thread running on another processor is destroyed.

Here's the small set of IPIs used by Neutrino:

Type Description
IPI reschedule Force a processor to examine the thread it's currently executing. Thread preemption, signals, cancellation, and death are acted upon as needed.
IPI kernel preempt Preempt a kernel running on another processor.
IPI TLB flush Flush the TLB (translation look-aside buffer) on the target processor as a result of a change to the MMU memory mapping.
Critical sectionsTop Table of Contents

To control access to data structures that are shared between them, threads and processes use the standard POSIX primitives of mutexes, condvars, and semaphores. These work without change in an SMP system.

Many realtime systems also need to protect access to shared data structures between an interrupt handler and the thread that owns the handler. The traditional POSIX primitives used between threads aren't available for use by an interrupt handler. There are two solutions here:

  • One is to remove all work from the interrupt handler and do all the work at thread time instead. Given Neutrino's fast thread scheduling, this is a very viable solution.
  • In a uniprocessor system running Neutrino, an interrupt handler may preempt a thread, but a thread will never preempt an interrupt handler. This allows the thread to protect itself from the interrupt handler by disabling and enabling interrupts for very brief periods of time.

The thread on a non-SMP system protects itself with code of the form:

InterruptDisable()
critical section
InterruptEnable()

or

InterruptMask(intr)
critical section
InterruptUnmask(intr)

Unfortunately, this code will fail on an SMP system since the thread may be running on one processor while the interrupt handler is concurrently running on another processor!

One solution would be to lock the thread to a particular processor (by setting the processor affinity to 1 via the ThreadCtl() function).

A better solution would be to use a new exclusion lock available to both the thread and the interrupt handler. This is provided by the following primitives, which work on both uniprocessor and SMP machines:

InterruptLock(_intrspin_t spinlock)
Attempt to acquire spinlock, a variable shared between the interrupt handler and thread. The code will spin in a tight loop until the lock is acquired. After disabling interrupts, the code will acquire the lock (if it was acquired by a thread). The lock must be released as soon as possible (typically via a few lines of C code without any loops).
InterruptUnlock(_intrspin_t spinlock)
Release a lock and reenable interrupts.

On a non-SMP system, there's no need for a spinlock.

<< Previous | Index | Next >>

Home    Datenschutzerklärung    Haftungsausschluss    Impressum   
© 2011 BitCtrl Systems GmbH