|
To understand how a program makes use of shared
objects, let's first see the format of an executable and then examine the steps that occur when the
program starts.
ELF format
Neutrino uses the ELF (Executable and Linking
Format) binary format, which is currently used in SVR4 Unix systems. ELF not only simplifies the task
of making shared libraries, but also enhances dynamic loading of modules at runtime.
In the following diagram, we show two views of an
ELF file: the linking view and the execution view. The linking view, which is used when the program
or library is linked, deals with sections within an object file. Sections contain the bulk of the
object file information: data, instructions, relocation information, symbols, debugging information,
etc.
At linktime, the program or library is built by
merging together sections with similar attributes into segments. Typically, all the executable and
read-only data sections are combined into a single "text" segment,
while the data and "BSS"s are combined into the
"data" segment. These segments are called load segments,
because they need to be loaded in memory at process creation. Other sections such as symbol
information and debugging sections are merged into other, non-load segments.
 |
Object file format: linking view and execution view. |
ELF without COFF
Most implementations of ELF loaders are derived from
COFF (Common Object File Format) loaders; they use the linking view of the ELF objects at load
time. This is inefficient because the program loader must load the executable using sections. A
typical program could contain a large number of sections, each of which would have to be located in
the program and loaded into memory separately.
Neutrino, however, doesn't rely at all on the COFF
technique of loading sections. When developing our ELF implementation, we worked directly from the
ELF spec and kept efficiency paramount. The Neutrino ELF loader uses the "execution view" of the
program. By using the execution view, the task of the loader is greatly simplified: all it has to do
is copy to memory the load segments (usually two) of the program or library. As a result, process
creation and library loading operations are much faster.
The process
The diagram below shows the memory layout of a
typical process. The process load segments (corresponding to "text"
and "data" in the diagram) are loaded at the process's base address.
The main stack is located just below and grows downwards. Any additional threads that are created
will have their own stacks, located below the main stack. Each of the stacks is separated by a guard
page to detect stack overflows. The heap is located above the process and grows upwards.
 |
Process memory layout. |
In the middle of the process's address space, a
large region is reserved for shared objects. Shared libraries are located at the top of the address
space and grow downwards.
When a new process is created, the process manager
first maps the two segments from the executable into memory. It then decodes the program's ELF
header. If the program header indicates that the executable was linked against a shared library, the
process manager will extract the name of the dynamic interpreter from the program header. The
dynamic interpreter points to a shared library that contains the runtime linker code. The
process manager will load this shared library in memory and will then pass control to the runtime
linker code in this library.
Runtime linker
The runtime linker is invoked when a program that
was linked against a shared object is started or when a program requests that a shared object be
dynamically loaded. The runtime linker is contained within the C runtime library.
The runtime linker performs several tasks when
loading a shared library (.so file):
- If the requested shared library isn't already loaded in memory,
the runtime linker loads it.
If the shared library name is fully qualified (i.e. begins with a slash),
it's loaded directly from the specified location. If it
can't be found there, no further searches are performed.
If it's not a fully qualified pathname, the runtime
linker searches for it in the directories specified
by LD_LIBRARY_PATH only if the program isn't
marked as setuid.
- If the shared library still isn't found, and if the executable's dynamic
section contains a DT_RPATH tag, then the path
specified by DT_RPATH is searched next.
- If the shared library still isn't found, then the runtime linker
searches for the default library search path as specified by
the LD_LIBRARY_PATH environment variable to
procnto. If none has been specified, then the
default library path is set to the image filesystem's path.
- Once the requested shared library is found, it's loaded into memory.
For ELF shared libraries, this is a very efficient
operation: the runtime linker simply needs to use the
mmap() call twice to map the two load
segments into memory.
- The shared library is then added to the internal list of all libraries
that the process has loaded.
The runtime linker maintains this list.
- The runtime linker then decodes the dynamic section of
the shared object.
This dynamic section provides information to the
linker about other libraries that this library was linked against. It also gives information about
the relocations that need to be applied and the external symbols that need to be resolved. The
runtime linker will first load any other required shared libraries (which may themselves reference
other shared libraries). It will then process the relocations for each library. Some of these
relocations are local to the library, while others require the runtime linker to resolve a global
symbol. In the latter case, the runtime linker will search through the list of libraries for this
symbol. In ELF files, hash tables are used for the symbol lookup, so they're very fast. The order in
which libraries are searched for symbols is very important, as we'll see in the section on "Symbol name resolution" below.
Once all relocations have been applied, any
initialization functions that have been registered in the shared library's init section are called.
This is used in some implementations of C++ to call global constructors.
Loading a shared library at runtime
A process can load a shared library at runtime by
using the dlopen() call, which instructs the runtime linker to load this library. Once the
library is loaded, the program can call any function within that library by using the dlsym()
call to determine its address.
 |
Remember: shared libraries are available only to processes that are
dynamically linked. |
The program can also determine the symbol associated
with a given address by using the dladdr() call. Finally, when the process no longer needs the
shared library, it can call dlclose() to unload the library from memory.
Symbol name resolution
When the runtime linker loads a shared library, the
symbols within that library have to be resolved. The order and the scope of the symbol resolution are
important. If a shared library calls a function that happens to exist by the same name in several
libraries that the program has loaded, the order in which these libraries are searched for this
symbol is critical. This is why Neutrino defines several options that can be used when loading
libraries.
All the objects (executables and libraries) that
have global scope are stored on an internal list (the global list). Any global-scope object,
by default, makes available all of its symbols to any shared library that gets loaded. The global
list initially contains the executable and any libraries that are loaded at the program's
startup.
By default, when a new shared library is loaded by
using the dlopen() call, symbols within that library are resolved by searching in this order
through:
- The shared library.
- The global list.
- Any dependent objects that the shared library references (i.e. any other libraries that the
shared library was linked against).
The runtime linker's scoping behavior can be changed
in two ways when dlopen()'ing a shared library:
- When the program loads a new library, it may instruct the
runtime linker to place the library's symbols on the global
list by passing the RTLD_GLOBAL flag to
the dlopen() call.
This will make the library's symbols available to any libraries
that are subsequently loaded.
- The list of objects that are searched when resolving the
symbols within the shared library can be modified.
If the RTLD_GROUP flag is passed to
dlopen(), then only objects that the library
directly references will be searched for symbols.
If the RTLD_WORLD flag is passed, only
the objects on the global list will be searched.
<<
Previous |
Index |
Next >>
|