This article discusses shared libraries - in particular, a method for doing
shared library based function call redirection for multiple purposes. During
the process of writing some code, some bugs were discovered in a few shared
library implementations, these are discussed as well.
First off, a short description of shared libraries is in order. Shared
libraries are designed to let you share code segments among programs. In this
way, memory usage is reduced significantly. Since code segments generally are
not modified, this sharing scheme works rather well. Obviously for this to
work, the code segments have to be location independent or PC indepenant (ip
independant for the x86 programmers in the audience).
Now, since the telnetd environment variable hole, most of you know there
are several environment variables that can be used to specify alternate shared
libraries. Among them, on most systems, are LD_LIBRARY_PATH and LD_PRELOAD;
this article strictly deals with the latter. Additionally, on Digital UNIX
and Irix, this variable is called _RLD_LIST and has a slightly different
syntax.
Sun's shared libraries came with an API to let users load and call shared
library functions; most other vendors have cloned the interface. Oddly enough,
our code will not work in SunOS, although it will in Solaris2. Anyhow, the
first function to be concerned with is called dlopen(). This function
basically loads the shared library and mmap()s it into memory if it is not
already loaded. The first argument it accepts, is a pointer to the filename
to be loaded, the second argument should usually be 1 (although some platforms
seem to support other options). The manpage provides more details. A handle
is returned on success, you can call dlerror() to determine if a failure
occurred.
Once you have dlopen()ed a library, the next goal is to get the address of one
or more of the symbols that are inside the library. You do this with the
dlsym() function. Unfortunately, this is where things can get nonportable.
On the freely available 4.4BSD machines I tested, dlsym() wants the function
name prepended by a underscore character. This makes perfect sense to me,
since that is how C stores function names internally. The System Vish
implementations, which make up the majority of the tested systems, do not use
such a convention. This, unfortunately, means you must use conditional
compilation in order to ensure portability.
A simple example of opening a library, getting a function and calling it is
shown below:
Okay, now that we understand how to use the programming interface, how do we
do function call redirection? Well, my idea is simple; you preload a library,
the preloaded library does its thing, then it dlopen()s the real library and
gets the symbol and calls it. This seems to work well on Solaris, Linux (ELF),
Irix (5.3 and 6.2), FreeBSD (see bugs section below), and OSF/1 (not tested).
Compiling shared libraries is a little different on each platform. The
compilation stage is basically the same, it is the linking that is actually
different. For GCC, you make the object with something like:
gcc -fPIC -c file.c
That will create file.o, object code which is suitable for dynamic linking.
Then you actually have to link it, which is where the fun begins :). Here is
a chart for linking in the various operating systems I have tested this stuff
on.
On IRIX, there is an additional switch you need to use if you are running 6.2,
it enables backwards ld compatibility; the manpage for ld is your guide.
Unfortunately, all is not happy in the world of shared libs since there are
bugs present in some implementations. FreeBSD in particular has a bug in that
if you dlsym() something and it is not found, it will not set the error so
dlerror() will return NULL. OpenBSD is far far worse (*sigh*). It
initializes the error to a value, and does not clear the error when you call
dlerror() so at all times, dlerror() will return non NULL. Of course, OpenBSD
is incompatible with our methods in other ways too, so it does not really
matter I guess :). The FreeBSD bug is hacked around by testing return values
for NULL.
Here is a simple TTY logger shared library example. When you preload it, it
will log the keystrokes when users run any nonprivledged shared lib using
program. It stores the logs in /tmp/UID_OF_USER. Pretty simple stuff.