To support the development of multithreaded applications, the Digital UNIX operating system provides DECthreads, Digital's Multithreading Run-Time Library. The DECthreads interface is Digital UNIX's implementation of IEEE Standard 1003.1c-1995 threads (also referred to as POSIX 1003.1c threads).
In addition to an actual threading interface, the operating system also provides Thread-Independent Services (TIS). The TIS routines are an aid to creating thread-safe libraries (see Section 12.4.1).
This chapter addresses the following topics:
A thread is a single, sequential flow of control within a program. Multiple threads execute concurrently and share most resources of the owning process, including the address space. By default, a process initially has one thread.
The purposes for which multiple threads are useful include:
You can also use multiple threads as an alternative approach to
managing certain events. For example, you can use one thread per
file descriptor in a process that otherwise might use the
select(
)
or
poll(
)
system calls to efficiently manage concurrent I/O operations on
multiple file descriptors.
The components of the multithreaded development environment for the Digital UNIX system include the following:
-pthread
flag on the
cc
or
c89
command.
libpthread.so
library provides interfaces for threads control, buffers an
application from lower-level threads implementation, and is selected
at application link time.
libm.{a,so}
,
libsys5_r.a
,
and
libmach.{a,so}
.
ladebug
debugger
prof
and
gprof
profilers - Compile with the
-p
and
-pthread
flags for
prof
and with the
-pg
and
-pthread
flags for
gprof
to use the
libprof1_r.a
profiling library.
atom
utility
(pixie
,
third
,
and
hiprof
tools)
For information on profiling multithreaded applications, see Section 8.14.
For releases of the DEC OSF/1 operating system (that is, for
releases prior to
Digital UNIX
Version 4.0), a large number of separate reentrant routines
(*_r
routines) were
provided to solve the problem of static data in the C run-time library
(the first two problems listed in
Section 12.3.1).
The
Digital UNIX
operating system fixes the problem of static data in the non-reentrant
versions of the routines by replacing the static data with
thread-specific data. Except for a few routines specified by
POSIX 1003.1c, all of the
alternate routines are no longer required and are retained only for
binary compatibility.
The following functions are the only alternate thread-safe routines that are specified by POSIX 1003.1c and need to be used when writing thread-safe code:
alctime_r *
|
ctime_r *
|
getgrgid_r *
|
getgrnam_r *
|
getpwnam_r *
|
getpwuid_r *
|
gmtime_r *
|
localtime_r *
|
rand_r *
|
readdir_r *
|
strtok_r
|
Starting with
Digital UNIX
Version 4.0, the interfaces flagged with an asterisk (*) in the
preceding list have new definitions that conform to POSIX 1003.1c.
The old versions of these routines can be obtained by defining
the preprocessor symbol
_POSIX_C_SOURCE
with the value
199309L
(which denotes POSIX 1003.1b conformance). The new versions of
the routines are the default when compiling code under
Digital UNIX
Version 4.0 or later, but you must be certain to include the header
files specified on the manpages for the various routines.
For more information on programming with threads, see the
Guide to DECthreads
and
cc
(1),
monitor
(3),
prof
(1),
and
gprof
(1).
Routines within a library can be thread safe or not. A thread-safe routine is one that can be called concurrently from multiple threads without undesirable interactions between threads. A routine can be thread safe for either of the following reasons:
Reentrant routines do not share any state across concurrent invocations from multiple threads. A reentrant routine is the ideal thread-safe routine, but not all routines can be made to be reentrant.
Prior to
Digital UNIX
Version 4.0, many of the C run-time library
(libc
)
routines were not thread safe, and alternate versions of these
routines were provided in
libc_r
.
Starting with
Digital UNIX
Version 4.0, all of the alternate versions formerly found in
libc_r
were merged into
libc
.
If a thread-safe routine and its corresponding nonthread-safe
routine had the same name, the nonthread-safe version was
replaced. The thread-safe versions are modified to use Thread
Independent Services (TIS)
(see
Section 12.4.1);
this enables them to work in both single- and multithreaded
environments - without extensive overhead in the
single-threaded case.
Some common practices that can prevent code from being thread safe
can be found by examining why some of the
libc
functions were not thread safe prior to
Digital UNIX
Version 4.0:
The
ctime
(3)
interface provides an example of this problem:
char *ctime(const time_t *timer);
This function takes no arguments and returns a pointer to a statically allocated buffer containing a string that is the ASCII representation of the time specified in the single parameter to the function. Because a single, statically allocated buffer is used for this purpose, any other thread that calls this function will overwrite the string returned to the previously calling thread.
To make the
ctime()
function thread safe, the POSIX 1003.1c standard has defined an
alternate version,
ctime_r()
,
which accepts an additional argument.
The argument is a user-supplied buffer that is allocated by the caller.
The
ctime_r()
function writes the following string into the buffer:
char *ctime_r(const time_t *timer, char *buf);
The users of this function must ensure that the buffer they supply as an argument to this function is not used by another thread.
The
rand()
function provides an example of this problem:
void srand(unsigned int seed); int rand(void);
This function is a simple pseudo-random number generator.
For any given starting "seed" value that is set with the
srand()
function, it generates an identical sequence of pseudo-random
numbers. To do this, it maintains a state value that is updated on
each call. If another thread is calling this function, the sequence
of numbers returned within any one thread for a given starting seed
is nondeterministic. This may be undesirable.
To avoid this problem, a second interface,
rand_r()
,
is specified in POSIX 1003.1c. This function accepts an additional
argument that is a pointer to a user-supplied integer used by
rand_r()
to hold the state of the random number generator:
int rand_r(unsigned int *seed);
The users of this function must ensure that the seed argument is not used by another thread. Using thread-specific data or keys is one way of doing this (see Section 12.4.2).
The problem of sharing read/write data can be solved by using mutexes. In this case, the routine is not considered reentrant, but it is still thread safe. Like thread-specific data, mutex locking is transparent to the user of the routine except for the creation of a potential for blocking (where the potential may not have existed previously).
Mutexes are used in several
libc
routines, most notably the
stdio
routines, for example,
printf()
.
Mutex locking in the
stdio
routines is done by stream to prevent concurrent operations on a stream
from colliding, as in the case of two processes trying to fill a stream
buffer at the same time. Mutex locking is also done on certain internal
data tables in the C run-time library during operations such as
fopen()
and
fclose()
.
Because the alternate versions of these routines do not require an
application program interface (API) change, they have the same name
as the original versions.
See Section 12.4.3 for an example of how to use mutexes.
When writing code that can be used by both single-threaded and multithreaded applications, it is necessary to code in a thread-safe manner. The following coding practices must be observed:
const
type modifier to reduce the potential for misuse of the data.
geterrno()
and
seterrno()
.
This replacement is not necessary if the source file
includes
<errno.h>
and one of the following conditions is true:
-pthread
flag
(cc
or
c89
command).
<pthread.h>
file is included at the top of the source file.
_REENTRANT
preprocessor symbol is explicitly set before including the
<errno.h>
file.
TIS is a package of routines provided by the C run-time library that can be used to write efficient code for both single-threaded and multithreaded applications. TIS routines can be used for handling mutexes, handling thread-specific data, and a variety of other purposes.
When used by a single-threaded application, these routines use simplified semantics to perform thread-safe operations for the single-threaded case. When DECthreads is present, the bodies of the routines are replaced with more complicated algorithms to optimize their behavior for the multithreaded case.
TIS is used within
libc
itself to allow a single version of the C run-time library to service
both single-threaded and multithreaded applications. See the
Guide to DECthreads
and
tis
(3)
for information on how to use this facility.
Example 12-1 shows how to use thread-specific data in a function that can be used by both single-threaded and multithreaded applications. For clarity, most error checking has been left out of the example.
#include <stdlib.h> #include <string.h> #include <tis.h>
static pthread_key_t key; void _ _init_dirname() { tis_key_create(&key, free); }
void _ _fini_dirname() { tis_key_delete(key); }
char *dirname(char *path) { char *dir, *lastslash; /* * Assume key was set and get thread-specific variable. */ dir = tis_getspecific(key); if(!dir) { /* First time this thread got here. */ dir = malloc(PATH_MAX); tis_setspecific(key, dir); }
/* * Copy dirname component of path into buffer and return. */ lastslash = strrchr(path, '/'); if(lastslash) { memcpy(dir, path, lastslash-path); dir[lastslash-dir+1] = '\0'; } else strcpy(dir, path); return dir; }
The following TIS routines are used in the preceding example:
tis_key_create
tis_key_delete
tis_getspecific
tis_setspecific
The
_
_init_
and
_
_fini_
routines are used in the example to initialize and destroy the
thread-specific data key. This operation is done only once, and
these routines provide a convenient way of ensuring that this is
the case, even if the library is loaded with
dlopen()
.
See
ld
(1)
for an explanation of how to use the
_
_init_
and
_
_fini_
routines.
Thread-specific data keys are a limited resource. A library that needs to create a large number of data keys should instead be written to create just one and to store all of the separate data items as a structure or an array of pointers pointed to by a single key.
In some cases, using thread-specific data is not the correct way to
convert static data into thread-safe code, for example,
when a data object is meant to be shareable between threads (as in
stdio
streams within
libc
).
Manipulating per-process resources is another case in which
thread-specific data is inadequate. The following example shows how to
manipulate per-process resources in a thread-safe fashion:
#include <pthread.h> #include <tis.h>
/* * NOTE: The putenv() function would have to set and clear the * same mutex lock before it accessed the environment. */
extern char **environ; static pthread_mutex_t environ_mutex = PTHREAD_MUTEX_INITIALIZER;
char *getenv(const char *name) { char **s, *value; int len;
tis_mutex_lock(&environ_mutex); len = strlen(name); for(s=environ; value=*s; s++) if(strncmp(name, value, len) == 0 && value[len] == '=') { tis_mutex_unlock(&environ_mutex); return &(value[len+1]); } tis_mutex_unlock(&environ_mutex); return (char *) 0L; }
In the preceding example, note how the lock is set once
(tis_mutex_lock
)
before accessing the environment and is unlocked exactly once
(tis_mutex_unlock
)
before returning. In the multithreaded case, any other
thread attempting to access the environment while the first thread
holds the lock is blocked until the first thread performs the unlock
operation. In the single-threaded case, no contention occurs unless
an error exists in the coding of the locking and unlocking sequences.
If it is necessary for the lock state to remain valid across a
fork()
system call in multithreaded applications, it may be useful to create
and register
pthread_atfork()
handler functions to lock the lock prior to any
fork()
call, and to unlock it in both the child and parent after the
fork()
call.
This guarantees that a fork operation is not done by one thread
while another thread holds the lock. If the lock was held by
another thread, it would end up permanently locked in the child
because the fork operation produces a child with only one thread.
In the case of an independent library, the call to
pthread_atfork()
can be done in an
_
_init_
routine in the library.
Unlike most
pthread
routines, the
pthread_atfork
routine is available in
libc
and may be used by both single-threaded and multithreaded applications.
The compilation and linking of multithreaded applications differs from that of single threaded applications in a few minor but important ways.
Many system
include
files behave differently when they are being included into the
compilation of a multithreaded application.
Whether the single-threaded or thread-safe
include
file behavior applies is determined by whether the
\_REENTRANT
preprocessor symbol is defined. When the
-pthread
flag is supplied to the
cc
or
c89
command, the
\_REENTRANT
symbol is defined automatically;
it is also defined if the
pthreads.h
system
include
file is included. This
include
file must be the first file included in any application that
uses the pthreads library,
libpthread.so
.
The
-pthread
flag has no other effect on the compilation of C programs.
The reentrancy of the actual code generated by the C compiler
is determined only by proper use of reentrant coding practices
by the programmer, by use of only thread-safe support libraries,
and by use of only thread-safe support libraries - not by any
special options.
To link a multithreaded C application, use the
cc
or
c89
command with the
-pthread
flag. When linking, the
-pthread
flag has the effect of modifying the library search path in the
following ways:
-l
flag, an attempt is made to
locate and presearch a library whose name is derived by appending
an
\_r
to the given name.
The
-pthread
flag does not modify the behavior of the linker in any
other way. The reentrancy of the linked code is determined by use of
proper programming practices in the orginal code, and by compiling and
linking with the proper
include
files and libraries, respectively.
Not all compilers necessarily generate reentrant code; the definition of the language itself can make this difficult. It is also necessary for any run-time libraries linked with the application to be thread safe. For details on such matters, you should consult the manual for the compiler you are using.