This chapter contains information on the following topics:
The compiler system is responsible for converting source code into an executable program. This can involve several steps:
.i
file suffix.
.o
file suffix.
These steps can be performed by separate preprocessing, compiling, and linking commands, or they can be performed in a single operation, with the compiler system calling each tool at the appropriate time during the compilation.
Other tools in the compiler system help debug the program after it has been compiled and linked, examine the object files that are produced, create libraries of routines, or analyze the run-time performance of the program.
Table 2-1
summarizes the tools in the compiler system and points to the chapter
or section where they are described in this and other documents.
Task | Tools | Where Documented |
Compile, link, and load programs, build shared libraries | Compiler drivers, link editor, dynamic loader |
This chapter, Chapter 4,
cc (1),
c89 (1),
as (1),
ld (1),
loader (5),
Assembly Language Programmer's Guide,
DEC C Language Reference Manual
|
Debug programs |
Symbolic debuggers
(dbx
and
ladebug )
and Third Degree
|
Chapter 5, Chapter 6,
dbx (1),
third (5),
ladebug (1),
Ladebug Debugger Manual
|
Profile programs | Profiler, call graph profiler |
Chapter 8,
prof (1),
gprof (1),
pixie (5),
atom (1),
hiprof (5),
atomtools (5)
|
Optimize programs | Optimizer, post-link optimizer |
This chapter, Chapter 10,
cc (1),
third (5)
|
Examine object files |
nm ,
file ,
size ,
dis ,
odump ,
and
stdump
tools
|
This chapter,
nm (1),
file (1),
size (1),
dis (1),
odump (1),
stdump (1),
Programming Support Tools
|
Produce necessary libraries |
Archiver
(ar ),
linker
(ld )
command
|
This chapter, Chapter 4,
ar (1),
ld (1)
|
Figure 2-1 shows the relationship between the major components of the compiler system and their primary inputs and outputs.
Compiler system commands, sometimes called driver programs, invoke the components of the compiler system. Each language has its own set of compiler commands and flags. In addition, your system might include layered products such as C++, or other languages such as Fortran or Pascal. The languages supported by any one system are determined by the choices made at the time the system is installed or modified. Thus, the configuration of your particular system may not support languages other than C and assembly.
The
cc
command invokes the C compiler. The
-newc
and
-oldc
flags invoke different compiler implementations (where the
implementation invoked by
-newc
is upwardly compatible with that invoked by
-oldc
).
The
-newc
compiler offers improved optimization, additional features, and greater
compatibility with Digital compilers provided on other platforms. The
-newc
compiler implementation is the default.
The
-newc
compiler was accessible in previous versions of the
Digital UNIX
operating system by means of the
-migrate
flag. The
-newc
compiler has been made more compatible with the
-oldc
compiler.
Note
This manual uses the phrase "the C compiler" to refer to both versions of the DEC C compiler,
-newc
and-oldc
. Features supported by only one of the compilers are so marked.
Each compiler implementation supports a slightly different set of
compiler flags. See
Table 2-4
for a comparison.
In the Digital UNIX programming environment, a single compiler command can perform multiple actions, including the following:
as
)
can assemble only a single file, which is assumed to contain
assembler code (any file suffix is ignored). The
as
command does not automatically link the assembled object file.
Thus, if you directly invoke the assembler, you need to link the object
in a separate step.
.o
object file for a subsequent link operation.
ld
)
to the linker. For example, you can include the
-L
flag as part of the
cc
command to specify the directory path to search for a library.
Each language requires different libraries at link time; the driver
program for a language passes the appropriate libraries to the linker.
For more information on linking with libraries, see
Chapter 4
and
Section 2.5.3.
a.out
or with a name that you specify.
Suffix | File |
.a
|
Archive library |
.c
|
C source code |
.i
|
The driver assumes that the source code was processed by the
C preprocessor and that the source code is that of the processing
driver, for example,
% cc c source.i .
The file,
source.i ,
is assumed to contain C source code.
|
.o
|
Object file |
.s
|
Assembly source code |
.so
|
Shared object (shared library) |
.u
|
ucode object file (supported only under
-oldc )
|
.b
|
ucode object library (supported only under
-oldc )
|
The following sections describe how data is represented on the Digital UNIX system.
The Digital UNIX system is little endian; that is, the address of a multibyte integer is the address of its least significant byte; the more significant bytes are at higher addresses. The C compiler supports only little endian byte ordering. The following table gives the sizes of supported data types.
Data type | Size in bits |
char | 8 |
short | 16 |
int | 32 |
long | 64 |
long long | 64 |
float | 32 (IEEE Single) |
double | 64 (IEEE Double) |
pointer | 64 |
The C compiler supports IEEE single-precision (32-bit
float
)
and double-precision (64-bit
double
)
floating-point data, as defined by the
IEEE Standard for Binary Floating-Point Arithmetic
(ANSI/IEEE Std 754-1985).
Floating-point numbers have the following ranges:
Digital UNIX
provides the basic floating-point number formats, operations (add,
subtract, multiply, divide, square root, remainder, and compare), and
conversions defined in the standard. You can obtain full
IEEE-compliant trapping behavior (including nonnumbers [NaNs]) by
specifying a compilation flag, or by specifying a fast mode when
IEEE-style traps are not required. You can also select, at
compile time, the rounding mode applied to the results of IEEE
operations. See
cc
(1)
for information on the flags that support IEEE floating-point
processing.
A user program can control the delivery of floating-point traps to a
thread by calling
ieee_set_fp_control
(),
or dynamically set the IEEE rounding mode by calling
write_rnd
().
See
ieee
(3)
for additional information on how to handle IEEE floating-point
exceptions.
The C compiler aligns structure members on natural boundaries by default. That is, the components of a structure are laid out in memory in the order in which they are declared. The first component has the same address as the entire structure. Each additional component follows its predecessor on the next natural boundary for the component type.
For example, the following structure is aligned as shown in Figure 2-2:
struct {char c1; short s1; float f; char c2; }
The first component of the structure,
c1
,
starts at offset 0 and occupies the first byte.
The second component,
s1
,
is a
short
;
it must start on a word boundary. Therefore, padding is added between
c1
and
s1
.
No padding is needed to make
f
and
c2
fall on their natural boundaries. However, because size is rounded
up to a multiple alignment, three bytes of padding are added after
c2
.
The following mechanisms can be used to override the default alignment of structure members:
#pragma
member_alignment
and
#pragma
nomember_alignment
directives
(-newc
only)
#pragma
pack
directive
(-newc
or
-oldc
)
-Zpn
flag
See Section 3.5 and Section 3.7 for information on these directives.
In general, the alignment of a bit field is determined by the bit size and bit offset of the previous field. For example, the following structure is aligned as shown in Figure 2-3:
struct a { char f0: 1; short f1: 12; char f2: 3; } struct_a;
The first bit field,
f0
,
starts on bit offset 0 and occupies 1 bit. The second,
f1
,
starts at offset 1 and occupies 12 bits. The third,
f2
,
starts at offset 13 and occupies 3 bits. The size of the structure is
two bytes.
Certain conditions can cause padding to occur prior to the alignment of the bit field:
#pragma
pack
directive
(-newc
or
-oldc
)
or the
-Zpn
compiler flag.) For bit fields of size 0, the bit field's base
type is ignored. For example, consider the following structure:
struct b { char f0: 1; int : 0; char f1: 2; } struct_b;
If the source file is compiled with the
-Zp1
flag or if a
#pragma
pack
1
directive is encountered in the compilation,
f0
would start at offset 0 and occupy 1 bit, the unnamed bit field
would start at offset 8 and occupy 0 bits, and
f1
would start at offset 8 and occupy 2 bits.
Similarly, if the
-Zp2
flag or the
#pragma
pack
2
directive were used, the unnamed bit field would start at offset 16.
With
-Zp4
or
#pragma
pack
4
,
it would start at offset 32.
char foo: 1
is a byte.)
The current unit is determined by the current offset, the bit
field's base size, and the kind of packing specified, as shown
in the following example:
struct c { char f0: 7; short f1: 11; } struct_c;
Assuming that you specify either the
-Zp1
flag or the
#pragma
pack
1
.directive,
f0
starts on bit offset 0 and occupies 7 bits in the structure.
Because the base size of
f1
is 8 bits and the current offset is 7,
f1
will not fit in the current unit. Padding is added to reach the
next unit boundary or the next pack boundary, whichever comes first,
in this case, bit 8. The layout of this structure is shown in
Figure 2-4.
Data alignment is implied by data type. For example, the C compiler
aligns an
int
(32 bits) on a 4-byte boundary and a
long
(64 bits) on an 8-byte boundary. The
_align
storage-class modifier, supported only by the C compiler using the
-std
and
-newc
flags (the default), aligns objects of any of the C data types on the
specified storage boundary. It can be used in a data declaration or
definition.
The
_align
modifier has the following format:
_align
(
keyword )
_align
(
n )
Where
keyword
is a predefined alignment constant and
n
is an integer power of 2. The predefined constant or power of
2 tells the compiler the number of bytes to pad in order to
align the data.
For example, to align an integer on the next quadword boundary, use any of the following declarations:
int _align(QUADWORD) data; int _align(quadword) data; int _align(3) data;
In this example,
int
_align
(3)
specifies an alignment of 2x2x2 bytes, which is 8 bytes, or a quadword
of memory.
The following table shows the predefined alignment constants, their equivalent power of 2, and equivalent number of bytes.
Constant | Power | Number |
of 2 | of Bytes | |
BYTE or byte | 0 | 1 |
WORD or word | 1 | 2 |
LONGWORD or longword | 2 | 4 |
QUADWORD or quadword | 3 | 8 |
The C preprocessor performs macro expansion, includes header files, and
executes preprocessor directives prior to compiling the source file.
The following sections describe the
Digital UNIX
-specific operations performed by the C preprocessor.
For more information on the C preprocessor, see the
cc
(1)
and
cpp
(1)
reference pages and the
DEC C Language Reference Manual.
When the compiler is invoked, it defines C preprocessor macros that
identify the language of the input files and the environments on which
the code may run. You can reference these macros in
#ifdef
statements to isolate code that applies to a particular language or
environment. The preprocessor macros are listed in
Table 2-3.
The type of source file and the type of standards you apply determine the macros that are defined. The C compiler supports several levels of standardization:
-std
flag enforces the ANSI C standard, but allows some common programming
practices disallowed by the standard, and passes
the macro
_
_STDC_
_=0
to the preprocessor.
-std0
flag enforces the K & R programming style, with certain ANSI
extensions in areas where the K & R behavior is undefined or ambiguous.
In general,
-std0
compiles most pre-ANSI C programs
and produces expected results. It causes the
_
_STDC_
_
macro to be undefined.
-std1
flag strictly enforces the ANSI C standard and all its prohibitions
(such as those that apply to handling a
void,
the definition of an
lvalue
in expressions,
the mixing of integrals and pointers, and
the modification of an
rvalue
).
It passes the macro
_
_STDC_
_=1
to the preprocessor.
Macro | Source File Type | -std Flag |
_ _DECC (-newc only) | .c | -std0, -std, -std1 |
LANGUAGE_C | .c | -std0 |
_ _LANGUAGE_C_ _ | .c | -std0, -std, -std1 |
unix | .c, .s | -std0 |
_ _unix_ _ | .c, .s | -std0, -std, -std1 |
_ _osf_ _ | .c, .s | -std0, -std, -std1 |
|
||
_ _alpha | .c, .s | -std0, -std, -std1 |
SYSTYPE_BSD | .c, .s | -std0 |
_SYSTYPE_BSD | .c, .s | -std0, -std, -std1 |
LANGUAGE_ASSEMBLY | .s | -std0, -std, -std1 |
_ _LANGUAGE_ASSEMBLY_ _ | .s | -std0, -std, -std1 |
When writing programs, you often use header files that are common among a program's modules. These files define constants, the parameters for system calls, and so on.
C header files, sometimes known as include files, have a
\.h
suffix. Typically, the reference page for a library routine or system
call indicates the required header files. Header files can be used in
programs written in different languages.
Note
If you intend to debug your program using
dbx
orladebug
, do not place executable code in a header file. The debugger interprets a header file as one line of source code; none of the source lines in the file appears during the debugging session. For more information on thedbx
debugger, see Chapter 5. For details onladebug
, see the Ladebug Debugger Manual.
You can include header files in a program source file in one of two ways:
#include "
filename"
filename
in the directory in which it found the file that contains the
directive,
then in the search path indicated by the
-I
flag, and
finally in
/usr/include
.
#include <
filename >
filename
only in the search path indicated by the
-I
flag and in
/usr/include
,
but not in the current directory.
You can also use the
-Idir
compiler flag to specify additional pathnames (directories) to be
searched by the C preprocessor for
#include
files. The C preprocessor searches first in the directory where the
source file resides, followed by the specified pathname,
dir
,
then the default directory,
/usr/include
.
If
dir
is omitted, the default directory,
/usr/include
,
is not searched.
C, Fortran, and assembly code can reside in the same include files,
then conditionally included in programs as required. To set up a
shareable include file, you must create a
\.h
file and enter the respective code, as shown in the following example:
#ifdef _ _LANGUAGE_C_ _ . . (C code) . #endif #ifdef _ _LANGUAGE_ASSEMBLY_ _ . . (assembly code) . #endif
When the compiler includes this file in a C source file, the
_
_LANGUAGE_C_
_
macro is defined, and the C code is compiled. When the compiler
includes this file in an assembly language source file, the
_
_LANGUAGE_ASSEMBLY_
_
macro is defined, and the assembly language code is compiled.
The
#pragma
directive is a standard method of implementing features that vary from
one compiler to the next.
The C compiler supports the following implementation-specific pragmas:
#pragma
environment
#pragma
function
#pragma
inline
#pragma
intrinsic
#pragma
linkage
#pragma
member
#pragma
message
#pragma
pack
#pragma
pointer_size
#pragma
use_linkage
#pragma
weak
The pragmas are described in detail in Chapter 3.
The
cc
command provides more than one compilation environment:
The
-newc
and
-oldc
flags invoke different compiler implementations (where the
implementation invoked by
-newc
is upwardly compatible with that invoked by
-oldc
).
The
-newc
compiler offers improved optimization, additional features, and
greater compatibility with Digital compilers provided on other
platforms. The
-newc
compiler implementation is the default.
The
-newc
compiler has been accessible in previous versions of the
Digital UNIX
operating system by means of the
-migrate
flag. The
-newc
compiler has been made more compatible with the
-oldc
compiler.
All compilation environments produce object files that comply with the
common object file format (COFF), and their objects files
can be freely intermixed. The C compiler invoked by the
-oldc
flag employs ucode-based optimizations; the C compiler invoked by the
-newc
flag employs other optimizations.
The following sections describe the flags that are available in all compilation environments, the default compiler behavior, and how to compile multilanguage programs.
Compiler flags select a variety of program development functions, including debugging, optimizing, and profiling facilities, and the names assigned to output files.
Table 2-4
compares the flags that are available with the three compilation
environments. An asterisk (*) indicates that the flag is accepted,
but ignored, by the compiler. See the
cc
(1)
reference page for more information on these flags.
Flag | -newc | -oldc | -migrate |
-ansi_alias | yes | no | yes |
-[no_]ansi_args | yes | no | yes |
-assume [no]accuracy_sensitive | yes | yes | yes |
-assume [no]aligned_object | yes | no | yes |
-assume [no]trusted_short_alignment | yes | no | yes |
-B | yes | yes | yes |
-c | yes | yes | yes |
-C | yes | yes | yes |
-call_shared | yes | yes | yes |
-check | yes | no | yes |
-compress | yes | yes | yes |
-cord | yes | yes | yes |
-[no_]cpp | yes | yes | yes |
-D | yes | yes | yes |
-double | yes | yes | yes |
-edit | yes | yes | yes |
-exact_version | yes | yes | yes |
-E | yes | yes | yes |
-fast | yes | yes | yes |
-feedback | yes | yes | yes |
-float | yes | yes | yes |
-float_const | yes | yes | yes |
-[no_]fp_reorder | yes | yes | yes |
-fprm {c | d | n | m} | yes | yes | yes |
-fptm {n | su | sui | u} | yes | yes | yes |
-framepointer | yes | yes | yes |
-g | yes | yes | yes |
-G | yes* | yes | yes* |
-gen_feedback | yes | no | yes |
-h | yes | yes | yes |
-H | yes | yes | yes |
-I | yes | yes | yes |
-ieee | yes | yes | yes |
-ifo | yes | yes* | yes |
-inline | yes | no | yes |
-j | no | yes | no |
-k | yes | yes | yes |
-K | yes | yes | yes |
-ko | yes | yes | yes |
-M | yes | yes | yes |
-machine_code | yes | no | yes |
-MD | yes | yes | yes |
-[no_]misalign | yes | yes | yes |
-no_archive | yes | yes | yes |
-no_inline | yes | yes | yes |
-nomember_alignment | yes | no | yes |
-non_shared | yes | yes | yes |
-noobject | yes | no | yes |
-o | yes | yes | yes |
-O | yes | yes | yes |
-oldcomment | yes | yes | yes |
-Olimit | yes* | yes | yes* |
-p | yes | yes | yes |
-P | yes | yes | yes |
-[no_]pg | yes | yes | yes |
-portable | yes | no | yes |
-preempt_module | yes | no | yes |
-preempt_symbol | yes | no | yes |
-proto[is] | yes | yes | yes |
-pthread | yes | yes | yes |
-Q | yes | yes | yes |
-readonly_strings | yes | yes | yes |
-resumption_safe | yes | yes | yes |
-S | yes | yes | yes |
-scope_safe | yes | yes | yes |
-show | yes | no | yes |
-signed | yes | yes | yes |
-source_listing | yes | no | yes |
-speculate | yes | no | yes |
-std[n] | yes | yes | yes |
-t | yes | yes | yes |
-taso | yes | yes | yes |
-threads | yes | yes | yes |
-tune | yes | yes | yes |
-traditional | yes | yes | yes |
-trapuv | yes | yes | yes |
-U | yes | yes | yes |
-unroll | yes | no | yes |
-unsigned | yes | yes | yes |
-v | yes | yes | yes |
-V | yes | yes | yes |
-varargs | yes | yes | yes |
-vaxc | yes | no | yes |
-verbose | yes | yes | yes |
-volatile | yes | yes | yes |
-w | yes |
yes |
yes |
-W | yes | yes | yes |
-warnprotos | yes | yes | yes |
-writable_strings | yes | yes | yes |
-xtaso | yes | yes | yes |
-xtaso_short | yes | yes | yes |
-Zp | yes | yes | yes |
Table note:
-w0
flag is not accepted by the
-oldc
flag.
Some flags have default values that are used if the flag is not
specified on the command line. For example, the default name for
an output file is
filename.o
for object files, where
filename
is the base name of the source file. The default name
for an executable program object is
a.out
.
The following
example uses the defaults in compiling two source files named
prog1.c
and
prog2.c
:
%
cc prog1.c prog2.c
This command runs the C compiler, creating object modules
prog1.o
and
prog2.o
and the executable program
a.out
.
Whether you are new to
Digital UNIX,
porting applications from other systems, or concerned with
compatibility issues, knowing the default behavior of the compiler
is useful. When you enter the
cc
compiler command with no other flags, the following flags are in
effect:
-newc
-assume
aligned_objects
-call_shared
-double
float
to
double
.
-fprm
n
-g0
-I/usr/include
#include
files whose names do not begin with / are always sought first in the
directory
/usr/include
.
-inline
manual
#pragma
inline
directive.
-member_alignment
-no_fp_reorder
-no_misalign
-O1
-oldcomment
-p0
-no_pg
gprof
profiling.
-preempt_symbol
-signed
char
declarations to be
signed
char
.
-std0
-tune
generic
-unroll
0
-writeable_strings
The following list includes miscellaneous aspects of the
default
cc
compiler behavior:
a.out
unless another name is specified by using the
-o
flag.
\/tmp
directory.
When the source language of the main program differs from that of a
subprogram, compile each program separately with the appropriate
driver and link the object files in a separate step.
You can create objects suitable for linking by specifying the
-c
flag, which stops a driver immediately after the object file has been
created. For example:
%
cc -c main.c
This command produces the object file
main.o
,
not the executable file
a.out
.
Most language driver programs pass information to
cc
,
which, after processing, passes information to
ld
.
When one of the modules to be compiled is a C program, you can
usually use the driver command of the other language to compile
and link both modules.
The
cc
driver command can link object files to produce an executable program.
In some cases, you may want to use the
ld
linker directly. Depending on the nature of the application, you must
decide whether to compile and link separately or to compile and link
with one compiler command. Factors to consider include:
You can use a compiler command instead of the linker command to link
separate objects into one executable program.
Each compiler (except the assembler) recognizes the
.o
suffix as the name of a file that contains object code suitable for
linking and immediately invokes the linker.
Because the compiler driver programs pass the libraries associated
with that language to the linker, using the compiler command is
usually recommended. For example, the
cc
driver uses the C library
(libc.so
)
by default. For information about the default libraries used by each
compiler command, see the appropriate command in the reference pages,
such as
cc
(1).
You can also use the
-l
flag to specify additional libraries to be searched for unresolved
references. The following example shows how to use the
cc
driver to pass the names of two libraries to the linker with the
-l
flag:
%
cc -o all main.o more.o rest.o
-lm -lexc
The
-lm
flag specifies the math library; the
-lexc
flag specifies the exception library.
You should compile and link modules with a single command when you want to optimize your program. Most compilers support increasing levels of optimization with the use of certain flags. For example:
-O0
flag requests no optimization (usually for debugging purposes).
-O1
flag requests certain local (module-specific) optimizations.
-O3
flag to the C compiler using the
-oldc
flag, or with the
-ifo
flag to the C compiler using the
-newc
flag. In this case, compiling multiple files in one operation
allows the compiler to perform the maximum possible optimizations.
-c
and
-o
)
that compile multiple source files into a single object module. This
combination allows interprocedural optimizations to occur, yet retains
the object file.
Normally, users do not need to run the linker directly, but use the
cc
command to indirectly invoke the linker. Executables that need to be
built solely from assembler objects can be built with the
ld
command.
The linker
(ld
)
combines one or more object files (in the order specified) into one
executable program file, performing relocation, external symbol
resolutions, and all other processing required to make object files
ready for execution. Unless you specify otherwise, the linker names
the executable program file
a.out
.
You can execute the program file or use it as input for another linker
operation.
The
as
assembler does not automatically invoke the linker.
To link a program written in assembly language,
do either of the following:
.s
suffix of the assembly language source file automatically
causes the compiler command to invoke the assembler.
as
command and then link the resulting object file with the
ld
command.
For information about the flags and libraries that affect the linking
process, see the
ld
(1)
reference page.
When you compile your program on the
Digital UNIX
system, it is automatically linked with the C library,
libc.so
.
If you call routines that are not in
libc.so
or one of the archive libraries associated with your compiler command,
you must explicitly link your program with the library. Otherwise,
your program will not be linked correctly.
You need to explicitly specify libraries in the following situations:
If you compile multilanguage programs, be sure to explicitly request
any required run-time libraries to handle unresolved references.
Link the libraries by specifying
-lstring
,
where
string
is an abbreviation of the library name.
For example, if you write a main program in C and some procedures
in another language, you must explicitly specify the library for that
language and the math library. When you use these flags, the linker
replaces the
-l
with
lib
and appends the specified characters (for the language library and for
the math library) and the
.a
or
.so
suffix, depending upon whether it is a static (non-shared archive
library) or dynamic (call-shared
object or shared library) library.
Then, it searches the following directories for the resulting library
name:
/usr/shlib
/usr/ccs/lib
/usr/lib/cmplrs/cc
/usr/lib
/usr/local/lib
/var/shlib
For a list of the libraries that each language uses, see the reference pages for the appropriate language compiler driver.
You must include the pathname of the library on the compiler or linker
command line. For example, the following command specifies that the
libfft.a
archive library in the
/usr/jones
directory is to be linked along with the math library:
%
cc main.o more.o rest.o /usr/jones/libfft.a -lm
The linker searches libraries in the order you specify. Therefore, if any file in your archive library uses data or procedures from the math library, you must specify the archive library before you specify the math library.
To link from a ucode library, specify the
-klx
compiler flag.
Note
Only the
-oldc
flag to the C compiler can be used to produce ucode files.
The following example links a file from a ucode library:
%
cc -klucode_lib -o output main.u more.u rest.u
Because the libraries are searched as they are encountered on the command line, the order in which you specify them is important. Although a library might be made from both assembly and high-level language routines, the ucode object library contains code only for the high-level language routines.
Unlike an extended COFF object library, the ucode library does not contain code for the routines. You must specify to the ucode linker both the ucode object library and the extended COFF object library, in that order, to ensure that all modules are linked with the proper library.
If the compiler driver is to perform both a ucode link step and a final link step, the object file created after the ucode link step is placed in the position of the first ucode file specified or created on the command line in the final link step.
To run an executable program in your current working directory,
in most cases you enter its file name. For example, to run the program
a.out
located in your current directory, enter:
%
a.out
If the executable program is not in a directory in your path, enter the directory path before the file name, or enter:
%
./a.out
When the program is invoked, the
main
function in a C program can accept arguments from the command line if
the
main
function is defined with one or more of the following optional
parameters:
int main( int
argc, char
*argv
[
], char
*envp
[
] )
[...]
The
argc
parameter is the number of arguments in the command line that
invoked the program. The
argv
parameter is an array of character strings containing the arguments.
The
envp
parameter is the environment array containing process information,
such as the user name and controlling terminal. (The
envp
parameter has no bearing on passing command-line arguments.
Its primary use is during
exec
and
getenv
function calls.)
You can
access only the parameters that you define. For example, the
following program defines the
argc
and
argv
parameters to echo the values of parameters passed to the program:
/* * Filename: echo-args.c * This program echoes command-line arguments. */
#include <stdio.h>
int main( int argc, char *argv[] ) { int i;
printf( "program: %s\n", argv[0] ); /* argv[0] is program name */
for ( i=1; i < argc; i++ ) printf( "argument %d: %s\n", i, argv[i] );
return(0); }
The program is compiled with the following command to produce a
program file called
a.out
:
$
cc echo-args.c
When the user invokes
a.out
and passes command-line arguments, the program echoes those
arguments on the terminal. For example:
$
a.out Long Day\'s "Journey into Night"
program: a.out argument 1: Long argument 2: Day's argument 3: Journey into Night
The shell parses all arguments before passing them to
a.out
.
For this reason, a single quote must be preceded by a backslash,
alphabetic arguments are delimited by spaces or tabs, and
arguments with embedded spaces or tables are enclosed in quotation
marks.
After a source file has been compiled, you can examine the object file or executable file with following tools:
odump
- Displays the contents of an object file, including the symbol
table and header information.
stdump
- Displays symbol table information from an object file.
nm
- Displays only symbol table information.
file
- Provides descriptive information
on the general properties of the specified file, for example, the
programming language used.
size
- Displays the size of the text, data, and bss segments.
dis
- Disassembles object files into machine instructions.
The following sections describe these tools. In addition, see the
strings
(1)
reference page for information on using the
strings
command to find the printable strings in an object file or other
binary file.
The
odump
tool displays header tables and other selected parts of an object
or archive file. For example,
odump
displays the following information about the file
echo-args.o
:
%
odump -at echo-args.o
***ARCHIVE SYMBOL TABLE***
***ARCHIVE HEADER*** Member Name Date Uid Gid Mode Size
***SYMBOL TABLE INFORMATION*** [Index] Name Value Sclass Symtype Ref echo-args.o: [0] main 0x0000000000000000 0x01 0x06 0xfffff [1] printf 0x0000000000000000 0x06 0x06 0xfffff [2] _fpdata 0x0000000000000000 0x06 0x01 0xfffff
For more information, see the
odump
(1)
reference page.
The
nm
tool displays symbol table information for object files.
For example,
nm
would display the following information about the object file
produced for the executable file
a.out
:
%
nm
nm: Warning: - using a.out
Name Value Type Size
.bss | 0000005368709568 | B | 0000000000000000 .data | 0000005368709120 | D | 0000000000000000 .lit4 | 0000005368709296 | G | 0000000000000000 .lit8 | 0000005368709296 | G | 0000000000000000 .rconst | 0000004831842144 | Q | 0000000000000000 .rdata | 0000005368709184 | R | 0000000000000000
The Name column contains the symbol or external name; the Value column
shows the address of the symbol, or debugging information; the Type
column contains a letter showing the symbol type; and the Size column
shows the symbol's size (accurate only when the source file is
compiled with a debugging flag, for example,
-g
).
Some of the symbol type letters are:
For more information, see
nm
(1).
The
file
command reads input files, tests each file to classify it by type, and
writes the file's type to standard output. The
file
command uses the
/etc/magic
file to identify files that contain a magic number. (A magic
number is a numeric or string constant that indicates a file's
type.)
The following example shows the output of the
file
command on a directory containing a C source file, object file, and
executable file:
%
file *.*
.: directory ..: directory a.out: COFF format alpha dynamically linked, demand paged executable or object module not stripped - version 3.11-8 echo-args.c: c program text echo-args.o: COFF format alpha executable or object module not stripped - version 3.12-6
For more information, see
file
(1).
The
size
tool displays information about the text, data, and bss segments of
the specified object or archive file or files in octal, hexadecimal,
or decimal format. For example, when it is called without any
arguments, the
size
command returns information on
a.out
.
You can also specify the name of an object or executable file on the
command line. For example:
%
size
text data bss dec hex 8192 8192 0 16384 4000
%
size echo-args.o
text data bss dec hex 176 96 0 272 110
For more information, see
size
(1).
The
dis
tool disassembles object file modules into machine language.
For example, the
dis
command produces the following output when it disassembles the
a.out
program:
%
dis a.out
.
.
.
_ _start: 0x120001080: 23defff0 lda sp, -16(sp) 0x120001084: b7fe0008 stq zero, 8(sp) 0x120001088: c0200000 br t0, 0x12000108c 0x12000108c: a21e0010 ldl a0, 16(sp) 0x120001090: 223e0018 lda a1, 24(sp)
The ANSI C standard states that users whose programs link against
libc
are guaranteed a certain range of global identifiers that can be used
in their programs without danger of conflict with, or preemption of,
any global identifiers in
libc
.
The ANSI C standard also reserves a range of global identifiers
libc
can use in its internal implementation. These are called reserved
identifiers and consist of the following, as defined in ANSI document
number X3.159-1989:
ANSI conformant programs are not permitted to define global identifiers that either match the names of ANSI routines or fall into the reserved name space specified earlier in this section. All other global identifier names are available for use in user programs.
Historical
libc
implementations contain large numbers of non-ANSI, nonreserved global
identifiers that are both documented and supported.
These routines are often called from within
libc
by other
libc
routines, both ANSI and otherwise. A user's program that defines its
own version of one of these non-ANSI, nonreserved items would preempt
the routine of the same name in
libc
.
This could alter the behavior of supported
libc
routines, both ANSI and otherwise, even though the user's program may
be ANSI conformant. This potential conflict is known as ANSI name
space pollution.
The implementation of
libc
on
Digital UNIX Version 4.0
includes a large number of
non-ANSI, nonreserved global identifiers that are both documented
and supported. To protect against preemption of these global
identifiers within
libc
and to avoid pollution of the user's name
space, the vast majority of these identifiers have been renamed to the
reserved name space by prepending two underscores
(_
_
)
to the
identifier names. To preserve external access to these items, weak
identifiers have been added using the original identifier names that
correspond to their renamed reserved counterparts. Weak identifiers
work much like symbolic links between files. When the weak identifier
is referenced, the strong counterpart is used instead.
User programs linked statically against
libc
may have extra symbol table entries for weak identifiers. Each of these
identifiers will have the same address as its reserved counterpart,
which will also be included in the symbol table. For example, if a
statically linked program simply called the
tzset()
function from
libc
,
the symbol table would
contain two entries for this call, as follows:
#
stdump -b a.out | grep tzset
18. (file 9) (4831850384) tzset Proc Text symref 23 (weakext)
39. (file 9) (4831850384) _ _tzset Proc Text symref 23
In this example,
tzset
is the weak identifier and
_
_tzset
is its strong counterpart. The
_
_tzset
identifier is the routine that will actually do the work.
User programs linked as shared should not see such additions to the symbol table because the weak/strong identifier pairs remain in the shared library.
Existing user programs that reference non-ANSI, nonreserved
identifiers from
libc
do not need to be recompiled because of these changes,
with one exception: user programs that depended on preemption of
these identifiers in
libc
will no longer be able to preempt them using the nonreserved names.
This kind of preemption is not ANSI compliant and is highly discouraged.
However, the ability to preempt these identifiers still exists by using
the new reserved names (those preceded by two underscores).
These changes apply to the dynamic and static versions of
libc
:
/usr/shlib/libc.so
/usr/lib/libc.a
When debugging programs linked against
libc
,
references to weak symbols resolve to their strong counterparts, as in
the following example:
%
dbx a.out
dbx version 3.11.4Type 'help' for help.
main: 4 tzset
(dbx) stop in tzset
[2] stop in _ _tzset
(dbx)
When the weak symbol
tzset
in
libc
is referenced, the debugger responds with the strong counterpart
_
_tzset
instead because the strong counterpart actually does the work.
The behavior of the
dbx
debugger is the same as if
_
_tzset
were referenced directly.