5 Symbol Table (V3.13)

One of the chief tasks of the compilation process is the production of a symbol table, which is a collection of data structures whose purpose is to store type, scope, and address information about program data. Compilers and assemblers create the symbol table. It is read and may be modified by linkers, profiling tools, and assorted object manipulation tools. It also contains information required for debugging.

For large applications, a single compilation can involve many program components, including source files, header files, and libraries. Data from all of these files must be described in the symbol table.

The Tru64 UNIX eCOFF symbol table, when present, comprises a large portion of the physical object file and is often considered a stand-alone entity. It is divided into numerous sections, including a header section that is used for navigation. The contents of the symbol table are shown in Figure 5-1.

Figure 5-1 Symbol Table Sections

The symbol table has a hierarchical design. The sections storing local symbols, local strings, relative file descriptors, procedure descriptors, line numbers, auxiliary symbols, and optimization symbols are divided into subtables and organized by file. Local symbols, local strings, and optimization symbols are further broken down by procedure. Figure 5-2 depicts this hierarchy.

Figure 5-2 Symbol Table Hierarchy

A particular symbol table may not contain all sections, for one of the following reasons:

Relative file descriptors are present in linked objects only.
The line number, auxiliary symbol and optimization symbol tables are produced only when debugging information is requested.
Symbol table information may be partially or entirely removed by post-processing tools.
Optimization symbols are not present in older object files (V3.12 and prior)

The function of each symbol table section is summarized below:

The symbolic header stores the sizes and locations of all other symbol table sections.
The line number table enables debuggers to map machine instructions to source code lines.
The procedure descriptor table contains call-frame information as well as pointers to a procedure's local symbols, line numbers and optimization entries.
The local symbol table describes procedures, static and local data, and user-defined types.
The external symbol table stores information about global symbols.
The relative file descriptor table contains a post-link file descriptor table index mapping for each file in the compilation.
The local and external string tables store local and external symbol names, respectively.
The file descriptor table stores the sizes and locations of each subtable produced for contributing source and include files. It also contains miscellaneous information about each file, such as the source language and the level of symbolic information.
The auxiliary symbol table contains data type information for local and external symbols.
The optimization symbols section stores procedure relative information, including extended source location information and optimized debugging information.

Several tools are available to view the contents of the symbol table. See the stdump(1), odump(1), and nm(1) man pages.

This chapter covers symbol table organization and usage, concentrating on debugging issues in particular. The version of the symbol table covered is V3.13. The dynamic symbol table built by the linker is discussed separately in Section 6.3.3.

5.1 New or Changed Symbol Table Features

Version 3.13 of the symbol table includes the following new or changed features:

64-bit auxiliary support (see Section 5.3.7.3)
Parameters with static storage and unallocated parameters (see Section 5.2.11)
New optimization symbols section (see Section 5.3.3)
Extended Source Location Information (see Section 5.3.2.2)
New representation for procedures with no text (see Section 5.3.6.1)
Modified variant record representation (see Section 5.3.8.11)
New function pointer representation (see Section 5.3.8.5)
Block symbol added for alternate entry prologue size (see Section 5.3.6.7)
Address of locally stripped FDRs set to addressNil (see Section 5.3.1.2)
Uplevel links for referencing local symbols in an outer scope (see Section 5.3.4.4)
New profile feedback information (see Section 5.3.5)
New representation for C++ namespaces (see Section 5.3.6.4)
Unnamed union or structure representation (see Section 5.3.8.3)

5.2 Structures, Fields and Values for Symbol Tables

Unless otherwise specified, all structures described in this section are declared in the header file sym.h, and all constants are defined in the header file symconst.h.

5.2.1 Symbolic Header (HDRR)

typedef struct {
        coff_ushort	magic;          
        coff_ushort	vstamp;         
        coff_int	ilineMax;       
        coff_int	idnMax;         
        coff_int	ipdMax;         
        coff_int	isymMax;        
        coff_int	ioptMax;        
        coff_int	iauxMax;        
        coff_int	issMax;         
        coff_int	issExtMax;      
        coff_int	ifdMax;         
        coff_int	crfd;           
        coff_int	iextMax;        
        coff_long	cbLine;         
        coff_off	cbLineOffset;   
        coff_off	cbDnOffset;     
        coff_off	cbPdOffset;     
        coff_off	cbSymOffset;    
        coff_off	cbOptOffset;    
        coff_off	cbAuxOffset;    
        coff_off	cbSsOffset;     
        coff_off	cbSsExtOffset;  
        coff_off	cbFdOffset;     
        coff_off	cbRfdOffset;    
        coff_off	cbExtOffset;    
} HDRR, *pHDRR;

SIZE - 144 bytes, ALIGNMENT - 8 bytes

Symbolic Header Fields

magic: To verify validity of the symbol table, this field must contain the constant magicSym, defined as 0x1992.
vstamp: Symbol table version stamp. This value consists of a major version number and a minor version number, as defined in the stamp.h header file:

MAJ_SYM_STAMP

3

High byte

MIN_SYM_STAMP

13

Low byte

See Section 5.1 for a list of symbol table features introduced with version V3.13.
ilineMax: Number of line number entries (if expanded).
idnMax: Obsolete.
ipdMax: Number of procedure descriptors.
isymMax: Number of local symbols.
ioptMax: Byte size of optimization symbol table.
iauxMax: Number of auxiliary symbols.
issMax: Byte size of local string table.
issExtMax: Byte size of external string table.
ifdMax: Number of file descriptors.
crfd: Number of relative file descriptors.
iextMax: Number of external symbols.
cbLine: Byte size of (packed) line number entries.
cbLineOffset: Byte offset to start of (packed) line numbers.
cbDnOffset: Obsolete.
cbPdOffset: Byte offset to start of procedure descriptors.
cbSymOffset: Byte offset to start of local symbols.
cbOptOffset: Byte offset to start of optimization entries.
cbAuxOffset: Byte offset to start of auxiliary symbols.
cbSsOffset: Byte offset to start of local strings.
cbSsExtOffset: Byte offset to start of external strings.
cbFdOffset: Byte offset to start of file descriptors.
cbRfdOffset: Byte offset to start of relative file descriptors.
cbExtOffset: Byte offset to start of external symbols.

General Notes

The size and offset fields describing symbol table sections must be set to zero if the section described is not present.

The cb*Offset fields are byte offsets from the beginning of the object file.

The i*Max fields contain the number of entries for a symbol table section. Legal index values for a symbol table section will range from 0 to the value of the associated i*Max field minus one.

For an explanation of packed and expanded line number entries, see the discussion in Section 5.3.2.2.

5.2.2 File Descriptor Entry (FDR)

typedef struct fdr {
        coff_addr	adr;    
        coff_long	cbLineOffset;   
        coff_long	cbLine;         
        coff_long	cbSs;           
        coff_int	rss;            
        coff_int	issBase;        
        coff_int	isymBase;      
        coff_int	csym;         
        coff_int	ilineBase;    
        coff_int	cline;       
        coff_int	ioptBase;   
        coff_int	copt;      
        coff_int	ipdFirst;       
        coff_int	cpd;            
        coff_int	iauxBase;      
        coff_int	caux;        
        coff_int	rfdBase;    
        coff_int	crfd;           
        coff_uint	lang : 5;      
        coff_uint	fMerge : 1;  
        coff_uint	fReadin : 1;
        coff_uint	fBigendian : 1;
        coff_uint	glevel : 2;    
        coff_uint	fTrim : 1;    
        coff_uint	reserved: 5;  
        coff_ushort	vstamp;
        coff_uint       reserved2;
} FDR, *pFDR;

SIZE - 96 bytes, ALIGNMENT - 8 bytes

See Section 5.3.2.1 for related information.

File Descriptor Table Entry Fields

adr: Address of first instruction generated from this source file, which should be the same value as found in the PDR.adr field of the first procedure descriptor for this file. If no instructions are associated with this source file, this field should be set to 0. File descriptors that have been merged by source language in locally-stripped objects will have this field set to addressNil (-1).
cbLineOffset: Byte offset from start of packed line numbers to start of entries for this file.
cbLine: Byte size of packed line numbers for this file.
cbSs: Byte size of local string table entries for this file.
rss: Byte offset from start of file's local string table entries to source file name; set to issNil (-1) to indicate the source file name is unknown.
issBase: Start of local strings for this file.
isymBase: Starting index of local symbol entries for this file.
csym: Count of local symbol entries for this file.
ilineBase: Starting index of line number entries (if expanded) for this file.
cline: Count of line number entries (if expanded) for this file.
ioptBase: Byte offset from start of optimization symbol table to optimization symbol entries for this file.
copt: Byte size of optimization symbol entries for this file.
ipdFirst: Starting index of procedure descriptors for this file.
cpd: Count of procedure descriptors for this file.
iauxBase: Starting index of auxiliary symbol entries for this file.
caux: Count of auxiliary symbol entries for this file.
rfdBase: Starting index of relative file descriptors for this file.
crfd: Count of relative file descriptors for this file.
lang: Source language for this file (see Table 5-1).
fMerge: Informs linker whether this file can be merged.
fReadin: True if file was read in (as opposed to just created).
fBigendian: Unused.
glevel: Symbolic information level with which this file was compiled. This value is not the same as the user's idea of debugging levels. The value mapping from the user level (-g compiler switch value) to the symbol table value is:

Debug switch

-g0

-g1

-g2

-g3

glevel contents

2

1

0

3


fTrim: Unused.
vstamp: Symbol table version stamp (HDRR.vstamp) value from the original object module (.o file) that is recorded by the linker. The linker may combine objects that were compiled at different times and potentially contain different versions of the symbol table. In post-link objects, this value may or may not match the version stamp in the symbolic header. For pre-link objects, the values in this field and the symbolic header stamp should be the same.
reserved, reserved2: Must be zero.

General Notes

The i*Base fields provide the starting indices of this file's subtables within the symbol table sections. If the associated count fields are set to 0, the base fields will also be set to zero.

For an explanation of packed and expanded line number entries, see the discussion in Section 5.3.2.2.

Table 5-1 Source Language (lang) Constants

Name	Value	Comment
`langC`	0
`langPascal`	1
`langFortran`	2
`langAssembler`	3
`langMachine`	4
`langNil`	5
`langAda`	6
`langPl1`	7
`langCobol`	8
`langStdc`	9
`langMIPSCxx`	10	Unused.
`langDECCxx`	11
`langCxx`	12
`langFortran90`	13	Not used by all compilers - l`angFortran` might be used instead for both f77 and f90
`langBliss`	14
`langMax`	31	Number of language codes available

5.2.3 Procedure Descriptor Entry (PDR)

struct pdr {
        coff_addr	adr;    
        coff_long	cbLineOffset;   
        coff_int	isym;          
        coff_int	iline;        
        coff_uint	regmask;     
        coff_int	regoffset;  
        coff_int	iopt;          
        coff_uint	fregmask;     
        coff_int	fregoffset;  
        coff_int	frameoffset;
        coff_int	lnLow;          
        coff_int	lnHigh;        
        coff_uint	gp_prologue : 8; 
        coff_uint	gp_used : 1;   
        coff_uint	reg_frame : 1;
        coff_uint	prof : 1;      
        coff_uint	reserved : 13; 
        coff_uint	localoff : 8; 
        coff_ushort	framereg;     
        coff_ushort	pcreg;         
} PDR, *pPDR;

SIZE - 64 bytes, ALIGNMENT - 8 bytes

See Section 5.3.4 for related information.

Procedure Descriptor Table Entry Fields

adr: The start address of this procedure. Set to addressNil (-1) for procedures with no text. This field may not be updated by the linker in symbol table versions prior to V3.13. To determine the procedure start address in pre-V3.13 symbol tables, use the algorithm described in Section 5.3.4.2.
cbLineOffset: Byte offset to the start of this procedure's line numbers from the start of the file descriptor entry (FDR.cbLineOffset).
isym: Start of local symbols for this procedure. This symbol is the symbol for the procedure (symbol type stProc). The name of the procedure can be obtained from the iss field of the symbol table entry.; If the object is stripped of local symbol information, this field contains an external symbol table index for the procedure symbol's entry.; If this procedure has no symbols associated with it, this field should be set to isymNil (-1). This situation occurs for a static procedure in an object stripped of local symbol information.
iline: Start of line number entries (if expanded) for this procedure. Set to ilineNil (-1) to indicate that this procedure does not have line numbers.
regmask: Saved general register mask.
regoffset: Offset from the virtual frame pointer to the general register save area in the stack frame.
iopt: Start of procedure's optimization symbol entries. Set to ioptNil (-1) to indicate that this procedure does not have optimization symbol entries.
fregmask: Saved floating-point register mask.
fregoffset: Offset from the virtual frame pointer to the floating-point register save area in the stack frame.
frameoffset: Size of the fixed part of the stack frame. The actual frame size can exceed this value. A routine can extend its own frame size for frame sizes larger than 2 GB or for dynamic stack allocation requests.
lnLow: Lowest source line number within this file for the procedure. This is typically the line number of the first instruction in the procedure, but not always. Code optimizations can rearrange or remove instructions making the first instruction map to a different line number.
lnHigh: Highest source line number within this file for the procedure. This field contains a value of -1 for alternate entry points, which is how an alternate entry point is identified.
gp_prologue: Byte size of gp prologue.
gp_used: Flag set if the procedure uses gp.
reg_frame: True if the procedure is a light-weight or null-weight procedure. See the General Notes section following these definitions for more details on procedure weights.
prof: True if the procedure has been compiled with �pg for gprof profiling.
reserved: Must be zero.
localoff: Bias value for accessing local symbols on the stack at run time.
framereg: Frame pointer register number.
pcreg: PC (Program Counter) register number.

General Notes:

For more information on call frames, see Section 5.3.4.1.

If the value of gp_prologue is zero and gp_used is 1, a gp prologue is present but was scheduled into the procedure prologue.

For an explanation of packed and expanded line number entries, see the discussion in Section 5.3.2.2.

A procedure may be heavy-, light-, or null-weight. The weight of a procedure can be determined from its descriptor by using the following guidelines:

Weight

Indications

Heavy

reg_frame is 0 and bit 26 of the register mask (regmask) is on

Light

reg_frame is 1 and regoffset is ra_save

Null

reg_frame is 1 and regoffset is 26

See the Calling Standard for Alpha Systems for details on the calling conventions for different weight procedures. Note that a calling routine does not need to know the weight of the routine being called.

5.2.4 Line Number Entry (LINER)

Line numbers are represented using two formats: packed and expanded. The packed format is a byte stream that can be interpreted as described in Section 5.3.2.2 to build an expanded table that maps instructions to source line numbers. The LINER field is used to refer to a single entry in the expanded table. It is declared as:

typedef int LINER, *pLINER;

A second, newer form of line number information is located in the optimization symbols section. See Section 5.2.10 and Section 5.3.2.2.

5.2.5 Local Symbol Entry (SYMR)

typedef struct {
        coff_long	value;       
        coff_int	iss;           
        coff_uint	st : 6;      
        coff_uint	sc  : 5;    
        coff_uint	reserved : 1; 
        coff_uint	index : 20; 
} SYMR, *pSYMR;

SIZE - 16 bytes, ALIGNMENT - 8 bytes

See Section 5.2.11, Section 5.3.4, and Section 5.3.8 for related information.

Local Symbol Table Entry Fields

value: A field that can contain an address, size, offset, or index. Its interpretation is determined by the symbol type and storage class combination, as explained in Section 5.2.11.
iss: Byte offset from the issBase field of a file descriptor table entry to the name of the symbol. If the symbol does not have a name, this field is set to issNil (-1). Generally, all user-defined symbols have names. A symbol without a name is one that has been created by the compilation system for its own use.
st: Symbol type (see Table 5-2).
sc: Storage class (see Table 5-3).
reserved: Must be zero.
index: An index into either the local symbol table or auxiliary symbol table, depending on the symbol type and class. The index is used as an offset from the isymBase field in the file descriptor entry for an entry in the local symbol table or an offset from the iauxBase field for an entry in the auxiliary symbol table.; The index field may have a value of indexNil, which is defined as (long)0xfffff. This value is used to indicate that the index is not a valid reference.

The next two tables contain all defined values for the st and sc constants, along with short descriptions. However, these fields must be considered as pairs that have a limited number of possible pairings as explained in Section 5.2.11.

Table 5-2 Symbol Type (st) Constants

Constant	Value	Description
`stNil`	0	Dummy entry
`stGlobal`	1	Global variable
`stStatic`	2	Static variable
`stParam`	3	Procedure argument
`stLocal`	4	Local variable
`stLabel`	5	Label
`stProc`	6	Global procedure
`stBlock`	7	Start of block
`stEnd`	8	End of block, file, or procedure
`stMember`	9	Member of class, structure, union, or enumeration
`stTypedef`	10	User-defined type definition
`stFile`	11	Source file name
`stStaticProc`	14	Static procedure
`stConstant`	15	Constant data
`stBase`	17	Base class (for example, C++)
`stVirtBase`	18	Virtual base class (for example, C++)
`stTag`	19	Data structure tag value (for example, C++ class or struct)
`stInter`	20	Interlude (for example, C++)
`stModule`	22	Fortran90 module definition; not yet implemented
`stNamespace`	22	Namespace definition (for example, C++)
`stModview`	23	Modifiers for current view of given module; not yet implemented
`stUsing`	23	Namespace use (for example, C++ "using").
`stAlias`	24	Defines an alias for another symbols. Currently, only used for namespace aliases.

Table 5-3 Storage Class (`sc`) Constants

Constant	Value	Description
`scNil`	0	Dummy entry
`scText`	1	Symbol allocated in the `.text` section
`scData`	2	Symbol allocated in the `.data` section
`scBss`	3	Symbol allocated in the `.bss` section
`scRegister`	4	Symbol allocated in a register
`scAbs`	5	Symbol value is absolute
`scUndefined`	6	Symbol referenced but not defined in the current module
`scUnallocated`	7	Storage not allocated for this symbol
`scTlsUndefined`	9	Undefined TLS symbol
`scInfo`	11	Symbol contains debugger information
`scSData`	13	Symbol allocated in the `.sdata` section
`scSBss`	14	Symbol allocated in the `.sbss` section
`scRData`	15	Symbol allocated in the `.rdata` section
`scVar`	16	Parameter passed by reference (for example, Fortran or Pascal)
`scCommon`	17	Common symbol
`scSCommon`	18	Small common symbol
`scVarRegister`	19	Parameter passed by reference in a register
`scVariant`	20	Variant record (for example, Pascal or Ada)
`scFileDesc`	20	File descriptor (for example, COBOL)
`scSUndefined`	21	Small undefined symbol
`scInit`	22	Symbol allocated in the `.init` section
`scReportDesc`	23	Report descriptor (for example, COBOL)
`scXData`	24	Symbol allocated in the `.xdata` section
`scPData`	25	Symbol allocated in the `.pdata` section
`scFini`	26	Symbol allocated in the `.fini` section
`scRConst`	27	Symbol allocated in the `.rconst` section
`scTlsCommon`	29	TLS unallocated data
`scTlsData`	30	Symbol allocated in the `.tlsdata` section
`scTlsBss`	31	Symbol allocated in the `.tlsbss` section
`scMax`	32	Maximum number of storage classes

5.2.6 External Symbol Entry (EXTR)

typedef struct {
        SYMR          asym;     
        coff_uint     jmptbl:1;    
        coff_uint     cobol_main:1;  
        coff_uint     weakext:1;    
        coff_uint     reserved:29; 
        coff_int      ifd;         
} EXTR, *pEXTR;

SIZE - 24 bytes, ALIGNMENT - 8 bytes

External Symbol Table Entry Fields

asym: External symbol table entry. This structure has the same format as a local symbol entry. The field interpretations differ somewhat:
value: Contains the symbol address for most defined symbols. See Section 5.2.11 for details.
iss: Byte offset in external string table to symbol name. Set to issNil (-1) if there is no name for this symbol.
st: Symbol type. See Table 5-2 for possible values.
sc: Storage class. See Table 5-3 for possible values.
reserved: Must be zero.
index: Can contain an index into the auxiliary symbol table for a type description or an index into the local symbol table to pointing to a related symbol.
jmptbl: Unused.
cobol_main: Flag set to indicate that the symbol is a COBOL main procedure.
weakext: Flag set to identify the symbol as a weak external. See Section 6.3.4.2 for more details on weak symbols.
reserved: Must be zero.
ifd: Index of the file descriptor where the symbol is defined. Set to ifdNil (-1) for undefined symbols and for some compiler system symbols.

5.2.7 Relative File Descriptor Entry (RFDT)

The relative file descriptor table provides a post-link mapping of file descriptor indices. The purpose of this table is to minimize work for the linker, which does not update symbol table references to local symbols. This information is used to obtain the file offset used to bias local symbol indices. Because this table is also known as the File Indirect Table, two declarations are included in the sym.h header file, as shown here.

typedef int RFDT, *pRFDT;
typedef int FIT, *pFIT;

SIZE - 4 bytes, ALIGNMENT - 4 bytes

See Section 5.3.2.1 for related information.

5.2.8 Auxiliary Symbol Table Entry (AUXU)

The auxiliary symbol table entry is a 32-bit union. It is either interpreted as a TIR or RNDXR structure or as an integer value. See Section 5.3.7.3 for detailed instructions on reading the auxiliary symbols.

typedef union {
        TIR 		ti;             
        RNDXR		rndx;          
        coff_int	dnLow;        
        coff_int	dnHigh;      
        coff_int	isym;       
        coff_int	iss;       
        coff_int	width;    
        coff_int	count;   
} AUXU, *pAUXU;

SIZE - 4 bytes, ALIGNMENT - 4 bytes

See Section 5.3.7.3 for related information.

Auxiliary Symbol Table Entry Fields

ti: Type information record (TIR), as defined in Section 5.2.8.1.
rndx: Relative index into local or auxiliary symbols (RNDX), as defined in Section 5.2.8.2.
dnLow: Lower bound of range or array dimension. For large structures, two of these fields can be used together to form one 64-bit number.
dnHigh: Upper bound of range or array dimension. For large structures, two of these fields can be used together to form one 64-bit number.
isym: For procedures (stProc or stStaticProc symbols), this field is an index into the local symbols. It is also used as an index into the relative file descriptors.
iss: Unused.
width: Width of a bit field or array stride in bits. Fortran compilers set the array stride to the array element size in bits. Two of these fields can be used together to form one 64-bit number.
count: Count of ranges for variant arm. This field name is only used within the type description of a variant block (stBlock, scVariant).

General Notes:

The fields dnLow, dnHigh, or width must all use either the 32-bit or 64-bit representation when used together. For example, an array dimension cannot be specified with a 32-bit dnLow and a 64-bit dnHigh.

5.2.8.1 Type Information Record (TIR)

typedef struct {
        coff_uint	fBitfield : 1;
        coff_uint	continued : 1;
        coff_uint 	bt  : 6;     
        coff_uint 	tq4 : 4;
        coff_uint 	tq5 : 4;
        coff_uint 	tq0 : 4;
        coff_uint 	tq1 : 4;    
        coff_uint 	tq2 : 4;
        coff_uint 	tq3 : 4;
} TIR, *pTIR;

SIZE - 4 bytes, ALIGNMENT - 4 bytes

Type Information Record Entry Fields

fBitfield: Flag set if bit width is specified.
continued: Flag set to indicate that the type description is continued in another TIR record. This will happen if the type is represented with more than six type qualifiers.
bt: Basic type (see Table 5-4 and Section 5.3.7.1).
tq0, tq1, tq2, tq3, tq4, tq5: Type qualifiers (see Table 5-5 and Section 5.3.7.2). The lower-numbered tq fields must be used first, and all unneeded fields must be set to tqNil (0).

Table 5-4 Basic Type (`bt`) Constants

Constant	Value	Description
`btNil`	0	Undefined or void
`btAdr32`	1	Address
`btChar`	2	Character
`btUChar`	3	Unsigned character
`btShort`	4	Short (16 bits)
`btUShort`	5	Unsigned short (16 bits)
`btInt`	6	Integer (32 bits)
`btUInt`	7	Unsigned integer (32 bits)
`btLong32`	8	Long (32 bits)
`btULong32`	9	Unsigned long (32 bits)
`btFloat`	10	Floating point
`btDouble`	11	Double-precision floating point
`btStruct`	12	Structure or record
`btUnion`	13	Union
`btEnum`	14	Enumeration
`btTypedef`	15	Defined by means of a user-defined type definition
`btRange`	16	Range of values (for example, Pascal subrange)
`btSet`	17	Sets (for example, Pascal)
`btComplex`	18	Currently unused
`btDComplex`	19	Currently unused
`btIndirect`	20	Indirect definition; following `rndx` points to an entry in the auxiliary symbol table that contains a `TIR` (type information record)
`btFixedBin`	21	Fixed binary (for example, COBOL)
`btDecimal`	22	Packed or unpacked decimal (for example, COBOL)
`btPicture`	25	Picture (for example, COBOL)
`btVoid`	26	Void
`btPtrMem`	27	Currently unused
`btScaledBin`	27	Scaled binary (for example, COBOL)
`btVptr`	28	Virtual function table (for example, C++)
`btArrayDesc`	28	Array descriptor (for example, Fortran, Pascal)
`btClass`	29	Class (for example, C++)
`btLong64`	30	Address
`btLong`	30	Long (64 bits)
`btULong64`	31	Unsigned long (64 bits)
`btULong`	31	Unsigned long (64 bits)
`btLongLong`	32	Long long (64 bits)
`btULongLong`	33	Unsigned long long (64 bits)
`btAdr64`	34	Address (64 bits)
`btAdr`	34	Address (64 bits)
`btInt64`	35	Integer (64 bits)
`btUInt64`	36	Unsigned integer (64 bits)
`btLDouble`	37	Long double floating point (128 bits)
`btInt8`	38	Integer (64 bits)
`btUInt8`	39	Unsigned integer (64 bits)
`btRange_64`	41	64-bit range
`btProc`	42	Procedure or function
`btChecksum`	63	Symbol table checksum value stored in auxiliary record
`btMax`	64	Number of basic type codes

Table Notes:

btInt and btLong32 are synonymous.
btUInt and btULong32 are synonymous.
btLong, btLong64, btLongLong, btInt64, and btInt8 are synonymous.
btULong64, btULongLong, btUInt64, and btUInt8 are synonymous.

Table 5-5 Type Qualifier (`tq`) Constants

Constant	Value	Description
`tqNil`	0	No qualifier (placeholder)
`tqPtr`	1	Pointer
`tqProc`	2	Procedure or function (obsolete)
`tqArray`	3	Array
`tqFar`	4	32-bit pointer; used with the `-xtaso` emulation
`tqVol`	5	Volatile
`tqConst`	6	Constant
`tqRef`	7	Reference
`tqArray_64`	8	Large array
`tqHasLen`	9	Reserved
`tqShar`	10	Reserved
`tqSharArr_64`	11	Reserved
`tqMax`	16	Number of type qualifier codes

5.2.8.2 Relative Symbol Record (RNDXR)

typedef struct {
        coff_uint	rfd : 12;    
        coff_uint	index : 20; 
} RNDXR, *pRNDXR;

SIZE - 4, ALIGNMENT - 4

Relative Symbol Record Fields

rfd: Index into relative file descriptor table if it exists; otherwise, index into file descriptor table.; This field may have a value of ST_RFDESCAPE, defined as 0xfff in the header file cmplrs/stsupport.h. This value is used to indicate that the next auxiliary entry, interpreted as an isym, contains the index.
index: Symbol index. Used as an offset from either FDR.isymbase or FDR.iauxbase, depending on context.

5.2.9 String Table

The string table is composed of two parts: the local string table and the external string table. In the on-disk symbol table, the external strings follow the local strings. The local string table is present only for objects created with full debugging information; it is removed if an object is locally stripped.

The storage format for the string table is a list of null-terminated character strings. It is correctly considered as one long character array, not an array of strings. Fields in the symbolic header and file headers represent string table sizes and offsets in bytes.

5.2.10 Optimization Symbol Entry (PPODHDR)

typedef struct {
        coff_uint	ppode_tag;
        coff_uint 	ppode_len;
        coff_ulong 	ppode_val;
} PPODHDR, *pPPODHDR;

SIZE - 16 bytes, ALIGNMENT - 8 bytes

See Section 5.3.3 for related information.

Optimization Symbol Entry Fields

ppode_tag: Identifies the kind of data described by this entry.
ppode_len: Indicates the size in bytes of the data that is found in the raw data area for this entry. When this field is zero, the only data is stored in the ppode_val field.
ppode_val: This field is either a pointer to the entry's data or is itself the data. If ppode_len is nonzero, this field is a relative file offset from the beginning of the current Per-Procedure Optimization Descriptor (PPOD) to the applicable data area. If ppode_len is zero, this field contains the data for the entry.

Table 5-6 Optimization Tag Values

Name	Value	Description
PPODE_STAMP	1	Version number of the PPOD stored in `ppode_val`. The current `PPOD_VERSION` value is 1
PPODE_END	2	End of entries for this PPOD
PPODE_EXT_SRC	3	Extended source line information
PPODE_SEM_EVENT	4	Semantic event information. (Reserved for future use.)
PPODE_SPLIT	5	Split lifetime information. (Reserved for future use.)
PPODE_DISCONTIG_SCOPE	6	Discontiguous scope information. (Reserved for future use.)
PPODE_INLINED_CALL	7	Inlined procedure call information. (Reserved for future use.)
PPODE_PROFILE_INFO	8	Profile feedback information.

5.2.11 Symbol Type and Class (st/sc) Combinations

Entries in the symbol table are primarily identified by the combination of their symbol type (st) and storage class (sc) values. Not all combinations are valid. Figure 5-3 indicates which combinations are currently in use.

Figure 5-3 st/sc Combination Matrix

Interpretation of storage class column labels:
    Ab. scAbs         RC. scRConst        TC. scTlsCommon
    BV. scBasedVar    RD. scRData         TD. scTlsData
    Bi. scBits        RI. scRegImage      TU. scTlsUndefined
    Bs. scBss         Re. scRegister      Ua. scUnallocated
    Co. scCommon      Rp. scReportDesc    Un. scUndefined
    Da. scData        SB. scSBss          US. scUserStruct
    FD. scFileDesc    SC. scSCommon       Va. scVar
    Fi. scFini        SD. scSData         VR. scVarRegister
    If. scInfo        SU. scSUndefined    Vt. scVariant
    In. scInit        Sy. scSymref        XD. scXData
    Ni. scNil         Te. scText
    PD. scPData       TB. scTlsBss 


              sc |ABBBC|DFFII|NPRRR|RRSSS|SSTTT|TTUUU|VVVX
    st           |bViso|aDifn|iDCDI|epBCD|UyeBC|DUanS|aRtD
    -------------+-----+-----+-----+-----+-----+-----+----
    stAlias      |     |   X |     |     |     |     |
    stBase       |     |   X |     |     |     |     |
    stBlock      |    X| X X |     | X   |  X  |     |  X
    stConstant   |X  X |X  X |  X  |  X X|     |     |
    stEnd        |    X| X X |     | X   |  X  |     |  X
    stExpr       |     |     |     |     |     |     |
    stFile       |     |     |     |     |  X  |     |
    stForward    |     |     |     |     |     |     |
    stGlobal     |   XX|X    |  XX |  XXX|X  XX|XX X |
    stInter      |     |   X |     |     |     |     |
    stLabel      |X  X |X X X| XXX |  X X|  XX |X X  |   X
    stLocal      |X  X |X X X| XXX |X X X|  XX |X X  |XX X
    stMember     |     | X X |     | X   |     |     |
    stModule     |     |     |     |     |     |     |
    stModview    |     |     |     |     |     |     |
    stNamespace  |     |   X |     |     |     |     |
    stNil        |     |     |     |     |     |     |
    stNumber     |     |     |     |     |     |     |
    stParam      |X  X |X  X |  XX |X X X|     |  X  |XX
    stProc       |     |   X |X    |     |  X  |   X |
    stRegReloc   |     |     |     |     |     |     |
    stSplit      |     |     |     |     |     |     |
    stStaParam   |     |     |     |     |     |     |
    stStatic     |   XX|X  X |  XX |  X X|   X |X    |
    stStaticProc |     |  X X|     |     |  X  |     |
    stStr        |     |     |     |     |     |     |
    stTag        |     |   X |     |     |     |     |
    stType       |     |     |     |     |     |     |
    stTypedef    |     |   X |     |     |     |     |
    stUsing      |     |   X |     |     |     |     |
    stVirtBase   |     |   X |     |     |     |     |

A symbol's type and class taken together determines interpretation of other fields in the symbol table entry. The same combination can be used for different purposes in different contexts. As a result, to understand the symbol entry, it also may be necessary to access type information in the auxiliary table or the source language information in the file descriptor.

The contents of the value and index fields for each combination, with a brief explanation of the symbol's use, are described in the following list of combinations. For many combinations, greater detail can be found in Section 5.3.7 and Section 5.3.8 .

stGlobal,sc(S)Data/(S)Bss/RData/Rconst

The value field is the symbol's address.
The index field is an auxiliary table index or indexNil (if the auxiliary table is not present).
This symbol is a defined global variable.

stGlobal,scTlsData/TlsBss

The value field is the offset from the base of the object's TLS region.
The index field is an auxiliary table index or indexNil (if the auxiliary table is not present).
This symbol is a defined global TLS variable.

stGlobal, sc(S)Common/TlsCommon

The value field is the symbol's size in bytes.
The index field is an auxiliary table index or indexNil (if the auxiliary table is not present).
This symbol is a common.

stGlobal, sc(S)Undefined/TlsUndefined

The value field is zero in linked objects. In relocatable objects, the value field is ignored. (Some compilers store the size in bytes of the global variable in the value field.)
The index field is an auxiliary table index or indexNil (if the auxiliary table is not present).
This symbol is an undefined global variable.

stStatic, sc(S)Data/(S)Bss/RData/Rconst

The value field is the symbol's address.
The index field is an auxiliary table index.
This symbol is a defined static variable.

stStatic, scTlsData/TlsBss

The value field is an offset from the base of the object's TLS region.
The index field is an auxiliary table index.
This symbol is a defined static TLS variable.

stStatic, scCommon

The value field is zero.
The index field is an auxiliary table index.
This symbol is a Fortran common block.

stStatic, scInfo

The value field is zero.
The index field is an auxiliary table index.
This symbol is a C++ static data member.

stParam, scAbs

The value field is an offset from the virtual frame pointer.
The index field is an auxiliary table index.
This symbol is a parameter stored on the stack.

stParam, scRegister

The value field is the number of the register containing the parameter.
The index field is an auxiliary table index.
This symbol is a parameter stored in a register.

stParam, scVar

The value field is an offset from the virtual frame pointer to the parameter's address.
The index field is an auxiliary table index.
This symbol is a parameter stored on the stack. One level of indirection is required to access the parameter's value.

stParam, scVarRegister

The value field is the register number containing the address of the parameter.
The index field is an auxiliary table index.
This symbol is a parameter stored on the stack. One level of indirection is required to access the parameter's value.

stParam, scInfo

The value field is zero.
The index field is an auxiliary table index.
This symbol is a parameter of a C++ member function, function pointer definition, or procedure with no code.

stParam, sc(S)Data/(S)Bss/Rconst/Rdata

The value field is the address of the parameter.
The index field is an auxiliary table index.
This symbol is a static parameter.

stParam, scUnallocated

The value field is zero.
The index field is an auxiliary table index.
This is an unallocated parameter.

stLocal, scAbs

The value field is an offset from the virtual frame pointer.
The index field is an auxiliary table index.
This is a local variable stored on the stack.

stLocal, scRegister

The value field is the number of the register containing the variable.
The index field is an auxiliary table index.
This symbol is a local variable stored in a register.

stLocal, scVar

The value field is an offset from the virtual frame pointer to the symbol's address.
The index field is an auxiliary table index.
This symbol is a local variable stored on the stack. One level of indirection is required to access its value.

stLocal, scVarRegister

The value field is the register number containing the address of this variable.
The index field is an auxiliary table index.
This symbol is a local variable stored on the stack. One level of indirection is required to access its value.

stLocal, scUnallocated

The value field is zero.
The index field is an auxiliary table index.
This is an unallocated local variable.

stLocal, scText/Init/Fini/(S)Data/(S)Bss/Rconst/Rdata/TlsData/TlsBss

The value field is the address of the section indicated by the storage class.
The index field is indexNil.
These are special symbols inserted by the compilation system for shared objects. They are found in the external symbol table and their names are the section names (for example, .text or .init).

stLabel, scAbs

The value field is the symbol's value. This may be either a numeric constant or absolute address.
The index field is indexNil.
This symbol is a linker defined absolute symbol.

stLabel, scText/Init/Fini/(S|X|P|R)Data/(S)Bss/Rconst/TlsData/TlsBss

The value field is the label's value (an address).
The index field is indexNil.
This symbol is an allocated label. It can be associated with any raw data section of the object file.

stLabel, scUnallocated

The value field is zero.
The index field is indexNil.
This symbol is an unallocated label.

stProc, scNil

The value field is zero.
The index field is indexNil.
This is an external symbol.

stProc, scText

The value field is the procedure's address.
This symbol can occur in the external or local symbol table:
- In the local symbol table, the index field is an auxiliary table index.
- In the external symbol table, it is the local symbol index of the corresponding procedure symbol in the local symbol table, unless the file is stripped of local symbol information. If the file is locally stripped, the index field is indexNil.
This symbol is a defined procedure.

stProc, scUndefined

The value field is zero.
The index field is indexNil.
This symbol is an undefined procedure.

stProc, scInfo

The value field contains a value of:
- -1 (a procedure with no code)
- -2 (a function prototype or function pointer definition)
- A non-negative index into the virtual function table for this function, for a C++ virtual member function.
The index field is an auxiliary table index.
This symbol represents a procedure without code, a function prototype, or a function pointer. The value field is used to distinguish among these possibilities.

stBlock, scText

The value field depends on context:
- If this is the first stBlock,scText symbol following an stProc,scText symbol, the value is the byte offset from the procedure's address to the address of the first instruction beyond the end of the procedure's prologue.
- For a text block, it is the byte offset from the procedure's address to the starting instruction address of the block.
The index field is the local symbol index of the symbol following the matching stEnd. If this is the first stBlock,scText following an stProc,scText for an alternate entry point, the index field will be set to indexNil because the symbol will not have a matching stEnd symbol.
This symbol indicates the start of a block scope.

stBlock, scInfo

The value field depends on context:
- Size in bytes for a class, structure, or union
- Size of the underlying data type for an enumerated type
- Auxiliary table index for a variant record
- Zero for the block scope of a procedure with no code.
The index field is the local symbol index of the symbol following the matching stEnd.
This symbol indicates the start of a structure, union, or enumeration definition (in C; the C++ representation differs). It describes a variant arm if it is inside an stBlock,scVariant scope. This symbol is also used to define the block scope of a procedure with no code.

stBlock, scCommon

The value field is the size of the common block in bytes.
The index field is the local symbol index of the symbol following the matching stEnd.
This symbol is a scoping symbol for a Fortran common block. It occurs in the context of the synthesized file used to define a common block.

stBlock, scVariant

The value field is the local symbol index of the structure member whose value determines which variant range is used.
The index field is a the local symbol index of the symbol following the matching stEnd.
This symbol occurs in the context of Pascal and Ada variant records. It indicates the start of the symbols for one variant.

stBlock, scFileDesc/scReportDesc

The value field is zero.
The index field is a the local symbol index of the symbol following the matching stEnd.
This symbol occurs in COBOL only. It indicates the start of the file or report descriptor scope.

stEnd, scText

The value field depends on the type of scope it is ending. It is:
- The size in bytes of the procedure's text (for a procedure)
- Byte offset from a procedure's address to the start of the epilogue (for the outermost text block in a procedure)
- Byte offset from a procedure's address to the first instruction address beyond the end of the block (for a text block)
- Zero (for a file)
The index field is the local symbol index of the matching stBlock, stProc, or stFile.
This symbol ends a file, procedure, or text block scope.

stEnd, scInfo

The value field is zero.
The index field is a the local symbol index of the matching stBlock or stNamespace.
If the matching symbol is an stBlock, this symbol ends a structure, union, enumeration, C++ member function definition, procedure with no code, or the block scope contained by a procedure with no code. If the matching symbol is an stNamespace, this symbol ends a namespace definition.

stEnd, scCommon

The value field is zero.
The index field is the local symbol index of the matching stBlock.
This symbol ends a Fortran common definition.

stEnd, scVariant

The value field is the same as that of the matching stBlock.
The index field is the local symbol index of the matching stBlock.
This symbol ends a variant record block.

stEnd, scFileDesc/scReportDesc

The value field is zero.
The index field is the local symbol index of the matching stBlock.
This symbol ends a file or report descriptor block.

stMember, scInfo

The value field depends on the symbol's data type:
- The ordinal value (for an element of an enumerated type)
- Zero (for a namespace or union member)
- Bit offset from the beginning of the structure (for a C structure or C++ class member)
The index field is an auxiliary table index.
This symbol describes a data structure field or the member of a namespace. It is found inside a block defining a data structure (for example, class or struct) or a namespace definition block.

stMember, scFileDesc/scReportDesc

The value field is zero or one, depending on whether the symbol is local or external, respectively.
The index field is an auxiliary table index.
This symbol occurs in COBOL only. It is found inside a file descriptor or report descriptor block.

stTypedef, scInfo

The value field depends on the purpose of this symbol:
- Zero (for a user-defined type definition).
- The auxiliary table index of the next auxiliary entry after the start of the class definition (for a compiler inserted symbol). In effect, the value is the contents of the index field plus one.
The index field is an auxiliary table index.
This symbol is a user-chosen name for a data type. It also appears as a compiler-inserted symbol following the stTag, scInfo symbol for an empty C++ class or structure.

stFile, scText

The value field is zero.
The index field is the local symbol index of the symbol following the matching stEnd.
This symbol denotes the scoping block for a source file.

stStaticProc, scText

The value field is the procedure's address.
The index field is an auxiliary table index.
This symbol is a defined static procedure.

stStaticProc, scInit/Fini

The value field is the procedure address.
The index field is an auxiliary table index.
These combinations are used for the special symbols __istart and __fstart, which are inserted by the linker.

stConstant, scInfo

The value field is the value of the constant.
The index field is an auxiliary table index.
This symbol represents a named value (for example, Fortran PARAMETER).

stConstant, scAbs

The value field is the value of the constant.
The index field is an auxiliary table index.
This symbol represents a named value (for example, Fortan PARAMETER).

stConstant, sc(S)Data/(S)Bss/RData/Rconst

The value field is the symbol's address.
The index field is an auxiliary table index.
This symbol represents allocated constant data.

stBase, scInfo

The value field is the offset of the base class relative to a derived class.
The index field is an auxiliary table index.
This symbol is a C++ base class.

stVirtBase, scInfo

The value field is an index (starting at 1) of the base class run-time description in the virtual base class table. See Section 5.3.8.6.2.
The index field is an auxiliary table index.
This symbol is a C++ virtual base class.

stTag, scInfo

The value field is zero.
The index field is an auxiliary table index.
This symbol is a C++ class, structure, or union. Note that the representation for C structures and unions is different.

stInter, scInfo

The value field is zero.
The index field is an auxiliary table index.
This symbol is used in C++ to connect the definition of a member function with its prototype in the class definition context.

stNamespace, scInfo

The value field is zero.
The index field is the local symbol index of the symbol following the matching stEnd.
This symbol indicates the start of the symbols in a namespace definition.

stUsing, scInfo

The value field is zero.
The index field is an auxiliary table index.
This symbol specifies a C++ namespace (or portion thereof) that is being imported into another scope.

stAlias, scInfo

The value field is zero.
The index field is an auxiliary table index.
This symbol defines an alias for a C++ namespace.

Combinations may be valid in the local symbol table, the external symbol table, or both. Table 5-7 shows which combinations are valid in which table, based on the symbol type value and also the storage class value where necessary. Only combinations previously specified as valid apply where the storage class value is shown as a wildcard value with the character '*'.

Table 5-7 Valid Placement for st/sc Combinations

`st/sc` Combination	External Symbol Table	Local Symbol Table
`stNil, *`	`X`	`X`
`stGlobal, *`	`X`
`stStatic, *`		`X`
`stParam, *`		`X`
`stLocal, scSCN`¹	`X`
`stLocal, not scSCN`¹		`X`
`stLabel, *`	`X`	`X`
`stProc, scInfo`		`X`
`stProc, scText`	`X`	`X`
`stProc, scUndefined`	`X`
`stBlock, *`		`X`
`stEnd, *`		`X`
`stMember, *`		`X`
`stTypedef, *`		`X`
`stFile, *`		`X`
`stStaticProc, scText`		`X`
`stStaticProc, scInit/Fini`	`X`
`stConstant, *`	`X`	`X`
`stBase, *`		`X`
`stVirtBase, *`		`X`
`stTag, *`		`X`
`stInter, *`		`X`
`stNamespace, *`		`X`
`stUsing, *`		`X`
`stAlias, *`		`X`

Table Notes:

scSCN = scData, scSData, scBss, scSBss, scRConst, scRData, scInit, scFini, scText, scXData, scPData, scTlsData, scTlsBss, scTlsInit

5.3 Symbol Table Usage

5.3.1 Levels of Symbolic Information

Different levels of symbolic information can be stored with an object file. Compilers often provide options that allow the user to choose the desired level of symbolic information for their program. This choice may be influenced by size considerations and debugging needs. A trade-off exists between the benefit of saving space in the object file and the amount of information available to tools that consume symbolic information.

It is also possible to change the amount of symbolic information present in a program that has already been compiled and linked. Information can be added or deleted. Two of the most common and useful operations are locally stripping and fully stripping the symbol tables in executable files. Tools that modify linked executables, such as instrumentation tools and code optimizers, may rewrite parts of the symbol table to reflect changes that they made.

5.3.1.1 Compilation Levels

The representation of symbolic information supported by compilers can be broken down into four levels:

Minimal� Only information required for linking
Limited� Source file and line number information for profiling and limited debugging (stack-tracing)
Full� Complete debugging information for non-optimized code
Optimized� Debugging information for optimized code

These levels correspond to the system compiler switches -g0 (minimal), -g1 (limited), -g2 (full), and
-g3 (optimized). Table 5-8 shows the symbol table sections that are produced by system compilers at each compilation level.

Table 5-8 Symbol Table Sections Produced at Various Compilation Levels

Symbol Table Section	Compilation Level
Symbol Table Section	Minimal	Limited	Full	Optimized
Symbolic header	Yes	Yes	Yes	Yes
File Descriptors	Yes	Yes	Yes	Yes
External Symbols	Yes	Yes	Yes	Yes
External Strings	Yes	Yes	Yes	Yes
Procedure Descriptors	Yes	Yes	Yes	Yes
Line Numbers	No	Yes	Yes	Yes
Relative File Descriptors	No	No	Yes	Yes
Optimization Symbols	No	Partial	Yes	Yes
Local Symbols	No	Partial	Yes	Yes
Local Strings	No	Partial	Yes	Yes
Auxiliary Symbols	No	Partial	Yes	Yes

The minimal level of symbolic information that may be produced during compilation includes only the symbol information required for the linker to function properly. This includes external symbol information that is needed to perform symbol resolution and relocation.

If the limited level of symbolic information is requested, line number entries are generated, but the auxiliary table will contain only external symbol entries. Again, external symbol and procedure descriptors are available. In addition, local symbols for procedures (and the corresponding auxiliary symbols, optimization symbols, and local strings) are present. Limited symbolic information is sufficient to meet the needs of profiling tools. The information present at this level is a subset of that required for full debugger support.

If full symbolic information is included, all symbol table section are produced in full. This level enables full debugging support with complete type descriptions for local and external symbols. Optimization is disabled.

Optimized symbolic information is designed to balance the aims of performance and debugging capabilities. This level supplies the same information as the full debugging option, but it also allows all compiler optimizations. As a result, some of the correlation is lost between the source code and the executable program.

On Tru64 UNIX systems, users can choose to compile their programs with any one of the four levels of symbolic information. The options -g0, -g1, and -g2 specify increasing levels of symbolic information. The system compiler's default is to produce the minimal level (-g0). Currently, debugging of optimized code (-g3) is not fully supported. See cc(1) for more details.

5.3.1.2 Locally Stripped Images

Objects can be produced with only global symbolic information stored in the symbol table. Selection of the -x option causes the linker to create a locally-stripped object. Reasons for stripping local symbolic information include reducing file size and limiting the amount of symbolic information available to end users of an application.

A locally-stripped object is very similar to an object produced with minimal symbolic information (see Section 5.3.1.1). The difference is the consolidation of file descriptors, which the linker does only for locally-stripped objects.

In a locally-stripped image, the file descriptors are included solely for the purpose of identifying source file languages. One file descriptor is present for each source language involved in the compilation. These file descriptors will have their adr field set to addressNil indicating the file descriptors cannot be used to identify text addresses.

The procedure descriptor table is present in full but is rearranged to group procedures by source language. All procedure descriptors for procedures written in a particular source language are thus contiguous, and they reflect the file descriptor's information.

External symbols are also present in a locally-stripped image. The file indices (ifd field) of the external symbols are updated to identify the generic file descriptor for the appropriate source language. The index fields are set to zero to indicate that no type information is available. External symbols with the storage class scNil are removed. These are debugging symbols that are not normally produced for minimal symbol tables.

Limited debugging is possible with locally-stripped objects. Because the procedure descriptors are retained, stack traces are possible. External symbol information can also be viewed, and language-dependent handling of symbols (for example, C++ name demangling) is preserved.

A linked executable file can be locally stripped at any time after its creation using the ostrip -x option. The output is the same as described above. This operation may also alter the raw data of the .comment section. See Chapter 7 for details.

5.3.1.3 (Fully) Stripped Images

Executable files may be fully stripped at any time after creation using either the strip command or the ostrip -s command. Stripping an executable will result in complete removal of the symbol table, including the symbolic header. The file header fields f_symptr and f_nsyms are set to zero to indicate that the file has been stripped.

This operation may also alter the raw data of the .comment section. See Chapter 7 for details.

5.3.2 Source Information

The final executable image for a program bears little resemblance to the source code files from which it was created. One of the principal functions of the symbol table is to track the relationship between the two so that the debugger is able to describe the resulting program in a way that the programmer can recognize.

5.3.2.1 Source Files

Much of the complication of source information stems from the "include" system. When a compilation involves several source files, there may be duplication of the header files included in each source file, or of the source files themselves. To avoid repetition of header file information in the linked object, the linker merges the input objects' included files wherever possible. Compilers mark file descriptors as mergeable or unmergeable. The linker then examines the input file descriptors and performs the merge whenever possible.

The linker considers two file descriptors to be mergeable if all of the following criteria are met:

The file descriptor fMerge bit is set in both (marked as mergeable by compiler).
Files have the same name.
Files are written in the same language.
Files contain the same number of local and auxiliary symbols.
Checksums match.
The checksums match if either:
1. Neither file's first auxiliary record is a btChecksum.
2. Both files' first auxiliary record is a btChecksum and they are identical.

The role of the relative file descriptor (RFD) tables is to track file-relative information after merging. A relative file descriptor table entry maps the index of each file at compile time to its index after linking. After linking, local or auxiliary symbols must be accessed through the RFD table to obtain the updated file descriptor index. This mechanism is necessary because the indices in the local symbol table are not updated when files are merged.

Figure 5-4 is an example of the use of the relative file descriptor table.

Figure 5-4 Relative File Descriptor Table Example

For a symbol reference composed of a file index and symbol index (offset within file), the relative file descriptor table is used as follows:

To look up given file index in the RFD table to get updated file index.
To look up new file index in the (merged) file descriptor table to get base of symbols for that file.
To add symbol index to file's base to access the symbol entry.

See Section 5.3.7.3 for the representation of relative indices in the auxiliary symbol table.

5.3.2.2 Line Number Information

For a debugger to be effective, a connection must be made between high-level-language statements in source files and the executable machine instructions in object files. Line number entries map executable instructions to source lines. This mapping allows a debugger to present to a programmer the line of source code that corresponds to the code being executed. The line number information is produced by the compiler and should be rewritten if an application such as an instrumentation tool or an optimizer modifies code.

In V3.13 of the Tru64 UNIX symbol table, line number information is emitted in two forms, one found in the line number table and one in the optimization symbol table. (Section5.3.3 describes the structure of the optimization symbol table.) The line number information found in the optimization symbol table is referred to as "extended source location information". This is a new form of line number information introduced in V3.13 symbol tables. The new line number information augments the information in the line number table. If both forms of line number information are present in an object the extended source line information will only be present for procedures that cannot be described adequately by entries in the line number table.

5.3.2.2.1 The Line Number Table

Line number information is generated for each source file that contributes executable code to a program. Within each source file, line numbers are organized by procedure, in the order of appearance in the file. The line number symbol table section is produced only when a program is compiled with limited or greater symbolic information (see Section 5.3.2.2).

Figure 5-5 illustrates of the organization of the line number table.

Figure 5-5 Line Number Table

The order outlined in Figure 5-5 is not guaranteed to match the ordering of file descriptors or procedure descriptors in those tables. To determine the bounds of the line number table entries for a specific procedure, fields in the associated file descriptor and procedure descriptors must be used. The starting offset for a procedure's line table entries is calculated directly from these fields. The ending offset can only be determined by finding the starting offset of the next procedure's entries in the line number table. An algorithm to identify the starting and ending line table offsets for a procedure follows.

IPD = index-of-procedure
IFD = index-of-file-containing-procedure

if (FDR[IFD].cbLine == 0 or
    (PDR[IPD].iline == ilineNil ))
    /* No line information for this procedure */

START_FILE_OFFSET = FDR[IFD].cbLineOffset
END_FILE_OFFSET = START_FILE_OFFSET + FDR[IFD].cbLine

START_PROC_OFFSET = START_FILE_OFFSET + PDR[IPD].cbLineOffset

NEXTIPD = -1
for (I = 0; I < FDR[IFD].cpd; I++)
    IPD2 = FDR[IFD].ipdFirst + I
    if (IPD2 != IPD and
        PDR[IPD2].iline != ilineNil and       /* No lines */
        PDR[IPD2].lnHigh != -1 and            /* Alt entry */
        PDR[IPD2].cbLineOffset > PDR[IPD].cbLineOffset)

        if (NEXTIPD == -1 or
            PDR[PID2].cbLineOffset < PDR[NEXTIPD].cbLineOffset)

            NEXTIPD = IPD2

if (NEXTIPD == -1)
    /* IPD is the last procedure with line numbers in the file */
    END_PROC_OFFSET = END_FILE_OFFSET
else
    END_PROC_OFFSET = START_FILE_OFFSET + PDR[NEXTIPD].cbLineOffset

Alternate entrypoints have a starting line number, but they have no specific ending line number. Procedure descriptors for a procedure and each of its associated alternate entrypoints share a common end offset in the line number table. See Section 5.3.6.7 for more information on alternate entrypoints.

The line number table has two forms. The "packed" form is used in the object file. The "expanded" form is a more useful representation to programmers and can be derived algorithmically (or by API) from the packed form.

The packed line numbers are stored as bytes. Each packed entry within the single byte value consists of two parts: count and delta. The count is the number of instructions generated from a source line. The delta is the number of source lines between the current source line and the previous one that generated executable instructions.

Figure 5-6 shows how these two values are represented.

Figure 5-6 Line Number Byte Format

The four-bit count is interpreted as an unsigned value between 1 and 16 (0 means 1, 1 means 2, and so forth). A zero value would be wasted when no instructions are generated for a source line and, as a result, no line number entry will exist for that line.

The four-bit delta is interpreted as a signed value in the range -7 to +7. The reason for this is that code generators may produce instructions that are not in the same order as the corresponding source lines. Therefore, the offset to the "next" source line may be a forwards or backward jump.

Either of these quantities may fall outside the permissible range. For a delta outside the range, an extended format exists (as shown in Figure 5-7).

Figure 5-7 Line Number 3-Byte Extended Format

For a count outside the range, one or more additional entries follow, with the delta set to zero.

If both fields are out of range, the delta is handled first. An extended-format delta representation is followed by an entry with the delta bits set to zero and the remainder of the count contained in the count value.

The packed line number format can be expanded to produce the instruction-to-source-line mapping that is needed for debugging. An algorithm to accomplish this transformation for a given procedure follows. The expanded line number array has a source line number entry for each instruction in the given procedure. The address of the first entry is the address recorded in the PDR.adr field. Subsequent entries correspond to contiguous sequential instruction addresses.

START_PROC_OFFSET = offset-of-procedure's-entries-in-line-table
END_PROC_OFFSET = offset-of-next-procedure's-line-table-entries

PACKED = HDRR.cbLineOffset + START_PROC_OFFSET 
CURRENTLINE = PDR.lnLow
EXPANDED = ALLOCATE(number-of-instructions-in-procedure)
 
for (I = 0; 
     I < (END_PROC_OFFSET - START_PROC_OFFSET)/sizeof(*PACKED); 
     I++)
    COUNT = (unsigned)(PACKED[0] & 0x0F) + 1
    DELTA =   (signed)(PACKED[0] & 0xF0) >> 4

    if (DELTA == (signed)0x8)     /* Extended delta */ 
        DELTA = (signed)((PACKED[2] << 8) | PACKED[1])
        PACKED += 2
    else
        PACKED += 1

    if (current-offset-matches-offset-of-alternate-entry)
        CURRENTLINE = PDR.lnLow of alternate entry

    CURRENTLINE += DELTA
 
    while (COUNT-- > 0)
        *EXPANDED = CURRENTLINE
        EXPANDED++

The following source listing of a file named lines.c provides an example that shows how the compiler assigns line numbers:

1   #include <stdio.h>
2   main()
3   {
4       char c;
5
6       printf("this program just prints input\n");
7       for (;;) {
8          if ((c =fgetc(stdin)) != EOF) break;
9       /*   this is a greater than 7-line comment
10           * 1
11           * 2
12           * 3
13           * 4
14           * 5
15           * 6
16           * 7
17           */
18           printf("%c", c);
19      } /* end for */
20  } /* end main */

The compiler generates line numbers only for the lines 2, 6, 8, 18, and 20; the other lines are either blank or contain only comments.

Table 5-9 shows the packed entries' interpretation for each source line.

Table 5-9 Line Number Example

Source Line	LINER contents	Interpretation
2	`03`	Delta 0, count 4
6	`44`	Delta 4, count 5
8	`29`	Delta 2, count 10
18 ¹	`88 00 0a`	Delta 10, count 9
19	`10`	Delta 1, count 1
20	`14`	Delta 1, count 5

Table Note:

Extended format (delta is greater than 7 lines).

The compiler generates the following instructions for the example program:

  [lines.c:   2] 0x0:     ldah    gp, 1(t12)
  [lines.c:   2] 0x4:     lda     gp, -32592(gp)
  [lines.c:   2] 0x8:     lda     sp, -16(sp)
  [lines.c:   2] 0xc:     stq     ra, 0(sp)
  [lines.c:   6] 0x10:    ldq     a0, -32720(gp)
  [lines.c:   6] 0x14:    ldq     t12, -32728(gp)
  [lines.c:   6] 0x18:    jsr     ra, (t12), printf
  [lines.c:   6] 0x1c:    ldah    gp, 1(ra)
  [lines.c:   6] 0x20:    lda     gp, -32620(gp)
  [lines.c:   8] 0x24:    ldq     a0, -32736(gp)
  [lines.c:   8] 0x28:    ldq     t12, -32744(gp)
  [lines.c:   8] 0x2c:    jsr     ra, (t12), fgetc
  [lines.c:   8] 0x30:    ldah    gp, 1(ra)
  [lines.c:   8] 0x34:    lda     gp, -32640(gp)
  [lines.c:   8] 0x38:    and     v0, 0xff, t0
  [lines.c:   8] 0x3c:    stq     v0, 8(sp)
  [lines.c:   8] 0x40:    xor     t0, 0xff, t0
  [lines.c:   8] 0x44:    bne     t0, 0x6c
  [lines.c:  18] 0x48:    ldq     t2, 8(sp)
  [lines.c:  18] 0x4c:    sll     t2, 0x38, t2
  [lines.c:  18] 0x50:    sra     t2, 0x38, a1
  [lines.c:  18] 0x54:    ldq     a0, -32752(gp)
  [lines.c:  18] 0x58:    ldq     t12, -32728(gp)
  [lines.c:  18] 0x5c:    jsr     ra, (t12), printf
  [lines.c:  18] 0x60:    ldah    gp, 1(ra)
  [lines.c:  18] 0x64:    lda     gp, -32688(gp)
  [lines.c:  19] 0x68:    br      zero, 0x24
  [lines.c:  20] 0x6c:    bis     zero, zero, v0
  [lines.c:  20] 0x70:    ldq     ra, 0(sp)
  [lines.c:  20] 0x74:    lda     sp, 16(sp)
  [lines.c:  20] 0x78:    ret     zero, (ra), 1
  [lines.c:  20] 0x7c:    call_pal        halt

After applying the given algorithm, the following instruction-to-source mapping (formatted instruction number. source line number) is obtained:

           0.    2         1.    2         2.    2
           3.    2         4.    6         5.    6
           6.    6         7.    6         8.    6
           9.    8        10.    8        11.    8
          12.    8        13.    8        14.    8
          15.    8        16.    8        17.    8
          18.   18        19.   18        20.   18
          21.   18        22.   18        23.   18
          24.   18        25.   18        26.   19
          27.   20        28.   20        29.   20
          30.   20        31.   20

Header files included in an object have no associated line numbers recorded in the symbol table. Line number information for included files containing source code is not supported.

5.3.2.2.2 Extended Source Location Information (ESLI)

The line number table does not correctly describe optimized code or programs with untraditional source files, resulting in images that are difficult to debug. Extended Source Location Information (ESLI) is intended to provide more information to enable debugging of optimized programs, including PC and line number changes, file transitions, and line and column ranges. ESLI is essentially a superset of the older line number table.

ESLI is stored in the optimization symbols section. This information is accessible on a per-procedure basis from the procedure descriptors. See Section 5.3.3 for more detail on accessing information in the optimization symbols section.

ESLI is a byte stream that can be interpreted in two modes: data mode or command mode. Currently, two formats are defined for data mode. These are designated as "Data Mode 1" and "Data Mode 2". Additional data modes may be defined as needed.

Figure 5-8 ESLI Data Mode Bytes

Data Mode 1 is the initial mode for a procedure's ESLI. Data Mode 1 is identical to the packed line number format with the exception of the interpretation of the delta PC escape value '1000' (which indicates a switch to command mode).

In Data Mode 2, each entry consists of two bytes. The first byte is identical to the encoding and interpretation of Data Mode 1. The second byte is an absolute column number (from 0 to 255), where column number 0 indicates that column information is missing or not meaningful for this entry. The escape from Data Mode 2 to command mode consists of a delta PC escape value set to '1000' and column number set to 0.

In command mode, each byte is either a command or a command parameter. For a command byte, the low-order six bits are a command code, and the two high bits are used as flags, as shown in Figure 5-9. The "mark" flag, if set, announces that a new state has been established. Several commands may be required to fully describe a new state. The "resume" flag, if set, indicates the end of command mode. The next byte following a command with "resume" set will be a data mode byte. The same data mode that was in effect prior to the escape to command mode will be resumed. See Table 5-10 for a complete list of commands.

Figure 5-9 ESLI Command Byte

Command parameters are stored in LEB (Little Endian Byte) 128 format. See Section 1.4.6 for a description of this data representation. PC deltas are always expressed as machine instruction offsets and must be scaled by the size of a machine instruction before adding to the current PC. No other deltas need to be scaled.

Table 5-10 shows how to interpret the bytes in command mode. These definitions can be found in the system header file linenum.h.

Table 5-10 ESLI Commands

Name	Value	Number of Parameters	Type of Parameters
`ADD_PC`	1	1	SLEB
`ADD_LINE`	2	1	SLEB
`SET_COL`	3	1	LEB
`SET_FILE`	4	1	LEB
`SET_DATA_MODE`	5	1	LEB
`ADD_LINE_PC`	6	2	SLEB, SLEB
`ADD_LINE_PC_COL`	7	3	SLEB, SLEB, LEB
`SET_LINE`	8	1	LEB
`SET_LINE_COL`	9	2	LEB, LEB

ADD_PC: Parameter is a signed value to add to the current PC value.
ADD_LINE: Parameter is a signed value to add to the current line number.
SET_COL: Parameter is an unsigned value that represents a new column number. The column number is used to associate the PC with a particular location within a source line. Column number parameters use a zero-based representation that must be adjusted by adding 1.
SET_FILE: Parameter is an unsigned value used to switch file context. This command is typically followed by a set_line command.
SET_DATA_MODE: Parameter is an unsigned value used to set current data mode. The only parameter values that are currently accepted are 1 and 2. Additional data modes may be defined in future releases.
ADD_LINE_PC: Both parameters are signed values. The first is added to the PC and the second is added to the line number.
ADD_LINE_PC_COL: The first two parameters are signed values and the third is an unsigned value. The first two are added to the PC and line number respectively. The third is used to set the column number.
SET_LINE: Parameter is an unsigned value that sets the current line number.
SET_LINE_COL: Both parameters are unsigned values. The first represents the line number and the second represents the column number.

A tool reading the ESLI must maintain the current PC value, file number, line number, and column. Taken together, these four values represent the current "state". Consumers must also keep track of the mode in effect to interpret the data properly. The following example shows the instructions for consuming ESLI for one procedure.

MODE = data mode 1
FILE = current file
LINE = PDR.lnLow
COLUMN = 0
PC = PDR.adr
STATE_TABLE++ = (FILE,LINE,COLUMN,PC)
ESLI = GET_ESLI(PDR.iopt)
for ppode_len bytes of ESLI do
    if (MODE == data mode 1 or MODE == data mode 2)
        if (ESLI.delta == escape)
            PUSH_MODE(MODE)
            MODE = command mode
        else
            PC += 4 * ESLI.delta
            LINE += COUNT + 1
            if (MODE == data mode 1)
                STATE_TABLE++ = (FILE,LINE,COLUMN,PC)
        ESLI++
    if (MODE == data mode 2)
        COLUMN = ESLI++
        STATE_TABLE++ = (FILE,LINE,COLUMN,PC)
    if (MODE == command mode)
        read all parameters
        update FILE, LINE, COLUMN and PC as required
        if (mark flag set)
            STATE_TABLE++ = (FILE,LINE,COLUMN,PC)
        if (resume flag set)
            MODE = POP_MODE()
        ESLI += number-of-bytes-read

Data encoded in ESLI can be represented in tabular format. The PC value and file, line and column numbers can be stored as a state table. The following example shows how to build this state table.

In this example ESLI will record line numbers for a routine that includes text from a header file.

Source listing for line1.c:

1   /* ESLI example using included source lines */
2   
3   main() {
4      char *msg;
5   
6      msg = (char *)0;
7   
8   #include "line2.h"
9   
10     printf("%s", msg);
11  }

Source listing for line2.h

1   msg = (char *)malloc(20);
2   /*
3    *
4    *
5    *
6    *
7    *
8    *
9    *
10   */
11  strcpy(msg, "Hello\n");

The compiler generates the following instructions for the example program:

      main:
[line1.c:   3] 0x1200011d0:     ldah    gp, 8192(t12)
[line1.c:   3] 0x1200011d4:     lda     gp, 28336(gp)
[line1.c:   3] 0x1200011d8:     lda     sp, -16(sp)
[line1.c:   3] 0x1200011dc:     stq     ra, 0(sp)
[line1.c:   3] 0x1200011e0:     stq     s0, 8(sp)
[line1.c:   6] 0x1200011e4:     bis     zero, zero, s0
[line2.h:   1] 0x1200011e8:     bis     zero, 0x14, a0
[line2.h:   1] 0x1200011ec:     ldq     t12, -32560(gp)
[line2.h:   1] 0x1200011f0:     jsr     ra, (t12)
[line2.h:   1] 0x1200011f4:     ldah    gp, 8192(ra)
[line2.h:   1] 0x1200011f8:     lda     gp, 28300(gp)
[line2.h:   1] 0x1200011fc:     bis     zero, v0, s0
[line2.h:  11] 0x120001200:     bis     zero, s0, a0
[line2.h:  11] 0x120001204:     lda     a1, -32768(gp)
[line2.h:  11] 0x120001208:     ldq     t12, -32600(gp)
[line2.h:  11] 0x12000120c:     jsr     ra, (t12)
[line2.h:  11] 0x120001210:     ldah    gp, 8192(ra)
[line2.h:  11] 0x120001214:     lda     gp, 28272(gp)
[line1.c:  10] 0x120001218:     ldq_u   zero, 0(sp)
[line1.c:  10] 0x12000121c:     lda     a0, -32760(gp)
[line1.c:  10] 0x120001220:     bis     zero, s0, a1
[line1.c:  10] 0x120001224:     ldq     t12, -32552(gp)
[line1.c:  10] 0x120001228:     jsr     ra, (t12)
[line1.c:  10] 0x12000122c:     ldah    gp, 8192(gp)
[line1.c:  10] 0x120001230:     lda     gp, 28244(gp)
[line1.c:  11] 0x120001234:     bis     zero, zero, v0
[line1.c:  11] 0x120001238:     ldq     ra, 0(sp)
[line1.c:  11] 0x12000123c:     ldq     s0, 8(sp)
[line1.c:  11] 0x120001240:     lda     sp, 16(sp)
[line1.c:  11] 0x120001244:     ret     zero, (ra)

The ESLI and its interpretation for the generated code is shown in the following table.

Table 5-11 ESLI Example

ESLI bytes (hex)	Mode	Command (M)ark (R)esume			State (F)ile (L)ine (C)olumn
		Code	M	R	PC (hex)	F	L	C
Initial State	Data1				`1200011d0`	`0`	`3`	`0`
`04`	Data1				`1200011e4`	`0`	`3`	`0`
`30`	Data1				`1200011e8`	`0`	`6`	`0`
`80`	Data1	Escape
`04 01`	Cmd	`set_file(1)`				`1`
`48 01`	Cmd	`set_line(1)`		X			`1`
`05`	Data1				`120001200`	`1`	`1`	`0`
`80`	Data1	Escape
`86 0a 06`	Cmd	`add_line_pc(10,6)`	X		`120001218`	`1`	`11`	`0`
`04 00`	Cmd	`set_file(0)`				`0`
`48 0a`	Cmd	`set_line(10)`		X			`10`
`06`	Data1				`120001234`	`0`	`10`	`0`
`16`	Data1				`120001250`	`0`	`11`	`0`

The handling of alternate entry points differs from the handling of main entry points. Procedure descriptors for alternate entry points are identified by a PDR.lnHigh value of -1. If the PC for an instruction maps to an alternate entry point, the following steps should be taken:

Find procedure descriptor for the corresponding main entry. This is accomplished by searching back in the procedure descriptors until a PDR is found that is not an alternate entry (PDR.lnHigh is not -1).
Access the ESLI for the procedure.
Read the ESLI until the PC value matches the PDR.adr field of the alternate entry's procedure descriptor.

5.3.3 Optimization Symbols

The optimization symbols section gives individual producers and consumers the ability to communicate information about any aspect of the object file, in any form they choose. New information can be generated at any time with minimal coordination between all producers and consumers. In V3.13 of the symbol table, the optimization section may include extended source location information (see Section 5.3.2.2).

The optimization section is organized on a per-procedure basis. Each procedure descriptor has a pointer to the optimization symbols in the field PDR.iopt. If no optimization symbols are associated with the procedure, the field contains ioptNil. Otherwise, it contains the index of the first optimization symbol entry for this procedure. Consumers should access the optimization symbols through the procedure descriptors. The optimization section is not present in a locally-stripped object.

This section consists of a sequence of zero or more Per-Procedure Optimization Descriptions (PPODs), as shown in Figure 5-10. Each PPOD's internal structure consists of two parts:

A leading sequence of structured entries using a Tag-Length-Value model to describe subsequent raw data. The structure of the PPOD entry can be found in Section 5.2.10.
The raw data area.

Figure 5-10 Optimization Symbols Section

This section has the following alignment requirements:

Octaword (16-byte) alignment of the beginning of the section.
Octaword (16-byte) alignment of the beginning of the raw data area.
Octaword (16-byte) alignment of each PPOD.

Object file producers must produce either an empty optimization symbols section or a valid one. An empty one has the symbolic header fields cbOptOffset and ioptMax set to zero. If an optimization section is present, but a particular file does not contribute to it, the file descriptor field copt is set to zero. In this case, all procedure descriptors belonging to the file must have their iopt fields set to ioptNil.

Tools that both read and write object files must consume a valid optimization symbols section (if present in the input file) and produce an equivalent and valid section in its output file. If a tool does not know how to process the section contents, the section must be omitted from the output file. If a tool does know how to process portions of the optimization symbols, those portions may be modified and the rest should be removed. As usual, the linker is a special case. It concatenates input optimization symbols sections into one output section without reading or modifying any of the entries.

The format and flexible nature of this section are similar by design to the .comment section. The structures are the same size and contain the same fields (with different names), and the rules of navigation are the same. The primary difference is that the optimization section is broken down by procedure; whereas, the comment section must be treated as a whole.

5.3.4 Run-Time Information

The symbol table contains information that debuggers must interpret to find symbols at run time. This section describes the information that the static symbol table structures provides. Algorithms for determining run-time symbol addresses are included.

5.3.4.1 Stack Frames

A stack frame is a run-time memory structure that is created whenever a procedure is called. The Calling Standard for Alpha Systems specifies the stack frame format and related code requirements. This section explains how to interpret procedure descriptor fields related to the stack frame.

Two types of stack frames are supported: fixed-size frames and variable-size frames. The variable frame format is used for procedures that dynamically allocate memory and for those with very large frames. Figure 5-11 shows a fixed-size frame and Figure 5-12 shows a variable-sized frame.

From the procedure descriptor, you can determine which type of stack frame the procedure has. The field PDR.framereg stores the frame pointer register number. If this field has a value of 30 ($sp), the stack frame is a fixed-size frame. If it has a value of 15 ($fp), the stack frame is a variable-size frame.

Figure 5-11 Fixed-Size Stack Frame

Figure 5-12 Variable-Size Stack Frame

For both types of stack frames, the value of PDR.frameoffset is the size of the fixed part of the stack frame. In the case of a fixed-size frame, it is the entire frame size. For a variable-sized frame, the entire frame size cannot be determined from the symbol table. The code may dynamically increase and decrease the size of the frame multiple times during procedure execution.

The virtual frame pointer represents the contents of the frame pointer register at procedure entry, prior to prologue execution. The (real) frame pointer is the contents of the frame pointer register after prologue execution. The difference between the virtual and real frame pointer values is the fixed frame size, which is subtracted from the $sp contents during the procedure prologue. Note that stack offsets recorded in the symbol table are relative to the virtual frame pointer, not the real value used at run time.

The contents of the frame pointer register at are used at run time as the base address for accessing data, such as parameters and local variables, on the stack. See Section 5.3.4.3 for details.

5.3.4.2 Procedure Addresses

The PDR.adr is reliably updated by the linker starting with version V3.13 of the symbol table. To determine the procedure start address for a given PDR in prior versions of the symbol table, the following algorithm is recommended:

if (HDRR.vstamp >= 0x30D || PDR.isym == isymNil) 
    return(PDR.adr)
else
    foreach FDR in HDRR
        foreach PDR in FDR
            if PDR matches
                if (FDR.csym == 0)  /* Use external symbol */
                    return (EXTR[PDR.isym].asym.value)
                else                /* Use local symbol */
                    return (SYMR[FDR.isymbase + PDR.isym].value)

If local symbol information is present for the given PDR, the isym field identifies the local symbol table entry that contains the start address of the procedure. If no local symbol information is present, the isym field identifies the external symbol table entry containing the start address of the procedure. If no symbol information is present for the PDR, the isym field is set to isymNil and the adr field will contain a reliable start address.

5.3.4.3 Local Symbol Addresses

Local variables and parameters may be stored in registers or on the stack. Those stored in registers (identified by a storage class of scRegister) do not have addresses. For local variables and parameters with addresses, this section explains how to calculate their run-time locations from the symbol table information.

To calculate the run-time address for a local variable (stLocal) based on its symbol table value:

Frame pointer - PDR.localoff + SYMR.value

To calculate the run-time address for a parameter (stParam) based on its symbol table value:

Frame pointer - argument_home_area_size + SYMR.value

The argument home area is a portion of the stack frame designated for parameter storage. See Figure 5-11 for an illustration. For historical reasons, the size of this area is always 48 bytes.

The calculations above must be performed at run time when the actual frame pointer value is known. Note that the value becomes valid only after the procedure prologue has executed.

To calculate the locations based on static information, convert the symbol's value to an offset from the real frame pointer:

Local:

PDR.frameoffset - PDR.localoff + SYMR.value

Parameter:

PDR.frameoffset - 48 + SYMR.value

The resulting offsets are always positive values because the frame pointer contains the address of the lowest memory in the fixed part of the stack frame at run time.

5.3.4.4 Uplevel Links

An uplevel link is the real frame pointer of an ancestor of a nested routine. The routine nesting may be a feature of the language (such as Pascal), or the nesting may occur in optimized code which has been decomposed for parallel execution into smaller routines. Uplevel links provide debuggers a method of finding all local symbols associated with the ancestor routine.

When a procedure is passed a static link, that static link will be represented within the scope of the procedure definition as a local automatic symbol with a special name beginning with "__StaticLink.". The lifetime of this symbol begins after the procedure prologue has been executed.

The static link symbol will occur between the procedure's parameter definitions and the first stBlock symbol.

The full name of the symbol will be "__StaticLink." followed by a positive decimal integer with no leading zeros. This integer value identifies the number of levels up the ancestor tree the static link points to.

For example, if the name is "__StaticLink.3" it will contain the static link of the procedure in which it is defined, and that procedure's static link points to a stack frame that is three levels up in the procedure's ancestor tree, the great-grandfather of the procedure.

Figure 5-13 Representation of Uplevel Reference

Debuggers of Tru64 UNIX object files need to use the uplevel link information to determine which symbols are visible at a location in the program and to compute the addresses of local symbols in ancestor routines. When the debugger needs the current value or address of a name that might be defined as an uplevel reference, two separate actions may be required: finding the procedure that defines the currently visible instance of that name, and finding the address of the currently visible instance of that name. If only type information is required, finding the procedure that defines the name may be sufficient.

Finding the defining procedure is accomplished by repeatedly looking up the name in the local symbol table of a chain of procedures that extends from the current procedure through its chain of ancestors until either the name is found in a procedure or the end of the chain of ancestors is reached without finding the name. If this search terminates without finding the name, the debugger should conclude that the name is not visible by uplevel reference at the current location in the program.

When searching for the desired procedure, the debugger should count how many levels in the ancestor chain were traversed before finding the name. If zero levels were traversed, the name is defined within the current procedure and is not an uplevel reference. The number of levels traversed is assumed to be in the variable LevelsToGo in the algorithm below.

Finding the address for the name involves locating static link values and dereferencing them with appropriate offsets. Basically, while the number of levels to be traversed is greater than zero, find the static link symbol for the current level and obtain its value. Finally, add the desired symbol's offset from the real frame pointer to the final static link value.

The recommended algorithm for finding the address is as follows:

LevelsToGo = <from name lookup above>
NewProc = CurrentProcedure
NewFrame = FramePointerValue(CurrentProcedure)
Failed = false
while (LevelsToGo > 0 && !Failed)
    StaticLink = FindStaticLinkSym(NewProc)
    if (StaticLink == NULL)
        Failed = true
    else
        NewFrame = *(NewFrame + StaticLink->symbol.offset)
        Levels = StaticLinkLevels(StaticLink)
        LevelsToGo = LevelsToGo - Levels
        for (; Levels > 0; Levels--)
            NewProc = NewProc->proc.parent

if Failed is true after executing this algorithm, required information about static links is missing in the symbol table, and an error has occurred. If LevelsToGo ends up less than zero, the optimizer's static link optimization has eliminated a static link level that would be needed to compute the address of the name. It is recommended that debuggers inform the user that optimization prevents the debugger from computing the address of the name.

If Failed is false and LevelsToGo is equal to zero, the address for the currently visible instance of the name is NewFrame plus the offset of the name with respect to the real frame pointer for NewProc.

The function StaticLinkLevels returns the integer at the end of the name for the indicated static link symbol.

5.3.4.5 Finding Thread Local Storage (TLS) Symbols

This section explains how to interpret symbolic information for TLS symbols (identified by a storage class of scTlsdata or scTlsbss). See Section 3.3.9 or the Programmer's Guide for general information on TLS.

A TLS symbol's value contains its offset from the start of the TLS region for that object. This offset can be used at process execution time to determine the address of the TLS symbol for a particular thread.

A debugger can calculate TLS symbol addresses by looking up the address of the TLS region using run-time structures and adding the offset of the TLS symbol to that address. The following formula can be used to calculate TLS symbol addresses.

TLS sym address = *(TEB.TSD + __tlskey) + SYMR.value

A detailed description of this formula follows:

Get the address of the Thread Environment Block (TEB).
Get the address of the Thread Specific Data (TSD) array from the TEB structure.
Get the offset of the TLS pointer in the TSD array.

This offset is normally stored in a .lita or .got entry. This value should be accessed using the symbol __tlskey . In spite of the fact that __tlskey is a label symbol, no ampersand is used in this context because the value that the label points to is being retrieved. The address of __tlskey will need to be adjusted by the address mapping displacement in the same manner that the debugger adjusts addresses of text and data symbols.

For non-shared objects, the .lita entry contains the constant offset (2048). This offset identifies the first and only TSD slot (256) that will be allocated for the TLS pointer.

For shared objects, the .got entry labeled by __tlskey is initially 0, indicating that the TSD slot has not been allocated yet. After the the object's initialization routines have run, a TSD key will be allocated and the .got entry will contain its offset.
Get the TLS pointer value. The TLS pointer is a 64-bit address set to the start of the TLS Region.
Calculate the address of the TLS symbol by adding the offset of the TLS symbol to the TLS pointer value.

5.3.5 Profile Feedback Data

Profile feedback data is stored in entries in the optimization symbols table with tag type PPODE_PROFILE_INFO. The data contained in this section is intended for Compaq internal use only. It contains execution profiling feedback used by compilers and the om utility.

Profile feedback data contains relative file descriptor and local symbol table indexes. If an object tool removes, adds, or rearranges relative file descriptors or local symbol table entries it must also remove all optimization symbol table entries including the profile feedback data.

5.3.6 Scopes

From a user-program's point of view, an identifer's scope determines its visibility in different parts of the program. Programming languages provide facilities for declaring and defining names of procedures, variables and other program components inside various scoping levels. This section briefly discusses the concept of scope and then explains how it is represented in the symbol table. References are made to structures in the auxiliary symbol table; see Section 5.3.7.3 for details.

Generally speaking, the four main scoping levels in a program are block scope, procedure scope, file scope, and program scope. Most programming languages have constructs to implement at least these scoping levels. Figure 5-14 shows the hierarchy of these scopes.

Figure 5-14 Basic Scopes

Names with block scope can only be referenced inside the declaring block. Blocks are delimited by begin and end markers, the syntax of which varies among languages.

Names with procedure scope are only recognized inside their enclosing subroutines. For instance, the names of formal parameters and local variables declared inside a procedure are accessible only to that procedure's executable statements.

Names with file scope can be referenced by any instruction within the file where they are declared. A file can be composed of procedures and data external to any procedure. Both external data names and procedure names can have file scope or program scope. Note that in a compilation involving only a single file or in a compilation for a programming language with no separate-compilation facilities, file scope and program scope are equivalent.

Names with program scope are visible everywhere in the program, even when the executable program is built from many source and header files. The linker must resolve these names or pass them to the dynamic loader to resolve. See Section 5.3.10 for more information about symbol resolution.

In the symbol table, procedure scope, file scope and program scope correspond to local, static, and global symbols, respectively. Block scope names are also local symbols. Local and static symbols appear in the local symbol table, and global symbols are in the external symbol table.

5.3.6.1 Procedure Scope

Although procedure symbols can only be global or static (with symbol types stProc and stStaticProc, respectively), procedure entries appear in the local symbol table to identify the containing scope of their local data. The set of symbols appearing in the local symbol table to describe a procedure scope and their associated auxiliary entries is shown in Figure 5-15. Global procedures also have entries in the external symbol table. As illustrated, the indices of these external entries point to the scoping entries in the local symbol table.

In this chapter, all diagrams of symbol table representations use arrows to show that one entry contains an index to another entry. For external and local symbol table entries, the index used is contained in the index field. For auxiliary symbols, the isym or RNDXR field is the index used. Any exceptions to this general rule are noted in the diagrams.

Figure 5-15 Procedure Representation

A special instance of a procedure definition occurs for a procedure with no text. This type of procedure occurs only in the local symbol table and is very similar to the representation of other procedures. It is generally used for procedures that have been optimized away that still need to be represented for debugging or profiling information.

Figure 5-16 Procedure with No Text

A procedure with no code can contain only nested procedures that also have no code associated with them. If a procedure with no code does not contain any nested procedures, the stBlock/stEnd symbol pair can be omitted from the representation.

The stProc symbol included in this representation is distinguished from similar stProc symbols by its value field that is set to addressNil (-1).

5.3.6.2 File Scope

As in the case of procedures, file name entries appear in the local symbol table to define the file's scope. This representation is shown in Figure 5-17. Note that file symbols appear in the local symbol table only.

Figure 5-17 File Representation

5.3.6.3 Block Scope

In general, the local symbol table denotes scoping levels with stBlock and stEnd pairs, as shown in Figure 5-18.

All symbols contained between these two entries belong to the scope they describe. Nested blocks are possible, and stEnd symbols match the most recent occurences of stBlock (or other opening symbol entries such as stProc or stTag).

Figure 5-18 Block Representation

Block scopes occur in many languages. In C, they take the form of lexical blocks. In C++, declarations can occur anywhere in the code. In Pascal and Ada, nested procedures are possible, with local variables at any or all levels.

5.3.6.4 Namespaces (C++)

A C++ namespace is a mechanism that allows the partitioning of the program global name space. This partitioning is intended to reduce name clashing and provide greater program managability to C++ developers.

Figure 5-19 C++ Namespace Representation

A namespace definition may exist only at the global scope or within another namespace. The namespace representation in Figure 5-19 shows a single contribution to a namespace. This representation may be replicated many times in the symbol table for a single namespace. A namespace definition may be continued within the same file or over multiple source files.

A single namespace contribution that spans multiple source files is represented as if it were contained entirely within the source file in which it began.

Namespaces may be aliased, allowing a single namespace to be refered to by multiple names. Namespace components may also be referenced without their namespace qualification if they are included within a scope by a using directive or using declaration. The representations of namespace aliases, using directives, and using declarations are shown in Figure 5-19. Namespace definitions, namespace component declarations, namespace aliases, using directives, and using declarations occur only in the local symbol table. Namespace component definitions may occur in the local or external symbol table.

5.3.6.4.1 Namespace Components

The components of a namespace are represented in two parts: declarations and definitions. Namespace components that do not require definition must be declared in the namespace definition. Namespace components that are referenced by a using declaration must be declared in the namespace definition. All other namespace component declarations may be omitted from the namespace definition.

Namespace component names are mangled only as needed. Function and data definitions have mangled name definitions in the local or external symbol table. These entries are mangled for type-safe linkage and as a method of matching components with the namespaces to which they belong. Names of component declarations within a namespace definition may or may not be mangled. They are not required to include the namespace name in their mangled form.

Empty namespace contributions can be omitted, but at least one instance of a namespace definition must occur somewhere in the local symbol table. This definition is required because name mangling rules do not distinguish namespace component definitions from class member definitions.

5.3.6.4.2 Namespace Aliases

Namespace aliases can occur in namespace, file, procedure or block scope in the local symbol table. The index value for the stAlias entry is an auxiliary table index. The auxiliary entry is a RNDXR record containing the local symbol table index of the stNamespace symbol in the first instance of a namespace definition within a compilation unit. For an alias of an alias, the RNDXR record can also contain the index of another stAlias symbol in the local symbol table. Section 9.2.5 provides an example of a namespace alias.

The stAlias symbol type may be used in future versions of the symbol table format as a general purpose symbol alias representation. The semantic interpretation of the stAlias symbol depends on the type of the symbol it aliases.

5.3.6.4.3 Unnamed Namespace

An unnamed namespace can be declared at the global scope or within another namespace. An unnamed namespace is unique within a compilation unit. Multiple contributions to a unique unnamed namespace are not allowed. Unnamed namespace contributions are included in the non-mergeable portion of a C++ header file.

Unnamed namespace components are subject to the same rules as named namespaces for declarations and definitions.

The stNamespace symbol for an unnamed namespace has no name, and its iss field is set to issNil. A compiler generated name is used to identify the unnamed namespace in the mangled names of unnamed namespace components. A convention for this special name is currently being investigated and will be identified in the next release of this document. The unnamed namespace example in Section 9.2.4 will use the name __unnamed until the actual naming convention has been determined.

5.3.6.4.4 Usage of Namespaces

A C++ using directive or a using declaration is represented by a symbol of type stUsing. It may occur in any scope in the local symbol table. The index value for the stUsing entry is an auxiliary table index. If the stUsing entry represents a using declaration for a single namespace component, the auxiliary entry is a RNDXR record containing the local symbol table index of a namespace component declaration. If the stUsing entry represents a using directive, its RNDXR auxiliary contains the local symbol table index of the stNamespace symbol in the first definition of that namespace in the compilation unit.

A using directive for a namespace alias is represented with a RNDXR auxiliary that directly references the aliased namespace. This representation contains no record of the alias referenced by the using directive.

Names are not required for stUsing entries, but they can be set to match the namespace or namespace component to which they refer.

Namespace components that are referenced by an stUsing symbol must be declared in the namespace definition.

Section 9.2.3 provides an example of namespace definitions and uses.

5.3.6.5 Exception Handling Blocks (C++)

In C++, a special scoping mechanism is introduced to expand user-defined exception-handling capabilities. Exception handlers are defined to "catch" exceptions that are "thrown" by other functions. The symbol table must contain sufficient information to recognize the scope of a handler. The compiler generates special symbols to identify where exception handlers are valid.

Figure 5-20 C++ Exception Handler Representation

5.3.6.6 Common Blocks (Fortran)

Fortran common blocks constitute another scoping level. Fortran uses common blocks as a way of specifying data that is global or shared between program units. A common block is global storage that can be named, allotted, accessed, and used by various subroutines. The block can be named or unnamed; unnamed blocks are known as "blank commons". Internal to the symbol table, blank commons are named "_BLNK_".

Figure 5-21 shows the symbolic representation of Fortran common blocks.

Figure 5-21 Fortran Common Block Representation

Because a Fortran common is represented as a synthesized file, it also has an entry in the file descriptor table. Furthermore, a global symbol with the same name is also present in the external symbol table.

An example of a Fortran common block can be found in Section 9.3.1.

5.3.6.7 Alternate Entry Points

Fortran also has a facility for creating alternate entry points in procedures. An alternate entry point is represented using an stProc, scText symbol. In the procedure descriptor table, an alternate entry point is identified by a lnHigh field with a value of -1. Procedure descriptors for alternate entry points follow the procedure descriptor for the primary entry point. In the local symbol table, an alternate entry point has an entry inside the scope of the procedure's main entry.

The representation of a procedure with an alternate entry point is shown in Figure 5-22

Figure 5-22 Alternate Entry Point Representation

An example of Fortran alternate entries can be found in Section 9.3.2.

5.3.7 Data Types in the Symbol Table

A data element's type dictates its size and interpretation in a programming environment. One of the symbol table's most important tasks is to represent data types in a compact and complete manner.

Type information is stored in the local and auxiliary symbol tables. This section provides guidelines for understanding the type information plus specific examples for depicting a range of types.

5.3.7.1 Basic Types

All programming languages have a set of simple types that are built into the language and from which other data types can be derived. Examples of simple types are integer, character, and floating point. Languages also provide constructs for creating user-defined types based on the simple types. For example, a C++ class can be built using any simple type or previously defined user-defined type and the language facility for declaring classes.

Similarly, a basic type in the symbol table is a building block from which each language constructs its type information. Basic type (bt) values directly represent many of the simple types for supported languages; for instance, the value btChar indicates a character. Other bt values represent language constructs for building aggregate types; a value of btStruct may be used, for example, to represent a C structure or Pascal record.

The symbol table uses approximately forty basic type values. The interpretation of some of these values is language dependent. See Table 5-4 for a list of all values.

5.3.7.2 Type Qualifiers

Type qualifiers can be applied to basic types to create other data types. Examples are "pointer to" and "array of". Generally the number and order of type qualifiers is unrestricted.

The type qualifier "function returning" (tqProc) is not used in V3.13 of the symbol table. However, it is used in prior versions for variables declared as function pointers. This older representation uses a TIR record to store the function type in the bt value followed by as many type qualifiers as necessary. A major limitation of this representation is the inability to represent parameter types.

The symbol table currently uses eight type qualifiers. See Table 5-5 for a list of all possible values.

5.3.7.3 Interpreting Type Descriptions in the Auxiliary Table

This section explains in detail the encoding of type descriptions in the symbol table. To fully describe the type of a symbol, the auxiliary symbol table must be created and referenced. Compilation with full symbolic information (-g option on system compilers) results in the creation of this table.

To correctly decode the type information, proceed sequentially, beginning with the symbol table entry. Several fields may be required from other symbol table structures:

symbol type (st)
storage class (sc)
index (SYMR.index)
value (SYMR.value)
source language (FDR.lang)

The first step is to determine whether the symbol contains an index of an auxiliary table description.

Table 5-12 Symbol Table Entries with Associated Auxiliary Table Type Descriptions

Symbol Type	Storage Class	Conditions	`SYMR` Field Containing `AUXU` Index
`stGlobal`	Any	None	`index`
`stStatic`	Any	None	`index`
`stParam`	Any	None	`index`
`stLocal`	Any	Local symbol table	`index`
`stProc`	Any	Local symbol table only	`index`
`stBlock`	`scInfo`	Inside an `scVariant` block only	`value`
`stMember`	`scInfo`	None	`index`
`stTypedef`	`scInfo`	None	`index`
`stStaticProc`	Any	Local symbol table only	`index`
`stConstant`	Any	None	`index`
`stBase`	`scInfo`	None	`index`
`stVirtBase`	`scInfo`	None	`index`
`stTag`	`scInfo`	None	`index`
`stInter`	`scInfo`	None	`index`
`stNamespace`	`scInfo`	None	`index`
`stUsing`	`scInfo`	None	`index`
`stAlias`	`scInfo`	None	`index`

If the index does represent a record in the auxiliary symbol table, the interpretation of the first auxiliary entry (AUXU) depends on the type of the symbol:

If the symbol's type is stProc or stStaticProc and the symbol is a local symbol, the indexed AUXU is an isym and the second AUXU is a TIR. External procedure symbols do not have descriptions in the auxiliary table.
If the symbol's type is stInter, stAlias, or stUsing, the indexed AUXU is an RNDXR and the type description does not contain a TIR.
If the symbol is an stBlock symbol inside an scVariant block, the symbol entry's value field is an index into the auxiliary table. This special case is the only one where the value is used as an auxiliary symbol pointer. In all other cases, it is the index field that potentially indexes the auxiliary table type description.
Otherwise, the indexed AUXU is a TIR.

The next task is to examine the contents of the TIR. The TIR contains constants representing the basic type of the symbol and up to six type qualifiers, labeled tq0-tq5. If a type has more than one qualifier, they are ordered from lowest to highest. Lower qualifiers are applied to the basic type before higher qualifiers. All unused tq fields are set to tqNil, and no tqNil fields are present before or between other type qualifiers.

In addition to the basic type and type qualifiers, the TIR contains two flags: an fBitfield flag to mark whether the size of the type is explicitly recorded, and a continued flag to indicate that the type description is continued in another TIR. If fBitfield is set, the TIR is immediately followed by a width entry. If more than six type qualifiers are required for the current definition, the description is continued, and the continued flag is set. If exactly six type qualifiers are needed, all six fields are used and the continued flag is cleared.

To illustrate, consider the type "array of pointers to integers". The basic type is "integer" and has two qualifiers, "array of" and "pointer to". Each element of the array is a "pointer to integer". Therefore, the qualifier "pointer to" must be applied first to the basic type "integer". In this example, the qualifier "pointer to" is lower than the qualifier "array of". The contents of the TIR are as follows:

        bt: btInt
        tq0: tqPtr
        tq1: tqArray
        tq2: tqNil
        tq3: tqNil
        tq4: tqNil
        tq5: tqNil
        continued: 0
        fBitfield: 0

The contents of the TIR dictate how to interpret any subsequent records. The records appear in a prescribed order:

If the fBitfield flag is set, a width record follows the TIR.
If the basic type is btPicture, the next four records contain integer values: the string table index of the picture string, the length, precision and scale.
If the basic type is btScaledBin, the next three records contain integer values: a basic type, the precision and scale.
If the basic type field is btStruct, btUnion, btEnum, btClass, btIndirect, btSet, btTypedef, btRange, btRange_64, btDecimal, btFixedBin, or btProc, the next record is an RNDXR.
If the rfd field of the RNDXR contains the value ST_RFDESCAPE, the next record is an isym.
If the basic type is btRange, the next two records are dnLow and dnHigh.
If the basic type is btRange_64, the next two records are dnLow records and the two after that are dnHigh records.
If the basic type is btDecimal or btFixedBin, the next two records contain integer values: the precision and scale.
For each array type qualifier in the TIR, the following symbols occur:
An RNDXR, again possibly followed by an isym
Either one or two dnLow records (depending on whether the array is tqArray or tqArray_64)
Either one or two dnHigh records (depending on whether the array is tqArray or tqArray_64)
Either one or two width records (depending on whether the array is tqArray or tqArray_64)
If the continued flag is set, the next record is another TIR

For a type description containing more than one TIR, the fields of all TIR records are interpreted in the same way. When a TIR is reached with the flag cleared and any records associated with that TIR have been decoded, the type description is complete.

As an example, consider an array of structures with the fBitfield flag set. A total of seven auxiliary records can be used to describe the type:

The TIR with a basic type of btStruct and with tq0 set to tqArray
A width record. The size of the basic type
A RNDXR record. A pointer to the structure definition in the local symbol table
A RNDXR record. A pointer to the array index type description elsewhere in the auxiliary table
A dnlow record. The lower bound of the array's range
A dnhigh record. The upper bound of the array's range
A width record. The distance in bits between each element in the array

If the continued flag of the TIR is cleared, the width record corresponding to the array qualifier is the final AUXU for this type description.

For another view of this process, see Figure 5-23. Each box represents one auxiliary entry belonging to the symbol's type description. Using the flowchart, an ordered list of entries can be assembled.

Figure 5-23 Auxiliary Table Interpretation

Figure 5-24 Auxiliary Table "ti" Interpretation

Figure 5-25 Auxiliary Table "bt vals" Interpretation

Figure 5-26 Auxiliary Table "arrays" Interpretation

Figure 5-27 Auxiliary Table "range" Interpretation

Figure 5-28 Auxiliary Table "rndx" Interpretation

The final step is to decode the RNDXR records. The basic types that are followed by RNDXR records require reference to another local or auxiliary symbol to complete the type description. Interpret the RNDXR records as follows:

If the basic type is btStruct,btUnion, btEnum, btClass, btProc, or btTypedef, the index field of the RNDXR points into the local symbol table. The specified local symbol is the start of the definition of the structure, union, enumeration, class, or user-defined type. For btProc, the referenced local symbol is the start of the set of symbols defining the procedure's signature.
If the basic type is btSet, the RNDXR points into the auxiliary symbol table. The specified record is the start of the description of the type of each element in the set.
If the basic type is btIndirect, the RNDXR points into the auxiliary symbol table. The specified auxiliary record is the start of the description of the referenced type.
If the basic type is btRange, the RNDXR points into the auxiliary symbol table. The specified auxiliary record is the start of the description of the type being subranged.
If the basic type is btFixedBin, the rfd field of the RNDXR contains a Boolean value. If rfd is true, the base is decimal; if rfd is false, the base is binary. The index field represents a type code.
If the basic type is btDecimal, the rfd field of the RNDXR contains the value 1 for 4-bit digits (packed decimal) or 2 for 8-bit digits (zoned decimal). The index field represents a type code.

Additionally, the index of every RNDXR used as a pointer must be mapped through the relative file descriptor table (see Section 5.3.2.1), if the table exists. The rfd field of the record controls this mapping. The following algorithm can be used to locate the symbol referenced by the relative index record:

if (RNDXR.rfd == ST_RFDESCAPE)
    RFD = (++AUXU).isym
else 
    RFD = RNDXR.rfd 
if (HDRR.crfd) /* RFD table exists */
    IFD = (current FDR's RFD table)[RFD]
else
    IFD = RFD

if (SYMR needed)
    SYMBASE = FDR[IFD].isymBase
    SYMR = SYMBASE[RNDXR.index]
else if (AUXU needed)
    AUXBASE = FDR[IFD].iauxBase
    AUXU = AUXBASE[RNDXR.index]

5.3.8 Individual Type Representations

This section provides sketches of type representations in the local and auxiliary symbol tables. The connections between the two tables is depicted for each type. This form of representation is only possible when full symbolic information is present.

Note that external symbols as well as local symbols reference the auxiliary table, although the examples in this chapter use local symbols only.

5.3.8.1 Pointer Type

A pointer is a variable containing the address of another variable. A pointer is represented by a tqPtr type qualifier modifying another type. A pointer is represented by a single symbol with an entry in the auxiliary table, as shown in Figure 5-29.

Note that if the pointer referenced a user-defined type, such as a class or structure, the TIR would be followed by an RNDXR (and possibly an isym).

Figure 5-29 Pointer Representation

The combination of type qualifiers tqFar and tqPtr are used to represent a short (32-bit) pointer. This pointer type is used with the XTASO emulation.

5.3.8.2 Array Type

An array is a list of elements that all have the same type. Arrays may be fixed size and allocated at compile time or dynamically sized and allocated at run time. This section describes the fixed-size array symbol table representation. For information on Fortran dynamic arrays, see Section 5.3.8.9. For conformant arrays in Pascal and Ada, see Section 5.3.8.10.

An array is represented by a tqArray or tqArray_64 type qualifier applied to another type. This second type describes the type of all elements in the array. In the local or external symbol table, a single entry represents an array. Figure 5-30 shows the symbol table description for an array.

Figure 5-30 Array Representation

Note that for an array of elements of a user-defined type, such as a class or structure, another RNDXR (and possibly an isym) would be inserted between the TIR and the RNDXR describing the subscript type.

If an array has multiple dimensions, the symbols describing the dimension appear in the order of innermost to outermost. For example, the following declaration produces a TIR with the tqArray qualifier followed by the RNDXR and range description for 0-1 followed by the entries for the dimension 0-99:

float floattable[100][2]

Some arrays may have dimensions too large to represent in the 32-bit format shown in Figure 5-30. Such arrays are represented using a 64-bit format in which two auxiliary entries are used for the dimension bounds and size. Figure 5-31 illustrates the 64-bit representation.

Figure 5-31 64-Bit Array Representation

5.3.8.3 Structure, Union, and Enumerated Types

This section applies to data structures in languages other than C++. For the C++ structure, union, or enumerated type representation, see Section 5.3.8.6.

Structures, unions, and enumerated types have a common representation. All three are identified using "tags" and contain zero or more fields. In the symbol table, the tag is the name associated with the starting stBlock symbol for the structure's set of local symbols. Note that it may be empty because the tag is optional. Symbols for fields follow. The definition is completed by a block-end symbol matching the block-start symbol.

Figure 5-32 contains a graphical depiction of this set of symbols.

Figure 5-32 Structure Representation

The structure members have auxiliary table indices pointing to their type descriptions.

Untagged structures and unions are represented with a NULL tag name. Unnamed structures can be embedded in other structures and are represented as a NULL-named member of the outer structure. See Section 9.1.1 for an example of an unnamed structure.

A structure can contain a field that is a pointer to itself. This field is represented by an stMember symbol with an auxiliary table entry that references the beginning of the structure's block of local symbols, as shown in Figure 5-33.

Figure 5-33 Recursive Structure Representation

When a field within a structure is itself a structure, the compiler may choose to generate the structure definitions either sequentially or embedded, as shown in Figure 5-34.

Figure 5-34 Nested Structure Representation

The following declaration might result in the nested structure representation:

struct line { 
        struct point { 
            float x, y;
        }  p1, p2;
};

5.3.8.4 Typedef Type

Most languages allow programmers to choose alternate names, or aliases, for data types. The alias created by such a facility (such as C's typedef) is represented as a single local symbol entry that has a pointer to its type description in the auxiliary table. The auxiliary entry contains a pointer to the definition of the type name, as shown in Figure 5-35.

Figure 5-35 Typedef Representation

5.3.8.5 Function Pointer Type

Languages such as C and C++, which allow pointers to functions, represent the type of the function pointer using a special stProc/scInfo block describing the parameters and return value for the function as shown in Figure 5-36.

Figure 5-36 Function Pointer Representation

The stProc/scInfo entry has its value set to -2, which distinguishes it from similar entries used to represent procedures with no text and C++ member functions. The stProc/scInfo and stEnd/scInfo entries have null names in the function pointer representation. The parameters are optional and may or may not be named.

This representation for function pointers is new in V3.13. The previous representation used the combination of type qualifiers tqPtr and tqProc in the TIR of the function pointer variable. Prior to V3.13, it was not possible to represent the parameter types for a function pointer.

5.3.8.6 Class Type (C++)

A C++ class resembles an extended C structure. One major distinction is that class fields (referred to as "members") can be functions as well as variables. The set of symbols created for a class is organized as follows:

The name of the class
A block symbol for scoping
Data members
Symbols associated with member functions. Each member function is represented by the normal set of symbols present for a function.
Corresponding end symbols that denote the completion of the block and class.

Another characteristic of classes is that symbols are defined implicitly. For example, all classes have an operator= operator-overloading function included in the class definition and a "this" pointer to its own type as a parameter to all member functions. These symbols are always included explicitly in the symbol table description.

Figure 5-37 is a graphical representation of the set of symbols for a class.

Figure 5-37 Class Representation

Class members, including member functions, have auxiliary references that point to their type descriptions. Note that member functions are represented as prototypes. The set of symbols defining the member function is elsewhere in the symbol table. To locate the definition of a member function, a name lookup can be performed using the mangled name of the member function with its class name qualifier. See Section 5.3.10.3 for information on name mangling.

C++ structures, unions, and enumerated types are represented the same way as classes. The different data structures are distinguished by basic type value.

The symbol table does not represent class member access attributes.

Examples of base and derived classes can be found in Section 9.2.1.

5.3.8.6.1 Empty Class or Structure (C++)

The representation of empty classes or structures in C++ is shown in Figure 5-38.

Figure 5-38 Empty Class or Structure (C++)

5.3.8.6.2 Base and Derived Classes (C++)

Hierarchical groups of classes can be designed in C++. A base class serves as a wider classification for its derived classes, and a derived class has all of the members and methods of the base class, plus additional members of its own. In the symbol table, the set of symbols denoting a derived class is nearly identical to that for a non-derived class. The derived class includes an additional stBase or stVirtBase symbol that identifies its corresponding base class, and it does not need to duplicate the definitions for the base class members. This representation is shown in Figure 5-39.

Figure 5-39 Base Class Representation

The representation of virtual base classes for C++ relies on the definition of a special symbol that identifies the virtual base table. The name for this symbol is derived from the name of the class to which it belongs. For example, the virtual base table symbol for class C5 would be named "_btbl_2C5". This table contains entries for base class run-time descriptions.

A class can include the special member "_bptr". This class member is a pointer to the virtual base table for that class.

The value field for a virtual base class symbol (stVirtBase/scInfo) serves as an index (starting at 1) into the virtual base class table.

5.3.8.7 Template Type (C++)

Templates are a C++-specific language construct allowing the parameterization of types. C++ class templates are represented in the symbol table for each instantiation, but not for the template itself. The set of class symbols is unchanged from the set shown in Figure 5-37.

5.3.8.8 Interlude Type (C++)

Interludes are compiler generated functions in C++. They are represented in the local symbol table with special names starting with the "__INTER__" prefix. Their representation in the symbol table makes use of two RNDXR aux entries to identify the related member function and the actual interlude function, both of which are local symbol table entries.

Figure 5-40 Interlude Representation

5.3.8.9 Array Descriptor Type (Fortran90)

A Fortran90 array descriptor is a structure that describes an array: its location, dimensions, bounds, sizes, and other attributes. Array descriptors are described in detail in the Fortran 90 User Manual for Tru64 UNIX. Fortran90 includes several types of arrays for which the dimensions or dimension bounds are determined at run time: allocatable arrays, assumed shape arrays, and array pointers.

Two symbol table representations can be used for an array descriptor. The default representation describes the array descriptor itself. The alternate representation describes what is known of the array itself at compile time.

No matter what symbolic representation is used, symbols of this type point to a data location at which the array descriptor is allocated. One of the array descriptor fields contains a pointer to the actual array. Other fields are used to describe the attributes of the array. Fields that describe the number of dimensions and upper and lower bounds are filled in at run time.

By default, array descriptors are described by a structure tag representation. Most of the array descriptor fields are represented as structure members. (Excluded fields are not needed by debuggers.) Special tag names are used to identify array descriptor structure definitions: $f90$f90_array_desc (assumed-shape array), $f90$f90_ptr_desc (pointer to array) and $f90$f90_alloc_desc (allocatable array). Figure 5-41 shows the format of this representation.

Some compilers may emit other fields in addition to those shown in Figure 5-41. A consumer's ability to interpret additional fields depends on its knowledge of the producing compiler.

Figure 5-41 Array Descriptor Representation (I)

An example of the default Fortran array descriptor representation can be found in Section 9.3.3.

An alternate representation for array descriptors may be found in symbol tables prior to V3.13. The overloaded basic type value 28 indicates an array descriptor in the TIR, and dimension bounds are set to [1:1] indicating their true size is unknown. The alternate representation does not provide any information describing the contents of the array descriptor itself, so debuggers must assume a static representation for the descriptor and lookup the fields at their expected offsets.

This representation is substantially more compact in the local symbol table, but it provides no way to distinguish between the different types of array descriptors.

Figure 5-42 shows the format of the older array descriptor representation.

Figure 5-42 Array Descriptor Representation (II)

5.3.8.10 Conformant Array Type (Pascal)

Full details are not currently available for Pascal's conformant array representation. A Pascal conformant array is very similar to Fortran's assumed shape arrays. It is an array parameter with upper and lower dimension bounds that are determined by the input argument. A conformant array is represented by an array descriptor. The special names used and the format of the array descriptor differ from those used for Fortran. The DEC Pascal release notes contain additional information on conformant arrays.

5.3.8.11 Variant Record Type (Pascal and Ada)

A variant record is an extension to the record data type, which is a Pascal or Ada data structure akin to a C struct and is represented in the same manner in the symbol table. The variant part of the record consists of sets of one or more fields associated with a range of values. Only one such set is part of the record, and it is selected based on the value of another record field. Any number of variant parts can be embedded in a single record.

The local symbol table entries for the variant part of a record are contained within a block with the storage class (sc value) scVariant. The value field of the stBlock entry contains the index of the local symbol entry for the member of the record whose value determines which variant arm is used. The variant block contains multiple inner blocks, each representing a variant arm. The value field of each of these block entries is an auxiliary table index. Each auxliary table entry starts with a count, which indicates how many range entries follow. The range entries describe the values associated with the block.

Figure 5-43 is a graphical representation of a variant record.

Figure 5-43 Variant Record Representation

Prior to V3.13 of the symbol table, variant records were represented differently. Figure 5-44 depicts the older representation.

Figure 5-44 Variant Record Representation (pre-V3.13)

An example of a Pascal variant record can be found in Section 9.4.3.

5.3.8.12 Subrange Type (Pascal and Ada)

A subrange data type defines a subset of the values associated with a particular ordinal type (the "base type" of the subrange). Ordinal types in Pascal include integers, characters, and enumerated types. The symbol table representation of a subrange uses the btRange or btRange_64 type followed by an auxiliary index identifying the base type and entries providing the bounds of the subrange. The 32-bit representation is shown in Figure 5-45 and the 64-bit representation is shown in Figure 5-46.

Figure 5-45 Subrange Representation

Figure 5-46 64-bit Range Representation

An example of a Pascal subrange can be found in Section 9.4.2.

5.3.8.13 Set Type (Pascal)

A set is a data type that groups ordinal elements in an unordered list. The arithmetic and logical operators are overloaded in Pascal; this enables them to be used with set variables to perform classic set operations such as union and intersection. A special auxiliary type definition btSet exists to identify this type. The symbol table representation is depicted in Figure 5-47.

Figure 5-47 Set Representation

The element type for a set is typically a range or an enumeration. An example of a Pascal set can be found in Section 9.4.1.

5.3.9 Special Debug Symbols

A variety of special symbols are used throughout the symbol table to convey call frame information, special type semantics, or other language specific information. These names are reserved for use by compilers and other tools that produce Tru64 UNIX object files.

Name

Purpose

__StaticLink.*

Uplevel link. See Section 5.3.4.4.

_BLNK__

Fortran unnamed common block. See Section 5.3.6.6.

MAIN__

Fortran alias for main program unit. See Section 5.3.10.4.

<ARGNAME>.len

Generated parameter for Fortran routines. It contains the length of <ARGNAME>, a parameter of character type.

.lb_<ARRAY>.<dim> .ub_<ARRAY>.<dim>

Lower and upper bounds of particular dimensions of arrays�when the array has an explicit shape, yet some bounds come from non-constant specification expressions (array arguments in Pascal and Fortran routines).

$f90$f90_array_desc $f90$f90_alloc_desc $f90$f90_ptr_desc

Variants of Fortran-90 described arrays (assumed shape, ALLOCATABLE, and POINTER, respectively). See Section 5.3.8.9.

cray pointee

Fortran-generated typedef describing the type of a variable pointed to by a CRAY pointer.

pointer

Fortran generated typedef describing the type of a scalar with the POINTER attribute.

_DECCXX_generated_name_*

DECC++ compiler-inserted name for unamed classes and enumerations.

this

Hidden parameter in C++ member functions that is a pointer to the current instance of the class. See Section 5.3.8.6.

__vptr

Hidden C++ class member containing the virtual function table. See example in Section 9.2.2.

__bptr

Hidden C++ class member containing the virtual base class table. See example in Section 9.2.2.

__vtbl_*

Global symbols for C++ virtual function tables. See example in Section 9.2.2.

__btbl_*

Global symbols for C++ virtual base class tables. See example in Section 9.2.2.

__control

Hidden argument to C++ constructors controlling descent (in the face of virtual base classes).

__t*__evdf

Structure used to maintain a list of C++ global deconstructors.

t*__iviw

C++ static procedure used for global constructors.

t*__evdw

C++ static procedure used for global destructors.

__t*_thunk

C++ static procedure used to provide a defaulted argument value.

__INTER__*

C++ interlude. See example in Section 9.2.2.

__unnamed::*

C++ unnamed namespace components. See example in Section 9.2.4.

5.3.10 Symbol Resolution

Among the linker's chief tasks is symbol resolution. Because most compilations involve multiple source files and virtually all programs rely on system libraries, a process is necessary to resolve conflicting uses of global symbol names. The linker must decide which symbol is referenced by a given name. This section highlights the major issues involved in that decision. Related information is contained in Section 6.3.4 and the Programmer's Guide.

Symbol table entries provide information relevant to performing symbol resolution. External symbols with a storage class of sc(S)Undefined, sc(S)Common, or scTlsCommon must be resolved before they are referenced. By default, the linker will not mark an object file with unresolved symbols as executable. However, linker options give programmers a fair measure of control over its symbol resolution behavior. See ld(1) for more information.

5.3.10.1 Library Search

Symbols referenced, but not defined in the main executable of an application must be matched with definitions in linked-in libraries. The linker combines objects, archives, and shared libraries while attempting to resolve all references to undefined symbols. The Programmer's Guide covers related topics in detail, such as how to specify libraries during compilation and the search order of libraries.

In general, main executable objects and shared libraries are searched before archive libraries. If no undefined external symbols remain, archive libraries in the library list do not have to be searched, because archive members are only loaded to resolve external references. Archives are not used to find "better" common definitions (see Section 5.3.10.2), and no archive definitions preempt symbol definitions from the main object or shared libraries.

5.3.10.2 Resolution of Symbols with Common Storage Class

Symbols with common storage class are a special category of global symbols that have a size but no allocated storage. Symbols with common storage class should not be confused with Fortran common symbols, which are not represented by a single symbol table entry. (See Section 5.3.6.6 for a description of Fortran common symbols.). Common storage classes are scCommon, scSCommon, and scTlsCommon.

The symbol definition model used by Tru64 UNIX allows an unlimited number of common storage class symbols with the same name. Ultimately, the "best" of these must be selected (by the linker or the loader) during symbol resolution. The criteria used to select the best symbol definition include the symbol's allocation status and size.

The symbol table does not provide an "allocated common" storage class. Common storage class symbols adopt a new storage class when they are allocated. Typically, their new storage class is scBss or scSBss or scTlsBss. On the other hand, the dynamic symbol table does explicitly distinguish common storage class symbols that have been allocated. See Section 6.3.4 for more information on dynamic symbol resolution.

A symbol reference is resolved according to the following precedence rules:

Find a symbol definition that does not have a common storage class and is not identified as an allocated common in the dynamic symbol table.
Find the largest allocated common identified in the dynamic symbol table.
Find the largest common storage class symbol and allocate it. This step will be skipped when the linker produces a relocatable object file.

Precedence is given to symbol definitions with storage allocation to minimize load time common allocation and redundant storage allocations in shared objects. The loader is capable of allocating space for common storage class symbols, but this should only be necessary when a program references an allocated common symbol in a shared library that is later removed from that shared library.

Note that Fortran common block representations use common storage class symbols Another very frequent occurrence of a common storage class symbol is a C-language global variable that does not have an initializer in its declaration.

5.3.10.3 Mangling and Demangling

Another issue related to symbol resolution is the need to "mangle" user-level identifiers. For example, C++ allows function overloading, prototyping, and the use of templates�all of which can result in the occurrence of the same names for different entities. The solution employed by the symbol table is to use mangled names that derive from the symbol's type signature.

Object file consumers, such as debuggers and object dumpers, need to "demangle" the identifiers so they can be output in a form that is recognizable to the user. For linking and loading, the mangled names are used for symbol resolution.

The encoding of C++ names is described in the manual Using DEC C++ for Tru64 UNIX Systems.

Other compilers may write symbol names that are modified by prepending or appending special characters such as dollar sign ($) or underscore (_) or by prepending qualifier strings such as file names or namespace names. Uppercasing of names is also common for certain languages such as Fortran. All of these transformations fall into the general category of mangled names. Refer to the release notes for specific compilers for additional information.

5.3.10.4 Mixed Language Resolution

Compilation of a program involving multiple source languages introduces additional symbol resolution issues. One important task is resolving the main program entry point because conflicting "main" symbols may be present in the different files. For C and C++, the symbol "main" is the main program entry point, but for other languages, "main" will either be an alias for the main program or an interlude. DEC Fortran and DEC COBOL provide interludes that perform some language specific initializations and then call the real main program entry point. For DEC FORTRAN the main program is "MAIN__" and for DEC COBOL the main program is "__cobol_main". DEC PASCAL provides a "main" symbol that aliases the actual main program symbol.

The symbols "MAIN__" and "__cobol_main" can both be present in a mixed language program, and either, neither, or both can be used by the program. Debuggers can set a breakpoint in the user's main program by applying some precedence for selecting the most appropriate symbol. For a mixed language program, there is a slight chance that "MAIN__" or "__cobol_main" will be present but never called.

5.3.10.5 TLS Symbols

TLS symbols, like non-TLS symbols, can be undefined or common. Unresolved TLS symbols are identified by the storage class scTlsUndefined, and TLS commons have the storage class scTlsCommon. The symbol resolution process for TLS names is similar, but separate; TLS symbols cannot be resolved to non-TLS symbols or vice versa.

TLS common symbols are resolved in the same manner as other common storage class symbols (see Section 5.3.10.2), except that, again, only TLS symbols are candidates for resolution.

Another rule special to TLS is that symbol definitions for TLS common and undefined symbols cannot be imported from shared libraries.

5.4 Language-Specific Symbol Table Features

Language-specific characteristics are pervasive in the symbol table, particularly in the local, external, and auxiliary symbol tables. See Section 5.2 and Section 5.3.7 for information on language-specific values.

The lang field of the file descriptor entry encodes the source language of the file. This field should be accessed prior to decoding symbolic information, especially type descriptions. This section highlights, by language, language-specific features represented in the symbol table. Additional information on certain features is available elsewhere in this chapter.

5.4.1 Fortran77 and Fortran90

In Fortran, it is possible to create multiple entry points in subroutines. A subroutine has one main entry point and zero or more alternate entry points, indicated by ENTRY statements. See Section 5.3.6.7 for their representation in the symbol table.

Fortran90 array descriptors include allocatable arrays, assumed-shape arrays, and pointers to arrays. Their representation in the symbol table is discussed in Section 5.3.8.9.

Modules provide another scoping level in Fortran90 programs. The symbol table representation for modules has not yet been implemented.

5.4.2 C++

C++ classes encapsulate functions and data inside a single structure. Classes are represented in the symbol table using a btClass basic type and the stBlock/stEnd scoping mechanism. See Section 5.3.8.6.

Templates provide for parameterized types. At present, no special symbol table values are related to templates. The template itself is not represented; rather, entries that correspond to each instantiation are generated. Template instantiations are distinguished by mangled names based on their type signatures.

C++ namespaces, like Fortran modules, offer an additional scope for program identifiers. Again, they are not yet implemented in the symbol table.

The C++ concepts of private, protected, and public data attributes are not currently represented in the symbol table. The C++ concept of "friend" classes and functions are also not represented.

5.4.3 Pascal and Ada

Pascal conformant arrays are function parameters with array dimensions that are determined by the arguments passed to the function at run time. See Section 5.3.8.10.

Variant records are an extension of the record data structure. Variant records allow different sets of fields depending on the value of a particular record member. See Section 5.3.8.11.

Nested procedures are supported in these languages. They are represented using standard scoping mechanisms discussed in Section 5.3.6 and uplevel references described in Section 5.3.4.4.

Sets and subranges are user-defined subsets of ordinal types. Sets are unordered groups of elements, which can be manipulated with the classic set operations. Subranges are ordered and are used with the usual operators. See Section 5.3.8.12 and Section 5.3.8.13.

Ada subtypes of ordinal types are represented in the same manner as Pascal subranges.