Upload
dassault
View
219
Download
1
Embed Size (px)
DESCRIPTION
MCLinker - Symbol Resolution
Citation preview
Symbol Resolution and Relocation
211/11/1811/11/18http://code.google.com/p/mclinker MCLinker
MCLDFileMCLDFile
Symbol Resolution and Relocation
ObjectreaderObjectreader
Archivereader
Archivereader
MCLDFileMCLDFile
ArchiveArchive
Object fileObject file
BitcodeBitcode
classData
Bitcodereader
Bitcodereader
MCLDFileMCLDFileSymbol
table
Section aSection bSection c
Relocation
OutputOutput
Symboltable
Section
Section
Section
Relocation
MCLinkerMCLinkerSymbol tableSymbol table
Symbol tableSymbol table
Symbol tableSymbol table
RelocationRelocation
RelocationRelocation
RelocationRelocation
SymbolResolution
Relocation
The essentials of MCLinker– Symbol Resolution
˙ Resolving references across symbols
˙ Merging multiple symbol tables into one
Section aSection a
Section bSection b
Section c Section c
Section aSection a
Section bSection b
Section c Section c
Section aSection a
Section bSection b
Section c Section c
– Relocation
˙ Performing section merge˙ Resolving all resolvable relocation˙ Replacing symbolic references with
actual addresses (binding)
311/11/1811/11/18http://code.google.com/p/mclinker MCLinker
MCLDFile█ Instances of MCLDFile are the inputs and outputs of MCLinker█ MCLDFile provides a consistent abstraction of object files in a variety of targets
and formats, it has:● symbol tables● sections● relocation entries
█ Linking operations on MCLDFile are efficient● Looking up symbols is fast with limited memory● By memory map I/O, usage of physical memory keeps as few as possible● By loading on demand, usage of virtual memory keeps as few as possible
411/11/1811/11/18http://code.google.com/p/mclinker MCLinker
Symbol Table In MCLinker (1/2)● MCLinker avoids copying symbols between symbol tables
● Symbol tables record only references of symbols, not instances.● A common symbol pool stores all symbol instances of different symbol
tables● MCLinker keeps the number of walks over symbol tables as few as possible
● MCLinker prevents merging symbol tables from symbol resolution– MCLinker resolves symbols simultaneously when reading symbol
tables of inputs● MCLinker only visits symbols which it needs
– Dynamic and common symbols are grouped into different sets
511/11/1811/11/18http://code.google.com/p/mclinker MCLinker
Symbol Table In MCLinker (2/2)
Symbol Table 2Symbol Table 1
Common Symbol Pool
Dynamic Common Non-Dynamic Common
Define symbolsReference symbols
Resolve symbols when add a symbol into the common symbol pool
Group symbols into different categories
Store only references of symbols
Common symbol pool records the real instances of symbols
611/11/1811/11/18http://code.google.com/p/mclinker MCLinker
Symbol in MCLinker● MCLinker defines a format-independent abstraction of
symbols, aka LDSymbol● Supports Mach-O, ELF, and COFF● Supports both 32- and 64-bit
● MCLinker transforms symbols of different formats into LDSymbol as the following figure:
MachO NlistMachO Nlistn_unn_un
n_typen_typen_descn_descn_sectn_sectn_valuen_value
ELF SymbolELF Symbolst_namest_namest_infost_info
st_shndxst_shndxst_valuest_valuest_sizest_sizest_otherst_other
COFF SymbolCOFF SymbolNameNameTypeType
StorageClassStorageClassSectionNumSectionNum
ValueValueNumAuxNumAux
LDSymbolLDSymbolnamename
is_dyn : 1is_dyn : 1type : 2type : 2bind : 2bind : 2
in_sectionin_sectionvalue : 64value : 64size : 64size : 64other : 8other : 8
711/11/1811/11/18http://code.google.com/p/mclinker MCLinker
Symbol Resolution● Steps
● Get a input symbol from an input file● If no symbols in output symbol table have the same name ,
then add input symbol to output symbol table● Otherwise, compare input symbol with the existing output symbol
according to Table 1.● Discard the input symbol or override the output symbol by the result of
comparison
Table 1. - The priorities of attribute values in symbol comparison
Attributes Priority of attribute values
is_dyn not a dynamic object > is a dynamic object
type defined > common > reference
bind global > weak
811/11/1811/11/18http://code.google.com/p/mclinker MCLinker
Sections in MCLinker● MCLDFile reuses the definitions of sections in LLVM machine code (MC) layer
● MCSection has the attributes (name, type, …) of a section
● MCSectionData records the size and offset of a section
● MCFragment is the storage of data
● Readers in MCLinker transform only the sections holding the information defined by the program into MCSection
● In general, readers transforms only text and data sections
● In ELF, readers transforms only sections with SHT_PROGBITS and SHF_ALLOC attributes
911/11/1811/11/18http://code.google.com/p/mclinker MCLinker
Relocation Entries in MCLinker● MCLinker defines a format-independent relocation called
LDRelocation● Support Mach-O, ELF, and COFF
● As BFD, MCLinker uses a target-independent relocation algorithm for all targets
● LDRelocation has a target-independent data structure ”howto” to describe how to apply relocation
● Relief the porting efforts from implementing various relocation functions for all targets
LDRelocationLDRelocationsymbolsymboloffsetoffset
addendaddendhowtohowto
howtohowtotypetype
right_shiftright_shiftsizesize
bit_sizebit_sizepcrelpcrel
bit_positionbit_positionoverflowoverflow
target_callbacktarget_callbacksrc_masksrc_maskdst_maskdst_mask
pcrel_offsetpcrel_offset
● TargetBackend additionally provides target-dependent relocation functions to improve performance as needed
1011/11/1811/11/18http://code.google.com/p/mclinker MCLinker
Applying Relocations by “howto“● Steps
1. Compute the relocation value– Relocation = S + A – P– S : the value of the symbol– A : the value of addend– P : the value derived from offset
2. Shift the relocation value by shiftright (>>=) and bitpos (<<=)
3. Apply relocation value (As the following figure)
4. Write back the final result into the address of the symbol
0 0 0 1 0 0 0 0 1 1 1 1 0 0 1 1 1 10 1 0 00 0 1 0
SUB R0, R1, #1024
1Rn Rd Immed 8
1
src_mask1 1 1 1 1 1 11
Rotate1 1 1 0
and
1 1 1 1 1 1 11dst_mask
sum
and
1 1 0 0 0 0 0 0 1 0 001 1 1 1 1 1 1 1 1 1 111 1 1 1 1 1 11final value of relocation ( offset + addend + symbol address )
0 1 0 0 0 1 10result low
0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 10 1 0 00 0 1 0 1Rn Rd Immed 8
1Rotate
1 1 1 0
0 0 0 0 0 0 00~dst_mask
1 1 111 1 111 1 111 1 111 1 111 1 11
result high 0 0 0 1 0 0 0 0 1 1 1 10 1 0 00 0 1 01 1 1 0
and
0 0 0 1 0 0 0 0 1 1 1 10 1 0 00 0 1 01 1 1 0 0 1 0 0 0 1 10final result
1111/11/1811/11/18http://code.google.com/p/mclinker MCLinker
Memory Allocation Policy in MCLinker● MCLinker has its own memory allocator, as called as MemoryArea
● Unfortunately, We do not directly use LLVM MemoryBuffer● Linkers' demands of memory allocation policy is different from
compilers'● The average size of object files is different from source files● Linkers have more file operations than compilers. Linkers'
performance is more sensitive to the usage of memory mapped I/O● LLVM MemoryBuffer is designed for compilers, not linkers
● Average size of all members in libc.a is less than and closed to one page● However, LLVM MemoryBuffer uses memory mapped I/O only when the
request is larger than four pagesPolicy Advantage DisadvantageMemory Mapped I/Ommap()
~x2 faster file copy 1. Start address must be on the page boundaries
2. Memory size must be a multiple of the page size
Dynamic Memorymalloc() + read()
No constraints on either the start address or the requested size
Slow file copy
1211/11/1811/11/18http://code.google.com/p/mclinker MCLinker
Components of MemoryArea (1/2)Three layers of MemoryArea
● MemoryArea● Clients request MemoryArea for virtual memory space● MemoryArea creates MemorySpaces and MemoryRegions to satisfy
clients' requests● MemoryArea decides whether to use dynamic memory or memory
mapped I/O
● MemorySpace● MemorySpace is a container of a non-overlapped and continuous range of
virtual addresses● Virtual memory is allocated by either “malloc” or memory mapped I/O ● Clients do not directly access memory by MemorySpace. Instead, they access
memory by MemoryRegions
● MemoryRegion● MemoryRegion marks a range of virtual memory space in a MemorySpace● Clients access memory through MemoryRegions● Several MemoryRegions can map to the identical MemorySpace
1311/11/1811/11/18http://code.google.com/p/mclinker MCLinker
Components of MemoryArea (2/2)
MemoryRegion
MemoryArea
MemoryRegionMemoryRegion
MemorySpaceMemorySpaceMemorySpaceMemorySpace
mmapmmapdynamic memorydynamic memory
A file in secondary storage
MemorySpaceMemorySpace
LDObjectReaderLDObjectReader
Copy parts of a file to a MemoySpaceby memory mapped I/O or dynamic memory
Reads or Writes memory
The MemorySpace mapped by MemoryRegion may be overlapped
MemoryRegionMemoryRegionMemoryRegionMemoryRegion
Every MemoryRegion maps to a MemorySpace
1411/11/1811/11/18http://code.google.com/p/mclinker MCLinker
How MemoryArea allocates memory?
Request Size >= ThresholdRequest Size >= Threshold
FeatureFeature Fast memory read and writeFast memory read and write
Allocated MemorySpace SizeAllocated MemorySpace Size
Memory PolicyMemory Policy Using memory mapped I/OUsing memory mapped I/O Allocating dynamic memoryAllocating dynamic memory
Reducing memory fragmentsReducing memory fragments
Page alignmentPage alignment As requested sizeAs requested size
MemoryRegion
MemoryArea MemorySpace
2nd storage2nd storage
LDObjectReaderLDObjectReader
Table 2.
Request Size < ThresholdRequest Size < Threshold
1. Requests memory space with the specified size
4. Reader reads and writes memory only through MemoryRegion
3. Map a MemorySpace to a MemoryRegion
2. Decides memory allocation policy by size and threshold as Table 2.