14
Symbol Resolution and Relocation

Symbol Resolution and Relocation

Embed Size (px)

DESCRIPTION

MCLinker - Symbol Resolution

Citation preview

Page 1: Symbol Resolution and Relocation

Symbol Resolution and Relocation

Page 2: Symbol Resolution and Relocation

211/11/1811/11/18http://code.google.com/p/mclinker MCLinker

MCLDFileMCLDFile

Symbol Resolution and Relocation

ObjectreaderObjectreader

Archivereader

Archivereader

MCLDFileMCLDFile

ArchiveArchive

Object fileObject file

BitcodeBitcode

classData

Bitcodereader

Bitcodereader

MCLDFileMCLDFileSymbol

table

Section aSection bSection c

Relocation

OutputOutput

Symboltable

Section

Section

Section

Relocation

MCLinkerMCLinkerSymbol tableSymbol table

Symbol tableSymbol table

Symbol tableSymbol table

RelocationRelocation

RelocationRelocation

RelocationRelocation

SymbolResolution

Relocation

The essentials of MCLinker– Symbol Resolution

˙ Resolving references across symbols

˙ Merging multiple symbol tables into one

Section aSection a

Section bSection b

Section c Section c

Section aSection a

Section bSection b

Section c Section c

Section aSection a

Section bSection b

Section c Section c

– Relocation

˙ Performing section merge˙ Resolving all resolvable relocation˙ Replacing symbolic references with

actual addresses (binding)

Page 3: Symbol Resolution and Relocation

311/11/1811/11/18http://code.google.com/p/mclinker MCLinker

MCLDFile█ Instances of MCLDFile are the inputs and outputs of MCLinker█ MCLDFile provides a consistent abstraction of object files in a variety of targets

and formats, it has:● symbol tables● sections● relocation entries

█ Linking operations on MCLDFile are efficient● Looking up symbols is fast with limited memory● By memory map I/O, usage of physical memory keeps as few as possible● By loading on demand, usage of virtual memory keeps as few as possible

Page 4: Symbol Resolution and Relocation

411/11/1811/11/18http://code.google.com/p/mclinker MCLinker

Symbol Table In MCLinker (1/2)● MCLinker avoids copying symbols between symbol tables

● Symbol tables record only references of symbols, not instances.● A common symbol pool stores all symbol instances of different symbol

tables● MCLinker keeps the number of walks over symbol tables as few as possible

● MCLinker prevents merging symbol tables from symbol resolution– MCLinker resolves symbols simultaneously when reading symbol

tables of inputs● MCLinker only visits symbols which it needs

– Dynamic and common symbols are grouped into different sets

Page 5: Symbol Resolution and Relocation

511/11/1811/11/18http://code.google.com/p/mclinker MCLinker

Symbol Table In MCLinker (2/2)

Symbol Table 2Symbol Table 1

Common Symbol Pool

Dynamic Common Non-Dynamic Common

Define symbolsReference symbols

Resolve symbols when add a symbol into the common symbol pool

Group symbols into different categories

Store only references of symbols

Common symbol pool records the real instances of symbols

Page 6: Symbol Resolution and Relocation

611/11/1811/11/18http://code.google.com/p/mclinker MCLinker

Symbol in MCLinker● MCLinker defines a format-independent abstraction of

symbols, aka LDSymbol● Supports Mach-O, ELF, and COFF● Supports both 32- and 64-bit

● MCLinker transforms symbols of different formats into LDSymbol as the following figure:

MachO NlistMachO Nlistn_unn_un

n_typen_typen_descn_descn_sectn_sectn_valuen_value

ELF SymbolELF Symbolst_namest_namest_infost_info

st_shndxst_shndxst_valuest_valuest_sizest_sizest_otherst_other

COFF SymbolCOFF SymbolNameNameTypeType

StorageClassStorageClassSectionNumSectionNum

ValueValueNumAuxNumAux

LDSymbolLDSymbolnamename

is_dyn : 1is_dyn : 1type : 2type : 2bind : 2bind : 2

in_sectionin_sectionvalue : 64value : 64size : 64size : 64other : 8other : 8

Page 7: Symbol Resolution and Relocation

711/11/1811/11/18http://code.google.com/p/mclinker MCLinker

Symbol Resolution● Steps

● Get a input symbol from an input file● If no symbols in output symbol table have the same name ,

then add input symbol to output symbol table● Otherwise, compare input symbol with the existing output symbol

according to Table 1.● Discard the input symbol or override the output symbol by the result of

comparison

Table 1. - The priorities of attribute values in symbol comparison

Attributes Priority of attribute values

is_dyn not a dynamic object > is a dynamic object

type defined > common > reference

bind global > weak

Page 8: Symbol Resolution and Relocation

811/11/1811/11/18http://code.google.com/p/mclinker MCLinker

Sections in MCLinker● MCLDFile reuses the definitions of sections in LLVM machine code (MC) layer

● MCSection has the attributes (name, type, …) of a section

● MCSectionData records the size and offset of a section

● MCFragment is the storage of data

● Readers in MCLinker transform only the sections holding the information defined by the program into MCSection

● In general, readers transforms only text and data sections

● In ELF, readers transforms only sections with SHT_PROGBITS and SHF_ALLOC attributes

Page 9: Symbol Resolution and Relocation

911/11/1811/11/18http://code.google.com/p/mclinker MCLinker

Relocation Entries in MCLinker● MCLinker defines a format-independent relocation called

LDRelocation● Support Mach-O, ELF, and COFF

● As BFD, MCLinker uses a target-independent relocation algorithm for all targets

● LDRelocation has a target-independent data structure ”howto” to describe how to apply relocation

● Relief the porting efforts from implementing various relocation functions for all targets

LDRelocationLDRelocationsymbolsymboloffsetoffset

addendaddendhowtohowto

howtohowtotypetype

right_shiftright_shiftsizesize

bit_sizebit_sizepcrelpcrel

bit_positionbit_positionoverflowoverflow

target_callbacktarget_callbacksrc_masksrc_maskdst_maskdst_mask

pcrel_offsetpcrel_offset

● TargetBackend additionally provides target-dependent relocation functions to improve performance as needed

Page 10: Symbol Resolution and Relocation

1011/11/1811/11/18http://code.google.com/p/mclinker MCLinker

Applying Relocations by “howto“● Steps

1. Compute the relocation value– Relocation = S + A – P– S : the value of the symbol– A : the value of addend– P : the value derived from offset

2. Shift the relocation value by shiftright (>>=) and bitpos (<<=)

3. Apply relocation value (As the following figure)

4. Write back the final result into the address of the symbol

0 0 0 1 0 0 0 0 1 1 1 1 0 0 1 1 1 10 1 0 00 0 1 0

SUB R0, R1, #1024

1Rn Rd Immed 8

1

src_mask1 1 1 1 1 1 11

Rotate1 1 1 0

and

1 1 1 1 1 1 11dst_mask

sum

and

1 1 0 0 0 0 0 0 1 0 001 1 1 1 1 1 1 1 1 1 111 1 1 1 1 1 11final value of relocation ( offset + addend + symbol address )

0 1 0 0 0 1 10result low

0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 10 1 0 00 0 1 0 1Rn Rd Immed 8

1Rotate

1 1 1 0

0 0 0 0 0 0 00~dst_mask

1 1 111 1 111 1 111 1 111 1 111 1 11

result high 0 0 0 1 0 0 0 0 1 1 1 10 1 0 00 0 1 01 1 1 0

and

0 0 0 1 0 0 0 0 1 1 1 10 1 0 00 0 1 01 1 1 0 0 1 0 0 0 1 10final result

Page 11: Symbol Resolution and Relocation

1111/11/1811/11/18http://code.google.com/p/mclinker MCLinker

Memory Allocation Policy in MCLinker● MCLinker has its own memory allocator, as called as MemoryArea

● Unfortunately, We do not directly use LLVM MemoryBuffer● Linkers' demands of memory allocation policy is different from

compilers'● The average size of object files is different from source files● Linkers have more file operations than compilers. Linkers'

performance is more sensitive to the usage of memory mapped I/O● LLVM MemoryBuffer is designed for compilers, not linkers

● Average size of all members in libc.a is less than and closed to one page● However, LLVM MemoryBuffer uses memory mapped I/O only when the

request is larger than four pagesPolicy Advantage DisadvantageMemory Mapped I/Ommap()

~x2 faster file copy 1. Start address must be on the page boundaries

2. Memory size must be a multiple of the page size

Dynamic Memorymalloc() + read()

No constraints on either the start address or the requested size

Slow file copy

Page 12: Symbol Resolution and Relocation

1211/11/1811/11/18http://code.google.com/p/mclinker MCLinker

Components of MemoryArea (1/2)Three layers of MemoryArea

● MemoryArea● Clients request MemoryArea for virtual memory space● MemoryArea creates MemorySpaces and MemoryRegions to satisfy

clients' requests● MemoryArea decides whether to use dynamic memory or memory

mapped I/O

● MemorySpace● MemorySpace is a container of a non-overlapped and continuous range of

virtual addresses● Virtual memory is allocated by either “malloc” or memory mapped I/O ● Clients do not directly access memory by MemorySpace. Instead, they access

memory by MemoryRegions

● MemoryRegion● MemoryRegion marks a range of virtual memory space in a MemorySpace● Clients access memory through MemoryRegions● Several MemoryRegions can map to the identical MemorySpace

Page 13: Symbol Resolution and Relocation

1311/11/1811/11/18http://code.google.com/p/mclinker MCLinker

Components of MemoryArea (2/2)

MemoryRegion

MemoryArea

MemoryRegionMemoryRegion

MemorySpaceMemorySpaceMemorySpaceMemorySpace

mmapmmapdynamic memorydynamic memory

A file in secondary storage

MemorySpaceMemorySpace

LDObjectReaderLDObjectReader

Copy parts of a file to a MemoySpaceby memory mapped I/O or dynamic memory

Reads or Writes memory

The MemorySpace mapped by MemoryRegion may be overlapped

MemoryRegionMemoryRegionMemoryRegionMemoryRegion

Every MemoryRegion maps to a MemorySpace

Page 14: Symbol Resolution and Relocation

1411/11/1811/11/18http://code.google.com/p/mclinker MCLinker

How MemoryArea allocates memory?

Request Size >= ThresholdRequest Size >= Threshold

FeatureFeature Fast memory read and writeFast memory read and write

Allocated MemorySpace SizeAllocated MemorySpace Size

Memory PolicyMemory Policy Using memory mapped I/OUsing memory mapped I/O Allocating dynamic memoryAllocating dynamic memory

Reducing memory fragmentsReducing memory fragments

Page alignmentPage alignment As requested sizeAs requested size

MemoryRegion

MemoryArea MemorySpace

2nd storage2nd storage

LDObjectReaderLDObjectReader

Table 2.

Request Size < ThresholdRequest Size < Threshold

1. Requests memory space with the specified size

4. Reader reads and writes memory only through MemoryRegion

3. Map a MemorySpace to a MemoryRegion

2. Decides memory allocation policy by size and threshold as Table 2.