94
Compilation 03683133 2015/16a Lecture 11 Code generation, Assemblers, and linkers Noam Rinetzky 1

Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Compilation0368-­‐3133    2015/16a

Lecture  11Code  generation,  Assemblers,  and  linkers

Noam  Rinetzky

1

Page 2: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

AST  +  Sym.  Tab.

Stages  of  compilationSource  code

(program)

LexicalAnalysis

Syntax  Analysis

Parsing

ContextAnalysis

Portable/Retargetable code  generation

Target  code

(executable)

Assembly

IRText

Token  stream

AST

CodeGeneration

Page 3: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Simple  instructions

§ Load_Mema,  R1     //  R1  =  a§ Store_Reg R1,  a //  a  =  R1§ Add  R1,  R2 //  R2  =  R2+1§ Load_Const cst,  R1 //  R1  =  cst§ …

3

Page 4: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Not  So  Simple  instructions

§ Load_Mema,  R1     //  R1  =  a§ Store_Reg R1,  a //  a  =  R1§ Add  R1,  R2 //  R2  =  R2+1§ Load_Const cst,  R1 //  R1  =  cst§ …

§ Add_Scaled_Regc,  R1,R2 //  R2=  r2  +  c  *  R1

4

Page 5: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Not  So  Simple  instructions

§ Load_Mema,  R1     //  R1  =  a§ Store_Reg R1,  a //  a  =  R1§ Add  R1,  R2 //  R2  =  R2+1§ Load_Const cst,  R1 //  R1  =  cst§ …

§ Add_Mem a,  R1 //  R1=  a  +  R1  § more  efficient  than:

§ Load_mem a,  R2§ Add  R2,R1

5

Page 6: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Not  So  Simple  instructions

§ Load_Mema,  R1     //  R1  =  a§ Store_Reg R1,  a //  a  =  R1§ Add  R1,  R2 //  R2  =  R2+1§ Load_Const cst,  R1 //  R1  =  cst§ …

§ Add_Scaled_Regc,  R1,R2 //  R2=  R2  +  c  *  R1§ more  efficient  than:                                              //  Artificial  instruction

§ Mult_Const c,  R1§ Add  R1,R2

6

Page 7: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Goal

§ Observation:  Real  machines  have  hundreds  of  instructions§ +  different  addressing  modes

§ Goal:  Generate  efficient  code§ Use  available  instructions

§ Input:  AST§ Generated   from  the  parse  tree  using  simple  techniques

§ Output:  A  “more  efficient”  AST

7

Page 8: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Code  Generation  Algorithms

§ Top  down  (maximal  munch)§ Bottom  up  (dynamic  programming)

8

Page 9: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Instruction’s  AST:  Pattern  Tree

§ Load_Const cst,  R     //  cost=1

§ Load_Mem a,  R     //  cost=3

§ Add_Mem a,  R //  cost=3  

§ Add_Scaled_Reg cst,  R1,  R2 //  cost=4  

9

R

cst

R

a

R

R a

R1

R1 *

cst R2

+

constant  operand

register  operand

memory  location  operand

result  

Page 10: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Instruction’s  AST:  Pattern  Tree

§ Load_Const cst,  R     //  cost=1

§ Load_Mem a,  R     //  cost=3

§ Add_Mem a,  R //  cost=3  

§ Add_Scaled_Reg cst,  R1,  R2 //  cost=4  

10

R

cst

R

a

R

R aR1

R1 *

cst R2

+

#1

#2

#3

#7

#7.1

+

Page 11: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Instruction  costs

§ Measures  time  (cycle)  /  space  (bytes)

§ Models:§ Fixed,  

§ e.g.,    Load_Constant costs  1  cycles

§ context-­‐defendant  § e.g.,  Add_Scaled_Reg c,  R1,R2     //  R2=  R2  +  c  *  R1  

11

Page 12: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Example  – Naïve  rewrite

12

Input  tree Naïve  Rewrite

Page 13: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Example  – Naïve  rewrite

13

Input  tree Naïve  Rewrite

Page 14: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Top-­‐Down  Rewrite  Algorithm

§ aka  Maximal  Munch

§ Based  on  tiling

§ Start  from  the  root§ Choose  largest  tile§ (covers  largest  number  of  nodes)

§ Break  ties  arbitrarily  

§ Continue  recursively

14

Page 15: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Top-­‐down  largest  fit  rewrite

15

Input  tree TDLF-­‐Rewrite

Page 16: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Top-­‐down  largest  fit  rewrite

16

Input  tree TDLF-­‐Rewrite

Page 17: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Resulting  code

17

Naïve  Rewrite TDLF-­‐Rewrite

Page 18: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Top-­‐Down  Rewrite  Algorithm

§ Maximal  Munch

§ If  for  every  node  there  exists  a  single  node  tile  then  the  algorithm  does  not  get  stuck  

18

Page 19: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Challenges

§ How  to  find  all  possible  rewrites?

§ How  do  we  find  most  efficient  rewrite?§ And  do  it  efficiently?

19

Page 20: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

BURS:  Bottom  Up  Rewriting  System

§ How  to  find  all  possible  rewrites?§ Solution:  Bottom-­‐Up  pattern  matching

§ How  do  we  find  most  efficient  rewrite?§ And  do  it  efficiently?

§ Solution:  Dynamic  Programming

20

Page 21: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

BURS:  Bottom  Up  Rewriting  System

§ Instruction-­‐collecting  Scan§ Bottom  up§ Identify  possible  instructions  for  every  node§ Uses  pattern-­‐matching

§ Instruction-­‐selecting  Scan§ top-­‐down§ selects  one  (previously-­‐identified)   instruction  

§ Code-­‐generation  Scan§ bottom-­‐up§ Generate  code  (sequence  of  instructions) 21

Page 22: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Bottom-­‐up  Pattern  matching

§ “Tree  version”  of  lexical  analysis

§ Record  at  AST  node  N  a  label  L  of  a  pattern  I  if  it  is  possible  to  identify  (the  part  of  pattern  I  rooted  at)  L  at  node  N    

22

#6,#5

#1 #2

Page 23: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Labels

§ Instruction  leaves  result  of  expression  in  a  location§ register§ memory§ constant

§ The  location  determines  which  instructions  can  take  the  result  as  an  operand

§ Convention:    use  labels  of    the  form  Làlocation§ e.g.,    #7àreg§ Location  is  determined   by  Label§ àcst means  node   is  a  constant

23

Page 24: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Dotted  trees

§ Labels  can  be  seen  as  dotted  trees

24

Page 25: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Bottom  up  pattern  matching  

25

Page 26: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Bottom  up  pattern  matching

26

Page 27: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Bottom  up  pattern  matching

27

Page 28: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Bottom  up  pattern  matching

28

Page 29: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Example

29

Page 30: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Pattern  matching  using  Tree  Automaton§ Storing  the  set  of  patterns  explicitly  is  redundant:§ Number  of  patterns   is  fixed  § Relation  between  pattern   is  know

§ Idea:  § Create  a  state  machine  over  sets  of  patterns

§ Input:  States  of  children  +  operator  of  root  § Output:  State  of  root

30

Page 31: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Pattern  matching  using  Tree  Automaton

31

Page 32: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Pattern  matching  using  Tree  Automaton§ In  theory,  may  lead  to  large  autoamton§ Hundreds  of  operators§ Thousands  of  states§ Two  dimensions

§ In  practice,  most  entries  are  empty

§ Question:§ How  to  handle  operators  with  1  operand?§ with  many  operands?

32

Page 33: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Instruction  Selection

33

We  can  build  many  trees!

Naïve  approach:Try  them  allSelect  cheapest

Problem:  Exponential  blowup

Page 34: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Instruction  Selection  with  Dynamic  Programming§ Cost  of  sub-­‐tree  is  sum  of  § The  cost  of  the  operator  § The  costs  of  the  operands

§ Idea:  Compute  the  cost  while  detecting  the  patterns

§ Label:  Labelà Location  @  cost§ E.g.,  #5àreg  @  3

34

Page 35: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Example

35

1

3

3

1

6

4

4

5

cost cost

total  cost  8

total  cost  7

Page 36: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Example

36

1

3

3

1

6

4

4

5

cost cost

total  cost  12

total  cost  9

Page 37: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Example

37

1

3

3

1

6

4

4

5

cost cost

total  cost  14

total  cost  13

Page 38: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Instruction  Selection

38

We  can  build  the  cheapest   tree

• Select  label  at  root  based  on  location  of  result

• Select  label  at  children  based  on  expected  location  of  operands

Page 39: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Linearize  code

§ Standard  ASTàCode procedure§ E.g.,  create  the  register-­‐heavy   code  first

39

Page 40: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Commutativity

§ We  got  the  optimal  tree  wrt.  given  patterns

40

Using  commutativity

Solutions:Add  patternsMark  commutative  operations  and  handle  them  specially

Page 41: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Automatic  code  generators

§ Combining  pattern  matching  with  instruction  selection  leads  to  great  saving  

§ Possible  if  costs  are  constants

§ Creates  a  complicated  tree  automaton

41

Page 42: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Compilation0368-­‐3133    2014/15a

Lecture  11Assembler,  Linker  and  Loader

Noam  Rinetzky

42

Page 43: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

What  is  a  compiler?

“A  compiler  is  a  computer  program  that  transforms  source  code  written  in  a  programming  language  (source  language)  into  another  language  (target  language).

The  most  common  reason  for  wanting  to  transform  source  code  is  to  create  an  executable  program.”

-­‐-­‐Wikipedia

Page 44: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

AST  +  Sym.  Tab.

Stages  of  compilationSource  code

(program)

LexicalAnalysis

Syntax  Analysis

Parsing

ContextAnalysis

Portable/Retargetable code  generation

Target  code

(executable)

Assembly

IRText

Token  stream

AST

CodeGeneration

Page 45: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Compilation  è Execution

AST  +  Sym.  Tab.

Source  code

(program)

LexicalAnalysis

Syntax  Analysis

Parsing

ContextAnalysis

Portable/Retargetable code  generation

Target  code

(executable)

IRText

Token  stream

AST

CodeGeneration

Linker

Assembler

Loader

Symbo

lic  Add

r

Object  File

Executable  File

image

Executing  program

Runtime  S

ystem

Page 46: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Program  Runtime  State

Code

Static  Data

Stack

Heap

Registers 0x11000

0x22000

0x33000

0x99000

G,  extern_G

foo,  extern_fooprintf

x

0x88000

Page 47: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Challenges

§ goto L2  è JMP  0x110FF§ G:=3  èMOV  0x2200F,  0..011§ foo()è CALL  0x130FF§ extern_G :=  1  èMOV  0x2400F,  0..01§ extern_foo()  è CALL  0x140FF§ printf()  è CALL    0x150FF

§ x:=2  èMOV  FP+32,  0…010§ goto L2  è JMP  [PC  +]  0x000FF

Code

Static  DataStack

Heap

0x11000

0x22000

0x33000

0x99000

G,  extern_G

foo,  extern_fooprintf

x

0x88000

Page 48: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Assembly  è Image

Assembler

Compiler

Linker

Loader

Source  program

Assembly   lang.  program  (.s)

Machine  lang.  Module  (.o):  program  (+library)  modules  

Executable  (“.exe”):  

Image  (in  memory):  

“compilation”  time

“execution”   timeLibraries  (.o)

(dynamic  loading)

Page 49: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Outline

§ Assembly§ Linker  /  Link  editor§ Loader

§ Static  linking§ Dynamic  linking

Page 50: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Assembly  è Image

Linker

Loader

Assembler

Compiler

Source  file  (e.g., utils)

Assembly  (.s)

Executable  (“.elf”)

Image  (in  memory):  

Assembler

Compiler

Source  file  (e.g.,  main)

Assembly  (.s)

Assembler

Compiler

library

Assembly  (.s)

Object  (.o)Object  (.o) Object  (.o)

Page 51: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Assembler§ Converts  (symbolic)  assembler  to  binary  (object)  code

§ Object  files  contain  a  combination  of  machine   instructions,  data,  and  information  needed   to  place  instructions  properly  in  memory

§ Yet  another(simple)  compiler§ One-­‐to  one  translation

§ Converts  constants  to  machine  repr.  (3è0…011)§ Resolve  internal  references§ Records  info  for  code  &  data  relocation

Page 52: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Object  File  Format

§ Header:  Admin  info  +  “file  map”§ Text  seg.:  machine  instruction§ Data  seg.:  (Initialized)  data  in  machine  format§ Relocation  info:  instructions  and  data  that  depend  on  absolute  addresses

§ Symbol  table:  “exported”  references  +  unresolved  references

Header Text  Segment

DataSegment

Relocation  Information

Symbol  Table

Debugging  Information

Page 53: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Handling  Internal  Addresses

Page 54: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Resolving  Internal  Addresses

§ Two  scans  of  the  code§ Construct  a  table  label  → address§ Replace  labels  with  values

§ One  scan  of  the  code  (Backpatching)  § Simultaneously   construct  the  table  and  resolve  symbolic  addresses§ Maintains  list  of  unresolved   labels

§ Useful  beyond   assemblers

Page 55: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Backpatching

Page 56: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Handling  External  Addresses

§ Record  symbol  table  in  “external”  table§ Exported  (defined)  symbols

§ G,  foo()

§ Imported  (required)  symbols§ Extern_G,  extern_bar(),   printf()

§ Relocation  bits§ Mark  instructions  that  depend  on  absolute  (fixed)  addresses  § Instructions  using  globals,    

Page 57: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Example

External  references  resolved  by  the  Linker using  the  relocation  info.

Page 58: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Example  of  External  Symbol  Table

Page 59: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Assembler  Summary

§ Converts  symbolic  machine  code  to  binary§ addl %edx,  %ecx⇒ 000  0001  11  010  001  =  01  D1  (Hex)

§ Format  conversions§ 3  è 0x0..011    or  0x000000110…0

§ Resolves  internal  addresses

§ Some  assemblers  support  overloading§ Different  opcodes based  on  types

Page 60: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Linker

§ Merges  object  files  to  an  executable§ Enables  separate  compilation

§ Combine  memory  layouts  of  object  modules§ Links  program  calls  to  library  routines

§ printf(),  malloc()

§ Relocates  instructions  by  adjusting  absolute  references§ Resolves  references  among  files

Page 61: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Linker

Code  Segment  1

Data

Segment  1

Code  Segment  2

Data

Segment  2

0

200

100

0

450

300

120

ext_bar()

380

ext_bar 150zoo 180

Data

Segment  1

Code  Segment  2

Data

Segment  2

0

400

100

500

420

580

ext_bar 250zoo 280

650

Code  Segment  1

foo  foo  

Page 62: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Relocation  information

• Information  needed  to  change  addresses

§ Positions  in  the  code  which  contains  addresses§ Data§ Code

§ Two  implementations§ Bitmap§ Linked-­‐lists

Page 63: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

External  References

§ The  code  may  include  references  to  external  names  (identifiers)§ Library  calls§ External  data

§ Stored  in  external  symbol  table

Page 64: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Example  of  External  Symbol  Table

Page 65: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Example

Page 66: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Linker  (Summary)

§ Merge  several  executables§ Resolve  external  references§ Relocate  addresses

§ User  mode

§ Provided  by  the  operating  system§ But  can  be  specific  for  the  compiler

§ More  secure  code§ Better  error  diagnosis

Page 67: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Linker  Design  Issues

§ Merges§ Code  segments§ Data  segments§ Relocation  bit  maps§ External  symbol  tables

§ Retain  information  about  static  length§ Real  life  complications

§ Aggregate  initializations  § Object  file  formats§ Large  library§ Efficient  search  procedures

Page 68: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Loader

§ Brings  an  executable   file  from  disk  into  memory  and  starts  it  running  § Read  executable   file’s  header   to  determine  the  size  of  text  and  data  

segments  § Create  a  new  address   space  for  the  program§ Copies  instructions  and  data  into  memory§ Copies  arguments  passed  to  the  program  on  the  stack  

§ Initializes  the  machine  registers  including  the  stack  ptr§ Jumps  to  a  startup  routine  that  copies  the  program’s  arguments  from  the  stack  to  registers  and  calls  the  program’s  main  routine  

Page 69: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Program  Loading

Registers

Loader  Image

Code  Segment  2

Data

Segment  2

0

400

100

500

420

580

ext_bar 250zoo 280

650

Code  Segment  1

Data

Segment  1

Code  Segment

Static  Data

Stack

Heap

Program  Executable

foo  

Page 70: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Loader  (Summary)

§ Initializes  the  runtime  state

§ Part  of  the  operating  system§ Privileged  mode

§ Does  not  depend  on  the  programming  language

§ “Invisible  activation  record”

Page 71: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Static  Linking  (Recap)

§ Assembler  generates  binary  code  § Unresolved  addresses§ Relocatable  addresses

§ Linker  generates  executable  code§ Loader  generates  runtime  states  (images)

Page 72: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Dynamic  Linking

§ Why  dynamic  linking?§ Shared   libraries

§ Save  space§ Consistency

§ Dynamic  loading§ Load  on  demand

Page 73: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

What’s  the  challenge?

Assembler

Compiler

Linker

Loader

Source  program

Assembly   lang.  program  (.s)

Machine  lang.  Module  (.o):  program  (+library)  modules  

Executable  (“.exe”):  

Image  (in  memory):  

“compilation”  time

“execution”   timeLibraries  (.o)

(dynamic  linking)

Page 74: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Position-­‐Independent  Code  (PIC)

§ Code  which  does  not  need  to  be  changed    regardless  of  the  address  in  which  it  is  loaded  § Enable  loading  the  same  object  file  at  different  addresses

§ Thus,  shared  libraries  and  dynamic  loading

§ “Good”   instructions  for  PIC:  use  relative  addresses§ relative   jumps§ reference   to  activation  records

§ “Bad”  instructions  for  :  use  fixed  addresses§ Accessing  global  and  static  data§ Procedure  calls

§ Where  are  the   library  procedures  located?  

Page 75: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

How?

“All  problems  in  computer  science  can  be  solved  by  another  level  of  indirection"  

Butler  Lampson

Page 76: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

PIC:  The  Main  Idea

§ Keep  the  global  data  in  a  table§ Refer  to  all  data  relative  to  the  designated  register

Page 77: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Per-­‐Routine  Pointer  Table

§ Record  for  every  routine  in  a  table

&foo

&D.S.  1

PT  ext_bar

&ext_bar

&D.S.  2

&zoo

&D.S.  2

PT  ext_bar

&D.S.  2

foo  

Page 78: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Per-­‐Routine  Pointer  Table

§ Record  for  every  routine  in  a  table

Data

Segment  1

Code  Segment  2

Data

Segment  2 580

ext_barzoo  

Code  Segment  1

foo  

&foo

&D.S.  1

PT  ext_bar

&ext_bar

&D.S.  2

&zoo

&D.S.  2

PT  ext_bar

&D.S.  2 ext_g

foo  

Page 79: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Per-­‐Routine  Pointer  Table§ Record  for  every  routine  in  a  table§ Record  used  as  a  address  to  procedure

Caller:1. Load  Pointer  table  address  

into  RP2. Load  Code  address  from  

0(RP)  into  RC3. Call  via  RC

Callee:1. RP  points  to  pointer  table2. Table  has  addresses   of  pointer  table  

for  sub-procedures

Other  data

RP.func

Page 80: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

PIC:  The  Main  Idea

§ Keep  the  global  data  in  a  table§ Refer  to  all  data  relative  to  the  designated  register

§ Efficiency:  use  a  register  to  point  to  the  beginning  of  the  table§ Troublesome   in  CISC  machines

Page 81: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

ELF-­‐Position  Independent  Code

§ Executable  and  Linkable  code  Format§ Introduced  in  Unix  System  V

§ Observation§ Executable  consists  of  code  followed  by  data§ The  offset  of  the  data  from  the  beginning  of  the  code  is  known  at  

compile-­‐time

GOT(Global  Offset  Table)Data

Segment

CodeSegment

XX0000

call  L2L2:  

popl %ebxaddl $_GOT[.-­‐..L2],  %ebx

Page 82: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

ELF:  Accessing  global  data

Page 83: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

ELF:  Calling  Procedures  (before  1st  call)

Page 84: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

ELF:  Calling  Procedures  (after  1st  call)

Page 85: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

PIC  benefits  and costs§ Enable  loading  w/o  relocation

§ Share  memory   locations  among  processes

§ Data  segment  may  need  to  be  reloaded

§ GOT  can  be  large§ More  runtime  overhead§ More  space  overhead  

Page 86: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Shared  Libraries

§ Heavily  used  libraries§ Significant  code  space  

§ 5-­‐10  Mega  for  print§ Significant  disk  space§ Significant  memory  space

§ Can  be  saved  by  sharing  the  same  code§ Enforce  consistency§ But  introduces  some  overhead

§ Can  be  implemented   either  with  static  or  dynamic  loading

Page 87: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Shared  Libraries

§ Heavily  used  libraries§ Significant  code  space  

§ 5-­‐10  Mega  for  print§ Significant  disk  space§ Significant  memory  space

§ Can  be  saved  by  sharing  the  same  code§ Enforce  consistency§ But  introduces  some  overhead

Page 88: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Content  of  ELF  file

CallPLT

GOT

Text

Data

RoutinePLT

GOT

Text

Data

Program Libraries

Page 89: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Consistency

§ How  to  guarantee  that  the  code/library  used  the  “right” library  version

Page 90: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Loading  Dynamically  Linked  Programs§ Start  the  dynamic  linker§ Find  the  libraries§ Initialization§ Resolve  symbols  § GOT

§ Typically  small

§ Library  specific  initialization

§ Lazy  procedure  linkage

Page 91: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Microsoft  Dynamic  Libraries    (DLL)

§ Similar  to  ELF§ Somewhat  simpler§ Require  compiler  support  to  address  dynamic  libraries

§ Programs  and  DLL  are  Portable  Executable  (PE)§ Each  application  has  it  own  address§ Supports  lazy  bindings

Page 92: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Dynamic  Linking  Approaches

§ Unix/ELF  uses  a  single  name  space  space  and  MS/PE  uses  several  name  spaces

§ ELF  executable  lists  the  names  of  symbols  and  libraries  it  needs

§ PE  file  lists  the  libraries  to  import  from  other  libraries

§ ELF  is  more  flexible§ PE  is  more  efficient  

Page 93: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Costs  of  dynamic  loading

§ Load  time  relocation  of  libraries§ Load  time  resolution  of  libraries  and  executable§ Overhead  from  PIC  prolog§ Overhead  from  indirect  addressing§ Reserved  registers

Page 94: Compilation - TAU · 2015-12-29 · Compilation 0368%3133’2015/16a Lecture’11 Code’generation,’Assemblers,’and’linkers Noam+Rinetzky 1

Summary

§ Code  generation  yields  code  which  is  still  far  from  executable§ Delegate   to  existing  assembler

§ Assembler  translates  symbolic  instructions  into  binary  and  creates  relocation  bits

§ Linker  creates  executable  from  several  files  produced  by  the  assembly

§ Loader  creates  an  image  from  executable