27
Code Generation II

Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

Embed Size (px)

DESCRIPTION

3 x86 assembly AT&T syntax and Intel syntax We’ll be using AT&T syntax Work with GNU Assembler (GAS)GNU Assembler

Citation preview

Page 1: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

Code Generation II

Page 2: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

22

Compiler

ICProgram

ic

x86 executable

exeLexicalAnalysi

s

Syntax Analysi

sParsing

AST Symbol

Tableetc.

Inter.Rep.(IR)

CodeGeneration

IC compiler

Page 3: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

33

x86 assembly

AT&T syntax and Intel syntax We’ll be using AT&T syntax Work with GNU Assembler (GAS)

Page 4: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

44

Immediate and register operands

Immediate Value specified in the instruction itself Preceded by $ Example: add $4,%esp

Register Register name is used Preceded by % Example: mov %esp,%ebp

Page 5: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

55

Memory Access

Memory operands Obtain value at given address Example: mov (%eax), %ebx

Base displacement Obtain value at computed address Syntax: disp(base,index,scale) address = base + (index * scale) + displacement

Example: mov $42, 2(%eax) Example: mov $42, (%eax,%ecx,4)

Page 6: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

66

Accessing Variables

Use offset from frame pointer

Examples %ebp + 8 = first parameter %eax = %ebp + 8 (%eax) = the value 572 8(%ebp) = the value 572

… …

SP

ebpReturn address

local 1…

local n

Previous ebp

param n…

572 %eax,%ebp+8

ebp-4

Page 7: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

77

LIR to assembly

Need to know how to translate: Function bodies

Translation for each kind of LIR instruction Calling sequences Correctly access parameters and variables

Compute offsets for parameter and variables

Dispatch tables String literals Runtime checks & error handlers

Page 8: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

88

Translating LIR instructions

Translate function bodies:1. Compute offsets for:

Local variables & LIR registers (-4,-8,-12,…) Function parameters (+8,+12,+16,…)

take this parameter into account

2. Translate instruction list for each function Local translation for each LIR instruction

local (machine) register allocation

Page 9: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

99

Memory offsets implementation

// MethodLayout instance per function declarationclass MethodLayout { // Maps variables/parameters/LIR registers to // offsets relative to frame pointer (EBP) Map<Memory,Integer> memoryToOffset;}

void foo(int x, int y) { int z = x + y; g = z; // g is a field Library.printi(z); }

virtual function takesone extra parameter: this

MethodLayout for fooMemory Offset

this +8

x +12

y +16

z -4

R0 -8

R1 -12

_A_foo: Move x,R0 Add y,R0 Move R0,z Move this,R1 MoveField R0,R1.1 Library __printi(R0),Rdummy

(manual) LIR translation1

PA4

PA5

Page 10: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

1010

Memory offsets example

MethodLayout for foo

_A_foo: Move x,R0 Add y,R0 Move R0,z Move this,R1 MoveField R0,R1.1 Library __printi(R0),Rdummy

_A_foo: push %ebp # prologue mov %esp,%ebp sub $12, %esp mov 12(%ebp),%eax # Move x,R0 mov %eax,-8(%ebp) mov 16(%ebp),%eax # Add y,R0 add -8(%ebp),%eax mov %eax,-8(%ebp) mov -8(%ebp),%eax # Move R0,z mov %eax,-4(%ebp) mov 8(%ebp),%eax # Move this,R1 mov %eax,-12(%ebp) mov -8(%ebp),%eax # MoveField R0,R1.1 mov -12(%ebp),%ebx mov %eax,4(%ebx) mov -8(%ebp),%eax # Library __printi(R0) push %eax call __printi add $4,%esp_A_foo_epilogoue: mov %ebp,%esp # epilogoue pop %ebp ret

LIR translation Translation to x86 assembly

Memory Offset

this +8

x +12

y +16

z -4

R0 -8

R1 -12

2

Page 11: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

1111

Translating instructionsLIR Instruction Translation

MoveArray R1[R2],R3 mov -8(%ebp),%ebx # -8(%ebp)=R1mov -12(%ebp),%ecx # -12(%ebp)=R2mov (%ebx,%ecx,4),%ebxmov %ebx,-16(%ebp) # -16(%ebp)=R3

MoveField x,R2.3 mov -12(%ebp),%ebx # -12(%ebp)=R2mov -8(%ebp),%eax # -12(%ebp)=xmov %eax,12(%ebx) # 12=3*4

MoveField _DV_A,R1.0 movl $_DV_A,(%ebx) # (%ebx)=R1.0(movl means move 4 bytes)

ArrayLength y,R1 mov -8(%ebp),%ebx # -8(%ebp)=ymov -4(%ebx),%ebx # load sizemov %ebx,-12(%ebp) # -12(%ebp)=R1

Add R1,R2 mov -16(%ebp),%eax # -16(%ebp)=R1add -20(%ebp),%eax # -20(%ebp)=R2mov %eax,-20(%ebp) # store in R2

Page 12: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

1212

Translating instructionsLIR Instruction Translation

Mul R1,R2 mov -8(%ebp),%eax # -8(%ebp)=R2 imul -4(%ebp),%eax # -4(%ebp)=R1 mov %eax,-8(%ebp)

Div R1,R2(idiv divides EDX:EAX stores quotient in EAX stores remainder in EDX)

mov $0,%edx mov -8(%ebp),%eax # -8(%ebp)=R2 mov -4(%ebp),%ebx # -4(%ebp)=R1 idiv %ebxmov %eax,-8(%ebp) # store in R2

Mod R1,R2 mov $0,%edx mov -8(%ebp),%eax # -8(%ebp)=R2 mov -4(%ebp),%ebx # -4(%ebp)=R1 idiv %ebxmov %edx,-8(%ebp)

Compare R1,x mov -4(%ebp),%eax # -4(%ebp)=xcmp -8(%ebp),%eax # -8(%ebp)=R1

Return R1(returned value stored in EAX register)

mov -8(%ebp),%eax # -8(%ebp)=R1jmp _A_foo_epilogue

Return Rdummy jmp _A_foo_epilogue

Page 13: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

1313

call

caller

callee

return

caller

Caller push code

Callee push code(prologue)

Callee pop code(epilogue)

Copy returned valueCaller pop code

Push caller-save registersPush actual parameters (in reverse order)push return addressJump to call addressPush current base-

pointerbp = spPush local variablesPush callee-save registersPop callee-save registersPop callee activation recordPop old base-pointerpop return addressJump to address

Pop parametersPop caller-save registers

Call sequences

… …

Return address

Local 1Local 2

……

Local n

Previous fp

Param n…

param1

FP

SPReg 1

…Reg n

SP

SP

SPSP

FP

SP

Page 14: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

1414

Translating static callsStaticCall _A_foo(a=R1,b=5,c=x),R3LIR code:

# push parametersmov -4(%ebp),%eax # push xpush %eaxpush $5 # push 5mov -8(%ebp),%eax # push R1push %eax

# push caller-saved registerspush %eaxpush %ecxpush %edx

call _A_foo

# pop parameters (3 params*4 bytes = 12)add $12,%esp

# pop caller-saved registerspop %edxpop %ecxpop %eax

only if the value stored in these registers is needed by the callerh

mov %eax,-16(%ebp) # store returned value in R3

Only if return register is not Rdummy

only if the value stored in these registers is needed by the caller

Page 15: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

1515

Virtual functions

Indirect call: call *(Reg)Example: call *(%eax)Used for virtual function calls

Dispatch table lookupPassing/receiving the this variable

Page 16: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

1616

Translating virtual callsVirtualCall R1.2(b=5,c=x),R3

# push parametersmov -4(%ebp),%eax # push xpush %eaxpush $5 # push 5

# push caller-saved registerspush %eaxpush %ecxpush %edx

LIR code:

# pop parameters (2 params+this * 4 bytes = 12)add $12,%esp

# pop caller-saved registerspop %edxpop %ecxpop %eax

mov %eax,-12(%ebp) # store returned value in R3

xy

DVPtr

R1

0 _A_rise

1 _A_shine

2 _A_twinkle

_DV_A

# Find address of virtual method and call itmov -8(%ebp),%eax # load thispush %eax # push thismov 0(%eax),%eax # Load dispatch table addresscall *8(%eax) # Call table entry 2 (2*4=8)

Page 17: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

1717

Function prologue/epilogue_A_foo:# prologuepush %ebpmov %esp,%ebp

# push local variables of foosub $12,%esp # 3 local vars+regs * 4 = 12

# push callee-saved registerspush %ebxpush %esipush %edi

function body

# pop callee-saved registerspop %edipop %esipop %ebx

# push local variables of foosub $12,%esp # 3 local vars+regs * 4 = 12

mov %ebp,%esppop %ebpret

_A_foo_epilogoue: # extra label for each function

Optional: only ifregister allocation optimization is used (in PA5)

only if the these registers will be modified by the collee

Page 18: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

1818

Representing dispatch tables

class A { void sleep() {…} void rise() {…} void shine() {…} static void foo() {…}}class B extends A { void rise() {…} void shine() {…} void twinkle() {…}}

_DV_A: [_A_sleep,_A_rise,_A_shine]_DV_B: [_A_sleep,_B_rise,_B_shine,_B_twinkle]

file.ic

file.lir

# data section.data .align 4_DV_A: .long _A_sleep .long _A_rise .long _A_shine_DV_B: .long _A_sleep .long _B_rise .long _B_shine .long _B_twinkle

file.s

PA4

PA5

Page 19: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

1919

Runtime checks

Insert code to check attempt to perform illegal operations Null pointer check

MoveField, MoveArray, ArrayLength, VirtualCall Reference arguments to library functions should not be null

Array bounds check Array allocation size check Division by zero

If check fails jump to error handler code that prints a message and gracefully exists program

Page 20: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

2020

Null pointer check

# null pointer check cmp $0,%eax je labelNPE

labelNPE: push $strNPE # error message call __println push $1 # error code call __exit

Single generated handler for entire program

Page 21: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

2121

Array bounds check

# array bounds check mov -4(%eax),%ebx # ebx = length mov $0,%ecx # ecx = index cmp %ecx,%ebx jle labelABE # ebx <= ecx ? cmp $0,%ecx jl labelABE # ecx < 0 ?

labelABE: push $strABE # error message call __println push $1 # error code call __exit

Single generated handler for entire program

Page 22: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

2222

Array allocation size check

# array size check cmp $0,%eax # eax == array size jle labelASE # eax <= 0 ?

labelASE: push $strASE # error message call __println push $1 # error code call __exit

Single generated handler for entire program

Page 23: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

2323

Division by zero check

# division by zero check cmp $0,%eax # eax is divisor je labelDBE # eax == 0 ?

labelDBE: push $strDBE # error message call __println push $1 # error code call __exit

Single generated handler for entire program

Page 24: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

2424

class Library { void println(string s); }

class Hello { static void main(string[] args) { Library.println("Hello world!"); } }

Hello world example

Page 25: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

2525

Assembly file structure.title "hello.ic“

# global declarations.global __ic_main

# data section.data

.align 4 .int 13str1: .string "Hello world\n“

# text (code) section.text

#----------------------------------------------------.align 4

__ic_main:push %ebp # prologuemov %esp,%ebp

push $str1 # print(...)call __printadd $4, %esp

mov $0,%eax # return 0

mov %ebp,%esp # epiloguepop %ebpret

header

statically-allocateddata: string literalsand dispatch tables

symbol exported to

linker

Method bodiesand error handlers

string lengthin bytes

comment

Page 26: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

2626

Assembly file structure.title "hello.ic“

# global declarations.global __ic_main

# data section.data

.align 4 .int 13str1: .string "Hello world\n“

# text (code) section.text

#----------------------------------------------------.align 4

__ic_main:push %ebp # prologuemov %esp,%ebp

push $str1 # print(...)call __printadd $4, %esp

mov $0,%eax # return 0

mov %ebp,%esp # epiloguepop %ebpret

push print parametercall print pop parameter

store return value of main in eax

prologue – save ebp and set to be esp

epilogue – restore esp and ebp (pop)

Page 27: Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing…

2727

From assembly to executableLexicalAnalysi

s

Syntax Analysi

sParsing

AST Symbol

Tableetc.

Inter.Rep.(IR)

CodeGeneration

ICProgram

prog.ic

x86 assembly

prog.s

x86 assembly

prog.s

libic.a(libic + gc)

GNU assembler prog.o GNU

linker prog.exe