1 CMPSC 160 Translation of Programming Languages Fall 2002 Lecture-Modules 17 and 18 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon

1

CMPSC 160Translation of Programming Languages

Fall 2002

Lecture-Modules 17 and 18

slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon

Department of Computer ScienceUniversity of California, Santa Barbara

2

Structure of a Compiler

• Front end of a compiler is efficient and can be automated

• Back end is generally hard to automate and finding the optimum solution requires exponential time

• Intermediate code generation can effect the performance of the back end

InstructionSelection

InstructionScheduling

RegisterAllocation

Scanner ParserSemantic Analysis

Code Optimization

IntermediateCode

GenerationIR

3

Intermediate Representations

• Abstract Syntax Trees (AST)

• Directed Acyclic Graphs (DAG)

• Control Flow Graphs (CFG)

• Static Single Assignment Form (SSA)

• Stack Machine Code

• Three Address Code

• Hybrid approaches mix graphical and linear representations

– SGI and SUN compilers use three address code but provide ASTs for loops if-statements and array references

– Use three-address code in basic blocks in control flow graphs

high-level

low-level

Graphical IRs

Linear IRs

4

Abstract Syntax Trees (ASTs)

if (x < y)

x = 5*y + 5*y/3;

else

y = 5;

x = x+y;

Statements

<

AssignStmt

+

*

x

IfStmt

AssignStmt AssignStmt

x x y+ yxy

/

5 y 3*

5 y

5

5

Directed Acyclic Graphs (DAGs)

• Use directed acyclic graphs to represent expressions

– Use a unique node for each expression

if (x < y)

x = 5*y + 5*y/3;

else

y = 5;

x = x+y;

Statements

<

AssignStmt

*

IfStmt

AssignStmt AssignStmt

x +y

/

5

3

6

Control Flow Graphs (CFGs)

• Nodes in the control flow graph are basic blocks

– A basic block is a sequence of statements always entered at the beginning of the block and exited at the end

• Edges in the control flow graph represent the control flow

if (x < y)

x = 5*y + 5*y/3;

else

y = 5;

x = x+y;

if (x < y) goto B1 else goto B2

x = 5*y + 5*y/3 y = 5

x = x+y

B1 B2

B0

B3

• Each block has a sequence of statements• No jump from or to the middle of the block• Once a block starts executing, it will execute till the end

7

Stack Machine Code if (x < y)

x = 5*y + 5*y/3;

else

y = 5;

x = x+y;

load x load y iflt L1 goto L2L1: push 5 load y multiply push 5 load y multiply push 3 divide add store x goto L3L2: push 5 store yL3: load x load y add store x

pops the toptwo elements andcompares them

pops the top two elements, multipliesthem, and pushes theresult back to the stack

JVM: A stack machine• In the project we generate assembly code for JVMwhich is converted to bytecode by Jasmin assembler• JVM interpreter executes the bytecode on differentmachines• JVM has an operand stack which we use to evaluateexpressions• JVM provides 65535 local variables for each method • The local variables are like registers so we do not haveto worry about register allocation• Each local variable in JVM is denoted by a numberbetween 0 and 65535 (x and y in the example will be assigned unique numbers)

pushes the valueat the location x to the stack

stores the value at the top of the stack to the location x

8

Three-Address Code

• Each instruction can have at most three operands

• Assignments– x := y– x := y op z op: binary arithmetic or logical operators– x := op y op: unary operators (minus, negation, integer to

float conversion)• Branch

– goto L Execute the statement with labeled L next

• Conditional Branch– if x relop y goto L relop: <, =, <=, >=, ==, !=

• if the condition holds we execute statement labeled L next• if the condition does not hold we execute the statement following this

statement next

9

Three-Address Code

if (x < y)

x = 5*y + 5*y/3;

else

y = 5;

x = x+y;

if x < y goto L1goto L2

L1: t1 := 5 * yt2 := 5 * yt3 := t2 / 3x := t1 + t2goto L3

L2: y := 5L3: x := x + y

Temporaries: temporaries correspondto the internal nodes of the syntax tree

Variables can be represented with their locations in the symbol table

• Three address code instructions can be represented as an array of quadruples: operation, argument1, argument2, result triples: operation, argument1, argument2 (each triple implicitly corresponds to a temporary)

10

Three-Address Code Generation for a Simple Grammar

Productions Semantic RulesS id := E id.place lookup(id.name);

S.code E.code || gen(id.place ‘:=‘ E.place);E E1 + E2 E.place newtemp();

E.code E1.code || E2.code || gen(E.place ‘:=‘ E1.place ‘+’ E2.place);E E1 * E2 E.place newtemp();

E.code E1.code || E2.code || gen(E.place ‘:=‘ E1.place ‘*’ E2.place);E ( E1 )E.code E1.code;

E.place E1.place;E E1 E.place newtemp();

E.code E1.code || gen(E.place ‘:=‘ ‘uminus’ E1.place);E id E.place lookup(id.name);

E.code ‘’ (empty string)

Attributes: E.place: location that holds the value of expression EE.code: sequence of instructions that are generated for E

Procedures: newtemp(): Returns a new temporary each time it is calledgen(): Generates instruction (have to call it with appropriate arguments)lookup(id.name): Returns the location of id from the symbol table

11

Stack Machine Code Generation for a Simple Grammar

Productions Semantic RulesS id := E id.place lookup(id.name);

S.code E.code || gen(‘store’ id.place);E E1 + E2 E.code E1.code || E2.code || gen(‘add’);

(arguments for the add instruction are in the top of the stack)E E1 * E2 E.code E1.code || E2.code || gen(‘multiply’);E ( E1 )E.code E1.code; E E1 E.code E1.code || gen( ‘negate‘);E id E.code gen(‘load’ id.place)

Attributes: E.code: sequence of instructions that are generated for E(no place for an expression is needed since the result of an expressionis stored in the operand stack)

Procedures: newtemp(): Returns a new temporary each time it is calledgen(): Generates instruction (have to call it with appropriate arguments)lookup(id.name): Returns the location of id from the symbol table

12

Code Generation for Boolean Expressions

• Two approaches

– Numerical representation

– Implicit representation

• Numerical representation

– Use 1 to represent true, use 0 to represent false

– For three-address code store this result in a temporary

– For stack machine code store this result in the stack

• Implicit representation

– For the boolean expressions which are used in flow-of-control statements (such as if-statements, while-statements etc.) boolean expressions do not have to explicitly compute a value, they just need to branch to the right instruction

– Generate code for boolean expressions which branch to the appropriate instruction based on the result of the boolean expression

13

Boolean Expressions: Numerical Representation

Attributes : E.place: location that holds the value of expression E E.code: sequence of instructions that are generated for Eid.place: location for id

Global variable: nextstat: Returns the location of the next instruction to be generated(each call to gen() increments nextstat by 1)

Productions Semantic Rules

E id1 relop id2 E.place newtemp();E.code gen(‘if’ id1.place relop.op id2.place ‘goto’ nextstat+3);

|| gen(E.place ‘:=‘ ‘0’) || gen(‘goto’ nextstat+2)|| gen(E.place ‘:=‘ ‘1’);

E E1 and E2 E.place newtemp();E.code E1.code || E2.code

|| gen(E.place ‘:=‘ E1.place ‘and’ E2.place);

14

Boolean Expressions: Implicit Representation

Productions Semantic Rules

E id1 relop id2 E.code gen(‘if’ id1.place relop.op id2.place ‘goto’ E.true)|| gen(‘goto’ E.false);

E E1 and E2 E1.true newlabel();E1.false E. false; (short-circuiting)E2.true E. true;E2.false E. false;E.code E1.code || gen(E1.true ‘:’) || E2.code ;

Attributes : E.code: sequence of instructions that are generated for EE.false: instruction to branch to if E evaluates to falseE.true: instruction to branch to if E evaluates to true(E.code is synthesized whereas E.true and E.false are inherited)id.place: location for id

can be any relational operator:==, <=, >= !=

These places will be filled with lables later on when they become available

This generated labelwill be inserted to the place for E1.true in the code generated for E1

15

Example

100 if x < y goto 103101 t1 := 0102 goto 104103 t1 := 1104 if a = b goto 107105 t2 := 0106 goto 108107 t2 := 1108 t3 := t1 and t2

Input boolean expression:x < y and a == b

Numerical representation:

if x < y goto L1goto LFalse

L1: if a = b goto LTruegoto LFalse...

LTrue:

LFalse:

Implicit representation:

These are the locations ofthree-address code instructions, they are not labels

These labels will be generatedlater on, and will be insertedto the corresponding places

16

Flow-of-Control Statements

If-then-else

• Branch based on the result of boolean expression

Loops

• Evaluate condition before loop (if needed)

• Evaluate condition after loop

• Branch back to the top if condition holds

Merges test with last block of loop body

While, for, do, and until all fit this basic model

Pre-test

Loop body

Post-test

Next block

17

Flow-of-Control Statements: Code Structure

E.code

S1.code

goto S.next

S2.code

to E.true

to E.false

E.true:

E.false:

S.next:

S if E then S1 else S2

if E evaluates to true

if E evaluates to false

E.code

S1.code

goto S.begin

to E.true

to E.false

E.true:

E.false:

S.begin:

S while E do S1

Another approach is to place E.code after S1.code

18

Flow-of-Control Statements

Productions Semantic RulesS if E then S1 else S2 E.true newlabel();

E.false newlabel(); S1.next S. next;S2.next S. next;S.code E.code || gen(E.true ‘:’) || S1.code

|| gen(‘goto’ S.next) || gen(E.false ‘:’) || S2.code ;

S while E do S1 S.begin newlabel();E.true newlabel(); E.false S. next;S1.next S. begin;S.code gen(S.begin ‘:’) || E.code || gen(E.true ‘:’) || S1.code

|| gen(‘goto’ S.begin);

S S1 ; S2 S1.next newlabel();S2.next S.next;S.code S1.code || gen(S1.next ‘:’) || S2.code

Attributes : S.code: sequence of instructions that are generated for SS.next: label of the instruction that will be executed immediately after S(S.next is an inherited attribute)

19

Example

Input code fragment:

while (a < b) { if (c < d) x = y + z; else x = y – z}

L1: if a < b goto L2goto LNext

L2: if c < d goto L3goto L4

L3: t1 := y + zx := t1goto L1

L4: t2 := y – zx := t2goto L1

LNext: ...

20

Backpatching

• E.true, E.false, S.next may not be computed in a single pass (they are inherited attributes)

• Backpatching is a technique for generating labels for E.true, E.false, S.next and inserting them to the appropriate locations

• Basic idea

– Keep lists E.truelist, E.falselist, S.nextlist

• E.truelist: the list of instructions where the label for E.true have to be inserted when it becomes available

• S.nextlist: the list of instructions where the label for S.next have to be inserted when it becomes available

– When labels E.true, E.false, S.next are computed these labels are inserted to the instructions in these lists

21

Flow-of-Control Statements: Case Statements

Case Statements1 Evaluate the controlling expression2 Branch to the selected case3 Execute the code for that case4 Branch to the statement after the case

Part 2 is the key

Strategies

• Linear search (nested if-then-else constructs)

• Build a table of case values & binary search it

• Directly compute an address (requires dense case set)

– Use an array of labels that is addressed by the case value

22

Type Conversions

Mixed-type expressions

• Insert conversions as needed from conversion table

– i2f r1, r2 (convert the integer value in register r1 to float, and store the result in register r2)

• Most languages have symmetric conversion tables

Typical Addition

Table

23

Memory Layout

Placing run time data structures

Alignment and padding• Machines have alignment restrictions

– 32-bit floating point numbers and integers should begin on a full-word boundary (32-bit boundary)

• Place values with identical restrictions next to each other• Assign offsets from most restrictive to least• If needed, insert padding (space left unused due to alignment restrictions) to

match restrictions

Code

S Gt la & ot bi ac l

Heap

Stack

0 high

• Stack and heap share free space

• Fixed-size areas together

• For compiler, this is the entire picture

24

Memory Allocation

P DD D; DD id : TT char | int | float | array[num] of T | pointer T

Attributes: T.type, T.widthBasic types: char width 4, integer width 4, float width 8Type constructors: array(size,type) width is size * (width of type)

pointer(type) width is 4

• Enter the variables to the symbol table with their type and memory location• Set the type attribute T.type• Layout the storage for variables

• Calculate the offset for each local variable and enter it to the symbol table• Offset can be offset from a static data area or from the beginning of the local data area in the activation record

25

A Translation Scheme for Memory Allocation

P {offset 0;} DD D; DD id : T {enter(id.name,T.type, offset); offset offset + T.width; } T char { T.type char; T.width 4; } | int { T.type integer; T.width 14; } | float { T.type float; T.width 8; } | array[num] of T1 { T.type array(num.val, T1.type);

T.width num.val * T1.width; } | pointer T { T.type pointer(T1.type); T.width 8; }

• Note that if the size of the array is not a constant we cannot compute its width at compile time• In that case allocate the memory for the array in the heap and allocate the memory for the pointer to the heap at compile time

26

Memory Allocation for Nested Procedures

• Use two stacks:– Stack for symbol table: top(tblptr) points to current symbol table– Stack for offset: top(offset) gives the value of the offset in the

current symbol table• addwidth(table,width) stores the cumulative width of all the entries in

the symbol table in the header of the symbol table• mktbl(previous) creates a new symbol table linked to the previous• enter(table, name, type, offset) enters a variable to the symbol table• enterProc(table,name,newtable) enters a procedure to the symbol table

P DD D; DD id : T | proc id; D; S

27

Memory Allocation for Nested Procedures

P {t mktable(nil); push(t,tblptr); push(0, offset)} D {addwidth(top(tblptr), top(offset)); pop(tblptr); pop(offset);}D D; DD proc id ; { t mktable(top(tblptr)); push(t,tblptr); push(0,offset); } D ; S { t top(tblptr); addwidth(t, top(offset)); pop(tblptr); pop(offset);D id : T {enter(top(tblptr), id.name, T.type, top(offset)); top(offset) top(offset) + T.width; }

28

Array access: How does the compiler handle A[ i ][ j ] ?

First, must agree on a storage schemeRow-major order (most languages)

Lay out as a sequence of consecutive rowsRightmost subscript varies fastestA[1,1], A[1,2], A[1,3], A[2,1], A[2,2], A[2,3]

Column-major order (Fortran) Lay out as a sequence of columnsLeftmost subscript varies fastestA[1,1], A[2,1], A[1,2], A[2,2], A[1,3], A[2,3]

Indirection vectors (Java) Vector of pointers to pointers to … to valuesTakes more spaceTrades indirection for arithmetic (in modern processors memory access is

more costly than arithmetic)Locality may not be good if indirection vectors are used

29

Laying Out Arrays

The Concept

Row-major order

Column-major order

Indirection vectors

1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4A

1,1 2,1 1,2 2,2 1,3 2,3 1,4 2,4A

1,1 1,2 1,3 1,4

2,1 2,2 2,3 2,4A

1,1 1,2 1,3 1,4

2,1 2,2 2,3 2,4A

These have distinct & different cache behavior The order of traversal of an array can effect the performance

30

Computing an Array Address

A[ i ]

• @A + ( i - low ) sizeof(A[1])

What about A[i1][i2] ?

Row-major order, two dimensions

@A + (( i1 - low1 ) (high2 - low2 + 1) + i2 - low2) sizeof(A[1])

Column-major order, two dimensions

@A + (( i2 - low2 ) (high1 - low1 + 1) + i1 - low1) sizeof(A[1])

Indirection vectors, two dimensions

*(A[i1])[i2] — where A[i1] is, itself, a 1-d array reference

Expensive computation!

Lots of +, -, x operations

int A[1:10] low is 1Make low 0 for faster access (save a subtraction )

Almost always a power of 2, known at compile-time use a shift for speed

Base of A (starting address of the array)

31

Optimizing Address Calculation for A[ i ][ j ]

In row-major order

@A + (i-low1)(high2-low2+1) sizeof(A[1][1]) + (j - low2) sizeof(A[1][1])

Which can be factored into

@A + i (high2-low2+1) sizeof(A[1][1]) + j sizeof(A[1][1])

- (low1 (high2-low2+1) sizeof(A[1][1]) + low2 sizeof(A[1][1]))

If lowi, highi, and sizeof(A[1][1]) are known, the last term is a constant

Define @A0 as

@A - (low1 (high2-low2+1) sizeof(A[1][1]) + low2 sizeof(A[1][1]))

And len2 as (high2-low2+1)

Then, the address expression becomes

@A0 + (i len2 + j ) sizeof(A[1][1])Compile-time constants

32

Array References

What about arrays as actual parameters?

Whole arrays, as call-by-reference parameters

• Need dimension information build a dope vector

• Store the values in the calling sequence

• Pass the address of the dope vector in the parameter slot

• Generate complete address polynomial at each reference

Some improvement is possible

• Save leni and lowi rather than lowi and highi

• Pre-compute the fixed terms in prolog sequence

What about call-by-value?

• Most call-by-value languages pass arrays by reference

@Alow1

high1

low2

high2

33

Structure of a Compiler

InstructionSelection

InstructionScheduling

RegisterAllocation

Scanner ParserSemantic Analysis

Code Optimization

IntermediateCode

Generation

Documents

1 CMPSC 160 Translation of Programming Languages Fall 2002 Lecture-Modules 17 and 18 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon