Upload
vankhuong
View
223
Download
0
Embed Size (px)
Citation preview
1
Intermediate code generation
Semantical analysis
Syntactical analysis
Lexical analysis
Optimization
Code generation
Target specific optimization
2
Runtime environments - Chapter 7
We need to translate high level code like “trac42” into machine code.
The instruction set of trac42VM (virtual machine):● Can be seen as an intermediate representation ● At the same time possible to execute in the interpreter trac42i.
Difference from assignment 1 - we now have type checking:● Want to generate “real code” ● Need to consider types of data● Comparing two strings differs from comparing two integers● Different machine instructions are needed
3
Code and dataIn the source language data and executable code are put into the same “sheet of paper”
- as declarations and statements in the source code file.
When executing a program we need to differentiate the code and data.
The main purpose of this chapter for us:● We need to understand the translation from a representation of a high level language to stack machine code● We need to understand how the stack machine code operates on the stack.
Some issues:● We cant use labels – only real addresses● Functions can be called from different contexts● Functions takes parameters and returns values.● Different data types differs in size.
4
Memory arrangement
Memory is partitioned into different sections● Executable code of a program● Stack – e.g. local variables● Static – global and static memory● Heap – dynamically allocated memory
5
Functions
A function definition:
void foo(){ write "Hello";}
Means: “foo” is a label for the code between the braces, the “body”
When using “foo” in some other code, the body of “foo” should be executed.
We may also need to handle recursive functions, i.e. “foo” may be referred in the body of “foo”.
6
Functions
Another function definition:int twice(int x){ int y; y = 2 * x; return y;}
A definition like the ones above (in languages like C and trac42) also serves as a declaration of the interface to the function.
This declaration states:● twice need to be called with a single int parameter● twice returns a value of type int.
This is useful for the type analysis (as you already know).
It's also needed to be able to generate code correctly.
7
Function calls
. . . a = twice(b + 1) + 1; . . .● Ensure that the body of twice is executed● You must also handle parameters and return value!
Before the execution of the body of twice:● The assignment xtwice = b + 1 must take place.When execution has finished ●The call expression is replaced by the returned value in the expression of the call, i.e.: ●Continue the execution after the call computing:a = retvaltwice + 1
To ensure a correct resume of the callers execution, we may also need● Save the processor state● Save the return-to addressDo this before invoking the function.
8
Function calls' contextsvoid bar() ...
a = twice(b + 1); baz(); ...
void foo() ...
x = twice(x + 2) + 2; ...
We also need to handle the function calls different in different contexts xtwice = bbar + 1 "execute body of twice" abar = retval + 1; ... xtwice = xfoo + 2 "execute body of twice" xfoo = retval + 2;We use a stack to accomplish this.
9
The stack
A stack is used to convey the needed information● To called functions● From from called functions● Necessary program state (e.g. the return-to address)
The information on the stack associated to one function call is called activation record (or sometimes stack frame).
10
Layout of an activation record
Actual arguments
Return address
Frame pointer
Local variables
Temporary variables
Return value
Top
Bottom
11
Mapping to C
Actual arguments
Return address
Frame pointer
Local variables
Temporary variables
Return value
int twice(int x){ int y; y = 2 * x; return y;}
. . .a = twice(13);x = ...
12
Example with numbers
Actual arguments
Return address
Frame pointer
Local variables
Return value
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(42);8 x = ...
Callersrecord
40
39
38
37
41
42
Activation record needed before
executing stmt 4!
FP
SP
13
Example with numbers
Actual arguments
Return address
Frame pointer
Local variables
Return value
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(42);8 x = ...
Callersrecord
42
Global registers holding stack pointer
and frame pointer
FP
SP
40
39
38
37
41
14
Mapping to stack
Callersrecord
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
Start here
42
42FP
SP
15
Mapping to stack
Return value
Callersrecord
?? 41
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
41
42FP
SP
16
Mapping to stack
Actual arguments
Return value
Callersrecord
13 40
41
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
40
42FP
SP
17
Mapping to stack
Actual arguments
Return address
Return value
Callersrecord
13
8 *
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
39
42FP
SP
40
39
41
* does not map to high-level view
18
Mapping to stack
Actual arguments
Return address
Frame pointer
Return value
Callersrecord
13
8
42
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
38
38FP
SP
40
39
38
41
19
Mapping to stack
Actual arguments
Return address
Frame pointer
Local variables
Return value
Callersrecord
13
8
42
??
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
37
38FP
SP
40
39
38
37
41
20
Mapping to stack
Actual arguments
Return address
Frame pointer
Local variables
Return value
Callersrecord
13
8
42
26
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
37
38FP
SP
40
39
38
37
41
21
Mapping to stack
Actual arguments
Return address
Frame pointer
Local variables
Return value
Callersrecord
13
8
42
26
26
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
37
38FP
SP
40
39
38
37
41
22
Mapping to stack
Actual arguments
Return address
Return value
Callersrecord
13
8
26
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
39
42FP
SP
40
39
41
23
Mapping to stack
Actual arguments
Return value
Callersrecord
13
26
40
41
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
40
42FP
SP
24
Mapping to stack
Return value
Callersrecord
26 41
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13); /*> 26 */8 x = ...
41
42FP
SP
25
Responsibility
- Who does what in the administration of function invocation?
Calling function (function call)● Reserve return value space● Push actual arguments● Pushes the return-to address● Calls the function● Pops actual arguments
Called function (function definition)● Save and re-set FP● Reserve space for local variables● Assign the return value● Restore FP● Return to caller
The task for your compiler is to generate code that handles this!
26
The task for the compiler
Calculate the actual addresses to:● Variables● Parameters ● Return value● Generate instructions using proper addresses
The FP is there to simplify the task for you.- You just need to calculate the offset relative to FP!
We split the work into two tasks (two passes):● Calculate offsets● Generate code using the offsets
27
Actual arguments
Return address
Frame pointer
Local variables
Return value
Callersrecord
13
8
42
26
26
42
int twice(int x){3 int y;4 y = 2 * x;5 return y;}
7 a = twice(13);8 x = ...
37
38FP
SP
For the moment: assume that all stack objects takes 1 stack position.What are the offsets for y, x and the return value?
Offset calculation
40
39
38
37
41
28
Code to be generated
Function callIf the function returns a value you must reserve space for the return value. Note! This needs special care in a language like C, where a function's result may be ignored.DECL #size (or PUSH)
For each actual argument, PUSH its value on the stack.
Call the function and push the return-to address at the same timeBSR #address
Pop the actual arguments.POP #size
29
Code to be generated
Function definitionPush FP and set FP to a stack address that is fixed throughout the functions execution.LINK
Reserve space for all the local variables.DECL #size
Restore FP to the value it had when entering this functionUNLINK
Return to the callerRTS
30
A complete example
save FPy&y2x*y=&retvalyretval=restore FP
restore FP
save FPa&aretvalargscall twicepop argsa=restore FP
int twice(int x){ int y; y = 2 * x; return y;}
void trac42(){ int a; a = twice(13);}
2 [twice] 3 LINK 4 DECL #1 5 LVAL -1(FP) 6 PUSHINT #2 7 RVALINT 2(FP) 8 MULT 9 ASSINT10 LVAL 3(FP)11 RVALINT -1(FP)12 ASSINT13 UNLINK14 RTS15 UNLINK16 RTS17 [trac42]18 LINK19 DECL #120 LVAL -1(FP)21 DECL #122 PUSHINT #1323 BSR 224 POP 125 ASSINT26 UNLINK27 RTS
31
A stack trace
● When starting a program the stack is empty● However, SP and FP always have some values.● The initial value of FP is uninteresting in our context● The initial value of SP is usually set to leave sufficient space for the program.● When a program has terminated the stack should be empty - the SP should have retained its initial value.● Note that the semantics of UNLINK may hide stack errors. You need to check that that SP is restored to the position of the last local variable before executing UNLINK.● You can use trac42i to do traces.● You will however need to do some practicing on paper to get how it all works ...
32
News with the stack machine code
● Sizes matters, e.g. size of( INT )!= size of( STRING )● Operations that depend on the size have a type tag, e.g.
● PUSHINT, PUSHBOOL, PUSHSTR● RVALINT, ...● ASSINT, ...● EQINT, ...● LTINT, ...● LEINT, ...● READINT, ...● WRITEINT, ...
● LINK and UNLINK used to handle FP● BSR and RTS to call/return. BSR needs the actual address● Instructions referring to the stack needs the offset to FP● string is 100, everything else (including pointers) are 1● You should be prepared for any size to appear...
33
Show the stack content as it is before instruction 13 ifSP = 100 FP = 102
save FPy&y2x*y=&retvalyretval=restore FP
restore FP
save FPa&aretvalargscall twicepop argsa=restore FP
int twice(int x){ int y; y = 2 * x; return y;}
void trac42(){ int a; a = twice(13);}
2 [twice] 3 LINK 4 DECL #1 5 LVAL -1(FP) 6 PUSHINT #2 7 RVALINT 2(FP) 8 MULT 9 ASSINT10 LVAL 3(FP)11 RVALINT -1(FP)12 ASSINT13 UNLINK14 RTS15 UNLINK16 RTS17 [trac42]18 LINK19 DECL #120 LVAL -1(FP)21 DECL #122 PUSHINT #1323 BSR 224 POP 125 ASSINT26 UNLINK27 RTS
Exercise
34
Reverse this to t42-code!(invent names of identifiers when needed)
2 [id] 3 LINK 4 RVALINT 2(FP) 5 RVALINT 3(FP) 6 ADD 7 WRITEINT 8 POP 1 9 UNLINK 10 RTS 11 [trac42] 12 LINK 13 DECL #1 14 LVAL -1(FP) 15 PUSHINT #42 16 ASSINT 17 PUSHINT #27 18 RVALINT -1(FP) 19 PUSHINT #99 20 ADD 21 BSR 2 22 POP 1 23 POP 1 24 UNLINK 25 RTS
35
Encode this to trac42VM-code (stack machine code)
void print(string s1, int x, string s2, int y){ write s1; write x; write s2; write y;}
void trac42 (){ print("Here comes one:", 1, "Here comes two:", 2);}
36
Parameters passing
● In trac42 we use call by value – we get what we write.
● Same in C
● Another approach is call by reference – a pointer to what we write is pushed.
● E.g. Java objects and references in C++.
● Usually combined with call by value
● Other:
● Call by name – similar to macros in C
● Copy restore – like the name suggests...
37
Memory allocation
● In trac42 we use stack allocation only.
● Another approach is static allocation for variables
● Disadvantage: does not allow for recursion
● Advantage: May be fast, no risk for stack overflow...
● Heap allocation
● Java objects
● Many functional languages
● Do not confuse with explicit memory allocation like in C/C++
Methods are usually mixed
● Globals and static locals in C are static
● Locals are on the stack
38
Local functions
Some languages allow for nested functions (local)
main() {int x;foo(int y) {
if y > 0 foo(y1);x = 42;
}foo(2);
}
How to find x on the stack?
How to find y on the stack, not confusing it with the y of the (recursive) caller?
39
Local functions
We add a so called access link to the stack frame
This can be seen as the next pointer in a linked list, where the records in the linked list are the activation records.
Actual arguments
Return address
Frame pointer
Local variables
Return value
Access link
The “linked list model” is not exactly true, because sometimes there may e.g. be short-cuts in the “linked list”:
40
Local functions
Generate code that for each access of non-local variables follows the access links to the activation record of the function to which the variable belongs.
● The relative nesting depth between the function of declaration and the function of the access determines the code to generate.
Generate code that sets the access link. That code must be generated at the call site.
● The relative nesting depth between the caller and called determines the code to generate.