Middle and Back End
AST translation IR1
asmmore IRs and translatio
n
translation IR2
Sources and IRsCODE DATA
Procedures
Control Flow
Statements
Data Access
Global Static Variables
Global Dynamic Data
Local Variables
Temporaries
Parameter Passing
Read-only Data
A code generator should… Translate all “CODE” to machine (or
assembly) instructions target-dependent
Allocate space for variables, etc. (“DATA”)
Respect the calling conventions and other constraints
To do all these, must know details of modern processors! and the impact on code generation
Overview of a modern processor
ALU Control Memory Registers
Memory
Registers ALU
Control
Arithmetic and Logic Unit
Most arithmetic and logic operation addl %eax, %ebx incl 4(%ecx)
Operands: immediate register memory
Memory
Registers ALU
Control
Arithmetic and Logic Unit
Operations may have constraints how to perform a division?
cltd; idivl ... Operations may raise exception
s idivl 0
Operations on different types addb, addw, addl, addq
Memory
Registers ALU
Control
Control
Executing instructions instructions are in memory
(pointed by PC)
for (;;) instruction = *PC; PC++; execute (instruction);
Memory
Registers ALU
Control
Registers
Limited but high-speed 8 on x86, more on RISC
Most are general-purpose but some are of special
use
Memory
Registers ALU
Control
Memory
Address space is the way how programs use memory highly architecture
and OS dependent right is the typical
layout of 32-bit x86/Linux
OS
heap
data
text
BIOS,VGA
0x00100000
stack
0xc00000000
0x08048000
0x00000000
0xffffffff
Read Only Data
Procedures
Control Flow
Statements
Data Access
Global Static Variables
Global Dynamic Data
Local Variables
Temporaries
Parameter Passing
Read-only Data
OS
heap
data
text
BIOS,VGA
stack .text
f:
pushl $s
call printf
s:
.string “hello”
char *s=“hello”;
void f ()
{printf(s);}
Global Static Variables
Procedures
Control Flow
Statements
Data Access
Global Static Variables
Global Dynamic Data
Local Variables
Temporaries
Parameter Passing
Read-only Data
OS
heap
data
text
BIOS,VGA
stack .text
f:
movl d, %eax
incl %eax
movl %eax, d
.data
d:
.int 1
int d = 1;
void f (){
d++;
}
Global Dynamic Data
Procedures
Control Flow
Statements
Data Access
Global Static Variables
Global Dynamic Data
Local Variables
Temporaries
Parameter Passing
Read-only Data
OS
heap
data
text
BIOS,VGA
stack .text
f:
pushl $4
call malloc
movl %eax, %ebx
void f (){
malloc(4);
}
Global Dynamic Data
Procedures
Control Flow
Statements
Data Access
Global Static Variables
Global Dynamic Data
Local Variables
Temporaries
Parameter Passing
Read-only Data
OS
heap
data
text
BIOS,VGA
stack .text
f:
pushl $4
call malloc
movl %eax, %ebx
void f (){
malloc(4);
}
Function, or Procedure, or method, or … High-level abstraction of code
logically-grouped Good for many things:
design and abstraction develop, testing, maintain and evolve …
Implementation? we start with C-style functions, and deal
with more advanced forms later
API & ABI Application Programming Interface
interfaces between source programs Application Binary Interface
contracts between binary programs even compiled from different languages by different
compilers conventions on low-level details:
how to pass arguments? how to return values? how to make use of registers? …
we posted the x86 ABI document on course page
Parameter Passing
Parameter passing Must answer two problems:
what to pass? call-by-value call-by-reference call-by-need …
how to pass? calling convention
http://en.wikipedia.org/wiki/X86_calling_conventions
Call-by-reference In languages such
as C++ arguments are
escaped so can not be
constants? actual arguments
and formal parameters are aliases
// C++ style reference:
int f (int &x, int y)
{
x = 3;
y = 4;
return 0;
}
// a call
f (a, b);
Simulating call-by-reference// original C++ code:
int f (int &x, int y)
{
x = 3;
y = 4;
return 0;
}
// a call
f (a, b);
// simulated:
int f (int *x, int y)
{
*x = 3;
y = 4;
return 0;
}
// the call becomes:
f (&a, b);
Moral
Call-by-reference is widely considered a wrong design of C++ the code is inherently inefficient! the code is ambiguous in nature
x = 4; (?)
A variant of this is the so-called call-by-value/result looks like call-by-value, but with effect
Call-by-value/result Upon call, the actual ar
guments is copies But callee only modifie
s a local version Upon exit, callee copie
s the local version to actual arguments
and formal parameters are aliases
// code:
int f (int @x, int y)
{
x = 3;
y = 4;
return 0;
}
// a call
f (a, b);
Simulating call-by-value/result// original code:
int f (int @x, int y)
{
x = 3;
y = 4;
return 0;
}
// a call
f (a, b);
// simulated:
int f (int *x, int y)
{
int temp = *x;
temp = 3;
y = 4;
*x = temp;
return 0;
}
// the call becomes:
f (&a, b);
Moral
What’s the difference between call-by-value and call-by-value-result?
Is call-by-value/result more efficient than call-by-reference? Why or why not?
We’d come back to a more interesting optimization called register promotion same idea to pull value into registers
Call-by-name Some languages, su
ch as Algo60 and Haskell, use call-by-name
Arguments are not evaluated, until they are really needed in the callee
For each argument, create a function, called a thunk
// code:
int f (int name x, int y)
{
if (y)
return x;
else
return 0;
}
// a call
f (a, b);
Simulating call-by-name// original code:
int f (int name x, int y)
{
if (y)
return x;
else
return 0;
}
// a call
f (a, b);
// simulated:
int f (fX: unit -> int, int y)
{
if (y)
return fX ();
else
return 0;
}
// the call becomes:
f (fn () => a, b);
this function is not closed!
Moral
A serious problem with call-by-name, is that the arguments may be evaluated many times
A better solution is to memoize the evaluation result
This method is called call-by-need, or sometimes lazy-evaluation
Simulating call-by-need// original code:
int f (int need x, int y)
{
if (y)
return x + x;
else
return 0;
}
// a call
f (a, b);
// simulated:
int f (fX: unit -> int, int y) {
if (y)
return fX() + fX();
else return 0;
}
// the call becomes:
val xMemoize = ref NONE
f (fn () =>
case !xMemoize of
NONE => a; store
| SOME i => i, b);
Where to pass the parameters?
Different calling conventions: pass them in registers pass them on stack (typically: the call
stack) a combination of the two
parts in registers, parts on the stack
This involves not only the ISA, but also the languages
Sample Calling Conventions for C on x86 (from Wiki)
Registers
Register usage Must be careful on register usage
caller-save: Callee is free to destroy these registers
eax, ecx, edx, eflags, fflags [and also all FP registers]
callee-save: Callee must restore these registers before returning to caller
ebp, esp, ebx, esi, edi [and also FP register stack top]
Register usage Should value reside in caller-save or callee-
save registers? not so easy to determine and no general rules must be veryyyyyyyyy careful with language feat
ures such as longjmp, goto or exceptions we’d come back to this later
We’d also come back to this issue later in register allocation part
The Call Stack
Stack on x86
Two dedicated regs Stack grows down to
lower address Frame also called ac
tivation record
frame 0
high address
%ebp
frame 1
frame 2
%esp low address
Stack Frameint f (int arg0, int arg1, …)
{
int local1;
int local2;
…;
}
%ebp
…
%esp
arg1
arg0
ret addrold ebplocal1
local2
…
Procedures
Control Flow
Statements
Data Access
Global Static Variables
Global Dynamic Data
Local Variables
Temporaries
Parameter Passing
Read-only Data
Put these together// C code
int main(void)
{ return f(8)+1; }
int f(int x)
{ return g(x); }
int g(int x)
{ return x+3; }
// x86 code
main:
pushl %ebp
movl %esp, %ebp
pushl $8
call f
incl %eax
leave
ret
Put these together// C code
int main(void)
{ return f(8)+1; }
int f(int x)
{ return g(x); }
int g(int x)
{ return x+3; }
// x86 code
f:
pushl %ebp
movl %esp, %ebp
pushl 8(%ebp)
call g
leave
ret
Put these together// C code
int main(void)
{ return f(8)+1; }
int f(int x)
{ return g(x); }
int g(int x)
{ return x+3; }
// x86 code
g:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %eax
addl $3, %eax
leave
ret
Implementation
Design a frame (activation record) data structure the frame size garbage collection info detailed layout, etc.
Thus, hide the machine-related details good for retargeting the compiler
Interfacesignature FRAME =
sig
type t
(* allocate space for a variable in frame *)
val allocVar: unit -> unit
(* create a new frame *)
val new: unit -> t
(* current size of the frame *)
val size: unit -> int
end
Frame on stack
Both function arguments and locals have a FIFO lifetime as with functions so one can put stack frame on the call
stack But later, we have the chance to
see other possibilities e.g.: higher-order nested functions
Nested Function
Nested Functions Functions declared
in the body of another function So the inner one
could refer to the variables in the outer ones
such kind of functions are called open
int f (int x, int y)
{
int m;
int g (int z)
{
int h ()
{
return m+z;
}
return 1;
}
return 0;
}
Nested Functions How to access
those variables in outer functions?
Three classical methods: lambda lifting static link display
int f (int x, int y)
{
int m;
int g (int z)
{
int h ()
{
return m+z;
}
return 1;
}
return 0;
}
Lambda lifting
In lambda lifting, the program is translated into a form such that all procedures are closed
The translation process starts with the inner-most procedures and works its way outwards
Lambda lifting exampleint f (int x, int y)
{
int m;
int g (int z)
{
int h (int &m, &z)
{
return m+z;
}
return 1;
}
return 0;
}
int f (int x, int y)
{
int m;
int g (int z)
{
int h ()
{
return m+z;
}
return 1;
}
return 0;
}
Lambda lifting exampleint f (int x, int y)
{
int m;
int g (int &m, int z)
{
int h (int &m, &z)
{
return m+z;
}
return 1;
}
return 0;
}
int f (int x, int y)
{
int m;
int g (int z)
{
int h ()
{
return m+z;
}
return 1;
}
return 0;
}
Lambda lifting example// flatten
int f (int x, int y){
int m;
return 0;
}
int g (int &m, int z){
return 1;
}
int h (int &m, &z){
return m+z;
}
int f (int x, int y)
{
int m;
int g (int z)
{
int h ()
{
return m+z;
}
return 1;
}
return 0;
}
Moral Pros:
easy to implement, source-to-source translations
even before code generation Cons:
all variables are escaped extra arguments passing
on some architectures, more arguments are passed in memory, so it’s inefficient
Static links An alternative approach is to add
an additional piece of information to the activation records, called the static link
The static link is a pointer to the activation record of the enclosing procedure
Used in the Borland Turbo Pascal compiler
Static links exampleint f (link,int x, int y)
{
int m;
int g (link, int z){
int h (link){
return link->
prev->m+
link->z;
}
return 1;
}
return 0;
}
int f (int x, int y)
{
int m;
int g (int z)
{
int h ()
{
return m+z;
}
return 1;
}
return 0;
}
Pros and cons
Pros: Little extra overhead on parameter
passing the static link
Cons: Still there is the overhead to climb up
a static link chain to access non-locals
Implementation details
First, each function is annotated with its enclosing depth, hence its variables
When a function at depth n accesses a variable at depth m emit code to climb up n-m links to visit th
e appropriate activation record
Implementation details When a procedure p at depth n calls a
procedure q at depth m: if n<m (ie, q is nested within p):
note: in first-order languages, n=m-1 q’s static link = q’s dynamic link
if nm: q’s prelude must follow m-n static links, sta
rting from the caller’s (p’s) static link the result is the static link for q
Moral In theory, static links don’t seem very good
functions may be deeply nested However, real programs access mainly
local/global variables, or occasionally variables just one or several static links away
Still, experimentation shows that static links are inferior to the lambda-lifting approach Personally, I believe static links are infeasible to
optimizations
Display The 3rd way to handle nest functions is
to use a display A display is a small stack of pointers to
activation records The display keeps track of the lexical
nesting structure of the program Essentially, it points to the currently set
of activation records that contain accessible variables
Higher-order functions Functions may serve more than just
being called can be passed as arguments can return as results can be stored in data structures
objects! we’d discuss later If functions don’t nest, then the
implementation is simple a simple code address e.g., the “function pointer” in C
Higher-order functions
But if functions do nest, it’s much trickier to compile: as found in Lisp, ML, Scheme even in recent version of C# and Java
Later, we’d discuss more advanced techniques to handle this