Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
CADSL
Compiler Optimization
Virendra Singh Computer Architecture and Dependable Systems Lab
Department of Electrical Engineering Indian Institute of Technology Bombay
http://www.ee.iitb.ac.in/~viren/ E-mail: [email protected]
Advanced Topics in Computing @ MNIT
Lecture 2 (09 Sep 2015)
CADSL
General Structure of a Modern Compiler
Lexical Analysis
Syntax Analysis
Semantic Analysis
Controlflow/Dataflow
Optimization
Code Generation
Source Program
Assembly Code
Scanner
Parser
High-level IR to low-level IR conversion
Build high-level IR Context
Symbol Table
CFG
Machine independent asm to machine dependent
Front end
Back end
09 Sep 2015 virendra@MNIT 2
CADSL
Multiple IRs • Most compilers use 2 IRs:
– High-‐level IR (HIR): Language independent but closer to the language
– Low-‐level IR (LIR): Machine independent but closer to the machine
– A significant part of the compiler is both language and machine independent!
AST HIR
Pentium
Java bytecode Itanium TI C5x ARM
optimize
LIR
optimize optimize
C++ C
Fortran
09 Sep 2015 virendra@MNIT 3
CADSL
High-Level IR • HIR is essenEally the AST
– Must be expressive for all input languages
• Preserves high-‐level language constructs – Structured control flow: if, while, for, switch – Variables, expressions, statements, funcEons
• Allows high-‐level opEmizaEons based on properEes of source language – FuncEon inlining, memory dependence analysis, loop transformaEons
09 Sep 2015 virendra@MNIT 4
CADSL
Low-Level IR • A set of instrucEons which emulates an abstract machine (typically RISC)
• Has low-‐level constructs – Unstructured jumps, registers, memory locaEons
• Types of instrucEons – ArithmeEc/logic (a = b OP c), unary operaEons, data movement (move, load, store), funcEon call/return, branches
09 Sep 2015 virendra@MNIT 5
CADSL
Alternatives for LIR • 3 general alternaEves
– Three-‐address code or quadruples • a = b OP c • Advantage: Makes compiler analysis/opE easier
– Tree representaEon • Was popular for CISC architectures • Advantage: Easier to generate machine code
– Stack machine • Like Java bytecode • Advantage: Easier to generate from AST
09 Sep 2015 virendra@MNIT 6
CADSL
IR choices • Other hybrids exist
– combinaEons of graphs and linear codes – CFG with 3-‐address code for basic blocks
• Many variants used in pracEce – no widespread agreement – compilers may need several different IRs!
• Advice: – choose IR with right level of detail – keep manipulaEon costs in mind
09 Sep 2015 virendra@MNIT 7
CADSL
Static Single Assignment Form • Goal: simplify procedure-‐global opEmizaEons
• Defini&on:
8
Program is in SSA form if every variable is only assigned once
09 Sep 2015 virendra@MNIT 8
CADSL
Static Single Assignment (SSA) • Each assignment to a temporary is given a unique name – All uses reached by that assignment are renamed – Compact representaEon – Useful for many kinds of compiler opEmizaEon …
09 Sep 2015 virendra@MNIT 9
Ron Cytron, et al., “Efficiently compu&ng sta&c single assignment form and the control dependence graph,” ACM TOPLAS., 1991.
x := 3;!x := x + 1;!x := 7;!x := x*2;!
x1 := 3;!x2 := x1 + 1;!x3 := 7;!x4 := x3*2;!
è
CADSL
Example: Condition
10
if B then!!a := b!else!!a := c!end!!… a …!
if B then!!a1 := b!else!!a2 := c!End!!!… a? …!
Original SSA
Conditions: what to do on control-flow merge?
09 Sep 2015 virendra@MNIT 10
CADSL
Solution: Φ-Function
11
if B then!!a := b!else!!a := c!end!!… a …!
if B then!!a1 := b!else!!a2 := c!End!a3 := Φ(a1,a2)!!… a3 …!
Original SSA
Conditions: what to do on control-flow merge?
09 Sep 2015 virendra@MNIT 11
CADSL
The Φ-Function • Φ-‐funcEons are always at the beginning of a basic block
• Select between values depending on control-‐flow
• ak+1 := Φ (a1…ak): the block has k preceding blocks PHI-‐func&ons are evaluated simultaneously within a basic block.
12 09 Sep 2015 virendra@MNIT 12
CADSL
SSA and CFG • SSA is normally used for control-‐flow graphs (CFG)
• Basic blocks are in 3-‐address form
13 09 Sep 2015 virendra@MNIT 13
CADSL
Recall: Control flow graph • A CFG models transfer of control in a program
– nodes are basic blocks (straight-‐line blocks of code) – edges represent control flow (loops, if/else, goto …)
© Marcus Denker 14
if x = y then!!S1!
else!!S2!
end!S3!
09 Sep 2015 virendra@MNIT 14
CADSL
SSA: a Simple Example
15
if B then!!a1 := 1!
else!!a2 := 2!
End!a3 := PHI(a1,a2)!!… a3 …!
09 Sep 2015 virendra@MNIT 15
CADSL
Recall: IR
16
• front end produces IR • optimizer transforms IR to more efficient program • back end transform IR to target code
09 Sep 2015 virendra@MNIT 16
CADSL
SSA as IR
17 09 Sep 2015 virendra@MNIT 17
CADSL
Transforming to SSA • Problem: Performance / Memory
– Minimize number of inserted Φ-‐funcEons – Do not spend too much Eme
• Many rela1vely complex algorithms – We do not go too much into detail – See literature!
18 09 Sep 2015 virendra@MNIT 18
CADSL
Minimal SSA • Two steps:
– Place Φ-‐funcEons – Rename Variables
• Where to place Φ-‐funcEons?
• We want minimal amount of needed Φ – Save memory – Algorithms will work faster
19 09 Sep 2015 virendra@MNIT 19
CADSL
Optimization: The Idea
• Transform the program to improve efficiency
• Performance: faster execuEon • Size: smaller executable, smaller memory footprint
Tradeoffs: 1) Performance vs. Size 2) Compila2on speed and memory
20 09 Sep 2015 virendra@MNIT
CADSL
Optimization on many levels
• OpEmizaEons both in the opEmizer and back-‐end
21 09 Sep 2015 virendra@MNIT
CADSL © Marcus Denker
Examples for Optimizations • Constant Folding / PropagaEon • Copy PropagaEon • Algebraic SimplificaEons • Strength ReducEon • Dead Code EliminaEon
– Structure SimplificaEons
• Loop OpEmizaEons • ParEal Redundancy EliminaEon • Code Inlining
22 09 Sep 2015 virendra@MNIT
CADSL
Constant Folding • Evaluate constant expressions at compile Eme • Only possible when side-‐effect freeness guaranteed
c:= 1 + 3 c:= 4
true not false
Caveat: Floats — implementation could be different between machines!
23 09 Sep 2015 virendra@MNIT
CADSL
Constant Propagation
• Variables that have constant value, e.g. c := 3 – Later uses of c can be replaced by the constant – If no change of c between!
b := 3!c := 1 + b!d := b + c!
b := 3!c := 1 + 3!d := 3 + c!
Analysis needed, as b can be assigned more than once!
24 09 Sep 2015 virendra@MNIT
CADSL
Copy Propagation
• for a statement x := y • replace later uses of x with y, if x and y have not been changed.
x := y!c := 1 + x!d := x + c!
x := y!c := 1 + y!d := y + c!
Analysis needed, as y and x can be assigned more than once!
25 09 Sep 2015 virendra@MNIT
CADSL © Marcus Denker
Algebraic Simplifications
• Use algebraic properEes to simplify expressions
-(-i)! i!
b or: true! true!
Important to simplify code for later optimizations
26 09 Sep 2015 virendra@MNIT
CADSL
Strength Reduction
• Replace expensive operaEons with simpler ones
• Example: MulEplicaEons replaced by addiEons
y := x * 2! y := x + x!
Peephole optimizations are often strength reductions
27 09 Sep 2015 virendra@MNIT
CADSL
Dead Code • Remove unnecessary code
– e.g. variables assigned but never read
b := 3!c := 1 + 3!d := 3 + c!
c := 1 + 3!d := 3 + c!
> Remove code never reached
if (false) {a := 5}!
if (false) {}!
28 09 Sep 2015 virendra@MNIT
CADSL
Simplify Structure
• Similar to dead code: Simplify CFG Structure
• OpEmizaEons will degenerate CFG
• Needs to be cleaned to simplify further opEmizaEon!
29 09 Sep 2015 virendra@MNIT
CADSL
Delete Empty Basic Blocks
30 09 Sep 2015 virendra@MNIT
CADSL
Fuse Basic Blocks
31 09 Sep 2015 virendra@MNIT
CADSL
Common Subexpression Elimination (CSE)
Common Subexpression: • There is another occurrence of the expression whose evaluaEon always precedes this one
• operands remain unchanged
Local (inside one basic block): When building IR Global (complete flow-‐graph)
32 09 Sep 2015 virendra@MNIT
CADSL
Example CSE
b := a + 2!c := 4 * b! b < c?!
b := 1!
d := a + 2!
t1 := a + 2!b := t1!c := 4 * b! b < c?!
b := 1!
d := t1!
33 09 Sep 2015 virendra@MNIT
CADSL
Loop Optimizations • OpEmizing code in loops is important
– ojen executed, large payoff
• All opEmizaEons help when applied to loop-‐bodies
• Some opEmizaEons are loop specific
34 09 Sep 2015 virendra@MNIT
CADSL © Marcus Denker
Loop Invariant Code Motion • Move expressions that are constant over all iteraEons out of the loop
35 09 Sep 2015 virendra@MNIT
CADSL
Induction Variable Optimizations
• Values of variables form an arithmeEc progression
integer a(100)!do i = 1, 100! a(i) = 202 - 2 * i!endo!
integer a(100)!t1 := 202!do i = 1, 100! t1 := t1 - 2! a(i) = t1!endo!
value assigned to a decreases by 2
uses Strength Reduction
36 09 Sep 2015 virendra@MNIT
CADSL
Partial Redundancy Elimination (PRE)
• Combines mulEple opEmizaEons: – global common-‐subexpression eliminaEon – loop-‐invariant code moEon
• ParEal Redundancy: computaEon done more than once on some path in the flow-‐graph
• PRE: insert and delete code to minimize redundancy.
37 09 Sep 2015 virendra@MNIT
CADSL
Code Inlining • All opEmizaEons up to now were local to one procedure
• Problem: procedures or funcEons are very short – Especially in good OO code!
• Solu1on: Copy code of small procedures into the caller – OO: Polymorphic calls. Which method is called?
38 09 Sep 2015 virendra@MNIT
CADSL
Example: Inlining
a := power2(b)! power2(x) {! return x*x!}!
a := b * b!
39 09 Sep 2015 virendra@MNIT
CADSL © Marcus Denker
Optimizations in the Backend
• Register AllocaEon • InstrucEon SelecEon • Peep-‐hole OpEmizaEon
40 09 Sep 2015 virendra@MNIT
CADSL
Register Allocation • Processor has only finite amount of registers
– Can be very small (x86)
• Temporary variables – non-‐overlapping temporaries can share one register
• Passing arguments via registers • OpEmizing register allocaEon very important for good performance – Especially on x86
41 09 Sep 2015 virendra@MNIT
CADSL © Marcus Denker
Instruction Selection
• For every expression, there are many ways to realize them for a processor
• Example: MulEplicaEon*2 can be done by bit-‐shij
Instruc&on selec&on is a form of op&miza&on
42 09 Sep 2015 virendra@MNIT
CADSL
Peephole Optimization
• Simple local opEmizaEon • Look at code “through a hole”
– replace sequences by known shorter ones – table pre-‐computed
store R,a; !load a,R!
store R,a; !
imul 2,R; ashl 1,R;
Important when using simple instruc&on selec&on!
43 09 Sep 2015 virendra@MNIT
CADSL
Compiler Backend Structure
Control flow analysis Control flow optimization
Dataflow analysis Dataflow optimization
Instruction Scheduling
Register Allocation
Instruction Selection
Machine Code Emission/Opti
Improve code quality (machine independent opti
Virtual to physical mapping and machine dependent opti
Branching structure
Computation instructions
Bind instrs to physical resources
Bind virtual regs to physical regs
Bind instrs to physical realizations
09 Sep 2015 virendra@MNIT 44
CADSL
Thank You
09 Sep 2015 virendra@MNIT 45