1
CS 406/534 Compiler ConstructionPutting It All Together
Prof. Li XuDept. of Computer Science
UMass LowellFall 2004
Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr. Linda Torczon’s teaching materials at Rice University.
All rights reserved.
2
CS406/534 Fall 2004, Prof. Li Xu 2
AdministraviaLast lecture todayLab2 and lab3 graded Final exam now handed out, due 12/20
Deadline is firm: late exam will not be graded
Extra credit lab3 presentation today
3
CS406/534 Fall 2004, Prof. Li Xu 3
What We Did Last TimeProgram analysis and optimization
Overview of compiler optimizationLocal optimization
DAGValue numbering
Control flow analysisCFG, DOM tree, natural loops
Data flow analysisGeneric frameworkTypical data flow problems
AVAIL, REACH, LIVE
SSA
4
CS406/534 Fall 2004, Prof. Li Xu 4
Today’s GoalsSummary of the subjects we’ve coveredPerspectives and final remarks
How will you use 91.406/534 knowledge?
5
CS406/534 Fall 2004, Prof. Li Xu 5
High-level View
DefinitionsCompiler consumes source code & produces target code
usually translate high-level language programs into machine codeInterpreter consumes executables & produces results
virtual machine for the input code
Sourcecode
MachinecodeCompiler
Errors
6
CS406/534 Fall 2004, Prof. Li Xu 6
Why Study Compilers?Compilers are important
Enabling technology for languages, software developmentAllow programmers to focus on problem solving, hiding the hardware complexityResponsible for good system performance
Compilers are usefulLanguage processing is broadly applicable
Compilers are funCombine theory and practiceOverlap with other CS subjectsHard problemsEngineering and trade-offsGot a taste in the labs!
7
CS406/534 Fall 2004, Prof. Li Xu 7
Structure of Compilers
Front Middle End Back End
Infrastructure: symbol tables, trees, graphs, intermediate representations, sets, tuples
Scanner
Parser
CS
A
Optim
ization 1
Optim
ization 2
Optim
ization nAnalysis
Instruction Selection
Instruction Scheduling
Register A
llocation
IRIR
8
CS406/534 Fall 2004, Prof. Li Xu 8
The Front-end
Front Middle End Back End
Infrastructure: symbol tables, trees, graphs, intermediate representations, sets, tuples
Scanner
Parser
CS
A
Optim
ization 1
Optim
ization 2
Optim
ization nAnalysis
Instruction Selection
Instruction Scheduling
Register A
llocation
IRIR
9
CS406/534 Fall 2004, Prof. Li Xu 9
Lexical AnalysisScanner
Maps character stream into tokens Automate scanner construction
Define tokens using Regular ExpressionsConstruct NFA (Nondeterministic Finite Automata) to recognize REsTransform NFA to DFA
Convert NFA to DFA through subset constructionDFA minimization (set split)
Building scanners from DFATools
ANTLR, lex
10
CS406/534 Fall 2004, Prof. Li Xu 10
Syntax AnalysisParsing language using CFG (context-free grammar)CFG grammar theory
DerivationParse treeGrammar ambiguity
Parsing Top-down parsing
recursive descenttable-driven LL(1)
Bottom-up parsingLR(1) shift reduce parsing
11
CS406/534 Fall 2004, Prof. Li Xu 11
Top-down Predictive ParsingBasic ideaBuild parse tree from root. Given A → α | β, use look-ahead symbol to choose between α & β
Recursive descentTable-driven LL(1)
Left recursion elimination
12
CS406/534 Fall 2004, Prof. Li Xu 12
Bottom-up Shift-Reduce ParsingBuild reverse rightmost derivation
The key is to find handle (rhs of production)
All active handles include top of stack (TOS)Shift inputs until TOS is right end of a handle
Language of handles is regular (finite)Build a handle-recognizing DFAACTION & GOTO tables encode the DFA
13
CS406/534 Fall 2004, Prof. Li Xu 13
Semantic AnalysisAnalyze context and semantics
types and other semantic checks
Attribute grammarassociate evaluation rules with grammar production
Ad-hoc build symbol table
14
CS406/534 Fall 2004, Prof. Li Xu 14
Intermediate Representation
Front Middle End Back End
Infrastructure: symbol tables, trees, graphs, intermediate representations, sets, tuples
Scanner
Parser
CS
A
Optim
ization 1
Optim
ization 2
Optim
ization nAnalysis
Instruction Selection
Instruction Scheduling
Register A
llocation
IRIR
15
CS406/534 Fall 2004, Prof. Li Xu 15
Intermediate RepresentationFront-end translates program into IR format for further analysis and optimization
IR encodes the compiler’s knowledge of the programLargely machine-independentMove closer to standard machine model
AST Tree: high-levelLinear IR: low-level
ILOC 3-address codeAssembly-level operationsExpose control flow, memory addressingunlimited virtual registers
16
CS406/534 Fall 2004, Prof. Li Xu 16
Procedure AbstractionProcedure is key language construct for building large systems
Name SpaceCaller-callee interface: linkage convention
Control transferContext protectionParameter passing and return value
Run-time support for nested scopesActivation record, access link, display
Inheritance and dynamic dispatch for OOmultiple inheritancevirtual method table
17
CS406/534 Fall 2004, Prof. Li Xu 17
The Back-end
Front Middle End Back End
Infrastructure: symbol tables, trees, graphs, intermediate representations, sets, tuples
Scanner
Parser
CS
A
Optim
ization 1
Optim
ization 2
Optim
ization nAnalysis
Instruction Selection
Instruction Scheduling
Register A
llocation
IRIR
18
CS406/534 Fall 2004, Prof. Li Xu 18
The Back-endInstruction selection
Mapping IR into assembly codeAssumes a fixed storage mapping & code shapeCombining operations, using address modes
Instruction schedulingReordering operations to hide latenciesAssumes a fixed program (set of operations)Changes demand for registers
Register allocationDeciding which values will reside in registersChanges the storage mapping, may add false sharingConcerns about placement of data & memory operations
19
CS406/534 Fall 2004, Prof. Li Xu 19
Code GenerationExpressions
Recursive tree walk on ASTDirect integration with parser
AssignmentArray referenceBoolean & Relational ValuesIf-then-elseCaseLoopProcedure call
20
CS406/534 Fall 2004, Prof. Li Xu 20
Instruction SelectionHand-coded tree-walk code generatorAutomatic instruction selection
Pattern matchingPeephole MatchingTree-pattern matching through tiling
21
CS406/534 Fall 2004, Prof. Li Xu 21
Instruction SchedulingThe Problem
Given a code fragment for some target machine and the latencies for each individual operation, reorder the operationsto minimize execution time
Build Precedence GraphList scheduling
NP-complete problemHeuristics work well for basic blocks
forward list schedulingbackward list scheduling
Scheduling for larger regionsEBB and cloningTrace scheduling
22
CS406/534 Fall 2004, Prof. Li Xu 22
Register AllocationLocal register allocation
top-downbottom-up
Global register allocationFind live-rangeBuild an interference graph GI
Construct a k-coloring of interference graphMap colors onto physical registers
23
CS406/534 Fall 2004, Prof. Li Xu 23
Web-based Live Ranges
def x def x
def y
use xuse x
use y def x
use x
use y
def yl1
l2
l3
l4
Connect common defs and usesSolve the Reaching data-flow problem!
24
CS406/534 Fall 2004, Prof. Li Xu 24
Interference GraphThe interference graph, GI
Nodes in GI represent live rangesEdges in GI represent individual interferences
For x, y ∈ GI, <x,y> ∈ iff x and y interfere
A k-coloring of GI can be mapped into an allocation to k registers
3-colorable
25
CS406/534 Fall 2004, Prof. Li Xu 25
Key Observation on ColoringAny vertex n that has fewer than kneighbors in the interference graph (n° < k) can always be colored !Remove nodes n° < k for GI ’, coloring for GI ’ is also coloring for GI .
26
CS406/534 Fall 2004, Prof. Li Xu 26
Chaitin’s Algorithm1. While ∃ vertices with < k neighbors in GI
> Pick any vertex n such that n°< k and put it on the stack> Remove that vertex and all edges incident to it from GI
• This will lower the degree of n’s neighbors
2. If GI is non-empty (all vertices have k or more neighbors) then:> Pick a vertex n (using some heuristic) and spill the live range
associated with n> Remove vertex n from GI , along with all edges incident to it and put it
on the stack> If this causes some vertex in GI to have fewer than k neighbors, then
go to step 1; otherwise, repeat step 2
3. If no spill, successively pop vertices off the stack and color them in the lowest color not used by some neighbor; otherwise, insert spill code, recompute GI and start from step 1
27
CS406/534 Fall 2004, Prof. Li Xu 27
Brigg’s ImprovementNodes can still be colored even with > k neighbors if some
neighbors have same color
1. While ∃ vertices with < k neighbors in GI > Pick any vertex n such that n°< k and put it on the stack> Remove that vertex and all edges incident to it from GI
• This may create vertices with fewer than k neighbors
2. If GI is non-empty (all vertices have k or more neighbors) then:> Pick a vertex n (using some heuristic condition), push n on the stack
and remove n from GI , along with all edges incident to it> If this causes some vertex in GI to have fewer than k neighbors, then
go to step 1; otherwise, repeat step 2
3. Successively pop vertices off the stack and color them in the lowest color not used by some neighbor
> If some vertex cannot be colored, then pick an uncolored vertex to spill, spill it, and restart at step 1
28
CS406/534 Fall 2004, Prof. Li Xu 28
The Middle-end: Optimizer
Front Middle End Back End
Infrastructure: symbol tables, trees, graphs, intermediate representations, sets, tuples
Scanner
Parser
CS
A
Optim
ization 1
Optim
ization 2
Optim
ization nAnalysis
Instruction Selection
Instruction Scheduling
Register A
llocation
IRIR
29
CS406/534 Fall 2004, Prof. Li Xu 29
Principles of Compiler Optimizationsafety
Does applying the transformation change the results of executing the code?
profitabilityIs there a reasonable expectation that applying the transformation will improve the code?
opportunityCan we efficiently and frequently find places to apply optimization
Optimizing compilerProgram AnalysisProgram Transformation
30
CS406/534 Fall 2004, Prof. Li Xu 30
Program AnalysisControl-flow analysisData-flow analysis
31
CS406/534 Fall 2004, Prof. Li Xu 31
Control Flow AnalysisBasic blocksControl flow graphDominator treeNatural loopsDominance frontier
the join points for SSAinsert Ф node
CFG DOM Tree
32
CS406/534 Fall 2004, Prof. Li Xu 32
Data Flow Analysis“compile-time reasoning about the run-time flow of values”
represent effects of each basic blockpropagate facts around control flow graph
33
CS406/534 Fall 2004, Prof. Li Xu 33
DFA: The Big PictureSet up a set of equations that relate program properties at different program points in terms of the properties at "nearby" program points
B
IN(B)
OUT(B)
local(B)
Transfer functionForward analysis: compute OUT(B) in terms IN(B)
Available expressionsReaching definition
Backward analysis: compute IN(B) in terms of OUT(B)
Variable livenessVery busy expressions
Meet function for join pointsForward analysis: combine OUT(p) of predecessors to form IN(B)Backward analysis: combine IN(s) of successors to form OUT(B)
34
CS406/534 Fall 2004, Prof. Li Xu 34
Available ExpressionBasic block b
IN(b): expressions available at b’s entryOUT(b): expressions available at b’s exitLocal sets
: expressions defined in b and available on exit: expressions killed in b
An expression is killed in b if operands are assigned in b
Transfer function
Meet function
35
CS406/534 Fall 2004, Prof. Li Xu 35
More Data Flow ProblemsAVAIL Equations
More data flow problemsReaching DefinitionLiveness
36
CS406/534 Fall 2004, Prof. Li Xu 36
Compiler OptimizationLocal optimization
DAG CSEValue numbering
Global optimization enabled by DFAGlobal CSE (AVAIL)Constant propagation (Def-Use)Dead code elimination (Use-Def)
Advanced topic: SSA
37
CS406/534 Fall 2004, Prof. Li Xu 37
PerspectiveFront End Middle End Back End
Infrastructure: symbol tables, trees, graphs, intermediate representations, sets, tuples
Scanner
Parser
CS
A
Optim
ization 1
Optim
ization 2
Optim
ization n
Analysis
Instruction Selection
Instruction Scheduling
Register A
llocation
IRIR
Front end: essentially solved problemMiddle end: domain-specific languageBack end: new architectureVerifying compiler, reliability, security
38
CS406/534 Fall 2004, Prof. Li Xu 38
Interesting Stuff We SkippedInterprocedural analysisAlias (pointer) analysisGarbage collection
Check the literature reference in EaC
39
CS406/534 Fall 2004, Prof. Li Xu 39
How will you use 91.406/534 knowledge?As informed programmer As informed small language designerAs informed hardware engineerAs compiler writer
40
CS406/534 Fall 2004, Prof. Li Xu 40
Informed Programmer“Knowledge is power”
Compiler is no longer a black boxKnow how compiler works
ImplicationsUse of language features
Avoid those can cause problemGive compiler hints
Code optimizationDon’t optimize prematurelyDon’t write complicated code
DebuggingUnderstand the compiled code
41
CS406/534 Fall 2004, Prof. Li Xu 41
Solving Problem the Compiler WaySolve problems from language/compiler perspective
Implement simple languageExtend language
42
CS406/534 Fall 2004, Prof. Li Xu 42
Informed Hardware EngineerCompiler support for programmable hardware
pervasive computingnew back-ends for new processors
Design new architectureswhat can compiler do and not dohow to expose and use compiler to manage hardware resources
43
CS406/534 Fall 2004, Prof. Li Xu 43
Compiler WriterMake a living by writing compilers!
TheoryAlgorithmsEngineering
We have built:scannerparserAST tree builder, type checkerregister allocatorinstruction scheduler
Used compiler generation toolsANTLR, lex, yacc, etc
On track to jump into compiler development!
44
CS406/534 Fall 2004, Prof. Li Xu 44
Final RemarksCompiler construction
TheoryImplementation
How to use what you learned in 91.406/534?As informed programmer As informed small language designerAs informed hardware engineerAs compiler writer
… and live happily ever after
45
CS406/534 Fall 2004, Prof. Li Xu 45