Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Welcome to CSE131b:Compiler Construction
pic from: http://xkcd.com/303/
Lingjia Tang
Monday, April 1, 13
• Course Information
• What are compilers?
• Why do we learn about them?
• History of compilers
• Structure of compilers
• A bit info on course projects
Monday, April 1, 13
Course Staff
• Instructor: Lingjia Tang (2108)
• TA: Xinxin Jing
• Tutors: Haronid Moncivais Miller and more
Monday, April 1, 13
http://cseweb.ucsd.edu/classes/sp13/cse131-b
Monday, April 1, 13
“Dragon Book”
Monday, April 1, 13
Grading Polices
35%
20%
45%
Project MidtermFinal
Monday, April 1, 13
Grading Polices
35%
20%
45%
Project MidtermFinal
• Class participation• Piazza participation• Projects/exam extra credits
Extra credits:
Monday, April 1, 13
Prerequisites
• CSE 70 / CSE 110
• CSE 100
• CSE 105
• CSE 130
Monday, April 1, 13
What is a compiler?
Monday, April 1, 13
• Compiler
Compiler
Monday, April 1, 13
• Compiler
Compilerprogram in source language
Monday, April 1, 13
• Compiler
Compilerprogram in source language
program in target language
Monday, April 1, 13
• Compiler
Compilerprogram in source language
program in target language
Error message
Monday, April 1, 13
• Compiler
Compilerprogram in source language
program in target language
Error message
Input
Output
Monday, April 1, 13
• Compiler
Compilerprogram in source language
program in target language
Error message
Input
Output
• Translates
Monday, April 1, 13
• Compiler
Compilerprogram in source language
program in target language
Error message
Input
Output
• Translates
• Typically lower the level of abstraction of the program
Monday, April 1, 13
• Compiler
Compilerprogram in source language
program in target language
Error message
Input
Output
• Translates
• Typically lower the level of abstraction of the program • High-level programming language -> machine code (C/C++ -> X86, etc)
Monday, April 1, 13
• Compiler
Compilerprogram in source language
program in target language
Error message
Input
Output
• Translates
• Typically lower the level of abstraction of the program • High-level programming language -> machine code (C/C++ -> X86, etc)
• We expect the program produced by the compiler to be better, (faster, consumes less memory, etc) than the original program
Monday, April 1, 13
• Compiler
Compilerprogram in source language
program in target language
Error message
Input
Output
Monday, April 1, 13
• Compiler
Compilerprogram in source language
program in target language
Error message
Input
Output
• Interpreter
Monday, April 1, 13
• Compiler
Interpreter
Compilerprogram in source language
program in target language
Error message
Input
Output
• Interpreter
Monday, April 1, 13
• Compiler
Interpreter
Compilerprogram in source language
program in target language
Error message
ProgramInput
Input
Output
• Interpreter
Monday, April 1, 13
• Compiler
Interpreter
Compilerprogram in source language
program in target language
Error message
ProgramInput
output
Input
Output
• Interpreter
Monday, April 1, 13
• Compiler
Interpreter
Compilerprogram in source language
program in target language
Error message
ProgramInput
output
Input
Output
• Interpreter
Static: before runtime
Monday, April 1, 13
• Compiler
Interpreter
Compilerprogram in source language
program in target language
Error message
ProgramInput
output
Input
Output
• Interpreter
Static: before runtime
Dynamic: during runtime
Monday, April 1, 13
• Compiler
Interpreter
Compilerprogram in source language
program in target language
Error message
ProgramInput
output
Input
Output
• Interpreter
Static: before runtime
Dynamic: during runtime
Examples: python, shell, javascript, PHP
Monday, April 1, 13
• Hybrid
• Example: Java
Compilerprogram in
source languageintermediate
program
Virtual Machine
Input output
Monday, April 1, 13
• Hybrid
• Example: Java
Compilerprogram in
source languageintermediate
program
Virtual Machine
Input output
Interpreter
Just-in-time compiler
Garbage Collection
native machine
code
Monday, April 1, 13
• Hybrid
• Example: Java
Compilerprogram in
source languageintermediate
program
Virtual Machine
Input output
Static
Interpreter
Just-in-time compiler
Garbage Collection
native machine
code
Monday, April 1, 13
• Hybrid
• Example: Java
Compilerprogram in
source languageintermediate
program
Virtual Machine
Input output
Static
Runtime
Interpreter
Just-in-time compiler
Garbage Collection
native machine
code
Monday, April 1, 13
• Hybrid
• Example: Java
Compilerprogram in
source languageintermediate
program
Virtual Machine
Input output
Sacrifice efficiency for portability, “safety” and other
dynamic features
Static
Runtime
Interpreter
Just-in-time compiler
Garbage Collection
native machine
code
Monday, April 1, 13
Why Learning Compiler?
• Build awesome, complex software
• Theory meets practice
• remember DFA, CFG in CSE105 ?
• Understand how SW interacts with HW
• Understand program language designs and features
Monday, April 1, 13
Why Learning Compiler? cont.
• My code runs faster than your code...
• consumes less power
• consumes less memory
• more reliable..
Monday, April 1, 13
Why Learning Compiler? cont.
• Who develop compilers?
• many many companies....
• Who use compilers?
• everyone!
Monday, April 1, 13
Why Learning Compiler? cont.
• Who develop compilers?
• many many companies....
• Who use compilers?
• everyone!
Monday, April 1, 13
Why Learning Compiler? cont.
• Who develop compilers?
• many many companies....
• Who use compilers?
• everyone!
Monday, April 1, 13
Why Learning Compiler? cont.
• Who develop compilers?
• many many companies....
• Who use compilers?
• everyone!
program in PHP
program in C++
Monday, April 1, 13
Why Learning Compiler? cont.
• Who develop compilers?
• many many companies....
• Who use compilers?
• everyone!
program in PHP
program in C++
Monday, April 1, 13
Goal of this course
• For you to become compiler NINJA!
Monday, April 1, 13
History of Compiler
• Machine Language
• Assembly
• programming too time consuming
Monday, April 1, 13
History of Compiler
• Machine Language
• Assembly
• programming too time consuming
• High-level language
Monday, April 1, 13
pics from wikipedia
Monday, April 1, 13
• Grace Hopper
pics from wikipedia
Monday, April 1, 13
• Grace Hopper conceptualized the idea of machine-independent programming languages
"Nobody believed that," she said. "I had a running compiler and nobody would touch it. They told me computers could only do arithmetic.”
COBOL - (1959)
pics from wikipedia
Monday, April 1, 13
• Grace HopperWon ACM Turing Award “for profound, influential, and lasting contributions to the design of practical high-level programming systems, notably through his work on FORTRAN"
• John Backus
conceptualized the idea of machine-independent programming languages
"Nobody believed that," she said. "I had a running compiler and nobody would touch it. They told me computers could only do arithmetic.”
COBOL - (1959)
pics from wikipedia
Monday, April 1, 13
saman amarasinghe’s slide
Monday, April 1, 13
• Compilers: map high-level abstract language to assembly language
saman amarasinghe’s slide
Monday, April 1, 13
• Compilers: map high-level abstract language to assembly language
• Simple mapping of a program to assembly language produces inefficient execution
• Higher the level of abstraction, more inefficiency
• If not efficient, high-level abstractions are useless
saman amarasinghe’s slide
Monday, April 1, 13
• Compilers: map high-level abstract language to assembly language
• Simple mapping of a program to assembly language produces inefficient execution
• Higher the level of abstraction, more inefficiency
• If not efficient, high-level abstractions are useless
• Compilers need to:
• provide a high level abstraction
• with performance of giving low-level instruction
• bridge the efficiency gap b/t high-level PL and low-level processor/ISAs
saman amarasinghe’s slide
Monday, April 1, 13
The Structure of a Modern Compiler
The Structure of a Modern Compiler
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
SourceCode
Machine
Code
Monday, April 1, 13
The Structure of a Modern Compiler
The Structure of a Modern Compiler
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
SourceCode
Machine
Code
Front end
Monday, April 1, 13
The Structure of a Modern Compiler
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
SourceCode
Machine
Code
The Structure of a Modern Compiler
Back end
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
Lexical analysis (Scanning): Identify logic pieces of the description.
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
T_WhileT_LeftParenT_Identifier yT_LessT_Identifier zT_RightParenT_OpenBraceT_IntT_Identifier xT_AssignT_Identifier aT_PlusT_Identifier bT_SemicolonT_Identifier yT_PlusAssignT_Identifier xT_SemicolonT_CloseBrace
Lexical analysis (Scanning): Group sequence of characters into lexemes – smallest meaningful entity in a language (keywords, identifiers, constants)
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
T_WhileT_LeftParenT_Identifier yT_LessT_Identifier zT_RightParenT_OpenBraceT_IntT_Identifier xT_AssignT_Identifier aT_PlusT_Identifier bT_SemicolonT_Identifier yT_PlusAssignT_Identifier xT_SemicolonT_CloseBrace
Syntax analysis (Parsing): Identify how those pieces relate to each other
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}While
<
Sequence
=
x +
a b
=
y +
y x
y z
Syntax analysis (Parsing): Convert a linear structure – sequence of tokens – to a hierarchical tree-like structure - abstract syntax tree (AST)
Monday, April 1, 13
Syntax Analyzer (Parser)Syntax Analyzer (Parser)
int * foo(i, j, k))int i;int i;int j; Extra parentheses
{for(i=0; i j) {fi(i>j)
return j;Missing increment
Not an expression}
Not an expression
Not a keyword
Saman Amarasinghe 35 6.035 ©MIT Fall 1998
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}While
Sequence
=
x +
a b
=
y +
y x
<
y z
Semantic Analysis: Identify the meaning of the overall structure
Monday, April 1, 13
Semantic Analysis: Rules of the language are checked (variable declaration, type checking)
Lexical Analysis
Syntax Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}While
Sequence
=
x +
a b
=
y +
y x
int
int int
int
intint
int
int int
int
void
void
Semantic Analysis
<
y z
int int
bool
Monday, April 1, 13
Semantic AnalyzerSemantic Analyzer
int * foo(i, j, k)int i;int i;int j; Type not declared
{int x;
Mismatched return typex = x + j + N;return j;
Mismatched return type
Uninitialized variable used} Undeclared variable
Saman Amarasinghe 37 6.035 ©MIT Fall 1998
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}While
Sequence
=
x +
a b
=
y +
y x
int
int int
int
intint
int
int int
int
void
void
<
y z
int int
bool
Intermediate Representation (IR) Generation: Generate intermediate code
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
Loop: x = a + b y = x + y _t1 = y < z if _t1 goto Loop
IR Generation
IR Generation: • Makes it easy to port compiler to other architectures (e.g. X86 to MIPS)• Enables optimizations that are not machine specific
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
Loop: x = a + b y = x + y _t1 = y < z if _t1 goto Loop
IR Optimization: Optimize intermediate code (machine-independent)
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
x = a + bLoop: y = x + y _t1 = y < z if _t1 goto Loop
IR Optimization
IR Optimization: Optimize intermediate code (machine-independent)
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
x = a + bLoop: y = x + y _t1 = y < z if _t1 goto Loop
Gode Generation: Intermediate code is translated into native code
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
add $1, $2, $3Loop: add $4, $1, $4 slt $6, $1, $5 beq $6, loop
Gode Generation: Intermediate code is translated into native code
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
add $1, $2, $3Loop: add $4, $1, $4 slt $6, $1, $5 beq $6, loop
Gode Optimization: machine dependent optimization(register allocation, instruction selection, peephole, etc)
Monday, April 1, 13
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
add $1, $2, $3Loop: add $4, $1, $4 blt $1, $5, loop
Gode Optimization: machine dependent optimization(register allocation, instruction selection, peephole, etc)
Monday, April 1, 13
Modern Compilers
• Matured Frontend
• Heavy Backend
• many passes of optimization
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
Monday, April 1, 13
Compiler Optimization
• Frances E. Allen
• first woman to win Turing Award
• “Introduced many of the abstractions, algorithms, and implementations that laid the groundwork for automatic program optimization technology”
Monday, April 1, 13
Architecture of GCC(GNU Compiler Collection)
Architecture of gcc
SourceCode
AST
GENERIC
HighGIMPLE
SSA
LowGIMPLE
RTL
MachineCode
IRsmany optimization
passes here
Monday, April 1, 13
Architecture of GCC(GNU Compiler Collection)
Architecture of gcc
SourceCode
AST
GENERIC
HighGIMPLE
SSA
LowGIMPLE
RTL
MachineCode
IRs
FrontEnd supports: C (gcc), C++ (g++), Objective-C, Objective-C++, Fortran (gfortran), Java (gcj), Ada (GNAT), and Go (gccgo) Machine code:
X86, ARM, Alpha, MIPS, SPARC,PowerPC, etc
many optimization passes here
Monday, April 1, 13
Architecture of GCC(GNU Compiler Collection)
Architecture of gcc
SourceCode
AST
GENERIC
HighGIMPLE
SSA
LowGIMPLE
RTL
MachineCode
IRs
FrontEnd supports: C (gcc), C++ (g++), Objective-C, Objective-C++, Fortran (gfortran), Java (gcj), Ada (GNAT), and Go (gccgo) Machine code:
X86, ARM, Alpha, MIPS, SPARC,PowerPC, etc
many optimization passes here
Jeanne Ferrante et. al
Monday, April 1, 13
Monday, April 1, 13
Architecture of LLVM• LLVM (Low Level Virtual Machine)
• developed in UIUC (2000)
• “language-agnostic” design
Monday, April 1, 13
This course covers
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
while (y < z) {
int x = a + b;
y += x;
}
Midterm
Monday, April 1, 13
The Course Project• Compiler: Decaf -> MIPS
• have a working compiler by the end of the quarter
• will be able to run your Decaf programs!
• Source Language: Decaf
• Custom programming language: similar to Java or C++ (simplified feature set)
• Objected-oriented, inheritance, Strong type
• Target language: MIPS
• RISC ISA
• Implementation: C++Monday, April 1, 13
Programming Assignments
Lexical Analysis
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization
SourceCode
Machine
Code
P1P2P3
P4
10%10%10%
15%
MIPS
Decaf
~1 week
1.5 week
3 weeks
4 weeks
Monday, April 1, 13
• 2 person group
• find your partner by the end of this week
• Group members get the same grade
• Honor code:
• Discussion among groups is encouraged
• But implement the solution on your own
• Discussion Session ?
Monday, April 1, 13