45
COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Embed Size (px)

Citation preview

Page 1: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

COMPSCI 322: Language and Compilers

Class Hour:Hyer Hall 210: TThu 9:30am – 10:15am

Page 2: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

A little bit about the instructor

• Graduated from the University of Connecticut (05 Class), Ph.D in Computer Science and Engineering

• Bachelor of Science from Hanoi University of Technology (86-91)

• Master of Computer Science from UW-Milwaukee (96-99)

Assistant professor at UWW since August 2005

Page 3: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

A little bit about the instructor

• Research Experience:– User Modeling, Information Retrieval,

Decision Theory, Collaborative Filtering, Human Factors

• Teaching Experience:– MCS 220, COMPSCI 172, 181, 271, 381 at

UWW– Introductory courses at UOP and Devry– TA for Computer Architecture, OO

Design, Compiler, Artificial Intelligence

Page 4: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Contact information

[email protected](fastest way to contact me)

Baker Hall 324Office Hours: 9:50am – 10:50

am, 3-4pm, MWF or by appointment

262 472 5170

Page 5: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Course Objectives

• Understand the description and successfully design a scanner, parser, semantic checker and code generator for this language

• Implement successfully a scanner, parser, semantic checker and code generator for this given language. Test the implementation with all test cases for each component in a compiler.

Page 6: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Book Requirement

• Engineering a Compiler. 2004. Keith D. Cooper and Linda Torczon. Morgan Kaufmann Publisher (available in TextBook rental)

• Web site: http://www.cs.rice.edu/~keith/Errata.html

Page 7: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Course detail - Evaluation

GRADABLE POINTS

3 projects 650

Final Exam 150

Presentation 100

In class exercises 100

Total 1000

Page 8: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Projects

• 3 projects: scanner, parser and semantic checker, code generator. Preferred language to develop them is Java, but C/C++ are welcomed too.

• Project 3 depends on Project 2, Project 2 depends on Project 1.

• ABSOLUTELY no LATE submission for Project 3 because of the time consuming to grade this project.

Page 9: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

In class exercises

• Simple multiple choice questions and simple problems will be given in class weekly and graded.

• This requires students to read the assigned reading (partly also because this is a discussion course instead of lecture)– Not all material will be covered in class– Book complements the lectures

Page 10: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Presentation

• Each student will do research on a specific programming language of his choice. Please let the instructor know ahead of time which language do you choose

• Then present 15-20 minutes his research in front of class using powerpoint presentation. This will be followed by 10 minute questions.

Page 11: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Grade

Letter Grade Percentage

A 90 to 100%

B 80 to 89%

C 70 to 79%

D 60 to 69%

F Below 60%

Page 12: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Prerequisite

Prerequisite: COMPSCI 271, and Data Structures

Students are responsible for meeting these requirements.

Page 13: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Compilers

• What is a compiler?– A program that translates an executable

program in one language into an executable program in another language

– The compiler should improve the program, in some way

• What is an interpreter?

Page 14: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Compilers

• What is a compiler?– A program that translates an executable

program in one language into an executable program in another language

– The compiler should improve the program, in some way

• What is an interpreter? – A program that reads an executable program

and produces the results of executing that program

Page 15: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Examples

• C is typically compiled, Basic is typically interpreted

• Java is compiled to bytecodes (code for the Java VM). – which are then interpreted– Or a hybrid strategy is used

• Just-in-time compilation

Page 16: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Taking a Broader View

• Compiler Technology = Off-Line Processing– Goals: improved performance and language

usability• Making it practical to use the full power of the

language

– Trade-off: preprocessing time versus execution time (or space)

– Rule: performance of both compiler and application must be acceptable to the end user

Page 17: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Why study Compilation

“ So even though I'd never actually want to write a compiler myself, knowing about compiler concepts would have made me a better programmer. It's one of those gaps that I regret, which is why I think I may actually try to struggle through a few chapters from this Engineering a Compiler book during the holidays, in between all the holiday activities like eating. And shopping. And listening to "Santa Got Run Over By a Reindeer" for the billionth time … “

Page 18: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Why Study Compilation?

• Compilers are important system software components– They are intimately interconnected with architecture,

systems, programming methodology, and language design

• Compilers include many applications of theory to practice– Scanning, parsing, static analysis, instruction selection

• Many practical applications have embedded languages– Commands, macros, formatting tags …

Page 19: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Why Study Compilation?

• Many applications have input formats that look like languages, – Matlab, Mathematica

• Writing a compiler exposes practical algorithmic & engineering issues– Approximating hard problems; efficiency &

scalability

Page 20: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Intrinsic interest

Compiler construction involves ideas from many different parts of computer science

Artificial intelligenceGreedy algorithmsHeuristic search techniques

AlgorithmsGraph algorithms, union-findDynamic programming

TheoryDFAs & PDAs, pattern matchingFixed-point algorithms

SystemsAllocation & naming, Synchronization, locality

ArchitecturePipeline & hierarchy management Instruction set use

Page 21: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Intrinsic merit Compiler construction poses challenging and

interesting problems:– Compilers must do a lot but also run fast

– Compilers have primary responsibility for run-time performance

– Compilers are responsible for making it acceptable to use the full power of the programming language

– Computer architects perpetually create new challenges for the compiler by building more complex machines

– Compilers must hide that complexity from the programmer

– Success requires mastery of complex interactions

Page 22: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Preparation for next class

Review the materials for this classRead chapter 1 of the book

Page 23: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Overview of compilers

Page 24: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

High-level View of a Compiler

Sourcecode

Machinecode

Compiler

Errors

Page 25: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

High-level overview of a compiler

– Must recognize legal (and illegal) programs– Must generate correct code– Must manage storage of all variables (and code)– Must agree with OS & linker on format for object codeBig step up from assembly language—use higher level

notations

Page 26: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Traditional Two-pass Compiler

• Use an intermediate representation (IR)• Front end maps legal source code into IR• Back end maps IR into target machine code• Admits multiple front ends & multiple passes

Sourcecode

FrontEnd

Errors

Machinecode

BackEnd

IR

Page 27: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

• Responsibilities– Recognize legal (& illegal) programs– Report errors in a useful way– Produce IR & preliminary storage map– Shape the code for the back end– Much of front end construction can be automated

The Front EndSourcecode Scanner

IRParser

Errors

tokens

Page 28: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Scanner

• Maps character stream into words• Produces pairs (token): <its part of speech, a

word>x = x + y ; becomes <id,x> = <id,x> + <id,y> ;– word lexeme, part of speech token type

• Typical tokens include number, identifier, +, –, new, while, if

• Scanner eliminates white space and comments• Speed is important

Page 29: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Parser

• Recognizes context-free syntax & reports errors

• Guides context-sensitive (“semantic”) analysis (type checking)

• Builds IR for source program

Hand-coded parsers are fairly easy to build

Most books advocate using automatic parser generators

Page 30: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Parser

Context-free syntax is specified with a grammar

SheepNoise SheepNoise baa | baa

SheepNoise -> nil

This grammar defines the set of noises that a sheep makes under normal circumstances

It is written in a variant of Backus–Naur Form (BNF)

Page 31: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Parser

Formally, a grammar G = (S,N,T,P)• S is the start symbol• N is a set of non-terminal symbols• T is a set of terminal symbols or words• P is a set of productions or rewrite rules

(P : N N T )

Page 32: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Parser

1. goal expr

2. expr expr op term3. | term

4. term number5. | id

6. op +7. | -

S = goal

T = { number, id, +, - }

N = { goal, expr, term, op }

P = { 1, 2, 3, 4, 5, 6, 7}

Page 33: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Parser

Context-free syntax can be put to better use

• This grammar defines simple expressions with addition & subtraction over “number” and “id”.

• This grammar, like many, falls in a class called “context-free grammars”, abbreviated CFG.

Page 34: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

ParserProduction Result goal

1 expr2 expr op term5 expr op y7 expr - y2 expr op term - y4 expr op 2 - y6 expr + 2 - y3 term + 2 - y5 x + 2 - y

x + 2 - y

Page 35: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

ParserA parse can be represented by a tree

(parse tree or syntax tree)

x + 2 - y

term

op termexpr

termexpr

goal

expr

op

<id,x>

<number,2>

<id,y>

+

-

1. goal expr

2. expr expr op term3. | term

4. term number5. | id

6. op +7. | -

Page 36: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Parser

Compilers often use an abstract syntax tree

+

-

<id,x>

<number,2>

<id,y>

The AST summarizes grammatical structure, without including detail about the derivation

Page 37: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

The Back End

Responsibilities• Translate IR into target machine code• Choose instructions to implement each IR operation• Decide which value to keep in registers• Ensure conformance with system interfaces

Automation has been less successful in the back end

Errors

IR RegisterAllocation

InstructionSelection

Machinecode

InstructionScheduling

IR IR

Page 38: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

The Back EndInstruction Selection• Produce fast, compact code• Take advantage of target features such as addressing

modes• Usually viewed as a pattern matching problem

– ad hoc methods, pattern matching, dynamic programming

Errors

IR RegisterAllocation

InstructionSelection

Machinecode

InstructionScheduling

IR IR

Page 39: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

The Back End

Register Allocation• Have each value in a register when it is used• Manage a limited set of resources• Can change instruction choices & insert LOADs & STOREs• Optimal allocation is NP-Complete (1 or k registers)

• Compilers approximate solutions to NP-Complete problems

Errors

IR RegisterAllocation

InstructionSelection

Machinecode

InstructionScheduling

IR IR

Page 40: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

The Back End

Instruction Scheduling• Avoid hardware stalls and interlocks• Use all functional units productively• Can increase lifetime of variables (changing the

allocation)Optimal scheduling is NP-Complete in nearly all casesHeuristic techniques are well developed

Errors

IR RegisterAllocation

InstructionSelection

Machinecode

InstructionScheduling

IR IR

Page 41: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Traditional Three-pass Compiler

Code Improvement (or Optimization)• Analyzes IR and rewrites (or transforms) IR• Primary goal is to reduce running time of the compiled

code– May also improve space, power consumption, …

• Must preserve “meaning” of the code– Measured by values of named variables

Errors

SourceCode

MiddleEnd

FrontEnd

Machinecode

BackEnd

IR IR

Page 42: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

The Optimizer (or Middle End)

Typical Transformations• Discover & propagate some constant value• Move a computation to a less frequently executed place• Specialize some computation based on context• Discover a redundant computation & remove it• Remove useless or unreachable code• Encode an idiom in some particularly efficient form

Errors

Opt1

Opt3

Opt2

Optn

...IR IR IR IR IR

Modern optimizers are structured as a series of passes

Page 43: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Modern Restructuring Compiler

Typical Restructuring Transformations:• Blocking for memory hierarchy and register reuse• Vectorization• Parallelization• All based on dependence• Also full and partial inlining

Errors

SourceCode

Restructurer

FrontEnd

Machinecode

Opt +BackEnd

HLAST IR

HLAST IR

Gen

Page 44: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Discussion

Consider a simple web browser that takes as input a textual string in HTML format and displays the specified graphics on the screen. Is the display process of compilation or interpretation? Why?

Page 45: COMPSCI 322: Language and Compilers Class Hour: Hyer Hall 210: TThu 9:30am – 10:15am

Next class

• Lexical analysis• Chapter 2