44
컴컴컴컴 컴컴컴컴 첫첫첫 첫첫첫 2011/09/02 2011/09/02 첫첫첫 첫첫첫

컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Embed Size (px)

Citation preview

Page 1: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

컴파일러 컴파일러

첫째주 첫째주 2011/09/022011/09/02

권혁철권혁철

Page 2: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

과제과제

It might be the biggest program you’ve ever writteIt might be the biggest program you’ve ever written.n.

It cannot be done the day it’s due!It cannot be done the day it’s due! SyllabusSyllabus 에 있는 대로 따라 올 것에 있는 대로 따라 올 것

ABEEKABEEK 의 설계과목임의 설계과목임 팀별 리포트 팀별 리포트 : : 팀원별 역할을 분명히팀원별 역할을 분명히 , , 전체 시스템 구조와 전체 시스템 구조와

내용을 알아야 함내용을 알아야 함 제한조건 아래 설계제한조건 아래 설계 창의성이 있어야 창의성이 있어야 ???? ????

Page 3: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

팀 구성팀 구성 // 과제과제

팀 구성팀 구성 22 명을 한 팀으로 구성명을 한 팀으로 구성 두 사람이 공동으로 개발하되두 사람이 공동으로 개발하되 , , 두 사람은 모두 전체 두 사람은 모두 전체

프로그램을 알고 있어야 함프로그램을 알고 있어야 함 과제과제

컴파일러를 만드는 일반 과정을 따르면 됨컴파일러를 만드는 일반 과정을 따르면 됨 제한 조건 내에서 자신의 아이디어를 추가함제한 조건 내에서 자신의 아이디어를 추가함

Page 4: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

제한조건제한조건

형선언형선언 , , 사칙연산사칙연산 , if, if 문문 , for, for 문은 포함함문은 포함함 부프로그램이나 기타 기능 추가 시 이를 평가하여 가점을 줌부프로그램이나 기타 기능 추가 시 이를 평가하여 가점을 줌

가상어셈블리어 코드를 만들고가상어셈블리어 코드를 만들고 , , 인터프리터를 만들어도 인터프리터를 만들어도 됨됨 PentiumPentium 용 어셈블리어나 용 어셈블리어나 Byte-codeByte-code 로 결과를 출력하여 로 결과를 출력하여

수행되면 이를 인정하여 가점을 줌수행되면 이를 인정하여 가점을 줌 언어 선택이나 문법 선택언어 선택이나 문법 선택 , , 언어정의 따위는 각자 알아서 언어정의 따위는 각자 알아서

함함 LexLex 나 나 YaccYacc 을 사용해도 됨을 사용해도 됨

단단 , , 이를 이용하지 않으면 이를 인정하여 가점을 줌이를 이용하지 않으면 이를 인정하여 가점을 줌 , Parse table, Parse table은 은 YaccYacc 의 것을 이용할 수 있음의 것을 이용할 수 있음

Page 5: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Making Languages UsableMaking Languages Usable

It was our belief that if FORTRAN, during its It was our belief that if FORTRAN, during its first months, were to translate any first months, were to translate any reasonable “scientific” source program into reasonable “scientific” source program into an object program only half as fast as its an object program only half as fast as its hand-coded counterpart, then acceptance of hand-coded counterpart, then acceptance of our system would be in serious danger... I our system would be in serious danger... I believe that had we failed to produce believe that had we failed to produce efficient programs, the widespread use of efficient programs, the widespread use of languages like FORTRAN would have been languages like FORTRAN would have been seriously delayed.seriously delayed.

— — John BackusJohn Backus

18 person-years to complete!!!

Page 6: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Compiler constructionCompiler construction

Compiler writing is perhaps the most Compiler writing is perhaps the most pervasive topic in computer science, pervasive topic in computer science, involving many fields:involving many fields: Programming languagesProgramming languages ArchitectureArchitecture Theory of computationTheory of computation AlgorithmsAlgorithms Software engineeringSoftware engineering

In this course, you will put everything you In this course, you will put everything you have learned together. Exciting, right??have learned together. Exciting, right??

Page 7: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

간단한 질문간단한 질문

Consider the grammar shown below(<S> is your start symbol). Circle which of the strings shown on the below are in the language described by the grammar? There may be zero or more correct answers.

Grammar: <S> ::= <A> a <B> b <A> ::= b <A> | b <B> ::= <A> a | a

Strings: A) baab B) bbbabb C) bbaaaa D) baaabb E) bbbabab

Compose the grammar for the language consisting of sentences of an equal number of a’s followed by an equal number of b’s. For example, aaabbb is in the language, aabbb is not, the empty string is not in the language.

Page 8: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

나이스 사건 나이스 사건

원인이 무엇인가원인이 무엇인가 ?? 당신이 해결한다면 어떻게 하겠는가당신이 해결한다면 어떻게 하겠는가 ??

Why software is Eating the WorldWhy software is Eating the World

Page 9: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

What is a compiler?What is a compiler?

The source language might beThe source language might be General purpose, e.g. C or PascalGeneral purpose, e.g. C or Pascal A “little language” for a specific domain, e.g. SIMLA “little language” for a specific domain, e.g. SIML

The target language might beThe target language might be Some other programming languageSome other programming language The machine language of a specific machineThe machine language of a specific machine

CompilerSourceProgra

m

TargetProgra

m

ErrorMessag

e

Page 10: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

관련 용어관련 용어 What is an What is an interpreterinterpreter? ?

A program that reads an A program that reads an executable executable program and produces the results of program and produces the results of executing that programexecuting that program

Target MachineTarget Machine: machine on which : machine on which

compiled program is to be runcompiled program is to be run

Cross-CompilerCross-Compiler: compiler that runs on a : compiler that runs on a

different type of machine than is its targetdifferent type of machine than is its target

Compiler-CompilerCompiler-Compiler: a tool to simplify the : a tool to simplify the

construction of compilersconstruction of compilers (YACC/JCUP) (YACC/JCUP)

Page 11: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Is it hard??Is it hard??

In the 1950s, compiler writing took an In the 1950s, compiler writing took an enormous amount of effort.enormous amount of effort. The first FORTRAN compiler took 18 person-The first FORTRAN compiler took 18 person-

yearsyears

Today, though, we have very good Today, though, we have very good software toolssoftware tools You will write your own compiler in a team of 3 You will write your own compiler in a team of 3

in one semester!in one semester!

Page 12: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Intrinsic interestIntrinsic interest

Compiler construction involves ideas Compiler construction involves ideas from many different parts of computer from many different parts of computer sciencescienceArtificial intelligence

Greedy algorithmsHeuristic search techniques

AlgorithmsGraph algorithms, union-findDynamic programming

TheoryDFAs & PDAs, pattern matchingFixed-point algorithms

SystemsAllocation & naming, Synchronization, locality

ArchitecturePipeline & hierarchy management Instruction set use

Page 13: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Intrinsic meritIntrinsic merit

Compiler construction poses challenging and interesting Compiler construction poses challenging and interesting problems:problems:

Compilers must do a lot but also Compilers must do a lot but also run fastrun fast

Compilers have primary responsibility for Compilers have primary responsibility for run-time run-time performanceperformance

Compilers are responsible for making it acceptable to Compilers are responsible for making it acceptable to use the use the full powerfull power of the programming language of the programming language

Computer architects perpetually create new challenges Computer architects perpetually create new challenges for the compiler by building more for the compiler by building more complex machinescomplex machines

Compilers must hide that complexity from the Compilers must hide that complexity from the programmerprogrammer

Success requires mastery of complex interactionsSuccess requires mastery of complex interactions

Page 14: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

ImplicationsImplications Must recognize legal (and illegal) programsMust recognize legal (and illegal) programs Must generate correct codeMust generate correct code Must manage storage of all variables (and code)Must manage storage of all variables (and code) Must agree with OS & linker on format for object codeMust agree with OS & linker on format for object code

High-level View of a CompilerHigh-level View of a Compiler

Sourcecode

Machinecode

Compiler

Errors

Page 15: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Two Pass CompilerTwo Pass Compiler

We break compilation into two phases:We break compilation into two phases:

ANALYSIS breaks the program into pieces and creates an ANALYSIS breaks the program into pieces and creates an intermediate representation of the source program.intermediate representation of the source program.

SYNTHESIS constructs the target program from the SYNTHESIS constructs the target program from the intermediate representation.intermediate representation.

Sometimes we call the analysis part the FRONT Sometimes we call the analysis part the FRONT END and the synthesis part the BACK END of the END and the synthesis part the BACK END of the compiler. They can be written independently.compiler. They can be written independently.

Page 16: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Traditional Two-pass CompilerTraditional Two-pass Compiler

ImplicationsImplications Use an intermediate representation (IR)Use an intermediate representation (IR) Front end maps legal source code into IRFront end maps legal source code into IR Back end maps IR into target machine codeBack end maps IR into target machine code Admits multiple front ends & multiple passesAdmits multiple front ends & multiple passes

((better codebetter code))

Typically, front end is O(n) or O(n log n), while back end is NP-Typically, front end is O(n) or O(n log n), while back end is NP-CompleteComplete

Sourcecode

FrontEnd

Errors

Machinecode

BackEnd

IR

Page 17: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Can we build Can we build n x mn x m compilers with compilers with n+m n+m components?components? Must encode all language specific knowledge in each Must encode all language specific knowledge in each

front endfront end Must encode all features in a single IRMust encode all features in a single IR Must encode all target specific knowledge in each back Must encode all target specific knowledge in each back

endend

Limited success in systems with very low-level IRsLimited success in systems with very low-level IRs

A Common FallacyA Common FallacyFortran

Scheme

Java

Smalltalk

Frontend

Frontend

Frontend

Frontend

Backend

Backend

i86

SPARC

Power PCBackend

Page 18: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Source code analysisSource code analysis

Analysis is important for many applications besidAnalysis is important for many applications besides compilers:es compilers: STRUCTURE EDITORS try to fill out syntax units as you STRUCTURE EDITORS try to fill out syntax units as you

typetype PRETTY PRINTERS highlight comments, indent your coPRETTY PRINTERS highlight comments, indent your co

de for you, and so onde for you, and so on STATIC CHECKERS try to find programming bugs withoSTATIC CHECKERS try to find programming bugs witho

ut actually running the programut actually running the program INTERPRETERS don’t bother to produce target code, bINTERPRETERS don’t bother to produce target code, b

ut just perform the requested operations (e.g. Matlab)ut just perform the requested operations (e.g. Matlab)

Page 19: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Source code analysisSource code analysis

Analysis comes in three phases:Analysis comes in three phases:

LINEAR ANALYSIS processes characters left-to-LINEAR ANALYSIS processes characters left-to-right and groups them into TOKENSright and groups them into TOKENS

HIERARCHICAL ANALYSIS groups tokens HIERARCHICAL ANALYSIS groups tokens hierarchically into nested collections of tokenshierarchically into nested collections of tokens

SEMANTIC ANALYSIS makes sure the program SEMANTIC ANALYSIS makes sure the program components fit together, e.g. variables should components fit together, e.g. variables should be declared before they are usedbe declared before they are used

Page 20: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Linear (lexical) analysisLinear (lexical) analysis

The linear analysis stage is called LEXICAL The linear analysis stage is called LEXICAL ANALYSIS or SCANNING.ANALYSIS or SCANNING.

Example:Example:position = initial + rate * 60position = initial + rate * 60

gets translated as:gets translated as:

1.1. he IDENTIFIER “position”he IDENTIFIER “position”2.2. The ASSIGNMENT SYMBOL “=”The ASSIGNMENT SYMBOL “=”3.3. The IDENTIFIER “initial”The IDENTIFIER “initial”4.4. The PLUS OPERATOR “+”The PLUS OPERATOR “+”5.5. The IDENTIFIER “rate”The IDENTIFIER “rate”6.6. The MULTIPLICATION OPERATOR “*”The MULTIPLICATION OPERATOR “*”7.7. The NUMERIC LITERAL 60The NUMERIC LITERAL 60

Page 21: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Hierarchical (syntax) analysisHierarchical (syntax) analysis

The hierarchical stage is called SYNTAX The hierarchical stage is called SYNTAX ANALYSIS or PARSING.ANALYSIS or PARSING.

The hierarchical structure of the source The hierarchical structure of the source program can be represented by a PARSE program can be represented by a PARSE TREE, for example:TREE, for example:

Page 22: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

assignment statement

identifier =

expression

position

+

expression

expression

identifier

initial

expression

expression

identifieridentifier

rate 60

*

Page 23: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Syntax analysisSyntax analysis

The hierarchical structure of the syntactic The hierarchical structure of the syntactic units in a programming language is units in a programming language is normally represented by a set of normally represented by a set of recursive rules. Example for expressions:recursive rules. Example for expressions:

1.1. Any identifier is an expressionAny identifier is an expression2.2. Any number is an expressionAny number is an expression3.3. If expression1 and expression2 are If expression1 and expression2 are

expressions, so areexpressions, so areexpression1 + expression2expression1 + expression2expression1 * expression2expression1 * expression2( expression1 )( expression1 )

Page 24: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Syntax analysisSyntax analysis

Example for statements:Example for statements:

1.1. If identifier1 is an identifier and expression2 is If identifier1 is an identifier and expression2 is an expression, then identifier1 = expression2 an expression, then identifier1 = expression2 is a statement.is a statement.

2.2. If expression1 is an expression and If expression1 is an expression and statement2 is a statement, then the following statement2 is a statement, then the following are statements:are statements:while ( expression1 ) statement2while ( expression1 ) statement2

if ( expression1 ) statement2if ( expression1 ) statement2

Page 25: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Lexical vs. syntactic analysisLexical vs. syntactic analysis

Generally if a syntactic unit can be Generally if a syntactic unit can be recognized in a linear scan, we convert it recognized in a linear scan, we convert it into a token during lexical analysis.into a token during lexical analysis.

More complex syntactic units, especially More complex syntactic units, especially recursive structures, are normally recursive structures, are normally processed during syntactic analysis processed during syntactic analysis (parsing).(parsing).

Identifiers, for example, can be recognized Identifiers, for example, can be recognized easily in a linear scan, so identifiers are easily in a linear scan, so identifiers are tokenized during lexical analysis.tokenized during lexical analysis.

Page 26: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Source code analysisSource code analysis

It is common to convert complex parse trees to It is common to convert complex parse trees to simpler SYNTAX TREES, with a node for each simpler SYNTAX TREES, with a node for each operator and children for the operands of each operator and children for the operands of each operator.operator.

Analysis

position = initial + rate * 60

=

position

initial

+

*

60rate

Page 27: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Semantic analysisSemantic analysis

The semantic analysis stage:The semantic analysis stage: Checks for semantic errors, e.g. undeclared variablesChecks for semantic errors, e.g. undeclared variables Gathers type informationGathers type information Determines the operators and operands of expressionsDetermines the operators and operands of expressionsExample: if rate is a float, the integer literal 60 should be Example: if rate is a float, the integer literal 60 should be

converted to a float before multiplyingconverted to a float before multiplying..

=

position

initial

+

*

inttorealrate

60

Page 28: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

The rest of the

process

intermediatecode generator

source program

target program

semanticanalyzer

syntaxanalyzer

lexicalanalyzer

codeoptimizer

codegenerator

errorhandler

symbol-tablemanager

Page 29: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Symbol-table managementSymbol-table management

During analysis, we record the identifiers used in During analysis, we record the identifiers used in the program.the program.

The symbol table stores each identifier with its The symbol table stores each identifier with its ATTRIBUTES.ATTRIBUTES.

Example attributes:Example attributes: How much STORAGE is allocated for the idHow much STORAGE is allocated for the id The id’s TYPEThe id’s TYPE The id’s SCOPEThe id’s SCOPE For functions, the PARAMETER PROTOCOLFor functions, the PARAMETER PROTOCOL

Some attributes can be determined immediately; Some attributes can be determined immediately; some are delayed.some are delayed.

Page 30: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Error detectionError detection

Each compilation phase can have errorsEach compilation phase can have errors Normally, we want to keep processing Normally, we want to keep processing

after an error, in order to find more errors.after an error, in order to find more errors. Each stage has its own characteristic Each stage has its own characteristic

errors, e.g.errors, e.g. Lexical analysis: a string of characters that do Lexical analysis: a string of characters that do

not form a legal tokennot form a legal token Syntax analysis: unmatched { } or missing ;Syntax analysis: unmatched { } or missing ; Semantic: trying to add a float and a pointerSemantic: trying to add a float and a pointer

Page 31: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

InternalRepresentations

Each stage of processing transforms a representation ofthe source code program into a new representation.

syntax analyzer

semantic analyzer

lexical analyzer

position = initial + rate * 60

id1 = id2 + id3 * 60

=id1

id2+

*60 id3

11 PositionPosition ……

22 initialinitial ……

33 raterate ……

44

=id1

id2+

*inttoreal id3

60

symbol table

Page 32: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Intermediate code generationIntermediate code generation

Some compilers explicitly create an Some compilers explicitly create an intermediate representation of the source intermediate representation of the source code program after semantic analysis.code program after semantic analysis.

The representation is as a program for an The representation is as a program for an abstract machine.abstract machine.

Most common representation is “three-Most common representation is “three-address code” in which all memory address code” in which all memory locations are treated as registers, and locations are treated as registers, and most instructions apply an operator to two most instructions apply an operator to two operand registers, and store the result to a operand registers, and store the result to a destination register.destination register.

Page 33: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Intermediate code generationIntermediate code generation=

position

initial

+

*

inttorealrate

60

semantic analyzer

temp1 := inttoreal(60)temp2 := id3 * temp1temp3 := id2+ temp2id1 := temp3

Page 34: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

The Optimizer (or Middle End)The Optimizer (or Middle End)

TTypical Transformationsypical Transformations performance, code size, power consumptionperformance, code size, power consumption etcetc

Discover & propagate some constant valueDiscover & propagate some constant value Move a computation to a less frequently executed placeMove a computation to a less frequently executed place Specialize some computation based on contextSpecialize some computation based on context Discover a redundant computation & remove itDiscover a redundant computation & remove it Remove useless or unreachable codeRemove useless or unreachable code Encode an idiom in some particularly efficient formEncode an idiom in some particularly efficient form

Errors

Opt1

Opt3

Opt2

Optn

...IR IR IR IR IR

Modern optimizers are structured as a series of passes

Page 35: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Code optimizationCode optimization

At this stage, we improve the code to make it run At this stage, we improve the code to make it run faster.faster.

code optimizer

temp1 := inttoreal(60)temp2 := id3 * temp1temp3 := id2+ temp2id1 := temp3

temp1 := id3 * 60.0id1 := id2 + temp1

Page 36: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Code generationCode generation

In the final stage, we take the three-address code In the final stage, we take the three-address code (3AC) or other intermediate representation, and (3AC) or other intermediate representation, and convert to the target language.convert to the target language.

We must pick memory locations for variables and We must pick memory locations for variables and allocate registers.allocate registers.

code generator

MOVF id3, R2MULF #60.0, R2MOVF id2, R1ADDF R2, R1MOVF R1, id1

temp1 := id3 * 60.0id1 := id2 + temp1

Page 37: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

The Back EndThe Back End

ResponsibilitiesResponsibilities Translate IR into target machine codeTranslate IR into target machine code Choose instructions to implement each IR operationChoose instructions to implement each IR operation Decide which value to keep in registersDecide which value to keep in registers Ensure conformance with system interfacesEnsure conformance with system interfaces

Automation has been Automation has been lessless successful in the back end successful in the back end

Errors

IR RegisterAllocation

InstructionSelection

Machinecode

InstructionScheduling

IR IR

Page 38: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

The Back EndThe Back End

Instruction SelectionInstruction Selection Produce fast, compact codeProduce fast, compact code Take advantage of target features such as Take advantage of target features such as

addressing modesaddressing modes Usually viewed as a pattern matching problemUsually viewed as a pattern matching problem

ad hoc ad hoc methods, pattern matching, dynamic methods, pattern matching, dynamic programmingprogramming

Errors

IR RegisterAllocation

InstructionSelection

Machinecode

InstructionScheduling

IR IR

Page 39: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

The Back EndThe Back End

Register AllocationRegister Allocation Have each value in a register when it is usedHave each value in a register when it is used Manage a limited set of resourcesManage a limited set of resources Can change instruction choices & insert LOADs & STOREsCan change instruction choices & insert LOADs & STOREs Optimal allocation is NP-Complete Optimal allocation is NP-Complete (1 or (1 or kk registers) registers)

Compilers approximateCompilers approximate solutions to NP-Complete solutions to NP-Complete problemsproblems

Errors

IR RegisterAllocation

InstructionSelection

Machinecode

InstructionScheduling

IR IR

Page 40: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

The Back EndThe Back End

Instruction SchedulingInstruction Scheduling Avoid hardware stalls and interlocksAvoid hardware stalls and interlocks Use all functional units productivelyUse all functional units productively Can increase lifetime of variables Can increase lifetime of variables (changing the (changing the

allocation)allocation)

Optimal scheduling is NP-Complete in nearly all casesOptimal scheduling is NP-Complete in nearly all cases

Heuristic techniques are well developedHeuristic techniques are well developed

Errors

IR RegisterAllocation

InstructionSelection

Machinecode

InstructionScheduling

IR IR

Page 41: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Cousins of the compilerCousins of the compiler

PREPROCESSORS take raw source code and proPREPROCESSORS take raw source code and produce the input actually read by the compilerduce the input actually read by the compiler MACRO PROCESSING: macro calls need to be replaceMACRO PROCESSING: macro calls need to be replace

d by the correct textd by the correct text Macros can be used to define a constant used in many placeMacros can be used to define a constant used in many place

s. E.g. #define BUFSIZE 100 in Cs. E.g. #define BUFSIZE 100 in C Also useful as shorthand for often-repeated expressions:Also useful as shorthand for often-repeated expressions:

#define DEG_TO_RADIANS(x) ((x)/180.0*M_PI)#define DEG_TO_RADIANS(x) ((x)/180.0*M_PI)#define ARRAY(a,i,j,ncols) ((a)[(i)*(ncols)+(j)])#define ARRAY(a,i,j,ncols) ((a)[(i)*(ncols)+(j)])

FILE INCLUSION: included files (e.g. using #include in FILE INCLUSION: included files (e.g. using #include in C) need to be expandedC) need to be expanded

Page 42: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Cousins of the compilerCousins of the compiler

ASSEMBLERS take assembly code and ASSEMBLERS take assembly code and covert to machine code.covert to machine code.

Some compilers go directly to machine Some compilers go directly to machine code; others produce assembly code then code; others produce assembly code then call a separate assembler.call a separate assembler.

Either way, the output machine code is Either way, the output machine code is usually RELOCATABLE, with memory usually RELOCATABLE, with memory addresses starting at location 0.addresses starting at location 0.

Page 43: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Cousins of the compilerCousins of the compiler

LOADERS take relocatable machine code and altLOADERS take relocatable machine code and alter the addresses, putting the instructions and daer the addresses, putting the instructions and data in a particular location in memory.ta in a particular location in memory.

The LINK EDITOR (part of the loader) pieces togetThe LINK EDITOR (part of the loader) pieces together a complete program from several independenher a complete program from several independently compiled parts.tly compiled parts.

Page 44: 컴파일러 첫째주 2011/09/02 권혁철. 과제 It might be the biggest program you’ve ever written. It might be the biggest program you’ve ever written. It cannot be done

Compiler writing toolsCompiler writing tools

We’ve come a long way since the 1950s.We’ve come a long way since the 1950s. SCANNER GENERATORS produce lexical analyzers SCANNER GENERATORS produce lexical analyzers

automatically.automatically. Input: a specification of the tokens of a language (usually Input: a specification of the tokens of a language (usually

written as regular expressions)written as regular expressions) Output: C code to break the source language into tokens. Output: C code to break the source language into tokens.

PARSER GENERATORS produce syntactic analyzers PARSER GENERATORS produce syntactic analyzers automatically.automatically. Input: a specification of the language syntax (usually written Input: a specification of the language syntax (usually written as a context-free grammar)as a context-free grammar) Output: C code to build the syntax tree from the token Output: C code to build the syntax tree from the token

sequence.sequence. There are also automated systems for code synthesis.There are also automated systems for code synthesis.