25
Compiler 1 Chapter V: Compiler Overview: To study the design and operation of compiler for high-level programming languages. Contents Basic compiler (one-pass compiler) functions Machine-dependent extension: (object-code generation & code optimization) Compiler design alternative: multi-pass compiler, interpreters, p-code compilers & compiler-compilers.

Compiler1 Chapter V: Compiler Overview: r To study the design and operation of compiler for high-level programming languages. r Contents m Basic compiler

Embed Size (px)

Citation preview

Compiler1

Chapter V: Compiler

Overview: To study the design and operation of compiler for

high-level programming languages. Contents

Basic compiler (one-pass compiler) functions Machine-dependent extension:

(object-code generation & code optimization) Compiler design alternative:

multi-pass compiler, interpreters, p-code compilers & compiler-compilers.

Compiler2

Example

Basic compiler functions

Compiler3

Basic compiler functions (cont.)

Source program Regard each statement as a sequence of token.

The task of scanning the source statement, recognizing and classifying the various tokens, is known as lexical analysis. (scanner)

Recognized all tokens as some language construct by the grammar. This process is called syntactic analysis or parsing. (parser)

Generation of object code.

Compiler4

Compilation process

Scanning (lexical analysis) Parsing (syntactic analysis)

Code generation

Ps. It can achieve in a single pass !

Compiler5

Grammars

A grammar for a programming language is a formal description of the syntax, of programs and individual statements written in the language.

The difference between syntax and semantics, E.g.,

I := J + K X := Y + I where X,Y : Real I,J,K : IntegerThey are identical syntax.However, the semantic are quite different.

Compiler6

Grammars (cont.)

BNF (Backus-Naur Form) A kind of syntax description. Simple. Widely used. It provide capabilities that are sufficient for most purposes.

BNF consists of a set of rules, each of which defines the syntax of some construct in the programming language. E.g., <read> ::= READ ( <id-list>)

Compiler7

Grammars (cont.)

<read> ::= READ ( <id-list>) <id-list> ::= id | <id-list>, id

Character strings enclosed between < and > are called nonterminal symbol.

Character strings not enclosed between < and > are called terminal symbol (I.e, tokens).

E.g., READ(value, sum, x, y)

Compiler8

Simplified Pascal grammar

Compiler9

Simplified Pascal grammar (cont.)

Compiler10

Simplified Pascal grammar (cont.)

To display the analysis of a source statement in terms of a grammar a a tree (parse tree or syntax tree).

Compiler11

The parse tree for VARIANCE := SUMSQ DIV 100 – MEAN * MEAN

Compiler12

Grammars (cont.)

Draw parse tree for ALPHA – BETA * GAMMA

If there is more than one possible parse tree for a given statement, the grammar is said to be ambiguous.

The ambiguous grammar would leave doubt about what object code should be generated.

Compiler13

Compiler14

Compiler15

Lexical analysis (scanning)

Scanning the program to be compiled and recognizing the tokens that make up the source statements.

Scanner are usually designed to recognize keywords, operators, and identifiers, integer, floating-point numbers, character strings, …,etc.

The identifier might be defined by the rules: <ident> ::= <letter> | <ident> <letter> | <ident> <digit> <letter> ::= A | B | C | D | … | Z <digit> ::= 0 | 1 | 2 | 3 | … | 9

Compiler16

Token coding scheme

Compiler17

Lexical scan

Compiler18

The lexical scanning

It must deal with the following cases: For example,

DO 10 I = 1, 100 DO 10 I =1 (FORTRAN ignores blank in the statement)

IF (THEN .EQ. ELSE) THEN IF = THENELSE THEN = IFENDIF

A number of tools have been developed for automatically constructing lexical scanners from specifications stated in a special-purpose language.

Compiler19

Modeling Scanners as Finite Automata

The tokens of most programming languages can be recognized by a finite automation.

Starting state vs. final state. If the automation stops in a final state, we say that

it recognizes (or accept) the string being scanned, otherwise, it fails to recognize the string.

Compiler20

Modeling Scanners as Finite Automata (cont.)

Compiler21

Modeling Scanners as Finite Automata (cont.)

Compiler22

Modeling Scanners as Finite Automata (cont.)

Compiler23

Modeling Scanners as Finite Automata (cont.)

Compiler24

The implementation of finite automata

Using algorithm code (for Fig. 5.8 (b))

Compiler25

Using tabular representation

The implementation of finite automata (cont.)