Upload
john-bautista
View
214
Download
0
Embed Size (px)
Citation preview
8/7/2019 2_SimpleOnePassCompiler
1/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 1/66
2. A SIMPLE ONE-PASS COMPILERInfix expression -> Postfix expression
Yoon-Joong KimComputer Engineering Department, Hanbat National University
8/7/2019 2_SimpleOnePassCompiler
2/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 2/66
2.1 Overview :
Language: Appearance(Vocabulay,Syntax), Semantics
2.2 Syntax definition
Alphabet, string(sentence), language, BNF, Context Free Grammar, Parse Tree, Ambiguity,
Associativity and precedence of operator, Syntax of assignment statement.
2.3 Syntax-directed translation :
SDD, SDTS
2.4 Parsing :Top-Down Parsing, Predictive Parsing(Recursive Decent Parsing), Left factoring, Elimination of left-recursion
2.5 A translator for simple expression : Design and implementation
2.6 Lexical analysis
Design and implementation of as simple lexical analyzer
2.7 Incorporating a symbol table
2.8 A abstract stack machine
Stack machine operation, Lvalue, Rvalue, Stack manipulation, Translation of expression,
Translation of statements. Emitting a translation
8/7/2019 2_SimpleOnePassCompiler
3/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 3/66
2.9 Putting the techniques together
Design and implementation of a simple one pass translator, infix expression to postfix expression
8/7/2019 2_SimpleOnePassCompiler
4/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 4/66
This chapter
is an introduction to the material in Chapter 3 through 8. concentrates on the front end of compiler, that is, on lexical analysis, parsing, and
intermediate code generation.
2.1 OVERVIEWLanguage Definition
- Appearance of programming language :
Vocabulary : Regular expressionSyntax : Backus-Naur Form(BNF) or Context Free Form(CFG)
- Semantics : Informal language or some examples
l Fig 2.1. Structure of our compiler front end
8/7/2019 2_SimpleOnePassCompiler
5/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 5/66
2.2 SYNTAX DEFINITION
lTo specify the syntax of a language : CFG and BNF Example : if-else statement in C has the form of
statement if ( expression ) statement else statementAn alphabet of a language is a set of symbols.
Examples : {0,1} for a binary number system(language)={0,1,100,101,...}
{a,b,c} for language={a,b,c, ac,abcc..}
{if ,(,),else ...} for a if statements={if(a==1)goto10, if--}A string over an alphabet
is a sequence of zero or more symbols from the alphabet.
Examples : 0,1,10,00,11,111,0202 ... strings for a alphabet {0,1}
Null string is a string which does not have any symbol of alphabet.
Language
Is a subset of all the strings over a given alphabet. Alphabets Ai Languages Li for Ai
A0={0,1} L0={0,1,100,101,...}
8/7/2019 2_SimpleOnePassCompiler
6/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 6/66
A1={a,b ,c} L1={a,b,c, ac, abcc..}
A2={all of C tokens} L2= {all sentences of C program }
Example 2.1. Grammar for expressions consisting of digits and plus and minus
signs.
Language of expressions L={9-5+2, 3-1, ...}
The productions of grammar for this language L are:
list list + digit
list list - digit
list digit
digit 0|1|2|3|4|5|6|7|8|9
list, digit : Grammar variables, Grammar symbols
{0,1,2,3,4,5,6,7,8,9,-,+} : Alphabet for language L, Tokens, Terminal symbolsConvention specifying grammar
Terminal symbols : bold face string i f , num, id Nonterminal symbol, grammar symbol : italicized names, list, digit ,A,B
8/7/2019 2_SimpleOnePassCompiler
7/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 7/66
Grammar G=(N,T,P,S)
N : a set of nonterminal symbols
T : a set of terminal symbols, tokens
P : a set of production rules
S : a start symbol, SN
Grammar G for a language L={9-5+2, 3-1, ...}
G=(N,T,P,S)
N={list,digit}T={0,1,2,3,4,5,6,7,8,9,-,+} : is a kind of alphabet for L
P : list -> list + digit list -> list - digit list -> digit
digit -> 0 |1 |2 |3 |4 |5 |6 |7 |8 |9S= list
8/7/2019 2_SimpleOnePassCompiler
8/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 8/66
Some definitions for a language L and its grammar G
Derivation :A sequence of replacements S 1 2n is a derivation of n.
Example, A derivation 1+9 from the grammar G
left most derivation list list + digit digit + digit 1 + digit 1 + 9
right most derivation list list + digit list + 9 digit + 9 1 + 9
Language of grammar L(G)L(G) is a set of sentences(strings) that can be generated from the grammar G=(N,T,P,S).
L(G)={x| S *
x} where x a sequence of terminal symbols
Example: Consider a grammar G=(N,T,P,S):
N={S} T={a,b}
S=S P ={S aSb | }
Is aabb a sentence of L(g)? (derivation of string aabb)SaSbaaSbbaabbaabb(or S* aabb) so, aabbL(G)
there is no derivation for aa, so aaL(G) note L(G)={anbn| n0} where anbn meas n as followed by n bs.
8/7/2019 2_SimpleOnePassCompiler
9/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 9/66Parse TreeA derivation can be conveniently represented by a derivation tree( parse tree).
The root is labeled by the start symbol.
Each leaf is labeled by a token or .
Each interior nodes are labeled by a nonterminal symbol.
When a production Ax 1 xn is derived, nodes labeled by x 1 xn are made as children
nodes of node labeled by A.
root : the start symbol internal nodes : nonterminal leaf nodes : terminal
8/7/2019 2_SimpleOnePassCompiler
10/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 10/66
Example G:
list -> list + digit | list - digit | digit
digit -> 0|1|2|3|4|5|6|7|8|9
left most derivation for 9-5+2,list list+digit list-digit+digit digit-digit+digit 9-digit+digit
9-5+digit 9-5+2 right most derivation for 9-5+2,
list list+digit list+2 list-digit+2 list-5+2 digit-5+2 9-5+2
parse tree for 9-5+2
8/7/2019 2_SimpleOnePassCompiler
11/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 11/66AmbiguityA grammar is said to be ambiguous if the grammar has more than one parsetree for a given string of tokens.
Example 2.5. Suppose a grammar G that can not distinguish between lists anddigits as in Example 2.1.
G : string string + string | string - string |0|1|2|3|4|5|6|7|8|9
1-5+2 has 2 parse trees => Grammar G is ambiguous.
8/7/2019 2_SimpleOnePassCompiler
12/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 12/66Associativity of operatorA operator is said to be left associative if an operand with operators on both sides of it is taken bythe operator to its left.
eg) 9+5+2(9+5)+2, a=b=ca=(b=c)
Left Associative Grammar :
list list + digit | list - digit
digit 0|1||9
Right Associative Grammar :
right letter = right | letter letter a|b||z
left associative operators :+ , - , * , /
right associative operators := , **
8/7/2019 2_SimpleOnePassCompiler
13/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 13/66Precedence of operatorsWe say that a operator(*) has higher precedence than other operator(+) if the operator(*) takesoperands before other operator(+) does.
ex. 9+5*29+(5*2), 9*5+2(9*5)+2
Syntax of full expressions
operator associative precedence+ , - left 1 low
* , / left 2 heigh
left a. left a. precedence/depth
expr expr + term | expr - term | term 1
term term * factor | term / factor | factor 2
factor digit | ( expr ) 3digit 0 | 1 | | 9 4
8/7/2019 2_SimpleOnePassCompiler
14/66
8/7/2019 2_SimpleOnePassCompiler
15/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 15/662.3 SYNTAX-DIRECTED TRANSLATION(SDT)A formalism for specifying translations for programming language constructs.
( attributes of a construct: type, string, location, etc)
Syntax directed definition(SDD) for the translation of constructs Syntax directed translation scheme(SDTS) for specifying translation
Postfix notation for an (infix) expression E If E is a variable or constant, then the postfix nation for E is E itself ( E.tE ).
if E is an expression of the form E1 op E2 where op is a binary operator
E1' is the postfix of E1, E2' is the postfix of E2 then E1' E2' op is the postfix for E1 op E2
if E is (E1), and E 1' is a postfix
then E1' is the postfix for E
eg) 9 - 5 + 2 9 5 - 2 +
9 - (5 + 2) 9 5 2 + -
8/7/2019 2_SimpleOnePassCompiler
16/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 16/66Syntax-Directed Definition(SDD) for translationSDD is a set of semantic rules predefined for each productions respectively fortranslation.
A translation is an input-output mapping procedure for translation of an input X,
construct a parse tree for X.
synthesize attributes in order of depth-first traversal over the parse tree.
Suppose a node n in parse tree is labeled by X and X.a denotes the value of attribute aof X at that node.
compute X's attributes X.a using the semantic rules associated with X.
Depth-First Traversalsprocedure visit(n: node)
beginfo r each child m of n, from left to right do visit(m);
evaluate semantic rules at node nend
8/7/2019 2_SimpleOnePassCompiler
17/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 17/66
An example of translating input X=9-5+2 to postfix expression (synthesized attributes for input X=9-5+2)
1. produce parse tree for the input X=9-5+22. evaluate attribute of node according to the dept-first order based on the SDD.
8/7/2019 2_SimpleOnePassCompiler
18/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 18/66Syntax-directed Translation Schemes(SDTS) A translation scheme is a context-free grammar in which program fragments called
translation actions are embedded within the right sides of the production.
productions(infix) SDD for postfix notation from infix SDTS for postfix notation from infixlist list + term list.t = list.t || term.t || "+" list list + term {print("+")}
{print("+");} : translation(semantic) action.
1. Design translation schemes(SDTS) for translation
2. Translate :a) parse the input string x andb) emit(execute) the action result encountered during the depth-first traversal of parse
tree.
8/7/2019 2_SimpleOnePassCompiler
19/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 19/66
Example 2.8.
SDD vs. SDTS for infix to postfix translation.
productions SDD SDTSexpr list + termexpr list + termexpr termterm 0term 1
term 9
expr.t = list.t || term.t || "+"expr.t = list.t || term.t || "-"expr.t = term.tterm.t = "0"term.t = "1"
term.t = "9"
expr list + term printf{"+")}expr list + term printf{"-")}expr termterm 0 printf{"0")}term 1 printf{"1")}
term 9 printf{"0")}
An Example of an Action translation for input 9-5+2
1) Parse.2) Translate.Do we have to maintain the whole parsetree ?No, Semantic actions are performed
during parsing, and we don't need thenodes (whose semantic actions done) anymore.
8/7/2019 2_SimpleOnePassCompiler
20/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 20/662.4 PARSINGif token string x L(G), then parse tree
else error message
Top-Down parsing1. At node n labeled with nonterminal A, select one of the productions
whose left part is A and construct children of node n with the symbols on the right side
of that production.
2. Find the next node at which a sub-tree is to be constructed.
ex. G: type simple
id array [ simple ] of type simple integer char num dotdot num
8/7/2019 2_SimpleOnePassCompiler
21/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 21/66
An example of top-down parsing for input x=array [ num dotdot num ] of integer
8/7/2019 2_SimpleOnePassCompiler
22/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 22/66
type simple
id array [ simple ] of type simple integer char num dotdot num
The selection of production for a nonterminal may involve trial-and-error.
=> backtracking
8/7/2019 2_SimpleOnePassCompiler
23/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 23/66Procedure of top-down pars inglet a pointed grammar symbol and pointed input symbol be g, a respectively. if( g N )
then select a production P whose left part equals to g next to current production.( select
expand derivation with that production P.
else if( g = a ) then make g and a be a symbol next to current symbol.else if( g a ) back tracking
let the pointed input symbol a be the symbol that moves back to steps same with thenumber of current symbols of underlying production
eliminate the right side symbols of current production and let the pointed symbol g be theleft side symbol of current production.
8/7/2019 2_SimpleOnePassCompiler
24/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 24/66
An example of top-down parsing for grammarG : { S->aSb | c | ab }, is acb , aabbL(G)?
S/acbaSb/acbaSb/acb aaSbb/acb X(SaSb) move (SaSb) backtracking
aSb/acbacb/acbacb/acbacb/acb(sc) move move
so, acb L(G)It is finished in 7 steps including one backtracking.
S/aabbaSb/aabbaSb/aabbaaSbb/aabbaaSbb/aabbaaaSbbb/aabb X(SaSb) move (SaSb) move (SaSb) backtracking
aaSbb/aabbaacbb/aabb X(Sc) backtrackingaaSbb/aabbaaabbb/aabb X
(Sab) backtrackingaaSbb/aabb X
backtrackingaSb/aabbacb/aabb
(Sc) bactracking aSb/aabbaabb/aabbaabb/aabbaabb/aabbaaba/aabb
(Sab) move move moveso, aabbL(G)but process is too difficult. It needs 18 steps including 5 backtrackings.
8/7/2019 2_SimpleOnePassCompiler
25/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 25/66Predictive parsing (Recursive Decent Parsing, RDP) A strategy for the general top-down parsing
Guess a production, see if it matches, if not, backtrack and try another.
It may fail to recognize correct string in some grammar G and is tedious in processing.
Predictive parsing
is a kind of top-down parsing that
predicts a production whose derived terminal symbol is equal to next input symbol whileexpanding in top-down paring. without backtracking. Procedure decent parser is a kind of predictive parser that is implemented by disjoint
recursive procedures one procedure for each nonterminal, the procedures are patternedafter the productions.
8/7/2019 2_SimpleOnePassCompiler
26/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 26/66Procedure of predict ive parsing(RDP)let a pointed grammar symbol and pointed input symbol be g, a respectively. if( g N )
select next production P whose left symbol equals to g and a set of first terminalsymbols of derivation from the right symbols of the production P includes a inputsymbol a.(select
expand derivation with that production P. else if( g = a ) then make g and a be a symbol next to current symbol. else if( g a ) error
=> RDP requires Left Factoring and Elimination of Left-recursion for grammar
G: A a, or
A a 1 1 | a22 | | an n |
for an d
8/7/2019 2_SimpleOnePassCompiler
27/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 27/66
G : { SaSb | c | ab }According to predictive parsing procedure, acb , aabbL(G)?
S/acb confused in { SaSb, Sab } so, a predictive parser requires some restriction in grammar, that is, there should be only
one production whose left part of productions are A and each first terminal symbol of
those productions have unique terminal symbol.
so G => G1 : { S->aS' | c S'->Sb | ab }
Requirements for a grammar to be suitable for RDP: For each nonterminal either
1. A a, or2. A a11 | a22 | | ann |
for and A may also occur
if none of ai can follow A in a derivation and if we have A
If the grammar is suitable, we can parse efficiently without backtrack.
General top-down parser with backtracking
Recursive(Recursive) Descent Parser without backtracking
8/7/2019 2_SimpleOnePassCompiler
28/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 28/66
An Example of a Recursive Predictive Parser(RDP) for grammarG : type simple | id | array [ simple ] of typesimple
integer | char | mun dotdot mun procedure match(t: token);
beginif lookahead = t then lookahead = nexttoken
else errorend
procedure type; //cf type simple | id | array [ simple ] of typebegin
if lookahead is in {integer, real, num} then simple //FIRST(simple)={integer,char,num}else if( lookahead = '' then begin match(''); match(id) end //FIRST(id) ={}else if( lookahead = array then begin //FIRST(array [ simple ] of type)={array}
match(array);match('[');simple; match(']');match(of); type endelse error
end procedure simple; //cf simple -> integer | char | mun dotdot mun
beginif lookahead = integer then match(integer)else if lookahead = char then match(char)else if lookahead = num then begin match(num); match(dotdot); match(num) endelse error end;
8/7/2019 2_SimpleOnePassCompiler
29/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 29/66Left FactoringIf a grammar contains two productions of form
S a and S a
it is not suitable for top down parsing without backtracking. Troubles of this form can
sometimes be removed from the grammar by a technique called the left factoring.
In the left factoring, we replace { S a, S a } by{ S aS', S' , S' } (Hopefully and start with different symbols)cf. S= a|a=a(|) = aS', S'=|
left factoring for G { SaSb | c | ab }SaS' | c cf. S=aSb | ab | c = a ( Sb | b) | c ) = a S' | c, S'=Sb|bS'Sb | b
A concrete example:
IF THEN |
IF THEN ELSE
is transformed into IF THEN S'
S' ELSE |
8/7/2019 2_SimpleOnePassCompiler
30/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 30/66
Example,
for G : { SaSb | c | ab }
According to predictive parsing procedure, acb , aabbL(G)?
S/aabb unable to choose { SaSb, Sab ?} According for the feft factored gtrammar G1, acb , aabbL(G)?
G1 : { SaS'|c S'Sb|b}
8/7/2019 2_SimpleOnePassCompiler
31/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 31/66Left RecursionAn Example of RDP for a left-recursive Grammar
G: SSa|b, input x=baaS/baaSa/baa Saa/baa Saaa/baa .... (SSa) (SSa) (SSa)
An Example of RDP for a right-recursive GrammarG1 : SbS ' S'aS'|, L(G)=L(G1)S/baabS'/baabS'/baabaS'/baabaS'/baabaaS'/baabaaS'/baabaa/baabaa/baa
(SbS') (move) (S'aS') (move) (S'aS') (move) (S')
A grammar is left recursive iff it contains a nonterminal A, such thatA
+A, where is any string.
Grammar {S S | c} is left recursive because of SS
Grammar {S A, A Sb | c} is also left recursive because of SA Sb
If a grammar is left recursive, you cannot build a predictive top down parser for it.
8/7/2019 2_SimpleOnePassCompiler
32/66
8/7/2019 2_SimpleOnePassCompiler
33/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 33/66
In general, the rule is that
If A A 1 | A2 | | A n and
A 1 | 2 | | m (no i's start with A),then, replace by
A 1R | 2R| | mR and
R 1R | 2R | | nR |
Exercise: Remove the left recursion in the following grammar:
expr expr + term | expr - termexpr term
solution: expr term rest rest + term rest | - term rest |
8/7/2019 2_SimpleOnePassCompiler
34/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 34/662.5 A TRANSLATOR FOR SIMPLE EXPRESSIONSConvert infix into postfix(polish notation) using SDT.Abstract syntax (annotated parse tree) tree vs. Concrete syntax tree
Concrete syntax tree : parse tree Fig 2.2. p28.
Abstract syntax tree, syntax tree : Fig 2.20. Concrete syntax : underlying grammar
8/7/2019 2_SimpleOnePassCompiler
35/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 35/66Adapting the Translation Scheme Embed the semantic action in the production
Design a translation scheme Left recursion elimination and Left factoring
Example, translate infix expression to post expression
Infix expression system(language)
L={0+1, 2-(5+1), ..}
G=({expr,term},{+,-,0,19},P,expr)
P ={expr expr+ term
expr expr- term
expr term
term 0
term 1
term 9
}
postfix expression system(language)
L'={0 1 +, 2 5 1 + -, ...}
G'=({expr,term},{+,-,0,19},P',expr)
P' = {expr expr term +
expr expr term -
expr term
term 0
term 1
term 9
}
8/7/2019 2_SimpleOnePassCompiler
36/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 36/66Example of translator design and executionA translation scheme and with left-recursion.
Fig 2.19. Initial specification for infix-to-postfix translator with left recursion eliminatedexpr expr + term {printf{"+")}expr expr - term {printf{"-")}expr termterm 0 {printf{"0")}term 1 {printf{"1")}
term 9 {printf{"0")}
expr term restrest + term {printf{"+")} restrest - term {printf{"-")} restrest term 0 {printf{"0")}term 1 {printf{"1")}
term 9 {printf{"0")}
8/7/2019 2_SimpleOnePassCompiler
37/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 37/66Procedure(RDP) for the Nonterminal expr, term, and restm a t c h ( t : t k o e n )
{
i f ( l o o k h e a d = t )
l o o k a h e a d l e x a n ( ) ;
e l s e e r r o r ( ) ;
}
expr term restrest + term {printf{"+")} rest
rest - term {printf{"-")} restrest term 0 {printf{"0")}term 1 {printf{"1")}
term 9 {printf{"0")}
8/7/2019 2_SimpleOnePassCompiler
38/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 38/66
Optimizer and Translator1. expr() {
2. term(); rest();3. }
4. rest() rest()
5. { {
6. if(lookahead == '+' ) { L: if(lookahead == '+' ) {
7. m('+'); term(); p('+'); rest(); m('+'); term(); p('+'); goto L;
8. } else if(lookahead == '-' ) { } else if(lookahead == '-' ) {
9. m('-'); term(); p('-'); rest(); m('-'); term(); p('-'); goto L;
10. } else ; } else ;11. } }
12. expr() {
13. term();
14. while(1) {
15. if(lookahead == '+' ) {
16. m('+'); term(); p('+');
17. } else if(lookahead == '-' ) {
18. m('-'); term(); p('-');19. } else break;
20. }
8/7/2019 2_SimpleOnePassCompiler
39/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 39/662.6 LEXICAL ANALYSISreads and converts the input into a stream of tokens to be analyzed by parser.
lexeme : a sequence of characters which comprises a single token.
Lexical Analyzer Lexeme / Token Parser
Removal of White Space and CommentsRemove white space(blank, tab, new line etc.) and commen ts
ContsantsConstants: consider only integers for a while.
eg) for input 31 + 28, output : ?input : 31 + 28output:
tokenattribute, value(or lexeme)
8/7/2019 2_SimpleOnePassCompiler
40/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 40/66RecognizingIdentifiers Identifiers are names of variables, arrays, functions... A grammar treats an identifier as a token.
eg) input : count = count + increment;output : ;Symbol table
tokens attributes(lexeme)0
12
3
idid
countincrement
Keywords are reserved, i.e., they cannot be used as identifiers.Then a character string forms an identifier only if it is no a keyword.
punctuation symbols
operators : + - * / := < >
8/7/2019 2_SimpleOnePassCompiler
41/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 41/66Interface to lexical analyzer
8/7/2019 2_SimpleOnePassCompiler
42/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 42/66A Lexical Analyzer
c=getchcar(); ungetc(c,stdin);
token representation
#define NUM 256
Function lexan()eg) input string 76 + ainput , output(returned value)76 token=NUM tokenval=76 (integer)
+ token=+a toekn=id, tokeval="a"
8/7/2019 2_SimpleOnePassCompiler
43/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 43/66
A way that parser handles the token NUM returned by laxan()
consider a translation scheme
factor ( expr )| num { print(num.value) }
in the RDP for the above production.
#define NUM 256
...
factor() {
if(lookahead == '(' ) {match('('); exor(); match(')');
} else if (lookahead == NUM) {
printf(" %f ",tokenval); match(NUM);
} else error();
}
8/7/2019 2_SimpleOnePassCompiler
44/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 44/66The implementation of function lexan for simple alphabet={integer numbers, +,-,*,/...}
1 ) # include 2) #include
3) in t lino = 1;4) int tokenval = NONE;5) int lexan() {6) int t;7) while(1) {8) t = getchar();9) if ( t==' ' || t=='\t' ) ;10) else if ( t=='\n' ) lineno +=1;11) else if (isdigit(t)) {12) tokenval = t -'0';
13) t = getchar();14) while ( isdigit(t)) {15 ) tokenval = tokenval*10 + t - '0';16) t =getchar();17) }18) ungetc(t,stdin);19) return NUM;20) } else {21 ) tokenval = NONE;22) return t; //+,-,...23) }24) }25) }
8/7/2019 2_SimpleOnePassCompiler
45/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 45/66
2.7 INCORPORATION A SYMBOL TABLE
The symbol table interfaces, operation, usually called by parser. insert(s,t): input s: lexeme
t: tokenoutput index of new entry
lookup(s): input s: lexemeoutput index of the entry for string s,
or 0 if s is not found in the symbol table.Handling reserved keywords
1. Inserts all keywords in the symbol table in advance.ex) insert("div", div)
token lexeme
insert("mod", mod)
2. while parsing
whenever an identifier s is encountered.
if (lookup(s)'s token is in {keywords} ) s is for a keyword;else s is for a identifier;
8/7/2019 2_SimpleOnePassCompiler
46/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 46/66
example
preset
insert("div",d iv );insert("mod",m od ); while parsing
count = i + i + div
lookup("count")=>0 insert("count",id );lookup("i") =>0 insert("i",id );lookup("i") =>4, idllokup("div")=>1,d iv
8/7/2019 2_SimpleOnePassCompiler
47/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 47/662.8 ABSTRACT STACK MACHINE An abstract machine is for intermediate code generation/execution.
Instruction classes: arithmetic / stack manipulation / control flow3 components of abstract stack machine1) Instruction memory : abstract machine code, intermediate code(instruction)2) Stack 3)Data memory
An example of stack machine operation.
for a input a=11; b=7; (5+a)*b; 1) lexical analysis and parsed(5 a + b *) =>2) intermediate codes : push 5, rvalue 2, +, rvalue 3, *
instruction memory1 push 5 pc2 rvalue 23 +
4 rvalue 3
5 *
Data memory
1 0
2 11 a
3 7 b
Stack
1 5
2 115
3 16
4 716
5 112
8/7/2019 2_SimpleOnePassCompiler
48/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 48/66L-value and r-value l-values a : address of location a
r-values a : if a is location, then content of location aif a is constant, then value a
eg) a :=(5 + a ) * b;
lvalue a lvalue 2
r value 5 push 5 r value a rvalue 2 r value of b rvalue 3
Stack ManipulationSome instructions for assignment operation
push v : push v onto the stack.
rvalue a : push the contents of data location a.
lvalue a : push the address of data location a.
pop : throw away the top element of the stack.
:= : assignmen t for the top 2 elements of the stack. copy : push a copy of the top element of the stack.
8/7/2019 2_SimpleOnePassCompiler
49/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 49/66Translation of ExpressionsInfix expression(IE) SDD/SDTS Abstact macine codes(AMC) of postfix expression
for stack machine evaluation.eg )
IE: a + b, (PE: a b + ) IC: rvalue arvalue b+
day := (1461 * y) div 4 + (153 * m + 2) div 5 + d( day 1462 y * 4 div 153 m * 2 + 5 div + d + :=)
1) lvalue day 6) div 11) push 5 16) :=2) push 1461 7) push 153 12) div3) rvalue y 8) rvalue m 13) +4) * 9) push 2 14) rvalue d5) push 4 10) + 15) +
A translation scheme for assignment-statement into abstract stack machine code e canbe expressed formally In the form as follows:stmt id := expr
{ stmt.t := 'lvalue' || id .lexeme || expr.t || ':=' }eg) day :=a+b lvalue day rvalue a rvalue b + :=
8/7/2019 2_SimpleOnePassCompiler
50/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 50/66Control Flow3 types of jump instructions :
1. Absolute target location
2. Relative target location( distance :Current Target)
3. Symbolic target location( i.e. the machine supports labels)
Control-flow instructions:
l a b e l a : t h e j u m p ' s t a r g e t a
g o t o a : t h e n e x t i n s t r u c t i o n i s t a k e n f r o m s t a t e m e n t l a b e l e d a g o f a l s e a : p o p t h e t o p & i f i t i s 0 t h e n j u m p t o a
g o t r u e a : p o p t h e t o p & i f i t i s n o n z e r o t h e n j u m p t o a h a l t : s t o p e x e c u t i o n
8/7/2019 2_SimpleOnePassCompiler
51/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 51/66Translation of Statements Translation scheme for translation if-statement into abstract machine code.
stmt if expr then stmt1{ out := newlabel1)
stmt.t := expr.t || 'gofalse' out || stmt1.t || 'label' out }
Translation scheme for while-statement ?
1) a procedure generates a unique label(eg. zzzzz001, zzzzz002, etc) whenever it is called!
8/7/2019 2_SimpleOnePassCompiler
52/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 52/66Emitting a TranslationSemantic Action(Tranaslation Scheme):
stmt if expr { out := newlabel; emit('gofalse', out) } then stmt1 { emit('label', out) }
stmt id { emit('lvalue', id.lexeme) } := expr { emit(':=') }
stmt if expr { out := newlabel; emit('gofalse', out) } then stmt1 { emit('label', out) ; out1 := newlabel; emit('goto', out1); }
else stmt2 { emit('label', out1) ; } if(expr==false) goto out
stmt1
goto out1out : stmt2out1:
8/7/2019 2_SimpleOnePassCompiler
53/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 53/66
implementation
procedure stmt()
var test,out:integer; begin // stmt id { emit('lvalue', id.lexeme) } := expr { emit(':=') } if lookahead = id then begin emit('lvalue',tokenval); match(id ); match(':='); expr(); emit(':='); end // stmt if expr { out := newlabel; emit('gofalse', out) } then stmt1 { emit('label', out) } else if lookahead = 'if' then begin match('if'); expr(); out := newlabel(); emit('gofalse', out); match('then'); stmt; emit('label', out) end
else error(); end
8/7/2019 2_SimpleOnePassCompiler
54/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 54/66Control Flow with Analysisif E1 or E2 then S vs if E1 and E2 then S
E 1 o r E 2 = i f E 1 t h e n t r u e e l s e E 2E 1 a n d E 2 = i f E 1 t h e n E 2 e l s e f a l s e
T h e c o d e fo r E 1 o r E 2 .
Codes for E1 Evaluation result: e1
copy
gotrue OUT
pop
Codes for E2 Evaluation result: e2
label OUT
8/7/2019 2_SimpleOnePassCompiler
55/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 55/66
T h e f u l l c o d e f o r if E 1 o r E 2 t h e n S ;
codes for E1
copy gotrue OUT1
pop
codes for E2
label OUT1
gofalse OUT2
code for S label OUT2
Exercise: How about if E1 and E2 then S;
if E1 and E2 then S1 else S2;
8/7/2019 2_SimpleOnePassCompiler
56/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 56/66
2 . 9 P u t t i n g t h e t e c h n i q u e s t o g e t h e r !
infix expression postfix expression
eg) id+(id-id)*num/id id id id - num * id / +
Description of the TranslatorSyntax directed translation scheme(SDTS) to translate the infix expressions
into the postfix expressions, Fig2.35
8/7/2019 2_SimpleOnePassCompiler
57/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 57/66
Structure of the translator, Fig 2.36
o global header file "header.h"
8/7/2019 2_SimpleOnePassCompiler
58/66
8/7/2019 2_SimpleOnePassCompiler
59/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 59/66The Parser Module parser.cSDTS Fig 2.35
left recursion elimination
New SDTS Fig 2.38
8/7/2019 2_SimpleOnePassCompiler
60/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 60/66
An example of parsing for the input 12 div 5 + 2 ;
8/7/2019 2_SimpleOnePassCompiler
61/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 61/66The Emitter Module emitter.cemit (t,tval)
The Symbol-Table Modules symbol.c and init.cSymbol.c
data structure of symbol table Fig 2.29 p62
insert(s,t)
lookup(s)
The Error Module error.cExample of execution
input 12 div 5 + 2
output 12
5
div
2
+
8/7/2019 2_SimpleOnePassCompiler
62/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 62/66
8/7/2019 2_SimpleOnePassCompiler
63/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 63/66
8/7/2019 2_SimpleOnePassCompiler
64/66
8/7/2019 2_SimpleOnePassCompiler
65/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 65/66
8/7/2019 2_SimpleOnePassCompiler
66/66
Compiler, Computer Eng. Dept., Hanbat National University, Yoon-Joong Kim 66/66