View
223
Download
3
Category
Preview:
Citation preview
Compiler Lecture Note, LL Parsing Page 1
PL Lab, DongGuk University
컴파일러 입문
제 제 7 7 장장LL LL 구문 분석구문 분석
Compiler Lecture Note, LL Parsing Page 2
PL Lab, DongGuk University
I. 결정적 구문 분석 II. Recursive-descent 파서III. PredictivePredictive 파서VI. Predictive 파싱 테이블의 구성 V. Strong LL(k)LL(k) 문법과 LL(k) 문법
목 차
Compiler Lecture Note, LL Parsing Page 3
PL Lab, DongGuk University
I. 결정적 구문 분석▶ Deterministic Top-Down Parsing
::= deterministic selection of production rules to be applied
in top-down syntax analysis.
▶ One pass nobackup
1. Input string is scanned once from left to right.
2. Parsing process is deterministic.
▶ Top-down parsing with nobackup
::= deterministic top-down parsing.
called LL parsing.
"Left to right scanning and Left parse"
Compiler Lecture Note, LL Parsing Page 4
PL Lab, DongGuk University
▶ How to decide which production is to be applied:
sentential form : 1 2 … i-1Xα
input string : 1 2 … i-1 i i+1 … n
X 1 | 2 ... | k P∈ 일 때 ,
i를 보고 X-production 중에 uniqueunique 하게 결정 .
the condition for no backtracking : FIRST 와 FOLLOW 가 필요 .
(= LL condition)
Compiler Lecture Note, LL Parsing Page 5
PL Lab, DongGuk University
FIRST ▶ FIRST() ::= the set of terminalsterminals that begin the strings derived from .
if * , then is also in FIRST().
FIRST(A) ::= { a V∈ T {∪ } | A * a, V∈ * }.
▶ Computation of FIRST(X), where X V.∈
1) if X V∈ T, then FIRST(X) = {X}
2) if X V∈ N and X a P, then FIRST(X) = FIRST(X) ∈ {a}
if X P, then FIRST(X) = FIRST(X) ∈ {}
3) if X Y1Y2 …Yk P and Y∈ 1Y2 …Yi-1 * ,
i
then FIRST(X) = FIRST(X) ( FIRST(Yj) - {}). j=1
if Y1Y2 …Yk * , then FIRST(X) = FIRST(X) {}.
Compiler Lecture Note, LL Parsing Page 6
PL Lab, DongGuk University
ex1) E TE E +TE | T FT T FT | F (E) | id
FIRST(E) = FIRST(T) = FIRST(F) = {(, id}
FIRST(E) = {+, }
FIRST(T) = {, }
ex2) PROGRAM begin d semi X end
X d semi X
X s Y
Y semi s Y |
FIRST(PROGRAM) = {begin}
FIRST(X) = {d,s}
FIRST(Y) = {semi, }
Text p.268
Compiler Lecture Note, LL Parsing Page 7
PL Lab, DongGuk University
연습문제 7.4 (1) - p.299
• FIRST 를 구하시오 .
(1) S aRTb | bRR
R cRd | T RS | TaT
Compiler Lecture Note, LL Parsing Page 8
PL Lab, DongGuk University
▶ left-dependency graph
- the vertices are the terminal and nonterminal symbols and the
arcs go from X to Y if and only if X X1...XnY, where
n 0, and each of X1,...,Xn can produce the empty string.
ex) S AB
A aA | B bB |
S
A
B b
a
FIRST(S) = {a, , b} FIRST(A) = {a, } FIRST(B) = {b, }
Compiler Lecture Note, LL Parsing Page 9
PL Lab, DongGuk University
★ In general, A A1A2...An
if A1 : non-nullable
if A1 : nullable
if A1A2 : nullable
A
A1
A3
A
A
A1
A1
A2
A2
Compiler Lecture Note, LL Parsing Page 10
PL Lab, DongGuk University
FOLLOW ▶ FOLLOW(A)
::= the set of terminals that can appear immediately to the right
of A in some sentential form. If A can be the rightmost
symbol in some sentential form, then $ is in FOLLOW(A).
$ is the input right marker.
::= {a V∈ T {$} | S ∪ * Aa, , V∈ *}.
▶ Computation of FOLLOW(A)
1) FOLLOW(S) = {$}
2) if A B P and ∈ ,
then FOLLOW(B) = FOLLOW(B) (FIRST(∪ ) - )
3) if A B P or A ∈ B and * ,
then FOLLOW(B) = FOLLOW(B) FOLLOW(∪ A).
Compiler Lecture Note, LL Parsing Page 11
PL Lab, DongGuk University
ex) E TE'
E' +TE' | T FT'
T' FT' | F (E) | id
Nullable = { E, T }
FIRST(E) = FIRST(T) = FIRST(F) = {(, id}
FIRST(E) = {+, } FIRST(T) = {, }
FOLLOW(E) = {),$} FOLLOW(E') = {),$}
FOLLOW(T) = {+,),$} FOLLOW(T') = {+,),$}
FOLLOW(F) = {,+,),$}
Text p.271
Compiler Lecture Note, LL Parsing Page 12
PL Lab, DongGuk University
연습문제 7.4 (3) - p.299
• FOLLOW 를 구하시오 .
(3) S aAa | A abS | c
Compiler Lecture Note, LL Parsing Page 13
PL Lab, DongGuk University
▶ LL condition
::= no backup condition
::= the condition for deterministic parsing of top-down method.
input : 12 ... i-1i ...n
derived string : 12...i-1X
X 1 | 2 ... | m
i를 보고 X-production 들 중에서 X 를 확장할 rule 을 결정적으로 선택 .
★ <LL condition> A | P,∈
1. FIRST() FIRST() =
2. if * , FOLLOW(A) FIRST() =
Compiler Lecture Note, LL Parsing Page 14
PL Lab, DongGuk University
ex) A aBc | Bc | dAa
B bB |
FIRST(A) = {a,b,c,d} FOLLOW(A) = {$,a}
FIRST(B) = {b, } FOLLOW(B) = {c}
1) A aBc | Bc | dAa 에서 ,
FIRST(aBc) FIRST(Bc) FIRST(dAa)
= {a} {b,c} {d} =
2) B bB | 에서 ,
FIRST(bB) FOLLOW(B) = {b} {c} =
1), 2) 에 의해 LL 조건을 만족한다 .
Compiler Lecture Note, LL Parsing Page 15
PL Lab, DongGuk University
II. Recursive-descent 파서 ▶ Recursive-descent parsing
::= A top-down method that uses a set of recursive procedures to recognize its input with no backtracking.
▶ create a procedure for each nonterminal.
ex) G : S aA | bB A aA | c B bB | d procedure pS; begin if nextsymbol = qa then begin get_nextsymbol; pA end else if nextsymbol = qb then begin get_nextsymbol; pB end else error end;
Compiler Lecture Note, LL Parsing Page 16
PL Lab, DongGuk University
procedure pA;
begin if nextsymbol = qa then begin get_nextsymbol; pA end
else if nextsymbol = qc then get_nextsymbol
else error
end;
procedure pB; ...
(* main *) begin get_nextsymbol; pS; if next_symbol = '$' then accept else error end.
= aac$
Procedure call sequence ::= leftmost derivation
Compiler Lecture Note, LL Parsing Page 17
PL Lab, DongGuk University
▶ The main problem in constructing a recursive-descent syntax
analyzer is the choice of productions when a procedure is first
entered. To resolve this problem, we can compute the lookahead
of each production.
▶ LOOKAHEADLOOKAHEAD of a production
Definition: LOOKAHEAD(A)
= FIRST({ | S * A * V∈ T*}).
Meaning : the set of terminals which can be generated by and
if * , then FOLLOW(A) is added to the set.
Computing formula: LOOKAHEAD(A X1X2...Xn)
= FIRST(X1X2...Xn) FOLLOW(A)
Compiler Lecture Note, LL Parsing Page 18
PL Lab, DongGuk University
ex) S aSA | A c
Nullable Set = {S}
FIRST(S) = {a, } FOLLOW(S) = {$,c}
FIRST(A) = {c} FOLLOW(A) = {$,c}
LOOKAHEAD(S aSA) = FIRST(aSA) FOLLOW(S) = {a}
LOOKAHEAD(S ) = FIRST() FOLLOW(S) = {$,c}
LOOKAHEAD(A c) = FIRST(c) FOLLOW(A) = {c}
LOOKAHEAD 를 구하는 순서 :
Nullable => FIRST => FOLLOW => LOOKAHEAD
Compiler Lecture Note, LL Parsing Page 19
PL Lab, DongGuk University
▶ Strong LL condition
Definition : A | P, ∈ LOOKAHEAD(A ) LOOKAHEAD(A ) = .
Meaning : for each distinct pair of productions with the same left-hand side, it can select the unique alternate that derives a string beginning with the input symbol.
Definition : the grammar G is said to be strong LL(1) if it satisfies the strong LL condition.
ex) G : S aSA | A c
LOOKAHEAD(S aSA) = {a}
LOOKAHEAD(S ) = FOLLOW(S) = {$, c}
LOOKAHEAD(S aSA) LOOKAHEAD(S ) =
G 는 strong LL(1) 이다 .
Compiler Lecture Note, LL Parsing Page 20
PL Lab, DongGuk University
▶ Implementation of Recursive-descent parser
If a grammar is strong LL(1), we can construct a parser for sentences of the
grammar using the following scheme.
a V∈ T,
procedure pa; (* get_nextsymbol=scanner *)
begin
if nextsymbol = qa then get_nextsymbol
else error
end;
get_nextsymbol : 스캐너에 해당하는 루틴으로 입력 스트림으로부터 토큰 한 개를 읽어 변수
nextsymbol 에 할당하는 일을 한다 .
Compiler Lecture Note, LL Parsing Page 21
PL Lab, DongGuk University
A V∈ N,
procedure pA;
var i: integer;
begin
case nextsymbol of
LOOKAHEAD(A X1X2...Xm): for i := 1 to m do pXi;
LOOKAHEAD(A Y1Y2...Yn): for i := 1 to n do pYi;
:
LOOKAHEAD(A Z1Z2...Zr): for i := 1 to r do pZi;
LOOKAHEAD(A ): ;
otherwise: error
end (* case *)
end;
Text p.278
Compiler Lecture Note, LL Parsing Page 22
PL Lab, DongGuk University
▶ Improving the efficiency and structure of recursive-descent parser
1) Eliminating terminal procedures
::= In practice it is better not to write a procedure for each terminal.
Instead the action of advancing the input marker can always be initiated
by the nonterminal procedures. In this way many redundant tests can
be eliminated.
ex) text p.279 [ 예 9]
2) BNF EBNF : reduce the number of productions and nonterminals.
① repetitive part : { }
② optional part : [ ]
③ alternation : ( | )
Compiler Lecture Note, LL Parsing Page 23
PL Lab, DongGuk University
ex) [ 예 10] --- text p.281
< IF_st > ::= ' if ' < C > ' then ' < S > [ ' else ' < S > ]
procedure pIF;
begin if nextsymbol = qif then
begin get_nextsymbol; pC;
if nextsymbol = qthen then
begin get_nextsymbol; pS end
else error(10)
end
else error(20);
if nextsymbol = qelse then
begin get_nextsymbol; pS end
end;
Compiler Lecture Note, LL Parsing Page 24
PL Lab, DongGuk University
ex) [ 예 11] --- text p.281
<id_list> ::= ' id ' { ' , ' ' id ' }
procedure pID_LIST;
begin if nextsymbol = qid then
begin get_nextsymbol;
while (nextsymbol = qcomma) do
begin get_nextsymbol;
if nextsymbol = qid then get_nextsymbol
else error
end
end
end;
Compiler Lecture Note, LL Parsing Page 25
PL Lab, DongGuk University
[ 연습문제 7.8 (2)] --- Text p.300
< 문제 > 다음 grammar 를 extended BNF 로 바꾸고 그에 따른
recursive-descent parser 를 위한 procedure 를 작성하시오 .
<D> ::= ' label ' <L> | ' integer ' <L>
<L> ::= <id> <R>
<R> ::= ' ; ' | ' , ' <L>
<L> <id> (' , ' <id> )* ' ; '
<D> ::= ( ' label ' | ' integer ' ) <id> {' , ' <id>} ' ; '
*
Compiler Lecture Note, LL Parsing Page 26
PL Lab, DongGuk University
procedure pD; begin if nextsymbol in [qlabel,qinteger] then begin get_nextsymbol; if nextsymbol = qid then begin get_nextsymbol; while (nextsymbol = qcomma) do begin get_nextsymbol; if nextsymbol = qid then get_nextsymbol else error(3) end end else error(2); if nextsymbol = qsemi then get_nextsymbol else error(4) end else error(1) end;
Compiler Lecture Note, LL Parsing Page 27
PL Lab, DongGuk University
Programming Assignment #1
Implement a recursive-descent syntax analyzer for the grammar
given in exercise 5.30(text p. 224).
Problem Specifications
- input : SPL program to find a Minimum and a Maximum.
- output : left parse
- methods : (1) write the get_nextsymbol routine.
(2) compute LOOKAHEADs for each production.
(3) create a procedure for each nonterminal.
(4) assemble the procedures with main program.
a set of productions
Computation of LOOKAHEADs
LOOKAHEADs foreach nonterminal
Compiler Lecture Note, LL Parsing Page 28
PL Lab, DongGuk University
III. Predictive Parsing ▶ Predictive parsing
::= a deterministic parsing method using a stack. The stack contains a sequence of grammar symbols.
▶ Model of a predictive parser
Driver routine
$
$ : input
output
stack
Table
Compiler Lecture Note, LL Parsing Page 29
PL Lab, DongGuk University
Current input symbol 과 stack top symbol 사이의 관계에 따라 parsing.
The input buffer contains the string to be parsed, followed by $.
Initial configuration : STACK INPUT
$S $
Parsing table(LL) : parsing action 을 결정지어 줌 .
※ M[X,a] = r : stack top symbol 이 X 이고 current symbol 이 a 일 때 ,
r 번 생성 규칙으로 expand.
r
terminals
nonterminals X
a
Compiler Lecture Note, LL Parsing Page 30
PL Lab, DongGuk University
▶ Parsing Actions
X : stack top symbol, a : current input symbol
1. if X = a = $, then accept.
2. if X = a, then pop X and advance input.
3. if X V∈ N, then if M[X,a] = r (X),
then replace X by
else error.
Compiler Lecture Note, LL Parsing Page 31
PL Lab, DongGuk University
▶ Predictive parsing algorithm
set ip to point to the first symbol of $; repeat let X be the top stack symbol and a the symbol pointed to by ip; if X is a terminal or $ then if X = a then pop X from the stack and advance ip else error(1) else /* X is nonterminal */
if M[X,a] = X Y1Y2...Yk then begin pop X from the stack;
push YkYk-1,...,Y1 onto the stack, with Y1 on top;
output the production X Y1Y2...Yk
end else error(2) until X = $ /* stack is empty */
Text p.284
Compiler Lecture Note, LL Parsing Page 32
PL Lab, DongGuk University
ex) G : 1. S aSb
2. S bA
3. A aA
4. A b
string : aabbbb
• Parsing Table:
a b
S
A
terminalsnonterminals
1 2
3 4
Compiler Lecture Note, LL Parsing Page 33
PL Lab, DongGuk University
STACK INPUT ACTIONS OUTPUT
$S aabbbb$ expand 1 1
$bSa aabbbb$ pop a and advance
$bS abbbb$ expand 1 1
$bbSa abbbb$ pop a and advance
$bbS bbbb$ expand 2 2
$bbAb bbbb$ pop b and advance
$bbA bbb$ expand 4 4
$bbb bbb$ pop b and advance
$bb bb$ pop b and advance
$b b$ pop b and advance
$ $ Accept
※ How to construct a predictive parsing table for the grammar.
Compiler Lecture Note, LL Parsing Page 34
PL Lab, DongGuk University
VI. Predictive 파싱 테이블의 구성 ▶ main idea : If A is a production with a in FIRST(), then the parser will expand A by when the current input symbol is a. And if * , then we should again expand A by when the current input symbol is in FOLLOW(A).
▶ parsing table(LL):
M[X,a] = r : expand X with r-production blank : error
VT a
X
VN
Compiler Lecture Note, LL Parsing Page 35
PL Lab, DongGuk University
▶ Algorithm : for each production A,
1. a FIRST(∈ ), M[A,a] := <A>
2. if * , then b FOLLOW(A), M[A,b] := <A∈ >.
ex) G: 1. E TE' 2. E' +TE' 3. E' 4. T FT'
5. T' FT' 6. T' 7. F (E) 8. F id
FIRST(E)=FIRST(T)=FIRST(F)={ ( , id }
FIRST(E')={ + , } FIRST(T')={ , }
FOLLOW(E) = FOLLOW(E') = { ) , $ }
FOLLOW(T) = FOLLOW(T') = { + , ) , $ }
FOLLOW(F) = { + , , ) , $ }
Compiler Lecture Note, LL Parsing Page 36
PL Lab, DongGuk University
1
2
4
• Parsing Table:
Terminalsid + * ( ) $
E
E'
T
T'
F 8
1
3 3
6 6
7
6 5
4
Nonterminals
Compiler Lecture Note, LL Parsing Page 37
PL Lab, DongGuk University
▶ LL(1) Grammar
::= a grammar whose parsing table has no multiply-defined entries.
multiply 정의되면 어느 rule 로 expand 해야 할 지 결정할 수 없기 때
문에 deterministic 하게 parsing 할 수 없다 .
▶ LL(1) condition: A | ,
1. FIRST( ) FIRST() = .
2. if , then FOLLOW(A) FIRST() = .
ex) G : 1. S iCtSS' 2. S a
3. S' eS 4. S' 5. C b
FIRST(S) = {i,a} FOLLOW(S) = {$,e}
FIRST(S') = {e, } FOLLOW(S') = {$,e}
FIRST(C) = {b} FOLLOW(C) = {t}
*
Compiler Lecture Note, LL Parsing Page 38
PL Lab, DongGuk University
Parsing Table:
M[S',e] := <3,4> 로 중복으로 정의되었음 .
여기서 , stack top 이 S' 이고 input symbol 이 e 일 때 3 번 rule 로
expand 해야 할 지 , 4 번 rule 로 expand 해야 하는지 알 수 없다 .
그러므로 G 는 LL(1) grammar 가 아니다 .
ex) [ 예제 15] --- text p.291
G : S aA | abA : abab
A Ab | a
a b e i t $
S
S'
C
2
5
1
43,4
Compiler Lecture Note, LL Parsing Page 39
PL Lab, DongGuk University
V. Strong LL(k) and LL(k) Grammars ▶ FIRSTk() = {| * , || = k or and || < k}
▶ G is said to be strong LL(k)strong LL(k), for some fixed integer k > 0, if
whenever there are two leftmost derivations.
1. S * A * x V∈ T*, and
2. S * A * y V∈ T* such that
3. FIRSTk(x) = FIRSTk(y). It follows that 4. = .
▶ Meaning: Suppose we consider any state of the parse in which A is the nonterminal currently being parsed and FIRSTk(x) is the k-lookahead at the current point. Then, if the k-lookahead is same, the two productions A and A are identical. Any other information provided by the closed portion and the open portion of the current state of the parse will be disregarded.
Compiler Lecture Note, LL Parsing Page 40
PL Lab, DongGuk University
▶ S A, : closed portion, : open portion
▶ Two states of the parse
FIRSTk(x) = FIRSTk(y) ===> = .
*
S
A
x
S
A
y
Compiler Lecture Note, LL Parsing Page 41
PL Lab, DongGuk University
▶ Def) LL(k) grammar:
1. S A x V∈ T*, and
2. S A y V∈ T* such that
3. FIRSTk(x) = FIRSTk(y). It follows that
4. = .
ex) S aAaa | bAba
A b |
S S
a A a a b A b a b
lookahead 가 ba 일 때 A b, A 중 어느 rule 을 택할 수 있는가 ? 이제 본 symbol 이 a 이면 A b 를 선택하고 , b
이면 A 를 선택한다 . 따라서 SLL(2) 는 아니며 LL(2) 가 된다 .
*
*
*
*
Compiler Lecture Note, LL Parsing Page 42
PL Lab, DongGuk University
▶ SLL(k) and LL(k)
▶ <theorem> strong LL(1) LL(1)
Proof) () clear!
() Suppose that G is not strong LL(1).
Then, by definition, there are two distinct productions
A and A such that,
S 1A1 11 111 111
S 2A2 22 222 222
and FIRST(11) = FIRST(22).
SLL(k)SLL(k)
LL(k)LL(k)
*
*
*
*
*
*
Compiler Lecture Note, LL Parsing Page 43
PL Lab, DongGuk University
Now we must prove that G is not LL(1).
1) 1= 2= , G is not LL(1).
Indeed, it is ambiguous.
2) one (or both) of 1 and 2 is not . 1 .
FIRST1(1 1) = FIRST1(1) = FIRST1(2 2).
but then,
S 2A2 22 2 12 212
S 2A2 22 2 22 222
satisfy the property
FIRST1(1 2) = FIRST1(1) = FIRST1(2 2).
Thus, by definition, G is not LL(1).
*
*
*
*
*
*
Recommended