View
241
Download
0
Embed Size (px)
Citation preview
Discussion #6 1/13
Discussion #6
Parsing Recursive Grammars
Discussion #6 2/13
Topics
• Tail recursion
• LL(1) with • Table driven LL(1) with • Lexical Analyzers
Discussion #6 3/13
Motivating Example• Let’s use integers instead of digits in our prefix
language.• What’s the problem with *2+72100?
– Syntax error? Or is it simply ambiguous?
– * 2 + 7 2100 = 2 * (7 + 2100) = 4214?
– * 2 + 72 100 = 2 * (72+100) = 344?
• Solution?– Let n mark the beginning of a number
e.g. * n2 + n72 n100 = 2 * (72 + 100) = 344
– Strange: but you’ll soon see where we are headed and why.
Discussion #6 4/13
E (1)N | (2)OEEO (3)+ | (4)*N (5)nII (6)D | (7)IDD (8)0 | (9)1 | … | (17)9
E
I
D
n
2
O E E
N N+
In
DI
DI
D
1
0
0
In
DI
D
7
2
N O E E*
Consider: * n2 + n72 n100
Discussion #6 5/13
E (1)N | (2)OEEO (3)+ | (4)*N (5)nII (6)D | (7)IDD (8)0 | (9)1 | … | (17)9
E
I
D
n
2
O E E
N N+
In
??
N O E E*
* n2 + n72 n100
Question…
Which rule do we choose?I (6)D
or I (7)IDWe don’t know without looking further ahead.Should we look further ahead, or find another way?
Discussion #6 6/13
LL(1) with
• There is another way.• Consider the following
replacement:– (6)I D by (6) I D T
– (7)I ID T (7)I | (8)
• Now, if I is on the top of the stack and we see a digit, we choose I DT.
• If T is on top, if we see a digit, we choose T I, otherwise we choose T .
D T
I
D T
I1
D T
0
0 I
Note: The does not “consume” the “+” which is still on top.
Example: the 100 in …n100+n21n…
Discussion #6 7/13
Tails• We use tails for things that go on forever.
– Numbers, eg. 12, 123456, …– Parameter lists, eg. (parm1, parm2,…,parmn)– Variable names, eg. dog, doggone, …
• Note that T(ail) rules have special constructions:– FIRST(I) = FIRST(T) = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}– FIRST(for T) = { {VT {#}} – FIRST(T) }
= {+, *, n, #}– Note: FIRST(T) FIRST(for T) = – Note also: FIRST(I) FIRST(for T) = VT {#}– Thus, our tail construction simply iterates until it reaches the
end. Further, it leaves the character that is one beyond the end on top of the stack.
Discussion #6 8/13
E (1)N | (2)OEEO (3)+ | (4)*N (5)nII (6)DTT (7)I | (8)D (9)0 | (10)1 | … | (18)9
+ * 0 1 2… n #
E (OEE,2) (OEE,2) (N,1)O (+,3) (*,4)N (nI,5)I (DT,6) (DT,6) (DT,6)T (,8) (,8) (I,7) (I,7) (I,7) (,8) (,8)D (0,9) (1,10) (2,11…)+ pop* pop0 pop1 pop
2… popn pop# accept
Discussion #6 9/13
+ * 0 1 2… n #
E (OEE,2) (OEE,2) (N,1)O (+,3) (*,4)N (nI,5)I (DT,6) (DT,6) (DT,6)T (,8) (,8) (I,7) (I,7) (I,7) (,8) (,8)D (0,9) (1,10) (2,11)
+,*,0-9,n pop pop pop pop pop pop# accept
Action Stack Input Output
Initialize E# +n10n1#ACTION(E,+) = Replace [E,OEE], Out 2 OEE# +n10n1# 2
ACTION(+,+) = pop(+,+) EE# +n10n1# 23ACTION(E,n) = Replace [E,N], Out 1 NE# +n10n1# 231
ACTION(O,+) = Replace [O,+], Out 3 +EE# +n10n1# 23
ACTION(N,n) = Replace [N,nI], Out 5 nIE# +n10n1# 2315
ACTION(I,1) = Replace [I,DT], Out 6 DTE# +n10n1# 23156ACTION(n,n) = pop(n,n) IE# +n10n1# 2315
ACTION(D,1) = Replace [D,1], Out 10 1TE# +n10n1# 2315610
ACTION(1,1) = pop(1,1) TE# +n10n1# 2315610
ACTION(T,0) = Replace [T,I], Out 7 IE# +n10n1# 23156107ACTION(I,0) = Replace [I,DT], Out 6 DTE# +n10n1# 231561076ACTION(D,0) = Replace [D,0], Out 9 0TE# +n10n1# 2315610769
Discussion #6 10/13
+ * 0 1 2… n #
E (OEE,2) (OEE,2) (N,1)O (+,3) (*,4)N (nI,5)I (DT,6) (DT,6) (DT,6)T (,8) (,8) (I,7) (I,7) (I,7) (,8) (,8)D (0,9) (1,10) (2,11)
+,*,0-9,n pop pop pop pop pop pop# accept
Action Stack Input Output
Continued… 0TE# +n10n1# 2315610769ACTION(0,0) = pop(0,0) TE# +n10n1# 2315610769ACTION(T,n) = Replace [T,], Out 8 E# +n10n1# 23156107698
ACTION(E,n) = Replace [E,N], Out 1 N# +n10n1# 231561076981ACTION(N,n) = Replace [N,nI], Out 5 nI# +n10n1# 2315610769815ACTION(n,n) = pop(n,n) I# +n10n1# 2315610769815ACTION(I,1) = Replace [I,DT], Out 6 DT# +n10n1# 23156107698156ACTION(D,1) = Replace [D,1], Out 10 1T# +n10n1# 2315610769815610
ACTION(1,1) = pop(1,1) T# +n10n1# 2315610769815610
ACTION(T,#) = Replace [T,], Out 8 # +n10n1# 23156107698156108
ACTION(#,#) = Accept! # +n10n1# 23156107698156108
ACTION(,n) = pop E# +n10n1# 23156107698
ACTION(,#) = pop # +n10n1# 23156107698156108
Discussion #6 11/13
E (1)N | (2)OEEO (3)+ | (4)*N (5)nII (6)DTT (7)I | (8)D (9)0 | (10)1 | … | (18)9
2 3 1 5 6 10 7 6 9 8 1 5 6 10 8
E
2 3 1 5 6 10 7 6 9 8 1 5 6 10 8 is the parse for + n 1 0 n 1
O E E
2
+
3
N1
N
1
In
5
TD
6
1
10
I
7
TD
6
0
9
8
In
5
TD
6
1
10
8
Discussion #6 12/13
E (1)N | (2)OEEO (3)+ | (4)*N (5)nII (6)DTT (7)I | (8)D (9)0 | (10)1 | … | (18)9
2 3 1 5 6 10 7 6 9 8 1 5 6 10 8
E
O E E
2
+
3
N1
N
1
In
5
TD
6
1
10
I
7
TD
6
0
9
8
In
5
TD
6
1
10
8
Lexical Analyzer Motivation
Discussion #6 13/13
E (1)N | (2)OEEO (3)+ | (4)*N (5)<number>
E
2 3 1 5 1 5 becomes the parse for +n10n1 where the tokens are +, n10, and n1
O E E
2
+
3
N1
N
1
n105
n15
Tokenization Simplifies Grammars