18
Discussion #5 1/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

  • View
    232

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 1/18

Discussion #5

LL(1) Grammars&Table-Driven Parsing

Page 2: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 2/18

Topics• Approaches to Parsing

– Full backtracking– Deterministic

• Simple LL(1), table-driven parsing

• Improvements to simple LL(1) grammars

Page 3: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 3/18

Prefix Expression Grammar• Consider the following grammar (which yields prefix

expressions for binary operators):

E N | OEEO + | | * | /N 0 | 1 | 2 | 3 | 4

• Here, prefix expressions associate an operator with the next two operands.

* + 2 3 4

(* (+ 2 3) 4)

(2 + 3) * 4 = 20

* 2 + 3 4

(* 2 (+ 3 4))

2 * (3 + 4) = 14

Page 4: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 4/18

E

N O E E

… + * N O E E N

… + N N 0 1 2

0 1 2 3 0 1 2 3 4E N | OEEO + | | * | /N 0 | 1 | 2 | 3 | 4

*+342

Top-Down Parsing with Backtracking

Page 5: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 5/18

What are the obvious problems?• We never know what production to try.

• It appears to be terribly inefficient—and it is.

• Are there grammars for which we can always know what rule to choose? Yes!

• Characteristics:– Only single symbol look ahead– Given a non-terminal and a current symbol, we

always know which production rule to apply

Page 6: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 6/18

LL(1) Parsers• An LL parser parses the input from Left to

right, and constructs a Leftmost derivation of the sentence.

• An LL(k) parser uses k tokens of look-ahead.• LL(1) parsers, although fairly restrictive, are

attractive because they only need to look at the current non-terminal and the next token to make their parsing decisions.

• LL(1) parsers require LL(1) grammars.

Page 7: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 7/18

Simple LL(1) Grammars

For simple LL(1) grammars all rules have the form

A a11 | a22 | … | ann

where

• ai is a terminal, 1 <= i <= n

• ai aj for i j and

i is a sequence of terminals and non-terminal or is empty, 1 <= i <= n

Page 8: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 8/18

Creating Simple LL(1) Grammars

• By making all production rules of the form:

A a11 | a22 | … | ann

• Thus,

E 0 | 1 | 2 | 3 | 4 | +EE | EE | *EE | /EE

• Why is this not a simple LL(1) grammar?

E N | OEEO + | | * | /N 0 | 1 | 2 | 3 | 4

• How can we change it to simple LL(1)?

Page 9: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 9/18

E (1)0 | (2)1 | (3)2 | (4)3 | (5)4 | (6)+EE | (7)EE | (8)*EE | (9)/EE

* + 2 3 4

E

2 * 3

E

?

* E E

8

E E+

6

2

3

3

44

5 E E

7

2

3

E E*

8

3

4

Success! Fail!

Output = 8 6 3 4 5

Example: LL(1) Parsing

Page 10: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 10/18

Simple LL(1) Parse TableA parse table is defined as follows:

(V {#}) (VT {#}) {(, i), pop, accept, error}where

is the right side of production number i– # marks the end of the input string (# V)

If A (V {#}) is the symbol on top of the stack and a (VT {#}) is the current input symbol, then:

ACTION(A, a) = pop if A = a for a VT

accept if A = # and a = # (a, i) which means “pop, then push a and

output i” (A a is the ith production) error otherwise

Page 11: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 11/18

Parse TableE (1)0 | (2)1 | (3)2 | (4)3 | (5)+EE | (6)*EE

0 1 2 3 + * #

E (0,1) (1,2) (2,3) (3,4) (+EE,5) (*EE,6)

0 pop

1 pop

2 pop

3 pop

+ pop

* pop

# accept

V{#}

VT {#}

All blank entries are error

Page 12: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 12/18

0 1 2 3 + * #

E (0,1) (1,2) (2,3) (3,4) (+EE,5) (*EE,6)

0,1,2,3,+,* pop pop pop pop pop pop

# accept

Action Stack Input Output

Initialize E# *+123#

ACTION(E,*) = Replace [E,*EE], Out 6 *EE# *+123# 6ACTION(*,*) = pop(*,*) EE# *+123# 6ACTION(E,+) = Replace [E,+EE], Out 5 +EEE# *+123# 65ACTION(+,+) = pop(+,+) EEE# *+123# 65ACTION(E,1) = Replace [E,1], Out 2 1EE# *+123# 652ACTION(1,1) = pop(1,1) EE# *+123# 652ACTION(E,2) = Replace [E,2], Out 3 2E# *+123# 6523ACTION(2,2) = pop(2,2) E# *+123# 6523ACTION(E,3) = Replace [E,3], Out 4 3# *+123# 65234ACTION(3,3) = pop(3,3) # *+123# 65234ACTION(#,#) = accept Done!

Page 13: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 13/18

Simple LL(1):More Restrictive than Necessary

• Simple LL(1) grammars are very easy and efficient to parse but also very restrictive.

• The good news: we can achieve the same desirable results without being so restrictive.

• How? We only need to retain the restriction that single-symbol look ahead uniquely determines which rule to use.

Page 14: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 14/18

• Consider the following grammar, which is not simple LL(1):E (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3

• What are the problem rules? (1) & (2)• Observe that it is possible distinguish between

rules 1 and 2.– N leads to {0, 1, 2, 3}– O leads to {+, *}– {0, 1, 2, 3} {+, *} = – Thus, if we see 0, 1, 2, or 3 we choose (1), and if we

see + or *, we choose (2).

Relaxing Simple LL(1) Restrictions

Page 15: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 15/18

LL(1) Grammars

• FIRST() = { | * and VT}

• A grammar is LL(1) if for all rules of the form

A 1 | 2 | … | n

the sets

FIRST(1), FIRST(2), …, and FIRST(n)

are pair-wise disjoint; that is,

FIRST(i) FIRST(j) = for i j

Page 16: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 16/18

E (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3

+ * 0 1 2 3 #E (OEE,2) (OEE,2) (N,1) (N,1) (N,1) (N,1)O (+,3) (*,4)N (0,5) (1,6) (2,7) (3,8)+ pop* pop0 pop1 pop2 pop3 pop# accept

V{#}

VT {#}

For (A, a), we select (, i) if a FIRST() and is the right hand side of rule i.

Page 17: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 17/18

+ * 0 1 2 3 #

E (OEE,2) (OEE,2) (N,1) (N,1) (N,1) (N,1)

O (+,3) (*,4)

N (0,5) (1,6) (2,7) (3,8)

+,*,0,1,2,3 pop pop pop pop pop pop

# accept

Action Stack Input Output

Initialize E# *+123#ACTION(E,*) = Replace [E,OEE], Out 2 OEE# *+123# 2

ACTION(*,*) = pop(*,*) EE# *+123# 24ACTION(E,+) = Replace [E,OEE], Out 2 OEEE# *+123# 242

ACTION(+,+) = pop(+,+) EEE# *+123# 2423

ACTION(N,1) = Replace [N,1], Out 6 1EE# *+123# 242316ACTION(1,1) = pop(1,1) EE# *+123# 242316ACTION(E,2) = Replace [E,N], Out 1 NE# *+123# 2423161

ACTION(2,2) = pop(2,2) E# *+123# 24231617ACTION(E,3) = Replace [E,N], Out 1 N# *+123# 242316171

ACTION(3,3) = pop(3,3) # *+123# 2423161718ACTION(#,#) = accept Done!

ACTION(O,*) = Replace [O,*], Out 4 *EE# *+123# 24

ACTION(O,+) = Replace [O,+], Out 3 +EEE# *+123# 2423

ACTION(E,1) = Replace [E,N], Out 1 NEE# *+123# 24231

ACTION(N,2) = Replace [N,2], Out 7 2E# *+123# 24231617

ACTION(N,3) = Replace [N,3], Out 8 3# *+123# 2423161718

Page 18: Discussion #51/18 Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5 18/18

What does 2 4 2 3 1 6 1 7 1 8 mean?

E (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3

E

(2)OEE

(1)N

(6)1 (7)2

(8)3

(4)* (2)OEE (1)N

(3)+ (1)N

2 4 2 3 1 6 1 7 1 8 defines a parse tree via a preorder traversal.