15
Chuen-Liang Chen, NTUCS&IE / TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, TAIWAN

C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

Embed Size (px)

Citation preview

Page 1: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 1

TOP-DOWN PARSINGTOP-DOWN PARSING

Chuen-Liang Chen

Department of Computer Science

and Information Engineering

National Taiwan University

Taipei, TAIWAN

Page 2: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 2

Parser (2/5)Parser (2/5)

void system_goal(void){

/* <system goal> ::= <program> SCANEOF */

program();match(SCANEOF);

}

void program(void){

/* <program> ::= BEGIN <statement list> END */

match(BEGIN)statement_list();match(END);

}

void statement_list(void){

/* <statement list> ::= <statement> { <statement> } */

statement();while (TRUE) {

switch (next_token()) {case ID:case READ:case WRITE:

statement();break;

default:return;

}}

}

QUIZ: Why ID, READ, WRITE ?QUIZ: Why ID, READ, WRITE ?

Page 3: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 3

trace of top-down parsing (left-most derivation)

orange : just derived (predicted) blue : just read (matched) black : derived or read green : un-processed (parse stack)

Example of top-down parsingExample of top-down parsing

Tail

E

Prefix ( E )

F V Tail

+ E

V

E

Prefix ( E )

E

Prefix ( E )

F V Tail

E

Prefix ( E )

F V Tail

+ ETail

E

Prefix ( E )

F V Tail

+ E

V

E E

Prefix ( E )

FWhy this production rule?

Why this production rule?

Page 4: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 4

Predict set (1/2)Predict set (1/2)

top-down parsing predicts the production that is to be matched before matching actually begins

predict set is used to indicate the applied production rule, according to encountered nonterminal & lookahead symbol(s)

Predictk( A X1 Xm ) =if Firstk(X1Xm) then Firstk(X1Xm)else ( Firstk(X1Xm) - {} ) Followk(A) for LL(k) parsing

usage -- e.g., to construct recursive descent parsingvoid non_term(void){

token tok = next_token();switch (tok) {case Predict_set:

parsing_actions();break;

default:

syntax_error(tok);break;

}} QUIZ: how to construct,QUIZ: how to construct,

automatically? automatically?

Page 5: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 5

Predict set (2/2)Predict set (2/2)example for LL(1) Predict set

1. <program> begin <statement list> end begin2. <statement list> <statement> <statement tail> First(rhs) = ID read write3. <statement tail> <statement> <statement tail> First(rhs) = ID read write4. <statement tail> Follow(lhs) = end5. <statement> ID := <expression> ; ID6. <statement> read ( <id list> ) ; read7. <statement> write ( <expr list> ) ; write8. <id list> ID <id tail> ID9. <id tail> , ID <id tail> ,

10. <id tail> Follow(lhs) = )11. <expr list> <expression> <expr tail> First(rhs) = ID INTLIT (12. <expr tail> , <expression> <expr tail> ,13. <expr tail> Follow(lhs) = )14. <expression> <primary> <primary tail> First(rhs) = ID INTLIT (15. <primary tail> <add op> <primary> <primary tail> First(rhs) = + -16. <primary tail> Follow(lhs) = , ; )17. <primary> ( <expression> ) (18. <primary> ID ID19. <primary> INTLIT INTLIT20. <add op> + +21. <add op> - -22. <system goal> <program> $ First(rhs) = begin

extended from BNF to CFG

Page 6: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 6

LL(k) parse tableLL(k) parse table example for LL(1)

blank entity -- syntax error QUIZ: parse table size for LL(k) parsing?QUIZ: parse table size for LL(k) parsing? QUIZ: Is LL(k), k QUIZ: Is LL(k), k 2, practical? 2, practical?

Page 7: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 7

LL(k) grammarLL(k) grammar

the grammar -- unique prediction for each combination of nonterminal and lookahead symbol(s) (entry of parse table)

unambiguous grammar

it is usually (but not always) to create an LL(1) grammar for programming language

predict conflicts

common prefix

left recursion

and so on

Page 8: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 8

Common prefix eliminationCommon prefix elimination

<stmt> if <expr> then <stmt list> endif ;

<stmt> if <expr> then <stmt list> else <stmt list> endif ;

<stmt> if <expr> then <stmt list> <if suffix> ;

<if suffix> endif ;

<if suffix> else <stmt list> endif ;

QUIZ: how, systematically?QUIZ: how, systematically?

Page 9: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 9

Left recursion eliminationLeft recursion elimination

E E + T E E1 Etail E T EtailE T E1 T Etail + T EtailT T * P Etail + T Etail Etail T P Etail T P TtailP ID T T1 Ttail Ttail * P Ttail

T1 P Ttail Ttail * P Ttail P IDTtail P ID

QUIZ: why conflict?QUIZ: why conflict? QUIZ: how, systematically?QUIZ: how, systematically?

Page 10: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 10

Other predict conflict eliminationsOther predict conflict eliminations<stmt> <label> <unlabeled stmt>

<label> ID :<label> <unlabeled stmt> ID := <expr> ;

<stmt> ID <id suffix><id suffix> : <unlabeled stmt><id suffix> := <expr> ;<unlabeled stmt> ID := <expr> ;

<array bound> <expr> .. <expr><array bound> ID

<array bound> <expr> <bound tail><bound tail> .. <expr><bound tail>

Page 11: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 11

LL(1) parser driverLL(1) parser driver

using a parse stack to keep predicted but unprocessed symbols

QUIZ: time complexity?QUIZ: time complexity?

QUIZ: space complexity?QUIZ: space complexity?

QUIZ: recursive descent QUIZ: recursive descent v.s. LL(1)? v.s. LL(1)?

void lldriver(void){

/* Push the Start Symbol onto an empty stack */push(S);while ( ! stack_empty() ) {

/* Let X be the top stack symbol; *//* let a be the current input token */if (is_nonterminal(X)

&& T[X][a] == X Y1 Ym) {/* Expand nonterminal */Replace X with Y1 Ym on the stack;

} else if (is_terminal(X) && X == a) {pop(l); /* Match of X worked */scanner(&a); /* Get next token */

} else if (is_action_symbol(X)) {pop(l);Call Semantic Routine corresponding to X;

} else/* Process syntax error */

}}

Page 12: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 12

Tracing example (1/2)Tracing example (1/2)

Step Remaining Input Parse Stack Action(1) begin A:=BB-314+A; end $ <s.g.> Predict 22 (2) begin A:=BB-314+A; end $ <p> $ Predict 1(3) begin A:=BB-314+A; end $ begin <s.l.> end $ Match(4) A:=BB-314+A; end $ <s.l.> end $ Predict 2(5) A:=BB-314+A; end $ <s> <s.t.> end $ Predict 5(6) A:=BB-314+A; end $ ID := <e> ; <s.t.> end $ Match(7) :=BB-314+A; end $ := <e> ; <s.t.> end $ Match(8) BB-314+A; end $ <e> ; <s.t.> end $ Predict 14(9) BB-314+A; end $ <p> <p.t.> ; <s.t.> end $ Predict 18

(10) BB-314+A; end $ ID <p.t.> ; <s.t.> end $ Match(11) -314+A; end $ <p.t.> ; <s.t.> end $ Predict 15(12) -314+A; end $ <a.o.> <p> <p.t.> ; <s.t.> end $ Predict 21(13) -314+A; end $ - <p> <p.t.> ; <s.t.> end $ Match

Page 13: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 13

Tracing example (2/2)Tracing example (2/2)

Step Remaining Input Parse Stack Action(14) 314+A; end $ <p> <p.t.> ; <s.t.> end $ Predict 19(15) 314+A; end $ IntL <p.t.> ; <s.t.> end $ Match(16) +A; end $ <p.t.> ; <s.t.> end $ Predict 15(17) +A; end $ <a.o.> <p> <p.t.> ; <s.t.> end $ Predict 20(18) +A; end $ + <p> <p.t.> ; <s.t.> end $ Match(19) A; end $ <p> <p.t.> ; <s.t.> end $ Predict 18(20) A; end $ ID <p.t.> ; <s.t.> end $ Match(21) ; end $ <p.t.> ; <s.t.> end $ Predict 16(22) ; end $ ; <s.t.> end $ Match(23) end $ <s.t.> end $ Predict 4(24) end $ end $ Match(25) $ $ Match

Page 14: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 14

Dangling else problem (1/2) Dangling else problem (1/2)

a notable exception of LL(1) even LL(k), but can be solved by bottom-up parsing

QUIZ: why?QUIZ: why?

if <cond> if <cond> <stmt> else <stmt>

<cond>if else<stmt> <stmt>

<stmt>

<stmt>if <cond> <cond>if else<stmt> <stmt>

<stmt>

<stmt>if <cond>

Page 15: C Chuen-Liang Chen, NTUCS&IE / 77 TOP-DOWN PARSING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University

c

Chuen-Liang Chen, NTUCS&IE / 15

Dangling else problem (2/2) Dangling else problem (2/2)

solution 1 -- ambiguous grammar + special handling ( else associates with nearest if )

1. G S ;2. S if S E3. S Other4. E else S 5. E

solution 2 -- change language structure G S ;

S if S ES OtherE else S endif E endif

if else Other ;S 2 3E 4 5 5G 1 1