42
CS 321 Programming Languages and Compilers Bottom Up Parsing

CS 321 Programming Languages and Compilers Bottom Up Parsing

Embed Size (px)

Citation preview

CS 321Programming Languages and

Compilers

CS 321Programming Languages and

Compilers

Bottom Up Parsing

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing22

Bottom-up Parsing: Shift-reduce parsingBottom-up Parsing: Shift-reduce parsing

Grammar H: L E;L | E

E a | b

Input: a;a;b

has parse tree LL;E

L;E

baa

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing33

Data for Shift-reduce ParserData for Shift-reduce Parser

• Input string: sequence of tokens being checked for grammatical correctness

• Stack: sentential form representing the input seen so far

• Trees Constructed: the parse trees that have been constructed so far at some point in the parsing

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing44

Operations of shift-reduce parserOperations of shift-reduce parser

• Shift: move input token to the set of trees as a singleton tree.

• Reduce: coalesce one or more trees into single tree (according to some production).

• Accept: terminate and accept the token stream as grammatically correct.

• Reject: Terminate and reject the token stream.

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing55

A parse for grammar HA parse for grammar H

Input Stack Action Treesa;a;b Shift

;a;b a Reduce E a

;a;b E Shift

a;b E; Shift

a

E

a

E

a

;

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing66

A parse for grammar HA parse for grammar H

Input Stack Action Trees;b E;a Reduce E a

;b E;E Shift

b E;E; Shift

E

a

; a

E

a

; E

a

E

a

; E

a

;

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing77

A parse for grammar HA parse for grammar H

Input Stack Action Trees$ E;E;b Reduce Eb

$ E;E;E Reduce LE

$ E;E;L Reduce LE;L

E

a

; E

a

; b

E

a

; E

a

; E

b

E

a

; E

a

;

E

b

L

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing88

A parse for grammar HA parse for grammar HInput Stack Action Trees

$ E;L Reduce LE;L

$ L Acc

E

a

; E

a

;

E

b

L

L

E

a

; E

a

;

E

b

L

LL

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing99

Bottom-up ParsingBottom-up Parsing

• Characteristic automata for an “LR” grammar: tells when to shift/reduce/accept or reject

• handles

• viable prefixes

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing1010

A parse for grammar HA parse for grammar H

Input Stack Action Stack+Inputa;a;b Shift a;a;b ;a;b a Reduce E a a;a;b ;a;b E Shift E;a;b a;b E; Shift E;a;b ;b E;a Reduce E a E;a;b ;b E;E Shift E;E;b b E;E; Shift E;E;b $ E;E;b Reduce Eb E;E;b $ E;E;E Reduce LE E;E;E$ E;E;L Reduce LE;L E;E;L$ E;L Reduce LE;L E;L$ L Accept L

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing1111

“Bottom-up” parsing?“Bottom-up” parsing?Stack+Input

LE;LE;E;LE;E;EE;E;b E;E;b E;E;b E;a;b E;a;b E;a;b a;a;b a;a;b

E

a

; E

a

;

E

b

L

LL

E

a

; E

a

;

E

b

L

LL

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing1212

Handles and viable prefixesHandles and viable prefixes

• Stack+remaining input = sentential form

• Handle = the part of the sentential form that is reduced in each step

• Viable prefix = the prefix of the sentential form in a right-most derivation that do not extend beyond the end of the handle

• E.g. viable prefixes for H: (E;)*(E | L | a | b)

• Viable prefixes form a regular set.

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing1313

Characteristic Finite State Machine (CFSM)Characteristic Finite State Machine (CFSM)

Viable prefixes of H are recognized by this CFSM:

0

1

2

3

4

E

L

a

b

5 6E

a

b

L;

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing1414

How a Bottom-Up Parser WorksHow a Bottom-Up Parser Works

• Run the CFSM on symbols in the stack

• If a transition possible on the incoming input symbol, then shift, else reduce.

– Still need to decide which rule to use for the reduction.

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing1515

Characteristic automatonCharacteristic automaton

0

1

2

3

4

E

L

a

b

5 6E

a

b

L;

a;a;b leads to state 3 after aE;a;b leads to state 3 after E;aE;E;b leads to state 4 after E;E;bE;E;E leads to state 2 after E;E;EE;E;L leads to state 6 after E;E;LE;L leads to state 6 after E;L

start

ViablePrefixes

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing1616

Characteristic automatonCharacteristic automaton

0

1

2

3

4

E

L

a

b

5 6E

a

b

L;start

State Action0,5 shift (if possible)1 accept2 reduce LE, if EOF

shift otherwise3 reduce Ea4 reduce Eb6 reduce LE;L

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing1717

Example: expression grammarExample: expression grammar

E E+T | T

T T*P | P

P id

id+id+id+idhas parse tree:

ET+E

T+ET+E

T

P

id

P

id

P

id

P

id

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing1818

A parse in this grammarA parse in this grammar

id+id+id+id Shift+id+id+id id Reduce+id+id+id P Reduce+id+id+id T Reduce+id+id+id E Shiftid+id+id E + Shift+id+id E+id Reduce+id+id E+P Reduce+id+id E+T Reduce+id+id E Shiftid+id E+ Shift

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing1919

A parse in this grammar (cont.)A parse in this grammar (cont.)

+id E+ id Reduce+id E+P Reduce+id E+T Reduce+id E Shiftid E+ Shift$ E+id Reduce$ E+P Reduce$ E+T Reduce$ E Reduce$ E Accept

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing2020

Characteristic Finite State MachineCharacteristic Finite State Machine

The CFSM recognizes viable prefixes (strings of grammar symbols that can appear on the stack):

0

1

4

2

3

id

E

T

P

5 8

id

*

T

id

+

7 9P

*P

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing2121

DefinitionsDefinitions

• Rightmost derivation

• Right-sentential form

• Handle

• Viable prefix

• Characteristic automaton

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing2222

Rightmost DerivationRightmost Derivation

Definition: A rightmost derivation in G is a derivation:

1ii1 wwwS

such that for each step i, the rightmost non-terminal in iw

1iw is replaced to obtain

ba;a;a;a;a EE;E;E;E;EE;E;LE;LL

ba;a;ba;babb ;E;;L;E;L;LE;LL

1.

2.

is right-most

is not right-most

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing2323

Right-sentential FormsRight-sentential Forms

Definition: A right-sentential form is any sentential form that occurs in a right-most derivation.

ba;a;ba;babb ;E;;L;E;L;LE;LL

E.g., any of these

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing2424

HandlesHandles

Definition: Assume the i-th step of a rightmost derivation is:

wi=uiAvi uivi=wi+1

Then, (, |ui|) is the handle of wi+1

In an unambiguous grammar, any sentence has a unique rightmostderivation, and so we can talk about “the” handle rather than “a”handle.

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing2525

The PlanThe Plan

• Construct a parser by first constructing the CFSM, then constructing GOTO and ACTION tables from it.

• Construction has two parts:– “LR(0)” construction of the CFSM

– “SLR(1)” construction of tables

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing2626

Constructing the CFSM: StatesConstructing the CFSM: States

The states in the CFSM are created by taking the “closure” of “LR(0) items”

L • E ; LL E • ; LL E ; • LL E ; L •

Given a production “L E ; L”, these are all induced “LR(0) items”

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing2727

What is “•”? What is “•”?

The “•” in “L E • ; L” represents the state of the parse.

L

E ; LOnly this part of thetree is fully developed

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing2828

A State in the CFSM: Closure of LR(0) ItemA State in the CFSM: Closure of LR(0) Item

For set I of LR(0) items, calculate closure(I):1. if “A • B ” is in closure(I), then for every production “B ”,

“B • ” is in closure(I)2. closure(I) is the smallest set with property (1)

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing2929

Closure of LR(0) Item: ExampleClosure of LR(0) Item: Example

H: L E;L | E E a | b

closure({L E ; • L}) = {L E ; • L , L • E ; L , L • E, E • a , E • b}

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing3030

LR(0) MachineLR(0) Machine

• Given grammar G with goal symbol S, augment the grammar by adding new goal symbol S’ and production S’ S.

• States = sets of LR(0) items

• Start state = closure({S’ •S})

• All states are considered to be final (set of viable prefixes closed under prefix)

• transition(I, X) = closure({A X• | A •XI}).

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing3131

Example: LR(0) CFSM Construction Example: LR(0) CFSM Construction

H: L’ L L E;L | E E a | b

Augment the grammar:

Initial State is closure of this “augmenting rule”:

closure({L’ • L}) =

L’ • LL • E;LL • EE • aE •b

:I0

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing3232

Example: Transitions from I0 Example: Transitions from I0

transition( , L) = {L’ L •} = 0I 1I

transition( , E) = {L E •;L , L E •} = 0I 2I

transition( , a ) = {E a •} = 0I

transition( , b) = {E b •} = 0I

3I

4I

There are no other transitions from I0. There are no transitions possible from I1.Now consider the transitions from I2.

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing3333

Transitions from I2 Transitions from I2

transition( , ;) = 2I

L E;• LL • E;LL • EE • aE •b

:I5

transition( , L) = { L E;L• }= 5I

transition( , E) =5I 2Itransition( , a) =5I 3I transition( , b) =5I 4I

New state: 6I

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing3434

The CFSM Transition Diagram for HThe CFSM Transition Diagram for H

L E;• LL • E;LL • EE • aE •b

:I5

L’ L •

L E •;LL E •

E a •

E b •

L’ • LL • E;LL • EE • aE •b

:I0

L E;L•

:I6:I2

:I4

:I3

:I1

L

E

a

b

E

a

b

;

L

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing3535

Characteristic Finite State Machine for HCharacteristic Finite State Machine for H

0

1

2

3

4

E

L

a

b

5 6E

a

b

L;

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing3636

How LR(1) parsers workHow LR(1) parsers work

• GOTO table: transition function of characteristic automaton in tabular form

• ACTION table: State Action

• Procedure:– Use the GOTO table to run the CFSM over the stack. Suppose

state reached at top of stack is .

– Take action given by ACTION(,a), where a is incoming input symbol.

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing3737

Action Table for HAction Table for H

a b ; $

0 Shift Shift1 Accept2 Shift Reduce L E3 Reduce E a Reduce E a4 Reduce E b Reduce E b5 Shift Shift6 Reduce L E;L

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing3838

SLR(1) Parser ConstructionSLR(1) Parser Construction

• GOTO table is the move function from the LR(0) CFSM.

• ACTION table is constructed from the CFSM as follows:

– If state i contains A •a, then ACTION(i,a) = Shift.

– If state i contains A •, then ACTION(i,a) = Reduce A(But, if A is L’, action is Accept.)

– Otherwise, reject.

• But,...

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing3939

SLR(1) Parser ConstructionSLR(1) Parser Construction

• Rules for the ACTION table can involve shift/reduce conflicts.

• So the actual rule for reduce actions is: If state i contains A •, then ACTION(i,a) = Reduce A , for

all a FOLLOW(A).

• E.g. state 2 for grammar H yields a shift/reduce conflict. Namely, should you shift the “;” or reduce by “LE”. This is resolved by looking at the “follow set” for L.

• Follow(L) = {$}

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing4040

FIRST and FOLLOW setsFIRST and FOLLOW sets

• FIRST() = {a | a} { | a }

• FOLLOW(A) = {a | S Aa}

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing4141

Calculating FIRST setsCalculating FIRST sets

• Create table Fi mapping N to {}; initially, Fi(A) = for all A.

• Repeat until Fi does not change:– For each A P,

Fi(A) := Fi(A) FIRST(, Fi)

where FIRST(, Fi) is defined as follows:– FIRST(, Fi) = {}

– FIRST(a, Fi) = {a}

– FIRST(B, Fi) = Fi(B) FIRST(b,Fi), if Fi(B) Fi(B), o.w.

Bottom-Up ParsingBottom-Up ParsingBottom-Up ParsingBottom-Up Parsing4242

Calculating FOLLOW setsCalculating FOLLOW sets

• Calculate FOLLOW sets

• Create table Fo : N {$}; initially, Fo(A) = for all A, except Fo(S) = {$}

• Repeat until Fo does not change:– For each production A B,

Fo(B) := Fo(B) FIRST() - {}

– For each production A BFo(B) := Fo(B) Fo(A)

– For each production A Bif FIRST(), Fo(B) := Fo(B) Fo(A)