58
Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

Embed Size (px)

Citation preview

Page 1: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

1

Compilation 0368-3133 (Semester A, 2013/14)

Lecture 6a: Syntax(Bottom–up parsing)

Noam RinetzkySlides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

Page 2: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

2

What is a Compiler?

“A compiler is a computer program that transforms source code written in a programming language (source language) into another language (target language).

The most common reason for wanting to transform source code is to create an executable program.”

--Wikipedia

Page 3: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

3

Conceptual Structure of a Compiler

Executable

code

exe

Source

text

txt

Semantic

Representation

Backend

CompilerFrontend

LexicalAnalysi

s

Syntax Analysi

s

Parsing

Semantic

Analysis

IntermediateRepresentati

on

(IR)

Code

Generation

words sentences

Page 4: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

4

From scanning to parsing((23 + 7) * x)

) x * ) 7 + 23 ( (

RP Id OP RP Num OP Num LP LP

Lexical Analyzer

program text

token stream

ParserGrammar: E ... | Id Id ‘a’ | ... | ‘z’

Op(*)

Id(b)

Num(23) Num(7)

Op(+)

Abstract Syntax Tree

validsyntaxerror

Page 5: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

5

Top-Down vs Bottom-Up• Top-down (predict match/scan-complete )

to be read…

already read…A

Aa b

Aa b

c

aacbb$

AaAb|c

Page 6: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

6

Top-Down vs Bottom-Up• Top-down (predict match/scan-complete )

Bottom-up (shift reduce)

to be read…

already read…

A

a bA

c

a b

A

aacbb$

AaAb|c

Page 7: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

7

Model of an LR parser

LR Parserq5

+

q2

T

q0

Stack

$ id + id

Output (AST/Error)state

symbol

ACTION GOTO

Input

Terminals and Non-terminals

Control Tables

Top

Remainder of text to be processed

State controls decisions

Initial stack contains q0

Page 8: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

8

LR(0) parser tables

State i + ( ) $ E T action

q0 q5 q7 q1 q6 shift

q1 q3 q2 shift

q2 reduce: ZE$q3 q5 q7 q4 shift

q4 reduce: EE+Tq5 reduce: Tiq6 reduce: ETq7 q5 q7 q8 q6 shift

q8 q3 q9 shift

q9 reduce: TE

GOTO Table ACTION TableEmpty cell =

error move

Page 9: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

9

LR(0) parser tables

• Shift action row – Tells which state to GOTO for current token– Blank entry indicates an error

• Reduce action row – Tells which rule to reduce with

• Independent of current token– GOTO entries are blank

State id + ( ) $ E T action

q0 q5 q7 q1 q6 shift

q5 reduce: Ti

Page 10: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

10

State id + ( ) $ E T action

q0 q5 q7 q1 q6 shift

q5 reduce: Ti

Shift Move• Remove first token from input• Push it on the stack• Compute next state based on GOTO table• Push new state on the stack• If new state is error – report error

id + id $input

q0stack

+ id $input

q0stack

shift

id q5

Page 11: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

11

Reduce Move (using Nα)• Symbols in α and their following states are removed from

stack• New state computed based on GOTO table (using top of

stack, before pushing N)• N is pushed on the stack• New state pushed on top of N

+ id $input

q0stack id q5

Reduce:T id + id $input

q0stack T

State id + ( ) $ E T action

q0 q5 q7 q1 q6 shift

q5 reduce: Ti

Page 12: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

12

Reduce Move (using Nα)• Symbols in α and their following states are removed from

stack• New state computed based on GOTO table (using top of

stack, before pushing N)• N is pushed on the stack• New state pushed on top of N

+ id $input

q0stack id q5

Reduce:T id + id $input

q0stack T q6

State id + ( ) $ E T action

q0 q5 q7 q1 q6 shift

q5 reduce: Ti

Page 13: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

13

GOTO/ACTION tableState i + ( ) $ E T

q0 s5 s7 s1 s6

q1 s3 s2

q2 r1 r1 r1 r1 r1 r1 r1

q3 s5 s7 s4

q4 r3 r3 r3 r3 r3 r3 r3

q5 r4 r4 r4 r4 r4 r4 r4

q6 r2 r2 r2 r2 r2 r2 r2

q7 s5 s7 s8 s6

q8 s3 s9

q9 r5 r5 r5 r5 r5 r5 r5

(1)Z E $(2)E T (3)E E + T(4)T i (5)T ( E )

Warning: numbers mean different things!rn = reduce using rule number nsm = shift to state m

Page 14: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

Parsing id+id$

14

goto action ST E $ ) ( + id

g6 g1 s7 s5 0acc s3 1

2g4 s7 s5 3

r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Stack Input Action

0 id + id $ s5

Initialize with state 0

Page 15: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

15

Parsing id+id$goto action S

T E $ ) ( + idg6 g1 s7 s5 0

acc s3 12

g4 s7 s5 3r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

Stack Input Action0 id + id $ s5

Initialize with state 0

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 16: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

16

Parsing id+id$Stack Input Action0 id + id $ s50 id 5 + id $ r4

goto action ST E $ ) ( + id

g6 g1 s7 s5 0acc s3 1

2g4 s7 s5 3

r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 17: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

17

Parsing id+id$Stack Input Action0 id + id $ s50 id 5 + id $ r4

goto action ST E $ ) ( + id

g6 g1 s7 s5 0acc s3 1

2g4 s7 s5 3

r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

pop id 5

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 18: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

18

Parsing id+id$Stack Input Action0 id + id $ s50 id 5 + id $ r4

goto action ST E $ ) ( + id

g6 g1 s7 s5 0acc s3 1

2g4 s7 s5 3

r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

push T 6

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 19: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

19

Parsing id+id$Stack Input Action0 id + id $ s50 id 5 + id $ r40 T 6 + id $ r2

goto action ST E $ ) ( + id

g6 g1 s7 s5 0acc s3 1

2g4 s7 s5 3

r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 20: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

20

Parsing id+id$Stack Input Action0 id + id $ s50 id 5 + id $ r40 T 6 + id $ r20 E 1 + id $ s3

goto action ST E $ ) ( + id

g6 g1 s7 s5 0acc s3 1

2g4 s7 s5 3

r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 21: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

21

Parsing id+id$Stack Input Action0 id + id $ s50 id 5 + id $ r40 T 6 + id $ r20 E 1 + id $ s30 E 1 + 3 id $ s5

goto action ST E $ ) ( + id

g6 g1 s7 s5 0acc s3 1

2g4 s7 s5 3

r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 22: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

22

Parsing id+id$Stack Input Action0 id + id $ s50 id 5 + id $ r40 T 6 + id $ r20 E 1 + id $ s30 E 1 + 3 id $ s50 E 1 + 3 id 5 $ r4

goto action ST E $ ) ( + id

g6 g1 s7 s5 0acc s3 1

2g4 s7 s5 3

r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 23: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

23

Parsing id+id$Stack Input Action0 id + id $ s50 id 5 + id $ r40 T 6 + id $ r20 E 1 + id $ s30 E 1 + 3 id $ s50 E 1 + 3 id 5 $ r40 E 1 + 3 T 4 $ r3

goto action ST E $ ) ( + id

g6 g1 s7 s5 0acc s3 1

2g4 s7 s5 3

r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 24: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

24

Parsing id+id$Stack Input Action0 id + id $ s50 id 5 + id $ r40 T 6 + id $ r20 E 1 + id $ s30 E 1 + 3 id $ s50 E 1 + 3 id 5 $ r40 E 1 + 3 T 4 $ r30 E 1 $ s2

goto action ST E $ ) ( + id

g6 g1 s7 s5 0acc s3 1

2g4 s7 s5 3

r3 r3 r3 r3 r3 4r4 r4 r4 r4 r4 5r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7s9 s3 8

r5 r5 r5 r5 r5 9

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 25: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

25

Constructing an LR parsing table

• Construct a transition diagram (deterministic FSM) – States = sets of LR(0) items– Transitions = one-step derivation

• If there are conflicts – stop• Fill table entries from diagram

Page 26: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

26

Terminology: Reductions & Handles

• The opposite of derivation is called reduction– Let Aα be a production rule– Derivation: βAµ βαµ– Reduction: βαµ βAµ

• A handle is the reduced substring– α is the handles for βαµ

Page 27: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

LR(0) Items• The items of a grammar are obtained by placing a

dot at every position in every production

27

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

1: S E$2: S E $3: S E $ 4: E T5: E T 6: E E + T7: E E + T8: E E + T9: E E + T 10: T id11: T id 12: T (E)13: T ( E)14: T (E )15: T (E)

Grammar LR(0) items

Page 28: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

28

LR(0) Item - Intuition

N αβ

Already matched To be matched

Input

Hypothesis about αβ is the rule being reduced, and so far we’ve matched α and we expect to see β

Page 29: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

29

Types of LR(0) items

N αβ Shift Item

N αβ Reduce Item

Page 30: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

LR(0) automaton example

30

Z E$E TE E + TT iT (E)

T (E)E TE E + TT iT (E)

E E + T

T (E) Z E$

Z E$E E+ T E E+T

T iT (E)

T i

T (E)E E+T

E Tq0

q1

q2

q3

q4

q5

q6

q7

q8

q9

T

(

i

E

+

$

T

)

+

E

i

T

(i

(

reduce stateshift state

Page 31: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

31

Computing item sets

• Initial set– Z is in the start symbol – -closure({ Zα | Zα is in the grammar } )

• Next set from a set S and the next symbol X– step(S,X) = { NαXβ | NαXβ in the item set S}– nextSet(S,X) = -closure(step(S,X))

Page 32: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

32

Operations for transition diagram construction

• Initial = {S’S$}

• For an item set IClosure(I) = Closure(I) ∪ {Xµ is in grammar| NαXβ in I}

• Goto(I, X) = { NαXβ | NαXβ in I}

Page 33: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

33

Initial example

• Initial = {S E $}(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Grammar

Page 34: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

34

Closure example

• Initial = {S E $}• Closure({S E $}) = {

S E $ E T E E + T T id T ( E ) }

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Grammar

Page 35: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

35

Goto example

• Initial = {S E $}• Closure({S E $}) = {

S E $ E T E E + T T id T ( E ) }

• Goto({S E $ , E E + T, T id}, E) = {S E $, E E + T}

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Grammar

Page 36: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

36

Constructing the transition diagram

• Start with state 0 containing itemClosure({S E $})

• Repeat until no new states are discovered– For every state p containing item set Ip, and

symbol N, compute state q containing item setIq = Closure(goto(Ip, N))

Page 37: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

LR(0) automaton example

37

Z E$E TE E + TT iT (E)

T (E)E TE E + TT iT (E)

E E + T

T (E) Z E$

Z E$E E+ T E E+T

T iT (E)

T i

T (E)E E+T

E Tq0

q1

q2

q3

q4

q5

q6

q7

q8

q9

T

(

i

E

+

$

T

)

+

E

i

T

(i

(

reduce stateshift state

Page 38: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

38

Automaton construction example

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

S E$q0

Initialize

Page 39: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

39

S E$E TE E + TT iT (E)

q0

applyClosure

Automaton construction example(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 40: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

40

Automaton construction example

S E$E TE E + TT iT (E)

q0 E T

q6

T

T (E)E TE E + TT iT (E)

(

T i

q5

i

S E$E E+ T

q1

E

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 41: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

41

S E$E TE E + TT iT (E)

T (E)E TE E + TT iT (E)

E E + T

T (E) S E$

Z E$E E+ T E E+T

T iT (E)

T i

T (E)E E+T

E Tq0

q1

q2

q3

q4

q5

q6

q7

q8

q9

T

(

i

E

+

$

T

)

+

E

i

T

(i

(

terminal transition corresponds to shift action in parse table

non-terminal transition corresponds to goto action in parse table

a single reduce item corresponds to reduce action

Automaton construction example(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

Page 42: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

42

Are we done?

• Can make a transition diagram for any grammar

• Can make a GOTO table for every grammar

• Cannot make a deterministic ACTION table for every grammar

Page 43: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

LR(0) conflicts

43

Z E $E T E E + TT i T ( E )T i[E]

Z E$E T

E E + TT i

T (E)T i[E] T i

T i[E]

q0

q5

T

(

i

E Shift/reduce conflict

Page 44: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

LR(0) conflicts

44

Z E $E T E E + TT i V iT ( E )

Z E$E T

E E + TT IV i

T (E)T i[E] T i

V i

q0

q5

T

(

i

E reduce/reduce conflict

Page 45: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

45

LR(0) conflicts

• Any grammar with an -rule cannot be LR(0)• Inherent shift/reduce conflict

– A – reduce item– P αAβ – shift item– A can always be predicted from P αAβ

Page 46: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

46

Conflicts

• Can construct a diagram for every grammar but some may introduce conflicts

• shift-reduce conflict: an item set contains at least one shift item and one reduce item

• reduce-reduce conflict: an item set contains two reduce items

Page 47: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

47

LR variants

• LR(0) – what we’ve seen so far• SLR(0)

– Removes infeasible reduce actions via FOLLOW set reasoning

• LR(1)– LR(0) with one lookahead token in items

• LALR(0)– LR(1) with merging of states with same LR(0)

component

Page 48: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

48

LR (0) GOTO/ACTIONS tables

State i + ( ) $ E T action

q0 q5 q7 q1 q6 shift

q1 q3 q2 shift

q2 ZE$q3 q5 q7 q4 Shift

q4 EE+Tq5 Tiq6 ETq7 q5 q7 q8 q6 shift

q8 q3 q9 shift

q9 TE

GOTO TableACTION

Table

ACTION table determined only by state, ignores input

GOTO table is indexed by state and a grammar symbol from the stack

Page 49: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

49

SLR parsing

• A handle should not be reduced to a non-terminal N if the lookahead is a token that cannot follow N

• A reduce item N α is applicable only when the lookahead is in FOLLOW(N)– If b is not in FOLLOW(N) we just proved there is no derivation

S * βNb. – Thus, it is safe to remove the reduce item from the conflicted

state• Differs from LR(0) only on the ACTION table

– Now a row in the parsing table may contain both shift actions and reduce actions and we need to consult the current token to decide which one to take

Page 50: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

50

SLR action table

State i + ( ) [ ] $

0 shift shift

1 shift accept

2

3 shift shift

4 EE+T EE+T EE+T5 Ti Ti shift Ti6 ET ET ET7 shift shift

8 shift shift

9 T(E) T(E) T(E)

vs.

state action

q0 shift

q1 shift

q2

q3 shift

q4 EE+Tq5 Tiq6 ETq7 shift

q8 shift

q9 TE

SLR – use 1 token look-ahead LR(0) – no look-ahead

… as before…T i T i[E]

Lookahead token from the input

Page 51: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

51

LR(1) grammars

• In SLR: a reduce item N α is applicable only when the lookahead is in FOLLOW(N)

• But FOLLOW(N) merges lookahead for all alternatives for N– Insensitive to the context of a given production

• LR(1) keeps lookahead with each LR item• Idea: a more refined notion of follows

computed per item

Page 52: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

52

LR(1) items• LR(1) item is a pair

– LR(0) item– Lookahead token

• Meaning– We matched the part left of the dot, looking to match the part on

the right of the dot, followed by the lookahead token• Example

– The production L id yields the following LR(1) items

[L → ● id, *][L → ● id, =][L → ● id, id][L → ● id, $][L → id ●, *][L → id ●, =][L → id ●, id][L → id ●, $]

(0) S’ → S(1) S → L = R(2) S → R(3) L → * R(4) L → id(5) R → L

[L → ● id][L → id ●]

LR(0) items

LR(1) items

Page 53: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

53

LALR(1)

• LR(1) tables have huge number of entries• Often don’t need such refined observation (and

cost)• Idea: find states with the same LR(0) component

and merge their lookaheads component as long as there are no conflicts

• LALR(1) not as powerful as LR(1) in theory but works quite well in practice– Merging may not introduce new shift-reduce conflicts,

only reduce-reduce, which is unlikely in practice

Page 54: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

54

Summary

Page 55: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

55

Summary

• Bottom up derivation• LR(k) can decide on a reduce after seeing the entire right

side of the rule plus k look-ahead tokens. – We focused on LR(0)

• Using a table and a stack to derive• LR Items and the automaton• Creating the table from the automaton• LR parsing with pushdown automata• LR(0), SLR, LR(1) – different kinds of LR items, same basic

algorithm

Page 56: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

56

Broad kinds of parsers

• Parsers for arbitrary grammars– Earley’s method, CYK method– Usually, not used in practice (though might change)

• Top-Down parsers – Construct parse tree in a top-down matter– Find the leftmost derivation

• Bottom-Up parsers – Construct parse tree in a bottom-up manner– Find the rightmost derivation in a reverse order

Page 57: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

57

LR is More Powerful than LL

• Any LL(k) language is also in LR(k), i.e., LL(k) LR(k). ⊂– LR is more popular in automatic tools

• But less intuitive

• Also, the lookahead is counted differently in the two cases– In an LL(k) derivation the algorithm sees the left-hand side of the

rule + k input tokens and then must select the derivation rule– In LR(k), the algorithm “sees” all right-hand side of the derivation

rule + k input tokens and then reduces• LR(0) sees the entire right-side, but no input token

Page 58: Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

58

Grammar Hierarchy

Non-ambiguous CFG

LR(1)

LALR(1)

SLR(1)

LL(1)

LR(0)