52
Canonical LR(1) Parsing LALR Parsing Syntax Analysis: LR(1) and LALR Parsing Lecture 7 c Haythem O. Ismail LR(1) and LALR 1 / 29

Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

  • Upload
    others

  • View
    21

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Syntax Analysis: LR(1) and LALR Parsing

Lecture 7

c©Haythem O. Ismail LR(1) and LALR 1 / 29

Page 2: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Objectives

By the end of this lecture you should be able to:1 Identify LR(1) items.2 Construct an LR(1) automaton for a CFG.3 Construct the LR(1) parsing table for a CFG.4 Trace the operation of an LR(1) parser.5 Construct an LALR automaton for a CFG.6 Construct the LALR parsing table for a CFG.7 Trace the operation of an LALR parser.

c©Haythem O. Ismail LR(1) and LALR 2 / 29

Page 3: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Outline

1 Canonical LR(1) Parsing

2 LALR Parsing

c©Haythem O. Ismail LR(1) and LALR 3 / 29

Page 4: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Outline

1 Canonical LR(1) Parsing

2 LALR Parsing

c©Haythem O. Ismail LR(1) and LALR 4 / 29

Page 5: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Grammars Not SLR

ExampleConsider the following grammar G7:

S −→ L=R | RL −→ ∗R | idR −→ L

c©Haythem O. Ismail LR(1) and LALR 5 / 29

Page 6: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Grammars Not SLR: States

Example

c© Aho et al. (2007)c©Haythem O. Ismail LR(1) and LALR 6 / 29

Page 7: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Problems with SLR Parsing

One problem with SLR parsers is that it is always possible to reduceA→ α if the next input is in Follow(A).

ExampleWith G7 and input id = id, an SLR parser may shift id and thenreduce L→ id.

Since = ∈ Follow(R), the parser may decide to reduce R→ L.

Clearly, this is not correct; a sentential form starting with R= cannever reduce to S.

States should carry more information about when reduction isappropriate.

c©Haythem O. Ismail LR(1) and LALR 7 / 29

Page 8: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Problems with SLR Parsing

One problem with SLR parsers is that it is always possible to reduceA→ α if the next input is in Follow(A).

ExampleWith G7 and input id = id, an SLR parser may shift id and thenreduce L→ id.

Since = ∈ Follow(R), the parser may decide to reduce R→ L.

Clearly, this is not correct; a sentential form starting with R= cannever reduce to S.

States should carry more information about when reduction isappropriate.

c©Haythem O. Ismail LR(1) and LALR 7 / 29

Page 9: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Problems with SLR Parsing

One problem with SLR parsers is that it is always possible to reduceA→ α if the next input is in Follow(A).

ExampleWith G7 and input id = id, an SLR parser may shift id and thenreduce L→ id.

Since = ∈ Follow(R), the parser may decide to reduce R→ L.

Clearly, this is not correct; a sentential form starting with R= cannever reduce to S.

States should carry more information about when reduction isappropriate.

c©Haythem O. Ismail LR(1) and LALR 7 / 29

Page 10: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LR(0) Items: Reprise

DefinitionAn LR(0) item of CFG G = 〈V,Σ,R, S〉 is a pair 〈A→ α, i〉, where(A→ α) ∈ R and 0 ≤ i ≤ |α|.

Intuitively, an LR(0) item is a rule and a position in the right sideof the rule.

Thus, 〈A→ aBb, 2〉 ≡ A→ aB.b

An LR(0) item represents a state of the parser: We have alreadyfound the prefix of α before the dot and, if we find the suffixfollowing the dot, we may reduce α to A.

c©Haythem O. Ismail LR(1) and LALR 8 / 29

Page 11: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LR(1) Items

DefinitionAn LR(1) item of CFG G = 〈V,Σ,R, S〉 is a pair 〈c, a〉, where c is anLR(0) item and a ∈ Σ ∪ {$}.

c is called the core of the item.a is the lookahead of the item.

Note that a is a single symbol, hence the “1” in “LR(1) item.”

An LR(1) item [A→ α.β, a] represents a state of the parser: Wehave already found α in the input and, if we find β, we mayreduce αβ to A if the next symbol is a.

c©Haythem O. Ismail LR(1) and LALR 9 / 29

Page 12: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LR(1) Items

DefinitionAn LR(1) item of CFG G = 〈V,Σ,R, S〉 is a pair 〈c, a〉, where c is anLR(0) item and a ∈ Σ ∪ {$}.

c is called the core of the item.a is the lookahead of the item.

Note that a is a single symbol, hence the “1” in “LR(1) item.”

An LR(1) item [A→ α.β, a] represents a state of the parser: Wehave already found α in the input and, if we find β, we mayreduce αβ to A if the next symbol is a.

c©Haythem O. Ismail LR(1) and LALR 9 / 29

Page 13: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LR(1) Items

DefinitionAn LR(1) item of CFG G = 〈V,Σ,R, S〉 is a pair 〈c, a〉, where c is anLR(0) item and a ∈ Σ ∪ {$}.

c is called the core of the item.a is the lookahead of the item.

Note that a is a single symbol, hence the “1” in “LR(1) item.”

An LR(1) item [A→ α.β, a] represents a state of the parser: Wehave already found α in the input and, if we find β, we mayreduce αβ to A if the next symbol is a.

c©Haythem O. Ismail LR(1) and LALR 9 / 29

Page 14: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LR(1) Items

DefinitionAn LR(1) item of CFG G = 〈V,Σ,R, S〉 is a pair 〈c, a〉, where c is anLR(0) item and a ∈ Σ ∪ {$}.

c is called the core of the item.a is the lookahead of the item.

Note that a is a single symbol, hence the “1” in “LR(1) item.”

An LR(1) item [A→ α.β, a] represents a state of the parser: Wehave already found α in the input and, if we find β, we mayreduce αβ to A if the next symbol is a.

c©Haythem O. Ismail LR(1) and LALR 9 / 29

Page 15: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The Set of LR(1) Items

The smallest set I(G) satisfying the following

[S′ → .S, $] ∈ I.

[A→ αX.β, a] ∈ I if [A→ α.Xβ, a] ∈ I, for X ∈ Σ ∪ V .

[B→ .γ, b] ∈ I if [A→ α.Bβ, a] ∈ I where (B→ γ) ∈ R andb ∈ First(βa).

c©Haythem O. Ismail LR(1) and LALR 10 / 29

Page 16: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The LR(1) NFA

DefinitionFor the CFG G = 〈V,Σ,R, S〉, the LR(1) NFA is an NFAN1

G = 〈I(G),V ∪ Σ, δ, [S′ → .S, $], I(G)〉, where

S′ /∈ V ∪ Σ;

δ([A→ α.Xβ, a],X) = {[A→ αX.β, a]};δ([A→ α.Bβ, a], ε) = {[B→ .γ, b] | (B→ γ) ∈ R andb ∈ First(βa)}.

c©Haythem O. Ismail LR(1) and LALR 11 / 29

Page 17: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The LR(1) Automaton

Definition

The LR(1) automaton for a CFG G is the DFA M1G which is equivalent

to N1G and constructed using the standard subset construction.

c©Haythem O. Ismail LR(1) and LALR 12 / 29

Page 18: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Grammar G8

Example (The grammar)

1. S −→ CC2. C −→ cC3. C −→ d

c©Haythem O. Ismail LR(1) and LALR 13 / 29

Page 19: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Grammar G8

Example (The automaton)

c©Aho et al. (2007)c©Haythem O. Ismail LR(1) and LALR 14 / 29

Page 20: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The Canonical LR(1) Parsing Table

We are given a CFG G = 〈V,Σ,R, S〉.1 Construct M1

G.2 For all states q of M1

G1 GOTO(q,A) = δ(q,A), for every A ∈ V .2 If a ∈ Σ and [A→ α.aβ, b] ∈ q, then ACTION(q, a) = “shiftδ(q, a)”.

3 Else if A 6= S′ and [A→ α., a] ∈ q, then ACTION(q, a) =“reduce A→ α”.

4 Else if [S′ → S., $] ∈ q, then ACTION(q, $) = “accept”.5 Else ACTION(q, a) = “error”.

If any conflicting actions result from the above construction, we saythat G is not LR(1).

c©Haythem O. Ismail LR(1) and LALR 15 / 29

Page 21: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The Canonical LR(1) Parsing Table

We are given a CFG G = 〈V,Σ,R, S〉.1 Construct M1

G.2 For all states q of M1

G1 GOTO(q,A) = δ(q,A), for every A ∈ V .2 If a ∈ Σ and [A→ α.aβ, b] ∈ q, then ACTION(q, a) = “shiftδ(q, a)”.

3 Else if A 6= S′ and [A→ α., a] ∈ q, then ACTION(q, a) =“reduce A→ α”.

4 Else if [S′ → S., $] ∈ q, then ACTION(q, $) = “accept”.5 Else ACTION(q, a) = “error”.

If any conflicting actions result from the above construction, we saythat G is not LR(1).

c©Haythem O. Ismail LR(1) and LALR 15 / 29

Page 22: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Grammar G8

Example (Table Automaton )

c© Aho et al. (2007)c©Haythem O. Ismail LR(1) and LALR 16 / 29

Page 23: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Grammar G8

Example (Trace Automaton )

Stack Input0 ccdd$03 cdd$033 dd$0334 d$0338 d$038 d$02 d$027 $025 $01 $

c©Haythem O. Ismail LR(1) and LALR 17 / 29

Page 24: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Canonical LR(1) and SLR Parsing

Note that a canonical LR(1) table can include conflicts only ifthe corresponding SLR table does.

Thus, every SLR grammar is a canonical LR(1) grammar, but notvice versa.

The price, however, is the increased table size due to theincreased number of automaton states.

c©Haythem O. Ismail LR(1) and LALR 18 / 29

Page 25: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Outline

1 Canonical LR(1) Parsing

2 LALR Parsing

c©Haythem O. Ismail LR(1) and LALR 19 / 29

Page 26: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Lookahead LR Parsing

Lookahead LR parsers (LALR parsers) are often used in practice.

Most common syntactic constructs in programming languagesare representable by LALR grammars.LALR tables are considerably smaller than canonical LR(1)tables.

SLR and LALR tables always have the same number of states.Such number is typically several hundred states.A canonical LR(1) table typically has several thousand states!

c©Haythem O. Ismail LR(1) and LALR 20 / 29

Page 27: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Lookahead LR Parsing

Lookahead LR parsers (LALR parsers) are often used in practice.

Most common syntactic constructs in programming languagesare representable by LALR grammars.LALR tables are considerably smaller than canonical LR(1)tables.

SLR and LALR tables always have the same number of states.Such number is typically several hundred states.A canonical LR(1) table typically has several thousand states!

c©Haythem O. Ismail LR(1) and LALR 20 / 29

Page 28: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Lookahead LR Parsing

Lookahead LR parsers (LALR parsers) are often used in practice.

Most common syntactic constructs in programming languagesare representable by LALR grammars.LALR tables are considerably smaller than canonical LR(1)tables.

SLR and LALR tables always have the same number of states.Such number is typically several hundred states.A canonical LR(1) table typically has several thousand states!

c©Haythem O. Ismail LR(1) and LALR 20 / 29

Page 29: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Lookahead LR Parsing

Lookahead LR parsers (LALR parsers) are often used in practice.

Most common syntactic constructs in programming languagesare representable by LALR grammars.LALR tables are considerably smaller than canonical LR(1)tables.

SLR and LALR tables always have the same number of states.Such number is typically several hundred states.A canonical LR(1) table typically has several thousand states!

c©Haythem O. Ismail LR(1) and LALR 20 / 29

Page 30: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Lookahead LR Parsing

Lookahead LR parsers (LALR parsers) are often used in practice.

Most common syntactic constructs in programming languagesare representable by LALR grammars.LALR tables are considerably smaller than canonical LR(1)tables.

SLR and LALR tables always have the same number of states.Such number is typically several hundred states.A canonical LR(1) table typically has several thousand states!

c©Haythem O. Ismail LR(1) and LALR 20 / 29

Page 31: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Lookahead LR Parsing

Lookahead LR parsers (LALR parsers) are often used in practice.

Most common syntactic constructs in programming languagesare representable by LALR grammars.LALR tables are considerably smaller than canonical LR(1)tables.

SLR and LALR tables always have the same number of states.Such number is typically several hundred states.A canonical LR(1) table typically has several thousand states!

c©Haythem O. Ismail LR(1) and LALR 20 / 29

Page 32: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The Core

DefinitionThe core of an LR(1) state q is the set

core(q) = {c|[c, l] ∈ q}

Two LR(1) states q1 and q2 are core-equivalent ifcore(q1) = core(q2).

Example (Automaton of G8 )The following states are core-equivalent.

I4 and I7.

I8 and I9.

I3 and I6.

How are they different?

c©Haythem O. Ismail LR(1) and LALR 21 / 29

Page 33: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The Core

DefinitionThe core of an LR(1) state q is the set

core(q) = {c|[c, l] ∈ q}

Two LR(1) states q1 and q2 are core-equivalent ifcore(q1) = core(q2).

Example (Automaton of G8 )The following states are core-equivalent.

I4 and I7.

I8 and I9.

I3 and I6.

How are they different?

c©Haythem O. Ismail LR(1) and LALR 21 / 29

Page 34: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Core-Equivalence and the Automaton

Observation

Let M1G be the LR(1) DFA for a CFG G, and let q1, q2 be states of M1

Gand X ∈ V ∪ Σ. If q1 and q2 are core-equivalent, then δ(q1,X) andδ(q2,X) are core-equivalent.

The number of states of the automaton may be reducedconsiderably if we use core-equivalence-classes as states.

In particular, if {q1, . . . , qn} is an equivalence class ofcore-equivalence, then we may replace the states q1, . . . , qn bythe single state Q = q1 ∪ · · · ∪ qn.

Transitions are adjusted appropriately.

c©Haythem O. Ismail LR(1) and LALR 22 / 29

Page 35: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Core-Equivalence and the Automaton

Observation

Let M1G be the LR(1) DFA for a CFG G, and let q1, q2 be states of M1

Gand X ∈ V ∪ Σ. If q1 and q2 are core-equivalent, then δ(q1,X) andδ(q2,X) are core-equivalent.

The number of states of the automaton may be reducedconsiderably if we use core-equivalence-classes as states.

In particular, if {q1, . . . , qn} is an equivalence class ofcore-equivalence, then we may replace the states q1, . . . , qn bythe single state Q = q1 ∪ · · · ∪ qn.

Transitions are adjusted appropriately.

c©Haythem O. Ismail LR(1) and LALR 22 / 29

Page 36: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Core-Equivalence and the Automaton

Observation

Let M1G be the LR(1) DFA for a CFG G, and let q1, q2 be states of M1

Gand X ∈ V ∪ Σ. If q1 and q2 are core-equivalent, then δ(q1,X) andδ(q2,X) are core-equivalent.

The number of states of the automaton may be reducedconsiderably if we use core-equivalence-classes as states.

In particular, if {q1, . . . , qn} is an equivalence class ofcore-equivalence, then we may replace the states q1, . . . , qn bythe single state Q = q1 ∪ · · · ∪ qn.

Transitions are adjusted appropriately.

c©Haythem O. Ismail LR(1) and LALR 22 / 29

Page 37: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Core-Equivalence and the Automaton

Observation

Let M1G be the LR(1) DFA for a CFG G, and let q1, q2 be states of M1

Gand X ∈ V ∪ Σ. If q1 and q2 are core-equivalent, then δ(q1,X) andδ(q2,X) are core-equivalent.

The number of states of the automaton may be reducedconsiderably if we use core-equivalence-classes as states.

In particular, if {q1, . . . , qn} is an equivalence class ofcore-equivalence, then we may replace the states q1, . . . , qn bythe single state Q = q1 ∪ · · · ∪ qn.

Transitions are adjusted appropriately.

c©Haythem O. Ismail LR(1) and LALR 22 / 29

Page 38: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The LALR Automaton

DefinitionLet 〈Q(G),V ∪ Σ, δ, q0,F〉 be an LR(1) automaton. The LALRautomaton MA

G = 〈Q′(G),V ∪ Σ, δ′, q′0,F′〉 is such that

Q′(G) = {⋃

qi∈[q]

qi|[q] is an equivalence class of core-equivalence

}q0 and q′0 are core-equivalent.

q′ ∈ F′ if and only if there is some q ∈ F such that q and q′ arecore-equivalent.

δ′(q′,X) is core-equivalent to δ(q,X), where q and q′ arecore-equivalent.

c©Haythem O. Ismail LR(1) and LALR 23 / 29

Page 39: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The LALR Automaton

DefinitionLet 〈Q(G),V ∪ Σ, δ, q0,F〉 be an LR(1) automaton. The LALRautomaton MA

G = 〈Q′(G),V ∪ Σ, δ′, q′0,F′〉 is such that

Q′(G) = {⋃

qi∈[q]

qi|[q] is an equivalence class of core-equivalence

}q0 and q′0 are core-equivalent.

q′ ∈ F′ if and only if there is some q ∈ F such that q and q′ arecore-equivalent.

δ′(q′,X) is core-equivalent to δ(q,X), where q and q′ arecore-equivalent.

c©Haythem O. Ismail LR(1) and LALR 23 / 29

Page 40: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The LALR Automaton

DefinitionLet 〈Q(G),V ∪ Σ, δ, q0,F〉 be an LR(1) automaton. The LALRautomaton MA

G = 〈Q′(G),V ∪ Σ, δ′, q′0,F′〉 is such that

Q′(G) = {⋃

qi∈[q]

qi|[q] is an equivalence class of core-equivalence

}q0 and q′0 are core-equivalent.

q′ ∈ F′ if and only if there is some q ∈ F such that q and q′ arecore-equivalent.

δ′(q′,X) is core-equivalent to δ(q,X), where q and q′ arecore-equivalent.

c©Haythem O. Ismail LR(1) and LALR 23 / 29

Page 41: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The LALR Automaton

DefinitionLet 〈Q(G),V ∪ Σ, δ, q0,F〉 be an LR(1) automaton. The LALRautomaton MA

G = 〈Q′(G),V ∪ Σ, δ′, q′0,F′〉 is such that

Q′(G) = {⋃

qi∈[q]

qi|[q] is an equivalence class of core-equivalence

}q0 and q′0 are core-equivalent.

q′ ∈ F′ if and only if there is some q ∈ F such that q and q′ arecore-equivalent.

δ′(q′,X) is core-equivalent to δ(q,X), where q and q′ arecore-equivalent.

c©Haythem O. Ismail LR(1) and LALR 23 / 29

Page 42: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Grammar G8

Example (LALR automaton)

Draw the state diagram of the LALR automaton for G8 .

c©Haythem O. Ismail LR(1) and LALR 24 / 29

Page 43: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

The LALR Parsing Table

The LALR parsing table is constructed in exactly the same waythe canonical LR(1) table is constructed, but with MA

G rather thanM1

G.

The size of the LALR table is the same as the size of the SLRtable.

If the table has no conflicts, the grammar is said to be an LALRgrammar.

c©Haythem O. Ismail LR(1) and LALR 25 / 29

Page 44: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Grammar G8

Example (LALR parsing table)

c© Aho et al. (2007)c©Haythem O. Ismail LR(1) and LALR 26 / 29

Page 45: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

Grammar G8

Example (Trace)Trace the operation of both the canonical LR(1) and the LALRparsers on input ccd. What do you notice?

c©Haythem O. Ismail LR(1) and LALR 27 / 29

Page 46: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LALR Conflicts (I)

ObservationIf an LR(1) table does not have shift/reduce conflicts, then thecorresponding LALR table does not have shift/reduce conflicts.

Proof. A shift/reduce conflict occurs if an LALR state has an item[A→ α., a], calling for reducing by A→ α, and an item[B→ β.aγ, b] calling for a shift. But, then, there must be some LR(1)state with items [A→ α., a] and [B→ β.aγ, c]. This state would alsohave a shift/reduce conflict, contradicting our assumption.

c©Haythem O. Ismail LR(1) and LALR 28 / 29

Page 47: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LALR Conflicts (I)

ObservationIf an LR(1) table does not have shift/reduce conflicts, then thecorresponding LALR table does not have shift/reduce conflicts.

Proof. A shift/reduce conflict occurs if an LALR state has an item[A→ α., a], calling for reducing by A→ α, and an item[B→ β.aγ, b] calling for a shift. But, then, there must be some LR(1)state with items [A→ α., a] and [B→ β.aγ, c]. This state would alsohave a shift/reduce conflict, contradicting our assumption.

c©Haythem O. Ismail LR(1) and LALR 28 / 29

Page 48: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LALR Conflicts (II)

Observation

If an LR(1) table does not have reduce/reduce conflicts, the corresponding LALR table mayhave reduce/reduce conflicts.

Proof. Consider the grammar

S −→ aAd | aBe | bBd | bAeA −→ cB −→ c

This grammar generates the strings acd,ace,bcd,bce.

The LR(1) state {[A→ c., d], [B→ c., e]} is reachable from the start state after readingac.

Similarly, the LR(1) state {[A→ c., e], [B→ c., d]} is reachable from the start stateafter reading bc.

The LALR automaton will include the state {[A→ c., d/e], [B→ c., d/e]}, whichresults in a reduce/reduce conflict.

c©Haythem O. Ismail LR(1) and LALR 29 / 29

Page 49: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LALR Conflicts (II)

Observation

If an LR(1) table does not have reduce/reduce conflicts, the corresponding LALR table mayhave reduce/reduce conflicts.

Proof. Consider the grammar

S −→ aAd | aBe | bBd | bAeA −→ cB −→ c

This grammar generates the strings acd,ace,bcd,bce.

The LR(1) state {[A→ c., d], [B→ c., e]} is reachable from the start state after readingac.

Similarly, the LR(1) state {[A→ c., e], [B→ c., d]} is reachable from the start stateafter reading bc.

The LALR automaton will include the state {[A→ c., d/e], [B→ c., d/e]}, whichresults in a reduce/reduce conflict.

c©Haythem O. Ismail LR(1) and LALR 29 / 29

Page 50: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LALR Conflicts (II)

Observation

If an LR(1) table does not have reduce/reduce conflicts, the corresponding LALR table mayhave reduce/reduce conflicts.

Proof. Consider the grammar

S −→ aAd | aBe | bBd | bAeA −→ cB −→ c

This grammar generates the strings acd,ace,bcd,bce.

The LR(1) state {[A→ c., d], [B→ c., e]} is reachable from the start state after readingac.

Similarly, the LR(1) state {[A→ c., e], [B→ c., d]} is reachable from the start stateafter reading bc.

The LALR automaton will include the state {[A→ c., d/e], [B→ c., d/e]}, whichresults in a reduce/reduce conflict.

c©Haythem O. Ismail LR(1) and LALR 29 / 29

Page 51: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LALR Conflicts (II)

Observation

If an LR(1) table does not have reduce/reduce conflicts, the corresponding LALR table mayhave reduce/reduce conflicts.

Proof. Consider the grammar

S −→ aAd | aBe | bBd | bAeA −→ cB −→ c

This grammar generates the strings acd,ace,bcd,bce.

The LR(1) state {[A→ c., d], [B→ c., e]} is reachable from the start state after readingac.

Similarly, the LR(1) state {[A→ c., e], [B→ c., d]} is reachable from the start stateafter reading bc.

The LALR automaton will include the state {[A→ c., d/e], [B→ c., d/e]}, whichresults in a reduce/reduce conflict.

c©Haythem O. Ismail LR(1) and LALR 29 / 29

Page 52: Syntax Analysis: LR(1) and LALR Parsing Canonical LR(1) Parsing LALR Parsing Objectives By the end of this lecture you should be able to: 1 Identify LR(1) items. 2 Construct an LR(1)

Canonical LR(1) ParsingLALR Parsing

LALR Conflicts (II)

Observation

If an LR(1) table does not have reduce/reduce conflicts, the corresponding LALR table mayhave reduce/reduce conflicts.

Proof. Consider the grammar

S −→ aAd | aBe | bBd | bAeA −→ cB −→ c

This grammar generates the strings acd,ace,bcd,bce.

The LR(1) state {[A→ c., d], [B→ c., e]} is reachable from the start state after readingac.

Similarly, the LR(1) state {[A→ c., e], [B→ c., d]} is reachable from the start stateafter reading bc.

The LALR automaton will include the state {[A→ c., d/e], [B→ c., d/e]}, whichresults in a reduce/reduce conflict.

c©Haythem O. Ismail LR(1) and LALR 29 / 29