Compiler Design and Construction
Bottom-Up Parsing
Slides modified from Louden Book, Y Chung (NTHU), and Fischer, Leblanc
2 2
Outline 6.0 Introduction
6.1 Shift-Reduce Parsers
6.2 LR Parsers
6.3 LR(1) Parsing
6.4 SLR(1)Parsing
6.5 LALR(1)
Fall 2012 Bottom Up Parsing
Parsing
A top-down parser “discovers” the parse tree by
starting at the root (start symbol) and expanding
(predict) downward in a depth-first manner
They predict the derivation before the matching is done
A bottom-up parser starts at the leaves (terminals)
and determines which production generates them.
Then it determines the rules to generate their parents and so-on, until reaching root (S)
Fall 2012 Bottom Up Parsing 3
Bottom-up Parsing Example
Fall 2012 Bottom Up Parsing 4
Scan the input looking for any substrings that appear on the RHS of a rule!
We call that RHS a handle
We can do this left-to-right or right-to-left
Let's use left-to-right
Replace that RHS with the LHS
Repeat until left with Start symbol or error
Effectively we are going to figure out which rules (in a right-most derivation) will generate our input (but in reverse order)
Can think of this as handle pruning
Top-down Parsing Example
Consider the following input and CFG
Input: begin SimpleStmt; SimpleStmt; end $
How would we generate this string in a rightmost
fashion?
<program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>
<stmts> l
Top-down Parsing Example
Consider the following input and CFG
Input: begin SimpleStmt; SimpleStmt; end $
<program> => begin <stmts> end $
=> begin SimpleStmt; <stmts> end $
=> begin SimpleStmt; SimpleStmt; <stmts> end $
=> begin SimpleStmt; SimpleStmt; end $
<program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>
<stmts> l
Bottom-up Parsing Example Input: begin SimpleStmt; SimpleStmt; <stmts> end $
Replace lambda with <stmts>
<stmts>
l <program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>
<stmts> l
Bottom-up Parsing Example
Input: begin SimpleStmt; SimpleStmt; <stmts> end $
Replace SimpleStmt; <stmts> with <stmts>
Input: begin SimpleStmt; <stmts> end $
<stmts>
SimpleStmts ; <stmts>
l <program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>
<stmts> l
Bottom-up Parsing Example
Input: begin SimpleStmt; <stmts> end $
Replace SimpleStmt; <stmts> with <stmts>
Input: begin <stmts> end $
<stmts>
SimpleStmt ; <stmts>
SimpleStmts ; <stmts>
l <program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>
<stmts> l
Bottom-up Parsing Example
Input: begin <stmts> end $
Replace with start symbol
<program> $
<program>
begin <stmts> end $
SimpleStmt ; <stmts>
SimpleStmts ; <stmts>
l <program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>
<stmts> l
Bottom Up Parsing
Fall 2012 Bottom Up Parsing 11
Consider this grammar:
S --> a T U e
T --> T b c | b
U --> d
and the rightmost derivation of the
sentence:
a b b c d e:
S ==> a T U e
==> a T d e
==> a T b c d e
==> a b b c d e
Bottom Up Parsing
Fall 2012 Bottom Up Parsing 12
A bottom-up parser is an LR parser so it reads the input
from left-to-right and performs a rightmost derivation in
reverse order.
There are four steps in the rightmost derivation of a b b
c d e so a bottom-up parser performs the steps in
reverse order: S ==> a T U e
==> a T d e
==> a T b c d e
==> a b b c d e
Bottom Up Parsing
Fall 2012 Bottom Up Parsing 13
The parser examines the sentence ( a b b c d e ) for substrings that match the right-sides of productions in the grammar.
There are three cases:
the first (b) in the sentence;
the second (b) in the sentence;
or the (d).
The parser chooses the first b and reduces it to the left-side of the T --> b production to produce the sentential form: a T b c d e .
S --> a T U e T --> T b c | b U --> d
Bottom Up Parsing
Fall 2012 Bottom Up Parsing 14
The parser examines the sentential form ( a T b c d e )
for substrings that match the right-sides of productions in
the grammar.
There are three cases:
( T b c ), (b), and (d).
The parser chooses ( T b c ) and reduces it to the left-
side of the production: T --> T b c to produce the
sentential form: a T d e.
S --> a T U e T --> T b c | b U --> d
Bottom Up Parsing
Fall 2012 Bottom Up Parsing 15
The parser examines the sentential form ( a T d e ) for
substrings that match the right-sides of productions in the
grammar and finds only one case:
(d).
The parser reduces it to the left-side of the production:
U --> d to produce the sentential form: a T U e.
S --> a T U e T --> T b c | b U --> d
Bottom Up Parsing
Fall 2012 Bottom Up Parsing 16
The parser examines the sentential form ( a T U e ) for substrings that match the right-sides of productions in the grammar and finds that the only case is the whole string: ( a T U e ).
The parser reduces it to the left-side of the production: S --> a T U e to produce a sentential form containing only the start
symbol, S.
Note that each step applies a production in reverse, replacing the right-side with the left-side, so we use the word reduce instead of produce.
Handles
Fall 2012 Bottom Up Parsing 17
The substring of the sentential form that the parser
chooses to reduce in each step of the parse is called the
handle for that step.
In the previous example the handles are:
1. the first (b) in ( a b b c d e ).
2. the ( T b c ) substring in ( a T b c d e ).
3. the (d) in ( a T d e ).
4. the whole string, ( a T U e ).
Handles
Fall 2012 Bottom Up Parsing 18
In step 1 and in step 2 of the example the parser has three possible handles to choose from:
if the parser chooses the wrong handle it won't be able to complete the reverse-ordered rightmost derivation.
The main task of a bottom-up parser is to choose the correct handle at each step of the parse.
There could be many choices on any step;
e.g., the empty string can be inserted into the string of n symbols in any of n + 1 different locations so just a single e -production in a grammar will give us many possible handles to choose from.
Shift Reduce Parsing
Fall 2012 Bottom Up Parsing 19
Most bottom-up parsers are implemented as shift-reduce
parsers.
Such a parser uses a stack to hold grammar symbols (it is
convenient to think of a horizontal stack with its bottom on
the left and its top on the right) and has four possible actions:
Shift: Move the next input symbol on to the top (right) of the stack.
Reduce: Reduce a handle on the right-most part of the stack by
popping it off the stack and pushing the left-side of the appropriate
production on to the right-end of the stack.
Accept: Announce successful completion of parsing.
Error: Signal discovery of a syntax error.
Shift Reduce Parsing
Fall 2012 Bottom Up Parsing 20
We use $ to mark the left-end (bottom) of the stack and also the end of the input string.
Initially the stack is empty.
Parsing ends successfully when the input is empty and the stack contains only the start symbol.
As an example we use the following grammar:
E --> E + E
E --> E *E
E --> (E )
E --> id
Example (louden)
Grammar:
E E + n | n
Input: 2 + 3, or n + n
Parse: ($ is EOF in input, also bottom of stack)
Fall 2012 Bottom Up Parsing 21
Parsing stack Input Action
1 $ n + n $ shift
2 $ n + n $ reduce E n
3 $ E + n $ shift
4 $ E + n $ shift
5 $ E + n $ reduce E E + n
6 $ E $ accept
Notes:
Left recursion is not a problem in bottom-up
parsing. Indeed, as we shall see, lookahead is
not as serious an issue.
Keeping track of what is on the stack, however,
is an issue (note the difference in the grammar
rule reductions at lines 2 and 5 of the previous
example). See later discussion on stack state.
Right recursion is actually a bit of a problem,
because it makes the stack grow large (see next example).
Fall 2012 Bottom Up Parsing 22
Example
Grammar:
E n + E | n
Input: 2 + 3, or n + n
Parse:
Fall 2012 Bottom Up Parsing 23
Parsing stack Input Action
1 $ n + n $ shift
2 $ n + n $ shift
3 $ n + n $ shift
4 $ n + n $ reduce E n
5 $ n + E $ reduce E n + E
6 $ E $ accept
Shift Reduce Parsing
Fall 2012 Bottom Up Parsing 24
The following figure shows the
actions of a shift-reduce parser to
parse the input string id1 * (id2 +
id3) according to the grammar.
STACK INPUT ACTION
$ id1 * ( id2 + id3 ) $ shift
$ id1 * ( id2 + id3 ) $ E --> id
$ E * ( id2 + id3 ) $ shift
$ E * ( id2 + id3 ) $ shift
$ E * ( id2 + id3 ) $ shift
$ E * ( id2 + id3 ) $ E --> id
$ E * ( E + id3 ) $ shift
$ E * ( E + id3 ) $ shift
$ E * ( E + id3 ) $ E --> id
$ E * ( E + E ) $ E --> E + E
$ E * ( E ) $ shift
$ E * ( E ) $ E --> ( E
$ E * E $ E --> E * E
$ E $ accept
Shift Reduce Parsing
Fall 2012 Bottom Up Parsing 25
Shift-reduce parsers can be constructed for a large class
of grammars - the LR grammars - but the construction is
usually so complicated that they are only constructed by
parser-construction programs (YACC)
However, the next section will show that there is a small
but important class of grammars where shift-reduce
parsers can be easily constructed by hand.
Introduction(2)
In Chapter 6
Bottom-up parsers
A bottom-up parser, or a shift-reduce parser,
begins at the leaves and works up to the top of the tree.
The reduction steps trace a rightmost derivation on reverse.
Fall 2012 26
More Example at Next Page to explain it.
S aABe
A Abc | b
B d
Grammar
The input string : abbcde.
parse
Bottom Up Parsing
27
Introduction(3)
a d b b c INPUT:
Bottom-Up Parsing
Program
e OUTPUT: $
Production
S aABe
A Abc
A b
B d
Bottom-Up Parser Example
Shift a
Fall 2012 Bottom Up Parsing
28
Introduction(4)
a d b b c INPUT:
Bottom-Up Parsing
Program
e OUTPUT:
A
b
$
Production
S aABe
A Abc
A b
B d
Bottom-Up Parser Example Shift b
Reduce from b to A
Fall 2012 Bottom Up Parsing
29
Introduction(5)
a d b A c INPUT:
Bottom-Up Parsing
Program
e OUTPUT:
A
b
$
Production
S aABe
A Abc
A b
B d
Bottom-Up Parser Example
Shift A
Fall 2012 Bottom Up Parsing
30
Introduction(6)
a d b A c INPUT:
Bottom-Up Parsing
Program
e OUTPUT:
A
b
$
Production
S aABe
A Abc
A b
B d
Bottom-Up Parser Example
Shift b
Fall 2012 Bottom Up Parsing
31
Introduction(7)
a d b A c INPUT:
Bottom-Up Parsing
Program
e OUTPUT:
A
b
$
Production
S aABe
A Abc
A b
B d
c
A
b
Bottom-Up Parser Example Shift c
Reduce from Abc to A
Fall 2012 Bottom Up Parsing
32
Introduction(8)
a d A INPUT:
Bottom-Up Parsing
Program
e OUTPUT:
A c
A
b
$
Production
S aABe
A Abc
A b
B d
b
Bottom-Up Parser Example
Shift A
Fall 2012 Bottom Up Parsing
33
Introduction(9)
a d A INPUT:
Bottom-Up Parsing
Program
e OUTPUT:
A c
A
b
$
Production
S aABe
A Abc
A b
B d
b
B
d
Bottom-Up Parser Example Shift d
Reduce from d to B
Fall 2012 Bottom Up Parsing
34
Introduction(10)
a B A INPUT:
Bottom-Up Parsing
Program
e OUTPUT:
A c
A
b
$
Production
S aABe
A Abc
A b
B d
b
B
d
Bottom-Up Parser Example
Shift B
Fall 2012 Bottom Up Parsing
35
Introduction(11)
a B A INPUT:
Bottom-Up Parsing
Program
e OUTPUT:
A c
A
b
$
Production
S aABe
A Abc
A b
B d
b
B
d
a
S
e
Bottom-Up Parser Example Shift e
Reduce from aABe to S
Fall 2012 Bottom Up Parsing
36
Introduction(12)
S INPUT:
Bottom-Up Parsing
Program
OUTPUT:
A c
A
b
$
Production
S aABe
A Abc
A b
B d
b
B
d
a
S
e
This parser is known as an LR Parser because
it scans the input from Left to right, and it constructs
a Rightmost derivation in reverse order.
Bottom-Up Parser Example Shift S
Hit the target $
Fall 2012 Bottom Up Parsing
Introduction(13)
Conclusion
The scanning of productions for matching with handles in the
input string
Backtracking makes the method used in the previous example
very inefficient.
Can we do better? Discuss it later!!!
Previous Architecture Renew Architecture
38 38
Outline 6.0 Introduction
6.1 Shift-Reduce Parsers
6.2 LR Parsers
6.3 LR(1) Parsing
6.4 SLR(1)Parsing
6.5 LALR(1)
Fall 2012 Bottom Up Parsing
Parse Trees
Phrase – sequence of tokens descended from a
nonterminal
Simple phrase – phrase that contains no smaller
phrase at the leaves
Handle – the leftmost simple phrase
40
Shift-Reduce Parsers(1) Shift-Reduce (bottom-up) parser is known as an LR Parser
It scans the input from Left to right
Rightmost derivation in reverse order
Kinds of LR
LR(k)
most powerful deterministic bottom-up parsing using k lookaheads
SLR(k)
LALR(k)
mechanism to perform
bottom-up parsing finite state machine
to manipulate “handle”
Components Parse stack Shift-reduce driver Action table
Goto table Fall 2012 Bottom Up Parsing
41
Shift-Reduce Parsers(2)
Parse stack
Initially empty, contains symbols already parsed
Elements in the stack are terminal or non-terminal symbols
The parse stack catenated with the remaining input always
represents a right sentential form
Fall 2012 Bottom Up Parsing
42
Shift-Reduce Parsers(3)
Shift-Reduce driver
Shift -- when top of stack doesn't contain a handle of the
sentential form
push input token (with contextual information) onto stack
Reduce -- when top of stack contains a handle
pop the handle
push reduced non-terminal (with contextual information)
Success when no input left and goal symbol on the stack
Fall 2012 Bottom Up Parsing
43
Shift-Reduce Parsers(4)
Two questions
– Have we reached the end of handles and how long is the
handle?
– Which non-terminal does the handle reduce to?
We use tables to answer the questions
ACTION table
GOTO table
Fall 2012 Bottom Up Parsing
44
Shift-Reduce Parsers(5)
LR parsers are driven by two tables:
Action table, which specifies that actions to take
Shift, reduce, accept (terminate with success) or error
Goto table, which specifies state transition
Defines successor states after a token or LHS is matched and shifted.
Parse stack – contains parse states (not symbols)
Encode the shifted symbol and the handles that are being matched, a possible sub-tree of the parse tree
Fall 2012
45
Shift-Reduce Parsers(6) grammar G0
1. <program> begin <stmts> end $
2. <stmts> SimpleStmt ; <stmts>
3. <stmts> begin <stmts> end ; <stmts>
4. <stmts> l
Action Table
Goto Table
blank -- ERROR
Shift Reduce Parser S – top parse stack state
T – Current input token
push(S0) // start state
Loop forever
case Action(S,T)
error => ReportSyntaxError()
accept => CleanUpAndFinish()
shift => Push(GoTo(S,T))
Scanner(T) // yylex()
reduce => Assume X -> Y1...Ym
Pop(m) // S' is new stack top
Push(GoTo(S',X))
47
Shift-Reduce Parsers(7)
void shift_reduce_driver(void) { /* Push the Start State, S0, * onto an empty parse stack. */ push(S0); while (TRUE) { /* forever */ /* Let S be the top parse stack state; * let T be the current input token.*/ switch (action[S][T]) { case ERROR: announce_syntax_error(); break; case ACCEPT: /* The input has been correctly
* parsed. */ clean_up_and_finish(); return;
case SHIFT: push(go_to[S][T]); scanner(&T); /* Get next token. */ break; case REDUCEi: /* Assume i-th production is * X Y1 Ym. * Remove states corresponding to * the RHS of the production. */ pop(m); /* S' is the new stack top. */ push(go_to[S'][X]); break; } } }
Fall 2012 Bottom Up Parsing
grammar G0
1. <program>begin<stmts>end$
2. <stmts> SimpleStmt;<stmts>
3. <stmts> begin<stmts>end;<stmts>
4. <stmts> l
tracing steps
Step Parse Stack Remaining Input Action (1) 0 begin SimpleStmt ; SimpleStmt ; end $ Shift 1
Shift-Reduce
Parsers(8)
Symbol State 0 1 2 3 4 5 6 7 8 9 10 11
begin S S S S S end R4 S R4 R4 S R4 R2 R3
; S S SimpleStmt S S S S
$ A
action
table
grammar G0
1. <program>begin<stmts>end$
2. <stmts> SimpleStmt;<stmts>
3. <stmts> begin<stmts>end;<stmts>
4. <stmts> l
tracing steps
Step Parse Stack Remaining Input Action (2) 0,1 SimpleStmt ; SimpleStmt ; end $ Shift 5
Shift-Reduce
Parsers(9)
Symbol State 0 1 2 3 4 5 6 7 8 9 10 11
begin S S S S S end R4 S R4 R4 S R4 R2 R3
; S S SimpleStmt S S S S
$ A
action
table
grammar G0
1. <program>begin<stmts>end$
2. <stmts> SimpleStmt;<stmts>
3. <stmts> begin<stmts>end;<stmts>
4. <stmts> l
tracing steps
Step Parse Stack Remaining Input Action (3) 0,1,5 ; SimpleStmt ; end $ Shift 6
Shift-Reduce
Parsers(10)
Symbol State 0 1 2 3 4 5 6 7 8 9 10 11
begin S S S S S end R4 S R4 R4 S R4 R2 R3
; S S SimpleStmt S S S S
$ A
action
table
grammar G0
1. <program>begin<stmts>end$
2. <stmts> SimpleStmt;<stmts>
3. <stmts> begin<stmts>end;<stmts>
4. <stmts> l
tracing steps
Step Parse Stack Remaining Input Action (4) 0,1,5,6 SimpleStmt ; end $ Shift 5
Symbol State 0 1 2 3 4 5 6 7 8 9 10 11
begin S S S S S end R4 S R4 R4 S R4 R2 R3
; S S SimpleStmt S S S S
$ A
Shift-Reduce
Parsers(11)
action
table
grammar G0
1. <program>begin<stmts>end$
2. <stmts> SimpleStmt;<stmts>
3. <stmts> begin<stmts>end;<stmts>
4. <stmts> l
tracing steps
Step Parse Stack Remaining Input Action (5) 0,1,5,6,5 ; end $ Shift 6
Symbol State 0 1 2 3 4 5 6 7 8 9 10 11
begin S S S S S end R4 S R4 R4 S R4 R2 R3
; S S SimpleStmt S S S S
$ A
Shift-Reduce
Parsers(12)
action
table
grammar G0
1. <program>begin<stmts>end$
2. <stmts> SimpleStmt;<stmts>
3. <stmts> begin<stmts>end;<stmts>
4. <stmts> l
tracing steps
Step Parse Stack Remaining Input Action (6) 0,1,5,6,5,6,l end $ /* goto(6,<stmts>) = 10 */ Reduce 4
Symbol State 0 1 2 3 4 5 6 7 8 9 10 11
begin S S S S S end R4 S R4 R4 S R4 R2 R3
; S S SimpleStmt S S S S
$ A
Shift-Reduce
Parsers(13)
goto
table
action
table
grammar G0
1. <program>begin<stmts>end$
2. <stmts> SimpleStmt;<stmts>
3. <stmts> begin<stmts>end;<stmts>
4. <stmts> l
tracing steps
Step Parse Stack Remaining Input Action (7) 0,1,5,6,5,6,10 end $ /* goto(6,<stmts>) = 10 */ Reduce 2
Symbol State 0 1 2 3 4 5 6 7 8 9 10 11
begin S S S S S end R4 S R4 R4 S R4 R2 R3
; S S SimpleStmt S S S S
$ A
Shift-Reduce
Parsers(14)
goto
table
action
table
grammar G0
1. <program>begin<stmts>end$
2. <stmts> SimpleStmt;<stmts>
3. <stmts> begin<stmts>end;<stmts>
4. <stmts> l
tracing steps
Step Parse Stack Remaining Input Action (8) 0,1,5,6,10 end $ /* goto(1,<stmts>) = 2 */ Reduce 2
Symbol State 0 1 2 3 4 5 6 7 8 9 10 11
begin S S S S S end R4 S R4 R4 S R4 R2 R3
; S S SimpleStmt S S S S
$ A
Shift-Reduce
Parsers(15)
goto
table
action
table
grammar G0
1. <program>begin<stmts>end$
2. <stmts> SimpleStmt;<stmts>
3. <stmts> begin<stmts>end;<stmts>
4. <stmts> l
tracing steps
Step Parse Stack Remaining Input Action (9) 0,1,2 end $ Shift 3
Symbol State 0 1 2 3 4 5 6 7 8 9 10 11
begin S S S S S end R4 S R4 R4 S R4 R2 R3
; S S SimpleStmt S S S S
$ A
Shift-Reduce
Parsers(16)
action
table
grammar G0
1. <program>begin<stmts>end$
2. <stmts> SimpleStmt;<stmts>
3. <stmts> begin<stmts>end;<stmts>
4. <stmts> l
tracing steps
Step Parse Stack Remaining Input Action (10) 0,1,2,3 $ Accept
Symbol State 0 1 2 3 4 5 6 7 8 9 10 11
begin S S S S S end R4 S R4 R4 S R4 R2 R3
; S S SimpleStmt S S S S
$ A
Shift-Reduce
Parsers(17)
action
table
tracing steps
Step Parse Stack Remaining Input Action (1) 0 begin SimpleStmt ; SimpleStmt ; end $ Shift 1 (2) 0,1 SimpleStmt ; SimpleStmt ; end $ Shift 5 (3) 0,1,5 ; SimpleStmt ; end $ Shift 6 (4) 0,1,5,6 SimpleStmt ; end $ Shift 5 (5) 0,1,5,6,5 ; end $ Shift 6 (6) 0,1,5,6,5,6 end $ /* goto(6,<stmts>) = 10 */ Reduce 4 (7) 0,1,5,6,5,6,10 end $ /* goto(6,<stmts>) = 10 */ Reduce 2 (8) 0,1,5,6,10 end $ /* goto(1,<stmts>) = 2 */ Reduce 2 (9) 0,1,2 end $ Shift 3 (10) 0,1,2,3 $ Accept
Shift-Reduce Parsers(18)
<program>
begin(1) <stmts> end(9) $(10)
SimpleStmt(2) ;(3) <stmts>
SimpleStmt(4) ;(5) <stmts>
l(6)
R4(6)
R2(7)
R2(8)
grammar G0 1. <program> begin <stmts> end $ 2. <stmts> SimpleStmt ; <stmts> 3. <stmts> begin <stmts> end ; <stmts> 4. <stmts> l
59 59
Outline 6.0 Introduction
6.1 Shift-Reduce Parsers
6.2 LR Parsers
6.3 LR(1) Parsing
6.4 SLR(1)Parsing
6.5 LALR(1)
6.6 Calling Semantic Routines in Shift-Reduce Parsers
6.7 Using a Parser Generator (TA course)
6.8 Optimizing Parse Tables
6.9 Practical LR(1) Parsers
6.10 Properties of LR Parsing
6.11 LL(1) or LAlR(1) , That is the question
6.12 Other Shift-Reduce Technique
Fall 2012 Bottom Up Parsing
60
LR Parsers LR(n) n=0~k
Read from Left, Right-most derivation, n look-ahead
LR parsers are deterministic
No backup or retry parsing actions
LR(0):
Without prediction read from Left, Right-most derivation, 0 look-ahead
LR(1):
1-token look-ahead
General
LR(k) parsers
Decide the next action by examining the tokens already shifted and at most k look-ahead tokens
The most powerful of deterministic
Difficult to implement
Fall 2012 Bottom Up Parsing
61
A production has the form
AX1X2…Xj
By adding a dot, we get a configuration (or an item)
A•X1X2…Xj
AX1X2…Xi • Xi+1 … Xj
AX1X2…Xj •
The • indicates how much of a RHS has been shifted onto the stack. an item (configuration) tells you where you are in a parse!
These are LR(0) configurations since no lookahead info is used.
An item with the • at the end of the RHS
Such as, AX1X2…Xj •, indicates that RHS should be reduced to LHS, it thus has recognized that production.
An item with the • at the beginning of RHS
Such as, A•X1X2…Xj, predicts that production, that is the RHS will be shifted onto the stack
LR(0) Table Construction(1)
Fall 2012 Bottom Up Parsing
LR(0) Table Construction(2) An LR(0) state is a set of configurations
The actual state of LR(0) parsers is denoted by one of the items (configurations).
The closure0 operation:
if there is a configuration B • A in the set where A is a non terminal, then add all configurations of the form A • to the set.
The initial configuration
s0 = closure0({S • $})
A configuration set is all possible configurations at a given point during a parse.
Configuration_set closure (configuration_set s) { configuration_set s’ = s ; do {
if( B • A s’ for A Vn ) { /* Predict productions with A as LHS */ Add all configurations of the form A • γ to s’ } } while (more new configurations can be added) ; return 0; }
EX: for grammar G1 :
1. S'S$
2.SID|l closure0( { S S $ } ) =
{ S' S$,
S ID,
S l }
special case: l
LR(0) Table Construction(3) • Q1: Why the grammar use S'S$ ?
• Ans: To check for the end of the parse.
EX: If S’ does not exist~
SID$
S l$
When we button up to reduce the original symbol S, there are two paths to achieve it.
Multipath is a problem that if we
have in complex grammars like C.
A lot of paths we need to check the ending symbol $.
EX: for grammar G1 :
1. S'S$
2.SID|l
closure0( { S S $ } ) =
{ S' S$,
S ID,
S l }
Given a configuration set s, we can compute its successor, s’ , under a symbol X
Denoted go_to0(s,X)=s’
Configuration_set goto (configuration_set s , symbol x) { Sb = Ø ;
for (each configuration c s) if(c = A β•x γ to sb) Add A βx • γ to sb ; /* * That is, we advance the • past the symbol X, * if possible. Configurations not having a * dot preceding an X are not included in sb . */ /* Add new predictions to sb via closure0. */ return closure0(sb) ; }
LR(0) Table Construction(4)
void_build_CFSM(void)
{
S = SET_OF(S0);
while (S is nonempty) {
Remove a configuration set s from S;
/* Consider both terminals and non-terminals */
for ( X in Symbols) {
if(go_to0(s,X) does not label a CFSM state) {
Create a new CFSM state & label with go_to0(s , X)
Add go_to0(s,X) to S;
}
Create a transition under X from the state s
labels to the state go_to0(s , X)
}
}
}
The grammar is finite, also the # of configurations and configuration sets.
Characteristic finite state machine (CFSM)
Build by identifying configuration sets and successor operations with CFSM states and transitions
It is a finite automaton
LR(0) Table Construction(5)
EX: for grammar G1 :
1. S'S$
2.SID|l
state 0
S' S$,
S ID,
S l
state 1
S ID
ID
state 2
S' S $
S
state 3
S' S $
$
state 4
error
Int ** build_go_to_table(finite_automation CFSM) {
const int N = num_states (CFSM);
int **tab;
Dynamically allocate a table of dimension
N × num_symbols (CFSM) to represent
the go_to table and assign it to tab;
Number the states of CFSM from 0 to N-1,
with the Start State labeled 0;
for( S = 0 ; S<=N-1 ; S++) {
/* Consider both terminals and non-terminals. */
for ( X in Symbols) {
if ( State S has a transition under X to some state T)
tab [S][X] = T ;
else
tab [S][X] = EMPTY;
}
}
return tab;
}
LR(0) Table Construction(6) CFSM is the goto table of LR(0) parsers. state 0
S' S$,
S ID,
S l
state 1
S ID
ID
state 2
S' S $
S
state 3
S' S $
$
State Symbol
ID $ S
0 1 4 2
1 4 4 4
2 4 3 4
3 4 4 4
4
goto table
Because LR(0) uses no look-ahead, we must extract the
action function directly from the configuration sets of
CFSM
Let Q={Shift, Reduce1, Reduce2 , …, Reducen}
There are n productions in the CFG
Let S0 be the set of CFSM states
The power set P, is a projection that maps each CFSM set
to appropriate subset of Q
P:S02Q 2Q is the power set of Q.
P(s)={Reducei | B • s and production i is B }
(if A • a s for a Vt Then {Shift} Else )
LR(0) Table Construction(7)
G is LR(0) if and only if s S0 |P(s)|=1
If G is LR(0), the action table is trivially extracted from P
P(s)={Shift} action[s]=Shift
P(s)={Reducei}, where production j is the augmenting
production, action[s]=Accept
P(s)={Reducei}, ij, action[s]=Reducei
P(s)= action[s]=Error
LR(0) Table Construction(8)
state 0
S' S$,
S ID,
S l
state 1
S ID
ID
state 2
S' S $
S
state 3
S' S $
$
LR(0) Table Construction(9)
EX: for grammar G1 :
1. S'S$
2.SID|l
state 0 1 2 3
action S R2 S Accept
Reducei | B • s and production i is B (if A • a s for a Vt Then {Shift} Else )
state 0
S' S$,
S ID,
S l
state 1
S ID
ID
state 2
S' S $
S
state 3
S' S $
$
LR(0) Table Construction(10)
EX: for grammar G1 :
1. S'S$
2.SID|l
state 0 1 2 3
action S R2 S Accept
Reducei | B • s and production i is B (if A • a s for a Vt Then {Shift} Else )
state 0
S' S$,
S ID,
S l
state 1
S ID
ID
state 2
S' S $
S
state 3
S' S $
$
LR(0) Table Construction(11)
EX: for grammar G1 :
1. S'S$
2.SID|l
state 0 1 2 3
action S R2 S Accept
Reducei | B • s and production i is B (if A • a s for a Vt Then {Shift} Else )
state 0
S' S$,
S ID,
S l
state 1
S ID
ID
state 2
S' S $
S
state 3
S' S $
$
LR(0) Table Construction(12)
EX: for grammar G1 :
1. S'S$
2.SID|l
state 0 1 2 3
action S R2 S Accept
Reducei | B • s and production i is B (if A • a s for a Vt Then {Shift} Else )
Any state s S0 for which |P(s)|>1 is said to be inadequate
Two kinds of parser conflicts create inadequacies in configuration sets
Shift-reduce conflicts
Reduce-reduce conflicts
Should be able to resolve inadequacy by using alookahead
If is easy to introduce inadequacies in CFSM states
Hence, few real grammars are LR(0). For example,
Consider l-productions
The only possible configuration involving a l-production is of the form A l•
However, if A can generate any terminal string other than l, then a shift
action must also be possible (First(A))
LR(0) parser will have problems in handling operator precedence properly
LR(0) Table Construction(13)
Before tracing , we will need to know the mind of CFSM
LR(0) Tracing Example(0)
for grammar G2 :
1. SE$
2.EE+T
3.ET
4.T id
5.T (E)
closure0( { T ( E ) }
= { T ( E ) ,
E E + T ,
E T ,
T id ,
T ( E ) }
T
( E )
E + T
T
( E )
T
id
T
( E )
T
id
When shift ( , some possible answers of tree:
state 0 S E$ E E+T E T T id T (E)
LR(0) Tracing Example(1)
closure0( { S E$ } ) = { S E$, E E+T, E T, T id, T (E) }
E
T
(
id
LR(0) Tracing Example(2)
closure0({ S E $, E E +T } ) =itself
E
T
(
id
state 1 S E $ E E +T
$
+
state 0 S E$ E E+T E T T id T (E)
LR(0) Tracing Example(3)
closure0({ S E $ } ) =itself
E
T
(
id
$
+
state 2 S E $
state 1 S E $ E E +T
state 0 S E$ E E+T E T T id T (E)
LR(0) Tracing Example(4)
closure0({E E+ T}) = {E E+ T, T id, T (E) }
E
T
(
id
$
+ state 3 E E + T T id T (E)
id
T
(
state 2 S E $
state 1 S E $ E E +T
state 0 S E$ E E+T E T T id T (E)
LR(0) Tracing Example(5)
closure0({E E+ T }) =itself
E
T
(
id
$
+
id
T
(
state 4 E E +T
state 3 E E + T T id T (E)
state 2 S E $
state 1 S E $ E E +T
state 0 S E$ E E+T E T T id T (E)
LR(0) Tracing Example(6)
closure0({T id }) =itself
E
T
(
id
$
+
id
T
(
state 5 T id
state 4 E E +T
state 3 E E + T T id T (E)
state 2 S E $
state 1 S E $ E E +T
state 0 S E$ E E+T E T T id T (E)
LR(0) Tracing Example(7)
closure0({T ( E) }) = { T ( E) , E E+T, E T, T id, T (E) }
E
T
(
id
$
+
id
T
(
state 4 E E +T
state 6 T ( E) E E+T E T T id T (E)
(
id
T
E
state 5 T id
state 3 E E + T T id T (E)
state 2 S E $
state 1 S E $ E E +T
state 0 S E$ E E+T E T T id T (E)
LR(0) Tracing Example(8)
closure0({T (E ) ,E E +T } ) =itself
E
T
(
id
$
+
id
T
(
(
id
T
E
state 7 T (E) E E +T
+ )
state 4 E E +T
state 6 T ( E) E E+T E T T id T (E)
state 5 T id
state 3 E E + T T id T (E)
state 2 S E $
state 1 S E $ E E +T
state 0 S E$ E E+T E T T id T (E)
LR(0) Tracing Example(9)
closure0({T (E ) } ) =itself
E
T
(
id
$
+
id
T
(
(
id
T
E
+ )
state 8 T (E)
state 7 T (E) E E +T
state 4 E E +T
state 6 T ( E) E E+T E T T id T (E)
state 5 T id
state 3 E E + T T id T (E)
state 2 S E $
state 1 S E $ E E +T
state 0 S E$ E E+T E T T id T (E)
LR(0) Tracing Example(10)
closure0({E T } ) =itself
E
T
(
id
$
+
id
T
(
(
id
T
E
+ )
state 8 T (E)
state 9
E T
state 7 T (E) E E +T
state 4 E E +T
state 6 T ( E) E E+T E T T id T (E)
state 5 T id
state 3 E E + T T id T (E)
state 2 S E $
state 1 S E $ E E +T
state 0 S E$ E E+T E T T id T (E)
state 0 S E$ E E+T E T T id T (E)
LR(0) Tracing Example(11)
E
T
(
id
state 1 S E $ E E +T
$
+
state 2 S E $
state 3 E E + T T id T (E)
id
T
(
state 4 E E +T
state 5 T id
state 6 T ( E) E E+T E T T id T (E)
(
id
T
E
state 7 T (E) E E +T
+ )
state 8 T (E)
state 9 E T
Symbol State
0 1 2 3 4 5 6 7 8 9 10
anything S S A S R2 R4 S S R5 R3
state 10 Error
any
error
action
table
Reducei | B • s and production i is B (if A • a s for a Vt Then {Shift} Else )
LR(0) Tracing
Example(12)
goto table
State Symbol
S E T + id ( ) $
0 1 9 5 6
1 3 2
2
3 4 5 6
4
5
6 7 9 5 6
7 3 8
8
9
10
Stat
e
Symbol
S E T + id ( ) $
0 1 9 5 6
1 3 2
2
3 4 5 6
4
5
6 7 9 5 6
7 3 8
8
9
10
Symbol State
0 1 2 3 4 5 6 7 8 9 10
anything S S A S R2 R4 S S R5 R3
Program Example (1)
Initial :(id)$
step1:0 (id)$ shift (
1
Tree:
(
Stat
e
Symbol
S E T + id ( ) $
0 1 9 5 6
1 3 2
2
3 4 5 6
4
5
6 7 9 5 6
7 3 8
8
9
10
Symbol State
0 1 2 3 4 5 6 7 8 9 10
anything S S A S R2 R4 S S R5 R3
Program Example (2)
step2:06 id)$ shift id
2
Tree:
(
Initial :(id)$
id
Stat
e
Symbol
S E T + id ( ) $
0 1 9 5 6
1 3 2
2
3 4 5 6
4
5
6 7 9 5 6
7 3 8
8
9
10
Symbol State
0 1 2 3 4 5 6 7 8 9 10
anything S S A S R2 R4 S S R5 R3
Program Example (3)
step3:065 )$ reduce 4
3
Tree:
Initial :(id)$
(
id
T
Stat
e
Symbol
S E T + id ( ) $
0 1 9 5 6
1 3 2
2
3 4 5 6
4
5
6 7 9 5 6
7 3 8
8
9
10
Symbol State
0 1 2 3 4 5 6 7 8 9 10
anything S S A S R2 R4 S S R5 R3
Program Example (4)
step4:069 )$ reduce 3
4
Tree:
Initial :(id)$
(
id
T
E
Stat
e
Symbol
S E T + id ( ) $
0 1 9 5 6
1 3 2
2
3 4 5 6
4
5
6 7 9 5 6
7 3 8
8
9
10
Symbol State
0 1 2 3 4 5 6 7 8 9 10
anything S S A S R2 R4 S S R5 R3
Program Example (5)
step5:067 )$ shift )
5
Tree: (
id
T
Initial :(id)$
E )
Stat
e
Symbol
S E T + id ( ) $
0 1 9 5 6
1 3 2
2
3 4 5 6
4
5
6 7 9 5 6
7 3 8
8
9
10
Symbol State
0 1 2 3 4 5 6 7 8 9 10
anything S S A S R2 R4 S S R5 R3
Program Example (6)
step6:0678 $ reduce 5
6
Tree:
Initial :(id)$
(
id
T
E )
T
Stat
e
Symbol
S E T + id ( ) $
0 1 9 5 6
1 3 2
2
3 4 5 6
4
5
6 7 9 5 6
7 3 8
8
9
10
Symbol State
0 1 2 3 4 5 6 7 8 9 10
anything S S A S R2 R4 S S R5 R3
Program Example (7)
step7:09 $ reduce 3
7
Tree:
Initial :(id)$
(
id
T
E )
T
E
Stat
e
Symbol
S E T + id ( ) $
0 1 9 5 6
1 3 2
2
3 4 5 6
4
5
6 7 9 5 6
7 3 8
8
9
10
Symbol State
0 1 2 3 4 5 6 7 8 9 10
anything S S A S R2 R4 S S R5 R3
Program Example (8)
step8:01 $ shift $
8
Tree:
Initial :(id)$
(
id
T
E )
T
E $
Stat
e
Symbol
S E T + id ( ) $
0 1 9 5 6
1 3 2
2
3 4 5 6
4
5
6 7 9 5 6
7 3 8
8
9
10
Symbol State
0 1 2 3 4 5 6 7 8 9 10
anything S S A S R2 R4 S S R5 R3
Program Example (9)
step9:012 Accept
9 Accept
Tree:
Initial :(id)$
(
id
T
E )
T
E $
S
96 96
Outline 6.0 Introduction
6.1 Shift-Reduce Parsers
6.2 LR Parsers
6.3 LR(1) Parsing
6.4 SLR(1)Parsing
6.5 LALR(1)
Fall 2012 Bottom Up Parsing
97
LR(1) Parsing (1)
An LR(1) configuration, or item is of the form
AX1X2…Xi • Xi+1 … Xj, l where l Vt{l}
The look ahead component l represents a possible look-ahead
after the entire right-hand side has been matched
The l appears as look-ahead only for the augmenting production
because there is no look-ahead after the end-marker
We use the following notation to represent the set of LR(1)
configurations that shared the same dotted production
AX1X2…Xi • Xi+1 … Xj, {l1…lm}
={AX1X2…Xi • Xi+1 … Xj, l1}
{AX1X2…Xi • Xi+1 … Xj, l2}
…
{AX1X2…Xi • Xi+1 … Xj, lm}
Fall 2012 Bottom Up Parsing
98
LR(1) Parsing (2)
LR(1) There are many more distinct LR(1) configurations than LR(0) configurations.
In fact, the major difficulty with LR(1) parsers is not their power but rather finding ways to represent them in storage-efficient ways.
Parsing begins with the configuration : closure1({S • $, {l}})
Configuration_set closure1 (configuration_set s) { configuration_set s’ = s ; do { if( B • A , l s’ for A Vn ) { /* * Predict productions with A as the left-hand side. * Possible lookaheads are First(l ) */ Add all configurations of the form A • γ, u where u First(l ) to s’ } } while (more new configurations can be added) ; return s’; }
for grammar G2 : 1. SE$
2.EE+T
3.ET
4.T id
5.T (E)
closure1(S • E$, l}) = { S E$,{l} E E+T,{$+} E T,{$+} T id,{$+} T (E),{$+} }
Fall 2012 Bottom Up Parsing
99
LR(1) Parsing (3)
Tracing Example for grammar G2 :
1. SE$
2.EE+T
3.ET
4.T id
5.T (E)
closure1(S • E$, l})
S E$,{l}
E E+T,{$} E T,{$}
T id,{$} T (E),{$}
E E+T,{+} E T,{+}
T id,{+} T (E),{+}
closure1(S • E$, l})=
{ S E$,{l} E E+T,{$+} E T,{$+} T id,{$+} T (E),{$+} }
Fall 2012 Bottom Up Parsing
100
LR(1) Parsing (4)
Given an LR(1) configuration set s
We compute its successor, s', under a symbol X
go_to1(s,X) Configuration_set goto1 (configuration_set s , symbol x) { Sb = Ø ;
for (each configuration c s) if( c is of the form A βx • γ, l)
//In goto0 if( each configuration c s) Add A βx • γ, l to sb ; /* * That is, we advance the • past the symbol X, * if possible. Configurations not having a * dot preceding an X are not included in sb . */ /* Add new predictions to sb via closure1. */ return closure1(sb) ; } Fall 2012 Bottom Up Parsing
101
LR(1) Parsing (5)
LR(1) We can build a finite automata that is analogue of the LR(0) CFSM
LR(1) FSM, LR(1) machine
The relationship between CFSM and LR(1) macine By merging LR(1) machine’s configuration sets, we can obtain CFSM
void_build_LR1(void)
{
Create the Start State of FSM; Label it with s0
Put s0 into an initially empty set , S.
while (S is nonempty) {
Remove a configuration set s from S;
/* Consider both terminals and non-terminals */
for ( X in Symbols) {
if(go_to1(s,X) does not label a FSM state) {
Create a new FSM state and label it with go_to1(s , X) into S;
Put go_to1(s , X) into S;
}
Create a transition under X from the state s
labels to the state go_to1 (s , X) labels;
} } }
Tracing Example:
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Fall 2012 Bottom Up Parsing
102
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T (
Fall 2012 Bottom Up Parsing
103
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
(
Fall 2012 Bottom Up Parsing
104
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
$ state 2 //Accept S E $ ,{l}
(
Fall 2012 Bottom Up Parsing
105
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id ( (
Fall 2012 Bottom Up Parsing
106
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id (
state 4 T P ,{$+*}
(
Fall 2012 Bottom Up Parsing
107
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
(
Fall 2012 Bottom Up Parsing
108
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
(
E T
P id (
Be careful of
look-ahead !!
Fall 2012 Bottom Up Parsing
109
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
(
state 7 E T ,{$+} T T *P,{$+*}
E T
P id (
*
Fall 2012 Bottom Up Parsing
110
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
(
state 7 E T ,{$+} T T *P,{$+*}
E T
P id (
*
state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}
id
P
(
Fall 2012 Bottom Up Parsing
111
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
(
state 7 E T ,{$+} T T *P,{$+*}
E T
P id (
*
state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}
id
P
(
state 9 T T* P ,{$+*}
Fall 2012 Bottom Up Parsing
112
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
(
state 7 E T ,{$+} T T *P,{$+*}
E T
P
*
state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}
id
P
(
state 9 T T* P ,{$+*}
state 10 P id ,{)+*}
(
id
Fall 2012 Bottom Up Parsing
113
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
(
state 7 E T ,{$+} T T *P,{$+*}
E T
P
*
state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}
id
P
(
state 9 T T* P ,{$+*}
state 10 P id ,{)+*}
(
id
state 11 E E+ T ,{$+} T T *P,{$+*}
* State 8
Fall 2012 Bottom Up Parsing
114
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
(
state 7 E T ,{$+} T T *P,{$+*}
E
T
P
*
state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}
id
P
(
state 9 T T* P ,{$+*}
state 10 P id ,{)+*}
(
id
state 11 E E+ T ,{$+} T T *P,{$+*}
* State 8
state 12 P (E ) ,{$+*} E E +T,{)+}
+ )
Fall 2012 Bottom Up Parsing
115
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
(
state 7 E T ,{$+} T T *P,{$+*}
E
T
P
*
state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}
id
P
(
state 9 T T* P ,{$+*}
state 10 P id ,{)+*}
(
id
state 11 E E+ T ,{$+} T T *P,{$+*}
* State 8
state 12 P (E ) ,{$+*} E E +T,{)+}
+
)
state 13 P (E ) ,{$+*}
Fall 2012 Bottom Up Parsing
116
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
(
state 7 E T ,{$+} T T *P,{$+*}
E
T *
state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}
id
P
(
state 9 T T* P ,{$+*}
state 10 P id ,{)+*}
id
state 11 E E+ T ,{$+} T T *P,{$+*}
* State 8
state 12 P (E ) ,{$+*} E E +T,{)+}
+
)
state 13 P (E ) ,{$+*}
(
state 14 T P ,{)+*}
P
Fall 2012 Bottom Up Parsing
117
state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
E
P
id
T
state 1 S E $ ,{l} E E +T,{$+}
+
$ state 2 //Accept S E $ ,{l}
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
P
T id
(
state 4 T P ,{$+*}
state 5 P id ,{$+*}
state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
(
state 7 E T ,{$+} T T *P,{$+*}
E
T *
state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}
id
P
(
state 9 T T* P ,{$+*}
state 10 P id ,{)+*}
id
state 11 E E+ T ,{$+} T T *P,{$+*}
* State 8
state 12 P (E ) ,{$+*} E E +T,{)+}
+
)
state 13 P (E ) ,{$+*}
(
state 14 T P ,{)+*}
P
state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
id P
(
T E Fall 2012 Bottom Up Parsing
118
state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
T
(
id P
State 14 State 10
E state 16 P (E ) ,{)+*} E E +T,{)+}
( +
LR(1) Parsing (16)
Fall 2012 Bottom Up Parsing
119
state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
T
(
id P
State 14 State 10
E state 16 P (E ) ,{)+*} E E +T,{)+}
( +
state 15 P (E ) ,{)+*}
LR(1) Parsing (17)
Fall 2012 Bottom Up Parsing
120
state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
T
(
id P
State 14 State 10
E state 16 P (E ) ,{)+*} E E +T,{)+}
(
+
state 15 P (E ) ,{)+*}
state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
P
id
(
T
LR(1) Parsing (18)
Renew state 12
->+ to state 17
Fall 2012 Bottom Up Parsing
121
state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
T
(
id P
State 14 State 10
E state 16 P (E ) ,{)+*} E E +T,{)+}
(
+
state 15 P (E ) ,{)+*}
state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
P
id
(
T
state 19 E T ,{)+} T T *P ,{)+*}
*
Renew state 6
->T to state 19
LR(1) Parsing (19)
Fall 2012 Bottom Up Parsing
122
state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
T
(
id P
State 14 State 10
E state 16 P (E ) ,{)+*} E E +T,{)+}
(
+
state 15 P (E ) ,{)+*}
state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
P
id
(
T
state 19 E T ,{)+} T T *P ,{)+*}
state 20 E E +T,{)+} T T *P ,{)+*}
*
*
LR(1) Parsing (20)
Fall 2012 Bottom Up Parsing
123
state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
T
(
id P
State 14 State 10
E state 16 P (E ) ,{)+*} E E +T,{)+}
(
+
state 15 P (E ) ,{)+*}
state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
P
id
(
T
state 19 E T ,{)+} T T *P ,{)+*}
state 20 E E +T,{)+} T T *P ,{)+*}
*
state 21 T T * P,{)+*} P id ,{)+*} P (E) ,{)+*}
*
(
id P
LR(1) Parsing (21)
Fall 2012 Bottom Up Parsing
124
state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
T
(
id P
State 14 State 10
E state 16 P (E ) ,{)+*} E E +T,{)+}
(
+
state 15 P (E ) ,{)+*}
state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
P
id
(
T
state 19 E T ,{)+} T T *P ,{)+*}
state 20 E E +T,{)+} T T *P ,{)+*}
*
state 21 T T * P,{)+*} P id ,{)+*} P (E) ,{)+*}
*
(
id P
state 22 T T * P ,{)+*}
LR(1) Parsing (22)
Fall 2012 Bottom Up Parsing
125
LR(1) Parsing (23)
LR(1)
The go_to table used to
drive an LR(1) is extracted
directly from the LR(1)
machine
The algorithm
to generate “go_to”
table is same that we
discuss in LR(0)
Fall 2012 Bottom Up Parsing
126
LR(1) Parsing (24)
LR(1)
Action table is extracted directly from the configur-ation sets of the LR(1) machine
A projection function, P
P : S1Vt2Q
S1 be the set of LR(1) machine states
P(s,a)= {Reducei | B •,a s and production i is B } (if A • a,b s Then {Shift} Else )
Fall 2012 Bottom Up Parsing
127
LR(1) Parsing (25)
LR(1)
G is LR(1) if and only if
s S1 a Vt |P(s,a)|1
If G is LR(1), the action
table is trivially extracted
from P
P(s,$)={Shift}
action[s][$]=Accept
P(s,a)={Shift}, a$
action[s][a]=Shift
P(s,a)={Reducei},
action[s][a]=Reducei
P(s,a)=
action[s][a]=Error
Fall 2012 Bottom Up Parsing
128
LR(1) Parsing (26)
Example:
state 7 Reduce when look-ahead $+
Shift when look-ahead *
P(s,a)= {Reducei | B •,a s and production i is B } (if A • a,b s Then {Shift} Else )
Fall 2012 Bottom Up Parsing
129
Look- State ahead 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
+ S3 R5 R6 R3 R4 R6 R2 S17 R7 R5 R7 S17 R3 R2 R4
* R5 R6 S8 R4 R6 S8 R7 R5 R7 S21 S21 R4
id S5 S5 S10 S5 S10 S10 S10
( S6 S6 S18 S6 S18 S18 S18
) R6 S13 R5 R7 S15 R3 R2 R4
$ A R5 R6 R3 R4 R2 R7
S
E S1 S12 S16
T S7 S11 S19 S20 S19
P S4 S4 S14 S9 S14 S14 S22
Complete Table
Merge Action table & Go-To table
Fall 2012 Bottom Up Parsing
130
Combare G3 action in LR(0) and LR(1)
Symbol State
0 1 2 3 4 5 6 7 8 9 10 11 12
anything S S A S R5 R6 S S
R3
S R4 R7 S
R2
S
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Look- State ahead 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
+ S3 R5 R6 R3 R4 R6 R2 S17 R7 R5 R7 S17 R3 R2 R4
* R5 R6 S8 R4 R6 S8 R7 R5 R7 S21 S21 R4
id S5 S5 S10 S5 S10 S10 S10
( S6 S6 S18 S6 S18 S18 S18
) R6 S13 R5 R7 S15 R3 R2 R4
$ A R5 R6 R3 R4 R2 R7
S
E S1 S12 S16
T S7 S11 S19 S20 S19
P S4 S4 S14 S9 S14 S14 S22
LR(0)
LR(1)
ambiguous
state 7 E T T T *P
state 7 E T ,{$+} T T *P,{$+*}
LR(0) LR(1)
Fall 2012 Bottom Up Parsing
137
Initial :(id+id)$
step7:0 6 12 17 id)$ shift id
Tree:
(
id
P
T
E + id
Fall 2012 Bottom Up Parsing
138
Initial :(id+id)$
step8:0 6 12 17 10 )$ Reduce 6
Tree:
(
id
P
T
E +
id
P
Fall 2012 Bottom Up Parsing
139
Initial :(id+id)$
step9:0 6 12 17 14 )$ Reduce 5
Tree:
(
id
P
T
E +
id
P
T
Fall 2012 Bottom Up Parsing
140
Initial :(id+id)$
step10:0 6 12 17 20 )$ Reduce 2
Tree:
(
id
P
T
+
id
P
E T
E
Fall 2012 Bottom Up Parsing
141
Initial :(id+id)$
step11:0 6 12 )$ Shift 13
Tree:
(
id
P
T
+
id
P
T E
E )
Fall 2012 Bottom Up Parsing
142
Initial :(id+id)$
step12:0 6 12 13 $ Reduce 7
Tree:
(
id
P
T
+
id
P
T E
E )
P
Fall 2012 Bottom Up Parsing
143
Initial :(id+id)$
step13:0 4 $ Reduce 7
Tree:
(
id
P
T
+
id
P
T E
E )
P
T
Fall 2012 Bottom Up Parsing
144
Initial :(id+id)$
step14:0 7 $ Reduce 3
Tree:
(
id
P
T
+
id
P
T E
E )
P
T
E
Fall 2012 Bottom Up Parsing
145
Initial :(id+id)$
step15:0 1 $ Accept
Tree:
(
id
P
T
+
id
P
T E
E )
P
T
E
Fall 2012 Bottom Up Parsing
146 146
Outline 6.0 Introduction
6.1 Shift-Reduce Parsers
6.2 LR Parsers
6.3 LR(1) Parsing
6.4 SLR(1)Parsing
6.5 LALR(1)
6.6 Calling Semantic Routines in Shift-Reduce Parsers
147
SLR(1) Parsing (1)
LR(1) parsers
are the most powerful case of shift-reduce parsers, using a single look-ahead
LR(1) grammars exist for virtually all programming languages
LR(1)’s problem is that the LR(1) machine contains so many states that the go_to and action tables become prohibitively large
In reaction to the space inefficiency of LR(1) tables computer scientists have devised parsing techniques that are almost as
powerful as LR(1) but that require far smaller tables
One is to start with the CFSM, and then add look-ahead after the CFSM is build
– SLR(1)
The other approach to reducing LR(1)’s space inefficiencies is to merger inessential LR(1) states
– LALR(1)
148
SLR(1) Parsing (2)
SLR(1) stands for Simple LR(1)
One-symbol look-ahead
Look-aheads are not built directly into configurations but rather are added after the LR(0) configuration sets are built
An SLR(1) parser will perform a reduce action for configuration B • if the look-ahead symbol is in the set Follow(B)
The SLR(1) projection function, from CFSM states,
P : S0Vt2Q
P(s,a)={Reducei | B •,a Follow(B) and production i is B } (if A • a s for a Vt Then {Shift} Else )
149
SLR(1) Parsing (3)
G is SLR(1) if and only if
s S0 a Vt |P(s,a)|1
If G is SLR(1), the action table is trivially extracted from P
P(s,$)={Shift} action[s][$]=Accept
P(s,a)={Shift}, a$ action[s][a]=Shift
P(s,a)={Reducei}, action[s][a]=Reducei
P(s,a)= action[s][a]=Error
Clearly SLR(1) is a proper superset of LR(0)
150
SLR(1) Parsing (4)
Consider G3
It is LR(1) but not LR(0)
What’re follow-sets in G3?
Consider G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
151
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T (
152
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$
(
153
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
154
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
state 3 E E+ T T T*P T P P id P (E)
P
T id
(
155
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
state 3 E E+ T T T*P T P P id P (E)
P
T id
(
P state 4 T P
156
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
state 3 E E+ T T T*P T P P id P (E)
P
T id
(
P state 4 T P
state 5 P id
157
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
state 3 E E+ T T T*P T P P id P (E)
P
T id
(
P state 4 T P
state 5 P id
state 6 P ( E) E E+T E T T T*P T P P id P (E)
E
T
P (
id
State 4
158
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
state 3 E E+ T T T*P T P P id P (E)
P
T id
(
P state 4 T P
state 5 P id
state 6 P ( E) E E+T E T T T*P T P P id P (E)
E
T
P (
id
State 4
state 7 E T T T *P
*
159
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
state 3 E E+ T T T*P T P P id P (E)
P
T id
(
P state 4 T P
state 5 P id
state 6 P ( E) E E+T E T T T*P T P P id P (E)
E
T
P (
id
State 4
state 7 E T T T *P
*
state 8 T T* P P id P (E)
id
P
(
160
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
state 3 E E+ T T T*P T P P id P (E)
P
T id
(
P state 4 T P
state 5 P id
state 6 P ( E) E E+T E T T T*P T P P id P (E)
E
T
P (
id
State 4
state 7 E T T T *P
*
state 8 T T* P P id P (E)
id
P
(
state 9 T T* P
161
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
state 3 E E+ T T T*P T P P id P (E)
P
T id
(
P state 4 T P
state 5 P id
state 6 P ( E) E E+T E T T T*P T P P id P (E)
E
T
P (
id
State 4
state 7 E T T T *P
*
state 8 T T* P P id P (E)
id
P
(
state 9 T T* P
state 11 E E+ T T T *P
* State 8
162
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
state 3 E E+ T T T*P T P P id P (E)
P
T id
(
P state 4 T P
state 5 P id
state 6 P ( E) E E+T E T T T*P T P P id P (E)
E
T
P (
id
State 4
state 7 E T T T *P
*
state 8 T T* P P id P (E)
id
P
(
state 9 T T* P
state 11 E E+ T T T *P
* State 8
state 12 P (E ) E E +T
)
State 3
+
163
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
state 0 S E$ E E+T E T T T*P T P P id P (E)
E
P
id
T
state 1 S E $ E E +T
+
$ state 2 //Accept S E $
(
state 3 E E+ T T T*P T P P id P (E)
P
T id
(
P state 4 T P
state 5 P id
state 6 P ( E) E E+T E T T T*P T P P id P (E)
E
T
P (
id
State 4
state 7 E T T T *P
*
state 8 T T* P P id P (E)
id
P
(
state 9 T T* P
state 11 E E+ T T T *P
* State 8
state 12 P (E ) E E +T
)
State 3
+
state 10 P (E)
165
SLR(1) Parsing (6)
Limitations of the SLR(1) Technique
The use of Follow sets to estimate the look-aheads that predict
reduce actions is less precise than using the exact look-aheads
incorporated into LR(1) configurations
Example in next page
166
Compare
LR(1)&
SLR(1)
LR(1)
SLR(1)
Consider Input: id )
Step1:0 id) shift 5
Step2:05 ) Error
Step1:0 id) shift 5
Step2:05 ) Reduce 6
Step3:04 ) Reduce 5
Step4:07 ) Reduce 3
Step5:01 ) Error
Consider G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
LR(1)
SLR(1)
The performance of
detecting errors
167 167
Outline 6.0 Introduction
6.1 Shift-Reduce Parsers
6.2 LR Parsers
6.3 LR(1) Parsing
6.4 SLR(1)Parsing
6.5 LALR(1)
6.6 Calling Semantic Routines in Shift-Reduce Parsers
168
LALR(1) (1)
LALR(1) parsers
can be built by first constructing an LR(1) parser and then
merging states
An LALR(1) parser is an LR(1) parser in which all states that differ only in the
look-ahead components of the configurations are merged
LALR is an acronym for Look Ahead LR
state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}
state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}
The core of the above two configurations is the same. Example: LR(1)- state3,state17
Core s’
E E+ T
T T*P
T P
P id
P (E)
Cognate(s)={c|cs, core(s)=s}
state 3 E E+ T,{)$+} T T*P ,{)$+*} T P ,{)$+*} P id ,{)$+*} P (E) ,{)$+*}
170 LALR(1) G3 diagram
SLR(1) G3 diagram (CFSM)
Compare SLR(1) & LALR(1)
It’s same behavior whether
action or goto using SLR(1) or
LALR(1) in G3
Follow(S) = {l},
Follow(E) = {+)$},
Follow(T) = {+*)$},
Follow(P) = {+*)$}
Example:
Compare state 7and state10
in SLR(1) andLALR(1).
Are they all same?
When’s different???
171
LALR(1) (4)
The CFSM state is transformed into its LALR(1) Cognate
P : S0Vt2Q
P(s,a)={Reducei | B •,a Cognate(s) and production i is B }
(if A • a s Then {Shift} Else )
G is LALR(1) if and only if
s S0 a Vt |P(s,a)|1
If G is LALR(1), the action table is trivially extracted from P
P(s,$)={Shift} action[s][$]=Accept
P(s,a)={Shift}, a$ action[s][a]=Shift
P(s,a)={Reducei}, action[s][a]=Reducei
P(s,a)= action[s][a]=Error
172
state 1 <stmt> ID
<var> ID
<var> ID [<expr>]
LALR(1) (5) For Grammar 5:
Assume statements are separated by ;’s,
the grammar is not SLR(1) because
; Follow(<stmt>) and
; Follow(<var>), since <expr><var>
grammar G5 : ….. <prog> <stmt>;{<stmt>;} <stmt>ID
<stmt><var>:=<expr>
<var> ID
<var> ID[<expr>]
<expr><var>
Reduce-reduce conflict
state 0 …… <prog> <stmt>;{<stmt>;} <stmt> ID
<stmt> <var>:=<expr>
<var> ID
<var> ID[<expr>]
<expr> <var>
id
173
LALR(1) (6)
However, in LALR(1),
if we use <var> ID the next symbol must be :=
so action[ 1, := ] = reduce(<var> ID)
action[ 1, ; ] = reduce(<stmt> ID)
action[ 1,[ ] = shift
There is no conflict.
state 1 <stmt> ID ,{$ ;} <var> ID ,{$ ; :=} <var> ID [<expr>] ,{$ ; := [ }
state 0 …… <prog> <stmt>;{<stmt>;} ,{$ ;} <stmt> ID ,{$ ;}
<stmt> <var>:=<expr> ,{$ ; :=}
<var> ID,{$ ; :=}
<var> ID[<expr>] ,{$ ; := [ }
<expr> <var>
id
174
A common technique
to put an LALR(1) grammar into SLR(1) form is to introduce a new non-terminal whose global (I.e. SLR) look-aheads more nearly correspond to LALR’s exact look-aheads
Follow(<lhs>) = {:=}
LALR(1) (7)
grammar G5 : …… <prog> <stmt>;{<stmt>;} <stmt> ID
<stmt> <var>:=<expr>
<var> ID
<var> ID[<expr>]
<expr> <var>
grammar G5 : …… <prog> <stmt>;{<stmt>;} <stmt> ID
<stmt> <lhs>:=<expr>
<lhs> ID
<lhs> ID[<expr>]
<var> ID
<var> ID[<expr>]
<expr> <var>
175
Both SLR(1) and LALR(1) are both built CFSM
Does the case ever occur in which action table can’t work?
At times, it is the CFSM itself that is at fault.
A different expression non-terminal is used to allow error or warning diagnostics
grammar G6 : S (Exp1)
S [Exp1]
S (Exp2]
S [Exp2)
<Exp1>ID
<Exp2>ID
LALR(1) (8)
In state4 , after reduce,
we do not know what
state should be the
next state
In LR(1) , state4 will split into
two states and have a solution.
176
Building LALR(1) Parsers (1)
In the definition of LALR(1)
An LR(1) machine is first built, and then its states are merged to form an
automaton identical in structure to the CFSM
May be quite inefficient
An alternative is to build the CFSM first.
Then LALR(1) look-aheads are “propagated” from configuration to configuration
Propagate links: Case 1: one configuration is created from another in a
previous state via a shift operation
Case 2: one configuration is created as the result of a closure
or prediction operation on another configuration
A •X , L1 A X• , L2
L2={ x|xFirst( t) and t L1 } B •A , L1
A • , L2
177
Building LALR(1) Parsers(2) Step 1:
After the CFSM is built, we can create all the necessary propagate links to transmit look-aheads from one configuration to another (case1)
Step 2: spontaneous look-aheads are determined (case2)
By including in L2, for configuration A,L2, all spontaneous look-aheads induced by configurations of the form B A,L1
These are simply the non-l values of First()
Step 3: Then, propagate look-aheads via the propagate links
While (stack is not empty)
{
pop top items , assign its components to (s,c,L)
if ( configuration c in state s has any propagate links)
{
Try, in turn, to add L to the look-ahead set of each
configuration so linked.
for (each configuration c’ in state s’ to which L is added)
Push(s’,c’,L) onto the stack
} }
178
Building LALR(1) Parsers(3) state 1 S Opts$ Opts Opt Opt Opt ID
grammar G6 : S Opts $
Opts Opt Opt
Opt ID
Opt state 2 Opts Opt Opt Opt ID
state 3 Opt ID
ID ID
Build CFSM
state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}
Opt state 2 Opts Opt Opt Opt ID
state 3 Opt ID
ID ID
Build initial Lookahead
Stack:
(s1,c2,$)
(s1,c3,ID)
179
Building LALR(1) Parsers(3)
Opt state 2 Opts Opt Opt,{$} Opt ID
state 3 Opt ID
ID ID Step1:
Pop(s1,c2,$)
Add $ to c1 in s2
Push(s2,c1,$)
state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}
Opt state 2 Opts Opt Opt.{$} Opt ID,{$}
state 3 Opt ID
ID ID
Stack:
(s2,c1,$)
(s1,c3,ID)
Stack:
(s1,c2,$)
(s1,c3,ID)
Step2:
Pop(s2,c1,$)
Add $ to c2 in s2
Push(s2,c2,$)
state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}
grammar G6 : S Opts $
Opts Opt Opt
Opt ID
180
Building LALR(1) Parsers(4) state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}
Opt state 2 Opts Opt Opt.{$} Opt ID,{$}
state 3 Opt ID ,{$}
ID ID
Stack:
(s2,c2,$)
(s1,c3,ID)
Step3:
Pop(s2,c2,$)
Add $ to c1 in s3
Push(s3,c1,$)
state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}
Opt
ID ID
Stack:
(s3,c1,$)
(s1,c3,ID)
Step4:
Pop(s3,c1,$)
Nothing to added
(no links)
state 2 Opts Opt Opt.{$} Opt ID,{$}
state 3 Opt ID ,{$}
grammar G6 : S Opts $
Opts Opt Opt
Opt ID
181
Building LALR(1) Parsers(4) state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}
Opt state 2 Opts Opt Opt.{$} Opt ID,{$}
state 3 Opt ID ,{$ ID}
ID ID
Stack:
(s1,c3,ID)
Step5:
Pop(s1,c3,ID)
Add ID to c1 in s3
Push(s3,c1,ID)
state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}
Opt
ID ID
Stack:
(s3,c1,ID)
Step6:
Pop(s3,c1,ID)
Nothing to added
(no links)
state 2 Opts Opt Opt.{$} Opt ID,{$}
state 3 Opt ID ,{$ ID}
grammar G6 : S Opts $
Opts Opt Opt
Opt ID
182
Building LALR(1) Parsers(5) state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}
Opt state 2 Opts Opt Opt.{$} Opt ID,{$}
state 3 Opt ID ,{$ ID}
ID ID
Stack:
Step7:
Terminate algorithm
Stack:
high Index low Index
grammar G6 : S Opts $
Opts Opt Opt
Opt ID
183
Building LALR(1) Parsers (6) A number of LALR(1) parser
generators use look-ahead propagation to compute the parser action table
LALR-Gen uses the propagation algorithm
YACC examines each state repeatedly
184 184
Outline 6.0 Introduction
6.1 Shift-Reduce Parsers
6.2 LR Parsers
6.3 LR(1) Parsing
6.4 SLR(1)Parsing
6.5 LALR(1)
6.6 Calling Semantic Routines in Shift-Reduce Parsers
185
Calling Semantic Routines in Shift-
Reduce Parsers (1) Shift-reduce parsers
can normally handle larger classes of grammars than LL(1) parsers, which is a major reason for their popularity
are not predictive
so we cannot always be sure what production is being recognized until its entire right-hand side has been matched
The semantic routines can be invoked only after a production is recognized and reduced
Action symbols only at the extreme right end of a right-hand side
186
Calling Semantic Routines in Shift-
Reduce Parsers (2)
Two common tricks are known that allow more flexible placement of semantic routine calls
For example,
<stmt>if <expr> then <stmts> else <stmts> end if
We need to call semantic routines
after the conditional expression else and end if are matched
Solution: create new non-terminals that generate l
<stmt>if <expr> <test cond>
then <stmts> <process then part>
else <stmts> end if
<test cond>l
<process then part>l
187
Calling Semantic Routines in Shift-
Reduce Parsers (3) If the right-hand sides differ in the semantic routines
that are to be called, the parser will be unable to correctly determine which routines to invoke
Ambiguity will manifest. For example, <stmt>if <expr> <test cond1>
then <stmts> <process then part>
else <stmts> end if;
<stmt>if <expr> <test cond2>
then <stmts> <process then part>
else <stmts> end if;
<test cond1>l
<test cond2>l
<process then part>l
188
Calling Semantic Routines in Shift-
Reduce Parsers (4) An alternative to the use of l–generating non-terminals
is to break a production into a number of pieces,
with the breaks placed where semantic routines are required
<stmt><if head><then part><else part>
<if head>if <expr>
<then part>then <stmts>
<else part>then <stmts> end if;
This approach can make productions harder to read but has the advantage
that no l–generating are needed
189 189
Outline 6.0 Introduction
6.1 Shift-Reduce Parsers
6.2 LR Parsers
6.3 LR(1) Parsing
6.4 SLR(1)Parsing
6.5 LALR(1)
6.6 Calling Semantic Routines in Shift-Reduce Parsers
6.7 Using a Parser Generator (TA course)
6.8 Optimizing Parse Tables
6.9 Practical LR(1) Parsers
6.10 Properties of LR Parsing
6.11 LL(1) or LALR(1) , That is the question
6.12 Other Shift-Reduce Technique
190
Optimizing
Parse tables (1)
Action table
Step1: Merge Action table and Go-to table
Lookahead State
0 1 2 3 4 5 6 7 8 9 10 11 12
+ S R5 R6 R3 R4 R7 R2 S
* R5 R6 S R4 R7 S
id S S S S
( S S S S
) R5 R6 R3 R4 R7 R2 S
$ A R5 R6 R3 R4 R7 R2
191
Optimizing
Parse tables (1)
Goto table
Optimizing Parse Table
Step1:Merge Action table
and Go-to table
Lookahead State
0 1 2 3 4 5 6 7 8 9 10 11 12
+ 3 3
* 8 8
id 5 5 5 5
( 6 6 6 6
) 10
$
S
E 1 12
T 7 11 7
P 4 4 4 9
192
Optimizing Parse tables (3) Action table
Goto table
Complete table
+
Lookahead State
0 1 2 3 4 5 6 7 8 9 10 11 12
+ S R5 R6 R3 R4 R7 R2 S
* R5 R6 S R4 R7 S
id S S S S
( S S S S
) R5 R6 R3 R4 R7 R2 S
$ A R5 R6 R3 R4 R7 R2
Lookahead State
0 1 2 3 4 5 6 7 8 9 10 11 12
+ 3 3
* 8 8
id 5 5 5 5
( 6 6 6 6
) 10
$
S
E 1 12
T 7 11 7
P 4 4 4 9
Lookahead State
0 1 2 3 4 5 6 7 8 9 10 11 12
+ S3 R5 R6 R3 R4 R7 R2 S3
* R5 R6 S8 R4 R7 S8
id S5 S5 S5 S5
( S6 S6 S6 S6
) R5 R6 R3 R4 R7 R2 S10
$ A R5 R6 R3 R4 R7 R2
S
E S1 S12
T S7 S11 S7
P S4 S4 S4 S9
193
Optimizing Parse Tables (2)
Single Reduce State
The state always simply reduce
Because of always reducing , can we simplify using another display?
Lookahead State
0 1 2 3 4 5 6 7 8 9 10 11 12
+ S3 R5 R6 R3 R4 R7 R2 S3
* R5 R6 S8 R4 R7 S8
id S5 S5 S5 S5
( S6 S6 S6 S6
) R5 R6 R3 R4 R7 R2 S10
$ A R5 R6 R3 R4 R7 R2
S
E S1 S12
T S7 S11 S7
P S4 S4 S4 S9
194
Optimizing Parse Tables (2)
Step2:
Eliminate all single reduce states.
Replaced with a special marker--- L-prefix
Example
Shift to state4 would be replaced by the entry L5
Make only one possible reduction in a state, we need not ever
go to that state
Cancel this column
Replace S4
to L5
L5 L5 L5
195
Optimizing Parse Tables (3)
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
196
Shift-Reduce Parsers
void shift_reduce_driver(void) { /* Push the Start State, S0, * onto an empty parse stack. */ push(S0); while (TRUE) { /* forever */ /* Let S be the top parse stack state; * let T be the current input token.*/ switch (action[S][T]) { case ERROR: announce_syntax_error(); break; case ACCEPT: /* The input has been correctly
* parsed. */ clean_up_and_finish(); return;
case SHIFT: push(go_to[S][T]); scanner(&T); /* Get next token. */ break; case REDUCEi: /* Assume i-th production is * X Y1 Ym. * Remove states corresponding to * the RHS of the production. */ pop(m); /* S' is the new stack top. */ push(go_to[S'][X]); break; case Li: /* Assume i-th production is * X Y1 Ym. * Remove states corresponding to * the RHS of the production. */ pop(m-1); /* S' is the new stack top. */ push(go_to[S'][X]); break; } } }
Example(1)
197
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
Input:(id+id)$
Example(2)
198
Initial :(id+id)$
step1:0 (id+id)$ shift (
Tree:
(
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(3)
199
Initial :(id+id)$
step2:0 6 id+id)$ L6
Tree:
( id
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(4)
200
Initial :(id+id)$
step3:0 6 id+id)$ L5
Tree:
(
id
P
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(5)
201
Initial :(id+id)$
step4:0 6 id+id)$ shift id
Tree:
(
id
P
T
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(6)
202
Initial :(id+id)$
step5:0 6 7 +id)$ Reduce 3
Tree:
(
id
P
T
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
+
Example(7)
203
Initial :(id+id)$
step6:0 6 12 +id)$ shift +
Tree:
(
id
P
T
E +
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(8)
204
Initial :(id+id)$
step7:0 6 12 3 id)$ L6
Tree:
(
id
P
T
E + id
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(9)
205
Initial :(id+id)$
step8:0 6 12 3 id)$ L5
Tree:
(
id
P
T
E +
id
P
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(10)
206
Initial :(id+id)$
step9:0 6 12 3 id)$ Shift id
Tree:
(
id
P
T
E +
id
P
T
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(11)
207
Initial :(id+id)$
step10:0 6 12 3 11 )$ Reduce 2
Tree:
(
id
P
T
+
id
P
E T
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
)
Example(12)
208
Initial :(id+id)$
step11:0 6 12 )$ L7
Tree:
(
id
P
T
+
id
P
T E
E )
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(13)
209
Initial :(id+id)$
step12:0 )$ L5
Tree:
(
id
P
T
+
id
P
T E
E )
P Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(14)
210
Initial :(id+id)$
step13:0 )$ Shift )
Tree:
(
id
P
T
+
id
P
T E
E )
P
T
Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(15)
211
Initial :(id+id)$
step14:0 7 $ Reduce 3
Tree:
(
id
P
T
+
id
P
T E
E )
P
T
E Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
Example(16)
212
Initial :(id+id)$
step15:0 1 $ Accept
Tree:
(
id
P
T
+
id
P
T E
E )
P
T
E Lookahead State
0 1 2 3 6 7 8 11 12
+ S3 R3 R2 S3
* S8 S8
id L6 L6 L6 L6
( S6 S6 S6 S6
) R3 R2 L7
$ A R3 R2
S
E S1 S12
T S7 S11 S7
P L5 L5 L5 L4
for grammar G3 :
1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )
LR(1) Parsers
Fall 2012 Bottom Up Parsing 213
Very powerful and most languages can be recognized by
them
But, the LR(1) machine contains so many states the GoTo
and Action tables are prohibitivley large.
Alternatives to LR(1) Parsers
Fall 2012 Bottom Up Parsing 214
LR(0) Parsers
Very compact tables
But with no lookahead, not very powerful
SLR(1) – Simple LR(1) parsers
Add lookahead to LR(0) talbes
Almost as powerful as LR(1) but much smaller
LALR(1) – look-ahead LR(1) parsers
Start with LR(1) states and merge states differing only in the
look-ahead
Smaller and slightly weaker than LR(1)
215
LL(1) or LALR(1) , That is the question(1)
--Modified by http://www.csie.ntu.edu.tw/~compiler/
LR(1) grammar
LALR(1) grammar
SLR(1) grammar
LR(0) grammar
LR(0) SLR(1) LALR(1) LR(1)
state number n n n N
action table † n 1 n |VT| n |VT| N |VT|
goto table † n |V| n |V| n |V| N |V|
† before compression
power --
LALR(1) is the most commonly used bottom-up parsing method
216
LL(1) or LALR(1) , That is the question(2)
--Modified by http://www.csie.ntu.edu.tw/~compiler/
LL(1) LALR(1)
simplicity simpler
generality all LL(1) grammars are LALR(1)
a grammar in LALR(1) form is more readable
placement of
action symbols
anywhere in rhs extreme right end
of rhs, essentially
error repair simpler, because parse stack
has predicted information
parse stack just has
matched information
table sizes |VN| |VT| |states| |V|
|states| may exponential
parsing speed comparable
semantic stack easier manipulation
Two most popular parsing methods
Shift-reduce parsers differ in their use of
Follow information:
Fall 2012 Bottom Up Parsing 217
LR(0) parsers never consult the lookahead at all.
SLR(1) parsers use the Follow sets as previously
constructed.
LR(1) parsers use context to split the Follow sets
into subsets for different parsing paths (huge,
inefficient parsers).
LALR(1) parsers: like LR(1) but coarser subsets are
used (achieves most of the benefit, but much smaller
and faster).
LL(1) vs LALR(1)
LL(1) and LALR(1) are dominant types
Although variants are used (recursive descent and SLR(1))
LL(1) is simpler
LALR(1) is more general
Most languages can be represented by an LL(1) or LALR(1) grammar, but it is easier to write the LALR(1) grammar
LL(1) can be easier to specify actions
Error repair is easier to do in LL(1)
LL(1) tables will be ~½ size of LALR(1)
A Comparison of Predictive Parsers with
Shift-Reduce Parsers
Fall 2012 Bottom Up Parsing 219
Both parsers read the input from left-to-right and
maintain a stack of grammar symbols but their parsing
operations are decidedly different as shown in the
following table: Predictive Parser Shift-Reduce Parser
Top-down (LL) Parser Bottom-up (LR) Parser
Stack predicts what is to come Stack shows what has been seen so far
The stack initially contains the start-symbol of the
grammar.
The stack is initially empty.
The stack is empty when the accept state is reached. The stack contains the start symbol of the grammar when the accept
state is reached.
Input tokens are popped off the stack. Input tokens are pushed on the stack.
Left sides of productions are popped off the stack. Right sides of productions are popped off the stack.
Right sides of productions are pushed on the stack. Left sides of productions are pushed on the stack.
Properties of LR(1) Parsers
A correct rightmost parse is guaranteed
Since LR-style parsers accept only viable prefixes,
syntax errors are detected as soon as the parser
attempts to shift a token that isn't part of a viable
prefix
Prompt error reporting
They are linear in operation
All LR(1) grammars are unambiguous
Will yacc generate a parser for an
ambiguous grammar?