Upload
eelco-visser
View
47
Download
0
Embed Size (px)
Citation preview
Challenge the future
Delft University of Technology
Course IN4303 Compiler Construction
Eduardo Souza, Guido Wachsmuth, Eelco Visser
LR ParsingTraditional Parsing Algorithms
lessons learned
LR Parsing
Recap: Traditional Parsing Algorithms
How can we parse context-free languages effectively?
• predictive parsing
Which grammar classes are supported by these algorithms?
• LL(k) grammars, LL(k) languages
How can we generate compiler tools from that?
• implement automaton
• generate parse tables
What are other techniques for implementing top-down parsers?
• Parser Combinators
• PEGs
• ALL(*)
2
today’s lecture
Lexical Analysis
Overview
3
today’s lecture
Lexical Analysis
Overview
efficient parsing algorithms
• LR parsing
• LR parse table generation
• SLR & LALR parse tables
• Generalized LR parsing
• Scannerless Generalized LR parsing
3
LR Parsing
LR Parsing
I
4
idea
LR Parsing
LR parsing
problems with LL parsing
• predicting right rule
• left recursion
LR parsing
• see whole right-hand side of a rule
• look ahead
• shift or reduce
5
example
LR Parsing
LR parsing
6
* 3 $
input
$
stack
7* 37 +
S → E $ E → E * E E → E + E E → Num
grammar
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
$ E + E 3* $
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
$ E + E 3* $
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
$ E + E 3* $
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
$ E + E 3* $
$ E + E * E $
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
$ E + E 3* $
$ E + E
$ E + E * E $
$
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
$ E + E 3* $
$ E + E
$ E + E * E $
$
$ E $
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
$ E + E 3* $
$ E + E
$ E + E * E $
$
$ E $
S → E $ E → E * E E → E + E E → Num
example
LR Parsing
LR parsing
6
* 3 $$ 7* 37 +
* 3 $7+$ E 3*
* 3 $7+$ E * E
* 3 $7+$ E
$ E + E 3* $
$ E + E
$ E + E * E $
$
$ E $
$ S
S → E $ E → E * E E → E + E E → Num
LR Parsing
Grammar classes
7
context-free grammars
LL(k)
LL(1)
LL(0)
LR Parsing
Grammar classes
7
context-free grammars
LR(0)
LL(k)
LL(1)
LL(0)
LR Parsing
Grammar classes
7
context-free grammars
LR(1)
LR(0)
LL(k)
LL(1)
LL(0)
LR Parsing
Grammar classes
7
context-free grammars
LR(k)
LR(1)
LR(0)
LL(k)
LL(1)
LL(0)
LR Parsing
Grammar classes
7
context-free grammars
LR(k)
LR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
LR Parsing
Grammar classes
7
context-free grammars
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
LR Parsing
LR Parse Tables
II
8
parse table
LR Parsing 9
rows
• states of a DFA
columns
• topmost stack symbol
• Σ, N
entries
• reduce, rule number
• shift, goto state
• goto state
• accept state
LR parsing
T1 ... N1 ...
1 s 3
2 g 5
3 r 1
4 r 2 a
5
6 g 1
7 s 1
8
...
items, closure & goto
LR Parsing
LR(0) parse tables
10
S → x S → ( L ) L → S L → L , S
S’ → . S $
items, closure & goto
LR Parsing
LR(0) parse tables
10
S → x S → ( L ) L → S L → L , S
S’ → . S $
items, closure & goto
LR Parsing
LR(0) parse tables
10
item
S → x S → ( L ) L → S L → L , S
closure
• for every item A → α . X β
• for every rule X → γ
• add item X → . γ
S’ → . S $
items, closure & goto
LR Parsing
LR(0) parse tables
10
item
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
closure
• for every item A → α . X β
• for every rule X → γ
• add item X → . γ
items, closure & goto
LR Parsing
LR(0) parse tables
10
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L )
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
(S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
(
S
L → S .
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
(
S
L → S .
L S → ( L . ) L → L . , S
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
(
S
L → S .
L S → ( L . ) L → L . , S
)
S → ( L ) .
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
(
S
L → S .
L S → ( L . ) L → L . , S
)
S → ( L ) .
,L → L , . S
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
L → L , . S S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
(
S
L → S .
L S → ( L . ) L → L . , S
)
S → ( L ) .
,
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
L → L , . S S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
(
S
L → S .
L S → ( L . ) L → L . , S
)
S → ( L ) .
,
x
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
L → L , . S S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
(
S
L → S .
L S → ( L . ) L → L . , S
)
S → ( L ) .
,
x
(
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
L → L , . S S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
(
S
L → S .
L S → ( L . ) L → L . , S
)
S → ( L ) .
,
x
(
SL → L , S .
S → x S → ( L ) L → S L → L , S
S’ → . S $ S → . x S → . ( L )
L → L , . S S → . x S → . ( L )
items, closure & goto
LR Parsing
LR(0) parse tables
10
S’ → S . $
S
x S → x .
( S → ( . L ) L → . S L → . L , S S → . x S → . ( L )
x
(
S
L → S .
L S → ( L . ) L → L . , S
)
S → ( L ) .
,
x
(
SL → L , S .
S → x S → ( L ) L → S L → L , S
12
3
4 6 7
5
8
9
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)x,x(
1
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)x,x
13
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)x,
132
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)x,
13
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)x,
136
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)x,
13
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)x,
135
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)x
1358
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)
13582
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)
1358
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)
13589
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)
13
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$)
135
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$
1357
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$
1
( ) x , $ S L
1 s 3 s 2 g 4
2 r 1 r 1 r 1 r 1 r 1
3 s 3 s 2 g 6 g 5
4 a
5 s 7 s 8
6 r 3 r 3 r 3 r 3 r 3
7 r 2 r 2 r 2 r 2 r 2
8 s 3 s 2 g 9
9 r 4 r 4 r 4 r 4 r 4
result
LR Parsing 11
LR(0) parse tables
S → x S → ( L ) L → S L → L , S
$
14
LR Parsing
Conflict Resolution
III
12
E → T + . E E → . T + E E → . T T → . x
S → . E $ E → . T + E E → . T T → . x
shift-reduce conflicts
LR Parsing
SLR parse tables
13
T → x .
x
E S → E . $
T
T +
E
E → T . + E E → T .
E → T + E .
E → T + E E → T T → xx
E → T + . E E → . T + E E → . T T → . x
S → . E $ E → . T + E E → . T T → . x
shift-reduce conflicts
LR Parsing
SLR parse tables
13
T → x .
x
E S → E . $
T
T +
E
E → T . + E E → T .
E → T + E .
E → T + E E → T T → x
1
2
3
5 4 6x
E → T + . E E → . T + E E → . T T → . x
S → . E $ E → . T + E E → . T T → . x
shift-reduce conflicts
LR Parsing
SLR parse tables
13
T → x .
x
E S → E . $
T
T +
E
E → T . + E E → T .
E → T + E .
E → T + E E → T T → x
1
2
3
5 4 6x
x + $ E T1 s 5 g 2 g 32 a3 r 2 ? r 24 s 5 g 6 g 35 r 3 r 3 r 36 r 1 r 1 r 1
E → T + . E E → . T + E E → . T T → . x
S → . E $ E → . T + E E → . T T → . x
shift-reduce conflicts
LR Parsing
SLR parse tables
13
T → x .
x
E S → E . $
T
T +
E
E → T . + E E → T .
E → T + E .
E → T + E E → T T → x
1
2
3
5 4 6x
x + $ E T1 s 5 g 2 g 32 a3 s 4 r 24 s 5 g 6 g 35 r 3 r 36 r 1
Reduce a production S → …
on symbols k ∈ Σ, k ∈ Follow(S)
look-ahead
LR Parsing
LR(1) parse tables
14
E → T + E E → T T → x
S → . E $ E → . T + E E → . T T → . x
? $ $ + $
E S → E . $ ?
T E → T . + E E → T .
$ $
T → x .
x
+ $
T +
x
E → T + . E E → . T + E E → . T T → . x
$ $ $ + $ E E → T + E . $
closure
• for every item A → α . X β, z
• for every rule X → γ
• for every w ∈ First(βz)
• add item X → . γ, w
look-ahead
LR Parsing
LR(1) parse tables
14
E → T + E E → T T → x
S → . E $ E → . T + E E → . T T → . x
? $ $ + $
E S → E . $ ?
T E → T . + E E → T .
$ $
T → x .
x
+ $
T +
x
E → T + . E E → . T + E E → . T T → . x
$ $ $ + $ E E → T + E . $
x + $ E T1 s 5 g 2 g 32 a3 s 4 r 24 s 5 g 6 g 35 r 3 r 36 r 1
closure
• for every item A → α . X β, z
• for every rule X → γ
• for every w ∈ First(βz)
• add item X → . γ, w
state space reduction
LR Parsing
LALR(1) parse tables
unify states
• with same items
• and same outgoing transitions
• but different look-ahead sets
might introduce new conflicts
15
state space reduction
LR Parsing
LALR(1) parse tables
16
S’ → S $ S → a E c S → a F d S → b F c S → b E d E → e F → e
S' → . S $ S → . a E c S → . a F d S → . b F c S → . b E d
? $ $ $ $
state space reduction
LR Parsing
LALR(1) parse tables
17
S’ → S $ S → a E c S → a F d S → b F c S → b E d E → e F → e
S' → . S $ S → . a E c S → . a F d S → . b F c S → . b E d
? $ $ $ $
S → a . E c S → a . F d E → . e F → . e
$ $ c d
a
state space reduction
LR Parsing
LALR(1) parse tables
18
S’ → S $ S → a E c S → a F d S → b F c S → b E d E → e F → e
S' → . S $ S → . a E c S → . a F d S → . b F c S → . b E d
? $ $ $ $
S → b . F c S → b . E d E → . e F → . e
$ $ d c
a
b
S → a . E c S → a . F d E → . e F → . e
$ $ c d
state space reduction
LR Parsing
LALR(1) parse tables
19
S’ → S $ S → a E c S → a F d S → b F c S → b E d E → e F → e
S' → . S $ S → . a E c S → . a F d S → . b F c S → . b E d
? $ $ $ $
E → e . F → e .
c d
a
b e
S → b . F c S → b . E d E → . e F → . e
$ $ d c
S → a . E c S → a . F d E → . e F → . e
$ $ c d
state space reduction
LR Parsing
LALR(1) parse tables
20
S’ → S $ S → a E c S → a F d S → b F c S → b E d E → e F → e
S' → . S $ S → . a E c S → . a F d S → . b F c S → . b E d
? $ $ $ $
E → e . F → e .
c d
a
b e
eE → e . F → e .
d c
S → b . F c S → b . E d E → . e F → . e
$ $ d c
S → a . E c S → a . F d E → . e F → . e
$ $ c d
state space reduction
LR Parsing
LALR(1) parse tables
21
S’ → S $ S → a E c S → a F d S → b F c S → b E d E → e F → e
S' → . S $ S → . a E c S → . a F d S → . b F c S → . b E d
? $ $ $ $
E → e . F → e .
c d
a
b e
eE → e . F → e .
d c
S → b . F c S → b . E d E → . e F → . e
$ $ d c
S → a . E c S → a . F d E → . e F → . e
$ $ c d
state space reduction
LR Parsing
LALR(1) parse tables
22
S’ → S $ S → a E c S → a F d S → b F c S → b E d E → e F → e
S' → . S $ S → . a E c S → . a F d S → . b F c S → . b E d
? $ $ $ $
S → a . E c S → a . F d E → . e F → . e
$ $ c d
S → b . F c S → b . E d E → . e F → . e
$ $ d c
E → e . F → e .
{c, d} {c, d}
a
b e
e
state space reduction
LR Parsing
LALR(1) parse tables
23
S’ → S $ S → a E c S → a F d S → b F c S → b E d E → e F → e
S' → . S $ S → . a E c S → . a F d S → . b F c S → . b E d
? $ $ $ $
E → e . F → e .
{c, d} {c, d}
a
b e
e Reduce/Reduce conflict!
S → a . E c S → a . F d E → . e F → . e
$ $ c d
S → b . F c S → b . E d E → . e F → . e
$ $ d c
LR Parsing
Generalized-LR Parsing
IV
24
Lexical Analysis
Generalized Parsing
• Parse all interpretations of the input, therefore it can handle ambiguous grammars.
• Parsers split whenever finding an ambiguous interpretation and act in (pseudo) parallel.
• Multiple parsers can join whenever they finish parsing an ambiguous fragment of the input.
• Some parsers may "die", if the ambiguity was caused by a lack of lookahead.
25
Lexical Analysis
Generalized LR
• Multiple parsers are synchronized on shift actions.
• Each parser has its own stack, and as they share states, the overall structure becomes a graph (GSS).
• If two parsers have the same state on top of their stack, they are joined into a single parser.
• Reduce actions affect all possible paths from the top of the stack.
26
Lexical Analysis
Generalized LR
27
S → E $ E → E + E E → E * E E → a
SLR table
Lexical Analysis
Generalized LR
28
S → E $ E → E + E E → E * E E → a
S → . E $ E → . E + E E → . E * E E → . a
0
SLR table
Lexical Analysis
Generalized LR
29
S → E $ E → E + E E → E * E E → a
S → . E $ E → . E + E E → . E * E E → . a
S → E . $ E → E . + E E → E . * E
E → a .
E
a
10
2
SLR table
Lexical Analysis
Generalized LR
30
S → E $ E → E + E E → E * E E → a
S → . E $ E → . E + E E → . E * E E → . a
S → E . $ E → E . + E E → E . * E
E → a .
E → E + . E E → . E + E E → . E * E E → . a
E → E * . E E → . E + E E → . E * E E → . a
S → E $ .E$
a
*
+
10 3
2
4
5
SLR table
Lexical Analysis
Generalized LR
31
S → E $ E → E + E E → E * E E → a
S → . E $ E → . E + E E → . E * E E → . a
S → E . $ E → E . + E E → E . * E
E → a .
E → E + . E E → . E + E E → . E * E E → . a
E → E * . E E → . E + E E → . E * E E → . a
E → E * E . E → E . + E E → E . * E
S → E $ .E$
a
*
+ E
a
10 3
2
4
5
6
SLR table
Lexical Analysis
Generalized LR
32
S → E $ E → E + E E → E * E E → a
S → . E $ E → . E + E E → . E * E E → . a
S → E . $ E → E . + E E → E . * E
E → a .
E → E + . E E → . E + E E → . E * E E → . a
E → E * . E E → . E + E E → . E * E E → . a
E → E + E . E → E . + E E → E . * E
E → E * E . E → E . + E E → E . * E
S → E $ .E$
a
*
+ E
E
a
a
10 3
2
4
5
7
6
SLR table
Lexical Analysis
Generalized LR
33
S → E $ E → E + E E → E * E E → a
S → . E $ E → . E + E E → . E * E E → . a
S → E . $ E → E . + E E → E . * E
E → a .
E → E + . E E → . E + E E → . E * E E → . a
E → E * . E E → . E + E E → . E * E E → . a
E → E + E . E → E . + E E → E . * E
E → E * E . E → E . + E E → E . * E
S → E $ .E$
a
*
+ E
*
E
+a
a
10 3
2
4
5
7
6
SLR table
Lexical Analysis
Generalized LR
34
S → E $ E → E + E E → E * E E → a
S → . E $ E → . E + E E → . E * E E → . a
S → E . $ E → E . + E E → E . * E
E → a .
E → E + . E E → . E + E E → . E * E E → . a
E → E * . E E → . E + E E → . E * E E → . a
E → E + E . E → E . + E E → E . * E
E → E * E . E → E . + E E → E . * E
S → E $ .E$
a
*
+ E
*
*
+
E
+a
a
10 3
2
4
5
7
6
SLR table
Lexical Analysis
Generalized LR
35
Nonterminal
Nullable First Follow
S
E
StateAction Goto
a + * $ S E
0
1
2
3
4
5
6
7
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
SLR table
Lexical Analysis
Generalized LR
36
Nonterminal
Nullable First Follow
S no a -
E no a +, *, $
StateAction Goto
a + * $ S E
0
1
2
3
4
5
6
7
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
SLR table
Lexical Analysis
Generalized LR
37
Nonterminal
Nullable First Follow
S no a -
E no a +, *, $
StateAction Goto
a + * $ S E
0 s2 1
1 s4 s5 s3
2 r3 r3 r3
3 acc
4 s2 7
5 s2 6
6 s4/r2 s5/r2 r2
7 s4/r1 s5/r1 r1
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
SLR table
Lexical Analysis
Generalized LR
38
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
a + a * a
0
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
Lexical Analysis
Generalized LR
39
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
+ a * a
0
2a
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
synchronize on shifts
0
2a
Lexical Analysis
Generalized LR
40
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
+ a * a
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
Lexical Analysis
Generalized LR
41
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
+ a * a
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
0
2a
1a : E
0
2a
1a : E 4+
Lexical Analysis
Generalized LR
42
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
a * a
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
synchronize
Lexical Analysis
Generalized LR
43
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
a * a
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
01a : E 4
+
Lexical Analysis
Generalized LR
44
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
* a
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
01a : E 4
+2a
synchronize
01a : E 4
+2a
Lexical Analysis
Generalized LR
45
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
* a
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
01a : E 4
+2a
a : E 7
Lexical Analysis
Generalized LR
46
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
* a
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
Lexical Analysis
Generalized LR
47
Parsing* a
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
01a : E 4
+2a
a : E 71a + a : E
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
01a : E 4
+2a
a : E 71a + a : E
5
**
Lexical Analysis
Generalized LR
48
Parsinga
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
synchronize
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
Lexical Analysis
Generalized LR
49
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
01a : E 4
+ a : E 7
1a + a : E
5*
*
a
01a : E 4
+ a : E 7
1a + a : E
5*
*
2a
Lexical Analysis
Generalized LR
50
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
synchronize
01a : E 4
+ a : E 7
1a + a : E
5*
*
2a
Lexical Analysis
Generalized LR
51
Parsing
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
01a : E 4
+ a : E 7
1a + a : E
5*
*
6a : E
Lexical Analysis
Generalized LR
52
Parsing
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
01a : E 4
+ a : E 7
1a + a : E
5*
*
6a : E
Lexical Analysis
Generalized LR
53
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
01a : E 4
+ a : E 7
1a + a : E
5*
*
6a : E
7a * a : E
Lexical Analysis
Generalized LR
54
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
01a : E 4
+ a : E 7
1a + a : E
5*
*
6a : E
7a * a : E
Lexical Analysis
Generalized LR
55
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
01a : E 4
+ a : E 7
1a + a : E
5*
*
6a : E
7a * a : E
1[a + a] * a : E
Lexical Analysis
Generalized LR
56
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
01a : E 4
+ a : E 7
1a + a : E
5*
*
6a : E
7a * a : E
1[a + a] * a : E
Lexical Analysis
Generalized LR
57
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
[a + a] * a : Eor
a + [a * a] : E
01a : E 4
+ a : E7
1a + a : E
5*
*
6a : E
7
a * a : E
1
Lexical Analysis
Generalized LR
58
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
[a + a] * a : Eor
a + [a * a] : E
01a : E 4
+ a : E7
1a + a : E
5*
*
6a : E
7
a * a : E
1 3$
Lexical Analysis
Generalized LR
59
Parsing
State Action Gotoa + * $ S E
0 s2 11 s4 s5 s32 r3 r3 r33 acc4 s2 75 s2 66 s4/r2 s5/r2 r27 s4/r1 s5/r1 r1
(0) S → E $ (1) E → E + E (2) E → E * E (3) E → a
synchronize
accept with trees on the link to initial state
LR Parsing
Scannerless Generalized-LR Parsing
V
60
Lexical Analysis
Scannerless Generalized LR
• Integrates scanning + parsing into a single GLR algorithm.
• Normalization separates lexical and context-free symbols.
• Crucial for avoiding conflicts when composing languages.
• Introduces lexical ambiguities. Solution:
‣Follow Restrictions (implement longest match).
‣Reject Rules (implement reserved keywords).
61
Lexical Analysis
Scannerless Generalized LR
62
Normalization
context-free syntax
Exp.Add = <<Exp> + <Exp>> {left} Exp.Inc = <<Exp>++> Exp.ID = ID
lexical syntax
ID = [a-zA-Z] IDRest* IDRest = [a-zA-Z0-9] LAYOUT = [\ \t\n\r] ID = "let" {reject}
context-free syntax
Exp.Add = <<Exp> + <Exp>> {left} Exp.Inc = <<Exp>++> Exp.ID = ID
lexical syntax
ID = [a-zA-Z] IDRest* IDRest = [a-zA-Z0-9] LAYOUT = [\ \t\n\r] ID = "let" {reject}
Lexical Analysis
Scannerless Generalized LR
63
Normalization
Exp-CF.Add = Exp-CF LAYOUT?-CF "+" LAYOUT?-CF Exp-CF {left} Exp-CF.Inc = Exp-CF LAYOUT?-CF "++" Exp-CF.ID = ID-CF ID-LEX = [a-zA-Z] IDRest*-LEX IDRest*-LEX = [a-zA-Z0-9] ID-LEX = "let" {reject} LAYOUT-CF = LAYOUT-LEX
Separate lexical from context-free symbols and separate symbols in context-free syntax by optional layout.
Lexical Analysis
Scannerless Generalized LR
64
Normalization
LAYOUT-CF = LAYOUT-LEX IDRest-CF = IDRest-LEX IDRest*-CF = IDRest*-LEX ID-CF = ID-LEX
context-free syntax
Exp.Add = <<Exp> + <Exp>> {left} Exp.Inc = <<Exp>++> Exp.ID = ID
lexical syntax
ID = [a-zA-Z] IDRest* IDRest = [a-zA-Z0-9] LAYOUT = [\ \t\n\r] ID = "let" {reject}
Create injections from lexical to context-free symbols
Lexical Analysis
Scannerless Generalized LR
65
Normalization
"+" = [\43] "++" = [\43] [\43] "let" = [\108] [\101] [\116] ID-LEX = [\65-\90\97-\122] IDRest*-LEX IDRest*-LEX = [\48-\57\65-\90\97-\122] LAYOUT-LEX = [\9-\10\13\32]
context-free syntax
Exp.Add = <<Exp> + <Exp>> {left} Exp.Inc = <<Exp>++> Exp.ID = ID
lexical syntax
ID = [a-zA-Z] IDRest* IDRest = [a-zA-Z0-9] LAYOUT = [\ \t\n\r] ID = "let" {reject}
Normalize literals symbols and character classes.
Lexical Analysis
Scannerless Generalized LR
66
Normalization
LAYOUT?-CF = LAYOUT-CF LAYOUT?-CF = IDRest+-LEX = IDRest-LEX IDRest+-LEX = IDRest+-LEX IDRest-LEX IDRest*-LEX = IDRest*-LEX = IDRest+-LEX IDRest+-CF = IDRest+-LEX
context-free syntax
Exp.Add = <<Exp> + <Exp>> {left} Exp.Inc = <<Exp>++> Exp.ID = ID
lexical syntax
ID = [a-zA-Z] IDRest* IDRest = [a-zA-Z0-9] LAYOUT = [\ \t\n\r] ID = "let" {reject}
Normalize regular expressions.
Lexical Analysis
Scannerless Generalized LR
67
Normalization
context-free start-symbols
Exp
<START> = LAYOUT?-CF Exp-CF LAYOUT?-CF <Start> = <START> [\256]
Define extra rules for the start symbols.
Lexical Analysis
Scannerless Generalized LR
68
Parse Table Generation
• Generation is based on SLR(1) item-sets.
• Uses character classes instead of tokens.
• Follow sets and goto actions are calculated for productions instead of non-terminals.
Lexical Analysis
Scannerless Generalized LR
69
Lexical Disambiguation
• A reject rule is of the form: ID = "let" {reject}
• When SGLR does a reduction with a reject rule, it marks the link as rejected.
• Further action on a stack is forbidden whenever all links to it are rejected.
• Follow restrictions are implemented as filters on the follow sets of productions.
Lexical Analysis
Generalized LR
70
• Priority rules: applied at parse table generation.
E → E * . E E → . E + E E → . E * E E → . a
multiplication has higher priority than addition!
Context-free Disambiguation
Lexical Analysis
Generalized LR
71
• Priority rules: applied at parse table generation.
multiplication is left associative!
E → E * . E E → . E + E E → . E * E E → . a
Context-free Disambiguation
Lexical Analysis
Generalized LR
72
• Priority rules: applied at parse table generation.
• Disambiguation filters: applied after parsing (prefer and avoid).
multiplication is left associative!
E → E * . E E → . E + E E → . E * E E → . a
Context-free Disambiguation
LR Parsing
Summary
VI
73
lessons learned
LR Parsing
Summary
74
lessons learned
LR Parsing
Summary
How can we generate LR parse tables?
• items, closure, goto
74
lessons learned
LR Parsing
Summary
How can we generate LR parse tables?
• items, closure, goto
How can we improve LR(0) parse table generation?
• SLR: consider FOLLOW sets to avoid shift-reduce conflicts
• LR(1): consider look-ahead in states
• LALR(1): unify LR(1) states to reduce state space
74
lessons learned
LR Parsing
Summary
How can we generate LR parse tables?
• items, closure, goto
How can we improve LR(0) parse table generation?
• SLR: consider FOLLOW sets to avoid shift-reduce conflicts
• LR(1): consider look-ahead in states
• LALR(1): unify LR(1) states to reduce state space
How can we handle conflicts in the parse table?
• generalized parsing - supports all class of context free grammars
• scannerless generalized LR - allow for proper language composition.
74
learn more
LR Parsing
Literature
75
learn more
LR Parsing
Literature
LR parsing
Andrew W. Appel, Jens Palsberg: Modern Compiler Implementation in Java, 2nd edition. 2002
Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman, Monica S. Lam: Compilers: Principles, Techniques, and Tools, 2nd edition. 2006
75
learn more
LR Parsing
Literature
LR parsing
Andrew W. Appel, Jens Palsberg: Modern Compiler Implementation in Java, 2nd edition. 2002
Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman, Monica S. Lam: Compilers: Principles, Techniques, and Tools, 2nd edition. 2006
Generalised LR parsing
Eelco Visser: Syntax Definition for Language Prototyping. PhD thesis 1997
M.G.J. van den Brand, J. Scheerder, J.J. Vinju, and E. Visser: Disambiguation Filters for Scannerless Generalized LR Parsers. CC 2002
75
LR Parsing
copyrights
76
LR Parsing 77
copyrights
LR Parsing
Pictures
Slide 1: Book Scanner by Ben Woosley, some rights reserved
Slide 19: Ostsee by Mario Thiel, some rights reserved
78