Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
1
Syntax Analysis
2
Syntax Analysis
Introduction to parsers
Context-free grammars
Push-down automata
Top-down parsing
Buttom-up parsing
Bison - a parser generator
3
Introduction to parsers
Lexical
AnalyzerParser
Symbol
Table
token
next token
source Semantic
Analyzersyntax
treecode
4
Context-Free Grammars
A set of terminals: basic symbols from
which sentences are formed
A set of nonterminals: syntactic
categories denoting sets of sentences
A set of productions: rules specifying
how the terminals and nonterminals can
be combined to form sentences
The start symbol: a distinguished
nonterminal denoting the language
5
An Example
Terminals: id, ‘+’, ‘-’, ‘*’, ‘/’, ‘(’, ‘)’
Nonterminals: expr, op
Productions:
expr expr op expr
expr ‘(’ expr ‘)’
expr ‘-’ expr
expr id
op ‘+’ | ‘-’ | ‘*’ | ‘/’
The start symbol: expr
6
Derivations
A derivation step is an application of a
production as a rewriting rule
E - E
A sequence of derivation steps
E - E - ( E ) - ( id )
is called a derivation of “- ( id )” from E
The symbol * denotes “derives in zero or
more steps”; the symbol + denotes “derives
in one or more steps
E * - ( id ) E + - ( id )
7
Context-Free Languages
A context-free language L(G) is the language
defined by a context-free grammar G
A string of terminals is in L(G) if and only
if S + , is called a sentence of G
If S * , where may contain nonterminals,
then we call a sentential form of G
E - E - ( E ) - ( id )
G1 is equivalent to G2 if L(G1) = L(G2)
8
Left- & Right-most Derivations
Each derivation step needs to choose
– a nonterminal to rewrite
– a production to apply
A leftmost derivation always chooses the
leftmost nonterminal to rewrite
E lm - E lm - ( E ) lm - ( E + E )
lm - ( id + E ) lm - ( id + id )
A rightmost derivation always chooses the
rightmost nonterminal to rewrite
E rm - E rm - ( E ) rm - ( E + E )
rm - (E + id ) rm - ( id + id )
9
Parse Trees
A parse tree is a graphical representation
for a derivation that filters out the order of
choosing nonterminals for rewriting
Many derivations may correspond to the
same parse tree, but every parse tree has
associated with it a unique leftmost and a
unique rightmost derivation
10
An Example
E
-
( )
+
id id
E
E E
EE
lm - E
lm - ( E )
lm - ( E + E )
lm - ( id + E )
lm - ( id + id )
E
rm - E
rm - ( E )
rm - ( E + E )
rm - ( E + id )
rm - ( id + id )
11
Ambiguous Grammar
A grammar is ambiguous if it produces
more than one parse tree for some sentence
E
E + E
id + E
id + E * E
id + id * E
id + id * id
E
E * E
E + E * E
id + E * E
id + id * E
id + id * id
12
Ambiguous Grammar
E
+E E
id
id
*E E
id
E
*E E
id
id
+E E
id
13
Resolving Ambiguity
Use disambiguiting rules to throw away
undesirable parse trees
Rewrite grammars by incorporating
disambiguiting rules into grammars
14
An Example
The dangling-else grammar
stmt if expr then stmt
| if expr then stmt else stmt
| other
Two parse trees for
if E1 then if E2 then S1 else S2
15
An Example
S
elseE S Sif then
if E then S
elseE
S
S Sif then
if E then S
16
Disambiguiting Rules
Rule: match each else with the closest
previous unmatched then
Remove undesired state transitions in the
pushdown automaton
17
Grammar Rewriting
stmt m_stmt
| unm_stmt
m_stmt if expr then m_stmt else m_stmt
| other
unm_stmt if expr then stmt
| if expr then m_stmt else unm_stmt
18
RE vs. CFG
Every language described by a RE can also
be described by a CFG
Why use REs for lexical syntax?
– do not need a notation as powerful as CFGs
– are more concise and easier to understand than
CFGs
– More efficient lexical analyzers can be
constructed from REs than from CFGs
– Provide a way for modularizing the front end
into two manageable-sized components
19
Push-Down Automata
Finite Automaton
Input
OutputStack
$
$
20
An Example
S’ S $
S a S b
S
1 2 3start (a, $)
a
(b, a)
a
($, $)
(a, a)
a
(b, a)
a
0
($, $)
21
Nonregular Constructs
REs can denote only a fixed number of
repetitions or an unspecified number of
repetitions of one given construct:
an, a*
A nonregular construct:
– L = {anbn | n 0}
22
Non-Context-Free Constructs
CFGs can denote only a fixed number of
repetitions or an unspecified number of
repetitions of one or two given constructs
Some non-context-free constructs:
– L1 = {wcw | w is in (a | b)*}
– L2 = {anbmcndm | n 1 and m 1}
– L3 = {anbncn | n 0}
共勉
子貢曰︰貧而無諂,富而無驕,何如。
子曰︰可也,未若貧而樂,富而好禮者也。
子貢曰︰詩云︰「如切如磋,如琢如磨。」
其斯之謂與。
子曰︰賜也,始可與言詩已矣;
告諸往而知來者。
--論語
24
Top-Down Parsing
Construct a parse tree from the root to the
leaves using leftmost derivation
1. S c A B input: cad
2. A a b
3. A a
4. B d
S S
c A B
1S
c A B
a b
2S
c A B
a d
4S
c A B
a
3
backtrack
25
Predictive Parsing
A top-down parsing without backtracking
– there is only one alternative production to
choose at each derivation step
stmt if expr then stmt else stmt
| while expr do stmt
| begin stmt_list end
26
LL(k) Parsing
The first L stands for scanning the input
from left to right
The second L stands for producing a
leftmost derivation
The k stands for the number of lookahead
input symbols used to choose alternative
productions at each derivation step
27
LL(1) Parsing
Use one input symbol of lookahead
LL(1): S a b e | c d e
LL(2): S a b e | a d e
28
Bottom-Up Parsing
Construct a parse tree from the leaves to the
root using rightmost derivation in reverse
S a A B e input: abbcde
A A b c | b
B d
ca d eb
A
b
A
ca d eb
A
b
BA
ca d eb
A
b
S
BA
ca d eb
A
bca d ebb
abbcde rm aAbcde rm aAde rm aABe rm S
29
Handles
A handle of a right-sentential form consists of
– a production A
– a position of where can be replaced by A to
produce the previous right-sentential form in a
rightmost derivation of
abbcde rm aAbcde rm aAde rm aABe rm S
A b A A b c B d S a A B e
30
Handle Pruning
rm A rm S
S
A
The string to the right of the handle contains only
terminals
A is the bottommost leftmost interior node with all
its children in the tree
31
An Example
S
S
BA
ca d eb
A
b
S
BA
ca d eb
A
S
BA
a d e
S
BA
a e
32
Shift-Reduce Parsing
Parsing driver
Parsing table
Input
Output
Stack
Handle
$
$
33
Stack Operations
Shift: shift the next input symbol onto the
top of the stack
Reduce: replace the handle at the top of the
stack with the corresponding nonterminal
Accept: announce successful completion of
the parsing
Error: call an error recovery routine
34
An Example
Action Stack Input
S $ a b b c d e $
S $ a b b c d e $
R $ a b b c d e $
S $ a A b c d e $
S $ a A b c d e $
R $ a A b c d e $
S $ a A d e $
R $ a A d e $
S $ a A B e $
R $ a A B e $
A $ S $
35
Shift/Reduce Conflict
stmt if expr then stmt
| if expr then stmt else stmt
| other
Stack Input
$ - - - if expr then stmt else - - - $
Shift if expr then stmt else stmt
Reduce if expr then stmt
36
Reduce/Reduce Conflict
stmt id ( para_list )
stmt expr := expr
para_list para_list , para
para_list para
para id
expr id ( expr_list )
expr id
expr_list expr_list , expr
expr_list expr
Stack Input
$ - - - id ( id , id ) - - - $
$- - - procid ( id , id ) - - - $
37
LR(k) Parsing
The L stands for scanning the input from
left to right
The R stands for constructing a rightmost
derivation in reverse
The k stands for the number of lookahead
input symbols used to make parsing
decisions
38
LR Parsing
The LR parsing algorithm
Constructing SLR(1) parsing tables
Constructing LR(1) parsing tables
Constructing LALR(1) parsing tables
39
Model of an LR Parser
Parsing driver
Input
Output
Stack
Action Goto
Sm
Sm-1
Xm-1
Xm
S0Parsing table
$
$
40
An Example
(1) E E + T
(2) E T
(3) T T * F
(4) T F
(5) F ( E )
(6) F id
State Action Goto
id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
41
An Example
Action Stack Input
s5 $0 id + id * id $
r6 $0 id5 + id * id $
r4 $0 F3 + id * id $
r2 $0 T2 + id * id $
s6 $0 E1 + id * id $
s5 $0 E1 +6 id * id $
r6 $0 E1 +6 id5 * id $
r4 $0 E1 +6 F3 * id $
s7 $0 E1 +6 T9 * id $
s5 $0 E1 +6 T9 *7 id $
r6 $0 E1 +6 T9 *7 id5 $
r3 $0 E1 +6 T9 *7 F10 $
r1 $0 E1 +6 T9 $
acc $0 E1 $
42
LR Parsing Driver
push $s0 onto the stack, where s0 is the initial state
set ip to point to the first symbol of w$;
repeat
let s be the top state on the stack and a the symbol pointed to by ip;
if action[s, a] == shift s’ then
push a and s’ onto the stack and advance ip
else if action[s, a] == reduce A then
pop 2 * | | symbols off the stack; s’ = goto[top(), A];
push A and s’ onto the stack
else if action[s, a] == accept then
return
else error
until false
43
First Sets
The first set of a string is the set of
terminals that begin the strings derived from
. If * , then is also in the first set of
.
44
First Sets
If X is terminal, then FIRST(X) is {X}
If X is nonterminal and X is a
production, then add to FIRST(X)
If X is nonterminal and X Y1 Y2 ... Yk is
a production, then add a to FIRST(X) if
for some i, a is in FIRST(Yi) and is in all
of FIRST(Y1), ..., FIRST(Yi-1). If is in
FIRST(Yj) for all j, then add to FIRST(X)
45
An Example
E T E'
E' + T E' |
T F T'
T' * F T' |
F ( E ) | id
FIRST(F) = { (, id }
FIRST(T') = { *, }, FIRST(T) = { (, id }
FIRST(E') = { +, }, FIRST(E) = { (, id }
46
Follow Sets
The follow set of a nonterminal A is the set of
terminals that can appear immediately to the
right of A in some sentential form, namely,
S * A a
a is in the follow set of A.
47
Follow Sets
Place $ in FOLLOW(S), where S is the start
symbol and $ is the input right endmarker
If there is a production A B , then
everything in FIRST() except for is placed
in FOLLOW(B)
If there is a production A B or A B
where FIRST() contains , then everything
in FOLLOW(A) is in FOLLOW(B)
48
An Example
E T E'
E' + T E' |
T F T'
T' * F T' |
F ( E ) | id
FIRST(E) = FIRST(T) = FIRST(F) = { (, id }
FIRST(E') = { +, }, FIRST(T') = { *, }
FOLLOW(E) = { ), $ }, FOLLOW(E') = { ), $ }
FOLLOW(T) = { +, ), $ }, FOLLOW(T') = { +, ), $ }
FOLLOW(F) = { *, +, ), $ }
49
SLR(1) Parsing Table
LR(0) Items
• An LR(0) item of a grammar in G is a production of G with a dot at some position of the right-hand side, A
• An LR(0) item represents a state in an NPDA indicating how much of a production we have seen at a given point in the parsing process
• The production A X Y Z yields the following four LR(0) items
A • X Y Z, A X • Y Z, A X Y • Z, A X Y Z •
50
From CFG to NPDA
• The state A a will go to the state A
a via an edge of terminal a (a shifting)
• The state A B will go to the state B
via an edge of the empty string
• The state A will cause a reduction on
seeing a terminal in FOLLOW(A)
• The state A B will go to the state A
B via an edge of nonterminal B (after a
reduction)
51
An Example
1. E’ E
2. E E + T
3. E T
4. T T * F
5. T F
6. F ( E )
7. F id
Augmented grammar:Easier to identify
the accepting state
52
An Example
E’•E
0
E
E’E•
7
T
ET•
9
F
TF•
11
E•E+T
1
E•T
2
T•T*F
3
T•F
4
EE•+T
8E
EE+•T
14+
17
EE+T•T
F•(E)
5
F•id
6 Fid•
13id
F(•E)
12(
TT•*F
10
TTT*•F*
15
TT*F•F
18
E F(E•)
16)
F(E)•
19
5
6
1
2
3
4
53
From NPDA to DPDA
• There are two functions performed on sets
of LR(0) items (DPDA states)
• The function closure(I) adds more items to I
when there is a dot to the left of a
nonterminal (corresponding to edges)
• The function goto(I, X) moves the dot past
the symbol X in all items in I that contain X
(corresponding to non- edges)
54
The Closure Function
function closure(I);
begin
J := I;
repeat
for each item A B in J and
each production B of G such that
B is not in J do
J = J { B }
until no more items can be added to J;
return J
end
55
An Example
1. E’ E
2. E E + T
3. E T
4. T T * F
5. T F
6. F ( E )
7. F id
s0 = E’ E,
I0 = closure({s0 }) =
{ E’ E,
E E + T,
E T,
T T * F,
T F,
F ( E ),
F id }
56
The Goto Function
function goto(I, X);
begin
set J to the empty set
for any item A X in I do
add A X to J
return closure(J)
end
57
An Example
I0 = {E’ E, E E + T, E T,
T T * F, T F, F ( E ),
F id }
goto(I0 , E)
= closure({E’ E , E E + T })
= {E’ E , E E + T }
58
Subset Construction
function items(G’);
begin
C := {closure({S’ S})}
repeat
for each set of items I in C and each symbol X do
J := goto(I, X)
if J is not empty and not in C then
C = C { J }
until no more sets of items can be added to C
return C
end
59
An Example
1. E’ E
2. E E + T
3. E T
4. T T * F
5. T F
6. F ( E )
7. F id
60
I0 : E’ E
E E + T
E T
T T * F
T F
F ( E )
F id
goto(I0, E) =
I1 : E’ E
E E + T
goto(I0, T) =
I2 : E T
T T * F
goto(I0, F) =
I3 : T F
goto(I0, ‘(’) =
I4 : F ( E )
E E + T
E T
T T * F
T F
F ( E )
F id
goto(I0, id) =
I5 : F id
goto(I1, ‘+’) =
I6 : E E + T
T T * F
T F
F ( E )
F id
goto(I2, ‘*’) =
I7 : T T * F
F ( E )
F id
goto(I4, E) =
I8 : F ( E )
E E + T
goto(I6, T) =
I9 : E E + T
T T * F
goto(I7, F) =
I10 : T T * F
goto(I8, ‘)’) =
I11 : F ( E )
61
An Example
E’ • E
E • E + T
E • T
T • T * F
T • F
F • ( E )
F • id
E’ E •
E E • + T
E T •
T T • * F
E T
T F •
F
F ( • E )
E • E + T
E • T
T • T * F
T • F
F • ( E )
F • id
F id •id
(
T T * • F
F • ( E )
F • id*
E E + • T
T • T * F
T • F
F • ( E )
F • id
+
F ( E • )
E E • + T
F T T * F •
E E + T •
T T • * FT
F ( E ) •
)
0
1
2
3
4
5
6
7
8
9
10
11
(id *
+
id
ET
F
F(
(id
62
SLR(1) Parsing Table Generation
procedure SLR(G’);
begin
for each state I in items(G’) do begin
if A a in I and goto(I, a) = J for a terminal a then
action[I, a] = “shift J”
if A in I and A S’ then
action[I, a] = “reduce A ” for all a in Follow(A)
if S’ S in I then action[I, $] = “accept”
if A X in I and goto(I, X) = J for a nonterminal X
then goto[I, X] = J
end
all other entries in action and goto are made error
end
63
An Example
+ * ( ) id $ E T F
0 s4 s5 1 2 3
1 s6 a
2 r3 s7 r3 r3
3 r5 r5 r5 r5
4 s4 s5 8 2 3
5 r7 r7 r7 r7
6 s4 s5 9 3
7 s4 s5 10
8 s6 s11
9 r2 s7 r2 r2
10 r4 r4 r4 r4
11 r6 r6 r6 r6
64
共勉
大學之道︰
在明明德,在親民,在止於至善。
--大學
65
LR(1) Parsing Table
LR(1) Items
• An LR(1) item of a grammar in G is a pair,
( A , a ), of an LR(0) item A and a lookahead symbol a
• The lookahead has no effect in an LR(1) item
of the form ( A , a ), where is not
• An LR(1) item of the form ( A , a )
calls for a reduction by A only if the next
input symbol is a
66
The Closure Function
function closure(I);
begin
J := I;
repeat
for each item (A B , a) in J and
each production B of G and
each b FIRST( a) such that
(B , b) is not in J do
J = J { (B , b) }
until no more items can be added to J;
return J
end
67
The Goto Function
function goto(I, X);
begin
set J to the empty set
for any item (A X , a) in I do
add (A X , a) to J
return closure(J)
end
68
Subset Construction
function items(G’);
begin
C := {closure({S’ S, $})}
repeat
for each set of items I in C and each symbol X do
J := goto(I, X)
if J is not empty and not in C then
C = C { J }
until no more sets of items can be added to C
return C
end
69
An Example
1. S’ S
2. S C C
3. C c C
4. C d
70
An Example
I0: closure({(S’ S, $)}) =
(S’ S, $)
(S C C, $)
(C c C, c/d)
(C d, c/d)
I1: goto(I0, S) = (S’ S , $)
I2: goto(I0, C) =
(S C C, $)
(C c C, $)
(C d, $)
I3: goto(I0, c) =
(C c C, c/d)
(C c C, c/d)
(C d, c/d)
I4: goto(I0, d) =
(C d , c/d)
I5: goto(I2, C) =
(S C C , $)
71
An Example
I6: goto(I2, c) =
(C c C, $)
(C c C, $)
(C d, $)
I7: goto(I2, d) =
(C d , $)
I8: goto(I3, C) =
(C c C , c/d)
: goto(I3, c) = I3
: goto(I3, d) = I4
I9: goto(I6, C) =
(C c C , $)
: goto(I6, c) = I6
: goto(I6, d) = I7
72
LR(1) Parsing Table Generation
procedure LR(G’);
begin
for each state I in items(G’) do begin
if (A a , b) in I and goto(I, a) = J for a terminal a then
action[I, a] = “shift J”
if (A , a) in I and A S’ then
action[I, a] = “reduce A ”
if (S’ S , $) in I then action[I, $] = “accept”
if (A X , a) in I and goto(I, X) = J for a nonterminal X
then goto[I, X] = J
end
all other entries in action and goto are made error
end
73
An Example
c d $ S C
0 s3 s4 1 2
1 a
2 s6 s7 5
3 s3 s4 8
4 r4 r4
5 r2
6 s6 s7 9
7 r4
8 r3 r3
9 r3
74
LALR(1) Parsing Table
The Core of LR(1) Items
• The core of a set of LR(1) items is the set of
their first components (i.e., LR(0) items)
• The core of the set of LR(1) items
{ (C c C, c/d),
(C c C, c/d),
(C d, c/d) }
is {C c C,
C c C,
C d }
75
Merging Cores
I3: { (C c C, c/d)
(C c C, c/d)
(C d, c/d) }
I4: { (C d , c/d) }
I8: { (C c C , c/d) }
I6: { (C c C, $)
(C c C, $)
(C d, $) }
I7: { (C d , $) }
I9: { (C c C , $) }
76
LALR(1) Parsing Table Generation
procedure LALR(G’);
begin
for each state I in mergeCore(items(G’)) do begin
if (A a , b) in I and goto(I, a) = J for a terminal a then
action[I, a] = “shift J”
if (A , a) in I and A S’ then
action[I, a] = “reduce A ”
if (S’ S , $) in I then action[I, $] = “accept”
if (A X , a) in I and goto(I, X) = J for a nonterminal X
then goto[I, X] = J
end
all other entries in action and goto are made error
end
77
An Example
c d $ S C
0 s36 s47 1 2
1 a
2 s36 s47 5
36 s36 s47 89
47 r4 r4 r4
5 r2
89 r3 r3 r3
78
LR Grammars
• A grammar is SLR(1) iff its SLR(1) parsing
table has no multiply-defined entries
• A grammar is LR(1) iff its LR(1) parsing
table has no multiply-defined entries
• A grammar is LALR(1) iff its LALR(1)
parsing table has no multiply-defined
entries
79
Hierarchy of Grammar Classes
Unambiguous Grammars Ambiguous Grammars
LL(k) LR(k)
LR(1)
LALR(1)
LL(1) SLR(1)
80
Hierarchy of Grammar Classes
• Why LL(k) LR(k)?
• Why SLR(k) LALR(k) LR(k)?
81
LL(k) vs. LR(k)
• For a grammar to be LL(k), we must be able
to recognize the use of a production by
seeing only the first k symbols of what its
right-hand side derives
• For a grammar to be LR(k), we must be able
to recognize the use of a production by
having seen all of what is derived from its
right-hand side with k more symbols of
lookahead
82
LALR(k) vs. LR(k)
• The merge of the sets of LR(1) items having the
same core does not introduce shift/reduce conflicts
• Suppose there is a shift-reduce conflict on
lookahead a in the merged set because of
1. (A , a) 2. (B a , b)
• Then some set of items has item (A , a) , and
since the cores of all sets merged are the same, it
must have an item (B a , c) for some c
• But then this set has the same shift/reduce conflict
on a
83
LALR(k) vs. LR(k)
• The merge of the sets of LR(1) items having the same core may introduce reduce/reduce conflicts
• As an example, consider the grammar1. S’ S2. S a A d | a B e | b A e | b B d3. A c4. B c
that generates acd, ace, bce, bcd
• The set {(A c , d), (B c , e)} is valid for acx
• The set {(A c , e), (B c , d)} is valid for bcx
• But the union {(A c , d/e), (B c , d/e)} generates a reduce/reduce conflict
84
SLR(k) vs. LALR(k)
1. S’ S
2. S L = R
3. S R
4. L * R
5. L id
6. R L
85
SLR(k) vs. LALR(k)
I0: closure({S’ S}) =
S’ S
S L = R
S R
L * R
L id
R L
I1: goto(I0, S) = S’ S
I2: goto(I0, L) =
S L = R
R L
I3: goto(I0, R) =
S R
I4: goto(I0, *) =
L * R
R L
L * R
L id
I5: goto(I0, id) =
L id
FOLLOW(R) = {=, $}
86
SLR(k) vs. LALR(k)
I6: goto(I2, =) =
S L = R
R L
L * R
L id
I7: goto(I4, R) =
L * R
I8: goto(I4, L) =
R L
I9: goto(I6, R) =
S L = R
87
SLR(k) vs. LALR(k)
I0: closure({(S’ S, $)}) =
(S’ S, $)
(S L = R, $)
(S R, $)
(L * R, =/$)
(L id, =/$)
(R L, $)
I1: goto(I0, S) = (S’ S , $)
I2: goto(I0, L) =
(S L = R, $)
(R L , $)
I3: goto(I0, R) =
(S R , $)
I4: goto(I0, *) =
(L * R, =/$)
(R L, =/$)
(L * R, =/$)
(L id, =/$)
I5: goto(I0, id) =
(L id , =/$)
88
SLR(k) vs. LALR(k)
I6: goto(I2, =) =
(S L = R, $)
(R L, $)
(L * R, $)
(L id, $)
I7: goto(I4, R) =
(L * R , =/$)
I8: goto(I4, L) =
(R L , =/$)
I9: goto(I6, R) =
(S L = R , $)
I10: goto(I6, L) =
(R L , $)
I11: goto(I6, *) =
(L * R, $)
(R L, $)
(L * R, $)
(L id, $)
I12: goto(I6, id) =
(L id , $)
I13: goto(I11, R) =
(L * R , $)
I4
I5
89
Bison – A Parser Generator
Bison compiler
C compiler
a.out
lang.ylang.tab.c
lang.tab.h (-d option)
lang.tab.c a.out
tokens syntax tree
A langauge for specifying parsers and semantic analyzers
90
Bison Programs
%{
C declarations
%}
Bison declarations
%%
Grammar rules
%%
Additional C code
91
An Example
line expr ‘\n’
expr expr ‘+’ term | term
term term ‘*’ factor | factor
factor ‘(’ expr ‘)’ | DIGIT
92
An Example
%token DIGIT
%start line
%%
line : expr ‘\n’ {printf(“line: expr \\n\n”);}
;
expr: expr ‘+’ term {printf(“expr: expr + term\n”);}
| term {printf(“expr: term\n”}
;
term: term ‘*’ factor {printf(“term: term * factor\n”;}
| factor {printf(“term: factor\n”);}
;
factor: ‘(’ expr ‘)’ {printf(“factor: ( expr )\n”);}
| DIGIT {printf(“factor: DIGIT\n”);}
;
93
Functions and Variables
• yyparse(): the parser function
• yylex(): the lexical analyzer function. Bison recognizes any non-positive value as indicating the end of the input
• yylval: the attribute value of a token. Its default type is int, and can be declared to be multiple types in the first section using
%union {int ival;double dval;
}
94
Conflict Resolutions
• A reduce/reduce conflict is resolved by
choosing the production listed first
• A shift/reduce conflict is resolved in favor
of shift
• A mechanism for assigning precedences and
assocoativities to terminals
95
Precedence and Associativity
• The precedence and associativity of operators are
declared simultaneously
%nonassoc ‘<’ /* lowest */
%left ‘+’ ‘-’
%right ‘^’ /* highest */
• The precedence of a rule is determined by the
precedence of its rightmost terminal
• The precedence of a rule can be modified by
adding %prec <terminal> to its right end
96
An Example
%{
#include <stdio.h>
%}
%token NUMBER
%left ‘+’ ‘-’
%left ‘*’ ‘/’
%right UMINUS
%%
97
An Example
line : expr ‘\n’
;
expr: expr ‘+’ expr
| expr ‘-’ expr
| expr ‘*’ expr
| expr ‘/’ expr
| ‘-’ expr %prec UMINUS
| ‘(’ expr ‘)’
| NUMBER
;
98
Error Recovery
• Error recovery is performed via error productions
• An error production is a production containing
the predefined terminal error
• After adding an error production,
A B | error
on encountering an error in the middle of B, the
parser pops symbols from its stack until , shifts
error, and skips input tokens until a token in
FIRST()
99
Error Recovery
• The parser can report a syntax error by calling
the user provided function yyerror(char *)
• The parser will suppress the report of another
error message for 3 tokens
• You can resume error report immediately by
using the macro yyerrok
• Error productions are used for major
nonterminals
100
An Example
line : expr ‘\n’
| error ‘\n’ {yyerror("reenter last line:");
yyerrok;}
;
expr: expr ‘+’ expr
| expr ‘*’ expr
| ‘-’ expr %prec UMINUS
| ‘(’ expr ‘)’
| NUMBER
;
101
共勉
樊遲問仁。
子曰︰「愛人。」
子曰︰「人之過也,各於其黨,
觀過,斯知仁矣。」
--論語
102
共勉
子曰︰唯仁者,能好人,能惡人。
子曰︰茍志於仁矣,無惡也。
--論語
子曰︰志士仁人,無求生以害仁,
有殺身以成仁。
103
共勉
子曰︰里仁為美。擇不處仁,焉得知?
子曰︰朝聞道,夕死可矣!
--論語
子曰︰不仁者不可以久處約,
不可以長處樂。仁者安仁,知者利仁。