Upload
meljun-cortes-mbampa
View
247
Download
0
Embed Size (px)
Citation preview
8/13/2019 MELJUN CORTES Automata Theory 9
1/22
CSC 3130: Automata theory and formal languages
Normal forms and parsing
Fall 2008ELJUN P. CORTES MBA MPA BSCS ACS
MELJUN CORTES
8/13/2019 MELJUN CORTES Automata Theory 9
2/22
Testing membership and parsing
Given a grammar
How can we know if a string xis in its language?
If so, can we reconstruct a parse tree for x?
S 0S1 | 1S0S1 | T
T S | e
8/13/2019 MELJUN CORTES Automata Theory 9
3/22
First attempt
Maybe we can try all possible derivations:
S 0S1 | 1S0S1 | TT S |
x= 00111
S 0S1
1S0S1
T
00S11
01S0S11
0T1
S
10S10S1...
when do we stop?
8/13/2019 MELJUN CORTES Automata Theory 9
4/22
Problems
How do we know when to stop?
S 0S1 | 1S0S1 | TT S |
x= 00111
S 0S1
1S0S1
00S11
01S0S11
0T1
10S10S1...
when do we stop?
8/13/2019 MELJUN CORTES Automata Theory 9
5/22
Problems
Idea: Stop derivationwhen length exceeds |x|
Not right because of -productions
We might want to eliminate -productions too
S 0S1 | 1S0S1 | TT S |
x= 01011
S 0S1 01S0S11 01S011 010111 3 7 6 5
8/13/2019 MELJUN CORTES Automata Theory 9
6/22
Problems
Loops among the variables (STS) might
make us go forever
We might want to eliminate such loops
S 0S1 | 1S0S1 | TT S |
x= 00111
8/13/2019 MELJUN CORTES Automata Theory 9
7/22
Unit productions
A unit productionis a production of the form
whereA1andA2are both variables
Example
A1 A2
S 0S1 | 1S0S1 | TT S | R |
R 0SR
grammar: unit productions:
S T
R
8/13/2019 MELJUN CORTES Automata Theory 9
8/22
Removal of unit productions
If there is a cycle of unit productions
delete it and replace everything withA1
Example
A1 A2 ... Ak A1
S 0S1 | 1S0S1 | T
T S | R | R 0SR
S T
R
S 0S1 | 1S0S1
S R | R 0SR
Tis replaced by Sin the {S, T}cycle
8/13/2019 MELJUN CORTES Automata Theory 9
9/22
Removal of unit productions
For other unit productions, replace every chain
by productionsA1 ,... , Ak
Example
A1 A2 ... Ak
S R 0SRis replaced by S 0SR, R 0SR
S 0S1 | 1S0S1
| R | R 0SR
S 0S1 | 1S0S1
| 0SR| R 0SR
8/13/2019 MELJUN CORTES Automata Theory 9
10/22
Removal of -productions
A variable Nis nullableif there is a derivation
How to remove -productions (except from S)Find all nullable variables N1, ..., Nk
For i= 1to k
For every production of the formA Ni
,
add another productionA
If Ni is a production, remove it
If S is nullable, add the special productionS
N*
8/13/2019 MELJUN CORTES Automata Theory 9
11/22
Example
Find the nullable variables
S
ACDAa
B
C ED |
D BC | b
E b
B C D
nullable variablesgrammar
Find all nullable variables N1, ..., Nk
8/13/2019 MELJUN CORTES Automata Theory 9
12/22
Finding nullable variables
To find nullable variables, we work backwards First, mark all variablesAs.t.Aas nullable
Then, as long as there are productions of the form
where all ofA1,, Ak are marked as nullable, markA
as nullable
A A1 A
k
8/13/2019 MELJUN CORTES Automata Theory 9
13/22
Eliminating -productions
S ACDAa
B
C ED |
D
BC | bE b
nullable variables:B, C, D
For i= 1to kFor every production of the formA Ni,
add another productionA
If Ni is a production, remove it
D CS AD
D B
D
S AC
S A
C E
8/13/2019 MELJUN CORTES Automata Theory 9
14/22
Recap
After eliminating -productions and unitproductions, we know that every derivation
doesnt shrink in lengthand doesnt go intocycles
Exception: S We will not use this rule at all, except to check if L
Note
-productions must be eliminated beforeunit
S a1ak where a1, , ak are terminals*
8/13/2019 MELJUN CORTES Automata Theory 9
15/22
8/13/2019 MELJUN CORTES Automata Theory 9
16/22
Algorithm 1 for testing membership
We can now use the following algorithm to checkif a string xis in the language of G
Eliminate all -productions and unit productions
If x = and S , accept; else delete S LetX:= S
While some new production Pcan be applied to X
Apply Pto X
IfX= x, accept
If |X| > |x|, backtrack
If no more productions can be applied toX, reject
8/13/2019 MELJUN CORTES Automata Theory 9
17/22
Practical limitations of Algorithm I
Previous algorithm can be very slow if xis long
There is a faster algorithm, but it requires that we
do some more transformations on the grammar
G= CFG of the java programming language
x= code for a 200-line java program
algorithm might take about 10200steps!
8/13/2019 MELJUN CORTES Automata Theory 9
18/22
Chomsky Normal Form
A grammar is in Chomsky Normal Formif everyproduction (except possiblyS ) is of the type
Conversion to Chomsky Normal Form is easy:
A BC A aor
A BcDEreplaceterminals
with new
variables
A BCDE
C c break upsequenceswith new
variables
A BX1X1 CX2X2 DE
C c
8/13/2019 MELJUN CORTES Automata Theory 9
19/22
Exercise
Convert this CFG into Chomsky Normal Form:
S |ADDA
A a
C c
D bCb
8/13/2019 MELJUN CORTES Automata Theory 9
20/22
8/13/2019 MELJUN CORTES Automata Theory 9
21/22
Parse tree reconstruction
S AB | BC
A BA | a
B CC | b
C AB | a
x= baabaab b aa
ACB B ACACBSA SASC
B B
SAC
SAC
Tracing back the derivations, we obtain the parse tree
8/13/2019 MELJUN CORTES Automata Theory 9
22/22
Cocke-Younger-Kasami algorithm
For i= 1to k
If there is a productionA xiPutAin table cell ii
For b= 2to kFor s= 1to kb+ 1
Set t= s+ b
Forj= sto t
If there is a productionA BC
where Bis in cell sjand Cis in celljt
PutAin cell st
x1 x2 xk
11 22 kk
12 23
1k
s j t k1
b
Input:Grammar Gin CNF, string x = x1xk
Cell ij remembers all possible derivations of substring xixj