Upload
kristen-wong
View
32
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Fall 2011. The Chinese University of Hong Kong. CSCI 3130: Formal languages and automata theory. Undecidable problems for CFGs. Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130. Decidable vs. undecidable. decidable. undecidable. “ DFA M accepts w ”. “ TM M accepts w ”. - PowerPoint PPT Presentation
Citation preview
CSCI 3130: Formal languages and automata theory
Andrej Bogdanov
http://www.cse.cuhk.edu.hk/~andrejb/csc3130
The Chinese University of Hong Kong
Undecidable problems for CFGs
Fall 2011
Decidable vs. undecidable
“TM M accepts w ”
“TM M accepts some input”
“TM M and M ’ accept same inputs”
undecidable
“TM M halts on w ” “CFG G generates w ”
“DFA M accepts w ”
decidable
“CFG G generates all inputs”“CFG G is ambiguous”
more?
?
“DFAs M and M ’ accept
same inputs”
Representing computations
q0
q1 q2
q5
q3 q4
q6q7 qa
a/xR
%/%R
%/%R
b/xL
b/xR
a/xL
a/aRb/bR
a/aRb/bR
x/xR
x/xR
x/xR
a/aLb/bLx/xL
☐/☐R%/%R %/%R
a/aLb/bL
x/xR
L1 = {w%w: w ∈{a, b}*}
abbaa%abbaaq0
xbbaa%abbaaq1
xbbaa%abbaa
...
q1
xbbaa%abbaaq2
xbbaa%xbbaaq5
...
xxxxx%xxxxxqa
Configurations
• A configuration consists of the current state, the head position, and tape contents
…a b a ☐q1 abq1a
q1 qacca/bR
abbqacc…a b b ☐
qacc
configuration
q0a0b0%0a0b0
x0q1b0%0a0b0
x0b6q1%0a0b0
x0b6%0q2a0b0
x0b6q5%0x1b0
x0q6b0%0x0b0
Computation histories
computation history
q0
q1 q2
q5
q3 q4
q6q7 qa
a/xR
%/%R
%/%R
b/xL
b/xR
a/xL
a/aRb/bR
a/aRb/bR
x/xR
x/xR
x/xR
a/aLb/bLx/xL
☐/☐R%/%R %/%R
a/aLb/bL
x/xR
x0x0%0x0x0q7
x0x0%0x0x0☐aqa
...
Computation histories as strings
M accepts w qacc occurs in hist
M rejects w qrej occurs in hist
q0a0b0%0a0b0
x0q1b0%0a0b
x0x0%0x0x0q7
x0x0%0x0x0qa
...
If M halts on w, the computation history of (M, w) is the sequence of configurations C1, ..., Cl that M goes through on input w.
#q0ab%ab#xq1b%ab# . . . #xx%xx☐qa#
C1 C2 Cl
The computation history can be writtenas a string hist over alphabet ∪Q∪{#}
accepting history:
rejecting history:
Undecidable problems for CFGs
• We will argue that
ALLCFG = { 〈 G 〉 : G is a CFG that generates all strings}
The language ALLCFG is undecidable.
If ALLCFG can be decided, so can ATM.
Undecidable problems for CFGs
〈 M, w 〉
reject if M accepts w
accept if M rej/loops w
Areject if not
accept if G generates all strings〈 G〉
A〈 G〉
construct G
G generates all strings if M rejects or loops on w
G fails to generate some string if M accepts w
Undecidable problems for CFGs
〈 M, w 〉
〈 G〉
construct G
G fails to generate some string
M accepts w
The alphabet of G will be ∪Q∪{#}
G will generate all strings except the computation history of (M, w), if it is accepting
First we construct a PDA P, then convert it to CFG G
Undecidability via computation histories
Pcandidate computationhistory hist of (M, w)
accept everything
#q0ab%ab#xq1b%ab# . . . #xx%xx☐qa# reject
except accepting hist
On input hist,
If w1 ≠ q0w or wk does not contain qa, accept.
If two consecutive blocks wi#wi+1 do not followfrom the transitions of M, accept.
P:
If hist is not of the form #w1#w2#...#wk#, accept.
Otherwise, hist is an accepting history, so reject.
// try to spot a mistake in hist
q0
#0q0a0b0%0a0b0
#0x0q1b0%0a0b0
#0x0b6q1%0a0b0
#0x0b6%0q2a0b0
#0x0b6q5%0x1b0
#0x0q6b0%0x0b0
Computation is local
Changes between configurations always occur around the head
q0
q1 q2
q5
q3 q4
q6q7 qa
a/xR
%/%R
%/%R
b/xL
b/xR
a/xL
a/aRb/bR
a/aRb/bR
x/xR
x/xR
x/xR
a/aLb/bLx/xL
☐/☐R%/%R %/%R
a/aLb/bL
x/xR
#0x0x0%0x0x0q7
#0x0x0%0x0x0☐aqa
...
Legal and illegal transition windows
… 6a3b0x0 …… 0a6b0x0 …0
legal windows
… 6a3b0a0 …… 0a6b0q5 …0
… 6a3q2a0 …… 0q5a6x0 …0
… 6q2a0b0 …… 0a6b0q2 …0
illegal windows
… 6a3a0☐0 …… 0x6a0☐0 …0
q2
q5
a/xL
… 6a3q2a0 …… 0q5a6b0 …0
… 6q2q2a0 …… 0q2q2x3 …0
… 6a3q2a0 …… 0a6q5x0 …0
#0x0b6%0q2a0b#0x0b6q5%0x1b
Implementing P
For every position of wi:
Remember first row of window in state
After reaching the next #:
Pop offset from # from stack as you consume input
Remember second row of window in state
Remember offset from # in wi on stack
If window is illegal, accept;Otherwise reject.
offset
If two consecutive blocks wi#wi+1 do not followfrom the transitions of M, accept:
The computation history method
• G accepts all strings except accepting computation histories of (M, w)
• We first construct a PDA P, then convert it to CFG G
ALLCFG = { 〈 G 〉 : G is a CFG that generates all strings}
If ALLCFG can be decided, so can ATM.
〈 M, w 〉
〈 G〉
construct G
The Post Correspondence Problem
• Input: A set of tiles like this
• Given an infinite supply of such tiles, can you match top and bottom?
babcc
cab
aab
baaa
babcc
cab
aab
baaa
cab
ababa
ababa
bab
bab
Undecidability of PCP
PCP = { 〈 T 〉 : T is a collection of tiles that
contains a top-bottom match}
The language PCP is undecidable.
Ambiguity of CFGs
AMB = {G: G is an ambiguous CFG}
• We will argue that
The language AMB is undecidable.
If AMB can be decided, so can PCP.
Ambiguity of CFGs
• Step 1: Number the tiles
T G
If T can be matched, then G is ambiguous
If T cannot be matched, then G is unambiguous
(collection of tiles)
babcc
cab
aab
1 2 3
(CFG)
Ambiguity of CFGs
T G(collection of tiles)
babcc
cab
aab
1 2 3
Productions:
Terminals:
T → babT1B → ccB1
T → cT2B → abB2
T → aT3B → abB3
a, b, c, 1, 2, 3
(CFG)
Variables: S, T, B
S → T | B
Ambiguity of CFGs
T G(collection of tiles)
babcc
cab
aab
1 2 3Productions:
T → babT1
Terminals:
B → ccB1
T → cT2
B → abB2
T → aT3
B → abB3
a, b, c, 1, 2, 3
(CFG)
Variables: S, T, B
S → T | BT → bab1
B → cc1
T → c2
B → ab2
T → a3
B → ab3
Ambiguity of CFGs
• Each sequence of tiles gives two derivations
• If the tiles match, these two derive the same string
babcc
cab
1 2cab
2
S ⇒ T ⇒ babT1 ⇒ babcT21⇒ babcc221
S ⇒ B ⇒ ccB1 ⇒ ccabB21⇒ ccabab221
Ambiguity of CFGs
• If G is ambiguous then ambiguity must look like this
T G
If T can be matched, then G is ambiguous
If T cannot be matched, then G is unambiguous
(collection of tiles) (CFG)
✓
STTa1 n1
ai ni
Ta2 n2
…
SBBb1 m1
bj mj
Bb2 m2
…
Then n1...ni = m1…mj
✓
So there is a match
a1
b1
a2
b2
ai
bi
n1 n2 ni
…
Undecidability of PCP(optional)
Undecidability of PCP
• We show that
PCP = { 〈 T 〉 : T is a collection of tiles that
contains a top-bottom match}
The language PCP is undecidable.
If PCP can be decided, so can ATM.
Undecidability of PCP
• Idea: Matches represent accepting histories
〈 M, w 〉
T (collection of tiles)
#q0ab%ab
#q0a#xq1
aa
bb
##
xq1%x%q2
%%
…
#q0ab%ab#xq1b%ab#...#xx%xx☐qa# #q0ab%ab#xq1b%ab#...#xx%xx☐qa#
aa
bb
M accepts w T contains a match
An assumption
• We will assume that one of the PCP tiles is marked as a starting tile
• Later we’ll see how to “simulate” the starting tile by an ordinary tile
babcc
cab
aab
baaa
ababa
s
Undecidability of PCP
• On input 〈 M, w 〉 we construct these tiles for PCP
〈 M, w 〉
T (collection of tiles)
M accepts w
#q0w
sx1qix2
x3x4x5
for each valid window with state qi in top
middle
xx
for all x in ∪{#}
xqa
qa
☐#☐#
qaxqa
☐#qix1
☐#x2x3
qa###
T contains a match
Undecidability of PCP
#q0w
s
tile type purpose
represents initial configuration
represent valid transitionsbetween configurations
add blank spacesbefore # if necessary
complete matchif computation accepts
xqa
qa
qaxqa
qa###
xx
x1qix2
x3x4x5
☐#☐#
☐#qix1
☐#x2x3
Undecidability of PCP
accepting computation history
#q0a%ab#xq1%ab#...#xx%xxq7☐#xx%xx☐qa#
#q0w
sxx
☐#☐#
x1qix2
x3x4x5
☐#qix1
☐#x2x3
#q0ab ab#xq1b%ab#...#xx%x xq7☐ #
#q0ab%ab#xq1b ab#...#xx%xxq7☐#xx%x x☐qa#
%
%
Undecidability of PCP
• Once the accepting state symbol occurs, the last two tiles can “eat up” the rest of the symbols
xx
xqa
qa
qaxqa
qa###
#xx%xx ☐qa #xx%xxqa#xx%xqa#...#qa###xx%xx☐qa #xx%xxqa#xx%xqa#...#qa##
Undecidability of PCP
• If M rejects on input w, then qrej appears on bottom at some point, but it cannot be matched on top
• If M loops on w, then matching keeps going forever
#q0w
sa1qia3
b1b2b3
for each valid window of this form
xx
for all x in ∪{#}
xqa
qa
☐#☐#
qaxqa
☐#qia2
☐#b1b2
qa###
Getting rid of the starting tile
• We assumed that one tile marked as starting tile
• We can remove assumption by changing tiles a bit
babb
bc
aaba
s
b*a**b*b
c*c*a**a
*a**a*b*a
*
“starting tile”begins with *
“final tile”“middle tiles”
ccaa
b**c
Getting rid of the starting tile
b*a**b*b
c*c*a**a
*a**a*b*a
*
b**c
babb
bc
aaba
sbc
ccaa
b**c
can only useas starting tile
can only useto complete match