33
CSCI 3130: Formal languages and automata theory Andrej Bogdanov http://www.cse.cuhk.edu.hk/ ~andrejb/csc3130 The Chinese University of Hong Kong Undecidable problems for CFGs Fall 2011

CSCI 3130: Formal languages and automata theory

Embed Size (px)

DESCRIPTION

Fall 2011. The Chinese University of Hong Kong. CSCI 3130: Formal languages and automata theory. Undecidable problems for CFGs. Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130. Decidable vs. undecidable. decidable. undecidable. “ DFA M accepts w ”. “ TM M accepts w ”. - PowerPoint PPT Presentation

Citation preview

Page 1: CSCI 3130: Formal languages and automata theory

CSCI 3130: Formal languages and automata theory

Andrej Bogdanov

http://www.cse.cuhk.edu.hk/~andrejb/csc3130

The Chinese University of Hong Kong

Undecidable problems for CFGs

Fall 2011

Page 2: CSCI 3130: Formal languages and automata theory

Decidable vs. undecidable

“TM M accepts w ”

“TM M accepts some input”

“TM M and M ’ accept same inputs”

undecidable

“TM M halts on w ” “CFG G generates w ”

“DFA M accepts w ”

decidable

“CFG G generates all inputs”“CFG G is ambiguous”

more?

?

“DFAs M and M ’ accept

same inputs”

Page 3: CSCI 3130: Formal languages and automata theory

Representing computations

q0

q1 q2

q5

q3 q4

q6q7 qa

a/xR

%/%R

%/%R

b/xL

b/xR

a/xL

a/aRb/bR

a/aRb/bR

x/xR

x/xR

x/xR

a/aLb/bLx/xL

☐/☐R%/%R %/%R

a/aLb/bL

x/xR

L1 = {w%w: w ∈{a, b}*}

abbaa%abbaaq0

xbbaa%abbaaq1

xbbaa%abbaa

...

q1

xbbaa%abbaaq2

xbbaa%xbbaaq5

...

xxxxx%xxxxxqa

Page 4: CSCI 3130: Formal languages and automata theory

Configurations

• A configuration consists of the current state, the head position, and tape contents

…a b a ☐q1 abq1a

q1 qacca/bR

abbqacc…a b b ☐

qacc

configuration

Page 5: CSCI 3130: Formal languages and automata theory

q0a0b0%0a0b0

x0q1b0%0a0b0

x0b6q1%0a0b0

x0b6%0q2a0b0

x0b6q5%0x1b0

x0q6b0%0x0b0

Computation histories

computation history

q0

q1 q2

q5

q3 q4

q6q7 qa

a/xR

%/%R

%/%R

b/xL

b/xR

a/xL

a/aRb/bR

a/aRb/bR

x/xR

x/xR

x/xR

a/aLb/bLx/xL

☐/☐R%/%R %/%R

a/aLb/bL

x/xR

x0x0%0x0x0q7

x0x0%0x0x0☐aqa

...

Page 6: CSCI 3130: Formal languages and automata theory

Computation histories as strings

M accepts w qacc occurs in hist

M rejects w qrej occurs in hist

q0a0b0%0a0b0

x0q1b0%0a0b

x0x0%0x0x0q7

x0x0%0x0x0qa

...

If M halts on w, the computation history of (M, w) is the sequence of configurations C1, ..., Cl that M goes through on input w.

#q0ab%ab#xq1b%ab# . . . #xx%xx☐qa#

C1 C2 Cl

The computation history can be writtenas a string hist over alphabet ∪Q∪{#}

accepting history:

rejecting history:

Page 7: CSCI 3130: Formal languages and automata theory

Undecidable problems for CFGs

• We will argue that

ALLCFG = { 〈 G 〉 : G is a CFG that generates all strings}

The language ALLCFG is undecidable.

If ALLCFG can be decided, so can ATM.

Page 8: CSCI 3130: Formal languages and automata theory

Undecidable problems for CFGs

〈 M, w 〉

reject if M accepts w

accept if M rej/loops w

Areject if not

accept if G generates all strings〈 G〉

A〈 G〉

construct G

G generates all strings if M rejects or loops on w

G fails to generate some string if M accepts w

Page 9: CSCI 3130: Formal languages and automata theory

Undecidable problems for CFGs

〈 M, w 〉

〈 G〉

construct G

G fails to generate some string

M accepts w

The alphabet of G will be ∪Q∪{#}

G will generate all strings except the computation history of (M, w), if it is accepting

First we construct a PDA P, then convert it to CFG G

Page 10: CSCI 3130: Formal languages and automata theory

Undecidability via computation histories

Pcandidate computationhistory hist of (M, w)

accept everything

#q0ab%ab#xq1b%ab# . . . #xx%xx☐qa# reject

except accepting hist

On input hist,

If w1 ≠ q0w or wk does not contain qa, accept.

If two consecutive blocks wi#wi+1 do not followfrom the transitions of M, accept.

P:

If hist is not of the form #w1#w2#...#wk#, accept.

Otherwise, hist is an accepting history, so reject.

// try to spot a mistake in hist

q0

Page 11: CSCI 3130: Formal languages and automata theory

#0q0a0b0%0a0b0

#0x0q1b0%0a0b0

#0x0b6q1%0a0b0

#0x0b6%0q2a0b0

#0x0b6q5%0x1b0

#0x0q6b0%0x0b0

Computation is local

Changes between configurations always occur around the head

q0

q1 q2

q5

q3 q4

q6q7 qa

a/xR

%/%R

%/%R

b/xL

b/xR

a/xL

a/aRb/bR

a/aRb/bR

x/xR

x/xR

x/xR

a/aLb/bLx/xL

☐/☐R%/%R %/%R

a/aLb/bL

x/xR

#0x0x0%0x0x0q7

#0x0x0%0x0x0☐aqa

...

Page 12: CSCI 3130: Formal languages and automata theory

Legal and illegal transition windows

… 6a3b0x0 …… 0a6b0x0 …0

legal windows

… 6a3b0a0 …… 0a6b0q5 …0

… 6a3q2a0 …… 0q5a6x0 …0

… 6q2a0b0 …… 0a6b0q2 …0

illegal windows

… 6a3a0☐0 …… 0x6a0☐0 …0

q2

q5

a/xL

… 6a3q2a0 …… 0q5a6b0 …0

… 6q2q2a0 …… 0q2q2x3 …0

… 6a3q2a0 …… 0a6q5x0 …0

Page 13: CSCI 3130: Formal languages and automata theory

#0x0b6%0q2a0b#0x0b6q5%0x1b

Implementing P

For every position of wi:

Remember first row of window in state

After reaching the next #:

Pop offset from # from stack as you consume input

Remember second row of window in state

Remember offset from # in wi on stack

If window is illegal, accept;Otherwise reject.

offset

If two consecutive blocks wi#wi+1 do not followfrom the transitions of M, accept:

Page 14: CSCI 3130: Formal languages and automata theory

The computation history method

• G accepts all strings except accepting computation histories of (M, w)

• We first construct a PDA P, then convert it to CFG G

ALLCFG = { 〈 G 〉 : G is a CFG that generates all strings}

If ALLCFG can be decided, so can ATM.

〈 M, w 〉

〈 G〉

construct G

Page 15: CSCI 3130: Formal languages and automata theory

The Post Correspondence Problem

• Input: A set of tiles like this

• Given an infinite supply of such tiles, can you match top and bottom?

babcc

cab

aab

baaa

babcc

cab

aab

baaa

cab

ababa

ababa

bab

bab

Page 16: CSCI 3130: Formal languages and automata theory

Undecidability of PCP

PCP = { 〈 T 〉 : T is a collection of tiles that

contains a top-bottom match}

The language PCP is undecidable.

Page 17: CSCI 3130: Formal languages and automata theory

Ambiguity of CFGs

AMB = {G: G is an ambiguous CFG}

• We will argue that

The language AMB is undecidable.

If AMB can be decided, so can PCP.

Page 18: CSCI 3130: Formal languages and automata theory

Ambiguity of CFGs

• Step 1: Number the tiles

T G

If T can be matched, then G is ambiguous

If T cannot be matched, then G is unambiguous

(collection of tiles)

babcc

cab

aab

1 2 3

(CFG)

Page 19: CSCI 3130: Formal languages and automata theory

Ambiguity of CFGs

T G(collection of tiles)

babcc

cab

aab

1 2 3

Productions:

Terminals:

T → babT1B → ccB1

T → cT2B → abB2

T → aT3B → abB3

a, b, c, 1, 2, 3

(CFG)

Variables: S, T, B

S → T | B

Page 20: CSCI 3130: Formal languages and automata theory

Ambiguity of CFGs

T G(collection of tiles)

babcc

cab

aab

1 2 3Productions:

T → babT1

Terminals:

B → ccB1

T → cT2

B → abB2

T → aT3

B → abB3

a, b, c, 1, 2, 3

(CFG)

Variables: S, T, B

S → T | BT → bab1

B → cc1

T → c2

B → ab2

T → a3

B → ab3

Page 21: CSCI 3130: Formal languages and automata theory

Ambiguity of CFGs

• Each sequence of tiles gives two derivations

• If the tiles match, these two derive the same string

babcc

cab

1 2cab

2

S ⇒ T ⇒ babT1 ⇒ babcT21⇒ babcc221

S ⇒ B ⇒ ccB1 ⇒ ccabB21⇒ ccabab221

Page 22: CSCI 3130: Formal languages and automata theory

Ambiguity of CFGs

• If G is ambiguous then ambiguity must look like this

T G

If T can be matched, then G is ambiguous

If T cannot be matched, then G is unambiguous

(collection of tiles) (CFG)

STTa1 n1

ai ni

Ta2 n2

SBBb1 m1

bj mj

Bb2 m2

Then n1...ni = m1…mj

So there is a match

a1

b1

a2

b2

ai

bi

n1 n2 ni

Page 23: CSCI 3130: Formal languages and automata theory

Undecidability of PCP(optional)

Page 24: CSCI 3130: Formal languages and automata theory

Undecidability of PCP

• We show that

PCP = { 〈 T 〉 : T is a collection of tiles that

contains a top-bottom match}

The language PCP is undecidable.

If PCP can be decided, so can ATM.

Page 25: CSCI 3130: Formal languages and automata theory

Undecidability of PCP

• Idea: Matches represent accepting histories

〈 M, w 〉

T (collection of tiles)

#q0ab%ab

#q0a#xq1

aa

bb

##

xq1%x%q2

%%

#q0ab%ab#xq1b%ab#...#xx%xx☐qa# #q0ab%ab#xq1b%ab#...#xx%xx☐qa#

aa

bb

M accepts w T contains a match

Page 26: CSCI 3130: Formal languages and automata theory

An assumption

• We will assume that one of the PCP tiles is marked as a starting tile

• Later we’ll see how to “simulate” the starting tile by an ordinary tile

babcc

cab

aab

baaa

ababa

s

Page 27: CSCI 3130: Formal languages and automata theory

Undecidability of PCP

• On input 〈 M, w 〉 we construct these tiles for PCP

〈 M, w 〉

T (collection of tiles)

M accepts w

#q0w

sx1qix2

x3x4x5

for each valid window with state qi in top

middle

xx

for all x in ∪{#}

xqa

qa

☐#☐#

qaxqa

☐#qix1

☐#x2x3

qa###

T contains a match

Page 28: CSCI 3130: Formal languages and automata theory

Undecidability of PCP

#q0w

s

tile type purpose

represents initial configuration

represent valid transitionsbetween configurations

add blank spacesbefore # if necessary

complete matchif computation accepts

xqa

qa

qaxqa

qa###

xx

x1qix2

x3x4x5

☐#☐#

☐#qix1

☐#x2x3

Page 29: CSCI 3130: Formal languages and automata theory

Undecidability of PCP

accepting computation history

#q0a%ab#xq1%ab#...#xx%xxq7☐#xx%xx☐qa#

#q0w

sxx

☐#☐#

x1qix2

x3x4x5

☐#qix1

☐#x2x3

#q0ab ab#xq1b%ab#...#xx%x xq7☐ #

#q0ab%ab#xq1b ab#...#xx%xxq7☐#xx%x x☐qa#

%

%

Page 30: CSCI 3130: Formal languages and automata theory

Undecidability of PCP

• Once the accepting state symbol occurs, the last two tiles can “eat up” the rest of the symbols

xx

xqa

qa

qaxqa

qa###

#xx%xx ☐qa #xx%xxqa#xx%xqa#...#qa###xx%xx☐qa #xx%xxqa#xx%xqa#...#qa##

Page 31: CSCI 3130: Formal languages and automata theory

Undecidability of PCP

• If M rejects on input w, then qrej appears on bottom at some point, but it cannot be matched on top

• If M loops on w, then matching keeps going forever

#q0w

sa1qia3

b1b2b3

for each valid window of this form

xx

for all x in ∪{#}

xqa

qa

☐#☐#

qaxqa

☐#qia2

☐#b1b2

qa###

Page 32: CSCI 3130: Formal languages and automata theory

Getting rid of the starting tile

• We assumed that one tile marked as starting tile

• We can remove assumption by changing tiles a bit

babb

bc

aaba

s

b*a**b*b

c*c*a**a

*a**a*b*a

*

“starting tile”begins with *

“final tile”“middle tiles”

ccaa

b**c

Page 33: CSCI 3130: Formal languages and automata theory

Getting rid of the starting tile

b*a**b*b

c*c*a**a

*a**a*b*a

*

b**c

babb

bc

aaba

sbc

ccaa

b**c

can only useas starting tile

can only useto complete match