22
1 Appendix D. Properties of Deterministic Context-free Languages 1. A CFL that cannot be recognized by any DPDA 2. Closure property of DCFL’s under complementation 3. Making a DPDA read the input up to the end of the input

Appendix D. Property of DCFL

Embed Size (px)

Citation preview

Page 1: Appendix D. Property of DCFL

1

Appendix D. Properties of Deterministic Context-free Languages

1. A CFL that cannot be recognized by any DPDA

2. Closure property of DCFL’s under complementation

3. Making a DPDA read the input up to the end of the input

Page 2: Appendix D. Property of DCFL

2

Theorem 1. There is a CFL that cannot be recognized by a DPDA.

Properties of DCFL's

Proof (Non-constructive). The complement of every DCFL is also a DCFL. (We will show this by the proof of Theorem 2 below.) In Chapter 9, we showed a CFL whose complement is not CFL, which implies the theorem. �

Let LCFL and LDCFL be, respectively, the classes of CFL’s and DCFL’s. The theorem below shows that LDCFL LCFL. In other words, it says that there is a CFL that cannot be recognized by a DPDA, but by an NPDA. (Recall that, in contrast, every language recognized by an NFA can also be recognized by a DFA.) This theorem can be proved in two ways, which are both interesting.

1. A CFL that cannot be recognized by a DPDA

Page 3: Appendix D. Property of DCFL

3

Properties of DCFL’s

We need the following lemma for the constructive proof of Theorem 1. This lemma, which simplifies the PDA model, will also be used for the proof of Theorem 2.

Lemma 1 (Normal form of PDA). Every CFL can be recognized by a PDA which satisfies the following conditions.

(1) The PDA never empties the stack (i.e., it does not pop Z0 ),

(2) when pushes, the machine pushes exactly one symbol, and

(3) never changes the stack-top.

Proof. Let M = (Q, , , , q0, Z0, F) be a PDA, where p, q Q, A, B and a

{}. Notice that conditions (2) and (3) does not allow a pushing move, like (p, a, A) = (q, BC), where the original stack-top A is changed to C. This normal form applies all PDA’s, either deterministic or nondeteriministic. In the rumination section at the end of Chapter 4, we showed that condition (2) does not affect the language recognized by a PDA. Here we show that the lemma is true for conditions (1) and (3).

Normal Form of PDA

Page 4: Appendix D. Property of DCFL

4

Suppose that a PDA M does not satisfy condition (1) and has a move which pops the bottom of the stack symbol Z0 as shown in figure (a) below. Since, with the stack empty, the PDA cannot have any move, we can simply let it push a new stack symbol, say X0 , on top of Z0 instead of popping it as shown in figure (b). This modified PDA M’ recognizes the same language.

(. , Z0 / )

(. , Z0 / )

start

Normal Form of PDA

(a) PDA M (b) PDA M'

(. , z0 / X0Z0 )

(. , z0 / X0Z0 )

start

Properties of DCFL’s

Page 5: Appendix D. Property of DCFL

5

Now, suppose that PDA M satisfies conditions (1) and (2), except for condition (3). We convert M to M’ such that M’ keeps the stack-top symbol of M in its finite state control and simulates M as illustrated in the following figure. Notice that when the stack of M is empty, M’ keeps a copy of Z0 in its finite state control. PDA M’ never rewrites its stack top and recognizes the same language. (By keeping the stack top in �the finite state control, we are increasing the number of states of the PDA.)

(a) M

(b) M'

Z0 .. BA .. BC .. BDA

.. B

A

.. B

C

.. BD

A

Z0A

Z0

Z0

A

Z0

ANI

Properties of DCFL'sNormal Form of PDA

Page 6: Appendix D. Property of DCFL

6

Proof of Theorem 1

Proof of Theorem 1 (constructive). Now, we will show that no DPDA recognizes the palindrome language L = { wwR | w {a, b}+ }. (This language is context-free, because we can easily construct a CFG, or a NPDA as shown in Section 5.2)

To the contrary, suppose that there is a DPDA M which recognizes L. Let qx and Ax be, respectively, the state of M and a stack-top symbol, when M has read the input up to some prefix x of the input string. M may read additional input string segment z before it pops Ax (see the figure below). By [qx, Ax], we shall denote such a pair.

x

Z0 Ax

z

t

Z0 Ax

qx[qx, Ax]

x

Properties of DCFL's

Page 7: Appendix D. Property of DCFL

7

Since M is a DPDA, for a given string x, there exists a unique pair [qx, Ax]. For the proof of the theorem, we will first show that if M recognizes the palindrome language L, there are two different strings x and y for which [qx, Ax] = [qy, Ay]. For such strings x and y, we can easily find a string z such that xz L and yz L.

Let’s examine what will happen for M with input strings xz and yz. When the machine reads up to x and y, it enters in the same state (i.e., qx = qy ) with the same stack-top (i.e., Ax= Ay ), and never pops it while reading the remaining part z of the input. It follows that M should either accept both xz and yz or both not. We are in a contradiction because M is a DPDA.

x

Z0 Ax

z

tqx

Z0 Ax

x

[qx, Ax]

Proof of Theorem 1Properties of DCFL's

Page 8: Appendix D. Property of DCFL

8

For an arbitrary input string u {a, b}+, let u be the content of the stack when M has read up to the last symbol of string u (figure (a)). Let v {a, b}* be a string such that given uv as an input, the machine reduces u to its minimum (uv in figure (b)) by the time when M reads the last symbol of string uv.

u

Z0

u

(a)

u v

Z0

uv

(b)

quv

In other words, v is a string which appended to u and given uv as an input, M reduces the stack height |u|-|uv| to its minimum. Thus after processing input u, no other string v’ appended to u and given as an input, M never pops the content of uv. Notice that depending on u, there can be more than one v that minimizes uv. In special case, it is also possible to have v = , i.e., u, = uv. Clearly, for every string u, there exists such a string v.

Proof of Theorem 1 Properties of DCFL's

Page 9: Appendix D. Property of DCFL

9

Let [quv, Auv] be the pair of state and the stack-top symbol (of uv ) when M reads the last symbol of the input string uv as figure (b) above illustrates, and define the following sets S and T.

Since the number of states of M and the size of the stack alphabet is finite, so is the set S. However, T is infinite, because it contains uv for every string u {a, b}+. Clearly, for every string x T there exists a pair [qx, Ax] S. It follows that for two distinct strings x, y T, there must be one pair [qx, Ax] = [qy, Ay] in S.

S = { [quv, Auv] | u {a, b}+, and v {a, b}* that gives the shortest |uv| }

T = { uv | u {a, b}+, and v {a, b}* that gives the shortest |uv| }

Proof of Theorem 1

u

Z0

u

(a)

u v

Z0

uv

(b)

quv

Properties of DCFL's

Page 10: Appendix D. Property of DCFL

10

Now, with the two strings x, y T for which [qx, Ax] = [qy, Ay], we find a string z such that xz L and yz L as follows. (1) If |x| = |y|, we let z = xR. Then clearly, xz = xxR L and yz = yxR L. (2) If |x| |y|, we construct z as follows: Suppose that |x| < |y|. (The same logic applies when it is assumed the other way.) Let y1 be the prefix of y such that |y1| = |x|, and let y = y1y2. Find a string w such that |w| = |y2| and w y2 and construct string z = wwRxR . Clearly, xz = xwwRxR L and yz = y1y2wwRxR L. (Notice that because of the three conditions |y1| = |x|, |w| = |y2| and w y2 , string yz does not have the palindrome property of L.) Now, let’s examine what will happen with the DPDA M for two input strings xz and yz. We know that for the two input strings, [qx, Ax] = [qy, Ay], which implies that M must either accept both xz and yz, or both not, because the DPDA is computing with the same input z starting with the same state qx ( = qy) and the same stack top Ax ( = Ay). This contradicts the supposition that M is a DPDA. It follows that L is not a CFL recognizable by a DPDA.

Proof of Theorem 1Properties of DCFL's

Page 11: Appendix D. Property of DCFL

11

Theorem 2. Let L1 and L2 be arbitrary DCFL’s, and let R be a regular language.

(1) L1 R is also DCFL.

(2) The complement of L1 is also DCFL.

(3) L1 L2 and L1 L2 are not necessarily a DCFL. In other words, DCFL’s are not closed under union and intersection.

Proof. We assume that every DFA and DPDA read the input string up to the last symbol without rejecting the input in the middle. It needs a long and complex proof to show that we can take such assumption. We shall defer this part of the proof toward the end of this appendix.

2. Properties of DCFL's

Properties of DCFL's

Page 12: Appendix D. Property of DCFL

12

Proof (1). L1 R is also a DCFL.

Let M and A be, respectively, a DPDA and a DFA which recognizes L1 and R. With the two automata M and A, we construct a DPDA M' which recognizes the language L1 R as follows. With the transition functions of M and A in its finite state control, the DPDA M' simulates both M and A, keeping track of their states. M' simulates A only when M reads the input. (Recall that PDA’s can have an -move, in which they do not read the input.) DPDA M' enters an accepting state if and only if both M and A simultaneously enter an accepting state. Since M and A both read the input up to the end of the input, clearly, M' recognizes the language L1 R.

Proof

x

A MM'

Properties of DCFL's

Page 13: Appendix D. Property of DCFL

13

Proof(2) The complement of L1 is also DCFL.

Let M be a DPDA recognizing L1. Unfortunately, we cannot use the simple technique of converting the accepting states to non-accepting states, and vice versa, that we used to prove the closure property of regular languages under complementation. Let’s see why.

Suppose that M takes a sequence of -moves, where the machine computes only with the stack, without reading the input as illustrated in the following figure. (In the figure, the heavy circle denotes an accepting state.) If the input symbol a, read by the machine entering state p, is the last input symbol, then the input string will be accepted, because it enters an accepting state after (not necessarily right after) reading the last input symbol.

p(a, ./..)

(, ./..) (, ./..) (, ./..) (b, ./..)

ProofProperties of DCFL's

Page 14: Appendix D. Property of DCFL

14

Let’s see what will happen, if we convert the accepting state to non-accepting state, and vice versa as shown below (figure (b)). Still the machine accepts the input, because it enters an accepting state after reading the last symbol a. To solve this problem, we will use a simulation technique.

p(a, ./..)

(, ./..) (, ./..) (, ./..) (b, ./..)

(a)

p(a, ./..)

(, ./..) (, ./..) (, ./..) (b, ./..)

(b)

ProofProperties of DCFL's

Page 15: Appendix D. Property of DCFL

15

We construct a DPDA M' which simulates M to recognizes the complement of L(M) as follows. M' keeps the transition function of M in its finite state control and uses its own stack for M. The simulation is carried out in two cases (a) and (b), depending on whether M enters an accepting state between two moves of reading the input.

(a) If M, after reading an input symbol (a in the figure), does not enter an accepting state till it reads the next input symbol (b in the figure), M' reads the input a and enters an accepting state, and then simulates M reading the next input symbol b. (The -transitions in between are ignored.) Notice that, M' is simulating M to recognize the complement of L(M). If the symbol a that M reads is the last one from input string x, it will not be accepted by M. So, to have this input string x accepted by M', we let it enter an accepting state right after reading the symbol a.

(b) If M ever enters an accepting state in between two reading moves (i.e., non- -transitions), M' simulates the two reading moves of M without entering an accepting state.

p q(a, ./..)

(, ./..) (, ./..) (, ./..) (b, ./..)

ProofProperties of DCFL's

Page 16: Appendix D. Property of DCFL

16

Proof (3). L1 L2 and L1 L2 are not necessarily a DCFL.

As shown in the following page, L1 and L2 below are DCFL’s. However, in Section 12.4 we proved that the intersection L1 L2 = {aibici | i 1} is a CSL which is not context-free.

L1 = {aibick | i, k 1 } L2 = {aibkck | i, k 1 } If the union L = L1 L2 is a DCFL, the following language L' must be a DCFL according to property (1) because {aibjck | i, j, k 1 } is regular (see a DFA recognizing this language in the following page).

L' = L {aibjck | i, j, k 1 } = {aibici | i 1 )

However, we know that {aibici | i 1 } is not context-free (see Section 12.5).

ProofProperties of DCFL's

Page 17: Appendix D. Property of DCFL

17

(a, Z0/aZ0),(a, a/aa)

(b, a/)

(b, a/)

(c, Z0/Z0)

(c, Z0/Z0)

(a) DPDA accepting

L1 = {aibick | i, k 1 }

(a, Z0/Z0)

(c, b/)

(, Z0/Z0)

(a, Z0/Z0),(b, Z0/bZ0),(b, b/bb)

(c, b/)

(b) DPDA accepting

L2 = {aibkck | i, k 1 }

a

a

b

b

c

c

(c) DFA accepting {aibjck | i, j, k 1 }

ProofProperties of DCFL's

Page 18: Appendix D. Property of DCFL

18

3. Making every DPDA and DFA read up to the last input symbol

Proof. According to the convention, when we define the transition function (or the transition graph) of an automaton, we may not define it for every possible tape symbol (and the stack-top for a PDA). We assume that entering a state from which no transition defined, the automaton rejects the input immediately without reading the input further).

To make a DFA read up to the last input symbol, we explicitly introduce a dead state and let the machine enter it for every undefined transition. Then we let it read off all the remaining input symbols in the dead state as the following example shows. (See also Section 9.1.)

Properties of DCFL's

Lemma 2. Every DCFL and regular language can be recognized, respectively, by a DPDA and a DFA which read up to the last input symbol.

a b

start

db

a

a, b

ANI

a, b

In this section we shall prove the following lemma which we have deferred while proving Theorem 2.

Page 19: Appendix D. Property of DCFL

19

Making DPDA read the last input symbol

Let M = (Q, , , , q0, Z0, F) be a DPDA. Recall that for every p Q, a and A , both (p, a, A) and (p, , A) give at most one value and if (p, a, A) is defined (p, , A) is not defined, and vice versa.

The problem of making M read up to the last input symbol is not that simple. The automata may hit an undefined transition as for DFA’s or end up in a cycle of -moves in the middle of the computation without consuming the input. For the case of undefined transitions, we can use the same approach as for DFA’s. Here is an example. (Notice that this DPDA accepts the language {aibi | i 1}.)

Properties of DCFL's

= {a, b}

= {A, Z0} ( a, A /A )

1 2 3 4(a, Z0 /AZ0)

( a, A/AA )

( b, A/ )

( b, A/ )

(, Z0 /Z0)

start

(b, Z0 /Z0)(X, Z0 /Z0) X {a, b}

Y {A, Z0}

d

(X, Y/Y )

ANI

Page 20: Appendix D. Property of DCFL

20

Now, we study the problem of converting M to a DPDA M’ which consumes the input without entering a cycle of -moves. The figure below shows a part of the transition graph of M with such loop. Notice that entering the loop the machine does not necessarily stay in it. Depending on the stack contents it may exit. Given a state transition graph of M, our objective is to find a state q and the stack top symbol Z such that the machine cyclically enters q with the same stack top Z, that is, (q, , Z) |-* (q, , (Z)*Z) for , *.

Making DPDA read the last input symbolProperties of DCFL's

2

34

(, A/CA)

(, A/BA)

(, B/)

(a, Z0/AZ0)1

(, D/ED)

(, D/ )

(b, A/BA)

(, B/B)(, C/ )(, E/ ),

Page 21: Appendix D. Property of DCFL

21

Since the graph is finite and the transitions are deterministic, we can effectively identify every entering transition which remains in the cycle, detach it from the cycle and let it enter the dead state, where the remaining input is consumed. If the cycle involves an accepting state, we let the transition enter an accepting state before sending it to the dead state.

Making DPDA read the last input symbol

Properties of DCFL's

2

34

(, A/CA)

(, A/BA)

(, B/)

1

(, D/ED)

(, D/ )

(, C/ )(, E/ ),

(a, Z0/AZ0)

(b, A/BA)

(, B/B)

d(X, Y/Y )

X Y

Page 22: Appendix D. Property of DCFL

22

For more detailed approach to the conversion, refer J. Hopcroft and J. Ullman, “Introduction to Automata Theory, Languages and Computation, Section 10.2,” Addison Wesley, 1979, or M. Harrison, “Introduction to Formal Language Theory, Section 5.6,” Addison Wesley, 1978.

Making DPDA read the last input symbol

Properties of DCFL's