Direct automaton construction for Parikh's Theoremdeepakd/atc-2018/seminar-slides/automa… · If w is a word over some T, we denote by T(w) the Parikh image of w over alphabet T

Direct automaton construction for Parikh’s Theorem

Abhishek Mukati, Rishi Tulsyan, Shubham Waghmare

November 23, 2018

Abhshek, Rishi, Shubham Automata Construction November 23, 2018 1 / 61

Parikh’s Theorem

Theorem

Every CFL has the same Parikh image as some regular language.

Parikh’s proof of theorem

Given a CFG G, the proof produces (implicitly) an automaton orregular expression whose language has the same Parikh image as G.

Constructed automata is of size O(nn), where n - number of variablesin the CNF of G.

Enhanced automaton construction

Given a CFG G, construct an automaton explicitly.

Instead of O(nn), only O(4n) states.


Terms and Notations

Context Free Grammar

G=(V ,T ,P,S)

V is set of variables, denoted as A1, A2, A3,......

T is a set of terminal symbols, denoted as a, b, c.....

S is the start variable.

P is a set of productions of the form Ai−→ α where α ∈(V U T)∗

If w is a word over some T, we denote by ΠT (w) the Parikh image ofw over alphabet T.

ΠT (w)maps a character in T to its number of occurrences in w.


Terms and Notations Cont.

Context Free Grammar

G=(V ,T ,P,S)

The Parikh image of a language L over T is ΠT (w)|w ∈ L. It isdenoted by ΠT (L).

Examples :-Πa,b ,c(bccba) = (1, 2, 2) where (1, 2, 2) stands for(a, 1), (b, 2), (c , 2)Πa,b ,c(cabaaabb) = (4, 3, 1)

α/V means α projected on V

α/T means α projected on T

A pair (α, β) is a step, denoted by α⇒ β, if there exists a productionA −→ γ and α1, α2 ∈ (V∪T)∗ such that: α = α1Aα2 and β = α1γα2.

Transition t(α⇒ β) = (ΠV (α),γ/T , ΠV (β)) where, α, β∈ (V ∪ T )∗


k-Parikh automaton of G

The k-Parikh automaton of G is the NFA MkG = (Q, T∗, δ, q0, qf )

n=|V|Q= (x1 ....xn) ∈ Nn |

∑ni =1 xi ≤ k

δ = t(α⇒ β) | α⇒ β is a step and ΠV (α), ΠV (β)∈ Q qf = ΠV (ε) = (0,....., 0)

q0 = ΠV (S);


3-Parikh Automaton Example

A1 −→ A1A2 | aA2 → bA2aA2 | cA1

Transition example: (0, 2)c−→ (1, 1)

A2A2 → cA1A2


Example


Theorem

If G is a context-free grammar with n variables and degree m, then L(G)and L(Mnm+1

G ) have same Parikh image.

the degree of G denoted by m := -1 + max|γ/V |(A→ γ) ∈ P ;i.e.,m+1 is the maximal number of variables on the right hand sides.

For the above grammar G, n=2 and m=1, which means L(G) =Π

L(M3G ).


Proof of L(MkG ) ⊆Π L(G )

Claim

If k ≥ 1, let q0σ−→ q be a run of Mk

G on the word σ ∈ T ∗, then there existsa step sequence S ⇒∗ α satisfying

ΠV (α) = q

ΠT (α) = ΠT (σ)

Proof of claim (by induction on length of path):

If l = 0 then σ = ε, so α = S and ΠV (S) = q0 and ΠT (S) = ΠT (ε)

If l > 0, let σ = σ′γ and q0σ′−→ q′

γ−→ q, then according to I.H.:S ⇒∗ α′ where ΠV (α′) = q′ and ΠT (α′) = ΠT (σ′)

Since q′γ−→ q is a transition there exists a step α1Aα2 ⇒ α1γ

′α2 anda production A→ γ′ where ΠV (α1Aα2) = q′, ΠV (α1γ

′α2) = q andγ′/T = γ


Claim’s proof cont.

Since ΠV (α′) = q′ = ΠV (α1Aα2), α′ = α′1Aα′2 for some α′1, α

′2

Let α = α′1γ′α′2 so that S ⇒∗ α′ ⇒ α and

ΠV (α) = ΠV (α′1γ′α′2)

= ΠV (α′1Aα′2)− ΠV (A) + ΠV (γ′)

= ΠV (α′)− ΠV (A) + ΠV (γ′)= ΠV (α1Aα2)− ΠV (A) + ΠV (γ′)= ΠV (α1γ

′α2) = q

ΠT (α) = ΠT (α′1γ′α′2)

= ΠT (α′1Aα′2) + ΠT (γ′)

= ΠT (α′) + ΠT (γ′)= ΠT (α′) + ΠT (γ)= ΠT (σ′) + ΠT (γ)= ΠT (σ)


Proof of L(MkG ) ⊆Π L(G ) from the claim

If σ ∈ L(MkG ), then there is a run q0

σ−→ ΠV (ε)

From the claim, there exists a step sequence S ⇒∗ α satisfyingΠV (α) = ΠV (ε) = (0, 0, ..., 0) and ΠT (α) = ΠT (σ)

So α ∈ T ∗ and hence α ∈ L(G )

Since ΠT (α) = ΠT (σ), we have α =Π σ


Notations and Definitions

Yield of a parse tree t is denoted by Y (t)

Set of yields of a set T of trees by Y (T )

A child of t is a subtree of t whose root is a child of the root of t.

A child is proper if its root is not a leaf.

The dimension d(t) of a parse tree is defined as follows :If t has no proper children then d(t) = 0.Otherwise let t1, t2, ...tr be the proper children of t sorted s.t.d(t1) ≥ d(t2) ≥ ... ≥ d(tr ). Then

d(t) =

d(t1) if r = 1 or d(t1) > d(t2)

d(t1) + 1 if d(t1) = d(t2)

The set of parse trees with dimension k is denoted by T (k )

Fact 2.1: The height of a tree, h(t) > d(t).


Proof idea of L(G ) ⊆Π L(Mnm+1G )

Definition

A derivation S ⇒ α0 ⇒ ...⇒ αl has index k if ∀i ∈ 0, ..., l, the word(αi )/V has length atmost k . The set of words derivable throughderivations of index k is denoted by Lk(G ).

Clearly L1(G ) ⊆ L2(G ) ⊆ L3(G ) ⊆ ... and L(G ) =⋃

k ≥1Lk(G )We can prove L(G ) ⊆Π L(Mnm+1

G ) by proving

L(G ) ⊆Π Lnm+1(G ) (Collapse Lemma) -proved usingY (T ) ⊆Π

⋃ni=0 Y (T (i))

Y (T (k )) ⊆ Lkm+1(G )

Lk(G ) ⊆Π L(MkG )


Lemma 2.1

Lemma 2.1

Y(T ) ⊆Π ∪ni=0Y(T (i))

Π (Yields of all the parse trees) ⊆Π ( Union of Yields of all the parse trees of dimension i,where (i=0 to n))

Proof Idea

For every parse tree t ∈ T , we can find a parse tree t’ ∈ T (i), suchthat:

Π (Y (t)) = Π (Y (t’))

d(t′) = i ≤ n. (n = | V |)


Lemma 2.1: Proof - Preliminaries

In this proof, we write t = t1t2 to denote that

t1 is a parse tree except that exactly one leaf is labelled by a variable,say A2, instead of a terminal.t2 is a parse tree with root A2.t is obtained from t1 and t2 by replacing the leaf A2 of t1 by the tree t2.

Figure: Tree t. Figure: Tree t1 and t2.


Lemma 2.1: Proof - Preliminaries

Ω− equivalence

Two trees t, t’ are Ω -equivalent if:

They have same number of nodes.The sets of variables in t and t ′ coincide.Y (t) = Π Y (t’)

Compact tree

A tree t is compact if d(t) ≤ K (t).

where K (t) denotes the number of variables that appear in t.

K (t) ≤ n. Why?

Proof Idea

For every parse tree t ∈ T , we can find a parse tree t’ ∈ T (i), suchthat:

Π (Y (t)) = Π (Y (t’))

d(t′) = i ≤ n. (n = | V |)Abhshek, Rishi, Shubham Automata Construction November 23, 2018 16 / 61

Compactify(t)

Compactify(t) transforms a tree t into an Ω− equivalent compact tree.Steps:

1 If t is compact then return t and terminate.

2 If t is not compact then:

2.1 Let t1, .....tr be the proper children of t, r ≥ 1.2.2 For every 1 ≤ i ≤ r : ti := Compactify(ti ).

(i.e., replace in t the subtree ti , by the result of compactifying ti .2.3 Let x be the smallest index 1 ≤ x ≤ r such that K (tx) = maxiK (ti ).2.4 Choose an index y 6= x such that d(ty ) = maxid(ti ).2.5 Choose subtrees tax , tbx of tx and subtrees tay , tby , tcy of ty such that:

i tx = tax .tbx and ty = tay . (t

by .t

cy ) ; and

ii the roots of tbx , tby and tcy are labelled by the same variable.

2.6 Remove tby from ty and insert it into tx .

tx = tax .(tby . tbx ) ; ty = tay .tcy .2.7 Goto (1).


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Example


Compactify(t)







by .t

cy ) ; and





Compactify(t) is well-defined

We first prove that the procedure Compactify is well-defined i.e. someassumptions made by the procedure about the existence of some objectsindeed hold.

(2.1) If t is not compact, then t has at least one proper child.

Proof:

Given: t is not compact.

Assume that t has no proper child.

Then, d(t) = 0 ≤ K (t). Why?

Hence, t is compact.

Contradiction.



(2.4) If t is not compact and all its proper children are compact and xis the smallest index such that K (tx) = maxiK (ti). Then, there existsan index y 6= x such that d(ty ) = maxid(ti).

Proof:

Let y be an index, such that d(ty ) = maxid(ti ).

Then d(t)≤ d(ty ) + 1. (by definition of dimension and of y)≤ K (ty ) + 1. (as ty is compact)≤ K (tx) + 1. (by definition of x)≤ K (t) + 1. (as tx is a child of t)≤ d(t). (as t is not compact)

d(t) = d(ty ) + 1.

By definition of dimension, if ty was the only child of t withmaximum dimension, then d(t) = d(ty ).

Hence, more than 1 such child exist.

If 1 of them is tx , we can choose ty from the rest of them.Abhshek, Rishi, Shubham Automata Construction November 23, 2018 46 / 61


(2.5) If t is not compact and all its proper children are compact, andit has two distinct proper children tx , ty such that K (tx) = maxiK (ti)and d(ty ) = maxid(ti). Then, there exist subtrees tax , tbx of tx andsubtrees tay , tby , tcy of ty satisfying conditions (i) and (ii).

i tx = tax .tbx and ty = tay . (tby .tcy ) ; and


Proof:

K (ty ) = d(ty ). (from proof of 2.4 )

d(ty ) < h(ty ). (from Fact 2.1 )

K (ty ) < h(ty ).

∃ path of ty from the root to a leaf that visits at least two nodeslabelled with the same variable, say A.

Hence, we can have ty = tay .(tby .tcy ) where roots of tby and tcy arelabelled A.



Also,

K (t) = K (tx). (from proof of 2.4 )

Hence, every variable that appears in t appears also in tx .

So tx contains a node labelled by A.

Hence, we can have tx = tax .tbx where root of tbx is labelled A.

Thus, the procedure Compactify is well-defined.


Compactify(t)







by .t

cy ) ; and





Compactify(t) is Correct

Correctness

If Compactify(t) terminates and returns a tree t’, then t and t’ areΩ− equivalent.

Proof by induction on the number of calls to Compactify(t).

Base Case: If Compactify is called only once, then only line (1) isexecuted, t is compact.If Compactify is called more than once, the only lines that modify tare:2.2 For every 1 ≤ i ≤ r : ti := Compactify(ti ).2.6 Remove tby from ty and insert it into tx .

Consider 2.2. By I.H., each call to Compactify(ti ) returns a compacttree t ′i that is Ω− equivalent to ti . Let t1 and t2 be the values of tbefore and after the execution of ti := Compactify(ti ). Then t2 is theresult of replacing ti by t ′i in t1.

since t ′i is Ω− equivalent to ti , we get that t2 is Ω− equivalent to t1.


Consider 2.6. By Let t1 and t2 be the values of t before and after theexecution of tx := tax (tby t

bx ) followed by the execution of ty := tay t

cy .

Since the subtree tby that is added to tx is subsequently removed fromty ,

t1 and t2 have same number of nodes.The sets of variables in t1 and t2 coincide.Y (t1) =Π Y (t2).

Thus, the procedure Compactify is Correct.


Compactify(t) terminates

Compactify(t) terminates

Assume that t has a minimal number of nodes.

The procedure does not change, the number of nodes or the set ofvariables occurring in each of t1, ...., tr .

The procedure adds nodes to tx . Value of K (tx) does not decrease.

x has the same value at every execution. Since, each executionstrictly decreases the number of nodes of some proper child tydifferent from tx , and only increases the number of nodes of tx .

Contradiction: All proper children of t have a finite number of nodes.


Lemma 2.2

Lemma 2.2

For every k ≥ 0 : Y(T k) ⊆ Lkm+1(G )

The set of Yields of all the parse trees of dimension k, is a subset ofthe set of words derivable through derivations of index km + 1,denoted by Lkm+1(G ).

Proof Idea

For every parse tree t ∈ T k , there exists a derivation of index ’km + 1’ forY (t).


Lemma 2.2: For every k ≥ 0 : Y(T k) ⊆ Lkm+1(G )

In this proof, we will use the following notation. If D is a derivationα0 ⇒ ...⇒ αl and w ,w ′ ∈ (V ∪ T )∗, then we define wDw ′ to be the stepsequence wα0w

′ ⇒ ...⇒ wαlw′.

Proof by induction on number of non-leaf nodes (nl) in t:

Let d(t) = k .Base Case:

nl = 0. t has no proper child.d(t) = k = 0. index = km + 1 = 1.Derivation: S ⇒ Y (t) of index 1.

Induction Step: Assume t has r ≥ 1 proper children t1, ..., tr wherethe root of ti is assumed to be labelled by A(i); i.e., we assumed thetopmost level of t is induced by a rule

S −→ γ0A(1)γ1...γr−1A

(r)γr for γi ∈ T ∗ (1)

r − 1 ≤ mAt most one child ti has dimension k , while others have at most k − 1.


Assume d(t1) ≤ k and d(t2), ..., d(tr ) ≤ k − 1.

I.H. for all 1 ≤ i ≤ r there is a derivation Di for Y (ti ) s.t. D1 hasindex km + 1, and D2, ...,Dr have index (k − 1)m + 1.

For each 1 ≤ i ≤ r , define the step sequence

D ′i := γ0A(1)γ1...γi−2A

(i−1)γi−1DiγiY (ti+1)γi+1...γr−1Y (tr )γr (2)

Now, D ′1 has index km + 1, and for 2 ≤ i ≤ r , the step sequence D ′ihas index, (i − 1) + (k − 1)m + 1 ≤ km + 1.

By concatenating the step sequences S −→ γ0A(1)γ1...γr−1A

(r)γr andDr ,Dr−1, ...,D1 in that order, we obtain a derivation for Y (t) ofindex km + 1.


Lemma 2.3 [Collapse Lemma]

Collapse Lemma

L(G ) ⊆Π Lnm+1(G )

L(G ) = Y (T )⊆Π ∪ni=0Y(T (i)) (Lemma 2.1)⊆ Lnm+1(G ) (Lemma 2.2)


Lemma 2.4

Lemma 2.4

For every k ≥ 1 : Lk(G ) ⊆Π L(MkG ).

We show that if S ⇒∗ α is a prefix of a derivation of index k then MkG has

a run q0w−→ ΠV (α) s.t. w ∈ T ∗ and α/V =Π w .

Proof by induction on length i of the prefix

Base Case: When i = 0, α = S , q0 = ΠV (S) and S/T = ε; this holds.

Induction Step: When i > 0, Since S ⇒i α, there exist:β1Aβ2 ∈ (V ∪ T )∗ and a production A→ γ s.t.S ⇒i−1 β1Aβ2 ⇒ α and β1γβ2 = α


Lemma 2.4: Proof of Lk(G ) ⊆Π L(MkG )

By I.H. there exists a run of MkG s.t. q0

w1−−→ ΠV (β1Aβ2) and(β1Aβ2)/T =Π w1.

By definition of MkG and the fact that S ⇒i α is of index k ; there

exists a transition (ΠV (β1Aβ2), γ/T ,ΠV (α)).

Hence q0

w1.γ/T−−−−→ ΠV (α).Also from (β1Aβ2)/T =Π w1 and α = β1γβ2; α/T =Π w1.γ/T andthis is proved.

Finally, if α ∈ T ∗ so that S ⇒∗ α is a derivation, then:q0

w−→ ΠV (α) = (0, ..., 0) where (0, ..., 0) is an accepting state andα = α/T =Π w .


Proof of Correctness: L(G ) ⊆Π L(Mnm+1G )

Proposition

L(G ) ⊆Π L(Mnm+1G )

L(G ) ⊆Π Lnm+1(G ) (Collapse Lemma)⊆Π L(Mnm+1

G ) (Lemma 2.4)


Conclusion

If G is a context-free grammar with n variables and degree m, then L(G)and L(Mnm+1

G ) have same Parikh image.

MkG has exactly n+kCk states.

For grammars in CNF, m=1. k=n+1.

Therefore, no. of states in Mn+1G is exactly 2n+1Cn ≤ 22n+1 ∈ O(4n).


Thank you!


Documents

Direct automaton construction for Parikh's Theoremdeepakd/atc-2018/seminar-slides/automa… · If w is a word over some T, we denote by T(w) the Parikh image of w over alphabet T