Kolmogorov structure functions for automatic complexity in ...math.hawaii.edu/~bjoern/COCOA-2014.pdfThe automatic complexity of a nite binary string x= x 1:::x n is the least number

History Automatic complexity Structure function

Kolmogorov structure functions for automaticcomplexity in computational statistics

Bjørn Kjos-Hanssen (University of Hawai’i at Manoa)

December 21, 2014 – Combinatorial Optimization andApplications (COCOA)


History

• 1936: Universal Turing machine / programming language /computer

• 1965: Kolmogorov complexity of a string x = 0111001, say, =the length of the shortest program printing x

• 1973: Structure function (Kolmogorov)

• 2001: Automatic complexity A(x) (Shallit and Wang)(deterministic)

• 2013: Nondeterministic automatic complexity AN (x) (Hyde,M.A. thesis, University of Hawai‘i): n/2 upper bound.

• 2014: Structure function for automatic complexity:entropy-based upper bound.


History








History








History








History








History








History








Definition

• Let M be an NFA having q states and no ε−transitions. Ifthere is exactly one path through M of length |x| leading toan accept state, and x is the string read along the path, thenwe say AN (x) ≤ q.

• Such an NFA is said to witness the complexity of x being nomore than q.


Definition

• Let M be an NFA having q states and no ε−transitions. Ifthere is exactly one path through M of length |x| leading toan accept state, and x is the string read along the path, thenwe say AN (x) ≤ q.

• Such an NFA is said to witness the complexity of x being nomore than q.


Upper Bound

Theorem 1

Let |Σ| = k ≥ 2 be fixed and suppose x ∈ Σn. ThenAN (x) ≤ n

2 + 1.


Upper Bound

Theorem 1

Let |Σ| = k ≥ 2 be fixed and suppose x ∈ Σn. ThenAN (x) ≤ n

2 + 1.


q1start q2 . . . qm qm+1

x1 x2 xm−1 xm

xm+1

xm+2xm+3xn−1xn

Figure 1 : An NFA uniquely accepting x = x1x2 . . . xn, n = 2m+ 1


q1start q2 . . . qm qm+1

x1 x2 xm−1 xm

xm+1

xm+2xm+3xn

Figure 2 : An NFA uniquely accepting x = x1x2 . . . xn, n = 2m


Examples:

q1start q2 q3 q4

x1 x2 x3

x4

x5x6

Figure 3 : An NFA uniquely accepting x = x1x2x3x4x5x6. AN (x) ≤ 4.


Examples:

q1start q2 q3 q4

x1 x2 x3

x4

x5x6x7

Figure 4 : An NFA uniquely accepting x = x1x2x3x4x5x6x7.AN (x) ≤ 4.


Examples:

q1start q2 q3 q4 q5 q6

0 1 1 1 0

0

1010

Figure 5 : An NFA uniquely accepting x = 0111001010, |x| = 10,AN (x) ≤ 6.


Examples:

q1start q2 q3 q4 q5 q6

0 1 1 1 0

0

10101

Figure 6 : An NFA uniquely accepting x = 01110010101, |x| = 11,AN (x) ≤ 6.


The Kolmogorov complexity of a finite word w is roughly speakingthe length of the shortest description w∗ of w in a fixed formallanguage. The description w∗ can be thought of as an optimallycompressed version of w. Motivated by the non-computability ofKolmogorov complexity, Shallit and Wang studied a deterministicfinite automaton analogue.

Definition 1 (Shallit and Wang [1])

The automatic complexity of a finite binary string x = x1 . . . xn isthe least number AD(x) of states of a deterministic finiteautomaton M such that x is the only string of length n in thelanguage accepted by M .


This complexity notion has two minor deficiencies:

1. Most of the relevant automata end up having a “dead state”whose sole purpose is to absorb any irrelevant or unacceptabletransitions.

2. The complexity of a string can be changed by reversing it. Forinstance,

AD(011100) = 4 < 5 = AD(001110).


Definition 2

The nondeterministic automatic complexity AN (w) of a word w isthe minimum number of states of an NFA M , having noε-transitions, accepting w such that there is only one acceptingpath in M of length |w|.


Definition 3

The complexity deficiency of a word x of length n is

Dn(x) = D(x) = b(n)−AN (x).


Length n P(Dn > 0)

0 0.0002 0.5004 0.5006 0.5318 0.617

10 0.66412 0.60014 0.68716 0.65718 0.65820 0.64122 0.63324 0.593

(a) Even lengths.

Length n P(Dn > 0)

1 0.0003 0.2505 0.2507 0.2349 0.207

11 0.31713 0.29515 0.29717 0.34219 0.33021 0.30323 0.32225 0.283

(b) Odd lengths.

Table 1 : Probability of strings of having positive complexity deficiencyDn, truncated to 3 decimal digits.


Definition 4

Let DEFICIENCY be the following decision problem.Given a binary word w and an integer d ≥ 0, is D(w) > d?

Theorem 5

DEFICIENCY is in NP.

We do not know whether DEFICIENCY is NP-complete.


Let

Sx = {(q,m) | ∃ q-state NFA M,x ∈ L(M)∩Σn, |L(M)∩Σn| ≤ bm}.

Then Sx has the upward closure property

q ≤ q′,m ≤ m′, (q,m) ∈ Sx =⇒ (q′,m′) ∈ Sx.


Definition 6 (Vereshchagin, personal communication, 2014)

In an alphabet Σ containing b symbols, we define

h∗x(m) = min{k : (k,m) ∈ Sx} and

hx(k) = min{m : (k,m) ∈ Sx}.


Definition 7

The entropy function H : [0, 1]→ [0, 1] is given by

H(p) = −p log2 p− (1− p) log2(1− p).

A useful well-known fact:

Theorem 8

For 0 ≤ k ≤ n,

log2

(n

k

)= H(k/n)n+O(log n).


Definition 9

Let

h(p) = lim supn→∞

max|x|=n

hx([p · n])

n, h : [0, 1/2]→ [0, 1]

where [x] is the nearest integer to x.

We are interested in upper bounds on h.


Figure 7 : Bounds for the automatic structure function for alphabet sizeb = 2.


Figure 8 : Bounds for the automatic structure function for alphabet sizeb = 2 with improved bound.


Complexity guessing game

http://math.hawaii.edu/play

http://math.hawaii.edu/play


References I

Jeffrey Shallit and Ming-Wei Wang.Automatic complexity of strings.J. Autom. Lang. Comb., 6(4):537–554, 2001.2nd Workshop on Descriptional Complexity of Automata,Grammars and Related Structures (London, ON, 2000).

Documents

Kolmogorov structure functions for automatic complexity in ...math.hawaii.edu/~bjoern/COCOA-2014.pdfThe automatic complexity of a nite binary string x= x 1:::x n is the least number