37
Fundamental Concepts Languages o A language is a subset of the set of all possible strings formed from a given set of symbols. o There must be a membership criterion for determining whether a particular string in in the set. Grammars o A grammar is a formal system for accepting or rejecting strings. o A grammar may be used as the membership criterion for a language. Automata o An automaton is a simplified, formalized model of a computer. o An automaton may be used to compute the membership function for a language. o Automata can also compute other kinds of things. Deterministic Finite Automata DFAs are: Deterministic--there is no element of choice Finite--only a finite number of states and arcs Acceptors--produce only a yes/no answer A DFA is drawn as a graph, with each state represented by a circle. One designated state is the start state.

fComputation

Embed Size (px)

Citation preview

Page 1: fComputation

Fundamental Concepts

Languages o A language is a subset of the set of all possible strings formed from a given set of

symbols.

o There must be a membership criterion for determining whether a particular string in in the set.

Grammars

o A grammar is a formal system for accepting or rejecting strings.

o A grammar may be used as the membership criterion for a language.

Automata

o An automaton is a simplified, formalized model of a computer.

o An automaton may be used to compute the membership function for a language.

o Automata can also compute other kinds of things.

Deterministic Finite Automata

DFAs are: Deterministic--there is no element of choice Finite--only a finite number of states and arcs

Acceptors--produce only a yes/no answer

A DFA is drawn as a graph, with each state represented by a circle.

One designated state is the start state.

Some states (possibly including the start state) can be designated as final states.

Page 2: fComputation

Arcs between states represent state transitions -- each such arc is labeled with the symbol that triggers the transition.

Page 3: fComputation

Example DFA

Example input string: 1 0 0 1 1 1 0 0

Operation Start with the "current state" set to the start state and a "read head" at the beginning of the

input string; while there are still characters in the string:

o Read the next character and advance the read head;

o From the current state, follow the arc that is labeled with the character just read; the state that the arc points to becomes the next current state;

When all characters have been read, accept the string if the current state is a final state, otherwise reject the string.

Sample trace: q0 1 q1 0 q3 0 q1 1 q0 1 q1 1 q0 0 q2 0 q0

Since q0 is a final state, the string is accepted.

Formal Definition of a DFA

A deterministic finite acceptor or dfa is a quintuple:M = (Q, , , q0, F)

where Q is a finite set of states, is a finite set of symbols, the input alphabet,

: Q Q is a transition function,

Page 4: fComputation

q0 Q is the initial state,

F Q is a set of final states.

Note: The fact that is a function implies that every vertex has an outgoing arc for each member of .

We can also define an extended transition function as

: Q Q. If a DFA M = (Q, , , q0, F) is used as a membership criterion, then the set of strings accepted by M is a language. That is,

L(M) = {w : (q0, w) F}.

Languages that can be defined by dfas are called regular languages.

Nondeterministic Finite Automata

A finite-state automaton can be nondeterministic in either or both of two ways:

A state may have two or more arcs emanating from it labeled with the same symbol. When the symbol occurs in the input, either arc may be followed.

A state may have one or more arcs emanating from it labeled with (the empty string) . These arcs may optionally be followed without looking at the input or consuming an input symbol.

Due to nondeterminism, the same string may cause an nfa to end up in one of several different states, some of which may be final while others are not. The string is accepted if any possible ending state is a final state.

Page 5: fComputation

Example NFAs

Formal Definition of NFAs

The extension of our notation to NFAs is somewhat strained.

A nondeterministic finite acceptor or nfa is defined by the quintuple

M = (Q, , , q0, F) where

Q is a finite set of states, is a finite set of symbols, the input alphabet,

: Q ( { } ) 2 is a transition function,

q0 Q is the initial state,

F Q is a set of final states.

These are all the same as for a dfa except for the definition of : Transitions on are allowed in addition to transitions on elements of , and The range of is 2 rather than Q. This means that the values of are not elements of Q,

but rather are sets of elements of Q.

The language defined by nfa M is defined as L(M) = {w : (q0, w) F }

Page 6: fComputation

Equivalence of NFA and DFA

Two acceptors are equivalent if the accept the same language.

A DFA is just a special case of an NFA that happens not to have any null transitions or multiple transitions on the same symbol. So DFAs are not more powerful than NFAs.

For any NFA, we can construct an equivalent DFA. So NFAs are not more powerful than DFAs. DFAs and NFAs define the same class of languages -- the regular languages.

To translate an NFA into a DFA, the trick is to label each state in the DFA with a set of states from the NFA. Each state in the DFA summarizes all the states that the NFA might be in. If the NFA contains |Q| states, the resultant DFA could contain as many as |2 | states. (Usually far fewer states will be needed.)

Regular Expressions

A regular expression can be used to define a language. A regular expression represents a "pattern;" strings that match the pattern are in the language, strings that do not match the pattern are not in the language.

As usual, the strings are over some alphabet .

The following are primitive regular expressions:

x, for each x , , the empty string, and

, indicating no strings at all.

Thus, if | | = n, then there are n+2 primitive regular expressions defined over .

Here are the languages defined by the primitive regular expressions:

For each x , the primitive regular expression x denotes the language {x}. That is, the only string in the language is the string "x".

The primitive regular expression denotes the language { }. The only string in this language is the empty string.

The primitive regular expression denotes the language {}. There are no strings in this language

Page 7: fComputation

We can compose additional regular expressions by applying the following rules a finite number of times:

If r1 is a regular expression, then so is (r1). If r1 is a regular expression, then so is r1*.

If If r1 and r2 are regular expressions, then so is r1r2.

If If r1 and r2 are regular expressions, then so is r1+r2.

Here's what the above notation means: Parentheses are just used for grouping. The postfix star indicates zero or more repetitions of the preceding regular expression.

Thus, if x , then the regular expression x* denotes the language { , x, xx, xxx, ...}.

Juxtaposition of r1 and r2 indicates any string described by r1 immediately followed by any string described by r2. For example, if x, y , then the regular expression xy describes the language {xy}.

The plus sign, read as "or," denotes the language containing strings described by either of the component regular expressions. For example, if x, y , then the regular expression x+y describes the language {x, y}.

Precedence: * binds most tightly, then justaposition, then +. For example, a+bc* denotes the language {a, b, bc, bcc, bccc, bcccc, ...}.

Languages Defined by Regular Expressions

There is a simple correspondence between regular expressions and the languages they denote:

Regular expression L(regular expression)

x, for each x {x}

{ }

{ }

(r1) L(r1)

r1* (L(r1))*

r1 r2 L(r1) L(r2)

r1 + r2 L(r1) L(r2)

Here are some hints on building regular expressions. We will assume = {a, b, c}.

Page 8: fComputation

Zero or more. a* means "zero or more a's." To say "zero or more ab's," that is, { , ab, abab, ababab, ...}, you need to say (ab)*. Don't say ab*, because that denotes the language {a, ab, abb, abbb, abbbb, ...}.

One or more. Since a* means "zero or more a's", you can use aa* (or equivalently, a*a) to mean "one or more a's." Similarly, to describe "one or more ab's," that is, {ab, abab, ababab, ...}, you can use ab(ab)*.

Zero or one. You can describe an optional a with (a+ ).

Any string at all. To describe any string at all (with = {a, b, c}), you can use (a+b+c)*.

Any nonempty string. This can be written as any character from followed by any string at all: (a+b+c)(a+b+c)*.

Any string not containing.... To describe any string at all that doesn't contain an a (with = {a, b, c}), you can use (b+c)*.

Any string containing exactly one... To describe any string that contains exactly one a, put "any string not containing an a," on either side of the a, like this: (b+c)*a(b+c)*.

Give regular expressions for the following languages on = {a, b, c}. All strings containing exactly one a.

(b+c)*a(b+c)*All strings containing no more than three a's.

We can describe the string containing zero, one, two, or three a's (and nothing else) as( +a)( +a)( +a)

Now we want to allow arbitrary strings not containing a's at the places marked by X's:X( +a)X( +a)X( +a)X

so we put in (b+c)* for each X:(b+c)*( +a)(b+c)*( +a)(b+c)*( +a)(b+c)*

All strings which contain at least one occurrence of each symbol in . The problem here is that we cannot assume the symbols are in any particular order. We have no way of saying "in any order", so we have to list the possible orders:

abc+acb+bac+bca+cab+cba

To make it easier to see what's happening, let's put an X in every place we want to allow an arbitrary string:

XaXbXcX + XaXcXbX + XbXaXcX + XbXcXaX + XcXaXbX + XcXbXaX

Finally, replacing the X's with (a+b+c)* gives the final (unwieldy) answer:

Page 9: fComputation

(a+b+c)*a(a+b+c)*b(a+b+c)*c(a+b+c)* + (a+b+c)*a(a+b+c)*c(a+b+c)*b(a+b+c)* + (a+b+c)*b(a+b+c)*a(a+b+c)*c(a+b+c)* + (a+b+c)*b(a+b+c)*c(a+b+c)*a(a+b+c)* + (a+b+c)*c(a+b+c)*a(a+b+c)*b(a+b+c)* + (a+b+c)*c(a+b+c)*b(a+b+c)*a(a+b+c)*

All strings which contain no runs of a's of length greater than two. We can fairly easily build an expression containing no a, one a, or one aa:

(b+c)*( +a+aa)(b+c)*

but if we want to repeat this, we need to be sure to have at least one non-a between repetitions:

(b+c)*( +a+aa)(b+c)*((b+c)(b+c)*( +a+aa)(b+c)*)* All strings in which all runs of a's have lengths that are multiples of three.

(aaa+b+c)*

Regular Language

Languages described by deterministic finite acceptors (dfas) are called regular languages.

For any nondeterministic finite acceptor (nfa) we can find an equivalent dfa. Thus nfas also describe regular languages.

Regular expressions also describe regular languages

From Regular Expressions to NFAs

We will build more complex nfas out of simpler nfas, each with a single start state and a single final state. Since we have nfas for primitive regular expressions, we need to compose them for the operations of grouping, juxtaposition, union, and Kleene star (*).

For grouping (parentheses), we don't really need to do anything. The nfa that represents the regular expression (r1) is the same as the nfa that represents r1.

For juxtaposition (strings in L(r1) followed by strings in L(r2), we simply chain the nfas together, as shown. The initial and final states of the original nfas (boxed) stop being initial and final

states; we include new initial and The + denotes "or" in a regular expression, so it makes sense that we would use an nfa with a choice of paths. (This is one of the reasons that it's easier to build an nfa than a dfa.)

Page 10: fComputation

The star denotes zero or more applications of the regular expression, so we need to set up a loop in the nfa. We can do this with a backward-pointing arc. Since we might want to traverse the regular expression zero times (thus matching the null string), we also need a forward-pointing arc to bypass

From NFAs to Regular Expressions

Creating a regular expression to recognize the same strings as an nfa is trickier than you might expect, because the nfa may have arbitrary loops and cycles. Here's the basic approach .

If the nfa has more than one final state, convert it to an nfa with only one final state. Make the original final states nonfinal, and add a transition from each to the new (single) final state.

1. Consider the nfa to be a generalized transition graph, which is just like an nfa except that the edges may be labeled with arbitrary regular expressions. Since the labels on the edges of an nfa may be either or members of , each of these can be considered to be a regular expression.

2. Remove states one by one from the nfa, relabeling edges as you go, until only the initial and the final state remain.

3. Read the final regular expression from the two-state automaton that results.

The regular expression derived in the final step accepts the same language as the original nfa.

Since we can convert an nfa to a regular expression, and we can convert a regular expression to an nfa, the two are equivalent formalisms--that is, they both describe the same class of languages, the regular languages.

Page 11: fComputation

Closure properties of Regular Languages

Closure properties are theorems, which show that the class of regular language is closed

under the operation mentioned. The theorems are of the form “if certain languages are regular,

and a language L is formed from them by certain operation such as union, intersection etc. then L

is also regular”. In general closure properties convey the fact that when one (or several)

languages are regular, then certain related languages are also regular.

The principal closure properties of regular languages are:

1.The union of two regular languages is regular.

If L and M are regular languages, then so is L È M.

2. The intersection of two regular languages is regular.

If L and M are regular languages, then so is L Ç M.

3. The compliment of two regular languages is regular.

If L is a regular language over alphabet S, then S*-L is also regular language.

4. The difference of two regular languages is regular.

If L and M are regular languages, then so is L - M.

5. The reversal of a regular language is regular.

The reversal of a string means that the string is written backward, i.e. reversal of

abcde is edcba.

The reversal of a language is the language consisting of reversal of all its strings,

i.e. if L={001,110} then

LÒ = {100,011}.

6.The closure of a regular language is regular.

If L is a regular language, then so is L*.

7. The concatenation of regular languages is regular.

If L and M are regular languages, then so is L M.

Page 12: fComputation

Minimization of Finite Automata

We now consider the following problem: for a given DFA A, _nd an equivalent DFA with a minimum number of states.

We start with two examples. Consider the automaton on the left:

You can see that it is not possible to ever visit state 2. States like this are called unreachable. We canSimply remove them from the automaton without changing its behavior. In our case, after removing state 2, we get the automaton on the right. As it turns out, however, removing unreachable states is not sufficient.

The next example is a bit more subtle (see the automaton on the left):

Let us compare what happens if we start processing some string w from state 0 or from state 2. I claim that the result will be always the same. This can be seen as follows. If w = , we will stay in either of the two states and will not be accepted. If w starts with 0, in both cases we will go to state 1 and both computations will be now identical. Similarly, if w starts with 1, in both cases we will go to state 0 and both computations will be identical. So no matter what w is, either we accept w in both cases or we reject w in both cases. Intuitively, from the point of view of the \future", it does not matter whether we start from state 0 or state 2. Therefore the transitions into state 0 can be redirected into state 2 and vice versa, without changing the outcome of the computation. Alternatively, we can combine states 0 and 2 into one state.

Page 13: fComputation

Context Free Grammar

Regular languages are generally used to describe how letters of the alphabet form meaning tokens in a language. However, they are not powerful enough to describe the structure in programming languages. Recall the anbn example, we know that regular languages cannot describe nested constructs that are often found in programming languages.

The class of languages that is slightly more powerful than regular languages is called context-free languages. The corresponding notation to describe the syntax of a context-free language is called the context-free grammar.

Grammar Introduction

A grammar describes a language. A grammar describes the legal words and sentences allowed in a language. A grammar consists of rules for generating legal words and sentences in the language. The generation process is usually depicted as a tree. The term “parse tree” or “derivation tree” is often used to describe the process of using the grammar to determine if a word or a sentence is valid for a given language.

A grammar is a specification for a language that consists of the following components:

G = (V,T,P,S)

• T is set of terminals (lexicon)

• V is set of non-terminals

• S is start symbol (one of the nonterminals)

• P is rules/productions of the form X->α , where X is a nonterminal and α is a sequence of

terminals and nonterminals (may be empty).

• A grammar G generates a language L.A production rule has the following format:

Left-hand side -> right –hand side

Production rules are like substitution rules. Replace the left-hand side with the right-hand side. In terms of the tree, the left-hand side is a node in the tree and right-hand side are the children of that node. Production rules will be applied continuously until there are no more non-terminals in the tree.

For a context-free grammar, there are some restrictions as to what can be on the left-hand side and what can be on the right-hand side:

1. Left-hand side of the productions must be a non-terminal

Page 14: fComputation

2. There is no restriction on the right-hand side of the productions. They can be any combination of terminals and non-terminals.

Context-Free vs. Regular languages A language is said to be context-free if and only if there is a context-free grammar for the

language. Every regular grammar is context-free, so a regular language is also a context-free language.

Example 1:

L = {w wr}

L is a language that contains all words which read the same forwards and backwards.

To describe L using a context-free grammar, we must identify the following:

1. Letters of the alphabet – {a, b} – also called terminals. 2. Start symbol – S3. Non-terminals – For this particular problem, we only need 1 nonterminal - S4. Production rules:

S ->

S-> aSa

S-> bSb

Notice that there are 3 production rules. The context-free grammar is inherently nondeterministic. The start symbol S has 3 choices. Notice also that context-free grammar is inherently recursive. The symbol S also appears on the right-hand side of a production rule. We can apply one of the 3 rules for S recursively and continuously until the desired word is generated or enough to determine that the word is not in the language. The latter part is more difficult. Often it’s easier to tell if something is in the language than to tell if something is not in the language.

Using the grammar specified, we can start generating some words and use it to test if a certain word is in the language.

Word = (empty string)

Use rule #1: S -> and generate a tree for the empty string

S

|

Page 15: fComputation

is a terminal and there is nothing else to do. Based on this tree, we know that is in the language L.

Word = aa

The word “aa” is a palindrome. It should be recognized by our grammar. We would need to use 2 rules, Rule #1: S -> and Rule #2: S-> aSa

S

/ | \

a S a

|

The word “aa” is formed by concatenating all the leaves of the tree together “a a” which is just “aa”.

Example 2:

L2 = {anbn}

Context-free grammar for L2:

S->

S->aS

Example 3:

L3 = {ab(bbaa)nbba(ba)n}

Context-Free grammar for L3:

S-> abB

A -> aaBb

A ->

Page 16: fComputation

B-> bbAa

More on Grammars

Rules can be combined on one line using the bar (|)

For example:

S -> a

S -> bA

Can be written as:

S -> a | bA

The process of applying a production rule is also called a derivation. -> is called the derivation symbol. S -> a can be read as “S derives a”. A grammar is ambiguous if it is possible to get 2 or more parse trees for the same word.

Example: given the following grammar production rules:

S -> a S

S -> S a

S -> a

It’s easy to see that for the word “aaa”, there are several possible ways to construct the parse tree.

This grammar is therefore ambiguous.

Leftmost and rightmost derivations

Example: G = ({A, B, C}, {a, b}, S, P)

P: S AB (1)

A aaA (2)

A

B Bb (4)

Page 17: fComputation

B

1 2 3 4 5

Leftmost S AB aaAB aaB aaBb aab

1 4 2 5 3

Mixed S AB ABb aaABb aaAb aab

Leftmost derivation:

Always replace leftmost variable. (Replace one variable per step)

Rightmost derivation:

Always replace rightmost variable

S aAB, A bBb, B A | Left most:

A ==> aAB ==> abBbB ==>abAbB ==>abbBbbB ==>abbbbB ==> abbbb

Right Most:

S ==> aAB ==>aA ==> abBb ==>abAb ==>abbBbb ==> abbbb

Derivation Tree

- Ordered tree- Nodes labeled with left side of productions- Children of a node represent corresponding production right sides.- Derivation trees are associated with a particular word in the language defined

by the given grammar.

Page 18: fComputation

There are two ways to use a grammar: Use the grammar to generate strings of the language. This is easy -- start with the start

symbol, and apply derivation steps until you get a string composed entirely of terminals. Use the grammar to recognize strings; that is, test whether they belong to the language.

For CFGs, this is usually much harder.

A language is a set of strings, and any well-defined set must have a membership criterion. A context-free grammar can be used as a membership criterion -- if we can find a general algorithm for using the grammar to recognize strings.

Parsing a string is finding a derivation (or a derivation tree) for that string.

Parsing a string is like recognizing a string. An algorithm to recognize a string will give us only a yes/no answer; an algorithm to parse a string will give us additional information about how the string can be formed from the grammar.

Generally speaking, the only realistic way to recognize a string of a context-free grammar is to parse it.

Ambiguous Grammar

Grammar is said to be an ambiguous grammar if there is some string that it can generate in more than one way (i.e., the string has more than one parse tree or more than one leftmost derivation). A language is inherently ambiguous if it can only be generated by ambiguous grammars.

Example

The context free grammar

A → A + A | A − A | a

is ambiguous since there are two leftmost derivations for the string a + a + a:

A → A + A A → A + A

→ a + A → A + A + A

→ a + A + A → a + A + A

→ a + a + A → a + a + A

Page 19: fComputation

→ a + a + a → a + a + a

As another example, the grammar is ambiguous since there are two parse trees for the string

a + a − a:

The language that it generates, however, is not inherently ambiguous; the following is a non-ambiguous grammar generating the same language:

A → A + a | A − a | a

Simplification of Context free Grammar

Elimination of e -Productions

.If some CFL contains the word e, then the CFG must have a e-production.

.However, if a CFG has a e-production, then the CFL does not necessarily

contain e;

e.g.,

S aX

X e

which defines the CFL {a}.

 Definition: In a given CFG, a nonterminal X is nullable if

1. There is a production X e2. There is a derivation that starts at X and leads to e:

X => . . . =>e i.e., X =>e.

For any language L, define the language L0 as follows:

Page 20: fComputation

1. if eÏL, then L0 is the entire language L, i.e., L0 = L.

2. if e e L, then L0 is the language L - {e}; i.e., if we let T = {e}, then L0

= L Ç T ¢, so L0 is all words in L except e.

 

If L is a CFL generated by a CFG G1 that includes e-productions, then there is another CFG G2 with no e-productions that generates L0.

 Procedure:

-We give constructive algorithm to convert CFG G1 with e-productions into equivalent CFG G2 with no e-productions:

-Delete all e-productions.

-For each production

X something

with at least one nullable nonterminal on the right-hand side, do the following for each possible nonempty subset of nullable nonterminals on the RHS:

(a) create a new production

X new something

where the new RHS is the same as the old RHS except with the entire current subset of nullable nonterminals removed.

(b) do not create the production

X e

Eliminating Unit Productions

A unit production is a production of the form

one nonterminal one nonterminal

 If a language L is generated by a CFG G1 that has no e-productions, then there is also a CFG G2 for L with no e-productions and no unit productions.

 

Use the following rules to create new CFG:

Page 21: fComputation

• For each pair of nonterminals A and B such that there is a production

A

or a chain of productions (unit derivation)

A =*>B,

introduce the following new productions:

– if the non-unit productions from B are

B s1 | s2 | . . . | sn

where the si e (å +N)* are strings of terminals and nonterminals, then create the new productions

A s1 | s2 | . . . | sn

– Do the same for all such pairs A and B simultaneously.

– Remove all unit productions.

• Can show that G1 and G2 generate the same language.

Chomsky Normal Form

Definition: A CFG is in Chomsky Normal Form (CNF) if each of its productions has one of the two forms:

1. Nonterminal string of exactly two Nonterminals

2. Nonterminal one terminal

For any CFL L, the non-e words of L can be generated by a CFG in CNF.

 

Procedure:

• For each production of the form

Nonterminal string of Nonterminals

we expand it into a collection of productions as follows:

Suppose we have the production

Page 22: fComputation

X4 X2X5X3X2X1

Replace the production with the new productions

X4 X2R1

R1 X5R2

R2 X3R3

R3 X2X1

where the Ri are new nonterminals.

For each transformation of original productions, introduce new nonterminals Ri.

• This transformation creates a new CFG in CNF.

• Now we have to show that the language generated by the new CFG is the same as

that generated by the original CFG.

• First show that any word that can be generated by original CFG can also be

generated by new CFG:

In any derivation of a word using the original CFG, we just replace any production

of the form X4 X2X5X3X2X1 with

the new productions

X4 X2R1

R1 X5R2

R2 X3R3

R3 X2X1

This gives us a derivation of the word using the new CFG.

Example: CFG

S abSba | bX1aX2 | bb

X1 aa | aSX1b

X2 X1a | abb

can be transformed into new CFG

Page 23: fComputation

S ABSBA | BX1AX2 | BB

X1 AA | ASX1B

X2 X1A | ABB

A a

B b

which can then be transformed into a CFG in CNF:

SAR1

R1 BR2

R2 SR3

R3 BA

S BR4

R4 X1R5

R5 AX2

S BB

X1 AA

X1 AR6

R6 SR7

R7 X1B

X2 X1A

X2 AR8

R8 BB

A a

B b

 

Page 24: fComputation

Greibach Normal Form

Here we put restriction not on the length of right sides of production, but in the position on which terminals and variables appear.

Definition: A context-free language is said to be in Greibach Normal Form if all productions have the form

A ax,

where ae T and x e V*.

We may be able to convert a grammar to GNF.

Example:

CFG: SAB

AaA | bB | b

B b

GNF: S aAB | bBB | bB

A aA | bB | b

B b

Example

Convert grammar S abSb | aa to GNF

GNF is: SaBSB | aA

A a

B b

For any CFL L, the non-e words of L can be generated by a CFG in GNF.

Page 25: fComputation

Post's Correspondence Problem (PCP)

Definition:

Given an alphabet S, one instance of Post's correspondence problem of size s is a finite set of pairs of strings (gi , hi) ( i = 1...s s>=1) over the alphabet S. A solution of length n >= 1 to this instance is a sequence i1 i2 ... in of selections such that the strings gi1gi2 ... gin and hi1hi2 ... hin formed by concatenation are identical.

Width of a PCP instance is the length of the longest string in g i and hi (i = 1, 2, ... , s). Pair i is the short name for pair (gi , hi), where gi and hi are the top string and bottom string of the pair respectively. Mostly, people are interested in optimal solution, which has the shortest length over all possible solutions to an instance. The corresponding length is called optimal length. We use the word hard or difficult to describe instances whose optimal lengths are very large. For simplicity, we restrict the alphabet S to {0, 1}, and it is easy to transform other alphabets to their equivalent binary format.

To describe subclasses of Post’s Correspondence Problem, we use PCP[s] to represent the set of all PCP instances of size s, and PCP[s, w] the set of all PCP instances of size s and width w.

For convenience, we use a matrix of 2 rows and s columns to represent instances of PCP[s], where string gi is located at (i , 1) and hi at (i , 2). The following is the matrix representation of the instance {{100, 1}, {0, 100}, {1, 00}} in PCP[3,3].

Let's consider the result of selections of pair 1, 3, 1, 1, 3, 2, 2 accordingly. They can be shown in the following table with each selection assigned a different color:

(1)1001

(3)100

(1)1001

(1)1001

(3)100

(2)0100

(2)0100

After the elimination of blanks and concatenation of strings in the top and bottom separately, it turns to:

1001100100100 1001100100100

Now, the string in the top is identical to the one in the bottom; therefore, these selections form a solution to PCP (1). When all combinations of up to 7 selections of pairs are tested, this solution is thus proven to be the unique optimal solution to this instance.

NP Completeness

Page 26: fComputation

Time Complexity of a Deterministic Turing Machine

The time complexity of Deterministic TM is the maximum number of moves made by the Turing Machine M in processing any input string of length n.

P Class

A language L is said to be in Class P if there exists a deterministic TM M, such that M is of time complexity τ(n) for some polynomial P and M accepts L.

Time Complexity of a Non Deterministic Turing Machine

Non Deterministic Turing Machine M is of time complexity τ(n), if for every accepted input string n there is some sequence of atmost τ(n) moves, leading to an accepting condition.

Class NP

A language L is said to be in Class NP if there is a non deterministic Turing Machine M such that M is of time complexity τ(n) for some polynomial P and M accepts L.

The definitions of P and NP are denoted similarly but there is a vast difference between them. When L is in P, the number of moves to test whether any input string of length n is less than or equal to P(n). When L is in NP, the number of moves to test is less than or equal to n only for strings accepted by M.

NP Complete

A language L C ∑* is NP Compelete if

i) L is in NP

ii) For every language L in NP, there is a polynomial time transformation from L to L1

NP Completeness

S.A.Cook introduced the concept of NP Completeness as a step towards solving P=NP. We have number of NP complete Problems existing in several areas. The definition of P and NP can be extended to the class of all problems in various fields like proportional calculus, Graph Theory, Operational Research.

Some of the examples of NP Complete Problems are

Page 27: fComputation

1. Travelling Salesman Problem

2. Zero One Programming Problem

3. Vertex Cover Problem

4. Hamilton Circuit Problem