Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Introduction to Automata Theory

BBM401 - Automata Theory and Formal Languages 1

Automata, Computability and Complexity

• Automata, Computability and Complexity are linked by the question:– “What are the fundamental capabilities and limitations of computers?”

• The theories of computability and complexity are closely related. – In complexity theory, the objective is to classify problems as easy ones and hard

ones,– In computability theory, the classification of problems is by those that are

solvable and those that are not. – Computability theory introduces several of the concepts used in complexity

theory.• Automata theory deals with the definitions and properties of mathematical models of

computation.– The finite automaton, is used in text processing, compilers, and hardware design.– The context–free grammar, is used in programming languages and artificial

intelligence.– Turing machines represent computable functions.


Why Study Automata?

• Automata theory is the core of computer science.• Automata theory presents many useful models for software and hardware.

– In compilers we use finite automatons for lexical analyzers, and push down automatons for parsers.

– In search engines, we use finite automatons in order to determine tokens in web pages.

– Finite automata model protocols, electronic circuits.– Context-free grammars are used to describe the syntax of essentially every

programming language.– Automata theory offers many useful models for natural language processing.

• When developing solutions to real problems, we often confront the limitations of what software can do.– Undecidable things – no program whatever can do it.– Intractable things – there are programs, but no fast programs.


A Simple Finite Automaton – On/Off Switch


off onstart

push

push

• States are represented by circles.

•Accepting (final) states are representedby double circles.

• One of the states is a starting state.

•Arcs represent external influences (inputs).

A Simple Finite Automaton – Recognizing A Word


istart i l

il ilyy ilya sa ilyas

Central Concepts of Automata Theory - Alphabets

• An alphabet is a finite, non empty set of symbols.• We use the symbol for an alphabet.

• = {0,1} - binary alphabet• = {a,b,c,…,z} - lowercase letters• The set of ASCII chracters is an alphabet.


Central Concepts of Automata Theory - Strings

• A string is a sequence of symbols chosen from some alphabet.• A string sometimes is called as word.

• 01101 is a string from the alphabet = {0,1}. – Some other strings: 11, 010, 1, 0

• The empty string, denoted as , is a string of zero occurrences of symbols.

• Length of string: number of symbols in the string– |ab| = 2 |b| = 1 || = 0


Central Concepts of Automata Theory - Strings

Powers of an alphabet:• If is an alphabet, the set of all strings of a certain length from the alphabet by using

an exponential notation.• k is the set of strings of length k from .• Let = {0,1}.

0 = {} 1 = {0,1} 2 = {00,01,10,11}• The set of all strings over an alphabet is denoted by *.

* = 0 ∪ 1 ∪ 2 ∪ …+ = 1 ∪ 2 ∪ … - set of nonempty strings

Concatenation of strings• If x and y are strings xy represents their concatenations.• If x=abc and y=de then xy = abcde


Central Concepts of Automata Theory - Languages

• A set of strings that are chosen from * is called as a language.• If is an alphabet, and L ⊆ * , then L is a language over .

• A language over may not include strings with all symbols of .• Some Languages:

– The language of all strings consisting of n 0’s followed by n 1’ for some n≥0 : {, 01, 0011, 000111, …}

– * is a language– Empty set is a language. The empty language is denoted by .– The set {} is a language, {} is not equal to the empty language.– The set of all identifers in a programming language is a language.– The set of all syntactically correct C programs is a language.– Turkish, English are lanuguages.


Set-Formers to Define Languages

• A set-former is a common way to define a languageSet-former: {w | something about w}

{w | w consists of equal number of 0’s and 1’s}{w | w is a binary integer that is prime}

Sometimes we replace w with an expression{0n1n | n≥1}{0i1j | 0 ≤ i ≤ j}


Language - Problem

• In automata theory, a problem is the question of deciding whether a given string is a member of a particular language.

• If is an alphabet, and L is a language over , then the problem L is: Given a string w in * , decide whether or not w is in L.

• In order to make decision requires some computational resources.– Deciding whether a given string is a correct C identifier– Deciding whether a given string is a syntactically correct C program.

• Some decision problems are simple, some others are harder.• A decision question may require exponential resources in the size of its

input.• A decision question may be unsolvable.


Formal Proofs

• When we study automata theory, we encounter theorems that we have to prove.

• There are different forms of proofs:– Deductive Proofs– Inductive Proofs– Proof by Contradiction– Proof by a counter example (disproof)

• To create a proof may NOT be so easy.


Deductive Proofs

• A deductive proof consists of a sequence of statement whose truth leads us from some initial statement (hypothesis or given statements) to a conclusion statement.

• Each step of a deductive proof MUST follow from a given fact or previous statements (or their combinations) by an accepted logical principle.

• The theorem that is proved when we go from a hyphothesis H to a conclusion C is the statement ’’if H then C’’. We say that C is deduced from H.


Deductive Proofs

• Assume that the following theorem (initial statement) is given:

• Given Thm. (initial statement): If x ≥ 4, then 2x ≥ x2

• Theorem to be proved:If x is the sum of the squares of four positive integers, then 2x ≥ x2


Proof of Theorem

1. x = a2 + b2 + c2 + d2 Given

2. a ≥ 1 b ≥ 1 c ≥ 1 d ≥ 1 Given

3. a2 ≥ 1 b2 ≥ 1 c2 ≥ 1 d2 ≥ 1 From (1) and principle of arithmetic

4. x ≥ 4 From (1), (3), and principle of arithmetic

5. 2x ≥ x2 From (4) and principle of arithmetic


If-And-Only-If Statements

• Some times theorems contain if-and-only-if statements.• In this case we have to prove in both directions.

Theorem:• Let x be a real number. Then x = x if and only if x is an integer.


Proof of TheoremLet x be a real number. Then x = x if and only if x is an integer.

If-Part:• Given that x is an integer.• By definitions of ceiling and floor operations. x = x and x = x• Thus, x = x .

Only-If-Part:• Given that x = x• By definitions of ceiling and floor operations. x ≤ x and x ≥ x• Since given that x = x, x ≤ x and x ≥ x• By the proporties of arithmetic inequalities, x = x • Since x is always an integer, x MUST be integer too.


Inductive Proofs

• An inductive proof has three parts:– Basis case– Inductive hypothesis– Inductive step

• Base case can be one or more than one


Inductive Proofs

19

Theorem:

For all n>=1.

Proof #1: (by induction on n)

Basis:n = 1

1 = 1

2)1(

1

nnin

i

2)11(11

1

i

i

BBM401 - Automata Theory and Formal Languages

20

Inductive hypothesis:Suppose that for some k>=1.

Inductive step:We will show that

by the inductive hypothesis

It follows that for all n>=1 �.

2)1(

1

kkik

i

2)2)(1(1

1

kkik

i

)1(1

1

1

kiik

i

k

i

)1(2

)1(

kkk

2)1(2)1(

kkk

2)2)(1(

kk

2)1(

1

nnin

i BBM401 - Automata Theory and Formal Languages

21

Theorem:

For all n>=1.

Proof #2: (by induction on n)

Basis:n = 1

1 = 1

2)1(

1

nnin

i

2)11(11

1

i

i


22

Inductive hypothesis:Suppose there exists a k>=1 such that that for all m, where 1 <= m <= k.

Inductive step:We will show that

by the inductive hypothesis

It follows that for all n>=1 �.

2)1(

1

mmim

i

2)2)(1(1

1

kkik

i

)1(1

1

1

kiik

i

k

i

)1(2

)1(

kkk

2)1(2)1(

kkk

2)2)(1(

kk

2)1(

1

nnin

i


23

• What is the difference between proof #1 and proof #2?– The inductive hypothesis for proof #2 assumes more than that for proof #1.– Proof #1 is sometimes called weak induction– Proof #2 is sometimes called strong induction

• Many times weak induction is sufficient, but other times strong induction is more convenient.


24

Definition #1:Let T=(V,E) be a tree with |V|>=1. Then T is a strict binary tree if and only if each vertex in T is either a leaf or has exactly two children.

Definition #2: (recursive)1. A single vertex is a strict binary tree.2. Let T1=(V1,E1) and T2=(V2,E2) be strict binary trees with roots r1 and r2,

respectively, where V1 and V2 do not intersect. In addition, let r be a new vertex that is not in V1 or V2. Finally, let:

Then T3=(V3,E3) is a strict binary tree.3. Nothing else is a strict binary tree.

Strict Binary Trees

)}2,(),1,{(213}{213

rrrrEEErVVV


25

Theorem:Let L(T) denote the number of leaves in a strict binary tree T=(V,E). Then

2*L(T) – 2 = |E|

Proof: (by induction on L(T))

Basis:L(T) = 1

Observation: The only strict binary tree with L(T) = 1 has a single vertex and 0 edges, i.e., |E| = 0.

Lets evaluate 2*L(T)-2 and see if we can show it is equal to |E|.

2*L(T) – 2 = 2 *1 – 2 Since L(T) = 1= 0

= |E| Since the observation tell us that |E| = 0


26

Inductive hypothesis:Suppose there exists a k>=1 such that for every strict binary tree where 1<=L(T)<=k that 2*L(T) – 2 = |E|.

Inductive step:Let T=(V,E) be a strict binary tree where L(T)=k+1. We must show that 2*L(T) – 2 = |E|.

Notes about T:1. Since k>=1 and L(T)=k+1, it follows that L(T)>1. Therefore T consists of a

root r and two strict binary trees T1=(V1,E1) and T2=(V2,E2), where 1<=L(T1)<=k and 1<=L(T2)<=k.

2. L(T) = L(T1) + L(T2) by definition of T

3. |E| = |E1| + |E2| + 2 also by definition of T

Lets start with 2*L(T) - 2 and, using 1-3, see if we can show that it is equal to |E|.


27

2 * L(T) – 2 = 2 * (L(T1) + L(T2)) – 2 by (2)= 2 * L(T1) + 2 * L(T2) – 2= (2 * L(T1) – 2) + (2 * L(T2) – 2) + 2

Since 1<=L(T1)<=k and 1<=L(T2)<=k from (1), it follows from the inductive hypothesis that:

2 * L(T1) – 2 = |E1| and2 * L(T2) – 2 = |E2|

Substituting these in the preceding equation gives:

= |E1| + |E2| + 2= |E| by (3)

Hence, for all strict binary trees, 2 * L(T) – 2 = |E|. �

Finally, note the use of strong induction in the above proof.BBM401 - Automata Theory and Formal Languages

Proving Equivalances About Sets

• In order to prove S = T, we have to prove that1. If x is a member of S, then x is also a member of T (S T), and2. If x is a member of T, than x is also a member of S (T S),

Theorem: R ( S T) = (R S) (R T)

We have to show that1. If x is in R ( S T) , than x is in (R S) (R T), and2. If x is in (R S) (R T), than x is in R ( S T)


Proof of R ( S T) = (R S) (R T)


If part of the proof of the theorem

Proof of R ( S T) = (R S) (R T)


Then part of the proof of the theorem

Documents

Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software