30
Introduction to Automata Theory BBM401 - Automata Theory and Formal Languages 1

Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

  • Upload
    phamanh

  • View
    258

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Introduction to Automata Theory

BBM401 - Automata Theory and Formal Languages 1

Page 2: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Automata, Computability and Complexity

• Automata, Computability and Complexity are linked by the question:– “What are the fundamental capabilities and limitations of computers?”

• The theories of computability and complexity are closely related. – In complexity theory, the objective is to classify problems as easy ones and hard

ones,– In computability theory, the classification of problems is by those that are

solvable and those that are not. – Computability theory introduces several of the concepts used in complexity

theory.• Automata theory deals with the definitions and properties of mathematical models of

computation.– The finite automaton, is used in text processing, compilers, and hardware design.– The context–free grammar, is used in programming languages and artificial

intelligence.– Turing machines represent computable functions.

BBM401 - Automata Theory and Formal Languages 2

Page 3: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Why Study Automata?

• Automata theory is the core of computer science.• Automata theory presents many useful models for software and hardware.

– In compilers we use finite automatons for lexical analyzers, and push down automatons for parsers.

– In search engines, we use finite automatons in order to determine tokens in web pages.

– Finite automata model protocols, electronic circuits.– Context-free grammars are used to describe the syntax of essentially every

programming language.– Automata theory offers many useful models for natural language processing.

• When developing solutions to real problems, we often confront the limitations of what software can do.– Undecidable things – no program whatever can do it.– Intractable things – there are programs, but no fast programs.

BBM401 - Automata Theory and Formal Languages 3

Page 4: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

A Simple Finite Automaton – On/Off Switch

BBM401 - Automata Theory and Formal Languages 4

off onstart

push

push

• States are represented by circles.

•Accepting (final) states are representedby double circles.

• One of the states is a starting state.

•Arcs represent external influences (inputs).

Page 5: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

A Simple Finite Automaton – Recognizing A Word

BBM401 - Automata Theory and Formal Languages 5

istart i l

il ilyy ilya sa ilyas

Page 6: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Central Concepts of Automata Theory - Alphabets

• An alphabet is a finite, non empty set of symbols.• We use the symbol for an alphabet.

• = {0,1} - binary alphabet• = {a,b,c,…,z} - lowercase letters• The set of ASCII chracters is an alphabet.

BBM401 - Automata Theory and Formal Languages 6

Page 7: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Central Concepts of Automata Theory - Strings

• A string is a sequence of symbols chosen from some alphabet.• A string sometimes is called as word.

• 01101 is a string from the alphabet = {0,1}. – Some other strings: 11, 010, 1, 0

• The empty string, denoted as , is a string of zero occurrences of symbols.

• Length of string: number of symbols in the string– |ab| = 2 |b| = 1 || = 0

BBM401 - Automata Theory and Formal Languages 7

Page 8: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Central Concepts of Automata Theory - Strings

Powers of an alphabet:• If is an alphabet, the set of all strings of a certain length from the alphabet by using

an exponential notation.• k is the set of strings of length k from .• Let = {0,1}.

0 = {} 1 = {0,1} 2 = {00,01,10,11}• The set of all strings over an alphabet is denoted by *.

* = 0 ∪ 1 ∪ 2 ∪ …+ = 1 ∪ 2 ∪ … - set of nonempty strings

Concatenation of strings• If x and y are strings xy represents their concatenations.• If x=abc and y=de then xy = abcde

BBM401 - Automata Theory and Formal Languages 8

Page 9: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Central Concepts of Automata Theory - Languages

• A set of strings that are chosen from * is called as a language.• If is an alphabet, and L ⊆ * , then L is a language over .

• A language over may not include strings with all symbols of .• Some Languages:

– The language of all strings consisting of n 0’s followed by n 1’ for some n≥0 : {, 01, 0011, 000111, …}

– * is a language– Empty set is a language. The empty language is denoted by .– The set {} is a language, {} is not equal to the empty language.– The set of all identifers in a programming language is a language.– The set of all syntactically correct C programs is a language.– Turkish, English are lanuguages.

BBM401 - Automata Theory and Formal Languages 9

Page 10: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Set-Formers to Define Languages

• A set-former is a common way to define a languageSet-former: {w | something about w}

{w | w consists of equal number of 0’s and 1’s}{w | w is a binary integer that is prime}

Sometimes we replace w with an expression{0n1n | n≥1}{0i1j | 0 ≤ i ≤ j}

BBM401 - Automata Theory and Formal Languages 10

Page 11: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Language - Problem

• In automata theory, a problem is the question of deciding whether a given string is a member of a particular language.

• If is an alphabet, and L is a language over , then the problem L is: Given a string w in * , decide whether or not w is in L.

• In order to make decision requires some computational resources.– Deciding whether a given string is a correct C identifier– Deciding whether a given string is a syntactically correct C program.

• Some decision problems are simple, some others are harder.• A decision question may require exponential resources in the size of its

input.• A decision question may be unsolvable.

BBM401 - Automata Theory and Formal Languages 11

Page 12: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Formal Proofs

• When we study automata theory, we encounter theorems that we have to prove.

• There are different forms of proofs:– Deductive Proofs– Inductive Proofs– Proof by Contradiction– Proof by a counter example (disproof)

• To create a proof may NOT be so easy.

BBM401 - Automata Theory and Formal Languages 12

Page 13: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Deductive Proofs

• A deductive proof consists of a sequence of statement whose truth leads us from some initial statement (hypothesis or given statements) to a conclusion statement.

• Each step of a deductive proof MUST follow from a given fact or previous statements (or their combinations) by an accepted logical principle.

• The theorem that is proved when we go from a hyphothesis H to a conclusion C is the statement ’’if H then C’’. We say that C is deduced from H.

BBM401 - Automata Theory and Formal Languages 13

Page 14: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Deductive Proofs

• Assume that the following theorem (initial statement) is given:

• Given Thm. (initial statement): If x ≥ 4, then 2x ≥ x2

• Theorem to be proved:If x is the sum of the squares of four positive integers, then 2x ≥ x2

BBM401 - Automata Theory and Formal Languages 14

Page 15: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Proof of Theorem

1. x = a2 + b2 + c2 + d2 Given

2. a ≥ 1 b ≥ 1 c ≥ 1 d ≥ 1 Given

3. a2 ≥ 1 b2 ≥ 1 c2 ≥ 1 d2 ≥ 1 From (1) and principle of arithmetic

4. x ≥ 4 From (1), (3), and principle of arithmetic

5. 2x ≥ x2 From (4) and principle of arithmetic

BBM401 - Automata Theory and Formal Languages 15

Page 16: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

If-And-Only-If Statements

• Some times theorems contain if-and-only-if statements.• In this case we have to prove in both directions.

Theorem:• Let x be a real number. Then x = x if and only if x is an integer.

BBM401 - Automata Theory and Formal Languages 16

Page 17: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Proof of TheoremLet x be a real number. Then x = x if and only if x is an integer.

If-Part:• Given that x is an integer.• By definitions of ceiling and floor operations. x = x and x = x• Thus, x = x .

Only-If-Part:• Given that x = x• By definitions of ceiling and floor operations. x ≤ x and x ≥ x• Since given that x = x, x ≤ x and x ≥ x• By the proporties of arithmetic inequalities, x = x • Since x is always an integer, x MUST be integer too.

BBM401 - Automata Theory and Formal Languages 17

Page 18: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Inductive Proofs

• An inductive proof has three parts:– Basis case– Inductive hypothesis– Inductive step

• Base case can be one or more than one

BBM401 - Automata Theory and Formal Languages 18

Page 19: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Inductive Proofs

19

Theorem:

For all n>=1.

Proof #1: (by induction on n)

Basis:n = 1

1 = 1

2)1(

1

nnin

i

2)11(11

1

i

i

BBM401 - Automata Theory and Formal Languages

Page 20: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

20

Inductive hypothesis:Suppose that for some k>=1.

Inductive step:We will show that

by the inductive hypothesis

It follows that for all n>=1 �.

2)1(

1

kkik

i

2)2)(1(1

1

kkik

i

)1(1

1

1

kiik

i

k

i

)1(2

)1(

kkk

2)1(2)1(

kkk

2)2)(1(

kk

2)1(

1

nnin

i BBM401 - Automata Theory and Formal Languages

Page 21: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

21

Theorem:

For all n>=1.

Proof #2: (by induction on n)

Basis:n = 1

1 = 1

2)1(

1

nnin

i

2)11(11

1

i

i

BBM401 - Automata Theory and Formal Languages

Page 22: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

22

Inductive hypothesis:Suppose there exists a k>=1 such that that for all m, where 1 <= m <= k.

Inductive step:We will show that

by the inductive hypothesis

It follows that for all n>=1 �.

2)1(

1

mmim

i

2)2)(1(1

1

kkik

i

)1(1

1

1

kiik

i

k

i

)1(2

)1(

kkk

2)1(2)1(

kkk

2)2)(1(

kk

2)1(

1

nnin

i

BBM401 - Automata Theory and Formal Languages

Page 23: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

23

• What is the difference between proof #1 and proof #2?– The inductive hypothesis for proof #2 assumes more than that for proof #1.– Proof #1 is sometimes called weak induction– Proof #2 is sometimes called strong induction

• Many times weak induction is sufficient, but other times strong induction is more convenient.

BBM401 - Automata Theory and Formal Languages

Page 24: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

24

Definition #1:Let T=(V,E) be a tree with |V|>=1. Then T is a strict binary tree if and only if each vertex in T is either a leaf or has exactly two children.

Definition #2: (recursive)1. A single vertex is a strict binary tree.2. Let T1=(V1,E1) and T2=(V2,E2) be strict binary trees with roots r1 and r2,

respectively, where V1 and V2 do not intersect. In addition, let r be a new vertex that is not in V1 or V2. Finally, let:

Then T3=(V3,E3) is a strict binary tree.3. Nothing else is a strict binary tree.

Strict Binary Trees

)}2,(),1,{(213}{213

rrrrEEErVVV

BBM401 - Automata Theory and Formal Languages

Page 25: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

25

Theorem:Let L(T) denote the number of leaves in a strict binary tree T=(V,E). Then

2*L(T) – 2 = |E|

Proof: (by induction on L(T))

Basis:L(T) = 1

Observation: The only strict binary tree with L(T) = 1 has a single vertex and 0 edges, i.e., |E| = 0.

Lets evaluate 2*L(T)-2 and see if we can show it is equal to |E|.

2*L(T) – 2 = 2 *1 – 2 Since L(T) = 1= 0

= |E| Since the observation tell us that |E| = 0

BBM401 - Automata Theory and Formal Languages

Page 26: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

26

Inductive hypothesis:Suppose there exists a k>=1 such that for every strict binary tree where 1<=L(T)<=k that 2*L(T) – 2 = |E|.

Inductive step:Let T=(V,E) be a strict binary tree where L(T)=k+1. We must show that 2*L(T) – 2 = |E|.

Notes about T:1. Since k>=1 and L(T)=k+1, it follows that L(T)>1. Therefore T consists of a

root r and two strict binary trees T1=(V1,E1) and T2=(V2,E2), where 1<=L(T1)<=k and 1<=L(T2)<=k.

2. L(T) = L(T1) + L(T2) by definition of T

3. |E| = |E1| + |E2| + 2 also by definition of T

Lets start with 2*L(T) - 2 and, using 1-3, see if we can show that it is equal to |E|.

BBM401 - Automata Theory and Formal Languages

Page 27: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

27

2 * L(T) – 2 = 2 * (L(T1) + L(T2)) – 2 by (2)= 2 * L(T1) + 2 * L(T2) – 2= (2 * L(T1) – 2) + (2 * L(T2) – 2) + 2

Since 1<=L(T1)<=k and 1<=L(T2)<=k from (1), it follows from the inductive hypothesis that:

2 * L(T1) – 2 = |E1| and2 * L(T2) – 2 = |E2|

Substituting these in the preceding equation gives:

= |E1| + |E2| + 2= |E| by (3)

Hence, for all strict binary trees, 2 * L(T) – 2 = |E|. �

Finally, note the use of strong induction in the above proof.BBM401 - Automata Theory and Formal Languages

Page 28: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Proving Equivalances About Sets

• In order to prove S = T, we have to prove that1. If x is a member of S, then x is also a member of T (S T), and2. If x is a member of T, than x is also a member of S (T S),

Theorem: R ( S T) = (R S) (R T)

We have to show that1. If x is in R ( S T) , than x is in (R S) (R T), and2. If x is in (R S) (R T), than x is in R ( S T)

BBM401 - Automata Theory and Formal Languages 28

Page 29: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Proof of R ( S T) = (R S) (R T)

BBM401 - Automata Theory and Formal Languages 29

If part of the proof of the theorem

Page 30: Introduction to Automata Theory - Hacettepe · Why Study Automata? • Automata theory is the core of computer science. • Automata theory presents many useful models for software

Proof of R ( S T) = (R S) (R T)

BBM401 - Automata Theory and Formal Languages 30

Then part of the proof of the theorem