33
LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Embed Size (px)

Citation preview

Page 1: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

LING 388: Language and Computers

Sandiway Fong

9/27

Lecture 10

Page 2: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Adminstrivia

• Reminder– Homework 4 due Wednesday

Page 3: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Today’s Topic

• Finite State Automata (FSA)

– equivalent to the regular expressions we’ve been studying

Page 4: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Regular Expressions: Example

.... from lecture 8

• example (sheeptalk) – baa!– baaa! – baaaa!– …

• regular expression– baaa*!– baa+!

Page 5: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Regular Expressions: Example

.... from lecture 8

• example (sheeptalk) – baa!– baaa! – baaaa!– …

• regular expression– baaa*!– baa+!

s w

z

b

!

ya

a

> xa

Page 6: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Regular Expressions: Example

• step-by-step• regular expression

– baaa*!

s>

Start state: s

Page 7: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Regular Expressions: Example

• step-by-step• regular expression

– baaa*!

– b

– from s, – see ‘b’, – move to w

s wb>

Page 8: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Regular Expressions: Example

• step-by-step• regular expression

– baaa*!

– ba

– From w, – see an ‘a’, – move to x

s wb a> x

Page 9: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Regular Expressions: Example

• step-by-step• regular expression

– baaa*!

– baa

– From x, – see an ‘a’, – move to y

s wb

ya> x

a

Page 10: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Regular Expressions: Example

• step-by-step• regular expression

– baaa*!

– baaa*– baa– baaa– baaaa– baaaaa...– from y,– see an ‘a’, – move to ?

y’

y”

a

a

a...

but machine musthave a finite numberof states!

s wb

ya> x

a

Page 11: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Regular Expressions: Example

• step-by-step• regular expression

– baaa*!

– baaa*– baa– Baaa– baaaa– baaaaa...– from y,– see an ‘a’, – “loop” aka return to state y

y

a

s wb a> x

a

Page 12: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Regular Expressions: Example

• step-by-step• regular expression

– baaa*!

– baaa*!

– from y,– see an ‘!’, – move to final state z

(indicated in red) z

!

y

a

Note: machine cannot finish (i.e. reach the end of the input string) in states s, x or y

s wb a> x

a

Page 13: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Finite State Automata (FSA)

• construction– the step-by-step FSA construction method we just

used – works for any regular expression

• conclusion– anything we can encode with a regular expression,

we can build a FSA for it

– an important step in showing that FSA and REs are equivalent

Page 14: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Microsoft Word Wildcards

• basic wildcards– ? and *

• ? any single character• e.g. p?t put, pit, pat, pet

• * zero or more characters

x yd

abc

e

z etc.

...

...

y

a etc.

one loopfor eachcharacter

Page 15: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Microsoft Word Wildcards

• basic wildcards– @

• one or more of the preceding character

• e.g. a@

– [ ]• range of characters• e.g. [aeiou]

x ya

a

x yo

aei

u

Page 16: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Microsoft Word Wildcards

• basic wildcards– < >

• < • beginning of a word

• can think of there being a special symbol/invisible character marking the beginning of each word

• > • end of a word

• suppose there is an invisible character marking the end of each word

x y<

see anything but ‘<‘

x y>

see anything but ‘>‘

Page 17: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Microsoft Word Wildcards

• basic wildcards– < >

• > • end of a word

– Note• the see-anything-but loop

is implicit• m>• “word that ends in m”• example:

– mom is...

x y>

see anything but ‘>‘

x ym

see anything but ‘m‘

z>

Page 18: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Finite State Automata (FSA)

• more formally– (Q,s,f,Σ,)1. set of states (Q): {s,w,x,y,z} 5 states must be a finite set2. start state (s): s3. end state(s) (f): z

4. alphabet (Σ): {a, b, !}5. transition function :

signature: character × state → state• (b,s)=w• (a,w)=x• (a,x)=y• (a,y)=y• (!,y)=z

z

!

y

a

s wb a> x

a

Page 19: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Finite State Automata (FSA)

• in Prolog– define one predicate for each state

• taking one argument (the input list L)• consume input character (take the head of the list)• call next state with the tail of the list

– rule• fsa(L) :- s(L).

i.e. call start state s

Page 20: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Finite State Automata (FSA)

• state s: (start state)– s([b|L]) :- w(L).match input string beginning with b and

call state w with remainder of input

• state w:– w([a|L]) :- x(L).

• state x:– x([a|L]) :- y(L).

• state y:– y([a|L]) :- y(L).– y([!|L]) :- z(L).

• state z: (end state)– z([]).

z

!

y

a

s wb a> x

a

Page 21: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Finite State Automata (FSA)

• query– ?- s([b,a,a,a,!]).

Databases([b|L]) :- w(L).w([a|L]) :- x(L).x([a|L]) :- y(L).y([a|L]) :- y(L).y([!|L]) :- z(L).z([]).

[b,a,a,a,!] [a,a,a,!] [a,a,!]

[!]

[]z

!

y

a

s wb a> x

a

[a,!]

Page 22: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Finite State Automata (FSA)

• In which state does query– ?- s([b,a,b,a,!]).

fail?

z

!

y

a

s wb a> x

a

Databases([b|L]) :- w(L).w([a|L]) :- x(L).x([a|L]) :- y(L).y([a|L]) :- y(L).y([!|L]) :- z(L).z([]).

[b,a,b,a,!] [a,b,a,!] [b,a,!]

Page 23: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

FSA

• Finite State Automata (FSA) have a limited amount of expressive power

• Let’s look at a modification to FSA and its effect on its power

Page 24: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

String Transitions

– so far...• all machines have had just a

single character label on the arc• so if we allow strings to label arcs

– do they endow the FSA with any more power?

b

• Answer: No– because we can always convert a

machine with string-transitions into one without

abb

a b b

Page 25: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Finite State Automata (FSA)

• equivalent

s

z

baa

!

y

a

>

machine with 5 states

z

!

y

a

s wb a> x

a

Page 26: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Finite State Automata (FSA)

• equivalent

Databases([b|L]) :- w(L).w([a|L]) :- x(L).x([a|L]) :- y(L).y([a|L]) :- y(L).y(['!'|L]) :- z(L).z([]).

Databases([b,a,a|L]) :- y(L).y([a|L]) :- y(L).y(['!'|L]) :- z(L).z([]).

z

!

y

a

s wb a> x

as

z

baa

!

y

a

>

Page 27: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Empty Transitions

– so far...• how about allowing the empty

character? – i.e. go from x to y without seeing a input

character– does this endow the FSA with any more

power?

b

• Answer: No– because we can always convert a

machine with empty transitions into one without

x yε

Page 28: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Empty Transitions

• example– (ab)|b

a

ε

b a

b

b> >

Page 29: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

Empty Transitions

• example– (ab)|(empty string)

a ba

ε

b>

= final state

Page 30: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

NDFSA

• Basic FSA– deterministic

• it’s clear which state we’re always in, or• deterministic = no choice point

• NDFSA– ND = non-deterministic

• i.e. we could be in more than one state• non-deterministic choice point

– example:• initially, either in state 1 or 2

s x

y

aa

b

b

1 2a

ε

3b

>

>

Page 31: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

NDFSA

• more generally– non-determinism can be had not just with ε-transitions but

with any symbol• example:

– given a, we can proceed to either state 2 or 3

1 2a

a

3b

>

Page 32: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

NDFSA

• NDFSA– are they more powerful than FSA?– similar question asked earlier for ε-transitions – Answer: No– We can always convert a NDFSA into a FSA

• example– (set of states)

1 2a

a

3b

1 2,3a

3b

2,32> >

Page 33: LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10

NDFSA

• example– (set of states)– construct new machine with

states = set of possible states of the old machine

• Essential trick:– i.e. simulate the old (non-

deterministic) machine with the new machine

1 2a

a

3b

>

{1}>

a{1}> {2,3}

{3}ba

{1}> {2,3}

{3}ba

{1}> {2,3}

1 2,3a

3b

2,32>