48
CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

Embed Size (px)

Citation preview

Page 1: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CSCI 3130: Formal Languages and

Automata Theory

Tutorial 5Hung Chun Ho

Office: SHB 1026

Department of Computer Science & Engineering1

Page 2: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

Agenda

• Cocke-Younger-Kasami (CYK) algorithm– Parsing CFG in normal form

• Pushdown Automata (PDA)– Design

2

Page 3: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm

Bottom-up Parsing for normal form

3

Page 4: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

Cocke-Younger-Kasami Algorithm

• Used to parse context-free grammar in Chomsky normal form (or simply normal form)

Every production is of type

1) X YZ

2) X a

3) S ε

Normal Form Example

S AB

A CC | a | c

B BC | b

C CB | BA | c

4

Page 5: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm - Idea

• = Algorithm 2 in Lecture Note (10L8.pdf)• Idea: Bottom Up Parsing• Algorithm:

Given a string s of length NFor k = 1 to N

For every substring of length k Determine what variable(s) can derive it

5

Page 6: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm - Example

• CFG

• Parse abbc

S AB

A CC | a | c

B BC | b

C CB | BA | c

6

Page 7: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Idea (1)

• Idea: We parse the strings in this order:• Length-1 substring

abbcabbcabbcabbc

7

Page 8: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Idea (1)

• Idea: We parse the strings in this order:• Length-2 substring

abbcabbcabbc

8

Page 9: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Idea (1)

• Idea: We parse the strings in this order:• Length-3 substring

abbcabbc

• Length-4 substringabbc

• Done!

9

Page 10: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Idea (2)

• Idea: Parsing of longer substrings depends on parsing of shorter substrings

• Example: abb may be decomposed as– ab + b– a + bb

• If we know how to parse ab and b (or, a and bb) then we know how to parse abb

10

Page 11: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Substring

• Denote sub(i, j) := substring with start index = i and end index = j

• Example: For abbc, sub(2,4) = bbc• This notation is not to complicate things, but

just for the sake of convenience in the following discussion…

11

Page 12: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

 

   

     

       

CYK Algorithm – Table

• Each cell corresponds to a substring• Store variables deriving the substring

Substring of length = 3Starting with index = 2

i.e., sub(2,3) = bbc

ba b c

Length of S

ubstring

Start Index of Substring 12

Page 13: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Simulation

• Base Case : length = 1– The possible choices of variable(s) can be known

by scanning through each production

a b b c

S AB

A CC | a | c

B BC | b

C CB | BA | c A B B A, C

         

13

Page 14: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

         A B B A, C

CYK Algorithm – Simulation

• Loop : length = 2– For each substring of length 2

• Decompose into shorter substrings• Check cells below it

S AB

A CC | a | c

B BC | b

C CB | BA | c

a b b c

ab Let’s parse this substring

14

Page 15: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Simulation

• For sub(1,2) = ab, it can be decomposed:– ab = a + b

= sub(1,1) + sub(2,2)– Possible choices: AB– Scan rules

a b b c

S AB

A CC | a | c

B BC | b

C CB | BA | c

: S

S

         A B B A, C

15

Page 16: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Simulation

• For sub(2,3) = bb, it can be decomposed:– bb = b + b

= sub(2,2) + sub(3,3)– Possible choices: BB– Scan rules

a b b c

S AB

A CC | a | c

B BC | b

C CB | BA | c

16

: ∅

No suitable rules are found The CFG cannot parse this substring

    

S     A B B A, C

Page 17: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Simulation

• For sub(3,4) = bc, it can be decomposed:– bc = b + c

= sub(3,3) + sub(4,4)– Possible choices: BA, BC– Scan rules

a b b c

S AB

A CC | a | c

B BC | b

C CB | BA | c

17

: B, C

B, C

    

S  ∅   A B B A, C

Page 18: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Simulation

• For sub(1,3) = abb:– abb = ab + b

= sub(1,2) + sub(3,3)– Possible choices: SB– Scan rules

a b b c

S AB

A CC | a | c

B BC | b

C CB | BA | c

18

: ∅

No suitable variables found yetBut, there is another way to decompose the string

    

S  ∅ B,

C A B B A, C

Page 19: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Simulation

• For sub(1,3) = abb:– abb = a + bb

= sub(1,1) + sub(2,3)– Possible choices: ∅– Scan rules

a b b c

S AB

A CC | a | c

B BC | b

C CB | BA | c

19

Cant parse smaller substring Cant parse the string No need to scan rules

    

S  ∅ B,

C A B B A, C

Page 20: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Simulation

• For sub(1,3) = abb:– abb = sub(1,1) + sub(2,3) gives no valid parsing– abb = sub(1,2) + sub(3,3) gives no valid parsing

• Cannot parse

a b b c

S AB

A CC | a | c

B BC | b

C CB | BA | c

20

    

S  ∅  B, C

A B B A, C

Page 21: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Simulation

• For sub(2,4) = bbc:– bbc = sub(2,2) + sub(3,4)

• Possible choices: BB, BC

– bbc = sub(2,3) + sub(4,4)• Possible choices: ∅

a b b c

S AB

A CC | a | c

B BC | b

C CB | BA | c

21

Variable: B

B

 ∅   

S  ∅ B,

C A B B A, C

Page 22: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Simulation

• Finally, for sub(1,4) = abbc:– Possible choices:

– Variables:•

a b b c

S AB

A CC | a | c

B BC | b

C CB | BA | c

22

AB

S

, SB, SC

This cell represents the original string, and it consists S abbc is in the language

 ∅  B 

S  ∅ B,

C A B B A, C

Page 23: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Parse Tree

• abbc is in the language!• How to obtain the parse tree?

– Tracing back the derivations:• sub(1,4) is derived using SAB from sub(1,1) and

sub(2,4)• sub(1,1) is derived using Aa• sub(2,4) is derived using BBC from sub(2,2) and

sub(3,4)• …

• So, record also the used derivations!

23

Page 24: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Parse Tree

• Obtained from the table

a b b c

S ∅  B 

S  ∅ B,

C A B B A, C

24

Page 25: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – Conclusion

• A bottom up parsing algorithm– Dynamic Programming– Solution of a subproblem (parsing of a substring)

depends on that of smaller subproblems• Before employing CYK Algorithm, convert the

grammar into normal form– Remove ε-productions– Remove unit-productions

25

Page 26: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

CYK Algorithm – DetailedD = “On input w = w1w2…wn:

If w = ε, and S ε is rule, AcceptFor i = 1 to n: For each variable A: Test whether A b is a rule, where b = wi.

If so, place A in table(i, i).For l = 2 to n: For i = 1 to n – l + 1: Let j = i + l – 1, For k = i to j – 1: For each rule A BC:If table(i,k) contains B and table(k+1, j) contains C

Put A in table(i, j)If S is in table (1,n), accept. Otherwise, reject.”

26

Page 27: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

Pushdown Automata

NFA with infinite memory/states

27

Page 28: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

Pushdown Automata

• PDA ~= NFA, with a stack of memory• Transition:

– NFA – Depends on input– PDA – Depends on input and top of stack

• Push a symbol to stack• Pop a symbol to stack• Read a terminal on string

• Transitions are non-deterministic

(possibly ε)(possibly ε)

(possibly ε)

28

Page 29: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

Pushdown Automata and NFA

• Accept:– NFA – Go to an Accept state– PDA – Go to an Accept state

29

Page 30: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1

• Given the following language:

• Design a PDA for itL = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}

30

Page 31: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1 - Idea

• Idea: The input has two sections– First half

• All ‘0’s

– Second half• All ‘1’s• #‘1 depends on #‘0’

– #‘0’ ≤ #‘1’ ≤ #‘0’ × 2

31

Page 32: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1 – Solution

• Solution:

q0

e,e/$

0,e/Xe,e/e q1

q2

e,$/e

1,X/e

1,X/X 1,X/eq3

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}

32

Page 33: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1 – Explain

• Solution:

• Let’s try some string… w = 00111– See white board for simulation…

q0

e,e/$

0,e/Xe,e/e q1

q2

e,$/e

1,X/e

1,X/X 1,X/eq3

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}

33

Page 34: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1 – Explain

• Solution:

• Indicates the start of parsing

q0

e,e/$

0,e/Xe,e/e q1

q2

e,$/e

1,X/e

1,X/X 1,X/eq3

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}

34

Page 35: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1 – Explain

• Solution:

• This part saves information about #‘0’• # ‘X’ in stack = #‘0’

q0

e,e/$

0,e/Xe,e/e q1

q2

e,$/e

1,X/e

1,X/X 1,X/eq3

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}

35

Page 36: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1 – Explain

• Solution:

• This part accounts for #‘1’– #‘0’ ≤ #‘1’ ≤ #‘0’ × 2

q0

e,e/$

0,e/Xe,e/e q1

q2

e,$/e

1,X/e

1,X/X 1,X/eq3

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}

36

Page 37: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1 – Explain

• Solution:

• Consume one ‘X’ and eats one ‘1’

q0

e,e/$

0,e/Xe,e/e q1

q2

e,$/e

1,X/e

1,X/X 1,X/eq3

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}

37

Page 38: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1 – Explain

• Solution:

• Consume one ‘X’ and eats two ‘1’

q0

e,e/$

0,e/Xe,e/e q1

q2

e,$/e

1,X/e

1,X/X 1,X/eq3

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}

38

Page 39: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1 – Explain

• Solution:

• Consume one ‘X’, and then– eats one ‘1’, or– eat two ‘1’

q0

e,e/$

0,e/Xe,e/e q1

q2

e,$/e

1,X/e

1,X/X 1,X/eq3

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}

39

Page 40: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 1 – Explain

• Solution:

• Indicates the end of parsing

q0

e,e/$

0,e/Xe,e/e q1

q2

e,$/e

1,X/e

1,X/X 1,X/eq3

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}

40

Page 41: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 2

• Given the following language:

• Design a PDA for it

L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d}

41

Page 42: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 2 – Idea

• Idea:– Sequentially read (multiple) ‘a’, ‘b’, ‘c’ and ‘d’– Maintain:

• #‘a’ + #‘c’• #‘b’ + #‘d’

– If these numbers equal• Accept

42

Page 43: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 2 – Solution

• Solution:

L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d}

43

e,e/$ q5q1

a,e/X

e,e/e

b,$/$Y

q2e,e/e

c,X/XX

q3e,e/e q4

e, $ /e

b,X/e

b,Y/YY

c,$/$X

c,Y/e

d,X/e

d,$/$Y

d,Y/YY

Page 44: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 2 – Explain

• Solution:

L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d}

a b c d endstart

44

e,e/$ q5q1

a,e/X

e,e/e

b,$/$Y

q2e,e/e

c,X/XX

q3e,e/e q4

e, $ /e

b,X/e

b,Y/YY

c,$/$X

c,Y/e

d,X/e

d,$/$Y

d,Y/YY

Page 45: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 2 – Explain

• Solution:

• Each X in stack = An extra a or c

L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d}

45

e,e/$ q5q1

a,e/X

e,e/e

b,$/$Y

q2e,e/e

c,X/XX

q3e,e/e q4

e, $ /e

b,X/e

b,Y/YY

c,$/$X

c,Y/e

d,X/e

d,$/$Y

d,Y/YY

Page 46: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 2 – Explain

• Solution:

• Each Y in stack = An extra b or d

L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d}

46

e,e/$ q5q1

a,e/X

e,e/e

b,$/$Y

q2e,e/e

c,X/XX

q3e,e/e q4

e, $ /e

b,X/e

b,Y/YY

c,$/$X

c,Y/e

d,X/e

d,$/$Y

d,Y/YY

Page 47: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 2 – Explain

• Solution:

• X and Y ‘cancel’ each other• The stack contains only X’s or only Y’s

L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d}

47

e,e/$ q5q1

a,e/X

e,e/e

b,$/$Y

q2e,e/e

c,X/XX

q3e,e/e q4

e, $ /e

b,X/e

b,Y/YY

c,$/$X

c,Y/e

d,X/e

d,$/$Y

d,Y/YY

Page 48: CSCI 3130: Formal Languages and Automata Theory Tutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

PDA – Example 2 – Explain

• Solution:

• No X’s and no Y’s means– #a + #c = #b + #d Accept

e,e/$ q5q1

a,e/X

e,e/e

b,$/$Y

q2e,e/e

c,X/XX

q3e,e/e q4

e, $ /e

b,X/e

b,Y/YY

c,$/$X

c,Y/e

d,X/e

d,$/$Y

d,Y/YY

L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d}

48