61
J. Xue Tutorials Tutorials to start in week 3 (i.e., next week) Tutorial questions are already available on-line COMP3131/9102 Page 65 March 4, 2018

Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Tutorials

• Tutorials to start in week 3 (i.e., next week)

• Tutorial questions are already available on-line

COMP3131/9102 Page 65 March 4, 2018

Page 2: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Assignment 1: Scanner

• +5 =⇒ two tokens: + and 5

the scanner understands how tokens are formed but not

anything else

COMP3131/9102 Page 66 March 4, 2018

Page 3: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

COMP3131/9102: Programming Languages and Compilers

Jingling Xue

School of Computer Science and Engineering

The University of New South Wales

Sydney, NSW 2052, Australia

http://www.cse.unsw.edu.au/~cs3131

http://www.cse.unsw.edu.au/~cs9102

Copyright @2018, Jingling Xue

COMP3131/9102 Page 67 March 4, 2018

Page 4: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

The Big Picture

REs

NFA

DFA

DFAminimal-state

DFA

The two conversions in dashed arrows are not covered:

• REs → DFA (pages 135 – 141, Red Dragon/§3.7, Purple

Dragon)

• DFA → REs: Chapter 3, J. Hopcroft, R. Motwani and J. Ullman, Introduction to

Automata Theory, Languages, and Computation, Addison-Wesley, 2nd Edition, 2001. See

www-db.stanford.edu/~ullman/ullman-books.html.

• DFA → minimal-state DFA (pages 141 – 144, Red

Dragon/§3.9.6, Purple Dragon)

• Tools: http://www.jflap.org/

COMP3131/9102 Page 68 March 4, 2018

Page 5: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Week 2: Regular Expressions, DFA and NFA

1. Definitions of REs, DFA and NFA

2. REs =⇒ NFA (Thompson’s construction, Algorithm 3.3, Red

Dragon/Algorithm 3.23, Purple Dragon)

3. NFA =⇒ DFA (subset construction, Algorithm 3.2, Red

Dragon/Algorithm 3.20, Purple Dragon)

4. DFA =⇒ minimal-state DFA (state minimisation, Algorithm 3.6, Red

Dragon/Algorithm 3.39, Purple Dragon)

5. Scanner generators

• How to use them (straightforward)

• How to write them (the most techniques introduced today)

COMP3131/9102 Page 69 March 4, 2018

Page 6: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Applications of Regular Expressions

• Anywhere when patterns of text need to be specified

– Specifying restriction enzymes

– Google analytics

• Unix system, database and networking administration:

grep, fgrep, egrep, sed, awk

• HTML documents: Javascript and VBScript

• Perl:

J. Friedl, Mastering Regular Expressions, O’reilly, 1997

• Token Specs for scanner generators (lex, Jflex, etc.)

• http://www.zytrax.com/tech/web/regex.htm

COMP3131/9102 Page 70 March 4, 2018

Page 7: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Applications of Finite Automata (i.e., Finite State Machines)

• Hardware design (minimising states =⇒ minimising cost)

• Language theory

• Computational complexity

• Scanner generators (lex and Jflex)

• Automata tools:

http://research.microsoft.com/en-us/

downloads/

39c51620-548c-49a3-ac9c-40d807010c07/

COMP3131/9102 Page 71 March 4, 2018

Page 8: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Alphabet, Strings and Languages

• Alphabet denoted Σ: any finite set of symbols

– The binary alphabet {0,1} (for machine languages)

– The ASCII alphabet (for high-level languages)

• String: a finite sequence of symbols drawn from Σ:

– Length |s| of a string s: the number of symbols in s

– ǫ: the empty string (|ǫ| = 0)

• Language: any set of strings over Σ; its two special cases:

– ∅: the empty set

– {ǫ}

COMP3131/9102 Page 72 March 4, 2018

Page 9: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Examples of Languages

• Σ = {0, 1} – a string is an instruction

– The set of M68K instructions

– The set of Pentium instructions

– The set of MIPS instructions

• Σ = the ASCII set – a string is a program

– the set of Haskell programs

– the set of C programs

– the set of VC programs

COMP3131/9102 Page 73 March 4, 2018

Page 10: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Terms for Parts of a String (Figure 3.7 of Text)

TERM DEFINITION

prefix of s a string obtained by removing

0 or more trailing symbols of s

suffix of s a string obtained by removing

0 or more leading symbols of s

substring of s a string obtained by deleting

a prefix and a suffix from s

proper prefix

suffix, substring of s

Any nonempty string x that is, respectively,

a prefix, suffix

or substring of s such that s 6= x

COMP3131/9102 Page 74 March 4, 2018

Page 11: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

String Concatenation

• If x and y are strings, xy is the string formed by

appending y to x

• Examples:

x y xy

key word keyword

java script javascript

• ǫ is the identity: ǫx = xǫ = x

COMP3131/9102 Page 75 March 4, 2018

Page 12: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Operations on Languages (Figure 3.8 of Text)

OPERATION DEFINITION

union: L ∪M L ∪M = {s | s ∈ L or s ∈ M}concatenation: LM LM = {st | s ∈ L and t ∈ M}Kleene Closure: L∗ L∗ = ∪∞

i=0Li = L0 ∪ L ∪ LL ∪ LLL . . .

where L0 = {ǫ}(0 or more concatenations of L)

Positive Closure: L+ L+ = ∪∞i=1L

i = L ∪ LL ∪ LLL . . .

(1 or more concatenations of L)

COMP3131/9102 Page 76 March 4, 2018

Page 13: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Examples: Operations on Languages

• L = {a, . . . , z, A, . . . , Z, }• D = {0, . . . , 9}

EXAMPLE LANGUAGE (THE SET OF )

L ∪D

L3

LD

L∗

L(L ∪D)∗

D+

COMP3131/9102 Page 77 March 4, 2018

Page 14: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Examples: Operations on Languages

• L = {a, . . . , z, A, . . . , Z, }• D = {0, . . . , 9}

EXAMPLE LANGUAGE

L ∪D letters and digits

L3 all 3-letter strings

LD strings consisting of a letter followed by a digit

L∗ all strings of letters, including the empty string ǫ

L(L ∪D)∗ all strings of letters and digits beginning with a letter

D+ all strings of one or more digits

COMP3131/9102 Page 78 March 4, 2018

Page 15: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Regular Expressions (REs) Over Alphabet Σ

• Inductive Base:

1. ǫ is a RE, denoting the RL {ǫ}2. a ∈ Σ is a RE, denoting the RL {a}

• Inductive Step: Suppose r and s are REs, denoting the

RLs L(r) and L(s). Then (next slide):

1. (r)|(s) is a RE, denoting the RL L(r) ∪ L(s)

2. (r)(s) is a RE, denoting the RL L(r)L(s)

3. (r)∗ is a RE, denoting the RL L(r)∗

4. (r) is a RE, denoting the RL L(r)

REs define regular languages (RL) or regular sets

COMP3131/9102 Page 79 March 4, 2018

Page 16: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Precedence and Associativity of “Regular” Operators

• Precedence:

– “∗” has the highest precedence

– “Concatenation” has the second highest precedence

– “|” has the lowest precedence

• Associativity: — all are left-associative

• Example:

(a)|((b)∗(c)) ≡ a|b∗c

Unnecessary parentheses can be avoided!

COMP3131/9102 Page 80 March 4, 2018

Page 17: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

An Example (Following the Definition of REs)

• Alphabet: Σ = {0, 1}• RE: 0(0|1)∗

• Question: What is the language defined by the RE?

• Answer:

L(0(0|1)∗) = L(0)L((0|1)∗)= {0}L(0|1)∗= {0}(L(0) ∪ L(1))∗

= {0}({0} ∪ {1})∗= {0}{0, 1}∗= {0}{ǫ, 0, 1, 00, 01, 10, 11, . . . }= {0, 00, 01, 000, 001, 010, 011, . . . }

The RE describes the set of strings of 0’s and 1’s beginning with a 0.

COMP3131/9102 Page 81 March 4, 2018

Page 18: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

More Example Regular Expressions: Σ = {0, 1}

RE LANGUAGE

1 {1}0|1 {0, 1}1∗ {ǫ, 1, 11, 111, . . . }1∗1 {1, 11, 111, . . . }

0|0∗1 the set containing 0 and all strings consisting

of zero or more 0’s followed by a 1.

COMP3131/9102 Page 82 March 4, 2018

Page 19: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Notational Shorthands

• One or more instances +: r+ = rr∗

– denotes the language (L(r))+

– has the same precedence and associativity as ∗

• Zero or one instance ?: r? = r|ǫ– denotes the language L(r) ∪ {ǫ}– written as (r)? to indicate grouping (e.g., (12)?)

• Character classes:

[A− Za− z ][A− Za− z0− 9 ]∗

COMP3131/9102 Page 83 March 4, 2018

Page 20: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Regular Expressions for VC (or C)

TOKEN RE

Identifiers letter(letter|digit)∗Integers digit+

Reals A bit long but can be obtained from

the following page by substitutions

• In the VC spec, letter includes “ ”

• In Java, letters and digits may be drawn from the entire

Unicode character set. Examples of identifiers are:

abc αβγ

COMP3131/9102 Page 84 March 4, 2018

Page 21: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Regular Grammars for Integers and Reals in VC

• Integers:

digit: 0|1|2|...|9

intLiteral: digit+

• Reals:

digit: 0|1|2|...|9

fraction: .digit+

exponent: (E|e)(+|-)?digit+

floatLiteral: digit∗ fraction exponent?

| digit+.

| digit+.?exponent

Regular grammars are a special case of CFGs (Week 3).

COMP3131/9102 Page 85 March 4, 2018

Page 22: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Finite Automata (or Finite State Machines)

A finite automaton consists of a 5-tuple:

(Σ, S, T, F, I)

where

• Σ is an alphabet

• S is a finite set of states

• T is a state transition function: T : S × Σ → S

• F is a finite set of final or accepting states

• I is the start state: I ∈ S.

COMP3131/9102 Page 86 March 4, 2018

Page 23: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Representation and Acceptance

• Transition graph:

state A

transition A Ba

start state Sstart

final state A

• Acceptance: A FA accepts an input string x iff there is

some path in the transition graph from the start state to

some accepting state such that the edge labels spell out x.

COMP3131/9102 Page 87 March 4, 2018

Page 24: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

What Language does this FA accept?

S A1

1

00

start

COMP3131/9102 Page 88 March 4, 2018

Page 25: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Example 1

• The language: strings of 0 and 1 with an odd number of 1

(ǫ not included)

S A1

1

0

0 S: even number of 1’s seenA: odd number of 1’s seen

start

Σ {0, 1}S {S,A}T T (S, 0) = S, T (S, 1) = A, T (A, 0) = A, T (A, 1) = S

F {A}I S

• 01011 is recognised because

S0→ S

1→ A0→ A

1→ S1→ A

COMP3131/9102 Page 89 March 4, 2018

Page 26: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Implicit Error State

• By definition, T is a function from S × Σ to S, but ...

S A1

1

0

start

• If T (s, a) is undefined at the state s on input a, then

T (s, a) = error

S A error1

1

0

0

start

• The error state and transitions to it aren’t drawn (by convention)

COMP3131/9102 Page 90 March 4, 2018

Page 27: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Deterministic FA (DFA) and Nondeterministic FA (NFA)

A FA is a DFA if

• no state has an ǫ-transition, i.e., an transition on input ǫ,

and

• for each state s and input symbol a, there is

at most one edge labeled a leaving s

A FA is an NFA if it is not a DFA:

• Nondeterministic: can make several parallel transitions on

a given input

• Acceptance: the existence of some path as per Slide 87

COMP3131/9102 Page 91 March 4, 2018

Page 28: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

DFA or NFA? What are the Languages Recognised?

0 1 2 3 4 5

a

a a a a a

astart

0 1 2 3 4 5

a

a a ǫ ǫ a

astart

COMP3131/9102 Page 92 March 4, 2018

Page 29: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Two Examples

• NFA 1:

0 1 2 3 4 5

a

a a a a a

astart

• NFA 2:

0 1 2 3 4 5

a

a a ǫ ǫ a

astart

• The same language:

the set of all strings of a’s such that the length of each of

these strings is a multiple of 2 or 3 (ǫ included)

COMP3131/9102 Page 93 March 4, 2018

Page 30: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Real-Life DFAs

The ghosts in Pac-Man have four behaviors:

1. Randomly wander the maze

2. Chase Pac-Man, when he is within line of sight

3. Flee Pac-Man, after Pac-Man has consumed a power pellet

4. Return to the central base to regenerate

COMP3131/9102 Page 94 March 4, 2018

Page 31: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Real-Life DFAs

The behavior of a vending machine:

accepts dollars and 25 cents, and charges $1.25 per coke.

COMP3131/9102 Page 95 March 4, 2018

Page 32: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

What About this Non-Real-Life NFA?

ǫ

COMP3131/9102 Page 96 March 4, 2018

Page 33: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Week 2: Regular Expressions, DFA and NFA

1. Definitions of REs, DFA and NFA√

2. REs =⇒ NFA (Thompson’s construction, Algorithm 3.3, Red

Dragon/Algorithm 3.23, Purple Dragon)

3. NFA =⇒ DFA (subset construction, Algorithm 3.2, Red

Dragon/Algorithm 3.20, Purple Dragon)

4. DFA =⇒ minimal-state DFA (state minimisation, Algorithm 3.6, Red

Dragon/Algorithm 3.39, Purple Dragon)

5. Scanner generators

• How to use them (straightforward)

• How to write them (the most techniques introduced today)

COMP3131/9102 Page 97 March 4, 2018

Page 34: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Thompson’s Construction of NFA from REs

• Syntax-driven

• Inductive: The cases in the construction of the NFA follow

the cases in the definition of REs

• Important: if a symbol a occurs several times in a RE r, a

separate NFA is constructed for each occurrence

• Thompson’s method is one of many available

COMP3131/9102 Page 98 March 4, 2018

Page 35: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Thompson’s Construction

• Inductive Base:

1. For ǫ, construct the NFA

S Aǫstart

2. For a ∈ Σ, construct the NFA

S Aastart

• Inductive step: suppose N(r) and N(s) are NFAs for REs

r and s. Then

COMP3131/9102 Page 99 March 4, 2018

Page 36: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Thompson’s Construction (Cont’d)

N(r)

N(s)S A

ǫ

ǫ

ǫ

ǫRE r|s : start

S N(r) AN(s)start

RE rs :

N(r)S Aǫ ǫ

ǫ

ǫ

startRE r∗ :

RE (r) : N((r)) is the same as N(r)

COMP3131/9102 Page 100 March 4, 2018

Page 37: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Example: RE =⇒ NFA

Converting (0|10∗1)∗10∗ to an NFA

COMP3131/9102 Page 101 March 4, 2018

Page 38: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Example: RE =⇒ NFA

• Regular expression: (0|10∗1)∗10∗

• NFA:

0 1 2 3 4 5 6 7 8 9

startǫ ǫ 0 ǫ ǫ 1 ǫ 0 ǫ

ǫ ǫ

ǫ10 15

11 12 13 14

ǫ

ǫ

1 ǫ 0 ǫ 1

ǫ

ǫ

ǫ

COMP3131/9102 Page 102 March 4, 2018

Page 39: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Week 2: Regular Expressions, DFA and NFA

1. Definitions of REs, DFA and NFA√

2. REs =⇒ NFA (Thompson’s construction, Algorithm 3.3, Red

Dragon/Algorithm 3.23, Purple Dragon)√

3. NFA =⇒ DFA (subset construction, Algorithm 3.2, Red

Dragon/Algorithm 3.20, Purple Dragon)

4. DFA =⇒ minimal-state DFA (state minimisation, Algorithm 3.6, Red

Dragon/Algorithm 3.39, Purple Dragon)

5. Scanner generators

• How to use them (straightforward)

• How to write them (the most techniques introduced today)

COMP3131/9102 Page 103 March 4, 2018

Page 40: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Example: DFA (Cont’d)

1,2,3,4,5,10

6,7,9,11,12,147,8,9,

12,13,140,1,2,5,10

1,2,4,5,10,15

0

0

1

0

0

1 1

1 1

0

start

• The algorithm used is known as the subset construction,

because a DFA state corresponds to a subset of NFA states

• There are at most 2n DFA states, where n is the total

number of the NFA states

COMP3131/9102 Page 104 March 4, 2018

Page 41: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Subset Construction: The Operations Used

OPERATION DESCRIPTION

ǫ-closure(s) Set of NFA states readable from

NFA state s on ǫ-transitions

ǫ-closure(T ) Set of NFA states readable from

some state s in T on ǫ-transitions

move(T, a) Set of NFA states to which there is a transition

on input a from some state s in T

• s: a NFA state

• T : a set of NFA states

COMP3131/9102 Page 105 March 4, 2018

Page 42: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Subset Construction: The Algorithm

Let s0 be the start state of the NFA;

DFAstates contains the only unmarked state ǫ-closure(s0);

while there is an unmarked state T in DFAstates do begin

mark T

for each input symbol a do begin

U := ǫ-closure(move(T, a));

if U is not in DFAstates then

Add U as an unmarked state in DFAstates;

DFATrans[T, a] := U ;

end;

end;

COMP3131/9102 Page 106 March 4, 2018

Page 43: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Subset Construction: The Definition of the DFA

Let (Σ, S, T, F, s0) be the original NFA. The DFA is:

• The alphabet: Σ

• The states: all states in DFAstates

• The start state: ǫ-closure(s0)

• The accepting states: all states in DFAstates containing at

least one accepting state in F of the NFA

• The transitions: DFATrans

COMP3131/9102 Page 107 March 4, 2018

Page 44: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Week 2: Regular Expressions, DFA and NFA

1. Definitions of REs, DFA and NFA√

2. REs =⇒ NFA (Thompson’s construction, Algorithm 3.3, Red

Dragon/Algorithm 3.23, Purple Dragon)√

3. NFA =⇒ DFA (subset construction, Algorithm 3.2, Red

Dragon/Algorithm 3.20, Purple Dragon)√

4. DFA =⇒ minimal-state DFA (state minimisation, Algorithm 3.6, Red

Dragon/Algorithm 3.39, Purple Dragon)

5. Scanner generators

• How to use them (straightforward)

• How to write them (the most techniques introduced today)

COMP3131/9102 Page 108 March 4, 2018

Page 45: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

An Algorithm to Mimimise DFA Statements

Initially, let Π be the partition with the two groups:

(1) one is the set of all final states

(2) the other is the set of all non-final states

Let Πnew = Πfor (each group G in Πnew) {

partition G into subgroups such that two states s and t

are in the same subgroup iff for all input symbols

a, states s and t have transitions on a to

states in the same group of Πnew

replace G in Πnew by the set of subgroups formed

}• Begins with the most optimistic assumption

• Also used in global value numbering (COMP4133)

COMP3131/9102 Page 109 March 4, 2018

Page 46: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Example (Cont’d): States Re-Labeled

D

B CA

E

0

01

0

0

1 11 1

0

start

state minimisation

A,D,E B,C1

1

0

0

start

Theoretical Result: every regular language can be recognised

by a minimal-state DFA that is unique up to state names

COMP3131/9102 Page 110 March 4, 2018

Page 47: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Various Conversions for an Example

(0|10∗1)∗10∗

NFA in Slide 102

DFA

DFA in Slide 110minimal-state

DFA in Slide 110

However, the conversions in dashed arrows are not covered.

COMP3131/9102 Page 111 March 4, 2018

Page 48: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Week 2: Regular Expressions, DFA and NFA

1. Definitions of REs, DFA and NFA√

2. REs =⇒ NFA (Thompson’s construction, Algorithm 3.3, Red

Dragon/Algorithm 3.23, Purple Dragon)√

3. NFA =⇒ DFA (subset construction, Algorithm 3.2, Red

Dragon/Algorithm 3.20, Purple Dragon)√

4. DFA =⇒ minimal-state DFA (state minimisation, Algorithm 3.6, Red

Dragon/Algorithm 3.39, Purple Dragon)√

5. Scanner generators

• How to use them (straightforward)

• How to write them (the most techniques introduced today)

COMP3131/9102 Page 112 March 4, 2018

Page 49: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Scanner Generators

• Scanners generated in C

– lex (UNIX)

– flex – GNU’s fast lex (UNIX)

– mks lex (MS-DOS and OS/2)

• Scanners generated in Java

– Jflex

– JavaCC (SUN Microsystems)

COMP3131/9102 Page 113 March 4, 2018

Page 50: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

The Scanner Spec in Jflex

user code -- copied verbatim to the scanner file

%%

Jflex directives

%%

regular expression rules

COMP3131/9102 Page 114 March 4, 2018

Page 51: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

How a Scanner Generator Works

Token Spec using REs

NFA

DFA

minimal-state DFA

table-driven code (Jflex) – simulating a DFA on an input

or hard-wired code (Assignment 1 or §3.4 of either Dragon Book)

COMP3131/9102 Page 115 March 4, 2018

Page 52: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

An Example: Spec

some user code

%%

LETTER=[A-Za-z_]

DIGIT=[0-9]

%%

"if" { return new Token(Token.IF, "if", pos); }

"<" { return new Token(Token.LT, "<", pos); }

"<=" { return new Token(Token.LE, "<=", pos); }

{LETTER}({LETTER}|{DIGIT})*

{ return new Token(Token.ID, "itsSpelling", pos); }

Two rules:

• The first pattern used when more than one are matched – “if” as a

keyword not as an id

• The longest prefix of the input is always matched – ”<= as one token

COMP3131/9102 Page 116 March 4, 2018

Page 53: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

An Example: NFA

N(if)

N(<)

N(<=)

N(id)

S

ǫ

ǫ

ǫ

ǫ

start

A DFA can also be used for each pattern.

COMP3131/9102 Page 117 March 4, 2018

Page 54: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

An Example: NFA

1 2 3

4 5

6 7 8

9 10

i f

<

< =0

LETTER

ǫ

ǫ

ǫ

ǫ

start

LETTER, DIGIT

COMP3131/9102 Page 118 March 4, 2018

Page 55: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

An Example: DFA

2, 10 3, 10

5, 7 8

10

f

=

0,1,4,6,9

i

<

LETTER but i

start

LETTER, DIGIT

LETTER, DIGIT

DIGIT

LETTER but f

Already a minimal-state DFA!

COMP3131/9102 Page 119 March 4, 2018

Page 56: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

A DFA Represented as a Transition Table

StateCharacter

< = i f LETTER but i LETTER but f DIGIT(0,1,4,6,9) (5,7) (2,10) (10)

(2,10) (3,10) (10) (10)

(5,7) (8)

(8)

(10) (10) (10) (10) (10) (10)

(3,10) (10) (10) (10) (10) (10)

• Letter = {i}∪ “letter but i”

• Character classes reduce the table size

• The blank entries are errors

• The tables are usually sparse (pages 146 – 177 of text for compression techniques)

COMP3131/9102 Page 120 March 4, 2018

Page 57: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

The Scanner Driver for Simulating a DFA

state = initial_state

while (TRUE) {

next_state = T[state][current_char];

if (next_state == ERROR) // cannot move any further

break;

state = next_state;

if (current_char == EOF) // input exhausted

break;

current_char = getchar(); // fetch the next char

}

Backtrack to the most recent accepting state

if (such a state exists)

/* return the corresponding token

reset current_char to the first after the token

*/

else

lexical_error(state);

• There should be a column in the transition table for EOF

• Need to backtrack

COMP3131/9102 Page 121 March 4, 2018

Page 58: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

The Output of Running Jflex on a Sample Scanner Spec

• Scanner.l: the spec for the scanner generator Jflex

jflex Scanner.l

Constructing NFA : 267 states in NFA

Converting NFA to DFA :

139 states before minimization, 106 states in minimized DFA

Old file ‘‘Scanner.java’’ saved as ‘‘Scanner.java~’’

Writing code to ‘‘Scanner.java’’

• Scanner.l.java: the scanner generated

javac Scanner.l.java

• java Scanner < test.vc

COMP3131/9102 Page 122 March 4, 2018

Page 59: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Limitations of Regular Expressions (or FAs)

• Cannot “count”

• Cannot recognise palindromes (e.g., racecar & rotator)

• The language of the balanced parentheses

{(n)n | n > 1}is not a regular language

– cannot build a FA to recognise the language for any n

(can trivially build a FA for n=3, for example)

– but can be specified by a CFG (Week 3):

P → (P ) | ( )

COMP3131/9102 Page 123 March 4, 2018

Page 60: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Chomsky’s Hierarchy

Depending on the form of production

α→β

four types of grammars (and accordingly, languages) are distinguished:

GRAMMAR KNOWN AS DEFINITION LANGUAGE MACHINE

Type 0 unrestricted grammar α 6= ǫ Type 0 Turing machine

Type 1context-sensitive grammar

CSGs|α| ≤ |β| Type 1

linear bounded

automaton

Type 2context-free grammar

CFGsA→α Type 2 stack automaton

Type 3 Regular grammars A→w | Bw Type 3 finite state automaton

COMP3131/9102 Page 124 March 4, 2018

Page 61: Tutorials Tutorials to start in week 3 (i.e., next week ...cs3131/Lectures/lec2.pdf · J. Xue Tutorials • Tutorials to start in week 3 (i.e., next week) • Tutorial questions are

J. Xue✬

Reading

• Sections 3.3 – 3.7 of either Dragon Book

• Week 3 tutorial questions (available on-line)

Week 3: Context-Free Grammars

COMP3131/9102 Page 125 March 4, 2018