21
LEXICAL ANALYSIS using Deterministic Finite Automata & Nondeterministic Finite Automata

using Deterministic Finite Automata Nondeterministic Finite Automata

Embed Size (px)

DESCRIPTION

Deterministic Finite Automata A regular expression can be represented (and recognized) by a machine called a deterministic finite automaton (dfa). A dfa can then be used to generate the matrix (or table) used by the scanner (or lexical analyzer). Deterministic finite automata are frequently also called simply finite automata (fa).

Citation preview

Page 1: using Deterministic Finite Automata  Nondeterministic Finite Automata

LEXICAL ANALYSIS

usingDeterministic Finite Automata &

Nondeterministic Finite Automata

Page 2: using Deterministic Finite Automata  Nondeterministic Finite Automata

Deterministic Finite Automata

• A regular expression can be represented (and recognized) by a machine called a deterministic finite automaton (dfa).

• A dfa can then be used to generate the matrix (or table) used by the scanner (or lexical analyzer).

• Deterministic finite automata are frequently also called simply finite automata (fa).

Page 3: using Deterministic Finite Automata  Nondeterministic Finite Automata

Example of a DFA for Recognizing Identifiers

Page 4: using Deterministic Finite Automata  Nondeterministic Finite Automata

Examples

A dfa for regular expressions on the alphabet

S = { a, b, c }

a. Which have exactly one b:

Page 5: using Deterministic Finite Automata  Nondeterministic Finite Automata

Examples (Cont. 1)

b. Which have 0 or 1 b's:

Page 6: using Deterministic Finite Automata  Nondeterministic Finite Automata

Examples (Cont. 2)

A dfa for a number with an optional fractional part (assume S = { 0,1,2,3,4,5,6,7,8,9,+,-,. }:

Page 7: using Deterministic Finite Automata  Nondeterministic Finite Automata

Constructing DFA

• Regular expressions give us rules for recognizing the symbols or tokens of a programming language.

• The way a lexical analyzer can recognize the symbols is to use a DFA (machine) to construct a matrix, or table, that reports when a particular kind of symbol has been recognized.

• In order to recognize symbols, we need to know how to (efficiently) construct a DFA from a regular expression.

Page 8: using Deterministic Finite Automata  Nondeterministic Finite Automata

How to Construct a DFA from a Regular Expression

• Construct a nondeterministic finite automata (nfa)

• Using the nfa, construct a dfa

• Minimize the number of states in the dfa to get a smaller dfa

Page 9: using Deterministic Finite Automata  Nondeterministic Finite Automata

Nondeterministic Finite Automata

• A nondeterministic finite automata (NFA) allows transitions on a symbol from one state to possibly more than one other state.

• Allows -transitions from one state to another whereby we can move from the first state to the second without inputting the next character.

• In a NFA, a string is matched if there is any path from the start state to an accepting state using that string.

Page 10: using Deterministic Finite Automata  Nondeterministic Finite Automata

NFA Example

This NFA accepts strings such as: abc abd ad ac

Page 11: using Deterministic Finite Automata  Nondeterministic Finite Automata

Examples

a f.a. for ab*:

a f.a. for ad

To obtain a f.a. for: ab* | ad We could try:

but this doesn't work, as it matches strings such as abd

Page 12: using Deterministic Finite Automata  Nondeterministic Finite Automata

Examples (Cont. 1)

So, then we could try:

It's not always easy to construct a f.a. from a regular expression.

It is easier to construct a NFA from a regular expression.

Page 13: using Deterministic Finite Automata  Nondeterministic Finite Automata

Examples (Cont. 2)

Example of a NFA with epsilon-transitions:

This NFA accepts strings such as ac, abc, ...

Page 14: using Deterministic Finite Automata  Nondeterministic Finite Automata

How to construct a NFA for any regular expression

Basic building blocks:

(1) Any letter a of the alphabet is recognized by:

(2) The empty set is recognized by:

Page 15: using Deterministic Finite Automata  Nondeterministic Finite Automata

(3) The empty string is recognized by:

(4) Given a regular expression for R and S, assume these boxes represent the finite automata for R and S:

Page 16: using Deterministic Finite Automata  Nondeterministic Finite Automata

How to construct a NFA for any regular expression - 3

(5) To construct a nfa for RS (concatenation):

(6) To construct a nfa for R | S (alternation):

Page 17: using Deterministic Finite Automata  Nondeterministic Finite Automata

(7) To construct a nfa for R* (closure):

Page 18: using Deterministic Finite Automata  Nondeterministic Finite Automata

NOTE: In 1-3 above we supply finite automata for some basic regular expressions, and in 4-6 we supply 3 methods of composition to form finite automata for more complicated regular expressions.

These, in particular, provide methods for constructing finite automata for regular expressions such as, e.g.: R+ = RR* R? = R | ε [1-3ab] = 1|2|3|a|b

Page 19: using Deterministic Finite Automata  Nondeterministic Finite Automata

Example

Construct a NFA for an identifier using the above mechanical method for the regular expression: letter ( letter | digit )*

First: construct the nfa for an identifier: ( letter | digit )

Page 20: using Deterministic Finite Automata  Nondeterministic Finite Automata

Example (Cont.1)

Next, construct the closure: ( letter | digit )*

1 23

4

5

67 8

letter

digit

Page 21: using Deterministic Finite Automata  Nondeterministic Finite Automata

Example (Cont.2)

Now, finish the construction for: letter ( letter | digit )*