161
1-303 Discrete Maths: Grammars/8 Discrete Maths Objectives to introduce grammars and show th eir importance for defining progr amming languages and writing comp ilers; to show the connection between RE s and grammars 241-303, Semester 1 2014-2015 8. Grammars

241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

Embed Size (px)

Citation preview

Page 1: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 1

Discrete Maths

• Objectives– to introduce grammars and show their im

portance for defining programming languages and writing compilers;

– to show the connection between REs and grammars

241-303, Semester 1 2014-2015

8. Grammars

Page 2: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 2

Overview

1. Why Grammars?

2. Languages

3. Using a Grammar

4. Parse Trees

5. Ambiguous Grammars

6. Top-down and Bottom-up Parsing

continued

Page 3: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 3

7. Building Recursive Descent Parsers

8. Making the Translation Easy

9. Building a Parse Tree

10. Kinds of Grammars

11. From RE to a Grammar

12. Context-free Grammars vs. REs

Page 4: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 4

1. Why Grammars?

• Grammars are the standard way of defining programming languages.

• Tools exist for translating grammars into compilers (e.g. JavaCC, lex, yacc, ANTLR)– this saves weeks of work

Page 5: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 5

2. Languages

• We use a natural language to communicate– its grammar rules are very complex– the rules don’t cover important things

• We use a formal language to define a programming language– its grammar rules are fairly simple– the rules cover almost everything

continued

Page 6: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 6

• A formal language is a set of legal strings.

• The strings are legal if they correctly use the language’s alphabet and grammar rules.

• The alphabet is often called the language’s terminal symbols (or terminals).

Page 7: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 7

Example 1

• Alphabet (terminals) = {1, 2, 3}

• Using the grammar rules, the language is:L1 = { 11, 12, 13, 21, 22, 23, 31, 32, 33}

• L1 is the set of strings of length 2.

not shownhere; seelater

Page 8: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 8

Example 2

• Terminals = {1, 2, 3}

• Using different grammar rules, the language is:

L2 = { 111, 222, 333}

• L2 is the set of strings of length 3, where all the terminals are the same.

Page 9: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 9

Example 3

• Terminals = {1, 2, 3}

• Using different grammar rules, the language is:

L3 = {2, 12, 22, 32, 112, 122, 132, ...}

• L3 is the set of strings whose numerical value is divisible by 2.

Page 10: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 10

3. Using a Grammar

• A grammar is a notation for defining a language, and is made from 4 parts:– the terminal symbols– the syntactic categories (nonterminal symbols)

• e.g. statement, expression, noun, verb

– the grammar rules (productions)• e,g, A => B1 B2 ... Bn

– the starting nonterminal• the top-most syntactic category for this grammar

continued

Page 11: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 11

• We define a grammar G as a 4-tuple:G = (T, N, P, S)

– T = terminal symbols– N = nonterminal symbols– P = productions– S = starting nonterminal

Page 12: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 12

3.1. Example 1

• Consider the grammar:T = {0, 1}

N = {S, R}

P = { S => 0S => 0 RR => 1 S }

S is the starting nonterminal

the right hand sidesof productions usuallyuse a mix of terminalsand nonterminals

Page 13: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 13

Is “01010” in the language?• Start with a S rule:

– Rule String Generated-- SS => 0 R 0 RR => 1 S 0 1 SS => 0 R 0 1 0 RR => 1 S 0 1 0 1 SS => 0 0 1 0 1 0

• No more rules can be applied since there are no more nonterminals left in the string.

Yes, itis in thelanguage.

Page 14: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 14

Example 2

• Consider the grammar:T = {a, b, c, d, z}

N = {S, R, U, V}

P = { S => R U z | zR => a | b RU => d V U | cV => b | c }

S is the starting nonterminal

Page 15: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 15

• The notation:X => Y | Z

is shorthand for the two rules:X => YX => Z

• Read ‘|’ as ‘or’.

Page 16: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 16

Is “adbdbcz” in the language?

• Rule String Generated-- SS => R U z R U zR => a a U zU => d V U a d V U zV => b a d b U zU => d V U a d b d V U zV => b a d b d b U zU => c a d b d b c z Yes!

This grammar has choices about how to rewrite the string.

Page 17: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 17

Is “abdbcz” in the language?

• Rule String Generated-- SS => R U z R U zR => a a U zwhich U rule?

• U must be replaced by something beginning with a ‘b’, but the only U rule is:

U => d V U | c

No

Page 18: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 18

3.2. BNF

• BNF is a shorthand notation for productions– Backus Normal Form, or– Backus-Naur Form

• We have already used ‘|’:X => Y1 | Y2 | ... | Yn

continued

Page 19: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 19

• X => Y [Z]is shorthand for two rules:

X => YX => Y Z

• [Z] means 0 or 1 occurrences of Z.

continued

Page 20: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 20

• X => Y { Z }is shorthand for an infinite number of rules:

X => YX => Y ZX => Y Z ZX => Y Z Z Z :

• { Z } means 0 or more occurrences of Z.

Page 21: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 21

3.3. A Grammar for Expressions

• Consider the grammar:T = { 0, 1, 2,..., 9, +, -, *, /, (, ) }

N = { Expr, Number }

P = { Expr => NumberExpr => ( Expr )Expr => Expr + Expr | Expr - Expr |

Expr * Expr | Expr / Expr }

Expr is the starting nonterminal

Page 22: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 22

Defining Number• The RE definition for a number is:

number = digit digit*digit = [0-9]

• The productions for Number are:Number => Digit { Digit }Digit => 0 | 1 | 2 | 3 | … | 9

orNumber => Number Digit | DigitDigit => 0 | 1 | 2 | 3 | ... | 9

Page 23: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 23

Using Productions

• Expand Expr into (125-2)*3

Expr => Expr * Expr=> ( Expr ) * Expr=> ( Expr - Expr ) * Expr=> ( Number - Number ) * Number

:=> ( 125 - 2 ) * 3

continued

Page 24: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 24

• Expand Number into 125

Number => Number Digit=> Number Digit Digit=> Digit Digit Digit=> 1 2 5

Page 25: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 25

3.4. Grammars are not Unique

• Two grammars that do the same thing:Balanced => Balanced => ( Balanced ) Balanced

and:

Balanced => Balanced => ( Balanced )Balanced => Balanced Balanced

• Both generate the same strings:(()(())) () (()())

Page 26: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 26

3.5. Productions for parts of C

• Control structures:Statement => while ( Cond ) Statement

Statement => if ( Cond ) StatementStatement => if ( Cond ) Statement

else Statement

• Testing (conditionals):Cond => Expr < Expr | Expr > Expr | ...

continued

Page 27: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 27

• Statement blocks:Statement => ‘{‘ StatList ‘}’

StatList => Statement ; StatList | Statement ;

Page 28: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 28

Using the Statement Production

Statement=> while ( Cond ) Statement=> while ( Expr < Expr ) Statement=> while ( Expr < Expr ) { StatList }=> while ( Expr < Expr ) { Statement ; Statement ; }

:=> while (x < 10) { y++; x++; }

• This example requires an extra Expr production for variables:

Expr => VariableName

Page 29: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 29

3.6. Generating a Language

• For a given grammar, what strings can it generate?– the language is the set of legal strings

• Most languages contain an infinite number of strings (e.g. English)– but there is a process for generating them

continued

Page 30: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 30

• For each production, list the strings that can be derived immediately.

• On the 2nd round, put those strings back into the productions to generate more strings.

• On the 3rd round, put those strings back...

• Continue for as many rounds as you want.

Page 31: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 31

Example

• Consider the grammar:T = { w, c, s, ‘{‘, ‘}’, ‘;’ }

N = { S, L }

P = { S => w c S | ‘{‘ L ‘}’ | s ‘;’L => L S |

}

S is the starting nonterminal

Page 32: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 32

Strings in First 3 Rounds

S L

Round 1: s;

Round 2: wcs;{}

s;

Round 3: wcwcs;wc{}{s;}

wcs;{}s;s;s;wcs;s;{}

Page 33: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 33

4. Parse Trees

• A parse tree is a graphical way of showing how productions are used to generate a string.

• Data structures representing parse trees are used inside compilers to store information about the program being compiled.

Page 34: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 34

Example 1

• Consider the grammar:T = { a, b }

N = { S }

P = { S => S S | a S b | a b | b a }

S is the starting nonterminal

Page 35: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 35

Parse Tree for “aabbba”

The root of the tree is the start symbol S: S

Expand using S => S SS

SS

Expand using S => a S b

continued

expand thesymbol inthe circle

Page 36: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 36

S

S

S

S

a b

Expand using S => a bS

S

SS

a b

a bExpand using S => b a

continued

Page 37: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 37

S

S

S

a b

a b

S

b a

• Stop when there are no more nonterminals in leaf positions.

• Read off the string by reading the leaves left to right.

Page 38: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 38

Example 2

• Consider the grammar:T = { a, +, *, (, ) }

N = { E, T, F }

P = { E => T | T + ET => F | F * TF => a | ( E ) }

E is the starting nonterminal

Page 39: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 39

Is “a+a*a” in the Language?

E

Expand using E => T + E E

+ ET

Expand using T => F E

+ E

F

T

continued

Page 40: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 40

Continue expansion until:E

+ ET

F T

a F * T

a F

a

Page 41: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 41

5. Ambiguous Grammars

• A grammar is ambiguous when a string can be represented by more than one parse tree– it means that the string has more than one “mea

ning” in the language

• e.g. a variant of the last grammar example:P = { E => E + E | E * E | ( E ) | a }

Page 42: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 42

Parse Trees for “a+a*a”

E

E + E

a E * E

a a

andE

E + E

a a

E

* E

a

continued

Page 43: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 43

• The two parse trees allow a string like “5+5*5” to be read in two different ways:– 5+ 25 (the left hand tree)– 10*5 (the right hand tree)

Page 44: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 44

Why is Ambiguity Bad?

• In a programming language, a string with more than one meaning means that the compiler and run-time system will not know how to process it.

• e.g in C:x = 5 + 5 * 5;// what is the value in x?

Page 45: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 45

6. Top-down and Bottom-up Parsing

• Top-down parsing creates a parse tree starting from the start symbol and moves down towards the leaves.– used in most compilers– usually implemented as recursive-descent parsi

ng

continued

Page 46: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 46

• Bottom-up parsing creates a parse tree starting from the leaves, and moves up towards the start symbol.– productions are used in ‘reverse’

• Both kinds of parsing often require “guessing” to decide which productions to use to parse a string.

Page 47: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 47

Example

• Consider the grammar:T = { a, +, *, (, ) }

N = { E, T, F }

P = { E => T | T + ET => F | F * TF => a | ( E ) }

E is the starting nonterminal

Page 48: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 48

Top-down Parse of “a+a*a”

E

+ ET

F T

a F * T

a F

a

Top-down

Page 49: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 49

Bottom-up Parse of “a+a*a”

E

+ ET

F T

a F * T

a F

a

Bottom-up

Page 50: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 50

Guessing when Building

• Guessing occurs when there are several rules which can apply to the current nonterminal.

• Compilers are very bad at guessing, and so program language designers try to make grammars as simple as possible.

Page 51: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 51

Guessing in Bottom-up

• The compiler must backtrack to an earlier point and try a different rule.

E

T

F

a

E

T

F

a

E

T

F

a+ *

STUCK !

Page 52: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 52

7. Building Recursive Descent Parsers

• The parser will read a string as input, and test if it fits the grammar.

parser

continued

input stringe.g."a+a*a"

checks inputagainst thegrammar

output is:"yes" or "no"

Page 53: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 53

• In section 9 we will add the ability to generate a parse tree.

continued

parserinput stringe.g."a+a*a"

checks inputagainst thegrammar

output is:"no" or a tree:

Page 54: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 54

• The parser will be coded in 2 steps:– 1) Convert the grammar into syntax graphs– 2) Convert the syntax graphs into code

grammarconverted

to

syntaxgraphs converted

to

parser

The pay-off is that a programmar writes high-levelgrammar rules instead of complex code.

Page 55: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 55

7.1. What is a Syntax Graph?• A syntax graph is a graphical representation of a gra

mmar– easier to manipulate than grammars

• For example:P = { A => x | ( B )

B => A CC => { + A } }

• Valid strings:x (x) (x+x+x)

Remember that{ R } means 0or more R’s

Page 56: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 56

Graphs for A, B, and C

B( )

xA

A CB

A +

C

choicepoint

choicepoint

Page 57: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 57

Meanings• The input string is processed by following t

he graph for the top-most symbol in the grammar.

• A circle means:– if the current input character is a "x" then conti

nue by reading the next input character, otherwise reject the input string.

x

continued

Page 58: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 58

• A box means: – that the current input should be processed by th

e B syntax graph. It's like 'calling' the B graph to do the work.

B

Page 59: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 59

Choice Points Make Things Hard

• The graph must decide which path to take when execution reaches a choice point.– "deciding" can be hard

continued

(

xchoicepoint

Page 60: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 60

• General solution is to lookahead in the graph:– e.g. if the current input character is a "(" then g

o along the path that looks for a "(" next

Page 61: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 61

• For lookahead to be fast, it should be possible to decide which path to take by looking at only the next graph symbol.

(

xchoicepoint

This is possible if all the pathsstart with circles.

e.g. the top-path wants a "(", the bottom path wants a "x".

Page 62: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 62

7.2. From Grammar to Syntax Graph

• There are 6 translation rules to convert a grammar into a syntax graph.

Page 63: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 63

7.2.1. Translate a Production

• The production:A => Body

is mapped to a graph labelled A.

BodyA

The text inside thehexagon still needsto be translated toa graph.

Page 64: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 64

7.2.2. Translate a Terminal

• A terminal symbol x is translated to the graph:

x

the translation is finished

Page 65: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 65

7.2.3. Translate a Nonterminal

• A nonterminal symbol B is translated to the graph:

B

the translation is finished

Page 66: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 66

7.2.4. Translate ‘|’

• A production body of the form:Body1 | Body2 | ... | BodyN

is translated to the graph:

Body1

Body2

BodyN

::

The text inside thehexagons still needsto be translated tographs.

Page 67: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 67

7.2.5. Translate a Sequence

• A production body of the form:Body1 Body2 ... BodyN

is translated to the graph:

Body1 Body2 BodyN...

The text inside thehexagons still needsto be translated tographs.

Page 68: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 68

7.2.6. Translate {...} (0 or more)

• A production body of the form:{ Body1 }

is translated to the graph:

Body1The text inside thehexagon still needsto be translated toa graph.

Page 69: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 69

7.3. Grammar to Graph Example

• Consider the grammar:T = { x, +, (, ) }

N = { A, B, C }

P = { A => x | ( B )B => A CC => { + A } }

A is the starting nonterminal

Page 70: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 70

Translate the A Rule

• A => x | ( B ) uses 7.2.1 to become:

x | ( B )A

• Use 7.2.4 to become:Use 7.2.4 to become:

x

( B )

A

continued

Page 71: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 71

• Use 7.2.2. on the top branch:

( B )

Ax

• Use 7.2.5. on the bottom branch:Use 7.2.5. on the bottom branch:

( B )

Ax

continued

Page 72: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 72

• Use rules 7.2.2 and 7.2.3 on the bottom branch:

B( )

xA

Page 73: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 73

Graphs for A, B, and C

B( )

xA

A CB

A +

C

choicepoint

choicepoint

Page 74: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 74

Combining the Graphs

• Combine B and C graphs with A:

A(

A +

)

x

A

choicepoint

choicepoint

Page 75: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 75

Two Easy Choice Points

It is easy to decide which path to take at the two choice points.

Each path starts with a different nonterminal.

We can lookahead to decide which path to take.

Page 76: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 76

7.4. From Syntax Graphs to Code

• Each syntax graph is tranformed into a function using 6 basic transformations.

• main() does two things:– reads the first input character:

ch = getchar(); // ch is a global variable

– calls the function representing the starting nonterminal: A();

Page 77: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 77

7.4.1. Transform a Graph

• Becomes the function:void G(){ /* the code generated by

transforming the graph GBody */}

GBody

GThe graph inside thepentagon still needsto be translated tocode.

Page 78: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 78

7.4.2. Transform a Terminal

• Becomes the code:if (ch == ‘x’) ch = getchar(); // get ch for next stepelse error(); // reports error then exits

x

check input is x;get next input;

Page 79: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 79

7.4.3. Transform a Nonterminal

• Becomes the function call:G1();

G1

Page 80: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 80

7.4.4. Transform a Choice

• Becomes a switch or multiple if statement.

::

GBody1

GBody2

GBodyN

continued

choicepoint

Page 81: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 81

if (ch == firstGBody1) // transformation of GBody1 ;else if (ch == firstGBody2) // transformation of GBody2 ;else if ...

:else if (ch == firstGBodyN) // transformation of GBodyN ;else error();

continued

Page 82: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 82

• The translation tests ch to see if it is the character firstGBody1, firstGBody2, etc– ch is the current input character

• firstGBody1, firstGBody2, etc. are the first terminals (circles) of the pathsGBody1, GBody2, etc.

• These terminals must be distinct (different)– then only one test will succeed

Page 83: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 83

7.4.5. Transform a Sequence

• Becomes the block:{ // transformation of GBody1 ; // transformation of GBody2 ;

: // transformation of GBodyN ;}

GBody1 GBody2 GBodyN...

Page 84: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 84

7.4.6. Transform a Multiple

• Becomes the loop:while (ch == firstGBody1) // transformation of GBody1 ;

• firstGBody1 is the first terminal in GBody1.

GBody1 choicepoint

Page 85: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 85

Two Optimising Transformations

• There are two other transformations for a choice and a multiple.

• These are optimisations when the graph is a special shape.

Page 86: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 86

7.4.7. Optimising Choice

• Becomes a switch or multiple if statement.

::

GBody1

GBody2

GBodyN

continued

x1

x2

xNchoicepoint

Page 87: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 87

if (ch == ‘x1’) { ch = getchar(); // transformation of GBody1 ;}else if (ch == ‘x2’) { ch = getchar(); // transformation of GBody2 ;}else if ...

:else if (ch == ‘xN’) { ch = getchar(); // transformation of GBodyN ;}else error();

continued

Page 88: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 88

• Here the assumption is that the terminals x1, x2, etc are all different– this means that only 1 test will succeed

Page 89: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 89

7.4.8. Optimising Multiple

• Becomes the loop:while (ch == ‘x’) { ch = getchar(); // transformation of GBody1 ;}

GBody1 x choicepoint

Page 90: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 90

Code Optimisations

• Sometimes the generated code can be simplified. For example:

ch = getchar(); foo();while (ch == ‘x’) { ch = getchar(); foo();}

can be rewritten as:do { ch = getchar(); foo();while (ch == ‘x’);

Page 91: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 91

error() Function

• A simple error reporting function:

void error(){ printf(“Error while processing %c\n”,ch); exit(1);}

Page 92: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 92

7.5. Graph to Code Example

• The original grammar in section 7.3:T = { x, +, (, ) }

N = { A, B, C }

P = { A => x | ( B )B => A CC => { + A } }

A is the starting nonterminal

Page 93: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 93

Graphs for A, B, and C (again)

A CB

A +

C

B( )

xA

Page 94: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 94

Code#include ...

void A(); // parse functionsvoid B();void C();void error();

int ch; // holds current input char

void main(){ ch = getchar(); A(); printf(“parsed successfully\n”);}

continued

Page 95: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 95

void A(){ if (ch == ‘x’) ch = getchar(); else if (ch == ‘(‘) { ch = getchar(); B(); if (ch == ‘)’) ch = getchar(); else error(); } else error();}

continued

This code has beenoptimised to reducethe number of callsto error().

Page 96: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 96

void B(){ A(); C();}

void C(){ while (ch == ‘+’) { ch = getchar(); A(); }}

Page 97: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 97

8. Making the Translation Easy

• The translation (syntax graphs to code) requires the grammar to have special properties.

• When there is a choice about which path to take through a graph, the decision should depend only on the current character and the first terminals on the paths.

continued

(

x

Page 98: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 98

Examples

(

x the current input is 'x'

executionis here

The choice is easy to make.

a

a the current input is 'a'

executionis here

The choice isn't easy.(

x

Page 99: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 99

• It may be possible to “convert” a grammar into a suitable form by using techniques such as:– left recursion elimination– left factoring

Page 100: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 100

8.1. Left Recursion Elimination

• Example of left recursion:L => L a d |

• How many times should the L production be used to parse “adad”?

• Rearrange the grammar:L => a d L |

Such arearrangementis not alwayspossible.

Page 101: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 101

L a d

Lthe current inputis "a"

a dL

L the current inputis "a"

BAD...

GOOD...

executionis here

executionis here

Page 102: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 102

Another Example• Left recursive grammar:

Number => Number Digit | DigitDigit => 0 | 1 | 2 | 3 | ... | 9

• How many times should the Number production be used to parse “123”?

• Rearrange to:Number => Digit Number | DigitDigit => 0 | 1 | 2 | 3 | ... | 9

there’s stilla problem;see next slides

Page 103: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 103

8.2. Left factoring

• When 2 (or more) productions begin with the same terminal or nonterminal, then which production should be used?

• e.g. Which X rule to use to parse “ae...”?X => a d SX => a e R

continued

Page 104: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 104

a

a

the current input is 'a'

executionis here

e

dX S

R

BAD...

Page 105: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 105

• Left factoring creates a new production which represents the “tails” of the left factored rules.

• e.g. left factoring the X rules:X => a XTailXTail => d S | e R

Page 106: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 106

a

the current input is 'a'

executionis here

e

d

X

S

R

XTail

XTail

GOOD...

Page 107: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 107

Another Example

• Which Number rule should be used to parse “123”?

Number => Digit Number | DigitDigit => 0 | 1 | 2 | 3 | ... | 9

• Left factorise Number:Number => Digit NumTailNumTail => Number |

Page 108: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 108

9. Building a Parse Tree• Now we will augment the parser code of section 7 to

generate a parse tree.

• The grammar again:T = { x, +, (, ) }

N = { A, B, C }

P = { A => x | ( B )B => A CC => { + A } }

A is the starting nonterminal

Page 109: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 109

9.1. Representing the Parse Trees

• The production:A => x | ( B )

can create two possible parse trees:

A

x

or

A

( treefor B

)

continued

Page 110: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 110

• The production:B => A C

will create the parse tree:

B

treefor A

treefor C

continued

Page 111: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 111

• The production:C => { + A }

can generate an infinite number of parse trees:

C

C

+ treefor A

C

+ treefor A

+ treefor A

or or or ...

Page 112: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 112

A Parse Tree for “(x+x+x)”

A

( B )

A C

x + A + A

x x

Our code will readin a string and createa parse tree datastructure like this one.

Page 113: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 113

9.2. The Tree Data Structure

• The nodes in a parse tree can have different numbers of children.

• The C grammar rule can generate 1 child or any even number of children!– 2, 4, 6, 8, ...

continued

Page 114: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 114

• struct node { char label; struct node *leftChild; struct node *rightSib; // sibling};typedef struct node *TREE;

leftChild

rightSiblabel

Tree Node Date Structure

This approach allows usto have a variable numberof siblings.

Page 115: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 115

9.3. Tree Building Functions

• A collection of building functions:– a function to create a node with 0 children– a function to create a node with 1 child– a function to create a node with 2 children– etc.

• The C production will require some fancy coding.

Page 116: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 116

TREE makeLeaf(char x){ TREE root = (TREE) malloc(

sizeof(struct node)); root->label = x; root->leftChild = NULL; root->rightSib = NULL;

return root;}

continued

x

NN

makeLeaf() createsthe node:

Page 117: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 117

TREE makeNode1(char x, TREE t){ // the subtree t is supplied TREE root = makeLeaf(x); root->leftChild = t; return root;}

continued

x

N

?

??

t

makeNode1() createsthe tree:

‘?’ means that makeNode1()does not care what the value is.

Page 118: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 118

TREE makeNode2(char x, TREE t1, TREE t2){ // subtrees t1 and t2 are supplied TREE root = makeNode1(x, t1); t1->rightSib = t2; return root;}

x

N

?

?

t1?

??

t2

continued

makeNode2() createsthe tree:

Page 119: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 119

TREE makeNode3(char x,

TREE t1, TREE t2, TREE t3){ // the subtrees are supplied TREE root = makeNode2(x, t1, t2); t2->rightSib = t3; return root;}

x

N

?

?

t1?

?

t2?

??

t3

This approach can be used to create makeNode4(), and so on.

makeNode3() createsthe tree:

Page 120: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 120

Dealing with the C production

• The C production can generate any even number of children:

C => { + A }

continued

Page 121: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 121

• A C tree will be constructed in three ways:– a C node with 1 child

• use makeNode1(‘C’, makeLeaf(‘e’))

– a C node with 2 children• use makeNode2()

– a C node with 4, 6, 8, ... children• use add2Children() repeatedly, after calling makeNode2() first

‘e’ standsfor

Page 122: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 122

TREE add2Children(TREE t, TREE t1, TREE t2)

{ TREE rm = rightMostChild(t); rm->rightSib = t1; t1->rightSib = t2; return t;}

x

N

?

?

?

?

t1?

??

t2

t

?

?

rm

We will notdefine this function.

add2Children() adds t1and t2 to the end of t’schildren (after rm):

Page 123: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 123

9.4. Parse Trees as CodeA

xbecome:

A

( treefor B

)

A

N

x

NN

A

N

(

N

B )

NN

cells representing B tree

Page 124: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 124

B

treefor A

treefor C

becomes:B

N

A C

cells for C tree

cells for A tree

N

continued

Page 125: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 125

C

+ treefor A

+ treefor A

becomes:C

N

+

N

A +

N

cells for A tree

A

cells for A tree

N

Page 126: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 126

9.5. Code with Parse Tree Generation#include ...

struct node { ... };typedef struct node *TREE;

TREE A(); // parse functionsTREE B();TREE C();void error();TREE makeLeaf(char x);

: // other TREE building prototypes

char ch; // holds current input char

continued

Page 127: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 127

void main(){ ch = getchar(); TREE parseTree = A();

:

// use parseTree, print it, etc.}

continued

Page 128: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 128

TREE A(){ if (ch == ‘x’) { ch = getchar(); return makeNode1(‘A’, makeLeaf(‘x’)); } else if (ch == ‘(‘) { ch = getchar(); TREE BTree = B(); if (ch == ‘)’) { ch = getchar(); return makeNode3(‘A’, makeLeaf(‘(’),

BTree, makeLeaf(‘)’) ); } else error(); } else error();}

continued

Page 129: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 129

TREE B(){ Tree ATree = A(); Tree CTree = C(); return makeNode2(‘B’, ATree, CTree);}

continued

Page 130: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 130

TREE C(){ TREE ATree, CTree; int numLoops = 0; // times round the loop while (ch == ‘+’) { numLoops++; ch = getchar(); ATree = A(); if (numLoops == 1) // 1st time through loop CTree = makeNode2(‘C’,makeLeaf(‘+’),ATree); else // 2nd, 3rd, etc time CTree = add2Children(CTree, makeLeaf(‘+’),

ATree); } if (numLoops == 0) // skipped the loop CTree = makeNode1(‘C’, makeLeaf(‘e’)); return CTree;}

Page 131: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 131

10. Kinds of Grammars

• There are 4 main kinds of grammar, of increasing expressive power:– regular (type 3) grammars– context-free (type 2) grammars– context-sensitive (type 1) grammars– unrestricted (type 0) grammars

• They vary in the kinds of productions they allow.

Page 132: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 132

10.1. Regular Grammars• Every production is of the form:

A => a | a B | – A, B are nonterminals, a is a terminal

• These are sometimes called right linear rules because if a nonterminal appears in the rule body, then it must appear last.

• Regular grammars are equivalent to REs (and also to automata).

S => wTT => xTT => a

Page 133: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 133

An Equivalence Diagram

RegularGrammars

REs

Automata

sameexpressivepower

Page 134: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 134

Example

• Integer => + UInt | - UInt | 0 Digits | 1 Digits | ... | 9 Digits

UInt => 0 Digits | 1 Digits | ... | 9 Digits

Digits => 0 Digits | 1 Digits | ... | 9 Digits |

Page 135: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 135

10.2. Context-Free Grammars

• Every production is of the form:A =>

– A is a nonterminal, can be any number of nonterminals or terminals

• Most of our examples have been context-free grammars– used widely to define programming languages– they subsume regular grammars

A => aA => aBcdB => ae

Page 136: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 136

10.3. Context-Sensitive Grammars

• Every production is of the form: => – , can contain any number of terminals and

nonterminals– must contain at least 1 nonterminal– size() >= size()– cannot be

continued

A => a11A => aB2dB2 => ae

Page 137: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 137

• Context-sensitive rules allow the grammar to specify a context for a rewrite– e.g. A1a0 => 1b00– the string 2A1a01 becomes 21b001

– Context-sensitive grammars are more powerful than context-free grammars because of this context ability.

Page 138: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 138

Example

• The language:E = {012, 001122, 000111222, ... }

or, in brief, E = {0n 1n 2n | n >= 1}

can only be expressed using a context-sensitive grammar:

S => 0 A 1 2 | 0 1 2A => 0 A 1 C | 0 1 CC 1 => 1 CC 2 => 2 2

Page 139: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 139

Rewrite S to 001122

• S => O A 1 2

0 A 1 2 => 0 0 1 C 1 2

0 0 1 C 1 2 => 0 0 1 1 C 2

0 0 1 1 C 2 => 0 0 1 1 2 2

Page 140: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 140

10.4. Unrestricted Grammars

• Every production is of the form: => – , can contain any number of terminals and no

nterminals; must contain at least 1 nonterminal– no restrictions on size()

• it may be smaller than size()

– can be • Also called phrase-structure grammars.

more generalthan contextsensitive

A => 11A => aB2 => aeA

Page 141: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 141

Example

• The language:E = {, 012, 001122, 000111222, ... }

or, in brief, E = {0n 1n 2n | n >= 0}

can only be expressed using an unrestricted grammar:

S => 0 A 1 2 | A => 0 A 1 C | C 1 => 1 CC 2 => 2 2

new features

Page 142: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 142

Rewrite S to 012

• S => 0 A 1 2• 0 A 1 2 => 0 1 2

– using A ==>

Page 143: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 143

10.5. Why so many Grammar Kinds?

• More powerful grammars are more expressive, but also harder to implement efficiently– a trade-off between power and implementation

continued

Page 144: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 144

• For example, most compilers have two grammar-based components:– the lexical analyzer

• uses REs (regular grammars) to parse basic nonterminals such as identifier and number

– the syntax analyzer• uses (context-free) grammars to deal with complex s

yntactic categories such as loops and expressions

Page 145: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 145

Lexical and Syntax Analyzers

lexicalanalyzer

syntaxanalyzer

program text file

chars:'i' 'n' 't'' ' 'x' '=''4' '3' ';' ...

tokens intx=43;

....

parse tree

int x = 43 ;

....

the compiler

codegeneration

Page 146: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 146

11. From REs to Grammars

• It is easy to translate a RE into a context-free grammar.– each RE operand and operator can be implemen

ted by a grammar rule

• Infact, the power of context-free grammars is not needed, since REs are equivalent to regular grammars– we translate to context-free because it is simple

to do

Page 147: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 147

Operand to Production

• Assume that R is the regular expression, and G is the new production.

• Operand ProductionR = x G => xR = G => R = {} nothing

translatesto

Page 148: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 148

Operator to Production

• Assume that S and T are REs; Gs and Gt are their translation to productions.

• Operator ProductionR = S | T G => Gs | Gt

R = S T G => Gs Gt

R = S* G => Gs G | or G => { Gs }

translatesto

Page 149: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 149

Example: translate a | bc*

• The RE with brackets:a | ( b ( c* ) )

• Translate the operands:A => aB => bC => c

– the nonterminals A, B, C are invented

continued

Page 150: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 150

• Translate the operators in precedence order.• Translate c*:

CStar => C CStar | • Translate b c*

BCStar => B CStar

• Translate a | b c*S => A | BCStar

The CStar,BCStar, andS nonterminalsare invented.

Page 151: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 151

• The complete grammar:T = { a, b, c }

N = { S, BCStar, CStar, A, B, C }

P = { S => A | BCStarBCStar => B CStarCStar => C CStar | A => aB => bC => c }

S is the starting nonterminal

These rulescan besimplified.

Page 152: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 152

Rules Simplification

• Substitute in the right hand sides for the A, B, and C rules:

P = { S => a | BCStarBCStar => b CStarCStar => c CStar | }

• Substitute in the right hand side for BCStar:P = { S => a | b CStar

CStar => c CStar | }

Page 153: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 153

12. Context-free Grammars vs. REs

• REs (and automata) are equivalent to regular grammars– they can be used for all the same problems

• Every production in a regular grammar is right linear:

A => a | a B | – A, B are nonterminals, a is a terminal

continued

Page 154: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 154

• This means that a regular grammar (and also REs, automata) can not be used to express most context-free grammars, or any context-sensitive or unrestricted grammars.

• REs are less powerful than context-free grammars.

Page 155: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 155

Example

• Context-free grammar:S => 0 1 | 0 S 1

– it defines the language E = { 0n 1n | n >= 1}

• The S production is not right linear, so a RE cannot be used to model the language E.

S is not at the end.

Page 156: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 156

12.1. Proof Using Automata

• Proving that REs are weaker than context-free grammars is easiest if we prove that automata are weaker than context-free grammars– remember that REs are equivalent to automata

continued

Page 157: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 157

• Assume an automaton with 2*k states.

• How could it be used to represent?E = { 0n 1n | n >= 1}

• We will consider the case when n >> k.

Page 158: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 158

First Automaton

• Problem: not enough states since n >> k

1 2 3 kstart 0 0 0 0

2k 2k-1 2k-2 k+11 1 1 1

1

It uses all of its allowed 2*k states.

must beequallength

Page 159: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 159

Second Try

• Add loops to reuse states.

1 2 3 kstart 0 0 0 0

2k 2k-1 2k-2 k+11 1 1 1

1

0

1continued

must beequallength

Page 160: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 160

• Question: how many 0’s were matched between state 1 and k?– Answer: it could be any number

• Question: how can the number of matched 0’s be used to fix the number of matched 1’s?– Answer: it cannot when n can be any number

continued

Page 161: 241-303 Discrete Maths: Grammars/8 1 Discrete Maths Objectives – –to introduce grammars and show their importance for defining programming languages and

241-303 Discrete Maths: Grammars/8 161

• So, no automaton can model the language: E = {0n 1n | n >= 1}– so there is no RE for E– but E can be written as a context-free grammar

• This shows that REs are weaker than context-free grammars.