63
Challenge the future Delft University of Technology Course IN4303 Compiler Construction Guido Wachsmuth Compiler Components & Generators Traditional Parsing Algorithms

Compiler Components and their Generators - Traditional Parsing Algorithms

Embed Size (px)

DESCRIPTION

Presentation slides for lecture 12 of course IN4303 on Compiler Construction at TU Delft.

Citation preview

Page 1: Compiler Components and their Generators - Traditional Parsing Algorithms

Challenge the future

DelftUniversity ofTechnology

Course IN4303Compiler Construction

Guido Wachsmuth

Compiler Components & GeneratorsTraditional Parsing Algorithms

Page 2: Compiler Components and their Generators - Traditional Parsing Algorithms

lessons learned

LL Parsing

Recap: Lexical Analysis

What are the formalisms to describe regular languages?

• regular grammars

• regular expressions

• finite state automata

Why are these formalisms equivalent?

• constructive proofs

How can we generate compiler tools from that?

• implement DFAs

• generate transition tables

2

Page 3: Compiler Components and their Generators - Traditional Parsing Algorithms

today’s lecture

LL Parsing

Overview

3

Page 4: Compiler Components and their Generators - Traditional Parsing Algorithms

today’s lecture

LL Parsing

Overview

efficient parsing algorithms

• predictive parsing

• LR parsing

3

Page 5: Compiler Components and their Generators - Traditional Parsing Algorithms

today’s lecture

LL Parsing

Overview

efficient parsing algorithms

• predictive parsing

• LR parsing

grammar classes

• LL(k) grammars

• LR(k) grammars

3

Page 6: Compiler Components and their Generators - Traditional Parsing Algorithms

LL Parsing

predictive parsing

I

4

Page 7: Compiler Components and their Generators - Traditional Parsing Algorithms

formal languages

LL Parsing 5

Recap: A Theory of Language

Page 8: Compiler Components and their Generators - Traditional Parsing Algorithms

formal languages

LL Parsing

vocabulary Σ

finite, nonempty set of elements (words, letters)

alphabet

5

Recap: A Theory of Language

Page 9: Compiler Components and their Generators - Traditional Parsing Algorithms

formal languages

LL Parsing

vocabulary Σ

finite, nonempty set of elements (words, letters)

alphabet

string over Σ

finite sequence of elements chosen from Σ

word, sentence, utterance

5

Recap: A Theory of Language

Page 10: Compiler Components and their Generators - Traditional Parsing Algorithms

formal languages

LL Parsing

vocabulary Σ

finite, nonempty set of elements (words, letters)

alphabet

string over Σ

finite sequence of elements chosen from Σ

word, sentence, utterance

formal language λ

set of strings over a vocabulary Σ

λ ⊆ Σ*

5

Recap: A Theory of Language

Page 11: Compiler Components and their Generators - Traditional Parsing Algorithms

formal grammars

LL Parsing 6

Recap: A Theory of Language

Page 12: Compiler Components and their Generators - Traditional Parsing Algorithms

formal grammars

LL Parsing

formal grammar G = (N, Σ, P, S)

nonterminal symbols N

terminal symbols Σ

production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*

start symbol S∈N

6

Recap: A Theory of Language

Page 13: Compiler Components and their Generators - Traditional Parsing Algorithms

formal grammars

LL Parsing

formal grammar G = (N, Σ, P, S)

nonterminal symbols N

terminal symbols Σ

production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*

start symbol S∈N

6

Recap: A Theory of Language

nonterminal symbol

Page 14: Compiler Components and their Generators - Traditional Parsing Algorithms

formal grammars

LL Parsing

formal grammar G = (N, Σ, P, S)

nonterminal symbols N

terminal symbols Σ

production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*

start symbol S∈N

6

Recap: A Theory of Language

context

Page 15: Compiler Components and their Generators - Traditional Parsing Algorithms

formal grammars

LL Parsing

formal grammar G = (N, Σ, P, S)

nonterminal symbols N

terminal symbols Σ

production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*

start symbol S∈N

6

Recap: A Theory of Language

replacement

Page 16: Compiler Components and their Generators - Traditional Parsing Algorithms

formal grammars

LL Parsing

formal grammar G = (N, Σ, P, S)

nonterminal symbols N

terminal symbols Σ

production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*

start symbol S∈N

grammar classes

type-0, unrestricted

type-1, context-sensitive: (a A c, a b c)

type-2, context-free: P ⊆ N × (N∪Σ)*

type-3, regular: (A, x) or (A, xB)

6

Recap: A Theory of Language

Page 17: Compiler Components and their Generators - Traditional Parsing Algorithms

formal languages

LL Parsing 7

Recap: A Theory of Language

Page 18: Compiler Components and their Generators - Traditional Parsing Algorithms

formal languages

LL Parsing

formal grammar G = (N, Σ, P, S)

7

Recap: A Theory of Language

Page 19: Compiler Components and their Generators - Traditional Parsing Algorithms

formal languages

LL Parsing

formal grammar G = (N, Σ, P, S)

derivation relation ⇒G ⊆ (N∪Σ)* × (N∪Σ)*

w ⇒G w’ ⇔

∃(p, q)∈P: ∃u,v∈(N∪Σ)*:

w=u p v ∧ w’=u q v

7

Recap: A Theory of Language

Page 20: Compiler Components and their Generators - Traditional Parsing Algorithms

formal languages

LL Parsing

formal grammar G = (N, Σ, P, S)

derivation relation ⇒G ⊆ (N∪Σ)* × (N∪Σ)*

w ⇒G w’ ⇔

∃(p, q)∈P: ∃u,v∈(N∪Σ)*:

w=u p v ∧ w’=u q v

formal language L(G) ⊆ Σ*

L(G) = {w∈Σ* | S ⇒G* w}

7

Recap: A Theory of Language

Page 21: Compiler Components and their Generators - Traditional Parsing Algorithms

formal languages

LL Parsing

formal grammar G = (N, Σ, P, S)

derivation relation ⇒G ⊆ (N∪Σ)* × (N∪Σ)*

w ⇒G w’ ⇔

∃(p, q)∈P: ∃u,v∈(N∪Σ)*:

w=u p v ∧ w’=u q v

formal language L(G) ⊆ Σ*

L(G) = {w∈Σ* | S ⇒G* w}

classes of formal languages

7

Recap: A Theory of Language

Page 22: Compiler Components and their Generators - Traditional Parsing Algorithms

recursive descent

LL Parsing 8

Exp → “while” Exp “do” Exp

Predictive parsing

public void parseExp() { consume(WHILE); parseExp(); consume(DO); parseExp();}

Page 23: Compiler Components and their Generators - Traditional Parsing Algorithms

look ahead

LL Parsing 9

Exp → “while” Exp “do” ExpExp → “if” Exp “then” Exp “else” Exp

Predictive parsing

public void parseExp() {

switch current() { case WHILE: consume(WHILE); parseExp(); ...; break; case IF : consume(IF); parseExp(); ...; break; default : error(); }}

Page 24: Compiler Components and their Generators - Traditional Parsing Algorithms

parse table

LL Parsing 10

rows

• nonterminal symbols N

• symbol to parse

columns

• terminal symbols Σk

• look ahead k

entries

• production rules P

• possible conflicts

Predictive parsing

T1 T2 T3 ...

N1 N1 →... N1 →...

N2 N2 →...

N3 N3 →... N3 →...

N4 N4 →...

N5 N5 →...

N6 N6 →... N6 →...

N7 N7 →...

N8 N8 →... N8 →... N8 →...

...

Page 25: Compiler Components and their Generators - Traditional Parsing Algorithms

automaton

LL Parsing

Predictive parsing

11

… tn $t1

$S

input

parse tablestack

Page 26: Compiler Components and their Generators - Traditional Parsing Algorithms

automaton

LL Parsing

Predictive parsing

12

x … $

$

x…

Page 27: Compiler Components and their Generators - Traditional Parsing Algorithms

automaton

LL Parsing

Predictive parsing

12

… $

$…

Page 28: Compiler Components and their Generators - Traditional Parsing Algorithms

automaton

LL Parsing

Predictive parsing

13

x … $

$

Xi

x

XiXi → Y1... Yk…

Page 29: Compiler Components and their Generators - Traditional Parsing Algorithms

automaton

LL Parsing

Predictive parsing

13

x … $

$

x

XiXi → Y1... Yk…

Yk

…Y1

Page 30: Compiler Components and their Generators - Traditional Parsing Algorithms

LL Parsing

LL parse tables

II

14

Page 31: Compiler Components and their Generators - Traditional Parsing Algorithms

entry (X, w)∈P at row X and column T

T∈ FIRST(w)

nullable(w) ∧ T∈ FOLLOW(X)

filling the table

LL Parsing 15

Predictive parsing

Page 32: Compiler Components and their Generators - Traditional Parsing Algorithms

entry (X, w)∈P at row X and column T

T∈ FIRST(w)

nullable(w) ∧ T∈ FOLLOW(X)

filling the table

LL Parsing 15

Predictive parsing

letters that w can start with

Page 33: Compiler Components and their Generators - Traditional Parsing Algorithms

entry (X, w)∈P at row X and column T

T∈ FIRST(w)

nullable(w) ∧ T∈ FOLLOW(X)

filling the table

LL Parsing 15

Predictive parsing

w ⇒G* ε

Page 34: Compiler Components and their Generators - Traditional Parsing Algorithms

entry (X, w)∈P at row X and column T

T∈ FIRST(w)

nullable(w) ∧ T∈ FOLLOW(X)

filling the table

LL Parsing 15

Predictive parsing

letters that can follow X

Page 35: Compiler Components and their Generators - Traditional Parsing Algorithms

nullable

LL Parsing

nullable(X)

(X, ε) ∈ P ⇒ nullable(X)

(X0, X1 … Xk)∈P ∧ nullable(X1) ∧ … ∧ nullable(Xk) ⇒ nullable(X0)

nullable(w)

nullable(ε)

nullable(X1 … Xk) = nullable(X1) ∧ … ∧ nullable(Xk)

16

Predictive parsing

Page 36: Compiler Components and their Generators - Traditional Parsing Algorithms

first sets

LL Parsing

FIRST(X)

X∈Σ : FIRST(X) = {X}

(X0, X1 … Xi … Xk)∈P ∧ nullable(X1 … Xi) ⇒ FIRST(X0) ⊇ FIRST(Xi+1)

FIRST(w)

FIRST(ε) = {}

¬nullable(X) ⇒ FIRST(Xw) = FIRST(X)

nullable(X) ⇒ FIRST(Xw) = FIRST(X) ∪ FIRST(w)

17

Predictive parsing

Page 37: Compiler Components and their Generators - Traditional Parsing Algorithms

follow sets

LL Parsing

FOLLOW(X)

(X0, X1 … Xi … Xk)∈P ∧ nullable(Xi+1 … Xk) ⇒ FOLLOW(Xi) ⊇ FOLLOW(X0)

(X0, X1 … Xi … Xk)∈P ⇒ FOLLOW(Xi) ⊇ FIRST(Xi+1 … Xk)

18

Predictive parsing

Page 38: Compiler Components and their Generators - Traditional Parsing Algorithms

LL Parsing 19

p1: Exp → Term Exp’ p2: Exp’ → “+” Term Exp’p3: Exp’ →p4: Term → Fact Term’p5: Term’ → “*” Fact Term’p6: Term’ →p7: Fact → Num p8: Fact → “(” Exp “)”

Example

nullable FIRST FOLLOW

Exp

Exp’

Term

Term’

Fact

Page 39: Compiler Components and their Generators - Traditional Parsing Algorithms

nullable

LL Parsing 20

p1: Exp → Term Exp’ p2: Exp’ → “+” Term Exp’p3: Exp’ →p4: Term → Fact Term’p5: Term’ → “*” Fact Term’p6: Term’ →p7: Fact → Num p8: Fact → “(” Exp “)”

Example(X, ε) ∈ P ⇒ nullable(X)

(X0, X1 … Xk)∈P ∧

nullable(X1) ∧ … ∧ nullable(Xk) ⇒ nullable(X0)

nullable FIRST FOLLOW

Exp no

Exp’ yes

Term no

Term’ yes

Fact no

Page 40: Compiler Components and their Generators - Traditional Parsing Algorithms

FIRST sets

LL Parsing 21

p1: Exp → Term Exp’ p2: Exp’ → “+” Term Exp’p3: Exp’ →p4: Term → Fact Term’p5: Term’ → “*” Fact Term’p6: Term’ →p7: Fact → Num p8: Fact → “(” Exp “)”

Example

nullable FIRST FOLLOW

Exp no Num (

Exp’ yes +

Term no Num (

Term’ yes *

Fact no Num (

(X0, X1 … Xi … Xk)∈P ∧

nullable(X1 … Xi) ⇒ FIRST(X0) ⊇ FIRST(Xi+1)

Page 41: Compiler Components and their Generators - Traditional Parsing Algorithms

FOLLOW sets

LL Parsing 22

p1: Exp → Term Exp’ p2: Exp’ → “+” Term Exp’p3: Exp’ →p4: Term → Fact Term’p5: Term’ → “*” Fact Term’p6: Term’ →p7: Fact → Num p8: Fact → “(” Exp “)”

Example(X0, X1 … Xi … Xk)∈P ∧

nullable(Xi+1 … Xk) ⇒ FOLLOW(Xi) ⊇ FOLLOW(X0)

(X0, X1 … Xi … Xk)∈P ⇒ FOLLOW(Xi) ⊇ FIRST(Xi+1 … Xk)

nullable FIRST FOLLOW

Exp no Num ( )

Exp’ yes + )

Term no Num ( + )

Term’ yes * + )

Fact no Num ( * + )

Page 42: Compiler Components and their Generators - Traditional Parsing Algorithms

LL parse table

LL Parsing 23

p1: Exp → Term Exp’ p2: Exp’ → “+” Term Exp’p3: Exp’ →p4: Term → Fact Term’p5: Term’ → “*” Fact Term’p6: Term’ →p7: Fact → Num p8: Fact → “(” Exp “)”

Exampleentry (X, w)∈P at row X and column T

T∈ FIRST(w)

nullable(w) ∧ T∈ FOLLOW(X)

+ * Num ( )

Exp p1 p1

Exp’ p2 p3

Term p4 p4

Term’ p6 p5 p6

Fact p7 p8

Page 43: Compiler Components and their Generators - Traditional Parsing Algorithms

parsing

LL Parsing 24

Example

+ * Num ( )

Exp p1 p1

Exp’ p2 p3

Term p4 p4

Term’ p6 p5 p6

Fact p7 p8

p1: Exp → Term Exp’ p2: Exp’ → “+” Term Exp’p3: Exp’ →p4: Term → Fact Term’p5: Term’ → “*” Fact Term’p6: Term’ →p7: Fact → Num p8: Fact → “(” Exp “)”

Page 44: Compiler Components and their Generators - Traditional Parsing Algorithms

LL Parsing

Grammar classes

25

context-free grammars

Page 45: Compiler Components and their Generators - Traditional Parsing Algorithms

LL Parsing

Grammar classes

25

context-free grammars

LL(0)

Page 46: Compiler Components and their Generators - Traditional Parsing Algorithms

LL Parsing

Grammar classes

25

context-free grammars

LL(1)

LL(0)

Page 47: Compiler Components and their Generators - Traditional Parsing Algorithms

LL Parsing

Grammar classes

25

context-free grammars

LL(k)

LL(1)

LL(0)

Page 48: Compiler Components and their Generators - Traditional Parsing Algorithms

encoding precedence

LL Parsing 26

Exp → Num Exp → “(” Exp “)”Exp → Exp “*” Exp Exp → Exp “+” Exp

Predictive parsing

Fact → Num Fact → “(” Exp “)”Term → Term “*” FactTerm → Fact Exp → Exp “+” TermExp → Term

Page 49: Compiler Components and their Generators - Traditional Parsing Algorithms

eliminating left recursion

LL Parsing 27

Term → Term “*” FactTerm → Fact Exp → Exp “+” TermExp → Term

Predictive parsing

Term’ → “*” Fact Term’Term’ →Term → Fact Term’Exp’ → “+” Term Exp’Exp’ →Exp → Term Exp’

Page 50: Compiler Components and their Generators - Traditional Parsing Algorithms

left factoring

LL Parsing 28

Exp → “if” Exp “then” Exp “else” ExpExp → “if” Exp “then” Exp

Predictive parsing

Exp → “if” Exp “then” Exp ElseElse → “else” ExpElse →

Page 51: Compiler Components and their Generators - Traditional Parsing Algorithms

LL Parsing

summary

III

29

Page 52: Compiler Components and their Generators - Traditional Parsing Algorithms

lessons learned

LL Parsing

Summary

30

Page 53: Compiler Components and their Generators - Traditional Parsing Algorithms

lessons learned

LL Parsing

Summary

How can we parse context-free languages effectively?

• predictive parsing algorithms

30

Page 54: Compiler Components and their Generators - Traditional Parsing Algorithms

lessons learned

LL Parsing

Summary

How can we parse context-free languages effectively?

• predictive parsing algorithms

Which grammar classes are supported by these algorithms?

• LL(k) grammars, LL(k) languages

30

Page 55: Compiler Components and their Generators - Traditional Parsing Algorithms

lessons learned

LL Parsing

Summary

How can we parse context-free languages effectively?

• predictive parsing algorithms

Which grammar classes are supported by these algorithms?

• LL(k) grammars, LL(k) languages

How can we generate compiler tools from that?

• implement automaton

• generate parse tables

30

Page 56: Compiler Components and their Generators - Traditional Parsing Algorithms

lessons learned

LL Parsing

Summary

How can we parse context-free languages effectively?

• predictive parsing algorithms

Which grammar classes are supported by these algorithms?

• LL(k) grammars, LL(k) languages

How can we generate compiler tools from that?

• implement automaton

• generate parse tables

30

Page 57: Compiler Components and their Generators - Traditional Parsing Algorithms

learn more

LL Parsing

Literature

31

Page 58: Compiler Components and their Generators - Traditional Parsing Algorithms

learn more

LL Parsing

Literature

formal languages

Noam Chomsky: Three models for the description of language. 1956

J. E. Hopcroft, R. Motwani, J. D. Ullman: Introduction to Automata Theory, Languages, and Computation. 2006

31

Page 59: Compiler Components and their Generators - Traditional Parsing Algorithms

learn more

LL Parsing

Literature

formal languages

Noam Chomsky: Three models for the description of language. 1956

J. E. Hopcroft, R. Motwani, J. D. Ullman: Introduction to Automata Theory, Languages, and Computation. 2006

syntactical analysis

Andrew W. Appel, Jens Palsberg: Modern Compiler Implementation in Java, 2nd edition. 2002

Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman, Monica S. Lam: Compilers: Principles, Techniques, and Tools, 2nd edition. 2006

31

Page 60: Compiler Components and their Generators - Traditional Parsing Algorithms

coming next

LL Parsing

Outlook

lectures

• last lecture: LR parsing

Question & Answer Jan 10

• 10 questions, submit & vote

Lab Dec 15

• translate expressions & statements

• challenge: stack limits

32

Page 61: Compiler Components and their Generators - Traditional Parsing Algorithms

LL Parsing

copyrights

33