149
Introduction Regular Expressions Computer Languages Introduction & Regular Expressions 1 Introduction Administrivia Course contents 2 Regular Expressions Definitions Examples Scanner generators January 19th

Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

  • Upload
    others

  • View
    24

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer LanguagesIntroduction & Regular Expressions

1 IntroductionAdministriviaCourse contents

2 Regular ExpressionsDefinitionsExamplesScanner generators

January 19th

Page 2: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

This is an elective course for the masterprogrammes Embedded Intelligent Systemsand Computer Systems Engineering.

Computer languages are

The tools you use to build pieces ofsoftware!

you need to understand themproperly,you need to learn new ones.

The solution to some problems

you need to know how to implementthem.

Page 3: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

This is an elective course for the masterprogrammes Embedded Intelligent Systemsand Computer Systems Engineering.

Computer languages are

The tools you use to build pieces ofsoftware!

you need to understand themproperly,you need to learn new ones.

The solution to some problems

you need to know how to implementthem.

Page 4: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

This is an elective course for the masterprogrammes Embedded Intelligent Systemsand Computer Systems Engineering.

Computer languages are

The tools you use to build pieces ofsoftware!

you need to understand themproperly,you need to learn new ones.

The solution to some problems

you need to know how to implementthem.

Page 5: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

This is an elective course for the masterprogrammes Embedded Intelligent Systemsand Computer Systems Engineering.

Computer languages are

The tools you use to build pieces ofsoftware!

you need to understand themproperly,you need to learn new ones.

The solution to some problems

you need to know how to implementthem.

Page 6: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

This is an elective course for the masterprogrammes Embedded Intelligent Systemsand Computer Systems Engineering.

Computer languages are

The tools you use to build pieces ofsoftware!

you need to understand themproperly,you need to learn new ones.

The solution to some problems

you need to know how to implementthem.

Page 7: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

This is an elective course for the masterprogrammes Embedded Intelligent Systemsand Computer Systems Engineering.

Computer languages are

The tools you use to build pieces ofsoftware!

you need to understand themproperly,you need to learn new ones.

The solution to some problems

you need to know how to implementthem.

Page 8: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

This is an elective course for the masterprogrammes Embedded Intelligent Systemsand Computer Systems Engineering.

Computer languages are

The tools you use to build pieces ofsoftware!

you need to understand themproperly,you need to learn new ones.

The solution to some problems

you need to know how to implementthem.

Page 9: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Resources

Teachers

Jerker Bengtssonwww2.hh.se/staff/jebe

Veronica Gaspeswww2.hh.se/staff/vero

On Line

Web pagewww2.hh.se/staff/jebe/languages.

Lecture notes, notice board, projectinstructions.

Manuals for the tools we use.

A good book, organizedaround building a compiler.

A lot of material, helpseven if you want to followan advanced course. Welook at the first part, uptochapter 12.

Page 10: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Resources

Teachers

Jerker Bengtssonwww2.hh.se/staff/jebe

Veronica Gaspeswww2.hh.se/staff/vero

On Line

Web pagewww2.hh.se/staff/jebe/languages.

Lecture notes, notice board, projectinstructions.

Manuals for the tools we use.

A good book, organizedaround building a compiler.

A lot of material, helpseven if you want to followan advanced course. Welook at the first part, uptochapter 12.

Page 11: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Resources

Teachers

Jerker Bengtssonwww2.hh.se/staff/jebe

Veronica Gaspeswww2.hh.se/staff/vero

On Line

Web pagewww2.hh.se/staff/jebe/languages.

Lecture notes, notice board, projectinstructions.

Manuals for the tools we use.

A good book, organizedaround building a compiler.

A lot of material, helpseven if you want to followan advanced course. Welook at the first part, uptochapter 12.

Page 12: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Resources

Teachers

Jerker Bengtssonwww2.hh.se/staff/jebe

Veronica Gaspeswww2.hh.se/staff/vero

On Line

Web pagewww2.hh.se/staff/jebe/languages.

Lecture notes, notice board, projectinstructions.

Manuals for the tools we use.

A good book, organizedaround building a compiler.

A lot of material, helpseven if you want to followan advanced course. Welook at the first part, uptochapter 12.

Page 13: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Resources

Teachers

Jerker Bengtssonwww2.hh.se/staff/jebe

Veronica Gaspeswww2.hh.se/staff/vero

On Line

Web pagewww2.hh.se/staff/jebe/languages.

Lecture notes, notice board, projectinstructions.

Manuals for the tools we use.

A good book, organizedaround building a compiler.

A lot of material, helpseven if you want to followan advanced course. Welook at the first part, uptochapter 12.

Page 14: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examination

A programming project whereyou implement a compiler for asubset of the Javaprogramming language.

Small computer basedexercises about formallanguages

The instructions will be on the web, including deadlines.

The project is evaluated during the examination week. If you donot pass, it can be evaluated during the following examinationweek.

Page 15: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examination

A programming project whereyou implement a compiler for asubset of the Javaprogramming language.

Small computer basedexercises about formallanguages

The instructions will be on the web, including deadlines.

The project is evaluated during the examination week. If you donot pass, it can be evaluated during the following examinationweek.

Page 16: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examination

A programming project whereyou implement a compiler for asubset of the Javaprogramming language.

Small computer basedexercises about formallanguages

The instructions will be on the web, including deadlines.

The project is evaluated during the examination week. If you donot pass, it can be evaluated during the following examinationweek.

Page 17: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examination

A programming project whereyou implement a compiler for asubset of the Javaprogramming language.

Small computer basedexercises about formallanguages

The instructions will be on the web, including deadlines.

The project is evaluated during the examination week. If you donot pass, it can be evaluated during the following examinationweek.

Page 18: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Organization

Theory lectures

On formal languages about

regular expressions andfinite automata

context free grammarsand pushdown automata

Assignments

Short labs to confirm that youunderstand some theory andthe tools we need.

Compiler techniques lectures

abstract syntax

types and type checking

intermediaterepresentations

code generation andoptimizations

Programming project

Organized as a series oflaborations with strictdeadlines.

Page 19: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Organization

Theory lectures

On formal languages about

regular expressions andfinite automata

context free grammarsand pushdown automata

Assignments

Short labs to confirm that youunderstand some theory andthe tools we need.

Compiler techniques lectures

abstract syntax

types and type checking

intermediaterepresentations

code generation andoptimizations

Programming project

Organized as a series oflaborations with strictdeadlines.

Page 20: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Organization

Theory lectures

On formal languages about

regular expressions andfinite automata

context free grammarsand pushdown automata

Assignments

Short labs to confirm that youunderstand some theory andthe tools we need.

Compiler techniques lectures

abstract syntax

types and type checking

intermediaterepresentations

code generation andoptimizations

Programming project

Organized as a series oflaborations with strictdeadlines.

Page 21: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Organization

Theory lectures

On formal languages about

regular expressions andfinite automata

context free grammarsand pushdown automata

Assignments

Short labs to confirm that youunderstand some theory andthe tools we need.

Compiler techniques lectures

abstract syntax

types and type checking

intermediaterepresentations

code generation andoptimizations

Programming project

Organized as a series oflaborations with strictdeadlines.

Page 22: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 23: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 24: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 25: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 26: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 27: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 28: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 29: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 30: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 31: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 32: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Computer Languages

There is plenty of computer languages forplenty of purposes . . .

xml to describe the structure ofdocuments and documents themselves.

xquery to transform xml documents.

VHDL to describe circuits.

VRML to describe 3D scenes.

C, Java, Haskell for programming.

Page 33: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Languages

Different kinds of programs requiredifferent kinds of abstractions

Symbolic manipulations functionallanguages like Haskell, ML, lisp,scheme or xquery.

Concurrent languages withcommunication and synchronizationmechanisms like Occam or MPD.

Embedded languages with interface tohardware like C.

. . .

Page 34: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Languages

Different kinds of programs requiredifferent kinds of abstractions

Symbolic manipulations functionallanguages like Haskell, ML, lisp,scheme or xquery.

Concurrent languages withcommunication and synchronizationmechanisms like Occam or MPD.

Embedded languages with interface tohardware like C.

. . .

Page 35: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Languages

Different kinds of programs requiredifferent kinds of abstractions

Symbolic manipulations functionallanguages like Haskell, ML, lisp,scheme or xquery.

Concurrent languages withcommunication and synchronizationmechanisms like Occam or MPD.

Embedded languages with interface tohardware like C.

. . .

Page 36: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Languages

Different kinds of programs requiredifferent kinds of abstractions

Symbolic manipulations functionallanguages like Haskell, ML, lisp,scheme or xquery.

Concurrent languages withcommunication and synchronizationmechanisms like Occam or MPD.

Embedded languages with interface tohardware like C.

. . .

Page 37: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Languages

Different kinds of programs requiredifferent kinds of abstractions

Symbolic manipulations functionallanguages like Haskell, ML, lisp,scheme or xquery.

Concurrent languages withcommunication and synchronizationmechanisms like Occam or MPD.

Embedded languages with interface tohardware like C.

. . .

Page 38: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Languages

Different kinds of programs requiredifferent kinds of abstractions

Symbolic manipulations functionallanguages like Haskell, ML, lisp,scheme or xquery.

Concurrent languages withcommunication and synchronizationmechanisms like Occam or MPD.

Embedded languages with interface tohardware like C.

. . .

Page 39: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Languages

Different kinds of programs requiredifferent kinds of abstractions

Symbolic manipulations functionallanguages like Haskell, ML, lisp,scheme or xquery.

Concurrent languages withcommunication and synchronizationmechanisms like Occam or MPD.

Embedded languages with interface tohardware like C.

. . .

Page 40: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Languages

Different kinds of programs requiredifferent kinds of abstractions

Symbolic manipulations functionallanguages like Haskell, ML, lisp,scheme or xquery.

Concurrent languages withcommunication and synchronizationmechanisms like Occam or MPD.

Embedded languages with interface tohardware like C.

. . .

Page 41: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Common features

All these languages have a lot in common!They all have to be processed by a programin order to do what we express in them!

They are formal languages,

in most of them we can makedefinitions,

in most of them types are used toidentify meaningful expressions.

Page 42: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Common features

All these languages have a lot in common!They all have to be processed by a programin order to do what we express in them!

They are formal languages,

in most of them we can makedefinitions,

in most of them types are used toidentify meaningful expressions.

Page 43: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Common features

All these languages have a lot in common!They all have to be processed by a programin order to do what we express in them!

They are formal languages,

in most of them we can makedefinitions,

in most of them types are used toidentify meaningful expressions.

Page 44: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Common features

All these languages have a lot in common!They all have to be processed by a programin order to do what we express in them!

They are formal languages,

in most of them we can makedefinitions,

in most of them types are used toidentify meaningful expressions.

Page 45: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler?

In the course we study acompiler for an imperativeprogramming language.

Language processors

There are two kinds oflanguage processors

Compilers

Interpreters

that have much in common!

They are complex programs!

They use advanced algorithmsand data structures.

They show an application of thetheory of formal languages, welearn how to use tools thatgenerate programs.

We learn programmingtechniques.

We learn to do semanticdistinctions (for instancedifferent parameter passingmechanisms!)

Page 46: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler?

In the course we study acompiler for an imperativeprogramming language.

Language processors

There are two kinds oflanguage processors

Compilers

Interpreters

that have much in common!

They are complex programs!

They use advanced algorithmsand data structures.

They show an application of thetheory of formal languages, welearn how to use tools thatgenerate programs.

We learn programmingtechniques.

We learn to do semanticdistinctions (for instancedifferent parameter passingmechanisms!)

Page 47: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler?

In the course we study acompiler for an imperativeprogramming language.

Language processors

There are two kinds oflanguage processors

Compilers

Interpreters

that have much in common!

They are complex programs!

They use advanced algorithmsand data structures.

They show an application of thetheory of formal languages, welearn how to use tools thatgenerate programs.

We learn programmingtechniques.

We learn to do semanticdistinctions (for instancedifferent parameter passingmechanisms!)

Page 48: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler?

In the course we study acompiler for an imperativeprogramming language.

Language processors

There are two kinds oflanguage processors

Compilers

Interpreters

that have much in common!

They are complex programs!

They use advanced algorithmsand data structures.

They show an application of thetheory of formal languages, welearn how to use tools thatgenerate programs.

We learn programmingtechniques.

We learn to do semanticdistinctions (for instancedifferent parameter passingmechanisms!)

Page 49: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler?

In the course we study acompiler for an imperativeprogramming language.

Language processors

There are two kinds oflanguage processors

Compilers

Interpreters

that have much in common!

They are complex programs!

They use advanced algorithmsand data structures.

They show an application of thetheory of formal languages, welearn how to use tools thatgenerate programs.

We learn programmingtechniques.

We learn to do semanticdistinctions (for instancedifferent parameter passingmechanisms!)

Page 50: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler?

In the course we study acompiler for an imperativeprogramming language.

Language processors

There are two kinds oflanguage processors

Compilers

Interpreters

that have much in common!

They are complex programs!

They use advanced algorithmsand data structures.

They show an application of thetheory of formal languages, welearn how to use tools thatgenerate programs.

We learn programmingtechniques.

We learn to do semanticdistinctions (for instancedifferent parameter passingmechanisms!)

Page 51: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler?

In the course we study acompiler for an imperativeprogramming language.

Language processors

There are two kinds oflanguage processors

Compilers

Interpreters

that have much in common!

They are complex programs!

They use advanced algorithmsand data structures.

They show an application of thetheory of formal languages, welearn how to use tools thatgenerate programs.

We learn programmingtechniques.

We learn to do semanticdistinctions (for instancedifferent parameter passingmechanisms!)

Page 52: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler?

In the course we study acompiler for an imperativeprogramming language.

Language processors

There are two kinds oflanguage processors

Compilers

Interpreters

that have much in common!

They are complex programs!

They use advanced algorithmsand data structures.

They show an application of thetheory of formal languages, welearn how to use tools thatgenerate programs.

We learn programmingtechniques.

We learn to do semanticdistinctions (for instancedifferent parameter passingmechanisms!)

Page 53: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler?

In the course we study acompiler for an imperativeprogramming language.

Language processors

There are two kinds oflanguage processors

Compilers

Interpreters

that have much in common!

They are complex programs!

They use advanced algorithmsand data structures.

They show an application of thetheory of formal languages, welearn how to use tools thatgenerate programs.

We learn programmingtechniques.

We learn to do semanticdistinctions (for instancedifferent parameter passingmechanisms!)

Page 54: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler?

In the course we study acompiler for an imperativeprogramming language.

Language processors

There are two kinds oflanguage processors

Compilers

Interpreters

that have much in common!

They are complex programs!

They use advanced algorithmsand data structures.

They show an application of thetheory of formal languages, welearn how to use tools thatgenerate programs.

We learn programmingtechniques.

We learn to do semanticdistinctions (for instancedifferent parameter passingmechanisms!)

Page 55: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler? Cooper & Torczon. Engineering aCompiler. (Elsevier)

A compiler is a large, complex program.

The design and implementation of a compiler is a substantialexercise in software engineering.

A good compiler contains a microcosmos of computer science.

Working inside a compiler provides practical experience insoftware engineering that is hard to obtain with smaller, lessintrincate systems.

Most software is compiled. Compiler construction has givenrise to tools for automatic programming that can be used formany purposes. In constructing a compiler you will get to usethese tools (and hopefully you will find use for them in otherareas!)

Page 56: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler? Cooper & Torczon. Engineering aCompiler. (Elsevier)

A compiler is a large, complex program.

The design and implementation of a compiler is a substantialexercise in software engineering.

A good compiler contains a microcosmos of computer science.

Working inside a compiler provides practical experience insoftware engineering that is hard to obtain with smaller, lessintrincate systems.

Most software is compiled. Compiler construction has givenrise to tools for automatic programming that can be used formany purposes. In constructing a compiler you will get to usethese tools (and hopefully you will find use for them in otherareas!)

Page 57: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler? Cooper & Torczon. Engineering aCompiler. (Elsevier)

A compiler is a large, complex program.

The design and implementation of a compiler is a substantialexercise in software engineering.

A good compiler contains a microcosmos of computer science.

Working inside a compiler provides practical experience insoftware engineering that is hard to obtain with smaller, lessintrincate systems.

Most software is compiled. Compiler construction has givenrise to tools for automatic programming that can be used formany purposes. In constructing a compiler you will get to usethese tools (and hopefully you will find use for them in otherareas!)

Page 58: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler? Cooper & Torczon. Engineering aCompiler. (Elsevier)

A compiler is a large, complex program.

The design and implementation of a compiler is a substantialexercise in software engineering.

A good compiler contains a microcosmos of computer science.

Working inside a compiler provides practical experience insoftware engineering that is hard to obtain with smaller, lessintrincate systems.

Most software is compiled. Compiler construction has givenrise to tools for automatic programming that can be used formany purposes. In constructing a compiler you will get to usethese tools (and hopefully you will find use for them in otherareas!)

Page 59: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a compiler? Cooper & Torczon. Engineering aCompiler. (Elsevier)

A compiler is a large, complex program.

The design and implementation of a compiler is a substantialexercise in software engineering.

A good compiler contains a microcosmos of computer science.

Working inside a compiler provides practical experience insoftware engineering that is hard to obtain with smaller, lessintrincate systems.

Most software is compiled. Compiler construction has givenrise to tools for automatic programming that can be used formany purposes. In constructing a compiler you will get to usethese tools (and hopefully you will find use for them in otherareas!)

Page 60: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Why a programming language?

We chose a programming language as running example of acomputer language because you are familiar with it and because itillustrates a lot of concepts.

definitions and scope

variables and arguments

types

Page 61: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Some side effects of the course.

Provides you with software tools to describe and implementcomputer languages

lexer generators

parser generators

Provides you with new programming techniques and datastructuresuseful in processing computer languages

design patterns component, visitor

front end/ back end, abstract machines

abstract syntax

environments

Page 62: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Some side effects of the course.

Provides you with software tools to describe and implementcomputer languages

lexer generators

parser generators

Provides you with new programming techniques and datastructuresuseful in processing computer languages

design patterns component, visitor

front end/ back end, abstract machines

abstract syntax

environments

Page 63: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Some side effects of the course.

Provides you with software tools to describe and implementcomputer languages

lexer generators

parser generators

Provides you with new programming techniques and datastructuresuseful in processing computer languages

design patterns component, visitor

front end/ back end, abstract machines

abstract syntax

environments

Page 64: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Some side effects of the course.

Provides you with software tools to describe and implementcomputer languages

lexer generators

parser generators

Provides you with new programming techniques and datastructuresuseful in processing computer languages

design patterns component, visitor

front end/ back end, abstract machines

abstract syntax

environments

Page 65: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Some side effects of the course.

Provides you with software tools to describe and implementcomputer languages

lexer generators

parser generators

Provides you with new programming techniques and datastructuresuseful in processing computer languages

design patterns component, visitor

front end/ back end, abstract machines

abstract syntax

environments

Page 66: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Some side effects of the course.

Provides you with software tools to describe and implementcomputer languages

lexer generators

parser generators

Provides you with new programming techniques and datastructuresuseful in processing computer languages

design patterns component, visitor

front end/ back end, abstract machines

abstract syntax

environments

Page 67: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Project

You will write a compiler for minijava, a small programminglanguage presented in the course book.

It is graded, it is the main contribution to the grade of thecourse!

It is divided in parts. You will get instructions and deadlinesfor each part.

You will get help to get started but you will have to work onyour own.

There will be time for extra questions/supervision duringconsultation hours, to be announced for each lab.

Page 68: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Project

You will write a compiler for minijava, a small programminglanguage presented in the course book.

It is graded, it is the main contribution to the grade of thecourse!

It is divided in parts. You will get instructions and deadlinesfor each part.

You will get help to get started but you will have to work onyour own.

There will be time for extra questions/supervision duringconsultation hours, to be announced for each lab.

Page 69: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Project

You will write a compiler for minijava, a small programminglanguage presented in the course book.

It is graded, it is the main contribution to the grade of thecourse!

It is divided in parts. You will get instructions and deadlinesfor each part.

You will get help to get started but you will have to work onyour own.

There will be time for extra questions/supervision duringconsultation hours, to be announced for each lab.

Page 70: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Project

You will write a compiler for minijava, a small programminglanguage presented in the course book.

It is graded, it is the main contribution to the grade of thecourse!

It is divided in parts. You will get instructions and deadlinesfor each part.

You will get help to get started but you will have to work onyour own.

There will be time for extra questions/supervision duringconsultation hours, to be announced for each lab.

Page 71: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Programming Project

You will write a compiler for minijava, a small programminglanguage presented in the course book.

It is graded, it is the main contribution to the grade of thecourse!

It is divided in parts. You will get instructions and deadlinesfor each part.

You will get help to get started but you will have to work onyour own.

There will be time for extra questions/supervision duringconsultation hours, to be announced for each lab.

Page 72: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview of a compiler

Compilersourcecode

machinecode

errors

Has to distinguish correct fromincorrect programs (has tounderstand!)

Has to generate correct machinecode!

Has to organize memory forvariables and instructions!

Has to agree with OS on theform of object code!

Page 73: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview of a compiler

Compilersourcecode

machinecode

errors

Has to distinguish correct fromincorrect programs (has tounderstand!)

Has to generate correct machinecode!

Has to organize memory forvariables and instructions!

Has to agree with OS on theform of object code!

Page 74: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview of a compiler

Compilersourcecode

machinecode

errors

Has to distinguish correct fromincorrect programs (has tounderstand!)

Has to generate correct machinecode!

Has to organize memory forvariables and instructions!

Has to agree with OS on theform of object code!

Page 75: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview of a compiler

Compilersourcecode

machinecode

errors

Has to distinguish correct fromincorrect programs (has tounderstand!)

Has to generate correct machinecode!

Has to organize memory forvariables and instructions!

Has to agree with OS on theform of object code!

Page 76: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview of a compiler

Compilersourcecode

machinecode

errors

Has to distinguish correct fromincorrect programs (has tounderstand!)

Has to generate correct machinecode!

Has to organize memory forvariables and instructions!

Has to agree with OS on theform of object code!

Page 77: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview . . .

Front End Back Endsourcecode

IR machinecode

errors

Programs are analysed and translated to an intermediaterepresentation IR, a form of abstract machine.

IR is useful for many things:

detect some errors (indexing out of range)reorganize the code (to gain efficiency)to analyze the code (to assign registers)

It also helps to think and understand the different tasks!

Page 78: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview . . .

Front End Back Endsourcecode

IR machinecode

errors

Programs are analysed and translated to an intermediaterepresentation IR, a form of abstract machine.

IR is useful for many things:

detect some errors (indexing out of range)reorganize the code (to gain efficiency)to analyze the code (to assign registers)

It also helps to think and understand the different tasks!

Page 79: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview . . .

Front End Back Endsourcecode

IR machinecode

errors

Programs are analysed and translated to an intermediaterepresentation IR, a form of abstract machine.

IR is useful for many things:

detect some errors (indexing out of range)reorganize the code (to gain efficiency)to analyze the code (to assign registers)

It also helps to think and understand the different tasks!

Page 80: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview . . .

Front End Back Endsourcecode

IR machinecode

errors

Programs are analysed and translated to an intermediaterepresentation IR, a form of abstract machine.

IR is useful for many things:

detect some errors (indexing out of range)reorganize the code (to gain efficiency)to analyze the code (to assign registers)

It also helps to think and understand the different tasks!

Page 81: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview . . .

Front End Back Endsourcecode

IR machinecode

errors

Programs are analysed and translated to an intermediaterepresentation IR, a form of abstract machine.

IR is useful for many things:

detect some errors (indexing out of range)reorganize the code (to gain efficiency)to analyze the code (to assign registers)

It also helps to think and understand the different tasks!

Page 82: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview . . .

Front End Back Endsourcecode

IR machinecode

errors

Programs are analysed and translated to an intermediaterepresentation IR, a form of abstract machine.

IR is useful for many things:

detect some errors (indexing out of range)reorganize the code (to gain efficiency)to analyze the code (to assign registers)

It also helps to think and understand the different tasks!

Page 83: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Overview . . .

Front End Back Endsourcecode

IR machinecode

errors

Programs are analysed and translated to an intermediaterepresentation IR, a form of abstract machine.

IR is useful for many things:

detect some errors (indexing out of range)reorganize the code (to gain efficiency)to analyze the code (to assign registers)

It also helps to think and understand the different tasks!

Page 84: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

The Front End

Scanner Parser Types Trans.sourcecode

tokens AS AS IR

errors

The Scanner (lexical analyzer) transforms a sequence ofcharacters (source code) into a sequence of tokens: arepresentation of the lexemes of the language.

The Parser (syntactical analyzer) takes the sequence of tokensand generates a tree representation, the Abstract Syntax.

This tree is analyzed by the type checker and is then used togenerate the intermediate representation.

Page 85: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

The Front End

Scanner Parser Types Trans.sourcecode

tokens AS AS IR

errors

The Scanner (lexical analyzer) transforms a sequence ofcharacters (source code) into a sequence of tokens: arepresentation of the lexemes of the language.

The Parser (syntactical analyzer) takes the sequence of tokensand generates a tree representation, the Abstract Syntax.

This tree is analyzed by the type checker and is then used togenerate the intermediate representation.

Page 86: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

The Front End

Scanner Parser Types Trans.sourcecode

tokens AS AS IR

errors

The Scanner (lexical analyzer) transforms a sequence ofcharacters (source code) into a sequence of tokens: arepresentation of the lexemes of the language.

The Parser (syntactical analyzer) takes the sequence of tokensand generates a tree representation, the Abstract Syntax.

This tree is analyzed by the type checker and is then used togenerate the intermediate representation.

Page 87: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

The Front End

Scanner Parser Types Trans.sourcecode

tokens AS AS IR

errors

The Scanner (lexical analyzer) transforms a sequence ofcharacters (source code) into a sequence of tokens: arepresentation of the lexemes of the language.

The Parser (syntactical analyzer) takes the sequence of tokensand generates a tree representation, the Abstract Syntax.

This tree is analyzed by the type checker and is then used togenerate the intermediate representation.

Page 88: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

The Back End

Front End Back Endsourcecode

IR machinecode

errors

The back end is also structured in phases!

Opt Instr. Sel. Reg. Alloc.IR IR Abstract

AssemblerAssembler

Page 89: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

The Back End

Front End Back Endsourcecode

IR machinecode

errors

The back end is also structured in phases!

Opt Instr. Sel. Reg. Alloc.IR IR Abstract

AssemblerAssembler

Page 90: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

The Back End

Front End Back Endsourcecode

IR machinecode

errors

The back end is also structured in phases!

Opt Instr. Sel. Reg. Alloc.IR IR Abstract

AssemblerAssembler

Page 91: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Describing a language

What are the phrases of thelanguage? (Syntax)

Example

In English some sentences have theform

<noun phrase><verb phrase>

where a <noun phrase> can be

the <noun>a <noun><name>

What do phrases mean?(Semantics)

Example

In Java the meaning of astatement like

if <exp><stm1><stm2>

is given by explaining whathappens when it is executed:

When the value of <exp> istrue <stm1> is executed,

otherwise <stm2> is executed.

Page 92: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Describing a language

What are the phrases of thelanguage? (Syntax)

Example

In English some sentences have theform

<noun phrase><verb phrase>

where a <noun phrase> can be

the <noun>a <noun><name>

What do phrases mean?(Semantics)

Example

In Java the meaning of astatement like

if <exp><stm1><stm2>

is given by explaining whathappens when it is executed:

When the value of <exp> istrue <stm1> is executed,

otherwise <stm2> is executed.

Page 93: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Describing a language

What are the phrases of thelanguage? (Syntax)

Example

In English some sentences have theform

<noun phrase><verb phrase>

where a <noun phrase> can be

the <noun>a <noun><name>

What do phrases mean?(Semantics)

Example

In Java the meaning of astatement like

if <exp><stm1><stm2>

is given by explaining whathappens when it is executed:

When the value of <exp> istrue <stm1> is executed,

otherwise <stm2> is executed.

Page 94: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Describing a language

What are the phrases of thelanguage? (Syntax)

Example

In English some sentences have theform

<noun phrase><verb phrase>

where a <noun phrase> can be

the <noun>a <noun><name>

What do phrases mean?(Semantics)

Example

In Java the meaning of astatement like

if <exp><stm1><stm2>

is given by explaining whathappens when it is executed:

When the value of <exp> istrue <stm1> is executed,

otherwise <stm2> is executed.

Page 95: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Describing a language

What are the phrases of thelanguage? (Syntax)

Example

In English some sentences have theform

<noun phrase><verb phrase>

where a <noun phrase> can be

the <noun>a <noun><name>

What do phrases mean?(Semantics)

Example

In Java the meaning of astatement like

if <exp><stm1><stm2>

is given by explaining whathappens when it is executed:

When the value of <exp> istrue <stm1> is executed,

otherwise <stm2> is executed.

Page 96: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Syntax

Alphabets

Phrases are formed using words.

Words are formed usingcharacters.

Languages

In the context of our course we willdeal with formal languages:

Sets of strings over some alphabetdescribed by certain rules

Page 97: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Syntax

Alphabets

Phrases are formed using words.

Words are formed usingcharacters.

Languages

In the context of our course we willdeal with formal languages:

Sets of strings over some alphabetdescribed by certain rules

Page 98: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Syntax

Alphabets

Phrases are formed using words.

Words are formed usingcharacters.

Languages

In the context of our course we willdeal with formal languages:

Sets of strings over some alphabetdescribed by certain rules

Page 99: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Syntax

There are different kinds of rulesto describe languages.

According to what kind of ruleswe use the languages havecertain structure and properties.

Typically languages of wordshave a simpler structure thanlanguages of phrases.

Regular expressions are used todescribe words of programminglanguages.

Easy to understand, usefull inmany contexts, software toolsupport.

Page 100: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Syntax

There are different kinds of rulesto describe languages.

According to what kind of ruleswe use the languages havecertain structure and properties.

Typically languages of wordshave a simpler structure thanlanguages of phrases.

Regular expressions are used todescribe words of programminglanguages.

Easy to understand, usefull inmany contexts, software toolsupport.

Page 101: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Syntax

There are different kinds of rulesto describe languages.

According to what kind of ruleswe use the languages havecertain structure and properties.

Typically languages of wordshave a simpler structure thanlanguages of phrases.

Regular expressions are used todescribe words of programminglanguages.

Easy to understand, usefull inmany contexts, software toolsupport.

Page 102: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Syntax

There are different kinds of rulesto describe languages.

According to what kind of ruleswe use the languages havecertain structure and properties.

Typically languages of wordshave a simpler structure thanlanguages of phrases.

Regular expressions are used todescribe words of programminglanguages.

Easy to understand, usefull inmany contexts, software toolsupport.

Page 103: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Syntax

There are different kinds of rulesto describe languages.

According to what kind of ruleswe use the languages havecertain structure and properties.

Typically languages of wordshave a simpler structure thanlanguages of phrases.

Regular expressions are used todescribe words of programminglanguages.

Easy to understand, usefull inmany contexts, software toolsupport.

Page 104: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Notation

A regular expression r over analphabet Σ describes a set of stringsL(r).

1 ε is a RE denoting the set with the empty string as onlyelement.

2 if a ∈ Σ then a is a RE denoting the set {a}3 if r and s are RE denoting L(r) and L(s) respectively, then

(r) is a RE denoting L(r)r |s is a RE denoting L(r) ∪ L(s)rs is a RE denoting {xy |x ∈ L(r) and y ∈ L(s)}r∗ is a RE denoting L(r)∗

Page 105: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Notation

A regular expression r over analphabet Σ describes a set of stringsL(r).

1 ε is a RE denoting the set with the empty string as onlyelement.

2 if a ∈ Σ then a is a RE denoting the set {a}3 if r and s are RE denoting L(r) and L(s) respectively, then

(r) is a RE denoting L(r)r |s is a RE denoting L(r) ∪ L(s)rs is a RE denoting {xy |x ∈ L(r) and y ∈ L(s)}r∗ is a RE denoting L(r)∗

Page 106: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Notation

A regular expression r over analphabet Σ describes a set of stringsL(r).

1 ε is a RE denoting the set with the empty string as onlyelement.

2 if a ∈ Σ then a is a RE denoting the set {a}3 if r and s are RE denoting L(r) and L(s) respectively, then

(r) is a RE denoting L(r)r |s is a RE denoting L(r) ∪ L(s)rs is a RE denoting {xy |x ∈ L(r) and y ∈ L(s)}r∗ is a RE denoting L(r)∗

Page 107: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Notation

A regular expression r over analphabet Σ describes a set of stringsL(r).

1 ε is a RE denoting the set with the empty string as onlyelement.

2 if a ∈ Σ then a is a RE denoting the set {a}3 if r and s are RE denoting L(r) and L(s) respectively, then

(r) is a RE denoting L(r)r |s is a RE denoting L(r) ∪ L(s)rs is a RE denoting {xy |x ∈ L(r) and y ∈ L(s)}r∗ is a RE denoting L(r)∗

Page 108: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Notation

A regular expression r over analphabet Σ describes a set of stringsL(r).

1 ε is a RE denoting the set with the empty string as onlyelement.

2 if a ∈ Σ then a is a RE denoting the set {a}3 if r and s are RE denoting L(r) and L(s) respectively, then

(r) is a RE denoting L(r)r |s is a RE denoting L(r) ∪ L(s)rs is a RE denoting {xy |x ∈ L(r) and y ∈ L(s)}r∗ is a RE denoting L(r)∗

Page 109: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Notation

A regular expression r over analphabet Σ describes a set of stringsL(r).

1 ε is a RE denoting the set with the empty string as onlyelement.

2 if a ∈ Σ then a is a RE denoting the set {a}3 if r and s are RE denoting L(r) and L(s) respectively, then

(r) is a RE denoting L(r)r |s is a RE denoting L(r) ∪ L(s)rs is a RE denoting {xy |x ∈ L(r) and y ∈ L(s)}r∗ is a RE denoting L(r)∗

Page 110: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Notation

A regular expression r over analphabet Σ describes a set of stringsL(r).

1 ε is a RE denoting the set with the empty string as onlyelement.

2 if a ∈ Σ then a is a RE denoting the set {a}3 if r and s are RE denoting L(r) and L(s) respectively, then

(r) is a RE denoting L(r)r |s is a RE denoting L(r) ∪ L(s)rs is a RE denoting {xy |x ∈ L(r) and y ∈ L(s)}r∗ is a RE denoting L(r)∗

Page 111: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Notation

A regular expression r over analphabet Σ describes a set of stringsL(r).

1 ε is a RE denoting the set with the empty string as onlyelement.

2 if a ∈ Σ then a is a RE denoting the set {a}3 if r and s are RE denoting L(r) and L(s) respectively, then

(r) is a RE denoting L(r)r |s is a RE denoting L(r) ∪ L(s)rs is a RE denoting {xy |x ∈ L(r) and y ∈ L(s)}r∗ is a RE denoting L(r)∗

Page 112: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Recall set operations

We used two operations on sets when introducing regularexpressions.

Union

A ∪ B = {x |x ∈ A or x ∈ B}

Example

{0, 2, 4, 6, 8, 10} ∪{1, 3, 5, 7, 9, 10} ={0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

Kleene closure

A∗ = ε ∪ A ∪ {xy |x , y ∈ A}∪{xyz |x , y , z ∈ A} ∪ . . .

Example

{a}∗ ={””, a, aa, aaa, aaaa, aaaaa, . . .}

Page 113: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Recall set operations

We used two operations on sets when introducing regularexpressions.

Union

A ∪ B = {x |x ∈ A or x ∈ B}

Example

{0, 2, 4, 6, 8, 10} ∪{1, 3, 5, 7, 9, 10} ={0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

Kleene closure

A∗ = ε ∪ A ∪ {xy |x , y ∈ A}∪{xyz |x , y , z ∈ A} ∪ . . .

Example

{a}∗ ={””, a, aa, aaa, aaaa, aaaaa, . . .}

Page 114: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Recall set operations

We used two operations on sets when introducing regularexpressions.

Union

A ∪ B = {x |x ∈ A or x ∈ B}

Example

{0, 2, 4, 6, 8, 10} ∪{1, 3, 5, 7, 9, 10} ={0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

Kleene closure

A∗ = ε ∪ A ∪ {xy |x , y ∈ A}∪{xyz |x , y , z ∈ A} ∪ . . .

Example

{a}∗ ={””, a, aa, aaa, aaaa, aaaaa, . . .}

Page 115: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Recall set operations

We used two operations on sets when introducing regularexpressions.

Union

A ∪ B = {x |x ∈ A or x ∈ B}

Example

{0, 2, 4, 6, 8, 10} ∪{1, 3, 5, 7, 9, 10} ={0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

Kleene closure

A∗ = ε ∪ A ∪ {xy |x , y ∈ A}∪{xyz |x , y , z ∈ A} ∪ . . .

Example

{a}∗ ={””, a, aa, aaa, aaaa, aaaaa, . . .}

Page 116: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Recall set operations

We used two operations on sets when introducing regularexpressions.

Union

A ∪ B = {x |x ∈ A or x ∈ B}

Example

{0, 2, 4, 6, 8, 10} ∪{1, 3, 5, 7, 9, 10} ={0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

Kleene closure

A∗ = ε ∪ A ∪ {xy |x , y ∈ A}∪{xyz |x , y , z ∈ A} ∪ . . .

Example

{a}∗ ={””, a, aa, aaa, aaaa, aaaaa, . . .}

Page 117: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examples & Abbreviations

The set of strings that beginand end with an a and containat least one b:

a(a|b)∗b(a|b)∗a

Example

abaaaba, abaa, abbaaabaa, abbaa, aabba, ababa . . .. . .

We omit many parenthesis byfollowing precedenceconventions:

∗ has highest precedence

then comes concatenation

and then union

Page 118: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examples & Abbreviations

The set of strings that beginand end with an a and containat least one b:

a(a|b)∗b(a|b)∗a

Example

abaaaba, abaa, abbaaabaa, abbaa, aabba, ababa . . .. . .

We omit many parenthesis byfollowing precedenceconventions:

∗ has highest precedence

then comes concatenation

and then union

Page 119: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examples & Abbreviations

The set of strings that beginand end with an a and containat least one b:

a(a|b)∗b(a|b)∗a

Example

abaaaba, abaa, abbaaabaa, abbaa, aabba, ababa . . .. . .

We omit many parenthesis byfollowing precedenceconventions:

∗ has highest precedence

then comes concatenation

and then union

Page 120: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examples & Abbreviations

The set of integer literals

0|(1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)∗

Example

0, 1, 2, 3, 4, 5, 6, 7, 8, 910, 11, 12, . . . , 20, 21, . . . , 99100, 101, 102 . . .. . .

We use [] for either:[123456789]

We use − for a range inan ordered part of thealphabet: [1− 9]

0|[1− 9][0− 9]∗

Page 121: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examples & Abbreviations

The set of integer literals

0|(1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)∗

Example

0, 1, 2, 3, 4, 5, 6, 7, 8, 910, 11, 12, . . . , 20, 21, . . . , 99100, 101, 102 . . .. . .

We use [] for either:[123456789]

We use − for a range inan ordered part of thealphabet: [1− 9]

0|[1− 9][0− 9]∗

Page 122: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examples & Abbreviations

The set of integer literals

0|(1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)∗

Example

0, 1, 2, 3, 4, 5, 6, 7, 8, 910, 11, 12, . . . , 20, 21, . . . , 99100, 101, 102 . . .. . .

We use [] for either:[123456789]

We use − for a range inan ordered part of thealphabet: [1− 9]

0|[1− 9][0− 9]∗

Page 123: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examples & Abbreviations

Identifiers in a little programming language are words of any lengthformed using the characters of the latin alphabet.

[a− zA− Z ][a− zA− Z ]∗

Example

a, b, c , . . . ,A,B,C , . . .aa, ab, ac , aZ , . . .the,myX , Int, . . .. . .

We use r+ instead of rr∗

[a− zA− Z ]+

Page 124: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examples & Abbreviations

Identifiers in a little programming language are words of any lengthformed using the characters of the latin alphabet.

[a− zA− Z ][a− zA− Z ]∗

Example

a, b, c , . . . ,A,B,C , . . .aa, ab, ac , aZ , . . .the,myX , Int, . . .. . .

We use r+ instead of rr∗

[a− zA− Z ]+

Page 125: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Examples & Abbreviations

Identifiers in a little programming language are words of any lengthformed using the characters of the latin alphabet.

[a− zA− Z ][a− zA− Z ]∗

Example

a, b, c , . . . ,A,B,C , . . .aa, ab, ac , aZ , . . .the,myX , Int, . . .. . .

We use r+ instead of rr∗

[a− zA− Z ]+

Page 126: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Non regular languages

Not all sets of strings are regular!

Example

Given the alphabet Σ = {a, b},the language {anbn|n ≥ 0} isnot regular.It can be provedmathematicaly, but we will notdo it

However, for any m ≥ 0, thelanguage {anbn|0 ≤ n ≤ m} isregular.

Example

Given the alphabet{0, 1, 2, 3, 4, 5, 6, 7, 8, 9,+, ∗, (, )},the set of wellformed arithmeticalexpressions is not regular.

We need recursion in order to allowfor subexpressions and balancedpartenthesis.

Page 127: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Non regular languages

Not all sets of strings are regular!

Example

Given the alphabet Σ = {a, b},the language {anbn|n ≥ 0} isnot regular.It can be provedmathematicaly, but we will notdo it

However, for any m ≥ 0, thelanguage {anbn|0 ≤ n ≤ m} isregular.

Example

Given the alphabet{0, 1, 2, 3, 4, 5, 6, 7, 8, 9,+, ∗, (, )},the set of wellformed arithmeticalexpressions is not regular.

We need recursion in order to allowfor subexpressions and balancedpartenthesis.

Page 128: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Non regular languages

Not all sets of strings are regular!

Example

Given the alphabet Σ = {a, b},the language {anbn|n ≥ 0} isnot regular.It can be provedmathematicaly, but we will notdo it

However, for any m ≥ 0, thelanguage {anbn|0 ≤ n ≤ m} isregular.

Example

Given the alphabet{0, 1, 2, 3, 4, 5, 6, 7, 8, 9,+, ∗, (, )},the set of wellformed arithmeticalexpressions is not regular.

We need recursion in order to allowfor subexpressions and balancedpartenthesis.

Page 129: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Non regular languages

Not all sets of strings are regular!

Example

Given the alphabet Σ = {a, b},the language {anbn|n ≥ 0} isnot regular.It can be provedmathematicaly, but we will notdo it

However, for any m ≥ 0, thelanguage {anbn|0 ≤ n ≤ m} isregular.

Example

Given the alphabet{0, 1, 2, 3, 4, 5, 6, 7, 8, 9,+, ∗, (, )},the set of wellformed arithmeticalexpressions is not regular.

We need recursion in order to allowfor subexpressions and balancedpartenthesis.

Page 130: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Non regular languages

Not all sets of strings are regular!

Example

Given the alphabet Σ = {a, b},the language {anbn|n ≥ 0} isnot regular.It can be provedmathematicaly, but we will notdo it

However, for any m ≥ 0, thelanguage {anbn|0 ≤ n ≤ m} isregular.

Example

Given the alphabet{0, 1, 2, 3, 4, 5, 6, 7, 8, 9,+, ∗, (, )},the set of wellformed arithmeticalexpressions is not regular.

We need recursion in order to allowfor subexpressions and balancedpartenthesis.

Page 131: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Regular expressions on the web

There is a lot of material on the web, both in the form of lecturenotes, slides and books. I will not link from the web page, but Iwill put some links on the slides that you can check.

A book on compilers

N. Wirth. Compiler Construction.http://www.oberon.ethz.ch/WirthPubl/CBEAll.pdf

Lecture notes for a compiler course

L. Fegaras. Design and Construction of Compilers at Texas atArlington.http://lambda.uta.edu/cse5317/notes/

A similar course at Lulea

Where you can find slides.http://www.sm.luth.se/csee/courses/smd/163/

Page 132: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Scanners

A scanner is a program that inspects a sequence of characterstrying to identify words of a language. They can be used for manydifferent purposes.

Example

In a compiler a scanner (or lexical analyzer) is used to begin withthe analysis (understanding) of the source code:

Read the sequence of characters and produce a sequence oftokens

Facilitates the analysis of the phrases of the program!

Interrupt the compilation process in case of somelexicographic error, including reporting an error message.

Page 133: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Scanners

A scanner is a program that inspects a sequence of characterstrying to identify words of a language. They can be used for manydifferent purposes.

Example

In a compiler a scanner (or lexical analyzer) is used to begin withthe analysis (understanding) of the source code:

Read the sequence of characters and produce a sequence oftokens

Facilitates the analysis of the phrases of the program!

Interrupt the compilation process in case of somelexicographic error, including reporting an error message.

Page 134: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Scanners

A scanner is a program that inspects a sequence of characterstrying to identify words of a language. They can be used for manydifferent purposes.

Example

In a compiler a scanner (or lexical analyzer) is used to begin withthe analysis (understanding) of the source code:

Read the sequence of characters and produce a sequence oftokens

Facilitates the analysis of the phrases of the program!

Interrupt the compilation process in case of somelexicographic error, including reporting an error message.

Page 135: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Scanners

A scanner is a program that inspects a sequence of characterstrying to identify words of a language. They can be used for manydifferent purposes.

Example

In a compiler a scanner (or lexical analyzer) is used to begin withthe analysis (understanding) of the source code:

Read the sequence of characters and produce a sequence oftokens

Facilitates the analysis of the phrases of the program!

Interrupt the compilation process in case of somelexicographic error, including reporting an error message.

Page 136: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Scanners

A scanner is a program that inspects a sequence of characterstrying to identify words of a language. They can be used for manydifferent purposes.

Example

In a compiler a scanner (or lexical analyzer) is used to begin withthe analysis (understanding) of the source code:

Read the sequence of characters and produce a sequence oftokens

Facilitates the analysis of the phrases of the program!

Interrupt the compilation process in case of somelexicographic error, including reporting an error message.

Page 137: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Generators

Scanners are tedious programsto write, with many cases totake care of, very error prone!

Some math that we willdiscuss in the second lecturehas resulted in programs thatgenerate scanners from aregular expression.

We will use , written in Javaand generating a Javaprogram.

Page 138: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Generators

Scanners are tedious programsto write, with many cases totake care of, very error prone!

Some math that we willdiscuss in the second lecturehas resulted in programs thatgenerate scanners from aregular expression.

We will use , written in Javaand generating a Javaprogram.

Page 139: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Generators

Scanners are tedious programsto write, with many cases totake care of, very error prone!

Some math that we willdiscuss in the second lecturehas resulted in programs thatgenerate scanners from aregular expression.

We will use , written inJava and generating a Javaprogram.

Page 140: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Generators

Scanners are tedious programsto write, with many cases totake care of, very error prone!

Some math that we willdiscuss in the second lecturehas resulted in programs thatgenerate scanners from aregular expression.

We will use , written inJava and generating a Javaprogram.

Page 141: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

JFlex source

Regularexpressions,directives, javacode.

JFlex the source!

Scanner in Java,compile it!

Run the scanneron a file of text!

// code outside MyName%%

%unicode%int%class MyName%function next%{// code inside MyName%}%%

"veronica gaspes"{System.out.println(yytext());}

.|\n{}

Page 142: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

JFlex source

Regularexpressions,directives, javacode.

JFlex the source!

Scanner in Java,compile it!

Run the scanneron a file of text!

// code outside MyName%%

%unicode%int%class MyName%function next%{// code inside MyName%}%%

"veronica gaspes"{System.out.println(yytext());}

.|\n{}

Page 143: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

JFlex source

Regularexpressions,directives, javacode.

JFlex the source!

Scanner in Java,compile it!

Run the scanneron a file of text!

// code outside MyName%%

%unicode%int%class MyName%function next%{// code inside MyName%}%%

"veronica gaspes"{System.out.println(yytext());}

.|\n{}

Page 144: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Directives and conventions

%line allows you to use thevariable yyline that isautomaticaly incremented onevery line change.

%column allows you to usethe variable yycolumn thatis automaticaly incrementedand reinitialized.

%implementsInterfaceName lets thegenerated java classimplement the interface.

When scanning the inputsequence of characters, theremight be clashes between some ofthe regular expressions.

Allways go for the longestsequence that matches anexpression.

The order of the rulesindicates precedence, if aword is described by morethan one RE the first one willbe chosen.

Page 145: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Directives and conventions

%line allows you to use thevariable yyline that isautomaticaly incremented onevery line change.

%column allows you to usethe variable yycolumn thatis automaticaly incrementedand reinitialized.

%implementsInterfaceName lets thegenerated java classimplement the interface.

When scanning the inputsequence of characters, theremight be clashes between some ofthe regular expressions.

Allways go for the longestsequence that matches anexpression.

The order of the rulesindicates precedence, if aword is described by morethan one RE the first one willbe chosen.

Page 146: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Directives and conventions

%line allows you to use thevariable yyline that isautomaticaly incremented onevery line change.

%column allows you to usethe variable yycolumn thatis automaticaly incrementedand reinitialized.

%implementsInterfaceName lets thegenerated java classimplement the interface.

When scanning the inputsequence of characters, theremight be clashes between some ofthe regular expressions.

Allways go for the longestsequence that matches anexpression.

The order of the rulesindicates precedence, if aword is described by morethan one RE the first one willbe chosen.

Page 147: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Directives and conventions

%line allows you to use thevariable yyline that isautomaticaly incremented onevery line change.

%column allows you to usethe variable yycolumn thatis automaticaly incrementedand reinitialized.

%implementsInterfaceName lets thegenerated java classimplement the interface.

When scanning the inputsequence of characters, theremight be clashes between some ofthe regular expressions.

Allways go for the longestsequence that matches anexpression.

The order of the rulesindicates precedence, if aword is described by morethan one RE the first one willbe chosen.

Page 148: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Directives and conventions

%line allows you to use thevariable yyline that isautomaticaly incremented onevery line change.

%column allows you to usethe variable yycolumn thatis automaticaly incrementedand reinitialized.

%implementsInterfaceName lets thegenerated java classimplement the interface.

When scanning the inputsequence of characters, theremight be clashes between some ofthe regular expressions.

Allways go for the longestsequence that matches anexpression.

The order of the rulesindicates precedence, if aword is described by morethan one RE the first one willbe chosen.

Page 149: Computer Languages - Introduction & Regular Expressions · Computer Languages This is anelective coursefor the master programmes Embedded Intelligent Systems and Computer Systems

Introduction Regular Expressions

Directives and conventions

%line allows you to use thevariable yyline that isautomaticaly incremented onevery line change.

%column allows you to usethe variable yycolumn thatis automaticaly incrementedand reinitialized.

%implementsInterfaceName lets thegenerated java classimplement the interface.

When scanning the inputsequence of characters, theremight be clashes between some ofthe regular expressions.

Allways go for the longestsequence that matches anexpression.

The order of the rulesindicates precedence, if aword is described by morethan one RE the first one willbe chosen.