150
Compilers and computer architecture: introduction Martin Berger 1 Thanks to Chad MacKinney, Alex Jeffery, Justin Crow, Jim Fielding, Shaun Ring and Vilem Liepelt for suggestions and corrections. Thanks to Benjamin Landers for the RARS simulator. September 2019 1 Email: [email protected], Office hours: Wed 12-13 in Chi-2R312. 1 / 150

Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

  • Upload
    others

  • View
    17

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers and computer architecture:introduction

Martin Berger 1

Thanks to Chad MacKinney, Alex Jeffery, Justin Crow, Jim Fielding, Shaun Ring and Vilem Liepelt forsuggestions and corrections. Thanks to Benjamin Landers for the RARS simulator.

September 2019

1Email: [email protected], Office hours: Wed 12-13 inChi-2R312.

1 / 150

Page 2: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: lecturer

I Name: Martin BergerI Email: [email protected] Web: http://users.sussex.ac.uk/~mfb21/compilersI Lecture notes etc: http://users.sussex.ac.uk/~mfb21/compilers/material.html Linked from Canvas

I Office hour: after the Wednesdays lectures, and onrequest (please arrange by email, seehttp://users.sussex.ac.uk/~mfb21/cal foravailable time-slots)

I My room: Chichester II, 312

2 / 150

Page 3: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: lecturer

I Name: Martin Berger

I Email: [email protected] Web: http://users.sussex.ac.uk/~mfb21/compilersI Lecture notes etc: http://users.sussex.ac.uk/~mfb21/compilers/material.html Linked from Canvas

I Office hour: after the Wednesdays lectures, and onrequest (please arrange by email, seehttp://users.sussex.ac.uk/~mfb21/cal foravailable time-slots)

I My room: Chichester II, 312

3 / 150

Page 4: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: lecturer

I Name: Martin BergerI Email: [email protected]

I Web: http://users.sussex.ac.uk/~mfb21/compilersI Lecture notes etc: http://users.sussex.ac.uk/~mfb21/compilers/material.html Linked from Canvas

I Office hour: after the Wednesdays lectures, and onrequest (please arrange by email, seehttp://users.sussex.ac.uk/~mfb21/cal foravailable time-slots)

I My room: Chichester II, 312

4 / 150

Page 5: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: lecturer

I Name: Martin BergerI Email: [email protected] Web: http://users.sussex.ac.uk/~mfb21/compilers

I Lecture notes etc: http://users.sussex.ac.uk/~mfb21/compilers/material.html Linked from Canvas

I Office hour: after the Wednesdays lectures, and onrequest (please arrange by email, seehttp://users.sussex.ac.uk/~mfb21/cal foravailable time-slots)

I My room: Chichester II, 312

5 / 150

Page 6: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: lecturer

I Name: Martin BergerI Email: [email protected] Web: http://users.sussex.ac.uk/~mfb21/compilersI Lecture notes etc: http://users.sussex.ac.uk/~mfb21/compilers/material.html Linked from Canvas

I Office hour: after the Wednesdays lectures, and onrequest (please arrange by email, seehttp://users.sussex.ac.uk/~mfb21/cal foravailable time-slots)

I My room: Chichester II, 312

6 / 150

Page 7: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: lecturer

I Name: Martin BergerI Email: [email protected] Web: http://users.sussex.ac.uk/~mfb21/compilersI Lecture notes etc: http://users.sussex.ac.uk/~mfb21/compilers/material.html Linked from Canvas

I Office hour: after the Wednesdays lectures, and onrequest (please arrange by email, seehttp://users.sussex.ac.uk/~mfb21/cal foravailable time-slots)

I My room: Chichester II, 312

7 / 150

Page 8: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: lecturer

I Name: Martin BergerI Email: [email protected] Web: http://users.sussex.ac.uk/~mfb21/compilersI Lecture notes etc: http://users.sussex.ac.uk/~mfb21/compilers/material.html Linked from Canvas

I Office hour: after the Wednesdays lectures, and onrequest (please arrange by email, seehttp://users.sussex.ac.uk/~mfb21/cal foravailable time-slots)

I My room: Chichester II, 312

8 / 150

Page 9: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: dates, times and assessment

I Lectures: Two lectures per week,Wednesday: 11-12 Lec PEV1-1A7Friday: 17-18 RICH-AS3

I Tutorials: please see your timetables. The TA is ShaunRing [email protected]

I There will (probably) be PAL sessions, more soon.I Assessment: coursework (50%) and by unseen

examination (50%). Both courseworks involve writing partsof a compiler. Due dates for courseworks: Fri, 8 Nov 2019,and Fri, 20 Dec 2019, both 18:00.

9 / 150

Page 10: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: dates, times and assessment

I Lectures: Two lectures per week,Wednesday: 11-12 Lec PEV1-1A7Friday: 17-18 RICH-AS3

I Tutorials: please see your timetables. The TA is ShaunRing [email protected]

I There will (probably) be PAL sessions, more soon.I Assessment: coursework (50%) and by unseen

examination (50%). Both courseworks involve writing partsof a compiler. Due dates for courseworks: Fri, 8 Nov 2019,and Fri, 20 Dec 2019, both 18:00.

10 / 150

Page 11: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: dates, times and assessment

I Lectures: Two lectures per week,Wednesday: 11-12 Lec PEV1-1A7Friday: 17-18 RICH-AS3

I Tutorials: please see your timetables. The TA is ShaunRing [email protected]

I There will (probably) be PAL sessions, more soon.I Assessment: coursework (50%) and by unseen

examination (50%). Both courseworks involve writing partsof a compiler. Due dates for courseworks: Fri, 8 Nov 2019,and Fri, 20 Dec 2019, both 18:00.

11 / 150

Page 12: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: dates, times and assessment

I Lectures: Two lectures per week,Wednesday: 11-12 Lec PEV1-1A7Friday: 17-18 RICH-AS3

I Tutorials: please see your timetables. The TA is ShaunRing [email protected]

I There will (probably) be PAL sessions, more soon.

I Assessment: coursework (50%) and by unseenexamination (50%). Both courseworks involve writing partsof a compiler. Due dates for courseworks: Fri, 8 Nov 2019,and Fri, 20 Dec 2019, both 18:00.

12 / 150

Page 13: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Administrative matters: dates, times and assessment

I Lectures: Two lectures per week,Wednesday: 11-12 Lec PEV1-1A7Friday: 17-18 RICH-AS3

I Tutorials: please see your timetables. The TA is ShaunRing [email protected]

I There will (probably) be PAL sessions, more soon.I Assessment: coursework (50%) and by unseen

examination (50%). Both courseworks involve writing partsof a compiler. Due dates for courseworks: Fri, 8 Nov 2019,and Fri, 20 Dec 2019, both 18:00.

13 / 150

Page 14: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Questions welcome!

Please, ask questions ...

I during the lessonI at the end of the lessonI in my office hours (seehttp://users.sussex.ac.uk/~mfb21/cal foravailable time-slots)

I by email [email protected] on CanvasI in the tutorialsI in the course’s Discord channel (invite is on Canvas)I any other channels (e.g. Telegram, TikTok ...)?

Please, don’t wait until the end of the course to tell me aboutany problems you may encounter.

14 / 150

Page 15: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Questions welcome!Please, ask questions ...

I during the lessonI at the end of the lessonI in my office hours (seehttp://users.sussex.ac.uk/~mfb21/cal foravailable time-slots)

I by email [email protected] on CanvasI in the tutorialsI in the course’s Discord channel (invite is on Canvas)I any other channels (e.g. Telegram, TikTok ...)?

Please, don’t wait until the end of the course to tell me aboutany problems you may encounter.

15 / 150

Page 16: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Prerequisites

Good Java programming skills are indispensable.This course isnot about teaching you how to program. “Good” in this contextmeans you can do most questions on e.g.

https://leetcode.com/

classified as “Easy” without problems (= without looking up theanswer, and in 1 hour or less). I also recommed that youfamiliarise yourself with the material on “Shell Tools andScripting” and “Command-line Environment” in:

https://missing.csail.mit.edu/

It helps if you have already seen e.g. regular expressions,FSMs etc. But we will cover all this from scratch.

It helps if you have already seen a CPU, e.g. know what aregister is or a stack pointer.

16 / 150

Page 17: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Prerequisites

Good Java programming skills are indispensable.This course isnot about teaching you how to program. “Good” in this contextmeans you can do most questions on e.g.

https://leetcode.com/

classified as “Easy” without problems (= without looking up theanswer, and in 1 hour or less). I also recommed that youfamiliarise yourself with the material on “Shell Tools andScripting” and “Command-line Environment” in:

https://missing.csail.mit.edu/

It helps if you have already seen e.g. regular expressions,FSMs etc. But we will cover all this from scratch.

It helps if you have already seen a CPU, e.g. know what aregister is or a stack pointer.

17 / 150

Page 18: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Prerequisites

Good Java programming skills are indispensable.This course isnot about teaching you how to program. “Good” in this contextmeans you can do most questions on e.g.

https://leetcode.com/

classified as “Easy” without problems (= without looking up theanswer, and in 1 hour or less). I also recommed that youfamiliarise yourself with the material on “Shell Tools andScripting” and “Command-line Environment” in:

https://missing.csail.mit.edu/

It helps if you have already seen e.g. regular expressions,FSMs etc. But we will cover all this from scratch.

It helps if you have already seen a CPU, e.g. know what aregister is or a stack pointer.

18 / 150

Page 19: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Prerequisites

Good Java programming skills are indispensable.This course isnot about teaching you how to program. “Good” in this contextmeans you can do most questions on e.g.

https://leetcode.com/

classified as “Easy” without problems (= without looking up theanswer, and in 1 hour or less). I also recommed that youfamiliarise yourself with the material on “Shell Tools andScripting” and “Command-line Environment” in:

https://missing.csail.mit.edu/

It helps if you have already seen e.g. regular expressions,FSMs etc. But we will cover all this from scratch.

It helps if you have already seen a CPU, e.g. know what aregister is or a stack pointer.

19 / 150

Page 20: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Course content

I’m planning to give a fairly orthodox compilers course thatshow you all parts of a compiler. At the end of this course youshould be able to write a fully blown compiler yourself andimplement programming languages.

We will also look at computer architecture, although moresuperficially.

This will take approximately 9 weeks, so we have time at theend for some advanced material. I’m happy to tailor the courseto your interest, so please let me know what you want to hearabout.

20 / 150

Page 21: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Course content

I’m planning to give a fairly orthodox compilers course thatshow you all parts of a compiler. At the end of this course youshould be able to write a fully blown compiler yourself andimplement programming languages.

We will also look at computer architecture, although moresuperficially.

This will take approximately 9 weeks, so we have time at theend for some advanced material. I’m happy to tailor the courseto your interest, so please let me know what you want to hearabout.

21 / 150

Page 22: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Course content

I’m planning to give a fairly orthodox compilers course thatshow you all parts of a compiler. At the end of this course youshould be able to write a fully blown compiler yourself andimplement programming languages.

We will also look at computer architecture, although moresuperficially.

This will take approximately 9 weeks, so we have time at theend for some advanced material. I’m happy to tailor the courseto your interest, so please let me know what you want to hearabout.

22 / 150

Page 23: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Course content

I’m planning to give a fairly orthodox compilers course thatshow you all parts of a compiler. At the end of this course youshould be able to write a fully blown compiler yourself andimplement programming languages.

We will also look at computer architecture, although moresuperficially.

This will take approximately 9 weeks, so we have time at theend for some advanced material. I’m happy to tailor the courseto your interest, so please let me know what you want to hearabout.

23 / 150

Page 24: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Coursework

Evaluation of assessed courseworks will (largely) be byautomated tests. This is quite different from what you’ve seenso far. The reason for this new approach is threefold.

I Compilers are complicated algorithms and it’s beyondhuman capabilities to find subtle bugs.

I Realism. In industry you don’t get paid for being nice, or forhaving code that “almost” works.

I Fairness. Automatic testing removes subjective element.

Note that if you make a basic error in your compiler then it isquite likely that every test fails and you will get 0 points. So it isreally important that you test your code before submissionthoroughly. I encourage you to share tests and testingframeworks with other students: as tests are not part of thedeliverable, you make share them. Of course the compiler mustbe written by yourself.

24 / 150

Page 25: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

CourseworkEvaluation of assessed courseworks will (largely) be byautomated tests. This is quite different from what you’ve seenso far. The reason for this new approach is threefold.

I Compilers are complicated algorithms and it’s beyondhuman capabilities to find subtle bugs.

I Realism. In industry you don’t get paid for being nice, or forhaving code that “almost” works.

I Fairness. Automatic testing removes subjective element.

Note that if you make a basic error in your compiler then it isquite likely that every test fails and you will get 0 points. So it isreally important that you test your code before submissionthoroughly. I encourage you to share tests and testingframeworks with other students: as tests are not part of thedeliverable, you make share them. Of course the compiler mustbe written by yourself.

25 / 150

Page 26: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

CourseworkEvaluation of assessed courseworks will (largely) be byautomated tests. This is quite different from what you’ve seenso far. The reason for this new approach is threefold.

I Compilers are complicated algorithms and it’s beyondhuman capabilities to find subtle bugs.

I Realism. In industry you don’t get paid for being nice, or forhaving code that “almost” works.

I Fairness. Automatic testing removes subjective element.

Note that if you make a basic error in your compiler then it isquite likely that every test fails and you will get 0 points. So it isreally important that you test your code before submissionthoroughly. I encourage you to share tests and testingframeworks with other students: as tests are not part of thedeliverable, you make share them. Of course the compiler mustbe written by yourself.

26 / 150

Page 27: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

CourseworkEvaluation of assessed courseworks will (largely) be byautomated tests. This is quite different from what you’ve seenso far. The reason for this new approach is threefold.

I Compilers are complicated algorithms and it’s beyondhuman capabilities to find subtle bugs.

I Realism. In industry you don’t get paid for being nice, or forhaving code that “almost” works.

I Fairness. Automatic testing removes subjective element.

Note that if you make a basic error in your compiler then it isquite likely that every test fails and you will get 0 points. So it isreally important that you test your code before submissionthoroughly. I encourage you to share tests and testingframeworks with other students: as tests are not part of thedeliverable, you make share them. Of course the compiler mustbe written by yourself.

27 / 150

Page 28: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Plan for today’s lecture

Whirlwind overview of the course.

I Why study compilers?I What is a compiler?I Compiler structureI Lexical analysisI Syntax analysisI Semantic analysis, type-checkingI Code generation

28 / 150

Page 29: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Plan for today’s lecture

Whirlwind overview of the course.

I Why study compilers?I What is a compiler?I Compiler structureI Lexical analysisI Syntax analysisI Semantic analysis, type-checkingI Code generation

29 / 150

Page 30: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Plan for today’s lecture

Whirlwind overview of the course.

I Why study compilers?

I What is a compiler?I Compiler structureI Lexical analysisI Syntax analysisI Semantic analysis, type-checkingI Code generation

30 / 150

Page 31: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Plan for today’s lecture

Whirlwind overview of the course.

I Why study compilers?I What is a compiler?

I Compiler structureI Lexical analysisI Syntax analysisI Semantic analysis, type-checkingI Code generation

31 / 150

Page 32: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Plan for today’s lecture

Whirlwind overview of the course.

I Why study compilers?I What is a compiler?I Compiler structure

I Lexical analysisI Syntax analysisI Semantic analysis, type-checkingI Code generation

32 / 150

Page 33: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Plan for today’s lecture

Whirlwind overview of the course.

I Why study compilers?I What is a compiler?I Compiler structureI Lexical analysis

I Syntax analysisI Semantic analysis, type-checkingI Code generation

33 / 150

Page 34: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Plan for today’s lecture

Whirlwind overview of the course.

I Why study compilers?I What is a compiler?I Compiler structureI Lexical analysisI Syntax analysis

I Semantic analysis, type-checkingI Code generation

34 / 150

Page 35: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Plan for today’s lecture

Whirlwind overview of the course.

I Why study compilers?I What is a compiler?I Compiler structureI Lexical analysisI Syntax analysisI Semantic analysis, type-checking

I Code generation

35 / 150

Page 36: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Plan for today’s lecture

Whirlwind overview of the course.

I Why study compilers?I What is a compiler?I Compiler structureI Lexical analysisI Syntax analysisI Semantic analysis, type-checkingI Code generation

36 / 150

Page 37: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Why study compilers?

To become a good programmer, you need to understand whathappens ’under the hood’ when you write programs in ahigh-level language.

To understand low-level languages (assembler, C/C++, Rust,Go) better. Those languages are of prime importance, e.g. forwriting operating systems, embedded code and generally codethat needs to be fast (e.g. computer games, ML e.g.TensorFlow).

Most large programs have a tendency to embed a programminglanguage. The skill quickly to write an interpreter or compiler forsuch embedded languages is invaluable.

But most of all: compilers are extremely amazing, beautiful andone of the all time great examples of human ingenuity. After 70years of refinement compilers are a paradigm case of beautifulsoftware structure (modularisation). I hope it inspires you.

37 / 150

Page 38: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Why study compilers?To become a good programmer, you need to understand whathappens ’under the hood’ when you write programs in ahigh-level language.

To understand low-level languages (assembler, C/C++, Rust,Go) better. Those languages are of prime importance, e.g. forwriting operating systems, embedded code and generally codethat needs to be fast (e.g. computer games, ML e.g.TensorFlow).

Most large programs have a tendency to embed a programminglanguage. The skill quickly to write an interpreter or compiler forsuch embedded languages is invaluable.

But most of all: compilers are extremely amazing, beautiful andone of the all time great examples of human ingenuity. After 70years of refinement compilers are a paradigm case of beautifulsoftware structure (modularisation). I hope it inspires you.

38 / 150

Page 39: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Why study compilers?To become a good programmer, you need to understand whathappens ’under the hood’ when you write programs in ahigh-level language.

To understand low-level languages (assembler, C/C++, Rust,Go) better. Those languages are of prime importance, e.g. forwriting operating systems, embedded code and generally codethat needs to be fast (e.g. computer games, ML e.g.TensorFlow).

Most large programs have a tendency to embed a programminglanguage. The skill quickly to write an interpreter or compiler forsuch embedded languages is invaluable.

But most of all: compilers are extremely amazing, beautiful andone of the all time great examples of human ingenuity. After 70years of refinement compilers are a paradigm case of beautifulsoftware structure (modularisation). I hope it inspires you.

39 / 150

Page 40: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Why study compilers?To become a good programmer, you need to understand whathappens ’under the hood’ when you write programs in ahigh-level language.

To understand low-level languages (assembler, C/C++, Rust,Go) better. Those languages are of prime importance, e.g. forwriting operating systems, embedded code and generally codethat needs to be fast (e.g. computer games, ML e.g.TensorFlow).

Most large programs have a tendency to embed a programminglanguage. The skill quickly to write an interpreter or compiler forsuch embedded languages is invaluable.

But most of all: compilers are extremely amazing, beautiful andone of the all time great examples of human ingenuity. After 70years of refinement compilers are a paradigm case of beautifulsoftware structure (modularisation). I hope it inspires you.

40 / 150

Page 41: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Why study compilers?To become a good programmer, you need to understand whathappens ’under the hood’ when you write programs in ahigh-level language.

To understand low-level languages (assembler, C/C++, Rust,Go) better. Those languages are of prime importance, e.g. forwriting operating systems, embedded code and generally codethat needs to be fast (e.g. computer games, ML e.g.TensorFlow).

Most large programs have a tendency to embed a programminglanguage. The skill quickly to write an interpreter or compiler forsuch embedded languages is invaluable.

But most of all: compilers are extremely amazing, beautiful andone of the all time great examples of human ingenuity. After 70years of refinement compilers are a paradigm case of beautifulsoftware structure (modularisation). I hope it inspires you.

41 / 150

Page 42: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Overview: what is a compiler?

42 / 150

Page 43: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Overview: what is a compiler?A compiler is a program that translates programs from oneprogramming language to programs in another programminglanguage. The translation should preserve meaning (what does“preserve” and “meaning” mean in this context?).

CompilerSource program Target program

Error messages

Typically, the input language (called source language) is morehigh-level than the output language (called target language)Examples

I Source: Java, target: JVM bytecode.I Source: JVM bytecode, target: ARM/x86 machine codeI Source: TensorFlow, target: GPU/TPU machine code.

43 / 150

Page 44: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Overview: what is a compiler?A compiler is a program that translates programs from oneprogramming language to programs in another programminglanguage. The translation should preserve meaning (what does“preserve” and “meaning” mean in this context?).

CompilerSource program Target program

Error messages

Typically, the input language (called source language) is morehigh-level than the output language (called target language)

ExamplesI Source: Java, target: JVM bytecode.I Source: JVM bytecode, target: ARM/x86 machine codeI Source: TensorFlow, target: GPU/TPU machine code.

44 / 150

Page 45: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Overview: what is a compiler?A compiler is a program that translates programs from oneprogramming language to programs in another programminglanguage. The translation should preserve meaning (what does“preserve” and “meaning” mean in this context?).

CompilerSource program Target program

Error messages

Typically, the input language (called source language) is morehigh-level than the output language (called target language)Examples

I Source: Java, target: JVM bytecode.I Source: JVM bytecode, target: ARM/x86 machine codeI Source: TensorFlow, target: GPU/TPU machine code.

45 / 150

Page 46: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Overview: what is a compiler?A compiler is a program that translates programs from oneprogramming language to programs in another programminglanguage. The translation should preserve meaning (what does“preserve” and “meaning” mean in this context?).

CompilerSource program Target program

Error messages

Typically, the input language (called source language) is morehigh-level than the output language (called target language)Examples

I Source: Java, target: JVM bytecode.

I Source: JVM bytecode, target: ARM/x86 machine codeI Source: TensorFlow, target: GPU/TPU machine code.

46 / 150

Page 47: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Overview: what is a compiler?A compiler is a program that translates programs from oneprogramming language to programs in another programminglanguage. The translation should preserve meaning (what does“preserve” and “meaning” mean in this context?).

CompilerSource program Target program

Error messages

Typically, the input language (called source language) is morehigh-level than the output language (called target language)Examples

I Source: Java, target: JVM bytecode.I Source: JVM bytecode, target: ARM/x86 machine code

I Source: TensorFlow, target: GPU/TPU machine code.

47 / 150

Page 48: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Overview: what is a compiler?A compiler is a program that translates programs from oneprogramming language to programs in another programminglanguage. The translation should preserve meaning (what does“preserve” and “meaning” mean in this context?).

CompilerSource program Target program

Error messages

Typically, the input language (called source language) is morehigh-level than the output language (called target language)Examples

I Source: Java, target: JVM bytecode.I Source: JVM bytecode, target: ARM/x86 machine codeI Source: TensorFlow, target: GPU/TPU machine code.

48 / 150

Page 49: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Example translation: source program

Here is a little program. (What does it do?)

int testfun( int n ){int res = 1;while( n > 0 ){

n--;res *= 2; }return res; }

Using clang -S this translates to the following x86 machinecode ...

49 / 150

Page 50: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Example translation: source program

Here is a little program. (What does it do?)

int testfun( int n ){int res = 1;while( n > 0 ){

n--;res *= 2; }return res; }

Using clang -S this translates to the following x86 machinecode ...

50 / 150

Page 51: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Example translation: source program

Here is a little program. (What does it do?)

int testfun( int n ){int res = 1;while( n > 0 ){

n--;res *= 2; }return res; }

Using clang -S this translates to the following x86 machinecode ...

51 / 150

Page 52: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Example translation: target program_testfun: ## @testfun

.cfi_startproc## BB#0:

pushq %rbpLtmp0:

.cfi_def_cfa_offset 16Ltmp1:

.cfi_offset %rbp, -16movq %rsp, %rbp

Ltmp2:.cfi_def_cfa_register %rbpmovl %edi, -4(%rbp)movl $1, -8(%rbp)

LBB0_1: ## =>This Inner Loop Header: Depth=1cmpl $0, -4(%rbp)jle LBB0_3

## BB#2:## in Loop: Header=BB0_1 Depth=1

movl -4(%rbp), %eaxaddl $4294967295, %eax ## imm = 0xFFFFFFFFmovl %eax, -4(%rbp)movl -8(%rbp), %eaxshll $1, %eaxmovl %eax, -8(%rbp)jmp LBB0_1

LBB0_3:movl -8(%rbp), %eaxpopq %rbpretq.cfi_endproc

52 / 150

Page 53: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers have a beautifully simple structure

Analysis phase

Code generation

Source program

Generated program

In the analysis phase two things happen:

I Analysing if the program is well-formed (e.g.checking for syntax and type errors).

I Creating a convenient (for a computer)representation of the source programstructure for further processing. (Abstractsyntax tree (AST), symbol table).

The executable program is then generated fromthe AST in the code generation phase.

Let’s refine this.

53 / 150

Page 54: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers have a beautifully simple structure

Analysis phase

Code generation

Source program

Generated program

In the analysis phase two things happen:

I Analysing if the program is well-formed (e.g.checking for syntax and type errors).

I Creating a convenient (for a computer)representation of the source programstructure for further processing. (Abstractsyntax tree (AST), symbol table).

The executable program is then generated fromthe AST in the code generation phase.

Let’s refine this.

54 / 150

Page 55: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers have a beautifully simple structure

Analysis phase

Code generation

Source program

Generated program

In the analysis phase two things happen:

I Analysing if the program is well-formed (e.g.checking for syntax and type errors).

I Creating a convenient (for a computer)representation of the source programstructure for further processing. (Abstractsyntax tree (AST), symbol table).

The executable program is then generated fromthe AST in the code generation phase.

Let’s refine this.

55 / 150

Page 56: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers have a beautifully simple structure

Analysis phase

Code generation

Source program

Generated program

In the analysis phase two things happen:

I Analysing if the program is well-formed (e.g.checking for syntax and type errors).

I Creating a convenient (for a computer)representation of the source programstructure for further processing. (Abstractsyntax tree (AST), symbol table).

The executable program is then generated fromthe AST in the code generation phase.

Let’s refine this.

56 / 150

Page 57: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers have a beautifully simple structure

Analysis phase

Code generation

Source program

Generated program

In the analysis phase two things happen:

I Analysing if the program is well-formed (e.g.checking for syntax and type errors).

I Creating a convenient (for a computer)representation of the source programstructure for further processing. (Abstractsyntax tree (AST), symbol table).

The executable program is then generated fromthe AST in the code generation phase.

Let’s refine this.

57 / 150

Page 58: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers have a beautifully simple structure

Analysis phase

Code generation

Source program

Generated program

In the analysis phase two things happen:

I Analysing if the program is well-formed (e.g.checking for syntax and type errors).

I Creating a convenient (for a computer)representation of the source programstructure for further processing. (Abstractsyntax tree (AST), symbol table).

The executable program is then generated fromthe AST in the code generation phase.

Let’s refine this.

58 / 150

Page 59: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compiler structure

Compilers have a beautifully simple structure. This structurewas arrived at by breaking a hard problem (compilation) intoseveral smaller problems and solving them separately. This hasthe added advantage of allowing to retarget compilers(changing source or target language) quite easily.

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

59 / 150

Page 60: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compiler structureCompilers have a beautifully simple structure. This structurewas arrived at by breaking a hard problem (compilation) intoseveral smaller problems and solving them separately. This hasthe added advantage of allowing to retarget compilers(changing source or target language) quite easily.

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

60 / 150

Page 61: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compiler structureCompilers have a beautifully simple structure. This structurewas arrived at by breaking a hard problem (compilation) intoseveral smaller problems and solving them separately. This hasthe added advantage of allowing to retarget compilers(changing source or target language) quite easily.

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program61 / 150

Page 62: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compiler structure

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

Interesting question: when do these phases happen?

In the past, all happend at ... compile-time. Now some happenat run-time in Just-in-time compilers (JITs). This has profoundinfluences on choice of algorithms and performance.

62 / 150

Page 63: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compiler structure

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

Interesting question: when do these phases happen?

In the past, all happend at ... compile-time. Now some happenat run-time in Just-in-time compilers (JITs). This has profoundinfluences on choice of algorithms and performance.

63 / 150

Page 64: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compiler structure

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

Interesting question: when do these phases happen?

In the past, all happend at ... compile-time. Now some happenat run-time in Just-in-time compilers (JITs). This has profoundinfluences on choice of algorithms and performance.

64 / 150

Page 65: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compiler structure

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

Another interesting question: do you note some thing about allthese phases?

The phases are purely functional, in that they take one input,and return one output. Modern programming languages likeHaskell, Ocaml, F#, Rust or Scala are ideal for writingcompilers.

65 / 150

Page 66: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compiler structure

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

Another interesting question: do you note some thing about allthese phases?

The phases are purely functional, in that they take one input,and return one output. Modern programming languages likeHaskell, Ocaml, F#, Rust or Scala are ideal for writingcompilers.

66 / 150

Page 67: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compiler structure

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

Another interesting question: do you note some thing about allthese phases?

The phases are purely functional, in that they take one input,and return one output. Modern programming languages likeHaskell, Ocaml, F#, Rust or Scala are ideal for writingcompilers. 67 / 150

Page 68: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Overview

I Lexical analysisI Syntactic analysis (parsing)I Semantic analysis (type-checking)I Intermediate code generationI OptimisationI Code generation

68 / 150

Page 69: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

69 / 150

Page 70: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

70 / 150

Page 71: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

What is the input to a compiler?

A (often long) string, i.e. a sequence of characters.

Strings are not an efficient data-structure for a compiler to workwith (= generate code from). Instead, compilers generate codefrom a more convenient data structure called “abstract syntaxtrees” (ASTs). We construct the AST of a program in twophases:

I Lexical anlysis. Where the input string is converted into alist of tokens.

I Parsing. Where the AST is constructed from a token list.

71 / 150

Page 72: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

What is the input to a compiler?

A (often long) string, i.e. a sequence of characters.

Strings are not an efficient data-structure for a compiler to workwith (= generate code from). Instead, compilers generate codefrom a more convenient data structure called “abstract syntaxtrees” (ASTs). We construct the AST of a program in twophases:

I Lexical anlysis. Where the input string is converted into alist of tokens.

I Parsing. Where the AST is constructed from a token list.

72 / 150

Page 73: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

What is the input to a compiler?

A (often long) string, i.e. a sequence of characters.

Strings are not an efficient data-structure for a compiler to workwith (= generate code from). Instead, compilers generate codefrom a more convenient data structure called “abstract syntaxtrees” (ASTs). We construct the AST of a program in twophases:

I Lexical anlysis. Where the input string is converted into alist of tokens.

I Parsing. Where the AST is constructed from a token list.

73 / 150

Page 74: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

What is the input to a compiler?

A (often long) string, i.e. a sequence of characters.

Strings are not an efficient data-structure for a compiler to workwith (= generate code from). Instead, compilers generate codefrom a more convenient data structure called “abstract syntaxtrees” (ASTs). We construct the AST of a program in twophases:

I Lexical anlysis. Where the input string is converted into alist of tokens.

I Parsing. Where the AST is constructed from a token list.

74 / 150

Page 75: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

In the lexical analysis, a string is converted into a list of tokens.Example: The program

int testfun( int n ){int res = 1;while( n > 0 ){

n--;res *= 2; }return res; }

Is (could be) represented as the list

T_int, T_ident ( "testfun" ), T_left_brack,T_int, T_ident ( "n" ), T_rightbrack,T_left_curly_brack, T_int, T_ident ( "res" ),T_eq, T_num ( 1 ), T_semicolon, T_while, ...

75 / 150

Page 76: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

In the lexical analysis, a string is converted into a list of tokens.Example: The program

int testfun( int n ){int res = 1;while( n > 0 ){

n--;res *= 2; }return res; }

Is (could be) represented as the list

T_int, T_ident ( "testfun" ), T_left_brack,T_int, T_ident ( "n" ), T_rightbrack,T_left_curly_brack, T_int, T_ident ( "res" ),T_eq, T_num ( 1 ), T_semicolon, T_while, ...

76 / 150

Page 77: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

In the lexical analysis, a string is converted into a list of tokens.Example: The program

int testfun( int n ){int res = 1;while( n > 0 ){

n--;res *= 2; }return res; }

Is (could be) represented as the list

T_int, T_ident ( "testfun" ), T_left_brack,T_int, T_ident ( "n" ), T_rightbrack,T_left_curly_brack, T_int, T_ident ( "res" ),T_eq, T_num ( 1 ), T_semicolon, T_while, ...

77 / 150

Page 78: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

In the lexical analysis, a string is converted into a list of tokens.Example: The program

int testfun( int n ){int res = 1;while( n > 0 ){

n--;res *= 2; }return res; }

Is (could be) represented as the list

T_int, T_ident ( "testfun" ), T_left_brack,T_int, T_ident ( "n" ), T_rightbrack,T_left_curly_brack, T_int, T_ident ( "res" ),T_eq, T_num ( 1 ), T_semicolon, T_while, ...

78 / 150

Page 79: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

T_int, T_ident ( "testfun" ), T_left_brack,T_int, T_ident ( "n" ), T_rightbrack,T_left_curly_brack, T_int, T_ident ( "res" ),T_eq, T_num ( 1 ), T_semicolon, T_while, ...

Why is this interesting?

I Abstracts from irrelevant detail (e.g. syntax of keywords,whitespace, comments).

I Makes the next phase (parsing) much easier.

79 / 150

Page 80: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

T_int, T_ident ( "testfun" ), T_left_brack,T_int, T_ident ( "n" ), T_rightbrack,T_left_curly_brack, T_int, T_ident ( "res" ),T_eq, T_num ( 1 ), T_semicolon, T_while, ...

Why is this interesting?

I Abstracts from irrelevant detail (e.g. syntax of keywords,whitespace, comments).

I Makes the next phase (parsing) much easier.

80 / 150

Page 81: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: Lexical analysis

T_int, T_ident ( "testfun" ), T_left_brack,T_int, T_ident ( "n" ), T_rightbrack,T_left_curly_brack, T_int, T_ident ( "res" ),T_eq, T_num ( 1 ), T_semicolon, T_while, ...

Why is this interesting?

I Abstracts from irrelevant detail (e.g. syntax of keywords,whitespace, comments).

I Makes the next phase (parsing) much easier.

81 / 150

Page 82: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

82 / 150

Page 83: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

83 / 150

Page 84: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)

This phase converts the program (list of tokens) into a tree, theAST of the program (compare to the DOM of a webpage). Thisis a very convenient data structure because syntax-checking(type-checking) and code-generation can be done by walkingthe AST (cf visitor pattern). But how is a program a tree?

while( n > 0 ){n--;res *= 2; }

T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

84 / 150

Page 85: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)This phase converts the program (list of tokens) into a tree, theAST of the program (compare to the DOM of a webpage). Thisis a very convenient data structure because syntax-checking(type-checking) and code-generation can be done by walkingthe AST (cf visitor pattern). But how is a program a tree?

while( n > 0 ){n--;res *= 2; }

T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

85 / 150

Page 86: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)This phase converts the program (list of tokens) into a tree, theAST of the program (compare to the DOM of a webpage). Thisis a very convenient data structure because syntax-checking(type-checking) and code-generation can be done by walkingthe AST (cf visitor pattern). But how is a program a tree?

while( n > 0 ){n--;res *= 2; }

T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

86 / 150

Page 87: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)This phase converts the program (list of tokens) into a tree, theAST of the program (compare to the DOM of a webpage). Thisis a very convenient data structure because syntax-checking(type-checking) and code-generation can be done by walkingthe AST (cf visitor pattern). But how is a program a tree?

while( n > 0 ){n--;res *= 2; }

T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

87 / 150

Page 88: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)

T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

I The AST is often implemented as a tree of linked objects.I The compiler writer must design the AST data structure

carefully so that it is easy to build (during syntax analysis),and easy to walk (during code generation).

I The performance of the compiler strongly depends on theAST, so a lot of optimisation goes here for instustrialstrength compilers.

88 / 150

Page 89: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

I The AST is often implemented as a tree of linked objects.I The compiler writer must design the AST data structure

carefully so that it is easy to build (during syntax analysis),and easy to walk (during code generation).

I The performance of the compiler strongly depends on theAST, so a lot of optimisation goes here for instustrialstrength compilers.

89 / 150

Page 90: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

I The AST is often implemented as a tree of linked objects.

I The compiler writer must design the AST data structurecarefully so that it is easy to build (during syntax analysis),and easy to walk (during code generation).

I The performance of the compiler strongly depends on theAST, so a lot of optimisation goes here for instustrialstrength compilers.

90 / 150

Page 91: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

I The AST is often implemented as a tree of linked objects.I The compiler writer must design the AST data structure

carefully so that it is easy to build (during syntax analysis),and easy to walk (during code generation).

I The performance of the compiler strongly depends on theAST, so a lot of optimisation goes here for instustrialstrength compilers.

91 / 150

Page 92: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

I The AST is often implemented as a tree of linked objects.I The compiler writer must design the AST data structure

carefully so that it is easy to build (during syntax analysis),and easy to walk (during code generation).

I The performance of the compiler strongly depends on theAST, so a lot of optimisation goes here for instustrialstrength compilers.

92 / 150

Page 93: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)

T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

The construction of the AST has another important role: syntaxchecking, i.e. checking if the program is syntactically valid!

This dual role is because the rules for constructing the AST areessentially exactly the rules that determine the set ofsyntactically valid programs. Here the theory of formallanguages (context free, context sensitive, and finite automata)is of prime importance. We will study this in detail.

93 / 150

Page 94: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

The construction of the AST has another important role: syntaxchecking, i.e. checking if the program is syntactically valid!

This dual role is because the rules for constructing the AST areessentially exactly the rules that determine the set ofsyntactically valid programs. Here the theory of formallanguages (context free, context sensitive, and finite automata)is of prime importance. We will study this in detail.

94 / 150

Page 95: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

The construction of the AST has another important role: syntaxchecking, i.e. checking if the program is syntactically valid!

This dual role is because the rules for constructing the AST areessentially exactly the rules that determine the set ofsyntactically valid programs. Here the theory of formallanguages (context free, context sensitive, and finite automata)is of prime importance. We will study this in detail.

95 / 150

Page 96: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

The construction of the AST has another important role: syntaxchecking, i.e. checking if the program is syntactically valid!

This dual role is because the rules for constructing the AST areessentially exactly the rules that determine the set ofsyntactically valid programs. Here the theory of formallanguages (context free, context sensitive, and finite automata)is of prime importance. We will study this in detail. 96 / 150

Page 97: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)

T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

Great news: the generation of lexical analysers and parserscan be automated by using parser generators (e.g. lex, yacc).Decades of research have gone into parser generators, and inpractise they generate better lexers and parsers than mostprogrammers would be able to. Alas, parser generators arequite complicated beasts, and in order to understand them, it ishelpful to understand formal languages and lexing/parsing. Thebest way to understand this is to write a toy lexer and parser.

97 / 150

Page 98: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

Great news: the generation of lexical analysers and parserscan be automated by using parser generators (e.g. lex, yacc).Decades of research have gone into parser generators, and inpractise they generate better lexers and parsers than mostprogrammers would be able to. Alas, parser generators arequite complicated beasts, and in order to understand them, it ishelpful to understand formal languages and lexing/parsing. Thebest way to understand this is to write a toy lexer and parser.

98 / 150

Page 99: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: syntax analysis (parsing)T_while

T_greater

T_var ( n ) T_num ( 0 )

T_semicolon

T_decrement

T_var ( n )

T_update

T_var ( res ) T_mult

T_var ( res ) T_num ( 2 )

Great news: the generation of lexical analysers and parserscan be automated by using parser generators (e.g. lex, yacc).Decades of research have gone into parser generators, and inpractise they generate better lexers and parsers than mostprogrammers would be able to. Alas, parser generators arequite complicated beasts, and in order to understand them, it ishelpful to understand formal languages and lexing/parsing. Thebest way to understand this is to write a toy lexer and parser.

99 / 150

Page 100: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: semantic analysis

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

100 / 150

Page 101: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: semantic analysis

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

101 / 150

Page 102: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: semantic analysis

While parsing can reject syntactically invalid programs, it cannotreject semantically invalid programs, e.g. programs with morecomplicated ’semantic’ mistakes are harder to catch. Examples.

void main() {i = 7int i = 7...

if ( 3 + true ) > "hello" then ...

They are caught with semantic analysis. The key technologyare types. Modern languages like Scala, Rust, Haskell, Ocaml,F# employ type inference.

102 / 150

Page 103: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: semantic analysis

While parsing can reject syntactically invalid programs, it cannotreject semantically invalid programs, e.g. programs with morecomplicated ’semantic’ mistakes are harder to catch. Examples.

void main() {i = 7int i = 7...

if ( 3 + true ) > "hello" then ...

They are caught with semantic analysis. The key technologyare types. Modern languages like Scala, Rust, Haskell, Ocaml,F# employ type inference.

103 / 150

Page 104: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: semantic analysis

While parsing can reject syntactically invalid programs, it cannotreject semantically invalid programs, e.g. programs with morecomplicated ’semantic’ mistakes are harder to catch. Examples.

void main() {i = 7int i = 7...

if ( 3 + true ) > "hello" then ...

They are caught with semantic analysis. The key technologyare types. Modern languages like Scala, Rust, Haskell, Ocaml,F# employ type inference.

104 / 150

Page 105: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: semantic analysis

While parsing can reject syntactically invalid programs, it cannotreject semantically invalid programs, e.g. programs with morecomplicated ’semantic’ mistakes are harder to catch. Examples.

void main() {i = 7int i = 7...

if ( 3 + true ) > "hello" then ...

They are caught with semantic analysis. The key technologyare types. Modern languages like Scala, Rust, Haskell, Ocaml,F# employ type inference.

105 / 150

Page 106: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: semantic analysis

While parsing can reject syntactically invalid programs, it cannotreject semantically invalid programs, e.g. programs with morecomplicated ’semantic’ mistakes are harder to catch. Examples.

void main() {i = 7int i = 7...

if ( 3 + true ) > "hello" then ...

They are caught with semantic analysis. The key technologyare types. Modern languages like Scala, Rust, Haskell, Ocaml,F# employ type inference.

106 / 150

Page 107: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: intermediate code generation

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

107 / 150

Page 108: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: intermediate code generation

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

108 / 150

Page 109: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: intermediate code generation

There are many different CPUs with different machinelanguages. Often the machine language changes subtly fromCPU version to CPU version. It would be annoying if we had torewrite large parts of the compiler. Fortunately, most machinelanguages are rather similar. This helps us to abstract almostthe whole compiler from the details of the target language. Theway we do this is by using in essence two compilers.

I Develop an intermediate language that captures theessence of almost all machine languages.

I Compile to this intermediate language.I Do compiler optimisations in the intermediate language.I Translate the intermediate representation to the target

machine language. This step can be seen as amini-compiler.

I If we want to retarget the compiler to a new machinelanguage, only this last step needs to be rewritten. Nicedata abstraction.

109 / 150

Page 110: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: intermediate code generationThere are many different CPUs with different machinelanguages. Often the machine language changes subtly fromCPU version to CPU version. It would be annoying if we had torewrite large parts of the compiler. Fortunately, most machinelanguages are rather similar. This helps us to abstract almostthe whole compiler from the details of the target language. Theway we do this is by using in essence two compilers.

I Develop an intermediate language that captures theessence of almost all machine languages.

I Compile to this intermediate language.I Do compiler optimisations in the intermediate language.I Translate the intermediate representation to the target

machine language. This step can be seen as amini-compiler.

I If we want to retarget the compiler to a new machinelanguage, only this last step needs to be rewritten. Nicedata abstraction.

110 / 150

Page 111: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: intermediate code generationThere are many different CPUs with different machinelanguages. Often the machine language changes subtly fromCPU version to CPU version. It would be annoying if we had torewrite large parts of the compiler. Fortunately, most machinelanguages are rather similar. This helps us to abstract almostthe whole compiler from the details of the target language. Theway we do this is by using in essence two compilers.

I Develop an intermediate language that captures theessence of almost all machine languages.

I Compile to this intermediate language.I Do compiler optimisations in the intermediate language.I Translate the intermediate representation to the target

machine language. This step can be seen as amini-compiler.

I If we want to retarget the compiler to a new machinelanguage, only this last step needs to be rewritten. Nicedata abstraction.

111 / 150

Page 112: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: intermediate code generationThere are many different CPUs with different machinelanguages. Often the machine language changes subtly fromCPU version to CPU version. It would be annoying if we had torewrite large parts of the compiler. Fortunately, most machinelanguages are rather similar. This helps us to abstract almostthe whole compiler from the details of the target language. Theway we do this is by using in essence two compilers.

I Develop an intermediate language that captures theessence of almost all machine languages.

I Compile to this intermediate language.

I Do compiler optimisations in the intermediate language.I Translate the intermediate representation to the target

machine language. This step can be seen as amini-compiler.

I If we want to retarget the compiler to a new machinelanguage, only this last step needs to be rewritten. Nicedata abstraction.

112 / 150

Page 113: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: intermediate code generationThere are many different CPUs with different machinelanguages. Often the machine language changes subtly fromCPU version to CPU version. It would be annoying if we had torewrite large parts of the compiler. Fortunately, most machinelanguages are rather similar. This helps us to abstract almostthe whole compiler from the details of the target language. Theway we do this is by using in essence two compilers.

I Develop an intermediate language that captures theessence of almost all machine languages.

I Compile to this intermediate language.I Do compiler optimisations in the intermediate language.

I Translate the intermediate representation to the targetmachine language. This step can be seen as amini-compiler.

I If we want to retarget the compiler to a new machinelanguage, only this last step needs to be rewritten. Nicedata abstraction.

113 / 150

Page 114: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: intermediate code generationThere are many different CPUs with different machinelanguages. Often the machine language changes subtly fromCPU version to CPU version. It would be annoying if we had torewrite large parts of the compiler. Fortunately, most machinelanguages are rather similar. This helps us to abstract almostthe whole compiler from the details of the target language. Theway we do this is by using in essence two compilers.

I Develop an intermediate language that captures theessence of almost all machine languages.

I Compile to this intermediate language.I Do compiler optimisations in the intermediate language.I Translate the intermediate representation to the target

machine language. This step can be seen as amini-compiler.

I If we want to retarget the compiler to a new machinelanguage, only this last step needs to be rewritten. Nicedata abstraction.

114 / 150

Page 115: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: intermediate code generationThere are many different CPUs with different machinelanguages. Often the machine language changes subtly fromCPU version to CPU version. It would be annoying if we had torewrite large parts of the compiler. Fortunately, most machinelanguages are rather similar. This helps us to abstract almostthe whole compiler from the details of the target language. Theway we do this is by using in essence two compilers.

I Develop an intermediate language that captures theessence of almost all machine languages.

I Compile to this intermediate language.I Do compiler optimisations in the intermediate language.I Translate the intermediate representation to the target

machine language. This step can be seen as amini-compiler.

I If we want to retarget the compiler to a new machinelanguage, only this last step needs to be rewritten. Nicedata abstraction.

115 / 150

Page 116: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: optimiser

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

116 / 150

Page 117: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: optimiser

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

117 / 150

Page 118: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: optimiser

Translating a program often introduces various inefficiencies,make the program e.g. run slow, or use a lot of memories, oruse a lot of power (important for mobile phones). Optimisers tryto remove these inefficiencies, by replacing the inefficientprogram with a more efficient version (without changing themeaning of the program).

Most code optimisations are problems are difficult (NPcomplete or undecidable), so optimisers are expensive to run,often (but not always) lead to modest improvements only. Theyare also difficult algorithmically. These difficulties areexacerbate for JITs because the are executed at programrun-time.

However, some optimisations are easy, e.g. inlining offunctions: if a function is short (e.g. computing sum of twonumbers), replacing the call to the function with its code, canlead to faster code. (What is the disadvantage of this?)

118 / 150

Page 119: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: optimiserTranslating a program often introduces various inefficiencies,make the program e.g. run slow, or use a lot of memories, oruse a lot of power (important for mobile phones). Optimisers tryto remove these inefficiencies, by replacing the inefficientprogram with a more efficient version (without changing themeaning of the program).

Most code optimisations are problems are difficult (NPcomplete or undecidable), so optimisers are expensive to run,often (but not always) lead to modest improvements only. Theyare also difficult algorithmically. These difficulties areexacerbate for JITs because the are executed at programrun-time.

However, some optimisations are easy, e.g. inlining offunctions: if a function is short (e.g. computing sum of twonumbers), replacing the call to the function with its code, canlead to faster code. (What is the disadvantage of this?)

119 / 150

Page 120: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: optimiserTranslating a program often introduces various inefficiencies,make the program e.g. run slow, or use a lot of memories, oruse a lot of power (important for mobile phones). Optimisers tryto remove these inefficiencies, by replacing the inefficientprogram with a more efficient version (without changing themeaning of the program).

Most code optimisations are problems are difficult (NPcomplete or undecidable), so optimisers are expensive to run,often (but not always) lead to modest improvements only. Theyare also difficult algorithmically. These difficulties areexacerbate for JITs because the are executed at programrun-time.

However, some optimisations are easy, e.g. inlining offunctions: if a function is short (e.g. computing sum of twonumbers), replacing the call to the function with its code, canlead to faster code. (What is the disadvantage of this?)

120 / 150

Page 121: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: optimiserTranslating a program often introduces various inefficiencies,make the program e.g. run slow, or use a lot of memories, oruse a lot of power (important for mobile phones). Optimisers tryto remove these inefficiencies, by replacing the inefficientprogram with a more efficient version (without changing themeaning of the program).

Most code optimisations are problems are difficult (NPcomplete or undecidable), so optimisers are expensive to run,often (but not always) lead to modest improvements only. Theyare also difficult algorithmically. These difficulties areexacerbate for JITs because the are executed at programrun-time.

However, some optimisations are easy, e.g. inlining offunctions: if a function is short (e.g. computing sum of twonumbers), replacing the call to the function with its code, canlead to faster code. (What is the disadvantage of this?)

121 / 150

Page 122: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: code generation

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

122 / 150

Page 123: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: code generation

Lexical analysis

Syntax analysis

Source program

Semantic analysis,e.g. type checking

Intermediate code generation

Optimisation

Code generation

Translated program

123 / 150

Page 124: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: code generation

This straighforward phase translates the generatedintermediate code to machine code. As machine code andintermediate code are much alike, this ’mini-compiler’ is simpleand fast.

124 / 150

Page 125: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Phases: code generation

This straighforward phase translates the generatedintermediate code to machine code. As machine code andintermediate code are much alike, this ’mini-compiler’ is simpleand fast.

125 / 150

Page 126: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers vs interpreters

Interpreters are a second way to run programs.

CompilerSource program Executable

Data

Output

Source program Interpreter

Data

Output

At runtime.Syntaxerror?

Syntaxerror?

I The advantage of compilers isthat generated code is faster,because a lot of work has tobe done only once (e.g.lexing, parsing, type-checking,optimisation). And the resultsof this work are shared inevery execution. Theinterpreter has to redo thiswork everytime.

I The advantage of interpretersis that they are much simplerthan compilers.

We won’t say much more about interpreters in this course.

126 / 150

Page 127: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers vs interpretersInterpreters are a second way to run programs.

CompilerSource program Executable

Data

Output

Source program Interpreter

Data

Output

At runtime.Syntaxerror?

Syntaxerror?

I The advantage of compilers isthat generated code is faster,because a lot of work has tobe done only once (e.g.lexing, parsing, type-checking,optimisation). And the resultsof this work are shared inevery execution. Theinterpreter has to redo thiswork everytime.

I The advantage of interpretersis that they are much simplerthan compilers.

We won’t say much more about interpreters in this course.

127 / 150

Page 128: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers vs interpretersInterpreters are a second way to run programs.

CompilerSource program Executable

Data

Output

Source program Interpreter

Data

Output

At runtime.Syntaxerror?

Syntaxerror?

I The advantage of compilers isthat generated code is faster,because a lot of work has tobe done only once (e.g.lexing, parsing, type-checking,optimisation). And the resultsof this work are shared inevery execution. Theinterpreter has to redo thiswork everytime.

I The advantage of interpretersis that they are much simplerthan compilers.

We won’t say much more about interpreters in this course.

128 / 150

Page 129: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers vs interpretersInterpreters are a second way to run programs.

CompilerSource program Executable

Data

Output

Source program Interpreter

Data

Output

At runtime.Syntaxerror?

Syntaxerror?

I The advantage of compilers isthat generated code is faster,because a lot of work has tobe done only once (e.g.lexing, parsing, type-checking,optimisation). And the resultsof this work are shared inevery execution. Theinterpreter has to redo thiswork everytime.

I The advantage of interpretersis that they are much simplerthan compilers.

We won’t say much more about interpreters in this course.

129 / 150

Page 130: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers vs interpretersInterpreters are a second way to run programs.

CompilerSource program Executable

Data

Output

Source program Interpreter

Data

Output

At runtime.Syntaxerror?

Syntaxerror?

I The advantage of compilers isthat generated code is faster,because a lot of work has tobe done only once (e.g.lexing, parsing, type-checking,optimisation). And the resultsof this work are shared inevery execution. Theinterpreter has to redo thiswork everytime.

I The advantage of interpretersis that they are much simplerthan compilers.

We won’t say much more about interpreters in this course.

130 / 150

Page 131: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Compilers vs interpretersInterpreters are a second way to run programs.

CompilerSource program Executable

Data

Output

Source program Interpreter

Data

Output

At runtime.Syntaxerror?

Syntaxerror?

I The advantage of compilers isthat generated code is faster,because a lot of work has tobe done only once (e.g.lexing, parsing, type-checking,optimisation). And the resultsof this work are shared inevery execution. Theinterpreter has to redo thiswork everytime.

I The advantage of interpretersis that they are much simplerthan compilers.

We won’t say much more about interpreters in this course.

131 / 150

Page 132: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Literature

Compilers are among the most studied and most wellunderstood parts of informatics. Many good books exist. Hereare some of my favourites, although I won’t follow any of themclosely.

I Modern Compiler Implementation in Java (secondedition) by Andrew Appel and Jens Palsberg. Probablyclosest to our course. Moves quite fast.

I Compilers - Principles, Techniques and Tools (secondedition) by Alfred V. Aho, Monica Lam, Ravi Sethi, andJeffrey D. Ullman. The first edition of this book is is theclassic text on compilers, known as the “Dragon Book”, butits first edition is a bit obsolete. The second edition issubstantially expanded and goes well beyond the scope ofour course. For my liking, the book is a tad long.

132 / 150

Page 133: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Literature

Compilers are among the most studied and most wellunderstood parts of informatics. Many good books exist. Hereare some of my favourites, although I won’t follow any of themclosely.

I Modern Compiler Implementation in Java (secondedition) by Andrew Appel and Jens Palsberg. Probablyclosest to our course. Moves quite fast.

I Compilers - Principles, Techniques and Tools (secondedition) by Alfred V. Aho, Monica Lam, Ravi Sethi, andJeffrey D. Ullman. The first edition of this book is is theclassic text on compilers, known as the “Dragon Book”, butits first edition is a bit obsolete. The second edition issubstantially expanded and goes well beyond the scope ofour course. For my liking, the book is a tad long.

133 / 150

Page 134: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Literature

Compilers are among the most studied and most wellunderstood parts of informatics. Many good books exist. Hereare some of my favourites, although I won’t follow any of themclosely.

I Modern Compiler Implementation in Java (secondedition) by Andrew Appel and Jens Palsberg. Probablyclosest to our course. Moves quite fast.

I Compilers - Principles, Techniques and Tools (secondedition) by Alfred V. Aho, Monica Lam, Ravi Sethi, andJeffrey D. Ullman. The first edition of this book is is theclassic text on compilers, known as the “Dragon Book”, butits first edition is a bit obsolete. The second edition issubstantially expanded and goes well beyond the scope ofour course. For my liking, the book is a tad long.

134 / 150

Page 135: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Literature

Compilers are among the most studied and most wellunderstood parts of informatics. Many good books exist. Hereare some of my favourites, although I won’t follow any of themclosely.

I Modern Compiler Implementation in Java (secondedition) by Andrew Appel and Jens Palsberg. Probablyclosest to our course. Moves quite fast.

I Compilers - Principles, Techniques and Tools (secondedition) by Alfred V. Aho, Monica Lam, Ravi Sethi, andJeffrey D. Ullman. The first edition of this book is is theclassic text on compilers, known as the “Dragon Book”, butits first edition is a bit obsolete. The second edition issubstantially expanded and goes well beyond the scope ofour course. For my liking, the book is a tad long.

135 / 150

Page 136: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Literature

Some other material:

I Engineering a Compiler, by Keith Cooper, Linda Torczon.I The Alex Aiken’s Stanford University online course on

compilers. This course coveres similar ground as ours,but goes more in-depth. I was quite influenced by Aiken’scourse when I designed our’s.

I Computer Architecture - A Quantitative Approach (sixthedition) by John Hennessey and David Patterson. This isthe ’bible’ for computer architecture. It goes way beyondwhat is required for our course, but very well written bysome of the world’s leading experts on computerarchitecture. Well worth studying.

136 / 150

Page 137: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Literature

Some other material:

I Engineering a Compiler, by Keith Cooper, Linda Torczon.I The Alex Aiken’s Stanford University online course on

compilers. This course coveres similar ground as ours,but goes more in-depth. I was quite influenced by Aiken’scourse when I designed our’s.

I Computer Architecture - A Quantitative Approach (sixthedition) by John Hennessey and David Patterson. This isthe ’bible’ for computer architecture. It goes way beyondwhat is required for our course, but very well written bysome of the world’s leading experts on computerarchitecture. Well worth studying.

137 / 150

Page 138: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Literature

Some other material:

I Engineering a Compiler, by Keith Cooper, Linda Torczon.

I The Alex Aiken’s Stanford University online course oncompilers. This course coveres similar ground as ours,but goes more in-depth. I was quite influenced by Aiken’scourse when I designed our’s.

I Computer Architecture - A Quantitative Approach (sixthedition) by John Hennessey and David Patterson. This isthe ’bible’ for computer architecture. It goes way beyondwhat is required for our course, but very well written bysome of the world’s leading experts on computerarchitecture. Well worth studying.

138 / 150

Page 139: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Literature

Some other material:

I Engineering a Compiler, by Keith Cooper, Linda Torczon.I The Alex Aiken’s Stanford University online course on

compilers. This course coveres similar ground as ours,but goes more in-depth. I was quite influenced by Aiken’scourse when I designed our’s.

I Computer Architecture - A Quantitative Approach (sixthedition) by John Hennessey and David Patterson. This isthe ’bible’ for computer architecture. It goes way beyondwhat is required for our course, but very well written bysome of the world’s leading experts on computerarchitecture. Well worth studying.

139 / 150

Page 140: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Literature

Some other material:

I Engineering a Compiler, by Keith Cooper, Linda Torczon.I The Alex Aiken’s Stanford University online course on

compilers. This course coveres similar ground as ours,but goes more in-depth. I was quite influenced by Aiken’scourse when I designed our’s.

I Computer Architecture - A Quantitative Approach (sixthedition) by John Hennessey and David Patterson. This isthe ’bible’ for computer architecture. It goes way beyondwhat is required for our course, but very well written bysome of the world’s leading experts on computerarchitecture. Well worth studying.

140 / 150

Page 141: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

How to enjoy and benefit from this course

I Assessed coursework is designed to reinforce andintegrate lecture material; it’s designed to help you passthe exam

I Go look at the past papers - now.I Use the tutorials to get feedback on your solutionsI Substantial lab exercise should bring it all togetherI Ask questions, in the lectures, in the labs, on Canvas or in

person!I Design your own mini-languages and write compilers for

them.I Have a look at real compilers. There are many free,

open-source compilers, g.g. GCC, LLVM, TCC, MiniML,Ocaml, the Scala compiler, GHC, the Haskell compiler.

141 / 150

Page 142: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

How to enjoy and benefit from this course

I Assessed coursework is designed to reinforce andintegrate lecture material; it’s designed to help you passthe exam

I Go look at the past papers - now.I Use the tutorials to get feedback on your solutionsI Substantial lab exercise should bring it all togetherI Ask questions, in the lectures, in the labs, on Canvas or in

person!I Design your own mini-languages and write compilers for

them.I Have a look at real compilers. There are many free,

open-source compilers, g.g. GCC, LLVM, TCC, MiniML,Ocaml, the Scala compiler, GHC, the Haskell compiler.

142 / 150

Page 143: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

How to enjoy and benefit from this course

I Assessed coursework is designed to reinforce andintegrate lecture material; it’s designed to help you passthe exam

I Go look at the past papers - now.

I Use the tutorials to get feedback on your solutionsI Substantial lab exercise should bring it all togetherI Ask questions, in the lectures, in the labs, on Canvas or in

person!I Design your own mini-languages and write compilers for

them.I Have a look at real compilers. There are many free,

open-source compilers, g.g. GCC, LLVM, TCC, MiniML,Ocaml, the Scala compiler, GHC, the Haskell compiler.

143 / 150

Page 144: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

How to enjoy and benefit from this course

I Assessed coursework is designed to reinforce andintegrate lecture material; it’s designed to help you passthe exam

I Go look at the past papers - now.I Use the tutorials to get feedback on your solutions

I Substantial lab exercise should bring it all togetherI Ask questions, in the lectures, in the labs, on Canvas or in

person!I Design your own mini-languages and write compilers for

them.I Have a look at real compilers. There are many free,

open-source compilers, g.g. GCC, LLVM, TCC, MiniML,Ocaml, the Scala compiler, GHC, the Haskell compiler.

144 / 150

Page 145: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

How to enjoy and benefit from this course

I Assessed coursework is designed to reinforce andintegrate lecture material; it’s designed to help you passthe exam

I Go look at the past papers - now.I Use the tutorials to get feedback on your solutionsI Substantial lab exercise should bring it all together

I Ask questions, in the lectures, in the labs, on Canvas or inperson!

I Design your own mini-languages and write compilers forthem.

I Have a look at real compilers. There are many free,open-source compilers, g.g. GCC, LLVM, TCC, MiniML,Ocaml, the Scala compiler, GHC, the Haskell compiler.

145 / 150

Page 146: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

How to enjoy and benefit from this course

I Assessed coursework is designed to reinforce andintegrate lecture material; it’s designed to help you passthe exam

I Go look at the past papers - now.I Use the tutorials to get feedback on your solutionsI Substantial lab exercise should bring it all togetherI Ask questions, in the lectures, in the labs, on Canvas or in

person!

I Design your own mini-languages and write compilers forthem.

I Have a look at real compilers. There are many free,open-source compilers, g.g. GCC, LLVM, TCC, MiniML,Ocaml, the Scala compiler, GHC, the Haskell compiler.

146 / 150

Page 147: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

How to enjoy and benefit from this course

I Assessed coursework is designed to reinforce andintegrate lecture material; it’s designed to help you passthe exam

I Go look at the past papers - now.I Use the tutorials to get feedback on your solutionsI Substantial lab exercise should bring it all togetherI Ask questions, in the lectures, in the labs, on Canvas or in

person!I Design your own mini-languages and write compilers for

them.

I Have a look at real compilers. There are many free,open-source compilers, g.g. GCC, LLVM, TCC, MiniML,Ocaml, the Scala compiler, GHC, the Haskell compiler.

147 / 150

Page 148: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

How to enjoy and benefit from this course

I Assessed coursework is designed to reinforce andintegrate lecture material; it’s designed to help you passthe exam

I Go look at the past papers - now.I Use the tutorials to get feedback on your solutionsI Substantial lab exercise should bring it all togetherI Ask questions, in the lectures, in the labs, on Canvas or in

person!I Design your own mini-languages and write compilers for

them.I Have a look at real compilers. There are many free,

open-source compilers, g.g. GCC, LLVM, TCC, MiniML,Ocaml, the Scala compiler, GHC, the Haskell compiler.

148 / 150

Page 149: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Feedback

In this module, you will receive feedback through:I The mark and comments on your assessmentI Feedback to the whole class on assessment and examsI Feedback to the whole class on lecture understandingI Model solutionsI Worked examples in class and lectureI Verbal comments and discussions with tutors in classI Discussions with your peers on problemsI Online discussion forumsI One to one sessions with the tutors

The more questions you ask, the more you participate indiscussions, the more you engage with the course, the morefeedback you get.

149 / 150

Page 150: Compilers and computer architecture: introductionusers.sussex.ac.uk/~mfb21/compilers/slides/1.pdf · 2020-02-04 · Compilers and computer architecture: introduction Martin Berger

Questions?

150 / 150