12
Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Embed Size (px)

Citation preview

Page 1: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Chapter 1

Introduction

Major Data Structures in Compiler

Gang S. LiuCollege of Computer Science &

TechnologyHarbin Engineering University

Page 2: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 2

Major Data Structures in Compiler

There is a strong interaction between the algorithms used by the phases of a compiler and the data structures that support these phases.

Algorithms need to be implemented in efficient manner.

The choice of data structures is important

Page 3: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 3

Tokens

When a scanner collects characters into a token, it represents the token symbolically as a value of an enumerated data type representing a

set of tokens of the source language Sometimes, it is necessary to preserve the character

string itself or other information derived from it The name associated with an identifier token The value of a number token

In most languages the scanner needs to generate one token at a time (single symbol lookahead) A single global variable can be used to hold the token

information.

Page 4: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 4

The Syntax Tree The syntax tree is constructed as a

standard pointer-based structure that is dynamically allocated as parsing proceeds.

The tree can be kept as a single variable pointing to the root node.

Each node is a record. Its fields represent the information collected by the parser and the semantic analyzer. Sometimes these fields are dynamically

allocated

Page 5: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 5

The Symbol Table This data structure keeps information

associated with identifiers, functions, variables, constants, and data types.

The symbol table interacts with almost every phase of the compiler.

The insertion, deletion access operations need to be efficient.

A standard data type for this purpose is the hash table.

Page 6: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 6

The Literal Table

Stores constants and strings used in the program.

Quick insertion and lookup are essential.

Need not allow deletions.

Page 7: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 7

Intermediate Code

Depending on the kind of intermediate code, it may be kept as An array of text strings A temporary text file Linked list of structures

Page 8: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 8

Temporary Files Computers did not possess enough memory

for an entire program to be kept in memory during compilation.

This was solved by using temporary files to hold the products of intermediate steps.

Memory constrains are now much smaller problem.

Occasionally, compilers generate intermediate files during some of the steps.

Page 9: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 9

Other Issues in Compiler Structure

Passes Language Definition Error Handling

Page 10: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 10

Passes A compiler often processed the entire source program

several times before generating code. These repetitions are referred as passes. Passes may or may not correspond to phases. Depending on the language, a compiler may be one

pass. Efficient compilation, but not efficient target code. Examples: Pascal and C.

Most compilers with optimizations use more than one pass: 1. Scanning and parsing2. Semantic analysis and source-level optimization3. Code generation and target code optimization

Page 11: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 11

Language Definition The description of the lexical, syntactic, and

semantics of a programming language is collected in a language reference manual, or language definition.

With a new language, a language definition and compiler are often developed together.

More common situation is when a compiler is written for well-known language which has an existing language definition.

Page 12: Chapter 1 Introduction Major Data Structures in Compiler Gang S. Liu College of Computer Science & Technology Harbin Engineering University

Compiler [email protected] 12

Error Handling One of the most important functions of a compiler. Errors can be detected during almost every phase

of compilation. Error reported by a compiler are static (or

compile-time) errors. It is important to generate meaningful error

messages. Error handler contains different operations for a

specific compiler phase and situation