31
CSE450 Translation of Programming Languages Lecture 11 : Semantic Analysis: Types & Type Checking

Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

CSE450 Translation of Programming LanguagesLecture 11: Semantic Analysis: Types & Type Checking

Page 2: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Semantic AnalyzerSemantic Analyzer

Syntax Analyzer

Lexical Analyzer

Target Code Generator

Code Optimizer

Int. Code Generator

Structure of a

Compiler

Source Language

Target Language

Front End

Back End

Intermediate Code

Today!

Project 1 - ✔Project 2 - ✔

Project 3 - ✔

Page 3: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Parsing cannot catch all possible errors. Parsing assumes that we are working with a context-free grammar.

Example language constructs that require context:

Have variables been declared? Is a variable available in the current scope? Are the operands of an expression valid types? Is an assignment using legal types? Are the arguments to a function of the correct type?

Importance of Semantic Analysis

Page 4: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Why do we need to worry about type checking?

Consider the Tube-IC fragment:

add s12 s20 s34What types are s12, s20, and s34?

They can be anything! Likewise, processors treat registers generically. This makes their operations flexible and reusable, but not type safe.

Types

Page 5: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Legal operations can vary depending on the type of a value.

It typically does not make sense to add a function pointer to an integer in C++ It does makes sense to add integers

Both of these operations can potentially have the same implementation in assembly. As far as the processor is concerned, an integer and a pointer look the same.

Types and Operations

Page 6: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

A language type system specifies which types are available, and what operations can be used on those types.

The goal of type checking is to ensure that only "sensible" operations are allowed to be performed.

Type checking also can provide the ability to have different operations performed depending on the types involved.

Type Systems

Page 7: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Statically Typed Almost all type checking happens at compile time Each variable is limited to a single type Language examples include C/C++, Java, Tubular

Dynamically Typed Almost all type checking occurs at runtime Variables can typically contain any type of value Most scripting languages do this (Javascript, Python, Ruby, Scheme, etc.)

Untyped No checking is done, such as in assembly

Three basic kinds of Type Checking

Page 8: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

There are three basic kinds of type checking systems:

Static typing Many errors can be caught at compile time Optimizations can be easier to perform Runtime environment can be faster, type decisions have already been performed

Dynamically Typed Less restrictive, easier to express operations, faster development Programs can be more modular, extensible, and adaptive More runtime machinery required, can be slower during execution

Static vs. Dynamic Typing In practice, most languages use some statically typed and dynamically typed elements.

Provide escape mechanisms (casting) to allow static elements to be used as needed.

Page 9: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

We will have two basic types in Project 4: val - floating point quantity (already implemented)char - a single ascii char

And one meta type will be added for Project 5: array

A consecutive grouping of a basic type array(char) can also be referred to as string

Types in Tubular

Page 10: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

For Tubular, we will be using static typing. Simpler to implement the runtime environment.

Four basic scenarios where types will need to be checked:

Variable Assignments: type of RHS must match variable Mathematical Operations: type must be val for + - * / % && || and ! Comparison Operators: types must both be val or both be char Generic commands, like print: any type accepted Function calls (coming in Project 7): arguments must match

Tubular Type Checking

Page 11: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Variable Assignments

val x;char y;

x = 1;y = 'b';

x = 'a';y = 2;

assignment: var_any '=' expression

= = x 1 y 'b'

✔ ✔

✖ ✖

= = x 'a' y 2

Page 12: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Mathematical Operationsval x = 1;char y;

x = x + 2;y = 'c';

x = y + 3;

y = x;

y = 'a' + 'b';

expr: expr '+' expr + x 2

✔ ✔

+ y 3

+ 'a' 'b'

Page 13: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

The print command can take anything. Type information is used to determine what operation to perform.

If the type of the argument is val, use out_val If the type of the argument is char, use out_char Starting with Project 5, If the type of the argument is an array print out each element of that array with the internal type

Other commands and functions may have particular type requirements, depending on argument position.

Functions and Commands

Page 14: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Type CheckingChar versus Val

Page 15: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

The ‘char’ Type

Like ‘val’ variables can be declared type ‘char’.

Char variables are single characters between single quotes.

The symbol table must keep track of type to ensure that no illegal operations take place.

val x = 0;

char y = ‘a’;

Page 16: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Escape Characters

The 4 escape characters are preceded by a backslash.

No other escape characters should be implemented.

char a = ‘\n’;

char b = ‘\t’;

char c = ‘\’’;

char d = ‘\\’;

Page 17: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Special Note - # is a normal character

The comment character ‘#’ is allowed between single quotes and doesn’t denote a comment.

char a = ‘#’;

Page 18: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Type Checking - Assignment

You cannot assign a variable of one type to another.

With static type checking, we know the type of every variable at compile time and can ensure correctness.

char a = ‘x’;

val b;

b = a; # ERROR

Page 19: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Type Checking - Relationship Operators

You can compare (==, !=, >, >=, <, <=) two variables of the same type.

But you cannot compare two different types.

char a = ‘x’;

char b = ‘y’;

b > a;

val c = 0;

a != c; # ERROR

Page 20: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Type Checking - Mathematical Operators

The char type cannot be used by math operators (+, +=, -, -=, *, *=, /, /=),

nor boolean operators (&&, ||, !).

char a = ‘x’;

char b = ‘y’;

a + b; # ERROR

a && b; # ERROR

Page 21: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Type Checking - Boolean Evaluation

The char type cannot be used where a boolean result is needed (conditions for if and while statements).

char a = ‘x’;

if (a) { # ERROR

a = ‘b’;

}

Page 22: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Type Checking - Type Specific Commands

The random command only takes the type ‘val’, giving it anything else is an error.

The print command happily takes type ‘char’ as an argument.

char a = ‘x’;

random(a); # ERROR

print(a);

Page 23: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Hold up the colors that are legal. #1val x = 1;x = ‘a’;

#2char x = ‘a’;char y = ‘b’;char z = x + y;

#3char x = ‘a’;char y = ‘b’;x != y;

#4char x = ‘a’;if (x == ‘b’) {x = ‘b’;

}

Page 24: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

‘Char’ Implementation - val_copy

Tube Intermediate Code handles ‘char’s just like ‘val’s.

Escape characters are treated identically to Tubular (original source).

char a = ‘\n’;becomes

val_copy ‘\n’ s1

val_copy s1 s2

Page 25: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

‘Char’ Implementation - other ops

The other TubeIC operators behave with char like val.

‘a’ > ‘b’;becomes

val_copy ‘a’ s1val_copy ‘b’ s2test_gtr s1 s2 s3

Page 26: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

‘Char’ Implementation - out_char

You’ve already been using the one char specific TubeIC instruction.

print(1, ‘a’)becomes

val_copy 1 s1val_copy ‘a’ s2out_val s1out_char s2out_char ‘\n’

Page 27: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

How to keep track of TYPE

Every variable (temporary or named) needs to know its type.

You can use the symbol table to store this information.

For this class, there will only be a finite number of types (val, char, and a few others introduced in future projects).

Page 28: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Implementing ‘char’ type1. Make the lexer include escape characters

2. Make the parser allow type ‘char’ in variable declarations

3. Make the symbol table store type of every variable used

4. Make the abstract syntax tree include a node for literal char values

5. For each node in the AST, make sure that the types of its children are legal or raise an error if not. This can be done at the creation of the node.

Page 29: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Scope RefresherSymbol Tables and Decrementing Scope

Page 30: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Scoping can be implemented right within your symbol table(s).

When a variable is declared: Check that it has not been previously defined within this scope (but lower scopes are allowed) Add it to the table, recording its name, type, etc., along with the scope in which it was created.

When leaving a scope, simply deactivate symbols that are no longer accessible. They can’t be used again in the source program. (But you will need to reference them when outputting your intermediate code!)

Implementing Scoping

Page 31: Lecture 11: Semantic Analysis: Types & Type Checkingcse450/Lectures/11-semantic-analysis.pdfA language type system specifies which types are available, and what operations can be

Stack of SymbolTablesSymbolTable[0]: val aval b

Given: val a = 123;val b = 44;if (a == 123) {char a = 'x';print(a);

}print(a);

SymbolTable[1]: char a