37
Compiler Construction Semantic Analysis II Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University

Compiler Construction Semantic Analysis II Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Compiler Construction

Semantic Analysis II

Rina Zviel-Girshin and Ohad ShachamSchool of Computer Science

Tel-Aviv University

22

Administration

TA1 is up LR parsing Submission deadline 20/12/2009

PA3 is up Submission deadline 16/12/2009

33

Compiler

ICProgram

ic

x86 executable

exeLexicalAnalysi

s

Syntax Analysi

s

Parsing

AST Symbol

Tableetc.

Inter.Rep.(IR)

CodeGeneration

IC compiler

We saw: Scope Symbol tables

Today: Type checking Recap

44

Semantic analysis motivation

Syntax analysis is not enough

int a;a = “hello”;

int a;b = 1;

Assigning wrong type

Assigning undeclared variable

int a;int a;a = 1;

Variable re declaration

55

Symbol table

An environment that stores information about identifiers

A data structure that captures scope information

SymbolKindTypeProperties

valuefieldint…

testmethod-> intprivate

setValuemethodint -> void public

66

Examples of type errors

int a; a = true;

void foo(int x) { int x; foo(5,7);}

1 < true

class A {…}class B extends A { void foo() { A a; B b; b = a; }}

argument list doesn’t match

formal parameters

a is not a subtype of b

assigned type doesn’t match declared type

relational operator applied to non-int

type

77

Types

Type Set of values computed during program execution boolean = {true,false} int = {-231..231-1} void = {}

Type safety Types usage adheres formally defined typing rules

88

Type judgments

e : T Formal notation for type judgments e is a well-typed expression of type T 2 : int 2 * (3 + 4) : int true : bool “Hello” : string

99

Type judgments

E e : T Formal notation for type judgments In the context E, e is a well-typed expression of T b:bool, x:int b:bool x:int 1 + x < 4:bool foo:int->string, x:int foo(x) : string

Type context set of type bindings id : T (symbol table)

1010

Typing rules

Premise

Conclusion[Name]

Conclusion[Name]

Axioms

1111

Typing rules for expressions

E true : bool E false : bool

E int-literal : int E string-literal : string

E e1 : int E e2 : int

E e1+e2 : int[+]

E null : null E new T() : T

AST leaves

1212

Some IC expression rules 1

E true : bool

E e1 : int E e2 : int

E e1 op e2 : int

E false : bool

E int-literal : int E string-literal : string

op { +, -, /, *, %}

E e1 : int E e2 : int

E e1 rop e2 : boolrop { <=,<, >, >=}

E e1 : T E e2 : T

E e1 rop e2 : boolrop { ==,!=}

1313

Some IC expression rules 2

E e1 : bool E e2 : bool

E e1 lop e2 : boollop { &&,|| }

E e1 : int

E - e1 : int

E e1 : bool

E ! e1 : bool

E e1 : T[]

E e1.length : int

E e1 : T[] E e2 : int

E e1[e2] : T

E e1 : int

E new T[e1] : T[]

E new T() : T

E e:C (id : T) C

E e.id : T

1414

Type-checking algorithm1. Construct types

1. Add basic types to type table

2. Traverse AST looking for user-defined types (classes,methods,arrays) and store in table

3. Bind all symbols to types

2. Traverse AST bottom-up (using visitor)1. For each AST node find corresponding rule

(there is only one for each kind of node)

2. Check if rule holds1. Yes: assign type to node according to consequent

2. No: report error

151545 > 32 && !false

BinopExpr UnopExpr

BinopExpr

op=AND

op=NEGop=GT

intLiteral

val=45

intLiteral

val=32

boolLiteral

val=false

: int : int

: bool

: bool

: bool

: bool

E false : bool

E int-literal : int

E e1 : int E e2 : int

E e1 rop e2 : bool

rop { <=,<, >, >=}

E e1 : bool E e2 : bool

E e1 lop e2 : bool

lop { &&,|| }

E e1 : bool

E !e1 : bool

Algorithm example

1616

Statement rules

Statements have type voidJudgments of the form

E SIn environment E, S is well-typed

E e:bool E S

E while (e) S

E e:bool E S

E if (e) S

E e:bool E S1 E S2

E if (e) S1 else S2

E break E continue

1717

Checking return statements

Special entry {ret:Tr} represents return value Add to symbol table when entering method Lookup entry when hit return statement

ret:void E

E return;

ret:T’E T≤T’

E return e;

E e:T

T subtype of T’

1818

Subtyping Inheritance induces subtyping relation

Type hierarchy is a treeSubtyping rules:

A extends B {…}

A ≤ B A ≤ A

A ≤ B B ≤ C

A ≤ C null ≤ A

Subtyping does not extend to array typesA subtype of B then A[] is not a subtype of B[]

1919

Type checking with subtyping

S ≤ TS may be used whenever T is expectedAn Expression E from type S also has type T

E e : SS ≤ T

E e : T

2020

IC rules with subtyping

E e1 : T1 E e2 : T2 T1 ≤ T2 or T2 ≤ T1

op {==,!=}

E e1 op e2 : bool

2121

Semantic analysis flow

Parsing and AST construction Combine library AST with IC program AST

Construct and initialize global type table Phase 1: Symbol table construction

Construct class hierarchy and check that hierarchy is a tree Construct remaining symbol table hierarchy Assign enclosing-scope for each AST node

Phase 2: Scope checking Resolve names Check scope rules using symbol table

Phase 3: Type checking Assign type for each AST node

Phase 4: Remaining semantic checks

2222

Class hierarchy for typesabstract class Type {...}

class IntType extends Type {...}

class BoolType extends Type {...}

class ArrayType extends Type { Type elemType;}

class MethodType extends Type { Type[] paramTypes; Type returnType; ... }

class ClassType extends Type { ICClass classAST; ...}...

2323

Type comparison

Use a unique object for each distinct type Resolve each type expression to same object Use reference equality for comparison (==)

2424

Type table implementationclass TypeTable { // Maps element types to array types private Map<Type,ArrayType> uniqueArrayTypes; private Map<String,ClassType> uniqueClassTypes;

public static Type boolType = new BoolType(); public static Type intType = new IntType(); ...

// Returns unique array type object public static ArrayType arrayType(Type elemType) { if (uniqueArrayTypes.containsKey(elemType)) { // array type object already created – return it return uniqueArrayTypes.get(elemType); } else { // object doesn’t exist – create and return it ArrayType arrt = new ArrayType(elemType); uniqueArrayTypes.put(elemType,ArrayType); return arrt; } } ... }

Recap

2626

Semantic analysis flow example

class A { int x; int f(int x) { boolean y; ... }}

class B extends A { boolean y; int t;}

class C { A o; int z;}

2727

Parsing and AST construction

IntTypeBoolTypeABCf : int->int…

TypeTable

Table populated with user-defined

types during parsing

(or special AST pass)

class A { int x; int f(int x) { boolean y; ... }}

class B extends A { boolean y; int t;}

class C { A o; int z;}

parser.parse()

ICClassname = A

Fieldname = xtype = IntType

Methodname = f

Paramname = xtype = IntType

DeclarationvarName = yinitExpr = nulltype = BoolType

fields[0] methods[0]

bodyparameters[0]

ASTProgramfile = …

classes[0]

ICClassname = Bsuper = A

classes[1]classes[2]

…ICClassname = C

2828

Defined types and type table

class A { int x; int f(int x) { boolean y; ... }}

class B extends A { boolean y; int t;}

class C { A o; int z;}

class TypeTable { public static Type boolType = new BoolType(); public static Type intType = new IntType(); ... public static ArrayType arrayType(Type elemType) {…} public static ClassType classType(String name, String super, ICClass ast) {…} public static MethodType methodType(String name,Type retType, Type[] paramTypes) {…}}

abstract class Type { String name; boolean subtypeof(Type t) {...}}class IntType extends Type {...}class BoolType extends Type {...}class ArrayType extends Type { Type elemType;}class MethodType extends Type { Type[] paramTypes; Type returnType;}class ClassType extends Type { ICClass classAST;}

IntTypeBoolTypeABCf : int->int…

TypeTable

2929

Assigning types by declarations

IntTypeBoolType

...

TypeTable

ClassTypename = A

ClassTypename = B

ClassTypename = C

MethodTypename = fretTypeparamTypes

type

type

type

type

super

All type bindings available during parsing time

ICClassname = A

Fieldname = xtype = IntType

Methodname = f

Paramname = xtype = IntType

DeclarationvarName = yinitExpr = nulltype = BoolType

fields[0] methods[0]

bodyparameters[0]

ASTProgramfile = …

classes[0]

ICClassname = Bsuper = A

classes[1]classes[2]

…ICClassname = C

3030

Symbol tables

ICClassname = A

Fieldname = xtype = IntType

Methodname = f

Paramname = xtype = IntType

DeclarationvarName = yinitExpr = nulltype = BoolType

fields[0] methods[0]

bodyparameters[0]

ASTProgramfile = …

classes[0]

ICClassname = Bsuper = A

classes[1]classes[2]

…ICClassname = C

ACLASS

BCLASS

CCLASS

Global symtab

xFIELDIntType

fMETHODint->int

A symtaboCLASSA

zFIELDIntType

C symtab

tFIELDIntType

yFIELDBoolType

B symtabxPARAMIntType

yVARBoolType

thisVARA

retRET_VARIntType

f symtab

abstract class SymbolTable { private SymbolTable parent;}class ClassSymbolTable extends SymbolTable { Map<String,Symbol> methodEntries; Map<String,Symbol> fieldEntries; }class MethodSymbolTable extends SymbolTable { Map<String,Symbol> variableEntries;}

abstract class Symbol { String name;}class VarSymbol extends Symbol {…} class LocalVarSymbol extends Symbol {…}class ParamSymbol extends Symbol {…}...

3131

Scope nesting in IC

SymbolKindTypeProperties Global

SymbolKindTypeProperties Class

SymbolKindTypeProperties Method

SymbolKindTypeProperties Block

names of all classes

fields and methods

formals + locals

variables defined in block

class GlobalSymbolTable extends SymbolTable {}class ClassSymbolTable extends SymbolTable {}class MethodSymbolTable extends SymbolTable {}class BlockSymbolTable extends SymbolTable {}

3232

Symbol tables

ICClassname = A

Fieldname = xtype = IntType

Methodname = f

Paramname = xtype = IntType

DeclarationvarName = yinitExpr = nulltype = BoolType

fields[0] methods[0]

bodyparameters[0]

ASTProgramfile = …

classes[0]

ICClassname = Bsuper = A

classes[1]classes[2]

…ICClassname = C

ACLASS

BCLASS

CCLASS

Global symtab

xFIELDIntType

fMETHODint->int

A symtaboCLASSA

zFIELDIntType

C symtab

tFIELDIntType

yFIELDBoolType

B symtabxPARAMIntType

yVARBoolType

thisVARA

retRET_VARIntType

f symtab

this belongs to method

scope

ret can be used later for type-

checking return statements

Locationname = xtype = ?

3333

Sym. tables phase 1 : construction

ICClassname = A

Fieldname = xtype = IntType

Methodname = f

Paramname = xtype = IntType

DeclarationvarName = yinitExpr = nulltype = BoolType

fields[0] methods[0]

bodyparameters[0]

ASTProgramfile = …

classes[0]

ICClassname = Bsuper = A

classes[1]classes[2]

…ICClassname = C

ACLASS

BCLASS

CCLASS

Global symtab

xFIELDIntType

fMETHODint->int

A symtaboCLASSA

zFIELDIntType

C symtab

tFIELDIntType

yFIELDBoolType

B symtabxPARAMIntType

yVARBoolType

thisVARA

retRET_VARIntType

f symtab

class TableBuildingVisitor implements Visitor { ...}

Locationname = xtype = ?

Build tables,Link each AST node to enclosing table

abstract class ASTNode { SymbolTable enclosingScope;}

enclosingScope

symbol

?

3434

ICClassname = A

Fieldname = xtype = IntType

Methodname = f

Paramname = xtype = IntType

DeclarationvarName = yinitExpr = nulltype = BoolType

fields[0] methods[0]

bodyparameters[0]

ASTProgramfile = …

classes[0]

ICClassname = Bsuper = A

classes[1]classes[2]

…ICClassname = C

ACLASS

BCLASS

CCLASS

Global symtab

xFIELDIntType

fMETHODint->int

A symtaboCLASSA

zFIELDIntType

C symtab

tFIELDIntType

yFIELDBoolType

B symtabxPARAMIntType

yVARBoolType

thisVARA

retRET_VARIntType

f symtab

class TableBuildingVisitor implements Visitor { ...}

During this phase, add symbols from definitions, not uses, e.g., assignment to variable x

symbol

?Locationname = xtype = ?

Sym. tables phase 1 : construct

3535

ICClassname = A

Fieldname = xtype = IntType

Methodname = f

Paramname = xtype = IntType

DeclarationvarName = yinitExpr = nulltype = BoolType

fields[0] methods[0]

bodyparameters[0]

ASTProgramfile = …

classes[0]

ICClassname = Bsuper = A

classes[1]classes[2]

…ICClassname = C

ACLASS

BCLASS

CCLASS

Global symtab

xFIELDIntType

fMETHODint->int

A symtaboCLASSA

zFIELDIntType

C symtab

tFIELDIntType

yFIELDBoolType

B symtabxPARAMIntType

yVARBoolType

thisVARA

retRET_VARIntType

f symtab

symbolLocationname = xtype=?

Sym. tables phase 2 : resolve

Resolve each id to a symbol,e.g., in x=5 in foo, x is the formal parameter of f

check scope rules:illegal symbol re-definitions,illegal shadowing,illegal use of undefined symbols...

class SymResolvingVisitor implements Visitor { ...}

enclosingScope

3636

ICClassname = A

Fieldname = xtype = IntType

Methodname = f

Paramname = xtype = IntType

DeclarationvarName = yinitExpr = nulltype = BoolType

fields[0] methods[0]

bodyparameters[0]

ASTProgramfile = …

classes[0]

ICClassname = Bsuper = A

classes[1]classes[2]

…ICClassname = C

Locationname = xtype = IntType

Type-check AST

IntTypeBoolType ...

TypeTable

class TypeCheckingVisitor implements Visitor { ...}

Use type-rules to infer types for all AST expression nodes

Check type rules for statements

3737

ICClassname = A

Fieldname = xtype = IntType

Methodname = f

Paramname = xtype = IntType

DeclarationvarName = yinitExpr = nulltype = BoolType

fields[0] methods[0]

bodyparameters[0]

ASTProgramfile = …

classes[0]

ICClassname = Bsuper = A

classes[1]classes[2]

…ICClassname = C

Locationname = xtype = IntType

Miscellaneous semantic checks

class SemanticChecker { ...}

Check remaining semantic checks: single main method, break/continue inside loops etc.