2
The Compiler So Far
• Lexical analysis– Detects inputs with illegal tokens
• Parsing– Detects inputs with ill-formed parse trees
• Semantic analysis (contextual Analysis)– Catches all remaining errors
4
Why a Separate Semantic Analysis?
• Parsing cannot catch all errors
• Some language constructs are not context-free– Example: All used variables must have been
declared (i.e. in scope)– ex: { int x { .. { .. x ..} ..} ..}– Example: A method must be invoked with
arguments of proper type (i.e. typing)– ex: int f(int, int) {…} called by f(‘a’, 2.3, 1)
5
More problems require semantic analysis1. Is x a scalar, an array, or a function?2. Is x declared before it is used/defined? 3. Is x defined before it is used?4. Are any names declared but not used?5. Which declaration of x does this reference?6. Is an expression type-consistent?7. Does the dimension of a reference match the
declaration?8. Where can x be stored? (heap, stack, . . . )9. Does *p reference the result of a malloc()?10. Is an array reference in bounds?
6
Why is semantic analysis hard?
• need non-local information– a[x] = y + z; // type consistent ?
• answers depend on values, not on syntax– int a[10]; a[10] = 1;
• answers may involve computation– a[10 + BASE] = 6;
7
How can we answer these questions?1. use context-sensitive grammars (CSG)
2. use attribute grammars(AG)– augment context-free grammar with rules– calculate attributes for grammar symbols
3. our approach:– Build AST– Construct visitors (TreeWalker) to traverse AST to
collect information about names in symbol tables– Write various checking visitors to check possible
semantic errors.
8
Symbol Tables
• Symbol Tables Environments– Mapping IDs to infrmation about the IDs like Types ,
Locations, etc.– Definition Insert in the table– Use Lookup ID
• Scope– Where the IDs are “visible”
Ex: formal parameters, local variables in MiniJava
-> inside the method where defined
-- (private) variables in a class
-> inside the class
-- (public) method : visible anywhere (unless overridden)
9
Symbol Table
• What kind of IDs should be entered?– class or structure names– variable names– defined constants– procedure and function names– literal constants and strings– source text labels
• Separate table for structure layouts (types) (field offsets and lengths)
10
What kind of information should be included?
• textual name• data type• dimension information ( for aggregates)• declaring procedure or class• lexical level of declaration• storage class ( base address in stack; heap , global)• size and offset in storage• record (pointer to) structure table• parameter by-reference or by-value?• function number and type of arguments to functions• …
11
Attributes of symbol table• Attributes are properties of ID in declarations• Symbol table associates names with attributes• Names may have different attributes depending
on their meaning:– variables: type, procedure level, frame offset– types: type descriptor, data size/alignment– constants: type, value– procedures: formals (names/types), result type, block
information (local decls.), frame size– classes : name, parent class, fields, methods, …
12
Symbol Table construction strategy
• number of tables constructed:– one global symbol table : scope level info included– one symbol table per scope + links to parent scope
• life-time of symbol table– persistent once created multi-pass compiler– created on entering scope and destroyed on leaving
its scope. one pass compiler
13
Environments
• A set of bindings (nameattributes assoc.)Initial Env 0
Class C { int a; int b; int c;
Env ={parent0 }+ {a -> int, b -> int, c -> int} public void m() { System.out.println(a+c); int j = a+b;
Env = {parent 1 }+ {j -> int} String a = “hello”;
Env = 2 + {a -> String} System.out.println(a);
14
Environments (Cont’d)
Env = 2 + {a -> String}
System.out.println(a);
System.out.println(a);
System.out.println(a);
}
Env
}
Env
15
Implementing Environments
• Functional Style (non-destructive)– enter scope Keep previous env and create new one– leave scope discard/save new one and back to old– implementation: hashtable or tree.
• Imperative Style (using hashtable )– enter scope mark a new scope– definition encountered put new key-value– leave scope pop all key-value pairs since the last
entering scope)– a global table only.
16
Multiple Symbol Tables : Java-style
Package M;
class E {
static int a = 5;
}
class N {
static int b = 10
static int a = E.a + b
}
class D {
static int d = E.a+ N.a
}
End
Initial Env 0
= {a -> int}
= {E -> }
= {b -> int,a -> int}
= {N -> }
= {d -> int}
= {D -> }
=
{ E , N , D }
17
Implementation – Functional Symbol Table
• Efficient Functional Approach’a
would return [a
• If implemented with a Hashtable would have to create O(n) buckets for each scope
• Is this a good idea?
18
Implementation – Imperative Symbol Table(inefficient nondestructive update)
a
b
d
c
See Appel Program 5.2 (p106)
Update (clone & put)
Undo
’d
19
Implementation - Tree
dog 3
bat 1
dog 3
camel 2
emu 42
m1
m2
m1 = { bat |-> 1 , camel |-> 2, dog |-> 3 }
m2 = {m1 + emu |-> 42 }
How could this be implemented?
Want m2 from m1 in O(n)
20
Symbols v.s Strings as table key • problem with string as table key:
– time consuming for comparing of long names.– why not comparing address if equal strings are identical.
• Symbol:– a wrapper for (intern) Stirngs
• Symbol Representation– Comparing symbols for equality is fast.– Extracting an integer hash key is fast.– Comparing two symbols for “greater-than” is fast. (monotonic ?)
• Properties:– Symbol s1,s2 => – s1 == s2 iff s1.equals(s2) iff s1.string == s2.string
• public class Symbol { public String toString(); public static Symbol getSymbol(String n);}
21
symbol.Symbolpublic class Symbol { public String name; // Symbol cannot be constructed directly private Symbol(String n) { name = n;} public String toString(){ return name; }
private static Map map = new Hashtable();
public static Symbol getSymbol(String n){ String u = n.intern(); Symbol s = (Symbol) map.get(u); if (s == null){ s = new Symbol(u); map.put(u,s); } return s; } }
22
Symbol Table Implementastion(efficient destructive update)
a
a
b
b
Using a Hash Table
c
c
c
top: Symbol
marker: Binder
null
c
c
null
23
Some sample program(I)
/** * The Table class is similar to
java.util.Dictionary, * except that each key must be a Symbol and there is * a scope mechanism. */
public class Table {
private Hashtable dict = new java.util.Hashtable(); private Symbol top; private Binder marks;
public Table(){}
24
Some sample program(II)/** Gets the object associated with the specified * symbol in the Table. */ public Object get(Symbol key) {
Binder e = (Binder)dict.get(key);if (e==null) return null;else return e.value;
}
/** Puts the specified value into the Table, * bound to the specified Symbol. */ public void put(Symbol key, Object value) {
dict.put(key, new Binder(value, top,
(Binder)dict.get(key)));top = key; }
25
Some sample program(III)/** * Remembers the current state of the Table. */ public void beginScope() {marks = new Binder(null,top,marks); top=null;}
/** Restores the table to what it was at the most * recent beginScope that has not already been ended. */ public void endScope() {
while (top!=null) { Binder e = (Binder)dict.get(top); if (e.tail!=null) dict.put(top,e.tail); else dict.remove(top); top = e.prevtop;} top=marks.prevtop; marks=marks.tail; }
26
Some sample program(IV)
package Symbol;
class Binder {
Object value;
Symbol prevtop;
Binder tail;
Binder(Object v, Symbol p, Binder t) {
value=v; prevtop=p; tail=t;
}
}
27
Type-Checking in MiniJava
• Binding for type-checking in MiniJava– Variable and formal parameter
• Var name <-> type of variable
– Method• Method name <-> result type, parameters( including position
information), local variables
– Class• Class name <-> variables, method declarations, parent class
28
Symbol Table: example
See Figure 5.7 (next slide)
• Primitive types– int -> IntegerType()– Boolean -> BooleanType()
• Other types– Int [] -> IntArrayType()– Class -> IdentifierType(String s)
29
A MiniJava Program and its symbol table(Figure 5.7)
class B { C f; int[] j; int q;
public int start(int p, int q) { int ret; int a; /* … */ return ret; }
public boolean stop(int p) { /* …*/ return false; } }class{ C /* …*/ }
B
C
FIELDS
f C
j int[]
g int
METHODS
start int
stop boolean
PARAMS
p int
q int
LOCALS
ret int
a int
PARAMS
p int
LOCALS
….
30
Main Symbol tables in MiniJava
• Table class hierarchy : • NameTypeTable• GlobalTable• ClassTable• MethodTable
Containment Hierarchy:
GlobalTable ClassTables MethodTables NameTypeTables (locals+params) NameTypeTables (for fields)
31
NameTypeTable
• String name; // name of var+class+method etc.• Type type; // type of var+local+method+field+class etc.• NameTypeTable parent; // parent table• Map<Symbol, Object> map ; // locals + formal + fields +
attr …• getName() // name of class/method/vars assoc. with this
table• getType() // type of class/method/vars assoc. with this
table• getFromHierarchy(Symbol)• boolean put(Symbol, Object);• boolean contains(Symbol) ;• Object get(Symbol);• dump()
34
Additional methods
• getVarType(String id) // in MethodTable– find type of variable id from current method– Precedence:– Locals in method– Foraml Parameters in parameter list– Fields in the containing class– Variable in the parent class
• getMethod(String) // classTable– May be defined in the parent Classes
35
Type-Checking : Two Phases • Build Symbol Table• Type-check statements and expressionspublic class Main { public static void main(String [] args) { try { Program prog = new MiniJavaParser(System.in).Program();MiniJavaSymbolTableBuilder v1 = new MiniJavaSymbolTableBuilder();
v1.visit(prog); new MiniJavaTypeCheckVisitor(v1.getTable()) .visit(prog); }catch (ParseException e) { System.out.println(e.toString()); }}}
36
Build Symbol Tablepublic class MiniJavaSymbolTableBuilder extends DepthFirstVisitor {
private GlobalTable gtable = new …; private ClassTable cTable ; private MethodTable mTable; boolean InMethod; String id; Type cType // Type t; // Identifier i;
37
Build Symbol Table ( Cont’d )public void visit(VarDecl n) { super.visit(n); // this will set id and cType if (inMethod) { if (!(mTable.addLocal(id, cType))) { err.printf( "Duplicate locals/parameters defined: %s %s !!", id, cType); } } else { if (!cTable.addField(id, cType)) {
err.printf("Duplicate fields defined: %s %s !!", id, cType); }}
38
visit() methods related to Symbol Table Building: DepthFirstVisitor()• need override them from their parent DepthFirstVisitor class.
public Type visit(MainClass n); public Type visit(ClassDeclSimple n); public Type visit(ClassDeclExtends n); public Type visit(VarDecl n); public Type visit(MethodDecl n); public Type visit(Formal n); public Type visit(IntArrayType n); public Type visit(BooleanType n); public Type visit(IntegerType n); public Type visit(IdentifierType n);
39
MiniJavaTypeCheckVisitor(SymbolTable);
package visitor;import syntaxtree.*;// statement & class member visitorpublic class MiniJavaTypeCheckVisitor extends DepthFirstVisitor {
private ClassTable cTable; private MethodTable mTable; private GlobalTable gTable; Ctype ctype, String id, bnoolean inMethod ;//ev is an expression visitor which needs to return the type of scanned Expressions.
private ExpTypeEvaluator ev = new
ExpTypeEvaluator();
41
MiniJavaTypeCheckVisitor(SymbolTable); - Cont’d// i = e ;// Identifier i; Exp e;public void visit(Assign n) { Type type1 = ev.visit(n.i); Type type2 = (Type)ev.visit(n.e); if (!type1.equals(type2)) {
err.printf(“[%s:%s] Expression [%s] of type:[%s] could not be assigned to ["%s] of type [%s]!\n", n.bl, n.bc, n.e, type2, n.i, type1);}
}
42
ExpTypeEvaluator : an inner class of MiniJavaTypeVisitor
// so we can share all fields declared in containing class : MiniJavaTypeVisitor
public class ExpTypeEvaluator extends DepthFirstVisitorR { …
// Exp e1,e2; public Type visit(Plus n) { if (! (visit(n.e1) == IntegerType.TYPE) ) { err.printf("Left side of Plus must be of type
integer"); } if (! (visit(n.e2) instanceof IntegerType) ) { err.printf("Right side of Plus must be of type
integer"); } return IntegerType.TYPE; }
43
Visit(IdentifierType )/** * 1. make sure that n has been defined in * global Table * 2. set cType = n ; */public void visit(IdentifierType n) { if(! ( gTable.contains(n.s) ) ){ err.printf( "%s:%s: The type: %s was not defined!!", n.bl, n.bl, n.s); } cType = n ;}
44
visit(Program)
// MainClass m;
// List<ClassDecl> cl;
public void visit(Program n) {
visit(n.m);
visitList(n.cl);
}
45
visit(While)
// Exp e;// Statement s;public void visit(While n) { Type tpe = (Type) ev.visit(n.e); if (! (BooleanType.TYPE.equals(tpe))) { err.printf(“%s:%s: the condtional: %s in While statement is not a boolean expression!\n",
n.e.bl, n.e.bc, n.e); } visit(n.s);}
46
MiniJavaTypeCheckVisitor extends DepthFirstVisitor
public void visit(Program), visit(MainClass n);
visit(ClassDeclSimple n);
visit(ClassDeclExtends n);
visit(MethodDecl n);
visit(Foraml n ), visit(VarDecl n );
visit(If n), visit(While n);
visit(Print n);
visit(Assign n); visit(ArrayAssign n);
visit(Identifer) ; visit(IdentifierType);
47
ExpTypeEvaluator extends DepthFirstVisitorR• Note: Must return a result type .
public Type visit(And n); // boolean public Type visit(LessThan n); // boolean public Type visit(Plus n); // int public Type visit(Minus n); public Type visit(Times n); public Type visit(ArrayLookup n); // int public Type visit(ArrayLength n); // int public Type visit(Call n); // result type public Type visit(IntegerLiteral n); // int public Type visit(True n); // boolean public Type visit(False n); public Type visit(IdentifierExp n); // symbol table lookup public Type visit(This n); // current class public Type visit(NewArray n); // int[] public Type visit(NewObject n); // IdentifierType(n.id) public Type visit(Not n); // boolean
48
Overloading of Operators, ….• When operators are overloaded, the compiler must
explicitly generate the code for the type conversion. – 2 + 2 2.0 + 3.4 2.4 + 4
– “abc” + 4
– need built-in int2float, float2int int2str system functions etc.
• For an assignment statement, both sides have the same type. When we allow extension of classes, the right hand side is a subtype of lhs.– long x = (int) y + 3
– Person p = new Student();
49
Error Handling
• For a type error or an undeclared identifier, it should print an error message.
• And must go on…..• Recovery from type errors?
– Do as if it were correct.– Not a big deal in our homework.
• Example:– int i = new C();– int j = i + 1;– still need to insert i into symbol table as an integer so
the rest can be typechecked..
51
Type Checkng for MiniJava (I)Package syntaxtree;Program(MainClass m, List<ClassDec> c1) // recursively type check m and clMainClass(Identifier i1, Identifier i2, Statement s)
// type check s----------------------------abstract class ClassDeclClassDeclSimple(Identifier i, List<VarDecl> vl, List<methodDecl> m1)
// recursively type check vl and mlClassDeclExtends(Identifier i, Identifier j, List<VarDecl> vl, List<MethodDecl> ml)// like above but must assure that j(parent class) is a declared class
52
-----------------------------
VarDecl(Type t, Identifier i)
Formal(Type t, Identifier i)
//2. recursively type check that t is a built-in or declared type
//will be checked in visit(IdentifierType t)
MethodDecl(Type t, Identifier i, List<Formal> fl, List<VarDecl> vl, List<Statement> sl, Exp e)
// same as (2). recursively type check fl, vl, sl and e.
// 3. type(e) == t
53
Type Checking for MiniJava (II)
abstract class type
IntArrayType()
BooleanType()
IntegerType()
// do nothing
IndentifierType(String s)
//check that gTable.conaitns(s).
// i.e., s muse be a class name.
---------------------------
54
abstract class Statement
Block(List<Statement> sl)
// recursively type check sl
// i.e., call visitList(sl);
// appear in DepthFirstVisitorR
If(Exp e, Statement s1, Statement s2)
//4. type(e) == boolean
// recursively type check sl and s2
While(Exp e, Statement s)
//4 + recursively check s.
55
Print(Exp e)
//5. type(e) == int
Assign(Identifier i, Exp e)
// 7. check type(i) == type(e)
// i[e1] = e2 ;
ArrayAssign(Identifier i,Exp e1,Exp e2)
// 8. type(i) == int[] &&
type(e1) == type(e2) == int
56
Type checking for MiniJava (III)abstract class Exp : // Arithmetic ExpressionPlus(Exp e1, Exp e2), Minus(Exp e1, Exp e2)Times(Exp e1, Exp e2)//10. type(e1) == type(e2) == int//11. type(rlt) intArrayLookup(Exp e1, Exp e2) // e1[e2]// type(e1) == int[] & type(e2) == int // type(rlt) int ArrayLength(Exp e) // e.length// type(e) == int[] ; type(rlt) intIntegerLiteral(int i) // 23// type(rlt) int
57
LessThan(Exp e1, Exp e2)// e1 < e2
// type(e1) == type(e2) == int
// type(rlt) boolean
True() False()
// type(rlt) boolean
And(Exp e1, Exp e2) // e1 && e2
// type(e1) == type(e2) == boolean
// type(rlt) boolean
Not(Exp e) // not e
// type(e) = boolean;
// return BooleanType.TYPE
58
IdentifierExp(String s)
//s(field or formal or local)was declared and in scope
//type(rlt) type part of the declaration that s is bound to.
This()
// type(rlt) current class type
NewArray(Exp e)
//type(e) == int
//type(rlt) int[]
NewObject(Identifier i)
// i is a class name in scope
// type(rlt) IdtentifierType(name of i )
59
--------------------------------------------Identifier(String s) // i = e ; // same as IdentifierExp--------------------------------------------Call(Exp e, Identifier i, List<Exp> el) // ex: Set s = … ; s.getMembers(20, 100 ) Call(s, “getMembers”, [20,100]). getMembers: (Int x int) Set. declared in class Set//c1. type(e) == IdentifierType(c) for some class c//c2. there is a method m in c named i with formal
parameters of types fl.// c3. fl and el has the same size k >= 0 and // c4. for j = 1 .. k type(el(j)) == fl(j)// type(rlt) returnType(m).