KAnnotator

ANNOTATION INFERENCE IN KANNOTATOR

ANNOTATION INFERENCE: AN OVERVIEW Annotation Inference includes following steps:

Load external/previously inferred annotations from specified sources (XML files, class files, annotation logs, etc.)

Load descriptions of classes to analyze Invoke inference algorithm on loaded classes and

annotations to produce set of inferred annotations Update inferred annotations with propagation

algorithm to ensure extension to other methods in inheritance hierarchy

Process conflicts between inferred and external annotations

Inference process is parameterized with algorithms which implement inference for specific kinds of annotations (e.g. nullability): Infer field annotations given its value Infer method annotations given its bytecode

ANNOTATIONS STRUCTURE Annotations are represented as a map from

annotation position to actual annotation values (e.g. NULLABLE/NOT_NULL)

Annotation position consists of Class member (field/method) Declaration position – annotated component of

class member: Field type position – annotates field type Return type – annotates return type of the method Parameter position – annotates type of method

parameter (given its index)

ANNOTATION LATTICE Inference assumes that annotations form a lattice,

hence for any pair (a, b) of annotations of the same kind (e.g. nullability) least upper LUB(a, b) and greatest lower GLB(a, b) bounds are defined Nullability: NOT_NULL < NULLABLE Mutability: READ_ONLY < MUTABLE

Unification: given two annotations (a, b) and declaration position p unified annotation unify(a, b, p) is defined as: LUB(a, b) if p is covariant GLB(a, b) if p is contravariant b if p is invariant (assuming a == b)

Unification is naturally extended to annotation sets

Annotation a (“parent”) subsumes annotation b (“child”) at position p if unify(a, b, p) == a

FIELD/METHOD DEPENDENCY Field dependency is a map which associates

field descriptions with pair (readers, writers) where readers is a set of all methods (within given class

set) which access field value through GETFIELD or GETSTATIC instructions

writers is a set of all methods (within given class set) which mutate field value through PUTFIELD or PUTSTATIC instructions

Method dependency is a graph with methods as vertices. The graph contains egde (a, b) if one of the following conditions holds: If method a invokes method b through one of

INVOKE*** instructions If there is a non-primitive field such that method

a is its reader and method b is its writer

ANNOTATION INFERENCE (WITHOUT PROPAGATION) Input:

classSource: set of classes to analyze externalAnn: external annotations (e. g. loaded from

classfiles/XMLs) existingAnn: previously inferred annotations

Output: inferredAnn: inferred annotations

inferredAnn := copy of existingAnn fieldDep := build field dependency map for all classes in

classSource methDep := build method dependency graph for all classes in

classSource depComps := list of SCCs of methDep ordered with the

respect to topological sorting For each field f in fieldDep, use annotation-specific algorithm

to infer annotations from initial value of f and copy them to inferredAnn

For each component comp in components infer annotation for methods within comp (see below)

INFER ANNOTATIONS ON DEPENDENCY GRAPH SCC

Input: methods: set of methods which form dependency graph

SCC ann: current annotations (to be updated by the inference)

queue := new queue containing all items from methods while (queue is not empty)

m := remove first method from queue cfg := build control-flow graph of m inferredAnn := Invoke annotation-specific inference

algorithm (e.g. nullability) for method m, graph cfg and predefined annotations ann

Copy all changed annotations from inferredAnn to ann If (at least one annotation was changed/added/removed)

then Add to the queue:

all dependent methods of m which belong to the same SCC as m itself

method m

PROPAGATION: AN OVERVIEW Propagation algorithm extends given annotation

set to methods within the same inheritance hierarchy

Propagation proceeds in the following steps:1. Resolve annotation conflicts

Parent and child method have conflicting annotations at some position if child annotation does not subsumes parent annotation. Conflicts are fixed by updating parent annotation to be least upper bound of child annotation and previous parent annotation

2. Unify parameter annotations Methods in the same inheritance hierarchy are assigned

identical annotation at corresponding parameter. The annotation is computed as least upper bound over annotations already present at that parameter

3. Apply propagation overrides Propagation override is an exception to unification algorithm

which states that given method and all its descendants must have some specific annotation at given parameter

PROPAGATION: CONFLICT RESOLUTION Input

leaves: set of “leaf” methods lat: annotation lattice ann: annotations (to be updated)

For each method leaf in leaves propagatedAnn := Annotations(ann) Perform breadth-first traversal of method hierarchy

graph starting from leaf (moving from child to parents) and for each traversed method m and each parent method pm of m and each annotation position ppos in pm

p := declaration position of ppos pos := position corresponding to ppos in method m child := propagatedAnn[p], parent := ann[pp] If (child is defined) then

If (parent is defined) then ann[pp] := lat.unify(child, parent, p)

else propagatedAnn[pp] := child If (p == RETURN_TYPE) then ann[pp] := child

PROPAGATION: CONFLICT RESOLUTION EXAMPLE (I)

public class XHierarchyAnnotatedMiddle {

public interface Top1 {

@NotNull

Object m(@Nullable Object x);

}


@NotNull

Object m(@Nullable Object x);

}

public interface Middle extends Top1, Top2 {

@Nullable

Object m(Object x);

}

public interface Leaf1 extends Middle {

Object m(@NotNull Object x);

}


Object m(Object x);

}

}

PROPAGATION: CONFLICT RESOLUTION EXAMPLE (II)

public class XHierarchyAnnotatedMiddle {


@Nullable


}


@Nullable


}


@Nullable

Object m(Object x);

}



}


Object m(Object x);

}

}

PROPAGATION: PARAMETER UNIFICATION Input

methods: set of methods lat: annotation lattice ann: annotations (to be updated)

descriptors := set of method descriptors found in methods For each method descriptor desc in descriptors

descMethods := subset of methods with descriptor desc For each parameter declaration position p in desc

paramAnn := set of all annotations from ann defined at such position pos that its method is from descMethods and its declaration position is p

If (paramAnn is not empty) then unifiedAnnotation := lat.unify(paramAnn, p) For each method m in descMethods

pos := annotation position of m corresponding to declaration position p

ann[pos] := unifiedAnnotation

PROPAGATION: PARAMETER UNIFICATION EXAMPLE (I)

public class XHierarchy {


Object m(Object x, Object y);

}


Object m(@NotNull Object x, Object y);

}


Object m(@Nullable Object x, @Nullable Object y);

}



}


Object m(Object x, @Nullable Object y);

}

}

PROPAGATION: PARAMETER UNIFICATION EXAMPLE (II)



Object m(@NotNull Object x, @Nullable Object y);

}



}



}



}



}

}

PROPAGATION: OVERRIDING RULES Input

graph: method hierarchy graph overrides: annotations specifying overriding rules ann: annotations (to be updated)

For each method annotation ann at position opos in overrides method := method corresponding to annotation position

opos Perform breadth-first traversal of method hierarchy graph

starting from method (moving from parent to children) and for each traversed method m pos := position corresponding to opos in method m ann[pos] := overrides[opos]

PROPAGATION: OVERRIDING RULE EXAMPLE (I)




}



}



}



}



}

}

Rule: Top1.m(Object, Object) at 0 is NULLABLE

PROPAGATION: OVERRIDING RULE EXAMPLE (II)




}



}



}



}



}

}

Rule: Top1.m(Object, Object) at 0 is NULLABLE

CONFLICT PROCESSING Input:

existingAnn: predefined annotations inferredAnn: inferred annotations lat: annotation lattice excPositions: set of excluded annotation positions

Output: conflicts: list of triples (position, existing annotation,

inferred annotation) conflicts := empty list positions := set of all positions in existingAnn For each annotation position pos in positions

inferred := inferredAnn[pos], existing := existingAnn[pos] p := declaration position corresponding to pos If (existing does not subsume inferred at p) then

If (pos in excPositions) then inferredAnn[pos] := existing else add (pos, existing, inferred) to conflicts

CONFLICT PROCESSING: EXAMPLEpublic class XHierarchy {



}


Object m(@Nullable @NotNull Object x, Object y);

}


@NotNull

@Nullable


}



}


Object m(Object x, @Nullable Object y);

}

}

CONTROL-FLOW GRAPH

Method control-flow graph describes transitions between bytecode instructions

Each instruction and transition has corresponding frame state which describes content of local variables and stack Interesting stack values correspond to method

parameters Also each instruction has one outcome value

which reflects possible terminations of outgoing control-flow paths: ONLY_RETURNS ONLY_THROWS RETURNS_AND_THROWS

Instruction outcomes are computed on demand

COMPUTATION OF INSTRUCTION OUTCOMES

Outcome of given instruction srcInsn is computed by depth-first traversal of all instructions reachable from srcInsn and merging outcomes of all visited termination instructions such that Outcome of any *RETURN instruction is

ONLY_RETURNS Outcome of ATHROW instruction is

ONLY_THROWS Traversal can be stopped earlier if

RETURNS_AND_THROWS outcome is produced Outcomes are merged according to the rules:

a + a = a a + b = RETURN_AND_THROWS if a != b

MUTABILITY INFERENCE: MUTABILITY INVOCATIONS

Mutating invocations: Collection.{add, remove, addAll, removeAll,

retainAll, clear} Set.{add, remove, addAll, removeAll, retainAll,

clear} List.{add, remove, set, addAll, removeAll,

retainAll, clear} Map.{put, remove, putAll, clear} Map.Entry.setValue Iterator.remove

Mutability propagating invocations: {Collection, Set, List}.iterator List.listIterator Map.{keySet, values, entrySet}

MUTABILITY INFERENCE Input:

method: method to be analyzed cfg: control-flow graph of method ann: predefined annotations set

Output: inferredAnn: inferred annotations set

mutabilityMap := empty map from values to mutability annotations

For each invocation instruction insn in cfg If (insn is invocation of some method m) then

If (insn is mutating invocation of some method m) then Mark each possible value of m’s receiver as

MUTABLE For each parameter param of m

pos := annotation position corresponding to param of m

If (ann[pos] is MUTABLE) then mark each possible value of param as MUTABLE

For each value v in mutabilityMap which is parameter of method pos := annotation position corresponding to v in method ann[pos] := convert mutabilityMap[v] to annotation value

MUTABILITY INFERENCE: VALUE MARKING Input

value: stack value mutabilityMap : map from values to mutability

annotations (to be updated) mutabilityMap[value] := MUTABLE If (value is created by method invocation instruction insn

and insn propagates mutability) m := method invoked by insn Recursively mark each possible values of m’s receiver as

MUTABLE

NULLABILITY INFERENCE: NULLABILITY VALUES Inference process assigns nullability to stack

values: UNKNOWN: not enough information to infer

nullability NULLABLE NOT_NULL NULL UNREACHABLE: contradicting nullabilities (value

is not realizable) Nullability merge rules:

a + a = a a + CONFLICT = CONFLICT + a = a a + NULL = NULL + a = NULLABLE a + NULLABLE = NULLABLE + a = NULLABLE NOT_NULL + UNKNOWN = UNKWNON +

NOT_NULL = UNKNOWN

NULLABILITY INFERENCE: NULLABILITY MAPS Nullability map is used to keep association

between stack values and nullability. In particular, nullability map is computed for each instruction and transition in control-flow graph of a method

Additional structures: Set of method annotation position Set of existing annotations (external or previously

inferred) Declaration index (used to look up fields and methods

by their descriptors in bytecode) Optional frame state:

In case of transition-related map it’s a state AFTER originating instruction

In case of instruction-related map it’s a merged state BEFORE the instruction

Assuming state is defined if some value is present in map, but absent in its state, it’s said to be unreachable

Set of spoiled values (i.e. values which are no longer associated with parameters due to assignment)

NULLABILITY INFERENCE: NULLABILITY MAP LOOKUPS

Stored: m.getStored(v) Return actual nullability previously stored in map, or UNKNOWN

if nullability is not defined Full: m[v]

If (v is lost) then return CONFLICT If (some nullability x was previously stored for value v) then

retun x If (v is created by some instruction insn) then

If (insn is NEW, NEWARRAY, ANEWARRAY, MULTIANEWARRAY, or LDC) then return NOT_NULL

If (insn is ACONST_NULL) then return NULL If (insn is AALOAD) then return UNKNOWN If (insn is GETFIELD or GETSTATIC) then return nullability

corresponding to the field annotation (or UNKNOWN if undefined)

If (insn is INVOKE***) then return nullability corresponding to return value of the invoked method (or UNKNOWN if undefined)

If (v is interesting) then return nullability corresponding to existing annotation at position encoded by v

If (v is null) then return NULL If (v is primitive) then return CONFLICT Otherwise return NOT_NULL

NULLABILITY INFERENCE: MERGING NULLABILITY MAPS

Input: srcMaps: Set of nullability maps

Output: mergedMap: merged nullability map mergedValues: set of values which have different nullability

in at least two maps in srcMaps mergedMap := new empty nullability map mergedValues := new empty set affectedValues := set of all stack values in srcMaps key sets For each map m in srcMaps

Add all values from m.spoiledValues to mergedMap.spoiledValues

For each value v in affectedValues If (v is already in mergedMap keys) then

If (mergedMap[v] != m[v]) then add v to mergedValues mergedMap[v] := merge m[v] with mergedMap[v]

else mergedMap[v] := m[v] If (v is lost in m and m.getStored(v) != NOT_NULL) then

Add v to mergedMap.spoiledValues

NULLABILITY INFERENCE: INFER FROM FIELD VALUE

Input: field: field description

Output: ann: nullability annotation

If (field is final and field type is not primitive and field initial value is not null) then ann := NOT_NULL

else ann := UNKWNOWN

NULLABILITY INFERENCE: INFER FROM METHOD Input:

method: method to be analyzed cfg: control-flow graph of method ann: predefined annotations set

Output: inferredAnn: inferred annotations set

ovrMap := new empty nullability map mergedMap := new empty nullability map returnValueInfo := UNKNOWN fieldInfo := new empty map from fields to nullability values For each instruction insn in cfg

insnMap := compute nullability map for insn If (insn is *RETURN) then

Process return instruction (insn, mergedMap, returnValueInfo)

If (insn is PUTFIELD or PUSTATIC) then Process field write (insn, fieldInfo)

inferredAnn := create annotations (ovrMap, mergedMap, returnValueInfo, fieldInfo)

NULLABILITY INFERENCE: PROCESS RETURNS Input:

insn: instruction mergedMap: nullability map (to be updated) returnValueInfo: return value nullability (to be

updated) Merge insnMap to mergedMap If (insn is ARETURN) then

For each possible return value v retValue := if ovrMap contains v then ovrMap[v]

else insnMap[v] Merge retValue to returnValueInfo

NULLABILITY INFERENCE: PROCESS FIELD WRITE Input:

insn: instruction fieldInfo: map from fields to nullability values (to be

updated) field := field mutated by insn If (field has reference type and is final) then

nullability := Merge all possible nullabilities of new field value in insn

If (fieldInfo contains key field) then fieldInfo[field] := fieldInfo[field] merge nullability

else fieldInfo[field] := nullability

NULLABILITY INFERENCE: CREATE ANNOTATIONS Input:

ovrMap: override nullability map mergedMap: merged instruction nullability map returnValueInfo: return value nullability fieldInfo: map from fields to nullability values

Output ann: annotations set

ann := new empty annotations set ann[return type position] := convert returnValueInfo to annotation For each interesting value v in mergedMap.keySet

pos := annotation position corresponding to v If (v in ovrMap.keySet) then

nullability := ovrMap.getStored(v) else If (v in mergedMap.spoiledValues)

nullability := NULLABLE else nullability := mergedMap.getStored(v) ann[pos] := convert nullability to annotation

COMPUTE INSTRUCTION NULLABILITY MAP Input:

ann: existing annotations set insn: instruction cfg: control-flow graph ovrMap: overriding nullability map (to be updated)

Output: insnMap: instruction nullability map

insnMap, mergedValues := merge maps from incoming edges of insn

inheritedValues := insnMap.keySet – mergedValues Process dereferencing (ann, insn, insnMap, cfg, ovrMap) If (insn is null check) then

Process null-branching (insn, insnMap, cfg, ovrMap) Else If (insn is equality check preceded by instanceof) then

Process instanceof-branching (insn, insnMap, cfg, ovrMap)

Else for each outgoing transition e of insn e.nullabilityMap := Copy of insnMap with state replaced

with e’s own frame state

PROCESS DEREFERENCING INSTRUCTION Input:

ann: existing set of annotations insn: instruction insnMap: instruction nullability map (to be updated) cfg: control-flow graph ovrMap: overriding nullability map (to be updated)

If (insn is invocation of some method m) then Mark each possible value of m’s receiver as NOT_NULL For each parameter param of m

pos := annotation position corresponding to param of m

If (ann[pos] is NOT_NULL) then mark each possible value of param as NOT_NULL

If (insn is GETFIELD, ARRAYLENGTH, ATHROW, MONITORENTER, MONITOREXIT, *ALOAD, *ASTORE, or PUTFIELD) then Mark each possible value of insn receiver as NOT_NULL

PROCESS NULL-BRANCHING INSTRUCTION Input:

insn: instruction insnMap: instruction nullability map (to be updated) cfg: control-flow graph ovrMap: overriding nullability map (to be updated)

For nullable transition e e.nullabilityMap := Copy of insnMap with state replaced

with e’s own frame state and nullability of condition subjects replaced according to the rule: If CONFLICT or NOT_NULL then CONFLICT, otherwise

NULL For non-nullable transition e

Similar to above, but replacement rule is If CONFLICT or NULL then CONFLICT, otherwise

NOT_NULL For each remaining transition e

e.nullabilityMap := Copy of insnMap with state replaced with e’s own frame state

If (outcome of nullable transition target is ONLY_THROWS) then For each possible value v of condition subject

ovrMap[v] := NOT_NULL

PROCESS INSTANCEOF-BRANCHING INSTRUCTION Input:

insn: instruction (IFEQ/IFNE) preceded by INSTANCEOF insnMap: instruction nullability map (to be updated) cfg: control-flow graph ovrMap: overriding nullability map (to be updated)

For instance-of (non-nullable) transition e e.nullabilityMap := Copy of insnMap with state replaced

with e’s own frame state and nullability of condition subjects replaced according to the rule: If CONFLICT or NULL then CONFLICT, otherwise

NOT_NULL For not-instance-of (nullable) transition e

Similar to above, but replacement rule is If UNKNOWN then NULLABLE, otherwise do not change

For each remaining transition e e.nullabilityMap := Copy of insnMap with state replaced

with e’s own frame state

NULLABILITY VALUE MARKING Input

value: stack value inheritedValues: set of inherited values insnMap: instruction nullability map (to be updated) ovrMap: overriding nullability map (to be updated)

If (insnMap.getStored(value) is neither CONFLICT, nor NULL) insnMap[value] := NOT_NULL If (value is interesting and

inheritedValues is empty and value is not in insnMap.spoiledValues) then

ovrMap[value] := NOT_NULL

Technology

KAnnotator