Upload
allison-brimley
View
217
Download
0
Embed Size (px)
Citation preview
Chair of Software Engineering
The alias calculus
Bertrand MeyerITMO Software Engineering Seminar
June 2011
Claims
Theory: Theory of aliasing Loss of precision is small New concepts, in particular inverse variables Abstract, does not mention stack & heap Simple, implementable Insights into the essence of object-oriented
programmingPractice:
Alias calculus Almost entirely automatic Implemented
2
Reference
Steps Towards a Theory and Calculus of Aliasing
International Journal of Software and Informatics
July 2011
http://se.ethz.ch/~meyer/publications/aliasing/alias-revised.pdf
3
The question under study
Given expressions e and f (of reference types) and a program location p :
At p , can e and f ever be attached to the same object?
4
(If so, we say that e and f are aliased to each other, meaning potentially aliased.)
e
f
“Given” expressions only
Considerfrom y := x loop
y := y aend
5
x
a
a
a
a
y may become aliased to:x, x a, x a a, x a a a
etc.(infinite set of expressions!)
An example of alias analysis
y
x
Consider two linked list structures known through x and y:
rightitem
Computing the alias relation shows that:
If x ≠ y, then no cell reachable from x ( or ) can be reached from y ( or ), and conversely
Without this assumption, such aliasing is possible 6
Why alias analysis is important
1. Without it, cannot apply standard proof techniques to programs involving pointers
2. Concurrent program analysis, in particular deadlock3. Program optimization
7
-- y.a = b
x.set_a (c)
? a
x
y
set_a (c)
b
c-- x.a = c-- x.a = c
-- c = c, i.e. True
Understand as x.a := c
-- y.a = b
Basic notionDefinition:
Not necessarily transitive:if c then
x := yelse
y := zend
8
A binary relation is an alias relationif it is symmetric and irreflexive
Can alias x to y
and y to z
but not x to z
Formulae of interest
The calculus defines, for any instruction p and any
alias relation a,the value of
a » p
denoting:
The aliasing relation resulting from executing
p from an initial state in which the aliasing
relation is a
For an entire program: compute » p9
The programming language
skip create x x := y forget x (p ; q) then p else q end
10
Eiffel: x := VoidJava etc.: x = null;
p, q, …: instructions
x, y, …: variables
cut x, y
pn -- for integer n
loop p end
r do p end call r
z := x y x r … Curren
t
E0: basic constructs
E1: cut
E4: O-O
E2: loops
E3: procedures
Describing an alias relation
If r is a relation in E E, the following is an alias relation:
r (r r-1 ) ― Id [E]
Example: {[x, x], [x, y], [y, z]} =
Generalized to sets:
{x, y, z} =
=D
11
Set difference
Identity on E
Set of binary relations on E; formally: P (E x E)
{[x, y], [y, x], [y, z], [z, y]}
{[x, y], [y, x], [x, z], [z, x], [y, z], [z, y]}
“Complete” alias relation
Canonical form & alias diagrams
Canonical form of an alias relation: union of complete alias relations, e.g.
, meaning
None of the sets of expressionsis a subset of another
12
x, y, y, z, x, u, v
x
u, v
y
x
y, z
Make it canonical: x, x
, yy
{x, y} { y, z} {x, u, v}
(not canonical)
An alias diagram:
yy
Alias calculus for basic operations (E0)
a » skip = a
a » (then p else q end ) = (a » p) (a » q)
a » (p ; q) = (a » p) » q
a » (forget x) = a \- {x}
a » (create x) = a \- {x}
a » (x := y) = a [x: y]
13
a deprived of all pairs involving x
e.g. x, y, y, z, x, u, v
\- {x, u}=
y, z, u, v Override, see next
The forget rule
a » (forget x) = a \- {x}
14
y
x,
y, z
x,
u, v
x,
x,
y a deprived of allpairs involving x
The assignment rule (E0)
a » (x := y) = a [x: y]
with:
a [x: y] = givenb = a \- {x}
then
b ({x} (b / y)) end
15
Symmetrize and de-reflect
a deprived of all pairs involving x
All pairs [x, u] where u is either aliased to y in b or y itself
Operations on alias relations
For an alias relation a in E E, an expression x, and a set of expressions A E, the following are alias relations:
r \– A = r — E x A
a / y = {z: E | (z = y) [y, z] a}
16
“Quotient”, similar to equivalence class in equivalence relation
“Minus” Set of all
expressions
The assignment rule (E0)
All u aliased to y in b, plus y itself
a [x: y] = givenb = a \- {x}
then
b ({x} ( b / y )) end
17
Symmetrize and de-reflect
a deprived of all pairs involving x
All pairs [x, u] where u is either aliased to y in b or y itself
Value of a » (x := y)
z := x
Assignment example 1
18
x
u, vx,
, y
Before
, z
, z
After
Assignment example 2
19
x
u, vx,
, yx, y
x := u
BeforeAfter
Assignment example 3
20
x
u, vx,
, yx, y
x, z
x,
x := z
BeforeAfter
The assignment rule (E0)
a [x: y] = givenb = a \- {x}
then
b ({x} (b / y)) end
21
Value of a » (x := y)
The cut instruction
E1 is E0 plus the instruction
cut x, y
Semantics: remove aliasing, if any, between x and y
E0: basic constructs
E1: cut
22
Cut example 1
23
x
u, vx,
, yx, y
cut x, y
BeforeAfter
Cut example 2
24
x, y
u, vx,x,
x, v
cut x, u
BeforeAfter
Cut rule
a » cut x, y = a ― x, y
25
Set difference
The role of cut
cut x, y informs the alias calculus with non-alias properties coming from other sourcesExample:
if m < n then x := u else x := y endm := m + 1if m < n then z := x end
But here x cannot be aliased to y (only to u). The alias theory does not know this property!To take advantage of it, add the instruction
This expression represents
check x /= y end (Eiffel)assert x != y ; (JML, Spec#)
26
Alias relation:
x, u, z, x, y, z
x, u, x, y
cut x, y;
Introducing repetitions
E2 is E1 plus:
pn (for integer n): n executions of p
(auxiliary notion)
loop p end : any sequence (incl. empty) of executions of p
27
E0: basic constructs
E1: cut
E2: loops
E2 alias calculus
a » p0 = a
a » pn+1 = (a » pn) » p -- For n 0
-- Also equal to (a » p)
» pn
a » (loop p end) = (a » pn)
28
n N
Loop aliasing theorem (1)
For any a and p, there exists a constant N N such that
a » (loop p end) = (a » pn)
Proof : the sequence
sn = (a » pk)
is non-decreasing (with respect to inclusion) on a finite set
More generally, for every construct p of E2, the functionl a | (a » p)
is non-decreasing 29
n: 0 N
k: 0 n
a » (loop p end) = (a » pn)
n N
Loop aliasing theorem (2)
a » (loop p end) is also the fixpoint of the sequencet0 = a
tn+1 = tn (tn » p)
Gives a practical way to compute a » (loop p end)
Proof: by induction. If sn is original sequence (a »
pn), prove separately sn tn and tn sn
30
k: 0 n
Introducing procedures: E3
A program is now a sequence of procedure definitions (one designated as main):
ri (f) do pi end
Instructions: as before, plus
call ri (a)
-- Procedure call
31
E0: basic constructs
E1: cut
E2: loops
E3: procedures
Alias calculus notations:
r denotes body of r (i.e. ri =
pi)
r denotes formals of r (here f)
Handling arguments
The calculus will treat call r (a)
as r := a ; call r
(With recursion, possible loss of precision)
32
Generalize notation a [x: y] to lists: use
a [a : b]as abbreviation for
(…((a [a1: b1])[a2: b2]) …[an: bn]
For example: a [r : a]
i.e. formal1 := actual1;… ; formaln :=
actualn
Call rule
Without arguments:
a » call r = a » r
33
Formal arguments of rWith arguments:
a » call r (v) = a [r: a] » r
Body of r
Using the call rule
34
a » call r (a) = a [r: a] » r
Because of recursion, no longer just definition but equation
For entire set of procedures P, this gives a vector equation
a » P = AL (a » P)
Interpret as fixpoint equation and solve iteratively(Fixpoint exists: increasing sequence on finite set)
Object-oriented mechanisms: E4
“General relativity”:
1. Qualified expressions: x y
Can be used as source (not target!) of assignments
x := y z
2. Qualified calls:
call x r (v)
3. Current
37
E0: basic constructs
E1: cut
E4: O-O
E2: loops
E3: procedures
Assignment (original rule)
a [x: y] = given
b = a \- {x}
then
b ({x} (b / y) )
end
38
a deprived of all pairs involving x
This includes [x, y] !
All pairs [x, u] where u is either aliased to y in b or y itself
All u aliased to y in b, plus y itself
Example:
x := y z
Value of a » (x := y)
Assigning a qualified expression
:= x y
39
x yx
x
z
x does not get aliased to x y!
(only to any z that was aliased to x
y)
x := x y
Assignment rule revisited
a [x: y] = given
b = a \– {x} then
b ({x} (b / y)) end
40
a deprived of all pairs involving xor an expression starting with x
Example:
x := y z
Value of a » (x := y)
Alias diagrams (E0 to E3)
41
x
u, v
y, z
x,
, y
Source node Value nodesValue nodesValue nodesSingle source node(represents stack)
Each value node represents a set of possible run-time values
Links: only from source tovalue nodes (will becomemore interesting with E4!)
Edge label: set ofvariables; indicates theycan all be aliased to each other
Alias diagrams (E4)
:= x y
42
x yx
x
z
Links may now exist between value nodes(now called object nodes)
Cycles possible (see next)
Source node
Value nodesValue nodesObject nodes
New laws, inverse variables
x Current = x
Current x = x
x’ x = Current
x x’ = Current
Current’ = Current
43
New form of call: qualified
In E4:
call x r (a, b, …)
44
Distribution operator:
For a list a = <u, v, w, …>:
x a = <x u, x v, x v, …>
For a relation r in E E :
x r = {[x u, x v] | [u, v] r}
Example:
x ( u, v, w, u, y ) = x u, x v, x w, x u, x
y
45
Handling arguments (unqualified call)
a » call r (a) = a [r : a] » r
46
Handling arguments: unqualified call
a » call r (a) = a [ ` r : a] » call r )
47
Was written r
Handling arguments: qualified call
a » call x r (a) = a [ x r : a] » call x r )
Treatcall x r (v)
asx formals := a ; call x r
48
x x’
Current
target
Handling arguments: an example
Eiffel: x r (a, b)
With, in a class C:
r (t: T ; u: U)
Handled as:
x t := ax u := bcall x r
x x’
Current
target
Without arguments: unqualified call rule
a » call r = a » r
50
Body of r
Qualified call rule
a » call x r = x ((x’ a) » r)
51
Example: d := c x r (d)
with r (u: U)
do
v := u end
Handled as: d := c call
with r
do
v := u end
u := x’
d
x r
xx
’
c
u, v
, d
target
Current
Inverse variable
The rule in action
a » call x r = x ((x’ a) » r)
52
Alias relation:
c, d
x’ c, x’ d
Prefix with x’
:
u, x’ c, x’
d
d := c callwith r
do
v := u end
u := x’
c
x r
v, u, x’ c, x’
d
Prefix with x :
x v, x u, c, d
xx
’
c
u,
x’ c, x’ d
v,x
x
x
x
c,d
Current
c
x’
c,x c, x’ c, x’ dx d
u,v,
x
x
, d, d
target
Current
x
’
About the qualified call rule
a » call x r = x ((x’ a) » r)
Thus we are permitted to prove that the unqualified call creates certain aliasings, on the assumption that it starts in its own alias environment but has access to the caller’s environment through the inverted variable, and then to assert categorically that the qualified call has the same aliasings transposed back to the original environment. This change of environment to prove the unqualified property, followed by a change back to the original environment to prove the qualified property, explains well the aura of magic which attends a programmer's first introduction to object-oriented programming.
54
The full qualified call rule
As two separate rules:
a » call x r = x ((x’ a) » r)
a » call x r (a) = a [x r : a] » call x r )
As a single rule:
a » call x r (a) = x (x’ a [ r : x’ a]) » r) \– x r
55
Hide internals of r
Termination?
The original termination argument does not hold any more
Considerfrom y := x loop
y := y aend
y may become aliased to:x, x a, x a a, x a a a etc.
(infinite set of expressions!)
56
x
a
a
a
a
Termination: the question under study
Given expressions e and f (of reference types) and a program location p :
At p , can e and f ever be attached to the same object?
57
The alias calculus
a » skip = a
a » (then p else q end) = (a » p) (a » q)
a » (p ; q) = (a » p) » q
a » (forget x) = a \- {x}
a » (create x) = a \- {x}
a » (x := y) = a [x: y]
a » cut x, y = a – x, y
a » p0 = a
a » pn+1 = (a » pn) » p
a » (loop p end) = (a » pn)
a » call r (a) = (a [ r : a]) » r
a » call x r (a) = x (x’ (a [x r : a]) » r) \– x r
58
n N
Plus:x Current = xCurrent x = xx’ x =
Current x x’ =
CurrentCurrent’= Current
a [x: y] = given b = a\- {x} then
b ({x} x (b/y))
end
There is no backward alias calculus
Consider
create y
create z
x := y
59
x, y
?
x := z
x, z ?
Notation
Targets of an instruction p:
p
This is the set of variables that p may modify
e.g.
(x :=y ; y := z) = {x, y}
60
Semantics of the alias calculus
Let Var be the set of variables and a an alias relation, the following assertion expresses that there is no aliasing except as implied by a:
Weak soundness of definition of » for an instruction p:
Soundness has less demanding precondition but assumes some white-box knowledge:
61
a– x ≠ y =D
[x, y] (Var Var) – Id – a
{(a )– } p {(a » p)– }
{(a)– } p {(a » p)– }
We may ignore variables modified by p
\- p
There is no backward rule!
Consider the definition of weak soundness:
62
It is possible to reconstruct a from a–, but not from a– \- p
(Same for strong soundess)
{(a)– } p {(a » p)– }
Why alias analysis is important
1. Without it, cannot apply standard proof techniques to programs involving pointers
2. Concurrent program analysis, in particular deadlock3. Program optimization
63
An application:deadlock avoidance
(sketch)
Coffman deadlock
Consider a set of processors and a set of resources. At every execution time t, for every processor p, two disjoint sets of resources are defined:
Ht (p) -- Has set: resources that p has
acquired Wt (p) -- Wait set: resources that p has
requested
A deadlock exists if for some set D of processors:
p: D | p’ : D | Wt (p) Ht (p’) ≠
(In such a case p ≠ p’)
65
Not the same thing as reverse of liveness
Absolute deadlock freedom
A system, made of a set of program elements E, is absolutely deadlock-free if for all times t
r: E | r’ : E | Wt (r) Ht (r’) =
(Works for r = r’ since Wt (r) Ht (r) = )
Can also be written:
r: E | r’ : E | a: Wt (r) | a Ht (r’)
66
Strategy for detecting deadlock
For every program element r: Compute W (r) and H (r) Determine with which other elements r’ it can run
parallel
67
r, r’: E | r // r’ (W (r) H (r’) = )
The SCOOP model
A primary characteristic of SCOOP is that the model removes the distinction between resources and processorsProperties:
Any processor p is such that p H (p) Any variable or expression e has an associated
processor, its handler <e> Any execution of a qualified call x f (a, b, …)
(separate or not) satisfies x H (<Current>) For any call r of actual arguments args including
uncontrolled arguments U, W (r) is {<a> | a U} Without lock passing, H (r) for any program
element r in a routine or formal separate arguments S is <Current> {<a> | a S}
68
Processor abstraction
In SCOOP: Any variable or expression e has an associated
processor, its handler <e>
The computation of H and W sets only involves the handlerStrategy:
Processor abstraction: identify every variable or expression e with its processor <e>
Perform alias analysis Use it to compute H and W sets as unions for all
possible aliases Check W (r) H (r’) = for any two calls r, r’,
including when they are the same call 69
Dining philosophersclass MEAL create make feature
p1, p2: separate PHILOSOPHER ; f1, f2: separate FORKmake do create f1; create f2
create p1 make (f1, f2); create p2 make (f2, f1) end
go_right (a, b: separate PHILOSOPHER)do p1 eat_right; p2 eat_right end
go_wrong (a, b: separate PHILOSOPHER)do p1 eat_wrong; p2 eat_wrong end
endclass PHILOSOPHER create make feature
left, right: separate FORKmake (u, v: separate FORK) do left:= u ; right := v endeat_right do pick_two (left, right) endpick_two (x, y: separate FORK) do x use; y use endeat_wrong do pick_in_turn (left) endpick_in_turn (z: separate FORK) do pick_two (z, right)
endend 70
H = {x, y},H = {f1, f2, …}W =
H = {z}, W = {right},H = {f1, f2, …},W = {f1, f2, …}
Claims
Theory: Theory of aliasing Loss of precision is small New concepts, in particular inverse variables Abstract, does not mention stack & heap Simple, implementable Insights into the essence of object-oriented
programmingPractice:
Alias calculus Almost entirely automatic Implemented
71
Approaches for comparison
Separation logic
Shape analysis (with abstract interpretation)
Ownership
Dynamic frames
72
73
Hoare-style reasoning
Assignment rule:
{P (e)} x := e {P (x)}
74
Hoare-style reasoning
require
do
:= whatever + 10000
y := y + 1ensure
endy < 3
-- y + 1 < 3-- y + 1 < 3
xx
y + 1 < 3
y + 1 < 3-- y + 1 < 3
Assignment rule:
{P (e)} x := e {P (x)}
require
ensure y <
3
75
The effect of pointers (references)
-- y.a = bx.set_a (c)
?
a
x
y
set_a (c)
b
c
-- x.a = c-- x.a = c
-- True
Understand as x.a := c
-- y.a = b
The question under study
Given expressions e and f (of reference types) and a program location p :
At p , can e and f ever be attached to the same object?
76
77
The effect of pointers : with alias analysis
x.set_a (c)-- y.a = b
a
x
y
set_a (c)
b
c
-- x, y
-- x.a = b-- x.a = b
Understand as x.a := c
--.c = b
-- x, y
x may be aliased to y
Alias relationsRelation of interest:
“In the computation, e might become aliased to f”Definition:
Not necessarily transitive:if c then
x := yelse
y := zend
78
A binary relation is an alias relationif it is symmetric and irreflexive
Can alias x to y
and y to z
but not x to z
Alias diagrams (E0 to E3)
79
x
u, v
y, z
x,
, y
Source node Value nodesValue nodesValue nodesSingle source node(represents stack)
Each value node represents a set of possible run-time values
Links: only from source tovalue nodes (will becomemore interesting with E4!)
Edge label: set ofexpressions; indicates theycan all be aliased to each otherIn canonical form: no label is subset ofanother; each label has at least 2 expressions
Approaches for comparison
Separation logic
Shape analysis
Ownership
Dynamic frames
80
Achievements
Theory of aliasingSimple (about a dozen rules)New concepts: inverse variables, modeling CurrentGraphical formalism (alias diagrams), canonical formImplementedAlmost entirely automatic (except for occasional cut)Small loss of precision, i.e. not too conservativeAbstract: does not mention stack and heapCovers object-oriented programmingFaithful to O-O spirit; see qualified call rule
Can cover full modern O-O languagePotential solution to “frame problem”
81
a » call x f = x ((x’ a) » call f)