View
222
Download
3
Embed Size (px)
Citation preview
PROGRAM ANALYSIS & SYNTHESIS
Lecture 08(b) – Typestate Verification
Eran Yahav
SAFEEFFECTIVE TYPESTATE VERIFICATION IN THE PRESENCE OF ALIASING
Joint work with Nurit Dor, Stephen Fink, Satish Chandra, Marco Pistoia, Ganesan Ramalingam, Sharon Shoham, Greta Yorsh
2
Motivation Application Trend: Increasing number of libraries and
APIs
– Non-trivial restrictions on permitted sequences of operations
Typestate: Temporal safety properties
– What sequence of operations are permitted on an object?
– Encoded as DFA
e.g. “Don’t use a Socket unless it is connected”
init connected closed
err
connect() close()
getInputStream()getOutputStream()
getInputStream()getOutputStream()getInputStream()
getOutputStream()
close()
* 3
Goal
Typestate Verification: statically ensure that no execution of a program can transition to err Sound1 (excluding concurrency)
Precise enough2 (reasonable number of false alarms)
Scalable3 (handle programs of realistic size)
1 In the real world, some other caveats apply
2 we’ll get back to that
3 relatively speaking
4
Challengesclass SocketHolder { Socket s; }
Socket makeSocket() { return new Socket(); // A }
open(Socket l) {
l.connect();
}talk(Socket s) { s.getOutputStream()).write(“hello”); }
main() { Set<SocketHolder> set = new HashSet<SocketHolder>(); while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); set.add(h) } for (Iterator<SocketHolder> it = set.iterator(); …) { Socket g = it.next().s; open(g); talk(g); }}
Flow-Sensitivity Interprocedural flow Context-Sensitivity * Non-trivial Aliasing * Path Sensitivity Full Java Language
Exceptions, Reflection, …
Big programs
5
Main Ideas Flow-sensitive, context-sensitive interprocedural
analysis
Abstract domains combine typestate and pointer information More precise than 2-stage approach Focus expensive effort where it matters
Staging: Sequence of abstractions of varying cost/precision Inexpensive early stages reduce work for later
expensive stages
Techniques for inexpensive strong updates (Uniqueness, Focus) Much cheaper than typical shape analysis More precise than usual “scalable” analyses
Results Flow-sensitive functional IPA with sophisticated alias
analysis on ~100KLOC in 10 mins. Scales up to ~500KLOC.
Verify ~92% of potential points of failure (PPF) as safe
6
Analysis Overview
Possible failure points
IntraproceduralVerifier
UniqueVerifier
AP FocusVerifier
Initial Verification
Scope
PreliminaryPointer Analysis/
Call Graph Construction
Composite TypestateVerifier
Program
Dataflow Analysis
Sound, abstract representation of program state Flow-sensitive propagation of abstract state Context-sensitive: Tabulation Solver [Reps-Horwitz-Sagiv 95]
7
(Instrumented) Concrete Semantics
init closed
err
connect() close()
getInputStream()getOutputStream()
getInputStream()getOutputStream()getInputStream()
getOutputStream()
close()
*
connected
init closed init
σ = { <o1, init> , <o2,closed> , <o3,init> , … }
8
Instrumented Concrete Semantics
L objectsv Val = objects { null } Env = VarId Valh Heap = objects x FieldId Val
state = <L, , h> 2objects x Env x Heap
Typestate property DFA <,Q,,init,Q\{err}>
typestate: L Q
istate = <L, , h,typestate> 9
Abstract State
AS2
AS2
AS3
AS3
AS3
AS1
AS2
AS1
init closed init{ init, closed }
σ = { <o1, init> , <o2,closed> , <o3,init> , … }
σ# = { <AS1, init> ,<AS1,closed> }
10
open(Socket s) { s.connect();}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { s.close(); }main() { Socket s = new Socket(); //S open(s); talk(s); dispose(s);}
<S, init> , <S, connected>, <S, err> ×
Abstract State := { < Abstract Object, TypeState> }
Base Abstraction
init closed
err
connect() close()
getInputStream()getOutputStream()
getInputStream()getOutputStream()getInputStream()
getOutputStream()
close()
*
connected
<S, init><S, init> , <S, connected>
Base 11
open(Socket s) { s.connect();}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { s.close(); }main() { Socket s = new Socket(); //S open(s); talk(s); dispose(s);}
<S, init, U>
<S, connected, U>
<S, connected, U>
Unique
Abstract State := { < Abstract Object, TypeState, UniqueBit> }
“UniqueBit” ≈ “ exactly one concrete instance of abstract object” Allows strong updates Works sometimes (80% in our experience)
Unique Abstraction
12
open(Socket s) { s.connect();}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { s.close(); }main() { while (…) { Socket s =new Socket();//S open(s); talk(s); dispose(s); }}
<S, init, U>
<S, connected, U>
<S, connected, U>
<S, closed, U>
<S, closed, U>
<S, closed, ¬U> <S, init, ¬U >
<S, closed, ¬U> <S, init, ¬U > <S, connected, ¬U>
<S, err, ¬U> × ….
Object liveness analysis to the rescue Preliminary live analysis oracle On-the-fly removal unreachable configurations using liveness info
Unique Abstraction
13
class SocketHolder {Socket s; }Socket makeSocket() { return new Socket(); // A }open(Socket s) { s.connect();}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { h.s.close(); }main() { while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); Socket s = makeSocket(); open(h.s); talk(h.s); dispose(h.s); open(s); talk(s); }}
<A, init, U>
<A, init , ¬U >
<A, init, ¬U > <A, connected, ¬U >
<A, err, ¬U> × ….
Unique Abstraction
14
Access Path Must
MustSet := set of symbolic access paths (x.f.g….) that must point to the objectMayBit := “must set is incomplete. Must fall back to may-alias oracle” Strong Updates allowed for e.op() when e Must or unique logic allows
{ < Abstract Object, TypeState, UniqueBit, MustSet, MayBit> }
MustNotSet := set of symbolic access paths that must not point to the object
Focus operation when interesting things happen generate 2 tuples, a Must information case and a MustNot
information case Only track access paths to “interesting” objects Sound flow functions to lose precision in MustSet,
MustNotSet– Allows k-limiting. Crucial for scalability.
{ < Abstract Object, TypeState, UniqueBit, MustSet, MayBit, MustNotSet> }
Access Path Focus
Access Path Based Abstractions
15
Access Path Based Abstractions
16
Statement
Resulting abstract tuples for incoming o,unique, typestate,APM,May,APMN
e.op() o,unique,(typestate,op),APM,May,APMN if e APMN and (e APM or May)
o,unique,typestate,APM,May,APMN if e APMN or (e APM (unique pt(e) = o) May)
Access Path Based Abstractions
17
Statement
Resulting abstract tuples for incoming o,unique, typestate,APM,May,APMN
v = new T()
o,false,typestate,APM – {v. | } ,May,APMN { v } o,false,init,{v} ,false,
v = null AP’M := APM – {v. | }AP’MN := APMN { v }
v.f = null AP’M := APM – {e’.f. | mayAlias(e’,v), } AP’MN := APMN { v.f }
v = e AP’M := APM {v. | e. APM}AP’MN := APMN - { v. | e. APMN}
v.f = e AP’M := APM {v.f. | e. APM}AP’MN := APMN - {v.f | e APMN}May’ := May (v.f. AP’M and p. mayAlias(v,p) p.f. APM )
= set of all possible access paths in the program
class SocketHolder {Socket s; }Socket makeSocket() { return new Socket(); // A }open(Socket s) { s.connect();}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { h.s.close(); }main() { while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); Socket s = makeSocket(); open(h.s); talk(h.s); dispose(h.s); open(s); talk(s); }}
<A, init, U, {h.s}, ¬May, {} >
<A, init , ¬U, {h.s}, ¬May, {} > …
<A, connected , ¬U, {h.s}, ¬May, {} >
Access Path Abstraction
18
class SocketHolder { Socket s; }
Socket makeSocket() { return new Socket(); // A }
open(Socket l) {
l.connect();
}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { h.s.close(); }main() { Set<SocketHolder> set = new HashSet<SocketHolder>(); while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); set.add(h); } ; for (Iterator<SocketHolder> it = set.iterator(); …) { Socket g = it.next().s; open(g); talk(g); dispose(g); }}
What can we do when we lose access-path information?
<A, init, U, {h.s}, ¬May>
<A, init , U, {h.s}, May>
<A, init, ¬U, {}, May >
<A, init, ¬U, {}, May >, <A, connected, ¬U, {}, May >
<A, err, ¬U, {}, May > …
19
20
Access Paths with FocusAS := { < Abstract Object, TypeState, Unique, Must, May,
MustNot> }
Access Path Must Abstraction + MustNot := set of access paths that must not point to the object
Focus operation when “interesting” things happen “case splitting” e.op() on A, T, u, Must, May, MustNot, generate 2
factoids: A, d(T), u, Must U {e}, May, MustNot A, T, u, Must, May, MustNot U {e}
Interesting Operations Typestate changes
Allows k-limiting. Crucial for scalability Allowed to limit exponential blowup due to focus
Current heuristic: discard MustNot before each focus operation TODO: More general solution (“blur”, “normalization”)
Works sometimes (95.6%)
class SocketHolder { Socket s; }Socket makeSocket() { return new Socket(); // A }open(Socket t) { t.connect();}talk(Socket s) { s.getOutputStream().write(“hello”); }dispose(Socket s) { h.s.close(); }main() { Set<SocketHolder> set = new HashSet<SocketHolder>(); while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); set.add(h); } for (Iterator<SocketHolder> it = set.iterator(); …) { Socket g = it.next().s; open(g); talk(g); }}
Access Path FocusAbstraction { < Abstract Object, TypeState, UniqueBit, MustSet, MayBit, MustNotSet> }
<A, init, U, {h.s}, ¬May, {}>
<A, init , U, {h.s}, May ,{}>
<A, init, ¬U, {}, May , {}>
<A, init, ¬U, {}, May , {¬g}>, <A, connected, ¬U, {g}, May, {}>
<A, init, ¬U, {}, May, {} >
<A, init, ¬U, {}, May, {¬ t} >, <A, connected, ¬U, {t}, May, {}><A, init, ¬U, {}, May , {¬g,¬s}>, <A, connected,
¬U, {g,s}, May, {}>
21
Access Path with Focus
Implementation Details Matter
Sparsification
Separation (solve for each abstract object separately)
“Pruning”: discard branches of supergraph that cannot affect abstract semantics Reduces median supergraph size by 50X
PreliminaryPointer Analysis / Call Graph Construction
Details matter a lot
if context-insensitive preliminary, stages time out, terrible precision
Current implementation: Subset-based, field-sensitive Andersen’s SSA local representation On-the-fly call graph construction Unlimited object sensitivity for
– Collections– Containers of typestate objects (e.g. IOStreams)
One-level call-string context for some library methods Heuristics for reflection (e.g. Livshits et al 2005)
22
11 typestate properties from Java standard libraries 17 moderate-sized benchmarks [~5K – 100K LOC]
Sources of False Positives Limitations of analysis
– Aliasing– Path sensitivity– Return values
Limitations of typestate abstraction – Application logic bypasses DFA, still OK
0
10
20
30
40
50
60
70
80
90
100
TOTAL
War
nin
gs/
Po
ten
tial
Err
Sta
tem
ents
(%
) FI LocalFocus Base
Unique APMust APFocus
Precision
23
Running time
0
200
400
600
800
1000
APFocus
APMust
Unique
Base
LocalFocus
Setup
IBM Intellistation Z pro 2x3GHz Xeon3.62 GB RAM/ Win XPIBM J2RE 1.4.2 / -Xmx800M
APMust = 1677APFocus = 4275
24
Some Related Work
ESP Das et al. PLDI 2002
Two-phase approach to aliasing (unsound strong updates) Path-sensitivity (“property simulation”)
Dor et al. ISSTA 2004 Integrated typestate and alias analysis Tracks overapproximation of May aliases
Type Systems Vault/Fugue
Deline and Fähndrich 04:adoption and focus CQUAL
Foster et al. 02: linear types Aiken et al. 03: restrict and confine
Alias Analysis Landi-Ryder 92, Choi-Burke-Carini 93, Emami-Ghiya-Hendren 95,
Wilson-Lam 95, …. Shape Analysis: Chase-Wegman-Zadeck 90, Hackett-Rugina 05,
Sagiv-Reps-Wilhelm 99, …25