25
1 PROGRAM ANALYSIS & SYNTHESIS Lecture 08(b) – Typestate Verification Eran Yahav

1 Lecture 08(b) – Typestate Verification Eran Yahav

  • View
    222

  • Download
    3

Embed Size (px)

Citation preview

Page 1: 1 Lecture 08(b) – Typestate Verification Eran Yahav

PROGRAM ANALYSIS & SYNTHESIS

Lecture 08(b) – Typestate Verification

Eran Yahav

Page 2: 1 Lecture 08(b) – Typestate Verification Eran Yahav

SAFEEFFECTIVE TYPESTATE VERIFICATION IN THE PRESENCE OF ALIASING

Joint work with Nurit Dor, Stephen Fink, Satish Chandra, Marco Pistoia, Ganesan Ramalingam, Sharon Shoham, Greta Yorsh

2

Page 3: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Motivation Application Trend: Increasing number of libraries and

APIs

– Non-trivial restrictions on permitted sequences of operations

Typestate: Temporal safety properties

– What sequence of operations are permitted on an object?

– Encoded as DFA

e.g. “Don’t use a Socket unless it is connected”

init connected closed

err

connect() close()

getInputStream()getOutputStream()

getInputStream()getOutputStream()getInputStream()

getOutputStream()

close()

* 3

Page 4: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Goal

Typestate Verification: statically ensure that no execution of a program can transition to err Sound1 (excluding concurrency)

Precise enough2 (reasonable number of false alarms)

Scalable3 (handle programs of realistic size)

1 In the real world, some other caveats apply

2 we’ll get back to that

3 relatively speaking

4

Page 5: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Challengesclass SocketHolder { Socket s; }

Socket makeSocket() { return new Socket(); // A }

open(Socket l) {

l.connect();

}talk(Socket s) { s.getOutputStream()).write(“hello”); }

main() { Set<SocketHolder> set = new HashSet<SocketHolder>(); while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); set.add(h) } for (Iterator<SocketHolder> it = set.iterator(); …) { Socket g = it.next().s; open(g); talk(g); }}

Flow-Sensitivity Interprocedural flow Context-Sensitivity * Non-trivial Aliasing * Path Sensitivity Full Java Language

Exceptions, Reflection, …

Big programs

5

Page 6: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Main Ideas Flow-sensitive, context-sensitive interprocedural

analysis

Abstract domains combine typestate and pointer information More precise than 2-stage approach Focus expensive effort where it matters

Staging: Sequence of abstractions of varying cost/precision Inexpensive early stages reduce work for later

expensive stages

Techniques for inexpensive strong updates (Uniqueness, Focus) Much cheaper than typical shape analysis More precise than usual “scalable” analyses

Results Flow-sensitive functional IPA with sophisticated alias

analysis on ~100KLOC in 10 mins. Scales up to ~500KLOC.

Verify ~92% of potential points of failure (PPF) as safe

6

Page 7: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Analysis Overview

Possible failure points

IntraproceduralVerifier

UniqueVerifier

AP FocusVerifier

Initial Verification

Scope

PreliminaryPointer Analysis/

Call Graph Construction

Composite TypestateVerifier

Program

Dataflow Analysis

Sound, abstract representation of program state Flow-sensitive propagation of abstract state Context-sensitive: Tabulation Solver [Reps-Horwitz-Sagiv 95]

7

Page 8: 1 Lecture 08(b) – Typestate Verification Eran Yahav

(Instrumented) Concrete Semantics

init closed

err

connect() close()

getInputStream()getOutputStream()

getInputStream()getOutputStream()getInputStream()

getOutputStream()

close()

*

connected

init closed init

σ = { <o1, init> , <o2,closed> , <o3,init> , … }

8

Page 9: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Instrumented Concrete Semantics

L objectsv Val = objects { null } Env = VarId Valh Heap = objects x FieldId Val

state = <L, , h> 2objects x Env x Heap

Typestate property DFA <,Q,,init,Q\{err}>

typestate: L Q

istate = <L, , h,typestate> 9

Page 10: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Abstract State

AS2

AS2

AS3

AS3

AS3

AS1

AS2

AS1

init closed init{ init, closed }

σ = { <o1, init> , <o2,closed> , <o3,init> , … }

σ# = { <AS1, init> ,<AS1,closed> }

10

Page 11: 1 Lecture 08(b) – Typestate Verification Eran Yahav

open(Socket s) { s.connect();}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { s.close(); }main() { Socket s = new Socket(); //S open(s); talk(s); dispose(s);}

<S, init> , <S, connected>, <S, err> ×

Abstract State := { < Abstract Object, TypeState> }

Base Abstraction

init closed

err

connect() close()

getInputStream()getOutputStream()

getInputStream()getOutputStream()getInputStream()

getOutputStream()

close()

*

connected

<S, init><S, init> , <S, connected>

Base 11

Page 12: 1 Lecture 08(b) – Typestate Verification Eran Yahav

open(Socket s) { s.connect();}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { s.close(); }main() { Socket s = new Socket(); //S open(s); talk(s); dispose(s);}

<S, init, U>

<S, connected, U>

<S, connected, U>

Unique

Abstract State := { < Abstract Object, TypeState, UniqueBit> }

“UniqueBit” ≈ “ exactly one concrete instance of abstract object” Allows strong updates Works sometimes (80% in our experience)

Unique Abstraction

12

Page 13: 1 Lecture 08(b) – Typestate Verification Eran Yahav

open(Socket s) { s.connect();}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { s.close(); }main() { while (…) { Socket s =new Socket();//S open(s); talk(s); dispose(s); }}

<S, init, U>

<S, connected, U>

<S, connected, U>

<S, closed, U>

<S, closed, U>

<S, closed, ¬U> <S, init, ¬U >

<S, closed, ¬U> <S, init, ¬U > <S, connected, ¬U>

<S, err, ¬U> × ….

Object liveness analysis to the rescue Preliminary live analysis oracle On-the-fly removal unreachable configurations using liveness info

Unique Abstraction

13

Page 14: 1 Lecture 08(b) – Typestate Verification Eran Yahav

class SocketHolder {Socket s; }Socket makeSocket() { return new Socket(); // A }open(Socket s) { s.connect();}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { h.s.close(); }main() { while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); Socket s = makeSocket(); open(h.s); talk(h.s); dispose(h.s); open(s); talk(s); }}

<A, init, U>

<A, init , ¬U >

<A, init, ¬U > <A, connected, ¬U >

<A, err, ¬U> × ….

Unique Abstraction

14

Page 15: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Access Path Must

MustSet := set of symbolic access paths (x.f.g….) that must point to the objectMayBit := “must set is incomplete. Must fall back to may-alias oracle” Strong Updates allowed for e.op() when e Must or unique logic allows

{ < Abstract Object, TypeState, UniqueBit, MustSet, MayBit> }

MustNotSet := set of symbolic access paths that must not point to the object

Focus operation when interesting things happen generate 2 tuples, a Must information case and a MustNot

information case Only track access paths to “interesting” objects Sound flow functions to lose precision in MustSet,

MustNotSet– Allows k-limiting. Crucial for scalability.

{ < Abstract Object, TypeState, UniqueBit, MustSet, MayBit, MustNotSet> }

Access Path Focus

Access Path Based Abstractions

15

Page 16: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Access Path Based Abstractions

16

Statement

Resulting abstract tuples for incoming o,unique, typestate,APM,May,APMN

e.op() o,unique,(typestate,op),APM,May,APMN if e APMN and (e APM or May)

o,unique,typestate,APM,May,APMN if e APMN or (e APM (unique pt(e) = o) May)

Page 17: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Access Path Based Abstractions

17

Statement

Resulting abstract tuples for incoming o,unique, typestate,APM,May,APMN

v = new T()

o,false,typestate,APM – {v. | } ,May,APMN { v } o,false,init,{v} ,false,

v = null AP’M := APM – {v. | }AP’MN := APMN { v }

v.f = null AP’M := APM – {e’.f. | mayAlias(e’,v), } AP’MN := APMN { v.f }

v = e AP’M := APM {v. | e. APM}AP’MN := APMN - { v. | e. APMN}

v.f = e AP’M := APM {v.f. | e. APM}AP’MN := APMN - {v.f | e APMN}May’ := May (v.f. AP’M and p. mayAlias(v,p) p.f. APM )

= set of all possible access paths in the program

Page 18: 1 Lecture 08(b) – Typestate Verification Eran Yahav

class SocketHolder {Socket s; }Socket makeSocket() { return new Socket(); // A }open(Socket s) { s.connect();}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { h.s.close(); }main() { while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); Socket s = makeSocket(); open(h.s); talk(h.s); dispose(h.s); open(s); talk(s); }}

<A, init, U, {h.s}, ¬May, {} >

<A, init , ¬U, {h.s}, ¬May, {} > …

<A, connected , ¬U, {h.s}, ¬May, {} >

Access Path Abstraction

18

Page 19: 1 Lecture 08(b) – Typestate Verification Eran Yahav

class SocketHolder { Socket s; }

Socket makeSocket() { return new Socket(); // A }

open(Socket l) {

l.connect();

}talk(Socket s) { s.getOutputStream()).write(“hello”); }dispose(Socket s) { h.s.close(); }main() { Set<SocketHolder> set = new HashSet<SocketHolder>(); while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); set.add(h); } ; for (Iterator<SocketHolder> it = set.iterator(); …) { Socket g = it.next().s; open(g); talk(g); dispose(g); }}

What can we do when we lose access-path information?

<A, init, U, {h.s}, ¬May>

<A, init , U, {h.s}, May>

<A, init, ¬U, {}, May >

<A, init, ¬U, {}, May >, <A, connected, ¬U, {}, May >

<A, err, ¬U, {}, May > …

19

Page 20: 1 Lecture 08(b) – Typestate Verification Eran Yahav

20

Access Paths with FocusAS := { < Abstract Object, TypeState, Unique, Must, May,

MustNot> }

Access Path Must Abstraction + MustNot := set of access paths that must not point to the object

Focus operation when “interesting” things happen “case splitting” e.op() on A, T, u, Must, May, MustNot, generate 2

factoids: A, d(T), u, Must U {e}, May, MustNot A, T, u, Must, May, MustNot U {e}

Interesting Operations Typestate changes

Allows k-limiting. Crucial for scalability Allowed to limit exponential blowup due to focus

Current heuristic: discard MustNot before each focus operation TODO: More general solution (“blur”, “normalization”)

Works sometimes (95.6%)

Page 21: 1 Lecture 08(b) – Typestate Verification Eran Yahav

class SocketHolder { Socket s; }Socket makeSocket() { return new Socket(); // A }open(Socket t) { t.connect();}talk(Socket s) { s.getOutputStream().write(“hello”); }dispose(Socket s) { h.s.close(); }main() { Set<SocketHolder> set = new HashSet<SocketHolder>(); while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); set.add(h); } for (Iterator<SocketHolder> it = set.iterator(); …) { Socket g = it.next().s; open(g); talk(g); }}

Access Path FocusAbstraction { < Abstract Object, TypeState, UniqueBit, MustSet, MayBit, MustNotSet> }

<A, init, U, {h.s}, ¬May, {}>

<A, init , U, {h.s}, May ,{}>

<A, init, ¬U, {}, May , {}>

<A, init, ¬U, {}, May , {¬g}>, <A, connected, ¬U, {g}, May, {}>

<A, init, ¬U, {}, May, {} >

<A, init, ¬U, {}, May, {¬ t} >, <A, connected, ¬U, {t}, May, {}><A, init, ¬U, {}, May , {¬g,¬s}>, <A, connected,

¬U, {g,s}, May, {}>

21

Access Path with Focus

Page 22: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Implementation Details Matter

Sparsification

Separation (solve for each abstract object separately)

“Pruning”: discard branches of supergraph that cannot affect abstract semantics Reduces median supergraph size by 50X

PreliminaryPointer Analysis / Call Graph Construction

Details matter a lot

if context-insensitive preliminary, stages time out, terrible precision

Current implementation: Subset-based, field-sensitive Andersen’s SSA local representation On-the-fly call graph construction Unlimited object sensitivity for

– Collections– Containers of typestate objects (e.g. IOStreams)

One-level call-string context for some library methods Heuristics for reflection (e.g. Livshits et al 2005)

22

Page 23: 1 Lecture 08(b) – Typestate Verification Eran Yahav

11 typestate properties from Java standard libraries 17 moderate-sized benchmarks [~5K – 100K LOC]

Sources of False Positives Limitations of analysis

– Aliasing– Path sensitivity– Return values

Limitations of typestate abstraction – Application logic bypasses DFA, still OK

0

10

20

30

40

50

60

70

80

90

100

TOTAL

War

nin

gs/

Po

ten

tial

Err

Sta

tem

ents

(%

) FI LocalFocus Base

Unique APMust APFocus

Precision

23

Page 24: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Running time

0

200

400

600

800

1000

APFocus

APMust

Unique

Base

LocalFocus

Setup

IBM Intellistation Z pro 2x3GHz Xeon3.62 GB RAM/ Win XPIBM J2RE 1.4.2 / -Xmx800M

APMust = 1677APFocus = 4275

24

Page 25: 1 Lecture 08(b) – Typestate Verification Eran Yahav

Some Related Work

ESP Das et al. PLDI 2002

Two-phase approach to aliasing (unsound strong updates) Path-sensitivity (“property simulation”)

Dor et al. ISSTA 2004 Integrated typestate and alias analysis Tracks overapproximation of May aliases

Type Systems Vault/Fugue

Deline and Fähndrich 04:adoption and focus CQUAL

Foster et al. 02: linear types Aiken et al. 03: restrict and confine

Alias Analysis Landi-Ryder 92, Choi-Burke-Carini 93, Emami-Ghiya-Hendren 95,

Wilson-Lam 95, …. Shape Analysis: Chase-Wegman-Zadeck 90, Hackett-Rugina 05,

Sagiv-Reps-Wilhelm 99, …25