28
Ezio Bartocci Vienna University of Technology Adaptive Runtime Verification in collaboration with: Radu Grosu, Scott A. Smolka, Scott D. Stoller, Erez Zadok, Justin Seyster Vienna University of Technology, Stony Brook University

Adaptive runtime verification_last_ezio

Embed Size (px)

Citation preview

Page 1: Adaptive runtime verification_last_ezio

Ezio BartocciVienna University of Technology

Adaptive Runtime

Verification

in collaboration with:

Radu Grosu, Scott A. Smolka, Scott D. Stoller,

Erez Zadok, Justin Seyster

Vienna University of Technology, Stony Brook University

Page 2: Adaptive runtime verification_last_ezio

ARV Overview

• New approach to runtime verification in which overhead control,

state estimation, and predictive analysis are synergistically

combined;

• Every monitor instance has an associated criticality level, which is a

measure of how ``close'' the instance is to violating property

underinvestigation;

• As criticality levels of monitor instances rise, so will fraction of

monitoring resources allocated to these instances, thereby

increasing probability of violation detection;

• There is a delicate interplay between (software) state estimation

and (monitoring) overhead control.

Page 3: Adaptive runtime verification_last_ezio

Outline

• Adaptive Runtime Verification Framework

• Runtime Verification with State Estimation

• Precomputing RVSE

• Predictive Analysis of Criticality Level

• Case Study

• Results

• Conclusions

Page 4: Adaptive runtime verification_last_ezio

Adaptive Runtime Verification Framework

Monitored System

InterAspect

Instrumentation

J. Seyster, K. Dixit, X. Huang, R. Grosu, K. Havelund, S.A. Smolka, S.D. Stoller and E. Zadok.

InterAspect: Aspect-Oriented Instrumentation with GCC. Formal Methods in System Design 2012

Page 5: Adaptive runtime verification_last_ezio

Adaptive Runtime Verification Framework

Monitored System

InterAspect

Primary ControllerMonitoring Framework

Secondary

Controller

Xiaowan Huang, Justin Seyster, Sean Callanan, Ketan Dixit, Radu Grosu, Scott A. Smolka, Scott D.

Stoller, Erez Zadok. Software monitoring with controllable overhead. STTT 14(3): 327-347 (2012)

Overhead: Low

Monitor: On

Secondary

ControllerOverhead: Low

Monitor: On

Secondary

ControllerOverhead: High

Monitor: Off

Software monitoring

with controllable overhead

Page 6: Adaptive runtime verification_last_ezio

Adaptive Runtime Verification Framework

Monitored System

InterAspect

Primary ControllerMonitoring Framework

Secondary

ControllerOverhead: Low

Monitor: On

Secondary

ControllerOverhead: Low

Monitor: On

Secondary

ControllerOverhead: High

Monitor: Off

Software monitoring

with controllable overhead

Real Trace

Monitor: On Off

DISP CMD SUCC CMD CMD FAIL

Monitored Trace DISP CMD SUCC CMD

m1 m2 m3

DISP

FAIL

SUCC CMD

CMD

DISP

CMD

DISP

FAIL

SUCCFAIL

SUCC

DFSM

Page 7: Adaptive runtime verification_last_ezio

Adaptive Runtime Verification Framework

Monitored System

InterAspect

Primary ControllerMonitoring Framework

Secondary

ControllerOverhead: Low

Monitor: On

Secondary

ControllerOverhead: Low

Monitor: On

Secondary

ControllerOverhead: Low

Monitor: On

Software monitoring

with controllable overhead

Real Trace

Monitor: On Off

DISP CMD SUCC CMD CMD FAIL SUCC

Monitored Trace DISP CMD SUCC CMD

m1 m2 m3

DISP

FAIL

SUCC CMD

CMD

DISP

CMD

DISP

FAIL

SUCCFAIL

SUCC

On

SUCC

DFSM ?

Page 8: Adaptive runtime verification_last_ezio

Adaptive Runtime Verification Framework

Monitored System

InterAspect

Primary ControllerMonitoring Framework

Secondary

ControllerOverhead: Low

Monitor: On

Secondary

ControllerOverhead: Low

Monitor: On

Secondary

ControllerOverhead: Low

Monitor: On

Runtime Verification with

State Estimation (RVSE)

Real Trace

Monitor: On Off

DISP CMD SUCC DISP CMD FAIL SUCC

Monitored Trace DISP CMD SUCC CMD

m1 m2 m3

DISP

FAIL

SUCC CMD

CMD

DISP

CMD

DISP

FAIL

SUCCFAIL

SUCC

On

SUCC

DFSMprob=0.73

Scott D. Stoller, Ezio Bartocci, Justin Seyster, Radu Grosu, Klaus Havelund, Scott A. Smolka, Erez Zadok.

Runtime Verification with State Estimation. RV 2011: 193-207 (2012) Best Paper Award RV2011

Learn an

HMM

P(CMD) = 1

P(DISP) = 0

P(SUCC) = 0

P(FAIL) = 0

P(CMD) = 0

P(DISP) = 1

P(SUCC) = 0

P(FAIL) = 0

P(CMD) = 0

P(DISP) = 0

P(SUCC) = 0.97

P(FAIL) = 0.03

DISP CMD SUCC CMD CMD SUCC CMD DISP DISP ……..Real Traces:

Compose with

the monitor

Estimate the

Error Probability (EP)

11

0.07

0.93

Page 9: Adaptive runtime verification_last_ezio

Adaptive Runtime Verification Framework

Monitored System

InterAspect

Primary ControllerMonitoring Framework

Secondary

ControllerOverhead: Low

Monitor: On

Secondary

ControllerOverhead: High

Secondary

ControllerOverhead: High

Adaptive Overhead Control

EP: Med Crit: Safe Monitor: Off EP: Low Crit: Safe EP: LowCrit:

CriticalMonitor: On

EP Error Probability

Crit. Criticality

m1 m2 m3

DISP

FAIL

SUCC CMD

CMD

DISP

FAIL

SUCC

Low Critical High Critical

Page 10: Adaptive runtime verification_last_ezio

Runtime Verification with State Estimation

1) Learning (offline) an HMM from a set of traces (Baum and Welch)

pT = 0.9999578… 3.9036e -12 4.2129e - 05éë

ùû

A =

5.7498e - 07 0.9999994… 1.0838e -11

7.9406e -12 0.7999385… 0.2000614…

0.9999990… 3.5693e -11 9.4621e - 07

é

ë

êêê

ù

û

úúú

B =

0.9987651¼ 2.1899e - 41 5.7327e - 07 0.0012343¼

3.8173e - 21 2.4628e -18 0.9996049¼ 3.9504e - 04

1.3882e - 40 0.9990775¼ 9.4410e - 07 9.2147e - 04

é

ë

êêê

ù

û

úúú

LOCK

s

1 s

2

s

3

UNLOCK PROT. UNPROT.

s

1 s

2 s

3

s

1

s

2

s

3

s

1

s

2

s

3

s

1 s

2 s

3

Page 11: Adaptive runtime verification_last_ezio

Runtime Verification with State Estimation

1) Learning (offline) an HMM from a set of traces (Baum and Welch)

2) Computing (offline) the gap distribution for a particular overhead

0 1 2 3 4 5 6 7

GAP LENGTH

0.45

0.110.1 0.09

0.08 0.06 0.06 0.05

Page 12: Adaptive runtime verification_last_ezio

Runtime Verification with State Estimation

1) Learning (offline) an HMM from a set of traces (Baum and Welch)

2) Computing (offline) the gap distribution for a particular overhead

3) Computing (online) the forward algorithm for RVSE

Page 13: Adaptive runtime verification_last_ezio

Runtime Verification with State Estimation

1) Learning (offline) an HMM from a set of traces (Baum and Welch)

2) Computing (offline) the gap distribution for a particular overhead

3) Computing (online) the forward algorithm for RVSE

Very expensive to compute !!!

Page 14: Adaptive runtime verification_last_ezio

Solution: Precomputing RVSE

Page 15: Adaptive runtime verification_last_ezio

Solution: Precomputing RVSE

O

1

m1 m2 m3

DFSM

dead

state

O

1

O

2

O

3

O

2

O

3

O

1,O

2,O

3

O

1

Page 16: Adaptive runtime verification_last_ezio

Solution: Precomputing RVSE

O

1

O

2

exact edge

Page 17: Adaptive runtime verification_last_ezio

Solution: Precomputing RVSE

O

1

O

2

Page 18: Adaptive runtime verification_last_ezio

Solution: Precomputing RVSE

O

1

O

2

O

3

Page 19: Adaptive runtime verification_last_ezio

Solution: Precomputing RVSE

O

1

O

2

O

3

Page 20: Adaptive runtime verification_last_ezio

Solution: Precomputing RVSE

O

1

O

2

O

3

O

1

approximate edge

Proof of Termination in the paper !!!

Page 21: Adaptive runtime verification_last_ezio

Predictive Analysis of Criticality

1) Compose (offline) an HMM with a DFSM to get a DTMC

pLOCK

= 3.8 10 21

pUNLOCK

= 2.4 10 18

pPROT .

= 0.99

pUNPROT .

= 3.9 10 4

s

3

a)#HMM# b)#DFSMs#

2# 3#1#

4#

LOCK

LOCK PROT.

UNPROT.

UNLOCK

PROT. UNPROT.

LOCK

UNLOCK

LOCK UNLOCK

PROT.

UNPROT.

PROT.

UNLOCK UNPROT.

s

1 s

2

0.79

2# 3#1#

4#

LOCK

PROT. UNPROT. UNPROT.

PROT. UNPROT.

LOCK

LOCK UNLOCK PROT.

UNPROT.

UNLOCK UNLOCK PROT.

LOCK

UNLOCK

3.56 10 11

0.20

3.9 10 12

pLOCK

= 1.3 10 40

pUNLOCK

= 0.99

pPROT .

= 9.4 10 7

pUNPROT .

= 9.2 10 4

pLOCK

= 0.99

pUNLOCK

= 2.1 10 41

pPROT .

= 5.7 10 7

pUNPROT .

= 1.2 10 3

5.74 10 7

0.99

7.9 10 12

0.99 1.08 10 11

0.99

4.2 10 5

9.46 10 7

pLOCK

= 3.8 10 21

pUNLOCK

= 2.4 10 18

pPROT .

= 0.99

pUNPROT .

= 3.9 10 4

s

3

a)#HMM# b)#DFSMs#

2# 3#1#

4#

LOCK

LOCK PROT.

UNPROT.

UNLOCK PROT.

UNPROT.

LOCK

UNLOCK

LOCK UNLOCK

PROT.

UNPROT.

PROT.

UNLOCK UNPROT.

s

1 s

2

0.79

2# 3#1#

4#

LOCK

PROT. UNPROT. UNPROT.

PROT. UNPROT.

LOCK

LOCK UNLOCK PROT.

UNPROT.

UNLOCK UNLOCK PROT.

LOCK

UNLOCK

3.56 10 11

0.20

3.9 10 12

pLOCK

= 1.3 10 40

pUNLOCK

= 0.99

pPROT .

= 9.4 10 7

pUNPROT .

= 9.2 10 4

pLOCK

= 0.99

pUNLOCK

= 2.1 10 41

pPROT .

= 5.7 10 7

pUNPROT .

= 1.2 10 3

5.74 10 7

0.99

7.9 10 12

0.99 1.08 10 11

0.99

4.2 10 5

9.46 10 7

HMM DFSM

Page 22: Adaptive runtime verification_last_ezio

Predictive Analysis of CriticalityDTMC%

(s1,m1)%%

(s2,m1)%%

(s0)%%

(s3,m1)%%

(s1,m2)%%

(s2,m2)%%

(s3,m2)%%

(s1,m3)%%

(s2,m3)%%

(s3,m3)%%

(s1,m4)%%

(s2,m4)%%

(s3,m4)%%

3242.81%%

8.98%%

3.99%%

5.00%%

3245.81%%

3240.81%%

3.98%%

1.00%%

3242.80%%

0%%

0%%

0%%

1) Compose (offline) an HMM

with a DFSM to get a DTMC

2) Compute (offline) the

expected distance ExpDist

Expected

Distance

… (*,m4)(s1,m1)WE CHECK IN PRISM: R=? [F m=4]= 3242.81

Page 23: Adaptive runtime verification_last_ezio

Linux Kernel Case Study

• Lock discipline: btrfs_space_info

• Protected fields are always accessed with a lock

• All btrfs_space_infoobjects can be accessed from any thread

struct btrfs_space_info {

/* Unprotected */

u64 flags;

/* Protected*/

u64 total_bytes;

spinlock_t lock;

};

Page 24: Adaptive runtime verification_last_ezio

Monitor

Lock Unlock

Protected,Unprotected,Unlock

Protected,Unprotected,Lock

Unprotected,Unlock

Lock

(All events)

Protected

InitLock Held

Error

Lock Released

Page 25: Adaptive runtime verification_last_ezio

HMM

Page 26: Adaptive runtime verification_last_ezio

Instrumentation

• Access instrumented using GCC plug-ins

• Function-body duplication used to enable/disable instrumentation

Function Entrance

Distributor

Instrumented function body

Uninstrumentedfunction body

Page 27: Adaptive runtime verification_last_ezio

Hardware Supervision

• 100% monitoring for a limited number of objects

– One per thread in our prototype

• Provided by hardware debug registers

• Used to monitor highest priority objects

– Highest priority = closest to error state (criticality)

Page 28: Adaptive runtime verification_last_ezio

Results

SamplingProb.

No supervision Random supervision Adaptive supervision

FalseAlarm ErrDet FalseAlarm ErrDet FalseAlarm ErrDet

50% 30.3 23.0% 11.7 57.4% 12 50.1%

75% 47 31.2% 36 69.3% 17 79.4%

85% 5502 34.1% 5606 72.3% 5449 85.1%

FalseAlarm:For a run with no errors, how many objects had a high error probability (> 80%) at the end of the run?

These are false positives.

ErrDet:Consider an error “detected” if the object had a high error probability (> 80%) after the error occurred.

We ran this test with manually inserted errors and measured what percent were detected.

• With high enough sampling, Adaptive Supervision does better than supervising randomly

• We are still investigating why false positive rates get high for high sampling rates