Upload
ezio-bartocci
View
88
Download
1
Embed Size (px)
Citation preview
Ezio BartocciVienna University of Technology
Adaptive Runtime
Verification
in collaboration with:
Radu Grosu, Scott A. Smolka, Scott D. Stoller,
Erez Zadok, Justin Seyster
Vienna University of Technology, Stony Brook University
ARV Overview
• New approach to runtime verification in which overhead control,
state estimation, and predictive analysis are synergistically
combined;
• Every monitor instance has an associated criticality level, which is a
measure of how ``close'' the instance is to violating property
underinvestigation;
• As criticality levels of monitor instances rise, so will fraction of
monitoring resources allocated to these instances, thereby
increasing probability of violation detection;
• There is a delicate interplay between (software) state estimation
and (monitoring) overhead control.
Outline
• Adaptive Runtime Verification Framework
• Runtime Verification with State Estimation
• Precomputing RVSE
• Predictive Analysis of Criticality Level
• Case Study
• Results
• Conclusions
Adaptive Runtime Verification Framework
Monitored System
InterAspect
Instrumentation
J. Seyster, K. Dixit, X. Huang, R. Grosu, K. Havelund, S.A. Smolka, S.D. Stoller and E. Zadok.
InterAspect: Aspect-Oriented Instrumentation with GCC. Formal Methods in System Design 2012
Adaptive Runtime Verification Framework
Monitored System
InterAspect
Primary ControllerMonitoring Framework
Secondary
Controller
Xiaowan Huang, Justin Seyster, Sean Callanan, Ketan Dixit, Radu Grosu, Scott A. Smolka, Scott D.
Stoller, Erez Zadok. Software monitoring with controllable overhead. STTT 14(3): 327-347 (2012)
Overhead: Low
Monitor: On
Secondary
ControllerOverhead: Low
Monitor: On
Secondary
ControllerOverhead: High
Monitor: Off
…
Software monitoring
with controllable overhead
Adaptive Runtime Verification Framework
Monitored System
InterAspect
Primary ControllerMonitoring Framework
Secondary
ControllerOverhead: Low
Monitor: On
Secondary
ControllerOverhead: Low
Monitor: On
Secondary
ControllerOverhead: High
Monitor: Off
…
Software monitoring
with controllable overhead
Real Trace
Monitor: On Off
DISP CMD SUCC CMD CMD FAIL
Monitored Trace DISP CMD SUCC CMD
m1 m2 m3
DISP
FAIL
SUCC CMD
CMD
DISP
CMD
DISP
FAIL
SUCCFAIL
SUCC
DFSM
Adaptive Runtime Verification Framework
Monitored System
InterAspect
Primary ControllerMonitoring Framework
Secondary
ControllerOverhead: Low
Monitor: On
Secondary
ControllerOverhead: Low
Monitor: On
Secondary
ControllerOverhead: Low
Monitor: On
…
Software monitoring
with controllable overhead
Real Trace
Monitor: On Off
DISP CMD SUCC CMD CMD FAIL SUCC
Monitored Trace DISP CMD SUCC CMD
m1 m2 m3
DISP
FAIL
SUCC CMD
CMD
DISP
CMD
DISP
FAIL
SUCCFAIL
SUCC
On
SUCC
DFSM ?
Adaptive Runtime Verification Framework
Monitored System
InterAspect
Primary ControllerMonitoring Framework
Secondary
ControllerOverhead: Low
Monitor: On
Secondary
ControllerOverhead: Low
Monitor: On
Secondary
ControllerOverhead: Low
Monitor: On
…
Runtime Verification with
State Estimation (RVSE)
Real Trace
Monitor: On Off
DISP CMD SUCC DISP CMD FAIL SUCC
Monitored Trace DISP CMD SUCC CMD
m1 m2 m3
DISP
FAIL
SUCC CMD
CMD
DISP
CMD
DISP
FAIL
SUCCFAIL
SUCC
On
SUCC
DFSMprob=0.73
Scott D. Stoller, Ezio Bartocci, Justin Seyster, Radu Grosu, Klaus Havelund, Scott A. Smolka, Erez Zadok.
Runtime Verification with State Estimation. RV 2011: 193-207 (2012) Best Paper Award RV2011
Learn an
HMM
P(CMD) = 1
P(DISP) = 0
P(SUCC) = 0
P(FAIL) = 0
P(CMD) = 0
P(DISP) = 1
P(SUCC) = 0
P(FAIL) = 0
P(CMD) = 0
P(DISP) = 0
P(SUCC) = 0.97
P(FAIL) = 0.03
DISP CMD SUCC CMD CMD SUCC CMD DISP DISP ……..Real Traces:
Compose with
the monitor
Estimate the
Error Probability (EP)
11
0.07
0.93
Adaptive Runtime Verification Framework
Monitored System
InterAspect
Primary ControllerMonitoring Framework
Secondary
ControllerOverhead: Low
Monitor: On
Secondary
ControllerOverhead: High
Secondary
ControllerOverhead: High
…
Adaptive Overhead Control
EP: Med Crit: Safe Monitor: Off EP: Low Crit: Safe EP: LowCrit:
CriticalMonitor: On
EP Error Probability
Crit. Criticality
m1 m2 m3
DISP
FAIL
SUCC CMD
CMD
DISP
FAIL
SUCC
Low Critical High Critical
Runtime Verification with State Estimation
1) Learning (offline) an HMM from a set of traces (Baum and Welch)
pT = 0.9999578… 3.9036e -12 4.2129e - 05éë
ùû
A =
5.7498e - 07 0.9999994… 1.0838e -11
7.9406e -12 0.7999385… 0.2000614…
0.9999990… 3.5693e -11 9.4621e - 07
é
ë
êêê
ù
û
úúú
B =
0.9987651¼ 2.1899e - 41 5.7327e - 07 0.0012343¼
3.8173e - 21 2.4628e -18 0.9996049¼ 3.9504e - 04
1.3882e - 40 0.9990775¼ 9.4410e - 07 9.2147e - 04
é
ë
êêê
ù
û
úúú
LOCK
s
1 s
2
s
3
UNLOCK PROT. UNPROT.
s
1 s
2 s
3
s
1
s
2
s
3
s
1
s
2
s
3
s
1 s
2 s
3
Runtime Verification with State Estimation
1) Learning (offline) an HMM from a set of traces (Baum and Welch)
2) Computing (offline) the gap distribution for a particular overhead
0 1 2 3 4 5 6 7
GAP LENGTH
0.45
0.110.1 0.09
0.08 0.06 0.06 0.05
Runtime Verification with State Estimation
1) Learning (offline) an HMM from a set of traces (Baum and Welch)
2) Computing (offline) the gap distribution for a particular overhead
3) Computing (online) the forward algorithm for RVSE
Runtime Verification with State Estimation
1) Learning (offline) an HMM from a set of traces (Baum and Welch)
2) Computing (offline) the gap distribution for a particular overhead
3) Computing (online) the forward algorithm for RVSE
Very expensive to compute !!!
Solution: Precomputing RVSE
Solution: Precomputing RVSE
O
1
m1 m2 m3
DFSM
dead
state
O
1
O
2
O
3
O
2
O
3
O
1,O
2,O
3
O
1
Solution: Precomputing RVSE
O
1
O
2
exact edge
Solution: Precomputing RVSE
O
1
O
2
Solution: Precomputing RVSE
O
1
O
2
O
3
Solution: Precomputing RVSE
O
1
O
2
O
3
Solution: Precomputing RVSE
O
1
O
2
O
3
O
1
approximate edge
Proof of Termination in the paper !!!
Predictive Analysis of Criticality
1) Compose (offline) an HMM with a DFSM to get a DTMC
pLOCK
= 3.8 10 21
pUNLOCK
= 2.4 10 18
pPROT .
= 0.99
pUNPROT .
= 3.9 10 4
s
3
a)#HMM# b)#DFSMs#
2# 3#1#
4#
LOCK
LOCK PROT.
UNPROT.
UNLOCK
PROT. UNPROT.
LOCK
UNLOCK
LOCK UNLOCK
PROT.
UNPROT.
PROT.
UNLOCK UNPROT.
s
1 s
2
0.79
2# 3#1#
4#
LOCK
PROT. UNPROT. UNPROT.
PROT. UNPROT.
LOCK
LOCK UNLOCK PROT.
UNPROT.
UNLOCK UNLOCK PROT.
LOCK
UNLOCK
3.56 10 11
0.20
3.9 10 12
pLOCK
= 1.3 10 40
pUNLOCK
= 0.99
pPROT .
= 9.4 10 7
pUNPROT .
= 9.2 10 4
pLOCK
= 0.99
pUNLOCK
= 2.1 10 41
pPROT .
= 5.7 10 7
pUNPROT .
= 1.2 10 3
5.74 10 7
0.99
7.9 10 12
0.99 1.08 10 11
0.99
4.2 10 5
9.46 10 7
pLOCK
= 3.8 10 21
pUNLOCK
= 2.4 10 18
pPROT .
= 0.99
pUNPROT .
= 3.9 10 4
s
3
a)#HMM# b)#DFSMs#
2# 3#1#
4#
LOCK
LOCK PROT.
UNPROT.
UNLOCK PROT.
UNPROT.
LOCK
UNLOCK
LOCK UNLOCK
PROT.
UNPROT.
PROT.
UNLOCK UNPROT.
s
1 s
2
0.79
2# 3#1#
4#
LOCK
PROT. UNPROT. UNPROT.
PROT. UNPROT.
LOCK
LOCK UNLOCK PROT.
UNPROT.
UNLOCK UNLOCK PROT.
LOCK
UNLOCK
3.56 10 11
0.20
3.9 10 12
pLOCK
= 1.3 10 40
pUNLOCK
= 0.99
pPROT .
= 9.4 10 7
pUNPROT .
= 9.2 10 4
pLOCK
= 0.99
pUNLOCK
= 2.1 10 41
pPROT .
= 5.7 10 7
pUNPROT .
= 1.2 10 3
5.74 10 7
0.99
7.9 10 12
0.99 1.08 10 11
0.99
4.2 10 5
9.46 10 7
HMM DFSM
Predictive Analysis of CriticalityDTMC%
(s1,m1)%%
(s2,m1)%%
(s0)%%
(s3,m1)%%
(s1,m2)%%
(s2,m2)%%
(s3,m2)%%
(s1,m3)%%
(s2,m3)%%
(s3,m3)%%
(s1,m4)%%
(s2,m4)%%
(s3,m4)%%
3242.81%%
8.98%%
3.99%%
5.00%%
3245.81%%
3240.81%%
3.98%%
1.00%%
3242.80%%
0%%
0%%
0%%
1) Compose (offline) an HMM
with a DFSM to get a DTMC
2) Compute (offline) the
expected distance ExpDist
Expected
Distance
… (*,m4)(s1,m1)WE CHECK IN PRISM: R=? [F m=4]= 3242.81
Linux Kernel Case Study
• Lock discipline: btrfs_space_info
• Protected fields are always accessed with a lock
• All btrfs_space_infoobjects can be accessed from any thread
struct btrfs_space_info {
/* Unprotected */
u64 flags;
/* Protected*/
u64 total_bytes;
spinlock_t lock;
};
Monitor
Lock Unlock
Protected,Unprotected,Unlock
Protected,Unprotected,Lock
Unprotected,Unlock
Lock
(All events)
Protected
InitLock Held
Error
Lock Released
HMM
Instrumentation
• Access instrumented using GCC plug-ins
• Function-body duplication used to enable/disable instrumentation
Function Entrance
Distributor
Instrumented function body
Uninstrumentedfunction body
Hardware Supervision
• 100% monitoring for a limited number of objects
– One per thread in our prototype
• Provided by hardware debug registers
• Used to monitor highest priority objects
– Highest priority = closest to error state (criticality)
Results
SamplingProb.
No supervision Random supervision Adaptive supervision
FalseAlarm ErrDet FalseAlarm ErrDet FalseAlarm ErrDet
50% 30.3 23.0% 11.7 57.4% 12 50.1%
75% 47 31.2% 36 69.3% 17 79.4%
85% 5502 34.1% 5606 72.3% 5449 85.1%
FalseAlarm:For a run with no errors, how many objects had a high error probability (> 80%) at the end of the run?
These are false positives.
ErrDet:Consider an error “detected” if the object had a high error probability (> 80%) after the error occurred.
We ran this test with manually inserted errors and measured what percent were detected.
• With high enough sampling, Adaptive Supervision does better than supervising randomly
• We are still investigating why false positive rates get high for high sampling rates