Upload
laureen-burke
View
217
Download
0
Embed Size (px)
DESCRIPTION
Cooperative Crug Isolation threaded.exe file.in threaded.exe file.in developer user Non-determinism! More cores More threads More crugs
Citation preview
Workshop on Dynamic Analysis 2009
Aditya Thakur Rathijit Sen Ben Liblit Shan Lu
University of Wisconsin–Madison
Cooperative Crug
Isolation
Cooperative Crug Isolation
read(x)
write(x)
Thread 1
Thread 2
Race!
read(x)
read(x)
write(x)
Thread 1
write(x)
Thread 2
Atomicity violation!
(concurrency bug)
Cooperative Crug Isolation
threaded.exe
file.in
threaded.exe
file.in
LJde
velo
per
user
Non-determinism!
More cores More threads
LL LL
LL
L
More crugs
L
Cooperative Crug Isolation
unlock(mut);
lock(mut);
Thread 1
mut = NULL;
Thread 2
Global variables are shown in bold.
Simplified crug from PBZIP2
J
Cooperative Crug Isolation
Global variables are shown in bold.
Identify root cause of crug
unlock(mut);
lock(mut);
Thread 1
mut = NULL;
Thread 2
Cooperative Crug Isolation
Not scalable,High overhead
Report benigncrugs
Target specific type of crugs and synchronization
Current techniques
Cooperative Crug Isolation
Scalable,Low overhead
Does not report benign crugs
Multiple types of crugs and synchronization
ShippingApplicatio
n
Cooperative Crug IsolationBug Isolation
�ƒƒ€ ‚�
Program
SourceCompiler
Sampler
Predicates
Counts& J/L
StatisticalDebugging
Top bugs withlikely causes
Cooperative Crug IsolationBug Isolation
unlock(mut);
lock(mut);
Thread 1
mut = NULL;
Thread 2
unlock(mut);
lock(mut);
Thread 1
mut = NULL;
Thread 2
CBI predicates inadequate for crug isolation.
Values of predicates same for successful and failing runs.
Cooperative Crug IsolationBug Isolation
unlock(mut);
lock(mut);
Thread 1
mut = NULL;
Thread 2
CBI sampling inadequate for crug isolation.
Sampling thread-local, independent.
Cooperative Crug IsolationBug Isolation
CBI was unable to diagnose crugs in any of the benchmarks used.
No bug predictors reported!
Cooperative Crug Isolation
CCI extends the CBI framework to target crugs
New predicate capturing interleaving events
New cross-thread sampling scheme
Cooperative Crug IsolationPredicate Design
unlock(mut);S:
lock(mut);
Thread 1
mut = NULL;
Thread 2
remoteS is true
LlocalS is true
J
Predicate Instrumentation
At runtime,maintain hashtable which mapsaddresses to thread id which last accessed it
Address Thread Id
0xb1ab1a 1
0xf00f00 2
0xb1af00 1
Predicate Instrumentation
access(x);
record(S, differs);
differs = test_and_insert(&x, curTid);
lock(glock);
unlock(glock);
Predicate Instrumentation
access(x);
record(S, differs);
differs = test_and_insert(&x, curTid);
lock(glock);
unlock(glock);
curTid is thread id of currently executing thread
Predicate Instrumentation
access(x);
record(S, differs);
differs = test_and_insert(&x, curTid);
lock(glock);
unlock(glock);
Check if curTid was the thread which previously accessed x
Predicate Instrumentation
access(x);
record(S, differs);
differs = test_and_insert(&x, curTid);
lock(glock);
unlock(glock);
Set differs to true if it was not
Predicate Instrumentation
access(x);
record(S, differs);
differs = test_and_insert(&x, curTid);
lock(glock);
unlock(glock);
Update the hashtable
Predicate Instrumentation
access(x);
record(S, differs);
differs = test_and_insert(&x, curTid);
lock(glock);
unlock(glock);
Increment counter for predicateat S
Predicate Instrumentation
access(x);
record(S, differs);
differs = test_and_insert(&x, curTid);
lock(glock);
unlock(glock); Execute block atomically
Predicate Instrumentation
access(x);
record(S, differs);
differs = test_and_insert(&x, curTid);
lock(glock);
unlock(glock);
Handles accesses through pointers.No need for static pointer analysis.
Sampling Mechanism
access(x);
record(S, differs);
differs = test_and_insert(&x, curTid);
lock(glock);
unlock(glock);
If(gsample == 0)
access(x);
gsample = curTid; insert(&x, curTid);
else
if(gsample == curTid) gsample = 0; clear();
Is sampling on?
Turn on samplingUpdate hashtable
Stop sampling, clear hashtable
Did current thread initiate sampling
Sampling not on
Sampling already on
Sampling Mechanism
lock(mut);
Thread 1
Address Thread Id
Hashtable
gsample = 0
Sampling Mechanism
lock(mut);
Thread 1 Thread 2
Address Thread Id&mut 1
Hashtable
gsample = 1
Sampling Mechanism
lock(mut);
Thread 1
mut = NULL;
Thread 2
Address Thread Id&mut 2
Hashtable
gsample = 1
Sampling Mechanism
unlock(mut);
lock(mut);
Thread 1
mut = NULL;
Thread 2
S:
Address Thread Id&x 2
Hashtable
gsample = 1
Record remoteS is true
Sampling Mechanism
unlock(mut);
lock(mut);
Thread 1
mut = NULL;
Thread 2
S:
Address Thread Id
Hashtable
gsample = 0
Stop sampling
Experimental Evaluation
Benchmarks used Apache HTTP server, PBZIP2 SPLASH-2: FFT, LU
Machine used dual-core Intel P4
Questions answered Runtime overhead Accuracy of predictors
Runtime Overhead
Benchmark No sampling SamplingApache 25% 2%PBZIP2 200% 7%FFT 650% 25%LU 1,300% 800%
Overhead compared to uninstrumented code
Low overheads for both real-world applications
Large difference between no sampling and sampling.
Predictor Accuracy
Predictor Function
R: buf->outcnt += len ap_buffered_log_writer()
Apache
Predictor Function
R: pthread_mutex_unlock(fifo->mut); consumer_decompress()
PBZIP2
remote predicate
Predictor Accuracy
Predictor Function
R: Global->finishtime=finish SlaveStart()
R: Global->initdonetime=initdone SlaveStart()
R: printf(“..”,Global->transtime[0]…) main()
L: malloc(2*(rootN-1)*sizeof(double)); SlaveStart()
FFT
Predictor Function
R: Global->rf=rf OneSolve()
L: (Global->start).gsense=-lsense; OneSolve()
LU
local predicate
Conclusion
•CCI is a low-overhead, scalable approach for root cause analysis of crugs•Effective on two widely-deployed applications•Simple predicates are effective because of the use of statistical models
Next time on
•What other events are useful for crug isolation?•Scope for static analysis to help?•Other cross-thread sampling mechanisms (e.g. bursty sampling)?•Crug isolation to crug tolerance?
Thank you!