16
DTHREADS: Efficient Deterministic Multithreading Tongping Liu, Charlie Curtsinger and, Emery D. Berger Dept. of Computer Science University of Massachusetts, Amherst Presented by: Lokesh Gidra

D THREADS : Efficient Deterministic Multithreading

  • Upload
    diata

  • View
    96

  • Download
    0

Embed Size (px)

DESCRIPTION

D THREADS : Efficient Deterministic Multithreading. Tongping Liu, Charlie Curtsinger and, Emery D. Berger Dept. of Computer Science University of Massachusetts, Amherst Presented by: Lokesh Gidra. Concurrent Programming is hard!. Prone to deadlocks and race conditions. - PowerPoint PPT Presentation

Citation preview

Page 1: D THREADS : Efficient Deterministic Multithreading

DTHREADS: Efficient Deterministic Multithreading

Tongping Liu, Charlie Curtsinger and, Emery D. Berger

Dept. of Computer ScienceUniversity of Massachusetts, Amherst

Presented by: Lokesh Gidra

Page 2: D THREADS : Efficient Deterministic Multithreading

Concurrent Programming is hard!

• Prone to deadlocks and race conditions. • Thread interleavings are non-deterministic Hard to

debug! Deterministic Multithreaded System (DMT)

eliminates this non-determinism. Same program with same input same result. Simplifies debugging. Simplifies record and replay (eliminates need to track

memory operations). Multiple replicated execution for fault tolerance.

Page 3: D THREADS : Efficient Deterministic Multithreading

Contributions

• DTHREADS guarantees deterministic execution.

• Straightforward deployment: replaces libpthread. No recompilation required.

• Eliminates cache-line false sharing (as a side effect).

• Makes printf debugging practical!

Page 4: D THREADS : Efficient Deterministic Multithreading

Basic Idea

• Isolated memory access between different threads.

• Replace threads with processes.– Replace pthread_create()

with clone system call.– Memory mapped files are

used to share memory (globals and the heap).

Heap

Thread 1 Thread 2

Page 5: D THREADS : Efficient Deterministic Multithreading

Fence and Global Token

Page 6: D THREADS : Efficient Deterministic Multithreading

Commit Protocol

Page 7: D THREADS : Efficient Deterministic Multithreading

Deterministic Synchronization(Global token is the key!)

• Locks– If held by someone else, pass the token.– Release the token only when lock count is 0.

• Condition Variables– Pthread_cond_wait: Remove from token’s Q and

add to variable’s Q.– Pthread_cond_signal: remove first thread in

variable Q and add to token’s Q.

Page 8: D THREADS : Efficient Deterministic Multithreading

Contd…

• Barriers (similar to condition variable)– If not last to enter: move self from token Q to

barrier Q.– otherwise, move all from barrier Q to token Q.

• Thread Creation– Child: place on token Q; wait for || phase.

• Thread Exit/Cancellation– Remove from Q, call pthread_exit()/kill()

Page 9: D THREADS : Efficient Deterministic Multithreading

Memory Allocation and OS Support

• Assign sub-heap to each thread using deterministic thread index.

• Superblocks allocated using locks deterministic.

• Intercepts system calls which affect program execution (like sigwait).

• Intercepts read/write system calls: touch pages for COW, to avoid segfault.

Page 10: D THREADS : Efficient Deterministic Multithreading

Performance

• On 8-core machine with 16GB RAM, 4MB L2.• Benchmarks from PARSEC and Phoenix suites.

For 9 of 14 benchs, dthreads runs nearly as fast or faster than pthreads, while providing determinism.

Page 11: D THREADS : Efficient Deterministic Multithreading

Scalability

• Scales nearly as well or better than pthreads.• Scales almost always as well or better than

CoreDet.

Page 12: D THREADS : Efficient Deterministic Multithreading
Page 13: D THREADS : Efficient Deterministic Multithreading

Limitations

• Incurs substantial overhead for apps with large number of:– short lived transactions.– modified pages per-transaction.

• No control over external non-determinism.• Apps using Ad-hoc synchronization are not supported.• Sharing of stack variables is not supported.• Increases program’s memory footprint.• Will perform poorly if #threads > #cores.

Page 14: D THREADS : Efficient Deterministic Multithreading

Personal Observations(side-effects on NUMA systems)

• Substantially reduces TLB miss cost:– For 64-bit apps, one TLB miss:• Pthreads: ~1500 cycles• Dthreads: ~500 cycles

• Diff-ing will be too expensive:– 4K as compared to just few cache lines.

Page 15: D THREADS : Efficient Deterministic Multithreading

Take Away

• Deterministic Multithreaded Systems are good.• Dthreads: an easy to deploy DMT system.• Supports all pthread APIs.• Replaces threads with processes for memory isolation.• Uses twin pages and diff-ing to commit changes.• Avoids cache-line false sharing.• Good for apps with less transactions.– Or, can we say for scalable apps?

• Doesn’t support Ad-hoc synchronization.

Page 16: D THREADS : Efficient Deterministic Multithreading

Optimizations

• Lazy Commit• Lazy twin creation and diff elimination• Single threaded execution• Lock ownership• Parallelization