Upload
gracie
View
43
Download
0
Embed Size (px)
DESCRIPTION
Hardware-based. Transactional Memory Supporting Large Transactions. Anvesh Komuravelli Abe Othman Kanat Tangwongsan. Concurrent Programs. handle with care . Thread 1. Thread 2. Deadlock. obj.x = 7; find_primes (); // intrusion test if ( obj.x != 7) fireMissiles () . - PowerPoint PPT Presentation
Citation preview
Transactional MemorySupporting Large Transactions
Anvesh KomuravelliAbe Othman
Kanat Tangwongsan
Hardware-based
Concurrent Programs
obj.x = 7;find_primes();// intrusion testif (obj.x != 7) fireMissiles()
do_stuff();obj.x = 42;
Thread 1 Thread 2
handle with care
lock_acquire(critical_zone);
lock_release(critical_zone);
Deadlock
Starvation
Complex ProgramLock-based Approaches
Transactional Memory
Transactional Memory
Atomicity in the face of concurrency.
Isolation from other transactions.
Consistency across the whole system.
Programmer: enclose instructions in a transaction.
System: execute transactions concurrently, and if conflict, do something intelligent (e.g., abort, restart)
obj.x = 7;find_primes();// intrusion testif (obj.x != 7) fireMissiles()
do_stuff();obj.x = 42;
x_begin();
x_finish();
Different strokes for different folks
0.01
0.1
1
10
100
1000
50th 80thPercentile of Transactions
Writ
e Se
t Siz
e in
Kby
tes ANL Java Pthreads
0.1
1
10
100
1000
50th 80thPercentile of Transactions
Rea
d Se
t Siz
e in
Kby
tes ANL Java Pthreads
Common Case: 98% transactions fit in L1 => hardware
What to do with the rest 2%?
Fast… Easy conflict detection… Easy commit and abort
Goal: Hide platform/resource limitations from programmers
Challenges &
Opportunities
VTM – Virtual Transactional Memory
• On overflow, use process’s virtual memory• Tracking at cache-line granularity• Per process state (tag and store virtual addresses) • Flatten nested transactions• Implemented in specialized hardware (dedicated
cache, search logic, …)• Drawbacks?– Modifications to hardware. Costly?
XTM – eXtended Transactional Memory
• “Complete TM Virtualization without complex hardware”• Page table per transaction• Allows arbitrary nesting – no flattening• The only hardware support – raise an exception on
overflow• Drawbacks?
– Page granularity on overflows– Potentially higher memory usage than VTM– Software commit is costlier than VTM’s hardware commit – can
stall other xactions of the process
Comparing the approaches
0
0.5
1
1.5
2
2.5
3XT
MXT
M-g
XTM-
eVT
MXT
MXT
M-g
XTM
-eVT
MXT
MXT
M-g
XTM-
eVT
MXT
MXT
M-g
XTM-
eVT
MXT
MXT
M-g
XTM-
eVT
MXT
MXT
M-g
XTM-
eVT
M
tomcatv[37.7%]
volrend[0.01%]
radix [0.26%]
micro- P10[39.2%]
micro- P20[60.3%]
micro- P30[60.8%]
Norm
aliz
ed E
xecu
tion
Tim
e VersioningValidationCommitViolationsIdleUseful
8.3
An observation
• Small transactions get things done in the hardware
• Large transactions spill the buffers and TM switches to virtual mode
• What about varyingly large transactions?– What if everything fits again in the buffers?– Can we switch back to hardware mode?
Towards improving virtualization
• Permissions-only cache – reduces the chance of overflowing buffers significantly– At the cost of a little extra hardware
• The already less frequent (assumed to be!) large transactions are even lesser
• Large transactions are serialized and handled one-at-a-time.
Towards improving virtualization
Do we always have only a few large transactions?
• For now: yes• In the future: maybe not• I/O and blocking system calls might wish to be
atomic• How do the earlier discussed approaches fare?– VTM – complex hardware– XTM – complications with OS and page granularity– OneTM – can lead to starvation!
TokenTM
• Uses tokens to monitor memory blocks– To read, you get a token– To write, you need to get every token
• Rigorous bookkeeping – blocks are tracked in caches, memory and disk
• Handles large transactions gracefully– Except for conflicts, transaction speed is
unaffected by large transactions in other threads
TokenTM Downsides
• Small transactions suffer(?)– L1 cache sized transactions can work at hardware
speed….BUT:• Need flash-clear and flash-OR circuits in L1 cache• Requires a very involved ad hoc representation• …or taking a 3% overhead hit
• Optimizes the rare large case to the detriment of the frequent small case?
Conclusion
• Sun Research’s Transactional Memory Spotlight:More recent proposals for “unbounded” HTM aim to overcome these disadvantages, but Sun Labs researchers came to the conclusion that the proposals were sufficiently complex and risky that they were unlikely to be adopted in mainstream commercial processor designs in the near future.