Upload
guy-korland
View
2.003
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Deuce STM - CMP'09
Citation preview
Noninvasive Java Concurrency with Deuce STM 1.0
Guy Korland “Multi Core Tools” CMP09
Outline
• Motivation• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References
Motivation
Problem I
Process 1 Process 2
a = acc.get() a = a + 100 b = acc.get()
b = b + 50 acc.set(b)
acc.set(a)
... Lost Update! ...
Problem II
Process 1 Process2
lock(A) lock(B)lock(B) lock(A)
... Deadlock! ...
• Cannot exploit cheap threads
• Today’s Software o Non-scalable methodologies
• Today’s Hardwareo Poor support for scalable synchronization.o Low level support CAS, TAS, MemBar…
The Problem
The Problem
Why Locking Doesn’t Scale?
• Not Robust
• Relies on conventions
• Hard to Useo Conservativeo Deadlockso Lost wake-ups
• Not Composable
Outline
• Motivation• Solutions• Deuce • Implementation• TL2• LSA• Benchmarks• Summary• References
Solutions I – Domain specific
•Mathlab – Concurrency behind the scenes. •SQL/XQuery/XPath – DB will handle it… •HTML, ASP, PHP, JSP … – (almost) stateless.
•Fortress[Sun], X10[IBM], Chapel[UW] … – implicit concurrency. Remember Cobol!
Domain too specific
Solutions II – Actor Model(Share nothing model)
•Carl Hewitt, Peter Bishop and Richard, A Universal Modular Actor Formalism for Artificial Intelligence [IJCAI 1973].
•An actor, on message:– no shared data– send messages to other actors– create new actors
•Where can we find it?– Simula, Smalltalk, Scala, Haskell, F#, Erlang...
Functional languges
Solutions II – Actor Model(Share nothing model)
-module(counter).-export([run/0, counter/1]). run() -> S = spawn(counter, counter, [0]), send_msgs(S, 100000), S. counter(Sum) -> receive {inc, Amount} -> counter(Sum+Amount) end.
send_msgs(_, 0) -> true;send_msgs(S, Count) -> S ! {inc, 1}, send_msgs(S, Count-1).
Actors in Erlang
•Is it really easier?
•What about performance?
•Will functional languages
ever be functional?
•Java/.NET/C++ rules!!!
(maybe Ruby)
Solutions III – STM Nir Shavit, DAN TOUITOU, Software Transactional Memory [PODC95]
synchronized{ <instructions>}
atomic{ <instructions>}
l.lock(); <instructions>l.unlock();
What is a transaction?
• Atomicity – all or nothing
• Consistency – consistent state (after & before)
• Isolation – Other can’t see intermediate.
• Durability - persistent
Or maybe we do want it?
The Brief History of STM
1993
STM
(Sha
vit,T
ouito
u)20
03D
STM
(Her
lihy
et a
l)
2003
WS
TM (F
rase
r, H
arris
)
2003
OS
TM (F
rase
r, H
arris
)
2004
AS
TM (M
arat
he e
t al)
2004
T-M
onito
r (Ja
gann
atha
n…)
2005
Lock
-OS
TM (E
nnal
s)
2004
Hyb
ridTM
(Moi
r)
2004
Met
a Tr
ans
(Her
lihy,
Sha
vit)
2005
McT
M (S
aha
et a
l)
2006
Ato
mJa
va (H
indm
an…
)
1997
Tran
s S
uppo
rt TM
(Moi
r)
2005
TL (D
ice,
Sha
vit))
2004
Sof
t Tra
ns (A
nani
an, R
inar
d)
2006
LSA
(R
iege
l et a
l
2006
TL2
(Dic
e, S
havi
t, S
hale
v)20
09D
euce
(Kor
land
et a
l)
2008
Roc
k (S
un)
2006
DS
TM2
(Her
lihy,
Luc
hang
co)
2007
Tang
er
DSTM2Maurice Herlihy et al, A flexible framework … [OOPSLA06]
@atomic public interface INode{int getValue ();void setValue (int value );INode getNext ();void setNext (INode value );
}Factory<INode> factory = Thread.makeFactory(INode.class );
result = Thread.doIt(new Callable<Boolean>() { public Boolean call () {
return intSet.insert (value); }
});
•Limited to Objects.
•Very intrusive.
•Doesn’t support libraries.
•Bad performance (fork).
JVSTMJoão Cachopo and António Rito-Silva, Versioned boxes as the basis for memory transactions [SCOOL05]
public class Account{
private VBox<Long> balance = new VBox<Long>();
public @Atomic void withdraw(long amount) { balance.put (balance.get() - amount); }
}
•Doesn’t support libraries.
•Less intrusive.
•Need to “Announce” shared fields
Atom-JavaB. Hindman and D. Grossman. Atomicity via source-tosourcetranslation. [MSPC06]
public void update ( double value){
Atomic{
commission += value;
}
}
•Add a reserved word.
•Need precompilation.
•Doesn’t support libraries.
•Even Less intrusive.
MultiversePeter Veentjer, 2009
@TmEntitypublic class Stack<E>{
private Node<E> head;
public void push(E item) { head = new Node(item, head); } }
@TmEntity public static class Node<E> { final E value; final Node parent;
Node(E value, Node prev) { this.value = value; this.parent = prev; } }
•Doesn’t support libraries.
•Limited to Objects.
DATM-JHany E. Ramadan et al., Dependence-aware transactional memory [MICRO08]
Transaction tx = new Transaction ( id) ;
boolean done = false;
while ( !done) {
try{
tx.BeginTransaction( ) ;
/ / txnl code
done = tx.CommitTransaction ( ) ;
} catch( AbortException e ) {
tx.AbortTransaction( ) ;
done = false;
}
}
•Explicit transaction.
•Explicit retry.
Outline
• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References
Deuce STM
• Java STM frameworko @Atomic methodso Field based access
More scalable than Object bases. More efficient than word based.
o Supports external libraries Can be part of a transaction
o No reserved words No need for new compilers (Existing IDEs can be used)
• Research toolo API for developing and testing new algorithms.
Deuce - API
public class Bank{
final private static double MAXIMUM_TRANSACTION = 1000; private double commission = 0;
@Atomic(retries=64) public void transaction( Account ac1, Account ac2, double amount){ ac1.balance -= (amount + commission); ac2.balance += amount; }
@Atomic public void update( double value){ commission += value; }}
Deuce - Overview
Deuce - Running
• –javaagent:deuceAgent.jar o Dynamic bytecode manipulation.
• -Xbootclasspath/p:rt.jaro Offline instrumentation to support boot classloader.
• java –javaagent:deuceAgent.jar –cp “myjar.jar” MyMain
Outline
• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References
Implementation
• ASM – Bytecode manipulationo Online & Offline
• Fields o private double commission;o final static public long commission__ADDRESS...
Relative address (-1 if final).o final static public Object __CLASS_BASE__ ...
Mark the class base for static fields access.
Implementation
• Method o @Atomic methods.
Replace the with a transaction retry loop. Add another instrumented method.
o Non-Atomic methods Duplicate each with an instrumented version.
Implementation
@Atomicpublic void update ( double value){ double tmp = commission; commission = tmp + value;}
@Atomicpublic void update ( double value){ commission += value;}
In byte code
public void update( double value, Context c){ double tmp; if( commission__ADDRESS < 0 ) { // final field tmp = commission; } else{ c.beforeRead( this, commission__ADDRESS); tmp = c.onRead( this, commission,
commission__ADDRESS); } c.onWrite( this, tmp + value, commission__ADDRESS);}
Implementation
JIT removes it
public void update( double value, Context c){ c.beforeRead( this, commission__ADDRESS); double tmp = c.onRead( this, commission,
commission__ADDRESS); c.onWrite( this, tmp + value, commission__ADDRESS);}
Implementation
public void update( double value){ Context context = ContextDelegetor.getContext(); for( int i = retries ; i > 0 ; --i){ context.init(); try{ update( value, context); if( context.commit()) return; }catch ( TransactionException e ){ context.rollback(); continue; }catch ( Throwable t ){ if( context.commit()) throw t; } } throw new TransactionException();}
Implementation
public interface Context{
void init ( int atomicBlockId)boolean commit();void rollback ();
void beforeReadAccess( Object obj , long field );Object onReadAccess( Object obj, Object value , long field );int onReadAccess( Object obj, int value , long field );long onReadAccess( Object obj, long value , long field );…void onWriteAccess( Object obj , Object value , long field );void onWriteAccess( Object obj , int value , long field );void onWriteAccess( Object obj , long value , long field );…
}
Implementation
Outline
• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References
TL2 (Transaction Locking II)Dave Dice, Ori Shalev and Nir Shavit [DISC06]
CTL - Commit-time locking• Start
o Sample global version-clock• Run through a speculative execution
o Collect write-set & read-set• End
o Lock the write-seto Increment global version-clocko Validate the read-seto Commit and release the locks
Outline
• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References
LSA (Lazy Snapshot Algorithm)Torvald Riegel, Pascal Felber and Christof Fetzer [DISC06]
ETL - Encounter-time locking• Start
o Sample global version-clock• Run through a speculative execution
o Lock on write accesso Collect read-set & write-set
• On validation error try to extend snapshot• End
o Increment global version-clocko Validate the read-seto Commit and release the locks
Outline
• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References
Benchmarks (Azul – Vega2 – 2 x 46)
Benchmarks (SuperMicro – 2 x Quad Intel)
Benchmarks (Sun UltraSPARC T2 Plus – 2 x Quad x 8HT)
Outline
• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References
Summary
• Simple APIo @Atomic
• No changes to Javao No reserved words
• OpenSourceo On Google code
• Shows nice scalabiltyo Field based
Outline
• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References
References
• Homepage - http://www.deucestm.org
• Project - http://code.google.com/p/deuce/
• Wikipedia -http://en.wikipedia.org/wiki/Software_transactional_memory
• TL2 – http://research.sun.com/scalable
• LSA-STM - http://tmware.org/lsastm