Upload
meghan-poole
View
212
Download
0
Embed Size (px)
Citation preview
Unit Testing in Java with an Unit Testing in Java with an Emphasis on Concurrency Emphasis on Concurrency
Corky CartwrightCorky Cartwright
Rice and Halmstad UniversitiesRice and Halmstad Universities
Summer 2013Summer 2013
Software Engineering CultureSoftware Engineering Culture● Three Guiding Visions
● Data-driven designData-driven design● Test-driven developmentTest-driven development● Mostly functional coding (no gratuitous mutation)Mostly functional coding (no gratuitous mutation)
● Codified in Design Recipe taught inHow to Design Programs by Felleisen et al(available for free online: www.htdp.org [first edition], www.ccs.neu.edu/home/matthias/HtDP2e/Draft, [second edition]) and Elements of Object-Oriented Design (available online at . The target languages are Scheme and Java.
TimelinessTimeliness
● CPU clock frequencies stagnate
● Multi-Core CPUs provide additional processing power, but multiple threads needed to use multiple cores.
● Writing concurrent programs is difficult!
Tutorial OutlineTutorial Outline
● Introduce unit testing in single-threaded (deterministic) setting using lists
● Demonstrate problems introduced by concurrency and their impact on unit testing
● Show how some of the most basic problems can be overcome by using the right policies and tools.
(Sequential) Unit Testing(Sequential) Unit Testing
Unit tests …Test parts of the program (Test parts of the program (including whole!including whole!))
Integrate with program development; commits Integrate with program development; commits to repository must pass all unit teststo repository must pass all unit tests
Automate testing during maintenance phaseAutomate testing during maintenance phase
Serve as documentationServe as documentation
Prevent bugs from reoccurringPrevent bugs from reoccurring
Help keep the code repository cleanHelp keep the code repository clean
Effective with a single thread of control
Universal Test-Driven Design RecipeUniversal Test-Driven Design RecipeAnalyze the problem: define the data and determine top level operations. Give sample data values.
Define type signatures, contracts, and headers for all top level operations. In Java, the type signature is part of the header.
Give input-output examples including critical boundary cases for each operation.
Write a template for each operation, typically based on structural decomposition of primary argument (the receiver in OO methods).
Code each method by filling in templates
Test every method (using I/O examples!) and ascertain that every method is tested on sufficient set of examples. White-box testing matters!
Sequential Case Studies: Sequential Case Studies: Functional Lists and Bi-ListsFunctional Lists and Bi-Lists
A List<E> is either Empty<E>(), orEmpty<E>(), or
Cons<E>(e, l) where e is an E and l is List<E>Cons<E>(e, l) where e is an E and l is List<E>
A BiList<E> is a mutable data structure containing a possibly empty sequence of objects of type E that can be traversed in either direction using a BiListIterator<E>.
Review Elements of Sequential Review Elements of Sequential Unit TestingUnit Testing
● Unit tests depend on deterministic behavior
● Known input, expected output…Success correct behaviorFailure flawed code
● Outcome of test is meaningful if test is deterministic
Problems Due to ConcurrencyProblems Due to Concurrency
Thread scheduling is nondeterministic and machine-dependent
Code may be executed under different schedulesCode may be executed under different schedulesDifferent schedules may produce different resultsDifferent schedules may produce different results
Known input, expected output(s?)…
Success correct behavior in this schedule, may be flawed in other schedule
Failure flawed code
Success of unit test is meaningless
Recommended Resources on Recommended Resources on Concurrent Programming in JavaConcurrent Programming in JavaExplicit Concurrency:
Comp 402 web site from 2009
Brian Goetz, Java Concurrency in Practice
(available onlne at this website)
Coping with Multicore
Emerging parallel extensions of Java/Scala that guarantee determinism (in designated subset) and do not require explicit synchronization and avoid JMM issues
Habanero Java
Habanero Scala
Success of non-deterministic unit test is not very meaningful
Problems Due to Java Memory ModelProblems Due to Java Memory Model
JMM is MUCH weaker than sequential consistency
Writes to shared data may be held pending indefinitely unless target is declared volatile or is shielded by the same lock as subsequent reads. Why not always use locking (synchronized)?
Significant overheadSignificant overhead
Increases likelihood of deadlockIncreases likelihood of deadlock
Extremely difficult to reason about program execution for specific inputs because so many schedules are allowed.
A model that accommodates compiler writers rather than software developers.
Hidden Pitfalls in Using JUnit to Hidden Pitfalls in Using JUnit to Test Concurrent JavaTest Concurrent Java
Junit Is Completely Broken for Concurrent Code Units:
Fails to detect exceptions and failed assertions Fails to detect exceptions and failed assertions in threads other than the main thread (!)in threads other than the main thread (!)Fails to detect if auxiliary thread is still running Fails to detect if auxiliary thread is still running when main thread terminates; all execution is when main thread terminates; all execution is aborted when main thread terminates.aborted when main thread terminates.Fails to ensure that all auxiliary threads were Fails to ensure that all auxiliary threads were joined by main thread before termination. (In joined by main thread before termination. (In Habanero Java, all programs are implicity Habanero Java, all programs are implicity enclosed a comprehensive join called enclosed a comprehensive join called finishfinish() ()
but not in Java.)but not in Java.)
Possible Solutions to Concurrent Possible Solutions to Concurrent Testing ProblemsTesting Problems
Programming Language FeaturesEnsure that bad things cannot happen; Ensure that bad things cannot happen; perhaps ensure determinism (reducing testing perhaps ensure determinism (reducing testing to sequential semantics!)to sequential semantics!)May restrict programmersMay restrict programmers
Comprehensive TestingTesting if bad things happen in any scheduleTesting if bad things happen in any schedule
All schedules may be too stringent for programs All schedules may be too stringent for programs involving GUIsinvolving GUIs
Does not limit space of solutions but testing Does not limit space of solutions but testing burden is greatly increased.burden is greatly increased.Good testing tools are essential. Good testing tools are essential.
Coping with the Java Memory ModelCoping with the Java Memory ModelAvoid using synchronized and minimize the size of synchronized blocks to reduce likelihood of deadlock.Identify all classes that can be shared and make all fields in such classes either final or volatile. Ensures sequential consistency (almost).Array elements are still technically a problem because they cannot be marked as volatile. The ConcurrentUtilities library includes a special form of array with volatile elements.
Improvements to JunitImprovements to Junit
Uncaught exceptions and failed assertionsUncaught exceptions and failed assertions– Not caught in child threadsNot caught in child threads
ConcJUnit developed by my former graduate student Mathias Ricken fixes all of the problems with Junit.
Developed for Java 6; Java 7 not yet Developed for Java 6; Java 7 not yet supported.supported.
Mathias developed some other tools to Mathias developed some other tools to help test concurrent programs but none of help test concurrent programs but none of them have yet reached production quality them have yet reached production quality (e.g., random delays/yields). Research (e.g., random delays/yields). Research idea: JVM from Hell.idea: JVM from Hell.
Presumably easy to use ConcJUnit jar Presumably easy to use ConcJUnit jar instead of Junit in Eclipse. Designed for instead of Junit in Eclipse. Designed for drop-in compatibility with Junit 4.7.drop-in compatibility with Junit 4.7.
Sample JUnit TestsSample JUnit Tests
public class Test extends TestCase { public void testException() { throw new RuntimeException("booh!"); } public void testAssertion() { assertEquals(0, 1); }}
if (0!=1) throw new AssertionFailedError();
}}Both tests
fail.Both tests
fail.
Problematic JUnit TestsProblematic JUnit Tests
public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}
newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();
throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");
Main thread
Child thread
Main thread
Child thread
spawns
uncaught!
end of test
success!
Problematic JUnit TestsProblematic JUnit Tests
public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}
newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();
throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");
Main thread
Child thread
Main thread
Child thread
spawns
uncaught!
end of test
success!
Problematic JUnit TestsProblematic JUnit Tests
public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}
newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();
throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");
Main thread
Child threadUncaught exception,
test should fail but does not!
Problematic JUnit TestsProblematic JUnit Tests
public class Test extends TestCase { public void testFailure() { new Thread(new Runnable() { public void run() { fail("This thread fails!"); } }).start(); }}
newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();
throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");
Main thread
Child threadUncaught exception,
test should fail but does not!
Thread Group for JUnit TestsThread Group for JUnit Tests
public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}
newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();
throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");
Test thread
Child thread
invokeschecks
TestGroup’s Uncaught Exception Handler
Thread Group for JUnit TestsThread Group for JUnit Tests
public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}
newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();
throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");
Test thread
Child thread
Test thread
Child thread
spawns
uncaught!
end of testfailure!
invokes group’s handler
Main thread
spawns and waits resumes
check group’s handler
Improvements to JUnitImprovements to JUnit
Uncaught exceptions and failed assertionsUncaught exceptions and failed assertions– Not caught in child threadsNot caught in child threads
Thread group with exception handlerThread group with exception handler– JUnit test runs in a separate thread, not main threadJUnit test runs in a separate thread, not main thread– Child threads are created in same thread groupChild threads are created in same thread group– When test ends, check if handler was invokedWhen test ends, check if handler was invoked
Detection of uncaught exceptions and failed Detection of uncaught exceptions and failed assertions in child threads that occurred before assertions in child threads that occurred before test’s endtest’s end
Past tense: occurred!
Child Thread Outlives ParentChild Thread Outlives Parent
public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}
newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();
throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");
Test thread
Child thread
Test thread
Child thread
spawns
uncaught!
end of testfailure!
invokes group’s handler
Main thread
spawns and waits resumes
check group’s handler
Child Thread Outlives ParentChild Thread Outlives Parent
public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}
newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();
throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");
Test thread
Child thread
Test thread
Child thread
spawns
uncaught!end of test
success!
invokes group’s handler
Main thread
spawns and waits resumescheck group’s
handler
Too late!
Enforced JoinEnforced Join
public class Test extends TestCase {
public void testException() {
new Thread(new Runnable() {
public void run() {
throw new RuntimeException("booh!");
}
});
t.start(); … t.join();
}
}
Thread t = Thread t = newnew Thread(new Runnable() { Thread(new Runnable() {
public void run() {public void run() {
throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");
}}
});});
t.start(); … t.join(); …t.start(); … t.join(); …
throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");
Test thread
Child thread
Testing Using ConcJUnitTesting Using ConcJUnit
Replacement for junit.jar or as plugin JAR for JUnit 4.7 compatible with Java 6 (not 7 or 8)
Available as binary and source at Available as binary and source at http://www.concutest.org/http://www.concutest.org/
Results from DrJava’s unit testsChild thread for communication with slave VM still alive in Child thread for communication with slave VM still alive in testtestSeveral reader and writer threads still alive in low level test Several reader and writer threads still alive in low level test (calls to (calls to join()join() missing) missing)
DrJava currently does not use ConcJUnitTests based on a custom-made class extending junit.framework.TestCase
Does not check if join() calls are missing
ConclusionConclusion
Improved JUnit now detects problems in other threads– Only in chosen scheduleOnly in chosen schedule– Needs schedule-based executionNeeds schedule-based execution
Annotations ease documentation and checking of concurrency invariants– Open-source library of Java API invariantsOpen-source library of Java API invariants
Support programs for schedule-based execution
Future WorkFuture Work
Adversary scheduling using delays/yields (JVM from Hell)Schedule-Based Execution (Impractical?)
Replay stored schedulesReplay stored schedules
Generate representative schedulesGenerate representative schedules
Dynamic race detection (what races bugs?)Dynamic race detection (what races bugs?)
Randomized schedules (JVM from Hell)Randomized schedules (JVM from Hell)
Support annotations from Floyd-Hoare logicDeclare and check contracts (preconditions & Declare and check contracts (preconditions & postconditions for methods)postconditions for methods)
Declare and check class invariantsDeclare and check class invariants
Test all possible schedules– Concurrent unit tests meaningful againConcurrent unit tests meaningful again
Number of schedules (N)– tt: # of threads, : # of threads, ss: # of slices per thread: # of slices per thread
detail
Tractability of Comprehensive TestingTractability of Comprehensive Testing
Extra: Number of SchedulesExtra: Number of Schedules
back
Product of s-combinations
For thread 1: choose s out of ts time slicesFor thread 2: choose s out of ts-s time slices…For thread t-1: choose s out of 2s time slicesFor thread t-1: choose s out of s time slices
Writing s-combinations using factorial
Cancel out terms in denominator and next numerator
Left with (ts)! in numerator and t numerators with s!
If program is race-free, we do not have to simulate all thread switches– Threads interfere only at “critical points”: lock Threads interfere only at “critical points”: lock
operations, shared or volatile variables, etc.operations, shared or volatile variables, etc.– Code between critical points cannot affect outcomeCode between critical points cannot affect outcome– Simulate all possible arrangements of blocks Simulate all possible arrangements of blocks
delimited by critical pointsdelimited by critical points
Run dynamic race detection in parallel– Lockset algorithm (e.g. Eraser by Savage et al)Lockset algorithm (e.g. Eraser by Savage et al)
Tractability of Comprehensive TestingTractability of Comprehensive Testing
Critical Points ExampleCritical Points Example
Thread 1
Thread 2
Local Var 1
Local Var 1
Shared Var
Lock
lock access unlock
lock access unlock
lock access unlock
All accesses protected by
lock
Local variables don’t need
locking
All accesses protected by
lock
All accesses protected by
lock
Fewer critical points than thread switches– Reduces number of schedulesReduces number of schedules– Example:Example: Two threads, but no communicationTwo threads, but no communication
NN = 1 = 1
Unit tests are small– Reduces number of schedulesReduces number of schedules
Hopefully comprehensive simulation is tractable– If not, heuristics are still better than nothingIf not, heuristics are still better than nothing
Fewer SchedulesFewer Schedules
LimitationsLimitationsImprovements only check chosen schedule– A different schedule may still failA different schedule may still fail– Requires comprehensive testing to be Requires comprehensive testing to be
meaningfulmeaningful
May still miss uncaught exceptions– Specify absolute parent thread group, not Specify absolute parent thread group, not
relativerelative– Cannot detect uncaught exceptions in a Cannot detect uncaught exceptions in a
program’s uncaught exception handler (JLS program’s uncaught exception handler (JLS limitation)limitation)
details
Extra: LimitationsExtra: Limitations
May still miss uncaught exceptions– Specify absolute parent thread group, not Specify absolute parent thread group, not
relative (rare)relative (rare)Koders.com: 913 matches Koders.com: 913 matches ThreadGroupThreadGroup vs. vs. 49,329 matches for 49,329 matches for ThreadThread
– Cannot detect uncaught exceptions in a Cannot detect uncaught exceptions in a program’s uncaught exception handler (JLS program’s uncaught exception handler (JLS limitation)limitation)
Koders.com: 32 method definitions for Koders.com: 32 method definitions for uncaughtExceptionuncaughtException method method
back
Extra: DrJava StatisticsExtra: DrJava Statistics
20042004736736
61061036369090
511651164161416196596518.83%18.83%
10710711
Unit testsUnit testspassedpassedfailedfailednot runnot run
InvariantsInvariantsmetmetfailedfailed% failed% failed
KLOCKLOC““event thread”event thread”
20062006881881
8818810000
344123441230616306163796379611.03%11.03%
1291299999
back