41
Unit Testing in Java Unit Testing in Java with an Emphasis on with an Emphasis on Concurrency Concurrency Corky Cartwright Corky Cartwright Rice and Halmstad Rice and Halmstad Universities Universities Summer 2013 Summer 2013

Unit Testing in Java with an Emphasis on Concurrency Corky Cartwright Rice and Halmstad Universities Summer 2013

Embed Size (px)

Citation preview

Unit Testing in Java with an Unit Testing in Java with an Emphasis on Concurrency Emphasis on Concurrency

Corky CartwrightCorky Cartwright

Rice and Halmstad UniversitiesRice and Halmstad Universities

Summer 2013Summer 2013

Software Engineering CultureSoftware Engineering Culture● Three Guiding Visions

● Data-driven designData-driven design● Test-driven developmentTest-driven development● Mostly functional coding (no gratuitous mutation)Mostly functional coding (no gratuitous mutation)

● Codified in Design Recipe taught inHow to Design Programs by Felleisen et al(available for free online: www.htdp.org [first edition], www.ccs.neu.edu/home/matthias/HtDP2e/Draft, [second edition]) and Elements of Object-Oriented Design (available online at . The target languages are Scheme and Java.

Moore’s LawMoore’s Law

Extrapolate the FutureExtrapolate the Future

TimelinessTimeliness

● CPU clock frequencies stagnate

● Multi-Core CPUs provide additional processing power, but multiple threads needed to use multiple cores.

● Writing concurrent programs is difficult!

Tutorial OutlineTutorial Outline

● Introduce unit testing in single-threaded (deterministic) setting using lists

● Demonstrate problems introduced by concurrency and their impact on unit testing

● Show how some of the most basic problems can be overcome by using the right policies and tools.

(Sequential) Unit Testing(Sequential) Unit Testing

Unit tests …Test parts of the program (Test parts of the program (including whole!including whole!))

Integrate with program development; commits Integrate with program development; commits to repository must pass all unit teststo repository must pass all unit tests

Automate testing during maintenance phaseAutomate testing during maintenance phase

Serve as documentationServe as documentation

Prevent bugs from reoccurringPrevent bugs from reoccurring

Help keep the code repository cleanHelp keep the code repository clean

Effective with a single thread of control

Universal Test-Driven Design RecipeUniversal Test-Driven Design RecipeAnalyze the problem: define the data and determine top level operations. Give sample data values.

Define type signatures, contracts, and headers for all top level operations. In Java, the type signature is part of the header.

Give input-output examples including critical boundary cases for each operation.

Write a template for each operation, typically based on structural decomposition of primary argument (the receiver in OO methods).

Code each method by filling in templates

Test every method (using I/O examples!) and ascertain that every method is tested on sufficient set of examples. White-box testing matters!

Sequential Case Studies: Sequential Case Studies: Functional Lists and Bi-ListsFunctional Lists and Bi-Lists

A List<E> is either Empty<E>(), orEmpty<E>(), or

Cons<E>(e, l) where e is an E and l is List<E>Cons<E>(e, l) where e is an E and l is List<E>

A BiList<E> is a mutable data structure containing a possibly empty sequence of objects of type E that can be traversed in either direction using a BiListIterator<E>.

Review Elements of Sequential Review Elements of Sequential Unit TestingUnit Testing

● Unit tests depend on deterministic behavior

● Known input, expected output…Success correct behaviorFailure flawed code

● Outcome of test is meaningful if test is deterministic

Problems Due to ConcurrencyProblems Due to Concurrency

Thread scheduling is nondeterministic and machine-dependent

Code may be executed under different schedulesCode may be executed under different schedulesDifferent schedules may produce different resultsDifferent schedules may produce different results

Known input, expected output(s?)…

Success correct behavior in this schedule, may be flawed in other schedule

Failure flawed code

Success of unit test is meaningless

Recommended Resources on Recommended Resources on Concurrent Programming in JavaConcurrent Programming in JavaExplicit Concurrency:

Comp 402 web site from 2009

Brian Goetz, Java Concurrency in Practice

(available onlne at this website)

Coping with Multicore

Emerging parallel extensions of Java/Scala that guarantee determinism (in designated subset) and do not require explicit synchronization and avoid JMM issues

Habanero Java

Habanero Scala

Success of non-deterministic unit test is not very meaningful

Problems Due to Java Memory ModelProblems Due to Java Memory Model

JMM is MUCH weaker than sequential consistency

Writes to shared data may be held pending indefinitely unless target is declared volatile or is shielded by the same lock as subsequent reads. Why not always use locking (synchronized)?

Significant overheadSignificant overhead

Increases likelihood of deadlockIncreases likelihood of deadlock

Extremely difficult to reason about program execution for specific inputs because so many schedules are allowed.

A model that accommodates compiler writers rather than software developers.

Hidden Pitfalls in Using JUnit to Hidden Pitfalls in Using JUnit to Test Concurrent JavaTest Concurrent Java

Junit Is Completely Broken for Concurrent Code Units:

Fails to detect exceptions and failed assertions Fails to detect exceptions and failed assertions in threads other than the main thread (!)in threads other than the main thread (!)Fails to detect if auxiliary thread is still running Fails to detect if auxiliary thread is still running when main thread terminates; all execution is when main thread terminates; all execution is aborted when main thread terminates.aborted when main thread terminates.Fails to ensure that all auxiliary threads were Fails to ensure that all auxiliary threads were joined by main thread before termination. (In joined by main thread before termination. (In Habanero Java, all programs are implicity Habanero Java, all programs are implicity enclosed a comprehensive join called enclosed a comprehensive join called finishfinish() ()

but not in Java.)but not in Java.)

Possible Solutions to Concurrent Possible Solutions to Concurrent Testing ProblemsTesting Problems

Programming Language FeaturesEnsure that bad things cannot happen; Ensure that bad things cannot happen; perhaps ensure determinism (reducing testing perhaps ensure determinism (reducing testing to sequential semantics!)to sequential semantics!)May restrict programmersMay restrict programmers

Comprehensive TestingTesting if bad things happen in any scheduleTesting if bad things happen in any schedule

All schedules may be too stringent for programs All schedules may be too stringent for programs involving GUIsinvolving GUIs

Does not limit space of solutions but testing Does not limit space of solutions but testing burden is greatly increased.burden is greatly increased.Good testing tools are essential. Good testing tools are essential.

Coping with the Java Memory ModelCoping with the Java Memory ModelAvoid using synchronized and minimize the size of synchronized blocks to reduce likelihood of deadlock.Identify all classes that can be shared and make all fields in such classes either final or volatile. Ensures sequential consistency (almost).Array elements are still technically a problem because they cannot be marked as volatile. The ConcurrentUtilities library includes a special form of array with volatile elements.

Improvements to JunitImprovements to Junit

Uncaught exceptions and failed assertionsUncaught exceptions and failed assertions– Not caught in child threadsNot caught in child threads

ConcJUnit developed by my former graduate student Mathias Ricken fixes all of the problems with Junit.

Developed for Java 6; Java 7 not yet Developed for Java 6; Java 7 not yet supported.supported.

Mathias developed some other tools to Mathias developed some other tools to help test concurrent programs but none of help test concurrent programs but none of them have yet reached production quality them have yet reached production quality (e.g., random delays/yields). Research (e.g., random delays/yields). Research idea: JVM from Hell.idea: JVM from Hell.

Presumably easy to use ConcJUnit jar Presumably easy to use ConcJUnit jar instead of Junit in Eclipse. Designed for instead of Junit in Eclipse. Designed for drop-in compatibility with Junit 4.7.drop-in compatibility with Junit 4.7.

Sample JUnit TestsSample JUnit Tests

public class Test extends TestCase { public void testException() { throw new RuntimeException("booh!"); } public void testAssertion() { assertEquals(0, 1); }}

if (0!=1) throw new AssertionFailedError();

}}Both tests

fail.Both tests

fail.

Problematic JUnit TestsProblematic JUnit Tests

public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Main thread

Child thread

Main thread

Child thread

spawns

uncaught!

end of test

success!

Problematic JUnit TestsProblematic JUnit Tests

public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Main thread

Child thread

Main thread

Child thread

spawns

uncaught!

end of test

success!

Problematic JUnit TestsProblematic JUnit Tests

public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Main thread

Child threadUncaught exception,

test should fail but does not!

Problematic JUnit TestsProblematic JUnit Tests

public class Test extends TestCase { public void testFailure() { new Thread(new Runnable() { public void run() { fail("This thread fails!"); } }).start(); }}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Main thread

Child threadUncaught exception,

test should fail but does not!

Thread Group for JUnit TestsThread Group for JUnit Tests

public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Test thread

Child thread

invokeschecks

TestGroup’s Uncaught Exception Handler

Thread Group for JUnit TestsThread Group for JUnit Tests

public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Test thread

Child thread

Test thread

Child thread

spawns

uncaught!

end of testfailure!

invokes group’s handler

Main thread

spawns and waits resumes

check group’s handler

Improvements to JUnitImprovements to JUnit

Uncaught exceptions and failed assertionsUncaught exceptions and failed assertions– Not caught in child threadsNot caught in child threads

Thread group with exception handlerThread group with exception handler– JUnit test runs in a separate thread, not main threadJUnit test runs in a separate thread, not main thread– Child threads are created in same thread groupChild threads are created in same thread group– When test ends, check if handler was invokedWhen test ends, check if handler was invoked

Detection of uncaught exceptions and failed Detection of uncaught exceptions and failed assertions in child threads that occurred before assertions in child threads that occurred before test’s endtest’s end

Past tense: occurred!

Child Thread Outlives ParentChild Thread Outlives Parent

public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Test thread

Child thread

Test thread

Child thread

spawns

uncaught!

end of testfailure!

invokes group’s handler

Main thread

spawns and waits resumes

check group’s handler

Child Thread Outlives ParentChild Thread Outlives Parent

public class Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { throw new RuntimeException("booh!"); } }).start(); }}

newnew Thread(new Runnable() { Thread(new Runnable() { public void run() {public void run() { throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!"); }}}).start();}).start();

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Test thread

Child thread

Test thread

Child thread

spawns

uncaught!end of test

success!

invokes group’s handler

Main thread

spawns and waits resumescheck group’s

handler

Too late!

Enforced JoinEnforced Join

public class Test extends TestCase {

public void testException() {

new Thread(new Runnable() {

public void run() {

throw new RuntimeException("booh!");

}

});

t.start(); … t.join();

}

}

Thread t = Thread t = newnew Thread(new Runnable() { Thread(new Runnable() {

public void run() {public void run() {

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

}}

});});

t.start(); … t.join(); …t.start(); … t.join(); …

throwthrow newnew RuntimeException("booh!"); RuntimeException("booh!");

Test thread

Child thread

Testing Using ConcJUnitTesting Using ConcJUnit

Replacement for junit.jar or as plugin JAR for JUnit 4.7 compatible with Java 6 (not 7 or 8)

Available as binary and source at Available as binary and source at http://www.concutest.org/http://www.concutest.org/

Results from DrJava’s unit testsChild thread for communication with slave VM still alive in Child thread for communication with slave VM still alive in testtestSeveral reader and writer threads still alive in low level test Several reader and writer threads still alive in low level test (calls to (calls to join()join() missing) missing)

DrJava currently does not use ConcJUnitTests based on a custom-made class extending junit.framework.TestCase

Does not check if join() calls are missing

ConclusionConclusion

Improved JUnit now detects problems in other threads– Only in chosen scheduleOnly in chosen schedule– Needs schedule-based executionNeeds schedule-based execution

Annotations ease documentation and checking of concurrency invariants– Open-source library of Java API invariantsOpen-source library of Java API invariants

Support programs for schedule-based execution

Future WorkFuture Work

Adversary scheduling using delays/yields (JVM from Hell)Schedule-Based Execution (Impractical?)

Replay stored schedulesReplay stored schedules

Generate representative schedulesGenerate representative schedules

Dynamic race detection (what races bugs?)Dynamic race detection (what races bugs?)

Randomized schedules (JVM from Hell)Randomized schedules (JVM from Hell)

Support annotations from Floyd-Hoare logicDeclare and check contracts (preconditions & Declare and check contracts (preconditions & postconditions for methods)postconditions for methods)

Declare and check class invariantsDeclare and check class invariants

Extra SlidesExtra Slides

Test all possible schedules– Concurrent unit tests meaningful againConcurrent unit tests meaningful again

Number of schedules (N)– tt: # of threads, : # of threads, ss: # of slices per thread: # of slices per thread

detail

Tractability of Comprehensive TestingTractability of Comprehensive Testing

Extra: Number of SchedulesExtra: Number of Schedules

back

Product of s-combinations

For thread 1: choose s out of ts time slicesFor thread 2: choose s out of ts-s time slices…For thread t-1: choose s out of 2s time slicesFor thread t-1: choose s out of s time slices

Writing s-combinations using factorial

Cancel out terms in denominator and next numerator

Left with (ts)! in numerator and t numerators with s!

If program is race-free, we do not have to simulate all thread switches– Threads interfere only at “critical points”: lock Threads interfere only at “critical points”: lock

operations, shared or volatile variables, etc.operations, shared or volatile variables, etc.– Code between critical points cannot affect outcomeCode between critical points cannot affect outcome– Simulate all possible arrangements of blocks Simulate all possible arrangements of blocks

delimited by critical pointsdelimited by critical points

Run dynamic race detection in parallel– Lockset algorithm (e.g. Eraser by Savage et al)Lockset algorithm (e.g. Eraser by Savage et al)

Tractability of Comprehensive TestingTractability of Comprehensive Testing

Critical Points ExampleCritical Points Example

Thread 1

Thread 2

Local Var 1

Local Var 1

Shared Var

Lock

lock access unlock

lock access unlock

lock access unlock

All accesses protected by

lock

Local variables don’t need

locking

All accesses protected by

lock

All accesses protected by

lock

Fewer critical points than thread switches– Reduces number of schedulesReduces number of schedules– Example:Example: Two threads, but no communicationTwo threads, but no communication

NN = 1 = 1

Unit tests are small– Reduces number of schedulesReduces number of schedules

Hopefully comprehensive simulation is tractable– If not, heuristics are still better than nothingIf not, heuristics are still better than nothing

Fewer SchedulesFewer Schedules

LimitationsLimitationsImprovements only check chosen schedule– A different schedule may still failA different schedule may still fail– Requires comprehensive testing to be Requires comprehensive testing to be

meaningfulmeaningful

May still miss uncaught exceptions– Specify absolute parent thread group, not Specify absolute parent thread group, not

relativerelative– Cannot detect uncaught exceptions in a Cannot detect uncaught exceptions in a

program’s uncaught exception handler (JLS program’s uncaught exception handler (JLS limitation)limitation)

details

Extra: LimitationsExtra: Limitations

May still miss uncaught exceptions– Specify absolute parent thread group, not Specify absolute parent thread group, not

relative (rare)relative (rare)Koders.com: 913 matches Koders.com: 913 matches ThreadGroupThreadGroup vs. vs. 49,329 matches for 49,329 matches for ThreadThread

– Cannot detect uncaught exceptions in a Cannot detect uncaught exceptions in a program’s uncaught exception handler (JLS program’s uncaught exception handler (JLS limitation)limitation)

Koders.com: 32 method definitions for Koders.com: 32 method definitions for uncaughtExceptionuncaughtException method method

back

Extra: DrJava StatisticsExtra: DrJava Statistics

20042004736736

61061036369090

511651164161416196596518.83%18.83%

10710711

Unit testsUnit testspassedpassedfailedfailednot runnot run

InvariantsInvariantsmetmetfailedfailed% failed% failed

KLOCKLOC““event thread”event thread”

20062006881881

8818810000

344123441230616306163796379611.03%11.03%

1291299999

back