Unit Testing Concurrent Software

  • Published on

  • View

  • Download

Embed Size (px)


<ul><li> 1. Unit Testing Concurrent SoftwareWilliam PughNathaniel AyewahDept. of Computer ScienceDept. of Computer ScienceUniv. of MarylandUniv. of MarylandCollege Park, MD College Park, MDpugh@cs.umd.edu ayewah@cs.umd.edu ABSTRACTvarious locking protocols, avoid race conditions, and coordi- There are many diculties associated with developing cor- nate multiple threads. Many developers manage this com- rect multithreaded software, and many of the activities thatplexity by separating the concurrent logic from the business are simple for single threaded software are exceptionally logic, limiting concurrency to small abstractions like latches, hard for multithreaded software. One such example is con- semaphores and bounded buers. These abstractions can structing unit tests involving multiple threads. Given, for then be tested independently, adequately validating the cor- example, a blocking queue implementation, writing a testrectness of most of the concurrent aspects of the application. case to show that it blocks and unblocks appropriately us- One strategy for testing concurrent components is to write ing existing testing frameworks is exceptionally hard. In thislarge test cases that contain many possible interleavings and paper, we describe the MultithreadedTC framework whichrun these test cases many times hopefully inducing some rare allows the construction of deterministic and repeatable unitbut faulty interleavings. Many concurrent test frameworks tests for concurrent abstractions. This framework is not de-are based on this paradigm and provide facilities to increase signed to test for synchronization errors that lead to rare the diversity of interleavings [5, 9]. But this strategy may probabilistic faults under concurrent stress. Rather, thismiss some interleavings that lead to failures, and do not framework allows us to demonstrate that code does provide yield consistent results. specic concurrent functionality (e.g., a thread attemptingA dierent strategy is to test specic interleavings sepa- to acquire a lock is blocked if another thread has the lock). rately. But some critical interleavings are hard to exerciseWe describe the framework and provide empirical compar-because of the presence of blocking and/or timing. This isons against hand-coded tests designed for Suns Java con- paper describes MultithreadedTC, a framework that allows currency utilities library and against previous frameworksa test designer to exercise a specic interleaving of threads that addressed this same issue. The source code for thisin an application. It features a metronome (or clock) that framework is available under an open source license.allows test designers to coordinate threads even in the pres- ence of blocking and timing issues. The clock advances when all threads are blocked; test designers can delay operations Categories and Subject Descriptorswithin a thread until the clock has reached a desired tick. D.2.5 [Software Engineering]: Testing and DebuggingMultithreadedTC also features a concise syntax for speci- testing tools fying threads and eliminates much of the scaolding code required to make multithreaded tests work well with JUnit. General Terms It can also detect deadlocks and livelocks. Experimentation, Reliability, Verication 2. A SIMPLE EXAMPLE Keywords Figure 1 shows four operations on a bounded buer, dis- Java, testing framework, concurrent abstraction, JUnit test tributed over two threads. A bounded buer allows users to cases, MultithreadedTCtake elements that have been put into the container, in the order in which they were put. It has a xed capacity and 1. INTRODUCTION both put and take cause the calling thread to block if the container is full or empty respectively. Concurrent applications are often hard to write and even In this example, the call to put 17 should block thread 1 harder to verify. The application developer has to adhere tountil the assertion take = 42 in thread 2 frees up space in the bounded buer. But how does the test designer guar- antee that thread 1 blocks before thread 2s assertion, and Permission to make digital or hard copies of all or part of this work for does not unblock until after the assertion? personal or classroom use is granted without fee provided that copies areOne option is to put thread 2 to sleep for a xed time pe- not made or distributed for prot or commercial advantage and that copies riod long enough to guarantee that thread 1 has blocked. bear this notice and the full citation on the rst page. To copy otherwise, toBut this introduces unnecessary timing dependence the republish, to post on servers or to redistribute to lists, requires prior specic resulting code does not play well in a debugger, for in- permission and/or a fee. ASE07, November 49, 2007, Atlanta, Georgia, USA.stance and the presence of multiple timing dependent Copyright 2007 ACM 978-1-59593-882-4/07/0011 ...$5.00.units would make a large example harder to understand. </li></ul><p> 2. class BoundedBufferTest extends MultithreadedTestCase {ArrayBlockingQueue buf;@Override public void initialize() { buf = new ArrayBlockingQueue(1);}public void thread1() throws InterruptedException { buf.put(42); buf.put(17); assertTick(1);}public void thread2() throws InterruptedException { waitForTick(1); assertTrue(buf.take() == 42); assertTrue(buf.take() == 17); Figure 1: (a) Operations in multiple threads on a} bounded buer of capacity 1 must occur in a par- @Override public void finish() { assertTrue(buf.isEmpty()); ticular order, (b) MultithreadedTC guarantees the} correct order. } // JUnit testpublic void testBoundedBuffer() throws Throwable { Another approach is to make thread 2 wait on a latch that TestFramework.runOnce( new BoundedBufferTest() ); is released at the appropriate time. But this will not work} here because the only other thread available to release the Figure 2: Java version of bounded buer test. latch (thread 1) is blocked.MultithreadedTCs solution is to superimpose an exter- nal clock on both threads. The clock runs in a separate The test sequence is run by invoking one of the run meth- (daemon) thread which periodically checks the status of allods in TestFramework. (This can be done in a JUnit test, test threads. If all the threads are blocked, and at least one as in Figure 2.) TestFramework is also responsible for regu- thread is waiting for a tick, the clock advances to the next lating the clock. It creates a thread that periodically checks desired tick. In Figure 1, thread 2 blocks immediately, andthe status of the clock and test threads and decides when to thread 1 blocks on the call to put 17. At this point, allrelease threads that are waiting for a tick. It can also detect threads are blocked with thread 2 waiting for tick 1. So the deadlock: when all threads are blocked and none of them is clock thread advances the clock to tick 1 and thread 2 iswaiting for a tick or a timeout. More details on creating and free to go. Thread 2 then takes from the buer with the as-running tests are available on the project website at [3]. sertion take = 42, and this releases Thread 1. Hence with the test, we are condent that (1) thread 1 blocked (because the clock advanced) and (2) thread 1 was not released until4. TEST FRAMEWORK EVALUATION after thread 2s rst assertion (because of the nal assertionTo evaluate MultithreadedTC, we performed some quali- in thread 1).tative and quantitative comparisons with other test frame-MultithreadedTC is partly inspired by ConAn [11, 12,works, focusing on how easy it is to write and understand 13], a script-based concurrency testing framework which also test cases, and how expressive each test framework is. Sec- uses a clock to coordinate the activities of multiple threads. tion 4.1 compares MultithreadedTC with TCK tests created We provide a comparison of ConAn and MultithreadedTC to validate Javas concurrency utilities library [1]. We take in Section 4.2.this test suite to represent a typical JUnit-based implemen-tation of multithreaded tests. Section 4.2 compares Multi-threadedTC with ConAn. 3. IMPLEMENTATION DETAILS Figure 2 shows a Java implementation of the test in Figure 4.1 JSR 166 TCK comparison 1 and illustrates some of the basic components of a Multi-JSR (Java Specication Request) 166 is a set of proposals threadTC test case. Each test is a subclass of Multithread-for adding a concurrency utilities library to Java [1]. This edTestCase. Threads are specied using thread methodsincludes components like locks, latches, thread locals and which return void, have no arguments and have names pre- bounded buers. Its TCK (Technology Compatibility Kit) xed with the word thread. Each test may also override includes a suite of JUnit tests that validate the correctness the initialize() or finish() methods provided by the of the librarys implementation against the specication pro- base class. When a test is run, initialize() is invokedvided in the JSR. We examined 258 of these tests which at- rst, then all thread methods are invoked simultaneously tempt to exercise specic interleavings in 33 classes, and im- in separate threads, and nally when all the thread meth-plemented them using MultithreadedTC. These tests do not ods have completed, finish() is invoked. This is analogous measure performance, nor do they look for rare errors that to the organization of setup, test and teardown methods in may occur when a concurrent abstraction is under stress. JUnit [2].In general MultithreadedTC allows the test designer to Each MultithreadedTestCase maintains a clock that starts use simpler constructs that require fewer lines of code. Our at time 0. The clock advances when all threads are blocked evaluation also demonstrates that MultithreadedTC can ex- and at least one thread is waiting for a tick. The methodpress all the strategies used in the TCK tests and is often waitForTick(i) blocks the calling thread until the clock more precise and expressive, especially for tests that do not reaches tick i. A test designer can also temporarily prevent require a timing dependency. Table 1 gives an overview of the clock from advancing when all threads are blocked by the comparison, including the observation that Multithread- using freezeClock(). edTC requires fewer local variables and anonymous inner 3. #ticktime 200#monitor m WriterPreferenceReadWriteLock Table 1: Overall comparison of TCK tests and MTC ... (MultithreadedTC) implementation ldots#begin MeasureTCK MTC #test C1 C13#tick Lines of Code 80037070#thread Bytecode Size 1017K980K#excMonitor m.readLock().attempt(1000); #end *Local variables per method 1.120.12 #valueCheck time() # 1 #end #end *Av. anon. inner classes/method 0.380.01 #end * Metrics measured by the software quality tool Swat4j [4] #tick #thread #excMonitor m.readLock().release(); #end#valueCheck time() # 2 #end threadFailed = false; #end ...#end Thread t = new Thread(new Runnable() { #endpublic void run() { try {Figure 4: The script for a ConAn test to validate a// statement(s) that may cause exception } catch(InterruptedException ie) { Writer Preference Read Write LockthreadFailed = true;fail("Unexpected exception"); } ConAn (Concurrency Analyzer) is a script-based test frame-} });work that, like MultithreadedTC, uses a clock to synchro- t.start(); nize the actions in multiple threads [12]. ConAn also aims to ...make it possible to write succinct test cases by introducing a t.join(); assertFalse(threadFailed); script-based syntax illustrated in Figure 4. To run the test,the script is parsed and translated into a Java test driver Figure 3: Handling threads in JUnit testscontaining methods which interact with some predened Co-nAn library classes and implement the tests specied in thescript. classes, in part because it does not make the test designer Each test is broken into tick blocks that contain one or construct threads manually. The TCK tests generally usemore thread blocks (which may have optional labels). A anonymous inner classes as illustrated in Figure 3. This thread block may span multiple tick blocks by using the syntax is quite verbose, especially since all the useful func- same label. Thread blocks contain valid Java code or use tionality is in the method run().custom script tags to perform common operations like han-Another source of verbosity in Figure 3 is the scaolding dling exceptions (excMonitor) and asserting values and equal- required to handle exceptions. While exceptions in Multi-ities (valueCheck). Any blocking statements in a given tick threadedTC cause the test to fail immediately, exceptionsmay unblock at a later tick, which is conrmed by checking in a JUnit test thread will kill the thread but will not failthe time. the test. To force failure, the TCK tests use additional try- The major dierence between the two frameworks is that catch blocks to catch thread exceptions and set a ag (suchConAn relies on a timer-based clock which ticks at regular as the threadFailed ag in Figure 3). MultithreadedTCintervals, while MultithreadedTC advances the clock when also eliminates the last join statement in Figure 3 because it all threads are blocked. ConAns strategy introduces a tim- checks to make sure all threads complete and uses finish() ing dependence even in tests that do not involve any timing to execute code that must run after all threads complete.like the bounded buer example in Figure 1.Some of the dierences we have just described are quanti-The two frameworks also use dierent paradigms to or- ed in Table 2. This table includes simple counts of the num-ganize tests. ConAn organizes tests by ticks, while Multi- ber of TCK tests that use the constructs described above threadedTC organizes tests by threads. The two paradigms (anonymous inner classes, try-catch blocks and joins) in a are dicult to compare quantitatively as each involves trade- way that is eliminated by the corresponding Multithread- os for the test designer. MultithreadedTC allows designers edTC test. It also counts the total number of times theseto inadvertently introduce indeterminism into tests by plac- constructs are removed.ing a waitForTick at the wrong place, while ConAn forcesdesigners to deterministically put code segments into the 4.2 Comparison to ConAntick in which they are to run. On the other hand ConAnsparadigm can be confusing when writing tests because thecode in one thread is spread out spatially into many ticks, Table 2: The number of TCK tests with constructs while MultithreadedTC places all the code for a thread into that were removed in the MultithreadedTC version one method, consistent with Javas paradigm. Also, Co- and the total number of constructs removed.nAns organization hardcodes the content of ticks and doesnot allow for relative time which is useful if blocking occursConstruct RemovedTestsRemoved in a loop.Anonymous inner classes 216 257Another signicant dierence between the two frameworksThreads join() method193 239 is that ConAn relies on a custom script-based syn...</p>


View more >