24
96 Concurrent/ Distributed Computing Paradigm Andrew P. Bernat Computer Research Association Patricia J. Teller University of Texas at El Paso 96.1 Introduction 96.2 Hardware Architectures 96.3 Software Architectures Busy-Wait: Concurrency without Abstractions Semaphores Monitors Message Passing 96.4 Distributed Systems 96.5 Formal Approaches 96.6 Existing Languages with Concurrency Features 96.7 Research Issues 96.8 Summary 96.1 Introduction Concurrent computing is the use of multiple, simultaneously executing processes or tasks to compute an answer or solve a problem. The original motivation for the development of concurrent computing techniques was for timesharing multiple users or jobs on a single computer. Modern workstations use this approach in a substantial manner. Another advantage of concurrent computing, and the reason for much of the current attention to the subject, is that it seems obvious that solving a problem using multiple computers is faster than using just one. Similarly, there is a powerful economic argument for using multiple inexpensive computers to solve a problem that normally requires an expensive supercomputer. Additionally, the use of multiple computers can provide fault tolerance. Moreover, there is an additional powerful argument for concurrent computing — the world is inherently concurrent. Just as each of us engages in a large number of concurrent tasks (hearing while seeing while reading, etc.), operating systems need to handle multiple, simultaneously executing tasks; robots need to engage in a multiplicity of actions; database systems must simultaneously handle large numbers of users accessing and updating information; etc. Often, breaking a problem into concurrent tasks provides a simpler, more straightforward solution. As an example, consider Conway’s problem: input is in the form of 80-character records (card images in the original problem, which gives an idea of how long it has been around); output is to be in the form of 120-character records; each pair of dollar signs, ‘$$’, is to be replaced by a single dollar sign, ‘$’; and a space, ‘ ’, is to be added at the end of each input record. In principle, a sequential solution may be developed, but the complications introduced require complex and non-obvious buffer manipulations. Moreover, a © 2004 by Taylor & Francis Group, LLC

Tucker Computer Science Handbook

Embed Size (px)

DESCRIPTION

Ciencias de la computación

Citation preview

  • 96Concurrent/Distributed

    Computing Paradigm

    Andrew P. BernatComputer Research Association

    Patricia J. TellerUniversity of Texas at El Paso

    96.1 Introduction

    96.2 Hardware Architectures

    96.3 Software ArchitecturesBusy-Wait: Concurrency without Abstractions Semaphores Monitors Message Passing

    96.4 Distributed Systems

    96.5 Formal Approaches

    96.6 Existing Languages with Concurrency Features

    96.7 Research Issues

    96.8 Summary

    96.1 Introduction

    Concurrent computing is the use of multiple, simultaneously executing processes or tasks to compute

    an answer or solve a problem. The original motivation for the development of concurrent computing

    techniques was for timesharing multiple users or jobs on a single computer. Modern workstations use

    this approach in a substantial manner. Another advantage of concurrent computing, and the reason

    for much of the current attention to the subject, is that it seems obvious that solving a problem using

    multiple computers is faster than using just one. Similarly, there is a powerful economic argument for using

    multiple inexpensive computers to solve a problem that normally requires an expensive supercomputer.

    Additionally, the use of multiple computers can provide fault tolerance.

    Moreover, there is an additional powerful argument for concurrent computing the world is inherently

    concurrent. Just as each of us engages in a large number of concurrent tasks (hearing while seeing while

    reading, etc.), operating systems need to handle multiple, simultaneously executing tasks; robots need

    to engage in a multiplicity of actions; database systems must simultaneously handle large numbers of

    users accessing and updating information; etc. Often, breaking a problem into concurrent tasks provides

    a simpler, more straightforward solution.

    As an example, consider Conways problem: input is in the form of 80-character records (card images

    in the original problem, which gives an idea of how long it has been around); output is to be in the form of

    120-character records; each pair of dollar signs, $$, is to be replaced by a single dollar sign, $; and a space,

    , is to be added at the end of each input record. In principle, a sequential solution may be developed,

    but the complications introduced require complex and non-obvious buffer manipulations. Moreover, a

    2004 by Taylor & Francis Group, LLC

  • concurrent solution consisting of three processes is both simpler and more elegant. The three processes

    execute within infinite loops performing the following actions:

    1. Process1 reads 80-character records into an 81-character buffer, places a space character in location

    81, and then outputs single characters from the buffer sequentially.

    2. Process2 reads single characters and copies them to output, but uses a simple state machine to

    substitute a single $ for two consecutive $$.

    3. Process3 reads single characters, saves them in a buffer, and outputs 120-character records.

    To develop an implementable solution, we need to decide how the independently executing processes

    communicate. A simple, widely used approach is to add two buffers: Buffer1 stores output characters from

    Process1 to be input to Process2; Buffer2 stores output characters from Process2 to be input to Process3.

    For simplicity, assume that Buffer1 and Buffer2 each hold a single character. Thus:

    1. Process1 reads 80-character records into an 81-character internal buffer, places a space character

    in location 81, and sequentially places in Buffer1 single characters from the internal buffer.

    2. Process2 reads single characters from Buffer1 and places them into Buffer2, but uses a simple state

    machine to substitute a single $ for two consecutive $$.

    3. Process3 reads single characters from Buffer2, saves them in an internal 120-character buffer, and

    outputs 120-character records.

    This solution demonstrates the essence of the concurrent paradigm: individual sequential processes that

    cooperate to solve a problem. The exemplified concurrency is pipelined concurrency, where the input of

    all processes but the first is provided by another process. Cooperation, in this and all other cases, requires

    that the processes:

    1. Share information and resources

    2. Not interfere during access to shared information or resources

    In the Conway solution, information is readily shared via the buffers. The chief problem is to ensure

    that concurrent accesses to the two buffers do not conflict; for example, Process2 does not attempt to

    retrieve a character from Buffer1 before it has been placed there by Process1 (which would lead to garbage

    characters), and Process1 does not attempt to place a character into Buffer1 before the previous character

    has been retrieved by Process2 (which would lead to lost characters).

    A simpler example of interference is provided by the following simple program (where the statements

    within the cobegincoend pair are to be executed simultaneously):

    x := 0

    cobegin

    x := x + 1

    x := x + 2

    coend

    Consider the value of x at the end of execution. Because each assignment statement is actually a sequence

    of machine-level instructions, various interleavings of the execution of these instructions result in different

    final values for x (i.e., 1, 2, or 3). Clearly, this is unacceptable!

    In each of these examples, it is clear that there are critical regions in which two (or more) processes

    have sections of code that may not be executed concurrently; we must have mutual exclusion between the

    critical regions. In the Conway example, critical regions include:

    r Process1 placing a value into Buffer1r Process2 retrieving a value from Buffer1r Process2 placing a value into Buffer2r Process3 retrieving a value from Buffer2

    2004 by Taylor & Francis Group, LLC

  • In the simple example above, each of the two assignment statements are critical regions. The essence of

    avoiding interference is to discover the critical regions and isolate them. This isolation takes the form of

    an entry protocol to announce entry into a critical region and an exit protocol to announce that the

    execution of the critical region has completed (below the # introduces a comment and the . . . represents

    the appropriate program code):

    # entry protocol

    ...

    # critical region code

    ...

    # exit protocol

    ...

    This is the basic model used by the busy-wait and semaphore approaches (discussed below). It is a low-level

    model in the sense that careful attention must be paid to the placement of the entry and exit protocols to

    ensure that critical regions are properly protected.

    There are other implementation approaches to concurrency that solve the critical region problem by

    prohibiting any direct interference between concurrent processes. This is done by not allowing any sharing

    of variables. The monitor approach places all shared variables and other resources under the control of a

    single monitor module, which is accessed by only a single process at a time. The message-passing approach

    is to share information only through messages passed from process to process. Both of these approaches

    are discussed in this chapter.

    As well as avoiding interference in data access, we must avoid interference in the sharing of resources

    (e.g., keyboard input for multiple processes). Also, we must ensure that any physical actions of concurrent

    processes, such as movement of robotic arms, are appropriately synchronized.

    Thus, to develop concurrent solutions, we require notations to:

    1. Specify which portions of our processes can run concurrently

    2. Specify which information and resources are to be shared

    3. Prevent interference by concurrent processes by ensuring mutual exclusion

    4. Synchronize concurrent processes at appropriate points

    Further, any proposed solution to a concurrent problem must have certain properties (see, for example,

    [Ben-Ari, 1990]):

    1. Safety: this property must always be true; examples include:

    a. Non-interference

    b. No deadlock, which occurs when no process can continue because all processes are waiting upon

    conditions that can never occur

    c. Partial correctness: whenever the program terminates, it has the correct answer

    2. Liveness: this property must be true eventually; examples include:

    a. Program terminates (if it also has the correct answer, this is total correctness)

    b. No race: nondeterministic behavior caused by concurrently executing processes

    c. Fairness: each process has an opportunity to execute (this is affected by implementation and

    process/thread scheduling)

    The verification or proof that solutions satisfy these properties is vastly complicated by the concurrent

    execution of code: particular orderings of code execution may exhibit interference or deadlock while others

    proceed nicely to termination. Returning to Conways problem, suppose the execution of Process1 and

    Process2 are matched evenly so that each character placed by Process1 into Buffer1 is retrieved by Process2

    before Process1 is ready to output another character. In this case, when tested, the program exhibits the

    desired correctness properties, lack of deadlock, etc. But if, due to a variation in processor workload or

    type, Process1 runs faster, then characters will be overwritten and lost; on the other hand, if Process2

    runs faster, characters will be repeated. The fact that we tested the program under one particular set

    2004 by Taylor & Francis Group, LLC

  • of circumstances (even for all possible inputs) is irrelevant to this issue. Sufficient testing is impossible

    because of the exponential explosion in the number of possible interleavings of instruction execution

    that can occur. The only fully satisfactory approach is to use formal methods (techniques that are still

    predominantly under development), which are touched on later in this chapter.

    This chapter focuses on the software architectures used for concurrency, using a set of archetypical

    problems and their solutions for illustration. These problems are chosen because of the frequency with

    which they arise in computing; careful study of actual problems frequently leads to the realization that

    a seemingly complicated problem is, at heart, one of these archetypes. First, we briefly explore hardware

    architectures and their impact on software.

    96.2 Hardware Architectures

    Hardware can influence synchronization and communication primarily with respect to efficiency. Mul-

    tiprogramming is the interleaving of the execution of multiple programs on a processor; on a uniproces-

    sor, a time-sharing operating system implements multiprogramming. Although such an approach on a

    uniprocessor does not provide the execution speedup discussed in the introduction, it does provide the

    possibility of elegance and simplicity in problem solution, which is the second argument for the concurrent

    paradigm.

    By employing multiple computers, we have multiprocessing, or parallel processing. Multiprocessing can

    involve multiple computers working on the same program or on different programs at the same time.

    If a multiprocessor system is built so that processors share memory, then processes can communicate

    via global variables stored in shared memory; otherwise, they communicate via messages passed from

    process to process. In contrast to a multiprocessor system, a distributed system is comprised of multiple

    computers that are remote from each other. This chapter focuses on multiprogramming and multipro-

    cessing systems with a short introduction to the additional problems associated with distributed systems.

    In addition (but outside the scope of this chapter), a wide variety of hybrid hardware/software approaches

    exist.

    96.3 Software Architectures

    To specify a software architecture for implementing concurrency, we must provide the syntax and semantics

    to:

    1. Specify which information and resources are to be shared

    2. Specify which portions of processes can run concurrently

    3. Prevent interference by concurrent processes by ensuring mutual exclusion

    4. Synchronize concurrent processes at appropriate points

    The first feature requires no special notation (shared variables are simply global), and the third and fourth

    are usually merged into one. A large number of software mechanisms have been proposed to support these

    features; in this chapter we explore the most widely used among them:

    1. Busy-wait: implementable on virtually any processor without operating system support; this is

    concurrency without abstractions

    2. Semaphores: historically the oldest satisfactory mechanism

    3. Monitors: modules that encapsulate concurrent access to shared data

    4. Message passing: a higher-level abstraction widely used in distributed systems

    The references at the end of the chapter provide pointers to a number of other mechanisms, such as Unix

    fork/join, conditional critical regions, etc.

    2004 by Taylor & Francis Group, LLC

  • 96.3.1 Busy-Wait: Concurrency without Abstractions

    To illustrate the busy-wait mechanism, we use (following [Ben-Ari, 1982], a very simple example consisting

    of two concurrent processes, each with a single critical region. The only assumption made is that each

    memory access is atomic; that is, it proceeds without interruption. Our task is to ensure mutual exclusion;

    the purpose of the exercise is to demonstrate the care with which a solution must be crafted to ensure the

    safety and liveness properties discussed above.

    Our first approach, which follows, is to ensure that the processes, p1 and p2, simply take turns in their

    critical regions.

    global var turn := 1

    process p1

    while true do ->

    # non-critical region

    ...

    # entry protocol

    while turn = 2 do ->

    # wait for turn

    # critical region

    ...

    # exit protocol

    turn := 2

    # rest of computation

    ...

    end p1

    process p2

    while true do ->

    # non-critical region

    ...

    # entry protocol

    while turn = 1 do ->

    # wait for turn

    # critical region

    ...

    # exit protocol

    turn := 1

    # rest of computation

    ...

    end p2

    This approach meets the desired properties but has a fundamental flaw: processes must take turns entering

    their critical regions. If p1 is ready and needs to execute its critical region at a higher frequency than p2, it

    cannot. The processes are an example of co-routines, historically one of the first approaches to concurrency.

    If we modify the solution to allow each process to proceed into its critical region if the other process is

    not in its critical region, and to then notify the other process, we obtain the following (where ci is used

    to signify that pi is in its critical region):

    global var c1 := false, c2 := false

    process p1

    while true do ->

    2004 by Taylor & Francis Group, LLC

  • # non-critical region

    ...

    # entry protocol

    while c2 do ->

    # wait for turn

    c1 := true # p1 in critical region

    # critical region

    ...

    # exit protocol

    c1 := false # p1 out of critical region

    # non-critical region

    end p1

    process p2

    while true do ->

    # non-critical region

    ...

    # entry protocol

    while c1 do ->

    # wait for turn

    c2 := true # p2 in critical region

    # critical region

    ...

    # exit protocol

    c2 := false # p2 out of critical region

    # non-critical region

    end p2

    Now, however, we have the possibility that the mutual exclusion requirement of the critical region can be

    violated; that is, both processes can be in their critical regions at the same time. (For example, suppose

    both c1 and c2 are false; p1 checks c2 via the loop and decides that it may enter its critical region; before

    it sets c1 to true, p2 checks c1 via its loop and decides that it may enter its critical region).

    As shown next, this disastrous possibility can be eliminated by having a process announce its intent to

    enter into its critical region before checking whether it can enter:

    global var c1 := false, c2 := false

    process p1

    while true do ->

    # non-critical region

    ...

    # entry protocol

    c1 := true # signal intent to enter

    while c2 do ->

    # wait for turn

    # critical region

    ...

    # exit protocol

    c1 := false # p1 out of critical region

    # non-critical region

    end p1

    process p2

    2004 by Taylor & Francis Group, LLC

  • while true do ->

    # non-critical region

    ...

    # entry protocol

    c2 := true # signal intent to enter

    while c1 do ->

    # wait for turn

    # critical region

    ...

    # exit protocol

    c2 := false # p1 out of critical region

    # non-critical region

    end p2

    But now we have raised the possibility of a race (when p1 sets c1 to true and p2 sets c2 to true).

    A possible solution to this difficulty, which appears below, moves the announcement statement into the

    loop, together with a random delay:

    global var c1 := false, c2 := false

    process p1

    while true do ->

    # non-critical region

    ...

    # entry protocol

    c1 := true # signal intent to enter

    while c2 do ->

    c1 := false # give up intent if p2 already

    # in critical region

    c1 := true # try again

    # critical region

    ...

    # exit protocol

    c1 := false # p1 out of critical region

    # non-critical region

    ...

    end p1

    process p2

    while true do ->

    # non-critical region

    ...

    # entry protocol

    c2 := true # signal intent to enter

    while c1 do ->

    c2 := false # give up intent if p1 already

    # in critical region

    c2 := true # try again

    # critical region

    ...

    # exit protocol

    2004 by Taylor & Francis Group, LLC

  • c2 := false # p2 out of critical region

    # non-critical region

    ...

    end p2

    But this is not a satisfactory solution because it exhibits a race in the (unlikely) situation that the two loops

    proceed in perfect synchronization.

    A valid solution, such as that which appears below, can be developed by returning to the concept of

    taking turns when applicable, which ensures mutual exclusion while not requiring alternating turns (thus

    allowing true concurrency):

    global var c1 := false, c2 := false, turn := 1

    process p1

    while true do ->

    # non-critical region

    ...

    # entry protocol

    c1 := true # signal intent to enter

    turn := 2 # give p2 priority

    while c2 and turn = 2 do ->

    # wait if p2 in critical region

    # critical region

    ...

    # exit protocol

    c1 := false # p1 out of critical region

    # non-critical region

    ...

    end p1

    process p2

    while true do ->

    # non-critical region

    ...

    # entry protocol

    c2 := true # signal intent to enter

    turn := 1 # give p1 priority

    while c1 and turn = 1 do ->

    # wait if p2 in critical region

    # critical region

    ...

    # exit protocol

    c2 := false # p1 out of critical region

    # non-critical region

    ...

    end p2

    This solution is due to Peterson [1983]; the first valid solution was presented by Dekker.

    The importance of the busy-wait approach is threefold:

    1. It provides a nice introduction to the problems inherent in designing concurrent solutions.

    2. It is executable on virtually every machine architecture without additional further software support

    and is, thus, suitable for micro-controllers, etc.

    3. Variants are frequently used in hardware implementations.

    2004 by Taylor & Francis Group, LLC

  • However, this approach also suffers from two difficulties:

    1. It is very inefficient: machine cycles are expended when executing busy-wait loops.

    2. Programming at such a low level is highly prone to error.

    96.3.2 Semaphores

    Dijkstra [1968] presented the first abstract mechanism for synchronization in concurrent programs. The

    semaphore, so named in direct relation to the semaphores used on railroad lines to control traffic over a

    single track, is a non-negative integer-valued abstract data type with two operations:

    P(s) : delay until s > 0, then s := s - 1

    V(s) : s := s + 1

    When a process delays on a semaphore, it is awakened only when another process executes a V operation

    on that semaphore. Thus, it uses no machine cycles to check if it can proceed. If more than one process

    is delaying on a semaphore, only one (which one is implementation dependent) can be awakened by a V

    operation.

    Additionally, the value of s can be set at instance creation via the semaphore declaration; if set to

    0, then some process must execute the V(s) operation before any processes first executing the P(s)

    operation can continue. With this abstract data type, we have a mechanism that handles both interference

    and synchronization.

    Additional notes:

    1. These are the only two synchronization operations defined; in particular, the value of s is not

    determinable.

    2. Implementation of these operations must be either in the hardware or in the (non-interruptible)

    system kernel.

    3. By sleeping while waiting for a semaphore (the delay in P(s)), a process does not waste machine

    cycles by repeated checking.

    4. The operation names (P and V) come from the Dutch words passeren (to pass) and vrygeven (to

    release); sometimes, signal and wait are used in place of P and V, respectively.

    5. Each of the P and V operations proceeds atomically; that is, it may not be interrupted by another

    process.

    The use of the semaphore in concurrent programming relates directly to the railroad analogy. Each

    critical section looks like the following:

    global var s : semaphore := 1

    # entry protocol

    P(s)

    # critical region

    ...

    # exit protocol

    V(s)

    The initialization of s to 1 ensures that the first process executing P(s) will continue. (Deadlock arises

    if s were initialized to 0.) Only the first process to reach its P(s) statement is allowed to proceed, as

    subsequent processes find s = 0 and delay. When the first process finishes its critical region, it executes

    V(s), which setss to 1. One of the waiting processes is awakened, finds>0, decrementss, and proceeds.

    Note the importance that these operations are atomic; this ensures that two processes cannot wake up and

    each find s>0.

    2004 by Taylor & Francis Group, LLC

  • 96.3.2.1 Semaphores and Producer-Consumer

    The Producer-Consumer problem arises whenever one process is creating values to be used by another

    process. Examples are Conways problem and buffers of various kinds, etc. Here we first look at the

    multi-element buffer version of this problem and then add multiple producers and consumers as a

    refinement.

    # define the buffer

    const N := ... # size

    var buf[N] : int # buffer

    front := 1 # pointers

    rear := 1

    semaphore empty := N # counts the number of empty slots

    in the buffer

    full := 0 # counts the number of items

    in the buffer

    process producer

    var x : int

    while true do ->

    # produce x

    ...

    P(empty) # delay until there is space in the buffer

    buf[rear] := x # place value in the buffer

    V(full) # signal that the buffer is non-empty

    rear := rear mod N + 1 # update buffer pointer

    end producer

    process consumer

    var x : int

    while true do ->

    P(full) # delay until a value is in the buffer

    x := buf[front] # obtain value

    V(empty) # signal that the buffer is not full

    front := front mod N + 1 # update buffer pointer

    # consume x

    ...

    end consumer

    The buffer processing is conventional; only the actual buffer access must be placed into a critical region

    because there is no possibility of interference between the assignments to rear and front. Note also the

    use of two semaphores: empty to signal that the producer can proceed because there is at least one empty

    slot in the buffer and full to signal that the consumer can proceed because there is at least one item in

    the buffer. Although it is possible to solve this problem with one semaphore, less concurrency results. Note

    that the empty semaphore is initialized to N, the size of the buffer. The producer process can run up to N

    steps ahead of the consumer process.

    To allow multiple producers and/or consumers, we must protect the actual buffer operations with

    additional semaphores to prevent, for example, two producers from accessing rear simultaneously with

    read and assignment operations. These semaphores, mutexR and mutexF, guarantee mutual exclusion

    of access to the rear and front pointers, respectively. It is not sufficient to use empty here because up

    to N producers will be able to continue through the P(empty) statement.

    # define the buffer as previously

    semaphore empty := N, full := 0

    2004 by Taylor & Francis Group, LLC

  • semaphore mutexR := 1 # mutual exclusion on rear pointer

    mutexF := 1 # mutual exclusion on front pointer

    process pi # one for each producer

    var x : int

    while true do ->

    # produce x

    ...

    P(empty) # delay until there is space in the buffer

    P(mutexR) # delay until rear pointer is not in use

    # place value in the buffer and modify pointer

    buf[rear] := x; rear := rear mod N + 1

    V(mutexR) # release rear pointer

    V(full) # signal that the buffer is non-empty

    end pi

    process ci # one for each consumer

    var x : int

    while true do ->

    P(full) # delay until a value is in the buffer

    P(mutexF) # delay until front pointer is not in use

    # access the value in the buffer and modify pointer

    x := buf[front]; front := front mod N + 1

    P(mutexF) # release front pointer

    V(empty) # signal that there is space in the buffer

    # consume x

    ...

    end ci

    96.3.2.2 Semaphores and Readers-Writers

    The Readers-Writers model captures the fundamental actions of a database; i.e.,

    r No exclusion between readersr Exclusion between readers and a writerr Exclusion between writers

    In other words, the software must guarantee only one update of a database record at a time, and no reading

    of that record while it is being updated.

    The simplest semaphore solution is to wait only for the first reader; subsequent readers need not check

    because no writer can be writing if there is already a reader reading (here, nr and nw are the numbers of

    active readers and writers, respectively):

    ...

    nr := nr + 1

    if nr = 1 -> P(rw) # if no one is presently reading,

    # then ensure no one is writing

    # before proceeding

    # access database

    ...

    nr := nr - 1

    if nr = 0 -> V(rw) # if no more are reading, possibly wake up

    # writer, or prepare for next reader

    2004 by Taylor & Francis Group, LLC

  • ...

    P(rw) # delay until no readers or writers

    # access database

    ...

    V(rw) # wake up delayed reader or writer, or prepare

    # for next reader or writer

    This solution gives readers preference over writers: new readers continually freeze out waiting writers.

    Extending this solution to other kinds of preferences, such as writer preference or first-come-first-served

    preference is cumbersome.

    A more general approach is known as passing the baton; it is easily extended to other kinds of prefer-

    ences because control is explicitly handed from process to process. Although a careful explanation of the

    approach is not given here, the concept is easily summarized. A process must check to ensure that it can

    legally proceed before doing so; if it cannot proceed, the process waits upon a semaphore assigned to it.

    For example, a writer process checks to see if no readers or writers are executing on the database before it

    proceeds; if they are executing on the database, then the writer process sleeps, waiting upon the semaphore

    assigned to it. When a process is finished accessing the database, it checks the conditions and wakes up (via

    signaling on the appropriate semaphore) one of the processes waiting upon the condition. This last opera-

    tion essentially passes the baton from one process to another. The key is that first a check is made to ensure

    that it is legal for the other process to wake up. The strength of the passing the baton approach emerges

    when its flexibility is used to develop more general solutions. Details may be found in Andrews [1991].

    96.3.2.3 Difficulties with Semaphores in Software Design

    While the use of semaphores does provide a complete solution to the interference problem, the correctness

    of the solution directly depends on the correct usage of the semaphore operations, which are fairly low-level

    and unstructured. Semaphores and shared variables are global to all processes and, like any global data

    structure, their correct usage requires considerable discipline by the programmer. Additionally, if a large

    system is to be built, any one implementor is likely responsible for only a portion of the semaphore usage

    so that correct pairing of Ps and Vs may be difficult. Despite this difficulty, semaphores are a widely used

    construct for concurrency.

    96.3.3 Monitors

    A more structured approach is to encapsulate the shared data/resources and their operations into a single

    module called a monitor. A monitor can contain non-externally accessible data and procedures that handle

    the state of resources. External access is strictly controlled through procedure calls to the monitor; mutual

    exclusion is ensured because procedure execution within the monitor is not concurrent.

    Monitors have the traditional advantages of abstract data types, but they must also deal with two

    issues rising from their use by concurrently executing processes: avoiding interference and providing

    synchronization. This section illustrates some sample applications of monitors and how they internally

    handle concurrency.

    Returning to the Producer-Consumer problem, we implement a monitor for handling shared access to

    the buffer. The monitor requires a synchronization mechanism to ensure that the Producer cannot overfill

    the buffer and that the Consumer cannot retrieve from an empty buffer. Monitors implement condition

    variables, the values of which are queues of processes delayed upon the corresponding conditions. Two

    standard operations defined on conditional variable cv are:

    1. wait(cv): causes the executing process to delay and to be placed at the end of cvs queue; in

    order to allow eventual awakening of the process, the process must relinquish exclusive access to the

    monitor when it executes a wait.

    2. signal(cv): causes the process at the head of cvs queue to be awakened; if the queue is empty,

    there is no effect.

    2004 by Taylor & Francis Group, LLC

  • Although these operations mirror those of semaphores, there is a key difference: the signal operation

    has no memory.

    96.3.3.1 Monitors and Producer-Consumer

    The buffer monitor can be defined as follows:

    monitor Buffer

    # define the buffer

    const N := .. # size of the buffer

    var buf[N] : int # buffer

    front := 1 # buffer pointers

    rear := 1

    # define the condition variables

    var not_full, # signaled when count < N

    not_empty : cv # signaled when count > 0

    procedure deposit(data : int)

    if count = N # check for space

    then wait(not_full) # delay if no space

    buf[rear] := data

    rear := (rear mod N) + 1

    N := N + 1

    signal(not_empty) # signal non-empty

    end

    procedure fetch(var data : int)

    if count = 0 # check for not empty

    then wait(not_empty) # delay if empty

    data := buf[front]

    front := (front mod N) + 1

    N := N - 1

    signal(not_full) # signal not full

    end Buffer

    Using this monitor, the producer and consumer tasks can be redone as follows:

    process Producer

    var x : int

    while true do ->

    # produce x

    ...

    deposit(x)

    end Producer

    process Consumer

    var x : int

    while true do ->

    fetch(x)

    # consume x

    ...

    end Consumer

    Now it is clear that programming (outside of the monitor) can now be done at a more abstract level, which

    will lead to more reliable software.

    2004 by Taylor & Francis Group, LLC

  • 96.3.3.2 Difficulties with Monitors

    There are difficulties with monitors as well. Consider the case where we have two consumers, C1 and C2.

    If the buffer is empty when C1 executes fetch, then C1 will delay on not empty. If the producer then

    executes deposit (note that deposit and fetch cannot be executed concurrently), it will eventually

    signal(not empty), which will awakenC1. But ifC2 executesfetch beforeC1 continues execution

    and its call tofetchproceeds, thenC1will access an empty buffer. Hence, thesignal operation must be

    considered to be a hint that proceeding with execution is possible, but not that it is correct. The following

    two approaches are used to solve this problem:

    1. Replace the check on the condition variable with a check inside a loop to ensure that the condition

    is true before execution proceeds. For example:

    procedure deposit(data : int)

    while count = N do -> # check for space

    then wait(not_full) # delay if no space

    buf[rear] := data

    rear := (rear mod N) + 1

    N := N + 1

    signal(not_empty) # signal non-empty

    end

    2. Give the highest priority to awakening processes so that intervening access to the monitor is not pos-

    sible; this also requires that the signal operation be the last operation executed in any procedure

    in which it occurs (to ensure that two processes will not be executing within the monitor).

    Monitors form the basis for concurrent programming in a number of systems and provide an efficient,

    high-level synchronization mechanism. They have the further advantage, as do other abstract data types

    or objects, of allowing for local modification and tuning without affecting the remainder of the system.

    96.3.4 Message Passing

    Consider a hardware architecture with multiple independent computers. Creating a semaphore to be

    efficiently accessed by processes running on separate computers is a difficult problem. We need a new

    abstraction for this case: message passing in which a sending process outputs a message to a channel and

    a receiving process inputs the message from this same channel. There are a large number of variations of

    this basic concept, depending on the semantics of the operations and the channels.

    The basic primitives are:

    1. Channel declaration

    2. send

    3. receive

    If both sending and receiving processes block upon reaching their corresponding message-passing

    operation, we have synchronous communication; if the sending process can send a message and continue

    without waiting for receipt, the system is asynchronous. Analogies are telephone communication and the

    postal system. The synchronous approach allows for ready synchronization of processes (at the instant

    of message passing we know where both are in their execution). This was the approach chosen by Hoare

    [1985] for his communicating sequential processes model and its subsequent implementation in the

    occam language [Jones and Goldsmith, 1988]. If we desire asynchronicity, we can add intermediate buffer

    processes to the synchronous approach. An advantage of synchronous message passing is that it often

    simplifies analysis of an algorithm because it is known where the sending and receiving processes are in

    their execution at the moment the message is passed.

    Further variations arise, depending on whether channels are one process-to-one process or one-to-

    many, statically instantiated at load time or dynamically created during execution, bi-directional or

    2004 by Taylor & Francis Group, LLC

  • uni-directional, whether the receiving process must be named by the sending process, etc. However,

    the basic concept is the same in all cases; ease of use and efficiency of implementation vary.

    Further variations include remote procedure call (RPC), which is the core of many distributed systems,

    and rendezvous, the approach used in Ada. We further explore these approaches after looking more closely

    at simple message passing.

    Note that, in the message-passing approach, there are no shared variables so interference is not an

    issue. The critical section issue does not arise because there is no way for concurrent processes to interfere

    with each other. This is one of the major motivating factors for the use of message-passing software

    architectures.

    96.3.4.1 Message Passing and Producer-Consumer

    If the message-passing system is asynchronous, as demonstrated below, we can rely on the system itself to

    buffer values:

    channel P2C

    process Producer

    int x

    while true do ->

    # produce x

    send P2C x

    end Producer

    process Consumer

    int x

    while true do ->

    receive P2C x

    # consume x

    end Consumer

    Using this approach, the Producer sends a message over channel P2C and continues producing and

    sending (up to channel capacity at which point the system blocks), while the Consumer blocks at the

    receive statement if no messages are available.

    If our system is synchronous, then as shown below, we create a separate buffer process:

    channel P2B, B2C

    process Buffer

    # create the buffer

    const N := ..

    var buffer[N] : int

    front := 1

    rear := 1

    count := 0 # number of items in the buffer

    while true do ->

    if

    # there is room and the producer is sending

    count < n and receive P2B buffer[rear] ->

    count++; rear := rear mod n + 1

    else

    # there are items and the consumer is receiving

    count > 0 and send B2C buffer[front] ->

    count--; front := front mod n + 1

    end Buffer

    2004 by Taylor & Francis Group, LLC

  • process Producer

    var x : int

    while true do ->

    # produce x

    ...

    send P2B x

    end Producer

    process Consumer

    var x : int

    while true do ->

    receive B2C x

    # consume x

    end Consumer

    Above the if statement is nondeterministic; that is, any true clause can be selected. The Boolean

    conditions in the clauses are called guards. The clauses are:

    r If there is room and the producer wishes to send a characterr If there are items to retrieve and the consumer wishes to receive a character

    For implementation efficiency reasons, actual programming languages do not allow guards for both

    input and output statements, so we must modify our solution; for example, as shown below, we can modify

    the buffer and consumer processes to eliminate the output guard:

    channel P2B, B2C, C2B

    process Buffer

    # define the buffer

    var buffer[n] : int

    var front := 1

    rear := 1

    count := 0

    while true do - >

    if

    # there is room and the producer is sending

    count < n and receive P2B buffer[rear] ->

    count++; rear := rear mod n + 1

    else

    # there are items and the consumer is requesting

    count > 0 and receive C2b buffer[front] ->

    send B2C buffer[front]

    count--; front := front mod n + 1

    end Buffer

    process Producer

    var x : int

    while true do ->

    # produce x

    2004 by Taylor & Francis Group, LLC

  • ...

    send P2B x

    end Producer

    process Consumer

    var int : x

    while true do ->

    send C2B NIL # announce ready for input

    receive B2C x

    # consume x

    ...

    end Consumer

    Above, the Consumer process first announces its intention to receive a value from the Buffer process (send

    C2B NIL; the NIL signifying that no message need be actually exchanged) and then actually receives the

    value (receive B2C x).

    This program is an example of client/server programming. The Consumer process is a client of the Buffer

    process; that is, it requests service from the buffer, which provides it. Client/server programming is widely

    used to provide services across a network and is based on the message-passing paradigm.

    96.3.4.2 Message Passing and Readers-Writers

    The message-passing approach to Readers-Writers is straightforward: do not accept a message from a

    reader or writer if a writer is writing; do not accept a message from a writer if a reader is reading. The

    solution, shown below, is simple if we adopt synchronous message passing and the notion of the database

    as a server:

    channel Rrequests, Rreceives, Wsends

    Reader

    send Rrequests

    receive Rreceives

    Writer

    send Wsends

    Server

    if

    # there are no writers, accept reader requests

    nw = 0 ->

    receive Rrequests

    # access the database

    ...

    send Rreceives

    # there are no readers or writers, accept writer

    requests

    nr = 0 and nw = 0 ->

    receive Wsends

    # modify the database

    ...

    2004 by Taylor & Francis Group, LLC

  • 96.3.4.3 Message Passing and Semaphore Simulation

    Of course, as we show next, message passing can simulate a semaphore (and vice versa if need be):

    channels P, V, initSemaphore

    process Semaphore

    var s : int

    receive initSemaphore i

    s := i

    while true do ->

    if

    # semaphore is non-zero accept P operation

    s > 0 and receive P NIL->

    s--

    # always accept V operation

    receive V NIL ->

    s++

    end Semaphore

    96.3.4.4 The Remote Procedure Call and Rendezvous Abstractions

    The remote procedure call, or RPC, abstraction is widely used to provide client/server services in a dis-

    tributed system. Revisiting the client/server examples above, it is clear that the client executes a send-

    receive pair while the server executes a receive-send pair. Using the standard procedure model to

    capture the servers actions, a call statement to capture the clients actions and parameters to capture the

    messages being sent, we have:

    Client

    ...

    call Server(args)

    ...

    Server(formal args)

    ...

    return

    which mirrors traditional procedure calls. The difference is that the Server procedure can be on a

    machine remote to the Client process. Indeed, the Server is implemented as a process that is always

    delayed until a Client executes a call. If multiple Clients concurrently execute calls to a Server, the

    Servermust be re-entrant or must provide protection for shared information. The RPC approach forms

    the basis for distributed systems programs on a wide variety of platforms; its relationship to monitors

    should be clear.

    The calling process and procedure are not truly concurrent in the sense used throughout this chapter,

    in that the calling process delays once the call is made, the procedure does not execute until called, the

    procedure delays when the return is executed, and the calling process resumes execution only upon the

    return from the procedure. The model is similar to that of synchronous message passing if the execution of

    the procedure is viewed as a component of the message-passing process (essentially, the procedure creates

    the return message).

    We can increase the power of this approach if we modify the procedure into a process and have both

    processes executing concurrently. When a call is made, execution of the calling process delays while

    execution of the called process continues until it is ready to accept the call (via a special statement). The

    called process continues execution, performing actions or calculating values for the return message. The

    return message is sent back to the caller, the called process continues executing, and the calling process

    2004 by Taylor & Francis Group, LLC

  • resumes execution once the message is received. Because there is an extended time period during which

    the two processes are synchronized (from called accept through called return), this model of concurrency

    is termed rendezvous. It is the basis for the model of concurrency used in the Ada language. The Ada model

    is not symmetric: the calling process must know the name of the process it is calling, but the called process

    need not know its caller. Accept statements may have guards, as discussed above for message passing, in

    order to control acceptance of calls. The complexity of these guards, and their priority, must be carefully

    followed during program implementation.

    There are several advantages to this approach, all based on the possibility of the called routine using

    multiple accept statements:

    1. The called routine can provide different responses to the calling process at different stages of its

    execution.

    2. The called routine can respond differently to different calling processes.

    3. The called routine chooses when it will receive a call.

    4. Different accept statements can be used to provide different services in a clear fashion (rather than

    through parameter values).

    96.3.4.5 Difficulties with Message Passing

    Message-passing systems are frequently inefficient during execution unless the algorithm is developed

    carefully. This is because messages take time to propagate, and this time is essentially overhead. For

    example, a single element buffer version of Conways problem spends significantly more time exchanging

    messages than any other operation.

    96.4 Distributed Systems

    In addition to the difficulties inherent in developing and understanding concurrent solutions, distributed

    systems contain the fundamental problem of identifying global state. For example, how do we determine if

    a program has terminated? In the sequential case, this is obvious, we execute the exit or end statement.

    In the concurrent case, we must ensure that all processes are ready to terminate. In the multiprogramming

    case, we can do this by checking the ready queue; if it is empty, then there are no processes waiting to run,

    which ensures that no process will ever be added to the ready queues (if no process can run, then there

    can be no changes to create another ready process). But if we are in a distributed system, there is no single

    ready queue to examine. If a process is in the suspended queue on its processor, it may be made ready by

    a message from a process on a different processor.

    Similarly, we may still require mutual exclusion on a system resource how do we ensure access across

    processors? The solution is to develop a method of determining global state; see, for example, Ben-Ari

    [1990].

    While a true distributed paradigm has not yet emerged in the programming paradigms domain, it

    will most likely evolve in the area of operating systems; for more information on distributed computing,

    readers are encouraged to look at Chapter 108 in this Handbook.

    96.5 Formal Approaches

    We argued above that software verification in concurrent programming must take into account the enor-

    mous number of possible interactions between concurrent processes. Obviously, traditional testing only

    demonstrates the presence of good execution histories and is not a mechanism to verify any solution

    sequential or concurrent. The use of a trace routine to generate execution histories is a standard sequential

    technique that becomes infeasible in the concurrent domain. Consider, for example, that n processes each

    executing m atomic actions generates (n m)!/(m!)n histories. For three processes, each executing only

    two actions, this is a total of 90 possible histories!

    2004 by Taylor & Francis Group, LLC

  • The alternative is to use a formal, mathematically rigorous method to develop a solution and/or to verify

    a complete solution. Two approaches have been applied to verifying concurrent software:

    1. Axiomatic or assertional

    2. Process algebraic

    The axiomatic approach develops assertions in the predicate logic that characterize the possible states of a

    computation. The actions of a program are viewed as predicate transformers that move the computation

    from one state to another. The beginning state is specified by the pre-condition of the computation, and

    the final state is characterized by the post-condition. This approach has been exploited for some time in

    the sequential paradigm; see Schneider [1997] for a comprehensive introduction to the field in the context

    of concurrency.

    The process algebraic approach was pioneered by Hoare [1985], who also pioneered the coarse-grained

    model of concurrency. The concept is that the interactions between a system and its environment (which

    are all that is ultimately observable) can be modeled via a mathematical abstraction called a process (this is

    the abstraction of the computing process as used above). Processes can be combined via algebraic laws to

    form systems. Communication between processes is an example of this interaction. By building up a system

    through these mathematical laws and then transforming the abstract mathematics into an implementable

    language, one arrives at a correct solution. The occam language was designed to match the algebraic laws

    devised by Hoare; transformations exist between these laws and occam programming constructs (but

    the transformations are not perfect due to practicalities of implementation) [Hinchey and Jarvis, 1995].

    A number of subsequent efforts developed process algebras with varying properties [Milner, 1989]; see

    Magee and Kramer [1999] for the use of a process algebra in the development of Java programs.

    Although both approaches are in active use, they are not typically applied in the concurrent paradigm

    with any greater frequency than they are in the sequential paradigm, and they remain primarily research

    tools. The fundamental difficulty is that theoreticians search for the fundamental particles of computing

    to develop mathematical laws enabling formal reasoning. Practical languages are (inherently) extremely

    complex mixtures of these fundamental particles and laws in order to have sufficient power to solve

    real-world problems. Theoretical tools do not yet scale to these large, complex problems.

    96.6 Existing Languages with Concurrency Features

    A large number of languages have been developed to use the concurrency paradigm; most have remained

    in the laboratory environment. If the underlying operating system provides the requisite support, then

    semaphores can be implemented in any language via system calls. Higher-level concurrency control struc-

    tures require modification of the underlying sequential language; for example, Concurrent Pascal [Brinch

    Hansen, 1975] uses monitors while Concurrent C [Gehani and Roome, 1986] is based on the rendezvous.

    By beginning with a widely used sequential programming language, a designer has a large community

    from which to draw users to the new language. The Ada (concurrency based upon the rendezvous) and SR

    (which includes structures for all of the approaches discussed in this chapter and is therefore particularly

    useful for exploring concurrent programming) [Andrews and Olsson, 1993; see Hartley, 1995, for extensive

    examples] languages are examples of sequential languages with concurrent structures included from the

    initial stages of development.

    Object-oriented languages have similarly had concurrency features added. For example, Smalltalk has

    the Process and Semaphore classes to provide for the dynamic creation of independent processes and their

    interaction using the semaphore approach [Goldberg and Robson, 1989].

    Languages based on an inherently concurrent model include Linda (more a language-independent

    philosophy than a language) [Ahuja et al., 1986] and occam (synchronous message passing) [Jones and

    Goldsmith, 1988].

    A different approach is to provide a standardized interface (an application program interface or API)

    that is language independent. A language implementation then provides a set of library routines to im-

    plement this API. Thus, programmers can use a language of their choice while being assured that their

    2004 by Taylor & Francis Group, LLC

  • program will function correctly. Currently, the two main paradigms that are the basis for writing paral-

    lel programs are message passing and shared memory. A hybrid paradigm is used in systems comprised

    of shared-memory multiprocessor nodes that communicate via message passing. For writing message-

    passing programs, MPI (Message Passing Interface) [http://www-unix.mcs.anl.gov/mpi/index.html] is

    a widely used standard; many variants of MPI exist, including MPICH, CH for Chameleon, which

    is a complete, freely-available implementation of the MPI specification, targeted at high performance

    [http://www-unix.mcs.anl.gov/mpi/mpich/].

    MPIs interface includes features of a number of message-passing systems and attempts to provide

    portability and ease-of-use. The MPI programming model is an MPMD (multiple program multiple data)

    model, in which every MPI process can execute a different program. A computation is envisioned as one

    or more processes that communicate by calling library routines to send and receive messages to other

    processes. In general, a fixed set of processes, one for each processor, is created at program initialization

    (versions of MPI that will support dynamic creation and termination of processes are anticipated). Local

    and global communication (e.g., broadcast and summation) is provided by point-to-point and collective

    communication operations, respectively. The former is used to send messages from one named process

    to another, while the latter is used to provide message passing among a group of processes. Most parallel

    algorithms are readily implemented using MPI. If an algorithm creates just one task per processor, it

    can be implemented directly with point-to-point or collective communication routines that meet its

    communication requirements. In contrast, if tasks are created dynamically or if several tasks are executed

    concurrently on a processor, the algorithm must be refined to permit an MPI implementation.

    The OpenMP API is becoming a standard that supports multi-platform shared-memory parallel pro-

    gramming in C/C++ and Fortran on all architectures, including Unix and Windows NT platforms.

    OpenMP is a portable, scalable model that gives shared-memory parallel programmers a simple and

    flexible interface for developing parallel applications for platforms ranging from the desktop to the super-

    computer [http://www.openmp.org/]. This API is jointly defined by a group of major computer hardware

    and software vendors. OpenMP can be used to explicitly direct multi-threaded, shared memory paral-

    lelism. It is comprised of three primary API components: compiler directives, runtime library routines,

    and environment variables. Using the fork/join model of parallel execution, an OpenMP program begins

    as a single master thread. The master thread creates or forks a set of parallel threads, which concurrently

    execute a parallel region construct. On completion, the threads parallel threads join (i.e., synchronize and

    terminate), leaving only the master thread. The API supports nested parallelism and dynamic threads,

    that is, dynamic alternation of the number of active threads. Variable scoping, for example, declaration

    of private and shared data, parallelism, and synchronization are specified through the use of compiler

    directives. By itself, OpenMP is not meant for distributed memory parallel systems. For example, for high-

    performance cluster architectures such as the IBM SP, where intranode communication is accomplished

    via shared memory and internode communication is performed via message passing, OpenMP is used

    within a node while MPI is used between nodes.

    There are many parallel programming tools available that help the user parallelize her/his application

    and then easily port it to a parallel machine. These machines can be shared-memory machines or a network

    of workstations.

    96.7 Research Issues

    While it is clear that concurrency is a necessary technique for the solution of many problems, it also is clear

    that progress must be made in order to ensure its effective application. That this is still a research issue

    is clear whenever an operating system crashes due to system processes that interfere with each other or

    we discover someone in our airplane seat due to concurrent access to the airlines database. This required

    progress falls into three categories:

    1. Theoretical advances must be made to develop formal techniques that scale to real-world appli-

    cations. For example, process interference checkers exist, but operate essentially by checking all

    2004 by Taylor & Francis Group, LLC

  • possible interactions between processes to check for deadlock, etc. This approach rapidly develops

    combinatorial explosion.

    2. Design tools that provide development support for concurrent solutions. For example, debuggers

    that capture the concurrent computation without overwhelming the user with information.

    3. Languages with powerful structures to support the correct application of concurrency. For example,

    the development of concurrent object-oriented languages appears straight-forward: simply allow

    each object to run concurrently because each object is logically autonomous. However, there are a

    number of issues that need resolution, including:

    a. Not all objects need to run concurrently because the majority of computation will still be

    sequential (thereby incurring no scheduler overhead).

    b. If we consider multiple concurrent objects attempting to communicate with the same object:

    i. Acceptance of a message must delay all other messages in order to correctly preserve the

    internal state of the object.

    ii. Ordering of message acceptance must be synchronized to ensure computations are correct.

    iii. Acceptance of messages must occur only at appropriate points in the objects execution.

    c. Inheritance through the class hierarchy creates problems because it will mix this synchronization

    with object behavior.

    96.8 Summary

    The single outstanding problem with concurrency is the development of correct solutions (as it is in all

    software systems): the state of development of both formal methods and software engineering tools for

    concurrent solutions lags behind the sequential world in this regard and well behind hardware advances.

    Defining Terms

    Asynchronous message passing: The message-sending process allows messages to be buffered and the

    sending process may continue after the send is initiated; the receiving process blocks if the message

    queue is empty.

    Channel: The data structure, which may be realized in hardware, over which processes send messages.

    Client/server: The software architecture in which clients are able to request services of processes executing

    on remote machines.

    Condition variables: A variable used within a monitor to delay an executing process.

    Critical regions: A section of code that must appear to be executed indivisibly.

    Deadlock: The state in which processes are waiting for events that can never occur; that is, the processes

    cannot progress.

    Distributed processing: The use of multiple processors that are remote from each other.

    Fairness: Processes will eventually be able to progress, that is, enter their critical regions.

    Message passing: A technique for providing mutual exclusion, communication, and synchronization

    among concurrent processes via sending messages between processes.

    Monitor: An encapsulation of a resource and the operations on that resource that serve to ensure mutual

    exclusion.

    Multiprocessing: The use of multiple processors.

    Multiprogramming: Simulating concurrency by interleaving instruction execution from multiple pro-

    grams; time sharing or time slicing.

    Mutual exclusion: The property ensuring that a critical region is executed indivisibly by one process or

    thread at a time.

    Race: Nondeterministic behavior caused by incorrectly synchronized concurrent processes.

    Remote procedure call: The message-passing architecture in which processes request services of processes

    executing procedures on remote machines.

    Rendezvous: The message-passing construct used in the Ada language.

    2004 by Taylor & Francis Group, LLC

  • Semaphore: A nonnegative integer-valued variable on which two operations are defined:P andV to signal

    intent to enter and exit, respectively, a critical region.

    Synchronous message passing: The message-sending process requires both sender and receiver to syn-

    chronize at the moment of message transmission.

    References

    Journals

    Ahuja, S., Carriero, N., and Gelernter, D. 1986. Linda and Friends. Computer, 19(8):2634.

    Andrews, G. R. and Schneider, F. B. 1983. Concepts and notations for concurrent programming. Comp.

    Surv., 15(1):343; reprinted in Gehani, N. and McGettrick, A. D. 1988. Concurrent Programming.

    Addison-Wesley, New York.

    Brinch Hansen, P. 1975. The Programming Language Concurrent Pascal. IEEE Trans. on Software Engineer-

    ing, 1(2):199207; reprinted in Gehani, N. and McGettrick, A. D. 1988. Concurrent Programming.

    Addison-Wesley, New York.

    Dijkstra, E. W. 1968. The structure of the T. H. E. multiprogramming system. CACM, 11:341346.

    Gehani, N. H. and Roome, W. D. 1986. Concurrent C. Software: Practice and Experience, 16(9):821844;

    reprinted in Gehani, N. and McGettrick, A. D. 1988. Concurrent Programming. Addison-Wesley,

    New York.

    Peterson, G. L. 1983. A new solution to Lamports concurrent programming problem using small shared

    variables. ACM Trans. Prog. Lang. and Syst., 5(1):5655.

    Books

    Andrews, G. R. 2000. Foundations of Multithreaded, Parallel, and Distributed Programming. Benjamin-

    Cummings, New York.

    Andrews, G. R. and Olsson, R. A. 1993. The SR Programming Language. Benjamin-Cummings, New York.

    Ben-Ari, M. 1982. Principles of Concurrent Programming. Prentice Hall, London.

    Ben-Ari, M. 1990. Principles of Concurrent and Distributed Programming. Prentice Hall, London.

    Bernstein, A. J. and Lewis, P. M. 1993. Concurrency in Programming and Database Systems. Jones and

    Bartlett, Boston.

    Filman, R. E. and Friedman, D. P. 1984. Coordinated Computing. McGraw-Hill, New York.

    Gehani, N. and McGettrick, A. D. 1988. Concurrent Programming. Addison-Wesley, New York.

    Goldberg, A. and Robson, D. 1989. Smalltalk80 The Language. Addison-Wesley, New York.

    Hartley, S. J. 1995. Operating Systems Programming. Oxford, New York.

    Hinchey, M. G. and Jarvis, S. A. 1995. The CSP Reference Book. McGraw-Hill, New York.

    Hoare, C. A. R. 1985. Communicating Sequential Processes. Prentice Hall, London.

    Jones, G. and Goldsmith, M. 1988. Programming occam 2. Prentice Hall, New York.

    Lester, B. P. 1993. The Art of Parallel Programming. Prentice Hall, New Jersey.

    Magee, J. and Kramer, J. 1999. Concurrency: State Models and Java Programs. Wiley, West Sussex.

    Milner, R. 1989. Communication and Concurrency. Addison-Wesley, New York.

    Schneider, F. 1997. On Current Programming. Springer-Verlag, New York.

    Wilkinson, B. and Allen, M. 1999. Parallel Programming: Techniques and Applications Using Networked

    Workstations and Parallel Computers. Prentice Hall, New Jersey.

    Further Information

    Further information can be gleaned from a number of sources; particularly recommended are Andrews

    [2000] for a comprehensive view of the field with an axiomatic flair, including a fascinating bibliography

    with historical notes and extensive problem sets; Schneider [1997] for a graduate level treatise on axiomatic

    semantics in the context of concurrency; Ben-Ari [1982] for a nice introduction including problem sets;

    Ben-Ari [1990], which adds Ada code examples, correctness arguments, and distributed computing; the

    2004 by Taylor & Francis Group, LLC

  • process algebra approach is developed in Hoare [1985] and Milner [1989] and demonstrated in Magee and

    Kramer [1999]; Filman and Friedman [1984] emphasize the various models of concurrent computation;

    Lester [1993] provides a comprehensive introduction including efficiency considerations, but without

    correctness arguments; Bernstein and Lewis [1993] use the axiomatic approach to develop concurrent

    solutions to a variety of problems with an emphasis on databases; Gehani and McGettrick [1988] reprint a

    number of the classic papers in the field. Wilkinson and Allen [1999] demonstrate parallel programming

    for a wide range of problems.

    The journal Concurrency: Practice and Experience focuses on practical experience with concurrent ma-

    chines and concurrent solutions to problems; concurrency is also frequently dealt with in a large number

    of society journals.

    In addition, there are a large number of resources available via the Web that may be discovered through

    the use of the various search techniques.

    2004 by Taylor & Francis Group, LLC