44
CS2403 Programming Languages Concurrency Chung-Ta King Department of Computer Science National Tsing Hua University (Slides are adopted from Concepts of Programming Languages, R.W. Sebesta)

CS2403 Programming Languages Concurrency

  • Upload
    donoma

  • View
    92

  • Download
    2

Embed Size (px)

DESCRIPTION

CS2403 Programming Languages Concurrency. Chung-Ta King Department of Computer Science National Tsing Hua University. (Slides are adopted from Concepts of Programming Languages , R.W. Sebesta). Outline. Parallel architecture and programming Language supports for concurrency - PowerPoint PPT Presentation

Citation preview

Page 1: CS2403 Programming Languages Concurrency

CS2403 Programming Languages

Concurrency

Chung-Ta KingDepartment of Computer ScienceNational Tsing Hua University

(Slides are adopted from Concepts of Programming Languages, R.W. Sebesta)

Page 2: CS2403 Programming Languages Concurrency

2

Outline

Parallel architecture and programming Language supports for concurrency

Controlling concurrent tasksSharing dataSynchronizing tasks

Page 3: CS2403 Programming Languages Concurrency

3

Sequential Computing

von Neumann arch. with Program Counter (PC) dictates sequential execution

Traditional programming thus follows a single thread of controlThe sequence of program

points reached as control flows through the program

Program counter(Introduction to Parallel Computing, Blaise Barney)

Page 4: CS2403 Programming Languages Concurrency

4

Sequential Programming Dominates

Sequential programming has dominated throughout computing history

Why?Why is there no need to change programming

style?

Page 5: CS2403 Programming Languages Concurrency

5

2 Factors Help to Maintain Perf.

IC technology: ever shrinking feature sizeMoore’s law, faster switching, more functionalities

Architectural innovations to remove bottlenecks in von Neumann architecture Memory hierarchy for reducing memory latency:

registers, caches, scratchpad memoryHide or tolerate memory latency: multithreading,

prefetching, predication, speculationExecuting multiple instructions in parallel:

pipelining, multiple issue (in-/out-of-order, VLIW), SIMD multimedia extensions (inst.-level parallelism, ILP)

(Prof. Mary Hall, Univ. of Utah)

Page 6: CS2403 Programming Languages Concurrency

6

End of Sequential Programming?

Infeasible for continuing improving performance of uniprocessorsPower, clocking, ...

Multicore architecture prevails (homogeneous or heterogeneous)Achieve performance gains with simpler

processors Sequential programming still alive!

Why?Throughput versus execution time

Can we live with sequential prog. forever?

Page 7: CS2403 Programming Languages Concurrency

7

Parallel Programming

A programming style that specify concurrency (control structure) & interaction (communication structure) between concurrent subtasksStill in imperative language style

Concurrency can be expressed at various levels of granularityMachine instruction level, high-level language

statement level, unit level, program level Different models assume different

architectural supportLook at parallel architectures first

(Ananth Grama, Purdue Univ.)

Page 8: CS2403 Programming Languages Concurrency

8

An Abstract Parallel Architecture

How is parallelism managed? Where is the memory physically located? What is the connectivity of the network?

(Prof. Mary Hall, Univ. of Utah)

Page 9: CS2403 Programming Languages Concurrency

9

Flynn’s Taxonomy of Parallel Arch.

Distinguishes parallel architecture by instruction and data streamsSISD: classical uniprocessor architecture

S I S D Single Instruction,

Single Data

S I M D Single Instruction,

Multiple Data

M I S D Multiple Instruction,

Single Data

M I M D Multiple Instruction,

Multiple Data

(Introduction to Parallel Computing, Blaise Barney)

Page 10: CS2403 Programming Languages Concurrency

10

Parallel Control Mechanisms

(Prof. Mary Hall, Univ. of Utah)

Page 11: CS2403 Programming Languages Concurrency

11

2 Classes of Parallel Architecture

Shared memory multiprocessor architecturesMultiple processors can operate independently

but share the same memory systemShare a global address space where each

processor can access every memory locationChanges in a memory location

effected by one processor are visible to all other processors like a bulletin board

(Introduction to Parallel Computing, Blaise Barney; Prof. Mary Hall, Univ. of Utah)

Page 12: CS2403 Programming Languages Concurrency

12

2 Classes of Parallel Architecture

Distributed memory architecturesProcessing units (PEs) connected by an

interconnectEach PE has its own distinct address space

without a global address space, and they explicitly communicate to exchange data

Ex.: PC clusters of connected by commodity Ethernet

(Introduction to Parallel Computing, Blaise Barney; Prof. Mary Hall, Univ. of Utah)

Page 13: CS2403 Programming Languages Concurrency

13

Shared Memory Programming

Often as a collection of threads of controlEach thread has private data, e.g., local stack,

and a set of shared variables, e.g., global heapThreads communicate implicitly by writing and

reading shared variablesThreads coordinate through locks and barriers

implemented using shared variables

(Prof. Mary Hall, Univ. of Utah)

Page 14: CS2403 Programming Languages Concurrency

14

Distributed Memory Programming

Organized as named processesA process is a thread of control plus local address

space -- NO shared dataA process cannot see the memory contents of other

processes, nor can it address and access themLogically shared data is partitioned over processesProcesses communicate by explicit send/receive.

i.e., asking the destination process to access its local data on behalf of the requesting process

Coordination is implicit in communication events blocking/non-blocking send and receive

(Prof. Mary Hall, Univ. of Utah)

Page 15: CS2403 Programming Languages Concurrency

15

Distributed Memory Programming

Private memory looks like mailbox

(Prof. Mary Hall, Univ. of Utah)

Page 16: CS2403 Programming Languages Concurrency

16

Specifying Concurrency

What language supports are needed for parallel programming?

Specifying (parallel) control flowsHow to create, start, suspend, resume, stop

processes/threads? How to let one process/thread explicitly wait for events or another process/thread?

Specifying data flows among parallel flowsHow to pass a data generated by one

process/thread to another process/thread?How to let multiple process/thread access

common resources, e.g., counter, with conflicts

Page 17: CS2403 Programming Languages Concurrency

17

Specifying Concurrency

Many parallel programming systems provide libraries and perhaps compiler pre-processors to extend a traditional imperative language, such as C, for parallel programmingExamples: Pthread, OpenMP, MPI,...

Some languages have parallel constructs built directly into the language, e.g., Java, C#

So far, the library approach works fine

Page 18: CS2403 Programming Languages Concurrency

18

Shared Memory Prog. with Threads

Several thread libraries: PThreads: the POSIX threading interface

POSIX: Portable Operating System Interface for UNIX

Interface to OS utilitiesSystem calls to create and synchronize threads

OpenMP is newer standardAllow a programmer to separate a program into

serial regions and parallel regionsProvide synchronization constructsCompiler generates thread program & synch.Extensions to Fortran, C, C++ mainly by directives

(Prof. Mary Hall, Univ. of Utah)

Page 19: CS2403 Programming Languages Concurrency

19

Thread Basics

A thread is a program unit that can be in concurrent execution with other program units

Threads differ from ordinary subprograms:When a program unit starts the execution of a

thread, it is not necessarily suspendedWhen a thread’s execution is completed,

control may not return to the callerAll threads run in the same address space but

have own runtime stacks

Page 20: CS2403 Programming Languages Concurrency

20

Message Passing Prog. with MPI

MPI defines a standard library for message-passing that can be used to develop portable message-passing programs using C or FortranBased on Single Program, Multiple Data (SPMD)All communication, synchronization require

subroutine calls no shared variablesProgram runs on a single processor just like any

uniprocessor program, except for calls to message passing library

It is possible to write fully-functional message-passing programs by using only six routines

(Prof. Mary Hall, Univ. of Utah; Prof. Ananth Grama, Purdue Univ. )

Page 21: CS2403 Programming Languages Concurrency

21

Message Passing Basics

The computing systems consists of p processes, each with its own exclusive address spaceEach data element must belong to one of the

partitions of the space; hence, data must be explicitly partitioned and placed

All interactions (read-only or read/write) require cooperation of two processes - the process that has the data and one that wants to access the data

All processes execute asynchronously unless they interact through send/receive synchronizations

(Prof. Ananth Grama, Purdue Univ. )

Page 22: CS2403 Programming Languages Concurrency

22

Controlling Concurrent Tasks

Pthreads:Program starts with a single master thread, from

which other threads are created errcode = pthread_create(&thread_id,

&thread_attribute, &thread_fun, &fun_arg);

Each thread executes a specific function, thread_fun(), representing thread’s computation

All threads execute in parallel Function pthread_join() suspends execution of

calling thread until the target thread terminates

(Prof. Mary Hall, Univ. of Utah)

Page 23: CS2403 Programming Languages Concurrency

23

Pthreads “Hello World!”

#include <pthread.h>void *thread(void *vargp);int main() { pthread_t tid; pthread_create(&tid, NULL, thread, NULL); pthread_join(tid, NULL); pthread_exit((void *)NULL);}void *thread(void *vargp){ printf("Hello World from thread!\n"); pthread_exit((void *)NULL);}

(http://www.cs.binghamton.edu/~guydosh/cs350/hello.c)

Page 24: CS2403 Programming Languages Concurrency

24

Controlling Concurrent Tasks (cont.)

OpenMP:Begin execution as a single process and fork

multiple threads to work on parallel blocks of code single program multiple data

Parallel constructs are specified using Pragmas

(Prof. Mary Hall, Univ. of Utah)

Page 25: CS2403 Programming Languages Concurrency

25

OpenMP Pragma

All pragmas begin: #pragma Compiler calculates loop bounds for each

thread and manages data partitioningSynchronization also automatic (barrier)

(Prof. Mary Hall, Univ. of Utah)

Page 26: CS2403 Programming Languages Concurrency

26

OpenMP “Hello World!”

#include <omp.h>int main (int argc, char *argv[]) { int th_id, nthreads; #pragma omp parallel private(th_id) { th_id = omp_get_thread_num(); printf("Hello World: %d\n", th_id); #pragma omp barrier if ( th_id == 0 ) { nthreads = omp_get_num_threads(); printf("%d threads\n",nthreads); } } return EXIT_SUCCESS;}

(http://en.wikipedia.org/wiki/OpenMP#Hello_World)

Page 27: CS2403 Programming Languages Concurrency

27

Controlling Concurrent Tasks (cont.)

Java:The concurrent units in Java are methods

named runA run method code can be in concurrent

execution with other such methodsThe process in which the run methods execute

is called a threadClass myThread extends Thread {public void run () {...}

}...Thread myTh = new MyThread ();myTh.start();

Page 28: CS2403 Programming Languages Concurrency

28

Controlling Concurrent Tasks (cont.)

Java Thread class has several methods to control the execution of threadsThe yield is a request from the running thread

to voluntarily surrender the processorThe sleep method can be used by the caller of

the method to block the threadThe join method is used to force a method to

delay its execution until the run method of another thread has completed its execution

Page 29: CS2403 Programming Languages Concurrency

29

Controlling Concurrent Tasks (cont.)

Java thread priority:A thread’s default priority is the same as the

thread that create itIf main creates a thread, its default priority is NORM_PRIORITY

Threads defined two other priority constants, MAX_PRIORITY and MIN_PRIORITY

The priority of a thread can be changed with the methods setPriority

Page 30: CS2403 Programming Languages Concurrency

30

Controlling Concurrent Tasks (cont.)

MPI:Programmer writes the code for a single

process and the compiler includes necessary librariesmpicc -g -Wall -o mpi_hello mpi_hello.c

The execution environment starts parallel processesmpiexec -n 4 ./mpi_hello

(Prof. Mary Hall, Univ. of Utah)

Page 31: CS2403 Programming Languages Concurrency

31

MPI “Hello World!”

#include "mpi.h"int main(int argc, char *argv[]) { int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf(”Hello World from process %d of

%d\n", rank, size); MPI_Finalize(); return 0;}

(Prof. Mary Hall, Univ. of Utah)

Page 32: CS2403 Programming Languages Concurrency

32

Sharing Data

Pthreads:Variables declared outside of main are sharedObject allocated on the heap may be shared (if

pointer is passed)Variables on the stack are private: passing pointer to

these around to other threads can cause problemsShared variables can be read and written directly by

all threads need synchronization to prevent racesSynchronization primitives, e.g., semaphores, locks,

mutex, barriers, are used to sequence the executions of the threads to indirectly sequence the data passed through shared variables

(Prof. Mary Hall, Univ. of Utah)

Page 33: CS2403 Programming Languages Concurrency

33

Sharing Data (cont.)

OpenMP:shared variables are shared; default is sharedprivate variables are privateLoop index is private int bigdata[1024]; void* foo(void* bar) { int tid; #pragma omp parallel \ shared (bigdata) private (tid) { /* Calc. here */ } }

(Prof. Mary Hall, Univ. of Utah)

Page 34: CS2403 Programming Languages Concurrency

34

Sharing Data (cont.)

MPI:int main( int argc, char *argv[]) { int rank, buf; MPI_Status status; MPI_Init(&argv, &argc); MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (rank == 0) { buf = 123456; MPI_Send(&buf, 1, MPI_INT, 1, 0,

MPI_COMM_WORLD); } else if (rank == 1) { MPI_Recv(&buf, 1, MPI_INT, 0, 0,

MPI_COMM_WORLD, &status);} MPI_Finalize();}

(Prof. Mary Hall, Univ. of Utah)

Page 35: CS2403 Programming Languages Concurrency

35

Synchronizing Tasks

A mechanism that controls the order in which tasks execute

Two kinds of synchronizationCooperation: one task waits for another, e.g., for

passing datatask 1 task 2

a = ... ... = ... a ...Competition: tasks compete for exclusive use of

resource without specific ordertask 1 task 2

sum += local_sum sum += local_sum

Page 36: CS2403 Programming Languages Concurrency

36

Synchronizing Tasks (cont.)

Pthreads:Provide various synchronization primitives,

e.g., mutex, semaphore, barrierMutex: protects critical sections -- segments of

code that must be executed by one thread at any timeProtect code to indirectly protect shared data

Semaphore: synchronizes between two threads using sem_post() and sem_wait()

Barrier: synchronizes threads to reach the same point in code before going any further

Page 37: CS2403 Programming Languages Concurrency

37

Pthreads Mutex Example

pthread_mutex_t sum_lock;int sum; main() { ... pthread_mutex_init(&sum_lock, NULL); ... } void *find_min(void *list_ptr) {

int my_sum; pthread_mutex_lock(&sum_lock); sum += my_sum; pthread_mutex_unlock(&sum_lock);

}

Page 38: CS2403 Programming Languages Concurrency

38

Synchronizing Tasks (cont.)

OpenMP:OpenMP has reduce operationsum = 0; #pragma omp parallel for reduction(+:sum)for (i=0; i < 100; i++) { sum += array[i]; }OpenMP also has critical directive that is

executed by all threads, but restricted to only one thread at a time

#pragma omp critical [( name )] new-line sum = sum + 1;

(Prof. Mary Hall, Univ. of Utah)

Page 39: CS2403 Programming Languages Concurrency

39

Synchronizing Tasks (cont.)

Java:A method that includes the synchronized

modifier disallows any other method from running on the object while it is in execution

public synchronized void deposit(int i) {…}

public synchronized int fetch() {…}The above two methods are synchronized

which prevents them from interfering with each other

Page 40: CS2403 Programming Languages Concurrency

40

Synchronizing Tasks (cont.)

Java:Cooperation synchronization is achieved via wait, notify, and notifyAll methods

All methods are defined in Object, which is the root class in Java, so all objects inherit them

The wait method must be called in a loopThe notify method is called to tell one waiting

thread that the event it was waiting has happened

The notifyAll method awakens all of the threads on the object’s wait list

Page 41: CS2403 Programming Languages Concurrency

41

Synchronizing Tasks (cont.)

MPI:Use send/receive to complete task synchronizations,

but semantics of send/receive have to be specializedNon-blocking send/receive:

Non-blocking send/receive: send() and receive() calls will return no matter whether data has arrived

Blocking send/receive:Unbuffered blocking send() does not return until

matching receive() is encountered at receiving process Buffered blocking send() will return after the sender has

copied the data into the designated bufferBlocking receive() forces the receiving process to wait

(Prof. Ananth Grama, Purdue Univ. )

Page 42: CS2403 Programming Languages Concurrency

42

Unbuffered Blocking

(Prof. Ananth Grama, Purdue Univ. )

Page 43: CS2403 Programming Languages Concurrency

43

Buffered Blocking

(Prof. Ananth Grama, Purdue Univ. )

Page 44: CS2403 Programming Languages Concurrency

44

Summary

Concurrent execution can be at the instruction, statement, subprogram, or program level

Two fundamental programming style: shared variables and message passing

Programming languages must provide supports for specifying control and data flows