Upload
shaniaki
View
231
Download
0
Embed Size (px)
Citation preview
8/14/2019 Current Software Methodologies and Languages
1/69
Topic 4: Current Software Methodologies and Languages
Seyed Hosein Attarzadeh Niaki
KTH
February 23, 2010
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages
February 23, 2010 1 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
2/69
Outline
1 Parallel ProgrammingOpenMPMessage-Passing InterfaceErlang
Haskell
2 Real-Time ProgrammingRTOS
Introduction to Real-Time JavaAda
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages
February 23, 2010 2 / 61
http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
3/69
Outline
1 Parallel ProgrammingOpenMPMessage-Passing Interface
ErlangHaskell
2 Real-Time Programming
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages
February 23, 2010 3 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
4/69
Solve a Problem in Parallel
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 4 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
5/69
Problems with Parallelization
no program can run more quickly than the longest chain of dependentcalculations; Bernsteins conditions says Pi and Pj can run in parallelif: (Ii & Oi are inputs and outputs of program fraction Pi)
Ij Oi = (no flow dependency)Ii Oj = (no anti-dependency)Oi Oj = (no output dependency)
race conditions happens when multiple threads need to update ashared variable
locks are used to provide mutual exclusionlocks can greatly slow down a programlocking multiple variable without atomic locks can produce deadlock
barriers are used when subtasks of a program need to act in synchrony
overhead of communication between threads may dominate the timespent on solving the problem
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 5 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
6/69
Classifications
Flynns taxonomy distinguishes parallel computer architectures usingtwo independent dimensions of Instruction and Data
SISD an entirely sequential computerSIMD processor arrays, vector pipelines, GPUs, etc.MISD few application examples exist (multiple parallel filters)
MIMD most common type of modern computers
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 6 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
7/69
Classifications
Flynns taxonomy distinguishes parallel computer architectures usingtwo independent dimensions of Instruction and Data
SISD an entirely sequential computerSIMD processor arrays, vector pipelines, GPUs, etc.MISD few application examples exist (multiple parallel filters)
MIMD most common type of modern computers
applications are classified according to how often their subtasks needto synchronize to:
Fine-grained more than multiple times per secondCoarse-grained less than multiple times per secondEmbarrassingly parallel rarely need communication
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 6 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
8/69
Classifications
Flynns taxonomy distinguishes parallel computer architectures usingtwo independent dimensions of Instruction and Data
SISD an entirely sequential computerSIMD processor arrays, vector pipelines, GPUs, etc.MISD few application examples exist (multiple parallel filters)
MIMD most common type of modern computers
applications are classified according to how often their subtasks needto synchronize to:
Fine-grained more than multiple times per secondCoarse-grained less than multiple times per secondEmbarrassingly parallel rarely need communication
parallelism can occur in different levelsBit-level more bit width
Instruction-level pipelines, superscalar processorsData inherent in program loops
Task entirely different calculations on same or different dataSeyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 6 / 61
http://goback/http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
9/69
Memory Architectures
Shared Memory all processors can access all memory as global addressspace; does not scale well
Uniform Memory Access (UMA)Non-Uniform Memory Access (NUMA)
Distributed Memory Distributed memory systems require acommunication network to connect inter-processor memory;programmer needs to do explicit communication
Hybrid the shared memory component is usually a cache coherent
SMP machine; the distributed memory component is thenetworking of multiple SMPs
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 7 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
10/69
Parallel Programming Models
The most known and used models for programming parallel systems, basedon the assumption they make about their underlying architecture is:
Shared memory threads communicate using shared variables by means ofsynchronizations facilities; implemented in POSIX threadsand OpenMP
Message Passing a set of tasks with their own local memory exchange
information and synchronize by sending and receivingmessages; implemented in MPI
Hybrid models of two above approaches are commonly used
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 8 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
11/69
Parallel Programming Models
The most known and used models for programming parallel systems, basedon the assumption they make about their underlying architecture is:
Shared memory threads communicate using shared variables by means ofsynchronizations facilities; implemented in POSIX threadsand OpenMP
Message Passing a set of tasks with their own local memory exchange
information and synchronize by sending and receivingmessages; implemented in MPI
Hybrid models of two above approaches are commonly used
Some approaches need explicit declaration of parallelism, but there are
parallelizing compilers that can generated parallel codes:Fully automatic candidates are loops and independent sections; limited
success
Programmer directed the programmer provides directives or programmerflags to assist the compiler
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 8 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
12/69
Designing Parallel Programs
Understand the problem and theprogram
can it be parallelized?where are the hotspots andbottlenecks?
identify inhibitors to parallelism
investigate other algorithms
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 9 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
13/69
Designing Parallel Programs
Understand the problem and theprogram
can it be parallelized?where are the hotspots andbottlenecks?
identify inhibitors to parallelism
investigate other algorithms
Partitioning
domain Decomposition
functional decomposition
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 9 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
14/69
Designing Parallel Programs (contd.)
Communication
cost
latency vs.bandwidth
Synchronous vs. asynchronous(blocking vs. non-blocking)
Scope of communications
(point-to-point, collaborative)
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 10 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
15/69
Designing Parallel Programs (contd.)
Communication
cost
latency vs.bandwidth
Synchronous vs. asynchronous(blocking vs. non-blocking)
Scope of communications
(point-to-point, collaborative)
Synchronization
Barrier
Lock/semaphore
Synchronouscommunicationoperations
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 10 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
16/69
Introduction to OpenMP
API defined jointly by a group of hardware and software vendors
consisting compiler directives, runtime library routines, andenvironment variables
provides a portable, scalable model for explicit multi-threaded, sharedmemory parallelism
it provides capability to incrementally parallelize a program
it is based on a fork/join model
supports nested parallelism
supports dynamic threads
it is NOT:
for distributed memory systemsguaranteeing IO synchronizationrequired to check for deadlocks, race,dependency and conflicts
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 11 / 61
O MP Di i
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
17/69
OpenMP Directives
Directives format:#pragma omp directive name [clause, ...] newline
Required for all OpenMPC/C++ directives A valid OpenMP direc-tive. Must appear afterthe pragma and before anyclauses
Optional. Clauses can bein any order, and repeatedas necessary unless other-wise restricted
Required. Precedes thestructured block which isenclosed by this directive
each directive applies to at most one succeeding statement whichmust be a structured block
a PARALLEL region is a block of code that will be executed bymultiple threads
#pragma omp parallel [clause ...] newlineif (scalar_expression)private (list)shared (list)
default (shared | none)firstprivate (list)reduction (operator: list)copyin (list)num_threads (integer-expression)
structured_block
a team of threads will be created
implied barrier at the end
nested regions supported
illegal to branch in or out
region shouldnt span multipleroutines or files
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 12 / 61
P ll l R i E l
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
18/69
Parallel Region Example
every thread executes all the code enclosed in the parallel section
OpenMP library routines are used to obtain thread identifiers andtotal number of threads
#include
main () {int nthreads, tid;
/* Fork a team of threads with each thread having a private tid variable */#pragma omp parallel private(tid)
{/* Obtain and print thread id */tid = omp_get_thread_num();printf("Hello World from thread = %d\n", tid);
/* Only master thread does this */
if (tid == 0){nthreads = omp_get_num_threads();printf("Number of threads = %d\n", nthreads);}
} /* All threads join master thread and terminate */}
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 13 / 61
W k Sh i C
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
19/69
Work Sharing Constructs
a work sharing constructs divides the execution of the enclosed region
among among the members of a team of threadsno implied barrier at the beginning, an implied barrier at the end
Do/for: shareiterations of a loop
(data parallelism)
SECTIONS: breakswork into separate,
discrete sections
SINGLE: seriallizes asection of code (just
one thread runs)
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 14 / 61
S h i ti C t t
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
20/69
Synchronization Constructs
the master directive specifies a region of code that is to be executedonly by the master thread of the team
the critical directive specifies a region that is to be executed onlyby one thread at a time
the barrier directive synchronizes all the threads in a team
the flush directive identifies a synchronization point at which theimplementation must provide a consistent view of memory
the ordered directive specifies that the iterations of the enclosedloop will be executed in the same order as the serial loop (used within
a Do/for loop with an ordered constructthe threadprivate directive makes global file scope variables localto each executing thread (by making multiple copies)
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 15 / 61
D t S Att ib t Cl
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
21/69
Data Scope Attribute Clauses
Since shared memory programming is all about shared variables, data scoping is avery important concept
the private clause declares variables to be private to each threadshared declares variables to be shared among all threads in the team
default allows specifying a default scope for all variables in a parallel region
the firstprivate clause combines the behavior of private clause with
automatic initialization of the variables in a provided listthe lastprivate clause clause combines behavior of the private clausewith a copy from the last loop operation or section to the original variableobject
the copyin clause provides a means for assigning the same value to
threadprivate variables for all threads in the team
the copyprivate clause can be used to broadcast values aquired by a singlethread directly to all instances of the private variables in the other threads
the reduction clause performs a reduction on the variables that appear inits list
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 16 / 61
Example: Vector Dot Product
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
22/69
Example: Vector Dot Product
iterations of the parallel loopwill be distributed in equal sizedblocks to each thread in the
team
at the end of the parallel loopconstruct, all threads will addtheir values of result to
update the master threadsglobal copy
#include
main () {
int i, n, chunk;float a[100], b[100], result;
/* Some initializations */n = 100;chunk = 10;result = 0.0;
for (i=0; i < n; i++){a[i] = i * 1.0;b[i] = i * 2.0;}
#pragma omp parallel for \default(shared) private(i) \schedule(static,chunk) \reduction(+:result)
for (i=0; i < n; i++)result = result + (a[i] * b[i]);
printf("Final result= %f\n",result);
}
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 17 / 61
Introduction to MPI
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
23/69
Introduction to MPI
Background
is a standard for a message passing library jointly developed by vendors,
researchers, library developers and users which claims to be portable,efficient, and flexibleby itself, it is a specification not a library
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 18 / 61
Introduction to MPI
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
24/69
Introduction to MPI
Background
is a standard for a message passing library jointly developed by vendors,
researchers, library developers and users which claims to be portable,efficient, and flexibleby itself, it is a specification not a library
Programming Model
lends itself to virtually anydistributed memory parallelprogramming modelit is also used in shared memoryarchitectures (SMP/NUMA)
behind the sceneall parallelism is explicitthe number of tasks dedicated torun a parallel program is static(relaxed in MPI-2)
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 18 / 61
Program Environment
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
25/69
Program Environment
MPI uses object called communicators or groups to define whichcollection of processes communicate together
within a communicator, every processor has its own unique, integeridentifier called rank; used to
specify source and destination of messagescontrol program execution
MPI Init and MPI Finalize initialize and terminate the executionenvironment
MPI Comm size and MPI Comm rank determine the number ofprocesses in the group and the rank of the calling process
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 19 / 61
Point-to-Point Communication
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
26/69
Point-to-Point Communication
occur between only two different MPI tasks and could besynchronous send
blocking send/blocking receivenon-blocking send non-blocking receivebuffered sendcombined send/receive
any type of send can be combined with any type of receive
buffering in system buffer space deals with storing data when twotasks are out of sync and its behavior implementation defined
a blocking send/receive only returns when it is safe to modify/use theapplication buffer
in non-blocking operations return immediately and its the user dutyto check/wait for completion of the operation before manipulatingthe buffers (introduces the possibility to overlap communication andcomputation)
MPI guarantees the correct ordering of messages but not fairness
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 20 / 61
Point-to-Point Communication Routines
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
27/69
Point-to-Point Communication Routines
MPI Send and MPI Recv are blocking routines
MPI Ssend is synchronous blocking send, waits for the receiving taskto start receiving the message
MPI Bsend is buffered blocking send, where the user can allocated
required space for the message before it is deliveredMPI Sendrecv sends a messages and posts a receive before blocking
MPI Isend, MPI Irecv, MPI Issend, and MPI Ibsend arenon-blocking versions of above routines
MPI Wait blocks until a specified non-blocking operation completesMPI Test checks the status of a non-blocking operation
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 21 / 61
Example: Nearest Neighbor Exchange in Ring Topology
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
28/69
Example: Nearest Neighbor Exchange in Ring Topology
#include "mpi.h"#include
int main(int argc, char *argv[]){
int numtasks, rank, next, prev, buf[2], tag1=1, tag2=2;MPI_Request reqs[4];MPI_Status stats[4];
MPI_Init(&argc,&argv);MPI_Comm_size(MPI_COMM_WORLD, &numtasks);MPI_Comm_rank(MPI_COMM_WORLD, &rank);
prev = rank-1;next = rank+1;if (rank == 0) prev = numtasks - 1;if (rank == (numtasks - 1)) next = 0;
MPI_Irecv(&buf[0], 1, MPI_INT, prev, tag1, MPI_COMM_WORLD, &reqs[0]);MPI_Irecv(&buf[1], 1, MPI_INT, next, tag2, MPI_COMM_WORLD, &reqs[1]);
MPI_Isend(&rank, 1, MPI_INT, prev, tag2, MPI_COMM_WORLD, &reqs[2]);MPI_Isend(&rank, 1, MPI_INT, next, tag1, MPI_COMM_WORLD, &reqs[3]);
{ do some work }
MPI_Waitall(4, reqs, stats);
MPI_Finalize();}
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 22 / 61
Collective Communication
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
29/69
Collective Communication
collective communication involves all processes in the scope of thecommunicator in the form of
synchronization
data movementcollective computation
collective operations are blocking
can only be done using MPI predefined data types
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 23 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
30/69
More in MPI
8/14/2019 Current Software Methodologies and Languages
31/69
More in MPI
using drived data types the user can define customized data types
(contiguous, vector, indexed, and struct)dynamically manage groups and communicator objects to organizetasks, enable collaborative operations on subsets of tasks
virtual topologies describe a mapping/ordering of MPI processes into
a geometric shape (Cartesian, graph, etc.)
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 25 / 61
More in MPI
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
32/69
using drived data types the user can define customized data types
(contiguous, vector, indexed, and struct)dynamically manage groups and communicator objects to organizetasks, enable collaborative operations on subsets of tasks
virtual topologies describe a mapping/ordering of MPI processes intoa geometric shape (Cartesian, graph, etc.)
Added in MPI-2:
dynamic processes supported
one-sided communication for shared memory operations (put/get)
and remote accumulate operationsextended collective operations (non-blocking supported)
parallel I/O
external interfaces such as for debuggers and profilers
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 25 / 61
Erlang
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
33/69
g
Developed at Ericsson in late 1980s as a platform for developing softreal-time software for managing phone switches
They needed a high level symbolic language to achieve productivity
gain whichcontains primitives for concurrencysupports error recoveryhas an execution model without back-trackinghas a granularity of concurrency such that one asynchronous telephonyprocess is represented by one process in the language
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 26 / 61
Sequential Erlang
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
34/69
q g
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 27 / 61
Concurrent Erlang
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
35/69
g
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 28 / 61
Abstracting Protocols
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
36/69
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 29 / 61
Standard Behaviours
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
37/69
Erlangs abstraction of a protocol pattern is called a behaviour.
large Erlang applications use heavy use of behaviours
direct use of message-sending or receiving is uncommon
Erlangs OTP standard library provides three main behaviours
generic server is the most common behaviour where responses can be
delayed or delegated, calls have optional timeouts, etc.generic finite state machinegeneric event handler an event manager monitors receives events as
incoming messages and dispatches them to arbitrarynumber of event handlers, each with with its ownmodule of callbacks and state
behaviour libraries provide functionality for dynamic debugging,inspecting state, producing traces of messages, and statistics
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 30 / 61
Worker Processes
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
38/69
many applications need tocreate concurrent activities onthe fly
suppose a client needs to sendcalls to multiple servers, hackingOTPs generic server is a pain
using worker processes clientscan use receive expressions
without worrying about beingblocked
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 31 / 61
Some notes on Erlang
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
39/69
All errors in concurrent programming have their equivalents in Erlang:races, deadlock, livelock, starvation, etc.
Erlang is a safe language: all run time faults result in clearly definedbehavior, usually an exception
Erlang provides two primitives for one process to notice the failure ofanother
monitoring of another process creates a one-way notification offailure
linking two processes establishes mutual notification
when a fault notification is delivered to a linked process, causes it also
to fail; but it can be configured to be sent as a message receivable bya receive statement
robust server deployments include an external nanny, that monitorsrunning operating system process and restarts it if it fails. In Erlang itis done with supervisor behaviour.
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 32 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
40/69
Parallel and Concurrent Programming in Haskell
8/14/2019 Current Software Methodologies and Languages
41/69
purity (inherent parallelism), laziness (no specific order) and types(faster parallelism) mean we can find more parallelism in the code
in Haskell it is distinguishable:
Parallelism exploit parallel computing hardware to improve
performance for a single taskConcurrency logically independent tasks as a structuring technique
different approaches available in haskell
sparks and parallel strategiesthreads, messages and shared memory
transactional memorydata parallelism
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 34 / 61
The GHC Runtime
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
42/69
GHC runtime supports millions of lightweight threads
they are multiplexed to real OS threads (app. one for each CPU)
automatic thread migration and load balancing (work-stealing)parallel garbage collector in 6.12
runtime settings Compile with
-threaded -O2
Run with
+RTS -N2
+RTS -N4 ...
+RTS -N64
...
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 35 / 61
Semi-Explicit Parallelism with Sparks
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
43/69
lack of side effects makes parallelism easy
f x y = ( x * y ) + ( y ^ 2 )almost everything could be done in parallel too much parallelism
the idea is to let the user annotate the code for potential parallelism
it is a deterministic approach
par :: a b b
a `par` b creates a spark for a
runtime sees a potential toconvert spark into a thread
is semantically equal to b
no restrictions in usage
pseq :: a b b
a `pseq` b evaluates a in
the current threadensures work is run in theright thread
is semantically equal to b
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 36 / 61
Putting it Together
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
44/69
f `par` e `pseq` f + e
one spark created for ff spark converted to a thread and executed
e evaluated in current thread in parallel with f
Threadscope helps think and evaluate spark code
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 37 / 61
Explicit Parallelism with Threads and Shared Memory
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
45/69
For stateful or imperative programs, we need explicit threads, notspeculative sparks.
forkIO :: IO () IO ThreadId
Takes a block of code to run, and executes it in a new Haskell thread
import Control.Concurrent
import System.Directory
main = do
forkIO (writeFile "xyz"
"thread was here")
v doesFileExist "xyz"print v
Non-Determinism!
threads scheduled preemptively
non-deterministic scheduling:random interleaving
threads may be preempted when
they allocate memorycommunicate via messages orshared memory
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 38 / 61
Shared Memory Communication: MVars and Chans
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
46/69
MVars are boxes. They areeither full or empty
put on a full MVar causes thethread to sleep until the MVaris emptytake on an empty MVar blocksuntil it is full
The runtime will wake you upwhen youre needed
do box
8/14/2019 Current Software Methodologies and Languages
47/69
An optimisitic model:transactions run inside atomic blocks assuming no conflicts
system checks consistency at the end of the transactionretry if conflictsrequires control of side effects (handled in the type system)
each atomic block appears to work in complete isolation
data STM aatomically :: STM a IO a
retry :: STM a
orElse :: STM a STM a
STM a
data TVar a
newTVar :: a STM (TVar a)
readTVar :: TVar a STM a
writeTVar :: TVar a a
STM ()
STM a is used to build upatomic blocks
transaction code can only runinside atomic blocks
orElse lets us compose atomicblocks into larger pieces
TVars are the variables theruntime watches for contention
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 40 / 61
Atomic Bank Transfers
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
48/69
transfer :: TVar Int -> TVar Int -> Int -> IO ()
transfer from to amount =atomically $ do
balance
8/14/2019 Current Software Methodologies and Languages
49/69
Simple IdeaDo the same thing in parallel to every element of a large collection
If a program can be expressed this way, then,
no explicit threads or communication (simplicity)
clear cost model (unlike `par`)
good locality, easy partitioning
Adds parallel array syntax:
[: e :]along with many parallel combinators (mapP, filterP, zipP, . . . )
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 42 / 61
Flat vs. Nested Data Parallelism
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
50/69
Flat data parallelism
sumsq :: [: Float :] -> Float
sumsq a = sumP [:x*x| x [:Float:]
-> Float
dotp v w = sumP (zipWithP (*)
v w)
break array into N chunks(for N cores)
run a sequential loop toapply f to each chunkelement
run that loop on each core
combine the results
Nested data parallelism
type Vector = [: Float :]
type Matrix = [: Vector :] matMul :: Matrix -> Vector
-> Vector
matMul m v = [: vecMul r v |
r < - m : ]
each element of a parallelcomputation may in turn be anested parallel computation
GHC implements a vectorizer
flattens nested data, changingrepresentations, automatically
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 43 / 61
Outline
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
51/69
1 Parallel Programming
2 Real-Time Programming
RTOSIntroduction to Real-Time JavaAda
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 44 / 61
Real-Time Computing
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
52/69
Definition
real-time computing (RTC), is the study of hardware and software systems
that are subject to a real-time constraint i.e., operational deadlinesfrom event to system response.
Often addressed in the context of real-time operating systems, andsynchronous programming languages
Definition
A system is said to be real-time if the total correctness of an operationdepends not only upon its logical correctness, but also upon the time inwhich it is performed.
in a hard real-time system, the completion of an operation after itsdeadline is useless
a soft real-time system on the other hand will tolerate such lateness
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 45 / 61
Real-Time Operating System (RTOS)
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
53/69
real-time operating systems offer programmers more control over
process prioritiesthe variability in the amount of time it takes to accept and completean applications task is called jitter; important to be near zero forhard real-time systems
two approaches:
Event-driven switches tasks only when an event of higher priorityneeds service called priority scheduling
Time-sharing switch tasks on a regular clock interrupt, and on eventscalled round robin
some algorithms used in RTOS scheduling are:cooperative multitaskingpreemptive schedulingearliest deadline first (EDF) approach
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 46 / 61
Real-Time Operating System (RTOS) contd.
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
54/69
interprocess communication should be enabled among tasks
shared memory using locks and semaphores (danger of priority
inversion, deadlock!)message passing using queues (danger of priority inversion!)
memory allocation speed is important; usually needs to be fixed time
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 47 / 61
Real-Time Java
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
55/69
Enabling real-time programming for Java needs addressing of:
the behavior of garbage collector which may introduce unpredicteddelays
lack of a strict priority based threading model
no priorities means no way to avoid priority inversion protocols
high resolution timing management
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 48 / 61
Real-Time Java
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
56/69
Enabling real-time programming for Java needs addressing of:
the behavior of garbage collector which may introduce unpredicteddelays
lack of a strict priority based threading model
no priorities means no way to avoid priority inversion protocols
high resolution timing managementAs a result, the Java community defined a Real-Time Specification forJava (RTSJ)
JVM enhancements and new API set
intended only for suitable underlying OS (e.g., QNX)existing J2SE applications can still run under Java RTS
is already being used by U.S. Navy, Boeing and others
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 48 / 61
Real-Time additions to Java
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
57/69
Direct Memory access similar to J2ME, more security compared to C
Asynchronous communication comes in two forms
Asynchronous event handling can schedule response toevents coming from outside JVM
Asynchronous transfer of control controlled way of safelyinterrupting another thread
High-resolution timing areas that help prevent unpredictable delay in GC
Immortal memory no GC; freed at the end of programScoped memory used only while a process works within a
particular section of program (e.g., a method)
Real-time threads cannot be interrupted by GC; 28 levels of strictly
enforced prioritiesReal-time threads are synchronized (no priority inversion)No-heap real-time threads may immediately preempt any
GC; no reference or allocation to/in heapallowed
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 49 / 61
Ada
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
58/69
Ada is an imperative, strongly typed, block-structured language
designed by US DoD (1983) for construction of large, complex,mission critical software.
the language is designed to avoid expensive implicit storagemanipulation operations; heap storage allocation is explicit
definition of Ada includes precise description of compilation issues,and of the interaction between applications and libraries (not left as apart of environment/OS)
Ada 95 provided a concurrent programming environment for
real-time systems using fixed-priority, preemptive schedulingAda 2005 includes new dispatching policies (e.g., non-preemptive,round-robin, and EDF), timing events, Ravenscar profile, and more
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 50 / 61
Ada Syntax
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
59/69
Ada is strongly typed
subtypes can be used to set constraints on typesscalar types include integer, enumerated, floating and fixed-point
composite types are arrays and records
access types are strongly typed pointers in Ada
expressions are based on standard arithmetic and boolean operationsAdas statements are assignments, case statements, loop, exitstatements, blocks, and gotos
subprograms have three modes for parameter passing, designated in,
in out, and out; subprograms can be overloadedpackages separate definition of interfaces and implementation,support abstract data types and information hiding, and are basicstructuring mechanism for large systems
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 51 / 61
Ada Tasks
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
60/69
Ada tasks are threads which describe concurrent computationstask specification provides the public interface in form of task entries
an entry names an action that a task will perform on behalf of the caller;they also act as a synchronization mechanism (rendezvous)communication is asymmetric; the caller names the server explicitly in thecall, while the server accepts them freely
task mailbox isentry put(m: message);
entry get(m: out message);
end mailbox;
accept put(m: message)
do buffer_store(m);
end;
a task can request that a rendezvous take place immediately, or not at alldelay statements can be used to program timed entry callsselective wait can be used to accept multiple entries depending on theinternal state of the server; put a time limit on the arrival of a call; orshut down in case no callers are active
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 52 / 61
Ada Concurrency Model
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
61/69
Ada has a core and several annexes
The Core is required by all implementations and contains definition ofall language constructs
The Annexes define additional facilities in form of packages and pragmas
but never new syntaxThe definition of concurrency model is included in the core in form of:
Tasks representing threads of control
Protected Objects provide mutual exclusion and condition synchronization;
they are passive and do not have separate thread of control
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 53 / 61
A Generic Bounded Buffer
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
62/69
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 54 / 61
A Typical Producer/Consumer
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
63/69
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 55 / 61
Ada 95 real-time foundation
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
64/69
The core does not define a notion of priority, nor of priority-based queuing
or scheduling. Thus, the Real-Time System Annex definesadditional semantics and facilities
integrated priority-based interrupt handling
run-time library behavior
that support deterministic tasking via fixed-priority, preemptive schedulingpriority inheritence and immediate ceiling priority protocol (ICPP) areincluded to limit blocking
a high resolution monotonic clock providing both absolute and
relative delays
These facilities provide off-line schedulability analysis
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 56 / 61
Priorities
Prioriries are assigned to tasks using a pragma directive
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
65/69
Prioriries are assigned to tasks using a pragma directive
assigning priority to tasks
task Producer is
pragma Priority (10);
end Producer;
task Consumer ispragma Priority (10);
end Consumer;
configuring behavior of the
run-time librarypragma Task_Dispatching_Policy
(FIFO_Within_Priorities);
pragma Locking_Policy
(Ceiling_Locking);pragma Queuing_Policy
(Priority_Queuing);
priorities for protected objects would be assigned in accordance withthe ceiling priority protocol
low-level tasking control and synchronization (semaphore like objects,asynchronous resuming/suspension of tasks, etc.) are available forextreme needs
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 57 / 61
2005 real-time enhancements
lti le i he it ce s o t dded
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
66/69
multiple inheritance support addednew dispatching policies (specified to the pragma
Task Dispatching Policy)
assigning priority to tasks
Non_Preemptive_FIFO_Within_Priorities
Round_Robin_Within_Priorities
EDF_Across_Priorities
combining dispatching policies based on priority bands
pragma Priority_Specific_Dispatching (
FIFO_Within_Priorities, 9, 20);
pragma Priority_Specific_Dispatching (
Round_Robin_Within_Priorities, 1, 8);
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 58 / 61
2005 real-time enhancements (contd.)
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
67/69
timing events are conceptually lightweight interrupts generated by thearrival of points in time
execution-time monitoring control is used to monitor the executiontime (CPU time) of tasks
execution time events are similar in concept and interface to timingevents except that they use execution time instead of wall-clock time
facilities to allocate and monitor a budgeted execution time for agroup of tasks as a whole
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 59 / 61
The Ravenscar Profile
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
68/69
an analyzable subset of Ada tasking, suitable for hard real-time andhigh-integrity applications
the tasking model is suitable for one processor using fixed-priority,preemptive dispatching.
there are fixed number of tasks which never terminate
there are two kind of tasks: time-triggered (periodic) andevent-triggered (sporadic)
task do not communicate directly (via rendezvous) and do notinteract with the control flow of other tasks
communication is done indirect using shared variables encapsulatedwithin protected objects
Seyed Hosein Attarzadeh Niaki (KTH) Topic 4: Current Software Methodologies and Languages February 23, 2010 60 / 61
http://goforward/http://find/http://goback/8/14/2019 Current Software Methodologies and Languages
69/69