74
MPI Rohit Banga Prakher Anand K Swagat Manoj Gupta Advanced Computer Architecture Spring, 2010

MPI Introduction

Embed Size (px)

Citation preview

Page 1: MPI Introduction

MPI

Rohit Banga

Prakher Anand

K Swagat

Manoj Gupta

Advanced Computer Architecture

Spring, 2010

Page 2: MPI Introduction

ORGANIZATION

Basics of MPI Point to Point Communication Collective Communication Demo

Page 3: MPI Introduction

GOALS

Explain basics of MPI Start coding today! Keep It Short and Simple

Page 4: MPI Introduction

MESSAGE PASSING INTERFACE

A message passing library specification Extended message-passing model Not a language or compiler specification Not a specific implementation, several

implementations (like pthread) standard for distributed memory, message

passing, parallel computing Distributed Memory – Shared Nothing

approach! Some interconnection technology – TCP,

INFINIBAND (on our cluster)

Page 5: MPI Introduction

GOALS OF MPI SPECIFICATION

Provide source code portability Allow efficient implementations Flexible to port different algorithms on

different hardware environments Support for heterogeneous architectures –

processors not identical

Page 6: MPI Introduction

REASONS FOR USING MPI

Standardization – virtually all HPC platforms Portability – same code runs on another

platform Performance – vendor implementations

should exploit native hardware features Functionality – 115 routines Availability – a variety of implementations

available

Page 7: MPI Introduction

BASIC MODEL

Communicators and Groups Group

ordered set of processes each process is associated with a unique integer

rank rank from 0 to (N-1) for N processes an object in system memory accessed by handle MPI_GROUP_EMPTY MPI_GROUP_NULL

Page 8: MPI Introduction

BASIC MODEL (CONTD.)

Communicator Group of processes that may communicate with

each other MPI messages must specify a communicator An object in memory Handle to access the object

There is a default communicator (automatically defined):

MPI_COMM_WORLD identify the group of all processes

Page 9: MPI Introduction

COMMUNICATORS

Intra-Communicator – All processes from the same group

Inter-Communicator – Processes picked up from several groups

Page 10: MPI Introduction

COMMUNICATOR AND GROUPS

For a programmer, group and communicator are one

Allow you to organize tasks, based upon function, into task groups

Enable Collective Communications (later) operations across a subset of related tasks

safe communications Many Communicators at the same time Dynamic – can be created and destroyed at run

time Process may be in more than one

group/communicator – unique rank in every group/communicator

implementing user defined virtual topologies

Page 11: MPI Introduction

VIRTUAL TOPOLOGIES coord (0,0):

rank 0 coord (0,1):

rank 1 coord (1,0):

rank 2 coord (1,1):

rank 3

Attach graph topology information to an existing communicator

Page 12: MPI Introduction

SEMANTICS

Header file #include <mpi.h> (C) include mpif.h (fortran) Java, Python etc.

Format: rc = MPI_Xxxxx(parameter, ... )

Example:

rc = MPI_Bsend(&buf,count,type,dest,tag,comm)

Error code:

Returned as "rc". MPI_SUCCESS if successful

Page 13: MPI Introduction

MPI PROGRAM STRUCTURE

Page 14: MPI Introduction

MPI FUNCTIONS – MINIMAL SUBSET

MPI_Init – Initialize MPI MPI_Comm_size – size of group associated

with the communicator MPI_Comm_rank – identify the process MPI_Send MPI_Recv MPI_Finalize

We will discuss simple ones first

Page 15: MPI Introduction

CLASSIFICATION OF MPI ROUTINES

Environment Management MPI_Init, MPI_Finalize

Point-to-Point Communication MPI_Send, MPI_Recv

Collective Communication MPI_Reduce, MPI_Bcast

Information on the Processes MPI_Comm_rank, MPI_Get_processor_name

Page 16: MPI Introduction

MPI_INIT

All MPI programs call this before using other MPI functions int MPI_Init(int *pargc, char ***pargv);

Must be called in every MPI program Must be called only once and before any other MPI

functions are called Pass command line arguments to all processes

int main(int argc, char **argv){

MPI_Init(&argc, &argv);…

}

Page 17: MPI Introduction

MPI_COMM_SIZE Number of processes in the group associated with a

communicator int MPI_Comm_size(MPI_Comm comm, int *psize);

Find out number of processes being used by your application

int main(int argc, char **argv){

MPI_Init(&argc, &argv);int p;MPI_Comm_size(MPI_COMM_WORLD, &p);…

}

Page 18: MPI Introduction

MPI_COMM_RANK Rank of the calling process within the communicator Unique Rank between 0 and (p-1) Can be called task ID

int MPI_Comm_rank(MPI_Comm comm, int *rank); Unique rank for a process in each communicator it belongs to Used to identify work for the processor

int main(int argc, char **argv){

MPI_Init(&argc, &argv);int p;MPI_Comm_size(MPI_COMM_WORLD, &p);int rank;MPI_Comm_rank(MPI_COMM_WORLD, &rank);…

}

Page 19: MPI Introduction

MPI_FINALIZE

Terminates the MPI execution environment Last MPI routine to be called in any MPI program

int MPI_Finalize(void);

int main(int argc, char **argv){

MPI_Init(&argc, &argv);int p;MPI_Comm_size(MPI_COMM_WORLD, &p);int rank;MPI_Comm_rank(MPI_COMM_WORLD, &rank);printf(“no. of processors: %d\n rank: %d”, p,

rank);MPI_Finalize();

}

Page 20: MPI Introduction
Page 21: MPI Introduction

HOW TO COMPILE THIS

Open MPI implementation on our Cluster mpicc -o test_1 test_1.c Like gcc only mpicc not a special compiler

$mpicc: gcc: no input files Mpi implemented just as any other library Just a wrapper around gcc that includes required

command line parameters

Page 22: MPI Introduction

HOW TO RUN THIS

mpirun -np X test_1 Will run X copies of program in your current

run time environment np option specifies number of copies of

program

Page 23: MPI Introduction

MPIRUN

Only rank 0 process can receive standard input. mpirun redirects standard input of all others to

/dev/null Open MPI redirects standard input of mpirun to

standard input of rank 0 process Node which invoked mpirun need not be the

same as the node for the MPI_COMM_WORLD rank 0 process

mpirun directs standard output and error of remote nodes to the node that invoked mpirun

SIGTERM, SIGKILL kill all processes in the communicator

SIGUSR1, SIGUSR2 propagated to all processes All other signals ignored

Page 24: MPI Introduction

A NOTE ON IMPLEMENTATION

I want to implement my own version of MPI Evidence

MPI_Init

MPI Thread

MPI_Init

MPI Thread

Page 25: MPI Introduction

SOME MORE FUNCTIONS

int MPI_Init (&flag) Check if MPI_Initialized has been called Why?

int MPI_Wtime() Returns elapsed wall clock time in seconds

(double precision) on the calling processor int MPI_Wtick()

Returns the resolution in seconds (double precision) of MPI_Wtime()

Message Passing Functionality That is what MPI is meant for!

Page 26: MPI Introduction

POINT TO POINT COMMUNICATION

Page 27: MPI Introduction

POINT-TO-POINT COMMUNICATION

Communication between 2 and only 2 processes

One sending and one receiving Types

• Synchronous send• Blocking send / blocking receive• Non-blocking send / non-blocking receive• Buffered send• Combined send/receive• "Ready" send

Page 28: MPI Introduction

POINT-TO-POINT COMMUNICATION

Processes can be collected into groups Each message is sent in a context, and must be received in the same context A group and context together form a

Communicator A process is identified by its rank in the

group associated with a communicator Messages are sent with an accompanying

user defined integer tag, to assist the receiving process in identifying the message

MPI_ANY_TAG

Page 29: MPI Introduction

POINT-TO-POINT COMMUNICATION

How is “data” described? How are processes identified? How does the receiver recognize messages? What does it mean for these operations to

complete?

Page 30: MPI Introduction

BLOCKING SEND/RECEIVE

int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm communicator)

buf: pointer - data to send count: number of elements in buffer . Datatype : which kind of data types in

buffer ?

Page 31: MPI Introduction

BLOCKING SEND/RECEIVE

int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm communicator)

buf: pointer - data to send count: number of elements in buffer . Datatype : which kind of data types in

buffer ?

Page 32: MPI Introduction

BLOCKING SEND/RECEIVE

int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm communicator)

buf: pointer - data to send count: number of elements in buffer . Datatype : which kind of data types in

buffer ?

Page 33: MPI Introduction
Page 34: MPI Introduction

BLOCKING SEND/RECEIVE

int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm communicator)

buf: pointer - data to send count: number of elements in buffer . Datatype : which kind of data types in

buffer ? dest: the receiver tag: the label of the message communicator: set of processors involved

(MPI_COMM_WORLD)

Page 35: MPI Introduction

BLOCKING SEND/RECEIVE

int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm communicator)

buf: pointer - data to send count: number of elements in buffer . Datatype : which kind of data types in

buffer ? dest: the receiver tag: the label of the message communicator: set of processors involved

(MPI_COMM_WORLD)

Page 36: MPI Introduction

BLOCKING SEND/RECEIVE

int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm communicator)

buf: pointer - data to send count: number of elements in buffer . Datatype : which kind of data types in

buffer ? dest: the receiver tag: the label of the message communicator: set of processors involved

(MPI_COMM_WORLD)

Page 37: MPI Introduction

BLOCKING SEND/RECEIVE (CONTD.)

Processor 1

Process 1

Application Send

System Buffer

Processor 2

Process 2

Application Send

System Buffer

Data

Page 38: MPI Introduction

A WORD ABOUT SPECIFICATION

The user does not know if MPI implementation: copies BUFFER in an internal buffer, start

communication, and returns control before all the data are transferred. (BUFFERING)

create links between processors, send data and return control when all the data are sent (but NOT received)

uses a combination of the above methods

Page 39: MPI Introduction

BLOCKING SEND/RECEIVE (CONTD.) "return" after it is safe to modify the application

buffer Safe

modifications will not affect the data intended for the receive task

does not imply that the data was actually received Blocking send can be synchronous which means

there is handshaking occurring with the receive task to confirm a safe send

A blocking send can be asynchronous if a system buffer is used to hold the data for eventual delivery to the receive

A blocking receive only "returns" after the data has arrived and is ready for use by the program

Page 40: MPI Introduction

NON-BLOCKING SEND/RECEIVE

return almost immediately simply "request" the MPI library to perform

the operation when it is able Cannot predict when that will happen request a send/receive and start doing other

work! unsafe to modify the application buffer (your

variable space) until you know that the non-blocking operation has been completed

MPI_Isend (&buf,count,datatype,dest,tag,comm,&request)

MPI_Irecv (&buf,count,datatype,source,tag,comm,&request)

Page 41: MPI Introduction

NON-BLOCKING SEND/RECEIVE (CONTD.)

Processor 1

Process 1

Application Send

System Buffer

Processor 2

Process 2

Application Send

System Buffer

Data

Page 42: MPI Introduction

To check if the send/receive operations have completed

int MPI_Irecv (void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm, MPI_Request *req);

int MPI_Wait(MPI_Request *req, MPI_Status *status); A call to this subroutine cause the code to wait until

the communication pointed by req is complete input/output, identifier associated to a

communications event (initiated by MPI_ISEND or MPI_IRECV). input/output, identifier associated to a

communications event (initiated by MPI_ISEND or MPI_IRECV).

NON-BLOCKING SEND/RECEIVE (CONTD.)

Page 43: MPI Introduction

int MPI_Test(MPI_Request *req, int *flag, MPI_Status *status); A call to this subroutine sets flag to true if the

communication pointed by req is complete, sets flag to false otherwise.

NON-BLOCKING SEND/RECEIVE (CONTD.)

Page 44: MPI Introduction

STANDARD MODE• Returns when Sender is free to access and overwrite

the send buffer.

• Might be copied directly into the matching receive buffer, or might be copied into a temporary system buffer.

• Message buffering decouples the send and receive operations.

• Message buffering can be expensive.

• It is up to MPI to decide whether outgoing messages will be buffered

• The standard mode send is non-local.

Page 45: MPI Introduction

SYNCHRONOUS MODE

Send can be started whether or not a matching receive was posted.

Send completes successfully only if a corresponding receive was already posted and has already started to receive the message sent.

Blocking send & Blocking receive in synchronous mode.

Simulate a synchronous communication. Synchronous Send is non-local.

Page 46: MPI Introduction

BUFFERED MODE

Send operation can be started whether or not a matching receive has been posted.

It may complete before a matching receive is posted.

Operation is local. MPI must buffer the outgoing message. Error will occur if there is insufficient buffer

space. The amount of available buffer space is

controlled by the user.

Page 47: MPI Introduction

BUFFER MANAGEMENT

int MPI_Buffer_attach( void* buffer, int size) Provides to MPI a buffer in the user's memory

to be used for buffering outgoing messages. int MPI_Buffer_detach( void* buffer_addr, int*

size) Detach the buffer currently associated with

MPI.MPI_Buffer_attach( malloc(BUFFSIZE), BUFFSIZE); /* a buffer of BUFFSIZE bytes can now be used by MPI_Bsend */

MPI_Buffer_detach( &buff, &size); /* Buffer size reduced to zero */

MPI_Buffer_attach( buff, size); /* Buffer of BUFFSIZE bytes available again */

Page 48: MPI Introduction

READY MODE

A send may be started only if the matching receive is already posted.

The user must be sure of this. If the receive is not already posted, the

operation is erroneous and its outcome is undefined.

Completion of the send operation does not depend on the status of a matching receive.

Merely indicates that the send buffer can be reused.

Ready-send could be replaced by a standard-send with no effect on the behavior of the program other than performance.

Page 49: MPI Introduction

ORDER AND FAIRNESS

• Order: • MPI Messages are non-overtaking. • When a receive matches 2 messages.• When a sent message matches 2 receive

statements.• Message-passing code is deterministic, unless

the processes are multi-threaded or the wild-card MPI_ANY_SOURCE is used in a receive statement.

• Fairness: – MPI does not guarantee fairness– Example: task 0 sends a message to task 2.

However, task 1 sends a competing message that matches task 2's receive. Only one of the sends will complete.

Page 50: MPI Introduction

EXAMPLE OF NON-OVERTAKING MESSAGES.

CALL MPI_COMM_RANK(comm, rank, ierr)

IF (rank.EQ.0) THEN

CALL MPI_BSEND(buf1, count, MPI_REAL, 1, tag, comm, ierr)

CALL MPI_BSEND(buf2, count, MPI_REAL, 1, tag, comm, ierr)

ELSE ! rank.EQ.1

CALL MPI_RECV(buf1, count, MPI_REAL, 0, MPI_ANY_TAG, comm, status, ierr)

CALL MPI_RECV(buf2, count, MPI_REAL, 0, tag, comm, status, ierr)

END IF

Page 51: MPI Introduction

EXAMPLE OF INTERTWINGLED MESSAGES.

CALL MPI_COMM_RANK(comm, rank, ierr)

IF (rank.EQ.0) THEN

CALL MPI_BSEND(buf1, count, MPI_REAL, 1, tag1, comm, ierr)

CALL MPI_SSEND(buf2, count, MPI_REAL, 1, tag2, comm, ierr)

ELSE ! rank.EQ.1

CALL MPI_RECV(buf1, count, MPI_REAL, 0, tag2, comm, status, ierr)

CALL MPI_RECV(buf2, count, MPI_REAL, 0, tag1, comm, status, ierr)

END IF

Page 52: MPI Introduction

DEADLOCK EXAMPLE

CALL MPI_COMM_RANK(comm, rank, ierr)

IF (rank.EQ.0) THEN

CALL MPI_RECV(recvbuf, count, MPI_REAL, 1, tag, comm, status, ierr) CALL MPI_SEND(sendbuf, count, MPI_REAL, 1, tag, comm, ierr)

ELSE ! rank.EQ.1

CALL MPI_RECV(recvbuf, count, MPI_REAL, 0, tag, comm, status, ierr) CALL MPI_SEND(sendbuf, count, MPI_REAL, 0, tag, comm, ierr)

END IF

Page 53: MPI Introduction

EXAMPLE OF BUFFERING

CALL MPI_COMM_RANK(comm, rank, ierr)

IF (rank.EQ.0) THEN

CALL MPI_SEND(buf1, count, MPI_REAL, 1, tag, comm, ierr)

CALL MPI_RECV (recvbuf, count, MPI_REAL, 1, tag, comm, status, ierr)

ELSE ! rank.EQ.1

CALL MPI_SEND(sendbuf, count, MPI_REAL, 0, tag, comm, ierr)

CALL MPI_RECV(buf2, count, MPI_REAL, 0, tag, comm, status, ierr)

END IF

Page 54: MPI Introduction

COLLECTIVE COMMUNICATIONS

Page 55: MPI Introduction

COLLECTIVE ROUTINES Collective routines provide a higher-level way to

organize a parallel program. Each process executes the same communication

operations. Communications involving group of processes in

a communicator. Groups and communicators can be constructed

“by hand” or using topology routines. Tags are not used; different communicators

deliver similar functionality. No non-blocking collective operations. Three classes of operations: synchronization,

data movement, collective computation.

Page 56: MPI Introduction

COLLECTIVE ROUTINES (CONTD.) int MPI_Barrier(MPI_Comm comm) Stop processes until all processes within a

communicator reach the barrier Occasionally useful in measuring performance

Page 57: MPI Introduction

COLLECTIVE ROUTINES (CONTD.)

int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int root, MPI_Comm comm)

Broadcast One-to-all communication: same data sent

from root process to all others in the communicator

Page 58: MPI Introduction

COLLECTIVE ROUTINES (CONTD.)

Reduction The reduction operation allow to:

Collect data from each process Reduce the data to a single value Store the result on the root processes Store the result on all processes

Reduction function works with arrays other operation: product, min, max, and, …. Internally is usually implemented with a

binary tree

Page 59: MPI Introduction

COLLECTIVE ROUTINES (CONTD.)

int MPI_Reduce/MPI_Allreduce(void * snd_buf, void * rcv_buf, int count, MPI_Datatype type, MPI_Op op, int root, MPI_Comm comm)

snd_buf: input array rcv_buf output array count: number of element of snd_buf and

rcv_buf type: MPI type of snd_buf and rcv_buf op: parallel operation to be performed root: MPI id of the process storing the result comm: communicator of processes involved

in the operation

Page 60: MPI Introduction

MPI OPERATIONSMPI_OP operator

MPI_MIN Minimum

MPI_SUM Sum

MPI_PROD product

MPI_MAX maximum

MPI_LAND Logical and

MPI_BAND Bitwise and

MPI_LOR Logical or

MPI_BOR Bitwise or

MPI_LXOR Logical xor

MPI_BXOR Bit-wise xor

MPI_MAXLOC Max value and location

MPI_MINLOC Min value and location

Page 61: MPI Introduction

COLLECTIVE ROUTINES (CONTD.)

Page 62: MPI Introduction

Learn by Examples

Page 63: MPI Introduction

Parallel Trapezoidal Rule

Output: Estimate of the integral from a to b of f(x) using the trapezoidal rule and n trapezoids. Algorithm: 1. Each process calculates "its" interval of integration. 2. Each process estimates the integral of f(x) over its interval using the trapezoidal rule. 3a. Each process != 0 sends its integral to 0. 3b. Process 0 sums the calculations received from the individual processes and prints the result.Notes: 1. f(x), a, b, and n are all hardwired.2. The number of processes (p) should evenly divide the number of trapezoids (n = 1024)

Page 64: MPI Introduction

Parallelizing the Trapezoidal Rule

#include <stdio.h>#include "mpi.h"main(int argc, char** argv) { int my_rank; /* My process rank */ int p; /* The number of processes */ double a = 0.0; /* Left endpoint */ double b = 1.0; /* Right endpoint */ int n = 1024; /* Number of trapezoids */ double h; /* Trapezoid base length */ double local_a; /* Left endpoint my process */ double local_b; /* Right endpoint my process */ int local_n; /* Number of trapezoids for */ /* my calculation */ double integral; /* Integral over my interval */ double total; /* Total integral */ int source; /* Process sending integral */ int dest = 0; /* All messages go to 0 */ int tag = 0; MPI_Status status;

Page 65: MPI Introduction

Continued…

double Trap(double local_a, double local_b, int local_n,double h);

/* Calculate local integral */ MPI_Init (&argc, &argv);

MPI_Barrier(MPI_COMM_WORLD); double elapsed_time = -MPI_Wtime();

MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); MPI_Comm_size(MPI_COMM_WORLD, &p);

h = (b-a)/n; /* h is the same for all processes */ local_n = n/p; /* So is the number of trapezoids */ /* Length of each process' interval of integration = local_n*h.

So my interval starts at: */ local_a = a + my_rank*local_n*h; local_b = local_a + local_n*h; integral = Trap(local_a, local_b, local_n, h);

Page 66: MPI Introduction

Continued…

/* Add up the integrals calculated by each process */ if (my_rank == 0) { total = integral; for (source = 1; source < p; source++) { MPI_Recv(&integral, 1, MPI_DOUBLE, source, tag,

MPI_COMM_WORLD, &status); total = total + integral; }//End for } else MPI_Send(&integral, 1, MPI_DOUBLE, dest, tag,

MPI_COMM_WORLD); MPI_Barrier(MPI_COMM_WORLD); elapsed_time += MPI_Wtime(); /* Print the result */ if (my_rank == 0) { printf("With n = %d trapezoids, our estimate\n",n); printf("of the integral from %lf to %lf = %lf\n",a, b, total); printf("time taken: %lf\n", elapsed_time); }

Page 67: MPI Introduction

Continued…

/* Shut down MPI */ MPI_Finalize();} /* main */

double Trap( double local_a , double local_b, int local_n, double h) { double integral; /* Store result in integral */ double x; int i; double f(double x); /* function we're integrating */ integral = (f(local_a) + f(local_b))/2.0; x = local_a; for (i = 1; i <= local_n-1; i++) { x = x + h; integral = integral + f(x); } integral = integral*h; return integral;} /* Trap */

Page 68: MPI Introduction

Continued…

double f(double x) { double return_val; /* Calculate f(x). */ /* Store calculation in return_val. */ return_val = 4 / (1+x*x); return return_val;} /* f */

Page 69: MPI Introduction

Program 2

Process other than root generates the random value less than 1 and sends to root. Root sums up and displays sum.

Page 70: MPI Introduction

#include <stdio.h>#include <mpi.h>#include<stdlib.h>#include <string.h>#include<time.h>

int main(int argc, char **argv){ int myrank, p; int tag =0, dest=0; int i; double randIn,randOut; int source;

MPI_Status status;

MPI_Init(&argc,&argv);

MPI_Comm_rank(MPI_COMM_WORLD,&myrank);

Page 71: MPI Introduction

MPI_Comm_size(MPI_COMM_WORLD, &p);

if(myrank==0)//I am the root{ double total=0,average=0; for(source=1;source<p;source++) { MPI_Recv(&randIn,1, MPI_DOUBLE, source, MPI_ANY_TAG, MPI_COMM_WORLD, &status);

printf("Message from root: From %d received number %f\n",source ,randIn);

total+=randIn; }//End for average=total/(p-1); }//End if

Page 72: MPI Introduction

else//I am other than root

{

srand48((long int) myrank);

randOut=drand48();

printf("randout=%f, myrank=%d\n",randOut,myrank);

MPI_Send(&randOut,1,MPI_DOUBLE,dest,tag,MPI_COMM_WORLD);

}//End If-Else

MPI_Finalize();

return 0;

}

Page 73: MPI Introduction

MPI References The Standard itself:

at http://www.mpi-forum.org All MPI official releases, in both postscript and HTML

Books: Using MPI: Portable Parallel Programming with the Message-

Passing Interface, 2nd Edition, by Gropp, Lusk, and Skjellum, MIT Press, 1999. Also Using MPI-2, w. R. Thakur

MPI: The Complete Reference, 2 vols, MIT Press, 1999. Designing and Building Parallel Programs, by Ian Foster,

Addison-Wesley, 1995. Parallel Programming with MPI, by Peter Pacheco, Morgan-

Kaufmann, 1997. Other information on Web:

at http://www.mcs.anl.gov/mpi For man pages of open MPI on the web

: http://www.open-mpi.org/doc/v1.4/ apropos mpi

Page 74: MPI Introduction

THANK YOU