23
OpenMPI Majdi Baddourah https://computing.llnl.gov/tutorials/mpi/

OpenMPI Majdi Baddourah

  • View
    236

  • Download
    0

Embed Size (px)

Citation preview

Page 1: OpenMPI Majdi Baddourah

OpenMPI

Majdi Baddourah

https://computing.llnl.gov/tutorials/mpi/

Page 2: OpenMPI Majdi Baddourah

What is OpenMP• An Application Program Interface (API) that may be used to

explicitly direct multi-threaded, shared memory parallelism • Comprised of three primary API components:

– Compiler Directives – Runtime Library Routines – Environment Variables

• Portable: – The API is specified for C/C++ and Fortran

• Standardized: – Jointly defined and endorsed by a group of major computer

hardware and software vendors • What does OpenMP stand for?

– Open Multi-Processing

Page 3: OpenMPI Majdi Baddourah

C / C++ - General Code Structure #include <omp.h>main () {int var1, var2, var3;Serial code .Beginning of parallel section. Fork a team of threads.Specify variable scoping #pragma omp parallel private(var1, var2) shared(var3) { Parallel section executed by all threads . All threads join master thread and disband } Resume serial code .}

Page 4: OpenMPI Majdi Baddourah

Fork - Join Model

Page 5: OpenMPI Majdi Baddourah

C / C++ Directives Format

#pragma omp directive-name [clause, ...] newline

Required for all OpenMP C/C++ directives.

A valid OpenMP directive. Must appear after the pragma and before any clauses.

Optional. Clauses can be in any order, and repeated as necessary unless otherwise restricted.

Required. Precedes the structured block which is enclosed by this directive

#pragma omp parallel default(shared) private(beta,pi)

Page 6: OpenMPI Majdi Baddourah

Parallel Region Example

#include <omp.h> main () { int nthreads, tid; /* Fork a team of threads with each thread having a private tid variable */#pragma omp parallel private(tid) { /* Obtain and print thread id */ tid = omp_get_thread_num(); printf("Hello World from thread = %d\n", tid); /* Only master thread does this */ if (tid == 0) { nthreads = omp_get_num_threads(); printf("Number of threads = %d\n", nthreads); } } /* All threads join master thread and terminate */}

Page 7: OpenMPI Majdi Baddourah

Work-Sharing Construct

DO / for - shares iterations of a loop across the team. Represents a type of "data parallelism".

SECTIONS - breaks work into separate, discrete sections. Each section is executed by a thread. Can be used to implement a type of "functional parallelism".

SINGLE - serializes a section of code

Page 8: OpenMPI Majdi Baddourah

Work-Sharing Constructs DO / for Directive

#pragma omp for [clause ...] newline schedule (type [,chunk]) ordered private (list) firstprivate (list) lastprivate (list) shared (list) reduction (operator: list) collapse (n) nowait for_loop

Page 9: OpenMPI Majdi Baddourah

For Directive Example#include <omp.h>#define CHUNKSIZE 100#define N 1000main () {int i, chunk;float a[N], b[N], c[N]; /* Some initializations */for (i=0; i < N; i++) a[i] = b[i] = i * 1.0;chunk = CHUNKSIZE;#pragma omp parallel shared(a,b,c,chunk) private(i) { #pragma omp for schedule(dynamic,chunk) nowait for (i=0; i < N; i++) c[i] = a[i] + b[i]; } /* end of parallel section */}

Page 10: OpenMPI Majdi Baddourah

SECTIONS Directive

•The SECTIONS directive is a non-iterative work-sharing construct. It specifies that the enclosed section(s) of code are to be divided among the threads in the team.

•Independent SECTION directives are nested within a SECTIONS directive. Each SECTION is executed once by a thread in the team. Different sections may be executed by different threads. It is possible that for a thread to execute more than one section if it is quick enough and the implementation permits such.

Page 11: OpenMPI Majdi Baddourah

SECTIONS Directive

#pragma omp sections [clause ...] newline private (list) firstprivate (list)

lastprivate (list) reduction (operator: list) nowait

{ #pragma omp section newline structured_block #pragma omp section newline structured_block }

Page 12: OpenMPI Majdi Baddourah

SECTIONS Directive #include <omp.h>#define N 1000 main (){ int i;float a[N], b[N], c[N], d[N]; /* Some initializations */for (i=0; i < N; i++) { a[i] = i * 1.5; b[i] = i + 22.35; }#pragma omp parallel shared(a,b,c,d) private(i) { #pragma omp sections nowait { #pragma omp section for (i=0; i < N; i++) c[i] = a[i] + b[i]; #pragma omp section for (i=0; i < N; i++) d[i] = a[i] * b[i]; } /* end of sections */ } /* end of parallel section */ }

Page 13: OpenMPI Majdi Baddourah

CRITICAL Construct

#include <omp.h> main(){ int x;x = 0; #pragma omp parallel shared(x) { #pragma omp critical x = x + 1; } /* end of parallel section */ }

Page 14: OpenMPI Majdi Baddourah

THREADPRIVATE Directive

• The THREADPRIVATE directive is used to make global file scope variables (C/C++) or common blocks (Fortran) local and persistent to a thread through the execution of multiple parallel regions

Page 15: OpenMPI Majdi Baddourah

Threadprivate Directive Exam

ple #include <omp.h> int a, b, i, tid;float x;#pragma omp threadprivate(a, x) main () {/* Explicitly turn off dynamic threads */ omp_set_dynamic(0); printf("1st Parallel Region:\n");#pragma omp parallel private(b,tid) { tid = omp_get_thread_num(); a = tid; b = tid; x = 1.1 * tid +1.0; printf("Thread %d: a,b,x= %d %d %f\n",tid,a,b,x); } /* end of parallel section */ printf("************************************\n"); printf("Master thread doing serial work here\n"); printf("************************************\n"); printf("2nd Parallel Region:\n");#pragma omp parallel private(tid) { tid = omp_get_thread_num(); printf("Thread %d: a,b,x= %d %d %f\n",tid,a,b,x); } /* end of parallel section */}

Page 16: OpenMPI Majdi Baddourah

Results

1st Parallel Region:Thread 0: a,b,x= 0 0 1.000000Thread 2: a,b,x= 2 2 3.200000Thread 3: a,b,x= 3 3 4.300000Thread 1: a,b,x= 1 1 2.100000************************************Master thread doing serial work here************************************2nd Parallel Region:Thread 0: a,b,x= 0 0 1.000000Thread 3: a,b,x= 3 0 4.300000Thread 1: a,b,x= 1 0 2.100000Thread 2: a,b,x= 2 0 3.200000

Threadprivate Directive Exam

ple

Page 17: OpenMPI Majdi Baddourah

Reduction Clause Example #include <omp.h>main () {int i, n, chunk;float a[100], b[100], result;/* Some initializations */n = 100;chunk = 10;result = 0.0;for (i=0; i < n; i++) { a[i] = i * 1.0; b[i] = i * 2.0; }#pragma omp parallel for \ default(shared) private(i) \ schedule(static,chunk) \ reduction(+:result) for (i=0; i < n; i++) result = result + (a[i] * b[i]);printf("Final result= %f\n",result);}

Page 18: OpenMPI Majdi Baddourah

Clauses / Directives Summary

Page 19: OpenMPI Majdi Baddourah

Environment Variables

• OMP_SCHEDULE • OMP_NUM_THREADS

– Setenv OMP_NUM_THREADS 5• OMP_DYNAMIC

– setenv OMP_DYNAMIC TRUE • OMP_NESTED • OMP_STACKSIZE • OMP_WAIT_POLICY • OMP_MAX_ACTIVE_LEVELS • OMP_THREAD_LIMIT

Page 20: OpenMPI Majdi Baddourah

Compilers Flags

Compiler Flag

IBM -qsmp=omp

Intel -openmp

PathScale -mp

PGI -mp

GNU -fopenmp

Page 22: OpenMPI Majdi Baddourah

int i; int j; int prime; int total = 0;# pragma omp parallel \ shared ( n ) \ private ( i, j, prime ) \ reduction ( + : total )# pragma omp for for ( i = 2; i <= n; i++ ) { prime = 1; for ( j = 2; j < i; j++ ) { if ( i % j == 0 ) { prime = 0; break; } } total = total + prime; }

Page 23: OpenMPI Majdi Baddourah

total = 0.0; # pragma omp parallel shared ( a, b, n ) private ( i, x ) reduction ( + : total ) # pragma omp for

for ( i = 0; i < n; i++ ) { x = ( ( n - i - 1 ) * a + ( i ) * b ) / ( n - 1 ); total = total + f ( x ); }