INTEL CONFIDENTIAL Confronting Race Conditions Introduction to Parallel Programming – Part 6

INTEL CONFIDENTIAL

Confronting Race ConditionsIntroduction to Parallel Programming – Part 6

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners.

2

Review & Objectives

Previously:Described how to add OpenMP pragmas to programs that

have suitable blocks of code or for loopsDemonstrated how to use private and reduction clauses

At the end of this part you should be able to:Give practical examples of ways that threads may

contend for shared resources Describe what race conditions are and explain

how to eliminate them in OpenMP code


Motivating Example

double area, pi, x;int i, n;...area = 0.0;for (i = 0; i < n; i++) { x = (i + 0.5)/n; area += 4.0/(1.0 + x*x);}pi = area / n;

What happens when we make the for loop parallel?

3


Race Condition

A race condition is nondeterministic behavior caused by the order in which two or more threads access a shared variable

For example, suppose both Thread 1 and Thread 2 are executing the statement

area += 4.0 / (1.0 + x*x);

4


One Timing Correct Sum

5

Value of area Thread 1 Thread 2

11.667

+3.765

15.432

15.432

+ 3.563

18.995


Another Timing Incorrect Sum

6

Value of area Thread 1 Thread 2

11.667

+3.765

11.667

15.432

+ 3.563

15.230


Another Race Condition Example

struct Node {int data; struct Node *next;}

struct List { struct Node *head; }

void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node;}

7

data

next

Node

head

List


Original Singly-Linked List

8

headdata

next

listnode_a

void AddHead (struct List *list, struct Node *node){ node->next = list->head; list->head = node; }


Thread 1 after Stmt. 1 of AddHead

9

headdata

next

listnode_a

data

next

node_b

void AddHead (struct List *list, struct Node *node){ node->next = list->head; list->head = node; }


Thread 2 Executes AddHead

10

headdata

next

listnode_a

data

next

node_b

data

next

node_cvoid AddHead (struct List *list, struct Node *node){ node->next = list->head; list->head = node; }


Thread 1 After Stmt. 2 of AddHead

11

headdata

next

listnode_a

data

next

node_b

data

next

node_cvoid AddHead (struct List *list, struct Node *node){ node->next = list->head; list->head = node; }


Why Race Conditions Are Nasty

Programs with race conditions exhibit nondeterministic behavior• Sometimes give correct result• Sometimes give erroneous result

Programs often work correctly on trivial data sets and small number of threads

Errors more likely to occur when number of threads and/or execution time increases

Hence debugging race conditions can be difficult

12


13

How to Avoid Race Conditions

Scope variables to be private to threads• Use OpenMP private clause• Variables declared within threaded functions• Allocate on thread’s stack (pass as parameter)

Control shared access with critical region• Mutual exclusion and synchronization


Mutual Exclusion

We can prevent the race conditions described earlier by ensuring that only one thread at a time references and updates shared variable or data structure

Mutual exclusion refers to a kind of synchronization that allows only a single thread or process at a time to have access to a shared resource

Mutual exclusion is implemented using some form of locking

14


Critical Regions

A critical region is a portion of code that threads execute in a mutually exclusive fashion

The critical pragma in OpenMP immediately precedes a statement or block representing a critical section

Good news: critical regions eliminate race conditionsBad news: critical regions are executed sequentiallyMore bad news: you have to identify critical regions

yourself

15


Motivating Example

void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node;}

16


Motivating Example

void AddHead (struct List *list, struct Node *node) { node->next = list->head;#pragma omp critical list->head = node;}

17

list


Motivating Example


18

list

Thread 1

Thread 1 node


Motivating Example


19

list

Thread 1

Thread 1 node


Motivating Example


20

list

Thread 1

Thread 1 node

Thread 2

Thread 2 node


Motivating Example


21

list

Thread 1

Thread 1 node

Thread 2

Thread 2 node


Motivating Example


22

list

Thread 1 node

Thread 2

Thread 2 node


Protect All References to Shared Data

You must protect both read and write accesses to any shared data

For the AddHead() function, both lines need to be protected

23


Corrected Example

void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; }}

24

list


Corrected Example


25

Thread 1

Thread 1 node

list


Corrected Example


26

Thread 1 Thread 2

Thread 2 node

Thread 1 node

list


Corrected Example


27

Thread 1 Thread 2

Thread 2 node

Thread 1 node

list


Corrected Example


28

Thread 2

Thread 2 node

list


Important: Lock Data, Not Code

Locks should be associated with data objectsDifferent data objects should have different locks

29


OpenMP atomic Construct

Special case of a critical section to ensure atomic update to memory location

Applies only to simple operations:• pre- or post-increment (++)• pre- or post-decrement (--)• assignment with binary operator (of scalar types)

Works on a single statement

#pragma omp atomic counter += 5;

30


Critical vs. Atomic

#pragma omp parallel for for (i = 0; i < n; i++) { #pragma omp critical x[index[i]] += WorkOne(i); y[i] += WorkTwo(i); }

critical protects:• Call to WorkOne()• Finding value of index[i]• Addition of x[index[i]] and

results of WorkOne()• Assignment to x array element

Essentially, updates to elements in the x array are serialized

#pragma omp parallel for for (i = 0; i < n; i++) { #pragma omp atomic x[index[i]] += WorkOne(i); y[i] += WorkTwo(i); }

atomic protects:• Addition and assignment to x

array element

Non-conflicting updates will be done in parallel

Protection needed only if there are two threads where the index[i] values match

31


References

Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau, “Deadlock”, CS 537, Introduction to Operating Systems, Computer Sciences Department, University of Wisconsin-Madison.

Jim Beveridge and Robert Wiener, Multithreading Applications in Win32®, Addison-Wesley (1997).

Richard H. Carver and Kuo-Chung Tai, Modern Multithreading: Implementing, Testing, and Debugging Java and C++/Pthreads/ Win32 Programs, Wiley-Interscience (2006).

Michael J. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw-Hill (2004).

Brent E. Rector and Joseph M. Newcomer, Win32 Programming, Addison-Wesley (1997).

N. Wirth, Programming in Modula-2, Springer (1985).

32

Documents

INTEL CONFIDENTIAL Confronting Race Conditions Introduction to Parallel Programming – Part 6