14
2/17/2010 1 2010 02 17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda Architecture Parallel Programming Languages Parallel Programming Languages Precedence Graph Elementary Parallel Algorithms Sorting Matrix Multiplication Download :http://www.cs.montana.edu/~atanu.roy/Classes/ CS515.html

CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

1

2010 ‐ 02 ‐17   

CS 515: Parallel Algorithms

Chandrima Sarkar

Atanu Roy

Agenda

• Architecture• Parallel Programming Languages• Parallel Programming Languages• Precedence Graph• Elementary Parallel Algorithms• Sorting• Matrix MultiplicationDownload :‐http://www.cs.montana.edu/~atanu.roy/Classes/CS515.html

Page 2: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

2

Architecture

Flynn’s ClassificationS = single , M = multiple , I = instruction (stream), D = data (stream)

SISD SIMD

Architecture

Flynn’s ClassificationS = single , M = multiple , I = instruction (stream), D = data (stream)

MISD MIMD

Page 3: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

3

Static Inter‐connection NetworkLinear Array

RingRing

Ring arranged to use short wires

Fully Connected Topology Chordal ring

Multidimensional Meshes and Torus

Tree

Page 4: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

4

Tree Cont.

FAT TREE

STAR

Hypercube

000 010

100 110

111101

1-D 2-D 3-D 4-D

001 011

0-D

5-D

Page 5: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

5

Parallel Programming Languages

Control Mechanism Communication  Mechanism

Shared Memory Message‐passingShared Memory Message passing

Control driven Fortran 90/HPF , C++ , HEP PL/I , Ada , Concurrent Pascal Modula‐2 , MultiLisp(MIMD), Lisp Connection Machine (SIMD)

CSP , Ada , OCCAM  (Von Neumann Language Extension )

Data driven VAL , ID LAU , SISAL ( data‐flow languages )

Pattern driven Concurrent Prolog ( Shapiro )

Actors

Demand driven ( reduction language )

FP

Dijkstra’s High Level language construct 

• Degree of Parallelism is static Algol‐68,CSP

AAparbegin

Cbegin

Bparbegin

DE

parend

Precendence Graph

Gend

parendH

Page 6: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

6

Elementary Parallel AlgorithmsFinding sum using a 2D mesh architecture

Finding sum of 16 values in a Shuffle Exchange SIMD Model

Page 7: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

7

Parallel summation in a Hypercube SIMD Model

Broadcast in a HypercubeAlgorithm 1

Algorithm 2

Page 8: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

8

Odd Even Transposition Sort

• (1) p = n• 14 – 5 – 15 – 8 – 4 – 11 – 13 – 1214  5  15  8  4  11  13  12

• odd‐even 14 5 – 15 8 – 4 11 – 13 12• even‐odd 14 – 5 15 – 4 8 – 11 13 – 12• odd‐even 5 14 – 4 15 – 8 11 – 12 13• even‐odd 5 – 4 14 – 8 15 – 11 12 – 13• odd‐even 4 5 – 8 14 – 11 15 – 12 13

dd 4 5 8 11 14 12 15 13• even‐odd 4 – 5 8 – 11 14 – 12 15 – 13• odd‐even 4 5 – 8 11 – 12 14 – 13 15• even‐odd 4 – 5 8 – 11 12 – 13 14 – 15

Odd Even Transposition Sort (contd…)

• (2) p << n{ }• S= {12, 7, 2, 4, 1, 11, 9, 5, 6, 3, 10, 8}, p = 4

P1 P2 P3 P4{12, 7, 2} {4, 1, 11} {9, 5, 6} {3, 10, 8}

{2, 7, 12} {1, 4, 11} {5, 6, 9} {3, 8, 10}

{1, 2, 4} {7, 11, 12} {3, 5, 6} {8, 9, 10}

{1 2 4} {3 5 6} {7 11 12} {8 9 10}{1, 2, 4} {3, 5, 6} {7, 11, 12} {8, 9, 10}

{1, 2, 3} {4, 5, 6} {7, 8, 9} {10, 11, 12}

{1, 2, 3} {4, 5, 6} {7, 8, 9} {10, 11, 12}

Page 9: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

9

Pseudocode

• Proc MERGE‐SPLIT(S)for i:= 1 to p do in parallel

( )QUICKSORT(Si)end forfor (i := 1 to ceil(p/2))for odd‐numbered processor do in parallel

MERGE(Si , Si + 1) SPLIT

end forfor odd‐numbered processor do in parallelp p

MERGE(Si , Si + 1) SPLIT

end forend for

2 – D mesh with Snake OrderThompson and Kung (1977)

Input : {23, 6, 1, 5, 11, 13, 55, 19, ‐3, 12, ‐5, ‐7, 9, 55, 28, ‐2}

Page 10: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

10

Snake Order (contd.)

Bitonic Merge Sort

• Bitonic Sequence :‐ 1, 3, 7, 8 6, 5, 4, 2

• Comparator

• Note :‐ Batcher’s Bitonic Merge Sort compares elements whose indices differ by a single bit.y g

Page 11: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

11

Bitonic Merge Sort

Shuffle‐Exchange Network

Bitonic Mergesort on Shuffle‐Exchange Network

• A list of n = 2k unsorted elements can be sorted in time θ(lg2• A list of n = 2 unsorted elements can be sorted in time θ(lgn) with a network 2k‐1[k (k‐1) + 1] comparators using the shuffle‐exchange network.

Page 12: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

12

Sorting Network

Page 13: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

13

Odd Even Merging Network

Systolic Matrix Multiplication

1. Multiply ai,k by ak,j2. Add the result to ri,j,j3. Send ai,k to cell ci+1,j4. Send bk,j to cell ci,j+1

Page 14: CS 515: Parallel Algorithmsatanu.roy/CS515/ParallelAlgoCSAR.pdf · 2/17/2010 1 2010 ‐02 ‐17 CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy Agenda • Architecture •

2/17/2010

14

Home Work

• Show how the following 16 values would bet d b B t h ’ Bit i tsorted by Batcher’s Bitonic sort.

16, 7, 4, 12, 2, 10, 13, 9,  1, 8, 11, 3, 15, 6, 5, 14