Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Basic Types of Communication 1
Basic Types of Communication
P.Y. Wang
Department of Computer Science 4A5
George Mason University
Fairfax VA 22030-4444 U.S.A.
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 2
Outline
• Simplifying assumptions about routing
• Node to node communications
• Broadcast communications
– One-to-All
– All-to-All
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 3
• All-to-All Reduction and Prefix Sum
• Personalized communications
– One-to-All
– All-to-All
• Circular Shifts
• Faster Communications
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 4
Simplifying Assumptions
• Let ts be the start-up time and lettw be the per word transfer time
– Two directly connected nodes send messages of sizem
to each other ints + mtw time
• Only two routing schemes will be studied here:
– Store-and-forward
– Cut-through routing
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 5
• Assumptions which make our analyses simpler:
– A processor (i.e. node) can send a message on only one link at a
time
– A node can receive a message on only one link at a time
– A node can send and receive a message at the same time on the
same or different link. (Communication links are bidirectional.)
• You should focus on thetypes of communication here; they are often
used in parallel programs by referencing communication routines
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 6
Basic Node-to-Node Communications
• One node, out of all possible nodes, sends a message to another node
• For store-and-forward (SF) routing, the time is
ts + l m tw,
wherets is the startup time,m is the message size,tw is the per
word transfer time, andl is the number of links traversed:
l ≤ bp/2c (ring)
l ≤ 2b√p/2c (torus)
l ≤ lg p (hypercube)
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 7
• For cut-through (CT) routing, the time is
ts + mtw + lth,
whereth is the “per-hop” time (lth is the time for a header to reach
the destination node)
• If m is small, then SF routing time is almost the same as CT routing
time
• If m >> l, then CT routing time is approximately equal to SF
routing time when the nodes are directly connected by a link
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 8
One-to-All Broadcast
• One node sends a value to all (or a subset of all) nodes
• When finished, there arep copies of the original message, one copy
per node
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 9
One-to-All Broadcast: Store-and-Forward on Ring
Tone to all = (ts + mtw)dp/2e
0 1 2 3 4 5 6 7
2
3
31
2
4
4
5
becausedp/2e steps are needed.
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 10
One-to-All Broadcast: Store-and-Forward on Torus
• Mesh of size√
p×√p
Procedure: do one row, then down all columns at the same time
Tone to all = 2(ts + mtw)d√p/2e
• Mesh of size√
p×√p×√p
Tone to all = 3(ts + mtw)d√p/2e
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 11
Do one row, then each column
3 7 1511
0 4 8 12
1 5 9 13
2 6 10 14
2 D Torus
21
2
333 3
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 12
One-to-All Broadcast: Store-and-Forward on Hypercube
Hypercube SF routing is faster than ring/mesh SF routing
Tone to all = (ts + mtw) lg p
See next example!
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 13
Example: p = 23 Processor Hypercube
• Use the highest order bit:
000→ 100
• Use the middle bit:
000→ 010
100→ 110
• Use the least order bit:
000→ 001
010→ 011
100→ 101
110→ 111
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 14
5
7
0
3
1
4
6
2
1
2 2
3
3
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 15
• This can be generalized if the source node is not node000.
• That is, relabel each node with a virtual self-address (ID) by
XOR’ing each nodes’ label with the source ID
XOR(0, 0) = 0
XOR(0, 1) = 1
XOR(1, 1) = 0
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 16
• The dual problem is calledsingle-node accumulation:
– Each node has a message of sizem words;
– all values are accumulated into onem word message for a single
destination
• It is accomplished by using the same One-to-All SF algorithms in
reverse
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 17
One-to-All: CT Routing on a Ring
0 1 2 3 4 5 6 71
2
3
2
3 3 3
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 18
Note the binary (PRAM?) aspect of this approach:
First step distance =p
2
Second step distance =p
22
Third step distance =p
23
ith step distance =p
2i
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 19
Tone to all =
lg p∑
i=1
(ts + twm + thp
2i)
= (ts + twm) lg p + th
lg p∑
i=1
p/2i
= (ts + twm) lg p + th(p− 1)
For largem, th is insignificant so the time is reduced by a factor ofp/ lg p
over SF routing
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 20
One-to-All: CT Routing on Mesh
Do one row, then columns at same time
1
22
3 3 3 3
4 4 4 4
4 4 4 43 7 1511
2 6 10 14
1 5 9 13
0 4 8 12
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 21
Total CT routing time on a mesh is
Tone to all = 2(ts + twm) lg√
p + 2th(√
p− 1)
= (ts + twm) lg p + 2th(√
p− 1)
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 22
All-to-All Broadcast, Reduction, and Prefix Sums
• Every node sends its (single) message of sizem to all the other nodes
• At the end, each node has the messages from the other nodes
• This is not the same as all-to-allpersonalized communications as we
will see
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 23
All-to-All: SF Routing on a Ring
0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7
0, 7 1, 0 2, 1 3, 2 4, 3 5, 4 6, 5 7, 6
0, 7, 1, 0, 2, 1, 4, 3, 5, 4, 6, 5, 7, 6, 3, 2, 6 7 0 5431 2
contents
This takesp− 1 steps and so
Tall to all = (ts + twm)(p− 1)
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 24
All-to-All: SF Routing on a Mesh
Use the ring algorithm in two phases
• All nodes in a row behave like a ring:
(ts + twm)(√
p− 1) time
collect the information in onemessage os sizem√
p
• Now do columns as rings, with message sizem√
p
(ts + twm√
p)(√
p− 1) time
Tall to all = 2ts(√
p− 1) + twm(p− 1)
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 25
All-to-All: SF Routing on a Hypercube
5
7
0
3
1
4
6
2
5
7
0
3
1
4
6
2
5
7
0
3
1
4
6
2
0,1
2,3
4,5
6,7 6,7
2,3
0,1
4,5
4, 5, 6, 7 4, 5, 6, 7
4, 5, 6, 74, 5, 6, 7
0, 1, 2, 3
0, 1, 2, 3 0, 1, 2, 3
0, 1, 2, 3
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 26
• Use the cube functions to exchange data:lg p steps
• In stepi, the size of the messages exchanged is2i−1m, so the time is
ts + 2i−1mtw.
Tall to all =
lg p∑
i=1
(ts + 2i−1mtw) = ts lg p + twm(p− 1)
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 27
Reduction on a Hypercube
The same steps are used, but the incoming value is reduced with the
resident value, resulting in a message of sizem for each step.
Treduction = (ts + twm) lg p
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 28
Prefix Sums on a Hypercube
• Each node maintains a buffer
• The incoming message is added to the result only if the message
came from a node withsmallerID.
• The contents of the outing message are updated with every incoming
message
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 29
Example
000
0
0
001
1
0 + 1
3
011010
2
100
4
2 + 32 4
101 110 111
5 6 7
64 + 5
Data_Br 0 + 1 2 + 3 4 + 50 + 1 4 + 5 6 + 72 + 3
6 + 7
6 + 7
0 0+1 0+1+2+3 4 4+5+6+74+50+1+2 4+5+6
0+1+2+30+1+2+30+1+2+30+1+2+3 4+5+6+74+5+6+74+5+6+7 4+5+6+7Data_Br
Data
Data
Node
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 30
One-to-All Personalized Communication
• A single node sends a unique message of sizem to every node
• This is ascatter operation
• The dual is called agather operation
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 31
Example for a Hypercube
Node 000 has eight messages, one for each node
0 1 2 4 6 753
4..5 6..7
0..3 4..7
000 001 011010 100 101 110 111
Node 000’s Perspective
0..7
0..1 2..3
Messages
Step 1
Step 2
Step 3
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 32
Communication Times
• Hypercube:ts lg p + twm(p− 1)
• Ring: (ts + twm)(p− 1)
• Mesh:2ts√
p + twm(p− 1)
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 33
All-to-All-Personalized Communication
• Every node has a distinctmessage of sizem for every other node
• Let (a, b) denote the message from nodea to nodeb
• Thus every nodei hasp− 1 messages(i, 0), (i, 1), . . .(i, p− 1)
• We use a shorthand notation
i : j..k = (i, j), (i, j + 1), . . . (i, k)
in the following diagrams
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 34
All-to-All-Personalized: SF Routing on a Ring
• Each node hasp− 1 messages to send as one message
• Use a pipeline approach
• Each receiving node keeps the message targeted for it and sends the
remainder onward
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 35
Example: p=4
0 21 3
0 : 2 .. 3
0 : 3 .. 3
0 : 1 .. 3 1 : 2 .. 0
1 : 3 .. 0
2 : 3 .. 1 3 : 0 .. 2
2 : 1 .. 1
3 : 1 .. 2
3 : 2 .. 2
2 : 0 .. 1
1 : 0 .. 0
Tall to all personalized =
p−1∑
i=1
(ts + twm(p− i)) = ts +1
2twmp(p− 1)
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 36
All-to-All-Personalized: SF Routing on a Mesh
• Each node first groups its p messages according to column
destinations
• There will be√
p groups of√
p messages
• The rows independently perform communication with clustered
messages of sizem√
p
√p−1∑
i=1
(ts + twm√
p(√
p− i)) = (ts +1
2twm√
p√
p)(√
p− 1)
= (ts +1
2twmp)(
√p− 1)
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 37
Example: p=4
Data Distribution Before Second PhaseData Distribution Before First Phase
6 87
0 1
5
2
3 4
1 : 0, 3, 60 : 0, 3, 6
2 : 0, 3, 6
4 : 3, 0, 63 : 0, 3, 6
5 : 0, 3, 6
7 : 0, 3, 68 : 0, 3, 6
3 : 1, 4, 7
5 : 1, 4, 74 : 1, 4, 7
3 : 2, 5, 8
5 : 2, 5, 84 : 2, 5, 8
6 : 2, 5, 8
8 : 2, 5, 87 : 2, 5, 8
6 : 1, 4, 7
8: 1, 4, 7
0 : 1, 4, 71 : 1, 4, 7
0 : 2, 5, 8
2 : 2, 5, 81 : 2, 5, 8
6 87
0 1
5
2
3 4
0 : 1, 4, 70 : 0, 3, 6
0 : 2, 5, 8
3 : 1, 4, 73 : 0, 3, 6
3 : 2, 5, 8
6 : 1, 4, 76 : 0, 3, 6
6 : 2, 5, 8
4 : 0, 3, 6
4 : 2, 5, 84 : 1, 4, 7
5 : 0, 3, 65 : 1, 4, 7
8 : 0, 3, 6
8 : 2, 5, 88 : 1, 4, 7
7 : 0, 3, 6
7 : 2, 5, 87 : 1, 4, 7
1 : 0, 3, 6
1 : 2, 5, 81 : 1, 4, 7
2 : 0, 3, 6
2 : 2, 5, 82 : 1, 4, 7
2 : 1, 4, 7
5 : 2, 5, 8
7 : 1, 4, 76 : 0, 3, 6
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 38
• Each node rearranges its messages by the rows of its destination
nodes in
Tlocal rearrange = trmp
time, wheretr is the time to read/write into local memory
• The columns independently perform communication
Tall to all personalized = (2ts + twmp)(√
p− 1) + Tlocal arrange
= (2ts + twmp)(√
p− 1)
because the last term is much smaller than the communicationtime
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 39
All-to-All-Personalized: SF Routing on a Hypercube
• Uselg p steps; the nodes exchange data in a different dimension on
each step
• In each step, a node holdsp packets of sizem each; sendsp/2
packets as one message to its neighbor
• Tall to all personalized = (ts + twmp2
) lg p
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 40
Example: p = 23
Distribution Before First Step Distribution Before Second Step
5
7
0
3
1
4
6
2
3: 0 .. 72 : 0 .. 7
1 : 0 .. 70 : 0 .. 7
4 : 0 .. 7 5 : 0 .. 7
7 : 0 .. 76 : 0 .. 7
5
7
0
3
1
4
6
2
2 : 1 3 5 72 : 0 2 4 6
1 : 1 3 5 71 : 0 2 4 60 : 0 2 4 6 0 : 1 3 5 7
3 : 0 2 4 6 3 : 1 3 5 7
6 : 0 2 4 6 6 : 1 3 5 7
4 : 1 3 5 75 : 0 2 4 64 : 0 2 4 6
5 : 1 3 5 7
7 : 1 3 5 77 : 0 2 4 6
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 41
Distribution Before Third Step
Final Distribution
5
7
0
3
1
4
6
2
0 1 2 3 : 4 0 1 2 3 : 50 1 2 3 : 0 0 1 2 3 : 1
0 1 2 3 : 30 1 2 3 : 70 1 2 3 : 6
0 1 2 3 : 2
4 5 6 7 : 14 5 6 7 : 54 5 6 7 : 4
4 5 6 7 : 34 5 6 7 : 7
4 5 6 7 : 24 5 6 7 : 6
4 5 6 7 : 0 0 .. 7 : 4 0 .. 7. : 5
0 .. 7 : 70 .. 7 : 6
0 .. 7 : 0
0 .. 7 : 2 0 .. 7 : 3
0 .. 7 : 1
5
7
0
3
1
4
6
2
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 42
All-to-All-Personalized: CT Routing
• The ring and mesh approach cannot be improved by using CT routing
(with message traveling clock/counterclockwise)
• But improvements can be made on a hypercube
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 43
All-to-All-Personalized: Using CT on a Hypercube
• Use CT routingp− 1 times: each node sends successively top− 1
other nodes
• Need to minimize congestion:
for i← 1 to p− 1 do
send my data toself-ID XOR i
receive data fromself-ID XOR i partner
endfor
• Each send/receive between partners is carried out using E-cube
routing which keeps congestion down
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 44
6 7
54
2 3
10
6 7
54
2 3
10
6 7
54
2 3
10
6 7
54
2 3
10
6 7
54
2 3
10
6 7
54
2 3
10
6 7
54
2 3
10
6 7
54
2 3
10
(a) (b) (c)
(d) (e) (f)
(g)
000001010011100101110111
011010001000111110101100
001000011010101100111110
i XOR 111
000 = 0
110 = 6101 = 5100 = 4011 = 3010 = 2001 = 1
111 = 7
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 45
• The accumulated time of this approach is:
Tall to all personalized = (ts + twm)(p− 1) +1
2thp lg p
This form of communication can be used in a parallel bucket sort
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 46
Circular q–Shifting
• Simplest form of permutation routing
• In a circular q-shift, each nodei sends to node(i + q) mod p
• For a ring, this is straight forward using SF routing
• ForSF routing on a mesh, use wrap-around:
– Performq mod√
p shift along each row, thenb q√pc shift along
columns
– May need some extra shifts of some columns (in between these
two shifts)
Tcircular shift = (ts + twm)(2b√
p
2c+ 1)
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 47
7
3
11
15
2
6
10
14
1
5
9
13
0
4
8
12
2
6
10
14
7
3
11
15
0
4
8
12
1
5
9
13
2 3
0
4
1
5
14 15
8
12
9
1310 11
6 7
8
12
9
1310 11
6 7
2 3
0
4
1
5
14 15
Circular Shift Right by 10
Shift 2 along row Make a correction Shift 2 along column
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 48
Circular Shifting: SF on a Hypercube
• Map aring onto the hypercube using binary reflected Gray coding
• Then convertq into binary:q = 5 = 101
• Perform a 4–shift plus a 1–shift
• Each2i–shift takes two steps wheni > 0
• Each 1–shift takes one step
• Gray coding ensures that nodes a distance of2i apart are in disjoint
subrings of the hypercube so that congestion free routing occurs
• Upper bound on communication time is:
Tcircular shift = (ts + twm)(2 lg p− 1)
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 49
Circular Shifting: CT on a Hypercube
• We can directly route messages using standard hypercube addresses
and E-cube routing
• This will be congestion free
• Communication time will be
Tcircular shift = ts + twm + thL
whereL is the longest path of theq–shift
L = lg p− γ(p)
and γ(q) = max{j|2j divides q}
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY
Basic Types of Communication 50
Faster Methods for Some Communication Operations
• Route messages in parts, e.g. on hypercube there arelg p distinct
paths betwen two nodes
• Use multiple spanning trees for One-To-All Broadcast
• Use All-port communications (multiple channels)
• Use Special Hardware for Global operations
c©2004 P.Y. Wang GEORGEMASON UNIVERSITY