Upload
harminder-sokhi
View
213
Download
0
Embed Size (px)
Citation preview
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 1/41
EENG-630 Chapter 3 1
Chapter 3: Principles of Scalable
Performance
• Performance measures
• Speedup las
• Scalabilit! principles
• Scalin" up #s$ scalin" don
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 2/41
EENG-630 Chapter 3 %
Performance metrics and
measures
• Parallelism profiles
• &s!mptotic speedup factor
• S!stem efficienc!' utili(ation and )ualit!
• Standard performance measures
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 3/41
EENG-630 Chapter 3 3
*e"ree of parallelism
• +eflects the matchin" of softare and
hardare parallelism
• *iscrete time function , measure' for each
time period' the of processors used
• Parallelism profile is a plot of the *.P as a
function of time
• /deall! ha#e unlimited resources
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 4/41
EENG-630 Chapter 3
actors affectin" parallelism
profiles• &l"orithm structure
• Pro"ram optimi(ation
• +esource utili(ation
• +un-time conditions
• +ealisticall! limited b! of a#ailable processors' memor!' and other
nonprocessor resources
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 5/41
EENG-630 Chapter 3 2
&#era"e parallelism #ariables
• n , homo"eneous processors
• m , maimum parallelism in a profile
∀∆ - computin" capacit! of a sin"le
processor 4eecution rate onl!' no o#erhead5
• DOP=i – # processors bus! durin" an
obser#ation period
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 6/41
EENG-630 Chapter 3 6
&#era"e parallelism
• otal amount of or7 performed is
proportional to the area under the profile
cur#e
∑
∫
=
⋅∆=
∆=
m
i
i
t
t
t iW
dt t DOP W
1
%
1
54
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 7/41
EENG-630 Chapter 3 8
&#era"e parallelism
⋅=
−=
∑∑
∫
==
m
i
i
m
i
i
t
t
t t i A
dt t DOP t t
A
11
1%
9
541 %
1
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 8/41
EENG-630 Chapter 3
Eample: parallelism profile and
a#era"e parallelism
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 9/41
EENG-630 Chapter 3 ;
&s!mptotic speedup
∑∑
∑ ∑
==
= =
∆=∞=∞
∆==
m
i
im
i
i
m
i
m
i
ii
i
W t T
W t T
11
1 1
5454
514514
∑
∑
=
=∞ =∞= m
i
i
m
i
i
iW
W
T
T S
1
1
954
514
< & in the ideal case4response time5
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 10/41
EENG-630 Chapter 3 10
Performance measures
• Consider n processors eecutin" m
pro"rams in #arious modes
• =ant to define the mean performance of
these multimode computers:
, &rithmetic mean performance
, Geometric mean performance
, >armonic mean performance
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 11/41
EENG-630 Chapter 3 11
&rithmetic mean performance
∑=
=m
i
ia m R R1
9
∑=
=m
i
iia R f R
1
? 54
&rithmetic mean eecution rate
4assumes e)ual ei"htin"5
=ei"hted arithmetic mean
eecution rate
- proportional to the sum of the in#erses of
eecution times
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 12/41
EENG-630 Chapter 3 1%
Geometric mean performance
∏=
=m
i
m
i g R R
1
91
∏=
=m
i
f
i g i R R
1
?
Geometric mean eecution rate
=ei"hted "eometric mean
eecution rate
-does not summari(e the real performance since it does
not ha#e the in#erse relation ith the total time
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 13/41
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 14/41
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 15/41
EENG-630 Chapter 3 12
>armonic @ean Speedup
• ies the #arious modes of a pro"ram to the
number of processors used
• Pro"ram is in mode i if i processors used
• Se)uential eecution time T 1 = 19 R1 = 1
∑ =
== n
i ii R f T T S
1
?1
919
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 16/41
EENG-630 Chapter 3 16
>armonic @ean Speedup
Performance
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 17/41
EENG-630 Chapter 3 18
&mdahlAs Ba
• &ssume Ri < i' w < 4α' 0' 0' ' 1- α5
• S!stem is either se)uential' ith probabilit!
α' or full! parallel ith prob$ 1- α
• /mplies S → 19 α as n → ∞
α 5141 −+
=n
nS n
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 18/41
EENG-630 Chapter 3 1
Speedup Performance
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 19/41
EENG-630 Chapter 3 1;
S!stem Efficienc!
• O4n5 is the total of unit operations
• T 4n5 is eecution time in unit time steps
• T 4n5 D O4n5 and T 415 < O415
54951454 nT T N S =
54
5145454
nnT
T
n
nS n E ==
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 20/41
EENG-630 Chapter 3 %0
+edundanc! and tili(ation
• +edundanc! si"nifies the etent of
matchin" softare and hardare parallelism
• tili(ation indicates the percenta"e of
resources 7ept bus! durin" eecution
51495454 OnOn R =
54
54545454
nnT
nOn E n RnU ==
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 21/41
EENG-630 Chapter 3 %1
Fualit! of Parallelism
• *irectl! proportional to the speedup and
efficienc! and in#ersel! related to the
redundanc!• pper-bounded b! the speedup S 4n5
5454514
54545454
%
3
nOnnT
T
n R
n E nS nQ ==
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 22/41
EENG-630 Chapter 3 %%
Eample of Performance
• Gi#en O415 < T 415 < n3' O4n5 < n3 n%lo" n,
and T 4n5 < n394n35
S 4n5 < 4n359
E 4n5 < 4n3594n5
R4n5 < 4n lo" n59n
U 4n5 < 4n354n lo" n594n%5 Q4n5 < 4n35% 9 4164n lo" n55
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 23/41
EENG-630 Chapter 3 %3
Standard Performance @easures
• @/PS and @flops , *epends on instruction set and pro"ram used
• *hr!stone results , @easure of inte"er performance
• =hestone results
, @easure of floatin"-point performance• PS and HB/PS ratin"s
, ransaction performance and reasonin" poer
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 24/41
EENG-630 Chapter 3 %
Parallel Processin" &pplications
• *ru" desi"n
• >i"h-speed ci#il transport
• .cean modelin"
• .(one depletion research
• &ir pollution• *i"ital anatom!
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 25/41
EENG-630 Chapter 3 %2
&pplication @odels for Parallel
Computers• ied-load model
, Constant or7load
• ied-time model
, *emands constant pro"ram eecution time
• ied-memor! model
, Bimited b! the memor! bound
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 26/41
EENG-630 Chapter 3 %6
&l"orithm Characteristics
• *eterministic #s$ nondeterministic
• Computational "ranularit!
• Parallelism profile
• Communication patterns and
s!nchroni(ation re)uirements
• niformit! of operations
• @emor! re)uirement and data structures
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 27/41
EENG-630 Chapter 3 %8
/soefficienc! Concept
• +elates or7load to machine si(e n needed
to maintain a fied efficienc!
• he smaller the poer of n' the more
scalable the s!stem
5'454
54
n sh sw
sw E
+=
or7load
o#erhead
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 28/41
EENG-630 Chapter 3 %
/soefficienc! unction
• o maintain a constant E ' w4 s5 should "ro
in proportion to h4 s,n5
• < E 941- E 5 is constant for fied E
5'41
54 n sh E
E sw ×
−=
5'454 n sh n f E ×=
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 29/41
EENG-630 Chapter 3 %;
Speedup Performance Bas
• &mdahlAs la , for fied or7load or fied problem si(e
• GustafsonAs la , for scaled problems 4problem si(e increases
ith increased machine si(e5
• Speedup model , for scaled problems bounded b! memor!capacit!
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 30/41
EENG-630 Chapter 3 30
&mdahlAs Ba
• &s of processors increase' the fied loadis distributed to more processors
• @inimal turnaround time is primar! "oal• Speedup factor is upper-bounded b! a
se)uential bottlenec7
• o cases: *.P D n
*.P ≥ n
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 31/41
EENG-630 Chapter 3 31
ied Boad Speedup actor
• Case 1: *.P I n • Case %: *.P D n
∆=
n
i
i
W it i
i
54
∑=
∆=
m
i
i
n
i
i
W nT
1
54
∆=∞= i
W t nt
iii 5454
∑∑
=
=
==m
i
i
m
ii
i
n
n
i
i
W
W
nT
T S
1
54
514
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 32/41
EENG-630 Chapter 3 3%
GustafsonAs Ba
• =ith &mdahlAs Ba' the or7load cannot
scale to match the a#ailable computin"
poer as n increases• GustafsonAs Ba fies the time' alloin"
the problem si(e to increase ith hi"her n
• Not sa#in" time' but increasin" accurac!
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 33/41
EENG-630 Chapter 3 33
ied-time Speedup
• &s the machine si(e increases' ha#e
increased or7load and ne profile
• /n "eneral' W i! " W i for % ≤ i ≤ mA and W 1!
= W 1
• &ssume T 415 < T A4n5
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 34/41
EENG-630 Chapter 3 3
GustafsonAs Scaled Speedup
541
J
1
nQ
n
i
i
W W
m
i
im
i
i +
= ∑∑==
n
nm
i
i
m
i
i
nW W
nW W
W
W
S ++
== ∑
∑
=
=1
1
1
1
J
J
J
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 35/41
EENG-630 Chapter 3 32
@emor! Kounded Speedup
@odel• /dea is to sol#e lar"est problem' limited b!
memor! space
• +esults in a scaled or7load and hi"her accurac!• Each node can handle onl! a small subproblem for
distributed memor!
• sin" a lar"e of nodes collecti#el! increases the
memor! capacit! proportionall!
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 36/41
EENG-630 Chapter 3 36
ied-@emor! Speedup
• Bet be the memor! re)uirement and W
the computational or7load: W < g 4 $
• g ?4n 5<%4n5 g 4 5<%4n5W n
nW n%W
W n%W
nQn
i
i
W
W
S n
nm
i
i
m
i
i
n954
54
54 1
1
1
?1
?
??
?
++
=+
=
∑
∑
=
=
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 37/41
EENG-630 Chapter 3 38
+elatin" Speedup @odels
• %4n5 reflects the increase in or7load as
memor! increases n times
• %4n5 < 1 : ied problem si(e 4&mdahl5
• %4n5 < n : =or7load increases n times hen
memor! increased n times 4Gustafson5
• %4n5 I n : or7load increases faster than
memor! than the memor! re)uirement
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 38/41
EENG-630 Chapter 3 3
Scalabilit! @etrics
• @achine si(e 4n5 : of processors
• Cloc7 rate 4 f 5 : determines basic m9c c!cle
• Problem si(e 4 s5 : amount of computationalor7load$ *irectl! proportional to T 4 s'15$
• CP time 4T 4 s,n55 : actual CP time for
eecution• /9. demand 4d 5 : demand in mo#in" the
pro"ram' data' and results for a "i#en run
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 39/41
EENG-630 Chapter 3 3;
Scalabilit! @etrics
• @emor! capacit! 4m5 : ma of memor! ordsdemanded
• Communication o#erhead 4h4s ,n55 : amount oftime for interprocessor communication's!nchroni(ation' etc$
• Computer cost 4&5 : total cost of h9 and s9
resources re)uired• Pro"rammin" o#erhead 4 '5 : de#elopment
o#erhead associated ith an application pro"ram
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 40/41
EENG-630 Chapter 3 0
Speedup and Efficienc!
• he problem si(e is the independent
parameter
nn sS n s E 5'45'4 =
5'45'451'45'4
n shn sT sT n sS +=
7/17/2019 parellel computing
http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 41/41
G 630 Ch 3 1
Scalable S!stems
• /deall!' if E 4 s,n5<1 for all al"orithms and
an! s and n' s!stem is scalable
• Practicall!' consider the scalabilit! of a m9c
5'4
5'4
5'4
5'45'4
n sT
n sT
n sS
n sS n s (
(
==Φ