41
 EENG-630 Chapter 3 1 Chapter 3: Principles of Scalable Performance Performance measures Speedup las Scalabilit! principles Scalin" up #s$ scalin" don

parellel computing

Embed Size (px)

Citation preview

Page 1: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 1/41

  EENG-630 Chapter 3 1

Chapter 3: Principles of Scalable

Performance

• Performance measures

• Speedup las

• Scalabilit! principles

• Scalin" up #s$ scalin" don

Page 2: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 2/41

  EENG-630 Chapter 3 %

Performance metrics and

measures

• Parallelism profiles

• &s!mptotic speedup factor 

• S!stem efficienc!' utili(ation and )ualit!

• Standard performance measures

Page 3: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 3/41

  EENG-630 Chapter 3 3

*e"ree of parallelism

• +eflects the matchin" of softare and

hardare parallelism

• *iscrete time function , measure' for each

time period' the of processors used

• Parallelism profile is a plot of the *.P as a

function of time

• /deall! ha#e unlimited resources

Page 4: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 4/41

  EENG-630 Chapter 3

actors affectin" parallelism

 profiles• &l"orithm structure

• Pro"ram optimi(ation

• +esource utili(ation

• +un-time conditions

• +ealisticall! limited b! of a#ailable processors' memor!' and other

nonprocessor resources

Page 5: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 5/41

  EENG-630 Chapter 3 2

&#era"e parallelism #ariables

• n , homo"eneous processors

• m , maimum parallelism in a profile

∀∆ - computin" capacit! of a sin"le

 processor 4eecution rate onl!' no o#erhead5

•  DOP=i – # processors bus! durin" an

obser#ation period

Page 6: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 6/41

  EENG-630 Chapter 3 6

&#era"e parallelism

• otal amount of or7 performed is

 proportional to the area under the profile

cur#e

∫ 

=

⋅∆=

∆=

m

i

i

t iW 

dt t  DOP W 

1

%

1

54

Page 7: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 7/41

  EENG-630 Chapter 3 8

&#era"e parallelism

  

  

   

  

 ⋅=

−=

∑∑

∫ 

==

m

i

i

m

i

i

t t i A

dt t  DOP t t 

 A

11

1%

9

541 %

1

Page 8: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 8/41

  EENG-630 Chapter 3

Eample: parallelism profile and

a#era"e parallelism

Page 9: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 9/41

  EENG-630 Chapter 3 ;

&s!mptotic speedup

∑∑

∑ ∑

==

= =

∆=∞=∞

∆==

m

i

im

i

i

m

i

m

i

ii

i

W t T 

W t T 

11

1 1

5454

514514

=

=∞   =∞= m

i

i

m

i

i

iW 

T S 

1

1

954

514

< & in the ideal case4response time5

Page 10: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 10/41

  EENG-630 Chapter 3 10

Performance measures

• Consider n processors eecutin" m 

 pro"rams in #arious modes

• =ant to define the mean performance of

these multimode computers:

 , &rithmetic mean performance

 , Geometric mean performance

 , >armonic mean performance

Page 11: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 11/41

  EENG-630 Chapter 3 11

&rithmetic mean performance

∑=

=m

i

ia m R R1

9

∑=

=m

i

iia   R f   R

1

? 54

&rithmetic mean eecution rate

4assumes e)ual ei"htin"5

=ei"hted arithmetic mean

eecution rate

- proportional to the sum of the in#erses of

eecution times

Page 12: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 12/41

  EENG-630 Chapter 3 1%

Geometric mean performance

∏=

=m

i

m

i g   R R

1

91

∏=

=m

i

 f 

i g i R R

1

?

Geometric mean eecution rate

=ei"hted "eometric mean

eecution rate

-does not summari(e the real performance since it does

 not ha#e the in#erse relation ith the total time

Page 13: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 13/41

Page 14: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 14/41

Page 15: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 15/41

  EENG-630 Chapter 3 12

>armonic @ean Speedup

• ies the #arious modes of a pro"ram to the

number of processors used

• Pro"ram is in mode i if i processors used

• Se)uential eecution time T 1 = 19 R1 = 1

∑ =

== n

i ii  R f T T S 

1

?1

919

Page 16: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 16/41

  EENG-630 Chapter 3 16

>armonic @ean Speedup

Performance 

Page 17: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 17/41

  EENG-630 Chapter 3 18

&mdahlAs Ba

• &ssume Ri < i' w < 4α' 0' 0' ' 1- α5

• S!stem is either se)uential' ith probabilit!

α' or full! parallel ith prob$ 1- α

• /mplies S  → 19 α  as n → ∞

α 5141   −+

=n

nS n

Page 18: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 18/41

  EENG-630 Chapter 3 1

Speedup Performance

 

Page 19: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 19/41

  EENG-630 Chapter 3 1;

S!stem Efficienc!

• O4n5 is the total of unit operations

• T 4n5 is eecution time in unit time steps

• T 4n5 D O4n5 and T 415 < O415

54951454   nT T  N S    =

54

5145454

nnT 

n

nS n E    ==

Page 20: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 20/41

  EENG-630 Chapter 3 %0

+edundanc! and tili(ation

• +edundanc! si"nifies the etent of

matchin" softare and hardare parallelism

• tili(ation indicates the percenta"e of

resources 7ept bus! durin" eecution

51495454 OnOn R   =

54

54545454

nnT 

nOn E n RnU    ==

Page 21: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 21/41

  EENG-630 Chapter 3 %1

Fualit! of Parallelism

• *irectl! proportional to the speedup and

efficienc! and in#ersel! related to the

redundanc!• pper-bounded b! the speedup S 4n5

5454514

54545454

%

3

nOnnT 

n R

n E nS nQ   ==

Page 22: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 22/41

  EENG-630 Chapter 3 %%

Eample of Performance

• Gi#en O415 < T 415 < n3' O4n5 < n3  n%lo" n, 

and T 4n5 < n394n35

S 4n5 < 4n359

 E 4n5 < 4n3594n5

 R4n5 < 4n  lo" n59n

U 4n5 < 4n354n  lo" n594n%5 Q4n5 < 4n35% 9 4164n  lo" n55

Page 23: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 23/41

  EENG-630 Chapter 3 %3

Standard Performance @easures

• @/PS and @flops , *epends on instruction set and pro"ram used

• *hr!stone results , @easure of inte"er performance

• =hestone results

 , @easure of floatin"-point performance• PS and HB/PS ratin"s

 , ransaction performance and reasonin" poer 

Page 24: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 24/41

  EENG-630 Chapter 3 %

Parallel Processin" &pplications

• *ru" desi"n

• >i"h-speed ci#il transport

• .cean modelin"

• .(one depletion research

• &ir pollution• *i"ital anatom!

Page 25: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 25/41

  EENG-630 Chapter 3 %2

&pplication @odels for Parallel

Computers• ied-load model

 , Constant or7load

• ied-time model

 , *emands constant pro"ram eecution time

• ied-memor! model

 , Bimited b! the memor! bound

Page 26: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 26/41

  EENG-630 Chapter 3 %6

&l"orithm Characteristics

• *eterministic #s$ nondeterministic

• Computational "ranularit!

• Parallelism profile

• Communication patterns and

s!nchroni(ation re)uirements

• niformit! of operations

• @emor! re)uirement and data structures

Page 27: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 27/41

  EENG-630 Chapter 3 %8

/soefficienc! Concept

• +elates or7load to machine si(e n needed

to maintain a fied efficienc!

• he smaller the poer of n' the more

scalable the s!stem

5'454

54

n sh sw

 sw E 

+=

or7load

o#erhead

Page 28: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 28/41

  EENG-630 Chapter 3 %

/soefficienc! unction

• o maintain a constant E ' w4 s5 should "ro

in proportion to h4 s,n5

•   < E 941- E 5 is constant for fied E 

5'41

54 n sh E 

 E  sw   ×

−=

5'454   n sh n f   E    ×=

Page 29: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 29/41

  EENG-630 Chapter 3 %;

Speedup Performance Bas

• &mdahlAs la , for fied or7load or fied problem si(e

• GustafsonAs la , for scaled problems 4problem si(e increases

ith increased machine si(e5

• Speedup model , for scaled problems bounded b! memor!capacit!

Page 30: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 30/41

  EENG-630 Chapter 3 30

&mdahlAs Ba

• &s of processors increase' the fied loadis distributed to more processors

• @inimal turnaround time is primar! "oal• Speedup factor is upper-bounded b! a

se)uential bottlenec7 

• o cases:  *.P D n

 *.P ≥ n

Page 31: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 31/41

  EENG-630 Chapter 3 31

ied Boad Speedup actor 

• Case 1: *.P I n • Case %: *.P D n

∆=

n

i

i

W it  i

i

54

∑=

∆=

m

i

i

n

i

i

W nT 

1

54

∆=∞= i

W t nt 

  iii 5454

∑∑

=

=

==m

i

i

m

ii

i

n

n

i

i

nT 

T S 

1

54

514

Page 32: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 32/41

  EENG-630 Chapter 3 3%

GustafsonAs Ba

• =ith &mdahlAs Ba' the or7load cannot

scale to match the a#ailable computin"

 poer as n increases• GustafsonAs Ba fies the time' alloin"

the problem si(e to increase ith hi"her n

•  Not sa#in" time' but increasin" accurac!

Page 33: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 33/41

  EENG-630 Chapter 3 33

ied-time Speedup

• &s the machine si(e increases' ha#e

increased or7load and ne profile

• /n "eneral' W i! " W i for % ≤ i ≤ mA and W 1!

= W 1 

• &ssume T 415 < T A4n5

Page 34: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 34/41

  EENG-630 Chapter 3 3

GustafsonAs Scaled Speedup

541

J

1

nQ

n

i

i

W W 

m

i

im

i

i   +

= ∑∑==

n

nm

i

i

m

i

i

nW W 

nW W 

S  ++

== ∑

=

=1

1

1

1

J

J

J

Page 35: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 35/41

  EENG-630 Chapter 3 32

@emor! Kounded Speedup

@odel• /dea is to sol#e lar"est problem' limited b!

memor! space

• +esults in a scaled or7load and hi"her accurac!• Each node can handle onl! a small subproblem for

distributed memor!

• sin" a lar"e of nodes collecti#el! increases the

memor! capacit! proportionall!

Page 36: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 36/41

  EENG-630 Chapter 3 36

ied-@emor! Speedup

• Bet   be the memor! re)uirement and W

the computational or7load: W < g 4 $

•  g ?4n 5<%4n5 g 4  5<%4n5W n 

nW n%W 

W n%W 

nQn

i

i

S n

nm

i

i

m

i

i

n954

54

54 1

1

1

?1

?

??

?

++

=+

=

=

=

Page 37: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 37/41

  EENG-630 Chapter 3 38

+elatin" Speedup @odels

• %4n5 reflects the increase in or7load as

memor! increases n times

• %4n5 < 1 : ied problem si(e 4&mdahl5

• %4n5 < n : =or7load increases n times hen

memor! increased n times 4Gustafson5

• %4n5 I n : or7load increases faster than

memor! than the memor! re)uirement

Page 38: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 38/41

  EENG-630 Chapter 3 3

Scalabilit! @etrics

• @achine si(e 4n5 : of processors

• Cloc7 rate 4 f 5 : determines basic m9c c!cle

• Problem si(e 4 s5 : amount of computationalor7load$ *irectl! proportional to T 4 s'15$

• CP time 4T 4 s,n55 : actual CP time for

eecution• /9. demand 4d 5 : demand in mo#in" the

 pro"ram' data' and results for a "i#en run

Page 39: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 39/41

  EENG-630 Chapter 3 3;

Scalabilit! @etrics

• @emor! capacit! 4m5 : ma of memor! ordsdemanded

• Communication o#erhead 4h4s ,n55 : amount oftime for interprocessor communication's!nchroni(ation' etc$

• Computer cost 4&5 : total cost of h9 and s9

resources re)uired• Pro"rammin" o#erhead 4 '5 : de#elopment

o#erhead associated ith an application pro"ram

Page 40: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 40/41

  EENG-630 Chapter 3 0

Speedup and Efficienc!

• he problem si(e is the independent

 parameter 

nn sS n s E    5'45'4   =

5'45'451'45'4

n shn sT  sT n sS +=

Page 41: parellel computing

7/17/2019 parellel computing

http://slidepdf.com/reader/full/parellel-computing-568ec5efce93e 41/41

G 630 Ch 3 1

Scalable S!stems

• /deall!' if E 4 s,n5<1 for all al"orithms and

an! s and n' s!stem is scalable

• Practicall!' consider the scalabilit! of a m9c

5'4

5'4

5'4

5'45'4

n sT 

n sT 

n sS 

n sS n s   ( 

 ( 

==Φ