29
Lecture 17: Analytical Modeling of Parallel Programs: Scalability 1 CSCE 569 Parallel Computing Department of Computer Science and Engineering Yonghong Yan [email protected] http://cse.sc.edu/~yanyh

Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

Lecture17:AnalyticalModelingofParallelPrograms:Scalability

1

CSCE569ParallelComputing

DepartmentofComputerScienceandEngineeringYonghong Yan

[email protected]://cse.sc.edu/~yanyh

Page 2: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

TopicOverview

• Introduction• PerformanceMetricsforParallelSystems– ExecutionTime,Overhead,Speedup,Efficiency,Cost• Amdahl’sLaw• ScalabilityofParallelSystems– Isoefficiency MetricofScalability• MinimumExecutionTimeandMinimumCost-OptimalExecutionTime

• AsymptoticAnalysisofParallelPrograms• OtherScalabilityMetrics– Scaledspeedup,Serialfraction

2

Page 3: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

SpeedupandEfficiency

3

Page 4: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

Amdahl’sLawSpeedup

4

Page 5: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

ScalabilityofParallelSystems

• Scalability:Thepatternsofspeedup– Howtheperformanceofaparallelapplicationchangesasthe

numberofprocessorsisincreased• Scaling:performanceimprovessteadily• Notscaling:performancedoesnotimproveorbecomesworse

Page 6: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

ScalabilityofParallelSystems

• Twodifferenttypesofscalingwithregardstotheproblemsize– StrongScaling• Totalproblemsizestaysthesameasthenumberofprocessorsincreases

– WeakScaling• Theproblemsizeincreasesatthesamerateasthenumberofprocessors,keepingtheamountofworkperprocessorthesame

• Strongscalingisgenerallymoreusefulandmoredifficulttoachievethanweakscaling

• http://www.mcs.anl.gov/~itf/dbpp/text/node30.html• https://www.sharcnet.ca/help/index.php/Measuring_Parallel_Scaling_Performance

Page 7: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

StrongScaling

7

Page 8: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

WeakScalingofParallelSystems

Extrapolateperformance• Fromsmallproblemsandsmallsystemsà largerproblemsonlargerconfigurations

3parallelalgorithmsforcomputingann-pointFFTon64PEs

8

Inferencesfromsmalldatasetsorsmallmachinescanbemisleading

Page 9: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

ScalingCharacteristicsofParallelPrograms:Increaseproblemsize

• Efficiency:

• Paralleloverhead:To =pTP – TSè E=TS/(TS+To)– Overheadincreasesasp increase

• Problemsize:– Givenproblemsize,TS remainsconstant

• Efficiencyincreases if– Theproblemsizeincreases(Ts) and– KeepingthenumberofPEsconstant.

9

Page 10: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

Example:Addingn Numbersonp PEs

• Addition=1timeunit;communication=1timeunit

10

Speeduptendstosaturateandefficiencydrops

Page 11: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

ScalingCharacteristicsofParallelPrograms:Increaseproblemsizeandincrease#PEs

• OverheadTo =ƒ (Ts,p),i.e.problemsizeandp– Inmanycases,To growssublinearly withrespecttoTs

• Efficiency:– Decreasesasweincreasep ->T0– Increasesasweincreaseproblemsize(Ts)

• Keepefficiencyconstant– Increaseproblemsizesand– proportionallyincreasingthenumberofPEs

• Scalable parallelsystems12

Page 12: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

Isoefficiency MetricofScalability

Rateatwhichtheproblemsize(Ts)mustincreaseperadditionalPE(T0)tokeeptheefficiencyfixed

• Thescalabilityofthesystem– Theslowerthisrate,thebetterscalability– Rate==0:strongscaling.• Thesameproblem(samesize)scaleswhenincreasingnumberofPEs

• Toformalizethisrate,wedefine– TheproblemsizeW= theasymptoticnumberofoperations

associatedwiththebestserialalgorithmtosolvetheproblem.• Theserialexecutiontime,Ts 14

Page 13: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

Isoefficiency MetricofScalability

• Paralleloverhead:To(W,p),again,W~=Ts• Parallelexecutiontime:

• Speedup:

• Efficiency

15

Page 14: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

Isoefficiency MetricofScalability

• Tomaintainconstantefficiency(between0and1)

• K =E /(1– E)isaconstantrelatedtothedesiredefficiency

16

RatioTo /W shouldbemaintainedataconstantvalue.

Page 15: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

Isoefficiency MetricofScalability

W=Φ (p)suchthatefficiencyisconstant

• W=Φ (p) iscalledtheisoefficiency function– Readas:whatistheproblemsizewhenwehavep PEstomaintain

constantefficiency?– Wp+1 – Wp =Φ (p+1)- Φ (p)• Tomaintainconstantefficiency,howmuchtoincreasetheproblemsizeifaddingonemorePE?

• isoefficiency function determinestheease– Withwhichaparallelsystemmaintainaconstantefficiency– Henceachievespeedupsincreasinginproportionto# PEs

17

Page 16: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

Isoefficiency Example1

Addingn numbersusingp PEs• Paralleloverhead:To =2plogp• W=KT0(W,p),substitute T0– W=K*2*p*logp• K*2*p*logp istheisoefficiency function

• Theasymptoticisoefficiency functionforthisparallelsystemisΘ(p*logp)

• Tohavethesameefficiencyonp’processorsasonp– problemsizenmustincreaseby(p’logp’)/(plogp)when

increasingPEsfromptop’18

Page 17: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

Examples

• by(p’logp’)/(plogp)

• Ifp=8,p’=16• 16*log16/(8*log8)=16*4/(8*3)=8/3=2.67

• 10Mon8cores• 10*2.67Mon16cores

19

Page 18: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

Cost-OptimalityandIsoefficiency

• Aparallelsystemiscost-optimalifandonlyif– Parallelcost==totalwork• Efficiency=1

• Fromthis,wehave:– i.e.workdominatesoverhead

• Ifwehaveanisoefficiency functionf(p)– TherelationW=Ω(f(p)) mustbesatisfiedtoensurethecost-

optimalityofaparallelsystemasitisscaledup21

Page 19: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

TopicOverview

• Introduction• PerformanceMetricsforParallelSystems– ExecutionTime,Overhead,Speedup,Efficiency,Cost• Amdahl’sLaw• ScalabilityofParallelSystems– Isoefficiency MetricofScalability• MinimumExecutionTime• AsymptoticAnalysisofParallelPrograms• OtherScalabilityMetrics– Scaledspeedup,Serialfraction

25

Page 20: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

MinimumExecutionTime

• Often,weareinterestedintheminimumtimetosolution• TodeterminetheminimumexetimeTPmin foragivenW– DifferentiatingtheexpressionforTP w.r.t.p andequateitto0

• Ifp0 isthevalueofp asdeterminedbythisequation– TP(p0)istheminimumparalleltime

26

=0

Page 21: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

MinimumExecutionTime:Example

Addingnnumbers• Parallelexecutiontime:

• Computethederivative:

• Setthederivative=0,solveforp:

• Thecorrespondingexetime:

27

=

=

Notethatatthispoint,theformulationisnotcost-optimal.

Page 22: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

TopicOverview

• Introduction• PerformanceMetricsforParallelSystems– ExecutionTime,Overhead,Speedup,Efficiency,Cost• Amdahl’sLaw• ScalabilityofParallelSystems– Isoefficiency MetricofScalability• MinimumExecutionTime• AsymptoticAnalysisofParallelPrograms• OtherScalabilityMetrics– Scaledspeedup,Serialfraction

30

Page 23: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

AsymptoticAnalysisofParallelPrograms

Sortingalistofn numbers.• Thefastestserialprograms:Θ(nlogn).• Fourparallelalgorithms,A1,A2,A3,andA4

31

Page 24: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

AsymptoticAnalysisofParallelProgramsSortingalistofn numbers.

• Ifmetricisspeed(TP),algorithmA1isthebest,followedbyA3,A4,andA2• Intermsofefficiency(E),A2andA4arethebest,followedbyA3andA1.• Intermsofcost(pTp),algorithmsA2andA4arecostoptimal,A1andA3are

not.

• Itisimportanttoidentifytheanalysisobjectivesandtouseappropriatemetrics!

32

Page 25: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

TopicOverview

• Introduction• PerformanceMetricsforParallelSystems– ExecutionTime,Overhead,Speedup,Efficiency,Cost• Amdahl’sLaw• ScalabilityofParallelSystems– Isoefficiency MetricofScalability• MinimumExecutionTime• AsymptoticAnalysisofParallelPrograms• OtherScalabilityMetrics– Scaledspeedup,Serialfraction

33

Page 26: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

ScaledSpeedup:Example

nxnmatrixmultiplication

• Theserialexecutiontime:tcn3.• Theparallelexecutiontime:

• Speedup:

39

Page 27: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

ScaledSpeedup:Example(continued)

Considermemory-constrainedscaledspeedup.• Wehavememorycomplexitym=Θ(n2)=Θ(p),orn2=cxp.

• Atthisgrowthrate,scaledspeedupS’ isgivenby:

• Notethatthisisscalable.

40

Page 28: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

ScaledSpeedup:Example(continued)

Considertime-constrainedscaledspeedup.

• WehaveTP =O(1)=O(n3/p) ,orn3=cxp .

• Time-constrainedspeedupS’’ isgivenby:

• Memoryconstrainedscalingyieldsbetterperformance.

41

Page 29: Lecture 17: Analytical Modeling of Parallel Programs ... · Asymptotic Analysis of Parallel Programs Sorting a list of nnumbers. • If metric is speed (T P), algorithm A1 is the

References

• Adaptedfromslides“PrinciplesofParallelAlgorithmDesign”byAnanth Grama

• “AnalyticalModelingofParallelSystems”,Chapter5inAnanth Grama,Anshul Gupta,GeorgeKarypis,andVipinKumar,IntroductiontoParallelComputing'',“AddisonWesley,2003.

• Grama,Ananth Y.;Gupta,A.;Kumar,V.,"Isoefficiency:measuringthescalabilityofparallelalgorithmsandarchitectures,"inParallel&DistributedTechnology:Systems&Applications,IEEE,vol.1,no.3,pp.12-21,Aug.1993,doi:10.1109/88.242438,http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=242438&isnumber=6234

46