45
Neil Palmer Partner SunGard Consulting Services [email protected] Michael Heydt Senior Manager SunGard Consulting Services [email protected] http://thecloudarchitect.com

Using Azure for Computationally Intensive Workloads

Embed Size (px)

Citation preview

NeilPalmerPartner

[email protected]

MichaelHeydtSeniorManager

[email protected]://thecloudarchitect.com

Agenda Background CloudArchitectureOverview  FunctionalLanguages&F# ProblemStatement Demo  SolutionArchitectures&Outcomes  LessonsLearned Economics Conclusions

Background Thisisaperiodofintensechangewithmanydisruptivetechnologies:  Cloudcomputing  Functionallanguages  Largedatasetprocessing

 Whichhavealargeimpactonapplicationssincestorageandcomputepowerispracticallyunlimited

  Leadstoquestions: Whattypesofapplicationsaretechnicallyfeasible?  Andeconomicallyfeasible?

CloudCompu2ngStack

Applications(Salesforce)

Services(GoogleMaps)

Platforms(Azure)

Infrastructure(EC2) Storage(S3)

CloudCompu2ngOverview Commoncomponentsforcloudcomputing:

  CPUandDatamanagedbyanotherprovider  Redundantstorageforavailability DynamicallyallocatedCPUbasedupondemand  Payforuseandasyougomodel(nocapitalinvestment)  Additionalapplicationservicesprovidedtosystemsinthecloud(e.g.SQLservices,paymentprocessing…)

 Operationalsupportforfailover

CloudBenefitsInternalIT Managedservices Thecloud

Capitalinvestment Significant Moderate Negligible

On‐goingcosts Moderate Significant BasedonUsage

Provisioningtime Significant Moderate None

Scalability Limited Moderate Flexible

Staffexpertiserequirements Significant Limited Moderate

Reliability Varies High ModeratetoHigh

Applica2onssuitableforthecloud Webapplications  Loadtesting MonteCarloSimulation  FinancialPortfolioAnalysis EnergyDemandForecasting AlgorithmicTrading Parallel/concurrentprocessing Processingpipelines

Func2onalProgramming Whatisfunctionalprogramming?

  FacilitatesParallelism  Treatcomputationastheevaluationoffunctions  Avoidsstateandmutabledata,whichgreatlysimplifiesparallelexecution

  Runtimehandlesparallelisminsteadofthroughexplicitcoding

  Takebetteradvantageofgrowing#ofcoresavailabletoyou

  Functionaldecompositionmaybeabetteranswertoleveragingthearchitectureoftomorrow  Beginningtothinkthiswayneedstostartnow

ProblemStatement  Computehistoricpricevolatilitycorrelationsforstocksovermanyyears.

  Goalistotesttheusefulnessofcloudcomputingfor:  Manipulatingalargedataset  Handleasolutionthatwouldotherwisebeslow&expensivewithdedicatedservers

  Todetermine:  Whatisthescalabilityofcloudarchitectures?  Costeffectivenessofscaleout(manyrolesinthecloud)vs.scaleup(highCPU,threading)

  Howdoesthesolutiondifferfromanon‐cloudsolution?

CloudApplica2onArchitecture

CloudApplica2onArchitecture Horsepower

  InAzure,thesearethewebandworkerroles

CloudApplica2onArchitecture  Ingress–Howtotalktothecloud

  InAzure,  HTTP/stowebroles  .NETServiceBusandqueuestoworkerroles

CloudApplica2onArchitecture CloudStorageServices

  Fundamentallythreetypes  Tables  Blobs  Volumes(EC2)

CloudApplica2onArchitecture  Intra‐cloudcommunications

 Queues  .NETServiceBus

CloudApplica2onArchitecture Cloudprovidedservices

  Valueaddfromyourcloudprovider  Youcanalsousethesefromotherproviders  AzureusingEC2nodes,AWSpaymentservices

  GoogleAPIusingRESTinterfacesintoAzure

Solu2onArchitecture

Keypoints:• Scalabilityisgoal#1• Partitioneddataset• Multipleworkers• Workitembased• Competitiveprocessing• Allasynchronous

Demo  Showtheportaldeployments Explainwebandworkerroles  Showclientdoingvolatilities

  ExplainpartsoftheapplicationandinteropwithAzure  Show/deleteexistingblobs  Calculatesomevolatilities  Showmessagesreceived,blobsbeingcreated  Showdatainoneoftheblobs

Solu2onArchitecture Architecturewasevolutionary

  StartedwithEC2,evolvedintoahybridAzure/EC2,andthenfullAzure

  Thiswasvaluabletoseethedifferencesincloudplatforms

Solu2onArchitectureVersion1.0100%EC2

Version1.0Outcome  Linearscalability.Doublethenodes,halfthetimetocomplete

  TooktimetoimageandmanageAMI’s  Feelslikeyoumanagetheserversintheirentirety  NoautomaticfailoverorrestartsprovidedbyEC2  Bandwidthcosts–mustwatchthem  Securityisonyourown  Tablestoragewasconsideredtoolimitedinmaxsizetouse

  Costeffectivefortheproblem,butminimumbillingisperhour,sothatcanburnyou

Solu2onArchitectureVersion2.0100%AzurewithTableData

Version2.0Outcome Tabledatadidnotperformwell Gaveupbeforeevengettinghistoricaldataintothecloud

 RESTperformancefortabledatawas~1000putsper50seconds

  4.7millionhistoricalpricepointstoload Multi‐threadingdidnothelp  Scrappedandwentforablobmodelwithtickerdatablobs(version3.0)

Solu2onArchitectureVersion2.5AzureandEC2Hybrid

Version2.5Outcome Easyportof.NetcodetoAzurefromEC2  JustpointeddatalayertoexistingEC2databaseimage ExecutiontimeaboutthesameaswithEC2,evenwiththedatabaseacrosstheInternet

 Notasmuchheadachesinceyouarenotmanagingasmanyvirtualservers

 ButishavinganRDBMSinthecloud“cloudy”?

Solu2onArchitectureVersion3.0100%Azure

Version3.0Outcome  Samebenefitsof2.5 NoneedforSQLServer BlobssolvedRESTproblems MigrationofdatatoAzureblobsdidtaketooksomeworkandredesign  Physicalpartitioningofdataintoblobs  Indexestodataalsostoredinblobs  Binaryserializationofobjectsintoblobs

CodeReview  Letslookatsomecode

 WCFserviceAPI  Silverlightservicebridge Webrole–looksjustlikeASP.NETandSilverlight  Silverlight–showwhathappenswhenyoupress“start” WCFservice–showhookstoazure(storageaccounts) Workerrole,showhowprocessingisdone

Poten2alEnhancements Tabledatastorageforstatuses  .NetServicebusintegrationformonitoringandcommunicationstonon‐cloudsystems

 Multicast.NetServiceBusforbroadcastingtoallworkerroles

 EventualuseofSQLdataservicesincloud BuildF#librariesforquantitativeanalysis

  Buildlocally,testagainstlocaldata Deploytocloud  Connectasrequired

LessonsLearned QueuesandAsynchronicity

 QueuesworkonadifferentmodelthanMSMQ  Retrievewithadeletewindowandthenexplicitlydelete  Pollingmodel(noblocking)

  Getusedtoasynchronousprocessing  Scalabilityisobtainedthroughasynchronousmodel  Queuebasedcommunicationbetweenwebandworkerroles  AsynchronouscommunicationsfromSilverlighttoAzure

LessonsLearned  EC2vsAzure

  DynamicallocationinAzureisnotasgoodaswithEC2  Ec2billingisbythehour,sonottoogoodforquickneeds  .NETcodewasveryportablebetweenEC2andAzure  Watchthebandwidthbetweenstoragezones

 Managementisdifficultinboth  ButAzuremanagementiseasierthanEC2  Azuremonitorsyourrolesandrestartsthem(EC2doesn’t)

  EC2feelsalotheavierthanAzure  Seemsgreatforappliances  Butifyouaredoing.NET,besttogoAzure

LessonsLearned Data

  Gettingdataintothecloudcanbealotofwork  RESTdoesisnotperformantforlarge#’sofsmallrecords

 Designingdatafornon‐relationalstorageiscumbersomeandrequiresachangeofmindset

LessonsLearned  Programming

  URLsforWCFservicesmustberewritteninthevariousenvironments

  .NETcodeforwebandworkersisverysimilartonormal.NETcode

  Lackoffulltrustcanbeapain;manylibrariescausedfailures  F#needstobelinkedintothesolutionduetonotbeingavailableintheAzureGAC/fulltrust

  Can’ttalkdirectlytoAzureeasilyfromSilverlight  Debuggingisdifficult:logs,writingtoqueues,ortoSQL

Thingstolookforinthefuture ConcernaboutAzurepricingforunutilizedworkers Dynamic/APIbasedallocationofroles ManagementAPI’sanduserinterfaces Capson#ofinstances/rolesavailable

Economics EC2

  Ranasubsetoftheoveralltask Usedfiveinstancesasbaseline  100unitsofwork

  50volatilityblocksofwork  50correlationblocksofwork  Eachinstancehandled10blocksofworkforbothvolatilityandcorrelation

  Volatilitiestook6.9minutesperblock  Correlationstook4.6minutesperblock

Economics–SubsetofSolu2onCalculation Time #Blocks TotalTime

Volatility 6.9 10 69

Correlation 4.6 10 46

TotalTime 115

Cost/BillableHour $0.125

Cost/Node $0.25

TotalCost(5Nodes) $1.25

Economics–FullSolu2onCalculation Time #Blocks TotalTime

Volatility 6.9 10 69

Correlation 4.6 500 2300

TotalTime 2369

Cost/BillableHour $0.125

Cost/Node $5.00

TotalCost(5Nodes) $25.00

Economics–SingleSystemCalculation Time #Blocks TotalTime

Volatility 6.9 50 345

Correlation 4.6 2500 11,500

TotalTime(mins) 11,845

TotalTime(days) 8.25

Economics–2500NodesCalculation Time #Blocks TotalTime

Volatility 6.9 1 6.9

Correlation 4.6 1 4.6

TotalTime 11.5

Cost/BillableHour $0.125

Cost/Node $0.125

TotalCost(2500Nodes) $312.50

Economics Thereisacrossoverofspeedvs.cost:

  $312.50(quickest)versus$25.00(mostcosteffective) MinimumbillinghourgranularityofEC2introducesafixedcostcomponent

  Isn’tnecessarilycheapercomparedtothecostof‘fixed’hardwareoveralongperiodoftime

 Computetimenottheonlycosttotakeintoaccount Datatransferin&outofcloudisequallyascostly

Conclusions Whatisthescalabilityofcloudarchitecture?

  Thisproblemwaslinearlyscalable;doublethenodes,roughlyhalfthetime

  Coststructure–isiteconomical?  Thenumberslookgoodcomparedtoinvestingincapitalandhumans

  Makesureyoudon’tgetbilledfornon‐utilizedtime  Bandwidthstillcostsyou,andcouldbesignificant  Watchforminimumbillingtimes

  Howdoesthesolutiondifferfromanon‐cloudsolution?  WithAzure,it’sverysimilarcoding(moreinlessonslearned)

  Butyoumustlearntopartitiontheproblemsetforscaleout

Q&A

Theconversa2ondoesn’tstophere!  Signinonwww.entdevcon.comnotonlytowatchthesessions,buttoalsodiscussthecontent!

 Createyourownblogs,wikis,conversationsandspecialinterestgroups–watchcontent–allforFREE!

 MakeaindustryconnectionswithyourpeersandMicrosoftexpertsonline!

[email protected]

EDConline!  Jointhecommunityofenterprisedevelopers!atwww.entdevcon.com  Shareyourstoriestodayandbeyondtheevent!  Tweetdirectlyto#EntDevConTwitter  ShareyourpicturesonourFacebookgroup! Discussionofthecommunity

Filloutyourevalua2ons!  Day1–Tonight’sreceptionrequirestickets.Inordertoreceiveyourticket,youmustcompleteanevaluationformforDay1,andreturntoregistrationdesk.

  Day2–Wewillholdaprizeraffleduringtheclosingkeynote.PleasehandinyourDay2evaluationspriortothekeynote.Prizesinclude:  Xbox360  Zunemusicplayers  WindowsMobiledevice,completewithmouseenabledpointerandkeyboard.

  MicrosoftWirelessLaserDesktop  LegoMindstormkits