Upload
germainrenaud
View
957
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
Towardsautonomice‐scienceecosystems
CécileGermain‐Renaud
LaboratoiredeRechercheenInforma<queUniversitéParis‐Sud‐CNRS‐INRIA
Outline
☀ Computa<onalecosystems
☂ TheClouds☂ Challenges☀ Autonomics
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Therequirementsofe‐science
“Cyberinfrastructureintegrateshardwareforcompu6ng,dataandnetworks,digitally‐enabledsensors,observatoriesandexperimentalfacili6es,andaninteroperablesuiteofso=wareandmiddlewareservicesandtools…”
NSF’sCyberinfrastructureVisionfor21stCenturyDiscovery
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Anolddream
«Acomputa6onalgridisahardwareandso=wareinfrastructurethatprovidesdependable,consistent,pervasive,andinexpensiveaccesstohighcomputa6onalcapabili6es.»I.Foster,C.Kesselman,TheGrid,1998
UCLApressreleaseonthecrea<onofARPANET,1969
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Gridsareareality
• Severallargedeploymentsinrou<neproduc<on• UKNa<onalGridService(NGS)• EuropeanGridInfrastructure(EGI‐EGEE)• TeraGrid• OpenScienceGrid(OSG)• DEISA• …
30/01/11 ASSYSTmee<ng:OpeningtheCloud
TheEGEE/EGIgrid
LHCisthe• Largest(26km),• Fastest(14TeV)• Coldest(1.9K)• Emp<est(10−13
atm)machine.
EGEE/EGIisthe• Largest(40KCPUs),• Mostdistributed(250
sites),• Mostused(300K
jobs/day)Computersystem
AtlasCollabora<on(oneinfour)
• 3000scien<sts• 38countries
• 174universi<esandlabs
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Cyberinfrastructure => Cyber-Ecosystems
Source: M. Parashar eSI visitor Seminar /www.nesc.ac.uk/action/esi/
Cyberinfrastructure => Cyber-Ecosystems
21st century Science and Engineering: New Paradigms & Practices
• Fundamentally data-driven/data intensive
• Fundamentally collaborative
Source: M. Parashar eSI visitor Seminar /www.nesc.ac.uk/action/esi/
UnprecedentedOpportuni<es
Forscienceandengineering
• Knowledge‐based,informa<on/data‐driven,context/content‐awarecomputa<onallyintensive,pervasive,..
• Holis<capplica<ons:integrateon‐demandcomputa<ons,experiments,observa<ons,data,…
• Tomanage,control,predict,adapt,op<mize,…
• Newparadigmsandprac<cesforexis<nggoalsornewthinking
30/01/11 ASSYSTmee<ng:OpeningtheCloud
e‐scienceecosystems
• AmajorrequirementisPervasive:On‐demand,integrated,transparent
• Con<nuity,notrevolu<on–Wemustlearnfromtheexperience
30/01/11 ASSYSTmee<ng:OpeningtheCloud
ExperiencewiththeEGEE/EGIgrid
EGEECPUusage
0.10%
1.00%
10.00%
100.00%
AA CC ES F HEP INF LS MV OTH UNK
Y0(%)
Y1(%)
Y2(%)
Source:ReportonU<liza<onofEGEEsupportservicesandinfrastructure,May2010
30/01/11 ASSYSTmee<ng:OpeningtheCloud
e‐scienceecosystems
• AmajorrequirementisPervasive:On‐demand,integrated,transparent
• Con<nuity,notrevolu<on–Wemustlearnfromtheexperience
• Organizedscien<ficcommuni<esarecommimedtoglobalizedhomogeneoussystems.Individualizedscienceisnot(yet?).Heterogeneoushigh‐levelsystemsares<llinthedesignstate.
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Outline
• Computa<onalecosystems
• TheClouds• Challenges• Autonomics
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Amorepervasivetechnology
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Source:WilliamVambenepe'sKeynoteatCloudConnect2010hmp://stage.vambenepe.com/archives/1355
SaaS:SopwareasaService
Howtodeliver/consume/managesuchservices
• Cloudprovidesincreasedinfrastructureflexibility,excellentbutnotthebomleneck
• Applica<onoruser‐orientedflexibility• Controlandorchestra<onoftheholis<capplica<onsacrossspecialized
andheterogeneouscomponents,whetherlocal,inagridorinacloud• Agilityasthecapacitytoreconfigure,reorganizetheinternalprocesses
«TheboComlineisthatanydis6nc6onbetweenSaaSandPOWA(PlainOldWebApplica6ons)isatworstarbitraryandatbestconcernedwiththebusinessrela6onshipbetweentheproviderandtheconsumerratherthantechnicalaspectsoftheapplica6on.»Samesource
30/01/11 ASSYSTmee<ng:OpeningtheCloud
TheGridexperience
«Gridaredefinedbycoordinatedresourcesharingandproblemsolvingindynamic,mul6‐ins6tu6onalvirtualorganiza6ons.Thesharingisnecessarily,highlycontrolled,withresourceprovidersandconsumersdefiningclearlyandcarefullyjustwhatisshared,whoisallowedtoshare,andthecondi6onsunderwhichsharingoccurs»IanFoster,2000
«Acomputa6onalgridisahardwareandso=wareinfrastructurethatprovidesdependable,consistent,pervasive,andinexpensiveaccesstohighcomputa6onalcapabili6es.»I.Foster,C.Kesselman,TheGrid,1998
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Consumers
Differentusersandrequirementsacrossandwithinthecollobara<ons30/01/11 ASSYSTmee<ng:OpeningtheCloud
Providers
30/01/11 ASSYSTmee<ng:OpeningtheCloud
WhataboutGPUs?
• Anewdigitaldivide,HPCandpersonalcomputersembarkingintoGPUs,businessande‐scienceintoclouds?
• GridsmightbeamenabletoGPUs,virtualizedGPUsisanascentresearcharea/technology
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Themessage
• DEFINITELYNOT“Cloudisabuzzword”• Atechnology,notasilverbullet• Bothe‐scienceandbusinessrequire
• Efficientintegra<onoflargedatasetswithcompu<ng
• Pervasiveness• e‐sciencehasspecificrequirements
• Organizedsharing:dataandfunding–technicalandpoli<calissues
• Performance:notalways,butastrongculturalbias/feature.
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Outline
• Computa<onalecosystems
• TheClouds• Challenges• Autonomics
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Thecomplexitycrisis
source:IDC2008,retrievedfromhmp://www.vmware.com/files/pdf/Virtualiza<on‐applica<on‐based‐cost‐model‐WP‐EN.pdf
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Thecomplexitycrisisinac<on
Source:hmp://www.teach‐ict.com/news/news_stories/news_computer_failures.htm
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Implemen<ngPervasivenessandSharing
Mul<‐scalefeedbacks30/01/11 ASSYSTmee<ng:OpeningtheCloud
Implemen<ngPervasivenessandSharing
Mul<‐scalefeedbacks30/01/11 ASSYSTmee<ng:OpeningtheCloud
Configuringthemiddleware
Source:JamesCasey’stalkatEGEE’0930/01/11 ASSYSTmee<ng:OpeningtheCloud
Runningthemiddleware
gLitepredic<onerrorforqueuing<me30/01/11 ASSYSTmee<ng:OpeningtheCloud
Usersbehavior
Users/filegroups/hostswithAVIZGraphDice
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Usersbehavior
30/01/11 ASSYSTmee<ng:OpeningtheCloud
ComplexityANDuncertainty
• Asadistributedsystem• Componentsandcommunica<onscomeandgo
• Fordynamic(P2P),butformanagedsystemsaswell
• CAP(Brewer’s)theorem:atmosttwooftheConsistency,Availability,Par<<ontolerancecanbeguaranteed
• Asadynamic(al)system• En<<eschangebehaviorasaneffectofunexpectedfeedbacks,
emergentbehavior• Organizedself‐cri<cality,minoritygames,...
• Lackofcompleteandcommonknowledge–Informa<onuncertainty• Monitoringisdistributedtoo• Resolu<onandcalibra<on• Seman<csandontologies
30/01/11 ASSYSTmee<ng:OpeningtheCloud
ComplexityANDuncertainty
Forapplica<onstoo
• Opportunis<cbehaviors• Space‐<me,accuracy,andmoregenerallyobjec<veadap<vity
• Context‐awarenessasrequiredbyaCAP‐proneenvironement
• Dynamicandcomplexcouplingandinterac<ons• mul<‐physics,mul<‐model,mul<‐resolu<on,…
• Trustindataandsopware• NotonlyforP2Psystems
30/01/11 ASSYSTmee<ng:OpeningtheCloud
ChallengesSummary
• Currentlevelsofscale,complexityanddynamismmakeitinfeasibleforhumanstoeffec<velymanageandcontrolsystemsandapplica<ons
• Compu<ngecosystems,withtheirverylargenumbersofhardwareandsopwarecomponentsinterac<ngwithverylargedata,arecomplexsystemsthatarecurrentlyverydifficulttoprogram
• Compu<ngecosystemsaredifficulttomanagebecauseoftheheterogeneityofworkflows,datasetsandopera<ngenvironment.
• Theabilityofanapplica<ontoself‐adaptbyincorpora<ngdynamicinputsalongitsexecu<onneedstobeformulatedthroughageneralandprincipledprogrammingmodel
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Outline
• Computa<onalecosystems
• TheClouds• Challenges• Autonomics
30/01/11 ASSYSTmee<ng:OpeningtheCloud
WhatisAutonomicCompu<ng?
“Compu6ngsystemsthatmanagethemselvesinaccordancewithhigh‐levelobjec6vesfromhumans”KephartandChess,AVisionofAutonomicCompu<ng,IEEEComputer,2003
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Milestones• IBMVisionandManifesto2001
• J.O.KephartandD.M.Chess.Thevisionofautonomiccompu<ng.IEEEComputer,36(1),2003
• IEEEInterna<onalConferenceonAutonomicCompu<ngseriessince2004
• IEEETaskForceonAutonomousandAutonomicSystems2006
• ECMLPKDD2006Tutorial/Workshop:AutonomicCompu<ng:ANewChallengeforMachineLearning,I.RishandG.Tesauro
• ACMTransac<onsonAutonomousandAdap<veSystems(TAAS),2006
• AutonomicCompu6ng:Concepts,InfrastructureandApplica6onsM.ParasharandS.Hariri(Ed.),CRCPress,2006
• TheNSFCenterforAutonomicCompu<ng,2008
• Interna<onalJournalofAutonomicCompu<ng(IJAC),IntersciencePublishers,2009
• Panelatthe1stGMACworkshop:TheconvergenceofGrids,CloudsandAutonomics,2009
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Self‐management
• Self‐ConfiguraDonAutomatedconfigura<onofcomponents,systemsaccordingtohigh‐levelpolicies;restofsystemadjustsseamlessly.
• Self‐HealingAutomateddetec<on,diagnosis,andrepairoflocalizedsopware/hardwareproblems.
• Self‐OpDmizaDonAutoma<candcon<nualadap<vetuningofhundredsofparameters(databaseparams,serverparams,…)affec<ngperformance&efficiency
• Self‐ProtecDonAutomateddefenseagainstmaliciousamacksorcascadingfailures;useearlywarningtoan<cipateandpreventsystem‐widefailures.
30/01/11 ASSYSTmee<ng:OpeningtheCloud
TheAutonomicNervousSystem
• Themostsophis<catedexampleofautonomicbehavior.
• Regulatesandmaintainshomeostasis:maintainsstructureandfunc<onsbymeansofamul<plicityofdynamicequilibriumsthatarerigorouslycontrolledbyinterdependentregula<onmechanisms.
30/01/11 ASSYSTmee<ng:OpeningtheCloud
• Notallparametershavethesameurgency,essen<alparametersaremonitoredmoreclosely.
Ashby’sUltrastableSystem
Source: “Autonomic Computing: An Overview, ” M. Parashar, and S. Hariri, UPP 2004, Mont Saint-Michel, France, Editors: J.-P. Banâtre et al. LNCS, Springer Verlag, Vol. 3566, pp. 247 – 259, 2005.
Acontroltheoryvision
30/01/11 ASSYSTmee<ng:OpeningtheCloud
And/orSelf‐awareness
30/01/11 ASSYSTmee<ng:OpeningtheCloud
TheMAPE‐Kloop
ManagedElement
ES
Monitor
Analyze
Execute
Plan
Knowledge
AutonomicManagerES
Environmentsensors
Networkinstrumenta<on
Userscontext
Applica<onrequirements
High‐dimensional,high‐volume‘raw’data
30/01/11 ASSYSTmee<ng:OpeningtheCloud
TheMAPE‐Kloop
ManagedElement
ES
Monitor
Analyze
Execute
Plan
Knowledge
AutonomicManagerES
State‐SpaceandDataAbstracDon
Streaming:
On‐linedatamining,clustering,..
Dimensionalityreduc<on
Ac<velearning
Ontologicalinference
High‐dimensional,high‐volume‘raw’data
Compressed,‘informa<ve’data30/01/11 ASSYSTmee<ng:OpeningtheCloud
TheMAPE‐Kloop
ManagedElement
ES
Monitor
Analyze
Execute
Plan
Knowledge
AutonomicManagerES
LearnpredicDvemodels
Classifica<on,regression,<meseries,MCMC
Decision‐making
Explora<onvsExploita<onGametheory,Riskanalysis
ReinforcementLearning,bandits
Compressed,‘informa<ve’data
30/01/11 ASSYSTmee<ng:OpeningtheCloud
TheMAPE‐Kloop
ManagedElement
ES
Monitor
Analyze
Execute
Plan
Knowledge
AutonomicManagerES
Knowledge–basedegontologies,a‐priorimodels,intelligentini<alisa<on
Or
Tabula‐rasaKnowledge
‐Avoidsknowledge‐intensivemodelbuilding
Criteria
‐IndepedentKnowledgeandlearning
‐Theore<calguaranteesofimprovement30/01/11 ASSYSTmee<ng:OpeningtheCloud
Technicalissues:exampleforRL
NeedenhancementtoVanillaReinforcementLearning• Observa<onuncertainty• Historicaldependenciesmayexist:MDPmightnotbeanexactmodel
• Convergencenotguaranteed• Lackofsta<onarity,• Con<nuousstate‐ac<onspacerequiresapproxima<ons• Localvsgloballearning,becauseofcurseofdimensionality
• Explora<onpenal<esmightbeexcessiveAnindepthexplora<onoftheseissues:GeraldTesauroetal.OntheUseofHybridReinforcementLearningforAutonomicResourceAlloca<on.ClusterCompu<ng,10(3):287‐99,2007.
30/01/11 ASSYSTmee<ng:OpeningtheCloud
Transversalissues
• Limits• Biologicalself‐*(awareness,healing…)may/willul<matelyfail,plusunforeseentreats
• Overheads• Designing,programming,execu<ng,provisioning
• Valida<on• Extremeevents:revisittradi<onalcriteriaegRMSE
• Benchmarkingunderuncertainty• Availabilityofreferencedatasets
www.grid‐observatory.org
CGermain‐Renaudetal.TheGridObservatory,toappearIEEE/ACMCCGRID'11
30/01/11 ASSYSTmee<ng:OpeningtheCloud
30/01/11 ASSYSTmee<ng:OpeningtheCloud