Upload
others
View
24
Download
0
Embed Size (px)
Citation preview
EnergySciencesNetwork(ESnet)LawrenceBerkeleyNationalLab
Oct2017
PresentedatInternet2TechEx 2017
Outline
• MLingeneral
• MLinnetworkresearch– LiteratureReviewofresearchfrom[2010- Sept2017]ofMLalgorithmsinWANs
– Commonareas,datainvolved,whatproblemssolved
• RoadAhead(unexploredareas)
AI,ML,DL– What’stheDifference?
• Turing“CanMachinesThink”– TuringTest:Exhibithuman-likeintelligence• MachinelearningiscollectionofalgorithmsthatcanhelpachieveAI
• e.g spamfilters,HRhiring,etc• DeeplearningisoneoftheseMLtechniques
• RecentadvancesduetoGPUandHPCprocessing(previouslyveryslow,toomuchdata,needtrainingtowork)
• Mainlyforimageandspeechrecognition– commercialapps
CourtesyNvidia Blog
AITree(exampletechniques)onlyasubsetareMLalgorithms
AI
Optimizationtechnique
Manymore….
Expertsystems
Fuzzysystems
NeuralNetworks
Evolutionaryalgorithms(Geneticalgorithms,evolutionarystrategies,etc)
Swarmintelligence(antcolony,etc)
Deepbeliefnetworks
Deepboltzman networks
Convolutionalnetworks
Stackedautoencoders
Networks:graphalgorithm(routing–shortestpath)
ML:Whereevertrainingor‘learning’onstatisticaldata
RandomForrest,Clustering,etc
5
Algorithmschosendependingon- dataavailable- problembeingsolved- combiningmultipletechniques(some50%accuracy,others80%accuracy)
Example:ChoosingAlgorithmsforProblems(e.g.DNNs)
Deepneuralnetwork
InputData Appliedfor Variants
Feedforwardneuralnetwork
Hierarchicaldatarepresentations
• Generalclassification• Clustering• Anomalyfinding• Featureextraction
• Deepbeliefnetworks(usesrestrictedboltzmanmachineforactivationfunction)
• Convolutionalneuralnetworks
Recurrentneuralnetwork
Sequentialdatarepresentation(i.e.timeseriesdata)
Sequentiallearning (whentimerelationshipexists)
Longshorttermmemory(LTSM)usedforspeechtranslation
• There are many variants of DNNs. Papers and researchers in each specific DNN.
• DeepMind used Deep Q-learning for Attari and Go• Action-pairs based on learned data.
MultipleToolsAvailable(DLLibraries)
• Google’s DNN platform TensorFlow used to tag unlabeled videos, recognize images with 70% accuracy and predict Gmail replies
• Scikit-learn good for learning, python library• Mostly used in image analysis
• HPC innovation: analyze massive data sets, quick training • Model and data parallelism to reduce the training time
Toolkit Language Use Processing capabilityCaffe C++ Images and video Distributed
(HPC, GPU)TensorFlow Python Images, regression, video, text, speech Distributed
(HPC, GPU)Theano Python Images Distributed
(HPC, GPU)Torch Lua Images and speech Distributed
(HPC, GPU)
BringingitbacktoNetworks…(Reviewingpaperssince2010)
MachinelearningUsecases(IETFforums)
• NetworkSecurity– Normalandoutlierbehaviorsintraffic
• Changeorpredictpossiblebehavior– This<QoS value>willcausethis<eventY>withprobability<P>
• Bugdetection– Softwareorhardwarefaults
• WANpathoptimization– Anticipatecongestion– Diverttraffictoalternatepaths
ConductedaSystematicLiteratureReview
• Step1:Identifyresearchquestions
• Step2:Identifyasearchstring– “Wideareanetworks”AND(estimateORpredict)AND(learningOR‘‘datamining’’OR‘‘artificialintelligence’’OR‘‘patternrecognition’’ORregressionORclassificationORoptimization)
• Step3:Identifyrelevantlibraries,journals,papers– IEEEXplore,ACMDigitalLibrary,ScienceDirect,WebofScience,EICompendex,andGoogleScholar
Step1:Researchquestions
Step2:Searchstrategy
Step3:Studyselectioncriteria
Step3:Quality
assessment
Relevantpapers
But…toomanypapersfound
• Spacewastoolarge:
• WANarecompletesystems
• Havemultiplelayers(e.g.seepicture)
• MultipleWANproblems
• Solution“Letsorganizetheresultsbasedon”:
• Createcategoriesofsimilarproblems
• ExploreMLandnon-MLsolutions
• Whichdatasetswereused
CategorizingsimilarProblems
Usertrafficdata Usertraffic(directedflows)
12
WANTopology(trafficengineering)
(flow-level,trafficprediction,adaptation,pathoptimization,linkfailure)Infrastructuretrafficdata
(Packet-level,queues,TCP,UDP)
Infrastructure-levelmodifications(Switches,deployment,etc)
MachinelearningapproachesinWAN
networks
2)TopologyEngineering
Trafficprediction
Trafficadaptation
Pathoptimization
Faultfinding
Multipledatacenter
connectivity
4)Infrastructureoptimization
1)Usertrafficoptimization
3)Packetleveloptimizations
TCPspecificproblems
Controllerplacements
Scheduling,congestion
Switchconfigurations
Note:SDNrelatedin(2,3,4)
Actual‘Actions’ontheWAN
Results
RelevantPapers:Statistics
IEEEExplore
ACMpub
ScienceDirect
WebofScience
#188
• Removeduplications
• Applyselectioncriteria
• Searchadditionalrelevancethroughreferences
• Removesurveys
• Applyqualityassessment
#3
#10
#532
#25
#223
Note:Googlescholargavemanyirrelevantresultsandisnotregardedasagoodpublicationsearchtool.
Results– peryear(1)
• RiseofMLtechniquesin2017(WorkshopsatSigComm,HotNets,etc)
0
5
10
15
20
25
30
2010 2011 2012 2013 2014 2015 2016 2017
ML Non-ML
No.ofpapers
Results– percategory(2)
• Non-MLstilllargelyfavored– problemsolving
• MostMLtechniquesareusedforclassification(oftraffic)andprediction(failures)– TechniquescoupledwithOpenFlow:Performclassificationandconfigurepackets
• SometoolsareenhancedbyMLembeddingfordecisionmaking:– Trafficawarenessandsecurityproblems– Formingtopologies,optimumpathfinding– Improvepathutilizationsdependingonarrivingtraffic
0
10
20
30
40
50
60
UserTraffic TrafficEngineering Packet-levelimprovements
Optimizinginfrastructure
ML Non-ML
No.ofpapers
Techniquesused
Cat1:Usertrafficanalysis
Cat2:Trafficengineering
Cat3:Packetoptimization
Cat4:Optimizeinfrastructure
ML NaïveBayestheorem,decisiontrees,SVM,RandomForest,ANN
Regression andclassificationtechniques
SVR,decisiontrees, naïve-bayes
Regressionandclassificationtechniques
Non-ML Rule-basedlearning,statisticalanalysistechniques
Graphopt–mincost,greedysearch,SPF
Fairnesscomputations,pathfindinggametheory,Markovmodels,simulations
Simulation,greedyalgorithmsforresourceallocation
Classification,Regression
Cat1:Usertrafficanalysis
Cat2:Trafficengineering
Cat3:Packetoptimization
Cat4:Optimizeinfrastructure
Usecases • Intrusiondetection
• Trafficprofiling
Classifyflowstoformoptimumtopologies
Pathperformance
Optimumconnectionsbetweendatacenters
Classification X X X X
Regression X X X
Clustering XDimensionreduction
Anomalydetection
X X
Featurelearning
Couplingwithdevices
XDemousingsimulations
DataInvolved
• Rangefrompacketdata,pathproperties,IPaddresses,QoS,TCP/UDPtraces,etc…
– E.g.Google’sB4optimizestopologytoSDWAN(basedondemand,packetloss,utilization)
Usecases Focus DatasetusedCategory 3:Packet-leveloptimization
VMresources Fairnessschemes,MTTF,MTTR,Netflow
Category 4:Infrastructureoptimization
Flowtables,controllerplacements
No.ofjobsrunning,VMdata,CPUusage,Applicationdata
RoadAhead…
LostofAreasstillUnder-developed• Mostlygraphoptimizationproblems(MLislessapplied)
• Identifywhatwewanttoachievealongthepipeline:
Understanding(Classification) Prediction Action
Linkwithdevices(SDN,NFV,etc),butwhatarethe‘knobs’wecanalter?
Short-termgoals Long-termgoalsWhichMLalgorithmtouse?• No‘One-Pill-Solution’• Fast versusOK response,dataused
Toomanydatasets• Costoftrainingmodels• Dynamicenvironments
Dimension reductionandfeaturelearning Notonlydeeplearningbutother‘distributed’MLapproaches
Andthelistgoeson….
MLforDistributedNetworks
ReinforcementLearningAgent
State s
DeepNeuralNetwork
parameterθ
policyπθ(s,a)
Takeactiona
• Whatifwedon’thaveacentralcloudorHPCtotrain?• Localizedlearningversusgloballearning• Learningindynamicenvironments(e.g.changingtrafficdemands)• MLresearchfocusesongamestrategies.Wedon’thavesimilar
“strategies”innetworks
• Learn,Try,Fail,Learn,Try,Succeed!
DQN
ConclusionsandContact
• AIshowssomepromise
• Networks+ML+HPC+(complexworkflows)
• Opendatasetsforresearch
• Combiningtechniques(andalgos)toadvanceresearchinexplored:– Newareasinnetworkandperhapsevenmore
Thankyou!
FundedunderDOEPanoramaProject(2017-2019),DOEASCR(2017-2022)