View
5
Download
0
Category
Preview:
Citation preview
©BigSwitchNetworks1
CaseStudy:Industry’sLargestNFVDeploymentCollaborationbetweenRedHat,DellandBigSwitchforatier-1USServiceProviderembracinglargescaleNFVdeploymentsonOpenStackwithSDNNFVdeploymentsrepresentsomeofthemostdemandingworkloadsinOpenStackclouds,yettheeconomicandoperationalpromiseofNFVmakesthisahigh-valuetechnicalchallengeforserviceprovidersworldwide.ThispaperwilldiscussthecollaborationbetweenDell,RedHatandBigSwitchforatier-1USserviceproviderintheindustry’slargestdeploymentofNFVinfrastructuretodate.Fourkeyareasofcollaborationwereneededtobringthedeploymentfromlabtoproduction:
§ Resiliency&PerformanceatScale § Design&DeploymentFlexibility
§ ReducingOperationalComplexity § IntegratingSecurity&Analytics
SoftwaredevelopersfromRedHatandBigSwitchworkedtogetheronadailybasisovermonths,leveragingover$1moftesthardwarefromDell,toacceleratetheopencommunityengineeringprocessanddeliverahighquality,validatedNFVPodarchitecture.Asaresultofthecollaboration,multipleimprovementsweremadetoupstreamopensourcecodetoalignthefinalPoddesignwithkeydesignandoperationalconsiderationsofalargeserviceprovidernetworkinfrastructure.
Big Switch SDN Controllers
(Physical appliance pair)
Switch Light OS on Spine
(40G Dell ON switches)
Red Hat OpenStack 7.1
(with Neutron)
Red Hat Enterprise Linux with Switch Light VX
(on Dell R630 Compute Nodes)
Switch Light OS on Leaf
(10G/40G Dell ON switches)
+
+
ThisSDN/NFVcollaborationhighlightstheopensourceleadershipofRedHat,theSDNexpertiseofBigSwitchandtheprovenserviceand
supportatscalefromDell
Figure1:PodDesignAtAGlance
CaseStudy:Industry’sLargestNFVDeployment
©BigSwitchNetworks2
KeyNetworkDesignChallengesThekeynetworkchallengesfortheOpenStackNFVpoddeploymentfellinfivemajorcategories:
• ResiliencyAtScale:Toachievescale,thedesignfollowedahyperscale-inspired“coreandpod”approach1,witha12rackpoddesignreplicatedacrossanumberofdatacentersacrosstheUS.The12rackpod,amulti-milliondollarinvestment,wasreplicatedatbothDellandBigSwitchlabstotestthesystemunderstress.Resiliencywasrequiredateverylevel–inthevSwitch,theleaf,thespine,thenetworkservicesandtheingress/egresstothedatacentercoreandotherpods.
• NoBandwidthBottlenecks:NFVworkloadsputextremestressonthenetworkinmanydimensions–east/westbandwidth,north/southbandwidth,intra-vSwitchbandwidthandlogicalL2/L3bandwidth.Neitherbandwidthlimitationsfromlegacyprotocolslikespanningtreenorpackethair-pinningacrossthefabricforoverlaygatewaypurposeswereacceptable,yetVNFinstancesneededtobeprovisionedinanyrackatanytime.ThesystemasawholerequiredoptimizedbandwidthcharacteristicsfromvSwitchtoleaftospineinbothnormalrunningoperationsandinpartialfailurescenarios.
• LogicalNetworkDesignFlexibility:ThepoddesignneededtoaccommodateNFVworkloadsthateachhaduniquelogicalnetworkrequirements,yetneededtosharethesamephysicalleaf/spinefabricandvSwitches.Ratherthanaone-size-fits-allL2/L3approach,thisdesignneededtoaccommodateNFV-specificpublicL2networks,publicL3networks,privateL2networks,tenant-managedservicechainswithFWaaSandLBaaS,provider-managedservicechainstransparenttothetenants,virtualtenantnetworkfunctions,physicalprovidernetworkfunctionswithcapacityforhighbandwidthbroadcastandarangeofconnectivityoptionstonumerousexternalnetworks.Alloftheseoptionsneededtobemixed-and-matchedinpeacefulco-existenceinthesamephysicalpodatthesametime,withrelevantprovisioningworkflowsautomatedbyOpenStack.
• ReducedOperationalComplexity:OperationalcomplexityfortheNFVdeploymentforthisengagementcameintwoforms:a)lifecyclemanagementofthenetworkcontrolsystemsrelativetotheOpenStackcontrolsystems,andb)trainingfordesign/install/troubleshootingofthenetworkcontrolsystemitself.ThefirstrequiredtightintegrationbetweenBigSwitchandRedHat.Theendresult–anleaf-spineCLOSfabricthatcanbeupgradedinlesstimethananiPhonewithoutimpactingproductionworkloadsorOpenStackcontrolsystems–isuniqueintheindustry.ThesecondleveragedBigSwitch’s“OneBigSwitch”metaphor,detailedbelow.
• IntegratedSecurity&Visibility:ToensurethattheNFVPodiscompliantandsecureagainstintrusionsandotherthreats,itwasimportanttodesignanout-of-bandmonitoringcapabilityforE-WtrafficaswellasaninlineprotectionmechanismforN-StrafficasapartoftheoverallPoddesign.Keyrequirementsfromthisvisibilityinfrastructurewere:ascale-outdesignthatgrewwiththePodscale;supportformulti-tenant/multi-toolenvironmentsand,easeofdeploymentandoperation.
PodDesignAtAGlanceThegeneralpoddesignincludesoneservices/connectivity/controlrackand12computeracks(Figure1).
§ Services/connectivity/controlrackholdstheSDNcontrollers,OpenStackcontrollers,variousphysicalprovider-sidenetworkservicesandtheingress/egressgatewaystonetworksconnectingtothepod.Whilethisrackrepresentsonly10%ofthephysicalspace,itrepresents90%oftheengineeringeffortinvolvedinthedesign.
§ Computeracksareintendedasascale-outdesign,with12perpodintheinitialdeployment.Thiswasdesignedtoevolveovertimeasmorecapacityperlocationisrequired,andsomelocationshavepower/coolingconstraintsandrequireflexibilityinserverdensity.ThenetworkingforeachcomputerackfeaturestheBigSwitchSwitchLightOSrunningateachtopofrack,runningonDellONswitchhardware.ThefirstgenerationpoddesignusedOpenVSwitch,whilethesecondgenerationusesBigSwitchSwitchLightVX(a“P+V”FabricDesign)runningonDellcomputenodes.
1Seethisarticleco-authoredbyPetrLapukhov,ArchitectatFacebook,andKyleForster,FounderofBigSwitch:http://www.infoworld.com/article/2608992/data-center/data-center-rethinking-the-data-center-network.html
CaseStudy:Industry’sLargestNFVDeployment
©BigSwitchNetworks3
Fornetworkvisibilityandmonitoring,SPANportsfromeachtopofrackswitchwereintendedtointegratewithBigSwitch’sBigMonitoringFabric.Thisenabledon-demandandgranularE-Wtrafficmonitoring(includingintra-hosttrafficusingRSPAN).Inphase1,DDoSmitigationtoolswereconnectedinlinetoprotectallN-StrafficandmanagedfromtheBigMonitoringFabriccontroller.
ResiliencyatScaleTovalidatetheresiliencyoftheNFVpoddesignatscale,largescaletestbeds(>$1.5meach)wereconstructedinbothDellandBigSwitchfacilities.Thecross-vendorteamuseda“ChaosMonkey”methodologypioneeredbyNetflix,culminatinginatestwith640forcednetworkfailuresinunder30minuteswithnoimpacttoworkloadperformance.2
Ina‘chaosmonkey’styletest,randomnetworkfailureswereinjectedintothepodwhilerunning‘worstcase’workloads,includingtheHadoopTerrasortbenchmark.Withinthetestingwindow,BigCloudFabricSDNcontrollerswereforcedtofail-overevery30seconds,arandomswitchwasforcedtofailevery8secondsandarandomlinkwasforcedtofailevery4seconds.
NoBandwidthBottlenecksNFVworkloadsputextremestressonthenetworkinmanydimensions–east/westbandwidth,north/southbandwidth,intra-vSwitchbandwidthandlogicalL2/L3bandwidth.Aleaf-spineCLOSdesign,popularizedbyGoogle3,hasbecomethecommonapproachforextremeeast/west/north/southbandwidthrequirements.However,thetraditionalalphabetsoupofprotocolsusedtoreplicatetheGoogledesignwithlegacynetworkingproductsoftenleavesdatacenterdesignsthatareextremelyfragileinthefaceofpartialfailures,particularlyatthehost,orthatsignificantlyconstrainworkloadplacement.ForVNFdeployments,thesedownsidesmaketheseapproachesanon-starter.Amodernleaf-spineCLOSdesign,usingcentralizedSDNcontroldesignedtoseethenetworkfromspinetoleaftovSwitch,wastheoptimalanswerforthisdesign.
Figure3:Leaf-SpineClosFabricArchitecture
2FormoredetailsonBigSwitch’sChaosMonkeytestingforOpenStacknetworking,seehttp://go.bigswitch.com/rs/bigswitchnetworks/images/Chaos%20Monkey%20and%20Big%20Cloud%20Fabric.pdf3Forahistoryofleaf-spineCLOSdesignsatGoogle,seehttp://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183.pdf
Leaf-spine CLOS extended all the way down to the vSwitch
! Maximized bandwidth use across all active links
! Designed-in coverage of all partial failure cases from vSwitch to leaf to spine to controllers to OpenStack orchestration (compared to ‘alphabet soup’ of protocols)
! Fully distributed L3 and Floating IP functions (no packet hair-pins)
! End-to-end analytics and troubleshooting tools from vSwitch to leaf to spine
A B
vSWITCH
vSWITCH
vSWITCH
vSWITCH
A B
vSWITCH
vSWITCH
vSWITCH
vSWITCH
A B A B
SCALE OUT INGRESS EGRESS
BARE METAL SERVERS & STORAGE
VIRTUAL MACHINE RACKS SERVICES &
CONNECTIVITY RACKS
BIG CLOUD FABRIC SDN CONTROLLERS
Centralized Control Plane
Figure2:DataCenterScaleTestSetup
CaseStudy:Industry’sLargestNFVDeployment
©BigSwitchNetworks4
LogicalNetworkFlexibilityThepoddesignneededtoaccommodateNFVworkloadsthateachhaduniquelogicalnetworkrequirements,yetneededtosharethesamephysicalleaf/spinefabricandvSwitches.Ratherthanaone-size-fits-allL2/L3approach,thisdesignneededtoaccommodatenumerousNFV-specificL2/L3/servicedesigns.Theseincluded:
• PublicL2networkswithworkload-specificroutersforingress/egress• Public(routable)L3networksconnectedviaBGPandstaticroutestothevariousserviceprovidernetworks• PrivateL2networksforworkloadsrequiringinter-VNFbroadcastandL2multicastconnectivity• Tenant-managedservicechainswithFWaaS,LBaaSandotherservicesmanagedbyworkload-specificteamsontheir
operationalschedules• Provider-managedservicechains,transparenttothetenants,toserveascorporatestandardsacrossawidevariety(butnot
all)NFVworkloadsloadedontothepod• Amixofbothvirtualnetworkfunctionsandphysicalnetworkfunctionsinsertedintotheservicechainsmentionedaboveto
serviceNFVworkloads,• Amixofbothvirtualnetworkfunctionsandpart-virtual/part-physicalnetworkfunctionsmakingupaNFVworkload(i.e.
specializedphysicalequipmentandhighratestorage)
Whereapplicable,workflowsrequiredforprovisioningthesenetworksneededtobeorchestratedthroughOpenStackAPIsandUserInterfaces.
ReducedOperationalComplexityNFVdesignsinthelabcanbeincrediblycomplex,representingunboundedoperationalrisk.Toaddressthoserisks,easeofdeploymentandmanagementofday-to-dayoperationswerecriticalelementsforthisdesign.
§ OpenStackDeployment:Thiswasaddressedwithapowerful,simplifiedandautomatedcloudinstallationtoolfromRedHat-theRHELOSP7director,whichalsoprovidessystem-widehealthcheckingandcompletelifecyclemanagement.TheintegrationoftheBCFnetworkinginstallerwithRHELOSP7directorprovidesacompletelyintegratedworkflowthatnotonlymakesthesysteminstallationprocessseamlessandpredictable,butalsoensuresthestabilityandrapidconvergenceofthesystemuponsubsequentupgradeofthesystemcomponents.
§ PodOperations:Inordertomakethissystemintuitivefornetworkingprofessionals,thepoddesignusedBigCloudFabric’s“OneBigSwitch”operationalmetaphor(Figure5).Fromanoperationsperspective,theSDNcontrollersfeel/actjustlikechassissupervisors,whilethespineswitchesfeeljustlikeachassisbackplaneandtheleafandvSwitchesfeeljustlikechassislinecards.Thismetaphordramaticallyreducedthetrainingrequiredwhenintegratingthisnewpodintoexistingoperationalprocesses.
Figure4:RHELOpenStackPlatformDirector
CaseStudy:Industry’sLargestNFVDeployment
©BigSwitchNetworks5
Figure5:One"BigSwitch"
WithcomplexNFVworkloadsridingontopofalayerofOpenStackautomationwhichitselfisridingontopofanSDNfabric,networkhealth,historyandtroubleshootingtoolswereakeychallengeforthedeployment.WithintegrationfromvSwitchtoleaftospine,thevisibilityoftheBigCloudFabric“P+V”designdramaticallyreducedoperationalconcernswiththiskindofdeployment.AccordingtoarecentACGresearchstudy,thesetoolsallowfortroubleshooting12xfasterthantraditionalnetworkdesignsforthesetypesofpods4.
IntegratedSecurity&VisibilityToensurethattheNFVPodiscompliantandsecureagainstintrusionsandotherthreats,BigMonitoringFabricwasusedtomonitorEast-Westtraffic(intra-pod)andNorth-Southtraffic(inline).BigMonitoringFabricisprovisionedandmanagedthroughacentralized,singlepaneofglass—BigMonitoringFabriccontrollerCLI,GUIorRESTAPIs.Inadditiontodeliveringrelevanttraffictodedicatedtools(e.g.DDoSapplianceininlinedeployment),BigMonitoringFabricalsosupportsbuiltinanalyticsandtroubleshootingasshowninFigure6.
4TheentireACGstudy,showing12xfastertroubleshootingtimes,20xfastersoftwareupgradetimesand12xfasterpodexpansiontimesisavailableathttp://go.bigswitch.com/rs/974-WXR-561/images/Economic%20Advantages%20of%20Open%20SDN%20Fabrics%20-%20ACG%20Research.pdf
Traditional Chassis Pair
BACKPLANE
SUPERVISOR(S)
LINE CARD(S) LINE CARD
LINE CARD
LINE CARD
LINE CARD
LINE CARD
SUPERVISOR 1
LINE CARD
LINE CARD
LINE CARD
LINE CARD
LINE CARD
SUPERVISOR
BIG CLOUD FABRIC
CONTROLLER
1 3
SPINE SWITCHES
2 4 1 3 2 4
COMPUTE WORKLOAD
SERVICES & CONNECTIVITY
COMPUTE WORKLOAD
LEAF SWITCHES LINE CARD
LINE CARD
LINE CARD
LINE CARD
LINE CARD
SUPERVISOR
LINE CARD
LINE CARD
LINE CARD
LINE CARD
LINE CARD
SUPERVISOR
BAC
KPLA
NE
BAC
KPLA
NE
Health
Machine-assisted troubleshooting
History
CaseStudy:Industry’sLargestNFVDeployment
©BigSwitchNetworks6
Figure6:IntegratedVisibility&Analytics
ToLearnMore§ BigCloudFabricOverview:Moredetailsavailableat:http://bigswitch.com/sdn-products/big-cloud-fabric
§ RedHatOpenStackPlatformOverview:Moredetailsavailableathttps://access.redhat.com/products/red-hat-openstack-platform
§ BigMonitoringFabricOverview:Moredetailsavailableat:http://bigswitch.com/products/big-monitoring-fabric
§ BigSwitchLabs:Gethands-onexperiencewiththeseamlessintegrationofOpenStackandBigCloudFabric(P+VEdition)usingBigSwitch’sNeutronplugin.Availableonline,forfree:http://labs.bigswitch.com
§ BCFStarterKits:BigSwitchoffersthisfullytested,scalableOpenStacknetworkingsolutioninseveralBigCloudFabricstarterkits,pre-configuredwithhardware,cables,supportandphysical+virtualBigCloudFabricsoftwarestartingat$49k.Formoredetails,downloadthebrochureat:http://bigswitch.com/starter-kits
§ TestSetupDetails:Detailsofthescaletestingarchitectureandchaosmonkeytestinginstallationandmethodologyavailableonrequest.Emailinfo@bigswitch.com.
ABOUTBIGSWITCH
BigSwitchNetworksisthemarket leaderinbringinghyperscaledatacenternetworkingtechnologiestoamainstreamdatacenteraudience.Thecompany is taking threekeyhyperscale technologies --OEM/ODMbaremetalandopenEthernet switchhardware,sophisticated SDN control software, and core-and-pod data center designs -- and leveraging them in fit-for-purpose productsdesignedforuseinenterprises,cloudproviders,andserviceproviders.Foradditionalinformation,emailinfo@bigswitch.com,follow@bigswitch,orvisitwww.bigswitch.com.
Big SwitchNetworks, Big Cloud Fabric, BigMonitoring Fabric, Switch LightOS, and Switch Light VX are trademarks or registeredtrademarksofBigSwitchNetworks, Inc.Allothertrademarks,servicemarks,registeredmarks,orregisteredservicemarksarethepropertyoftheirrespectiveowners.
Recommended