Case Study: Industry’s Largest NFV Deployment...Case Study: Industry’s Largest NFV Deployment...

Preview:

Citation preview

©BigSwitchNetworks1

CaseStudy:Industry’sLargestNFVDeploymentCollaborationbetweenRedHat,DellandBigSwitchforatier-1USServiceProviderembracinglargescaleNFVdeploymentsonOpenStackwithSDNNFVdeploymentsrepresentsomeofthemostdemandingworkloadsinOpenStackclouds,yettheeconomicandoperationalpromiseofNFVmakesthisahigh-valuetechnicalchallengeforserviceprovidersworldwide.ThispaperwilldiscussthecollaborationbetweenDell,RedHatandBigSwitchforatier-1USserviceproviderintheindustry’slargestdeploymentofNFVinfrastructuretodate.Fourkeyareasofcollaborationwereneededtobringthedeploymentfromlabtoproduction:

§ Resiliency&PerformanceatScale § Design&DeploymentFlexibility

§ ReducingOperationalComplexity § IntegratingSecurity&Analytics

SoftwaredevelopersfromRedHatandBigSwitchworkedtogetheronadailybasisovermonths,leveragingover$1moftesthardwarefromDell,toacceleratetheopencommunityengineeringprocessanddeliverahighquality,validatedNFVPodarchitecture.Asaresultofthecollaboration,multipleimprovementsweremadetoupstreamopensourcecodetoalignthefinalPoddesignwithkeydesignandoperationalconsiderationsofalargeserviceprovidernetworkinfrastructure.

Big Switch SDN Controllers

(Physical appliance pair)

Switch Light OS on Spine

(40G Dell ON switches)

Red Hat OpenStack 7.1

(with Neutron)

Red Hat Enterprise Linux with Switch Light VX

(on Dell R630 Compute Nodes)

Switch Light OS on Leaf

(10G/40G Dell ON switches)

+

+

ThisSDN/NFVcollaborationhighlightstheopensourceleadershipofRedHat,theSDNexpertiseofBigSwitchandtheprovenserviceand

supportatscalefromDell

Figure1:PodDesignAtAGlance

CaseStudy:Industry’sLargestNFVDeployment

©BigSwitchNetworks2

KeyNetworkDesignChallengesThekeynetworkchallengesfortheOpenStackNFVpoddeploymentfellinfivemajorcategories:

• ResiliencyAtScale:Toachievescale,thedesignfollowedahyperscale-inspired“coreandpod”approach1,witha12rackpoddesignreplicatedacrossanumberofdatacentersacrosstheUS.The12rackpod,amulti-milliondollarinvestment,wasreplicatedatbothDellandBigSwitchlabstotestthesystemunderstress.Resiliencywasrequiredateverylevel–inthevSwitch,theleaf,thespine,thenetworkservicesandtheingress/egresstothedatacentercoreandotherpods.

• NoBandwidthBottlenecks:NFVworkloadsputextremestressonthenetworkinmanydimensions–east/westbandwidth,north/southbandwidth,intra-vSwitchbandwidthandlogicalL2/L3bandwidth.Neitherbandwidthlimitationsfromlegacyprotocolslikespanningtreenorpackethair-pinningacrossthefabricforoverlaygatewaypurposeswereacceptable,yetVNFinstancesneededtobeprovisionedinanyrackatanytime.ThesystemasawholerequiredoptimizedbandwidthcharacteristicsfromvSwitchtoleaftospineinbothnormalrunningoperationsandinpartialfailurescenarios.

• LogicalNetworkDesignFlexibility:ThepoddesignneededtoaccommodateNFVworkloadsthateachhaduniquelogicalnetworkrequirements,yetneededtosharethesamephysicalleaf/spinefabricandvSwitches.Ratherthanaone-size-fits-allL2/L3approach,thisdesignneededtoaccommodateNFV-specificpublicL2networks,publicL3networks,privateL2networks,tenant-managedservicechainswithFWaaSandLBaaS,provider-managedservicechainstransparenttothetenants,virtualtenantnetworkfunctions,physicalprovidernetworkfunctionswithcapacityforhighbandwidthbroadcastandarangeofconnectivityoptionstonumerousexternalnetworks.Alloftheseoptionsneededtobemixed-and-matchedinpeacefulco-existenceinthesamephysicalpodatthesametime,withrelevantprovisioningworkflowsautomatedbyOpenStack.

• ReducedOperationalComplexity:OperationalcomplexityfortheNFVdeploymentforthisengagementcameintwoforms:a)lifecyclemanagementofthenetworkcontrolsystemsrelativetotheOpenStackcontrolsystems,andb)trainingfordesign/install/troubleshootingofthenetworkcontrolsystemitself.ThefirstrequiredtightintegrationbetweenBigSwitchandRedHat.Theendresult–anleaf-spineCLOSfabricthatcanbeupgradedinlesstimethananiPhonewithoutimpactingproductionworkloadsorOpenStackcontrolsystems–isuniqueintheindustry.ThesecondleveragedBigSwitch’s“OneBigSwitch”metaphor,detailedbelow.

• IntegratedSecurity&Visibility:ToensurethattheNFVPodiscompliantandsecureagainstintrusionsandotherthreats,itwasimportanttodesignanout-of-bandmonitoringcapabilityforE-WtrafficaswellasaninlineprotectionmechanismforN-StrafficasapartoftheoverallPoddesign.Keyrequirementsfromthisvisibilityinfrastructurewere:ascale-outdesignthatgrewwiththePodscale;supportformulti-tenant/multi-toolenvironmentsand,easeofdeploymentandoperation.

PodDesignAtAGlanceThegeneralpoddesignincludesoneservices/connectivity/controlrackand12computeracks(Figure1).

§ Services/connectivity/controlrackholdstheSDNcontrollers,OpenStackcontrollers,variousphysicalprovider-sidenetworkservicesandtheingress/egressgatewaystonetworksconnectingtothepod.Whilethisrackrepresentsonly10%ofthephysicalspace,itrepresents90%oftheengineeringeffortinvolvedinthedesign.

§ Computeracksareintendedasascale-outdesign,with12perpodintheinitialdeployment.Thiswasdesignedtoevolveovertimeasmorecapacityperlocationisrequired,andsomelocationshavepower/coolingconstraintsandrequireflexibilityinserverdensity.ThenetworkingforeachcomputerackfeaturestheBigSwitchSwitchLightOSrunningateachtopofrack,runningonDellONswitchhardware.ThefirstgenerationpoddesignusedOpenVSwitch,whilethesecondgenerationusesBigSwitchSwitchLightVX(a“P+V”FabricDesign)runningonDellcomputenodes.

1Seethisarticleco-authoredbyPetrLapukhov,ArchitectatFacebook,andKyleForster,FounderofBigSwitch:http://www.infoworld.com/article/2608992/data-center/data-center-rethinking-the-data-center-network.html

CaseStudy:Industry’sLargestNFVDeployment

©BigSwitchNetworks3

Fornetworkvisibilityandmonitoring,SPANportsfromeachtopofrackswitchwereintendedtointegratewithBigSwitch’sBigMonitoringFabric.Thisenabledon-demandandgranularE-Wtrafficmonitoring(includingintra-hosttrafficusingRSPAN).Inphase1,DDoSmitigationtoolswereconnectedinlinetoprotectallN-StrafficandmanagedfromtheBigMonitoringFabriccontroller.

ResiliencyatScaleTovalidatetheresiliencyoftheNFVpoddesignatscale,largescaletestbeds(>$1.5meach)wereconstructedinbothDellandBigSwitchfacilities.Thecross-vendorteamuseda“ChaosMonkey”methodologypioneeredbyNetflix,culminatinginatestwith640forcednetworkfailuresinunder30minuteswithnoimpacttoworkloadperformance.2

Ina‘chaosmonkey’styletest,randomnetworkfailureswereinjectedintothepodwhilerunning‘worstcase’workloads,includingtheHadoopTerrasortbenchmark.Withinthetestingwindow,BigCloudFabricSDNcontrollerswereforcedtofail-overevery30seconds,arandomswitchwasforcedtofailevery8secondsandarandomlinkwasforcedtofailevery4seconds.

NoBandwidthBottlenecksNFVworkloadsputextremestressonthenetworkinmanydimensions–east/westbandwidth,north/southbandwidth,intra-vSwitchbandwidthandlogicalL2/L3bandwidth.Aleaf-spineCLOSdesign,popularizedbyGoogle3,hasbecomethecommonapproachforextremeeast/west/north/southbandwidthrequirements.However,thetraditionalalphabetsoupofprotocolsusedtoreplicatetheGoogledesignwithlegacynetworkingproductsoftenleavesdatacenterdesignsthatareextremelyfragileinthefaceofpartialfailures,particularlyatthehost,orthatsignificantlyconstrainworkloadplacement.ForVNFdeployments,thesedownsidesmaketheseapproachesanon-starter.Amodernleaf-spineCLOSdesign,usingcentralizedSDNcontroldesignedtoseethenetworkfromspinetoleaftovSwitch,wastheoptimalanswerforthisdesign.

Figure3:Leaf-SpineClosFabricArchitecture

2FormoredetailsonBigSwitch’sChaosMonkeytestingforOpenStacknetworking,seehttp://go.bigswitch.com/rs/bigswitchnetworks/images/Chaos%20Monkey%20and%20Big%20Cloud%20Fabric.pdf3Forahistoryofleaf-spineCLOSdesignsatGoogle,seehttp://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183.pdf

Leaf-spine CLOS extended all the way down to the vSwitch

!  Maximized bandwidth use across all active links

!  Designed-in coverage of all partial failure cases from vSwitch to leaf to spine to controllers to OpenStack orchestration (compared to ‘alphabet soup’ of protocols)

!  Fully distributed L3 and Floating IP functions (no packet hair-pins)

!  End-to-end analytics and troubleshooting tools from vSwitch to leaf to spine

A B

vSWITCH

vSWITCH

vSWITCH

vSWITCH

A B

vSWITCH

vSWITCH

vSWITCH

vSWITCH

A B A B

SCALE OUT INGRESS EGRESS

BARE METAL SERVERS & STORAGE

VIRTUAL MACHINE RACKS SERVICES &

CONNECTIVITY RACKS

BIG CLOUD FABRIC SDN CONTROLLERS

Centralized Control Plane

Figure2:DataCenterScaleTestSetup

CaseStudy:Industry’sLargestNFVDeployment

©BigSwitchNetworks4

LogicalNetworkFlexibilityThepoddesignneededtoaccommodateNFVworkloadsthateachhaduniquelogicalnetworkrequirements,yetneededtosharethesamephysicalleaf/spinefabricandvSwitches.Ratherthanaone-size-fits-allL2/L3approach,thisdesignneededtoaccommodatenumerousNFV-specificL2/L3/servicedesigns.Theseincluded:

• PublicL2networkswithworkload-specificroutersforingress/egress• Public(routable)L3networksconnectedviaBGPandstaticroutestothevariousserviceprovidernetworks• PrivateL2networksforworkloadsrequiringinter-VNFbroadcastandL2multicastconnectivity• Tenant-managedservicechainswithFWaaS,LBaaSandotherservicesmanagedbyworkload-specificteamsontheir

operationalschedules• Provider-managedservicechains,transparenttothetenants,toserveascorporatestandardsacrossawidevariety(butnot

all)NFVworkloadsloadedontothepod• Amixofbothvirtualnetworkfunctionsandphysicalnetworkfunctionsinsertedintotheservicechainsmentionedaboveto

serviceNFVworkloads,• Amixofbothvirtualnetworkfunctionsandpart-virtual/part-physicalnetworkfunctionsmakingupaNFVworkload(i.e.

specializedphysicalequipmentandhighratestorage)

Whereapplicable,workflowsrequiredforprovisioningthesenetworksneededtobeorchestratedthroughOpenStackAPIsandUserInterfaces.

ReducedOperationalComplexityNFVdesignsinthelabcanbeincrediblycomplex,representingunboundedoperationalrisk.Toaddressthoserisks,easeofdeploymentandmanagementofday-to-dayoperationswerecriticalelementsforthisdesign.

§ OpenStackDeployment:Thiswasaddressedwithapowerful,simplifiedandautomatedcloudinstallationtoolfromRedHat-theRHELOSP7director,whichalsoprovidessystem-widehealthcheckingandcompletelifecyclemanagement.TheintegrationoftheBCFnetworkinginstallerwithRHELOSP7directorprovidesacompletelyintegratedworkflowthatnotonlymakesthesysteminstallationprocessseamlessandpredictable,butalsoensuresthestabilityandrapidconvergenceofthesystemuponsubsequentupgradeofthesystemcomponents.

§ PodOperations:Inordertomakethissystemintuitivefornetworkingprofessionals,thepoddesignusedBigCloudFabric’s“OneBigSwitch”operationalmetaphor(Figure5).Fromanoperationsperspective,theSDNcontrollersfeel/actjustlikechassissupervisors,whilethespineswitchesfeeljustlikeachassisbackplaneandtheleafandvSwitchesfeeljustlikechassislinecards.Thismetaphordramaticallyreducedthetrainingrequiredwhenintegratingthisnewpodintoexistingoperationalprocesses.

Figure4:RHELOpenStackPlatformDirector

CaseStudy:Industry’sLargestNFVDeployment

©BigSwitchNetworks5

Figure5:One"BigSwitch"

WithcomplexNFVworkloadsridingontopofalayerofOpenStackautomationwhichitselfisridingontopofanSDNfabric,networkhealth,historyandtroubleshootingtoolswereakeychallengeforthedeployment.WithintegrationfromvSwitchtoleaftospine,thevisibilityoftheBigCloudFabric“P+V”designdramaticallyreducedoperationalconcernswiththiskindofdeployment.AccordingtoarecentACGresearchstudy,thesetoolsallowfortroubleshooting12xfasterthantraditionalnetworkdesignsforthesetypesofpods4.

IntegratedSecurity&VisibilityToensurethattheNFVPodiscompliantandsecureagainstintrusionsandotherthreats,BigMonitoringFabricwasusedtomonitorEast-Westtraffic(intra-pod)andNorth-Southtraffic(inline).BigMonitoringFabricisprovisionedandmanagedthroughacentralized,singlepaneofglass—BigMonitoringFabriccontrollerCLI,GUIorRESTAPIs.Inadditiontodeliveringrelevanttraffictodedicatedtools(e.g.DDoSapplianceininlinedeployment),BigMonitoringFabricalsosupportsbuiltinanalyticsandtroubleshootingasshowninFigure6.

4TheentireACGstudy,showing12xfastertroubleshootingtimes,20xfastersoftwareupgradetimesand12xfasterpodexpansiontimesisavailableathttp://go.bigswitch.com/rs/974-WXR-561/images/Economic%20Advantages%20of%20Open%20SDN%20Fabrics%20-%20ACG%20Research.pdf

Traditional Chassis Pair

BACKPLANE

SUPERVISOR(S)

LINE CARD(S) LINE CARD

LINE CARD

LINE CARD

LINE CARD

LINE CARD

SUPERVISOR 1

LINE CARD

LINE CARD

LINE CARD

LINE CARD

LINE CARD

SUPERVISOR

BIG CLOUD FABRIC

CONTROLLER

1 3

SPINE SWITCHES

2 4 1 3 2 4

COMPUTE WORKLOAD

SERVICES & CONNECTIVITY

COMPUTE WORKLOAD

LEAF SWITCHES LINE CARD

LINE CARD

LINE CARD

LINE CARD

LINE CARD

SUPERVISOR

LINE CARD

LINE CARD

LINE CARD

LINE CARD

LINE CARD

SUPERVISOR

BAC

KPLA

NE

BAC

KPLA

NE

Health

Machine-assisted troubleshooting

History

CaseStudy:Industry’sLargestNFVDeployment

©BigSwitchNetworks6

Figure6:IntegratedVisibility&Analytics

ToLearnMore§ BigCloudFabricOverview:Moredetailsavailableat:http://bigswitch.com/sdn-products/big-cloud-fabric

§ RedHatOpenStackPlatformOverview:Moredetailsavailableathttps://access.redhat.com/products/red-hat-openstack-platform

§ BigMonitoringFabricOverview:Moredetailsavailableat:http://bigswitch.com/products/big-monitoring-fabric

§ BigSwitchLabs:Gethands-onexperiencewiththeseamlessintegrationofOpenStackandBigCloudFabric(P+VEdition)usingBigSwitch’sNeutronplugin.Availableonline,forfree:http://labs.bigswitch.com

§ BCFStarterKits:BigSwitchoffersthisfullytested,scalableOpenStacknetworkingsolutioninseveralBigCloudFabricstarterkits,pre-configuredwithhardware,cables,supportandphysical+virtualBigCloudFabricsoftwarestartingat$49k.Formoredetails,downloadthebrochureat:http://bigswitch.com/starter-kits

§ TestSetupDetails:Detailsofthescaletestingarchitectureandchaosmonkeytestinginstallationandmethodologyavailableonrequest.Emailinfo@bigswitch.com.

ABOUTBIGSWITCH

BigSwitchNetworksisthemarket leaderinbringinghyperscaledatacenternetworkingtechnologiestoamainstreamdatacenteraudience.Thecompany is taking threekeyhyperscale technologies --OEM/ODMbaremetalandopenEthernet switchhardware,sophisticated SDN control software, and core-and-pod data center designs -- and leveraging them in fit-for-purpose productsdesignedforuseinenterprises,cloudproviders,andserviceproviders.Foradditionalinformation,emailinfo@bigswitch.com,follow@bigswitch,orvisitwww.bigswitch.com.

Big SwitchNetworks, Big Cloud Fabric, BigMonitoring Fabric, Switch LightOS, and Switch Light VX are trademarks or registeredtrademarksofBigSwitchNetworks, Inc.Allothertrademarks,servicemarks,registeredmarks,orregisteredservicemarksarethepropertyoftheirrespectiveowners.

Recommended