In-bandNetworkTelemetry(INT)
Mukesh Hira,VMwareNagaKatta,PrincetonUniversity
VirtualL2andL3topologies,Firewalls,Load-balancers,…
Policies,Service-chaining
End-points
Container
Spine
Leaf
PhysicalTransport
DatacenterNetworkTopologies
Currentmonitoringmethodsareinadequate
§ Notfastenough§ InvolveCPUandcontrolplanes§ Networkstatechangesrapidly
§ Donotprovideend-to-endstate§ Difficulttocorrelateper-elementstatewiththeactualpathofaflow
INT:In-bandNetworkTelemetry
§ Mechanismforcollectingnetworkstateinthedataplane§ Asclosetorealtime aspossible§ Atcurrentandfuturelinerates§ Withaframeworkthatcanadapt overtime
§ Examplesofnetworkstate§ SwitchID,IngressPortID,EgressPortID§ EgressLinkUtilization§ HopLatency§ EgressQueueOccupancy§ EgressQueueCongestionStatus§ ….
INTExample
vSwitch vSwitch
Payload
L1,P1,P2,Eg-Util
Encap
INTinstructions PortP1SWL1
PortP2
PortP1SWS2
PortP2 PortP1
SWL2
PortP2 S2,P1,P2,Eg-Util
L2,P1,P2,Eg-Util
SwitchID
IngressPortID
EgressPortID
EgressLinkUtilizationPayload
Payload
Encap
INTinstructions
Encap
INTinstructions
L1,P1,P2,Eg-Util
Payload
Encap
INTinstructions
S2,P1,P2,Eg-Util
L1,P1,P2,Eg-Util
Payload
Payload
INTHeaderFormat
Ver Flags InstructionCount MaxHopCount TotalHopCount
InstructionBitmap ReservedMostRecentINTMetadata
00
1
INTMetadata..
FirstINTMetadata
MetadataHeader
Metadata
4Bytes
INTHeader:PotentialLocations
OuterEth,IP,UDPHeaders
Geneve Header
OptionClass,Type,Length
INTMetadataHeadersandMetadata
InnerPayload
Variable-lengthGENEVEoptions
GENEVE
OuterEth,IP,UDPHeaders
VXLANHeaderNext_Protocol =INT
VXLANGPEHeader
INTMetadataHeadersandMetadata
InnerPayload
INTasVXLANNext-Protocol
VXLAN-GPE
INTmetadatamayalsobecarriedas• NetworkServiceHeaderMetadata• TCPoptions/payload• UDPpayload
INTusingP4
§ P4enablesflexiblepacketparsingandmodificationforINT
§ P4allowsINTtoadaptto§ AnyEncapsulationformat§ AnyStaterequiredtobecollected§ Anyfeature,protocol– currentandfuture
INT:P4CodeSnippet
header_type int_header_t {fields{ver :2;flags:9;ins_cnt :5;max_hop_cnt :8;total_hop_cnt :8;instruction_mask :16;
}}
header_type vxlan_gpe_int_header_t{fields{int_type :8;rsvd :8;len :8;next_proto :8;
}}
header_type vxlan_gpe_t{fields{flags:8;reserved:16;next_proto :8;vni :24;reserved2:8;
}}
HeaderDefinitions
ParserDefinitions parserparse_gpe_int_header {
extract(vxlan_gpe_int_header);set_metadata(int_metadata.gpe_int_hdr_len,
latest.len);returnparse_int_header;
}
parserparse_int_header {extract(int_header);….
}
INT:P4CodeSnippetExact-matchTableDefinition
ActionDefinitions
tableint_inst {reads{int_header.instruction_mask :exact;
}actions{int_set_header_i0;int_set_header_i1;int_set_header_i2;int_set_header_i3;…..
}
actionint_set_header_i0(){}actionint_set_header_i1(){int_set_header_3();
}actionint_set_header_i2(){int_set_header_2();
}actionint_set_header_i3(){int_set_header_3();int_set_header_2();
}…..
INTApplicationReal-timemonitoringandtroubleshooting
OverlayNetworkMonitoringtoday
Real-timeNetworkMonitoring
20% 80%
40% 20%
22% 85%
45% 26%
25% 90%
50% 24%
35% 82%
48% 28%
30% 75%
42% 30%
Next:Pickaflowonthesourcelogicalportandviewthepathittakesandexactnetworkstateitexperiences
Leaf1 Leaf2
Spine1
Spine2
Real-timetroubleshootingdemo
INTApplicationHop-by-HopUtilization-AwareLoad-balancingArchitecture
HULA:INT+Flowlet routing
1. PeriodicINTprobes§ disseminatepathutilizationtoswitches
2. Flowlet detectionandpathselection§ happensatall switches§ hop-by-hopadaptive routing
INTprobestraversemultiplepaths
ToR
Aggregate
Spines
Probeoriginates
Probereplicates
Probescarrypathutilization
S1
S2
S3
S4
ToR 10
ToR ID=10Max_util =50%
ToR 1Probe
ToR ID=10Max_util =80%
ToR ID=10Max_util =60%
Probesupdateswitchstate
S1
S2
S3
S4
ToR 10
Dst Besthop Pathutil
ToR 10 S4 50%
ToR 1 S2 10%
… …
ToR 1
PathUtil table
Probe
ToR ID=10Max_util =50%
Switchesloadbalanceflowlets
S1
S2
S3
S4
ToR 10
Dst Besthop Pathutil
ToR 10 S4 50%
ToR 1 S2 10%
… …
Dst Flowlet # Next hop
Tor_10 1 S4
… …
… …
ToR 1
Flowlet table
PathUtil table
Data
Simulation:TopologyAsymmetry
8serversperleaf
40Gbps
40Gbps
10Gbps
LinkFailure
HULAVs.ECMP
HULA- Advantages
• Topologyoblivious• Adaptive tonetworkdynamics• Scalable tolargetopologies• Noseparatesourceroutingrequired• Programmable inP4!
§ Processingprobes§ Flowlet routing
Summary
§ INTprovidesreal-timenetworkstatedirectlyinthedataplane§ Scalestoarbitrarilylargenetworks§ Scalestocurrentandfuturelinkspeeds§ Canadapttoanynetwork,anyencap,anyapplication
§ Knowledgeofreal-timenetworkstateopensupnewpossibilities§ Enhancedmonitoringandtroubleshooting§ Network-stateawarerouting§ …
Moreinformation
http://p4.org/p4/inband-network-telemetry/Blogpostwithlinksto
§ INTdemovideo§ INTspecification§ P4sourcecoderepository
MoreinformationonUtilizationawareroutingwillbepostedonp4.orginthenearfuture
INTSpecification– CollaborativeEffort
http://p4.org/wp-content/uploads/fixed/INT/INT-current-spec.pdf
ThankYou