IntOpt: In-Band Network Telemetry Optimization for NFV ...1348380/FULLTEXT01.pdf · C. Optimized Active Telemetry Probes Generation In this sub-section we illustrate our concept of

http://www.diva-portal.org

Postprint

This is the accepted version of a paper presented at IEEE ICC 2019: IEEE InternationalConference on Communications 2019 Shanghai, China 20-24 May.

Citation for the original published paper:

Bhamare, D., Kassler, A., Vestin, J., Khoshkholghi, M A., Taheri, J. (2019)IntOpt: In-Band Network Telemetry Optimization for NFV Service Chain MonitoringIn: 2019 IEEE International Conference on Communications (ICC) Próceedingshttps://doi.org/10.1109/ICC.2019.8761722

N.B. When citing this work, cite the original published paper.

© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must beobtained for all other uses, in any current or future media, including reprinting/republishingthis material for advertising or promotional purposes, creating new collective works, forresale or redistribution to servers or lists, or reuse of any copyrighted component of this workin other works.

Permanent link to this version:http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-74631

in the substrate network on which SFCs are routed. Theproposed framework uses active telemetry probing [12], thatis, inserting separate monitoring probes for the MFs period-ically in the network to gather telemetry information fromthe data-plane using programmable data plane elements andP4. The sink nodes in the physical network send the probesalong with the collected telemetry information back to thecontroller for further analysis. The IntOpt controller executesits commands by communicating them through the SDNcontroller, which in turn communicates with the underlyingphysical switches through the control plane [13]. For inputto our meta-heuristic, we benchmark the P4 INT frameworkusing P4FPGA [14] to approximate the delay induced byINT-operations.

The remaining article is organized as follows. In sectionII we discuss the related work in brief. In section III wediscuss IntOpt architecture with a motivational example. Insection IV we discuss the implemented heuristics. Section Vpresents the heuristic results while section VI concludes thepaper addressing future directions.

II. RELATED WORK

Many solutions have been proposed in academia as well asindustry for network monitoring. However, existing solutionsmainly focus on trade-off between expressiveness, accuracy,speed and scalability [1], [5]. For example, systems such asNetQRE [7] and others can support a wide range of queriesusing stream processors running on general-purpose CPUs,but they incur substantial bandwidth and processing costs todo so. Telemetry systems such Chimera [6] and Gigascope [8]are expressive in nature by covering wide range of telemetryitems, however, can only support lower packet rates. Thisis because these systems process all packets at the streamprocessor which can become a bottleneck.

On contrary, telemetry systems that rely on programmableswitches alone can scale to high traffic rates, however, theycan accommodate a limited set of telemetry items in orderto achieve the scalability. For example, Sketchvisor [10],UnivMon [15] and OpenSketch [11] can perform telemetrytasks by executing queries solely in the data-plane at linerate, but the queries that they can support are limited bythe computational capabilities and memory in the data-plane,scarifying the expressiveness and accuracy.

Systems such as ElasticSketch [16], Marple [17] obtaina good balance between the expressiveness and scalability,however, they incur substantial processing overheads, delaysand traffic overheads. To overcome this problem, in this work,we propose an approach to minimize the overhead associatedwith monitoring so as to make the underlying monitoringframework scalable as well as expressiveness.

Authors in [18] propose use of piggyback technique tomonitor network statistics. However, we argue that the trafficin NFV and SFC architecture is unpredictable and hencemight come in bursts. Due to this, piggybacking may fail todeliver accurate per-flow statistics. In this work, we advocateactive telemetry probing [12], since it is an effective wayto perform network monitoring. It is especially effective inthe dynamic service chaining architecture, since each servicechain may have been allocated different network slices withdifferent QoS requirements undergoing different treatment in

the data-plane. Inserting active probes, however, can be ex-pensive and may lead to queue buildups, buffer-bloat, packetdrops and network congestion as well as delays, especially ifit is performed in unplanned ad-hoc manner. To minimizethe overhead associated with active probing, we developa simulated annealing based random greedy meta-heuristic(SARG) that determines the optimal set of monitoring flows(MFs) in order to fulfill all the monitoring requirements ofservice flows (SFs). In the subsequent sections we discuss ourproposed IntOpt architecture along with the proposed SARGmeta-heuristic in more details.

III. ACTIVE PROBING OPTIMIZATION

A. INT - InBand Network Telemetry

An important goal of any network monitoring frameworkis to enable fine-grade monitoring at scale. Typically, suchmonitoring frameworks involve expensive interaction withthe control plane and are either not scalable or lack expres-siveness. With recent advances in programmable data-planedevices based on P4 [8] together with compiler and run-time support, adding monitoring support in the data planebecomes possible at line-rate. The main idea of the INTFramework [9] is that programmable data plane elements(e.g. a P4 capable router that runs the INT framework) canadd telemetry instructions to individual monitoring probes,which the switch would parse and understand, in order tocollect desired telemetry items. Such monitoring informationmay include packet queuing time, port utilization, etc. andwould be added as custom headers for each probe using theINT-framework.

Programmable data plane devices complying with the INTstandard are divided into three categories: INT sources, INTforwarders and sinks. INT sources add telemetry instructionheaders thus instructing forwarding nodes, which telemetryitems to collect. Using P4 framework, INT forwarders parseinstruction headers, collect relevant telemetry informationand add it to the packet headers. Finally, INT sinks removetelemetry headers and forward collected information to adata sink (e.g. stream processor) where further processingcan be done (e.g. detecting per switch micro bursts, applyingmachine learning on collected state information, etc.). Usingthe INT framework, packets collect observed network statefrom INT enabled devices and aggregate such state in flexibleheaders while traversing the network [9], [12].

B. IntOpt Architecture

The proposed IntOpt architecture is shown in Fig. 1. IntOptcontroller communicates with SDN controller using East-West interfaces. It retrieves information such as underlyingphysical topology, physical links as well as the SFCs, theiractual deployment over the physical nodes and their monitor-ing demands from the SDN controller (black double-dashedline from SDN controller). The figure shows a networktopology with six (SW1 to SW6) physical switches. Also, twoservice flows, flow A-Z and flow B-Y are deployed throughthese switches as shown in Fig. 1.

The IntOpt controller maps service flow (SF) telemetryitems and frequency demands to the respective physical links.It then finds out the optimal probing frequency as well astotal telemetry items which need to be monitored for each

Figure 1. IntOpt Architecture

physical link in order to cover all SFs with minimal overheadat the data plane as well at the controller. The controller thenprepares an optimal set of monitoring flows (MFs) so thatall the given SFs are monitored along with their monitoringdemands in terms of telemetry frequency and telemetry items.The IntOpt controller performs these tasks by executing theSARG meta-heuristic proposed in this work. The details ofthe SARG meta-heuristic are explained in section IV.

The IntOpt controller then identifies telemetry sources(SW1 in Fig. 1), forwarders (SW2-SW5) as well as sinks(SW6), and commands the SDN controller to populate flowtables accordingly (black dashed line). Since we are usingactive telemetry probing to monitor the network, IntOpt alsocommands SDN controller to send the periodic monitor-ing probes for each monitoring flow as per the telemetryfrequency determined in the previous step (black double-dashed line to SDN controller). P4 programmable switchesare responsible for parsing the monitoring probes, insertingthe telemetry items and forwarding of the probes to correctoutput port (red dotted line). The Controller acts as data-sinkby instructing the sink switch (SW6) of each monitoring flowto forward the collected information to itself through SDNcontroller (red double dashed line). It then maps the collectedtelemetry information back to individual SF requirements tocheck any SLA violation, such as exceeding the total delaysor buffer queue size at the intermediate switch etc..

C. Optimized Active Telemetry Probes Generation

In this sub-section we illustrate our concept of preparingoptimal monitoring flows (MFs) with a toy example. Wehave considered six different service function chains (SFCs)as service flows (SFs) as shown in Fig. 2, with 15 virtualfunctions (VFs), numbered from 1 to 15. The numbers onSFC blocks (inside circles and rectangles) indicate the VFsthat particular SFC is comprised of. Please note that SFCsmay share the VFs. SFCs may have different sizes and shapesas shown in Fig. 2. This may be due to back and forthtraffic flows among the VFs and their execution order. Wehave considered different complex SFC shapes to make thesolution generic [2], [19]

While deploying the monitoring flows, however, we onlyconsider linear flows. That is, we don’t allow forking, or

Figure 2. Service Flows and deployment over Atlanta network.

loop formation in MFs. The probes can simply be forwardedon to the port which is inserted as a next hop in telemetryheader. Since we propose to perform mapping and extractionof telemetry data at the controller, similar to the approachproposed in [5], implementing linear MFs reduces the con-troller overhead while preparing the MFs as well as gatheringthe collected information from the probes and mapping itback to the SFs. Also, this is typically handy allowing theprobe forwarding logic at the switches to be simple and fast.

Let us consider a 15-node Atlanta topology fromSNDLib [20] for deployment of the six SFs. A possibledeployment of SFs over the given substrate network is givenin Fig. 2. To illustrate our approach, we focus on a specificSF, SF1, with blue rectangular VFs. Deployment of SF1 overthe substrate network is shown in Fig.2 with double dashedblue lines. Let’s assume for simplicity we implement separatemonitoring flow for each SF, following its exact shape on thesubstrate network. As a result, monitoring flow MF1 for SF1will also follow the same path as SF1. As we notice, at node15, the MF1 has to split and the probes gets forwarded to twonodes, node 8 and 9. If we aim to implement simple “next-hop look-up and forward” functionality at the intermediateswitches, then it becomes complex to keep track of the nextforwarding port due to split-up of the MF. One way to achievethis is to let the controller keep track of switch unique ID(such as MAC) and its forwarding port on that switch forthat particular MF and embed the whole information in theprobe. Each switch then performs the match and forwardsthe monitoring probe accordingly.

The major drawback with such scheme is the delaysincurred due to the processing overhead at intermediateswitches. Also, the probe size increases as more data needs tobe embedded (here MAC of the switch) at every hop, addingto the overhead. Alternatively, we can implement two linearMFs, that is, MF1 with path as (15-9-10-7-14) and MF2with path (15-8-1-7-14). In this case, forwarding is linearand simple. A simple next-hop port number can be insertedin the probe to guide the intermediate switch to forward theprobe to the next hop. However the solution is still non-

Figure 3. 5 monitoring flows to cover all the service flows.

optimal as the link E7,14 is covered twice unnecessarily bythe probes. Such overhead due to non-optimal deploymentof monitoring flows may increase significantly with increasein the number of service flows. In this simple case with asingle SF, the optimal solution would be to deploy two linearMFs with paths as (15-9-10-7-14) and (15-8-1-7). With thismotivational example, we demonstrate that it is sufficient todeploy 5 monitoring flows as shown in Fig. 3, so that allthe given SFs can be monitored. As we note that, all theused physical links in the topology of Fig. 2 are covered byat least one MF in Fig. 3. We cast this as an optimizationproblem and develop a simulated annealing based randomgreedy meta-heuristic (SARG) approach to prepare optimalset of MFs as explained in the next section.

IV. ALGORITHMS FOR ACTIVE PROBING OPTIMIZATION

In this section we develop simulated annealing basedrandom greedy heuristic that determines the optimal set ofmonitoring flows (MFs) in order to fulfill all the monitoringrequirements of service flows (SFs) while minimizing theoverhead. We also benchmark the INT framework usingP4FPGA, as explained in Section V. We now explain ourSARG approach to prepare optimal set of MFs. As a first steptowards preparing the optimal set of MFs, we prepare set Eof links covered by the SFs and map the SF telemetry itemsand frequency demands to the respective physical links. Thenwe calculate the minimum number of MFs to monitor all thelinks in the set E. To achieve this, the heuristic prepares twosets, µij and δij . µij denotes a strict bound for the telemetryfrequency demands and δij denotes a strict bound for thetelemetry item demands for all SFs passing through the linkEij . We implement a pre-processing step at the controllerand maintain separate data structures to keep track of µij

and δij .We now demonstrate the above concept with an example

that shows a simple substrate network and three serviceflows (SFs) in Fig. 4. Let us denote the sets of telemetryitems demanded by SF1, SF2 and SF3 as set S1, S2 and S3

respectively. The frequency and telemetry item demands foreach SF are given in table I, with values selected randomly.Relationship between telemetry item demands for each SFsis shown by V enn diagram in Fig. 5. That is, S2, a set of

Figure 4. Service Flow demands to link demands mapping.

Figure 5. Telemetry item demands for service flows.

10 telemetry items demanded by SF2 is a super-set of S1, 5items demanded by SF1. However, S3 for SF3 is intersectingS1 and S2 as shown in Fig. 5, with 2 items in common withSF1 and 3 more items in common with SF2 (in total 5 itemscommon with SF2).

Table ISERVICE FLOW FREQUENCY AND TELEMETRY DEMANDS

Service Flow Frequency (ms) Telemetry ItemsSF1 5 5SF2 1 10SF3 10 10

The pre-processing step maps SF monitoring demands tolink demands and fills sets µij and δij as shown in tableII. As we observe, if only one monitoring flow is passingthrough the link, then telemetry frequency demand mappingsare straightforward (such as links E12, E35, E56 and E46).However, for links accommodating more than one SF, weneed to determine the appropriate mappings. For example,monitoring frequency demand for link E23 should be 1 mssince, from table 1, it is the most strict telemetry frequencydemand for all the SFs passing through link E23. Also, it willcover the telemetry frequency demands for other SFs, whichare greater than 1 ms. Similarly, for the link E34 monitoringfrequency demand should be 5 ms as it is more strict (5 ms)compared to the other (10 ms), and so on.

Table IIFREQUENCY DEMANDS MAPPINGS

Link Frequency (µij ) Telemetry Items (δij )E12 5 S1

E23 1 S4 = S1 ∪ S2 ∪ S3

E34 1 S2

E35 5 S5 = S1 ∪ S3

E46 1 S2

E56 10 S3

Similarly, we map the telemetry item demands of the SFsto the link demands (column 2 in table II). For example,

we observe that on the link E23, the set S2 with 10 itemsfor SF2 also covers S1 with 5 items for SF1 (as S1 ⊂ S2as mentioned earlier). However, every 10th ms, we need toinsert a new set S4 of telemetry items in the monitoring flowover E23 such that S4 = S1 ∪ S2 ∪ S3. This is because SF3has some telemetry items in S3 which are not covered by S2.As we can see, the size of S4 is 15 telemetry items.

Once the mappings are done, Random Greedy proceduremaps link monitoring demands to MFs so that all SFs with thespecified SLAs are covered while the number of total MFs areminimized. Algorithm 1 illustrates the steps for our RandomGreedy policy. We begin by initializing a monitoring flow andadding any random link to it. We keep adding more linksto the MF by selecting them sequentially thereafter, giventhe link being added has similar or less strict monitoringdemands than the existing set of links in the given MF. This isrepeated until the size of the MF grows beyond the threshold.At that instance, the heuristic terminates the monitoring flowand start a new one. The process is repeated until all linksare covered.

Algorithm 1 Random Greedy Approach integrated withSimulated Annealingprocedure RANDOM GREEDYE ⇒ set of the edges covered by all service flowsλf ⇒ monitoring flow size thresholdEij ← Random Select(E)Initialize a monitoring flow mf

Monitoring Frequency(mf ) ← µijTelemetry Items(mf ) ← δijSet Eij as a start link of mf

while E 6= Φ doEjk ← Sequential Select(E)if µij = µjk and δij= δjk thenmf ← mf + EjkE ← E - Ejk

elseterminate mf and initiate new flow mf+1

break the for loopend ifif Size(mf ) > λf then

terminate mf and initiate new flowbreak the for loop

end ifend while

end procedure

We develop a simulated annealing based meta-heuristicshown in Algorithm 2, which prevents the random greedyapproach from getting stuck in local minima. The qualityattribute of the solution returns the total number of en-cap/decap plus forwarding instances at the data plane due tothe proposed MF deployment scheme in the given solution.It has been used as the fitness function (explained in moredepth in Section V) for comparison of the solutions.

We also implement an ad-hoc approach, which we denoteas naıve algorithm, which is unaware of the optimizationpolicies and just tries to avoid forking or looping of theMFs, which is the basic requirement for MF to be a validflow. In the naıve implementation, we just start the MF foreach SF and follow it linearly. If there is any forking or loopformation in the SF, we just break the existing MF and forma new one. This is the typical approach which is generally

Algorithm 2 Simulated Annealing meta-heuristic stepsprocedure SIMULATED ANNEALINGtemperature ← λcooling rate ← αprevious sol ← new sol ← best sol ← NULLwhile temperature > 1 donew sol ← Random Greedy Procedure()if e(new sol.quality−prev sol.quality)/λ > Random(0, 1) thenprev sol ← new solif new sol.quality > best sol.quality thenbest sol ← new sol

end ifend iftemperature ← temperature × (1 – cooling rate)

end whileend procedure

followed in the absence of any sophisticated algorithm forthe MFs formation. Steps for the naıve approach are givenin Algorithm 3. In the next section, we analyze the bench-marking results as well as numerical results obtained throughsimulation of the two algorithms discussed.

Algorithm 3 Naıve ad-hoc Approachprocedure NAIVE APPROACHs ← Sequential Select(R)E ⇒ set of the edges covered by service flow sλf ⇒ monitoring flow size thresholdInitialize a monitoring flow mf

Eij ← Sequential Select(E)Set Eij as a start link of mf

while E 6= Φ doEjk ← Sequential Select(E)if (no loop or fork) and µij = µjk and δij= δjk thenmf ← mf + EjkE ← E - Ejk

elseterminate mf and initiate new flow mf+1

break the for loopend ifif Size(mf ) > λf then

terminate mf and initiate new flowbreak the for loop

end ifend while

end procedure

V. NUMERICAL RESULTS

In this section we first evaluate the monitoring overheadof INT-operations by benchmarking the P4 implementationwith P4FPGA. We use the benchmarking results to compareour proposed SARG scheme with ad-hoc naıve approach interms of monitoring overheads and present our numericalevaluations.

In order to approximate the delay induced by INT-operations, which is mainly due to packet parsing and push-ing telemetry headers, we performed a series of experimentsusing the NetFPGA-SUME hardware platform. We imple-mented a P4 program using the P4FPGA toolkit [14], basedon the INT specification, which parses incoming packets,and pushes INT headers (encapsulation), accordingly [9]. Theheaders are divided into two types, Telemetry Instruction andTelemetry Data headers. Instruction headers contain a setof instructions that determine, which telemetry data items

should be pushed by each switch, along with various meta-data. Telemetry data headers contain actual INT data such asqueue occupation, switch traversal latency, etc. The P4 INTprogram checks incoming packets for instruction headers,and if found one, it pushes the telemetry data specified inthe instruction header. Should a packet arrive without aninstruction header, the switch can be configured through thecontrol plane, to either forward it as normal packet or inserta telemetry instruction header. This is configured through amatch key, such as source port or destination address.

We ran experiments with both edge and core switch P4programs, pushing from 0 to 8 INT data headers. Edgeswitches were configured to push both the instruction headerand the configured number of data headers, while the coreswitches only pushed data headers. Fig. 6 shows the expe-rienced delays against the number of header fields insertedfor core as well as edge switches. As we observe, there is alinear relationship between the telemetry data pushed in thepacket and the delays observed. Also, the edge switch hashigher latency due to additional push operation for instructionheaders.

The tasks performed at the switches while monitoringprobes are passed through them and which cause the signifi-cant delays are (1) encap/decap of the probes with telemetryheaders at the source and the sink and (2) parse the telemetryheader and insert the telemetry items accordingly (which wecall Forwarding or FW for simplicity). We now evaluate theperformance of our proposed SARG scheme and the naıveapproach in terms of total encap/decap plus FW instances aswell as total delays incurred due to such operations while allthe probes pass through the network. We apply a linear fitwith the delay results obtained using P4-NetFPGA switches(Fig. 6) to extrapolate delay values for larger topologies.

We have used use Europe topology from SNDLib whichhas 37 nodes in total and 178 links [20]. We generate theservice chains randomly. We also select specific parametersfor SFs such as average length in terms of hops, telemetryitems demands and telemetry frequency demands randomlyfrom specific ranges given as an input to the heuristic. Wealso vary the total number of actual SFs to be deployed inthe network to observe the effect on the final results.

We have considered two different cases for average hop-length and telemetry item demands for the SFs. That is,the first case with average hop-length as 5, telemetry itemsneeded are 10 and telemetry frequency demand as 5 ms. Forthe second case, the hop-length as 10, telemetry items neededare 10 and the telemetry frequency demand is 10 ms. Fig 7shows the total encap/decap plus forwarding instances in thegiven network along the Y-axis against different number ofservice flows along the X-axis. In Fig. 8 we plot the graphsfor the total delays incurred due to the monitoring flows.For this purpose, we use the overhead values obtained fromthe experimental benchmarking of P4 INT framework on theNetFPGA-SUME hardware as discussed above.

As we observe, in Fig. 7, the total encap/decap plus FWoverhead is minimum for the proposed SARG scheme. Forexample, for 25 SFs with case 1 (10H/5L), the number is 100(green squared solid line), however for the naıve approach thenumber is 163 (red circled dotted line). For case 2 (10H/10L)the numbers are 122 and 210 for SARG and naıve approach

Figure 6. NetFPGA results for Core and edge switches.

Figure 7. Total encap-decap and FW instances.

respectively. Corresponding delays for case 1 as shown inFig. 8 are, 300 and 700 micro-seconds for SARG and naıveapproach respectively. For case 2 the delays are observedto be 400 and 800 micro-seconds. We also observe that suchdelays increase with the increase in average hop length of theSFs (green squared solid line vs. blue triangle solid line). InFig 9 we keep the number of total number of SFs constant to50 and vary the telemetry item requirements along the X-axis.We observe the linear growth, which is due to the linearlygrowing overhead for the P4 header pushing operations. Theresults show that the proposed SARG approach can reducethe monitoring overhead by 39% and the total delays by57%. Numerical evaluation demonstrates that, with system-atic approach such as SARG, the monitoring overheads canbe reduced significantly.

VI. CONCLUSIONS

In this work, we propose IntOpt, a scalable and expres-sive telemetry framework and develop simulated annealingbased random greedy (SARG) meta-heuristic for optimallydeploying the MFs for flexible service function chain mon-itoring. The IntOpt controller prepares optimal set of MFsby executing the proposed SARG approach, calculates theoptimal probing frequency as well as total telemetry itemsto be monitored for each link in order to cover all serviceflows with minimal overhead at the data plane as well atthe controller. The controller then identifies proper telemetrysources, forwarders as well as sinks, and populates flow tables

Figure 8. Total delays against total SFs.

Figure 9. Total delays against total telemetry items.

accordingly through the SDN controller. In addition to ourproposed SARG meta-heuristic, we also implement an ad-hocnaıve approach, which is generally followed in the absenceof a systematic flow generation strategy. We benchmarkthe actual incurred overheads and latency due to telemetryoperations using the P4 INT framework, P4FPGA and SUMEhardware platform for a variety of telemetry items. Ourevaluation shows that using our heuristic significantly reducesthe total monitoring overhead and the delays introduced dueto the telemetry operations. We argue that such systematicapproach can be incorporated with the existing monitoringframeworks to obtain scalability without losing the generalityand expressiveness of the systems.

As a future work, we aim to develop an optimizationmodel to obtain the optimal number of monitoring flows andcompare the heuristics against the optimal solution as well asconsider different substrate network topology. We also aimto provide an architecture to integrate the proposed schemewith the existing networking architectures such as ONAP.

VII. ACKNOWLEDGMENT

The authors received partial funding from the KnowledgeFoundation of Sweden through the Profile HITS (GrantNumber 20140037).

REFERENCES

[1] T. Yang, J. Jiang, P. Liu, Q. Huang, J. Gong, Y. Zhou, R. Miao,X. Li, and S. Uhlig, “Elastic sketch: Adaptive and fast network-wide

measurements,” in Proceedings of the 2018 Conference of the ACMSpecial Interest Group on Data Communication, pp. 561–575, ACM,2018.

[2] D. Bhamare, M. Samaka, A. Erbad, R. Jain, L. Gupta, and H. A. Chan,“Optimal virtual network function placement in multi-cloud servicefunction chaining architecture,” Computer Communications, vol. 102,pp. 1–16, 2017.

[3] D. Bhamare, R. Jain, M. Samaka, and A. Erbad, “A survey on servicefunction chaining,” Journal of Network and Computer Applications,vol. 75, pp. 138–155, 2016.

[4] D. Bhamare, A. Erbad, R. Jain, M. Zolanvari, and M. Samaka,“Efficient virtual network function placement strategies for cloud radioaccess networks,” Computer Communications, 2018.

[5] A. Gupta, R. Harrison, A. Pawar, R. Birkner, M. Canini, N. Feamster,J. Rexford, and W. Willinger, “Sonata: Query-driven network teleme-try,” arXiv preprint arXiv:1705.01049, 2017.

[6] K. Borders, J. Springer, and M. Burnside, “Chimera: A declarativelanguage for streaming network traffic analysis.,” in USENIX SecuritySymposium, pp. 365–379, 2012.

[7] Y. Yuan, D. Lin, A. Mishra, S. Marwaha, R. Alur, and B. T. Loo,“Quantitative network monitoring with netqre,” in Proceedings of theConference of the ACM Special Interest Group on Data Communica-tion, pp. 99–112, ACM, 2017.

[8] P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford,C. Schlesinger, D. Talayco, A. Vahdat, G. Varghese, et al., “P4: Pro-gramming protocol-independent packet processors,” ACM SIGCOMMComputer Communication Review, vol. 44, no. 3, pp. 87–95, 2014.

[9] C. Kim, A. Sivaraman, N. Katta, A. Bas, A. Dixit, and L. J. Wobker,“In-band network telemetry via programmable dataplanes,” in ACMSIGCOMM, 2015.

[10] Q. Huang, X. Jin, P. P. Lee, R. Li, L. Tang, Y.-C. Chen, andG. Zhang, “Sketchvisor: Robust network measurement for softwarepacket processing,” in Proceedings of the Conference of the ACMSpecial Interest Group on Data Communication, pp. 113–126, ACM,2017.

[11] M. Yu, L. Jose, and R. Miao, “Software defined traffic measurementwith opensketch.,” in NSDI, vol. 13, pp. 29–42, 2013.

[12] P. Lapukhov and R. Chang, “Data-plane probe for in-band telemetrycollection,” Internet-Draft draft-lapukhov-dataplane-probe-01, InternetEngineering Task Force, June 2016. Work in Progress.

[13] R. Jain and S. Paul, “Network virtualization and software definednetworking for cloud computing: a survey,” IEEE CommunicationsMagazine, vol. 51, no. 11, pp. 24–31, 2013.

[14] H. Wang, R. Soule, H. T. Dang, K. S. Lee, V. Shrivastav, N. Foster,and H. Weatherspoon, “P4fpga: A rapid prototyping framework forp4,” in Proceedings of the Symposium on SDN Research, SOSR ’17,(New York, NY, USA), pp. 122–135, ACM, 2017.

[15] Z. Liu, A. Manousis, G. Vorsanger, V. Sekar, and V. Braverman, “Onesketch to rule them all: Rethinking network flow monitoring withunivmon,” in Proceedings of the 2016 ACM SIGCOMM Conference,pp. 101–114, ACM, 2016.

[16] A. Kumar, M. Sung, J. J. Xu, and J. Wang, “Data streaming algorithmsfor efficient and accurate estimation of flow size distribution,” in ACMSIGMETRICS Performance Evaluation Review, vol. 32, pp. 177–188,ACM, 2004.

[17] S. Narayana, A. Sivaraman, V. Nathan, P. Goyal, V. Arun, M. Alizadeh,V. Jeyakumar, and C. Kim, “Language-directed hardware design fornetwork performance monitoring,” in Proceedings of the Conferenceof the ACM Special Interest Group on Data Communication, pp. 85–98, ACM, 2017.

[18] C. Yu, C. Lumezanu, Y. Zhang, V. Singh, G. Jiang, and H. V.Madhyastha, “Flowsense: Monitoring network utilization with zeromeasurement cost,” in International Conference on Passive and ActiveNetwork Measurement, pp. 31–41, Springer, 2013.

[19] D. Bhamare, M. Samaka, A. Erbad, R. Jain, L. Gupta, and H. A.Chan, “Multi-objective scheduling of micro-services for optimal ser-vice function chains,” in 2017 IEEE International Conference onCommunications (ICC), pp. 1–6, May 2017.

[20] S. Orlowski, M. Pioro, A. Tomaszewski, and R. Wessaly, “SNDlib1.0–Survivable Network Design Library,” in Proceedings of the 3rdInternational Network Optimization Conference (INOC 2007), Spa,Belgium, April 2007. http://sndlib.zib.de, extended version acceptedin Networks, 2009.

Documents

IntOpt: In-Band Network Telemetry Optimization for NFV ...1348380/FULLTEXT01.pdf · C. Optimized Active Telemetry Probes Generation In this sub-section we illustrate our concept of