End-to-End Delay Modeling for Embedded VNF Chains in 5G ...q6ye/Papers/Delay_Modeling_JIoT.pdfan embedded VNF chain, including packet queueing delay, packet processing delay on NFV

692 IEEE INTERNET OF THINGS JOURNAL, VOL. 6, NO. 1, FEBRUARY 2019

End-to-End Delay Modeling for EmbeddedVNF Chains in 5G Core Networks

Qiang Ye , Member, IEEE, Weihua Zhuang , Fellow, IEEE, Xu Li, and Jaya Rao, Member, IEEE

Abstract—In this paper, an analytical end-to-end (E2E) packetdelay modeling is established for multiple traffic flows travers-ing an embedded virtual network function (VNF) chain in fifthgeneration communication networks. The dominant-resourcegeneralized processing sharing is employed to allocate both com-puting and transmission resources among flows at each networkfunction virtualization (NFV) node to achieve dominant-resourcefair allocation and high resource utilization. A tandem queueingmodel is developed to characterize packets of multiple flows pass-ing through an NFV node and its outgoing transmission link. Foranalysis tractability, we decouple packet processing (and trans-mission) of different flows in the modeling and determine averagepacket processing and transmission rates of each flow as approx-imated service rates. An M/D/1 queueing model is developed tocalculate packet delay for each flow at the first NFV node. Basedon the analysis of packet interarrival time at the subsequentNFV node, we adopt an M/D/1 queueing model as an approx-imation to evaluate the average packet delay for each flow ateach subsequent NFV node. The queueing model is proved toachieve more accurate delay evaluation than that using a G/D/1queueing model. Packet transmission delay on each embeddedvirtual link between consecutive NFV nodes is also derived forE2E delay calculation. Extensive simulation results demonstratethe accuracy of our proposed E2E packet delay modeling, uponwhich delay-aware VNF chain embedding can be achieved.

Index Terms—Bi-resource allocation, CPU and bandwidthresources, dominant-resource generalized processor sharing(DR-GPS), embedded virtual network function (VNF) chains,end-to-end (E2E) delay modeling, network function virtualiza-tion (NFV), rate decoupling, software-defined networking (SDN),tandem queueing model.

I. INTRODUCTION

FUTURE communication networks are expected to providecustomized delay-sensitive end-to-end (E2E) service

deliveries [1] (e.g., video streaming and machine-to-machinecommunications) with fine-grained quality-of-service (QoS)for Internet-of-Things (IoT) [2]–[6]. Typical IoT applicationscenarios include remote control for smart homing, smart sens-ing [7], large-scale mobile social networking [8], industrial

Manuscript received April 10, 2018; revised June 13, 2018; acceptedJune 30, 2018. Date of publication July 9, 2018; date of current versionFebruary 25, 2019. This work was supported by research grants from HuaweiTechnologies Canada and from the Natural Sciences and Engineering ResearchCouncil (NSERC) of Canada. (Corresponding author: Qiang Ye.)

Q. Ye and W. Zhuang are with the Department of Electrical and ComputerEngineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada (e-mail:[email protected]; [email protected]).

X. Li and J. Rao are with Huawei Ottawa Research & Development Centre,Huawei Technologies Canada Inc., Ottawa, ON K2K 3J1, Canada (e-mail:[email protected]; [email protected]).

Digital Object Identifier 10.1109/JIOT.2018.2853708

automation [9], high-definition video conferencing, and intel-ligent transportation systems [10], [11]. To support diversi-fied applications and use cases, network servers providingdifferent functions (e.g., classifiers, firewalls, and proxies)are required to be augmented in a large scale to fulfillcustomized service requirements in both wireless and corenetwork domains. However, the increasingly densified networkdeployment significantly expands installation and operationalcost of the infrastructure. Network function virtualization(NFV) [12]–[15] provides a promising solution to reducethe deployment cost and realize flexible function placementand service customization. With NFV, network functions aredecoupled from function-specific servers, softwarized as vir-tual network functions (VNFs), and placed on general-purposeprogrammable servers (also called NFV nodes [13], [16]).Specifically, through a resource virtualization platform [17],computing resources (i.e., CPU cores) for task processing oneach NFV node are virtualized as virtual CPU cores, uponwhich virtual machines (VMs) are installed. Then, VNFs areprogrammed on VMs of different NFV nodes at differentnetwork locations to achieve high resource utilization.

For the evolving network paradigm, the backbone corenetwork consists of a combination of network switches andNFV nodes interconnected via high-speed wired transmissionlinks. NFV nodes hosting and operating VNFs are introducedto improve service provisioning and resource utilization. Aset of VNFs and the virtual links connecting them constitutea logic VNF chain, referred to as service function chain [18],representing a specific sequence of network functions that atraffic flow1 requires to traverse for E2E service provisioning.All VNF chains are managed by a virtualization controllerand are placed onto the physical substrate network, with eachVNF embedded on an NFV node and virtual links representedby transmission links and network switches. This processis known as VNF chain embedding [13], [14], [19], [20].Note that the virtualization controller is software-definednetworking (SDN) enabled [21], with all control functionsdecoupled from the underlying physical network [22], [23].Therefore, the controller has direct control (programmabil-ity) on all VNFs to enhance resource utilization via trafficbalancing and VM migration [24], [25].

E2E packet delay of a delay-sensitive service flow travers-ing an embedded VNF chain is a main metric indicating the

1A traffic (service) flow refers to an aggregation of packets belonging tothe same service type for the same source and destination node pair in thebackbone network.

2327-4662 c© 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

https://orcid.org/0000-0002-8208-6295

https://orcid.org/0000-0001-9024-3863

YE et al.: E2E DELAY MODELING FOR EMBEDDED VNF CHAINS IN 5G CORE NETWORKS 693

embedding performance. Existing research works investigatehow to achieve optimal embedding of VNF chains on the corenetwork to minimize the deployment, operational, and delayviolation cost under physical resource constraints and flowconservation constraints. E2E packet delay of each traffic flowis calculated as a summation of packet transmission delayson each physical link, without considering packet processingdelay due to CPU processing on NFV nodes [13], [14], [26].However, each packet from different traffic flows passingthrough an NFV node usually requires different amounts ofCPU processing time on the NFV node and packet trans-mission time on the outgoing link in sequence [27], [28].Depending on the type of VNF each flow traverses, someflows’ packets with large headers bottleneck on CPU pro-cessing, whereas packets of other flows having large packetpayload sizes demand more transmission time. In addition,the packet arrival process of a flow at an NFV node correlateswith packet processing and transmission at its preceding NFVnodes, which makes E2E delay analysis difficult. Therefore,how to develop an analytical model to evaluate the delaythat each packet of a flow experiences when passing throughan embedded VNF chain, including packet queueing delay,packet processing delay on NFV nodes, and packet transmis-sion delay on links, is a challenging research issue. For VNFchain embedding, there is a tradeoff between E2E delay satis-faction and reducing the network cost. Different VNF chainsare often embedded on a common network path with multipleVNF instances operated on an NFV node to improve resourceutilization and reduce the deployment and operational cost ofnetwork functions and links. On the other hand, sharing a setof physical resources with other flows on NFV nodes and linksmay degrade the delay of one individual flow. Thus, modelingE2E packet delay of each flow is essential to achieve delay-aware VNF chain embedding. Since traffic flows traversingeach NFV node demonstrate discrepant “dominant” resourceconsumption on either CPU or link bandwidth, how to allocatethe two resources among traffic flows to guarantee allocationfairness and achieve high resource utilization, referred to asbi-resource allocation, needs investigation and is required forE2E delay modeling.

In this paper, we employ dominant-resource generalizedprocessor sharing (DR-GPS) [28] as the bi-resource alloca-tion scheme among traffic flows sharing resources at eachNFV node. The DR-GPS balances the tradeoff between ser-vice isolation and high resource utilization, and maintainsdominant-resource fairness (DRF) [29] among flows. Then,we model packet delay of a flow passing through NFV nodesand physical links (and switches) of an embedded VNF chain.The contributions of this paper are twofold.

1) With DR-GPS, we establish a tandem queueing modelto extract the process of each flow going through CPUprocessing and link transmission at the first NFV node.To remove the rate coupling effect among flows, wederive average processing rate as an approximation oninstantaneous processing rate for each flow with the con-sideration of resource multiplexing. An M/D/1 queueingmodel is then developed for each decoupled process-ing queue to determine the average packet processing

delay. Based on the analysis of packet departure processfrom each decoupled processing queue, the decoupledtransmission rate is derived for each flow.

2) We analyze packet arrival process and then remove ratecoupling effect of each flow traversing the subsequentNFV node. To eliminate the dependence of packet pro-cessing and packet transmission between consecutiveNFV nodes, the arrival process of each flow at the sub-sequent NFV node is approximated as a Poisson processand an M/D/1 queueing model is employed to calculatethe average delay for packet processing. It is proved thatthe average packet queueing delay for CPU processingbased on the approximated M/D/1 queueing model isan improved upper bound over that calculated upon aG/D/1 queueing model.

The rest of this paper is organized as follows. Existing studieson E2E packet delay modeling for embedded VNF chains arereviewed in Section II. The system model under considerationis described in Section III. In Section IV, we present a bi-resource allocation scheme employed for flows traversing anNFV node. The E2E delay modeling for packet flows goingthrough an embedded VNF chain is established in Section V.In Section VI, numerical results are presented to demonstratethe accuracy of the proposed delay analytical framework andits effectiveness for achieving delay-aware VNF chain embed-ding. Lastly, we draw the conclusions in Section VII. Mainparameters and symbols throughout this paper are summarizedin Table I.

II. RELATED WORK

In most existing studies, E2E packet delay for an embed-ded VNF chain is modeled as a summation of transmissiondelays when every packet of a flow traverses each embed-ded physical link, without considering packet processing delayassociated with VNFs. In [13] and [14], delay violation penaltyis expressed as a function of E2E packet transmission delayfor each traffic flow, which indicates the cost if the delayrequirement for packets traversing an embedded VNF chainis violated. A delay violation cost minimization problem isthen formulated to determine an optimal embedded physicalnetwork path that achieves minimum E2E packet transmissiondelay for each flow. In [26], the total delay when a packet isrouted between consecutive virtual nodes consists of packetprocessing delay, packet queueing delay, and packet transmis-sion delay, and is determined using network traffic measuringtools instead of analytical modeling. The E2E delay for pack-ets going through each source–destination node pair in anembedded virtual network is then calculated to achieve QoS-aware multicast virtual network embedding. For embeddedVNF chains, how CPU and bandwidth resources are sharedamong multiple flows traversing each NFV node determinesthe fractions of processing and transmission rates allocatedto each flow, and thus affects E2E packet delay calculation.Multiresource sharing is studied in a data center (DC) envi-ronment [27]–[29], where dominant-resource fair allocation isadopted to equalize the dominant resource shares of multipleflows, while achieving high resource utilization. However, howto model E2E packet delay for each flow traversing a set of


TABLE IMAIN PARAMETERS AND SYMBOLS

DC nodes needs further investigation [28]. An analytical E2Edelay model for packets passing through each embedded VNFchain is established in [30], where an independent M/M/1/Kqueueing model is employed to evaluate packet delay at eachVNF. However, packet interarrival time at a subsequent VNFcorrelates with the processing and transmission rates at itspreceding VNF, making packet delay calculation unique ateach VNF. In addition, the assumption of exponential distri-bution on packet processing time needs justification. Existingstudies establish analytical delay models for packets of a traf-fic flow going through a set of OpenFlow network switchesin SDN [31], [32]. Since control functions are migrated fromeach network switch to the SDN controller where process-ing resources are consumed for making routing decisions, thenetwork switches are simplified with only packet forward-ing functions. E2E packet delay for a flow passing througha sequence of network switches is determined based on anM/M/1 queueing network modeling.

Overall, developing an accurate analytical E2E delay modelfor each traffic flow traversing an embedded VNF chain ischallenging and of importance for achieving delay-aware VNFchain embedding.

III. SYSTEM MODEL

A. Embedded VNF Chains

An aggregated data traffic flow from the wireless networkdomain, belonging to an identical service type, is requiredto traverse a sequence of VNFs in the core network to fulfillcertain service requirements. Each VNF is embedded and oper-ated on an NFV node, and each virtual link represents a set oftransmission links and network switches. Multiple VNF chainscan be embedded on a common network path to improveresource utilization and reduce network deployment and oper-ational cost. We consider a set, I, of traffic flows traversingdifferent logic VNF chains over a common embedded phys-ical path. In Fig. 1, two flows i and j (∈ I), representingtwo logic VNF chains f1 → f3 and f1 → f2, respectively,traverse one embedded network path and share the same phys-ical resources. At the service level, we have flow i traversing

a firewall function and a domain name system (DNS) func-tion sequentially to fulfill a secured DNS service request, andflow j traversing a firewall function and an intrusion detectionsystem (IDS) for secured E2E data streaming. At the networklevel, flow i goes through the first NFV node N1 operatingVNF f1 and transmission link L0, and are then forwarded byn1 network switches {R1, . . . , Rk, . . . , Rn1} and n1 transmis-sion links {L1, . . . , Lk, . . . , Ln1} in between before reachingthe second NFV node, N2, operating VNF f3; flow j traversesthe same physical path but passes through VNF f2 at the sec-ond NFV node N2. After passing through N2, traffic flows i andj are forwarded by a sequence of n2 switches and n2 links (notdepicted in Fig. 1 for brevity) to reach the destination nodein the core network. For a general case, we have the set I offlows traversing and sharing an embedded physical path, withm NFV nodes denoted by Nz (z = 1, 2, . . . , m) and nz pairs ofnetwork switches and physical links forwarding traffic betweenNFV nodes Nz and Nz+1 before reaching the destination node.With NFV, different VNFs can be flexibly orchestrated andinstalled at appropriate NFV nodes to enhance traffic bal-ancing and reduce deployment cost of network infrastructure.When a traffic flow passes through an NFV node, each packetof the flow first requires a CPU time for packet processing,after which the processed packet is allocated link bandwidthresources for transmission [28]. The total amount of CPU timeis assumed infinitely divisible [27], [28] on each NFV nodeand needs to be properly shared among traffic flows passingthrough, and the bandwidth resources on transmission linksare also shared among traffic flows.

B. Traffic Model

Packet arrivals of flow i at the first NFV node, N1, are mod-eled as a Poisson process with arrival rate λi. For resourcerequirements of a flow, we define time profile for flow i pass-ing through N1 as a 2-D time vector [τi,1, τi,2], indicating thatevery packet of flow i requires τi,1 time for CPU process-ing and τi,2 time for transmission if all CPU time on N1 andlink bandwidth resources on L0 are allocated to flow i [28].Correspondingly, the rate vector [Ci,1, Ci,2] (in the unit of


Fig. 1. Two embedded VNF chains sharing a common physical network path.

packet per second) for flow i is the reciprocal of the timeprofile, Ci,1 = (1/τi,1) and Ci,2 = (1/τi,2). Service flows withdifferent packet structures often have discrepant time profilewhen passing through an NFV node. For example, when goingthrough a firewall function, service flows carrying small pack-ets with a large header size, such as DNS request packets, aremore CPU time demanding, whereas other data traffic flows,e.g., video traffic, with a large packet size will require moretime for packet transmissions but less time on packet pro-cessing. Therefore, service flows always have a more criticalresource consumption on either CPU time or link bandwidth,which is referred to as dominant resource. With a given setof maximum available CPU resources and link bandwidthresources on an NFV node and its outgoing link, time profilefor different traffic flows sharing one NFV node can be verydifferent. The maximum available CPU time on N1 and max-imum link bandwidth on L0 are shared among flows. Supposeflow i is allocated packet processing rate ci,1 out of the max-imum processing rate Ci,1, and packet transmission rate ci,2out of the maximum transmission rate Ci,2. Note that whenthe CPU time is shared among multiple flows, there can besome overhead time as the total CPU resources switch amongflows for different processing tasks. Existing studies show thatthis CPU switching overhead only becomes obvious when thetraffic of each flow is saturated with a high percentage ofCPU utilization (i.e., CPU cores are frequently interruptedfor switching tasks) [18], [33]. Considering only nonsatura-tion traffic, we assume that the allocated processing rate ci,1for flow i (∈ I) varies linearly with its occupied fraction ofCPU time (i.e., the useful fraction of CPU usage). Thus, wedenote the fraction of CPU resources allocated to flow i byhi,1 = (ci,1/Ci,1) and the fraction of link bandwidth resourcesby hi,2 = (ci,2/Ci,2).

IV. BI-RESOURCE ALLOCATION DISCIPLINE

A. Dominant-Resource Generalized Processor Sharing

When different service flows multiplex at a common NFVnode, we want to determine how CPU and link bandwidthresources should be shared among the traffic flows to achievehigh utilization on each resource type and, at the same time,maintain a (weighted) fair allocation among the services.Since service flows can have different dominant resources,

the bi-resource sharing becomes more challenging than sin-gle resource allocation, to balance between high resourceutilization and fair allocation for both resource types.

The generalized processor sharing (GPS) discipline is abenchmark fluid-flow (i.e., with resources being infinitelydivisible) based single resource allocation model for inte-grated service networks in the traditional communicationnetworks [34], [35]. Each service flow, say flow i, at a com-mon GPS server (e.g., a network switch) is assigned a positivevalue, ϕi, indicating its priority in bandwidth allocation. TheGPS server guarantees that the allocated transmission rate gi

for flow i satisfies

gi ≥ ϕi∑

i∈Iϕi

G (1)

where G is the maximum service rate of the GPS server. Notethat the inequality sign in (1) holds when some flows in I donot have packets to transmit, and thus more resources can beallocated among any backlogged flows. Therefore, GPS hasthe properties of achieving both service isolation and a highresource multiplexing gain among flows.

However, if GPS is directly applied in the bi-resource con-text (i.e., bi-resource GPS), it is difficult to simultaneouslymaintain fair allocation for both CPU time and link band-width and achieve high system performance. Consider flow iand flow j, with time profiles [τi,1, τi,2] and [τj,1, τj,2], respec-tively, traversing NFV node N1, with the same service priorityfor fair resource sharing. Assume τi,1 > τi,2 and τj,1 < τj,2. Ifwe apply the bi-resource GPS, both the maximum processingand transmission rates are equally divided for the two flows.Consequently, the performance of both flows traversing N1is not maximized. For flow i, due to unbalanced time pro-files, the allocated link transmission rate ci,2 is larger than theprocessing rate ci,1, leading to resource wastage on link trans-mission; the situation reverses for flow j, where packets areaccumulated for transmissions, causing an increase of queue-ing delay. Therefore, to improve the system performance, abasic principle [28] is that the fractions, hi,1 and hi,2, ofCPU and bandwidth resources allocated to any flow i (∈ I)should be in the same proportion as its time profile, i.e.,(hi,1/hi,2) = (τi,1/τi,2), to guarantee the allocated processingrate be equalized with the transmission rate, i.e., ci,1 = ci,2.In such a way, queueing delay before packet transmissions of


each flow can be eliminated. However, with this basic prin-ciple, if we apply GPS on one of the two resources (i.e.,single-resource GPS with equalized processing and transmis-sion rates), the allocation of the other type of resources amongtraffic flows continues to be unbalanced due to the discrepancyof time profiles among different flows.

To maximize the tradeoff between high performance andfair resource allocation for each traffic flow, we employ aDR-GPS scheme [28] for the bi-resource allocation. The DR-GPS combines the concepts of DRF [29] and GPS, in whichthe fractions of allocated dominant resources among differentbacklogged flows are equalized based on service priority, andthe other type of resources are allocated to ensure the pro-cessing rate equal transmission rate for each flow (the basicprinciple applies). When some flows do not have backloggedpackets for processing, their allocated resources are redis-tributed among other backlogged flows if any. Since there area finite number of flow combinations to form a backloggedflow set out of I, we denote B (∈ I) as one of the flow combi-nations for a backlogged flow set. Each backlogged flow hasa dominant resource consumption in either packet processingor packet transmission. We mathematically formulate DR-GPSin (P1) when the set, I (|I| ≥ 1), of traffic flows traverse N1,where | · | is the set cardinality, as follows:

(P1) : max{h1,1, . . . , hi,1, . . . , hj,1, . . . , h|B|,1

}

s.t.

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

∑

i∈B

hi,1 ≤ 1 (2a)

∑

i∈B

hi,2 ≤ 1 (2b)

hi,1 = τi,1

τi,2hi,2 (2c)

hi,d

wi= hj,d

wj∀i; j ∈ B (2d)

hi,1, hi,2, wi, wj ∈ [0, 1]. (2e)

In (P1), wi and wj are weights of resource allocation to repre-sent service priority for flow i and flow j, respectively, and hi,d

is the fraction of occupied dominant resources of flow i, whichis either hi,1 or hi,2. Constraint (2c) guarantees ci,1 = ci,2;Constraint (2d) equalizes the fractions of allocated dominantresources among the backlogged flows. Problem (P1) is a lin-ear programming problem and can be solved efficiently toobtain the optimal solutions of hi,1 and hi,2 for any flow i.

The DR-GPS has properties of 1) service isolation by guar-anteeing a service rate in (1) to each flow and 2) workconservation by fully utilizing at least one of the two typesof resources in serving the backlogged flows [28], [36], [37].Although the queueing delay for packet transmissions isreduced by employing the DR-GPS scheme, the total packetdelay2 for each flow traversing the same NFV node should beevaluated. With GPS [34], [35], the process of multiple traf-fic flows passing through common NFV node N1 is extractedas a tandem queueing model, shown in Fig. 2. The total

2Total packet delay refers to the duration from the instant that a packetof one traffic flow reaches to the processing queue of the NFV node to theinstant when it is transmitted out of the NFV node over a physical link.

Fig. 2. Queueing model for multiple traffic flows traversing N1.

packet delay is the summation of packet queueing delay beforeprocessing, packet processing delay and packet transmissiondelay. In the following, we develop an analytical model toevaluate the total packet delay for each traffic flow travers-ing N1. Based on the delay modeling for flows traversing thefirst NFV node, the E2E packet delay for traffic flows passingthrough the embedded VNF chains can be analyzed.

V. END-TO-END DELAY ANALYSIS

In this section, we first analyze the total packet delay foreach traffic flow traversing the first NFV node, and then extendthe delay model to evaluate the E2E delay for traffic lowspassing through a sequence of NFV nodes of an embeddedVNF chain.

A. Rate Decoupling

The main difficulty in analyzing the packet delay for eachflow is that both the processing and transmission rates of eachflow depend on the backlog status of other flows at the sameNFV node. For the case of two flows, when one of the flowshas an empty processing queue, its processing and transmis-sion resources are given to the other backlogged flow to exploitthe traffic multiplexing gain. Therefore, the processing (trans-mission) rate of each flow switches between two deterministicrate values, depending on the status of the other flow. For ageneral case where we have the set, I, of multiplexing flowswhich includes a set, B, of backlogged flows excluding thetagged flow i, the processing rate ci,1 of flow i can be deter-mined by solving (P1), based on set B of backlogged flows.We further denote B as Br where r = 1, 2, . . . ,

( |I||B|

), represent-

ing one of( |I||B|

)combinations of |B| backlogged flows. Thus,

ci,1 changes with Br. This correlation of queue status amongflows leads to the processing rate of each flow jumping overdiscrete deterministic values, making the total packet delayanalysis complex.

To remove the coupling effect of instantaneous process-ing rate of one flow fluctuating with the backlog status ofother flows, we first determine the average processing rateμi,1 for flow i, with the consideration of resource multiplexingamong different flows, i.e., nonempty probabilities of process-ing queues from all other flows, through a set of high-ordernonlinear equations given by

⎧⎪⎨

⎪⎩

μi,1 =|I|−1∑

|B|=0

M∑

r=1

∏l∈Br

�l,1∏

k∈Br(1 − �k,1)ci,1

�i,1 = λiμi,1

∀i ∈ I.(3)


(a)

(b)

Fig. 3. Queueing model for (a) decoupled packet processing and(b) decoupled packet processing and transmission.

In (3), M = (|I|−1|B|

), Br = I\{i ⋃ Br}, and �i,1 is the nonempty

probability of processing queue for flow i at N1. Given packetarrival rate for any flow in I, (3) has 2|I| equations with 2|I|variables and can be solved numerically for the set of aver-age processing rates of each flow. For analysis tractability, weuse the average processing rate, μi,1, as an approximation forthe instantaneous processing rate ci,1 to transform the originalcorrelated processing system to a number of |I| uncorrelatedprocessing queues. A case of two traffic flows at an NFV nodeis demonstrated in Fig. 3. With the decoupled deterministicprocessing rates, packet processing for each flow can be mod-eled as an M/D/1 queueing process, upon which we can furthercalculate both the decoupled packet processing delay and thetotal packet delay for processing. The accuracy of process-ing rate decoupling is verified through extensive simulationspresented in Section VI.

In Fig. 3(a), the instantaneous link transmission rate foreach flow is also correlated with the backlog status of otherflows. To remove the transmission rate correlation for packetdelay analysis at the first NFV node, we analyze the packetarrival process of each flow at link transmission (i.e., thedeparture process from the preceding packet processing) inthe following.

B. Queue Modeling at the First NFV Node

We study the packet departure process from each decou-pled processing at N1. Taking flow i in Fig. 3 as an example,we focus on the interdeparture time between two succes-sively departed packets from the decoupled processing. Asindicated in [38] and [39], for an M/D/1 queueing system,the queue occupancy distribution in a steady state seen bya departing packet is the same as that seen by an arrivingpacket due to the Poisson characteristic of the arriving process.Therefore, a departing packet from the decoupled processingsees the same empty probability of the processing queue asan arriving packet. Let random variable Yi be the interde-parture time of successive packets of flow i departing fromthe decoupled processing at N1. If the lth departing packet

sees a nonempty queue, then Yi = Ti, where Ti = (1/μi,1)

is the decoupled processing time for a packet of flow i; ifthe departing packet sees an empty queue upon its departure,Yi = Xi + Ti, where random variable Xi denotes the durationfrom the time of the lth packet departure of flow i to the time of(l+1)th packet arrival. Because of the memoryless property ofa Poisson arrival process, Xi has the same exponential distribu-tion as packet interarrival time with parameter λi. Therefore,the probability density function (PDF) of Yi, ξYi(t), can becalculated as

ξYi(t) = (1 − �i,1

)ξ(Xi+Ti)(t) + �i,1ξTi(t) (4)

where �i,1 = (λi/μi,1). Since Xi and Ti are independent ran-dom variables, the PDF of Xi + Ti is the convolution of thePDFs of Xi and Ti. Thus, (4) is further derived as

ξYi(t) =(

1 − λi

μi,1

)[ξXi(t) � ξTi(t)

] + λi

μi,1ξTi(t)

=(

1 − λi

μi,1

)[λie

−λitu(t) � δ(t − Ti)]

+ λi

μi,1δ(t − Ti)

= λi(μi,1 − λi

)

μi,1e−λi(t−Ti)u(t − Ti) + λi

μi,1δ(t − Ti) (5)

where u(t) is the unit step function, δ(t) is the Dirac deltafunction, and � is the convolution operator. From (5), thecumulative distribution function (CDF) of Yi is given by

FYi(t) =[

1 −(

1 − λi

μi,1

)

e−λi(t−Ti)

]

u(t − Ti). (6)

Based on (5) and (6), both mean and variance of Yi can becalculated as

E[Yi] = 1

λiand D[Yi] = 1

λ2i

− 1

μi,12. (7)

From (6) and (7), we observe that, when λi is small, thedeparture process, delayed by the service time Ti, approachesthe Poisson arrival process with parameter λi; when λi isincreased to approach μi,1, the departure process approachesthe deterministic process with rate μi,1.

Since the packet departure rate from each decoupled pro-cessing is the same as packet arrival rate at each processingqueue, the decoupled transmission rate μi,2 for flow i is thesame as μi,1, by solving the set of (3). At this point, we havea complete decoupled queueing model of both packet process-ing and packet transmission for flow i traversing the first NFVnode N1, shown in Fig. 3(b). The average total packet delayfor flow i is determined by

Di,1 = 1

μi,1+ λi

2μi,12(1 − �i,1)

+ 1

μi,2. (8)

Before modeling the delay for flows going through the sec-ond NFV node, N2, we analyze the departure process forpacket transmissions of each flow at N1. Similar to the analysisof packet departure process from the decoupled processing forflow i at N1, we set time 0 as the instant when the lth packetdeparts from the processing queue and reaches the transmit-ting queue for immediate packet transmission. Since we have


(a)

(b)

Fig. 4. Composition of Zi under different cases. (a) Xi > Ti. (b) Xi ≤ Ti.

μi,1 = μi,2, Ti also indicates packet transmission time, i.e.,Ti = (1/μi,2). Let Zi denote packet interdeparture time forflow i passing though link transmission. If the lth departingpacket from the processing queue sees a nonempty queue, wehave Zi = Ti; otherwise, the following two cases apply, asillustrated in Fig. 4.

Case 1: If the (l+1)th packet’s arrival time at the processingqueue is greater than the lth packet’s transmission time at thetransmitting queue, i.e., Xi > Ti, we have

Zi = ζ1 + 2ζ2 = (Xi − Ti) + 2Ti = Xi + Ti (9)

where ζ1 indicates the duration from the instant that the lthpacket departs from the transmission queue till the instantthat the (l + 1)th packet arrives at the processing queue, andζ2 = Ti.

Case 2: If the (l + 1)th packet arrives at the processingqueue while the lth packet is still at the transmission queue,i.e., Xi ≤ Ti, we have

Zi = ζ ′1 + ζ ′

2 = [Ti − (Ti − Xi)] + Ti = Xi + Ti (10)

where ζ ′1 denotes the remaining processing time on the (l+1)th

packet at the processing queue after the lth packet departs fromtransmission queue, and ζ ′

2 = Ti.As a result, the PDF of Zi is given by

ξZi(t) = (1 − �i,1

)[P{Xi ≤ Ti}ξ(Xi+Ti)(t)

+ P{Xi > Ti}ξ(Xi+Ti)(t)] + �i,1ξTi(t)

= (1 − �i,1

)ξ(Xi+Ti)(t) + �i,1ξTi(t). (11)

Comparing with (4) and (11), we conclude that Zi and Yi haveexactly the same probability distribution, and thus any orderof statistics (e.g., expectation and variance).

C. Delay Over the Virtual Link Between NFV NodesN1 and N2

So far, we derive the packet departure process of eachflow from the link transmission at the first NFV node N1.Before reaching to the second NFV node N2, the flows may gothrough a sequence of network switches and physical links for-warding the traffic. The transmission rates allocated to flow ifrom these switches and links are the same as the transmissionrates μi,2 from N1 to maximize the bandwidth utilization [40].

Therefore, queueing delays on switches and links are not con-sidered for each flow. The total packet transmission delay forflow i traversing n1 switches and n1 links before reaching N2is given by

D(f )i,1 = 2n1

μi,2. (12)

D. Delay at the Second NFV Node

Proposition 1: A Poisson packet flow traverses a tandemof k (k = 1, 2, 3, . . .) servers, each with deterministic servicecapacity Y(k). If we have Y(q) ≥ Y(q−1), ∀q ∈ [2, k], the depar-ture process of the traffic flow coming out of the kth (k ≥ 2)

server remains the same as the departure process from the firstserver.

The proof of Proposition 1 is given in Appendix A.According to Proposition 1, the arrival process at the sec-

ond NFV node N2 is the same as the traffic departure processfrom N1. Based on (7) and (11), the arrival rate for flow i atN2 is λi. Thus, the same method can be used as in (3) to geta set of decoupled processing and transmission rates μ′

i,1 andμ′

i,2 for flow i at N2, as shown in Fig. 5, by taking into con-sideration the time profiles of traffic flows going through thenew VNF(s) at N2 and instantaneous processing and transmis-sion rates c′

i,1 and c′i,2 allocated to flow i. The main difference

in packet delay modeling for flow i at N2 from that at N1 isthat the packet arrival process for flow i is a general processwith average arrival rate λi. The process has the interarrivaltime Zi with the same CDF, expectation and variance as thoseof Yi in (6) and (7). Thus, we can model packet process-ing at N2 as a G/D/1 queueing process,3 where the averagepacket queueing delay before processing for flow i at N2 isgiven by [38]

Wi,2 =λi

(1λ2

i− 1

μi,12 − σ 2

i

)

2(

1 − �′i,1

) ≤λi

(1λ2

i− 1

μi,12

)

2(

1 − �′i,1

) . (13)

In (13), �′i,1 = (λi/μ

′i,1), Te is the idle duration within

interarrival time of successive packets of flow i at N2, withvariance σ 2

i .Since the arrival process at N2 for each flow correlates with

the preceding decoupled processing rates at N1, as indicatedin (6), the G/D/1 queueing model is not accurate especiallywhen λi becomes large [38]. Also, it is difficult to obtain thedistribution of Te to calculate σ 2

i in (13). Using the upperbound in (13) to approximate Wi,2 is not accurate when λi

is small (the queueing system is lightly loaded), since theprobability of an arriving packet at the processing queueof N2 seeing an empty queue increases, and σ 2

i becomeslarge.

From (6) and (7), in the case of μ′i,1 < μi,2, Zi is more likely

to approach an exponentially distributed random variable thana deterministic value with varying λi under the condition of�′

i,1 < 1. Therefore, to make the arrival process of each flowat the processing queue of N2 independent of processing and

3Note that we consider the case where μ′i,1 < μi,2, ∀i ∈ I; for the case of

μ′i,1 ≥ μi,2, there is no queueing delay for processing at N2.


TABLE IISIMULATION PARAMETERS

Fig. 5. Decoupled queueing model for traffic flows traversing the first andsecond NFV nodes in sequence.

transmission rates at N1, we approximate packet arrival pro-cess of flow i at N2 as a Poisson process with rate parameterλi, and establish an M/D/1 queueing model to represent thepacket processing for flow i. Proposition 2 indicates that theaverage packet queueing delay Qi,2 in the M/D/1 queueingmodel is an improved upper bound over that in the G/D/1system in (13), especially when the input traffic is lightlyloaded.

Proposition 2: Given μ′i,1 < μi,2, Qi,2 is an upper bound

of Wi,2 when the processing queue for flow i at N2 is bothlightly loaded and heavily loaded.

The proof of Proposition 2 is given in Appendix B.According to the approximation on packet arrival process

for flow i at N2, average total packet delay at N2 is calculated,independently of the processing and transmission rates at N1,given by

Di,2 =⎧⎨

⎩

1μ′

i,1+ λi

2μ′i,1

2(

1−�′i,1

) + 1μ′

i,2, μ′

i,1 < μi,2

1μ′

i,1+ 1

μ′i,2

, μ′i,1 ≥ μi,2.

(14)

E. Average E2E Delay

Based on the same methodology of delay modeling forpackets traversing N2, the average total packet delay forflow i traversing each subsequent NFV node (if any) can bederived independently. Under the condition that the decoupledpacket processing rate of flow i at one subsequent NFV nodeNz (z > 2) is smaller than the decoupled packet transmissionrate at its preceding NFV node Nz−1, using an approximatedM/D/1 queueing model to represent packet processing at Nz isvalid since packet arrival process of flow i at Nz is more likelyto approach a Poisson process with varying λi. In general, theaverage E2E delay for a packet of flow i passing through anembedded VNF chain, consisting of m NFV nodes, is the sum-mation of average total delay for a packet passing through allNFV nodes and the total transmission delay on switches and

links along the path forwarding the packet, given by

Di =m∑

z=1

Di,z +m∑

z=1

D(f )i,z . (15)

In (15), Di,z is the average packet delay for flow i passingthrough zth NFV node of the embedded VNF chain, D(f )

i,z isthe total packet transmission delay for flow i traversing nz

switches and nz links before reaching NFV node Nz+1, and isdetermined in the same way as in (12).

VI. SIMULATION RESULTS

In this section, simulation results are presented to verify theaccuracy of the proposed packet delay modeling of each flowpassing through an embedded VNF chain. All simulations arecarried out using the network simulator OMNeT++ [41]. Weconsider the network scenario where two equally weightedflows, i and j, traversing logic VNF chains firewall (f1) →DNS (f3) and firewall (f1) → IDS (f2), respectively, areembedded on a common physical path and share a set of pro-cessing and transmission resources, as shown in Fig. 1. Flowi represents DNS request traffic, whereas flow j indicates avideo-conferencing data streaming. The packet arrival rate λi

for flow i is set to 150 packet/s with packet size of 4000bits [42]. The packet size for flow j is set to 16 000 bits, andwe vary its arrival rate, λj, from 75 packet/s to 350 packet/s toreflect different traffic load conditions. The rate vector for eachflow traversing an NFV node is tested over OpenStack [43],which is a resource virtualization platform installed on eachNFV node. By injecting traffic flows with different packet sizesinto different VNFs programmed on the OpenStack, we testmaximum available packet processing and transmission ratesfor different flows. With DR-GPS, each flow is allocated afraction of the maximum available processing and transmis-sion rates, upon which packet-level simulation is conductedto evaluate packet delay of each flow traversing each NFVnode. Table II summarizes the rate vectors for flows i and jtraversing different VNFs. Other important simulation settingsare also included.

A. Packet Delay at the First NFV Node (N1)

We first compare packet processing delay and packet queue-ing delay for each flow traversing N1. In Fig. 6, it isdemonstrated that both packet processing delay and packetqueueing delay derived using rate decoupling between flowsi and j are close to the simulation results with rate cou-pling. We can see from Fig. 6(a) that the decoupled processing


(a)

(b)

Fig. 6. Packet delay for processing at N1. (a) Average packet processingdelay. (b) Average packet queueing delay.

rate for flow i decreases with λj, since the processing queuenonempty probability for flow j increases statistically, shrink-ing the decoupled processing rates for the other flow. Packettransmission delay and queueing delay before transmission areevaluated for both flows at N1 in Fig. 7. For packet transmis-sion delay, the analytical results match the simulation resultswell, and there is almost no queueing delay before packettransmissions, which indicates the accuracy of the proposedtransmission rate decoupling. In Fig. 8, we compare queue-ing delays before packet transmissions for flows i and j atN1 by employing DR-GPS and bi-resource GPS schemes.Although the bi-resource GPS achieves fair allocation on bothCPU and bandwidth resources between the two flows, theamount of allocated bandwidth resources is overly provisionedfor flow i and is underestimated for flow j, due to the dis-crepancy of time profiles for different flows. Thus, packetqueueing delay before link transmission for flow j becomesmuch larger than that with the DR-GPS where packet queueing

Fig. 7. Delay for packet transmissions at N1.

Fig. 8. Queueing delay for packet transmissions under different resourceallocation disciplines.

delays for both flows at the transmission link of N1 areminimized.

B. Packet Delay at the Second NFV Node (N2)

After traversing the first NFV node, N1, packet processingdelay for flow i and flow j at the second NFV node, N2, isevaluated in Fig. 9. With a close match between analytical andsimulation results, it is verified that the processing rate decou-pling at N2 is accurate. Packet queueing delay before process-ing for both flows at N2 are demonstrated in Figs. 10 and 11.As λj increases, the packet queueing delay for flow i increasesslightly since the resource multiplexing gain obtained by flowi from flow j becomes small. We can see from Fig. 10 that theM/D/1 queueing model for packet processing at N2 providesa tighter upper bound than the G/D/1 queueing delay upperbound for flow i. In comparison with the results for flow jin Fig. 11, we observe that the G/D/1 upper bound of packetqueueing delay is looser for flow i, since the processing queue


Fig. 9. Packet processing delay at N2.

Fig. 10. Packet queueing delay for flow i at N2.

for flow i is lightly loaded with low queue nonempty prob-ability shown in Fig. 12, whereas the G/D/1 queueing delayupper bound becomes tighter for flow j with the increase ofλj but less accurate in a heavy traffic load condition. For bothflows, the proposed M/D/1 queueing model is an improvedupper bound to approximate the packet queueing delay beforeprocessing at N2.

Packet transmission delay for both flows at N2 is demon-strated in Fig. 13. Similar to Fig. 7, we can see the decoupledtransmission rates for each flow are close to the simulationresults, and queueing delay before packet transmissions is neg-ligible. Lastly, we evaluate in Fig. 14 the E2E delay for packetsof each flow going through the whole embedded physicalnetwork path, which is a summation of the total packet delayfor traversing all NFV nodes and packet transmission delayover all physical links and network switches. It is demon-strated that the proposed analytical modeling provides anaccurate delay evaluation on E2E packet processing and trans-mission for multiple flows traversing VNF chains embedded

Fig. 11. Packet queueing delay for flow j at N2.

Fig. 12. Nonempty probability for processing queues at N2.

Fig. 13. Delay for packet transmissions at N2.

on a common physical network path. The proposed analyti-cal modeling can help to support E2E delay-aware VNF chainembedding.


Fig. 14. E2E packet delay for both flows.

VII. CONCLUSION

In this paper, we present an analytical model to evaluateE2E packet delay for multiple traffic flows traversing a com-mon embedded VNF chain. With DR-GPS, both CPU andbandwidth resources are allocated among different flows ateach NFV node to achieve dominant-resource fair allocationwith high resource utilization. A tandem queueing model isestablished to describe packets of each flow passing throughan NFV node and its outgoing link. By removing the couplingeffect on instantaneous packet processing rates among multipleflows, an M/D/1 queueing model is used to determine aver-age packet delay at each decoupled processing queue of thefirst NFV node. The correlation of packet transmission rates isalso removed in the modeling based on the analysis of packetdeparture process from each decoupled processing queue. Wefurther analyze the packet arrival process of each flow at thesubsequent NFV node, and establish an approximated M/D/1queueing model to determine the average packet delay of aflow at a decoupled processing queue of the NFV node, whichis proved to be a more accurate upper bound than that usinga G/D/1 queueing model in both lightly loaded and heavilyloaded traffic conditions. Packet transmission delay over eachembedded virtual link between consecutive NFV nodes is alsoderived for E2E delay calculation. Simulation results demon-strate the accuracy and effectiveness of the proposed analyticalE2E packet delay modeling for achieving delay-aware VNFchain embedding.

APPENDIX APROOF OF PROPOSITION 1

Define Z(k) as packet interdeparture time at the kth server,and define ρ(1) as the probability that the lth departing packetsees a nonempty queue at the first server. If the lth departingpacket sees an empty queue at the first server, we let X(1) bethe duration from time 0 till the instant that (l + 1)th packetarrives at the server. With Y(q) ≥ Y(q−1), there is no queueingdelay at each of the servers following the first server. Similarto the description in Fig. 4, if the lth departing packet sees a

nonempty queue, we have Z(k) = Y(1); Otherwise, two casesare considered.

Case 1: If X(1) >k∑

q=2Y(q)

Z(k) =⎛

⎝X(1) −k∑

q=2

Y(q)

⎞

⎠ +k∑

q=1

Y(q) = X(1) + Y(1). (16)

Case 2: If X(1) ≤k∑

q=2Y(q)

Z(k) =k∑

q=1

Y(q) −⎛

⎝k∑

q=2

Y(q) − X(1)

⎞

⎠ = Y(1) + X(1). (17)

Hence, Z(k) has the same PDF as Yi derived in (4), which endsthe proof.

APPENDIX BPROOF OF PROPOSITION 2

When λi is small, σ 2i in (13) is close to that for an M/D/1

queueing system, given by [38]

σ 2i ≈ 1

λ2i

− 1

μ′i,1

2. (18)

Hence, Wi,2 is further derived as

Wi,2 ≈λi

[1λ2

i− 1

μi,12 −

(1λ2

i− 1

μ′i,1

2

)]

2(

1 − �′i,1

)

=λi

(1

μ′i,1

2 − 1μi,1

2

)

2(

1 − �′i,1

) . (19)

When λi becomes large, the idle duration within interarrivaltime of successive packets of flow i at N2 is small, making σ 2

inegligible. Thus, we have

Wi,2 ≈λi

(1λ2

i− 1

μi,12

)

2(

1 − �′i,1

) ≈λi

(1

μ′i,1

2 − 1μi,1

2

)

2(

1 − �′i,1

) . (20)

On the other hand, under both traffic load cases, Qi,2 in theapproximated M/D/1 queueing system is derived as

Qi,2 = λi

2μ′i,1

2(

1 − �′i,1

) ≥ Wi,2. (21)

Thus, we prove that Qi,2 is an upper bound of Wi,2 inboth lightly loaded and heavily loaded traffic conditions, andbecomes a tighter upper bound than that in the G/D/1 systemin (13) when λi is small.

ACKNOWLEDGMENT

The authors would like to thank Y. Cao (an undergraduateresearch assistant) for his help in testing over OpenStack thetime profiles for traffic flows traversing an NFV node.


REFERENCES

[1] Q. Ye et al., “End-to-end quality of service in 5G networks—Examiningthe effectiveness of a network slicing framework,” IEEE Veh. Technol.Mag., vol. 13, no. 2, pp. 65–74, Jun. 2018.

[2] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, andM. Ayyash, “Internet of Things: A survey on enabling technologies, pro-tocols, and applications,” IEEE Commun. Surveys Tuts., vol. 17, no. 4,pp. 2347–2376, 4th Quart., 2015.

[3] Q. Ye and W. Zhuang, “Distributed and adaptive medium access controlfor Internet-of-Things-enabled mobile networks,” IEEE Internet ThingsJ., vol. 4, no. 2, pp. 446–460, Apr. 2017.

[4] Q. Ye and W. Zhuang, “Token-based adaptive MAC for a two-hopInternet-of-Things enabled MANET,” IEEE Internet Things J., vol. 4,no. 5, pp. 1739–1753, Oct. 2017.

[5] H. Li, M. Dong, and K. Ota, “Radio access network virtualization for thesocial Internet of Things,” IEEE Cloud Comput., vol. 2, no. 6, pp. 42–50,Nov./Dec. 2015.

[6] W. Wu et al., “Enhance the edge with beamforming: Performance analy-sis of beamforming-enabled WLAN,” in Proc. IEEE WiOpt, May 2018,pp. 1–6.

[7] X. Duan, C. Zhao, S. He, P. Cheng, and J. Zhang, “Distributed algo-rithms to compute Walrasian equilibrium in mobile crowdsensing,” IEEETrans. Ind. Electron., vol. 64, no. 5, pp. 4048–4057, May 2017.

[8] G. Yang, S. He, and Z. Shi, “Leveraging crowdsourcing for efficientmalicious users detection in large-scale social networks,” IEEE InternetThings J., vol. 4, no. 2, pp. 330–339, Apr. 2017.

[9] L. Lyu, C. Chen, S. Zhu, and X. Guan, “5G enabled codesign ofenergy-efficient transmission and estimation for industrial IoT systems,”IEEE Trans. Ind. Informat., vol. 14, no. 6, pp. 2690–2704, Jun. 2018,doi: 10.1109/TII.2018.2799685.

[10] F. Lyu et al., “SS-MAC: A novel time slot-sharing MAC for safetymessages broadcasting in VANETs,” IEEE Trans. Veh. Technol., vol. 67,no. 4, pp. 3586–3597, Apr. 2018, doi: 10.1109/TVT.2017.2780829.

[11] W. Xu, H. A. Omar, W. Zhuang, and X. S. Shen, “Delay analysis ofin-vehicle Internet access via on-road WiFi access points,” IEEE Access,vol. 5, pp. 2736–2746, 2017.

[12] R. Mijumbi et al., “Network function virtualization: State-of-the-artand research challenges,” IEEE Commun. Surveys Tuts., vol. 18, no. 1,pp. 236–262, 1st Quart., 2016.

[13] L. Wang, Z. Lu, X. Wen, R. Knopp, and R. Gupta, “Joint optimizationof service function chaining and resource allocation in network functionvirtualization,” IEEE Access, vol. 4, pp. 8084–8094, 2016.

[14] F. Bari, S. R. Chowdhury, R. Ahmed, R. Boutaba, andO. C. M. B. Duarte, “Orchestrating virtualized network functions,”IEEE Trans. Netw. Service Manag., vol. 13, no. 4, pp. 725–739,Dec. 2016.

[15] N. Zhang et al., “Software defined networking enabled wireless networkvirtualization: Challenges and solutions,” IEEE Netw., vol. 31, no. 5,pp. 42–49, May 2017.

[16] S. Q. Zhang, Q. Zhang, H. Bannazadeh, and A. Leon-Garcia, “Routingalgorithms for network function virtualization enabled multicast topol-ogy on SDN,” IEEE Trans. Netw. Service Manag., vol. 12, no. 4,pp. 580–594, Dec. 2015.

[17] J. Ordonez-Lucena et al., “Network slicing for 5G with SDN/NFV:Concepts, architectures, and challenges,” IEEE Commun. Mag., vol. 55,no. 5, pp. 80–87, May 2017.

[18] X. Li, J. Rao, H. Zhang, and A. Callard, “Network slicing with elasticSFC,” in Proc. IEEE VTC, Toronto, ON, Canada, Sep. 2017, pp. 1–5.

[19] J. Li et al., “Joint resource allocation and online virtual networkembedding for 5G networks,” in Proc. IEEE GLOBECOM, Singapore,Dec. 2017, pp. 1–6.

[20] M. Mechtri, C. Ghribi, and D. Zeghlache, “A scalable algorithm forthe placement of service function chains,” IEEE Trans. Netw. ServiceManag., vol. 13, no. 3, pp. 533–546, Sep. 2016.

[21] B. A. A. Nunes, M. Mendonca, X.-N. Nguyen, K. Obraczka, andT. Turletti, “A survey of software-defined networking: Past, present,and future of programmable networks,” IEEE Commun. Surveys Tuts.,vol. 16, no. 3, pp. 1617–1634, 3rd Quart., 2014.

[22] H. Li, K. Ota, M. Dong, and M. Guo, “Mobile crowdsensing in softwaredefined opportunistic networks,” IEEE Commun. Mag., vol. 55, no. 6,pp. 140–145, Jun. 2017.

[23] H. Li, M. Dong, and K. Ota, “Control plane optimization in software-defined vehicular ad hoc networks,” IEEE Trans. Veh. Technol., vol. 65,no. 10, pp. 7895–7904, Oct. 2016.

[24] C. Clark et al., “Live migration of virtual machines,” in Proc. ACMUSENIX, May 2005, pp. 273–286.

[25] S. Akoush, R. Sohan, A. Rice, A. W. Moore, and A. Hopper,“Predicting the performance of virtual machine migration,” in Proc.IEEE MASCOTS, Aug. 2010, pp. 37–46.

[26] S. Ayoubi, C. Assi, K. Shaban, and L. Narayanan, “MINTED: Multicastvirtual network embedding in cloud data centers with delay constraints,”IEEE Trans. Commun., vol. 63, no. 4, pp. 1291–1305, Apr. 2015.

[27] A. Ghodsi, V. Sekar, M. Zaharia, and I. Stoica, “Multi-resource fairqueueing for packet processing,” ACM SIGCOMM Comput. Commun.Rev., vol. 42, no. 4, pp. 1–12, Oct. 2012.

[28] W. Wang, B. Liang, and B. Li, “Multi-resource generalized processorsharing for packet processing,” in Proc. ACM IWQoS, Montreal, QC,Canada, Jun. 2013, pp. 1–10.

[29] A. Ghodsi et al., “Dominant resource fairness: Fair allocation of multipleresource types,” in Proc. ACM NSDI, Apr. 2011, pp. 323–336.

[30] F. C. Chua, J. Ward, Y. Zhang, P. Sharma, and B. A. Huberman,“Stringer: Balancing latency and resource usage in service functionchain provisioning,” IEEE Internet Comput., vol. 20, no. 6, pp. 22–31,Nov./Dec. 2016.

[31] B. Xiong, K. Yang, J. Zhao, W. Li, and K. Li, “Performance evaluation ofOpenFlow-based software-defined networks based on queueing model,”Comput. Netw., vol. 102, pp. 172–185, Jun. 2016.

[32] M. Jarschel et al., “Modeling and performance evaluation of anOpenFlow architecture,” in Proc. 23rd ITC, San Francisco, CA, USA,Sep. 2011, pp. 1–7.

[33] N. Egi et al., “Understanding the packet processing capability of multi-core servers,” Intel, Santa Clara, CA, USA, Rep. LABOS-REPORT-2009-001, 2009.

[34] A. K. Parekh and R. G. Gallager, “A generalized processor shar-ing approach to flow control in integrated services networks: Thesingle-node case,” IEEE/ACM Trans. Netw., vol. 1, no. 3, pp. 344–357,Jun. 1993.

[35] Z.-L. Zhang, D. Towsley, and J. Kurose, “Statistical analysis of thegeneralized processor sharing scheduling discipline,” IEEE J. Sel. AreasCommun., vol. 13, no. 6, pp. 1071–1080, Aug. 1995.

[36] D. C. Parkes, A. D. Procaccia, and N. Shah, “Beyond dominant resourcefairness: Extensions, limitations, and indivisibilities,” ACM Trans. Econ.Comput., vol. 3, no. 1, p. 3, Mar. 2015.

[37] A. Gutman and N. Nisan, “Fair allocation without trade,” in Proc. ACMAAMAS, Jun. 2012, pp. 719–728.

[38] D. P. Bertsekas, R. G. Gallager, and P. Humblet, Data Networks, vol. 2.Englewood Cliffs, NJ, USA: Prentice-Hall, 1987.

[39] L. Kleinrock, Queueing Systems, Volume 1: Theory. New York, NY,USA: Wiley, 1976.

[40] S. Larsen, P. Sarangam, R. Huggahalli, and S. Kulkarni, “Architecturalbreakdown of end-to-end latency in a TCP/IP network,” Int. J. ParallelProgram., vol. 37, no. 6, pp. 556–571, Dec. 2009.

[41] OMNeT++ 5.0. Accessed: Dec. 2017. [Online]. Available: http://www.omnetpp.org/omnetpp

[42] A. Liska and G. Stowe, DNS Security: Defending the Domain NameSystem, Syngress, Maryland Heights, MO, USA, 2016.

[43] Openstack (Release Pike). Accessed: Dec. 2017. [Online]. Available:https://www.openstack.org

Qiang Ye (S’16–M’17) received the B.S. degree innetwork engineering and M.S. degree in commu-nication and information system from the NanjingUniversity of Posts and Telecommunications,Nanjing, China, in 2009 and 2012, respectively,and the Ph.D. degree in electrical and computerengineering from the University of Waterloo,Waterloo, ON, Canada, in 2016.

He has been a Post-Doctoral Fellow withthe Department of Electrical and ComputerEngineering, University of Waterloo, since 2016.

His current research interests include SDN and NFV, network slicing for5G networks, VNF chain embedding and end-to-end performance analysis,medium access control and performance optimization for mobile ad hocnetworks, and Internet of Things.

http://dx.doi.org/10.1109/TII.2018.2799685

http://dx.doi.org/10.1109/TVT.2017.2780829


Weihua Zhuang (M’93–SM’01–F’08) has beenwith the Department of Electrical and ComputerEngineering, University of Waterloo, Waterloo, ON,Canada, since 1993, where she is a Professor and aTier I Canada Research Chair in wireless communi-cation networks.

Dr. Zhuang was a recipient of the 2017 TechnicalRecognition Award from the IEEE CommunicationsSociety Ad Hoc & Sensor Networks TechnicalCommittee, one of 2017 ten N2Women (Stars inComputer Networking and Communications) and a

co-recipient of several Best Paper Awards from IEEE conferences. She was theEditor-in-Chief of the IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

from 2007 to 2013, the Technical Program Chair/Co-Chair of IEEE VTCFall 2017 to Fall 2016, and the Technical Program Symposia Chair of IEEEGLOBECOM 2011. She is a Fellow of the Royal Society of Canada, theCanadian Academy of Engineering, and the Engineering Institute of Canada.She is an elected member of the Board of Governors and VP Publications ofthe IEEE Vehicular Technology Society. She was an IEEE CommunicationsSociety Distinguished Lecturer from 2008 to 2011.

Xu Li received the B.Sc. degree in computer sci-ence from Jilin University, Changchun, China, in1998, the M.Sc. degree in computer science fromthe University of Ottawa, Ottawa, ON, Canada, in2005, and the Ph.D. degree in computer science fromCarleton University, Ottawa, ON, in 2008.

He was a Research Scientist (Tenure) with Inria,France. He is a Staff Researcher with HuaweiTechnologies Inc., Ottawa, ON. His current researchinterests include 5G system design and standardiza-tion, along with over 90 refereed scientific publi-

cations, over 40 3GPP standard proposals, and over 50 patents and patentfilings.

Dr. Li was a recipient of NSERC PDF Awards, the IEEE ICNC 2015 BestPaper Award, and a number of other awards. He was on the Editorial Boardsof the IEEE Communications Magazine and the IEEE TRANSACTIONS ON

PARALLEL AND DISTRIBUTED SYSTEMS. He was a TPC Co-Chair of IEEEVTC 2017 (Fall) LTE, 5G and Wireless Networks Track, and the IEEEGlobecom 2013 Ad Hoc and Sensor Networking Symposium.

Jaya Rao (M’14) received the B.S. and M.S.degrees in electrical engineering from the Universityof Buffalo, Buffalo, NY, USA, in 2001 and2004, respectively, and the Ph.D. degree from theUniversity of Calgary, Calgary, AB, Canada, in2014.

In 2014, he joined Huawei Technologies Canada,Ottawa, ON, Canada, where he has been involvedwith research and design on CIoT, URLLC, andV2X-based solutions in 5G new radio, and is cur-rently a Senior Research Engineer. He has con-

tributed with Huawei at 3GPP RAN WG2, RAN WG3, and SA2 meetings ontopics related to URLLC, network slicing, mobility management, and sessionmanagement. From 2004 to 2010, he was a Research Engineer with MotorolaInc., Chicago, IL, USA.

Dr. Rao was a recipient of the Best Paper Award of IEEE WCNC 2014.

Documents

End-to-End Delay Modeling for Embedded VNF Chains in 5G ...q6ye/Papers/Delay_Modeling_JIoT.pdfan embedded VNF chain, including packet queueing delay, packet processing delay on NFV