Adaptive Resource Management and Control in Software ...gpavlou/Publications/... · Recently, the Open Networking Foundation (ONF) has pre-sented an architecture for SDN in a technical

18 IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, VOL. 12, NO. 1, MARCH 2015

Adaptive Resource Management and Controlin Software Defined Networks

Daphne Tuncer, Marinos Charalambides, Stuart Clayman, and George Pavlou

Abstract—The heterogeneous nature of the applications, tech-nologies and equipment that today’s networks have to supporthas made the management of such infrastructures a complex task.The Software-Defined Networking (SDN) paradigm has emergedas a promising solution to reduce this complexity through thecreation of a unified control plane independent of specific vendorequipment. However, designing a SDN-based solution for networkresource management raises several challenges as it should exhibitflexibility, scalability and adaptability. In this paper, we presenta new SDN-based management and control framework for fixedbackbone networks, which provides support for both static anddynamic resource management applications. The framework con-sists of three layers which interact with each other through a setof interfaces. We develop a placement algorithm to determine theallocation of managers and controllers in the proposed distributedmanagement and control layer. We then show how this layer cansatisfy the requirements of two specific applications for adaptiveload-balancing and energy management purposes.

Index Terms—Software defined networking, adaptive resourcemanagement, decentralized network configuration.

I. INTRODUCTION

THE evolution of information and communication technol-ogy (ICT) over the past thirty years has heavily influenced

the life of the modern consumer. The crucial role played byICT today has catered for a persistent demand in terms ofnew services and applications with strict requirements in termsof availability, service quality, dependability, resilience andprotection. This has resulted in increasingly complex networksand software systems that need to support heterogeneous appli-cations, technologies and multi-vendor equipment, making themanagement of network infrastructures a key challenge.

The vision of Software-Defined Networking (SDN) as a keyenabler for simplifying management processes has led to keeninterest from both the industry and the research community,who have been investing significant efforts in the developmentof SDN-based solutions. SDN enables the control of networksvia a unified plane which is agnostic to vendor equipment andoperates on an abstract view of the resources. Among its advan-tages, flexibility and programmability are usually highlighted,

Manuscript received November 15, 2014; revised February 2, 2015; acceptedFebruary 5, 2015. Date of publication February 11, 2015; date of currentversion March 17, 2015. This research was funded by the EPSRC KCN project(EP/L026120/1) and by the Flamingo Network of Excellence project (318488)of the EU Seventh Framework Programme. The associate editor coordinatingthe review of this paper and approving it for publication was F. De Turck.

The authors are with the Department of Electronic and Electrical Engineer-ing, University College London, London WC1E 7JE, U.K.

Digital Object Identifier 10.1109/TNSM.2015.2402752

in addition to simplification of management tasks and applica-tion deployment through a centralized network view [1]–[3].

Centralized management and control solutions have, how-ever, limitations. In addition to resilience, scalability is animportant issue, especially when dealing with operations thatrequire dynamic reconfiguration of network resources. Central-ized approaches are generally well-suited for implementing thelogic of applications for which the time between each executionis significantly greater than the time to collect, compute anddisseminate results. As a consequence, adaptive managementoperations with short timescales call for both distributed man-agement and control approaches, which are essential enablersof online resource reconfigurations.

In this paper, we present a novel SDN-based managementand control framework for fixed backbone networks. The pro-posed framework, based on SDN principles, follows a lay-ered architecture where the communication between layersis achieved through a set of interfaces. Although differentapproaches have been proposed in the literature, these haveeither mainly focused on the control plane (e.g, [4], [5]) orconsidered centralized management solutions (e.g., [6], [7]).In contrast, the framework presented in this paper relies on adistributed management and control layer, which consists ofa set of local managers (LMs) and controllers (LCs) formingseparate management and control planes. The modular structureof this layer is a salient feature of our approach. This notonly simplifies the integration of management applications, butalso offers significant deployment benefits, allowing controland management functionality to evolve independently. Forexample, management and control components from differentvendors can be used together, but also the functionality ofmanagement applications can be updated without disrupting ac-tive network services. The degree of distribution in each plane(number of elements) depends both on the physical infrastruc-ture as well as the type of management applications to consider.The exchange of information between distributed elements ineach plane is supported by the management substrate developedin our previous work [8], [9].

We investigate how the proposed framework can be usedto support adaptive resource management operations whichinvolves short timescale reconfiguration of network resources,and we discuss the main research issues/challenges associatedwith the deployment of such a distributed solution. In particular,we show how the requirements of the adaptive load-balancingand energy management applications proposed in our previouswork [10]–[12] can be satisfied by the functionality and inter-faces of the framework. In addition, we develop a placementalgorithm to determine the allocation of LMs and LCs (both

1932-4537 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

TUNCER et al.: ADAPTIVE RESOURCE MANAGEMENT AND CONTROL IN SOFTWARE DEFINED NETWORKS 19

TABLE IMAIN ACRONYMS

in number and mapping to network equipment) according totopological characteristics of the physical infrastructure. Basedon real network topologies, we show how the parameters ofthe algorithm can be tuned to control the allocation. We alsoevaluate the performance of the load-balancing and energymanagement applications in terms of resource utilization basedon real traffic traces and compare their performance to differentschemes. The results demonstrate that a significant reductionin terms of link utilization and energy consumption can beachieved in a scalable manner.

The remainder of this paper is organized as follows.Section II provides background information. Section III de-scribes the main components and interfaces of the proposedframework and highlights the operations performed by eachcomponent. Section IV presents the placement algorithm. Anoverview of the adaptive resource management applicationsis provided in Section V. Section VI describes in detail howthe requirements of these applications are supported by theproposed framework. The results of the evaluation are presentedin Section VII and Section VIII discusses related work. Finally,conclusions and future directions are provided in Section IX.

II. DEFINITIONS AND BACKGROUND

In this section, we first define some of the basic terms andnotations used in this paper and give background informationon SDN. We also present the network management substrate,which forms the communication basis of the proposed archi-tecture. In the last sub-section, we provide information aboutMulti-Topology Routing which serves as the routing mecha-nism for achieving path diversity in the context of the two man-agement applications considered in this work. For clarificationpurposes, the main acronyms used in this paper are summarizedin Table I.

A. Definitions

We consider network topologies represented by the sets ofnetwork links L and nodes N . We refer to network edge nodesas the set of nodes generating and absorbing traffic and werepresent this set by NE. We refer to all other nodes as corenodes. For any pair of edge nodes (i, j) ∈ NE, sdij representsthe source-destination pair of source node i and destinationnode j. Each sdij is associated with a volume of traffic v(sdij)that represents the traffic demand between source node i and

destination node j. We define the traffic flow F (sdij) as the2-tuple (sdij, v(sdij)) and the set of traffic flows φi locallyoriginating at edge node i ∈ NE as follows:

∀ i ∈ NE, φi = {F (sdij), j ∈ NE} (1)

B. Software-Defined Networking

The main principle of SDN lies in the decoupling of networkcontrol from forwarding hardware [13]. In the SDN architec-ture, physical network devices are represented as basic forward-ing elements (usually referred to as switches), forming a dataplane, and are supervised by a network-wide control platformconsisting of a set of software components (the controllers)[5]. The control platform can be seen as a logically centralizedcontrol plane which operates on a global network view andwhich implements a range of control functions. The controllersinteract with the switches via a standardized interface whichis used to collect network state information and distributecontrol commands to be enforced in the network. The existenceof a standard interface enables the separation of the controland forwarding logic and, as such, supports the independentevolution of both planes.

Although there is no formal requirement for the choice ofthe interface to use, OpenFlow [14] has progressively imposeditself as the de facto standard given the massive support fromboth academia and industry [3]. As stated in its specifications,it provides an open protocol to define the basic primitives forprograming the forwarding plane of network devices [13]. Inthe OpenFlow model, the network traffic is identified as a setof flows which are defined according to a set of pre-definedmatch rules instantiated by the controller in the flow tables ofthe switches. Each flow is associated with a set of instructionsused to control how traffic should be routed and treated in thenetwork.

Recently, the Open Networking Foundation (ONF) has pre-sented an architecture for SDN in a technical report [15]. Theproposed architecture consists of the following three planes:(i) a data plane, comprising the set of network elements,(ii) a controller plane, with a set of SDN controllers which haveexclusive control over a set of network resources, and (iii) anapplication plane, implementing a set of network/managementapplications which are executed through the control plane. Thecommunication between the planes is realized in a hierarchicalmanner through a set of interfaces.

C. Network Management Substrate

In our previous work [8], [9], we developed an in-networkmanagement approach for fixed backbone networks, in whichan intelligent substrate is used to enable the dynamic recon-figuration of network resources. Compared to traditional man-agement solutions, where reconfigurations are decided offlineby a centralized management system that has a global viewof the network, reconfiguration decisions are directly taken ina decentralized and adaptive fashion by the decision-makingentities distributed across network edge nodes, based on peri-odic feedback from the network. The decision-making entitiesare organized into a management substrate (MS), which is a


logical structure used to facilitate the exchange of information.In particular, it is used for coordination purposes since itprovides a means through which decision-making points cancommunicate.

Any node in the substrate can directly communicate onlywith its neighbors, which are defined by the topological struc-ture used. The choice of this structure can be driven by differentparameters related to the physical network, such as its topology,the number of edge nodes, but also by the constraints of thecoordination mechanism between the nodes and the associatedcommunication protocol. The overhead incurred by the com-munication protocol in terms of delay and number of messagesexchanged, for example, is a key factor that can influence thechoice of the structure [9].

D. Multi-Topology Routing

To achieve their objectives, most resource managementapproaches employ routing protocols that can support pathdiversity. Multi-Topology Routing (MTR) [16], [17] is a stan-dardized extension to the common Interior Gateway routingProtocols, i.e., OSPF and IS-IS, which can provide a set ofmultiple routes between any source-destination (S-D) pair inthe network by enabling the virtualization of a single physicalnetwork topology into several independent virtual IP planes.

To determine to which topology packets need to be routed,these are marked at network ingresses with the Multi-TopologyIdentifier (MT-ID) of the routing topology to which the corre-sponding traffic flows have been assigned. A separate routingtable needs to be implemented for each topology (i.e., routingscheme) in each router, so that upon receiving a traffic flow, therouter analyzes the MT-ID marked in the packets and forwardsthe packets to the next-hop according to the relevant routingtable [18]. The configuration of the different virtual planes ispart of an offline process which computes a set of desired IPvirtual topologies given the physical network topology.

Splitting ratios, enforced at network ingresses, can subse-quently control the portion of an incoming traffic flow to berouted over each of the virtual planes.

III. MANAGEMENT AND CONTROL FRAMEWORK

Resource management in fixed networks is usually per-formed by external offline centralized systems, which optimizenetwork performance over long timescales. Typically, the cen-tral manager operates on a global view of the network, whichfacilitates the implementation of management applications.However, while centralized/offline solutions are adequate fornetwork operations that do not require frequent reconfig-urations (e.g., computation of MTR planes), they are notappropriate for applications that adapt to traffic and networkdynamics (e.g., online traffic engineering). In addition to thesingle point of failure, these approaches have limitations es-pecially in terms of scalability (i.e., communication overheadbetween the central manager and devices at runtime) and lag inthe central manager reactions, which may result in sub-optimalperformance. To overcome these limitations, dynamic man-agement applications and control should rely on a distributedframework. In this section, we present a hierarchical resource

Fig. 1. Proposed framework.

management and control framework for fixed backbone infras-tructures in the context of SDN environments.

A. Architecture

In the proposed framework, the network environment is con-ceptually divided into three layers as shown in Fig. 1. The bot-tom layer represents the physical infrastructure, which consistsof network switches1 and links, and can be defined as the dataor forwarding plane. The second layer consists of a set of localcontrollers (LCs) and managers (LMs), forming the distributedcontrol and management planes, respectively. Finally, the thirdlayer represents the centralized management system.

A key characteristic of the proposed framework is its modularnature, which enables the separation between the managementapplication logic (represented by LMs) and the control logic(represented by LCs). As a result, this allows the two to evolveindependently, offering increased design choices and flexibilityfor the system vendors, as well as simplified integration ofnetwork applications, while maintaining interoperability.

More specifically, the LCs and LMs are software componentswhich are in charge of controlling and managing the networkresources (i.e., switches and links), respectively. Each LC isresponsible for a set of network switches, which define its scopeof control, so that a network switch is controlled by one LConly. In addition, each LC is logically associated with one ormore LMs. The LMs implement the logic of management appli-cations (e.g., traffic engineering) and are responsible for makingdecisions regarding the settings of network parameters—forexample to compute new configurations that optimize resourceutilization—to be applied in the switches under their respon-sibility. To take management decisions, the LMs communicatethrough the management substrate, as described in Section II-Cand shown in Fig. 1. The substrate, which was proposed in ourprevious work [8], is implemented as an integral part of theframework and is used by LMs to exchange information aboutthe network state and the configurations to apply. Configuration

1In this paper, we assume that each network node is represented by anOpenFlow switch.


decisions taken by LMs are provided to peering LCs, whichdefine and plan the sequence of actions to be enforced for up-dating the network parameters. These actions are then mappedto OpenFlow instructions sent-to and executed-by the switches.It is not the intention of this paper to propose a specific planningmechanism and mapping functions. These are implementationchoices that depend on the type of inputs from the managementapplications. The separation of concerns between LMs and LCsprovides significant deployment benefits since changes can beapplied to LMs in an operational environment independentlyof the LCs and vice versa. In particular, replacing or updatingthe management logic can be achieved without stopping thenetwork as LCs can rely on existing rules.

The number of LMs and LCs to deploy, as well as theassociation between the two, can depend on different factorssuch as the size and topology of the physical network, or thetype of management applications to support. In this paper, weadopt a configuration similar to the one depicted in Fig. 1. Weconsider an equal number of LCs and LMs and a one-to-onemapping between them. In addition, LCs and LMs interact withthe same set of switches, i.e., there is a perfect overlap betweentheir zones of responsibility. As shown in Section VI, such amodel is well suited to the resource management applicationscenarios investigated in this paper.

The centralized management system consists of two com-ponents, namely, the Local Controller Orchestrator (LCO),which supervises all LCs, and the Local Manager Orchestrator(LMO), which supervises all LMs. These are responsible forlonger term operations, for example those that pertain to the lifecycle of LMs and LCs. In particular, they are used to determinethe number of LMs and LCs to deploy, their location, as well astheir zone of responsibility.

It should be noted that the proposed architecture is compati-ble with the generic ONF SDN model [15]. However, while thereport does not elaborate on the specifics of each layer, we goa step further in this paper by investigating the issues that arisefrom the requirements and realization of the functionality ofsuch an architecture. In particular, we investigate how the dis-tributed management and control planes can support dynamicoperations.

B. System Design and Interfaces

The three layers of the proposed architecture are realizedwith a set of functional components and interfaces that facilitatethe interaction/communication between the various compo-nents. These are depicted in Fig. 2 and elaborated below.

1) Functional Components: From an architectural viewpoint,the system can be decomposed into four main components.

Central Management System The functionality of the cen-tral management system pertains to long term managementoperations. The LM Substrate Orchestrator module implementsmethods to compute the structure of the management substrate.The Application Orchestrator module is responsible for per-forming a high-level control of the management applicationswhich are instantiated in the network. In particular, this moduledecides how the logic of each application should be distributedacross the different LMs (i.e., to select the LMs which need to

Fig. 2. Overview of system components and interfaces.

be involved in the decision-making process of a given applica-tion). The LC Substrate Orchestrator module is concerned withthe supervision of LCs. The decisions taken by the central man-agement system rely on information obtained from the GlobalNetwork View, which maintains a global knowledge about theenvironment, such as the physical network topology. It shouldbe noted that, in this work, we do not elaborate on issuesassociated with gathering such information and generating aglobal network view. This component is included in Fig. 2 forcompleteness purposes to illustrate how this view can be usedby other components of the framework.

Local Manager From a functional viewpoint, a LM can berepresented by three main modules. The Monitoring Module isconcerned with functions related to network monitoring, suchas data collection, filtering, aggregation etc., and in particular,it enables each LM to create its own local network view.Furthermore, it also allows the LM to maintain consistency withthe network view of other LMs. Network information collectedand generated by the Monitoring Module can be stored on thelocal memory of the LM. The logic to perform managementoperations is realized by Management Application Modules,which maintain information tables and implement algorithmsto decide on the configurations to apply. A module is definedfor each management application and each LM can implementa different number of applications. The decision of whether amodule should be instantiated on a given LM is made by theApplication Orchestrator and depends on the application type.Finally, the Routing Module implements basic methods relatedto the routing functionality (e.g., shortest path computation).

Local Controller An instance of a LC is represented bythree main modules. The Storage Module consists of a set oflocal storage structures, which are used to maintain informationreceived from the LMs regarding the configuration output ofmanagement applications. Based on this information, the Plan-ning Module determines the actions to take to (re)configurethe switches, for example, according to mapping functions.This also encompasses scheduling methods to decide on howand when actions should be applied. The Execution Moduleis responsible for translating the planned actions into a set of


configuration commands (e.g., OpenFlow) to be enforced in theswitches.

Switch The basic functionality of the network switches isforwarding. In the context of OpenFlow, switches performpacket lookups and forwarding. They are represented by oneor more Flow Tables and an OpenFlow channel to an externalcontroller [14]. Each table entry is mapped to a flow and is as-sociated with a set of instructions to apply to matching packets.The number of tables to configure, as well as their structure, de-pends on the nature of the management applications supportedby the system. It should be noted that different applicationsmay have different requirements in terms of structures andcapabilities to be embedded in the switches (e.g., support forhashing functions).

2) Interfaces: Some previous research initiatives on SDN(e.g., [1], [15]) have used the notion of northbound/southboundto refer to the different interfaces. The relevance of such aterminology presupposes, however, that the controller(s) can beregarded as the focal element(s) of an SDN architecture. In theproposed framework, both LMs and LCs act as the focal pointsand, as such, we define the name of the various interfaces basedon the identity of the interacting components instead.

The interaction between the different components of theproposed architecture is supported by the interfaces shown inFig. 2. The communication between the orchestrator compo-nents (LMO and LCO) and the LMs and LCs is supportedby the O-M and O-C interfaces, respectively. The exchange ofmessages between the LMs and the LCs, for example regardingnew configurations computed by the management applicationmodules, is supported by the M-C interface. The M-S interfaceis defined between the LMs and the switches. It serves monitor-ing purposes, so that each LM can build its own local view ofthe network by directly collecting information from the set ofswitches under its responsibility. Since the network informationis primarily needed by the LM, the M-S interface bypasses theLC to avoid additional processing and delay. Finally, the inter-action between the LCs and the switches is supported by theC-S interface. Switches can report network events to the LCs,which, in turn, instruct them about configuration updates toapply (e.g., modification of table entries). This interface can berealized by the OpenFlow protocol, however, extensions will berequired to enable the configuration of more than one table type.

C. Operations

This subsection provides a detailed description of the mainoperations performed by each component of the architecture.

Management Operations The main difference between themanagement operations performed by the central managementsystem and the LMs concerns the timescale at which they areexecuted. The central management system performs long termoperations, which concern the computation of static config-urations based on a global view of the network (e.g., con-nectivity/topology). These rely on a set of algorithms, whichare usually invoked at long timescales (e.g., in the order ofdays/weeks) and are executed in an offline manner. The place-ment of LMs and LCs, the organization of the managementsubstrate and the computation of virtual MTR planes are ex-

amples of such operations. To take management decisions, thecentral manager uses network-wide information maintained bythe Global Network View component. This stores informationrelated to the network topology, as well as to the structure ofthe distributed management and control planes. Any changes inthe environment (e.g., node failure) are reported to the centralsystem since these can affect the current settings and shouldsubsequently trigger appropriate reconfigurations.

Short to medium term management operations are performedby the LMs in the distributed management plane. Short termoperations are concerned with management decisions taken inthe order of seconds/minutes, with failure recovery mechanismsand adaptive splitting ratio reconfiguration algorithms beingrepresentative examples. In contrast, medium term operationsdeal with configurations which need to be updated less often(e.g., every few hours), such as the route computation betweentwo nodes. The decisions can be taken independently by eachLM based on local knowledge of the network, which is acquiredfrom switches under their responsibility. However, to avoidconfiguration inconsistencies, the LMs may also coordinatetheir decisions through the management substrate. In particular,they can exchange information available locally about networkstatistics. The characteristics of the coordination process areapplication-specific.

Control Operations Control operations are performed bythe LCs on switches under their scope of control, based ondirectives received from the LMs. More specifically, the LCsare responsible for configuring the entries of the tables imple-mented in the switches by deciding which entry(ies) should beinstalled, removed or updated, and also when possible changesshould be applied. They act as intermediate entities betweenLMs and switches, capable of translating the decisions of themanagement modules into commands to be executed to modifytable entries in the switches. In addition, LCs can also controlwhich configurations should be applied and when. For instance,a LC may decide to instantiate entries for a subset of the flowsto satisfy the memory capacity constraint defined for a table.Configurations that are not directly enforced are stored in thelocal Storage Module.

At the network level, each incoming packet is matchedagainst entries in successive tables implemented in the switchesbased on rules. These define whether the packet satisfies somecharacteristics (i.e., belonging to a given traffic flow). In caseof a positive match, the actions defined for the matching entryare added to the action set associated with the packet. In casethe packet does not match any entry, it is sent to the relevantLC through the C-S interface to determine how it should beprocessed. Upon receiving a packet request, the LC defines theset of actions to be applied based on the configurations storedin the Storage Module and instantiates the corresponding newtable entries in all the switches under its zone of control.

IV. LOCAL MANAGER/CONTROLLER PLACEMENT

A key deployment aspect of the decentralized managementand control planes is the distribution of LCs and LMs. It wasrecently argued by Heller et al. in [19] that one of the keyparameters to take into account when designing a SDN-based


architecture for large networks (i.e., WAN) is the propagationdelay between the controller(s) and the network devices. In thecase of the architecture proposed in this paper, a “good” config-uration can be thought, from a qualitative point of view, as onethat can reduce the communication delay between the LM/LCsand the network components without significantly increasingthe management overhead (e.g., due to the coordination be-tween LMs). For instance, while assigning a LM/LC to eachnetwork switch can optimize the communication delay, thismay also significantly affect the complexity of the coordinationmechanism required to harmonize management decisions (i.e.,volume of messages and delay).

In this section, we present an approach to compute theplacement of LCs and LMs in the distributed management andcontrol planes for the specific case depicted in Fig. 1, wherethe mapping of LCs to LMs is one-to-one. Given a networktopology, the approach aims at determining the number ofLM/LCs to deploy, their location, as well as the switches theseare connected to, with the objective of minimizing the distance(in terms of hop count) between the switches and the LM/LCs.To avoid overloading the text, we use the term LC to refer tothe pair LM-LC in the rest of this section.

A. Placement Algorithm

The placement problem can be formulated as an uncapaci-tated facility location problem, which is known to be NP-hard.It has been addressed in several application domains, rangingfrom the selection of network service gateways (e.g., [20]) tothe deployment of sensor networks (e.g., [21], [22]). In thiswork, we develop an approach based on a modified version ofthe leader node selection algorithm proposed by Clegg et al.in [23], which more closely relates to our application scenario.The algorithm, called Pressure, aims at determining, given adynamic network environment, the subset of nodes on whichto install monitoring points to minimize the average distance(in terms of hop count) between the monitoring entities andthe network nodes. While Pressure has a similar objective tothe one considered here, it was originally designed for dynamicnetwork topologies and cannot be directly applied to the LCplacement problem. To account for the requirements of a statictopology, we modify the logic of Pressure and extend it toincorporate an initialization step and a terminating condition,which are essential in this case. The output of the new algo-rithm, which we refer to as PressureStatic, provides the numberof LCs to deploy, their location, as well as the configuration oftheir mapping to network switches.

More specifically, PressureStatic follows a greedy approach,where the LCs are iteratively added in the network one-by-one.The main principle is to select, at each step, the location atwhich, if a LC is installed, will lead to the largest reductionin terms of average distance. To decide on the location, thealgorithm maintains a list of locations at which it is still possibleto install a LC and, at each step, it calculates the Pressurescore of the node i associated with each of these locations asfollows [23]:

P (i) =∑

j∈Nmax(0, lj − di,j) (2)

TABLE IINETWORK CHARACTERISTICS

where for all j in N , lj represents the distance between node jand the LC to which it is currently connected and for all i andj in N , di,j is the distance from node i to node j. The nodewith the highest score is then selected to attach the next LC andbased on the updated list of selected locations, the algorithmfinally determines to which LC each network node should belogically connected, so that a node is connected to its closestavailable LC.2

B. Algorithm Initialization and Terminating Condition

1) Initialization: Due to the greedy nature of the Pressure-Static algorithm, the order according to which LC locations areselected can affect the resulting configuration. Given that thisis directly driven by the choice of the first placement location,the objective of the initialization step is to determine how tobest select this location. In practice, different strategies can beimplemented, ranging from simple random selection methodsto more sophisticated approaches which can take into accountsome topological characteristics of the locations. We analyzeand discuss in detail the effects of the initialization step inSection VII.

2) Terminating Condition: Given that the objective of theproposed placement algorithm is to minimize the average dis-tance of LCs to switches, the optimal configuration would bethe direct one-to-one mapping of LCs to network switches.However, as previously explained, a fully distributed solutionhas inherent limitations in terms of scalability. The purposeof the terminating criterion is to provide a condition underwhich no more LC should be added to the network. To derivesuch a condition, we build upon the observations formulatedby Heller et al. in [19], in which they investigated, for a widerange of network topologies, the "ideal" number of controllersto deploy to minimize the controller-to-switch communicationlatency. It was shown that, in most cases, the gain in terms oflatency reduction tends to decrease as the number of controllersincreases.

To evaluate the performance of PressureStatic with respectto the distance reduction improvement with the introductionof a new LC, we first define the terminating condition as aconstraint on the maximum number of LCs to deploy. As such,the algorithm stops when the number of selected LCs is equalto the maximum authorized value. We then apply the algorithmto the four networks presented in Table II.

For each network, we vary the maximum authorized valuefrom 1 to the total number of nodes in the topology and deter-mine, for each case, the resulting average distance. We note D̄k

the average switch-LC distance in case the total number of LCs

2It should be noted that a network node is connected to one LC only.


Fig. 3. Evolution of the value of τk+1 with the number of LCs.

is equal to k. To measure the benefit of having multiple LCs,we then define, for all k in [2;N ], the parameter ωk as follows:

∀ k ∈ [2;N ], ωk =D̄k

D̄1(3)

ωk represents the gain, in terms of distance reduction, whenusing k ≥ 2 LCs compared to the case where only one LC isused. Based on the ωk values, we then define, for all k in [2;N ],the parameter τk+1 as follows:

∀ k ∈ [2;N − 1], τk+1 =ωk+1 − ωk

ωk. (4)

The value τk+1 represents the improvement, in terms ofdistance gain, when an extra LC is added to the network. Fig. 3shows the evolution of the τk+1 values for the four consideredtopologies. As observed, all networks follow the same trend—the improvement in terms of distance gain reduction rapidlydecreases until reaching a stage where it slowly converges tozero. These results corroborate the observations formulated in[19] and show that when reaching a certain number of LCs, theaddition of an extra LC does not yield significant benefits interms of average LC-to-switch distance reduction.

Based on these results, we define the terminating conditionaccording to a threshold imposed to the gain improvement τk+1.The algorithm terminates if the value of τk+1 is smaller thanthe input threshold. We further investigate the influence of thethreshold value in Section VII. It should be noted that with theproposed approach, the minimum number of selected LCs isalways equal to 2 and that the introduction of a new LC alwaysleads to a positive improvement (i.e., reduces the averagedistance). The pseudo-code of the PressureStatic algorithm ispresented in Fig. 4. Its time complexity is dominated by thenumber of nodes in the network, i.e., O(N2).

V. ADAPTIVE RESOURCE MANAGEMENT

Initial demonstration of management applications in SDNenvironments relied on centralized solutions (e.g., [7], [28]).While these are well-suited for computing long term networkconfigurations, they have limitations which make them un-able to efficiently deal with dynamic resource reconfigura-tions. Adaptive resource management approaches call for thedevelopment of distributed solutions. In our previous work,

Fig. 4. Pseudo-code of the PressureStatic algorithm.

we investigated a new approach to support dynamic resourcereconfiguration functionality in the context of load-balancing(LB) [10], [11] and energy efficiency management (EM) [12]applications. This section describes the main characteristics ofthe proposed approach.

A. Management Functionality

The resource management decision process is distributedacross the network edge nodes, which are organized intoa management substrate (Section II-C) and embedded withdedicated management logic that enables them to performreconfigurations based on feedback regarding the state of thenetwork. More specifically, based on path diversity providedby MTR, the reconfiguration decisions of both applicationsconcern the traffic splitting ratios applied at network ingressesso that individual objectives are met.

In the case of the LB functionality, the objective is to balancethe load in the network by moving some traffic away fromhighly utilized links towards less utilized ones to disperse trafficfrom hot spots. In order to minimize the maximum utilization inthe network, the load-balancing algorithm iteratively adjusts thesplitting ratios of the traffic flows, so that traffic can be movedaway from the link with the maximum utilization lmax.

Exploiting the fact that many links in core networks arebundles of multiple physical cables [29], the objective of theEM approach is to offload as many router line cards (RLCs) aspossible, which can subsequently enter sleep mode. RLCs canbe full if their load is equal to their capacity, utilized if theirload is not zero and less than their capacity, and non-utilized ifthey have zero load. One of the key decisions when adjustingthe splitting ratios concerns the bundled link to consider for(a) removing traffic from, and (b) assigning that traffic to.This decision is based on a ranked list of all utilized RLCs inthe network according to their load. Traffic load is iterativelymoved from the least utilized RLC to more utilized ones thatcan accommodate this load and thus potentially fill-up their re-maining capacity, without activating new RLCs in the process.

The adaptation of the splitting ratios for both applications isperformed in short timescales, for instance, every 15 minutes.

B. Adaptation Process

To adjust the splitting ratios of network traffic flows, bothapplications rely on an adaptation process, which is an iterative


process triggered periodically by the nodes in the substrate (i.e.,managers). It consists of a sequence of reconfiguration actionsdecided in a coordinated fashion.

To prevent inconsistencies between concurrent traffic split-ting adjustments, these are made in a collaborative mannerbetween nodes involved in the process, so that only one node ispermitted to take reconfiguration decisions at a time. The nodeselected at each iteration, which we refer to as the decision-making point, is responsible for executing locally a reconfigu-ration algorithm, which tries to adapt the splitting ratios of localtraffic flows (i.e., originating at the corresponding edge node) sothat the resource optimization objective can be met.

To select a unique decision-making point, each node in thesubstrate is associated with an unique identifier. This can bedefined based on the actual network identifier of the node (e.g.,address) or determined according to some of the characteristicsof the node, for example with respect to its local traffic flows.The identifiers are used to compute a ranked list of nodes,which is made available at each node in the substrate usingthe communication protocol defined in [9]. The node with thehighest identifier is initially chosen to be the decision-makingpoint. Upon completing the reconfiguration locally, it sendsa message to the next node in the list (i.e., node with thesecond highest identifier), which then becomes the decision-making point for the next reconfiguration interval, and so on.The adaptation process terminates if no further adjustments canbe performed or if the algorithm reaches the maximum numberof permitted iterations, which is a parameter of the system.

C. Reconfiguration Algorithm

The reconfiguration algorithm is executed by the decision-making point at each iteration of the adaptation process. Itfollows a greedy process that successively recomputes thesplitting ratios of the local traffic flows of the decision-makingpoint. Based on information available locally and received fromother nodes in the substrate, the algorithm determines the linklrem from which traffic should be removed. In case of the LBapproach, this is the link with the maximum utilization, whilefor the EM approach, this is the link with the least utilized RLC.The algorithm then tries to adjust the splitting ratios of the flowswhich contribute to the load on lrem to move traffic away fromthe link. The outcome of the algorithm at each iteration is eitherpositive, which means that part of a local flows can be divertedfrom lrem, or negative if this is not possible.

The algorithm first identifies the local traffic flows that canbe diverted from lrem. In particular, a flow qualifies if it isrouted over lrem in at least one virtual topology but not alltopologies, i.e., there exists at least one alternative topologyin which the traffic flow is not routed over lrem. Those flowsare then considered iteratively until the first one that can leadto an acceptable configuration is determined. In this case, thesplitting ratios of the set of topologies in which a flow isrouted over lrem are decreased by a factor δ− while othersare increased by a factor δ+. The process terminates whenno further local adjustments can be performed. The pseudo-code of the reconfiguration algorithm is presented in Fig. 5. Itstime complexity is theoretically defined by the number of local

Fig. 5. Pseudo-code of the reconfiguration algorithm.

traffic flows to reconfigure and is in the order of O(N2) in thecase of a PoP-level topology with N nodes.

VI. MANAGEMENT APPLICATION REQUIREMENTS

In this section, we show how the requirements of the twoadaptive resource management applications described in theprevious section can be satisfied by the functionalities andinterfaces of the proposed SDN architecture.

A. Local Manager Substrate

As described in Section III-C, short to mid term managementdecisions are taken by the LMs organized into a managementsubstrate. In practice, the range of operations that a LM canperform depends on the Management Application moduleswhich the LM implements. Whether a LM should be involvedin the decision-making process of a given application is drivenby the characteristics of the switches (e.g., edge/core switch)under its responsibility.

To enable the proposed adaptive resource management ap-proach, two functions need to be supported by a LM. Thefirst one concerns routing decisions, which are taken by aRoute Management Application (RMA) module and the secondconcerns the reconfiguration of splitting ratios, which is im-plemented by an Adaptive Resource Management Application(ARMA) module (one instance of RMA and ARMA per appli-cation). The configuration used in this paper assumes that theallocation of LMs is driven by the placement of LCs, in whichcase each LM is responsible for the set of switches attached toits peer LC. As such, two scenarios can be considered:

• the switches under the responsibility of a LM are coreswitches only. In this case, the LM implements the RMAmodule only.

• there is at least one edge switch under the LM respon-sibility. In this case, the LM implements both the RMAand ARMA modules and is responsible for configuringthe splitting ratios of the local traffic flows of all the edgeswitches to which it is connected.


Fig. 6. Adaptive Resource Management Application module.

To realize the functionality of a given application, the LMsinvolved in the decision-making process may need to communi-cate through the management substrate. In the case of multipleapplications, a separate management substrate needs to becomputed for each application and implemented in the relevantARMA module. Each substrate defines the set of neighbors ofthe LMs and their substrate identifier. These identifiers are usedby the ARMA module to compute the ordered list of nodes inthe substrate and to select the decision-making point at eachiteration of the adaptation process (see Section V-B).

B. Management Application Functionality Requirements

Long Term Configurations Long term configurations arecomputed by the centralized management system. In the contextof the resource management scenario considered here, theseconcern the computation of the MTR planes and the structureof the substrate associated with each management application.The MTR plane computation algorithm is executed in an offlinefashion by the Application Orchestrator. Based on informationretrieved from the Global Network View component aboutthe physical network topology, the algorithm determines thenumber and structure of the virtual topologies needed to satisfythe path diversity requirements. The configuration of each planeis then passed to the RMA module of all LMs through the O-Minterface. The structure of each substrate is computed by an of-fline algorithm implemented in the LM Substrate Orchestrator.The resulting structure information is passed to the RMA andARMA modules of associated LMs.

Adaptive Resource Management The logic to executethe load-balancing and energy management functionality isimplemented by the ARMA module of each involved LM.As depicted in Fig. 6, the ARMA implements four components.The Adaptation Process component controls the executionof the reconfiguration algorithm (see Section V-C), which ad-justs the splitting ratios of the local traffic flows controlled bythe LM. The Management Substrate Information componentmaintains information about the structure of the managementsubstrate, such as the set of neighbor LMs and their orderedlist. The Link & Flow Information component consists ofmultiple tables containing information about the network links(e.g., capacity, utilization etc.) and characteristics of the localflows (e.g., splitting ratio configuration, demand etc.). Network

Fig. 7. Example of possible inconsistencies between routing decisions.

statistics are updated based on information received from otherLMs through the substrate or retrieved from the local LMmonitoring module. Based on the ARMA module, the LMsassociated with at least one edge switch periodically (e.g., every15 minutes) compute the vectors of splitting ratios for everysource-destination pair in the network. These are then sent tothe LCs to which the LMs are connected and stored for futureenforcement.

C. Routing Functionality Requirements

The RMA module interacts with the Routing module whichimplements methods to compute routing parameters (e.g.,Dijkstra shortest path). In the scenario considered here, tworouting configurations need to be computed that we refer to astraffic engineering (TE) and forwarding (FW). The TE configu-ration is computed at every LM involved in the splitting ratioreconfiguration process and represents the full network pathfrom any source switch controlled by the LM to any destinationswitch in every MTR plane. The FW configuration is computedat every LM and is used to determine the interface on whichpackets (received at any network switch) need to be sent toreach their destination via the shortest path in each MTR plane.The results are sent by each LM to its attached LC, which thenconfigures the forwarding policies in all the switches under itsresponsibility.

In most network topology cases, the switches along the pathbetween a S-D pair may not all be under the responsibility of asingle LM. As a result, inconsistencies between the TE and FWconfigurations may occur. In particular, this can happen whenmultiple equal cost shortest paths exist. To illustrate this issue,we consider the simple example depicted in Fig. 7. All linkshave a weight equal to 1. The full path from S1 to S6 is com-puted at LM_1. There are two equal shortest paths between S1and S6: path p11 : {1; 2; 3; 4; 6} and path p12 : {1; 2; 3; 5; 6}.Assume that path p12 is selected by LM_1. To route packets,forwarding policies need to be implemented in each of theswitches. Due to its scope of responsibility, LM_1 can onlydecide how to forward packets from S1 and S2; it does nothave any control on how the packets for S6 are routed from S3onwards. Switches S3, S4, S5 and S6 are controlled by LM_2.There are two equal cost shortest paths from S3 to S6: pathp21 : {3; 4; 6} and path p22 : {3; 5; 6}. To ensure the consistency


with the TE path considered by LM_1, LM_2 should choosepath p22 when deriving the forwarding policies to apply forpackets from S1 to S6. To avoid inconsistent decisions, all LMsshould therefore apply a common path selection strategy (forexample based on the identifier of the interfaces). This ensuresthat the FW decisions are in accordance with the TE decisions.

D. Switch Configuration

To enforce TE decisions at the network level, incoming pack-ets need to be marked at the network edges with the identifierof the MTR plane to which they have been assigned basedon the computed splitting ratios. Although OpenFlow does notcurrently support MTR, the latest version of the specificationsintroduces support for MPLS labeling and VLAN tagging [14].In addition, a proposal for multipath routing [30], which relieson the group option defined in the OpenFlow protocol, hasbeen released by the ONF. As such, we believe that the currentprotocol could be easily extended to support MTR.

The TE application requires that traffic splitting happens atnetwork edges only. Packets that belong to the same TCP floware always assigned to the same topology and no further adjust-ment is permitted along the route. This ensures that all packetsin one TCP flow follow only a single path to the destination, assuch avoiding out-of-order delivery issues that deteriorate theperformance of TCP [31]. As a result, this has implications onthe way incoming packets need to be processed in the differentswitches along the path between a S-D pair. More specifically,switches can act as source or transit depending on how theyprocess packets. A switch acts as source switch for incomingpackets if a) it is an edge switch, and b) packets belong to oneof the switch’s local traffic flows. In this case, the switch needsto assign packets to the relevant MTR plane and execute thefollowing steps:

1) Determine to which local traffic flow the packet belongs.2) Enforce the relevant splitting ratios to determine the MTR

plane on which to route the packet.3) Mark the header with the identifier of the selected plane.4) Forward the packet according to the configuration of the

selected MTR plane.

A switch acts as a transit for incoming packets if a) it is anedge switch but incoming packets do not belong to one of theswitch’s local traffic flows, or b) it is a core switch. In this case,packet processing consists mainly in forwarding the packetsaccording to the configuration of the MTR plane to which theyhave been assigned, i.e.,

1) Determine to which local traffic flow the packet belongsand to which MTR plane it is assigned.

2) Forward the packet according to the configuration of therelevant MTR plane.

Each packet is processed according to the information re-trieved from the packet header (e.g., source IP, destination IP,MTR_ID etc.), which is used to match the packet against flowentries in the different Flow Tables implemented in each switch.Table entries are pro-actively configured by the LC to whicheach switch is connected based on routing and splitting ratioconfigurations information maintained by the Storage module.

In case of a table miss (i.e., no entry for the traffic flow to whichthe packet belongs), switches should be configured to send thepacket to their LC, which then decides on the processing toapply (i.e., to create new table entries).

To balance the traffic across the different MTR planes, ahashing scheme, such as the one proposed by Cao et al. in [32],could be implemented in each edge switch and parametrizedfor each traffic flow (i.e., S-D pair) according to the valuesof the splitting ratios. This, however, suggests the availabilityof a mechanism to enable the programmability of the hashingfunction, which is currently not possible with OpenFlow. An al-ternative solution could use a simple hack based on the availableoptions (e.g., by configuring the bitmasks associated with someof the flow entry matching fields). Although this may have theadvantage of not requiring any extension to the protocol, it maynot provide the same level of control as the hashing method.

VII. EVALUATION

The logic of the LM and LC Orchestrator Modules andthe Adaptive Resource Management Application Module hasbeen implemented in a Java-based simulated environment. Thissection presents the results of the performance evaluation ofthe LC placement algorithm and the load-balancing and energymanagement approaches based on real network topologies andtraffic traces.

A. Placement Algorithm Performance

We evaluate the performance of PressureStatic based on thefour network topologies presented in Table II. Given that itsoutput is affected by the location on which the first LC is added,we consider, for all topologies, all possible initial locations,covering as such the totality of the input space.

1) Influence of the Initial and Terminating Criteria: Asdescribed in Section IV-B2, the terminating condition is definedaccording to the threshold imposed to the distance reductiongain improvement. From Fig. 3, it can be inferred that settingthe value of the threshold more than 10% will result, for alltopologies, in the selection of 2 LCs only. To investigate howthe threshold influences the number of selected LCs, we applythe algorithm using improvement values of 2.5%, 5% and 10%.The results are presented as boxplots in Fig. 8.

Several observations can be made from the results. It canfirst be noted that, for all topologies, the number of selectedLCs decreases as the value of the threshold increases, which isconsistent with the results depicted in Fig. 3. Larger thresholdvalues force the algorithm to terminate prematurely. In addition,it can be observed that the size of the network (i.e., number ofnodes) is not the main factor affecting the number of selectedLCs. On average, a similar number of LCs are selected in theGeant, Germany50 and Deltacom networks for all cases. Thiscan be explained by the definition of the terminating thresholdwhich considers distance reduction gain in absolute values.Finally, the boxplots depict a variation in the number of selectedLCs, which shows that the choice of the first LC locationinfluences the output of the algorithm.


Fig. 8. Number of selected local controllers. (a) Threshold 2.5%. (b) Thresh-old 5%. (c) Threshold 10%.

Fig. 9. Number of selected LCs vs. average distance factor.

To determine how to choose the initial location, we inves-tigate the existence of a correlation between the number ofselected LCs and the actual topological location of the nodeto which the first LC is connected. Given that the objective ofthe placement algorithm is to reduce the average LC-switchdistance, we characterize the node location according to itsaverage distance factor which is defined as follows:

∀ i ∈ N , Δ(i) =

∑j∈N\{i} di,j

|N |2 (5)

The value of Δ represents the proximity of a node in terms ofaverage distance to the rest of the network nodes. The smallerthe value is, the closer the node is, on average, to the othernetwork nodes.3 For each network node i, we calculate its valueΔ(i) and record the number of selected LCs when the first LC isconnected to node i. The correlation between the values of theaverage distance factor of the initial LC location and the number

3By definition, the Δ values depend on the size of the network.

TABLE IIIPLACEMENT ALGORITHM PARAMETER SETTINGS

of selected LCs is depicted in Fig. 9 for the four considerednetworks with a threshold value of 5%.

In all cases, the number of LCs tends to decrease as the valueof Δ increases. In other words, when the initial location ison average close to every other node, a large number of LCstends to be selected by the algorithm. In contrast, if the initiallocation is off-centered, a small number of LCs tends to beselected. As indicated by (4), the value of the improvementfactor τk+1 depends on the previous gain ωk. When the firstLC is attached to an off-centered node, the algorithm tends toselect a more “central” location for the second LC in order tomaximize the distance reduction. As a result, this leads to asubstantial gain improvement. In comparison, the subsequentadditions of LCs produce lower reduction in terms of distance,and, as such, do not satisfy the threshold constraint. In contrast,when the initial location is more “central”, the rates of distancereduction associated with each additional LC tend to be onaverage lower but comparable between themselves, which leadsto the selection of a larger number of LCs. Similar results wereobtained with the other threshold values but are not reportedhere due to space limitations.

The results demonstrate that the number of LCs selected fora given network can be tuned by controlling the settings of theterminating condition and the initial LC placement. In practice,it is expected that the number of LCs should increase withthe size of the network topology in order to minimize boththe LC-switch and LC-LC distances. This can be translatedinto the following parameter settings: low threshold value andcentral initial LC position for large scale networks, and highthreshold value and off-centered initial position for smallerscale networks. The settings for the topologies considered inthis paper are presented in Table III.

2) Heuristic Performance: Clegg et al. showed in [23] thatthe placement computed by the Pressure algorithm significantlyoutperforms any random configuration in terms of LC-switchdistance. Another factor to consider for evaluating the perfor-mance of the proposed placement heuristic is to compare itsoutput to the optimal placement of the same number of LCs(i.e., optimum average switch-LC distance). In order to derivethe optimum value, we formulate the placement of n LCs as anInteger Linear Programming (ILP) problem with the followingsparameters. Let NLC be the total number of LCs to deploy. Forall i and j in N , let di,j be the distance between i and j. Forall i in N , let xi be the binary variable equal to 1 if a LC isattached to i, 0 otherwise. In addition, for all i and j in N ,let yi,j be the binary variable equal to 1 if node i is connectedto LC attached to node j, 0 otherwise. Finally, for all i in N ,let li be the distance between node i and the LC to which it isconnected.


Fig. 10. Performance of PressureStatic with threshold 5% vs. optimum aver-age LC-switch distance. (a) Abilene. (b) Geant. (c) Germany50. (d) Deltacom.

The objective of the ILP is to determine the values of xi

and yi,j which minimize the average LC-switch distance, i.e.,formally:

minimize1

|N | ·∑

i∈Nli (6)

subject to the following constraints:

∀ i ∈ N ,∑

j∈Ndi,j · yi,j = li (7)

∀ i ∈ N ,∑

j∈Nyi,j =1 (8)

∀ i ∈ N , j ∈ N , yi,j ≤xj (9)

∀ j ∈ N , xj ≤∑

i∈Nyi,j (10)

∑

j∈Nxj ≤NLC . (11)

Constraint (7) defines the LC-switch distance. Constraint (8)ensures that each switch is connected to one LC only. Constraint(9) guarantees that a switch is associated with a location onlyif a LC is attached there and constraint (10) forces the ILPto position a LC on a location only if at least one switch isassociated with this location. Finally, constraint (11) ensuresthat the total number of LCs is at most equal to NLC .

For each network topology in Table II, we apply the ILPfor all values of NLC relevant to that network (derived fromFig. 9). Based on the algorithm output, we then determine theoptimal LC-switch distance and compare it to the one obtainedwith PressureStatic for the same number of selected LCs. Theresults are shown in Fig. 10. In the case of PressureStatic,the best, average and worst distance values recorded for theconsidered configuration are reported. It is worth noting thatsince switches and LCs can be physically attached to the samelocation (distance equal to 0), the average LC-switch distancecan be lower than 1.

As observed, the placement computed by PressureStaticgives close to optimal performance in terms of average LC-switch distance. In the case of the Abilene network, the pro-posed algorithm is able to compute the optimal placement. In

Fig. 11. Standard deviation of the cluster size with a threshold of 5%.

TABLE IVNETWORK BUNDLED LINKS (BLS) CONFIGURATION

all other cases, the deviation from the optimum decreases asthe number of LCs increases. The highest deviation (16%) isobtained with Deltacom and 4 LCs.

3) Placement Optimization Objective: The algorithm aimsat minimizing the average LC-switch distance. The number ofLM/LCs, however, may also be driven by other parameters.In the proposed architecture, each LC is logically connectedto a set of switches forming clusters. In practice, the numberof switches attached to an LC can affect the volume of infor-mation which needs to be maintained and processed by theLC. To investigate the effect of PressureStatic on the clustersize distribution, we compute the standard deviation of thesize of the clusters attached to each LC for each network andconfiguration. To account for the variation incurred by thechoice of the first LC location, we plot the results as boxplotsin Fig. 11. The closer the value of the standard deviation is to 0,the more uniformly distributed in terms of size are the clusters.

The results show that PressureStatic can lead to unbalancedclusters. The heterogeneity in terms of cluster sizes tends toincrease as the number of nodes in the network increases. Thetrade-off between latency reduction and homogeneity of thevolume of information to maintain at each LC could thereforebe taken into account in the placement objective to control theclustering of switches. Increasing the homogeneity of the clus-ter size may, however, lead to the selection of a larger numberof LM/LCs, which raises challenges regarding the choice ofthe management substrate structure to use to organize the LMs(Section II-C). In [11], we showed that while simple structuressuch as the full-mesh or ring models suit well the case wherea small number of nodes is involved in the substrate, thesehave limitations when this number increases. In this case, theuse of a more sophisticated structure such as the one presentedin [9] should be considered. The proposed structure, which is


Fig. 12. Evolution of max-u. (a) Abilene. (b) Geant. (c) Germany50.

a hybrid model, offers a trade-off between the full-mesh andthe ring models in terms of communication cost and volume ofinformation that needs to be maintained by each substrate node.

B. Adaptive Resource Management Scheme Performance

We evaluate the performance of the load-balancing (LB) andenergy management (EM) approaches, described in Section V,under the LM plane configuration determined by the placementalgorithm for the Abilene, Geant and Germany50 networks forwhich real data traces are available (this is not the case forDeltacom). In order to take into account a wide range of trafficconditions, we consider a period of 7 days for both Abilene [33]and Geant [25]. In the case of Germany50, a period of one day isconsidered given the available data [34]. In all cases, adaptationis performed at a frequency of 15 minutes. The configurationsused in each topology are summarized in Table IV.

We compare the performance in terms of maximum linkutilization (max-u) and number of active line cards (nbRLCs)obtained for each traffic matrix (TM) with the five followingschemes:

• Original: the original link weight settings are used in theoriginal topology and no adaptation is performed.

• Static MTR (MTR-S): static splitting ratios (do not adap-tively change) are set equal to the inverse of the capacityof the bottleneck bundled link in each virtual topology.

• Load-Balancing (LB): the considered LB approach.• Energy Management (EM): the considered EM

approach.• Optimum Load-Balancing (Opt-LB): the routing prob-

lem is defined as a MultiCommodity Flow problem [35]and the glpsol GLPK (GNU Linear Programming Kit)linear programming solver [36] is used to compute theoptimal max-u for each traffic matrix.

The evolution of the max-u at 15 minute intervals obtainedwith the five schemes over the time period considered ineach network is shown in Fig. 12. As can be observed, LBoutperforms the Original, MTR-S and EM schemes in all casesand obtains close to optimal performance. The deviation fromthe optimum max-u is equal to 8.78%, 5.6%, and 10.1% inthe Abilene, Geant and Germany50 networks, respectively. Inaddition, a deviation of less than 10% is obtained for 96.42%of the TMs in the case of Abilene, and 98.25% and 92.8%for Geant and Germany50, which indicates that the proposedscheme performs uniformly well.

To analyze the performance of the EM approach in terms ofenergy gain, we determine the deviation between the number of

Fig. 13. Cumulative frequency graphs of the gain in terms of active linecards obtained by EM compared to the other schemes. (a) Abilene. (b) Geant.(c) Germany50.

active line cards used by EM and the one obtained with the otherschemes. The results are shown as cumulative frequency graphsin Fig. 13. It can first be observed that a positive gain is obtainedin all cases, which shows that EM always uses the lowestnumber of RLCs to route the traffic. The best performanceis achieved when compared to the schemes that balance thetraffic (Opt-LB, LB and MTR-S), which can be explained by theantagonistic nature of the two objectives. The gain comparedto LB is on average equal to 20.90%, 44.82%, and 30.47%for the Abilene, Geant and Germany50 networks, respectively.In addition, EM performs better than the Original scheme byconcentrating the traffic on a smaller number of links. In thiscase, the gain is equal to 19.21%, 21.05%, and 10.08% for thethree networks, respectively.

The results demonstrate that a significant reduction in termsof resource utilization can be achieved by the proposed schemesunder the configuration of the distributed management planecomputed by the placement algorithm.


VIII. RELATED WORK

In contrast to traditional network architectures, local controlfunctions are moved away from network elements to remotecontrollers in SDN. As a result, this can lead to the creationof new bottlenecks and potentially significant overhead, de-pending on the type of management applications to consider[2]. While using a centralized controller with a network-wideview has the benefit of facilitating the implementation of thecontrol logic, it also presents limitations, especially in terms ofscalability as the size and dynamics of the network increase.Different approaches have been proposed in the literature toovercome the limitations of the single centralized controllermodel, e.g., [4]–[6], [37].

The approach presented in [4] by Yeganeh et al. is based ontwo levels of controllers. Distributed controllers in the lowerlevel operate on locally-scoped information, while decisionswhich require network-wide knowledge are taken by a logicallycentralized root controller. A hierarchical solution for WideArea Networks (WAN) has also been proposed by Ahmed et al.in [6]. In their architecture, the network is divided into multiplezones of control, on top of which a centralized managementlayer implements management operation functionality and ser-vices. In contrast to hierarchical solutions, fully distributeddesigns have been proposed in [5] and [37]. The platformpresented in [5] aims at facilitating the implementation ofdistributed control planes by abstracting network resources asdata objects stored in a Network Information Base. In [37],the authors introduce HyperFlow, a physically distributed butlogically centralized event-based OpenFlow control plane. Dueto its centralized control logic, HyperFlow requires state syn-chronization mechanisms and targets pro-active managementoperations. The impact of control state consistency on theperformance of a load-balancing application has been inves-tigated by Levin et al. in [38]. While most of the previousapproaches have focused on the interface between the dataand control planes, the policy-based framework developed byKim et al. targets the interface between the control platformand the network management logic [1]. Jain et al. report theirexperience in deploying a SDN-based WAN to connect Googledatacenters [7]. Each datacenter site is controlled by a set ofOpenFlow-based network control servers connected to a cen-tralized SDN gateway which implements a logically centralizedtraffic engineering (TE) application.

The approaches described above realize distributed controlplanes. Most of them, however, consider a centralized solutionto implement network applications, which is not adequate forreactive and adaptive functionalities. The framework proposedin this paper advances the state-of the art by enabling adap-tive resource management through a distributed managementplane (LMs), while relying on the support of a centralizedmanagement system for long term operations. In addition, itis interesting to note that most of the SDN approaches ema-nating from outside the network management community donot make a clear separation between control and managementfunctionalities (e.g., [4], [5]) and usually disregard the impli-cations of management operations (especially dynamic ones).In this paper, we advocate a model that separates control and

management logic. This distinction was also taken into accountin the approach presented in [6]. However, in contrast to ourframework, this relies on a centralized management plane.

A key issue when deploying a distributed control planeconcerns the controller placement problem. In [39], Bari et al.proposed an approach to dynamically determine the numberand location of controllers based on the network conditions.In practice, the allocation of controllers should also be drivenby the requirements of the network applications to implement.In our algorithm, this is taken into account through the initialand terminating conditions. A dynamic approach is thus moregeared towards applications sensitive to traffic fluctuations. Adifferent objective has been considered by Hu et al. in [40]where the goal is to maximize the reliability in terms controlpaths. Their approach assumes that the number of controllers todeploy is given, which may not be easy to determine a prioriand is considered as a variable in our work.

Another line of research related to the work presented inthis paper concerns network resource management for load-balancing purposes and energy savings. To overcome the lim-itations of current offline TE functionality, online approachesthat are able to react to current conditions in short timescaleshave been developed [41]. The objective is to dynamicallyadapt the settings based on real-time information received fromthe network in order to better utilize network resources. Mostof the previous approaches rely on a centralized manager tocompute new configurations (e.g., [42], [43]). While distributedsolutions have been proposed in [44]–[46], these target MPLS-based networks. In contrast, this paper presents an adaptive anddecentralized approach for IP networks. A decentralized IP-based TE mechanism has also been proposed in [47]. Com-pared to our approach, however, decisions are made by eachnode in the network and as such, may be associated with anon negligible signalling overhead. Finally, in the context ofSDN, Agarwal et al. proposed an approach to perform TE inSDN environments [28] in the case where not all switches areembedded with SDN capabilities. In this case, SDN-compliantswitches are controlled by a centralized SDN controller whileothers implement traditional hop-by-hop routing functions.

IX. CONCLUSION

This paper presents a new SDN-based management andcontrol framework for fixed backbone networks, which pro-vides support for both static and dynamic resource managementapplications. Its architecture is compatible with the genericONF SDN model and consists of three layers which interactwith each other through a set of interfaces. Based on its modularstructure, the framework makes a clear distinction betweenthe management and control logic which are implemented bydifferent planes, offering as such improved deployment advan-tages. To demonstrate the benefits of the proposed framework,we show how its functionality and interfaces can be used tosupport the requirements of two distributed adaptive resourcemanagement applications whose performance is evaluated interms of resource utilization reduction. We also present a newplacement algorithm to compute the configuration of the dis-tributed management and control planes and investigate how


the degree of distribution can be controlled based on differentparameters.

In future extensions of this research, we plan to enhancethe proposed placement algorithm by considering other costsand constraints (e.g., maintaining the homogeneity of clustersize, applying constraint on the volume of information storedat each LC etc.). We also plan to develop a mechanism toenable unequal cost splitting in OpenFlow. Finally, future workwill investigate how the proposed framework can be used torealize other types of management applications (e.g., cachemanagement).

REFERENCES

[1] H. Kim and N. Feamster, “Improving network management with softwaredefined networking,” IEEE Commun. Mag., vol. 51, no. 2, pp. 114–119,Feb. 2013.

[2] S. Yeganeh, A. Tootoonchian, and Y. Ganjali, “On scalability of software-defined networking,” IEEE Commun. Mag., vol. 51, no. 2, pp. 136–141,Feb. 2013.

[3] B. Nunes, M. Mendonca, X.-N. Nguyen, K. Obraczka, and T. Turletti,“A survey of software-defined networking: Past, present, future of pro-grammable networks,” IEEE Commun. Surveys Tuts., vol. 16, no. 3,pp. 1617–1634, 2014.

[4] S. Hassas Yeganeh and Y. Ganjali, “Kandoo: A framework for effi-cient and scalable offloading of control applications,” in Proc. HotSDN,Helsinki, Finland, 2012, pp. 19–24.

[5] T. Koponen et al., “Onix: A distributed control platform for largescaleproduction networks,” in Proc. USENIX, Vancouver, BC, Canada, 2010,pp. 1–6.

[6] R. Ahmed and R. Boutaba, “Design considerations for managing widearea software defined networks,” IEEE Commun. Mag., vol. 52, no. 7,pp. 116–123, Jul. 2014.

[7] S. Jain et al., “B4: Experience with a globally-deployed software definedwan,” SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 3–14,Oct. 2013.

[8] M. Charalambides, G. Pavlou, P. Flegkas, N. Wang, and D. Tuncer, “Man-aging the future Internet through intelligent in-network substrates,” IEEENetw., vol. 25, no. 6, pp. 34–40, Nov. 2011.

[9] D. Tuncer, M. Charalambides, H. El-Ezhabi, and G. Pavlou, “A hybridmanagement substrate structure for adaptive network resource manage-ment,” in Proc. ManFI, Krakow, Poland, May 2014, pp. 1–7.

[10] D. Tuncer, M. Charalambides, G. Pavlou, and N. Wang, “Towards de-centralized and adaptive network resource management,” in Proc. CNSM,Mini-Conference, Paris, France, Oct. 2011, pp. 1–6.

[11] D. Tuncer, M. Charalambides, G. Pavlou, and N. Wang, “DACoRM:A coordinated, decentralized, and adaptive network resource managementscheme,” in Proc. NOMS, Maui, HI, USA, Apr. 2012, pp. 417–425.

[12] M. Charalambides, D. Tuncer, L. Mamatas, and G. Pavlou, “Energy-awareadaptive network resource management,” in Proc. IM, Ghent, Belgium,May 2013, pp. 369–377.

[13] “Software-defined networking: The new norm for networks,” OpenNetworking Found., Palo Alto, CA, USA, Apr. 2012. [Online].Available: https://www.opennetworking.org/images/stories/downloads/sdn-resources/white-papers/wp-sdn-newnorm.pdf

[14] “OpenFlow Specifications v.1.4.0,” Open Networking Found., Palo Alto,CA, USA, Oct. 2013. [Online]. Available: https://www.opennetworking.org/sdn-resources/onf-specifications/openflow

[15] “SDN architecture,” Open Networking Found., Palo Alto, CA, USA,Jun. 2014. [Online]. Available: https://www.opennetworking.org/images/stories/downloads/sdn-resources/technical-reports/TR_SDN_ARCH_1.0_06062014.pdf

[16] P. Psenak, S. Mirtorabi, A. Roy, L. Nguyen, and P. Pillay-Esnault,“Internet Engineering Task Force,” Multi-Topology (MT) Routing inOSPF, RFC 4915, Jun. 2007. [Online]. Available: http://www.ietf.org/rfc/rfc4915.txt

[17] T. Przygienda, N. Shen, and N. Sheth, M-ISIS: Multi Topology (MT) Rout-ing in Intermediate System to Intermediate Systems (IS-ISs), RFC 5120,Feb. 2008. [Online]. Available: http://www.ietf.org/rfc/rfc5120.txt

[18] S. Gjessing, “Implementation of two resilience mechanisms using multitopology routing and stub routers,” in Proc. AICT-ICIW, Guadeloupe,French Caribbean, Feb. 2006, pp. 29–34.

[19] B. Heller, R. Sherwood, and N. McKeown, “The controller placementproblem,” in Proc. HotSDN, Helsinki, Finland, 2012, pp. 7–12.

[20] R. Cohen and G. Nakibly, “A traffic engineering approach for placementand selection of network services,” IEEE/ACM Trans. Netw., vol. 17,no. 2, pp. 487–500, Apr. 2009.

[21] S. Ray, R. Ungrangsi, D. Pellegrini, A. Trachtenberg, and D. Starobinski,“Robust location detection in emergency sensor networks,” in Proc.INFOCOM, San Francisco, CA, USA, Mar. 2003, vol. 2, pp. 1044–1053.

[22] D. F. Pellegrini and R. Riggio, “Leakage detection in waterpipes networksusing acoustic sensors and identifying codes,” in Proc. IEEE PerSense,San Diego, CA, USA, Mar. 2013, vol. 2, pp. 1044–1053.

[23] R. Clegg, S. Clayman, G. Pavlou, L. Mamatas, and A. Galis, “On theselection of management/monitoring nodes in highly dynamic networks,”IEEE Trans. Comput., vol. 62, no. 6, pp. 1207–1220, Jun. 2013.

[24] The Abilene Internet 2 Topology. [Online]. Available: http://www.Internet2.edu/pubs/200502-IS-AN.pdf

[25] The GEANT Topology, 2004. [Online]. Available: http://www.dante.net/server/show/nav.007009007

[26] The Germany50 Topology, 2004. [Online]. Available: http://sndlib.zib.de/[27] The Deltacom Topology, 2010. [Online]. Available: http://www.

topology-zoo.org/maps/Deltacom.jpg/[28] S. Agarwal, M. Kodialam, and T. Lakshman, “Traffic engineering in

software defined networks,” in Proc. INFOCOM, Turin, Italy, Apr. 2013,pp. 2211–2219.

[29] IEEE Standard for Local and Metropolitan Area Networks: Link Aggre-gation, IEEE Std. 802.1AX, Nov. 2008.

[30] OpenFlow Multipath Proposal. [Online]. Available: http://archive.openflow.org/wk/index.php/Multipath_Proposal

[31] M. Laor and L. Gendel, “The effect of packet reordering in a backbonelink on application throughput,” IEEE Netw., vol. 16, no. 5, pp. 28–36,Sep./Oct. 2002.

[32] Z. Cao, Z. Wang, and E. Zegura, “Performance of hashing-based schemesfor Internet load balancing,” in Proc. INFOCOM, Tel Aviv, Israel, 2000,vol. 1, pp. 332–341.

[33] The Abilene Topology and Traffic Matrices Dataset, 2004. [Online].Available: http://www.cs.utexas.edu/~yzhang/research/AbileneTM/

[34] S. Orlowski, M. Pióro, A. Tomaszewski, and R. Wessäly, “SNDlib1.0-survivable network design library,” in Proc. INOC, Spa, Belgium,Apr. 2007, pp. 1–6.

[35] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization:Algorithms and Complexity. Upper Saddle River, NJ, USA: Prentice-Hall, 1982.

[36] GNU Linear Programming Kit (GLPK). [Online]. Available: http://www.gnu.org/software/glpk/

[37] A. Tootoonchian and Y. Ganjali, “HyperFlow: A distributed controlplane for openflow,” in Proc. INM/WREN, San Jose, CA, USA, 2010,pp. 3–9.

[38] D. Levin, A. Wundsam, B. Heller, N. Handigol, and A. Feldmann, “Log-ically centralized?: State distribution trade-offs in software defined net-works,” in Proc. HotSDN, Helsinki, Finland, 2012, pp. 1–6.

[39] M. Bari et al., “Dynamic controller provisioning in software definednetworks,” in Proc. CNSM, Zurich, Switzerland, Oct. 2013, pp. 18–25.

[40] Y.-N. Hu, W.-D. Wang, X.-Y. Gong, X.-R. Que, and S.-D. Cheng, “On theplacement of controllers in software-defined networks,” J. China Univ.Posts Telecommun., vol. 19, no. S2, pp. 92–97, Oct. 2012.

[41] N. Wang, K. Ho, G. Pavlou, and M. Howarth, “An overview of routingoptimization for Internet traffic engineering,” IEEE Commun. SurveysTuts., vol. 10, no. 1, pp. 36–56, 2008.

[42] N. Wang, K. Ho, and G. Pavlou, “Adaptive multi-topology IGP basedtraffic engineering with near-optimal network performance,” in Proc.NETWORKING, vol. 4982, Lecture Notes in Computer Science, A. Das,H. Pung, F. Lee, and L. Wong, Eds., Berlin/Heidelberg, 2008, pp. 654–666, Springer.

[43] M. Zhang, C. Yi, B. Liu, and B. Zhang, “GreenTE: Power-aware trafficengineering,” in Proc. ICNP, Kyoto, Japan, 2010, pp. 21–30.

[44] S. Kandula, D. Katabi, B. Davie, and A. Charny, “Walking thetightrope: responsive yet stable traffic engineering,” in Proc. SIGCOMM,Philadelphia, PA, USA, Aug. 2005, vol. 35, pp. 253–264.

[45] F. Cuomo, A. Cianfrani, M. Polverini, and D. Mangione, “Network prun-ing for energy saving in the Internet,” Comput. Netw., vol. 56, no. 10,pp. 2355–2367, Jul. 2012.

[46] V. Foteinos, K. Tsagkaris, P. Peloso, L. Ciavaglia, and P. Demestichas,“Operator-friendly traffic engineering in IP/MPLS core networks,” IEEETrans. Netw. Serv. Manag., vol. 11, no. 3, pp. 333–349, Sep. 2014.

[47] S. Fischer, N. Kammenhuber, and A. Feldmann, “REPLEX: Dynamictraffic engineering based on wardrop routing policies,” in Proc. CoNEXT ,Lisboa, Portugal, 2006, pp. 1–12.


Daphne Tuncer received the “Diplôme d’ingénieurde Télécom SudParis” in 2009. She received thePh.D. from the Electronic and Electrical Engineeringat University College London, U.K., in November2013. She is a postdoctoral researcher in the Depart-ment of Electronic and Electrical Engineering at Uni-versity College London, U.K. Her research interestsare in the areas of software-defined networking, net-work self-management, adaptive network resourcemanagement, energy efficiency and cache/contentmanagement.

Marinos Charalambides received the B.Eng. de-gree (First Class Hons.) in electronic and electricalengineering, the M.Sc. degree (Distinction) in com-munications networks and software, and the Ph.D.degree in policy-based management, all from theUniversity of Surrey, U.K., in 2001, 2002 and 2009,respectively. He is a senior researcher at UniversityCollege London. He has been working in a numberof European and U.K. national projects since 2005and his current research interests include software-defined networking, in-network caching, energy-

aware networking and on-line traffic engineering. He is on the technicalprogram committees of the main network and service management conferences.

Stuart Clayman received the Ph.D. degree in com-puter science from University College London in1994. He has worked as a Research Lecturer atKingston University and at UCL. He is currently aSenior Research Fellow at UCL EEE department. Heco-authored over 30 conference and journal papers.His research interests and expertise lie in the areas ofsoftware engineering and programming paradigms;distributed systems; virtualised compute and net-work systems, network and systems management;networked media; and knowledge-based systems. He

has been involved in several European research projects since 1994. He alsohas extensive experience in the commercial arena undertaking architectureand development for software engineering, distributed systems and networkingsystems. He has run his own technology start-up in the area of NoSQLdatabases, sensors and digital media.

George Pavlou received the Diploma in engineeringfrom the National Technical University of Athens,Greece, and the M.Sc. and Ph.D. degrees in com-puter science from University College London, U.K.He is a Professor of Communication Networks inthe Department of Electronic and Electrical Engi-neering, University College London, U.K., wherehe coordinates research activities in networking andnetwork management. His research interests focuson networking and network management, includingaspects such as traffic engineering, quality of service

management, autonomic networking, information-centric networking, grid net-working and software-defined networks. He has been instrumental in a numberof European and U.K. research projects that produced significant results withreal-world uptake and has contributed to standardization activities in ISO,ITU-T and IETF. He has been on the editorial board of a number of key journalsin these areas, he is the chief editor of the bi-annual IEEE Communicationsnetwork and service management series and in 2011 he received the DanielStokesbury award for “Distinguished technical contribution to the growth ofthe network management field”.

Documents

Adaptive Resource Management and Control in Software ...gpavlou/Publications/... · Recently, the Open Networking Foundation (ONF) has pre-sented an architecture for SDN in a technical