11
Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network D. Harle, S. Albarrak * Department of Electrical and Electronic Engineering, University of Strathclyde, 204 George Street, Glasgow G1 1XW, UK Available online 27 June 2007 Abstract IP over wavelength-division multiplexing (WDM) networks have been identified as promising candidates to underpin future telecom- munication networks and thus industrial and academic researchers have been actively investigating key issues that affect the integration of IP and optical layers. An area of prime concern is network survivability. The main challenges within this domain are: to provide sur- vivability schemes under different failure scenarios and to reduce the amount of required spare capacity while providing the required quality of service (QoS). Due to its flexibility and simplicity, one candidate technique is pre-allocated restoration. In this paper, the performance of pre-allocated restoration mechanisms are investigated considering a distributed GMPLS-based IP/ WDM mesh network under single and dual-link failure scenarios. Several issues are considered. Firstly, two proposed spare capacity allocation schemes are investigated; link partitioning and lightpath partitioning. Secondly, the paper evaluates a retrial scheme that sup- ports the utilisation of spare capacity. Thirdly, proposed class prioritisation scheme applied to the differentiated survivability concept with pre-allocated restoration is investigated. Finally, two load-based spare capacity optimisation schemes are proposed and evaluated; local spare capacity optimisation (LSCO) and global spare capacity optimisation (GSCO). Ó 2007 Published by Elsevier B.V. Keywords: GMPLS; Survivability; Differentiated survivability; Pre-allocated restoration; Spare capacity allocation; Spare capacity optimisation; Dual- link failures 1. Introduction In recent years, the research and industrial communities have increased their efforts into the development of optical network technologies, at both the physical and manage- ment layers, in order to progress beyond point-to-point transmission systems and proceed to all-optical networks. Consequently, the optical transport network (OTN) struc- ture with its own data plane and control plane has been introduced. Generalised multi-protocol label switching (GMPLS) has been developed by the Internet engineering task force (IETF) to form an intelligent control plane for the OTN. This control plane is able to establish and tear down lightpaths on-demand and to provide quality of ser- vice (QoS) [1,2]. As a result, a two-layer approach, consist- ing of the IP layer built directly over the optical layer, has been identified as a promising candidate for the structure of future telecommunication networks. Such a structure has its own control and data plane. The control plane can be an overlay or peer model based on the GMPLS protocols while the data plane consists three overlaid topologies; link, lightpath, and label switching path (LSP) topology. The lightpath and LSP topologies exhibit a degree of similarity in terms of the provisioning procedure and providing an end-to-end path. They are, however, diverse in terms of granularity; lightpath granularity is more coarse than LSP granularity. Moreover, an LSP may traverse more than one lightpath. One of the critical issues in the proposed two-layer approach is network survivability. There are many reasons for this. Firstly, OTNs support enormous bandwidth and, therefore, a single failure may have a significant impact 0140-3664/$ - see front matter Ó 2007 Published by Elsevier B.V. doi:10.1016/j.comcom.2007.05.057 * Corresponding author. Tel.: +44 141 5482081; fax: +44 151 5524968. E-mail addresses: [email protected] (D. Harle), sbarrak@eee. strath.ac.uk (S. Albarrak). www.elsevier.com/locate/comcom Available online at www.sciencedirect.com Computer Communications 30 (2007) 3085–3095

Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

  • Upload
    d-harle

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

Available online at www.sciencedirect.com

www.elsevier.com/locate/comcom

Computer Communications 30 (2007) 3085–3095

Spare capacity allocation and optimisation in a distributedGMPLS-based IP/WDM mesh network

D. Harle, S. Albarrak *

Department of Electrical and Electronic Engineering, University of Strathclyde, 204 George Street, Glasgow G1 1XW, UK

Available online 27 June 2007

Abstract

IP over wavelength-division multiplexing (WDM) networks have been identified as promising candidates to underpin future telecom-munication networks and thus industrial and academic researchers have been actively investigating key issues that affect the integrationof IP and optical layers. An area of prime concern is network survivability. The main challenges within this domain are: to provide sur-vivability schemes under different failure scenarios and to reduce the amount of required spare capacity while providing the requiredquality of service (QoS). Due to its flexibility and simplicity, one candidate technique is pre-allocated restoration.

In this paper, the performance of pre-allocated restoration mechanisms are investigated considering a distributed GMPLS-based IP/WDM mesh network under single and dual-link failure scenarios. Several issues are considered. Firstly, two proposed spare capacityallocation schemes are investigated; link partitioning and lightpath partitioning. Secondly, the paper evaluates a retrial scheme that sup-ports the utilisation of spare capacity. Thirdly, proposed class prioritisation scheme applied to the differentiated survivability conceptwith pre-allocated restoration is investigated. Finally, two load-based spare capacity optimisation schemes are proposed and evaluated;local spare capacity optimisation (LSCO) and global spare capacity optimisation (GSCO).� 2007 Published by Elsevier B.V.

Keywords: GMPLS; Survivability; Differentiated survivability; Pre-allocated restoration; Spare capacity allocation; Spare capacity optimisation; Dual-link failures

1. Introduction

In recent years, the research and industrial communitieshave increased their efforts into the development of opticalnetwork technologies, at both the physical and manage-ment layers, in order to progress beyond point-to-pointtransmission systems and proceed to all-optical networks.Consequently, the optical transport network (OTN) struc-ture with its own data plane and control plane has beenintroduced. Generalised multi-protocol label switching(GMPLS) has been developed by the Internet engineeringtask force (IETF) to form an intelligent control plane forthe OTN. This control plane is able to establish and teardown lightpaths on-demand and to provide quality of ser-

0140-3664/$ - see front matter � 2007 Published by Elsevier B.V.

doi:10.1016/j.comcom.2007.05.057

* Corresponding author. Tel.: +44 141 5482081; fax: +44 151 5524968.E-mail addresses: [email protected] (D. Harle), sbarrak@eee.

strath.ac.uk (S. Albarrak).

vice (QoS) [1,2]. As a result, a two-layer approach, consist-ing of the IP layer built directly over the optical layer, hasbeen identified as a promising candidate for the structure offuture telecommunication networks. Such a structure hasits own control and data plane. The control plane can bean overlay or peer model based on the GMPLS protocolswhile the data plane consists three overlaid topologies; link,lightpath, and label switching path (LSP) topology. Thelightpath and LSP topologies exhibit a degree of similarityin terms of the provisioning procedure and providing anend-to-end path. They are, however, diverse in terms ofgranularity; lightpath granularity is more coarse thanLSP granularity. Moreover, an LSP may traverse morethan one lightpath.

One of the critical issues in the proposed two-layerapproach is network survivability. There are many reasonsfor this. Firstly, OTNs support enormous bandwidth and,therefore, a single failure may have a significant impact

Page 2: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

3086 D. Harle, S. Albarrak / Computer Communications 30 (2007) 3085–3095

upon user services. Secondly, the introduction of a two-layer approach eliminates the SONET/SDH layer, tradi-tionally responsible for providing survivability in opticalnetworks. In the two-layer structure, survivability can beprovided at either one or both layers. Providing survivabil-ity at a lower layer is scalable and faster compared to thehigher layer. By contrast, providing survivability at thehigher layer achieves better resource utilisation and finerrecovery granularity. Therefore, in order to provide moreefficient survivability schemes in terms of resource utilisa-tion and coping with different failure scenarios, it is neces-sary to move towards the multilayer survivability [3–5].The coordination between the IP and optical layer can beimplemented using a hold-off timer or recovery token.The former is based upon a pre-configured delay to triggerthe survivability schemes in the IP layer, while the lattertriggers the survivability schemes in the IP layer themoment that the optical layer is unable to perform anyadditional recovery actions.

Survivability mechanisms can be broadly classified intoprotection, restoration, and pre-allocated restoration tech-niques. The term protection is used to describe schemesthat are pre-planned for both spare capacity and backuproutes. Restoration schemes plan both spare capacity andbackup routes after failure occurrences. The pre-allocatedrestoration schemes use pre-planned spare capacity only.

Due to their ability to provide a guaranteed recoveryconnection, protection-based approaches are the dominantapproaches used to provide network survivability. How-ever, they are not scalable when different failure scenariosare considered. For instance, providing a protectionscheme for dual-link failure requires almost triple theamount of spare capacity as compared to the single-linkfailure. By contrast, restoration techniques can scale wellunder different failure scenarios. But, connection recoverycannot be guaranteed for all circumstances. Therefore, itis essential to investigate alternative approaches that canefficiently withstand different failure scenarios while pro-viding the required QoS. A candidate to meet this challengeis the pre-allocated restoration technique.

The pre-allocated restoration scheme, considered in thiswork, bridges the gap between the protection and restora-tion techniques. It is very simple in terms of implementa-tion and operation. In this scheme, additional capacity isembedded in the network specifically for survivability pur-poses. This additional capacity is not visible to the routingalgorithms under normal operation conditions (i.e. no-fail-ure), meaning that there is no need for survivable routingcalculations to be involved at that point [3]. Moreover, apre-allocated restoration technique is more flexible in termsof resource utilisation and coping with various failure sce-narios, whereby routing computation and resource alloca-tion need only be involved once a failure has occurred.

With the pre-allocated restoration technique, there areseveral notable issues worth investigating; specifically sparecapacity allocation and implementation issues. From thespare capacity perspective, it is important to consider the

spare capacity allocation, optimisation or reconfiguration,and utilisation under failure and no-failure conditions.From the implementation perspective, it is essential toinvestigate the technique’s effectiveness in terms of copingwith various failure scenarios, provide QoS and scalability.In order to implement the spare capacity allocation meth-ods and optimisation mechanisms, several related issuesmust be addressed. With respect to the signalling protocol,it is essential to take into consideration the requiredincreased messaging complexity and compatibility withthe GMPLS signalling protocols. From a routing perspec-tive, scalability can be examined by considering the routinginformation required to perform routing calculations anddetermine the shared spare capacity associated with IPand optical layers.

The paper is organized as follows; Section 2 presents theproblem overview and related work. Section 3 presents acomprehensive overview of the pre-allocated restorationtechnique and focuses upon the spare capacity embeddedin the network for survivability purposes and outlines keyrequirements including the capacity allocation and optimi-sation. Model implementation and system assumptions areconsidered in Section 4. Section 5 presents model perfor-mance and simulation results. Finally the paper is con-cluded in Section 6.

2. Problem discussion and the related work

Providing appropriate survivability schemes for differentfailure scenarios is rapidly becoming a critical concern forservice providers of backbone networks [4]. Currently, themost dominant scenario is a single-link failure. However,the dual-link failure scenario has become a key concernfor both designers of survivable networks and service pro-viders. The reasons are manifold; firstly, due to the physicaltopology constraints and long-distance fibre link installa-tions, the occurrence of dual-link failures is now highlyprobable in large-scale networks. The second motivationarises from the improvement of optical layer functionalityfacilitated by a GMPLS-based distributed control planerather than any centralised management unit. Thisimprovement makes it possible, by adopting restorationmechanisms and differentiated survivability concepts, toprovide survivable schemes that can significantly reducethe cost of dual-link failure recovery.

Previous studies have conducted work on pre-allocatedrestoration, dual-link failures, and multilayer survivability.Grover [3] introduces the pre-allocated restoration pro-posal known as the ‘protected work capacity envelope con-cept’. Further algorithms for computing the pre-allocatedcapacity in the MPLS layer are presented by Kodialamand Alicherry, respectively [5,6]. Dual-link failures havebeen investigated using the following three techniques;reconfiguration, two pre-planning backup paths, and re-routing [7]. Such investigations generally adopt a single-layer survivability approach to achieve a full dual-failurerecovery guarantee without taking into account per-path

Page 3: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

D. Harle, S. Albarrak / Computer Communications 30 (2007) 3085–3095 3087

dual-failure survivability. Various schemes of multilayersurvivability have been proposed for inter-working func-tionality based on either an escalation or integrated modelin terms of sharing the spare capacity and routing informa-tion [8–10].

In this paper, multilayer pre-allocated restoration per-formance is investigated considering a smart-edge simple-core GMPLS-based IP/WDM network based on thepath-level recovery (end-to-end recovery) which providesbetter resource utilisation than link-level. Several issuesare investigated including the spare capacity allocationproblem, the retrial method to support effective use ofavailable spare capacity and a differentiated survivabilityconcept that supports quality of service and spare capacityoptimisation.

3. Spare capacity overview

3.1. Spare capacity allocation

The two-layer approach data plane is an overlay modelwith three topologies; link, lightpath, and label switchingpath (LSP) topology. Therefore, it is important to provideefficient methods to allocate the spare capacity within eachtopology. A well-known method used within the LSPtopology is called bypass or backup tunnels [3,5,6]. Here,a span tunnel is reserved to reroute the traffic under failureconditions. Adapting such an approach, this work pro-poses two methods to allocate spare capacity; lightpath par-

titioning and link partitioning.

3.1.1. Lightpath partitioning

In the ‘lightpath partitioning’ scheme, spare capacityis allocated within all active lightpaths whereby thetotal lightpath capacity is partitioned into two elements;working capacity and restoration capacity. For a routingcalculation perspective, each lightpath is advertised byfour parameters; total lightpath capacity (Lc), maximumrestoration capacity (Rc), current working capacity (wc),and current restoration capacity (rc). The first twoparameters are generally preset values while the lasttwo must be updated by the GMPLS routing protocolsuch as open short path first (OSPF). Based on theseparameters, the IP link weight (Lw) which is equal(1/available lightpath capacity) can be calculated usingformulas (1)–(3).

Lw ¼1

Lc � wc � Rc

if rc >¼ Rc ð1Þ

1

Lc � ðwc þ rcÞif rc < Rc ð2Þ

(a) For the normal LSP requests

Lw ¼1

Lc � ðwc þ rcÞð3Þ

(b) For the restoration LSP requests

3.1.2. Link partitioning

Using ‘a link partitioning’ scheme, spare capacity is allo-cated within all links such that the link wavelengths arepartitioned into two components; working wavelengthand restoration wavelength. From the routing calculationperspective, each link is advertised in terms of four param-eters; maximum number of wavelengths (W), maximumrestoration wavelengths (R), current working wavelengths(w), and current restoration wavelengths (r). Similar tothe lightpath partitioning scheme, the first two parametersare generally preset values while the last two must beupdated by the GMPLS routing protocol such as openshort path first (OSPF). Based on these parameters, theIP link weight (Lw) which is constant (1/available wave-lengths) can be calculated using formulas (4)–(6).

Lw ¼1

W� ðwþ rÞ if r >¼ R ð4Þ

1

W� W� Rif r < R ð5Þ

(a) For the normal LSP requests

Lw ¼1

W� ðwþ rÞ ð6Þ

(b) For the restoration LSP requests

3.2. Spare capacity optimisation

Regardless of the allocation method, providing a flex-ible reconfiguration mechanism to optimise the sparecapacity is essential. Such a reconfirmation mechanismcan be classified as either static or dynamic. In the for-mer, static spare capacity is embedded in the networkusing an off-line calculation algorithm based on statictraffic demand. In the latter, such capacity is calculatedand adjusted on-line based on the current network con-ditions. The static technique is not suitable for the nextgeneration network (NGN) case where traffic changesin a dynamic fashion. Hence, the work describeddynamic spare capacity reconfiguration. Two load-basedspare capacity optimisation schemes are proposed in thispaper; local spare capacity optimisation (LSCO) and glo-bal spare capacity optimisation (GSCO).

3.2.1. Local spare capacity optimisation

The term ‘‘local’’ refers to the fact that only local portcapacity information is considered in the optimisation pro-cess. Each node is autonomously responsible for adjustingits port capacity based on the amount of generated trafficwithin each port. The key motivation is to ensure that traf-fic generated within any port can be rerouted through otherports. This constraint is presented in Eq. (7) and illustratedin Fig. 1.

WCi 6

Xn

j¼1

SCj i 6¼ j ð7Þ

Page 4: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

WC1 =0SC1 =2

WC =2SC =1

CL=8

WC3 =5SC3 =0

WC4 =1SC4 =2

Local Port

WC1 =0SC1 =2

WC2 =2SC2 =1

CL=8

WC3 =5SC3 =0

WC4 =1SC4 =2

Fig. 1. An example of the local spare capacity optimisation (CL: localgenerated working capacity. WCi: the portion of the generated workingcapacity which passes though port (i). SCi: the reserved capacity in theport (i)).

3088 D. Harle, S. Albarrak / Computer Communications 30 (2007) 3085–3095

• WCi represents the generated traffic in port i.• SCj represents the spare capacity in port j.• n indicates the number of ports.

The information, required to meet the constraint, isupdated locally when any new generated connection isestablished or rejected. Hence, there is no need to extendthe routing and signalling protocol. The LSCO schemecan be triggered locally either per-connection or when gen-erated traffic changes significantly. The obtained valuesindicate the level of spare capacity within node ports thatwill force the node to converge to the new values. Such con-vergence is possible since some connections are rejected.On the other hand, it is possible to configure the LSCOscheme to monitor specific classes of service rather thanmonitoring all the traffic generated.

3.2.2. Global spare capacity optimisation

In proposed global scheme, it is assumed that the opti-misation is achieved by a centralised link-failure agent.The agent works on the principle that the spare capacityoptimisation need only occur at a lower rate than per-con-nection reconfiguration. Therefore, any optimisation agentcan be triggered periodically or by a node whose linkcapacities experience prescribed and significant change.

The key principle is that the optimisation agent main-tains a small database which describes the existing spare

32

1 84 5

6 7

1

2

3

4

5

5

6

7

8

9

10

11

12

32

1 84 5

6 7

1

2

3

4

5

5

6

7

8

9

10

11

12

• Each node sends pan updated messag

• The agent sends peupdated message t

• Each node sends pan updated messag

• The agent sends peupdated message t

Fig. 2. Global spare capacity

capacity and the total load for each pair passing throughthe corresponding link. Based on this information, theagent emulates some link-failure scenarios and investigatesthe level of spare capacity in each link. Consequently, thespare capacity in the network can be reconfigured. It isassumed that the agent operates in the background andtherefore no service interruption occurs.

Fig. 2 shows how such a scheme would operate. The sig-nalling protocol is responsible for exchanging the informa-tion between the centralised link-failure agent and networknodes. When a node receives an updating message, it forcesits routing protocol to converge to the new values. Suchconvergence is possible since connections are rejected. Onthe other hand, similar to the LSCO scheme, GSCO canbe configured to monitor specific classes of service ratherthan monitoring all the traffic generated.

3.3. Spare capacity utilisation

There is no doubt that network capacity is a premium inany network infrastructure. Thus, any additional capacityreserved for survivability purposes could be considered tobe wasted should no-failure occur. Therefore, network ser-vice providers can capitalise from this circumstance byaccepting low priority traffic to utilise the additional capac-ity. However, in a QoS-enabled IP/WDM network, it isimportant to provide available and reliable services to theend users; especially for the high priority class users sincea large portion of revenue comes from users of this class.Preemption techniques can be viewed as one way to achievesuch an objective. Any preemption scheme must have twokey elements: a control admission policy which decides onwhich connections are to be dropped when resource scar-city is experienced, and a recovery process that is activatedas a result of failure occurrences in the network. The sub-ject of preemption and spare capacity is an ongoing issuebut is not the subject of this work; the discussion is simplyincluded here for completeness.

4. Model implementation

The experimental work is based upon the OMNeT++(Objective Modular Network Testbed in C++) discrete-event simulation platform. OMNeT++ supports hierarchi-

GSCO agent database

1 Gb525

8

8

Node 2

3Gb25

2.8 Gb15

Total load

Node 1

Link id

1 Gb525

8

8

Node 2

3Gb25

2.8 Gb15

Total load

Node 1

Link id

Maximum number of records=(number of node-1) * number of link

eriodically e.

riodically an o each node.

eriodically e.

riodically an o each node.

optimisation operation.

Page 5: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

GMPLS-basedOXC control plane

GMPLS-basedrouter control planeLS P s

sw itch ing

Po

rt

S w itch ing

LS P ssw itch ing

Po

rt

S w itch ing

s

s

Fig. 3. Network topology and node structure.

D. Harle, S. Albarrak / Computer Communications 30 (2007) 3085–3095 3089

cally nested modules with flexible module parameters, andtherefore, provides an ideal platform to support modelingof distributed mesh topologies.

4.1. Model structure

The network topology adopted in this work is the NSF-net network topology as shown in Fig. 3a. It consists of aset of nodes connected by a set of paired fibre links. Ded-icated channels in each link comprise the control planetopology over which control messages are exchanged; inde-pendently of any data plane topology. Internally, eachnode consists of an edge router connected to an opticalcross connection (OXC) as illustrated in Fig. 3b. The rou-ter and OXC could be placed in a separate location or theycould be combined to form a single node with a commoncontrol plane.

The network structure can therefore be viewed fromto perspectives; a control plane and a data plane. Thedata plane must be an overlay model with three topolo-gies; link, lightpath, and label switching path (LSP)topology. The control planes, in both the edge routersand OXCs, consist of three components: the signalling,the routing, and the recovery units. The functionalitiesof the signalling and routing units are implemented usingstandard GMPLS protocols as described in the Internetdrafts [11,12]. The work in this paper focuses on theimplementation of the recovery unit functions payingdue reference to the other protocols. In order to effi-ciently implement their functionality, the node unitsrequire specific information and three data tables aremaintained in each node:

• Wavelength routing table: contains the information thatdescribes the status of wavelengths at each port.

• Lightpaths information table: maintains the informationabout all lightpaths generated/ terminated at the corre-sponding node.

• Network physical topology table: contains the informa-tion about the entire network link connectivity.

Similar to the OXCs, routers also require specific infor-mation to underpin their functionality. Three tables aredescribed:

• Forwarding table: contains the forwarding informationincluding in_port, in_label, out_port, and out_label.

• Logical topology: contains the information regarding theexisting lightpaths including: the source, destination,bandwidth information, and type of lightpaths. Thisinformation supports the routing unit by calculatingthe appropriate route for a new request.

• LSP information table: maintains the information aboutall LSPs generated from, or terminated on, the corre-sponding router.

The proposed network model considers three delay com-ponents; the propagation delay, the transmission delay,and the nodal process delay. The propagation delay repre-sents the delay for the first bit to propagate from a sourceto a destination and is a function of the link propagationspeed and the link length. Transmission delay representsthe time needed to pump data onto a link and is calculatedas a function of the link capacity and the message size. Thenodal process delay describes the interval between the nodereceiving a message though the input port and that messagebeing sent to the output port and includes the times takento examine a message, to calculate a new route and to per-form wavelength switching.

4.2. No-failure condition assumptions

At the IP layer, LSP connections are requested and ter-minated randomly with requests arriving according to aPoisson process. The LSP parameters include the source,the destination, the capacity, and the service class, selectedrandomly based on a uniform distribution. Two classes ofservices are provided; protected and unprotected connec-tions. From the routing calculation perspective, the pro-posed model adopts the source explicit routing conceptand the n-step constraint-based shortest-path-first algo-

Page 6: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

754321

LP1 LP2 LP3

LSP1 2 3 2

LP2 upstream LP2 downstream LP2 destinationLP2 source

LSP source LSP upstream LSP downstream LSP destination

Edge router OXC

6

Fig. 4. Notify messages delivery form the optical to IP layer.

3090 D. Harle, S. Albarrak / Computer Communications 30 (2007) 3085–3095

rithm. The former considers the provision of an explicitroute at the source nodes; therefore, this route cannot bemodified during the signalling phase. The latter algorithmprovides an efficient means to compute the working andbackup paths for any connection. During the first stage,the working path is computed. The second stage computesthe first backup path considering the shared risk link group(SRLG) condition of the working path. Next, step 2 isrepeated n-2 times in order to compute the remaining n-2backup paths. For each iteration of stage 2, the SRLGassociated to each previous computed backup path is con-sidered in the computation of the new backup path. It isassumed that shared capacity protection is implementedusing the partial information strategy explained in detailpreviously [13].

The IP routing unit determines the explicit route basedon the amount of available capacity in each lightpath. Rou-ters request a new lightpath based on the lightpath-create-first policy [14]. Based on such a policy, edge routers searchfor a direct lightpath within the existing lightpath topology.If there is no lightpath available, the edge router requests anew lightpath from its associated OXC to accommodatethe new LSP requests. At each lightpath setup failure, therouters attempt to find a route within the existing light-paths. Therefore, an LSP could traverse multiple lightpathsbetween source and destination. The request will beblocked if there are no available resources along its route.It is assumed that no repeat behaviour is considered. Light-paths are terminated if they are not being traversed byLSPs.

At the optical layer, the routing units, based on thenumber of free wavelengths in each link, determine theshortest-path. The first fit wavelength assignment strategywas considered and full wavelength conversion is assumed.Lightpath provisioning employs the destination-initiatedreservation (DIR) method using the GMPLS signallingprotocol. Based on the DIR method, a lightpath requestis forwarded from the source to the destination, collectingresource information as it progresses. The destination thenselects the appropriate label (wavelength) and returns a res-ervation request back to the source; all intermediate nodes,including the source, attempt to find and reserve therequired resources. The lightpath request will be blockedif there are insufficient resources along its route.

4.3. Failure condition assumptions

Failures are generated randomly. The inter-arrival timeand holding time of failures are generated based on anexponential distribution. Link failures are determined froma uniform distribution such that all links are equally likelyto fail. The dual-link failure scenario considered is the ran-dom simultaneous failure of two links. The path-levelrecovery (end-to-end recovery) is applied as this providesbetter resource utilisation when compared to link-levelrecovery. Additional capacity is reserved for survivabilitypurposes.

The recovery procedure can be classified into three keyprocesses: fault notification, failed connection teardown,recovery.

• Notification process: the notification process starts at theupstream node, which is required to send a notify mes-sage to the source node. It is possible to aggregate multi-ple failed connections that belong to the same sourcenode in a single notify message. The optical layer isresponsible for notifying the IP layer about any unre-covered or unprotected lightpaths. The cooperationbetween layers is shown in Fig. 4.

• Teardown process: the teardown process starts at boththe upstream and downstream node. The upstream nodeis responsible for tearing down the upstream segmentwhile the downstream node is in charge of downstreamsegment teardown.

• Recovery process: the recovery process starts at thesource node of any failed connection and depends onthe recovery scheme associated with each class ofservice.� Protected connection: the recovery process requires

only switchover synchronization signalling betweenthe source and destination of a failed connection.

� Unprotected connection: the recovery process requiresthe provisioning of an alternative connection. There-fore, the admission control, at the source node, meetsthis challenge by using existing lightpaths includingthe pre-allocated spare capacity. The pre-allocatedspare capacity is reserved using either link partition-ing or lightpath partitioning methods.

One of the critical problems in path-level pre-allocatedrestoration is contention between messages that ariseswhen multiple provisioning processes begin simulta-neously. The suggested solution is to apply a retrialmethod. Such that, when the connection recovery processfails, the restoration process is repeated. The performanceof any retrial method is affected by the number of retrials(allowed or required) and the time delay between the retrialevents.

The second critical issue with path-level pre-allocatedrestoration is to provide quality of recovery (QoR). Theproposal in this paper is to apply a recovery class prioriti-

Page 7: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

D. Harle, S. Albarrak / Computer Communications 30 (2007) 3085–3095 3091

sation method. With this method, during a preset time-interval, each class invokes its appropriate recoverymethod. It is assumed that the method operates indepen-dently at each node. It is possible to apply the retrialmethod within any time-interval for enhanced performancein the corresponding class. The length of the discrete inter-val between classes directly affects the restoration time foreach class.

5. Simulation results

This section presents results for a range of distinct andrelevant simulation-based experiments. The performancemetrics of interest are the restoration rate, average restora-tion time, spare capacity ratio, and blocking probability.The restoration rate gives the ratio of the number ofrestored connections over the number of failed connectionsin the network. The average restoration time is defined asthe ratio of the total restoration time of restored connec-tions over the number of restored connections. The sparecapacity ratio is defined as the ratio of the reserved capac-ity over the working capacity within the whole active light-paths. The blocking probability is the ratio of the numberof rejected connections over the number of requested con-nections in the network. The offered load indicates the traf-fic load expressed in Erlangs. It is assumed that all linkshave the same number of wavelengths (8 wavelengths) eachwith 10 capacity. The length is defined as the distancebetween nodes as indicated in previous work [10]. Thenodal process delay is 1 ms It is assumed that the LSPcapacity varies in a continuous range between 1 Mb/sand 2.5 Gb/s.

The mean failure inter-arrival time is five time units (atime unit is equal to 50 s) and the mean repair time is a sin-gle time unit. These particular values were adopted primar-ily as a result of experimental expediency in order to setlimits for the experimental work. It is recognised that thenumerical values are lower or different than those experi-enced in practical networks; however, the results attainedrepresent reasonable limits under the consideration to pro-vide multiple failure events.

0

0.1

0.2

0.3

0.4

0.5

0.6

Dedicated 1:2

Shared 1:2

Dedicated 1:1Shared 1:1

Restoration

Blo

ckin

g pr

obab

ility

Fig. 5. Performance comparison between different survivability

The first experiment investigates the performance of therestoration and protection schemes under single- and dual-link failure. The aim of this experiment is to evaluate therequired spare capacity and the blocking probability.Fig. 5 presents a performance comparison between differ-ent survivability techniques including dedicated, shared,and restoration. The techniques are investigated one byone at the same network load value (400 Erlangs). Eachperformance value is an average of 10 different simulationruns. As expected, it can be seen that the restoration pro-vides the best resource utilisation in terms of blockingprobability. Moreover, the figures show clearly that dual-link failure protection is inefficient in terms of requiredspare capacity and blocking ratio. The high blocking ratioarises from a number of different reasons; lack of resources,routing combinations whereby it is not simple to find threedisjointed paths between a particular pairs, and messagecontention, in particular, within distributed network.Therefore, it is suggested that the protection technique beapplied to specific connections which required certainQoS, rather than wholesale adoption of the protectiontechnique to provide survivability for the entire network.

5.1. Spare capacity allocation performance

The second experiment considers the pre-allocated res-toration technique and, in particular, the spare capacityallocation methods. Figs. 6 and 7 present the restorationrate for the lightpath partitioning and link partitioningmethods under single- and dual-link failures with and with-out retrial methods (two retrials). All values are recorded atthe same network load (400 Erlangs). The experimentalresults show that the restoration rate is improved whenthe amount of pre-reserved capacity increases. An outcomethat one would expect but which does need stating. Addi-tionally, this experiment demonstrates clearly the effect ofcontention problems in the GMPLS-based distributed net-work model, in which the restoration rate of dual-link fail-ures with retrials exceeds that of the restoration rate forsingle-link failures without retrial when both methods areapplied. Moreover, both methods achieve full single failure

0%

100%

200%

300%

Dedicated1:2

Shared1:2

Dedicated1:1

Shared1:1

Spa

re c

apac

ity r

atio

techniques. (a) Blocking probability and (b) spare capacity.

Page 8: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

0.5

0.6

0.7

0.8

0.9

1

1 1.5 2 2.5 3 3.5

Pre-allocated capacity(Gb/s)

Res

tora

tion

rate

Single failure Single failure with retrialDual failure Dual failure with retrial

Fig. 6. Lightpath partitioning method performance with various pre-allocated capacity values for different failure scenarios with and withoutretrial.

0.5

0.6

0.7

0.8

0.9

1

1 3Pre-allocated wavelengths

Res

tora

tion

rate

Single failure Single failure with retrialDual failure Dual failure with retrial

2 4

Fig. 7. Link partitioning method performance with various pre-allocatedwavelengths for different failure scenarios with and without retrial.

3092 D. Harle, S. Albarrak / Computer Communications 30 (2007) 3085–3095

recovery with relatively lower of spare capacity than whenthe retrial method is applied. For example, using retrialmethod with the lightpath partitioning method reducesthe amount of spare capacity from 3.5 Gb/s to 2.5 Gb/s.On the other hand, the results show that, in order toachieve full dual-link failure recovery, the amount of sparecapacity embedded in the network should be in the range of40–50% of the total network capacity (3.5;Gb/s with thelightpath partitioning method and 4 wavelengths for linkpartitioning method). Therefore, in order to reduce theamount of spare capacity, it is strongly recommended that

0.5

0.6

0.7

0.8

0.9

1

250 300 350 400 450Offered load(Erlangs)

Res

tora

tion

rate

RestorationLink partitioning methodLightpath partitioning method

Fig. 8. Restoration rate and restoration time comparison for multil

the differentiated survivability concept should be consid-ered to provide per-path dual-link failure recovery.

Fig. 8a and b present a performance comparison for themultilayer restoration and the multilayer pre-allocated res-toration when the two methods for allocating the spacecapacity are applied. All performance values are recordedfor the same degree of pre-reserved capacity; 25% of theworking capacity in the network (2.5 Gb/s for lightpathpartitioning method and 2 wavelengths for link partition-ing method). Moreover, this experiment considers thedual-link failure scenario. The experimental results showthat, by comparison with the multilayer restoration, boththe restoration rate and the restoration time are improvedwhen the two methods are applied. Additionally, the fig-ures show that there is a trade-off between the performanceparameters when the two schemes are applied. While light-path partitioning achieves a better restoration rate, linkpartitioning achieves better restoration time. The reasonfor this trade-off is because link partitioning facilitatesthe recovery process at the optical domain while lightpathpartitioning supports IP layer restoration. This paperinvestigated the performance of these two schemes wheneach scheme is applied singularly; the application of thecombination of the two will be addressed in future work.

5.2. Differentiated survivability performance

In the third experiment, the differentiated survivabilityconcept is investigated; in order to provide QoR in thepre-allocated restoration technique, the recovery class pri-oritisation method is applied. The percentage of classesrouted in the network are 20% class EF (Expedited For-warding), 30% class AF (Assured Forwarding), and 50%class BE (Best Effort). The differentiation between classespertains to the same categories used in differentiated ser-vices (DiffServ) traffic with each class having a different pri-ority [15]. Fig. 9a and b, illustrate the restoration rate andrestoration time for three preset connection classes; eachgenerated following a simple uniform distribution. All val-ues are recorded at the same network load (400 Erlangs)under the dual-link failure scenario. The preset time-inter-val between classes is 50 ms. The experimental results show

0.05

0.06

0.07

0.08

250 300 350 400 450Offered load(Erlangs)

Res

tora

tion

time

(ms)

RestorationLink partitioning methodLightpath partitioning method

ayer restoration and the two spare capacity allocation methods.

Page 9: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

0.04

0.07

0.1

0.13

0.16

0.19

0.5 1.5 2.5

Pre-allocated capacity(Gb/s)

Res

tora

tion

time

(ms)

Without class of service Class EFClass AF Class BE

0.5

0.6

0.7

0.8

0.9

1

0.5 1.5 2.5

Pre-allocated capacity(Gb/s)

Res

tora

tion

rate

Without class of service Class EFClass AF Class BE

1 2 1 2

Fig. 9. Performance comparisons for different class services using lightpath partitioning method.

D. Harle, S. Albarrak / Computer Communications 30 (2007) 3085–3095 3093

that both EF and AF classes achieve better performance interms of restoration rate when compared to the case whereclass of services are not supported. Additionally, the per-formance of class AF is improved in comparison with classBE. The results clearly demonstrate the importance ofapplying differentiated survivability, in particular, withdual-link failures. For example, class EF requires 1.5 Gb/sspare capacity (15%) to support full dual-link failures. Onthe other hand, this method is an efficient method in termsof scalability and simplicity; it operates independently ateach node and can be implemented simply under theGMPLS protocols.

This paper considers a lightpath partitioning method toprovide QoR. In order to provide a QoR based on the linkpartitioning method, it is necessarily to provide an appro-priate class of service at the both layers and design efficientservice mapping mechanisms. The reason being that linkpartitioning provides most of its recovery process at theoptical layer while the class of service is assigned at theIP layer. The authors will consider this issue in futurework.

5.3. Spare capacity optimisation schemes performance

The final experiment investigates the proposed optimisa-tion schemes performance. It is assumed that the sparecapacity is allocated based on the lightpath partitioning

0.7

0.75

0.8

0.85

0.9

0.95

1

250 300 350 400 450Offered load (Erlangs)

Ret

erat

ion

rate

Restoration static optimisation (1 Gb) LSCO GSCOGSCO with 4 retrial static optimisation (2.5 Gb)

Fig. 10. Restoration ratio comparison for restoration and pre-allocatedrestoration with the optimisation methods.

method when the retrial method is applied. Figs. 10 and11 present the restoration rate and spare capacity ratio ofrestoration technique and the pre-allocated restorationtechnique under single-link failure. The optimisationschemes considered by this experiment are static optimisa-tion (with two preset values 1 and 2.5 Gb/s), LSCO andGSCO.

The experimental results show that the restoration rateimproves significantly when the optimisation schemes areapplied. The improvement is clearly evident, in particular,for the case of medium and high load values. Additionally,the figures show that there is a trade-off between theamount of spare capacity embedded in the network andthe restoration ratio performance. While the restorationratio improves under the optimisation schemes, the amountof spare capacity required increases significantly. More-over, the experimental results demonstrated clearly thatthe GSCO scheme achieves better performance thanLSCO. The reason is that GSCO uses not only the localtraffic but also the generated traffic between all pairs inthe network.

The GSCO performance is investigated under differentnumber of retrials (2 and 4). The experimental results showthat the restoration rate is improved significantly when theretrial method is applied. For example, GSCO with fourretrials achieves similar performance in terms of restora-tion rate to static optimisation with 2.5 Gb/s. Moreover,

0%5%

10%15%20%25%30%35%40%45%50%

250 300 350 400 450

Offered load (Erlangs)

Spa

re c

apac

ity r

atio

static optimisation (1 Gb) LSCO

GSCO static optimisation (2.5 Gb)

Fig. 11. Spare capacity ratio comparison for pre-allocated restorationwith the optimisation methods.

Page 10: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

0

0.05

0.1

0.15

0.2

0.25

0.3

250 300 350 400 450

offered load (Erlangs)

bloc

king

pro

babi

lity

Restoration LSCO GSCO

Fig. 12. Blocking probability comparison for restoration and pre-allocated restoration with the two optimisation methods.

3094 D. Harle, S. Albarrak / Computer Communications 30 (2007) 3085–3095

GSCO achieves better performance in terms of requiredspare than static optimisation with 2.5 Gb/s capacity. Notethat the application of a retrial method has no effect on thespare capacity ratio.

In Fig. 11, which shows the relative spare capacity ratio,it can be seen that the spare capacity ratio decreases whenthe load increases; in particular for the low and mediumload values. The reason being that the spare capacity ratiorelies on the number of existing lightpaths and, when theload increases, the number of existing lightpaths is alsoincreased.

Further to the failure condition performance, this workalso considers the effect that deploying an optimisationscheme has on the normal network operation. Fig. 12 pre-sents the blocking probability when the optimisationschemes are applied under single-link failures. The block-ing probability considers only normal LSP requests anddoes not include the restoration LSP requests. The resultsshow that the blocking probability increases when thetwo schemes are applied. This result is expected wherebya specific amount of the network capacity is hidden fromthe routing unit under no-failure conditions. On the otherhand, the figure shows that the LSCO scheme achievesslightly lower blocking probability than GSCO; essentiallybecause, with the LSCO scheme, the required amount ofreserved capacity is lower.

6. Conclusions

The paper presents a comprehensive overview of a pro-posed pre-allocated restoration technique and its perfor-mance. By outlining the key requirements such ascapacity allocation and optimisation, the paper has focusedupon the use of spare capacity embedded within the net-work to meet survivability needs. To facilitate the investi-gation, a distributed model considering a GMPLS-basedIP/WDM network was implemented.

Two methods to present the spare capacity at opticaland IP layer were proposed; lightpath partitioning and linkpartitioning method. The simulation results show that themodel performance, in terms of the restoration rate andrestoration time improves significantly when the proposed

methods are applied. Moreover, this paper investigatedthe contention problem arising when multiple provisioningprocesses begin simultaneously. The suggested solution wasto apply a retrial method. The experimental results showthat the restoration rate is improved when the retrialmethod is applied.

Furthermore, this paper proposed the class prioritisa-tion method to apply the differentiated survivability con-cept with pre-allocated restoration. The experiments showthat this method was an efficient method in terms of scala-bility and simplicity; it operates independently at each nodeand it is simple to implement under the GMPLS protocols.

Two load-based spare capacity optimisation schemeswere proposed in this paper; local spare capacity optimisa-tion (LSCO) and global spare capacity optimisation(GSCO). The simulation results show that the model per-formance in terms of the restoration ratio improves signif-icantly when the proposed schemes are applied.Additionally, the figures show that there is a trade-offbetween the amounts of spare capacity embedded in thenetwork and the restoration ratio performance.

References

[1] Greg Bernstein, Bala Rajagopalan, Debanjan Saha, Optical NetworkControl: Architecture, Protocols, and Standards, Addison Wesley,2004.

[2] Peter Tomsu, Christian Schmutzer, Next Generation Optical Net-works, Prentice Hall PTR, 2002.

[3] W.D. Grover, The protected working capacity envelope concept: analternate paradigm for automated service provisioning, Communica-tions Magazine IEEE 42 (1) (2004) 62–69.

[4] Jing Zhang, Biswanath Mukherjee, A review of fault management inWDM Mesh networks: basic concepts and research challenges, IEEENetworks Magazine (2004).

[5] Murali Kodialam, T. Lakshman , Sudipta Sengupta, A simple trafficindependent scheme for enabling restoration oblivious routing ofresilient connections, in: IEEE INFOCOM 2004.

[6] Mansoor Alicherry, and Randeep Bhatia, Pre-provisioning networksto support fast restoration with minimum over-build, in: IEEEINFOCOM 2004, Hong Kong, March 9, 2004.

[7] D.A. Schupke, R.G. Prinz, Capacity efficiency and restorabilityof path protection and rerouting in WDM networks subject todual failures, Photonic Network Communications 8 (2) (2004)191–207.

[8] Lei Lei, Aibo Liu, Yuefeng Ji, A joint resilience scheme withinterlayer backup resource sharing in IP over WDM networks, IEEECommunications Magazine 42 (1) (2004) 78–84.

[9] Y. Qin, L. Mason, K. Jia, Studying of a joint multiple-layerrestoration scheme for IP over WDM network, IEEE NetworksMagazine (2003).

[10] Jian Wang, L. Sahasrabuddhe, B. Mukherjee, Path vs. subpath vs.link restoration for fault management in IP-over-WDM networks:performance comparisons using GMPLS control signalling, IEEECommunications Magazine 40 (11) (2002) 80–87.

[11] L. Berger, Generalized Multi-Protocol Label Switching (GMPLS)Signalling Functional Description, http://www.faqs.org/ftp/rfc/pdf/rfc3471.txt.pdf, January 2003.

[12] J. Lang, Generalized Multi-Protocol Label Switching (GMPLS)Signaling Resource ReserVation Protocol-Traffic Engineering(RSVP-TE) Extensions, January 2003.

[13] M. Kodialam, T.V. Lakshman, Restorable dynamic quality of servicerouting, IEEE Communications Magazine 40 (6) (2002) 72–81.

Page 11: Spare capacity allocation and optimisation in a distributed GMPLS-based IP/WDM mesh network

D. Harle, S. Albarrak / Computer Communications 30 (2007) 3085–3095 3095

[14] E. Oki, K. Shiomoto, D. Shimazaki, Dyamic multilayer routingschemes in GMPLS-based IP+optical newtorks, IEEE Communca-tions Magazine (2005) 108–113.

[15] F.L. Faucheur, W. Lai, RFC 3564: requirements for support ofdifferentiated services-aware MPLS traffic engineering, July 2003.

David Harle is currently a senior lecturer withinthe Broadband and Optical Networks ResearchGroup within the Department of Electronic andElectrical Engineering at the University ofStrathclyde in Glasgow. He received his Ph.D.in Integrated Telecommunications from theUniversity of Strathclyde in 1990 having pre-viously been a Research Assistant in the samedepartment. David Harle’s current researchinterests within the Broadband & Optical Net-works Group focus upon performance evalua-

tion, design and management issues associated with current and future

broadband and optical communication systems. The author of over 50research papers and undergraduate texts, David is also a member ofIEEE.

Saud Albarrak received his Bachelor in ComputerEngineering in 1986 and his Master of Science inElectrical Engineering in 1897, both of them fromuniversity of King Saud. He is now a Ph.D.research student within the Broadband andOptical Networks Research Group within theDepartment of Electronic and Electrical Engi-neering at the University of Strathclyde in Glas-gow. His research interest include: IP-over-optical networks, including traffic engineeringand multilayer survivability.