Fast and versatile scheduler design for optical packet/burst switching

Optical Switching and Networking 8 (2011) 93–102

Contents lists available at ScienceDirect

Optical Switching and Networking

journal homepage: www.elsevier.com/locate/osn

Fast and versatile scheduler design for optical packet/burst switchingFranco Callegati, Aldo Campi, Walter Cerroni ∗DEIS - University of Bologna, via Venezia 52, 47521 Cesena (FC), Italy

a r t i c l e i n f o

Article history:Received 14 April 2010Received in revised form 12 November2010Accepted 20 November 2010Available online 27 November 2010

Keywords:Optical networksOptical packet switchingOptical burst switchingSchedulingScheduler implementation

a b s t r a c t

Themost promising solutions to increase bandwidth efficiency in IP over DWDMnetworksare represented by Optical Packet and Burst Switching. Due to the statistical multiplexingapproach adopted, these paradigms must deal with packet/burst contentions at the nodeoutput channels and require the use of scheduling algorithms to optimize resourceassignment and bandwidth utilization. However, very strict time constraints on thescheduling operations are imposed by the extremely high bit rates used, making the actualfeasibility of the scheduler control logic a major implementation issue common to bothtechnologies. This paper provides a specific formulation of the scheduling problem thatcan be used as the basis for the practical design of a fast and versatile scheduler, capableof implementing many algorithms found in literature. A possible implementation of thescheduler sub-functions, mainly based on the use of combinatorial operations, is alsodiscussed and supported by the simulation of the hardware implementation time response,with results that demonstrate the feasibility of the proposed solution.

© 2010 Elsevier B.V. All rights reserved.

1. Introduction

Optical Packet Switching (OPS) and Optical BurstSwitching (OBS) have been considered as themost promis-ing paradigms to increase bandwidth efficiency in IP overDWDMnetworkswith respect to coarse-grained switchingtechniques operating at the lightpath level [1]. Since bothOPS and OBS are based on the concept of statistical multi-plexing, performance of network nodes is clearly affectedby the contention resolution mechanism adopted. Due tothese similar contention aspects and to the packet-basedtransfer mode common to both technologies, this paperwill refer to either an OBS or an OPS data unit with thegeneric term packet, specifying the difference only whenneeded.

Optical packet contentions can be solved by exploitingthe wavelength, space and time domains. The firsttwo approaches try to balance the load over multipleoutput channels, wavelengths and fibers respectively. Inparticular, the multiple fibers used in the space domaincan be connected either to the same next hop (multi-fiber output interface) or to different nodes (deflection

∗ Corresponding author. Tel.: +39 0547 339209; fax: +39 0547 339208.E-mail address:[email protected] (W. Cerroni).

1573-4277/$ – see front matter© 2010 Elsevier B.V. All rights reserved.doi:10.1016/j.osn.2010.11.002

routing). The time domain can be exploited by delayingpackets using Fiber Delay Lines (FDLs) [2] or similardevices [3]. Typically, a combined approach in channel andtime domains proves more effective [4].

Contention resolution algorithms act proactively andschedule the transmission of the packets trying to avoidcontentions as much as possible. Such scheduling algo-rithms show similar characteristics when designed for OPSand OBS nodes, especially when OPS nodes must deal withasynchronous, variable-length packets. In both cases it isnecessary to compute the exact transmission time of thepacket just after the forwarding decision, while planningthe resource usage in advance in order to minimize thechance of contention [5].

Because of the huge bandwidth available on the opticalchannels, the very high packet arrival rates impose stricttime constraints to the scheduling decision at each node.More specifically, a scheduler should be able to completeits tasks in a predictable time frame, as independent aspossible of the number of incoming requests and resourcesavailable for contention resolution. If this is not the case,the scheduler can become the real bottleneck of the wholeswitching fabric, as discussed in an early work on thissubject [6].

http://dx.doi.org/10.1016/j.osn.2010.11.002

http://www.elsevier.com/locate/osn

http://www.elsevier.com/locate/osn

mailto:[email protected]

http://dx.doi.org/10.1016/j.osn.2010.11.002

94 F. Callegati et al. / Optical Switching and Networking 8 (2011) 93–102

Fig. 1. Generic block diagram of an OPS/OBS node with a parallelscheduler. In the OBS case, headers are extracted from and inserted tothe control channels.

The OPS/OBS scheduling problem formulation pre-sented in this paper requires the computation of asimple minimum/maximum function over a limited set ofvalues. With the approach proposed here, it is possible toimplement a scheduler such that the scheduling time is notonly limited in the range of very acceptable values, but itis also very predictable and not significantly dependent onthe complexity of the scheduling problem.

The paper is organized as follows. Section 2 providesa brief overview of the existing solutions for OPS/OBSscheduling. Section 3 describes the proposed formulationof the scheduling problem, that could be the basis for afeasible implementation as discussed in Section 4, alongwith some results obtained from a simulation of theFPGA implementation of the proposed scheduler. Finally,Section 5 concludes the work.

2. Overview of OPS/OBS scheduling

When a packet header or a burst control packet arrivesat a given node, it must be processed to determine thecorrect forwarding path. Then the corresponding payloadmust be scheduled for transmission. Both tasks couldbe performed by a central control unit, but a fasterprocessing can be achieved by simultaneously executingthe forwarding operations at each input port and thescheduling operations at each output port. In this wayforwarding and scheduling operations deal with de-coupled subsets of requests, their tasks are less complexand can be executed in parallel instead of sequentially, asshown by the logical block diagram in Fig. 1.

The assumption of thiswork is that any output interfaceforwarding data to a given destination is equipped with Ffibers, each carrying W wavelengths. Therefore C = F ·

W is the total number of transmission channels availableat the interface. In the remainder of this work the indexc ∈ [1 : C] will be used to address the generic outputchannel. In addition, B FDLs (or other generic delay devices)are associated to each channel.1 A packet sent through thei-th delay line will experience a delay equal to di withi = 0, 1, . . . , B − 1.

1 This does not necessarily mean that a set of devices per channel isneeded, since sharing policies can be adopted depending on the specificswitching matrix architecture.

Despite the recent advances in integrated photonics,optical switching is still far from being a mature technol-ogy, resulting in a number of constraints to the schedul-ing problem. The most obvious and well known is theconstraint in the time domain due to the discrete natureof the delay lines, which are able to delay a packet onlyto a finite subset of points in time. This behavior causesthe presence of voids or gaps between scheduled packetsthat may significantly reduce the channel utilization andaffect node performance [7,8]. Wavelength conversion isanother case: depending on the availability and nature ofwavelength converters, a packet may be transmitted onanyof the available channels or just on a subset of them. Forinstance, the case of limited-range wavelength convertershas been recently studied [9,10]: for any incoming packet,F out of the C available channels are potentially availablewithout wavelength conversion, whereas the availabilityof the others C − F = (W − 1) · F depends on the wave-lentgh conversion capabilities and resources available inthe switching matrix. Because of these constraints, opticalpacket scheduling cannot be implemented by simply re-using conventional scheduling algorithms, instead it needsan ad-hoc design.

Many scheduling algorithms for OPS/OBS have beenproposed in the past. One of the earliest used in OBSdefines a time horizon for each output channel as theinstant after which no burst has been scheduled on thechannel [11]. Any channel with a horizon smaller thanthe arrival time of the burst payload (or of one of itscopies delayed by the FDL buffer, if present) is availableto accommodate the incoming burst and the algorithmselects the channel with the latest horizon in order tominimize the voids between bursts scheduled on the samechannel. This choice, called Latest Available UnscheduledChannel (LAUC) in OBS networks [12], proved to be themost efficient one also for OPS schedulers using the samehorizon approach [13].

A major improvement in terms of packet loss rate isachieved by adding void filling capabilities to the sched-uler [7]. Already scheduled channels are now included inthe search, given that they are unused for a period (i.e., avoid) starting before the (possibly delayed) payload arrivaltime and large enough to accommodate the entire incom-ing packet. The unscheduled channels are considered as aparticular case of voids with infinite length. The algorithmselects the suitable void with the latest starting time, min-imizing the gap left in front of the incoming packet. In OBSthis technique is known as Latest Available Unused Chan-nel with Void Filling (LAUC-VF) [12]. However, schedulersbased on the void filling scheme must keep track of all thegaps in the output channels, resulting inmore complex andtime-consuming algorithms than horizon-based ones.

Depending on the performance parameter to be op-timized (delay, loss, computational complexity) severalscheduling choices are available, leading to a possible clas-sification of the algorithms as:

• delay-oriented (D-type), aimed at transmitting thepacket with the smallest possible delay;

• gap-oriented (G-type), aimed at minimizing the gapsbetween packets and increasing channel efficiency;

F. Callegati et al. / Optical Switching and Networking 8 (2011) 93–102 95

or as:

• without void filling (noVF);• with void filling (VF).

Examples of the combination of the different schedulingchoices will be discussed in Section 3.2.

Some variants of VF scheduling proposed for OBS areable to achieve the same loss ratios as LAUC-VF with a sig-nificantly reduced complexity, thanks to the use of effi-cient data structures and smart search techniques [14–16].However, the complexity of these solutions is a functionof the amount of resources available to the scheduler andof the number of already scheduled packets, which in turndepends on the traffic load. Therefore, the time requiredto complete a scheduling task is not deterministic and noteasily predictable and may turn out to be large enoughto represent a system bottleneck. As a consequence, alter-native solutions that are able to provide a suitable upperbound to the scheduling time must be investigated.

A first step in this direction is represented by an OBSscheduler implementation that uses multiple associativeprocessors in order to perform parallel searches of suitablevoids and available horizons and delays [17]. However, thelook-up operations must still be performed over the wholeset of voids stored in memory. The scheduling algorithmsconsidered are LAUC-VF and LAUC.

Another approach is based on burst resequencing [18],whose hardware implementation is capable of achievingoptimal scheduling in O(1) complexity. This schedulerdoes not process burst headers immediately as they arrive.Instead, it delays and reorders them according to therespective payload arrival time. Then, by applying a simplehorizon policy, it is possible to schedule bursts that wouldhave required a void filling algorithm with sequentialheader processing. However, this scheduler is applicable tothe bufferless OBS case only, since it is able to exploit thevoids created by different offset times but not those causedby packets delayed through FDLs. Furthermore, due to thedelayed header processing, data bursts are always subjectto an additional latency at each node in order to maintainthe required offset after the header.

Recently, a parallel iterative implementation of an OBSscheduler has been proposed [19], where a number ofburst headers received within a given time window areprocessed together to find the optimal channel assignmentwith void filling. This approach is based on iterativeprocedures derived from existing solutions adopted invirtual output queuing architectures. The very high levelof parallelism used by this kind of scheduler may result ina very complex hardware design of the switch control unit.

An alternative approach to packet scheduling aimedat reducing the processing time is presented in thispaper. Building on ideas already sketched in previousworks [20,21], this paper provides a more general formu-lation of the scheduling problem as well as a detailed de-scription of the implementation proposed for the mainsub-functions of the scheduler. This approach allows us toreduce the scheduling computation to the simple task offinding the minimum or maximum over a limited set ofvalues, which are obtained by executing a properly definedmathematical function. The advantage of this approach is

that the computational effort can be reduced and the re-quired processing time can be made almost independentof the current traffic conditions.

Moreover, this paper overcomes a weakness of many ofthe previous works on the subject. In general, these worksfocus on the scheduling problem showing the effectivenessof their proposed implementation, but do not discuss indetail about the problem of managing the informationthe scheduler has to deal with. Indeed, voids must bestored in a memory, new voids must be added when anew packet is scheduled, expired voids must be deleted,etc. Doing so is not costless when dealing with hardwareimplementation and may turn into a significant amount ofthe overall processing time required by the scheduler towork properly. In this work not only a new formulationof the scheduling problem is proposed, but how the voidscould be managed in the systemmemory, in order to keepthe time required to do so in the order of the time requiredby the pure scheduling function, is also discussed in detailin the implementation section.

Therefore, the formalization of the scheduling proce-dures proposed here improves what has been investigatedin previous papers [20,21] by adding generality andprovid-ing further details about a feasible hardware implementa-tion of a fast and versatile OPS/OBS scheduler.

3. Scheduling problem formulation

This section presents an original mathematical formu-lation of the OPS/OBS scheduling problem that is func-tional to the efficient implementation presented in thenext section. The formulation refers to a single schedulerworking on a given forwarding destination. In terms ofproblem complexity the worst case is when the switchingmatrix is equipped with full-range wavelength convertersand therefore any incoming packet can in principle be for-warded to any of the C output channels. This is the caseconsidered here, although the concepts presented can beeasily applied to limited or nowavelength conversion con-ditions by simply reducing the number of channels avail-able for the scheduling.

As already outlined, this paper assumes asynchronousarrivals of variable-sized packets. The generic packetduration L, i.e. its size expressed in time units, is lower-and upper-bounded (as usually happens with networkprotocols) so that Lm ≤ L ≤ LM . The average packet sizeis L̄.

In the discussion that will follow it is quite importantto consider the event time framework according to thefollowing definitions:

• th: the time when a packet header or a burst controlpacket arrives at the switch;

• t0: the time when the scheduler operations begin,i.e. when the required output interface has beendetermined and the forwarding information is passedto the corresponding port scheduler;

• tsched: the amount of processing time required by thepacket scheduling;

• tsw: the optical switch reconfiguration time;


• toffset: the offset time between the burst control packetand the payload in the OBS case;2

• toffset_min: the minimum value of toffset assumed in theOBS case.

In conclusion, the instant when actual data will arrive atthe configured switching matrix is either

ta = t0 + tsched + tsw

in the OPS case or

ta = th + toffset

in the OBS case.Let us define the buffer delay times, i.e. the instantswhen

the packet could be available at the output channel afterbeing delayed, as

di(ta) = ta + di, i = 0, . . . , B − 1.

These values include the zero-delay cut-through path withd0 = 0 and consider the switching matrix propagationtime negligible. Finally, the scheduling time window isdefined as

dB(ta) = dB−1(ta) + LM .

The formulation proposed here requires the voids oneach output channel to be arranged in a logical list, for atotal of C lists per output interface. Each void in the list isrepresented by an element including the void starting timeb and ending time e. Lists are ordered based on the startingtime. In case the channel is completely free (no packetscheduled on that channel so far) the list will includejust one void spanning over the whole scheduling timewindow. In general, when previously scheduled packetsare present, voids appear between them. A final void isalso included, spanning from the end of the transmissionof the last scheduled packet to the end of the schedulingtime window.

Therefore, assuming K voids are present on a givenchannel c ∈ [1 : C], the corresponding list looks like

Vc = {(bc1, ec1), . . . , (bck, eck), . . . , (bcK , ecK )}.

It is intuitive that the gaps on the same channel will beconsecutive and non-overlapping, i.e.

bc(k−1) < ec(k−1) < bck < eck < bc(k+1) < ec(k+1) ∀k.

Now let x be the incoming packet duration, with Lm ≤

x ≤ LM . The scheduler must run through all the lists Vc andperform the following actions (in the most general case ofa void filling algorithm):

1. check whether the channels are free at time di(ta) ∀i;2. check whether the packet fits into suitable voids;3. if suitable voids are found in the previous steps, choose

the best one and schedule the packet accordingly.

2 Following the OBS paradigm, the offset time is computed at theingress node as the sumof the processing, scheduling and switching timesfor all the nodes to be crossed along the path toward the egress node.Things are different in the OPS case, where there is no significant offsetbetween header and packet and the optical payload is delayed at eachnode to allow for header processing, scheduling and switch configuration.

Fig. 2. Schematic of the scheduling logical sub-blocks.

This exhaustive search, if done sequentially, may turnout to be very time-consuming, especially in case of a largenumber of channels (i.e., large C) or when many voids arepresent (i.e., many short packets are scheduled). In thiswork the search problem is de-composed into a series oftasks that can then be executed in parallel and effectivelyimplemented and linked in hardware. The basic principleis as follows:

• search the whole C channels and B delays at once;• for each channel–delay pair (c, i), defined as a poten-

tial scheduling point, apply an intermediate functionF (Vc, di, ta, x), called the search function, that evaluateswhether the scheduling point is suitable to accommo-date the incoming packet;

• collect all the outputs from F that correspond to validscheduling points and store them in a temporary vectorA(ta, x);

• choose the best scheduling point out of A(ta, x) using aso-called select function S(A(ta, x)) that depends on thespecific algorithm adopted.

The input to the search function F (Vc, di, ta, x) in-cludes the void list of a given channel, a given delay andthe packet arrival time and length. It returns a weightw(c, i) that will be used to determine whether the corre-sponding scheduling point is valid, inwhich case the triplet(c, i,w(c, i)) is stored in A(ta, x) and used by S(A(ta, x)) toselect the optimum. While c and i are obviously trivial toget, the difficult task is to find the correctweight in order tomake the optimal selection process as simple as possible.

Fig. 2 shows the logical structure of the schedulingformulation, that is made of two main phases: search(executed in parallel for each potential scheduling point)and select. The introduction of vector A(ta, x) allowsboth the de-coupling of the two phases and the parallelcomputation of the search function on the whole set ofdelays and channels. Each sub-block of the search phaseis responsible to check whether a void exists for a givendelay di on a given channel c . Obviously, C · B such sub-blocks must work in parallel to feed the required values toA(ta, x). How these sub-blocks shouldwork and the relatedcomplexity, as well as an example of implementation, willbe discussed next.

Picking the best value out ofA(ta, x) is the second phaseof the operation, that could turn complicated if the vectorsize is large and the related processing complex. If it wasthe case, this approach could turn as bad as or even worsethan other search strategies proposed in literature. The re-mainder of this section is devoted to demonstrate that,when the weight w(c, i) and the search function F are


Fig. 3. Example of buffer status at time ta .

properly chosen, it is possible to make the selection of thebest element of vector A(ta, x) almost straightforwardly,allowing a very effective implementation of the wholescheduler. It is shown, with a series of examples referringto the most common scheduling algorithms found in liter-ature, that the select function S can be defined as a simplemin/max function on the whole set of values of A(ta, x).

As a reference case study, let us consider the schedulingproblem displayed in Fig. 3, where B = 5, C = 4 and LM =

3D. Without loss of generality, the case of a degeneratedelay buffer is considered [22], where delays are consec-utive multiples of a given time unit D. Nonetheless, theconcepts presented here are applicable to different bufferarrangements as well. In addition, for the sake of a simplernotation, in the following it is assumed D = 1 and ta = 0,leading to di(ta) = i. The status of the resulting lists Vc attime ta is as follows:

V1 = {(0, 0.2), (0.6, 0.8), (1.8, 2.4), (3.3, 3.9), (6.8, 7.0)}V2 = {(0.3, 0.5), (0.9, 1.6), (2.2, 7.0)}V3 = {(0.4, 1.5), (2.5, 7.0)}V4 = {(0.4, 0.9), (2.2, 2.7), (3.8, 7.0)}.

3.1. Definition of the search function

For a given channel c and delay i and for a given packetto be scheduled, the search function is defined as follows:

F (Vc, di, ta, x) = (FH(Vc, di, ta, x), FT (Vc, di, ta, x))

where the two components are

FH(Vc, di, ta, x) = di(ta) − bck (1)FT (Vc, di, ta, x) = eck − di(ta) − x (2)

if an element (bck, eck) ∈ Vc exists such that

eck > di(ta) and bck < di(ta) + x (3)

otherwise

FH(Vc, di, ta, x) = FT (Vc, di, ta, x) = −∞. (4)

This definition of the search function provides ameasure ofthe size of the gap thatwould be left before (1) and after (2)the packet if it is assigned to scheduling point (c, i), in casea suitable gap is present. Note that, due to conditions (3),FH and FT may assume finite negative values greater than−x, corresponding to partial overlapping of the head andtail of the incoming packet with those already scheduled.This adds generality to the scheduling formulation, as willbecome clear later. Condition (4) holds when there iscomplete overlapping with previously scheduled packets.

Table 1Values assumed byw(c, i) for the reference example of Fig. 3. The symbol−∞ representsw(c, i) = (−∞, −∞).

w(c, i) i = 0 i = 1 i = 2 i = 3 i = 4

c = 1 (0, −0.1) −∞ (0.2, 0.1) −∞ −∞

c = 2 −∞ (0.1, 0.3) (−0.2, 4.7) (0.8, 3.7) (1.8, 2.7)c = 3 −∞ (0.6, 0.2) −∞ (0.5, 3.7) (1.5, 2.7)c = 4 −∞ −∞ (−0.2, 0.4) −∞ (0.2, 2.7)

The weightw(c, i) assigned to the potential schedulingpoint (c, i) for the incoming packet is defined as3

w(c, i) = F (Vc, di, ta, x) = (H(c, i), T (c, i)) .

Table 1 shows the weights computed for the case studyof Fig. 3 for an incoming packet with x = 0.3 and ta = 0.

Each scheduling point is subject to a validity test forinclusion of the corresponding weight in A(ta, x). Such atest depends on the scheduling algorithm adopted. Forinstance, algorithms that do not allow any overlappingbetween incoming and scheduled packets will use thefollowing validity test(c, i,w(c, i)) ∈ A(ta, x) iff H(c, i) ≥ 0 and T (c, i) ≥ 0 (5)which, applied to the previous example, results in avector A(ta, x) of 8 elements, corresponding to all theweights in Table 1 with non-negative components. Otherscheduling approaches, such as those based on burstsegmentation [23], allow a less strict validity test

(c, i,w(c, i)) ∈ A(ta, x) iff H(c, i) > −∞

and T (c, i) > −∞(6)

which, in the example above, gives a vector of 11 elements,including those related to scheduling points with partialoverlapping.

Finally, algorithms that do not use void filling need onlyto maintain the horizon information for each channel. Thismeans that the listsVc at time ta for the example abovewillbeV1 = {(6.8, 7.0)} V3 = {(2.5, 7.0)}V2 = {(2.2, 7.0)} V4 = {(3.8, 7.0)}and the weights shown in Table 1 corresponding to voidsbetween scheduled packets will all be−∞. In this case thevalidity test not allowing any overlapping is(c, i,w(c, i)) ∈ A(ta, x) iff H(c, i) ≥ 0 (7)which, for the example above, leads to a vectorA(ta, x) thatincludes only the elements corresponding to the followingscheduling points(c, i) ∈ {(2, 3), (2, 4), (3, 3), (3, 4), (4, 4)} . (8)

3.2. Definition of the select function

Depending on the particular scheduling algorithmadopted, a specific select function may be defined asS (A(ta, x)) = (c0, i0)where (c0, i0) is the optimal scheduling choice accordingto the algorithm. In the following, a few definitions ofthe select function are provided for the most commonscheduling algorithms available in literature and their

3 The two components of F (Vc , di, ta, x) have been renamed as H(c, i)and T (c, i) to simplify the notation in the remainder of the paper.


Fig. 4. Possible scheduling solutions for the example of Fig. 3 assumingan incoming packet with x = 0.3: ❶G-type VF with minimum ending gapor with best fit; ❷G-Type VF with minimum starting gap, D-Type VF withminimum starting gap (LAUC-VF, Min-SV) or with best fit, NP-MOC-VF;❸D-Type VFwithminimum ending gap (Min-EV); ❹D-Type noVF (LAUC);❺G-Type noVF.

application to the reference example of Fig. 3 is illustrated,resulting in the allocation alternatives shown in Fig. 4.

3.2.1. G-type VFA G-type VF policy could be easily implemented by

finding the minimum residual gap that would be createdbetween the incoming packet and the preceding one.Assuming the validity test (5), (c0, i0) is the schedulingpoint such that

H(c0, i0) = min(c,i,w(c,i))∈A(ta,x)

H(c, i) (9)

and

i0 = min {i = 0, 1, . . . , B − 1 | H(c, i) = H(c0, i0)} . (10)

In case more than one scheduling point provide thesame minimum residual gap (9), condition (10) selectsthe earliest one. If there are still multiple alternatives, arandom choice can be made. For the reference example,from Table 1 it results in (c0, i0) = (2, 1).

Alternatively, if the objective is to find the minimumresidual gap with the following scheduled packet, then(c0, i0) is the point that minimizes T (c, i) and the solutionin the reference example is (c0, i0) = (1, 2). Similarly, ifthe void that best fits the incoming packet is the one thatleaves the minimum residual gap on both head and tail,then (c0, i0) is the point that minimizes H(c, i) + T (c, i).In the example it is again (c0, i0) = (1, 2).

3.2.2. D-type VF and LAUC-VFThe select function for a D-type policy with void filling,

implementing the OBS LAUC-VF algorithm as describedin [12], can be defined by measuring the distance betweeneach potential scheduling point and the end of thescheduling time window dB(ta), reduced by the residualgap measured by H(c, i). Therefore, assuming

D(c, i) = dB(ta) − di(ta) − H(c, i) (11)

the optimal scheduling point is (c0, i0) such that

D(c0, i0) = max(c,i,w(c,i))∈A(ta,x)

D(c, i). (12)

In fact, D(c, i) is maximized when the smallest delayavailable is chosen. In case such a minimum delay isavailable on more than one channel, the further decreaseof the residual gap causes the highest value of D(c, i)to correspond to the smallest gap with the preceding

Table 2Values assumed by D(c, i) for the reference example of Fig. 3.

D(c, i) i = 0 i = 1 i = 2 i = 3 i = 4

c = 1 4.8c = 2 5.9 3.2 1.2c = 3 5.4 3.5 1.5c = 4 2.8

scheduled packet. In case condition (12) gives multiplesolutions, corresponding to multiple scheduling pointswith the smallest delay and the same residual gap, arandom choice can be made.

The values of D(c, i) for the reference example areshown in Table 2. In this case, the best scheduling pointis (c0, i0) = (2, 1).

3.2.3. Additional OBS algorithmsOther OBS scheduling approaches, such as those dis-

cussed in [14], can be implemented using the method-ology proposed here, with the advantage that the selectfunction is able to find the optimal solution also in case ofmultiple delay lines (B > 1), without the need to adopta sub-optimal approach as in the Batching FDL algorithmproposed in [14]. For instance, the Minimum Starting Void(Min-SV) fit, which is identical to LAUC-VF in case no delaylines are used, can be obtained by applying (11) and (12)again.

Alternatively, when the objective is to minimize theresidual ending gap between the incoming burst andthe following scheduled one given the smallest delayavailable, the Minimum Ending Void (Min-EV) fit can beimplemented applying (12) with

D(c, i) = dB(ta) − di(ta) − T (c, i) (13)

resulting in (c0, i0) = (3, 1) for the reference example.Finally, when the objective is to minimize the whole

residual gap (head and tail) for a given minimum delay(Best Fit), (12) can still be used with

D(c, i) = dB(ta) − di(ta) − (H(c, i) + T (c, i)) (14)

resulting in (c0, i0) = (2, 1) for the reference example.

3.2.4. Segmentation-based algorithmsThe use of validity test (6) extends vector A(ta, x) to

also include scheduling points with finite negative valuesof H(c, i) and T (c, i), corresponding to partial overlappingbetween the incoming packet and those already scheduled.This extension allows us to implement scheduling algo-rithms based on the burst segmentation scheme, such asthe Non-Preemptive Minimum Overlapping Channel withVoid Filling (NP-MOC-VF) [23]. In this case, whenever it isnot possible to find a scheduling point with a gap largeenough to accommodate the incoming packet, instead ofdiscarding it the objective is to find the point with themin-imum partial overlapping. Of course, when voids withoutoverlapping are available, one of the previous algorithmscan be applied, e.g. G-Type VFwithminimum starting void.

This specification translates into a select function forwhich the optimal scheduling point (c0, i0) is such that

O(c0, i0) = min(c,i,w(c,i))∈A(ta,x)

O(c, i) (15)


Table 3Values assumed by O(c, i) under different conditions.

Overlapping type H(c, i) T (c, i) O(c, i)

Head overlapping <0 >0 |H(c, i)|Tail overlapping >0 <0 |T (c, i)|Head and tail overlapping <0 <0 |H(c, i)|+|T (c, i)|No overlapping >0 >0 − (dB(ta) − H(c, i))

whereO(c, i) = u (−H(c, i)) |H(c, i)| + u (−T (c, i)) |T (c, i)|

− u (H(c, i)) u (T (c, i)) (dB(ta) − H(c, i)) (16)and

u(y) =

0 if y < 01 if y ≥ 0 (17)

is the unit step function, which is used in (16) toselect the proper overlapping or gap to be minimized, asshown in Table 3. Note that in case a scheduling pointprovides a suitable void without overlapping, O(c, i) =

− (dB(ta) − H(c, i)) < 0 for that point, which guaranteesthat by applying (15) the point without overlapping andwith minimum starting gap is always selected. In thereference example, the smallest overlapping is found forO(1, 0) = 0.1, although the minimum value is found forO(c0, i0) = O(2, 1) = −6.9, as this point provides thesmallest starting void without overlapping.

3.2.5. noVF algorithmsThe scheduling problem formulation presented here

can also be used to implement algorithms that do notuse void filling. In fact, after the validity test (7) hasbeen applied, the select function for a G-Type noVFalgorithm [13] can be implemented using (9) and (10).Similarly, applying (12) and (11) results in a D-Type noVFalgorithm, also named LAUC in OBS [12]. In the referenceexample, looking at Tables 1 and 2 and considering only thevalid points shown in (8), it is easy to see that the optimalscheduling points are (c0, i0) = (4, 4) for the G-Type-noVFand (c0, i0) = (3, 3) for the D-Type-noVF.

4. Implementing the scheduler

It is important at this point to understand the effec-tiveness of the proposed scheduling formulation from theimplementation point of view. In this section we providesimulations of the implementation of the most relevantfunctional sub-blocks using the FPGA design software. Inparticular, we used the Altera Quartus II 8.0 Web Editionsoftware with Stratix II and III reference boards. We wishto point out that an optimized, fully functional hardwareimplementation of the scheduler would be a typical taskof prototyping and pre-production engineering activities,which require resources and skills that go well beyond thescope of this research. Nonetheless we believe it is reason-able to assume that a full implementation with optimizedhardware design will provide performances for the sub-blocks equal to or better than the ones we experienced.Moreover, the logic to interconnect these blocks does notintroduce any complex computation or waste of time. Forthese reasons we believe it is credible to claim that theresults of the simulations provided in the following give

Fig. 5. Example of hardware implementation of a minimum/maximumfunction.

a rough estimate of the processing time of the full imple-mentation.

The focus of the simulations is placed on the search andselect functions as well as on the gap list management. Allthe simulations have beenmadeusing 8-bit registers, inputand output variables and real timing simulation param-eters without any fitter optimization. Performance eval-uation of the proposed scheduler in terms of traditionalpacket loss probability is not considered here. In fact, thispaper does not introduce any new scheduling algorithmand is focused on the scheduling formulation and imple-mentation. It has been shown in Section 3.2 that a largenumber of the algorithms presented in literature can beimplemented with the proposed formulation. For detailson the performance of the various algorithms, please referto e.g. [12–14].

4.1. The select function

Themain advantage of the proposed formulation is thatthe search and select functions, as defined above, consistof simple mathematical operations, number comparisonsand minimum/maximum value extractions. All theseoperations are easy to implement using combinatorialcircuits. In practice, both functions require a given numericvalue to be compared to a set of other values.

Here we will focus at first on the implementationof the select function, which requires comparisons andcomputation of the minimum or maximum value over agiven set, as discussed in Section 3.2. These tasks can beimplemented using dedicated hardware as shown in Fig. 5,where a pool of comparators combined with flip-flops andAND filters is shown. Each comparator takes two inputvalues I1, I2 and returns, for instance, a binary output asfollows:• Cmax(I1, I2) = 1, if I1 ≥ I2• Cmax(I1, I2) = 0, if I1 < I2.

For each input Ii the hardware structure evaluateswhether Ii is a maximum or minimum of the set. Suchoperations can be performed in parallel by arrays ofcomparators, so that the time needed to compare all thesevalues is approximately constant and in the order of asingle computational step. The resulting set is composed


Table 4Processing time (ns) obtained with the minimum/maximum functionimplemented as in Fig. 5. The ∗ symbol denotes unavailable values dueto limitations in the simulation software.

Inputs 8 bit 16 bit 24 bit 32 bit

2 4.772 7.390 8.099 8.9864 6.01 8.872 9.610 10.9596 6.727 9.812 10.829 12.0798 7.667 9.882 11.981 12.319

10 8.057 10.497 11.979 13.19215 8.677 11.822 13.226 14.59720 9.266 12.258 14.228 ∗

25 9.54 12.886 ∗ ∗

by the optimal value (or values, if there are more than one)according to the algorithm chosen.

The hardware structure of Fig. 5 has been designed andits behavior simulated assuming an FPGA implementation.Table 4 reports the processing time (in nanoseconds) re-quired to perform the minimum/maximum value extrac-tion as a function of the number of inputs to be compared(rows) and number of bits used to encode the input val-ues (columns). The timing obtained is always under 15 ns,demonstrating the scalability of the proposed implemen-tation in the case of tens of inputs values and up to fourbyte encoding.

4.2. The search function

As defined in Section 3.1, for each potential schedulingpoint (c, i) the search function must find the void wherethe arriving packet may fit and compute the H(c, i)and T (c, i) overlap functions. While the actual functioncomputation is a straightforward task, the preliminary voidsearch operation is more critical and can be efficientlyimplemented with hardware similar to the one in Fig. 5. Inthis case the buffer delay time di(ta) is compared with thestarting and ending times of the voids included in the i-thdelay time span. The comparisons can be made in parallelfor any delay using an array of structures similar to theone in Fig. 5. Nonetheless the complexity and performanceof such hardware depends on the number of inputs (seeTable 4), i.e. on the number of voids from which to search.Therefore it would be desirable to have a limited set ofvoids to choose from.

We propose to implement a suitable multi-pointerstructure to limit the amount of voids to be considered oneach delay. The key issue is to be able to access the list ofvoids rather close to the time stamp of interest and limitthe number of voids to be included in the comparison tobe made by the search function. The set of channel listsVc is managed through a number of pointers equal to thenumber of potential scheduling points (c, i). Pointer p(c, i)refers to the earliest void

bp(c,i), ep(c,i)

to be considered by

the search function on channel c for delay i.The pointers are periodically updated by a trigger

function, that is executed ‘‘off-line’’ with respect to thepacket arrival time. Basically the trigger periodically shiftsthe pointers to follow the normal flow of time. The timeinterval between consecutive executions of the trigger isa parameter available to the system engineer to tradecomplexity with performance. If the trigger is executed

more frequently, the pointers will typically point to voidsvery close to the instant of interest, whereas executing thetrigger less frequently will cause the pointers to point tovoids that may be placed rather earlier than the time ofinterest. In the former case the search can be made on aset of voids significantly smaller than the one obtained inthe latter case.

The goal is to search for a void which overlaps withdelay i according to one of the following conditions on thestarting and ending times:

bp(c,i) ≤ di(ta) ≤ ep(c,i) ≤ di+1(ta) (18)

di(ta) ≤ bp(c,i) ≤ ep(c,i) ≤ di+1(ta) (19)

di(ta) ≤ bp(c,i) ≤ di+1(ta) ≤ ep(c,i) (20)

bp(c,i) ≤ di(ta) ≤ di+1(ta) ≤ ep(c,i). (21)

Once a suitable void is found on each channel c for each de-lay i, the search function can be logically implemented byexecuting the following procedure in parallel:1: Search(c, i, ta, x)2: if bp(c,i) ≥ di(ta) + x or ep(c,i) ≤ di(ta) then3: H(c, i) := −∞

4: T (c, i) := −∞

5: else6: H(c, i) := di(ta) − bp(c,i)7: T (c, i) := ep(c,i) − di(ta) − x8: w(c, i) := (H(c, i), T (c, i))9: end if

10: if H(c, i) ≥ 0 and T (c, i) ≥ 0 then11: Insert((c, i,w(c, i)) ,A(ta, x))12: end if

This procedure implements the computation of thesearch function as defined in Section 3.1 as well as thevalidity test given by (5).

The computational burden of the Search (c, i, ta, x)procedure, with particular reference to lines 6 and 7, isassessed here by means of FPGA simulations. Results areshown in Fig. 6, where d = 5 represents the delayed arrivaltime di(ta), x = 8 is the packet length, c_b = 4 and c_e =

15 represent bp(c,i) and ep(c,i) respectively. The computationof H(c, i) and T (c, i) is started at the beginning of thesimulation and out_en_H and out_en_T are two enablesignals that rise after the computation is complete. At thispoint the resulting values are available as OUT_H = 1and OUT_T = 2. In particular, the out_en_H signal takestwo clock cycles to rise up, while out_en_T takes threeclock cycles, as expected due to the additional operationrequired. Since the two operations are executed in parallel,the head and tail gap measure requires three clock cycles.This simulation has been performed assuming a 5 ns clockperiod, although with a Stratix II board a maximum clockfrequency of 241.31MHz (i.e. a period of 4.144 ns) has beenreached.

4.3. Void list management

Last but not least, it is important to comment on theimplementation of the void list management. The triggermay help in this task as well. It periodically sweeps the


Fig. 6. Simulation waveforms of H(c, i) and T (c, i) computation.

lists and shifts the pointers to keep them aligned with thecurrent time reference.

During this operation, a void list element is to beconsidered expired as soon as it is not able to satisfy anypossible future request, i.e. when the corresponding gapis not large enough to accommodate the earliest possiblepacket. Given the channel status at the generic time t , thishappens when the void ending time for delay i = 0 is

ep(c,0) < t + toffset_min (22)

in the OBS case and

ep(c,0) < t + tproc + tsched + tsw (23)

in the OPS case. All elements marked as expired areremoved from the system memory by a periodic garbagecollection function.

In the OPS case the trigger function is sufficient to allowthe Search (c, i, ta, x) procedure to process new arrivalson the fly. The OBS case is different due to the advancedreservation approach adopted. In fact, when a burst headerarrives at th the pointers must be shifted forward by anamount of time that depends on toffset. However, this doesnot mean that the skipped voids are tagged as expired, asfuture burst headers may refer to bursts with a smalleroffset time. Therefore, these gaps must be kept in the listsfor possible future use.

The additional processing burden due to the burst offsettime alignment can be reduced by placing intermediatepointers at key elements in each list between theminimumand the maximum offset time, e.g. following the expectedoffset time distribution. According to the value of toffsetcarried by the burst header, the scheduler selects theclosest intermediate pointer preceding the burst arrivaltime ta and then it is able to quickly reach the gapcorresponding to the correct time reference.

After the void list updates mentioned above and incase the execution of the Search (c, i, ta, x) procedure re-sults in a successful channel and delay assignment, thechannel status must be updated since existing gaps aremodified and new ones may be created. This task can beimplementedwith the following procedure, assuming thatthe packet is inserted in the gap pointed by p(c, i):

1: ModifyVoid(c, i)2: if T (c, i) ≥ Lm then

3: g := CreateNewVoid4: eg := ep(c,i)5: bg := ep(c,i) − T (c, i)6: InsertVoid(g after p(c, i))7: end if8: if H(c, i) ≥ Lm then9: ep(c,i) := bp(c,1) + H(c, i)

10: else11: RemoveVoid(p(c, i))12: end if

A new void is created if the scheduled packet leavesenough roomafter its tail (lines 2–7). Similarly, the existinggap pointed by p(c, i) is updated if the scheduled packetleaves enough room before its head (lines 8–9), otherwiseit is removed (lines 10–11).

Fig. 7 shows the waveforms related to the simulation ofthe pointer shifting operation. The input parameter Timerepresents the right-hand side of either (22) or (23) fora generic delayed time reference di(t). The void currentlypointed by p(c, i) is represented by three output registers:Out_Gap_B stores the starting time bp(c,i), Out_Gap_E theending time ep(c,i) and Out_Gap_N the address of thefollowing void in the list. Values at the beginning of thesimulation are Out_Gap_B = 7,Out_Gap_E = 14 andOut_Gap_N = 0 × 02.

When Time reaches the same value as Out_Gap_E, thegap is to be considered as expired and the scheduler mustshift the pointer to the next void pointed by Out_Gap_N .This action is triggered by the Next_EN enable signal,whose rising edge tells the scheduler to load the valuesaddressed by Out_Gap_N into the registers, which changevalue into Out_Gap_B = 15,Out_Gap_E = 30 andOut_Gap_N = 0 × 03.

As shown in Fig. 7, the scheduler takes two clock cyclesto update the pointer after a gap expires. The waveformsare shown with a clock period of 20 ns, although amaximum clock frequency of 155.09 MHz, correspondingto a clock period of 6.448 ns, has been reached witha Stratix III board. These results have been obtainedwithout any fitter optimization, leaving space for furtherimprovements to the scheduler performance.

5. Conclusion

Contention resolution throughoptimal channel schedul-ing is a common problem in both OPS and OBS nodes,due to the similar statistical multiplexing approaches.


Fig. 7. Simulation waveforms of the pointer shifting operation.

This paper presented a formulation of the packet or burstscheduling problem that is flexible enough to be capableof implementing many scheduling choices found in litera-ture, while requiring simple minimum or maximum com-putations over a limited set of values and other simpleoperations.

Moreover, the paper discussed a possible implementa-tion of the scheduler functions, mainly based on the use ofcombinatorial operations, supported by simulation of thehardware implementation. The results suggest a very lim-ited computational complexity and a fast response time.This is an important result toward the actual implementa-tion of a feasible and fast scheduler control logic.

The simulations presented above show that the searchand select phases can be executed in the order of a few tensof nanoseconds. Of course, thewhole scheduling ismade ofseveral such operations. It has been shown that the searchtasks can be parallelized and the management of the voidlists can be performed off-line. Therefore, it is possible toconclude that the time to complete the scheduling of eachpacket will be approximately the sum of a search and aselect operation, ranging in the order of 50 ns. This maybe considered a fairly reasonable performance, assumingan average length of the optical packet in the order of themicrosecond.

Acknowledgement

The work described in this paper was carried out withthe support of the BONE-project (‘‘Building the FutureOptical Network in Europe’’), a Network of Excellencefunded by the European Commission through the 7thICT-Framework Programme.

References

[1] S.J. Ben Yoo, Optical packet and burst switching technologies for thefuture photonic Internet, IEEE/OSA Journal of Lightwave Technology24 (12) (2006) 4468–4492.

[2] D.K. Hunter, M.C. Chia, I. Andonovic, Buffering in optical packetswitches, IEEE/OSA Journal of Lightwave Technology 16 (12) (1998)2081–2094.

[3] R.S. Tucker, P. Ku, C.J. Chang-Hasnain, Slow-light optical buffers:capabilities and fundamental limitations, IEEE/OSA Journal ofLightwave Technology 23 (12) (2005) 4046–4066.

[4] S. Yao, B. Mukherjee, S.B. Yoo, S. Dixit, A unified study of con-tention–resolution schemes in optical packet-switched networks,IEEE/OSA Journal of Lightwave Technology 21 (3) (2003) 672–683.

[5] F. Callegati, W. Cerroni, G.S. Pavani, Key parameters for contentionresolution in multi-fiber optical burst/packet switching nodes(invited paper), in: Proc. of IEEE Broadnets 2007, Raleigh, NC,September, 2007.

[6] F. Callegati, H.C. Cankaya, Y. Xiong, M. Vandenhoute, Design issuesfor optical IP routers, IEEE Communications Magazine 37 (2) (1999).

[7] L. Tancevski, S. Yegnanarayanan, G. Castanon, L. Tamil, F. Masetti,T. McDermott, Optical routing of asynchronous, variable lengthpackets, IEEE Journal on Selected Areas in Communications 18 (10)(2000) 2084–2093.

[8] F. Callegati, Optical buffers for variable length packets, IEEECommunications Letters 4 (9) (2000) 292–294.

[9] V. Eramo, et al., Cost evaluation of optical packet switches equippedwith limited-range and full-range converters for contention res-olution, IEEE/OSA Journal of Lightwave Technology 26 (4) (2008)390–407.

[10] F. Callegati, et al. Congestion resolution in optical burst/packetswitching with limited wavelength conversion, in: Proc. of IEEEGlobecom 2006, San Francisco, USA, November, 2006.

[11] J.S. Turner, Terabit burst switching, Journal of High Speed Networks8 (1) (1999) 3–16.

[12] Y. Xiong, M. Vandenhoute, H.C. Cankaya, Control architecture inoptical burst-switched WDM networks, IEEE Journal on SelectedAreas in Communications 18 (10) (2000) 1838–1851.

[13] F. Callegati, W. Cerroni, G. Corazza, Optimization of wavelengthallocation in WDM optical buffers, Optical Networks Magazine 2 (6)(2001) 66–72.

[14] J. Xu, C. Qiao, J. Li, G. Xu, Efficient burst scheduling algorithmsin optical burst-switched networks using geometric techniques,IEEE Journal on Selected Areas in Communications 22 (9) (2004)1796–1811.

[15] G. Muretto, C. Raffaelli, P. Zaffoni, Effective implementation of voidfilling in OBS networks with service differentiation, in: Proc. ofWOBS 2004, San José, CA, USA, October, 2004.

[16] Y. Chen, C. Qiao, X. Yu, Optical burst switching: a new area in opticalnetworking research, IEEE Network 18 (3) (2004).

[17] S.Q. Zheng, Y. Xiong, M. Vandenhoute, Hardware implementation ofchannel scheduling algorithms for optical routers with FDL buffers,US Patent No. 6804255, October, 2004.

[18] Y. Chen, J. Turner, P.F. Mo, Optimal burst scheduling in optical burstswitched networks, IEEE/OSA Journal of Lightwave Technology 25(8) (2007) 1883–1894.

[19] P. Pavon-Marino, J. Veiga-Gontan, A. Ortuno-Manzanera,W. Cerroni,J. Garcia-Haro, PI-OBS: a parallel iterative optical burst scheduler forOBS networks, in: Proc. of HPSR 2009, Paris, France, June, 2009.

[20] F. Callegati, A. Campi,W. Cerroni, A cost-effective approach to opticalpacket/burst scheduling, in: Proc. of IEEE ICC 2007, Glasgow, UK,June, 2007.

[21] F. Callegati, A. Campi, W. Cerroni, A practical approach to schedulerimplementation for optical burst/packet switching, in: Proc. ofONDM 2010, Kyoto, Japan, February, 2010.

[22] L. Tancevski, L. Tamil, F. Callegati, Non-degenerate buffers: anapproach for building large optical memories, IEEE PhotonicsTechnology Letters 11 (8) (1999) 1072–1074.

[23] V.M. Vokkarane, J.P. Jue, Segmentation-based nonpreemptivechannel scheduling algorithms for optical burst-switched net-works, IEEE/OSA Journal of Lightwave Technology 23 (10) (2005)3125–3137.

Documents

Fast and versatile scheduler design for optical packet/burst switching