30
A Approaches to Task Scheduling and Resource Allocation in Optical Grids ALA SHAABANA, University of Windsor Using optical network technology in grid networks, we are able to achieve much higher bandwidth require- ments, essentially eliminating the bottleneck that throttled the previous generation of grids. However, we cannot apply the same algorithms to optical grid networks as we would on traditional grid networks since they have other physical constraints and conditions that need to be satisfied. As such, research on this topic is relatively scarce since optical networks themselves are still not perfectly implemented and researchers are still experimenting with this technology, and have yet to fully utilize it. We present a comprehensive survey studying key solutions that have been proposed to optimize task scheduling and resource allocation in optical grid networks. Note that although the traditional algorithm can be applied to optical grids, there are many other factors and constraints that need to be taken into account in order to provide an optimal or a near-optimal solution. Contents 1 Introduction 2 2 Survey of Research 2 2.1 Task scheduling and Resource Allocation Strategies on optical grids .. 2 2.1.1 Adaptive task scheduling on optical grid ............... 3 2.1.2 Resource allocation strategies for data-intensive workflow-based applications in optical grids ...................... 3 2.1.3 Coordinated resource scheduling in high-performance optical grids 4 2.1.4 On Dimensioning optical grids and the impact of scheduling . . . 5 2.2 Approaches to Accuracy in Task Scheduling in Optical Grids ....... 6 2.2.1 On accurate task scheduling in Optical Grid ............ 6 2.2.2 Task Scheduling Accuracy Analysis in Optical Grid Environments 7 2.3 Other Approaches to Resource Allocations and Task Scheduling ..... 8 2.3.1 Dynamic rescheduling of network resources with advance reser- vations in optical grids ......................... 8 2.3.2 Task scheduling and lightpath establishment in optical grids .. 10 2.3.3 Communication contention reduction in joint scheduling for opti- cal grid computing ........................... 11 2.3.4 Multi-cost job routing and scheduling in grid networks ...... 12 3 Concluding Comments 13 4 Annotations 16 4.1 Liang et al. 2006 ................................. 16 4.2 Guo et al. 2006 .................................. 17 4.3 Kim et al. 2007 .................................. 18 4.4 Develder et al. 2009 ............................... 19 4.5 Wang et al. 2007 ................................. 20 4.6 Guo et al. 2009 .................................. 21 4.7 Tanwir et al. 2008 ................................ 23 4.8 Liu et al. 2009 .................................. 24 4.9 Jin et al. 2009 .................................. 25 4.10 Stevens et al. 2009 ............................... 27 ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A

Approaches to Task Scheduling and Resource Allocation in OpticalGrids

ALA SHAABANA, University of Windsor

Using optical network technology in grid networks, we are able to achieve much higher bandwidth require-ments, essentially eliminating the bottleneck that throttled the previous generation of grids. However, wecannot apply the same algorithms to optical grid networks as we would on traditional grid networks sincethey have other physical constraints and conditions that need to be satisfied. As such, research on this topicis relatively scarce since optical networks themselves are still not perfectly implemented and researchersare still experimenting with this technology, and have yet to fully utilize it. We present a comprehensivesurvey studying key solutions that have been proposed to optimize task scheduling and resource allocationin optical grid networks. Note that although the traditional algorithm can be applied to optical grids, thereare many other factors and constraints that need to be taken into account in order to provide an optimal ora near-optimal solution.

Contents

1 Introduction 2

2 Survey of Research 22.1 Task scheduling and Resource Allocation Strategies on optical grids . . 2

2.1.1 Adaptive task scheduling on optical grid . . . . . . . . . . . . . . . 32.1.2 Resource allocation strategies for data-intensive workflow-based

applications in optical grids . . . . . . . . . . . . . . . . . . . . . . 32.1.3 Coordinated resource scheduling in high-performance optical grids 42.1.4 On Dimensioning optical grids and the impact of scheduling . . . 5

2.2 Approaches to Accuracy in Task Scheduling in Optical Grids . . . . . . . 62.2.1 On accurate task scheduling in Optical Grid . . . . . . . . . . . . 62.2.2 Task Scheduling Accuracy Analysis in Optical Grid Environments 7

2.3 Other Approaches to Resource Allocations and Task Scheduling . . . . . 82.3.1 Dynamic rescheduling of network resources with advance reser-

vations in optical grids . . . . . . . . . . . . . . . . . . . . . . . . . 82.3.2 Task scheduling and lightpath establishment in optical grids . . 102.3.3 Communication contention reduction in joint scheduling for opti-

cal grid computing . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.4 Multi-cost job routing and scheduling in grid networks . . . . . . 12

3 Concluding Comments 13

4 Annotations 164.1 Liang et al. 2006 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.2 Guo et al. 2006 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.3 Kim et al. 2007 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.4 Develder et al. 2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.5 Wang et al. 2007 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.6 Guo et al. 2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.7 Tanwir et al. 2008 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.8 Liu et al. 2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.9 Jin et al. 2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.10 Stevens et al. 2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 2: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:2 A. Shaabana et al

5 References 29

1. INTRODUCTIONRecently, there has been a move towards grids implemented with optical networks astraditional networks are reaching their bottleneck with the emergence of newer tech-nologies which utilize more bandwidth. This survey presents a comprehensive analysisof the research done so far on task scheduling and resource allocation of applications onoptical grids. The research papers for this survey were identified using Google Scholar,ACM and IEEE sources. There were 4 conference papers and 6 journal articles whichdeal with aspects of task scheduling and/or resource allocation strategies specificallyin optical grid networks.

Liang et al.’s proposal [2006], which considered the allocation of network resourcesfor data transfer in optical grids when scheduling was one of the first papers at thetime in which the authors granted the network resource the same level to be con-sidered as the computation and storage resources. In the same year, Guo et al. [2006]released their work which proposes algorithms for task scheduling and resource alloca-tion which consider network resources contention simultaneously. We further presentthe work of Kim et al. [2007] which investigates coordinated resource scheduling al-gorithms, introduces a simple scheduling algorithm, and then evaluates its impact ongrid performance. In the following years research focused more on accurate scheduling.We present the work of Wang et al. [2007] which investigates accurate task schedulingin optical grids, models practical data transfer and task execution, and finally imple-ments a theoretical scheduling algorithm to show significant improvement over con-ventional methods. Later, Guo et al. [2008] proposed an improved accuracy of schedul-ing algorithms by solving the problem of accuracy deviations in the real optical-gridenvironment in [2008]. In [2009] Develder et al. studied the impact of task and re-source scheduling on optical grids, and concluded that the different locations of clus-ters yield different results. The remaining 4 papers are considered important althoughthey do not deal with the problem of task scheduling and resource allocation directly,they do deal with different aspects that may affect the processes of task schedulingand/or resource allocation. For instance, Liu et al. [2009] suggest and tried to provethat lightpath establishment must be considered jointly with task scheduling in orderto achieve the best performance. Interestingly, only one of these papers cites another.However, all of these papers cite the same research that has been done on task schedul-ing and resource allocation on either traditional grid networks or theoretical models,this may be a good indication that this area of research is still relatively new.

We have divided the surveyed papers into 3 sections. Section I pertains to researchwhich directly deals with the allocation of network resources when scheduling. SectionII of our paper explores research which aims to improve the accuracy of task schedulingalgorithms on optical grids. Finally, section III presents papers which suggest otherfactors and solutions must be taken into consideration (such as modeling the probleminto an acyclic directed graph) for the optimization of task scheduling and resourceallocation in optical networks.

2. SURVEY OF RESEARCH2.1. Task scheduling and Resource Allocation Strategies on optical gridsThe research papers in this section deal directly with approaches to task schedulingand resource allocations in optical grids. The earliest research on task scheduling isthe work by Liang et al. [2006]. In addition to being one of the earliest papers on taskscheduling in optical grids, this is also one of the earliest papers on task scheduling inoptical grids which consider the co-allocation or co-scheduling of both optical network

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 3: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:3

and computing resources. There are 3 conference papers in this section, along with onejournal paper.

2.1.1. Adaptive task scheduling on optical grid. As grid users create more diverse andcomplex applications, the bandwidth of the electrical network becomes the bottleneckof grid applications. This makes optical grids, with their massive bandwidth andcheap cost an attractive alternative. A key aspect of grid environments is schedulingof tasks and resources. However, Liang et al. note that few proposed methods considercommunication contentions (such as resource conflicts) on the optical grids, butthey rarely consider co-allocating or co-scheduling of both the optical network andcomputing resources. Liang et al. refer to previous work by Banerjee et al. [2005] .They also cite the work by Sinnen et al. [2005] and compare their algorithm with theproposed algorithm. They note that Banerjee et al. [2005] do not consider co-allocatingor co-scheduling of both the optical network resources and the computing resources.They state that this method is destined to hit a threshold since sequential allocationscan only go so far.

The authors present an optical grid model based on optical network characteristics.In this model, the authors granted the network resource the same level to be consid-ered as the computation and storage resources. In this way they treat the networkitself as a resource to be offered as a service to the grid like any other resource.The authors also took into consideration the allocation of network resources for datatransfers to make job scheduling as close to reality as possible. Furthermore, theypresent a communication contention algorithm based on list scheduling to minimizethe total execution time for given tasks towards an optical grid, where they used amodified version of Djikstra’s algorithm for routing.

The authors used the national 863 high-performance broadband informationnetwork in their simulation. Further, they defined the resource type and capacityof the device connected to the optical network. They mainly defined three resourcestypes: storage resources, I/O resources and computing resources. Using simulatedjob sizes of 15, 20, 30, 40, 50 and 60 jobs, the authors claim that their simulationsshow that the proposed adaptive scheduling algorithms performs better than the fixedshortest-route algorithm, proposed by Sinnen et al. [2006]. Liang et al. [2006] claimthat the proposed method demonstrates improved accuracy and scheduling efficiencyover conventional algorithms. In the future they plan to consider fiber allocationgranularity as well as multiple type resources in one independent system, whichdecreases down communications.

2.1.2. Resource allocation strategies for data-intensive workflow-based applications in opticalgrids. Scientific simulations and remote-visualizations are data-intensive applicationsthat are often run on a grid architecture. Since these applications are really a setof tasks that are essentially executed sequentially, each task must perform variousoperations such as computing, transporting and displaying the data. This creates theproblem of scheduling and running these tasks and their resource allocations. Guoet al. refer to the work of Braun et al. [2001], Binato et al. [2000], He et al. [2003],Li et al. [2005], and Topcuoglu et al. [2002]. Further, the authors also cite the worksof Blythe et al. [2005], Jia et al. [2005], and Mandal et al. [2005] and compare theirsolutions with the proposed solution. Guo et al. [2006] note that the research of Braunet al. [2001], Binato et al. [2000], He et al. [2003], Li et al. [2005], and Topcuoglu etal. [2002] concentrates on matching individual tasks with their respective resourcesand does not find an efficient overall allocation. Hence they do not consider tasks that

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 4: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:4 A. Shaabana et al

come later in the work-flow, and their overall allocation of resources may result in pooroverall assignments. Furthermore, they note that the proposed solutions of Blythe etal. [2005], Jia et al. [2005], and Mandal et al. [2005] search for an efficient allocationfor the entire workflow, which depends on predictions of future task performance.However they note that this research also assumes that computational resourcesare fully connected and that the network bandwidth is enough for unlimited datatransfers in parallel, this is not the case in real networks.

In order to minimize the execution time required for a given workflow, the authorspropose two algorithms for resource allocation and task scheduling in an optical grid:a task-based approach and a workflow-based approach for data-intensive workflow-based applications. The authors also claim to have considered network resourcecontention in implementing these algorithms. They used an optical grid simulator toinvestigate the performance of the resource allocation approaches. The workflow ofthe experiments consisted of randomly generated directed acyclic graphs. In order togenerate such graphs, the authors randomly generated one of 3 following parameters:

(1) Number of nodes.(2) Average number of edges per node.(3) Communication to computation ratio.

Furthermore, resource allocation algorithms were run with a time limit of 200 seconds.The authors claim that while there is no significant difference for computationallyintensive cases, the workflow-based algorithm outperforms the task-based approach byproducing schedules with significantly lower execution times for data-intensive cases.However, they noted that the workflow-based algorithms take up more time than task-based approaches, making them less compatible with workflows containing severalthousand tasks.

The authors produce task-based and workflow-based allocation approaches forworkflow-based applications, and claim that they have similar performance forcomputationally-intensive cases but the workflow-based approach performs better fordata-intensive cases, where they take advantage of the ability to begin transferringlarge data sets earlier and make decisions based on global measures of performance.

2.1.3. Coordinated resource scheduling in high-performance optical grids. In grid computing,optical transport provides the means to transmit large amounts of data with lowcost and reliability. However, the complexity of large scale resource sharing andcoordination is an important issue that must be resolved in order to fully utilizeoptical grids. Kim et al. cite no related work, but present results from differentcoordinated resource scheduling algorithms, namely First-Come-First-Serve (FCFS),Proportional-backoff (P-backoff) and Weighted-backoff (W-backoff). Specifically, Kimet al. focus on high-performance optical networks as the underlying architecture forall communications [Kim et al. 2007]. The authors then present the performancemeasures for CPU selection methods. Finally, the authors evaluate and claim topresent the effect of network path length limits on overall grid scheduler performance.

The authors simulate a network which has full wavelength conversion. They thenrandomly select and designate network nodes with either CPU, storage or user roles.The authors also assumed each link in the network has 32 wavelengths and each CPUnode consists of 64 CPUs. The authors note for each connection on the grid only theshortest paths are utilized, from storage to CPU and from CPU to user. Furthermore,

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 5: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:5

there is a significant improvement in performance when resource scheduling and/orallocation schemes are given enough freedom to choose any CPU. However the authorsnote that this option may not always be feasible as the size of the grid and the natureof the application also play a role in grid utilizations. The authors have furtherfound that out of the three scheduling algorithms considered, FCFS performed theworst, P-backoff performed slightly better and finally W-backoff performed slightlybetter than P-backoff. Finally the authors have found that increasing the path lengthimproves overall system performance. However increasing the number of hops orwhen the grid scheduler is given the flexibility of choosing any CPU in the system,longer paths no longer have a significant effect. Finally, they have found that sincenetwork congestion occurs around local CPU resources, then by allowing longer paths,it is possible to route around the congested (or non-local) network resources [Kimet al. 2007].

2.1.4. On Dimensioning optical grids and the impact of scheduling. Anycast routing of jobsis conventionally adopted in network dimensioning (how much capacity is needed forthe network to be able to transport a given amount of traffic), this implies the absenceof a clearly defined traffic (or job) matrix (i.e. source-destination based signals) sinceonly the origin of grid jobs is known, but not their destination. Further, Develder etal. acknowledge that the anycast routing principle directly affects the scheduling androuting decisions, such as where to execute a job in the grid system and how to get itto the destination. Although this provides an edge over traditional grids since it addsmore freedom, it also incurs multi-cost routing problems, incorporating the state ofboth the network and computational/storage grid resources. Develder et al. surmisethat grid dimensioning alone is not enough, and one must also dimension the compu-tational and storage resources, making the problem close to, if not already is, NP-hard.

The authors acknowledge a variety of solutions presented to solve the problem ofnetwork dimensioning (understanding how much network capacity is needed for thenetwork to be able to transport a given amount of traffic). Further, the authors notethat solutions came in the form of heuristics and Integer Linear Programs (ILPs).They further state that these algorithms varied in their approach depending onseveral factors, such as: the topologies and technologies implemented, design criteria,single or multi-period planning and single domain or hierarchal networks. However,they note that if one wished to apply any of the approaches listed for dimensioninggrids, then the problem of accurately estimating the traffic matrix will arise. Moreover,the authors mention that in addition to grid dimensioning, one must also dimensionthe computational and/or storage resources, and speculate that jointly determiningboth network and server dimensions is a possible NP-hard problem.

However, the authors claim that related work on dimensioning grids is uncommon,and cite some papers dealing with this topic. Specifically, they cite Thysevaert et al.[2005] and De Leenheer et al. [2007] as the only attempts at the time of writing thepaper to solve the problem of dimensioning network grids.

The authors address the grid dimensioning problem, and present a solution tounderstand how to decide where to provide server capacity and resources, whereto process the submitted jobs, and finally how to calculate the network dimensionsrequired. Hence, authors propose a phased solution, first dimensioning the serversand then the network. [Develder et al. 2009]. In order to obtain a more realisticcase study, the authors performed measurements on a real world grid, deployed in

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 6: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:6 A. Shaabana et al

the frame of the Large Hadron Collider (LHC) experiments and the Enabling Gridsfor E-ScienceE (EGEE) project (EGEE/LCG). The authors considered two networktopology and job demand cases, with the first using a fairly densely meshed Europeanbackbone network, with artificially generated job arrival rates at each site [Develderet al. 2009]. The second case is based on measurement data from EGEE/LCG, and theaverage job duration was derived from real-life trace files. The authors applied theproposed dimensioning strategies on both cases.

They claim that the two case studies on European topologies show that placingserver capacity where a lot of jobs arrive is important to minimize network bandwidthrequirements. With regard to grid scheduling, a simple shortest path strategy, prefer-ring closer server sites, led to the lowest bandwidth demands [Develder et al. 2009].Finally with respect to choosing an appropriate number of server sites, the authorsspeculate that for a larger number of server sites, the total server capacity becomesfragmented, thereby reducing opportunities for statistical multiplexing, whereas forsmaller server site counts the average distance that jobs need to travel is too large.Hence, they claim that the optimal number of server sites depends on the schedulingalgorithm and the server site dimensioning strategies, and that their case studiesdid not cover all of the total number of servers. Moreover, they claim that researchon dimensioning grids is uncommon, and it is still an unexplored area. They finallyclaim that in their experiments have showed that grid scheduling algorithm has asubstantial effect on the required network capacity.

2.2. Approaches to Accuracy in Task Scheduling in Optical GridsThis section presents papers which deal with the accuracy of task scheduling inoptical grids. Incidentally, the same authors worked on both of these papers. Theyhave presented work in [2007b] first, in which they proposed a theoretical schedulingmodel. Then in Guo et al. [2009] the same authors extended their work with amodified, more realistic theoretical model, which proved to be more effective thanthe former. The work presented by Wang et al. in [2007] is a journal, while the workpresented by Guo et al. in [2009] is a conference paper.

2.2.1. On accurate task scheduling in Optical Grid. An important aspect of efficient uti-lization of grid resources, task scheduling locates computational resources for tasksand optical network resources for communications in optical grids. The predictedeventual finish time of a task, known as the scheduling span, is employed to evaluatescheduling efficiency. Moreover, Wang et al. claim one must also consider whether thescheduling prediction is executed accordingly. Wang et al. conclude that the problemthus lies in how to guarantee the conformity of the scheduling with practicality.Further, they cite past work which simulates task scheduling scenarios well butnote that it is done for packet-switched electrical networks as opposed to opticalnetworks, which have special scenarios and constraints to consider. Therefore theyconclude that while traditional research helps with optical network research, researchon accurate task scheduling in optical grids is considered more valuable in thiscontext. Primarily, Wang et al. [2007b] cite and compare the work described in thispaper to their previously developed theoretical Optical Grid Earliest Finish Time(OGEFT) algorithm presented in Wang et al. [2007a] . Further, the authors note thework done by Sinnen et al. [2006] , a practical and realistically modeled runningscenario for accurate task scheduling. With regard to their own previous work, theauthors state that the OGEFT algorithm proposed does not consider an optical grid’s

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 7: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:7

practical running scenarios, such as light-path establishment (hence why they calledit theoretical), and claim that the proposed algorithm uses a more realistic modeland hence delivers better results. Further, the authors claim that the work done bySinnen et al. [2006] is only accurate for traditional packet-switched networks, and isnot realistically applicable to optical networks as they abide by more constraints andscenarios.

Along with investigating and modeling the practical data transfer and task execu-tion scenarios, the authors present a modified version of their Optical Grid EarliestFinished Time (OGEFT) algorithm to schedule tasks. The authors state that this algo-rithm is a listing algorithm in that it sorts tasks into a list according to their executionpriorities and allocates grid resources accordingly. The authors also claim that thisalgorithm uses optical network routing algorithms to allocate a light-path to transfergrid data. To evaluate the scheduling accuracy of the new realistic task schedulingscheme of OGEFT, the authors set up the same test bed used in their previous work toevaluate the theoretical OGEFT previously presented. Three computational resourceslocated in Shanghai Jiaotang University, 20km apart and networked by a GMPLS-controlled Automatically Switched Optical Network (ASON), with each switching nodecapable of switching at 360 Gbps with VC-4 granularity and each link containing twoSTM-64 fibers. Task execution is conducted as —v— (computing amount) seconds’ ma-trix computation on a computational resource. The number of tasks in a task set canbe 10, 20 or 50 tasks.

After incorporating the new, more realistic model to OGEFT, the authors claim tohave found that the accuracy deviations have become much lower than the previouslyobtained results with the theoretical model. Hence they conclude that their realisticmodel can improve task scheduling accuracy significantly. The authors claim thispaper presents a significantly improved and more realistic model of their OGEFTalgorithm proposed in [Wang et al. 2007b].

2.2.2. Task Scheduling Accuracy Analysis in Optical Grid Environments. Task scheduling is aprominent issue to optimize the performance of grid applications in optical grids andto improve utilization for both grid resources and optical networks resources is taskscheduling. In order to get the shortest possible execution time (also called schedulinglength) task scheduling finds spatial and temporal assignments of applications ontothe optical grid environments. Guo et al. claim that the accuracy of a task schedulingalgorithm is an important issue to efficiently utilize resources and guarantee qualityof service (QoS) to grid applications. The authors cite the work by Sinnen et al. [2004& 2005], and also cite work by various other authors who have attempted to solve theNP-hard task scheduling problem with near optimal solutions. Further, the authorshave stated that the solution by Sinnen et al. [2006] has achieved significantlyimproved accuracy.

The authors cite multiple heuristic algorithms that achieve near-optimal solutionsto the problem of task scheduling on normal grid environments. However they claimthat most of the algorithms are based on a simplistic model which assumes allprocessors are fully connected. Moreover, the authors acknowledge that there areother, more recent algorithms which have been proposed that consider the topology ofa communication network and are aware of the contention for network resources, suchas the one proposed by Sinnen et al. [2005] . However, the authors claim that while theaccuracy was significantly improved through the consideration of contention in taskscheduling, the experimental results show it is still an unsatisfactory achievement in

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 8: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:8 A. Shaabana et al

scheduling accuracy, as Sinnen et al. [2004] have shown. The authors however statethat these solutions are not directly transferable to optical grid environments, asoptical grids have many other factors and constraints that need to be accounted for.The authors also claim that in order for these algorithms to function in an optical gridenvironment, they must be designed so they co-allocate grid resources and opticalnetwork resources. Further, they must also define a new data model to obtain accuratetask scheduling in real optical grid environments. Finally, the authors have citedBanerjee et al.’s [2008] proposing scheduling algorithms for large file transfers inan optical grid system, however they claim the accuracy analysis of the generatedscheduling algorithm is yet to be investigated. The authors propose a theoreticaltask scheduling algorithm, followed by a demonstration showing the deviation ofthe scheduling length from the actual finish time in a real optical grid environment.Moreover, the authors claim to have made some analysis to find several factorsthat are likely to affect optical grid networks, and finally they further claim to haveincorporated the new revised parameters into the theoretical scheduling model andproposing a more realistic scheduling model as a result.

By varying two key parameters: the communication-computation ratio (CCR,defined as the sum of communication cost divided by the sum of the task computationcost) and the Directed Acyclic Graph (DAG) size, the authors randomly generated 20DAGs. To evaluate the proposed algorithms, the scheduler employs the theoreticaland realistic tai scheduling algorithms to get the scheduling results and schedulinglength. The scheduler then asked the grid resources to execute the tasks of the DAGand directs the optical network to set up or release the light paths according to thescheduling results in the optical grid testbed [Guo et al. 2009]. Finally, the authorsclaim to have measured and compared the actual finish time for the DAG with thescheduling length to calculate the accuracy deviations. The authors claim that theresults show their realistic task scheduling algorithm demonstrates its capacity toimprove the overall scheduling accuracy. They also noted that the improvement ofaccuracy increased as the CCR increased. Finally, the authors claim that these resultsprove that the realistic task scheduling algorithm can achieve better performance fordata-break intensive applications in optical grid environments [Guo et al. 2009]. Theauthors claim that there has been no previous research on analyzing the accuracy oftask scheduling in optical grid networks at the time of writing their paper. In regardto the scheduling algorithm results, the authors claim the realistic task schedulingalgorithm could significantly improve overall task scheduling accuracy. This papersupersedes the work by Wang et al. [2007], wherein the same contributors to thisresearch presented a new model to augment their previously proposed theoreticalmodel for accurate task scheduling into a more realistic one.

2.3. Other Approaches to Resource Allocations and Task SchedulingThe papers presented in this section suggest that other factors and solutions must betaken into consideration (such as modeling the problem into an acyclic directed graph)for the optimization of task scheduling and resource allocation in optical networks.

2.3.1. Dynamic rescheduling of network resources with advance reservations in optical grids.Optical grids help make possible many eScience and eBusiness applications bysupporting their large bandwidth requirements. Tanwir et al. note that these emerg-ing applications have led to a rapid advancement in optical network technologies,however the area that is still lacking is the link between the grid applications and

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 9: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:9

the underlying network technologies which make grids effective. In order to meetthe complex demand patterns of grid applications and to optimize overall networkutilization, Tanwir et al. claim it is necessary to abstract and encapsulate opticalnetwork resources into manageable and dynamically divided grid entities. Theycite multiple online projects and note that they consider the problem of buildingreconfigurable, dynamic, adaptable optical grids. The authors note that these projectshave developed grid middleware that makes optimized use of an optical networkas a virtual coordinated resource [Tanwir et al. 2008]. The authors also cite workby Zheng et al. [2002] which discusses different designs for effective Routing andWavelength Assignment algorithms for different types of advance reservations andpresent algorithms for requests having specific start times and specific durations(STSD), specific start times and unspecific durations (STUD), and unspecific starttimes and specific durations (UTSD). Moreover, the authors cite the work by Burchard[2005] in which he discusses the properties of advance reservation and also proposedan architecture for a bandwidth distribution system to improve the performanceof a network based on the acquired knowledge of these reservations. The authorsalso cite Foster et. al. [2009] who presented Globus Architecture for Reservation andAllocation (GARA), which supports advance reservations for various types of resourcesfor grid optical networks. Further, the authors cite Curti et. al. [2005], who discussedadvance reservation of heterogeneous network paths in grid computing and in orderto integrate the path management with grid information and authentication services,proposed a network resource hierarchy. Furthermore, the authors cited multiple previ-ous works which tried to improve advance reservations. They cite the work of Wang etal. [2005], who proposed a sliding scheduled traffic model and a demand time conflictresolution scheme to maximize the resource usage in a network. Finally, the authorscite the work of [He et. al. 2006] which proposes a Flesible Advance Reservation Model(FARM) and described how to implement this model in the meta-scheduling problem[Tanwir et al. 2008]. However, they do not identify any shortcomings of the previouswork mentioned. The authors focus on evaluating and comparing various algorithmsfor advanced light path scheduling that can be implemented in a Domain NetworkResource Manager (DNRM). In the end the authors hope to find the best schedulingpolicy for a grid network resource manager that improves network utilization andminimizes blocking probability simultaneously.

The authors conducted simulation experiments on a 14-node National ScienceFoundation Network (NSFNET) topology with 42 unidirectional links and a 33node Gigabit European Advanced Network Technology (GEANT) topology with 94unidirectional links. The authors assumed 10 wavelengths on each link and fullwavelength conversion on each link, with 30 minute time slots. Further, requestsare assumed to come in a poison fashion and all requests need to reserve a lightpath with bandwidth equal to one wavelength. The duration of a reservation isuniformly distributed. In order to stimulate a more realistic environment, the authorsgenerated the intermediate period between the arrival of the request and the startof the reservation using a discrete probability distribution. Moreover, the sour anddestination nodes for the requested connection are selected randomly using a uniformdistribution, and simulations were run for long-time durations with a large numberoff arrivals such that a sufficiently small confidence interval within 1% of the meanwith 95% confidence is reached. Finally, the authors used a link failure model wherea link is randomly selected from the network as a failed link. And the mean time tofailure was exponentially distributed with a mean of 80 slots (recovery time is alsoexponentially distributed with a mean of 48 slots). The authors assumed a link fails

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 10: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:10 A. Shaabana et al

when all wavelengths on that link fail.

The authors claim that their simulation results show that minimum cost adaptiverouting usage of the link provides the least blocking probability. Moreover, searchingfor k alternate paths within the scheduling window significantly improved perfor-mance. In the case of wavelength assignment, the authors claim that a scheme thatminimizes unused leading or trailing gaps in wavelengths gave the best results.Finally in the case of failure recovery, the authors claim to have found that a shortre-routing interval gives a large number of terminated connections while a longinterval reroutes unaffected connections. Hence the authors conclude there is atradeoff between the two intervals and best results are obtained when the re-routinginterval is based on the moving average of the historical failure times and is updatedcontinuously

The authors claim they have implemented periodic reconfiguration of reserved light-paths in their simulations, which has improved performance but has not presented avery significant improvement. Hence, more work needs to be done for further improve-ment of performance.

2.3.2. Task scheduling and lightpath establishment in optical grids. Grid applications areslowly becoming more data-intensive, and require huge data transfers between multi-ple geographically separated computing nodes. Liu et al. claim that in order for WDMnetworks to efficiently support this type of emerging application in the future, thetraditional approaches to establishing lightpaths between given source-destinationnodes will not be sufficient, nor are those existing application level approachesthat consider computing resources but ignore the optical layer connectivity. Theauthors claim that instead, one must jointly consider lightpath establishment andtask scheduling to achieve the best performance. They cite the control software andtechnologies proposed in the works by Simeonidou et al. [2005], De Leenheer et al.[2006], and Zervas et al. [2007] to efficiently control and support grid services. Theauthors further cite the work done by Wang et al. [2007] which considered the problemof jointly scheduling computational and networking resources in one Directed AcyclicGraph (DAG). Further, the authors cite the book “Task Scheduling in Parallel andDistributed Systems” by El Rewini et al. [1994], and speculate that the underlyingphysical network’s connectivity is taken for granted, and that at best it is assigned aknown “communication cost” between two nodes [Liu et al. 2009]. Further, they claimthat it is assumed that this cost does not depend on where other tasks are assigned.This is not the case in the real world, where in a WDM network supporting dynamicjobs, link usage is dynamic and the “communication cost” between two nodes dependson the assignment of other tasks.

The authors consider a similar problem where lightpaths are required for commu-nicating nodes and multiple jobs may arrive one after another and, accordingly, theobjective is to minimize the resource usage, subject to the job’s deadline constraint(if any). Additionally, when the objective is to minimize the completion time of a job,authors claim to obtain an optimal solution for a pipelined (DAG). Based on thisinformation, the authors further devise an efficient algorithm for a general DAG.Moreover, the authors claim that this work differs from previous works on traditionallightpath establishment, in which case the source and destination pairs are given.Also, different Asynchronous Array of Simple Processors (ASAP) networks need tobe formed for different jobs, so the problem of forming an ASAP network for eachjob differs from the virtual topology design. The authors implemented the proposed

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 11: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:11

algorithms over a WDM network with 100 wavelengths per link, with 24 nodes(USNET topology), with each node connected to one computing node. Further, theyhave assigned random values to the initial resource availability information. Theexperiments were repeated on different setups (like different task graphs) usingdifferent seeds for random generators. The authors used fixed routing for simplicity.They claim that the simulation results show that the proposed algorithm performsbetter than a traditional list scheduling algorithm. Further, they claim that the pro-posed heuristics are the first attempt to minimize the cost with a deadline constraintin optical grids. Further, the authors claim that to the extent of their knowledge, theproposed heuristics are the first attempt in optical grid research that addresses theoptimization problem of minimizing the resource usage subject to the job’s deadlineconstraint while obtaining an optimal solution for a pipelined DAG.

2.3.3. Communication contention reduction in joint scheduling for optical grid computing. Opticalnetworks are used to provide guaranteed quality of service connections, and are beingwidely implemented for scientific applications and simulations. In this infrastructure,the optical links are regarded as resources and are jointly scheduled with other gridresources, hence communication contention must be taken into account for efficienttask scheduling. Jin et al. start by citing previous works that deal with the testbedsand/or architectures for optical grid applications. They note that these works mainlyaim to integrate optical networks as grid services, or to make optical circuit-switchednetworks (OCS) more suitable to meet the requirements of a typical grid, such asuser-controlled capabilities, fast lightpath provisioning, and flexible dynamic control.Furthermore, the authors claim that there are few works that focus on the schedulingproblem for optical grids in theoretical detail. With regard to DAG scheduling, thesame method used by the authors of this paper, they cite works by Topcuoglu etal. [2002], Wu et al. [1990], Yang et al. [1994] and Kwok et al. [2000]. Further, theauthors cite a few attempts to incorporate communication contention awarenessinto DAG scheduling, such as the work done by Kwok et al. [2000], Beaumont etal. [2002] and Sinnen et al. [2005], and Agarwal et al. [2006]. Finally, the authorsacknowledge the work by Wang et al. [2007] which proposes a joint schedulingmodel of computing and networking resources for optical grid applications by incorpo-rating the link communication contention of the optical networks into DAG scheduling.

With regard to the proposed algorithms for DAG scheduling by Topcuoglu et al.[2002], Wu et al. [1990], Yang et al. [1994] and Kwok et al. [2000], the authors notethat these algorithms cannot be applied to grid computing applications. The authorsfurther note that most of them assume an ideal communication system in whichthe grid resources are fully connected and the communication between any two gridresources can be provisioned whenever the need arises, this is not the case in areal world OCS network in which a lightpath should be the first setup before eachcommunication and torn down after communications end[Jin et al. 2009]. In otherwords, they do not account for the optical constraints, in which case if one lightpathis already occupied by a wavelength, it cannot be occupied by another wavelength orsignal, this creates communication contention problems.

The authors model this problem as a communication aware Directed Acyclic Graph(DAG) scheduling problem. They claim to have found that there are two ways toreduce the communication contention, the first being the use of adapting routingschemes to detour the heavy traffic. While the second is to map task objects to nearbygrid resources to avoid long-hop communications. The authors focus on the second

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 12: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:12 A. Shaabana et al

method in this paper, and propose to use the hop-bytes metric (HBM) heuristic toselect computing resources, in order to reduce the communication contention. Theauthors employed two conventional network topologies, with one being a 64-nodemesh-torus and the other a 46-node USNET. In order to minimize communicationcontention, the algorithm aims to minimize the link capacity. The authors furtheremploy two routing schemes and three resource selection schemes, while using thesame random DAG generator as in the work by Wang et al. [2005]. The DAG nodeweight is taken randomly from a uniform distribution of approximately 10, makingthe average node weight 10. While the communication-computation-ratio (CCR) ischosen to be 2 to simulate more communications with the application. Furthermore,the authors assumed that all DAG nodes and grid resources are of the same time,and that grid resources are homogeneous, with performance results being the averageof 100 simulations. The authors demonstrate that the HBM-based resource selectionscheme contributes to lower resource utilization, while the adaptive routing schemereduces the schedule length. Hence this leads the authors to claim that simulationresults show that the HBM approach, combined with the adapting routing scheme,can achieve better performance in terms of normalized schedule length and linkutilization, and most of the communication contention can be avoided.

The authors claim that there are few works that focus on the scheduling problemfor optical grids in theoretical detail. They further claim that when they employ theHBM and routing schemes together, both of their merits can be achieved and most ofthe communication contention can be avoided, leading to the smallest schedule lengthwith relatively lower link utilization [Jin et al. 2009].

2.3.4. Multi-cost job routing and scheduling in grid networks. Due to the algorithms respon-sible for the routing of data and the scheduling of tasks, efficient management of theavailable infrastructure of grid networks in order to satisfy user requirements andmaximize resource utilization becomes vital. Grid applications usually pose challeng-ing demands on networks, since data transfers demand high bandwidth and low la-tency connections. This makes optical grids the most suitable technology to implementtoday’s grid systems. However, Stevens et al. speculate that, irrespective of the trans-port technology used, several problems arise. The authors solve for two: the efficientrouting of data between grid sites, and accounting for temporal information in thescheduling and routing decisions. With regard to multi-cost algorithms, the authorscite works by Wang et al. [1996], Van Mieghem et al. [2001, 2004], Kuipers et al. [2004,2005], and Chen et al. [1998]. However they note that these algorithms are mainlyused for QoS routing problems, and are hence not optimal for the case at hand. Theauthors present several multi-cost algorithms for the joint scheduling of the commu-nication and computational resources that will be used by a grid task [Stevens et al.2009]. Specifically, the authors claim the proposed immediate reservation algorithmselects the computation resource to execute the task and determine the path to routethe input data [Stevens et al. 2009]. Further, they propose multi-cost schemes of poly-nomial complexity which perform advance reservations, and therefore also find thestart times for the transmission of data and task execution.

The authors used the Phosphorus network from the Phosphorus project, with 9nodes and 16 bidirectional links. Further, the authors assumed that the network iscomposed of 1Gbps optical links, with each offering a single wavelength. Moreover,the authors selected three nodes at random to function as computing elements, witheach element containing 60 CPUs, and each CPU offering 25000 million instructionsper second. They claim that the quantitative results show that, due to the inclusion

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 13: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:13

of temporal information in their multi-cost formulation, the time complexity of theadvance reservation algorithms increased. Consequently, they claim that they havedemonstrated the inherent trade-off for immediate vs. advanced reservations. Specifi-cally, they claim to have found that the lower blocking probability achieved by advancereservations comes at a small or moderate increase in the end-to-end delay.

3. CONCLUDING COMMENTSIn this survey we have summarized 4 conference papers and 6 journal articles whichdeal with aspects of task scheduling and/or resource allocation strategies specificallyin optical grid networks. Four of which directly tackle the problem of task schedul-ing and resource allocations in optical grid networks. The earliest one, Liang et al.[2006], was among the first to co-schedule and co-allocate tasks and resources simul-taneously. Since then, research has expanded to consider other problems pertainingto task scheduling and resource allocation, such as accuracy. Wang et al. [2007b] andGuo et al. [2009] are both works by the same authors, and address the issue of taskscheduling and resource allocation accuracy. The last four papers consist of more recentresearch which has focused on other factors, either affecting task scheduling and/or re-source allocation, or addressing the symptoms caused by them.

Although these papers do not reference each other, they certainly have many otherworks in common which they have all referenced together. Such works include Baner-jee et al. [2005], Sinnen et al. [2005], Topcuoglu et al. [2002] and numerous others.All of these papers address the problem of task scheduling and resource allocation onelectrical grids, and as such are not directly transferable as optical grids are subjectto all of the constraints of optical networks.This shows that although the topic of taskscheduling and resource allocations is not new, the research to normalize it to opticalgrids is still unexplored.

Some of the authors of these surveys have expressed interest in further optimiz-ing their models in their future work. Liang et al. [2006] wish to consider differenthardware in the optical network itself in the future. That is, they wish to consider thedifferent fiber allocation granularities. Further, they wish to consider the possibility ofhaving more than one type resource in one independent system. Meanwhile, Kim et al.[2007] wish to evaluate resource scheduling performance under different scopes, likethe use of all-optical networks, advance reservation models and under different cate-gories of high-performance applications. Tanwir et al. [2008] wish to further improvenetwork utilization by using offline optimization of the reserved connections whichare not yet in service, as opposed to the periodic reconfiguration of reserved light-paths done in the present work, which does not significantly improve performance.And finally, Liu et al. [2009] took a more mathematical approach, and are trying toinvestigate the performance of other approaches to their research, like duplicated andclustering heuristics applied in optical grids, as well as new scheduling algorithmsunder various QoS constraints. This shows that there are multiple approaches to opti-mizing task scheduling and resource allocations in optical grid networks, each with itsadvantages and disadvantages. Consequently, we can expect to see a surge of researchactivity on this topic as traditional grid networks get closer to being at bottleneck dueto their comparatively low capacity, while optical networks are becoming a more viableoption as optical network technology progresses further.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 14: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:14 A. Shaabana et al

Year Authors Title Papers referred to Major contribution2006 Liang et al. Adaptive task schedul-

ing on optical gridNone Considers alloca-

tion of networkresources for datatransfer in opti-cal grids whenscheduling, a con-cept which has notbeen thoroughlyresearched at thetime of writing thispaper.

2006 Guo et al. Resource allocationstrategies for data-intensive workflow-based applications inoptical grids

None Proposes algorithmsfor task schedulingand resource alloca-tion which considernetwork resourcescontention simulta-neously.

2007 Kim et al. Coordinated ResourceScheduling in High-performance OpticalGrids

None Investigates coor-dinated resourcescheduling algo-rithms for opticalgrids, introduces asimple schedulingalgorithm then eval-uates its impact ongrid performance.

2007 Wang et al. On Accurate TaskScheduling in OpticalGrid

None Investigates accu-rate task schedulingin optical grids,models practicaldata transfer andtask executionmodel in opticalgrids closely to realtime data, imple-ments schedulingalgorithm to showsignificant im-provement overconventional meth-ods.

2008 Tanwir et al. Dynamic Scheduling ofNetwork Resources withAdvance Reservations inOptical Grids

None Evaluates andcompares sev-eral algorithmsfor schedulinglightpaths, whichconsequently affectstask scheduling andresource allocations.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 15: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:15

Year Authors Title Papers referred to Major contribution2009 Guo et al. Task Schedulng accu-

racy analysis in opticalgrid environments

Guo et al. 2006 Proposes an im-proved accuracyof task schedul-ing algorithms bysolving the prob-lem of accuracydeviations in a real-world optical-gridenvironment.

2009 Develder et al. On dimensioning opticalgrids and the impact ofscheduling

None Proposes a phasedsolution to under-stand and showhow to decide whereto provide servercapacity and re-sources, where toprocess the submit-ted jobs, and finallyhow to calculate thenetwork dimensionsrequired.

2009 Liu et al. Task scheduling andlightpath establishmentin optical grids

None Suggests and claimsto prove that light-path establishmentmust be consid-ered jointly withtask scheduling toachieve the bestperformance.

2009 Jin et al. Communication con-tention reduction injoint scheduling foroptical grid computing

None Attempts to tacklethe side effects oftask scheduling andresource allocationby modelling anoptical grid as acommunication-aware directedacyclic graphscheduling algo-rithm, and claimsto solve communi-cation contentionusing a heuristicalgorithm.

2009 Stevens et al. Multi-cost job routingand scheduling in gridnetworks

None Proposes a multi-cost resource al-location and taskand job schedul-ing scheme ofpolynomial timecomplexity.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 16: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:16 A. Shaabana et al

4. ANNOTATIONS4.1. Liang et al. 2006Citation: LIANG, X., LIN, X., AND LI, M. 2006. Adaptive task scheduling on optical grid,2006 IEEE Asia-Pacific Conference on Services Computing (APSCC’06), 486-491.

Problem. As grid users create more diverse and complex applications, the bandwidthof the electrical network becomes the bottleneck of grid applications. This makesoptical grids, with their massive bandwidth and cheap cost an attractive alternative.A key aspect of grid environments is scheduling of tasks and resources. However, theauthors note that few proposed methods consider communication contentions (suchas resource conflicts) on the optical grids, but they rarely consider co-allocating orco-scheduling of both the optical network and computing resources.

Previous Work. The authors refer to previous work by Banerjee et al. [2005] . Theyalso cite the work by Sinnen et al. [2005] and compare their algorithm with theproposed algorithm.

Shortcomings of Previous Work. The authors note that Banerjee et al. [2005] do notconsider co-allocating or co-scheduling of both the optical network resources and thecomputing resources. They state that this method is destined to hit a threshold sincesequential allocations can only go so far.

New Idea/Algorithm/Architecture. The authors present an optical grid model basedon optical network characteristics. In this model, the authors granted the networkresource the same level to be considered as the computation and storage resources.In this way the authors treat the network itself as a resource to be offered as aservice to the grid like any other resource. The authors also took into considerationthe allocation of network resources for data transfers to make job scheduling as closeto reality as possible. Furthermore, the authors present a communication contentionalgorithm based on list scheduling to minimize the total execution time for given taskstowards an optical grid, where they used a modified version of Djikstra’s algorithm forrouting.

Experiments Conducted. The authors used the national 863 high-performancebroadband information network in their simulation. Further, they defined theresource type and capacity of the device connected to the optical network. Theymainly defined three resources types: storage resources, I/O resources and computingresources.

Results. Using simulated job sizes of 15, 20, 30, 40, 50 and 60 jobs, the authorsclaim that their simulations show that the proposed adaptive scheduling algorithmsperforms better than the fixed shortest-route algorithm, proposed by Sinnen et al.[2006].

Claims Made. Liang et al. [2006] claim that the proposed method demonstratesimproved accuracy and scheduling efficiency over conventional algorithms. In thefuture they plan to consider fiber allocation granularity as well as multiple typeresources in one independent system, which decreases down communications.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 17: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:17

4.2. Guo et al. 2006Citation: GUO, W., WEIQIANG, S., HU, W., AND JIN, Y. 2006. Resource allocation strategiesfor data-intensive workflow-based applications in optical grids. In IEEE SingaporeInternational Conference on Communications Systems, 2006. ICCS 2006 IEEE, 1-5.

Problem. Scientific simulations and remote-visualizations are data-intensive appli-cations are often run on a grid architecture. Since these applications are really a setof tasks that are essentially executed sequentially, each task must perform variousoperations such as computing, transporting and displaying the data. This creates theproblem of scheduling and running these tasks and their resource allocations.

Previous Work. The authors refer to the work of Braun et al. [2001], Binato et al.[2000], He et al. [2003], Li et al. [2005], and Topcuoglu et al. [2002]. Further, authorsalso cite the works of Blythe et al. [2005], Jia et al. [2005], and Mandal et al. [2005]and compare their solutions with the proposed solution.

Shortcomings of Previous Work. Guo et al. [2006] note that the research of Braunet al. [2001], Binato et al. [2000], He et al. [2003], Li et al. [2005], and Topcuoglu etal. [2002] concentrates on matching individual tasks with their respective resourcesand does not find an efficient overall allocation. Hence they do not consider tasks thatcome later in the work-flow, and their overall allocation of resources may result in pooroverall assignments. Furthermore, they note that the proposed solutions of Blythe etal. [2005], Jia et al. [2005], and Mandal et al. [2005] search for an efficient allocationfor the entire workflow, which depends on predictions of future task performance.However they note that this research also assumes that computational resourcesare fully connected and that the network bandwidth is enough for unlimited datatransfers in parallel, this is not the case in real networks.

New Idea/Algorithm/Architecture. In order to minimize the execution time requiredfor a given workflow, the authors propose two algorithms for resource allocation andtask scheduling in an optical grid: a task-based approach and a workflow-basedapproach for data-intensive workflow-based applications. The authors also claim tohave considered network resource contention in implementing these algorithms.

Experiments Conducted. The authors used an optical grid simulator to investigatethe performance of the resource allocation approaches. The workflow of the experi-ments consisted of randomly generated directed acyclic graphs. In order to generatesuch graphs, the authors randomly generated one of 3 following parameters:

(1) Number of nodes.(2) Average number of edges per node.(3) Communication to computation ratio.

Furthermore, resource allocation algorithms were run with a time limit of 200 seconds.

Results. The authors claim that while there is no significant difference for compu-tationally intensive cases, the workflow-based algorithm outperforms the task-basedapproach by producing schedules with significantly lower execution times for data-intensive cases. However, the authors noted that the workflow-based algorithmstake up more time than task-based approaches, making them less compatible with

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 18: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:18 A. Shaabana et al

workflows containing several thousand tasks.

Claims Made. The authors produce task-based and workflow-based allocationapproaches for workflow-based applications, and claim that they have similar perfor-mance for computationally-intensive cases but the workflow-based approach performsbetter for data-intensive cases, where they take advantage of the ability to begintransferring large data sets earlier and make decisions based on global measures ofperformance.

4.3. Kim et al. 2007Citation: KIM, S-I., JUKAN, A., AND LUMETTA, S. 2007. Coordinated resource schedulingin high-performance optical grids. In Conference on Optical Fiber Communication andthe National Fiber Optic Engineers Conference, 2007. OFC/NFOEC 2007.

Problem. In grid computing, optical transport provides the means to transmit largeamounts of data with low cost and reliability. However, the complexity of large scaleresource sharing and coordination is an important issue that must be resolved inorder to fully utilize optical grids.

Previous Work. The authors cite no related work.

New Idea/Algorithm/Architecture. The authors present results from differentcoordinated resource scheduling algorithms, namely First-Come-First-Serve (FCFS),Proportional-backoff (P-backoff) and Weighted-backoff (W-backoff). Specifically, theauthors focus on high-performance optical networks as the underlying architecturefor all communications [Kim et al. 2007]. The authors then present the performancemeasures for CPU selection methods. Finally, the authors evaulate and present theeffect of network path length limits on overall grid scheduler performance.

Experiments. The authors simulate a network which has full wavelength conversion.They then randomly select and designat network nodes with either CPU, storage oruser roles. The authors also assumed each link in the network has 32 wavelengthsand each CPU node consists of 64 CPUs.

Results. The authors note for each connection on the grid only the shortest pathsare utilized, from storage to CPU and from CPU to user. Furthermore, there is asignificant improvement in performance when resource scheduling and/or allocationschemes are given enough freedom to choose any CPU. However the authors notethat this option may not always be feasible as the size of the grid and the natureof the application also play a role in grid utilizations. The authors have furtherfound that out of the three scheduling algorithms considered, FCFS performed theworst, P-backoff performed slightly better and finally W-backoff performed slightlybetter than P-backoff. Finally the authors have found that increasing the path lengthimproves overall system performance. However increasing the number of hops orwhen the grid scheduler is given the flexibility of choosing any CPU in the system,longer paths no longer have a significant effect. Finally, they have found that sincenetwork congestion occurs around local CPU resources, then by allowing longer paths,it is possible to route around the congested (or non-local) network resources [Kimet al. 2007].

Claims Made The authors have made no claims for their research.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 19: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:19

4.4. Develder et al. 2009Citation: DEVELDER C., DHOEDT B., MUKHERJEE B., AND DEMEESTER P. 2009. On Dimen-sioning optical grids and the impact of scheduling. Photonic Network Communications17, 255-265. 10.1007/s11107-008-0160-z.

Problem. Anycast routing of jobs is conventionally adopted in network dimensioning(how much capacity is needed for the network to be able to transport a given amountof traffic), this implies the absence of clearly defined traffic (or job) matrix (i.e.source-destination based) since only the origin of grid jobs is known, but not theirdestination. Further, the authors acknowledge that the anycast routing principledirectly affects the scheduling and routing decisions, such as where to execute a job inthe grid system and how to get it to the destination. Although this provides an edgeover traditional grids since it adds more freedom, it also incurs multi-cost routingproblems, incorporating the state of both the network and computational/storage gridresources. The authors surmise that grid dimensioning alone is not enough, and onemust also dimension the computational and storage resources, making the problemclose to, if not already is, NP-hard.

Previous Work. The authors acknowledge a variety of solutions presented to solvethe problem of network dimensioning (understanding how much network capacity isneeded for the network to be able to transport a given amount of traffic). Further,the authors note that solutions came in the form of heuristics and Integer LinearPrograms (ILPs). They further state that these algorithms varied in their approachdepending on several factors, such as: the topologies and technologies implemented,design criteria, single or multi-period planning and single domain or hierarchalnetworks. However, they note that if one wanted to apply any of the approaches listedfor dimensioning grids, then the problem of accurately estimating the traffic matrixwill arise. Moreover, the authors mention that in addition to grid dimensioning,one must also dimension the computational and/or storage resources, and speculatethat jointly determining both network and server dimensions is a possible NP-hardproblem.

However, the authors claim that related work on dimensioning grids is uncommon,and cite some papers dealing with this topic. They cite Thysevaert et al. [2005] and DeLeenheer et al. [2007] as the only attempts at the time of writing the paper to solvethe problem of dimensioning network grids.

Shortcomings of Previous Work. The authors cite only two papers which address theproblem of dimensioning grids. They do not identify any shortcomings of these works.

New Idea/Algorithm/Architecture. The authors address the grid dimensioningproblem, and present a solution to understand how to decide where to provide servercapacity and resources, where to process the submitted jobs, and finally how tocalculate the network dimensions required. Hence, authors propose a phased solution,first dimensioning the servers and then the network. [Develder et al. 2009].

Experiments. In order to obtain a more realistic case study, the authors performedmeasurements on a real world grid, deployed in the frame of the Large HadronCollider (LHC) experiments and the Enabling Grids for E-ScienceE (EGEE) project(EGEE/LCG). The authors considered two network topology and job demand cases,with the first using a fairly densely meshed European backbone network, with

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 20: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:20 A. Shaabana et al

artificially generated job arrival rates at each site [Develder et al. 2009]. The secondcase is based on measurement data from EGEE/LCG, and the average job durationwas derived from real-life trace files. The authors applied the proposed dimensioningstrategies on both cases.

Results. The authors claim that the two case studies on European topologies showthat placing server capacity where a lot of jobs arrive is important to minimizenetwork bandwidth requirements. With regard to grid scheduling, a simple shortestpath strategy, preferring closer server sites, led to the lowest bandwidth demands[Develder et al. 2009]. Finally with respect to choosing an appropriate number ofserver sites, authors speculate that for a larger number of server sites, the totalserver capacity becomes fragmented, thereby reducing opportunities for statisticalmultiplexing, whereas for smaller server site counts the average distance that jobsneed to travel is too large. Hence they claim that the optimal number of server sitesdepends on the scheduling algorithm and the server site dimensioning strategies, andthat their case studies did not cover all of the total number of servers.

Claims Made. The authors claim that research on dimensioning grids is uncommon,and it is still an unexplored area. They further claim that in their experimentshave showed that grid scheduling algorithm has a substantial effect on the requirednetwork capacity.

4.5. Wang et al. 2007Citation: WANG, Z., GUO, W., SUN, Z., JIN, Y., SUN, W., HU, W., AND QIAO, C. 2007b. Onaccurate task scheduling in Optical Grid. In 2007 First International Symposium onAdvanced Networks and Telecommunication Systems. December 2007 1 - 2.

Problem. An important aspect of efficient utilization of grid resources, task schedul-ing locates computational resources for tasks and optical network resources forcommunications in optical grids. The predicted eventual finish time of a task, knownas the scheduling span, is employed to evaluate scheduling efficiency. Moreover, theauthors claim one must also consider whether the scheduling prediction is executedaccordingly. The authors conclude that the problem thus lies in how to guaranteethe conformity of the scheduling with practicality. Further, they cite past work whichsimulates task scheduling scenarios well but note that it is done for packet-switchedelectrical networks as opposed to optical networks, which have special scenarios andconstraints to consider. Therefore they conclude that while traditional research helpswith optical network research, research on accurate task scheduling in optical grids isconsidered more valuable in this context.

Previous Work. Primarily, Wang et al. [2007b] cite and compare there work describedin this paper to their previously developed theoretical Optical Grid Earliest FinishTime (OGEFT) algorithm presented in Wang et al. [2007a] . Further, the authors notethe work done by Sinnen et al. [2006] , a practical and realistically modeled runningscenario for accurate task scheduling.

Shortcomings of Previous Work. With regard to their own previous work, theauthors state that the OGEFT algorithm proposed does not consider an optical grid’spractical running scenarios, such as light-path establishment (hence why they calledit theoretical), and claim that the proposed algorithm uses a more realistic modeland hence delivers better results. Further, the authors claim that the work done by

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 21: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:21

Sinnen et al. [2006] is only accurate for traditional packet-switched networks, and isnot realistically applicable to optical networks as they abide by more constraints andscenarios.

New Idea/Algorithm/Architecture. Along with investigating and modeling thepractical data transfer and task execution scenarios, the authors present a modifiedversion of their Optical Grid Earliest Finished Time (OGEFT) algorithm to scheduletasks. The authors state that this algorithm is a listing algorithm in that it sortstasks into a list according to their execution priorities and allocates grid resourcesaccordingly. The authors also claim that this algorithm uses optical network routingalgorithms to allocate a light-path to transfer grid data.

Experiments. To evaluate the scheduling accuracy of the new realistic task schedul-ing scheme of OGEFT, the authors set up the same test bed used in their previouswork to evaluate the theoretical OGEFT previously presented. Three computationalresources located in Shanghai Jiaotang University, 20km apart and networked bya GMPLS-controlled Automatically Switched Optical Network (ASON), with eachswitching node capable of switching at 360 Gbps with VC-4 granularity and eachlink containing two STM-64 fibers. Task execution is conducted as —v— (computingamount) seconds’ matrix computation on a computational resource. The number oftasks in a task set can be 10, 20 or 50 tasks.

Results. After incorporating the new, more realistic model to OGEFT, the authorsclaim to have found that the accuracy deviations have become much lower than thepreviously obtained results with the theoretical model. Hence they conclude that theirrealistic model can improve task scheduling accuracy significantly.

Claims Made. The authors claim this paper presents a significantly improved andmore realistic model of their OGEFT algorithm proposed in [Wang et al. 2007b].

4.6. Guo et al. 2009Citation: GUO, W., WANG Z., SUN Z., SUN W., JIN Y., HU W., AND QIAO C. 2009. TaskScheduling Accuracy Analysis in Optical Grid Environments. Photonic NetworkCommunications 17, 209-217.

Problem. Task scheduling is a prominent issue to optimize the performance of gridapplications in optical grids and to improve utilization for both grid resources andoptical networks resources is task scheduling. In order to get the shortest possibleexecution time (also called scheduling length) task scheduling finds spatial andtemporal assignments of applications onto the optical grid environments. The authorsclaim that the accuracy of a task scheduling algorithm is an important issue toefficiently utilize resources and guarantee quality of service (QoS) to grid applications.

Previous Work. The authors cite the work by Sinnen et al. [2004 & 2005], and alsocite work by various other authors who have attempted to solve the NP-hard taskscheduling problem with near optimal solutions. Further, the authors have stated thatthe solution by Sinnen et al. [2006] has achieved significantly improved accuracy.

Shortcomings of Previous Work. The authors cite multiple heuristic algorithmsthat achieve near-optimal solutions to the problem of task scheduling on normalgrid environments. However they claim that most of the algorithms are based on

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 22: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:22 A. Shaabana et al

a simplistic model which assumes all processors are fully connected. Moreover, theauthors acknowledge that there are other, more recent algorithms which have beenproposed that consider the topology of a communication network and are aware of thecontention for network resources, such as the one proposed by Sinnen et al. [2005]. However, the authors claim that while the accuracy was significantly improvedthrough the consideration of contention in task scheduling, the experimental resultsshow it is still an unsatisfactory achievement in scheduling accuracy, as Sinnenet al. [2004] have shown. The authors however state that these solutions are notdirectly transferable to optical grid environments, as optical grids have many otherfactors and constraints that need to be accounted for. The authors also claim that inorder for these algorithms to function in an optical grid environment, they must bedesigned so they co-allocate grid resources and optical network resources. Further,they must also define a new data model to obtain accurate task scheduling in realoptical grid environments. Finally, the authors have cited Banerjee et al.’s [2008]proposing scheduling algorithms for large file transfers in an optical grid system,however they claim the accuracy analysis of the generated scheduling algorithm is yetto be investigated.

New Idea/Algorithm/Architecture. The authors propose a theoretical task schedul-ing algorithm, followed by a demonstration showing the deviation of the schedulinglength from the actual finish time in a real optical grid environment. Moreover, theauthors claim to have made some analysis to find several factors that are likely toaffect optical grid networks, and finally they further claim to have incorporated thenew revised parameters into the theoretical scheduling model and proposing a morerealistic scheduling model as a result.

Experiments. By varying two key parameters: the communication-computationratio (CCR, defined as the sum of communication cost divided by the sum of the taskcomputation cost) and the Directed Acyclic Graph (DAG) size, the authors randomlygenerated 20 DAGs. To evaluate the proposed algorithms, the scheduler employs thetheoretical and realistic tai scheduling algorithms to get the scheduling results andscheduling length. The scheduler then asked the grid resources to execute the tasks ofthe DAG and directs the optical network to set up or release the light paths accordingto the scheduling results in the optical grid testbed [Guo et al. 2009]. Finally, theauthors claim to have measured and compared the actual finish time for the DAGwith the scheduling length to calculate the accuracy deviations.

Results. The authors claim that the results show their realistic task schedulingalgorithm demonstrates its capacity to improve the overall scheduling accuracy. Theyalso noted that the improvement of accuracy increased as the CCR increased. Finally,the authors claim that these results prove that the realistic task scheduling algorithmcan achieve better performance for data-break intensive applications in optical gridenvironments [Guo et al. 2009].

Claims Made. The authors claim that there has been no previous research onanalyzing the accuracy of task scheduling in optical grid networks at the time ofwriting their paper. In regard to the scheduling algorithm results, the authors claimthe realistic task scheduling algorithm could significantly improve overall taskscheduling accuracy. This paper supersedes the work by Wang et al. [2007], whereinthe same contributors to this research presented a new model to augment their previ-ously proposed theoretical model for accurate task scheduling into a more realistic one.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 23: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:23

4.7. Tanwir et al. 2008Citation: TANWIR, S., BATTESTILLI, L., PERROS, H. and KARMOUS-E., AND DWARDS, G. 2008.Dynamic rescheduling of network resources with advance reservations in opticalgrids. In International Journal of Network Management. 79 - 105.

Problem. Optical grids help make possible many eScience and eBusiness applica-tions by supporting their large bandwidth requirements. The authors note that theseemerging applications have led to a rapid advancement in optical network technolo-gies, however the area that is still lacking is the link between the grid applicationsand the underlying network technologies which make grids effective. In order to meetthe complex demand patterns of grid applications and to optimize overall networkutilization, the authors claim it is necessary to abstract and encapsulate opticalnetwork resources into manageable and dynamically divided grid entities.

Previous Work. The authors cite multiple online projects and note that they considerthe problem of building reconfigurable, dynamic, adaptable optical grids. The authorsnote that these projects have developed grid middleware that makes optimized use ofan optical network as a virtual coordinated resource [Tanwir et al. 2008]. The authorsalso cite work by Zheng et al. [2002] which discusses different designs for effectiveRouting and Wavelength Assignment algorithms for different types of advancereservations and present algorithms for requests having specific start times andspecific durations (STSD), specific start times and unspecific durations (STUD), andunspecific start times and specific durations (UTSD). Moreover, the authors cite thework by Burchard [2005] in which he discusses the properties of advance reservationand also proposed an architecture for a bandwidth distribution system to improvethe performance of a network based on the acquired knowledge of these reservations.The authors also cite Foster et. al. [2009] who presented Globus Architecture forReservation and Allocation (GARA), which supports advance reservations for varioustypes of resources for grid optical networks. Further, the authors cite [Curti et. al.2005], who discussed advance reservation of heterogeneous network paths in gridcomputing and in order to integrate the path management with grid information andauthentication services, proposed a network resource hierarchy.

Furthermore, the authors cited multiple previous works which tried to improveadvance reservations. They cite the work of Wang et al. [2005], who proposed a slidingscheduled traffic model and a demand time conflict resolution scheme to maximizethe resource usage in a network. Finally, the authors cite the work of [He et. al. 2006]which proposes a Flesible Advance Reservation Model (FARM) and described how toimplement this model in the meta-scheduling problem [Tanwir et al. 2008].

Shortcomings of Previous Work. The authors do not identify any shortcomings of theprevious work mentioned.

New Idea/Algorithm/Architecture. The authors focus on evaluating and comparingvarious algorithms for advanced light path scheduling that can be implemented in aDomain Network Resource Manager (DNRM). In the end the authors hope to find thebest scheduling policy for a grid network resource manager that improves networkutilization and minimizes blocking probability simultaneously.

Experiments. The authors conducted simulation experiments on a 14-node NationalScience Foundation Network (NSFNET) topology with 42 unidirectional links and

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 24: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:24 A. Shaabana et al

a 33 node Gigabit European Advanced Network Technology (GEANT) topology with94 unidirectional links. The authors assumed 10 wavelengths on each link and fullwavelength conversion on each link, with 30 minute time slots. Further, requestsare assumed to come in a poison fashion and all requests need to reserve a lightpath with bandwidth equal to one wavelength. The duration of a reservation isuniformly distributed. In order to stimulate a more realistic environment, the authorsgenerated the intermediate period between the arrival of the request and the startof the reservation using a discrete probability distribution. Moreover, the sour anddestination nodes for the requested connection are selected randomly using a uniformdistribution, and simulations were run for long-time durations with a large numberoff arrivals such that a sufficiently small confidence interval within 1% of the meanwith 95% confidence is reached. Finally, the authors used a link failure model wherea link is randomly selected from the network as a failed link. And the mean time tofailure was exponentially distributed with a mean of 80 slots (recovery time is alsoexponentially distributed with a mean of 48 slots). The authors assumed a link failswhen all wavelengths on that link fail.

Results. The authors claim that their simulation results show that minimum costadaptive routing usage of the link provides the least blocking probability. Moreover,searching for k alternate paths within the scheduling window significantly improvedperformance. In the case of wavelength assignment, the authors claim that a schemethat minimizes unused leading or trailing gaps in wavelengths gave the best results.Finally in the case of failure recovery, the authors claim to have found that a shortre-routing interval gives a large number of terminated connections while a longinterval reroutes unaffected connections. Hence the authors conclude there is atradeoff between the two intervals and best results are obtained when the re-routinginterval is based on the moving average of the historical failure times and is updatedcontinuously

Claims Made. The authors claim they have implemented periodic reconfigurationof reserved lightpaths in their simulations, which has improved performance but hasnot presented a very significant improvement. Hence, more work needs to be done forfurther improvement of performance.

4.8. Liu et al. 2009Citation: LIU X., QIAO C., WEI W., YU X., WANG T., HU W., GUO W., AND WU, M. 2009.Task scheduling and lightpath establishment in optical grids. Journal of LightwaveTechnology, 27, 12. 1827-1836.

Problem. Grid applications are slowly becoming more data-intensive, and requirehuge data transfers between multiple geographically separated computing nodes.The authors claim that in order for WDM networks to efficiently support this typeof emerging application in the future, the traditional approaches to establishinglightpaths between given source-destination nodes will not be sufficient, nor arethose existing application level approaches that consider computing resources butignore the optical layer connectivity. The authors claim that instead, one must jointlyconsider lightpath establishment and task scheduling to achieve the best performance.

Previous Work. The authors cite the control software and technologies proposed inthe works by Simeonidou et al. [2005, De Leenheer et al. [2006], Zervas et al. [2007] toefficiently control and support grid services. The authors further cite the work done byWang et al. [2007] which considered the problem of jointly scheduling computational

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 25: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:25

and networking resources in one Directed Acyclic Graph (DAG).

Shortcomings of Previous Work. The authors cite the book “Task Scheduling inParallel and Distributed Systems” by El Rewini et al. [1994], and speculate that theunderlying physical network’s connectivity is taken for granted, and that at best it isassigned a known “communication cost” between two nodes [Liu et al. 2009]. Further,they claim that it is assumed that this cost does not depend on where other tasks areassigned. This is not the case in the real world, where in a WDM network supportingdynamic jobs, link usage is dynamic and the “communication cost” between two nodesdepends on the assignment of other tasks.

New Idea/Algorithm/Architecture. The authors consider a similar problem wherelightpaths are required for communicating nodes and multiple jobs may arrive oneafter another and, accordingly, the objective is to minimize the resource usage,subject to the job’s deadline constraint (if any). Additionally, when the objective is tominimize the completion time of a job, authors claim to obtain an optimal solution fora pipelined (DAG). Based on this information, the authors further devise an efficientalgorithm for a general DAG. Moreover, the authors claim that this work differs fromprevious works on traditional lightpath establishment, in which case the source anddestination pairs are given. Also, different Asynchronous Array of Simple Processors(ASAP) networks need to be formed for different jobs, so the problem of forming anASAP network for each job differs from the virtual topology design.

Experiments. The authors implemented the proposed algorithms over a WDM net-work with 100 wavelengths per link, with 24 nodes (USNET topology), with each nodeconnected to one computing node. Further, they have assigned random values to theinitial resource availability information. The experiments were repeated on differentsetups (like different task graphs) using different seeds for random generators. Theauthors used fixed routing for simplicity.

Results. The authors claim that the simulation results show that the proposedalgorithm performs better than a traditional list scheduling algorithm. Further, theyclaim that the proposed heuristics are the first attempt to minimize the cost with adeadline constraint in optical grids.

Claims Made. The authors claim that to the extent of their knowledge, the proposedheuristics are the first attempt in optical grid research that addresses the optimizationproblem of minimizing the resource usage subject to the job’s deadline constraintwhile obtaining an optimal solution for a pipelined DAG.

4.9. Jin et al. 2009Citation: JIN Y., WANG Y., GUO W., SUN W., AND HU W. 2009. Communication con-tention reduction in joint scheduling for optical grid computing. In Networks for GridApplications. Lecture Notes of the Institute for Computer Sciences, Social Informaticsand Telecommunications Engineering Series, vol. 2. Springer Berlin Heidelberg,206-213.

Problem. Optical networks are used to provide guaranteed quality of service connec-tions, and are being widely implemented for scientific applications and simulations.In this infrastructure, the optical links are regarded as resources and are jointlyscheduled with other grid resources, hence communication contention must be taken

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 26: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:26 A. Shaabana et al

into account for efficient task scheduling.

Previous Work. The authors start by citing previous works that deal with thetestbeds and/or architectures for optical grid applications. They note that theseworks mainly aim to integrate optical networks as grid services, or to make opticalcircuit-switched networks (OCS) more suitable to meet the requirements of a typicalgrid, such as user-controlled capabilities, fast lightpath provisioning, and flexibledynamic control. Furthermore, the authors claim that there are few works that focuson the scheduling problem for optical grids in theoretical detail. With regard to DAGscheduling, the same method used by the authors of this paper, they cite works byTopcuoglu et al. [2002], Wu et al. [1990], Yang et al. [1994] and Kwok et al. [2000].Further, the authors cite a few attempts to incorporate communication contentionawareness into DAG scheduling, such as the work done by Kwok et al. [2000],Beaumont et al. [2002] and Sinnen et al. [2005], and Agarwal et al. [2006]. Finally, theauthors acknowledge the work by Wang et al. [2007] which proposes a joint schedulingmodel of computing and networking resources for optical grid applications by incorpo-rating the link communication contention of the optical networks into DAG scheduling.

Shortcomings of Previous Work. With regard to the proposed algorithms for DAGscheduling by Topcuoglu et al. [2002], Wu et al. [1990], Yang et al. [1994] and Kwok etal. [2000], the authors note that these algorithms cannot be applied to grid computingapplications. The authors further note that most of them assume an ideal communi-cation system in which the grid resources are fully connected and the communicationbetween any two grid resources can be provisioned whenever the need arises, thisis not the case in a real world OCS network in which a lightpath should be the firstsetup before each communication and torn down after communications end[Jin et al.2009]. In other words, they do not account for the optical constraints, in which case ifone lightpath is already occupied by a wavelength, it cannot be occupied by anotherwavelength or signal, this creates communication contention problems.

New Idea/Algorithm/Architecture. The authors model this problem as a communi-cation aware Directed Acyclic Graph (DAG) scheduling problem. They claim to havefound that there are two ways to reduce the communication contention, the first beingthe use of adapting routing schemes to detour the heavy traffic. While the second isto map task objects to nearby grid resources to avoid long-hop communications. Theauthors focus on the second method in this paper, and propose to use the hop-bytesmetric (HBM) heuristic to select computing resources, in order to reduce the commu-nication contention.

Experiments. The authors employed two conventional network topologies, with onebeing a 64-node mesh-torus and the other a 46-node USNET. In order to minimizecommunication contention, the algorithm aims to minimize the link capacity. Theauthors further employ two routing schemes and three resource selection schemes,while using the same random DAG generator as in the work by Wang et al. [2005].The DAG node weight is taken randomly from a uniform distribution of approximately10, making the average node weight 10. While the communication-computation-ratio(CCR) is chosen to be 2 to simulate more communications with the application.Furthermore, the authors assumed that all DAG nodes and grid resources are of thesame time, and that grid resources are homogeneous, with performance results beingthe average of 100 simulations.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 27: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:27

Results. The authors demonstrate that the HBM-based resource selection schemecontributes to lower resource utilization, while the adaptive routing scheme reducesthe schedule length. Hence this leads the authors to claim that simulation resultsshow that the HBM approach, combined with the adapting routing scheme, canachieve better performance in terms of normalized schedule length and link utiliza-tion, and most of the communication contention can be avoided.

Claims Made. The authors claim that there are few works that focus on the schedul-ing problem for optical grids in theoretical detail. They further claim that when theyemploy the HBM and routing schemes together, both of their merits can be achievedand most of the communication contention can be avoided, leading to the smallestschedule length with relatively lower link utilization [Jin et al. 2009].

4.10. Stevens et al. 2009Citation: STEVENS T., DE LEENHEER M., DEVELDER C., DHOEDT B., CHRISTODOULOPOULOUSK., AND KOKKINOS P., AND VARVARIGOS E. 2009. Multi-cost job routing and schedulingin grid networks. Future Generation Computer Systems 25, 8, 912-925.

Problem. Due to the algorithms responsible for the routing of data and the schedul-ing of tasks, efficient management of the available infrastructure of grid networks inorder to satisfy user requirements and maximize resource utilization becomes vital.Grid applications usually pose challenging demands on networks, since data trans-fers demand high bandwidth and low latency connections. This makes optical gridsthe most suitable technology to implement today’s grid systems. However, the authorsspeculate that, irrespective of the transport technology used, several problems arise.The authors solve for two: the efficient routing of data between grid sites, and account-ing for temporal information in the scheduling and routing decisions.

Previous Work. With regard to multi-cost algorithms, the authors cite works by Wanget al. [1996], Van Mieghem et al. [2001, 2004], Kuipers et al. [2004, 2005], and Chen etal. [1998]. However they note that these algorithms are mainly used for QoS routingproblems.

Shortcomings of Previous Work. The authors do not specify any shortcomings of theprevious work mentioned.

New Idea/Algorithm/Architecture. The authors present several multi-cost algo-rithms for the joint scheduling of the communication and computational resourcesthat will be used by a grid task [Stevens et al. 2009]. Specifically, the authors claimthe proposed immediate reservation algorithm selects the computation resource to ex-ecute the task and determine the path to route the input data [Stevens et al. 2009].Further, they propose multi-cost schemes of polynomial complexity which perform ad-vance reservations, and therefore also find the start times for the transmission of dataand task execution.

Experiments. The authors used the Phosphorus network from the Phosphorusproject, with 9 nodes and 16 bidirectional links. Further, the authors assumed that thenetwork is composed of 1Gbps optical links, with each offering a single wavelength.Moreover, the authors selected three nodes at random to function as computing ele-ments, with each element containing 60 CPUs, and each CPU offering 25000 millioninstructions per second.

Results. The authors claim that the quantitative results show that, due to the in-clusion of temporal information in their multi-cost formulation, the time complexity ofthe advance reservation algorithms increased.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 28: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:28 A. Shaabana et al

Claims Made. The authors claim that they have demonstrated the inherent trade-off for immediate vs. advanced reservations. Specifically, they claim to have found thatthe lower blocking probability achieved by advance reservations comes at a small ormoderate increase in the end-to-end delay.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 29: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

Approaches to Task Scheduling and Resource Allocation in Optical Grids A:29

5. REFERENCES

BANERJEE, A., CHUN FENG, W., GHOSAL, D., AND MUKHERJEE, B. 2008. Algorithms for integrated routingand scheduling for aggregating data from distributed resources on a lambda grid. IEEE Transactionson Parallel and Distributed Systems. 19, 1, 24 –34.

BANERJEE, A., CHUN FENG, W., MUKHERJEE, B., AND GHOSAL, D. 2005. Routing and scheduling largefile transfers over lambda grids. In in Proceedings of the 3rd International Workshop on Protocols forFast Long-Distance Networks (PFLDnet05), February 2005.

BINATO, S., HERY, W., LOEWENSTERN, D. M., AND RESENDE, M. G. C. 2000. A grasp for job shop schedul-ing. In Essays and Surveys on Metaheuristics. Kluwer Academic Publishers, 59–79.

BLYTHE, J., JAIN, S., DEELMAN, E., GIL, Y., VAHI, K., MANDAL, A., AND KENNEDY, K. 2005. Task schedul-ing strategies for workflow-based applications in grids. In IEEE International Symposium on ClusterComputing and the Grid, 2005. CCGrid 2005. Vol. 2. 759 – 767.

BRAUN, T. D., SIEGEL, H. J., BECK, N., BOLONI, L. L., MAHESWARAN, M., REUTHER, A. I., ROBERTSON,J. P., THEYS, M. D., YAO, B., HENSGEN, D., AND FREUND, R. F. 2001. A comparison of eleven staticheuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems.J. Parallel and Distributed Computing 61, 810–837.

BURCHARD, L.-O. 2005. Networks with advance reservations: Applications, architecture, and performance.Journal of Networks and System Management 13, 429–449.

DE LEENHEER, M., DEVELDER, C., STEVENS, T., DHOEDT, B., PICKAVET, M., AND DEMEESTER, P. 2007.Design and control of optical grid networks. In Fourth International Conference on Broadband Commu-nications, Networks and Systems, 2007. BROADNETS 2007. IEEE, 107–115.

DE LEENHEER, M., THYSEBAERT, P., VOLCKAERT, B., DE TURCK, F., DHOEDT, B., DEMEESTER, P., SIME-ONIDOU, D., NEJABATI, R., ZERVAS, G., KLONIDIS, D., AND O’MAHONY, M. 2006. A view on enabling-consumer oriented grids through optical burst switching. IEEE Communications Magazine 44, 3, 124 –131.

DEVELDER, C., DHOEDT, B., MUKHERJEE, B., AND DEMEESTER, P. 2009. On dimensioning optical gridsand the impact of scheduling. Photonic Network Communications 17, 255–265. 10.1007/s11107-008-0160-z.

FOSTER, I., KESSELMAN, C., LEE, C., LINDELL, B., NAHRSTEDT, K., AND ROY, A. 1999. A distributed re-source management architecture that supports advance reservations and co-allocation. In 1999 SeventhInternational Workshop on Quality of Service, 1999. IWQoS ’99. 27 –36.

GUO, W., WANG, Z., SUN, Z., SUN, W., JIN, Y., HU, W., AND QIAO, C. 2009. Task scheduling accuracyanalysis in optical grid environments. Photonic Network Communications 17, 209–217.

GUO, W., WEIQIANG, S., HU, W., AND JIN, Y. 2006. Resource allocation strategies for data-intensiveworkflow-based applications in optical grids. In 10th IEEE Singapore International Conference on Com-munication systems, 2006. ICCS 2006. IEEE, 1–5.

HE, X., SUN, X., AND VON LASZEWSKI, G. 2003. Qos guided min-min heuristic for grid task scheduling. J.Comput. Sci. Technol. 18, 442–451.

JIN, Y., WANG, Y., GUO, W., SUN, W., AND HU, W. 2009. Communication contention reduction in jointscheduling for optical grid computing. In Networks for Grid Applications. Lecture Notes of the Institutefor Computer Sciences, Social Informatics and Telecommunications Engineering Series, vol. 2. SpringerBerlin Heidelberg, 206–214.

KIM, S.-I., JUKAN, A., AND S.LUMETTA, S. 2007. Coordinated resource scheduling in high-performanceoptical grids. Conference on Optical Fiber Communication and the National Fiber Optic Engineers Con-ference, 2007. OFC/NFOEC 2007., 1 –3.

LI, K. 2005. Job scheduling for grid computing on metacomputers. In Parallel and Distributed ProcessingSymposium, 2005. Proceedings. 19th IEEE International.

LIANG, X., LIN, X., AND LI, M. 2006. Adaptive task scheduling on optical grid. 2006 IEEE Asia-PacificConference on Services Computing (APSCC’06), 486–491.

LIU, X., QIAO, C., WEI, W., YU, X., WANG, T., HU, W., GUO, W., AND WU, M. 2009. Task scheduling andlightpath establishment in optical grids. Journal of Lightwave Technology 27, 12, 1796 –1805.

MANDAL, A., KENNEDY, K., KOELBEL, C., MARIN, G., MELLOR-CRUMMEY, J., LIU, B., AND JOHNSSON,L. 2005. Scheduling strategies for mapping application workflows onto the grid. In 14th IEEE Inter-national Symposium on High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 125 –134.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.

Page 30: A Approaches to Task Scheduling and Resource Allocation in Optical Gridsrichard.myweb.cs.uwindsor.ca/cs510/survey_shaabana.pdf · 2013-12-04 · Approaches to Task Scheduling and

A:30 A. Shaabana et al

SIMEONIDOU, D., NEJABATI, R., ZERVAS, G., KLONIDIS, D., TZANAKAKI, A., AND O’MAHONY, M. 2005.Dynamic optical-network architectures and technologies for existing and emerging grid services. Jour-nal of Lightwave Technology 23, 10, 3347 – 3357.

SINNEN, O. AND SOUSA, L. 2005. Communication contention in task scheduling. IEEE Transactions onParallel and Distributed Systems 16, 6, 503 – 515.

SINNEN, O., SOUSA, L., AND SANDNES, F. 2006. Toward a realistic task scheduling model. Parallel andDistributed Systems, IEEE Transactions on 17, 3, 263 – 275.

SINNEN, O., S. L. 2004. On task scheduling accuracy: Evaluation methodology and results. Journal ofSupercomputing 27, 177–194.

STEVENS, T., DE LEENHEER, M., DEVELDER, C., DHOEDT, B., CHRISTODOULOPOULOS, K., KOKKINOS, P.,AND VARVARIGOS, E. 2009. Multi-cost job routing and scheduling in grid networks. Future GenerationComputer Systems 25, 8, 912–925.

TANWIR, S., BATTESTILLI, L., PERROS, H., AND KARMOUS-EDWARDS, G. 2008. Dynamic scheduling ofnetwork resources with advance reservations in optical grids. International Journal of Network Man-agement 18, 2, 79–105.

THYSEBAERT, P., DE LEENHEER, M., VOLCKAERT, B., DE TURCK, F., DHOEDT, B., AND DEMEESTER, P.2005. Using divisible load theory to dimension optical transport networks for grid excess load handling.In Joint International Conference on Autonomic and Autonomous Systems and International Conferenceon Networking and Services, 2005. ICAS-ICNS 2005. 89.

TOPCUOGLU, H., HARIRI, S., AND WU, M.-Y. 2002. Performance-effective and low-complexity task schedul-ing for heterogeneous computing. IEEE Transactions on Parallel and Distributed Systems. 13, 3, 260–274.

WANG, B., LI, T., LUO, X., FAN, Y., AND XIN, C. 2005. On service provisioning under a scheduled trafficmodel in reconfigurable wdm optical networks. In BroadNets 2005. 2nd International Conference onBroadband Networks, 2005. 13 – 22 Vol. 1.

WANG, Y., JIN, Y., GUO, W., SUN, W., HU, W., AND WU, M. 2007a. Joint scheduling for optical grid appli-cations. Journal of Optical Networking 6, 3, 304–318.

WANG, Z., GUO, W., SUN, Z., JIN, Y., SUN, W., HU, W., LIN, X., WU, M.-Y., LIU, H., FU, S., YUAN, J., ANDQIAO, C. 2007b. Demonstration of a task-flow based aircraft collaborative design application in opticalgrid. 2007 33rd European Conference and Ehxibition of Optical Communication (ECOC), 1 –2.

WANG, Z., GUO, W., SUN, Z., JIN, Y., SUN, W., HU, W., AND QIAO, C. 2007c. On accurate task schedulingin Optical Grid. 2007 First International Symposium on Advanced Networks and TelecommunicationSystems 1, 1–2.

YU, J., BUYYA, R., AND THAM, C. K. 2005. Cost-based scheduling of scientific workflow applications onutility grids. In First International Conference on e-Science and Grid Computing, 2005.

ZERVAS, G., NEJABATI, R., WANG, Z., SIMEONIDOU, D., YU, S., AND O’MAHONY, M. 2007. A fully func-tional application-aware optical burst switched network test-bed. In Conference on Optical Fiber Com-munication and the National Fiber Optic Engineers Conference, 2007. OFC/NFOEC 2007. 1 –3.

ZHENG, J. AND MOUFTAH, H. 2002. Routing and wavelength assignment for advance reservation inwavelength-routed wdm optical networks. In IEEE International Conference on Communications, 2002.ICC 2002. Vol. 5. 2722 – 2726 vol.5.

ACM Transactions on Embedded Computing Systems, Vol. V, No. N, Article A, Publication date: January YYYY.