Tai, Yu-Chang
4/29/2013
Future Generation Computer Systems(FGCS.J)journal homepage: www.elsevier.com/locate/fgcsSaeid Abrishami a,∗, Mahmoud Naghibzadeha, Dick H.J. Epemab
Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds
*Outline
* Introduction
* Scheduling system model
* IaaS cloud partial critical paths algorithms
* An illustrative example
*Time complexity
* Performance evaluation
* Conclusions
*Introduction
*Clouds are different from utility Grids - on-demand resource provisioning
- homogeneous networks
- the pay-as-you-go pricing model
*consider the benefits of using Cloud computing for executing scientific workflows -there exist several commercial Clouds, such as Amazon
*IntroductionInfrastructure as a Service (IaaS) Clouds, has some potential benefits
for executing scientific workflows1. users can dynamically obtain and release resources on demand,
and charged on a pay-as-you-go basis
2.resource provisioning
3. illusion of unlimited resources
important parameter : economic cost-faster resources are more expensive than slower ones
-time-cost tradeoff in selecting appropriate services
-belongs to the multi-criteria optimization problems
minimize the execution cost of the workflow, while completing the workflow before the user specified deadline
IaaS Cloud Partial Critical Paths (IC-PCP)
IaaS Cloud Partial Critical Paths with Deadline Distribution (IC-PCPD2)
*Scheduling system model*An application is modeled by a directed acyclic graph
G(T , E)
*T is a set of n tasks {t1, t2, . . . , tn}
*E is a set of dependencies ei,j=(ti,tj)
*two dummy tasks tentry and texit to the beginning and the end of the workflow (zero execution time and they are connected with zero-weight dependencies to the actual entry and exit tasks)
*Scheduling system model
*services S = {s1,s2,…,sm} with different QoS parameters such as CPU type
and memory size, and different prices
*The pricing model is based on a pay-as-you-go basis similar to the current commercial Clouds, i.e., the users are charged based on the number of time intervals that they have used the resource, even if they have not completely used the last time interval
c1=5c2=2c3=1
*ET(ti, sj) :execution time of task ti on computation service sj
*average bandwidth between the computation services is roughly equal
* TT(ei,j) : data transfer time of a dependency ei,j
* MET(ti) : Minimum Execution Time of a task ti
-execution time of task ti on a service sj ∈ S which has the
minimum ET(ti, sj) between all available services
ti
p p
ti
cc
*SS (ti) = sj,k : Selected Service for each scheduled task ti
*sj,k : kth instance of service sj.
*AST (ti) : Actual Start Time of ti
*assigned node :has already been assigned to (scheduled on)
a service
*Critical Parent : of a node ti is the unassigned parent of ti that has the latest data arrival time at ti, that is, it is the parent tp of ti for which EFT(tp) + TT(ep,i), is maximal
*PCP: The Partial Critical Path of a node ti is:
- empty if ti does not have any unassigned parents
-consists of the Critical Parent tp of ti and the Partial
Critical Path of tp if has any unassigned parents
*Algorithm1IC-
PCP
*example
258
4610
4811
5811
51216
3811
368
359
5814
0~3__16
0~5__16
0~2__19 3~7__24
7~10__23
7~11__22
8~13__30
14~17__30
D=30
14~19__30
1
3
2
0 10 20 30
*example
258
4610
4811
5811
51216
3811
368
359
5814
0~3__16
0~5__16
0~2__19 3~7__24
7~10__23
7~11__22
8~13__30
14~17__30
D=30
Path{t2,t6,t9}
14~19__30
1
S2,1
28
3
2 2
14~17__23
21~24__23
12~20__23
20~28__30
0~3__12
0~12__14
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~3__12
0~12__14
0~2__19 3~7__24
14~17__23
12~20__22
8~13__30
21~24__30
D=30
Path{t3}
20~28__30
1
S2,1
28
3
2 2
0~9__12
S3,1
9 1
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~9__12
0~12__14
0~2__19 3~7__24
14~17__23
12~20__22
8~13__30
21~24__30
D=30
Path{t5,t8}
20~28__30
1
S2,1
28
3
2 2
S3,1
9 1
14~22__24
22~28__30
3~7__230~2__18
S2,2
14 6
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~9__12
0~12__14
0~2__18 3~7__23
14~22__24
12~20__22
8~13__30
22~28__30
D=30
Path{t1,t4}
20~28__30
1
S2,1
28
3
2 2
S3,1
9 18~18__23
0~8__13
S2,2
6
19~24__30
S3,2
18
2
0 10 20 30
14
258
4610
4811
5811
51216
3811
368
359
5814
0~9__12
0~12__14
0~8__13
14~22__24
12~20__22
22~28__30
D=30
Path{t7}
20~28__30
1
S2,1
28
3
2 2
S3,1
9 1 8~18__23
S2,2
6
19~24__30
S3,2
18 2
18~29__30
S3,3
11
1
COST=2*5+1*4=14
0 10 20 30
14
5
2
1
*Applicable
*applicable instance for a path if it satisfies two conditions:
- The path can be scheduled on the instance such that each
task of the path is finished before its latest finish time
- The new schedule uses (a part of) the extra time of the
instance,which is the remaining time of the last time
interval of thatinstance.
P
C
P
Cost=zero
C
*Algorithm2
IC-PCPD2
Call PLANNING(G(T,E))
Assign subdeadline on PCP node(assigned node)
258
4610
4811
5811
51216
3811
368
359
5814
0~3__16
0~5__7
0~2__6 3~7__24
7~10__13
7~11__17
8~13__30
14~17__30
D=30
14~19__30
sb=0
tentety t4
0~5__6
S1,1
S2,1
S3,1
t2 t3t1
6~10__24
11~16__30
t1
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~3__16
0~5__7
0~5__6 6~10__24
7~10__13
7~11__17
11~16__30
14~17__30
D=30
14~19__30
sb=0
tentety t4 t5
S1,1
S2,1
S3,1
t2 t3t1
t1
t2
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~3__16
0~5__7
0~5__6 6~10__24
7~10__13
7~11__17
11~16__30
14~17__30
D=30
14~19__30
sb=0
tentety t4
0~9__16
t5 t6
S1,1
S2,1
S3,1
t2 t3t1
11~15__17
18~23__30
t1
t2
t3
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~9__16
0~5__7
0~5__6 6~10__24
7~10__13
11~15__17
11~16__30
14~17__30
D=30
sb=0
tentety t4 t5 t6 t7
S1,1
S2,1
S3,1
t2 t3t1
18~23__30
t1
t2
t3
6~16__24
17~22__30
17~20__30
t4S3,2
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~5__7
0~5__6
7~10__13
D=30
sb=0
tentety t4
0~9__16
t5 t6 t7 t8
S1,1
S2,1
S3,1
t2 t3t1
11~15__17
18~23__30
t1
t2 t5
t3
6~16__24
17~22__30
17~20__30
t4S3,2
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~5__7
0~5__6
7~10__13
D=30
sb=0
tentety t4
0~9__16
t5 t9t6 t7 t8
S1,1
S2,1
S3,1
t2 t3t1
11~15__17
18~23__30
t1
t2 t5
t3
6~16__24
17~22__30
17~20__30
t4S3,2
S1,2
t6
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~5__7
0~5__6
7~10__13
D=30
sb=0
tentety t4
0~9__16
t5 t9t6 t7 t8
S1,1
S2,1
S3,1
t2 t3t1
11~15__17
18~23__30
t1
t2 t5
t3
6~16__24
17~22__30
17~20__30
t4S3,2
S1,2
t6
17~29__30
t7
16~28__30
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~5__7
0~5__6
7~10__13
D=30
sb=0
tentety t4
0~9__16
t5 t9t6 t7 t8
S1,1
S2,1
S3,1
t2 t3t1
11~15__17
18~23__30
t1
t2 t5
t3
6~16__24
16~28__30
17~20__30
t4S3,2
S1,2
t6
t7
17~25__30
t8S3,3
0 10 20 30
258
4610
4811
5811
51216
3811
368
359
5814
0~5__7
0~5__6
7~10__13
D=30
sb=0
tentety t4
0~9__16
t5 t9t6 t7 t8
S1,1
S2,1
S3,1
t2 t3t1
11~15__17
18~23__30
t1
t2 t5
t3
6~16__24
16~28__30
17~25__30
t4S3,2
S1,2
t6
t7
18~26__30
t8S3,3
t9S2,2
COST=5*2+2*2+1*4=18
0 10 20 30
5
2
1
*Time complexity O(n+e)~O(n^
2)
O(n-1)
O(m*n)=O(n^2)
IC-PCP=O(n^2)
O(n)O(n^2)
IC-PCPD2=O(n^2)
Call PLANNING(G(T,E))
Assign subdeadline on PCP node
*Time complexity
O(n+e)~O(n^2)
O(n)
O(n^2)
O(n^2)
*evaluationAlgo1IC-PCP
Algo2IC-PCPD2
Algo3IC-LOSS
normalize the total cost of each workflow execution
Cheapest schedule : scheduling all workflow tasks on a single instance of the cheapest computation service
Fastest schedule : scheduling each workflow task on a distinct instance of the fastest computation service, while all data transmission times are considered to be zero MF =makespan of the Fastest schedule
deadline factor αset the deadline = α・MF -Since the problem has no solution for α = 1, we let α ranges from 1.5 to 5 in our experiments, with a step length equal to 0.5
*evaluation
Algo1IC-PCP
Algo2IC-PCPD2
Algo3IC-LOSS
1>2>3
1>2>3
1>2>3
1≈2>3
1>2>3
2>1>3
1>2>3
1≈2>3
1>2>3
2>1>3
*Conclusions*The new algorithms consider the main features of the current
commercial Clouds such as on-demand resource provisioning, homogeneous networks, and the pay-as-you-go pricing model
*The time complexity of both algorithms is O(n2), The polynomial time complexity makes them suitable options for the large workflows
*IC-PCP outperforms both, IC-PCPD2 and IC-Loss in most cases
*experiments show that the computation times of the algorithms are very low, less than 500 ms for the large workflows
*intend to improve our algorithms for the real Cloud environments
*Thanks~
*The end