Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
tu-logo
Outline
Task Clustering Problem with LargeCommunication Delays: Convex Clustering
Johnatan PECERO SANCHEZ
Informatique et Distribution Laboratory
Equipe MOAIS (13/02/2006)
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Outline
2 Context and Motivation
3 Task Clustering ProblemTask Clustering Problem
4 Convex ClusteringDefinition
5 Genetic Convex Clustering AlgorithmGenetic Algorithm
6 Experimental Results
7 ConclusionsConclusions and future works
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Context and Motivation
For most of the parallel and distributed systems available today,communication issues are a crucial factor of performance.
The times for exchanging data are usually much largerthan the times for computing elementary operations
This is even more important for cluster computing systemsor grid computing systems.
The efficient execution of an application on a parallel anddistributed system highly depends on the decisions taken forscheduling the tasks that constitute the program.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Scheduling problem
One of the most challenging problems in parallel anddistributed computing.
Goal
To determine an assignment of tasks to processing elements inorder to optimize certain performance indices (and an order).
FocusTaking into account large communication costs into thescheduling decision is a key point to reach high performance.
Communications are usually relatively high in regard tobasic execution time.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
2 Context and Motivation
3 Task Clustering ProblemTask Clustering Problem
4 Convex ClusteringDefinition
5 Genetic Convex Clustering AlgorithmGenetic Algorithm
6 Experimental Results
7 ConclusionsConclusions and future works
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Preliminaries
Classical application model
An application is represented as a precedence task graph G =(V,E), where
V is the set of task nodes, which are in one-to-onecorrespondence with the computational tasks in theapplications
E is the set of communication edges and represents theset of the precedence relations between the tasks.
Target architecture
A generic parallel system composed of unbounded number ofprocessors linked by an interconnection network.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Preliminaries
Computational model
The delay model.
Introduced by Rayward-Smith.
The communication time between two communicatingtasks is neglected if they are executed by the sameprocessor, otherwise we are incurring in a communicationcost.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
120
n1
n2 n3 n4 n5 n6
n7
n9 n11n10n8
n12
n13 n14 n15
n16
n17 n18
120120 120 120
120
80
80 80 8080
120 120 120
80
80
80 80
120
120 120 120
80
80
80
120 120
80
40 40 40 40 40
60
30
120
30 30 30
10
20
120
20 20
20
10 10
Figure: Gaussian Elimination
Precedence Task Graph
Vertices: computation tasks
Edges: data dependeciesbetween tasks.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Task clustering
Clustering
Often used as a compile-time preprocessing step in mappingparallel applications onto multiprocessor architecture.
Considered as a scheduling problem on unboundednumber of processors.
NP-hard problem even for simple cases.Is a two step method:
1 Use heuristics to group tasks into clusters2 Clusters will be allocated to vailable processors and the
final ordering of the tasks will be computed.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Task clustering
HeuristicsPractical solutions have been proposed based mainly on threedifferent heuristics
1 Based on the critical path analysis.DSC (Dominant Sequence Clustering)
2 Priority-based list scheduling.ETF (Earliest Task First)
3 Based on the graph decomposition.CLAN, Convex Clustering
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Our objective
Objective
To solve the task clustering problem on unbounded number ofprocessors taking into account any execution time and largecommunication delays.
Proposed solution
An extension of Convex Clustering Algorithm.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
2 Context and Motivation
3 Task Clustering ProblemTask Clustering Problem
4 Convex ClusteringDefinition
5 Genetic Convex Clustering AlgorithmGenetic Algorithm
6 Experimental Results
7 ConclusionsConclusions and future works
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Convex Clustering
Idea
Assign tasks to location in convex groups.
Convexity
Definition 1A group of task T ⊂ V is said convex if and only ifall pairs of tasks (x , z) ∈ T , all intermediate task such that ∀ y, x≺ y
∧y ≺ z =⇒ y ∈ T
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Example
A
B C
D E
F
G
15
4 3
1.5 1.5
1
2
1
5 1
2 2
1
1
A
B C
D E
F
G
15
4 3
1.5 1.5
1
2
1
5 1
2 2
1
1
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Definition 2 A clustering R = {Vi , ≺i }i is a partition of thegraph into groups of tasks associated to a total linear orderextension of the original partial order ≺.
Induced Graph
Definition 3From each clustering , we can build a graph whosenodes are the clusters and adding all arcs corresponding to thesequential execution inside clusters. This graph is calledinduced graph and ωR denote the longest path in it.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Induced Graph
T1 T2
T3 T4 T5
T6 T7
3 3 3 3
3 3
T3
3
T1 0 T2
3 3
0
0 0
T6
T4 T5
T7
3
0 3
Figure: On the left, a task graph G representing a parallel application.On the rigth, the induced graph associated to the clustering.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Cluster Graph
Definition 4 The nodes of GclusterR are the cluster of clustering
R, that is to set of tasks {Vi}. Exists an arc between twodistinct nodes Vi and Vj 6= Vi if and only if there exist two tasksx ∈ Vi and y ∈ Vj such that (x , y) ∈ E. Moreover, the graph isweighted by ρ on each arc, and each nodes Vi is weighted by|Vi | =
∑x∈Vi
px (where px is the task execution time). In thiscase, we denote the length of the longest path in this graph byω
clusterR .
Lemma 1For all convex clustering R, ωclusterR is an upper bound
of ωR .
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Recall of the task clustering problem
Formal definition
Instance: precedence task graph and a delay ρ
Problem: To find a clustering R = {Vi , ≺i }i of G.
Objective: Minimize the execution time ωR
3-fields notation
P∞|prec, ci ,j = c, pi |Cmax
Bad News
No constant guaranty for large communication delays!
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Characteristics
Properties
1 The convex cluster is a particular class of ProcessorOrdered Schedule
2 The resulting clustered graph in a convex cluster is a DAG3 Schedules based on convex clusters are 2-dominant
Bad news
The task clustering problem in convex sets is NP-Complet.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
The algorithm
1 Algorithm CC Gatherρ(G(V, E))2 A = Partition(G);3 IF Clustering Condition THEN
RA = Gatherρ(Induit(G, A));RA = Gatherρ(Induit(G, A));RA> = Gatherρ(Induit(G, A>));RA< = Gatherρ(Induit(G, A<));return RA< ] RA ] RA ] RA> ;
4 ELSEreturn (V ,≺);
5 END IF
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Partitioning DAG problem
Formal definition
Instance: oriented acyclic graph G(V,E)
Solution: two disjoint groups of independent tasks: A1 andA2 such that for all task x in A1 and y in A2 there is no pathbetween x and y (and vice versa)
Objective: Maximize the size of the smallest group(maximize min(|A1|, |A2|))
Complexity
NP-complete!
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
2 Context and Motivation
3 Task Clustering ProblemTask Clustering Problem
4 Convex ClusteringDefinition
5 Genetic Convex Clustering AlgorithmGenetic Algorithm
6 Experimental Results
7 ConclusionsConclusions and future works
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Definition
Is a guided random search algorithm based on the principles ofevolution and natural genetics.
It combines the exploitation of past results with theexploration of new areas of search space.
Is randomized but not simple random walks
It exploits historical information efficiently to speculate onnew search points with expected improvement.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
The algorithm
1 Compute initial population2 WHILE stopping condittion not fulfilled DO
BEGINselect individuals for reproductioncreate offsprings by crossing individualseventually mutate some individualscompute new generation
END
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Coding
A solution representation of task clustering problem based onconvex sets is feasible if it satisfies the following condition:
1 Each task of the precedence tasks graph belongs toexactly one cluster
2 The restriction of convexity must be fulfilled3 The fitness function of GCCA depends only on the
clustering of the tasks, rather than the numbering of thecluster.
0 1 2 3 4 5 60 3 2 2 2 2 3 3 2 0
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Local Policy
There remains a problem to solve: the final ordering of tasksinside a given cluster.
Two strategies
1 Top Level or earliest execution time2 Bottom Level or remaining execution time
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Top Level
TL(v, C)= max(maxin, maxout )
Wheremaxin = max {TL(u) + w(u); u ∈ PRED(v)C(u) = C(v)}maxout =max {TL(u) + w(u) + c(u, v); u ∈ PRED(v), C(u)difC(v)}
Bottom Level
BL(v, C)= w(v) + max(maxin, maxout )
Where maxin = max {BL(u); u ∈ SUCC(v), C(v) = C(u)}maxout =max {BL(u) + c(v , u); u ∈ SUCC(v), C(v)difC(u)}
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Results
In producing the results given in this section, the followingbenchmark values were adopted
Population size of 100
Crossover probability of 0.8
Mutation probability of 0.01
A generation limit 100
We consider unlimited number of processors
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Reference Graphs
Task graphs that have been previously used by differentresearchers and addressed in the literature.
This set consists in about 9 graphs (7 to 18 task nodes)
Relatively small graphs but do not have trivial solution andexpose the complexity of clustering very adequately.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
References No.nodes
Optimalschedule
Optimal#clusters
GCCAschedule
GCCAclusters
[10] 7 9 2 10 3[22] 7 4 2 4 4[10] 9 5 2 5 5[24] 9 149 2 160 3[10] 10 26 2 26 3[23] 10 11 2 10 6[21] 13 190 2 190 5[20] 18 440 3 480 5[20] 18 330 3 340 6
Figure: Results for the referenced graphs
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Randomly generated task graphs
These task graphs range from 1500 tasks in size.
We compare GCCA with the well-known DSC algorithmand another general genetic algorithm(GAAZ) applied tosolve the task clustering problem.
The main performance measure: Normalized ScheduleLength(NSL)
NSL
NSL = SL / CP
WhereSL: Schedule lengthCP: Critical path. It represents a lower bound on the schedule
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Performance evaluation
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
10 15 20 25 30 35
Nor
mal
ized
Sch
edul
e Le
nght
Communication to Computation Ratio
GenALDSC
GCABGCAT
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
0.4
0.45
0.5
0.55
0.6
0.65
0.7
200 400 600 800 1000 1200 1400
Nor
mal
ized
Sch
edul
e Le
ngth
Number of tasks
DSCGCABGCAT
Figure: Performance evaluation of GCCA and DSC.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
2 Context and Motivation
3 Task Clustering ProblemTask Clustering Problem
4 Convex ClusteringDefinition
5 Genetic Convex Clustering AlgorithmGenetic Algorithm
6 Experimental Results
7 ConclusionsConclusions and future works
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Conclusions
The clustering algorithm remains guaranted from anoptimal clustering.
A new genetic algorithm to solve the task clusteringproblem was proposed.
The experiment results obtained by GCCA show thefeasibility of using genetic algorithm with convex clusterproperties to solve the task clustering problem.
The convex clustering approach seems particularly wellsuited for the new parallel systems like cluster of PC withhierarchical communication.
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
Future works
To parallelize GCCA.
To implement GCCA in a real parallel environment
Extension of CC for parallel systems with hierarchicalcommunication levels(clusters, grid computing)
Johnatan Pecero Convex Clustering Algorithm
tu-logo
Context and MotivationTask Clustering Problem
Convex ClusteringGenetic Convex Clustering Algorithm
Experimental ResutlsConclusions
MERCI!!
Johnatan Pecero Convex Clustering Algorithm