View
6
Download
0
Category
Preview:
Citation preview
HPCS 2005
Universidade Federal de Pelotas (BRA) LUPS – Laboratory of Ubiquitous and Parallel Systems
C. A. S. Camargo, G. G. H. Cavalheiro, M. L. Pilla, S. A. C. Cavalheiro, L. Foss
Applying List Scheduling Algorithms in a Multithreaded Execution Environment
HPCS 2005
Overview
• Introduction
• List Scheduling Algorithms
• Anahy Multithreaded Execution Model – Programming interface – Scheduling strategy
• Analysis of the scheduling strategy – Transforming parallel program representations
• Concluding Remarks
HPCLatam’12
HPCS 2005
Introduction
Program
Sequential SMP Cluster NOW
• Performance portability
The concurrency of an application can be described regardless of hardware resources
HPCLatam’12
HPCS 2005
Introduction
Program
Sequential SMP Cluster NOW
• Performance portability
The concurrency of an application can be described regardless of hardware resources
HPCLatam’12
Concurrency >> Parallelism
HPCS 2005
Introduction
Performance portability • Our approach: – Dissociate programming of execution
• Our proposal: –
• Our mechanisms: – List Scheduling and dataflow control at run time
HPCLatam’12
HPCS 2005
Introduction • Many multithread runtime environments use list
scheduling strategies with good practical results – Cilk, OpenMP, Anahy
• We want to evaluate the theoretical efficiency of greedy list scheduling on multithreaded environments
We built an algorithm that obtains DGCs apart from DAGs, so we could compare the results of dynamic multithreaded schedulings with static, task-based
ones, for the same programs
HPCLatam’12
HPCS 2005
List Scheduling • Program described as a DAG • Task is the scheduling unit • A task defines a sequence of
instructions and two set of data: input and output data
• Tasks are assigned priorities and ordered in a list, that is consulted for each scheduling event
HPCLatam’12
T1 /4 T2 /1
T3 /2
T5 /5 T4 /4 T6 /10
T7 /10
HPCS 2005
List Scheduling • Program described as a DAG • Task is the scheduling unit • A task defines a sequence of
instructions and two set of data: input and output data
• Tasks are assigned priorities and ordered in a list, that is consulted for each scheduling event
HPCLatam’12
T1 /4 T2 /1
T3 /2
T5 /5 T4 /4 T6 /10
T7 /10
knowing the critical path is paramount
HPCS 2005
Environment
Anahy
API programminginterface
Applicative Scheduling performanceportability
multithreading
Operating System Hardware
genericarchitecture
HW/OSdependentmodules
Execution pool active messages
Communication
HPCLatam’12
HPCS 2005
Anahy
void foo(In x) { res = computes(x); } void bar(In p) { Task_A t1 = create(foo,a); Task_B t2 = create(foo,b); ... join(t1,r1) Task_C join(t2,r2) Task_D }
!!!
!
"
#
$
!"#$%
&''$% &''$%
Detailed DCG Programming Interface
HPCLatam’12
HPCS 2005
Anahy Just the thread level
(what the scheduler sees)
!!!
!
"
#
$
!"#$%
&''$% &''$%
Detailed DCG
foo()
foo()
bar()
HPCLatam’12
HPCS 2005
Anahy
Scheduling strategy – A list of ready threads, ordered by priority – Prioritize threads in the critical path • In the default strategy, the closer a thread is from
the root of the DCG, the higher is its priority • If more than one thread is at the same level in the
DCG, the oldest ready thread has higher priority (for multiple create, ties are broken randomly)
– No migration, no task preemption
HPCLatam’12
HPCS 2005
Analysis of the scheduling strategy
• DAGs from 9 case studies of Graham’s (1976) were transformed to DCGs
• The resulting DCGs were scheduled according to Anahy’s strategy
• Scheduling lengths were compared with the ones showed by Graham (optimal and non-optimal schedules)
HPCLatam’12
HPCS 2005
Analysis of the scheduling strategy
Two-step transformation • Pre-processing
– Identify input and output tasks in the DAG and insert them in threads in a proper way
– Group sequences of tasks that that do not configure a call to create or join primitives in a multithreaded program
• Iterative Processing – Breadth-first analysis in the DAG
– Edges visited from left to right
– Heuristics to solve conflicts
HPCLatam’12
HPCS 2005
Iterative Processing
!"#$%#
!!!!!!!"!
!!!!!!!"
!"#$
!!!!!!!
"
!!!!!!!
"
Transforming DAGs into DCGs
HPCLatam’12
HPCS 2005
Iterative Processing
!"#$"%&"
!!!!!!!
"
!
!!!!!!!
"
!"#$%&$'(
!!!!!!!!!!!"!!!!!!!!!!!!!!!"
!"#$%
!!!!!!!!!!" !!!!!!!!!!"
Transforming DAGs into DCGs
HPCLatam’12
HPCS 2005
Analysis of the scheduling strategy
Schedule lengths
• 4 optimal schedules
• 2 good schedules • Same length of Graham’s Critical Path heuristic
• 3 bad schedules • But wen we added some dependencies to the graph
we could got optimal schedules
HPCLatam’12
HPCS 2005
Analysis of the scheduling strategy
Schedule lengths
• 4 optimal schedules
• 2 good schedules • Same length of Graham’s Critical Path heuristic
• 3 bad schedules • But wen we added some dependencies to the graph
we could got optimal schedules
HPCLatam’12
7
HPCS 2005
Concluding remarks • Conclusions
– We developed a graph grammar that can successfully map DAG programs to multithreaded applications
– Anahy’s scheduling strategy can provide a dynamic schedule as efficient as static list strategies, preserving the time bounds • However, the programmer has to be aware of the scheduling policy to
take advantage of the runtime environment
• Future work – Analyse the scheduling assuming NUMA architecture models
– Improve thread priority strategies using attributes derived from DAG level
HPCLatam’12
HPCS 2005
C. A. S. Camargo, G. G. H. Cavalheiro, M. L. Pilla, S. A. C. Cavalheiro, L. Foss
Applying List Scheduling Algorithms in a Multithreaded Execution Environment
pilla@inf.ufpel.edu.br
.org
Recommended