Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
1© Bull, 2012
23/09/14 Yiannis GeorgiouDavid GlesserMatthieu HautreuxDenis Trystram
Adaptive Resource and Job Managementfor limited power consumption
2© Bull, 2012
Introduction DVFS & Switch-of
The model Algorithm and implementation Experimentations
Conclusion and future works
3© Bull, 2012
Introduction DVFS & Switch-of
The model Algorithm and implementation Experimentations
Conclusion and future works
4© Bull, 2014
Introduction - Powercap
Powercap: limit the power consumption during a certain amount of time
5© Bull, 2014
Introduction - Energy
Why control? Power peak = O(power of a city) Power installations lifetime Electricity providers limitations Controling energy consumption = Controling cost
How control?– DVFS– Switch-of
• (or shutdown, or sleep mode, or hibernation...)
6© Bull, 2012
Introduction DVFS & Switch-of
The model Algorithm and implementation Experimentations
Conclusion and future works
7© Bull, 2014
The RJMS level – Switch-of
Switch-of Switch-of some resources switched-of has a cost Not possible on all clusters Jobs can not run on switched-of nodes!
8© Bull, 2014
The RJMS level – Switch-of
« Power Bonuses » If all components of a level are switched-of, the component of
the upper level can be switched-of and provide an additional gain
Exemples : Nodes are made of processors Chassis are made of nodes Rack are made of Chassis
...
9© Bull, 2014
The RJMS level – Switch-of
« Power Bonuses » on CURIE cluster: 18 nodes per chassis, 5 chassis per rack Power gained by switching of a Chassis
~= Power(computing node) Power gained by switching of a Rack
~= 10 * Power(computing node)
...
Computing node
Switched-off node
Chassis
Rack
- 344 W
- 500 W
- 3400 W
...
...
10© Bull, 2014
The RJMS level - DVFS
DVFS It's a trade-of between performance and power
consumption What about performance / energy trade-of ?
∫POWER .dt=Energy
11© Bull, 2014
The RJMS level - DVFS
DVFS It's a trade-of between performance and power
consumption What about performance / energy trade-of ?
12© Bull, 2014
The RJMS level - DVFS
DVFS It's a trade-of between performance and power
consumption What about performance / energy trade-of?
13© Bull, 2014
The RJMS level - DVFS
DVFS is a trade-of between completion time and power
No obvious performance / energy trade-of– Minimizing power != minimizing energy– The impact of DVFS is highly dependant on the job
14© Bull, 2012
Introduction DVFS & Switch-of
The model Algorithm and implementation Experimentations
Conclusion and future works
15© Bull, 2014
Our Model
We work with maximum power consumptions W is the maximal computational work possible
Powercap limitation
W=T .(N−Noff−N dvfsσMax +N dvfsσMin )
N off . Poff +N dvfs .PMin+(N−Noff−N dvfs) .PMax⩽P
N X=numberof node in state XΣZ=speed degradationat state ZPY=power consumptionat YP= powercap
16© Bull, 2014
Our model
● In the space 3D (Ndvfs, Nof, W) is a plane
is an half space
⇒ The intersection is a straight line● Within the bound of the total number of
nodes, W is maximized when:
17© Bull, 2014
Our Model
3 cases:
– DVFS is better ⇒ we only use DVFS
– Switch-of is better ⇒ we only use Switch-of
– The powercap is so low that we should use both
18© Bull, 2014
Our model – switch-of or DVFS?
How to choose ?
When ρ
19© Bull, 2014
Our Model – DVFS or switch-of ?
On CURIE cluster:
20© Bull, 2012
Introduction DVFS & Switch-of
The model Algorithm and implementation Experimentations
Conclusion and future works
21© Bull, 2014
The algorithm
When a powercap limit is set Choose between DVFS and switch-of
If DVFS When a job is being launched, Try to schedule it at the highest frequency
If switch-of switch-of nodes at runtime, mark these nodes as « reserved » for the scheduler
22© Bull, 2014
The algorithm
$ scontrol create res Watts=123151 ...
23© Bull, 2012
Introduction DVFS & Switch-of
The model Algorithm and implementation Experimentations
Conclusion and future works
24© Bull, 2014
Experimental validation
Replay interesting parts of the CURIE workload– 5 hours, high utilization, jobs representative of the
whole workload
Slurm can emulate his environement– 336 Slurm nodes on 1 physical node– Sleep instead of real computational jobs
Add a powercap Case study: 1 hour, in the middle of the trace, at diferent
powers
25© Bull, 2014
Experimental validation
26© Bull, 2014
Experimental validation
Idle Switch-off
27© Bull, 2014
Experimental validation
Switch-off DVFS
28© Bull, 2012
Introduction DVFS & Switch-of
The model Algorithm and implementation Experimentations
Conclusion and future works
29© Bull, 2014
Current and future works
Powercap on live power values– Implemented using Layouts
Powercap on nodes
DVFS– What about reproducibility of jobs runs?– To do DVFS right, we need to know the job
Switch-of New scheduling algorithms Switch-of (with bonuses) whithout powercaps Switch-of particular components (cpus, gpus, network...)
30© Bull, 2012
31© Bull, 2014
References
[11] V. W. Freeh, D. K. Lowenthal, F. Pan, N. Kappiah, R. Springer, B. L. Rountree, and M. E. Femal, “Analyzing the energy-time trade-of in high-performance computing applications,” IEEE Transactions on Parallel and Distributed Systems, 2007.
[22] M. Etinski, J. Corbalan, J. Labarta, and M. Valero, “BSLD threshold driven power management policy for HPC centers,” in 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010.
32© Bull, 2014
Experimental validation
Diapo 1Diapo 2Diapo 3Diapo 4Diapo 5Diapo 6Diapo 7Diapo 8Diapo 9Diapo 10Diapo 11Diapo 12Diapo 13Diapo 14Diapo 15Diapo 16Diapo 17Diapo 18Diapo 19Diapo 20Diapo 21Diapo 22Diapo 23Diapo 24Diapo 25Diapo 26Diapo 27Diapo 28Diapo 29Diapo 30Diapo 31Diapo 32