View
221
Download
1
Category
Tags:
Preview:
Citation preview
ATAC: Ambient Temperature-Aware Capping
for Power Efficient Datacenters
Georgia TechMARS Lab
Sungkap YeoMohammad M. Hossain
Jen-cheng Huang
Hsien-Hsin S. Lee
Executive Summary• Observation Server locations are not ▷
created equal. ‘Minority’ of servers ‘rarely’ experience thermal emergencies.
• Goal Reduce cooling power & avoid thermal ▷overshooting on CPUs
• Solution Inlet temperature-aware technique▷• Results
38% savings in cooling power <1% performance degradation
2
Datacenters
3
• Cloud computing
• 2 ~ 10%
Datacenters
4
ServersNetworking
StoragePower delivery
Cooling
Datacenters: Traditional
5
ServersNetworking
StoragePower delivery
~50%
Cooling~50%
Datacenters: State-of-the-art
6
ServersNetworking
StoragePower delivery
>90%
Cooling<10%
Datacenters: State-of-the-art
“Parasol and Greenswitch: Managing datacenters powered by renewable energy”
Datacenters: State-of-the-art
Google datacenter in Finland
Datacenters: State-of-the-art
Yahoo datacenter
Datacenters: Majority
• Small to medium datacenters– Responsible for more than 70% of the entire
electrical power used by datacenters– Still labor under heavy cooling overhead
about 50%– More demand: private cloud
1111
1. Datacenter Cooling Essentials2. Motivation3. Our Approach: ATAC4. Evaluation
• Two things– Control algorithms for cooling units– Cool air delivery time
12
Datacenter Cooling Essentials
13
Datacenter Cooling Essentials (1)
Static control algorithm Always supplies cool air based on worst case sce-
nario Not efficient
Dynamic control algorithm(1) Starts from static control algorithm
(2) While all servers are under emergency tempera-ture, raises room temperature
(3) When any server experiences emergency tempera-ture, lower room temperature
• Cool air delivery time– Why is it important?
14
Datacenter Cooling Essentials (2)
FeelingHungry
Order a Pizza
Remain hungry!
Hottest server
15
Datacenter Cooling Essentials
Cooling unit
Hot
Inlet air temperature < Emergency temperature
16
Datacenter Cooling Essentials
Hot
!
Inlet air temperature > Emergency temperature
17
Datacenter Cooling Essentials
Temperature margin is required for all dynamic control algorithms!
1818
1. Datacenter Cooling Essentials2. Motivation3. Our Approach: ATAC4. Evaluation
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 351E+00
1E+02
1E+04
1E+06
1E+08
1E+10
Instant delivery of cool air is assumed
Non-zero delivery of cool air is assumed
Inlet Temperature (°C)
Tim
e (s
econ
ds)
ATAC Motivation I▷• Thermal overshooting
Non-zero delivery time of cool air
Inlet air temperature > Temergency
19
About 1% of the time
Lowest Lower Middle Higher Highest05
10152025303540
Height of Server Chassis in Racks
Inle
t A
ir T
emp
erat
ure
(˚
C)
ATAC Motivation II▷• Thermal overshooting
Only for the small numbers of servers
20
ATAC Motivation▷
• Potential solutions should- perform locally AND
Non-zero delivery time of cool air
Only for the small numbers of servers
- take inlet air temperature into account
21
• Goal- Keep CPU temperature under the target
temperature even when Tinlet air > Temergency
2222
1. Datacenter Cooling Essentials2. Motivation3. Our Approach: ATAC4. Evaluation
Our approach: ATAC• Experiments: Server power vs. Inlet air temperature
23
80 82 84 86 88 90 92 94 96 98 100
180
190
200
210
220
230
240
203 204 203 204 203 203 204 204 204 205 205
11 13 16 19 22 25 29 29 29 29 29
Fans (W)
Other Parts (W)
Inlet Air Temperature (°F)
Syst
em P
ower
(W
)
Fans = Max
27 28 29 30 31 32 33 34 35 36 37 38 3960
64
68
72
76
80
3.1Ghz2.9Ghz2.7Ghz
Inlet Air Temperature (˚C)
Cor
e T
emp
erat
ure
(˚C
)
Our approach: ATAC• Repeating experiments with different configurations
24
27 28 29 30 31 32 33 34 35 36 37 38 39180
190
200
210
220
230
240
3.1Ghz
2.9Ghz
Inlet Air Temperature (˚C)
Pow
er D
raw
(W
)
Core temperature can be under the controleven after Temergency
Linear model
• ATAC is a system-level technique– Throttles performance when Tinlet air > Temergency
– Any theory that explains our experiments?
Fan affinity laws [22]
Watts (heat transfer) ΔTemperature × Amount of air∝
constant
CPU power (∝ Tcore - Tinlet air)
25
Old CPU Power New CPU Power
Old ΔTemperature New ΔTemperature=
Our approach: ATAC
ATAC: Algorithm
ATAC Configurables▷
• Aggressive ATAC– ATAC compromises performance to reduce CPU
power consumption.– How aggressively?
• ATAC - #– ATAC - 0
• Lower CPU performance when Tinlet air = Temergency
– ATAC - X• Lower CPU performance when Tinlet air = Temergency - X
27
2828
1. Datacenter Cooling Essentials2. Motivation3. Our Approach: ATAC4. Evaluation
ATAC Simulation Setup▷
29
Raised FloorHot Aisle
Cold AisleHot Aisle
(a) Bird's-eye view (b) Top viewC
RA
C
Racks
Racks
ATAC Simulation Setup▷
• Google cluster data (GCD)– select 5 days
• 12,800 processing cores– 50 blade chassis– 16 servers per blade chassis– 16 processing cores
• AMD Opteron 6386 SE, 140W TDP
30
ATAC Evaluation▷
• Baseline– Dynamic cooling
controlTemergency = 40ºC that targets Tcore ≤ 80ºC
– No safety margin– Failed because of
non-zero cool air delivery time
31
60
65
70
75
80
85
90
Bas
elin
e
AT
AC
-0
AT
AC
-1
AT
AC
-2
AT
AC
-3
AT
AC
-4
AT
AC
-5
AT
AC
-6
AT
AC
-7
AT
AC
-8
AT
AC
-9
ATAC
Max
T(c
ore)
in
°C
97%
98%
99%
100%
101%
102%
103%
104%
105%
Nor
mal
ized
to
Bas
elin
e
Max T(core), leftLatency, right
ATAC Evaluation▷
• ATAC, DTM, Power capping, and PowerNap32
707274767880828486
Bas
elin
eA
TA
C-0
AT
AC
-1A
TA
C-2
AT
AC
-3A
TA
C-4
AT
AC
-51°
C, 1
0%1°
C,
5%
2°C
, 10%
2°C
, 5
%31
0W30
0W29
0WPo
wer
Nap
ATAC DTM PowerCapping
Max
Cor
e T
emp
erat
ure
(°C
)
80%90%
100%110%
120%130%
140%
Bas
elin
eA
TA
C-0
AT
AC
-1A
TA
C-2
AT
AC
-3A
TA
C-4
AT
AC
-51°
C, 1
0%1°
C,
5%
2°C
, 10%
2°C
, 5
%31
0W30
0W29
0WPo
wer
Nap
ATAC DTM PowerCapping
Nor
mal
ized
Lat
ency
ATAC Evaluation▷
• ATAC, DTM, Power capping, and PowerNap33
707274767880828486
Bas
elin
eA
TA
C-0
AT
AC
-1A
TA
C-2
AT
AC
-3A
TA
C-4
AT
AC
-51°
C, 1
0%1°
C,
5%
2°C
, 10%
2°C
, 5
%31
0W30
0W29
0WPo
wer
Nap
ATAC DTM PowerCapping
Max
Cor
e T
emp
erat
ure
(°C
)
80%90%
100%110%
120%130%
140%
Bas
elin
eA
TA
C-0
AT
AC
-1A
TA
C-2
AT
AC
-3A
TA
C-4
AT
AC
-51°
C, 1
0%1°
C,
5%
2°C
, 10%
2°C
, 5
%31
0W30
0W29
0WPo
wer
Nap
ATAC DTM PowerCapping
Nor
mal
ized
Lat
ency
ATAC Key Contributions▷
34
Ambient temperature-aware thermal control
Negligible performance degradation
No need for the safety margin 38% saving in cooling power
<1%
35
Georgia TechMARS Labhttp://arch.ece.gatech.edu
ATAC: Ambient Temperature-Aware Capping
for Power Efficient Datacenters
• Sungkap Yeo• Mohammad M. Hossain• Jen-cheng Huang• Hsien-Hsin Sean Lee
Recommended