Upload
arnaud
View
46
Download
0
Embed Size (px)
DESCRIPTION
Evaluating a DVS Scheme for Real-Time Embedded Systems. Ruibin Xu, Daniel Mossé and Rami Melhem. Introduction. Energy conservation is important for real-time embedded systems Dynamic Voltage Scaling (DVS) is effective in power management - PowerPoint PPT Presentation
Citation preview
1Computer Science Department University of
Pittsburgh
Evaluating a DVS Scheme for Real-Time Embedded Systems
Ruibin Xu, Daniel Mossé and Rami Melhem
2Computer Science Department University of
Pittsburgh
Introduction
Energy conservation is important for real-time embedded systems
Dynamic Voltage Scaling (DVS) is effective in power management
A popular problem: minimizing energy consumption while meeting the deadlines
3Computer Science Department University of
Pittsburgh
Focus
Frame-based systems that execute variable workloads
The problem becomes minimizing the expected energy consumption while meeting the deadlines
……
timeFrame length
4Computer Science Department University of
Pittsburgh
A New DVS Scheme (MEEC)
simplified problem
original problem
optimal solution
practical solution
relax
fix
Evaluationsefficient algorithm
emsoft’05
parc’05
5Computer Science Department University of
Pittsburgh
Task and System Model
N periodic tasksT1, T2, …, TN to be executed consecutively in each frame
The power function is p(f) = c0+c1f α
6Computer Science Department University of
Pittsburgh
Review of Existing Schemes
slack
slack
slack
Proportional Scheme
Greedy Scheme
Statistical Scheme
7Computer Science Department University of
Pittsburgh
The MEEC Scheme
Incorporates the variability of the tasks into the speed schedule
The variability of the tasks are captured by the probability density function of the workload of the tasks
Aims to minimize the expected energy consumption in the system
workload
probability
8Computer Science Department University of
Pittsburgh
The MEEC Schemeslack
β1d (1-β1)d
d
β1 β2 β3 β4
9Computer Science Department University of
Pittsburgh
An Important PropertyThe optimal expected energy consumption
for
d d
…
are
Both are proportional to 1/d2
10Computer Science Department University of
Pittsburgh
Computing βi
β4=100%
β3=xx%
vs.
T1 T2 T3 T4
β2=xx%
vs.
vs.
β1=xx%
11Computer Science Department University of
Pittsburgh
Applying PACE
PACE is a technique in which the execution speed is gradually increased as the task progresses
12Computer Science Department University of
Pittsburgh
The MEEC Scheme
The β values (optimal) are computed based on the assumption of unrestricted continuous frequency
We need to deal with: Minimum and maximum speed restriction Discrete speed
We have solutions and will use simulation to test them
13Computer Science Department University of
Pittsburgh
Evaluations – Power models
Synthetic processor Strictly conforms to p(f)=f3
10 frequencies: 100MHz, 200MHz,…, 1000MHz
Intel Xscale Power numbers from Intel
datasheets p(f) = 80+1520(f/1000)3
14Computer Science Department University of
Pittsburgh
Evaluation – Synthetic Workload
We simulated systems that have 5,10,15,20 tasks
The WCEC of each task is randomly generated from 10M to 1G cycles
The probability distribution of each task is randomly chosen from 6 representative distributions
Frame length
15Computer Science Department University of
Pittsburgh
Evaluation – Synthetic Workload
We evaluated 8 schemes Proportional with and without PACE Greedy with and without PACE Statistical with and without PACE MEEC with and without PACE
We simulated 100,000 frames and computed the average energy consumption per frame for each scheme
16Computer Science Department University of
Pittsburgh
Results – Synthetic Workload
For synthetic CPU, the best scheme is always MEEC (with or without PACE), but MEEC with PACE is only better than MEEE without PACE 13.6% of the time with an average saving of 1.2%
For Intel Xscale, the best scheme is always MEEC without PACE
Conclusion: PACE is not recommended in the MEEC scheme
17Computer Science Department University of
Pittsburgh
Why PACE Is Not Good in MEEC scheme?
PACE (under the assumption of unrestricted continuous frequency)
PACE (discrete frequency)
fix
β values
compute
Can differ a lot
18Computer Science Department University of
Pittsburgh
Results – Synthetic Workload
0
10
20
30
40
50
60
70
80
90
Synthetic CPU,Max
Synthetic CPU,Average
XScale, Max XScale, Average
Power model
En
erg
y s
av
ing
(%)
vs . proportional
vs . greedy
vs . s tatis tical
19Computer Science Department University of
Pittsburgh
Evaluation – Automatic Target Recognition (ATR)
The ATR application does pattern matching of targets in images
The regions of interest (ROI) in the image are detected and each ROI is compared with all the templates
Image processing time is proportional to the number of ROIs
20Computer Science Department University of
Pittsburgh
Evaluation – Automatic Target Recognition (ATR)
A front-end is responsible for collecting images and send them to the back-end periodically for target recognition
This application can be modeled as a frame-based real-time system in which all the tasks have the same workload distribution
front-end
back-end ……
21Computer Science Department University of
Pittsburgh
Evaluation – Automatic Target Recognition (ATR)
Simulation setup Use Intel Xscale The period is 100ms The front-end sends 1 to 6 images to the
back-end The number of ROIs in an image varies
from 1 to 8 The back-end precomputes 6 speed
schedules
22Computer Science Department University of
Pittsburgh
Results - Automatic Target Recognition (ATR)
0
5
10
15
20
25
1 2 3 4 5 6 average
Number of images
En
erg
y sa
vin
g(%
)
vs. proportional
vs. greedy
vs. statistical
23Computer Science Department University of
Pittsburgh
Summary
In this paper, we demonstrate and evaluate a new DVS scheme that aims to minimize the expected energy consumption in the system
24Computer Science Department University of
Pittsburgh
Conclusions
The MEEC scheme achieves significant energy savings over the existing schemes
Using only static information or aggregating dynamic information, even with probabilistic techniques, will not produce as good results as when dynamic information for each task in considered separately
25Computer Science Department University of
Pittsburgh
Thank you
26Computer Science Department University of
Pittsburgh
A Simple Example
3 tasks, the frame length is 14 time units For the CPU, c0=0, c1=1, fmin=0, and fmax=1