15
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 2, FEBRUARY 2005 211 Energy-Aware Wireless Systems With Adaptive Power–Fidelity Tradeoffs Vijay Raghunathan, Student Member, IEEE, Cristiano L. Pereira, Mani B. Srivastava, Senior Member, IEEE, and Rajesh K. Gupta, Fellow, IEEE Abstract—Wireless networked embedded systems, such as multimedia terminals, sensor nodes, etc., present a rich domain for making energy/performance/quality tradeoffs based on ap- plication needs, network conditions, etc. Energy awareness in these systems is the ability to perform tradeoffs between available battery energy and application quality requirements. In this paper, we show how operating system directed dynamic voltage scaling and dynamic power management can provide for such a capability. We propose a real-time scheduling algorithm that uses runtime feedback about application behavior to provide adaptive power–fidelity tradeoffs. We demonstrate our approach in the context of a static priority-based preemptive task scheduler. Simulation results show that the proposed algorithm results in significant energy savings compared to state-of-the-art dynamic voltage scaling schemes with minimal loss in system fidelity. We have implemented our scheduling algorithm into the eCos real-time operating system running on an Intel XScale-based variable voltage platform. Experimental results obtained using this platform confirm the effectiveness of our technique. Index Terms—Dynamic power management, dynamic voltage scaling, real-time systems, wireless embedded systems. I. INTRODUCTION W IRELESS embedded systems such as multimedia ter- minals, 3G cell phones, ad hoc networks of wireless sensors, wireless toys and robots, etc., are increasingly com- putationally capable and often sport high bandwidth network connections. As a result, they run sophisticated algorithms and execute multiple large multimedia/data intensive applications. Since several of these applications have real-time constraints, a real-time operating system (RTOS) is often used in the imple- mentation of these systems. The battery operated nature of these systems requires them to be highly energy efficient to maximize device lifetimes. The drive to prolong system lifetime has resulted in several low power hardware design techniques (see [1] and [2] for an overview of these techniques). In addition to using low power design techniques, a commonly used approach for energy-aware Manuscript received October 29, 2002; revised June 7, 2003. This paper is based in part on research funded through the Defense Advanced Re- search Projects Agency (DARPA) PAC/C Program under AFRL Contract F30602-00-C-0154, and through Semiconductor Research Corporation System Task Thrust ID 899.001. V. Raghunathan and M. B. Srivastava are with the Department of Electrical Engineering, University of California, Los Angeles, CA 90095 USA (e-mail: [email protected]). C. L. Pereira and R. K. Gupta are with the Department of Computer Science and Engineering, University of California, San Diego, CA 92093 USA. Digital Object Identifier 10.1109/TVLSI.2004.840773 system operation is Dynamic Power Management (DPM), in which unused system components are shutdown or sent into low power states [3]. An alternative, and more effective when applicable, technique used to increase energy efficiency is Dy- namic Voltage Scaling (DVS) in which the supply voltage and clock frequency of the processor are changed dynamically to just meet the instantaneous performance requirement [4]. The RTOS is uniquely poised to efficiently implement DPM and DVS policies due to the following reasons. 1) Since the RTOS coordinates the execution of the various application tasks, it has global information about the performance requirements and timing constraints of all the applications. 2) The RTOS can directly control the underlying hardware platform, fine-tuning it to meet specific system requirements. In this paper, we use an RTOS directed approach to enable en- ergy-aware system operation by exploiting the following char- acteristics of wireless embedded systems: • Most wireless systems are resilient to packet losses and er- rors. The operating scenarios of these systems invariably involve data losses and errors (e.g., due to noisy wireless channel conditions), and the application layer is designed to be tolerant to these impairments. Most wireless sys- tems, therefore, are “soft” real-time systems where a few deadline misses only lead to a small degradation in the user-perceived application quality. • Wireless embedded systems offer a power–fidelity tradeoff that can be tuned to suit application needs. For example, the precision of the computation (e.g., quality of the audio or video) can be traded off against the power consumed for the computation. As another example, wire- less channel induced errors can be reduced by increasing the transmission power of the radio. • Wireless embedded systems have time varying compu- tational loads. Performance analysis studies have shown that for typical wireless applications, the instance to in- stance task execution time varies significantly, and is often far lower than the worst case execution time (WCET). Table I gives the WCET and best case execution time (BCET) for a few benchmarks and clearly illustrates the large workload variability [5]. Although the execution times vary significantly, they depend on data values that are obtained from physical real-world signals (e.g., audio or video streams), and are likely to have some temporal correlation between them. This enables us to predict task instance execution times with reasonable accuracy. 1063-8210/$20.00 © 2005 IEEE

Energy-aware wireless systems with adaptive power-fidelity tradeoffs

  • Upload
    rk

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 2, FEBRUARY 2005 211

Energy-Aware Wireless Systems With AdaptivePower–Fidelity Tradeoffs

Vijay Raghunathan, Student Member, IEEE, Cristiano L. Pereira, Mani B. Srivastava, Senior Member, IEEE, andRajesh K. Gupta, Fellow, IEEE

Abstract—Wireless networked embedded systems, such asmultimedia terminals, sensor nodes, etc., present a rich domainfor making energy/performance/quality tradeoffs based on ap-plication needs, network conditions, etc. Energy awareness inthese systems is the ability to perform tradeoffs between availablebattery energy and application quality requirements. In thispaper, we show how operating system directed dynamic voltagescaling and dynamic power management can provide for sucha capability. We propose a real-time scheduling algorithm thatuses runtime feedback about application behavior to provideadaptive power–fidelity tradeoffs. We demonstrate our approachin the context of a static priority-based preemptive task scheduler.Simulation results show that the proposed algorithm results insignificant energy savings compared to state-of-the-art dynamicvoltage scaling schemes with minimal loss in system fidelity.We have implemented our scheduling algorithm into the eCosreal-time operating system running on an Intel XScale-basedvariable voltage platform. Experimental results obtained usingthis platform confirm the effectiveness of our technique.

Index Terms—Dynamic power management, dynamic voltagescaling, real-time systems, wireless embedded systems.

I. INTRODUCTION

WIRELESS embedded systems such as multimedia ter-minals, 3G cell phones, ad hoc networks of wireless

sensors, wireless toys and robots, etc., are increasingly com-putationally capable and often sport high bandwidth networkconnections. As a result, they run sophisticated algorithms andexecute multiple large multimedia/data intensive applications.Since several of these applications have real-time constraints, areal-time operating system (RTOS) is often used in the imple-mentation of these systems.

The battery operated nature of these systems requires themto be highly energy efficient to maximize device lifetimes.The drive to prolong system lifetime has resulted in severallow power hardware design techniques (see [1] and [2] for anoverview of these techniques). In addition to using low powerdesign techniques, a commonly used approach for energy-aware

Manuscript received October 29, 2002; revised June 7, 2003. This paperis based in part on research funded through the Defense Advanced Re-search Projects Agency (DARPA) PAC/C Program under AFRL ContractF30602-00-C-0154, and through Semiconductor Research Corporation SystemTask Thrust ID 899.001.

V. Raghunathan and M. B. Srivastava are with the Department of ElectricalEngineering, University of California, Los Angeles, CA 90095 USA (e-mail:[email protected]).

C. L. Pereira and R. K. Gupta are with the Department of Computer Scienceand Engineering, University of California, San Diego, CA 92093 USA.

Digital Object Identifier 10.1109/TVLSI.2004.840773

system operation is Dynamic Power Management (DPM), inwhich unused system components are shutdown or sent intolow power states [3]. An alternative, and more effective whenapplicable, technique used to increase energy efficiency is Dy-namic Voltage Scaling (DVS) in which the supply voltage andclock frequency of the processor are changed dynamically tojust meet the instantaneous performance requirement [4]. TheRTOS is uniquely poised to efficiently implement DPM andDVS policies due to the following reasons. 1) Since the RTOScoordinates the execution of the various application tasks, ithas global information about the performance requirementsand timing constraints of all the applications. 2) The RTOS candirectly control the underlying hardware platform, fine-tuningit to meet specific system requirements.

In this paper, we use an RTOS directed approach to enable en-ergy-aware system operation by exploiting the following char-acteristics of wireless embedded systems:

• Most wireless systems are resilient to packet losses and er-rors. The operating scenarios of these systems invariablyinvolve data losses and errors (e.g., due to noisy wirelesschannel conditions), and the application layer is designedto be tolerant to these impairments. Most wireless sys-tems, therefore, are “soft” real-time systems where a fewdeadline misses only lead to a small degradation in theuser-perceived application quality.

• Wireless embedded systems offer a power–fidelitytradeoff that can be tuned to suit application needs. Forexample, the precision of the computation (e.g., qualityof the audio or video) can be traded off against the powerconsumed for the computation. As another example, wire-less channel induced errors can be reduced by increasingthe transmission power of the radio.

• Wireless embedded systems have time varying compu-tational loads. Performance analysis studies have shownthat for typical wireless applications, the instance to in-stance task execution time varies significantly, and is oftenfar lower than the worst case execution time (WCET).Table I gives the WCET and best case execution time(BCET) for a few benchmarks and clearly illustrates thelarge workload variability [5]. Although the executiontimes vary significantly, they depend on data values thatare obtained from physical real-world signals (e.g., audioor video streams), and are likely to have some temporalcorrelation between them. This enables us to predict taskinstance execution times with reasonable accuracy.

1063-8210/$20.00 © 2005 IEEE

212 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 2, FEBRUARY 2005

TABLE IVARIATION IN EXECUTION TIMES (IN CLOCK CYCLES) FOR A FEW MULTIMEDIA BENCHMARKS [5]

A. Paper Contributions

The contributions of this paper are the following.

1) We present an enhanced fixed priority schedulabilityanalysis for variable voltage systems that incorporatesthe timing overheads incurred due to voltage/frequencychanges and processor shutdown/wakeup. This analysiscan be used to accurately analyze the schedulabilityof nonconcrete periodic task sets, scheduled using theRate Monotonic (RM) or the Deadline Monotonic (DM)priority assignment schemes.

2) We present a proactive DVS technique that exploits theabove described characteristics of wireless embedded sys-tems to achieve significant energy savings. A key featureof our technique, not addressed in prior work, is that ityields an adaptive tradeoff between energy consumptionand system fidelity/quality.

3) Our DVS technique has been implemented into the kernelof the eCos [6] real-time operating system running on anIntel XScale [7] processor. We present measured results todemonstrate the effectiveness of our technique. As part ofour implementation, we have also developed a completesoftware architecture that facilitates application-RTOS in-teraction for efficient DPM/DVS.

The proposed DVS technique exploits task runtime varia-tion by predicting task instance execution times and performingDVS accordingly. Most previously proposed predictive DVS al-gorithms [4], [8] are processor utilization based, and hence per-form poorly in the presence of latency constraints. In contrast,our technique is tightly coupled to the schedulability analysisof the underlying real-time scheduling scheme, thereby yieldingbetter results than utilization-based approaches. The few dead-line misses that result from occasional misprediction are notcatastrophic to performance due to the inherent error-tolerantnature of wireless systems. In addition, our algorithm uses anadaptive feedback mechanism to control the number of taskdeadline misses. This is done by monitoring recent deadlinemiss history, and accordingly adapting the prediction to be moreaggressive/conservative. Finally, since our technique is based onthe enhanced schedulability analysis mentioned above, it can beused to schedule nonconcrete task sets, where the initial phaseoffsets of tasks are unknown. This is unlike several existing vari-able voltage real-time scheduling schemes, which require theinitial phase offsets of the tasks to be known in order to performhyper-period scheduling.

B. Related Work

Energy awareness in wireless embedded systems builds uponadvances in DPM, DVS, and real-time scheduling techniques.We briefly discuss related work in these areas.

Shutdown-based DPM techniques have been proposed forseveral system components including processors, hard disks,and wireless network interfaces. The goal of system-level DPMtechniques is to obtain an optimized power-state transitionpolicy [9]–[12].

Early work on DVS was in the context of a workstation-like environment [13], [14] where average throughput is themetric of performance, and latency is not an issue since thereare no real-time constraints. In these techniques, the task sched-uler adjusts clock frequency and supply voltage in each discretetime interval based on the processor utilization in the precedinginterval. Similar utilization-based predictive techniques werestudied in [4] and [8]. Numerous other DVS techniques havebeen proposed for low power real-time task scheduling, bothfor uniprocessor [15]–[25] as well as multiprocessor [26]–[28]systems. A detailed survey of these techniques is presented in[29]. Most uniprocessor DVS techniques target systems withhard deadline constraints, and several of them assume that eachtask instance executes for its WCET. Even the few schemes thatallow deadline misses [4], [8] involve utilization-based speedsetting decisions, and hence exhibit poor real-time behavior.Moreover, they only consider systems with a single applica-tion executing (i.e., a unitasking environment). Pering et al. [20]propose a thread-based DVS technique for uniprocessor multi-tasking systems, based on execution time prediction. Similar toPering’s work, our work considers a uniprocessor, multitasking,real-time environment, and shows that permitting a few dead-line misses leads to a significant increase in energy savings andimproves the energy scalability of the system. Our work differsfrom the Pering’s work in two ways: 1) While Pering’s algo-rithm permits missed deadlines, it does not provide any controlon the number of deadline misses. Through appropriate param-eter selection, the proposed algorithm permits the user to con-trol the number of deadline misses, and trade them off for in-creased energy savings. 2) Pering’s algorithm is a purely onlineone, whose complexity scales linearly with the number of taskspresent in the system. In contrast, since the proposed algorithmis based on an offline schedulability analysis, the scheduler over-head is independent of the number of tasks in the system, re-sulting in better scalability. This can be significant for systemswith a large number of tasks since online DVS algorithms areinvoked during every context switch.

The rest of this paper is organized as follows. Section IIpresents examples to illustrate the power management oppor-tunities that exist during real-time task scheduling. Section IIIdescribes our system model. Section IV presents the enhancedRM schedulability analysis for variable voltage systems. Sec-tion V discusses our adaptive power–fidelity tradeoff technique.Section VI details our simulation-based performance analysis.Section VII describes the implementation of our technique intoeCos. Section VIII presents the conclusions.

RAGHUNATHAN et al.: ENERGY-AWARE WIRELESS SYSTEMS WITH ADAPTIVE POWER–FIDELITY TRADEOFFS 213

II. BACKGROUND AND EXAMPLES

Next, we describe the basic scheduling approach adopted inour work, and present examples to illustrate the DPM/DVS op-portunities that arise during real-time task scheduling.

A. Task Scheduling in RTOS

The task scheduler of an RTOS is responsible for schedulinga given set of tasks such that real-time constraints are satis-fied. Schedulers differ in the type of scheduling policy they im-plement. A commonly used scheduling policy is Rate Mono-tonic (RM) scheduling [30]. This is a fixed-priority-based pre-emptive scheduling scheme where tasks are assigned prioritiesin the inverse ratio of their time periods. Fixed-priority-basedpreemptive scheduling is commonly used in operating systemssuch as eCos, WinCE, VxWorks, QNX, uC/OS, etc., that runon wireless embedded devices such as handheld multimedianodes. An alternative scheduling scheme is Earliest DeadlineFirst (EDF) scheduling [30], where task priorities are assigneddynamically such that task instances with closer deadlines aregiven higher priorities. EDF can schedule task sets with a higherprocessor utilization than can be scheduled by a static priority-based scheduling scheme such as RM. However, dynamic pri-ority-based scheduling is also more complex to implement sincetask priorities keep changing during runtime. No matter whichscheduling policy is used, the RTOS ensures that at any point oftime, the currently active task is the one with the highest priorityamong the ready to run tasks. The problem of task schedulingon a uniprocessor system in the presence of deadlines is knownto be solvable in polynomial time if the system is preemptive innature [31]. However, the low energy scheduling problem wasshown to be NP-Complete in [32], by a reduction from the se-quencing with deadlines and set-up times problem, described in[31].

B. Power Management Opportunities During Task Scheduling

1) Static Slack: It has been observed in many systems that,during runtime, even if all task instances run for their WCET,the processor utilization is often far lower than 100%, resultingin idle intervals. This slack that inherently exists in the systemdue to low processor utilization is henceforth referred to asstatic slack. It can be exploited to reduce energy consumptionby statically slowing down the processor and operating at alower voltage. While processor slowdown improves utilization,excessive slowdown may lead to deadline violations. Hence,the extent of slowdown is limited by the schedulability of thetask set at the reduced speed, under the scheduling policy used.The following example illustrates the use of static slowdown toreduce energy consumption.

Example 1: Consider a simple mobile multimedia terminalshown in Fig. 1.1 The system receives real-time audio and videostreams over a wireless link, and plays them out. Thus, the threemain tasks that run on a processor embedded in this system areprotocol processing, audio decoding, and video decoding. The

1The system shown in Fig. 1 could, in general, be any multitasking, unipro-cessor, battery powered system, such as a 3G cell phone or a PDA.

Fig. 1. Mobile multimedia terminal used in Examples 1 and 2.

TABLE IITASK TIMING PARAMETERS FOR EXAMPLES 1 AND 2

timing parameters for these tasks are listed in Table II. For sim-plicity, the initial phase offsets for the tasks are set to zero (how-ever, our algorithm, presented in Section V, makes no such as-sumptions). The resulting schedule for the time interval [0, 120]when this task set is scheduled on a single processor using theRM priority assignment scheme, is shown in Fig. 2(a). It canbe seen from the figure that the system is idle during time in-terval [90, 120]. This slack can be utilized to lower the oper-ating frequency and supply voltage, thereby reducing the energyconsumption. Fig. 2(b) shows the schedule for the same taskset with the processor slowed down2 by a factor of . Asseen from the figure, processor slowdown leads to a reductionin slack.3 Note that any further reduction in processor speed willresult in the video decoding task missing its deadline. When thesupply voltage is also scaled, this leads to a decrease in powerconsumption from 420 mW to 184 mW, using the power versusfrequency curve for the StrongARM processor, shown in Fig. 4.The energy consumption over the interval [0, 120], therefore,decreases by 41%, compared to a shutdown-based policy.

2) Dynamic Slack: Static slack is not the only kind of slackpresent in the system. During runtime, due to the variation in thetask instance execution times, there is additional slack createdwhen a task instance finishes executing before its WCET. Thisslack that arises due to execution time variation is henceforthreferred to as dynamic slack. Dynamic slack can be exploitedfor DVS by dynamically varying the operating frequency andsupply voltage of the processor to extend the task instance’sexecution time to its WCET. Thus, the lower a task instance’sactual execution time, the more its energy reduction potential.However, to realize this potential, we need to know/estimate theexecution time of each task instance. The following exampleillustrates how dynamic slack can be exploited for DVS.

2For simplicity, we assume that such an exact slowdown is possible. Our al-gorithm handles the case only a finite set of frequencies are available.

3While in this example, uniform slowdown of all tasks resulted in a com-plete elimination of static slack, in general different tasks might require differentslowdown factors. Our algorithm does this, if necessary.

214 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 2, FEBRUARY 2005

Fig. 2. Task execution schedule for Example 1. (a) Original. (b) Statically optimized.

Fig. 3. Task execution schedule for Example 2. (a) Statically optimized. (b) After dynamic slowdown.

Example 2: Consider the statically optimized schedulefor the task set of Example 1 shown in Fig. 2(b). Fig. 3(a)shows the resulting schedule when the video decoding taskinstance requires only 20 time units to complete execution atmaximum processor speed. As a result, the video decoding tasknow completes execution at time units, and does notget preempted. As seen from the figure, this execution timevariation creates some dynamic slack. To utilize this slack,the processor frequency and supply voltage are dynamicallyreduced further during the execution of the video decoding task.If the frequency were to be dropped by a factor of 2 over thealready statically optimized value, the slack gets filled up againas shown in Fig. 3(b)4. Accompanied by appropriate voltagescaling, the energy consumption for the interval [0, 120] nowreduces by a further 14% compared to the statically optimizedschedule of Fig. 3(a).

The above examples illustrated the energy reduction opportu-nities present during real-time task scheduling. We next describeour system model, and present an enhanced RM schedulabilityanalysis for variable voltage systems. Using this analysis, wethen present our power-aware scheduling algorithm that exploitsthe above illustrated DVS/DPM opportunities to yield large en-ergy savings.

4Note that in Fig. 3(b), the video decoding task is preempted at time t = 60

units, and resumes execution later to complete at time t = 120 units.

III. SYSTEM MODEL

A. Task Timing Model

A set of N independent periodic tasks is to be scheduled ona uniprocessor system. Associated with each task i are the fol-lowing parameters: 1) is its time period; 2) is its worst caseexecution time (WCET); 3) is the best case execution time(BCET); and 4) is its deadline. The BCET and WCET can beobtained through execution profiling or other performance anal-ysis techniques [5]. Our techniques are applicable to the generalcase where and are unrelated. We discuss this further inSection IV.

B. Power Model

We use a power model of the StrongARM processor, whichis based on actual measurements, for computing the powerconsumption. Fig. 4 shows the power consumption of theStrongARM, plotted as a function of the operating frequency.This plot is obtained from actual current and voltage measure-ments reported in [33] for the StrongARM SA-1100 processor.Using this curve, the power consumption of the processor for agiven clock frequency can be calculated. By varying the supplyvoltage and clock frequency, the variable voltage system can bemade to operate at different points along this curve. For a givenclock frequency (i.e., speed setting), the energy consumption

RAGHUNATHAN et al.: ENERGY-AWARE WIRELESS SYSTEMS WITH ADAPTIVE POWER–FIDELITY TRADEOFFS 215

Fig. 4. Power consumption as a function of clock frequency for theStrongARM processor.

of a task instance can be computed as the product of the powerconsumption (obtained from the power-frequency curve shownabove) and the execution time of the task instance. Further, theenergy-speed curve is convex in nature [17]. Thus, if two taskinstances have to be completed by a deadline, due to Jensen’sinequality [34], it is more energy efficient torun the two tasks at a constant speed than to change the speedfrom one task to the other. As will be seen in the next section,our algorithm utilizes this fact by attempting to slow downtasks in a uniform manner, to the extent possible.

C. Overheads Due to DVS/DPM

The variable supply voltage is generated using a dc–dcconverter. During a voltage transition, it takes a finite amount oftime for the output voltage of the dc–dc converter to transitionfrom the present level to the desired level. Efficient dc–dcregulators with fast transition times were reported in [35],[36]. Complementary to the supply voltage variation is theaccompanying variation of the clock frequency. The phaselocked loop present in the clock generation circuitry also takesa finite amount of time to settle at its steady state value, aftera frequency change. Further, shutting down and waking upthe processor involve a nonzero time and power overhead.These overheads have to be explicitly accounted for duringtask scheduling. For variable voltage real-time systems, thetiming overheads have to be incorporated into the task-setschedulability analysis to guarantee the schedulability of tasks.The energy overheads have to be considered while computingthe energy savings obtained through the use of DVS andDPM. Commercial processors till recently had large frequencytransition overheads (for example, the StrongARM [37] takes150 s to change its frequency). However, these transitiontimes are considerably lower in more recent processors asdesigners realize the effectiveness of DVS. The transitionoverhead is around 30 s in the XScale processor [7]. We usea stall duration of 150 s in our simulations, since the powermodel used is that of a StrongARM processor. However, inour implementation (Section VII), the stall duration is only30 s, since our experimental testbed is based on the XScaleprocessor. Finally, although some processors [20] developed atacademic institutions allow the computation to continue whilethe voltage and frequency are being changed, commerciallyavailable processors such as the XScale do not permit this.

Therefore, in our simulations, the processor is stalled for theduration of the frequency change, resulting in some wastedenergy (although this is insignificant since the processor clockis frozen). We account for all the above mentioned overheadsin our energy calculations.

IV. RM SCHEDULABILITY ANALYSIS

Several schedulability analysis results exist for RMscheduling [38]. However, none of them consider variablevoltage/frequency processors. Since processor shutdown, pro-cessor wakeup, and voltage and frequency changes take a finiteamount of time, they affect the timing behavior of the system.In this section, we present an enhanced schedulability analysisfor RM scheduling, which takes into account the overheadsintroduced due to DPM/DVS. We denote the time taken toshutdown/wakeup the processor by , and the time taken fora voltage/frequency change by .5

A. The Case

The schedulability analysis for RM scheduling for the caseis presented in [38], in which necessary and sufficient

conditions are derived for the schedulability of a nonconcrete(i.e., unknown initial phases for the tasks) periodic task set. Theresponse time of a task instance is defined as the amount of timeneeded for the task instance to finish execution, from the instantat which it arrived in the system. The worst case response time(WCRT), as the name indicates, is the maximum possible re-sponse time that a task instance can have. For a conventionalfixed speed system, the WCRT of a task under the RM sched-uling scheme is given by the smallest value of that satisfiesthe equation [38]

(1)

where denotes the set of tasks with priority greater thanthat of task denotes the worst case execution time of task

denotes the ceiling operator, and is simply an interme-diate variable used in computing the WCRT. Equation (1) can besolved using an iterative technique. The summation term on theright-hand-side of the equation represents the total interferencethat an instance of task sees from higher priority tasks duringits execution. Task is schedulable iff the WCRT of the task isless than or equal to its deadline . Our enhanced schedula-bility analysis is based upon three observations.

1) Slowing down a task by a factor increases its compu-tation time by the same factor. Therefore, given a set ofslowdown factors , the computationtime for task now becomes .

2) Each higher priority task instance that arrives during thelifetime of a lower priority task instance adds either or

to the response time of the lower-priority task in-stance, where is the time taken to change the processor

5The time taken for a voltage change could be different from the time takenfor a frequency change. In that case, T is the maximum of the two values.Further, the voltage/frequency change will be accompanied by a context switch,whose overhead can also be added to T .

216 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 2, FEBRUARY 2005

frequency/voltage. This is the preemption related over-head of DVS. It takes time units to change the voltageto execute the higher priority task instance, and another

units to change back to resume execution of the pre-empted task instance, and the worst case for a task in-stance occurs when this procedure repeats for every higherpriority task instance that arrives during its lifetime. So,the worst case interference that a higher priority task in-stance contributes is equal to . However, when thehigher priority task is being executed, suppose a task withpriority in between that of the currently running task andthe preempted task arrives. It is easy to see that this taskinstance only causes an interference of .

3) The maximum amount of blocking that can be faced bya task instance, and which is not preemption related is

, where is the timetaken to shutdown/wakeup the processor. Since the pro-cesses of voltage/frequency change or shutdown cannotbe preempted, each task instance may be blocked for anamount of time if it arrives just after a voltage/frequencychange or a shutdown has been initiated.

Based on the above observations, the following theorem canbe used as a sufficient condition for the schedulability of a taskset under RM scheduling on a DVS enabled system.

Theorem 1: A nonconcrete periodic task set is guaranteed tobe schedulable on a variable voltage processor, under the RMscheduling policy if, for every task where

is the smallest value of that satisfies the equation

(2)

where is the slowdown factor for task .

B. The Case

The schedulability analysis for the case when ismore complex. For fixed speed systems, when and areunrelated, the WCRT of task is given by [38]

(3)

where is the smallest nonnegative integer value that satisfies, and is the smallest solution to the

equation

(4)

The task set is schedulable if, and only if,. Intuitively, the above analysis implies that

the WCRT may not occur for the first instance of the task, andwe may need to check the schedulability for multiple instancesof the task. In the case of variable voltage systems, using the

three observations listed in Section IV-A., the schedulabilitytest reduces to a sufficient condition, and (4) is changed to

(5)

where is the slowdown factor for task . Note that thecase is just a special case of this enhanced equation. When

, (3) and (4) are satisfied for . The offline component ofour algorithm uses this enhanced sufficient schedulability test tocompute static slowdown factors for each task.

V. ALGORITHM

We consider a priority-based, preemptive scheduling modelwhere the priority assignment is done statically, using the RMpriority assignment policy.6 Task instances that do not finish ex-ecuting by their deadline are killed. Our algorithm adjusts theprocessor’s supply voltage and clock frequency whenever a newtask instance first starts executing, as well as every time it re-sumes execution after being preempted by a higher priority task.Also, our algorithm uses a constant speed setting between twopoints of preemption in order to maximally exploit the convexityof the energy-speed curve. The pseudo-code of our algorithm isshown in Figs. 5 and 6. The crucial steps of our algorithm aredescribed next.

A. Static Slowdown Factor Computation

This component of our algorithm involves a schedulabilityanalysis of the task set. It is executed whenever a new task (e.g.,audio/video decoding) enters the system and registers itself withthe scheduler. Given the task set to be scheduled, the procedureCOMPUTE STATIC SLOWDOWN FACTORS performs a schedula-bility analysis (as described in Section IV), and computes theminimum operating frequency for each task, at which the entiretask set is still schedulable. Every task is slowed down uniformlytill one or more tasks reach their critical point (i.e., they are“just” schedulable). This is done by the function Scale WCET().At this juncture, further slowdown of any task with higher pri-ority than any of the critical tasks will render at least one criticaltask unschedulable. Therefore, only the tasks, if any, with lowerpriority than any of the critical tasks can be slowed down furtherwithout affecting the schedulability of the task set. The proce-dure continues till there is no further scaling possible while sat-isfying all task deadlines. In summary, this iterative proceduregives us a new reduced frequency for each task such that theentire task set is just schedulable. The online component of ouralgorithm augments this static slowdown with a dynamic slow-down factor, computed at runtime, to increase energy savings.

B. Dynamic Slowdown Factor Computation

The online component of our algorithm invokes theCOMPUTE DYNAMIC SLOWDOWN FACTOR procedure, listed in

6Although not analyzed in this paper, our algorithm is also applicable to dy-namic priority scheduling, such as Earliest Deadline First scheduling, using anappropriately modified schedulability analysis.

RAGHUNATHAN et al.: ENERGY-AWARE WIRELESS SYSTEMS WITH ADAPTIVE POWER–FIDELITY TRADEOFFS 217

Fig. 5. Pseudo-code for the offline component of the proposed algorithm.

Fig. 6. Pseudo-code for the online component of the proposed algorithm.

Fig. 6, to dynamically alter the supply voltage and operating fre-quency of the system in accordance with recent task executionstatistics. Thus, while the offline component of our algorithmcomputes task specific static slowdown factors, the online com-ponent augments this with task instance specific dynamic slow-down factors. The COMPUTE DYNAMIC SLOWDOWN FACTOR

procedure is called each time a new task instance starts, orresumes execution after being preempted by a high prioritytask. The algorithm sets the processor speed and voltage-basedon an estimate of the task instance execution time.

1) Runtime Prediction Strategy: A novel feature of our al-gorithm is that it involves a proactive DVS scheme. In contrast,a majority of existing DVS techniques are reactive in nature.Therefore, in conventional schemes, any slack that arises due toa task instance completing execution early is distributed to taskinstances that follow, using (possibly complex) dynamic slackreclamation algorithms. Our technique, on the other hand, at-tempts to avoid the creation of slack altogether through the useof a predictive technique, eliminating the need for slack recla-mation. In our scheme, the execution time of a task instance

218 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 2, FEBRUARY 2005

is predicted as some function of the execution times of a fixednumber of previous task instances. An execution history data-base is maintained for each task, that is updated every time atask instance finishes executing or is killed due to a deadlinemiss. The function used to compute the execution time deter-mines the type of predictor. We have evaluated our techniquesusing a simple average predictor model, as well as an exponen-tially weighted average one. Both these are simple predictionschemes that place a light computational load on the scheduler.Since the results were very similar in the two cases, we only re-port the results obtained using the weighted average model. Theuse of a more complex predictor will improve prediction accu-racy. However, this will increase the computational burden onthe scheduler since the predictor is executed every time a taskinstance starts or restarts after preemption.

2) Improving Prediction Accuracy: In order to improveprediction accuracy while retaining simplicity, we extend theprediction strategy to use conditional prediction as describedbelow. Every time a task instance is preempted, the predictedremaining time for that task instance is recalculated as theexpected value of the execution history distribution from thealready elapsed time to the WCET of the task. This gives thepredicted remaining time of the task, given that it has alreadyrun for the elapsed time. This improves the prediction accu-racy, and reduces the probability of under-prediction, in turnreducing the probability of missed deadlines. As a side effect,it also reduces the impact of the original prediction model.This explains why we obtained similar results using the simpleaverage and weighted average predictors.

3) Adaptive Power–Fidelity Tradeoff: Some applicationsmay not be able to tolerate the number of deadlines that aremissed using the above described predictive scheme. There-fore, we introduce an adaptive feedback mechanism into theprediction process. The predicted execution time of a taskinstance is altered using an adaptive multiplicative factor, asshown in the pseudo-code of Fig. 6. Deadline miss historyis monitored using a moving window mechanism, and if thenumber of deadline misses is found to be increasing, the pre-diction is made more conservative, reducing the probability offurther deadline misses. Similarly, a low/decreasing deadlinemiss history results in more aggressive prediction in order toincrease energy savings. This is implemented as follows. Ifthere are more than deadline misses in the last WINDOWtask instances, the algorithm becomes more conservative andincreases the adaptive factor by . Similarly, if the number ofdeadline misses in the last WINDOW task instances is less than

, then the algorithm becomes more aggressive and decreasesthe adaptive factor by . A lower bound, ADAPTIVE LB,dictates the maximum aggressiveness of the algorithm. Theprediction scheme is now adaptive to a recent history of misseddeadlines, becoming more conservative in response to deadlinemisses and becoming more aggressive if no deadline missesoccur over the past few instances. This adaptive control mech-anism ensures that the number of deadline misses (which isrepresentative of system fidelity) is kept under tight checkat all times. The choice of the various parameters dependson how aggressive/conservative the user wants to be in theenergy-fidelity tradeoff, a detailed analysis of which is beyond

Fig. 7. System simulation model in PARSEC.

the scope of this paper. Note that in order to keep the algorithmand implementation simple, we have made the deadline misshistory and all the parameters of the adaptive algorithm globalparameters. An alternative, although more complicated, ap-proach would be to perform the adaptation on a per-task basis.Such an approach would permit better control on the fidelity ofindividual applications. Finally, note that this adaptive schemeis also useful in the case when applications can tolerate moredeadlines than missed through the use of the baseline predictivescheme. In such a situation, the adaptive scheme can be usedto obtain an increase in energy savings at the cost of moredeadline misses. Thus, the application can adjust the operatingpoint along an energy-fidelity curve, which enhances its energyscalability. For example, as the system runs out of energy, it canscale down its fidelity gracefully. To the best of our knowledge,existing DVS schemes do not provide this capability.

4) Voltage Variation Policy: Our algorithm recomputes thevoltage setting every time a task instance first starts executing,or restarts after being preempted. The pre-computed static slow-down factor for the task is augmented with a dynamic slowdownfactor for the specific task instance that is about to start exe-cuting. The dynamic slowdown factor is computed by stretchingthe task instance’s predicted execution time to reach its WCET.The product of the static and dynamic factors is the final slow-down factor. The processor’s operating frequency is then de-creased by this factor, the corresponding supply voltage is set,and execution of the task instance begins. This dynamic slow-down spreads the execution to fill otherwise idle time inter-vals, enabling operation at a lower supply voltage and clock fre-quency.

VI. SIMULATION-BASED PERFORMANCE ANALYSIS

We next describe our simulation framework, and presentsimulation results comparing the proposed adaptive power–fi-delity DVS algorithm to several existing DVS/DPM schemesto demonstrate its effectiveness.

A. Simulation Model

To analyze the performance of the proposed adaptivepower–fidelity DVS scheme, a discrete event simulator wasbuilt using PARSEC [39], a C-based parallel simulation lan-guage. The simulation structure, shown in Fig. 7, consisted oftwo parallel communicating entities. The first one representedthe RTOS, and implemented the task scheduler (enhanced withthe DVS scheme). In addition to performing task scheduling,

RAGHUNATHAN et al.: ENERGY-AWARE WIRELESS SYSTEMS WITH ADAPTIVE POWER–FIDELITY TRADEOFFS 219

Fig. 8. Energy savings of various DVS/DPM schemes using RM scheduling for (a) Task Set 1 and (b) Task Set 2.

Fig. 9. Deadline miss percentage for the proposed scheme and the PAST/PEG scheme using RM scheduling for (a) Task Set 1, and (b) Task Set 2.

the RTOS entity also maintained the statistics of task instanceexecution times and deadline misses. The second entity, i.e., thetask generator, periodically generated task instances with runtimes according to a trace or a distribution, and passed them tothe RTOS entity.

B. Simulation Results

We performed simulations on two task sets (henceforthcalled Task Set 1 and Task Set 2) that are based on standardtask sets used in embedded real-time applications [40], [41].The task instance execution times were generated using aGaussian distribution7 with ,and . To demon-strate that our technique works even for nonconcrete task sets,every task was given a random initial phase offset in theinterval . We performed four experiments on each taskset. In the first experiment, only shutdown-based DPM wasemployed, and the supply voltage and frequency were fixed attheir maximum values. Next, we implemented the low-power

7We obtained similar results using a uniform distribution. Therefore, only theresults for the Gaussian distribution case are presented.

scheduling technique of [18]. This is a WCET-based scheme,where DVS is done only when there is a single task remainingto execute. In the third experiment, we implemented thePAST/PEG DVS scheme [4], [8]. Similar to our algorithm, thisis a predictive DVS scheme. However, all DVS decisions inthe PAST/PEG scheme are purely utilization-based. Finally,the proposed adaptive power–fidelity DVS scheme was im-plemented. The experiments were repeated while varying theBCET from 10% to 100% of the WCET in steps of 10%. Forthe adaptive scheme, the following parameter values were used:

WINDOW , and. The various parameters are described in Section V-B3.

Fig. 8 shows the energy savings obtained for the two task setsfor each of the above mentioned DVS/DPM schemes. The re-sults are normalized to the case when no DVS/DPM is used.As can be seen in the figure, the proposed adaptive power–fi-delity DVS algorithm results in significantly higher energy sav-ings compared to the shutdown, WCET-based, and PAST/PEGalgorithms. Fig. 9 shows the percentage of deadlines missedby the proposed algorithm, and the PAST/PEG algorithm. ThePAST/PEG scheme results in a large number of deadline misses

220 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 2, FEBRUARY 2005

Fig. 10. Impact of the parameter ADAPTIVE LB on (a) Percentage of deadline misses and (b) Energy savings, for Task Set 1.

Fig. 11. Variation of (a) Energy savings and (b) deadline miss percentage with parameters WINDOW and ADAPTIVE LB for Task Set 1.

because it takes DVS decisions purely based on processor uti-lization. As mentioned before, in real-time multi-tasking sys-tems, the schedulability of the task set is only weakly related tothe processor utilization. Therefore, the PAST/PEG algorithmresults in a significant loss in real-time behavior. The algorithmproposed in this work is tightly coupled to the schedulabilityanalysis of the underlying real-time scheduling scheme used,resulting in far fewer deadline misses, as is evident from Fig. 9.

We performed additional experiments with Task Set 1 toillustrate the adaptive power–fidelity tradeoff that our schemeprovides. We ran our algorithm for three different values ofthe parameter ADAPTIVE LB. Fig. 10(a) and (b) show theenergy savings and deadline miss percentages, respectively, forADAPTIVE LB , and for a nonadaptive versionof our algorithm. From the figures, it is clear that by increasing

, one can trade-off energy savings forfewer deadline misses. Such an adaptive scheme enhances thesystem’s energy-scalability. For example, as the battery drainsout, ADAPTIVE LB can be decreased, thereby gradually de-grading the system’s fidelity. Fig. 11(a) and (b) plot the energy

Fig. 12. Energy savings versus Deadline misses for WINDOW = 10 andvarious values of ADAPTIVE LB for Task Set 1.

savings and deadline misses as a function of the parametersADAPTIVE LB and WINDOW, for a .

RAGHUNATHAN et al.: ENERGY-AWARE WIRELESS SYSTEMS WITH ADAPTIVE POWER–FIDELITY TRADEOFFS 221

Fig. 13. (a) Power-aware software architecture, and (b) Our experimental setup. The XScale board is on the top left, the Maxim board on the top right, and theFPGA board, which generates interrupts, is on the bottom right side. On the bottom left is a bread board with interconnections to trigger the DAQ board.

It is evident that as ADAPTIVE LB increases, the energy sav-ings decrease, and the deadline misses increase. As the windowsize increases, the energy savings decrease and the deadlinemisses decrease. Also, note that choosing a very low value forADAPTIVE LB is not very energy efficient. This is because,the large number of missed deadlines cause the voltage to beincreased very frequently in response, lowering the energysavings obtained. In order to clearly show the energy-fidelitytradeoff, we have plotted the energy savings versus the deadlinemiss percentage in Fig. 12 for and differentvalues of ADAPTIVE LB. As can be seen in Fig. 12, a valueof 0.8 for ADAPTIVE LB seems to yield the maximum energysavings, for this particular task set.

VII. IMPLEMENTATION

In addition to validating the proposed adaptive power–fi-delity technique through simulations, we also evaluated itsperformance by implementing it in an RTOS running on avariable voltage hardware platform. In this section, we firstgive a brief overview of a structured software architecturethat we have developed to facilitate application-RTOS interac-tion for effective energy management. Then, we describe ourimplementation in detail, and present measured results thatdemonstrate the effectiveness of our DVS algorithms.

A. Software Architecture

We view the notion of power awareness in the applicationand OS as a capability that enables a continuous dialog between

the application, the OS, and the underlying hardware. This di-alog establishes the functionality and performance expectations(or even contracts, as in the real-time sense) within the avail-able energy constraints. A well structured software architectureis necessary for realizing this notion of power awareness. Thepower-aware software architecture (PASA) [42], [43] that wehave developed is composed of two software layers and theRTOS kernel. One layer is an API that interfaces applicationswith the OS, and the second layer makes power related hard-ware knobs available to the OS. Both layers interface to var-ious OS services as shown in Fig. 13(a). The API layer is sep-arated into two sub-layers. The Power Aware Application Pro-grammer Interface (PA-API) sub-layer provides power manage-ment functions to the applications, while the other sub-layer, thePower Aware Operating System Layer (PA-OSL) provides ac-cess to existing and modified OS services. Active entities thatare not implemented within the RTOS kernel should be imple-mented at this layer (e.g., threads created to assist the OS inDVS/DPM, such as a thread responsible for killing other threadswhose deadlines were missed). The modified RTOS and the un-derlying hardware are interfaced using a Power Aware Hard-ware Abstraction Layer (PA-HAL). The PA-HAL gives the OSaccess to the power related hardware knobs while abstractingout the details of the underlying hardware platform.

B. Experimental Setup

Using the PASA presented above, the proposed adaptivepower–fidelity DVS algorithm has been incorporated into theeCos operating system, an open-source RTOS from Red-Hat

222 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 2, FEBRUARY 2005

TABLE IIIFREQUENCY–VOLTAGE PAIRS FOR THE INTEL XSCALE PROCESSOR USED IN

OUR IMPLEMENTATION

Inc. eCos was ported to an 80200 Intel Evaluation Board [44],which is an evaluation platform based on the XScale processor.The processor supports nine frequency levels ranging from 200MHz to 733 MHz. However, two of them (200 MHz and 266MHz) cannot be used in the 80200 board due to limitations inthe clock generation circuitry [7]. In addition, the processorsupports three different low power modes: IDLE, DROWSY,and SLEEP. The SLEEP mode results in maximal power sav-ings, but requires a processor reset in order to return to theACTIVE mode. The IDLE mode, on the other hand, offersthe least power savings but only requires a simple externalinterrupt to wake the processor up. We use the IDLE mode inour experiments due to its simple implementation.

Like most RTOSs, eCos requires a periodic interrupt to keeptrack of the internal OS tick, responsible for the notion of timingwithin the system. In the 80200 board, the only source of such aninterrupt is the internal XScale performance counter interrupt.However, the interrupt is internal to the processor, and there-fore cannot wake it up from the IDLE mode. To overcome thisproblem, we used a source of external interrupts to wake upthe processor. The interrupt pin of the processor is connectedto an Altera FPGA board, which generates periodic interruptsto wake up the processor. The period of the external interrupt ismade equal to that of the smallest time period task to ensure thatthe processor is not woken up unnecessarily. The experimentalsetup, consisting of the XScale board, the MAXIM dc–dc con-verter board, the FPGA, and some interface circuitry is shownin Fig. 13(b).

1) Variable Voltage Supply: The variable voltage supplyconsists of a MAXIM 1855 dc–dc converter board, and someinterface circuitry implemented using a PLD. When a voltagechange is required, the processor sends a byte through theperipheral bus of the 80200 board to the interface circuitrywhich acts as an addressable latch. The outputs of this latch areconnected to the digital inputs of the Maxim variable supplyboard. These inputs determine the output supply voltage ofthe Maxim board, which is fed back to the processor core. Forthe experiments, the system was configured to run at supplyvoltages from 1.0 to 1.5 V and corresponding frequencies from333 to 733 MHz. The frequency-voltage pairs are listed inTable III. The frequency is changed using the XScale’s internalregisters.

2) Power Measurement Setup: We used a National Instru-ments Data Acquisition (DAQ) board to measure the power con-sumption. We isolated the power supply to the processor fromthe rest of the board, and measured the power consumed by the

processor alone. For this purpose, we used a current-sense re-sistor (0.02 ) and sampled the voltage drop across it at a highsampling rate to obtain the instantaneous current drawn by theprocessor. We computed the energy consumption by integratingthe product of the instantaneous supply voltage and current con-sumption over the interval of interest. The DAQ board was trig-gered by pulling signals out of the peripheral bus of the 80 200board to synchronize the power measurements with the task ex-ecution.

C. Experimental Procedure and Results

We implemented and compared three different DVS/DPMschemes: 1) shutdown-based DPM; 2) a nonadaptive version ofthe proposed algorithm; and 3) the complete adaptive power–fi-delity tradeoff DVS algorithm. As a baseline for comparison,we also conducted an experiment without any DVS/DPM. Thefollowing parameter values were used for the adaptive scheme,

WINDOW , and . Weused two values (0.95 and 0.85) for ADAPTIVE LB. We did notexplore optimal choices for these parameters. Rather, our goalwas to demonstrate the effectiveness of our algorithm by imple-menting it on a real system.

We used three different applications in our experiments: 1)an MPEG2 decoder; 2) an ADPCM speech encoder; and 3) afloating-point fast Fourier transform (FFT) algorithm. Note thatthese applications run concurrently in the system. Each appli-cation executes as an eCos thread, and is scheduled using theRM priority assignment scheme. We profiled the execution timeof each application for different input data to collect WCETstatistics. For the MPEG2 decoder, different files were decom-pressed, and the WCET was measured separately for each oneof them. The original FFT algorithm computes a fixed number(1024) of FFTs in one execution. In order to increase the vari-ability in execution time, we also implemented a version of theFFT algorithm that computes a random number (obtained usinga Gaussian distribution) of FFTs in each execution. Table IVshows the characteristics of the applications used. The WCETof each application instance was measured with the processorrunning at maximum frequency. The standard deviation is alsogiven in Table IV to show the variability in the execution timesfor each application. We built three different task sets usingthese applications. The task sets, their component tasks, and theindividual task characteristics are listed in Table V. The staticslowdown factor is also listed for each task.

All the DVS/DPM algorithms shutdown the processor as soonas it becomes idle. The processor is woken up when the next ex-ternal interrupt arrives. A limitation of the processor wake upstrategy for our current testbed requires us to make all task pe-riods a multiple of the highest-rate task. Thus, whenever the pro-cessor is woken up there is useful work to be done. This limi-tation could be eliminated through the use of a programmableinterrupt generator. Further, note that this is not a limitationof the DVS algorithm or the software architecture itself. Thestatic slowdown factors used by the proposed DVS algorithm aremaintained in a table that is internal to eCos. Each task type isassociated with a static factor, which is computed during systeminitialization. The dynamic and adaptive factors are maintained

RAGHUNATHAN et al.: ENERGY-AWARE WIRELESS SYSTEMS WITH ADAPTIVE POWER–FIDELITY TRADEOFFS 223

TABLE IVAPPLICATIONS USED IN THE EXPERIMENTS. T1 AND T2 BOTH PERFORM MPEG DECODING, BUT ON DIFFERENT FILES

TABLE VTASK SETS USED IN THE EXPERIMENTS. EACH TASK SET IS COMPRISED OF THREE TASKS

TABLE VIEXPERIMENTAL RESULTS FOR TASK SET A. THE TOTAL NUMBER OF TASK INSTANCES IS 415, 207, AND 138 FOR TASKS T2, T3, AND T4, RESPECTIVELY

TABLE VIIEXPERIMENTAL RESULTS FOR TASK SET B. THE TOTAL NUMBER OF TASK INSTANCES IS 130, 65, AND 43 FOR TASKS T1, T3, AND T4, RESPECTIVELY

TABLE VIIIEXPERIMENTAL RESULTS FOR TASK SET C. THE TOTAL NUMBER OF TASK INSTANCES IS 130, 65, AND 43 FOR TASKS T1, T3, AND T5, RESPECTIVELY

in a table of task instances. The task type table also contains aspecified number (10 in our case) of execution times of previoustask instances. The eCos kernel has full access to these tables.When a context switch occurs, the various tables are updated,and the voltage and frequency of the processor are adjusted.

The energy consumption, average power consumption, andnumber of deadline misses for the three task sets, under variousDPM/DVS schemes, are shown in Tables VI, VII, and VIII. Thecolumn titled Ratio gives the energy consumption normalized tothe case when no DVS/DPM is used. For the adaptive algorithm,the number in parentheses in the column DVS/DPM schemeused represents the value of ADAPTIVE LB. As expected, theenergy savings increase as this parameter decreases, at the costof more deadline misses. These results indicate that the pro-

posed adaptive power–fidelity DVS algorithm does indeed re-sult in considerable energy savings at the cost of a small numberof missed deadlines.

VIII. SUMMARY AND FUTURE WORK

This paper presented an RTOS directed DVS scheme to re-duce energy consumption in wireless embedded systems. A keyfeature of the proposed technique is that it yields an adaptivetradeoff between energy consumption and system fidelity. Theproposed algorithm exploits low processor utilization, instanceto instance variation in task execution times, and tolerance tomissed deadlines of wireless systems to achieve this tradeoff.The technique has been incorporated into the eCos RTOS, and

224 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 2, FEBRUARY 2005

an energy efficient software architecture has been developed thatfacilitates application-aware power management by enabling adialog between the application and the RTOS. Involving the ap-plication in DVS/DPM can yield significant benefits. In manycases, the execution time is a superposition of several distinctdistributions corresponding to different operating modes or dis-tinct values of data. For example, a multiplier takes lesser timeto compute its output if the operands are powers of two. Anotherexample is an MPEG decoder whose histogram of run timeshas three distinct peaks corresponding to P, I, and F frames. Insuch cases, an intelligent task can provide feedback to the OS inthe form of a hint on the distribution of run times after the datavalues are known. As part of future work, we plan to investigatethe energy reduction potential of such interactions in detail.

REFERENCES

[1] A. P. Chandrakasan and R. W. Brodersen, Low Power CMOS DigitalDesign. Norwell, MA: Kluwer, 1996.

[2] A. Raghunathan, N. K. Jha, and S. Dey, High-Level Power Analysis andOptimization. Norwell, MA: Kluwer, 1998.

[3] L. Benini and G. De Micheli, Dynamic Power Management: DesignTechniques and CAD Tools. Norwell, MA: Kluwer, 1997.

[4] T. A. Pering, T. D. Burd, and R. W. Brodersen, “The simulation and eval-uation of dynamic voltage scaling algorithms,” in Proc. ACM ISLPED,1998, pp. 76–81.

[5] Y.-T. S. Li and S. Malik, “Performance analysis of embedded softwareusing implicit path enumeration,” IEEE Trans. Comput. Aided Des. In-tegrat. Circuits Syst., vol. 16, no. 12, pp. 1477–1487, Dec. 1997.

[6] eCos Real-Time OS [Online]. Available: http://www.redhat.com/em-bedded/technologies/ecos

[7] Intel XScale Microarchitecture [Online]. Available: http://devel-oper.intel.com/design/xscale/

[8] D. Grunwald et al., “Policies for dynamic clock scheduling,” in Proc.ACM Symp. OSDI, 2000, pp. 73–86.

[9] M. B. Srivastava, A. P. Chandrakasan, and R. W. Brodersen, “Predictivesystem shutdown and other architectural techniques for energy efficientprogrammable computation,” IEEE Trans. VLSI Syst., vol. 4, no. 1, pp.42–55, Mar. 1996.

[10] C. Hwang and A. Hu, “A predictive system shutdown method for energysaving of event driven computation,” in Proc. IEEE ICCAD, 1997, pp.28–32.

[11] T. Simunic, L. Benini, P. Glynn, and G. De Micheli, “Dynamic powermanagement for portable systems,” in Proc. ACM MOBICOM, 2000, pp.11–19.

[12] L. Benini, A. Bogliolo, and G. De Micheli, “A survey of design tech-niques for system-level dynamic power management,” IEEE Trans. VLSISyst., vol. 8, no. 3, pp. 299–316, Jun. 2000.

[13] M. Weiser, B. Welch, A. Demers, and S. Shenker, “Scheduling for re-duced CPU energy,” in Proc. ACM Symp. OSDI, 1994, pp. 13–23.

[14] K. Govil, E. Chan, and H. Wasserman, “Comparing algorithms for dy-namic speed-setting of a low-power CPU,” in Proc. ACM MOBICOM,1995, pp. 13–25.

[15] F. Yao, A. Demers, and S. Shenker, “A scheduling model for reducedCPU energy,” in Proc. Annu. Symp. Foundations of Computer Science,1995, pp. 374–382.

[16] I. Hong, M. Potkonjak, and M. B. Srivastava, “On-line scheduling ofhard real-time tasks on variable voltage processors,” in Proc. IEEEICCAD, 1998, pp. 653–656.

[17] T. Ishihara and H. Yasuura, “Voltage scheduling problem for dynam-ically variable voltage processors,” in Proc. ACM ISLPED, 1998, pp.197–202.

[18] Y. Shin and K. Choi, “Power conscious fixed priority scheduling for hardreal-time systems,” in Proc. IEEE/ACM DAC, 1999, pp. 134–139.

[19] T. D. Burd, T. A. Pering, A. J. Stratakos, and R. W. Brodersen, “A dy-namic voltage scaled microprocessor system,” IEEE J. Solid-State Cir-cuits, vol. 35, no. 11, pp. 1571–1580, Nov. 2000.

[20] T. A. Pering, T. D. Burd, and R. W. Brodersen, “Voltage scheduling inthe lpARM microprocessor system,” in Proc. ACM ISLPED, 2000, pp.96–101.

[21] A. Manzak and C. Chakrabarty, “Variable voltage task scheduling forminimizing energy or minimizing power,” in Proc. IEEE ICASSP, 2000,pp. 3239–3242.

[22] C. M. Krishna and Y. H. Lee, “Voltage-clock-scaling adaptive sched-uling techniques for low power in hard real-time systems,” in Proc. IEEERTAS, 2000, pp. 156–165.

[23] P. Pillai and K. G. Shin, “Real-time dynamic voltage scaling for lowpower embedded operating systems,” in Proc. ACM Symp. OperatingSystems Principles, 2001, pp. 89–102.

[24] F. Gruian, “Hard real-time scheduling for low energy using stochasticdata and DVS processor,” in Proc. ACM ISLPED, 2001, pp. 46–51.

[25] J. Pouwelse, K. Langendoen, and H. Sips, “Energy priority schedulingfor variable voltage processors,” in Proc. ACM ISLPED, 2001, pp.28–33.

[26] J. Luo and N. K. Jha, “Power conscious joint scheduling of periodic taskgraphs and aperiodic tasks in distributed real-time embedded systems,”in Proc. IEEE ICCAD, 2000, pp. 357–364.

[27] , “Battery aware static scheduling for distributed real-time em-bedded systems,” in Proc. ACM/IEEE DAC, 2001, pp. 444–449.

[28] D. Zhu, R. Melhem, and B. Childers, “Scheduling with dynamicvoltage/speed adjustment using slack reclamation in multi-processorreal-time systems,” in Proc. IEEE RTSS, 2001, pp. 95–105.

[29] N. K. Jha, “Low power system scheduling and synthesis,” in Proc. IEEEICCAD, 2001, pp. 259–263.

[30] C. L. Liu and J. W. Layland, “Scheduling algorithms for multiprogram-ming in a hard real time environment,” J. ACM, vol. 20, pp. 46–61, Jan.1973.

[31] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guideto the Theory of NP-Completeness. San Francisco, CA: Freeman andCo., 1979.

[32] I. Hong, G. Qu, M. Potkonjak, and M. B. Srivastava, “Synthesis tech-niques for low-power hard real-time systems on variable voltage pro-cessors,” in Proc. IEEE RTSS, 1998, pp. 178–187.

[33] A. Sinha and A. P. Chandrakasan, “Jouletrack: A web based tool for soft-ware energy profiling,” in Proc. IEEE/ACM DAC, 2001, pp. 220–225.

[34] J. L. W. V. Jensen, “Sur les fonctions convexes et les inegalites entre lesvaleurs moyennes,” Acta. Math., vol. 30, pp. 175–193, 1906.

[35] W. Namgoong and T. H. Meng, “A high-efficiency variable-voltageCMOS dynamic dc-dc switching regulator,” in IEEE Int. Solid-StateCircuits Conf. (ISSCC) Dig. Tech. Papers, 1997, pp. 380–381.

[36] V. Gutnik and A. P. Chandrakasan, “Embedded power supply for low-power DSP,” IEEE Trans. VLSI Syst., vol. 5, no. 4, pp. 425–435, Dec.1997.

[37] Intel StrongARM Processors [Online]. Available: http://devel-oper.intel.com/design/strong/

[38] A. Burns and A. Welling, Real-Time Systems and Their ProgrammingLanguages. Reading, MA: Addison-Wesley, 1989.

[39] PARSEC Parallel Simulation Language [Online]. Available:http://pcl.cs.ucla.edu/projects/parsec/

[40] A. Burns, K. Tindell, and A. Wellings, “Effective analysis for engi-neering real-time fixed priority schedulers,” IEEE Trans. Software Eng.,vol. 21, no. 5, pp. 475–480, May 1995.

[41] N. Kim, M. Ryu, S. Hong, M. Saksena, C. Choi, and H. Shin, “Visualassessment of a real time system design: A case study on a CNC con-troller,” in Proc. IEEE RTSS, 1996, pp. 300–310.

[42] C. Pereira, V. Raghunathan, S. Gupta, R. Gupta, and M. Srivastava, “Asoftware architecture for building power aware real time operating sys-tems,” Univ. California, Irvine, CA, Tech. Report #02–07, Mar. 2002.

[43] Power Aware Distributed Systems Project [Online]. Available:http://www.ics.uci.edu/cpereira/pads

[44] Intel 80 200 Evaluation Board [Online]. Available: http://devel-oper.intel.com/design/xscale/

Vijay Raghunathan (S’00) received the B.Tech. de-gree in electrical engineering from the Indian Insti-tute of Technology, Madras, in 2000, and the M.S.degree in electrical engineering from the Universityof California at Los Angeles (UCLA) in 2002, wherehe is currently pursuing the Ph.D. degree.

His research interests include system architecturesand design methodologies for wireless embeddedsystems with emphasis on low power design, andenergy efficient protocols for wireless communica-tions.

Mr. Raghunathan received the UCLA Electrical Engineering Department’sOutstanding Masters Student Award for the year 2001–2002 and the UC Re-gents graduate fellowship during the year 2000–2001. He also received the BestStudent Paper Award at the IEEE International Conference on VLSI Design in2000.

RAGHUNATHAN et al.: ENERGY-AWARE WIRELESS SYSTEMS WITH ADAPTIVE POWER–FIDELITY TRADEOFFS 225

Cristiano L. Pereira received the Bachelors degreein computer science from the Catholic University ofMinas Gerais, Brazil, in 1997, and the M.Sc. degreein computer science from the Federal University ofMinas Gerais, Brazil, in 2000. He is currently pur-suing the Ph.D. degreee in the Computer Science andEngineering Department, University of California atSan Diego.

His research interests are high-level power man-agement, operating systems, and embedded software.

Mani B. Srivastava (SM’02) received the B.Tech.degree in electrical engineering from IIT Kanpur in1985, and the M.S. and Ph.D. degrees from the Uni-versity of California at Berkeley in 1987 and 1992,respectively.

He is currently a Professor in the Electrical Engi-neering Department at the University of Californiaat Los Angeles (UCLA). Prior to joining UCLA, hewas in the Networked Computing Research Depart-ment at Bell Laboratories from 1992 to 1996. Hisresearch interests at UCLA are in mobile and wire-

less networked embedded systems, focusing particularly on energy-aware com-puting and communications, low-power design, pervasive computing systems,and wireless sensor and actuator networks. He holds five U.S. patents, and haspublished over 100 papers in the areas of wireless systems, sensor networks,and low-power embedded systems.

Dr Srivastava serves on the editorial board of IEEE TRANSACTIONS ON

MOBILE COMPUTING and Wiley’s Wireless Communications and Mobile Com-puting. He received the NSF CAREER Award in 1997, the Okawa Foundationgrant in 1997, and the President of India Gold Medal in 1985. He also receivedthe Best Paper Award at the IEEE International Conference on DistributedComputing Systems in 1997.

Rajesh K. Gupta (F’04) received the B.Tech. degreein electrical engineering from IIT Kanpur, India, theM.S. degree in electrical engineering and computerscience from the University of California at Berkeley,and the Ph.D. degree in electrical engineering fromStanford University, Stanford, CA.

He is a Professor and Holder of the Qualcomm en-dowed chair in embedded microsystems in the De-partment of Computer Science and Engineering at theUniversity of California at San Diego. His current re-search interests are in embedded systems, VLSI de-

sign, and adaptive system architectures. Earlier, he was on the faculty of Com-puter Science Departments at UC Irvine and the University of Illinois, Urbana-Champaign. Prior to that, he worked as a circuit designer at Intel Corpora-tion, Santa Clara, CA, on a number of processor design teams. He is authoror coauthor of over 150 articles on various aspects of embedded systems anddesign automation and three patents on PLL design, data-path synthesis andsystem-on-chip modeling.

Dr. Gupta is a recipient of the Chancellor’s Fellow at UC Irvine, UCI Chan-cellor’s Award for excellence in undergraduate research, National Science Foun-dation CAREER Award, two Departmental Achievement Awards and a Compo-nents Research Team Award at Intel. He is editor-in-chief of IEEE Design andTest of Computers and serves on the editorial boards of IEEE TRANSACTIONS ON

COMPUTER-AIDED DESIGN and IEEE TRANSACTIONS ON MOBILE COMPUTING.He is a distinguished lecturer for the ACM/SIGDA and the IEEE CAS Society.