University of California, University of California, Irvine and San DiegoIrvine and San Diego
Energy-Aware System Design for Energy-Aware System Design for Wireless MultimediaWireless Multimedia
Nikil DuttShivajit Mohapatra
Rajesh GuptaNalini
VenkatasubramanianSumit Gupta
Cristiano Pereira
Hans Van Antwerpen
Ralph Von Vignau
Philips Semiconductors
2
Talk OutlineTalk Outline
OverviewOverview Distributed Wireless Distributed Wireless MultimediaMultimedia
Case Study:Case Study: The FORGE FrameworkThe FORGE Framework
IP Reuse at PhilipsIP Reuse at Philips
3
• Power Optimization in battery operated mobile devices is a crucial research challenge• Devices operate in dynamic distributed environments.• Future power management strategies need to be aware of global system changes.
4
Online Gaming
Video Streaming
Distance Learning
Online Banking,Chat
W i
d e
A
r e
a N
e t
w o
r k
W i
r e
l e s
s
N e
t w
o r
k
L o
w
P o
w e
r
D e
v i
c e
s
AP
AP
laptop
palmtop
iPAQ
Low-powermobile device
Infrastructure for Mobile Multimedia Environments
request
response
• Best-Effort Service
5
Online Gaming
Video Streaming
Distance Learning
Online Banking,Chat
Directory Service
Broker
W i
d e
A
r e
a N
e t
w o
r k
W i
r e
l e s
s
N e
t w
o r
k
PROXY-N
L o
w
P o
w e
r
D e
v i
c e
s
AP
AP
laptop
palmtop
iPAQ
Low-powermobile device
Data from Server
Execute Remote Tasks
Caching compress
decryptionencryption
Compositing transcode
output
Data from Mobile host
services
PROXY-1
Enhanced Infrastructure
6
Challenges in Wireless Multimedia Challenges in Wireless Multimedia ProcessingProcessing
Proliferation of DevicesProliferation of Devices System support for multitude of smart devices System support for multitude of smart devices
thatthat attach and detach from a distributed infrastructureattach and detach from a distributed infrastructure produce large volume of information at a high rateproduce large volume of information at a high rate limited by communication and power constraintslimited by communication and power constraints
Need a customizable networking backboneNeed a customizable networking backbone QoS driven resource provisioning algorithms QoS driven resource provisioning algorithms
for highly dynamic environmentsfor highly dynamic environments Need to deal adaptively with incoming Need to deal adaptively with incoming
requestsrequests Dynamically reconfigure system to service Dynamically reconfigure system to service
requestsrequests
7
High Data Volume of Multimedia High Data Volume of Multimedia InformationInformation
Speech 8000 samples/s 8Kbytes/s
CD Audio 44,100 samples/s, 2 bytes/sample
176Kbytes/s
Satellite Imagery
180X180 km 2̂ 30m 2̂ resolution
600MB/image (60MB compressed)
NTSC Video 30fps, 640X480 pixels, 3bytes/pixel
30Mbytes/s (2-8 Mbits/s compressed)
8
Challenges in Wireless Multimedia Challenges in Wireless Multimedia ProcessingProcessing
Dealing with Device MobilityDealing with Device Mobility Need high degree of “network awareness”Need high degree of “network awareness”
congestion rates, mobility patterns etc.congestion rates, mobility patterns etc. global system state is constantly changingglobal system state is constantly changing
Service Brokering for QoS Aware Resource Service Brokering for QoS Aware Resource ProvisioningProvisioning Admission control, Load-balancing etc.Admission control, Load-balancing etc.
Multimedia Processing challenges Multimedia Processing challenges Soft Real Time ConstraintsSoft Real Time Constraints Synchronization (e.g. lip sync. , floor control)Synchronization (e.g. lip sync. , floor control) Support for traditional media (text, images) and Support for traditional media (text, images) and
continuous media (audio/video)continuous media (audio/video) Other considerations – Availability, Other considerations – Availability,
Reliability, Cost-Effectiveness & SecurityReliability, Cost-Effectiveness & Security
9
Distributed Wireless MultimediaDistributed Wireless Multimedia
Different forms of information accessible anytimeDifferent forms of information accessible anytime Multiple Sessions with varying characteristics.Multiple Sessions with varying characteristics.
Services, Networks and SystemsServices, Networks and Systems Heterogeneous, evolve dynamicallyHeterogeneous, evolve dynamically
Quality of Service Quality of Service Constraints: Timing, resource availability, network Constraints: Timing, resource availability, network
constraints, (e.g. bandwidth), security, reliability …constraints, (e.g. bandwidth), security, reliability … Example: For Multimedia Streaming to Handheld Example: For Multimedia Streaming to Handheld
DevicesDevices QoS Parameters: QoS Parameters: jitter, frame rate, resolution, bit-ratejitter, frame rate, resolution, bit-rate etc. etc. All these QoS parameters affect user perception. All these QoS parameters affect user perception.
Power is a new QoS dimension – in distributed Power is a new QoS dimension – in distributed multimedia. multimedia. User must be able to watch requested video without User must be able to watch requested video without
running out of batteryrunning out of battery
10
Multimedia Streaming ExampleMultimedia Streaming Example
We use this framework for examining the design challengesWe use this framework for examining the design challenges Proxy node between servers and clients allows dynamic stream Proxy node between servers and clients allows dynamic stream
transformations (Transcoding, Adaptation, Annotation etc)transformations (Transcoding, Adaptation, Annotation etc)
WiredNetwork
WiredNetwork
ACCESS POINT
WirelessNetwork
PROXY
MEDIA SERVERSMEDIA SERVERS CLIENTSCLIENTS
Handheld PC
PDA
Phone
NETWORKNETWORK
11
Opportunities in Wireless Multimedia Opportunities in Wireless Multimedia System Design System Design
Dynamic nature of multimedia tasks leaves some Dynamic nature of multimedia tasks leaves some computational slackcomputational slack Slack = Difference between computational capability Slack = Difference between computational capability
and computational requirements due to deadlinesand computational requirements due to deadlines QoS trade-offs possible for reducing energy QoS trade-offs possible for reducing energy
consumptionconsumption Example: Lower quality video needs less Example: Lower quality video needs less
computation/bandwidthcomputation/bandwidth Multimedia Applications CharacteristicsMultimedia Applications Characteristics
Kernels of computation-dominated operationsKernels of computation-dominated operations E.g. MPEG: IDCT, motion compensation, VLDE.g. MPEG: IDCT, motion compensation, VLD
Predictable, regular behavior (most of the time)Predictable, regular behavior (most of the time) E.g. VLD, followed by IQ, IDCTE.g. VLD, followed by IQ, IDCT
Clear computation and/or data access patterns (cyclic)Clear computation and/or data access patterns (cyclic) E.g. video frames are traversed in a known orderE.g. video frames are traversed in a known order
Exploit multimedia specific characteristics to Exploit multimedia specific characteristics to enable a range of optimization techniquesenable a range of optimization techniques
12
Implications on Wireless Multimedia Implications on Wireless Multimedia System DesignSystem Design
Devise strategies that reduce energy Devise strategies that reduce energy These strategies must adapt to/optimize for These strategies must adapt to/optimize for
changes in changes in Application Data (video stream)Application Data (video stream) OS/Hardware (CPU, Memory, Reconfigurable logic)OS/Hardware (CPU, Memory, Reconfigurable logic) Network (congestion, noise, node mobility)Network (congestion, noise, node mobility) Residual Energy (battery)Residual Energy (battery) Environment (Ambient light, sound)Environment (Ambient light, sound)
Strategies canStrategies can Change application behavior (compression ratio)Change application behavior (compression ratio) Reduce backlightReduce backlight Buffer Data (and switch off network card)Buffer Data (and switch off network card)
13
Abstraction Layers in Distributed Abstraction Layers in Distributed Multimedia SystemsMultimedia Systems
Server
Clientn
Clienti
Client1
NetworkCard
Display Cache Memory Reg Files CPU H/W
Operating System
DVS Scheduler
NetworkManagement
Transcoding AdmissionControl
ApplicationsVideo Player Other Tasks
Middleware
Abstraction Layers
ChallengesChallenges Enable high quality of services (particularly multimedia Enable high quality of services (particularly multimedia
services) at the mobile device: services) at the mobile device: High Computational capabilityHigh Computational capability Do so within strict Do so within strict Peak Power and Energy BudgetsPeak Power and Energy Budgets Eg.: Play video stream at highest quality (requires Eg.: Play video stream at highest quality (requires
computation), while ensuring the entire video plays back computation), while ensuring the entire video plays back (requires energy)(requires energy)
14
Energy Aware System Design Energy Aware System Design TechniquesTechniques
Several approaches optimize energy for Several approaches optimize energy for each component and each abstraction leveleach component and each abstraction level
Solutions – at each abstraction levelSolutions – at each abstraction level Architecture: Architecture:
Cache/Memory optimizations, Processor Cache/Memory optimizations, Processor architectural optimizationsarchitectural optimizations
Operating SystemOperating System Dynamic voltage scaling (DVS)Dynamic voltage scaling (DVS) Dynamic power management (DPM) of System Dynamic power management (DPM) of System
components: disks, network interfacescomponents: disks, network interfaces Middleware solutions Middleware solutions
Adaptive streaming, mobility based Adaptive streaming, mobility based adaptationsadaptations
Application adaptationsApplication adaptations Profiling applications for low power executionProfiling applications for low power execution
Related WorkRelated Work
Architecture
Architecture(cpu, memory)Operating System
DVS, DPM, DriverInterfaces, system callsDistributed Middleware
Distributed Adaptation Cross-Layer AdaptationAppl. specific AdaptationUser/Application
Quality of Service Application/user feedback
• Soderquist (ACM Multimedia 97)• Azevedo (AWIA 2001)• Hughes, Adve (MICRO 01, ICSA 01)• Brooks (ISCA 2000), Choi (ISLPED 02)• Leback (ASPLOS 2000), Microsoft’s ACPI
• Ellis, Vahdat (EcoSystem, Currentcy, ASPLOS 02) • Hao, Nahrstedt (ICMCS 99, HPDC 99, Globecom)•DVS (Shin, Gupta, Weiser, Srivastava, Govil et. al.)•DPM (Douglis, Hembold, Delaluz, Kumpf et. al.)•Chandra (MMCN 02), Katz (IEICE 97), Chou(02)•Feeney, Nilson ( Infocom 2001)
• Nahrstedt ( Grace, UIUC - MMCN 2002, 2003)• Shenoy (MMCN 2002), RajKumar (ICDCS 2003)• Mohapatra(ICDCS, MWCN 2003), Xu (DCS 03) • Efstratiou, Friday (Middleware 2000)•Forge Project UCI (ACM MM, RTAS, CIPC 03)
• Flinn (ICDSP 2001), Yau (ICME 2002)• Krintz, Wolski (UCSD)• Noble (SOSP 97, MCSA 1999)• Li (CASES 2002), Othman (1998)• Abeni (RTSS 98)•Rudenko ( ACM SAC 99), Satyanarayan (2001)
PROXY-BASED ADAPTATION for POWER AWARENESS• Shenoy(transcoding), Chandra(netwrk), Mohapatra (OS, arch, network + transcoding)
CROSS-LAYER ADAPTATION• GRACE (Illinois), FORGE/DYNAMO (UCI)
16
Talk OutlineTalk Outline
OverviewOverview Distributed Wireless Distributed Wireless MultimediaMultimedia
Case Study:Case Study: The FORGE FrameworkThe FORGE Framework
IP Reuse at PhilipsIP Reuse at Philips
17
Traditional Approach: A Traditional Approach: A Closer LookCloser Look
Architecture
Operating System
Application
Low-powerdevice
request
response
Power Management
Wide Area Network
Low-powermobile device
Wireless Network
Wireless Distributed Infrastructure (traditional)
server
Network Infrastructure
18
DrawbacksDrawbacks• Limited co-ordination between the different computation layers (Architecture, OS, application)
• Lack of generalized framework• Example (DVS in presence of architectural opt.)
• Do not exploit global system knowledge • Network congestion levels• Device mobility information• Data characteristics
Cross-layer coordination directed by a distributed Cross-layer coordination directed by a distributed middleware framework can effectively address middleware framework can effectively address
the above limitations.the above limitations.
19
device
Architecture
Operating System
Distributed Middleware
User/Application
LOCALCROSS LAYER ADAPTATION
Architecture
Operating System
Distributed Middleware
User/Application
DirectoryService
Proxy
GLOBALPROXY BASED ADAPTATION
network
Build Power-aware Distributed Embedded System framework that can o Exploit global changes (network congestion, system loads, mobility patterns) to improve local adaptationso Distribute local information (e.g. device mobility, residual power) for improved global adaptationso Co-ordinate power management strategies at different levels (application, middleware, OS, architecture) o Maximize the utility (application QoS, power savings) of a mobile device.
A Global Coordinated Approach in A Global Coordinated Approach in FORGEFORGE
20
Operating SystemDVS Scheduler Device Drivers
OS
FORGE: Layers and FORGE: Layers and InteractionsInteractions
Middleware Device Runtime (API Interface)
Networkoptimization
Taskpartitioning
Userinfo
Collect/updateLocal data
Middleware strategies
U S E R A P P L I C A T I O N S (Utility)
App. specific info
NetworkCard
Display Cache Memory RegFiles CPUH/W
CROSS LAYER ADAPTATION(Local Device)
Proxy
• Admission Control• Task Partitioning• Adaptive network transmission
…
PROXY-BASED ADAPTATION
Proxy Middleware
•Mobility Information, •Current Residual Power•Utility levels supported•User requirements for Adm. Control.
• Transcoded payload/data• Settings for transmitted data•Control information ( n/w trans)
• NIC Idle periods• Video Encoding Info• Display Settings
• Residual Power Info• Power API
Arch. Specific Settings e.g. Cache config• Arch. Specific Knobs
(Register file sizes, Cache config)
21
Outline for the rest of the Outline for the rest of the talktalk
Examine Energy optimization knobs at Examine Energy optimization knobs at each abstraction leveleach abstraction level
Examine how cross-layer coordination Examine how cross-layer coordination can reduce energy furthercan reduce energy further
Specifically, we will talk about:Specifically, we will talk about: Using Reconfigurable CachesUsing Reconfigurable Caches Adaptive DVS techniquesAdaptive DVS techniques Network Card shut-down by buffering video Network Card shut-down by buffering video
data data Reducing Backlight by Video EnhancementReducing Backlight by Video Enhancement
22
Hardware/Architectural Hardware/Architectural Level KnobsLevel Knobs
Major sources of power consumptionMajor sources of power consumption Display (Backlight)Display (Backlight) Network InterfaceNetwork Interface CPU – particularly memory sub-systemCPU – particularly memory sub-system
We will discuss two Middleware/HW We will discuss two Middleware/HW optimizations:optimizations: Quality-Driven Cache ReconfigurationQuality-Driven Cache Reconfiguration Dynamic Backlight AdjustmentDynamic Backlight Adjustment
23
Quality-Driven Cache Quality-Driven Cache ReconfigurationReconfiguration
((Hardware-Level OptimizationHardware-Level Optimization)) Why caches?Why caches?
High relative power consumption (above 50%)High relative power consumption (above 50%) Influences external memory powerInfluences external memory power
Idea: reconfigure data cache for specific video Idea: reconfigure data cache for specific video stream format requirementsstream format requirements
Cache power knobs used: Cache power knobs used: size, associativitysize, associativity Goal: Find best configuration for each quality levelGoal: Find best configuration for each quality level
Plus: combine with dynamic voltage scaling (DVS)Plus: combine with dynamic voltage scaling (DVS) Application: MPEG decoding Application: MPEG decoding
Frame decoding may take less than frame delayFrame decoding may take less than frame delay Slack time: Slack time: θθ = = FFdd – D – D (between deadline & end of (between deadline & end of
computation)computation)
24
Impact of Cache Parameters on Impact of Cache Parameters on EnergyEnergy
Profiled short (10sec) video clips (quality: low - Profiled short (10sec) video clips (quality: low - high) for all cache configurations – parameters high) for all cache configurations – parameters varied:varied: Size: Size: 4KB – 64KB4KB – 64KB Associativity: Associativity: 1 – 321 – 32
Energy savings: Energy savings: 10-15% (CPU + memory) 10-15% (CPU + memory) over over 32x32 baseline32x32 baseline
•Experimental Setup:•Wattch/Simplescalar•Berkeley MPEG tools
“Action” clip, high quality
Observations:Observations: Associativity: largest impact on Associativity: largest impact on
energyenergy Best cache configuration Best cache configuration
reflectsreflects internal storage internal storage
requirements for different requirements for different frame sizesframe sizes
decoding algorithm internal decoding algorithm internal organization (data sets)organization (data sets)
25
Cache Configuration + DVSCache Configuration + DVS Interaction of DVS with cache configurations Interaction of DVS with cache configurations
Cache configurations with the largest frame Cache configurations with the largest frame decoding slack enable largest DVS savingsdecoding slack enable largest DVS savings
Results: up to Results: up to 60%60% energy savings over base config energy savings over base config
Video Cache Cache Clock Voltage Original Optimized Savings
Quality Size Associativity Frequency Energy Energy
Q1 8 8 100 1 1.29 0.76 47.50%Q2 8 8 100 1 1.09 0.64 47.80%Q3 8 8 100 1 0.95 0.56 48.00%Q4 32 2 66 0.9 0.54 0.26 57.60%Q5 32 2 66 0.9 0.48 0.23 57.80%Q6 32 2 33 0.9 0.42 0.2 58.00%Q7 8 8 33 0.9 0.29 0.14 57.30%Q8 8 8 33 0.9 0.24 0.11 57.50%
Base configuration: 400MHz, 1.3V, 32 kb, 32 set assoc
Middleware Rule Base for Best Config at each Quality LevelMiddleware Rule Base for Best Config at each Quality Level
QualityHigh
to Low
26
OSOS Directed Power Directed Power ManagementManagement
OS has a global view of what is going OS has a global view of what is going on the whole system on the whole system
Applications should communicate:Applications should communicate: Quality of service, timing restrictionsQuality of service, timing restrictions
The OS decides how to configure the The OS decides how to configure the knobs availableknobs available Ex: Processor frequency and voltage Ex: Processor frequency and voltage
scalingscaling
27
Power Aware Software Architecture Power Aware Software Architecture (PASA)(PASA)
PA-APIPA-API (Power Aware API) (Power Aware API) Application/OS InterfaceApplication/OS Interface Makes power aware OS Makes power aware OS
services available to the services available to the application writer.application writer.
PA-OSLPA-OSL (Power Aware Operating (Power Aware Operating System Layer)System Layer) Implements modified OS Implements modified OS
services and active components services and active components such as a DPM manager. such as a DPM manager.
PA-HALPA-HAL (Power Aware Hardware (Power Aware Hardware Abstraction Layer) Abstraction Layer) OS/Hardware InterfaceOS/Hardware Interface Makes power control knobs Makes power control knobs
available to the OS available to the OS programmer.programmer.
AdaptableApplications
Power Aware API
OS
PA OS Services
Local PM
Power Aware HALOS HAL
Hardware
Middleware
28
Operating system driven Operating system driven DVSDVS
Slow down the CPU based on workload and timing Slow down the CPU based on workload and timing restrictions (slowdown factors f < 1)restrictions (slowdown factors f < 1)
We model real time task sets with periods=deadlines We model real time task sets with periods=deadlines using RMSusing RMS
We implemented 4 variations of DVS with CPU We implemented 4 variations of DVS with CPU shutdown:shutdown: Shutdown when idleShutdown when idle – – as soon as CPU becomes idle as soon as CPU becomes idle
shutdown the processorshutdown the processor Static slow down factorsStatic slow down factors – – calculated offline and calculated offline and
based on RM schedulability analysis (using the based on RM schedulability analysis (using the WCETs)WCETs)
Dynamic slow downDynamic slow down – – run-time slow down factors run-time slow down factors are predicted based on a history of execution timesare predicted based on a history of execution times
Adaptive slow downAdaptive slow down – – A third slowdown factor A third slowdown factor adapted according to number of deadline missed in a adapted according to number of deadline missed in a previous window of executions.previous window of executions.
29
ImplementationImplementation We modified the We modified the eCos eCos real time operating real time operating
system running on a XScale platform system running on a XScale platform (80200EVB) with dynamic frequency and (80200EVB) with dynamic frequency and voltage scaling hardware.voltage scaling hardware.
For the DVS techniques, we implemented For the DVS techniques, we implemented real tasksets to validate the software real tasksets to validate the software implementation:implementation: MPEG decoding, ADPCM and FFTMPEG decoding, ADPCM and FFT
30
TaskTask ApplicationApplication WCET (us)WCET (us) Std Dev (us)Std Dev (us)
T1T1 MPEG2 MPEG2 3070030700 31003100
T2T2 MPEG2MPEG2 2630026300 21002100
T3T3 ADPCMADPCM 93009300 33003300
T4T4 FFTFFT 1590015900 00
T5T5 FFTFFT 1360013600 800800
Energy Consumption for each scheme
0
0.2
0.4
0.6
0.8
1
Scheme
Ra
tio
of
en
erg
y
co
ns
um
pti
on
be
twe
en
N
orm
al
an
d S
ch
em
e
Taskset A Taskset B Taskset C
Task Set
A: T1,T3,T4
B: T2,T3,T4
C: T1,T3,T5
ObservationsObservations Adaptive slowdown achieves about 30-40 % savingsAdaptive slowdown achieves about 30-40 % savings However, deadline misses increase ( not shown here)However, deadline misses increase ( not shown here) OS/Middleware have to trade-off deadline misses with OS/Middleware have to trade-off deadline misses with
energy savings/slowdown factorsenergy savings/slowdown factors
31
Middleware Middleware Controlled Network Controlled Network Data BufferingData Buffering
Wireless NIC cards consume significantly less energy Wireless NIC cards consume significantly less energy in in sleepsleep mode (NIC = Network Interface Card) mode (NIC = Network Interface Card) Avg. power consumption in Avg. power consumption in sleepsleep mode = 0.184 W, whereasmode = 0.184 W, whereas Idle & receiveIdle & receive modes consume 1.34 & 1.435 W respectivelymodes consume 1.34 & 1.435 W respectively
Transmitting video data in burstsTransmitting video data in bursts can help save can help save power. power. NIC on device can be transitioned into sleep modeNIC on device can be transitioned into sleep mode
The middleware on the proxy is used to buffer video The middleware on the proxy is used to buffer video data and transmit it in bursts to the device. data and transmit it in bursts to the device.
Additionally, based on the residual energy feedback Additionally, based on the residual energy feedback from the device, the middleware can transcode the from the device, the middleware can transcode the video stream based on Quality/Power Matrix.video stream based on Quality/Power Matrix.
32
N=1N=3
N=5Q1Q2
Q3Q4
Q5Q6
Q7Q8
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25A
vg
. Po
we
r S
av
ed
(%
)
Power Gains using Buffering for various noise levels
Decreasing
Increasing
Power savings decrease as video quality increasesPower savings decrease as video quality increases Amount of Data Buffering possible is less at higher qualityAmount of Data Buffering possible is less at higher quality
This is an ideal model: in practice, network noise will mean that This is an ideal model: in practice, network noise will mean that network interface has to be left on for longer periods of timenetwork interface has to be left on for longer periods of time
33
Reducing Backlight for Reducing Backlight for Lower PowerLower Power
Identify “Identify “Groups of Groups of ScenesScenes” with little ” with little variance in luminosityvariance in luminosity
Increase pixel luminance Increase pixel luminance and reduce backlight leveland reduce backlight level
To avoid loss of contrast To avoid loss of contrast (due to pixel luminance (due to pixel luminance saturation)saturation) Perform spatial convolution Perform spatial convolution
using high pass filter using high pass filter This sharpens objects in the This sharpens objects in the
imageimage
Backlight Backlight ModesModes
Power Power ConsumConsumed (in ed (in Watts)Watts)
Super BrightSuper Bright 2.802.80
High BrightHigh Bright 2.512.51
Medium Medium BrightBright
2.322.32
Low BrightLow Bright 2.162.16
Power SavePower Save 1.721.72Power consumed at various
backlight levels during streaming multimedia playback
on the Compaq iPAQ
34
MPEG VideoMPEG Video ResolutionResolution FPSFPS DuratioDuration (sec)n (sec)
Luminosity Luminosity VariationVariation
Video TypeVideo Type
bipolar.mpgbipolar.mpg 320 x 240320 x 240 3030 4141 LittleLittle Dark, 3D animationDark, 3D animation
iceegg.mpgiceegg.mpg 240 x 136240 x 136 3030 5959 ModerateModerate Bright, 3D animationBright, 3D animation
intro.mpgintro.mpg 160 x 120160 x 120 3030 5959 Very HighVery High Flashy, TV show clipFlashy, TV show clip
simpsons.mpgsimpsons.mpg 192 x 144192 x 144 3030 2727 HighHigh Colorful, 2D animationColorful, 2D animationCharacteristics of video streams used in experiment
bipolar.mpg iceegg.mpg intro.mpg simpsons.mpg
Snapshots of MPEG-1 video streams used in experiments
Video Streams used for Video Streams used for ExperimentsExperiments
35
SBCSBC: Simple Backlight Compensation: Simple Backlight Compensation Only identify GOS, reduce backlight on Only identify GOS, reduce backlight on
handheldhandheld No video stream contrast enhancementNo video stream contrast enhancement
CBVLCCBVLC: Constant Backlight with Video : Constant Backlight with Video Luminosity Compensation Luminosity Compensation Backlight level set once at start of video streamBacklight level set once at start of video stream Video stream is enhanced (dynamically at the Video stream is enhanced (dynamically at the
proxy)proxy)
DCADCA: Dual Compensation Approach: Dual Compensation Approach Backlight level is dynamically changed based on GOSBacklight level is dynamically changed based on GOS Video stream is enhanced based on Backlight level Video stream is enhanced based on Backlight level
decisiondecision
Three Backlight Compensation Three Backlight Compensation ApproachesApproaches
36
0
100
200
300
400
500
600
700
iceegg simpsons intro bipolar
Po
wer
sav
ing
(m
Wat
ts)
CBVLC
SBC
DCA
0
100
200
300
400
500
600
700
iceegg simpsons intro bipolar
Po
wer
sav
ing
(m
Wat
ts)
CBVLC
SBC
DCA
Super Bright
Results for Backlight Results for Backlight CompensationCompensation
0
50
100
150
200
250
300
350
400
450
500
iceegg simpsons intro bipolar
Po
wer
Sav
ing
(in
mW
atts
)
CBVLC
SBC
DCA
0
50
100
150
200
250
300
350
400
450
500
iceegg simpsons intro bipolar
Po
wer
Sav
ing
(in
mW
atts
)
CBVLC
SBC
DCA
0
20
40
60
80
100
120
140
160
180
iceegg simpsons intro bipolar
Po
wer
Sav
ing
(in
mW
atts
)
CBVLC
SBC
DCA
0
20
40
60
80
100
120
140
160
180
iceegg simpsons intro bipolar
Po
wer
Sav
ing
(in
mW
atts
)
CBVLC
SBC
DCA
High Bright
Medium Bright
Backlight Backlight ModesModes
Power Power ConsumConsumed (in ed (in Watts)Watts)
Super Super BrightBright
2.802.80
High BrightHigh Bright 2.512.51
Medium Medium BrightBright
2.322.32
Low BrightLow Bright 2.162.16
Power SavePower Save 1.721.72
37
SummarySummary We explored ways to reduce power by We explored ways to reduce power by
integrating power optimization techniques integrating power optimization techniques across abstraction layersacross abstraction layers HW/OS/Middleware: HW/OS/Middleware: Cache Reconfiguration, DVS, Cache Reconfiguration, DVS,
Backlight ReductionBacklight Reduction OS/Application: OS/Application: Power Aware API for DVSPower Aware API for DVS Middleware/Network: Middleware/Network: NIC Shutdown using data NIC Shutdown using data
bufferingbuffering ConclusionConclusion: A Cross-Layer Coordinated Strategy : A Cross-Layer Coordinated Strategy
is required for maximum energy savingsis required for maximum energy savings Information available at different abstraction levels Information available at different abstraction levels
can be used by either the OS or the middleware to can be used by either the OS or the middleware to make global decisionsmake global decisions
38
Ongoing WorkOngoing Work Exploits repetitive and cyclic characteristics Exploits repetitive and cyclic characteristics
of MPEG-2, MPEG-4/H.263of MPEG-2, MPEG-4/H.263 Application and data profiling possible for reducing Application and data profiling possible for reducing
energy consumptionenergy consumption Energy Characterization of Security and Energy Characterization of Security and
Digital Media Protection algorithmsDigital Media Protection algorithms Security and IP protection of multimedia content Security and IP protection of multimedia content
has spawned a range of security measureshas spawned a range of security measures First step: We analyzed the effects of watermarking First step: We analyzed the effects of watermarking
on energy and computation time on PDAson energy and computation time on PDAs Task partitioning between proxy and handheld Task partitioning between proxy and handheld
for reducing total energy for reducing total energy (=computation+communication)(=computation+communication) For Video Streaming, Video Conferencing, For Video Streaming, Video Conferencing,
WatermarkingWatermarking
39
Talk OutlineTalk Outline
OverviewOverview Distributed Wireless Distributed Wireless MultimediaMultimedia
Case Study:Case Study: The FORGE FrameworkThe FORGE Framework
IP Reuse at PhilipsIP Reuse at Philips
Blowing away Blowing away the Barriers to the Barriers to Large Scale IP Large Scale IP
ReuseReuse
Ralph von vignauRalph von vignau
5 January 20045 January 2004
DATE Conference 2004
Paris, La Defense
41
Philips and IP ReusePhilips and IP Reuse
Philips Semiconductors is a leading Philips Semiconductors is a leading SoC developerSoC developer
A reuse structure and policy for IP has A reuse structure and policy for IP has been systematically introduced into been systematically introduced into the development environment. the development environment.
There are rules and tools to support There are rules and tools to support the reusethe reuse CoReUse for HWCoReUse for HW MoReUse for SWMoReUse for SW
42
Philips and IP ReusePhilips and IP Reuse Background - 1Background - 1
Philips Semiconductors has a strategy of Philips Semiconductors has a strategy of developing products based on System developing products based on System Silicon Platforms (SSP’s).Silicon Platforms (SSP’s). Chameleon (MIPS subsystem generator)Chameleon (MIPS subsystem generator) ChipBuilder (ARM based system generator)ChipBuilder (ARM based system generator)
Demonstrates the value of automatic Demonstrates the value of automatic methods of integrating IP blocks into a methods of integrating IP blocks into a subsystem along with it’s verification subsystem along with it’s verification environment.environment.
43
Philips and IP ReusePhilips and IP Reuse Background - 2Background - 2
““Need a generic framework that Need a generic framework that enables platform developers to enables platform developers to
implement their system in a implement their system in a consistent, flexible and easy-to-consistent, flexible and easy-to-
use wayuse way””
Combining automatic methods of Combining automatic methods of integrating configurable IP blocks integrating configurable IP blocks
together with their verification together with their verification environmentenvironment
44
Lessons LearnedLessons Learned Factors that enable successful IP reuseFactors that enable successful IP reuse
A centrally driven and supported company A centrally driven and supported company policypolicy
Wide deployment with consultancyWide deployment with consultancy Central repositoryCentral repository High quality that can be trustedHigh quality that can be trusted Ease of useEase of use Good documentationGood documentation Central supportCentral support Distributed championsDistributed champions Visible improvements and successesVisible improvements and successes
45
The Limits of the Current The Limits of the Current PoliciesPolicies
A standard set of views is provided for each IP blockA standard set of views is provided for each IP block Ensures compatibility with the development flowsEnsures compatibility with the development flows Supports easier integrationSupports easier integration Ensures a minimum of documentation is availableEnsures a minimum of documentation is available Is supported by checking toolsIs supported by checking tools
However:However: Verification reuse is not yet includedVerification reuse is not yet included The checking is done by in-house toolsThe checking is done by in-house tools The rules only apply to in-house IPThe rules only apply to in-house IP
A far more radical change is required to move to the next A far more radical change is required to move to the next level of reuse methodology. level of reuse methodology. Higher automationHigher automation Faster integration and verificationFaster integration and verification Higher qualityHigher quality Flexibility in design flowsFlexibility in design flows
46
Requirements for the next Requirements for the next Level of ReuseLevel of Reuse
Extend reuse both within Philips as well as to the IP Extend reuse both within Philips as well as to the IP bought for use within Philipsbought for use within Philips The use of IP from multiple vendors must be made easier and The use of IP from multiple vendors must be made easier and
less costlyless costly Tools from various EDA vendors must be easier to Tools from various EDA vendors must be easier to
integrate into a design flowintegrate into a design flow The verification of IP must be more:The verification of IP must be more:
Comprehensive, stretching from unit tests to system Comprehensive, stretching from unit tests to system verificationverification
Reusable in all stages of the SoC developmentReusable in all stages of the SoC development A higher automation in the development flow must be A higher automation in the development flow must be
supportedsupported Automated IP integrationAutomated IP integration Verification suite compilationVerification suite compilation
47
Supportive StandardizationSupportive Standardization
Although there are several activities Although there are several activities and working groups throughout the and working groups throughout the industry and standardization groups, industry and standardization groups, none have the industry focus or time none have the industry focus or time drive set by the SPIRIT Consortiumdrive set by the SPIRIT Consortium
Only if there is an industry drive to common standards for the Reuse of IP can major
improvements be achieved
WWW.spiritconsortium.com
48
The SPIRIT ConsortiumThe SPIRIT Consortium
SPIRIT SPIRIT SStructure for tructure for PPackaging, ackaging, IIntegrating and ntegrating and RRe-e-
using using IIP within P within TTool-flowsool-flows A consortium of leading companies in the EDA, A consortium of leading companies in the EDA,
IP, system and semiconductor industriesIP, system and semiconductor industries AimAim
To develop industry standards To develop industry standards Ease integration of semiconductor IP into SystemsEase integration of semiconductor IP into Systems Enable the interoperability of tools for IP integrationEnable the interoperability of tools for IP integration
49
Reason for the SPIRIT Reason for the SPIRIT consortiumconsortium
Industry demandsIndustry demands Complex System-on-Chip and Programmable Complex System-on-Chip and Programmable
Platforms require IP re-usePlatforms require IP re-use Device manufacturers need to be able to select Device manufacturers need to be able to select
IP from multiple sourcesIP from multiple sources Unifying IP descriptions and access to this Unifying IP descriptions and access to this
information permits best-in-class choices for information permits best-in-class choices for both IP and toolsboth IP and tools
50
SPIRIT Consortium SPIRIT Consortium BackgroundBackground
The founding companies decided mutually The founding companies decided mutually to establish a unified set of standards to to establish a unified set of standards to increase efficiency of IP based SoC design increase efficiency of IP based SoC design
Combining technological strengths of Combining technological strengths of SPIRIT members toSPIRIT members to Create standards that will help express complex Create standards that will help express complex
IP IP Deliver greater flexibility and efficiency to the Deliver greater flexibility and efficiency to the
SoC design process SoC design process
51
Consortium GoalsConsortium Goals
Develop standards to facilitate IP re-useDevelop standards to facilitate IP re-use Structure for configurable IP design Structure for configurable IP design
Separating core functionality from associated parametersSeparating core functionality from associated parameters Defining standard interfaces for tools Defining standard interfaces for tools Enable more efficient and cost-effective integration of Enable more efficient and cost-effective integration of
IP from multiple sources IP from multiple sources Test the proposed standards within multiple live Test the proposed standards within multiple live
projectsprojects Providing proof-of-concept Providing proof-of-concept
Transfer proven standards to an international Transfer proven standards to an international standards bodystandards body
52
Future-world for designersFuture-world for designers
SPIRIT-enabled IP will facilitate new levels of design integration &SPIRIT-enabled IP will facilitate new levels of design integration &automation across a wide range of IP, tools and vendors:automation across a wide range of IP, tools and vendors:
IP providers will ship IP with a machine-readable XML 'data-book'IP providers will ship IP with a machine-readable XML 'data-book' Designers do not have to study data books to use IP in a System design.Designers do not have to study data books to use IP in a System design. IP will be automatically configured and integrated into designs.IP will be automatically configured and integrated into designs.
The same design information will be used to generate varied system The same design information will be used to generate varied system
information information Simulation models, Documentation, System APIs, Tool Configurations, SW applicationsSimulation models, Documentation, System APIs, Tool Configurations, SW applications
New specialist design applications will emerge to process IP informationNew specialist design applications will emerge to process IP information FPGA prototype generatorsFPGA prototype generators HW/SW optimizers/re-mapping toolsHW/SW optimizers/re-mapping tools Automatic OS portingAutomatic OS porting Bus Generators optimized for power/bandwidth etc.Bus Generators optimized for power/bandwidth etc.
For the first time, it will become realistic to reuse IP directly in a System For the first time, it will become realistic to reuse IP directly in a System DesignDesign
54
The Philips Utilization of the The Philips Utilization of the SPIRIT StandardsSPIRIT Standards
Philips has been gaining experience with Philips has been gaining experience with automated IP integration:automated IP integration: A Philips in-house tool, Chip Builder is an A Philips in-house tool, Chip Builder is an
excellent example of the technologyexcellent example of the technology Uses architecture templatesUses architecture templates IP generatorsIP generators Interconnect generatorsInterconnect generators Automated clock insertion and DfTAutomated clock insertion and DfT Automated pad and ring insertionAutomated pad and ring insertion
Using the SPIRIT standards, Philips intends Using the SPIRIT standards, Philips intends to use third party tools to realize an to use third party tools to realize an optimized new generationoptimized new generation
55
The Nx-Builder development The Nx-Builder development environmentenvironment
Philips will integrate a selected number of tools Philips will integrate a selected number of tools together to form a highly automated SoC design together to form a highly automated SoC design flowflow The Nx-Builder development environment will The Nx-Builder development environment will
support the 3 main phases in the development of support the 3 main phases in the development of SoC’s:SoC’s:
The architecture exploration and definition of templatesThe architecture exploration and definition of templates The integration & verification of IPThe integration & verification of IP The synthesis and chip design stepsThe synthesis and chip design steps
Nx-Builder will provide a highly flexible platform Nx-Builder will provide a highly flexible platform and product development environmentand product development environment
56
System Silicon Platform
NxNx--Builder GoalsBuilder Goals Aim is to move to next level of abstraction in SoC developmentAim is to move to next level of abstraction in SoC development
HW & SW IP, Subsystems and platforms, SoC HW & SW IP, Subsystems and platforms, SoC Encapsulates architectural rules and IP in an abstract form Encapsulates architectural rules and IP in an abstract form Provides basis for derivativesProvides basis for derivatives
encapsulated system can be deployed to derivative development teamsencapsulated system can be deployed to derivative development teams
Microcontroller Subsystem
CPU
Application testbench
...
...
Standard IP
Standard Cell
Subsystems
SoC
57
Nx-Builder, Nx-Builder, its place in the IC its place in the IC design flowdesign flowUpstreamUpstream
Architectureexploration
- Identification of new Identification of new IPIP
- Decision on IP reuseDecision on IP reuse-Specifications of of new IPnew IP
- Identification of new Identification of new IPIP
- Decision on IP reuseDecision on IP reuse-Specifications of of new IPnew IP
Optimization of:•Performance•IP Reuse•DevelopmentDevelopment of new IP
Optimization of:•Performance•IP Reuse•DevelopmentDevelopment of new IP
•SystemC•Verilog•VHDL
•SystemC•Verilog•VHDL
SystemSystemDefinitionDefinition
IP & System development using SystemC
•Verification Software
•Test Suites•Drivers
•Verification Software
•Test Suites•Drivers
SWEnvironment
Co-SimulationSystemC
Simulations
- Access to data Access to data base of SystemC base of SystemC modelsmodels
- Access to data Access to data base of SystemC base of SystemC modelsmodels
58
Nx-Builder, Nx-Builder, its place in the IC its place in the IC design flowdesign flow
IP IntegrationIP Integration
IPSelect
- GUI EntryGUI Entry or Configuration filefile- Configurable Configurable BlocksBlocks- I/OsI/Os
- GUI EntryGUI Entry or Configuration filefile- Configurable Configurable BlocksBlocks- I/OsI/Os
Feedback on: - Gates - Address Maps - Block Diagram
Feedback on: - Gates - Address Maps - Block Diagram
- Database generation- Extract all IP from data bases
- Database generation- Extract all IP from data bases
- Chip Model- System Model - Test Bench- Simulations
- Chip Model- System Model - Test Bench- Simulations
ExtractChipChip
ConfigurationConfiguration
GUI using generators for automation
- Build Verification Software
- Build Verification Software
Build SW SimulateCompile
- Search in Search in IPYPIPYP
- Search in Search in IPYPIPYP
59
Nx-Builder, Nx-Builder, its place in the IC its place in the IC design flowdesign flow
DownstreamDownstream
- Chip Model- System Model- Test Bench-Simulations-Prototyping
- Chip Model- System Model- Test Bench-Simulations-Prototyping
Scripts forIndustry Standard Tools
Scripts forIndustry Standard Tools
Make/Automation scripts
Simulate Synthesis TimingProduct
VerificationPlace &Route
60
The Major FocusThe Major Focus Nx-Builder will focus onNx-Builder will focus on
Reuse of verification suites at all stages of the Reuse of verification suites at all stages of the development flowdevelopment flow
Support of verification for SystemC Support of verification for SystemC simulations, in prototyping systems and on the simulations, in prototyping systems and on the integrated IP integrated IP
All IP will have several standard views:All IP will have several standard views: A SystemC modelA SystemC model An FPGA view for prototypingAn FPGA view for prototyping A verification suite viewA verification suite view RTL, Verilog and/or VHDLRTL, Verilog and/or VHDL A metadata description packageA metadata description package
61
SDRAMcontroller
Cameraintf.
PCMCIA
ARM7TDMIARM7TDMI
MPEG-4 DMA
Platform ExamplePlatform Example
ARM9
ISROM
VPB1
VDDalways
CGU
PLLPLL
PLLPLL
ClocksExternalint. ctl
bridge
AHB
ISRAM
VPB2
bridge
AHB
UART
JTAGTAP
CTAGTCB
IO conf
sys_creg
New IP 1New IP 1vectoredinterrupt ctl
New IP 2New IP 2
Multi-layer AHB
62
SummarySummary
Philips is committed to the planned Philips is committed to the planned Reuse of IP and Verification SuitesReuse of IP and Verification Suites
Philips will exploit the SPIRIT Philips will exploit the SPIRIT standards to achieve the next step in standards to achieve the next step in Reuse technologyReuse technology
Philips believes the changes induced in Philips believes the changes induced in the EDA and IP provider scene through the EDA and IP provider scene through SPIRIT will have positive effects on the SPIRIT will have positive effects on the electronic industry as a wholeelectronic industry as a whole
64
Layered Model for QoSLayered Model for QoS
User
NetworkMM devices
System
Application
(System QoS)
(Application QoS)
(Perceptual QoS)
(Operating and Communication System)
(Network QoS)(Device QoS)
(System QoS)
(Application QoS)
(Perceptual QoS)
Application Qos
Media Quality …... Media Relations
Intraframe
Media Characteristics
Interframe
Component Spec
Name
Size
Rate
Importance
Loss Rate Transmission Characteristics
Sample Size
Sample Rate
Compression
End-to-end Delay
Sample Loss Rate
Importance
Cost
Synchronization Skew
Integration
Communication
Conversion
Application QoS Parameter Examples
65
QoS ClassesQoS Classes
QoS classes can determineQoS classes can determine Reliability of offered servicesReliability of offered services Utilization of resourcesUtilization of resources
Guaranteed Service ClassGuaranteed Service Class Deterministic/Statistical guarantees.Deterministic/Statistical guarantees.
Predictive Service Class Predictive Service Class QoS parameters based on past behaviorQoS parameters based on past behavior
Best-Effort Service ClassBest-Effort Service Class Only partial guarantees based on resource Only partial guarantees based on resource
availabilityavailability QoS parameters are specified with only QoS parameters are specified with only
minimal/no boundminimal/no bound
66
What’s New in the Context of What’s New in the Context of Wireless SystemsWireless Systems
Earlier optimization metric was Earlier optimization metric was BandwidthBandwidth MPEG is a video compression standardMPEG is a video compression standard
For mobile, wireless devices: Energy is a For mobile, wireless devices: Energy is a severely limited resourceseverely limited resource How can we optimize MPEG How can we optimize MPEG
encoding/decoding to reduce energyencoding/decoding to reduce energy Traditionally, DSPs and ASICs have been Traditionally, DSPs and ASICs have been
used to execute Multimedia applicationsused to execute Multimedia applications Mobile handhelds, laptops etc use general Mobile handhelds, laptops etc use general
purpose processorspurpose processors
67
Architecture
Operating System
Distributed Middleware
User/Application
Low-powerdevice
Power Management
Wide Area Network
Wireless Network
Low-powermobile device
Proxy
Proxy-Based Optimization
Network Infrastructure
Execute Remote Tasks
Caching Compress
DecryptionEncryption
Compositing Transcode
Proxy Based Middleware ApproachProxy Based Middleware Approach
68
Energy-Sensitive Video Energy-Sensitive Video TranscodingTranscoding
► We conducted a survey to subjectively assess We conducted a survey to subjectively assess human perception of video quality on handhelds.human perception of video quality on handhelds.► Hard to programmatically identify video quality parametersHard to programmatically identify video quality parameters► We identified 8 perceptible video quality levels that We identified 8 perceptible video quality levels that
produced noticeable difference in power consumption produced noticeable difference in power consumption (Compaq iPaq 3600)(Compaq iPaq 3600)
5.38 W3.88 WQSIF, 20fps,100kbpsQ8 (Terrible)
5.5 W3.95 WQSIF, 20fps, 150KbpsQ7 (Bad)
5.63 W4.06 WHSIF, 24fps, 150KbpsQ6(Poor)
5.73 W4.15 WHSIF, 24fps, 200KbpsQ5 (Fair)
5.81 W4.24 WHSIF, 24fps, 350KbpsQ4 (Good)
5.86 W4.31 WSIF, 25fps, 350KbpsQ3 (Very Good)
5.99 W4.37 WSIF, 25fps, 450KbpsQ2 (Excellent)
6.07 W4.42 WSIF, 30fps, 650KbpsQ1 (Like original)
Avg. Power (Linux)
Avg. Power (Windows CE)
Video transformation parameters
QUALITY
VIDEO TRANSCODING PARAMETERS
Quality/Power Matrix for COMPAQ IPAQ 3600 ( Grand Theft Auto Action Video Sequence)
69
Experimental SetupExperimental Setup Power measurements:Power measurements:
IPAQ 3650 + Cisco 350 Aironet wireless IPAQ 3650 + Cisco 350 Aironet wireless cardcard
206Mhz Intel StrongArm, 16MB ROM, 206Mhz Intel StrongArm, 16MB ROM, 32MB RAM32MB RAM
SimulationSimulation Wattch / SimpleScalar for ARMWattch / SimpleScalar for ARM
MPEG decoder: Berkeley MPEG toolsMPEG decoder: Berkeley MPEG tools Transcoder: TMPGEncTranscoder: TMPGEnc Video clipsVideo clips
High action (e.g. GTA)High action (e.g. GTA) Medium action (sport)Medium action (sport) Low action (news)Low action (news)
Cable DAQ
Power measurement system(Windows XP, 650 MHz)
Ext
ern
al V
olt
ag
e
Su
pp
ly (
5V
)
AP
BNC-2110connector
802.11b
Serial connection
Wireless
R=.22ohm
ProxyVR
ViP
AQ