51
1 CSE 237A Platforms Ayse Coskun Department of Computer Science and Engineering University of California, San Diego

CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

1

CSE 237A Platforms

Ayse CoskunDepartment of Computer Science and EngineeringUniversity of California, San Diego

Page 2: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

2

Embedded System Hardware

actuators

Page 3: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

3

Hardware Platform Architecture

Page 4: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

4

The PC as a PlatformAdvantages:

Cheap and easy to getRich and familiar software environment

Disadvantages:Requires a lot of hardware resourcesNot well-adapted to real-time

Page 5: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

5

Host / Target DesignUse a host system to prepare software for target system:

targetsystem

host systemserial line

Page 6: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

6

Host-Based ToolsCross compiler:

Compiles code on host for target system

Cross debugger:Displays target state, allows target system to be controlled

Page 7: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

7

Evaluation BoardsDesigned by CPU manufacturer or others

Includes CPU, memory, some I/O devices

May include prototyping section

CPU manufacturer often gives out evaluation board netlist---can be used as starting point for your custom board design

Page 8: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

8

Adding Logic to a BoardProgrammable logic devices (PLDs) provide low/medium density logic

Field-programmable gate arrays (FPGAs) provide more logic and multi-level logic

Application-specific integrated circuits(ASICs) are manufactured for a specific purpose

Page 9: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

9

How To Exercise CodeRun on:

Host systemTarget systemInstruction-level simulatorCycle-Accurate simulatorHardware/Software co-simulation

environment

Page 10: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

10

Debugging Embedded Systems

Challenges:Target system may be hard to observe

Target may be hard to control

May be hard to generate realistic inputs

Setup sequence may be complex

Page 11: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

11

Testing and DebuggingImplementation

PhaseImplementation

Phase

Verification Phase

VerificationPhase

Emulator

Debugger/ ISS

Programmer

Development processor

(a) (b)

External tools

ISS Gives us control over time – set breakpoints, look at register values, set values, step-by-step execution, ...But, doesn’t interact with real environment

Download to boardUse device programmerRuns in real environment, but not controllable

Compromise: EmulatorRuns in real environment, at speed or nearAllows you to stop execution, examine CPU state, modify registers.

Page 12: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

12

DebuggersA monitor program residing on the target provides basic debugger functionsDebugger should have a minimal footprint in memoryUser program must be careful not to destroy debugger program, but should be able to recover from some damage Breakpoints are very useful

Replace the break-pointed instruction with a subroutine call to the monitor program

Page 13: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

13

Breakpoint Handler ActionsSave registersAllow user to examine machineBefore returning, restore system state

Safest way to execute the instruction is to replace it and execute in placePut another breakpoint after the replaced breakpoint to allow restoring the original breakpoint

Page 14: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

14

Platforms Example: MoteSW & HW kits

Page 15: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

15

Mote AppsTwo prepackaged apps:

Environmental monitoringTemperature, humidity, radiation, photosynthetic, barometric pressure

SecurityMagnetometerPassive IR detector

Page 16: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

16

Mote Specs

Page 17: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

17

Mote Specs (cont’d)

Page 18: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

18

ATmega128L4 or 8 MHz8 bit128KB Flash4KB EEPROM4KB SRAM133 instructions

most single cycle32 gen. regs2 cycle multiplier8 channel, 10 bit ADC; 15 kSamps/s max resolution

Page 19: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

19

Platforms: XScale

It’s got it all!Everything one would need to develop a next generation cell phone or PDA

Main board block diagram:

Page 20: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

20

Platforms: XScale

Daughter board block diagram

Page 21: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

21

XScale Processor7 stage pipeline32bit32 KB instr/data cache; 2kb mini data cache256 Kb SRAMSupport for various peripheralsCPU freq: 100-600 MHz; voltage down to 0.85 VFound in Blackberry, Treo, IPAQ, etc.

Page 22: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

22

XScale Architecture

Page 23: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

23

Compare Power / PerformanceXScale

Active: 13 MHz => 44.2 mW; 624 MHz => 925 mW

Idle: 13 MHz => 8.2 mW; 624 MHz => 260 mW

Deep sleep: 0.1 mWATmega

Active: 4 MHz => 17 mWIdle: 4 MHz => 12mWSleep: 45 uW

Page 24: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

24

Power Management

Page 25: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

25

Energy EfficiencyEnergy-efficient computing is needed by:

portable systemsincrease battery lifetimeoptimize thermal design -> form factor

non-portable systemsminimize costenvironmental concerns

Electronic system designHardware: processing, storage, communicationSoftware: operating systems, applications

Electronic system utilizationRuntime control and managemente.g. Dynamic Power Management (sleep states) & Dynamic Voltage Scaling (lower CPU frequency/voltage)

Page 26: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

26

Portable ComputersHard Disk

Active power: 0.95 -2.5 WIdle power: 0.95 WSleep: 0.13 WSleep time: 0.67 s Wake-up time: 1.6 s

WLANTransmission: 1.65 WReceiving: 1.4 WDoze: 0.045 WDown time: 62msWake-up time: 34 ms

Most energy consumed in: display, hard disks, WLAN

Page 27: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

27

Power and Energy Relationship

∫= dtPE

t

P

E

Page 28: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

28

Low Power vs. Low Energy

Minimizing the power consumption is important forthe design of the power supplythe design of voltage regulatorsthe dimensioning of interconnectshort term cooling

Minimizing the energy consumption is important due torestricted availability of energy (mobile systems)

limited battery capacities (only slowly improving)very high costs of energy (solar panels, in space)

coolinghigh costslimited space

dependability long lifetimes, low temperatures

Minimizing the power consumption is important forthe design of the power supplythe design of voltage regulatorsthe dimensioning of interconnectshort term cooling

Minimizing the energy consumption is important due torestricted availability of energy (mobile systems)

limited battery capacities (only slowly improving)very high costs of energy (solar panels, in space)

coolinghigh costslimited space

dependability long lifetimes, low temperatures

Page 29: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

29

Nuclear reactor

Page 30: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

30

Consider CPU & System PowerMobile PC

Thermal Design (TDP) System Power

Note: Based on Actual Measurements

600/500 MHz uP37%

LCD 10"19%

HDD9%

Memory+Graphics12%

Power Supply10%

Other13%

Mobile PCAverage System Power

600/500 MHz uP13%

LCD 10"30%

HDD19%

Memory+Graphics15%

Power Supply10%

Other13%

CPU Dominates Thermal Design Power

Multiple Platform Components Comprise

Average Power[Courtesy: N. Dutt; Source: V. Tiwari]

Page 31: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

31

Power Manageable ComponentsComponents with several internal states

Corresponding to power and service levelsAbstracted as power state machines

State diagram with:Power and service annotation on statesPower and delay annotation on edges

Example: SA-1100RUN: operational

IDLE: a sw routine may stop the CPU when not in use, while monitoring interrupts

SLEEP: Shutdown of on-chip activity

RUN

SLEEPIDLE

400mW

160uW50mW

90us

90us10us

10us160ms

Page 32: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

32

Example: Hard Disk Drive

Working: 2.2 W(spinning + IO)

Sleeping: 0.13 W(stop spinning)

Idle: 0.95 W(spinning)

read / write

IO completeshut down0.36 J, 0.67 sec

spin up4.4J, 1.6 sec

Fujitsu MHF 2043 AT

Page 33: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

33

The Applicability of DPM

State transition power (Ptr) and delay (Ttr)

If Τtr = 0, Ptr = 0 the policy is trivialStop a component when it is not needed

If Τtr != 0 or Ptr != 0 (always…)Shutdown only when idleness is long enough to amortize the costWhat if Τ and P fluctuate?

Ptr Ttr

OFF

Ptr Ttr

ON

Page 34: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

34

Workload and System Representation

sleeping

shut down wake up

working working

requests requests

busy busytime

workload

system idle

shutdown delay wakeup delaypower

time

power state

Page 35: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

35

Waking Up Hard Disk

0

0.5

1

1.5

2

2.5

0 1 2 3 4 5 6

time (sec)

pow

er (W

)

sleeping working

waking up

Measurements done on Fujitsu MHF 2043 AT hard disk

Page 36: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

36

System Break-Even Time: TBE

Transition delay (Ttr)Transition power (Ptr)

Sleep power (Poff)

Minimum idle time for amortizing Minimum idle time for amortizing

the cost of component shutdownthe cost of component shutdown

offon

ontrtrtrBE PP

PPTTT−−

+=

Page 37: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

37

Fixed Time-OutSimple policy

If Tidle > TTO go to SLEEPStay in sleep until workload != 0

Rationale

When Tidle > TTO it is likely that: Tidle > TTO + TBE

Choice of TTO is critical

Large is safe, but it could be useless

Too small is highly undesirable

Limitations

Performance penalty for wake-up is paid after every shutdown

Power is wasted during TTO

Page 38: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

38

OS Based Power ManagementIn systems with an operating system (OS)

The OS knows of tasks running and waitingThe OS should perform the DPM decisions

Advanced Configuration and Power Interface(ACPI)

[Intel, Microsoft, Toshiba]Open standard for design of OS-based power management

Page 39: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

39

DPM and Operating SystemsApplication

should not directly control hardware powerno power management in legacy programs

Schedulerselects processes and affects idle periods

Process managerknows multiple requesterscan estimate idle periods more accurately

Driverdetects busy and idle periods

Deviceconsumes powershould provide mechanism, not policy

hardware devices

device driver

application programs

operating system

process manager

scheduler

Page 40: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

40

Low-Power Scheduling

tasks specify• device requirements• timing requirements

operating system1. groups tasks with same

device requirements2. execute tasks in groups3. wake up devices in advance to

meet timing constraints

p2

1

3

31 2

p1

p3 1 2

23

p2

1

3

31 2

p1

p3 1 2

23

idle

Page 41: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

41

Dynamic Voltage Scaling (DVS)

Power consumption of CMOScircuits (ignoring leakage):

frequency clockvoltagesupply

ecapacitanc loadactivity switchingwith2

:::

:

fVC

fVCP

dd

L

ddL

α

α=

( )

) than lly substancia(voltage threshhold:

with2

ddt

t

tdd

ddL

VVV

VVVCk

<

−=τ

Delay for CMOS circuits:

Page 42: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

42

Voltage scaling: Example

Vdd[Courtesy, Yasuura, 2000]

Page 43: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

43

Variable-voltage/frequency

From

Inte

l’s W

eb S

ite

Page 44: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

44

Mini Project:Portable Multimedia

Player on Intel XScale

Page 45: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

45

Project GoalDesign an efficient & low power media player for an XScale platform

Real time decoding of audioEnergy efficiency – use DVS!

Steps:Install OS onto Intel’s XScale DVK (Linux)Set up cross compiling & debugging environmentsGet media player to run in real timeMinimize CPU power with real time playback

Page 46: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

46

Power Management on XScale

XScale 27x150 us for f/V change0.5 ms to sleep state

StateActive (mW)

Idle (mW)

Freq (MHz)

P1 925 260 624

P2 747 222 520

P3 279 129 208

P4 116 64 104

Psleep - 0.163 0

Page 47: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

47

Reducing Power ConsumptionImplement a dynamic voltage frequency scalingpolicy

Check system statusActive / Idle, cpu utilization, etc.

Adjust V / f setting accordingly

Compiling the applications with different optimizations and libraries can reduce power consumption further

Try for mplayer

Page 48: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

48

DVS PoliciesExample:

The application will not fully utilize the CPUCheck utilization

e.g. with “top” commandReduce V / f setting to take advantage of low CPU utilizationHow can you estimate the utilization in the next interval?

Page 49: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

49

How To Test Your Policies

Run the provided real time applications (without DVS) on the PC and record the execution timeRun the apps on the platform If they’re slower not real time!

Page 50: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

50

Sources and References

Peter Marwedel, “Embedded Systems Design,” 2004.Frank Vahid, Tony Givargis, “Embedded System Design,” Wiley, 2002. Wayne Wolf, “Computers as Components,” Morgan Kaufmann, 2001. Nikil Dutt @ UCI

Page 51: CSE 237A Platforms - University of California, San Diegocseweb.ucsd.edu/classes/wi08/cse237a/handouts/L5-platformdev.pdfCSE 237A Platforms Ayse Coskun Department of Computer Science

51