23
Mapping Flow for Car-Entertainment Applications on Embedded Multiprocessor Systems Marco Bekooij and Maarten Wiggers Name? and address? Name? and address?

Mapping Flow for Car-Entertainment Applications on Embedded

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Mapping Flow for Car-Entertainment Applications on Embedded

Mapping Flow for Car-Entertainment Applications on Embedded Multiprocessor Systems

Marco Bekooij and Maarten WiggersName? and address?Name? and address?

Page 2: Mapping Flow for Car-Entertainment Applications on Embedded

2

Outline

Multi-stream car-entertainment system

Objectives & Approach

Bounding interference with budget schedulers– Unsafe worst-case execution time estimates of tasks

Expressivity analysis model– Data dependent execution rates of tasks

Mapping flow– Limitations

FPGA demonstrator

Summary

Page 3: Mapping Flow for Car-Entertainment Applications on Embedded

3

Car entertainment application domain

Page 4: Mapping Flow for Car-Entertainment Applications on Embedded

4

Multi-stream car-entertainment system

Page 5: Mapping Flow for Car-Entertainment Applications on Embedded

5

Application model

use-case

input data stream

task

use-case

SRT video job

taskinput data

stream output streamto display

FRT audio job

task

task output streamto speakers

task

Jobs are composed of tasks

Simultaneously running jobs together form use-cases

Jobs often have real-time requirements– Firm (FRT) if deadline misses are highly undesirable (steep quality

degradation)– Soft (SRT) if occasional deadline misses are tolerable

Page 6: Mapping Flow for Car-Entertainment Applications on Embedded

6

Car entertainment use-case

use-case

use-case

ADC

FRT digital radio job A

PDC CFE

f1VIT CBE SRC APP DAC

f2

FRT source decoding job B

SRC APP DAC

f3MP3BR

Observations:– Reactive system because stream from transmitter cannot be slowed down– Firm real-time jobs because deadline misses are highly undesirable but

not catastrophic– Both streams are equally important

Page 7: Mapping Flow for Car-Entertainment Applications on Embedded

7

Our objectives:

► Enable independent development of jobs

► Maintain robustness despite increased resource sharing

► Reduce design and verification effort

Page 8: Mapping Flow for Car-Entertainment Applications on Embedded

8

Strategy: Divide and conquer

use-case

ADC

real-time digital radio job A

PDC CFE

f1VIT CBE SRC APP DAC

f2

real-time source decoding job B

SRC APP DAC

f3MP3BR

Bound interference other jobs (guaranteed by measures in hardware)– Each job can be designed and characterized independently of other jobs– Erroneous behavior of a job has bounded effect on behavior other jobs

Objective 1&2

Page 9: Mapping Flow for Car-Entertainment Applications on Embedded

9

Strategy: Abstraction with guarantees (conservative arrival time data)

v0 v1 v2Fast (>100Mcc/s) simulation ofcommunicatingprocesses with time (CP-T)

Instead of:

DSP

mem Arb

NI

I/OExternalSDRAM

CActrl

µP

Network

Low-level simulation(e.g. PV-T)

NI NI NI

$

Objective 3

Page 10: Mapping Flow for Car-Entertainment Applications on Embedded

10

Strategy: Synthesize settings(prevent iterative design space exploration)

Dataflow synthesis

(cyclic) task graph of a jobWCRTs of the tasks

multiprocessorinstance

throughput and latency constraint

scheduler settings andcommunication buffer capacities

Objective 3

Page 11: Mapping Flow for Car-Entertainment Applications on Embedded

11

Multiprocessor system under development

Æthereal network-on-chip (NoC)

• Instructions and data can be stored in external SDRAM • lower cost per bit than on-chip SRAM (Digital Radio Mondial channel decoder requires 940 kByte storage)

• Caches hide SDRAM memory access latency• large difference between WCET and average execution time

Proc

Arb

NI

ExternalSDRAM

NI

Proc

NI

$

memctrl

LL LT

[M. Bekooij et. al.: Bits & Chips 2007]

Page 12: Mapping Flow for Car-Entertainment Applications on Embedded

12

Execution times of tasks

Why use optimistic-estimated WCETs– Distance ∆ can be large due to caches and external SDRAM– Tight conservative-estimated WCET requires knowledge input data

Our approach– Guarantee no deadline misses for set of input stimuli– Fall-back mechanism to cope with deadline misses – Worst-case temporal behavior of other jobs should not be affected

dist

ribut

ion

of ti

mes

execution time

WCET

optimisticestimate

conservativeestimate

all execution timeobserved execution times

Page 13: Mapping Flow for Car-Entertainment Applications on Embedded

13

Response time

Execution Time (ET)

ET

start finish

Response Time (RT)

RT

enabledstarted

finished

Page 14: Mapping Flow for Car-Entertainment Applications on Embedded

14

Scheduler/Arbiter characteristicse.g. memory port arbiter, task scheduler

Budget Scheduler (e.g. TDM)

WCRT independent of

WCET of tasks other jobs(unsafe WCET!)

Data arrival/traffic characterization

Budgetscheduler

All schedulers

Page 15: Mapping Flow for Car-Entertainment Applications on Embedded

15

Variable consumption rate

BR MP3 SRC DAC

44.1 kHz

480 441576

3 n=[1..100] 1

Digital to analog converter (DAC) determines throughput constraint

MP3 decoder task consumes variable amount of data

Block-reader (BR) task must “know” consumption speed MP3 task– Implies cyclic dependency that affects the temporal behavior!– Block reader task does not execute periodically

[M. Wiggers et.al.,DATE 2008]

Page 16: Mapping Flow for Car-Entertainment Applications on Embedded

16

Data dependent execution rate

BR DQ20481VLD

m n 1 IDCT1 MCn1

1[n] 1[n]

1 DAC1

Task graph of an H263 video decoder• Number of execution of the DQ and IDCT task is input data dependent

[M. Wiggers et.al.,RTAS 2008]

Page 17: Mapping Flow for Car-Entertainment Applications on Embedded

17

Expressivity of the model

Deadlockfree

applications

Set of all applications

Models for data dependent behaviorshould be provable deadlock free

Page 18: Mapping Flow for Car-Entertainment Applications on Embedded

18

Tool flow Heracles/Minos (online/offline)

cluster/static order scheduling

Processor/buffer assignment

buffer capacities

task graph+WCETs(CSDF)

multiprocessorinstance

throughput/latencyconstraint

data routing in NoC

scheduler settings

offline

onlineMinos

Heracles

Unsolved issues: phase coupling, maximization online mapping options?

System configuration

Page 19: Mapping Flow for Car-Entertainment Applications on Embedded

19

FPGA demonstrator

Æthereal (NoC)

MEM

DTLt DTLi DTLs

VLIW-2 DTLi

MEM

DTLt DTLs

DTLt DTLi DTLs

VLIW-4 DTLi

MEM

DTLt DTLs

DTLt DTLi DTLs

DTLt

DTLi

Audio I/ODTLs

DTLs

Frame bufferDTLt

DTLi

DTLtHost PC

• Only one network in system• for control, data, debug

• 2x2 mesh router network• guaranteed throughput

• SiHive VLIW core• Ansi-C compiler

• PC used as control processor• High level re-configuration API

• setup & tear-down connections • starting & stopping tasks

VLIW-1 DTLi

ME

M

DTLt DTLs

Differentiators:• Predictable = upper-bound arrival data can be computed at design time

• budget scheduler for every shared resource

Page 20: Mapping Flow for Car-Entertainment Applications on Embedded

20

Summary

Car entertainment application characteristics– Multi job– Tasks with data dependent execution rates– Large memory footprint of digital radio applications

Architecture characteristics– Multiprocessor– Scratch-pad memories as well as caches together with an external SDRAM

Approach– Bound interference other tasks by making use of budget schedulers

Tool flow– Split in an offline and an online part

Challenges– Phase coupling and maximization of online mapping options – Expressive enough analysis model to determine throughput– …. and many more.

Page 21: Mapping Flow for Car-Entertainment Applications on Embedded

21

Questions?

Page 22: Mapping Flow for Car-Entertainment Applications on Embedded

22

ReferencesB. Akesson, K. Goossens, and M. Ringhofer. Predator: A Predictable SDRAM Memory Controller. In Proc. Int’l Conference on Hardware-Software Codesign and System Synthesis (CODES+ISSS), 2007.

M. Bekooij, A. Moonen, and J. van Meerbergen. Predictable and Composable Multiprocessor System Design: A Constructive Approach, In Proc. Bits&Chips Symposium on Embedded Systems and Software, October 2007, Eindhoven, The Netherlands.

M. Bekooij, M.Wiggers, J. van Meerbergen, Efficient Buffer Capacity and Scheduler Setting Computation for Soft Real-Time Stream Processing Applications, In Proc. Int’l Workshop on Software and Compilers for Embedded Systems (SCOPES), April 2007.A. Kumar, A. Hansson, J. Huisken and H. Corporaal. An FPGA Design Flow for Reconfigurable Network-Based Multi-Processor Systems on Chip. In Proc. Design, Automation and Test in Europe Conference and Exhibition (DATE), April 2007.

A. Moonen et.al. A Multi-Core Architecture for In-Car Digital Entertainment, GSPX publication 24-27 October 2005: In Proc. Int’l Conference on Global Signal Processing (GSPx), October 2005.

O. Moreira and M. Bekooij. Self-Timed Scheduling Analysis for Real-Time Applications, In EURASIP Journal on Advances in Signal Processing, 2007

O. Moreira and M. Bekooij. Scheduling Multiple Independent Hard Real-Time Jobs on a Heterogeneous Multiprocessor, In Proc. Int’l Conference on Embedded Software (EMSOFT), September 2007

Page 23: Mapping Flow for Car-Entertainment Applications on Embedded

23

ReferencesS. Sriram and S. S. Bhattacharyya. Embedded Multiprocessors: Scheduling and Synchronization. Marcel Dekker Inc., 2000D. Stiliadis and A. Varma. Latency-Rate Servers: A General Model for Analysis of Traffic Scheduling Algorithms. In IEEE/ACM Transactions on Networking, 6(5):611–624, October 1998.

S. Stuijk, T. Basten, M.C.W. Geilen and H. Corporaal. Multiprocessor Resource Allocation for Throughput-Constrained Synchronous Dataflow Graphs, In Proc. Design Automation Conference (DAC), June 2007.

M. Wiggers, M. Bekooij, and G. Smit. Modelling Run-Time Arbitration by Latency-Rate Servers in Data Flow Graphs. In Proc. Int’l Workshop on Software and Compilers for Embedded Systems (SCOPES), April 2007.

M. Wiggers, M. Bekooij, P. Jansen, and G. Smit. Efficient Computation of Buffer Capacities for Cyclo-Static Real-Time Systems with Back-Pressure. In Proc. IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), April 2007.

M. Wiggers, M. Bekooij, and G. Smit. Computation of buffer capacities for thoughput constrained and data dependent inter-task communication. In Proc. In Proc. Design, Automation and Test in Europe Conference and Exhibition (DATE), April 2008

M. Wiggers, M. Bekooij, and G. Smit. Buffer Capacity Computation for Throughput Constrainted Streaming Applications with Data-Dependent Inter-Task Communication. In Proc. IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), April 2008.