Deployment of Data-flow Applications
in Many-core Architectures
UltrasoundToGo RTD 2013
Stefanos Skalistis and Alena Simalatsar
Rigorous System Design Laboratory (RiSD), EPFL
Task A
Task C
ω(eAC)
Task B
ω(eAB)
Cluster 1 Cluster 2b(eAC)
IA TA0 ω(eAB)
FA
0
b(eAB)
IB
TB
0
0
b(eBA)
Application Partitioning
Application Placement
Mapping, Scheduling, Buffer Allocation
Power Management
Real-Time Scheduling
SMT
Solver
Offline
Online
Image credit: Marcio Castro et al.
Many-core architectures:
+ Efficient for high-parallelizable applications
+ Support for power management
- Complex communication mechanisms
- Safe code-generation is not straightforward
Kalray MPPA-256:
• 256 cores (400 MHz) in 16 clusters with 2MB shared scratchpad memory, on 2D-torus NoC
• Scheduling constraints
• Resource constraints
• Power constraints
Providing real-time guarantees is challenging
Existing methods:
• Based on WCET → largely overestimated
• Do not focus on optimizing performance
Real-time performance and power efficiency:
• Real-time scheduling respecting time constraints
• Operating within power constraints
• Optimal use of resources
Motivation
Multi-criteria optimization decisions:
• Right degree of parallelism?
• Application partitioning and mapping to PE?
• Size of buffers? Task scheduling with real-time constraints?
Deploying on many-core: Bridging safety-critical and best-effort
Hybrid approach:
Offline: Guarantees based on worst-case execution time
Online: Optimizations based on actual execution times
[1] Tendulkar, Pranav, et al. "Many-Core Scheduling of Data Parallel Applications using SMT Solvers." Digital System Design (DSD), 2014 17th Euromicro Conference on. IEEE, 2014.
[2] De Dinechin, Benoît Dupont, et al. "Time-critical computing on a single-chip massively parallel processor." Design, Automation and Test in Europe Conference and Exhibition (DATE). IEEE, 2014.
[3] Ibrahim, Aya, et al. "Assessment of Image Quality vs. Computation Cost for Different Parameterizations of Ultrasound Imaging Pipelines.“ Medical Cyber-Physical Systems (MCPS) Workshop, 2015
Case study: 3D Ultrasound Imaging
• Data-flow application
• High degree of parallelism
• Medical nature requires guarantees
• Portability imposes power limitations
Finding solutions using SMT solvers:
Safe deployment without
undermining performance