Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1
Laboratoire des Systèmes Intégrés(Integrated Systems Laboratory)
Research Activities
Dr. David AtienzaProf. Giovanni De Micheli
RWTH Aachen – EPFL Joint Research Meeting
David Atienza (LSI/EPFL) 230/08/2007
Research Projects (1/2)
System-on-Chip Consumer Electronics Networks-on-Chip (NoC) interconnects Thermal modeling SoCs
Reliable Nano-Scale Circuits Efficient addressing schemes (micro/nano) Circuit-level defect modeling and reliable
CAD toolsMOSFET
today 22 nmMOSFET in 2008
4.2 nmMOSFETin 2023
David Atienza (LSI/EPFL) 330/08/2007
Research Projects (2/2)
Wireless Sensor Networks (WSNs) Power-efficient communication protocols Modeling of working conditions of nodes Self-maintained (scavenging mechanisms)
Lab-on-Chip Integration of bio-electro- mechano- opto-
systems Point of care medical devices Implanted health monitoring devices
David Atienza (LSI/EPFL) 430/08/2007
World-Wide Partners and Influence
HolstThe Netherlands
IMECBelgium
Stanford,UC San Diego
CA, USA
Complutense U.Spain
Joint work
World-wide impact
Bologna,Cagliari, Verona
Italy,
David Atienza (LSI/EPFL) 530/08/2007
Research Projects (1/2)
System-on-Chip Consumer Electronics Networks-on-Chip (NoC) interconnects Thermal modeling SoCs
Reliable Nano-Scale Circuits Efficient addressing schemes (micro/nano) Circuit-level defect modeling and reliable
CAD toolsMOSFET
today 22 nmMOSFET in 2008
4.2 nmMOSFETin 2023
David Atienza (LSI/EPFL) 630/08/2007
Roadmap continues: 90→65→45 nm “Traditional” Bus-based SoCs fit in one tile !! Communication demands: unevenly distributed,
because of architectural heterogeneity
I/0
I/0
PE
PE PE PE
SRAM SRAM
DRAM
I/O
I/OPERIPHERALS
3D stacked m
ain mem
ory
PE
LocalMemory
hierarchy
CPU
i/o
Consumer SoCs: Architecture Evolution
David Atienza (LSI/EPFL) 730/08/2007
DSPNI
NIDRAM
switch
DMANI
CPU NI
NIAccelNI MPEG
switch
switch
switch
NoC
switch
switch
NetworkInterface Clean separation at
session layer Cores issue end-to-end
transactions NI deals with transport,
network, link, physical
Modularity at HW level:only 2 building blocks Network interface Switch (router)
Physical design aware(floorplan global routing)
Networks-on-Chip (NoC) Interconnects
David Atienza (LSI/EPFL) 830/08/2007
SunFloor
TopologySynthesis
includes:FloorplannerNoC Router
RTLArchitecturalSimulation
Constraint & Comm graph
Area models
SystemCcode
NoClibrary
FPGAEmul.
To fab
PlatformGeneration
(xpipes-Compiler)
Synth.
User:power, area
Power models
Constraints:area, power,
IP Coremodels
Placem.&
Routing
Codes,Simul
Applic.
Area, power characterizationFloorplanning specifications
Complete NoC-Based EDA Flow
Participants: F. Angiolini, A. Pulini, Dr. S. Murali, Dr. D. Atienza,
Prof. L. Benini, Prof. G. de Micheli Partners: Bologna, Stanford, Cagliari.
Backend flow 130, 90, 65nm
Sunfloor
David Atienza (LSI/EPFL) 930/08/2007
M5 T2M4
P4 P5
AHB Layer 2
M7 T3M6
AHB Layer 3
M9 T4M8
P8 P9
AHB Layer 4
M1 T0M0
P0 P1
AHB Layer 0
M3 T1M2
P2 P3
AHB Layer 1A
MB
A A
HB
cro
ssb
ar
S10
S11
S12
S13
S14
P6 P7
Realistic 130nm SoC (10 cores,5 accelerators, 15 mem. slaves)
Shared bus fully saturated! Multilayer solution required for
performance: Intra-cluster traffic to private
slaves (P0-P9) is bound withineach layer, reducing congestion
Shared slaves (S10-S14) can beaccessed in parallel
Representative 5x5 Multilayerconfiguration (up to 8x8)
Synthesized for max frequency
Case study 130nm: AMBA AHB Multilayer
1030/08/2007
M0
P0
M1
P1
M2
P2
M3
P3
M5
P5
M4
P4
M6
P6
M7
P7
M8
P8
M9
P9
T0
S10
T1
S11
T2
S12
T3
S13
T4
S14
Uni- Bi-directional links: balanced, no bottlenecks Custom, but almost regular topology: easy to floorplan Optimally placed private slaves, scattered shared slaves Only 4 hours specification to layout thanks to EDA flow
NoC solution: Quasi-Mesh
1130/08/2007
IPs: 1mm² obstructions (ARM cores, 32kB SRAM) Wire routing over the cores was forbidden Minimum delay synthesis
1mm²
AMBA
Slaves
AHB Layer
1mm²
Mesh Row
2 NIs + 1 switch
Summary of results:6.4% vs. 22.9% post P&R timing degradation → Improved predictabilityClock frequency 793 vs. 370 MHz post P&R → Much faster15% avg application speedup (longer latency, more effective bandwidth)
Application specific NoC competitive with state-of-the-artinterconnect even in 130nm using proposed EDA flow
AMBA vs NoC: Results in 130 nm
David Atienza (LSI/EPFL) 1230/08/2007
90 nm, 1 mm2 cores
65 nm, 1 mm2 cores
65 nm, 0.4 mm2 cores
Three designs for max frequency:
NoC Technology Scaling
David Atienza (LSI/EPFL) 1330/08/2007
NoC Results in 90 and 65nm
0.63 mm2
0.64 mm2
1.31 mm2
NoC CellArea
495.3 mW520.1 mW784.6 mW
Power BandwidthMax Freq.Max Frequency
(38-bit links)
285 GB/s1.25 GHz65 nm, .4 mm2
1.25 GHz1 GHz
285 GB/s65 nm, 1 mm2
228 GB/s90 nm, 1 mm2
Power shifting from switches/NIs to links (buffering) Links always short (<1.2 mm) → non-pipelined Power figures:
− 90 nm 1 mm2: 3.1 mW− 65 nm 1 mm2: 3.6 mW (tightest fit → more buffering)− 65 nm 0.4 mm2: 2.2 mW
David Atienza (LSI/EPFL) 1430/08/2007
Thermal Modeling Multi-Processor SoC
[Sun,1.8 GHz
Sparc v9Microproc]
Complex SoCs, increase power density Non-uniform hot-spots Spatial gradients: large part of chip
with slower performance & leaking
Participants: F. Mulas, M. Buttu, M. Pittau, P. Valle,
Prof. S. Carta, Prof. A. Acquaviva,Dr. D. Atienza, Prof. L. Benini,Prof. G. de Micheli
Partners: UCM, Cagliari, Verona, Bologna.
[Sun,Niagara
Broadband Processor]
David Atienza (LSI/EPFL) 1530/08/2007
FPGA
Xilinx
MPSoC HW architec. emulation
HW statistics extraction system
HW/SW Emulation Framework for MPSoC
FPGA FPGA Host communication ?? Host communication ??mustmust be be efficientefficient andand flexible flexible toto avoidavoid
bottlenecksbottlenecks
Integration FPGA SoC-NoC emul with SW thermal libraries
SW Thermal Model
Host PC
standardstandard EthernetEthernet connectionconnection & &dedicateddedicated HW monitor HW monitor
ConnectionConnection Monitor Monitor
Energy
Temp. (T)
si sisi
sisi
sisi
sisi
cu cucu cu cu
David Atienza (LSI/EPFL) 1630/08/2007
Comparisons with State-of-the-art Simulators
HW emulation does not lag with large numberof processors (100 MHz) or signal management (NoC)
0.17 sec(1147x)
3’ 15 secMultimedia pipeline(4 cores-NoC)
0.18 sec(861x)
2’ 35 secMultimedia pipeline(4 cores-bus)
1.2 sec(664x)
13’ 17 secMatrix (8 cores)
1.2 sec(269x)
5’ 23 secMatrix (4 cores)
1.2 sec(88x)
106 secMatrix (one core)
HW/SWEmulator
MPARMSW sim.
Performance Improvement
0200400600800
100012001400
Matrix
(one core)
Matrix (4
cores)
Matrix (8
cores)
Dithering
(4 cores-
bus)
Dithering
(4 cores-
NoC)
Speed-ups of 3 orders of magnitude withcycle-accurate MPSoC simulators
David Atienza (LSI/EPFL) 1730/08/2007
Long Thermal Studies
SW sim. very slow with “high” accuracy (<12 cells) Long simulations are unaffordable: 2 days for 0.18 real sec
Average temperature of emulated 4-core MPSoC
300
320
340
360
380
400
420
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0
Time (seconds)
Tem
pera
ture
(K
elv
in)
Simulation in MPARM Emulation Emulation with DFS
Emulation time5.02s (26 cells)!DFS OFF:
500 MHz.
DFS ON:100 MHz. 100 MHz.
500 MHz.
Validation run-time thermal management in real-life systems
David Atienza (LSI/EPFL) 1830/08/2007
FPGA
Xilinx
MPSoC HW architec. emulation
HW statistics extraction system
OS-Based Thermal Management Framework
uCLinux extended with task migration support
SW Thermal Model
Host PC
ConnectionConnection Monitor Monitor
Energy
Temp. (T)
si sisi
sisi
sisi
sisi
cu cucu cu cu
MPOS layer + task migration
David Atienza (LSI/EPFL) 1930/08/2007
Case Study: OS-Based Thermal Policies
Working conditions− 3 cores start processing− 1 core remains idle for cooling
Task migration policy− Treshold-Based Policy:
– Upper bound: 360º K− Migration to coolest processor
DFS Policy− 10 Emulated freq: 100-512 MHz− Linear decrease applied
if deadline met
Task2Task1 Task3
Task2
Task1Task4
David Atienza (LSI/EPFL) 2030/08/2007
Results: Thermal Evaluation
David Atienza (LSI/EPFL) 2130/08/2007
Research Projects (1/2)
System-on-Chip Consumer Electronics Networks-on-Chip (NoC) interconnects Thermal modeling SoCs
Reliable Nano-Scale Circuits Efficient addressing schemes (micro/nano) Circuit-level defect modeling and reliable
CAD toolsMOSFET
today 22 nmMOSFET in 2008
4.2 nmMOSFETin 2023
David Atienza (LSI/EPFL) 2230/08/2007
Nanoscale Hybrid Crossbar Memory
p-doped SiNW or CNT
n doped SiNW
molecularswitch
molecularswitch
SiO2-coated n-doped SiNW
P-doped SiNW or CNT
Lower layer of NW
Upper layer of NW
Participants: H. Ben Jamaa, F. Angiolini,
S. K. Bobba, Dr. David Atienza, Prof. Yusuf Leblebici,
Prof. Giovanni de Micheli, Prof. Luca Benini
Partners: STMicroelectronics, Stanford, Bologna
RegularArchitecture
Logic Circuit
EncoderRecognizer
Recognizer
CrossbarMemory
CrossbarMemory
Add.
David Atienza (LSI/EPFL) 2330/08/2007
Project Major Research Questions
Conformal layer
2nd spacer
Multispacer
1st spacer
Sacrificial layerConformal layer
Integration Issue:micro-to-nano interface
within hybridizedstandard CMOS
Reliability Issues: Switch defects Nanowire breakage Nanowire shorts
G C
Nanowiredifferentiation
Defect-awareaddressing
Defect-awareinformationencoding
Redundancy atthe device levelAchieved reliable decoding schemes for multi-Vt SiNW
memory architectures
David Atienza (LSI/EPFL) 2430/08/2007
Research Projects (2/2)
Wireless Sensor Networks (WSNs) Power-efficient communication protocols Modeling of working conditions of nodes Self-maintained (scavenging mechanisms)
Lab-on-Chip Integration of bio-electro- mechano- opto-
systems Point of care medical devices Implanted health monitoring devices
David Atienza (LSI/EPFL) 2530/08/2007
Wireless Sensor Networks (WSN) Projects
Participants: A. Susu, F. Rincon, Dr. N. Khaled,
Prof. Andrea Acquaviva, Dr. David Atienza,Prof. Giovanni de Micheli
Partner: IMEC-NL/Holst, ComplutenseUniv. Madrid
Optimization of communicationprotocols for WSN nodes Nodes: SensorCubes (IMEC/Holst) Basestation: USB Stick-MSP 430 (Holst) Tiny OS
Modeling of scavenging devices Model checking-based simulator
David Atienza (LSI/EPFL) 2630/08/2007
MAC Protocols Exploration
Contention based/random access Mesh network. Dynamic topology, no
rigid structure Allows multihoping, energy
optimization, mesh topology Applications: spread networks,
alarms o events transmission B-MAC, S-MAC, Aloha protocol.
TDMA Based dynamic/static Star topology. Optimal under known/static
conditions Appls: real time or continuos
monitoring (e.g. EMGBAN)Working 5/10-node WSN for biomedical applicationsmonitoring (ECG, EEG, EMG, etc.)
David Atienza (LSI/EPFL) 2730/08/2007
Research Projects (2/2)
Wireless Sensor Networks (WSNs) Power-efficient communication protocols Modeling of working conditions of nodes Self-maintained (scavenging mechanisms)
Lab-on-Chip Integration of bio-electro- mechano- opto-
systems Point of care medical devices Implanted health monitoring devices
David Atienza (LSI/EPFL) 2830/08/2007
Lab-on-Chip Projects
Participants: Dr. Frank K. Guerkaynak, Dr. Carlotta Guiducci, Prof. Y. Leblebici Dr. Diego Barrettino, Prof. Martin Gijs Prof. Luca Benini, Prof. Giovanni de Micheli
Partner: Institut Suisse de Recherche Expérimentale sur le Cancer (ISREC), Lab NanoPhysics Paris, Bologna
Cancer research Why incidence of cancer rises exponentially with age? Deterioration in genomic stability due to age creates
higher incidence of cancer
Capacitive DNA sensor chip Aims for cheap label-free detection of DNA Passive sensors realized on glass substrates 4th generation of chips underway
David Atienza (LSI/EPFL) 2930/08/2007
Moving Forward
Design better circuits with current technology NoC based design and EDA tools with new tech nodes Thermal studies in multi-FPGA context Relation thermal vs long-term reliability
Design working circuits with future technologies Interconnect nano wires with micro wires Reliable decoding models of underlying faults and defects Scavenger-based wireless sensor networks
Extend design and data analysis techniques tointegrated non-conventional systems Medical diagnosis and studies of cancer-age relations Capacitive DNA sensor
David Atienza (LSI/EPFL) 3030/08/2007
QUESTIONS ?