Upload
flextiles-team
View
107
Download
3
Tags:
Embed Size (px)
DESCRIPTION
The FP7 FlexTiles Project uses DSP accelerators. They are connected with each other - and with the general purpose procesors (GPPs) through a Network-on-Chip (NoC). These slides give the details about the DSP accelerator.
Citation preview
www.flextiles.eu
FlexTiles
Da
te /R
efe
renc
e
Workshop at AHS’2014 conference: FlexTiles FP7 project
Low-Power DSP AcceleratorEmbedded in a Heterogeneous
Many-Core ArchitectureMarc MORGAN
CSEM – Swiss Center for Electronics and Microtechnology
2 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
CSEM overview on a single slide
• private company, founded in the 1980’s, not for profit
• approx. 400 employees on 5 sites in Switzerland (HQ in Neuchatel)
and a site Brazil
• 5 research programs:
1. ultra-low power integrated systems (SoC, Vision, Wireless)
2. systems engineering (med tech, instrumentation, automation)
3. MEMS
4. surface engineering (nano, bio, printable electronics)
5. photovoltaic
• approx. 70 MCHF annual budget
• over 20 start-ups and spin-offs since 1995
3 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
Many-core architecture: GPPs + accelerators
• An array of general purpose processors (GPP)
• Connected via a Network-on-Chip (NoC)
• Complemented with accelerators to optimize speed and power:
DSP processors or specialized logic implemented in embedded-FPGA
• Plus memory nodes and I/O
4 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
Many-core architecture: GPPs + accelerators (cont’d)
Several IPs are available for the building blocks
both in the consortium and on the market architectural choices attempt to retain genericity of the platform
CSEM provides an ultra-low power DSP processor for the DSP accelerator
It plugs into a generic accelerator interface (AI)
5 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
Accelerator interface (AI)
Interfaces the NoC’s NI to the accelerator by providing services:
programming, control/status, data in, data out, debug
DMA access, word FIFOs, notification
6 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
DSP accelerator architecture
Choices for the DSP accelerator avoid DSP specific features
the DSP will not run an OS or kernel
the DSP will not use (or at least not require) interruptions
Note: CSEM’s icyflex4 ULP DSP could support both of the above
Implement a FIFO manager to handle input and output tokens from/to the accelerator interface (AI)
Implement debug and tracing facilities
Debug: JTAG 1149.1 TAP
Tracing: programmable tracing unit
7 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
DSP accelerator architecture (cont’d)
8 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
Management of the DSP accelerator
Each accelerator is managed by software running on GPPs
virtualization manager: attribution of the accelerator resource manager: control of the accelerator
These managers are in charge of:
transfer of the application (ELF) to the accelerator signaling the accelerator when to start and when to stop recovering statistics on usage of the accelerator to optimize the
execution of the application on the many-core platform
The tracing unit can be managed from the processor or from the JTAG interface
9 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
Ultra low-power (ULP) processors at CSEM
CSEM was founded in the 1980s to promote innovation
1980s, initially for the Swiss watch industry ULP 4-bit processors: PUNCH, µPUS, Combo, ...
1990s, development of a general purpose ULP 8-bit processor: CoolRISC: licensed to Swatch group, TI, Semtech, ...
2000s, powerful new ULP processors with DSP features 2006: icyflex1, a flexible processor for DSP/control apps 2009: icyflex2, a smaller processor for control applications 2009: icyflex4, a scalable processor for DSP/control apps
icyflex is a registered trademark of CSEM
10 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
icyflex family of ultra-low power processors
icyflex2Control
ComputingPower
DSP
icyflex1
icyflex4
1 MUL 2 MAC 4 MAC … 36 MAC
Application
6 µW/MHz
25 µW/MHz 10-150 µW/MHz
12 MAC
power indicated for TSMC 65 nm CMOS
11 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
icyflex software development kit
GNU C compiler (gcc) v 4.6.3
icyflex instruction parallelism supported by latest releases of gcc libc and libm from RedHat’s NewLib software implementation of IEEE floating-point standard
GNU assembler / linker (binutils), v 2.20
BFD / ELF32 object file format Binary, SREC, IHEX memory image file formats
GNU debugger (gdb), v 6.7.1
Mode 1: instruction set simulator of the icyflex core Mode 2: On-Chip Debug (OCD) through a JTAG interface
icyflex instruction set simulator (ISS), written in C++
Phase-accurate, pipelined Wrappers to SystemC, VHDL (Modelsim), Matlab/Simulink
Eclipse integrated development environment, v Helios
CDT C/C++ IDE plug-in icyflex plug-in
.c
.o
.exe
.log
gcc
ld
gdbgdb
12 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
icyflex2 vs icyflex4
Feature icyflex2 icyflex4 VPS=2
Optimized for Control DSPP, X, Y memory buses,ISA, HW loops, saturation, …
Instruction word [bits] 32 (1 or 2 sub) 64 (1, 2 or 3 sub)
Memory access [bits] 8, 16 or 32 2x (8, 16, 32, 64, 128)
Data processing [bits] 16 or 32, trunc 2x (16 or 32 or 64), full
Single Instr. Multiple Data (SIMD) No Yes, up to 8 MAC
Instruction set is reconfigurable on the fly
No Yes
Software Development Kit (SDK) GNU-based tool suite (gcc, gdb) + cycle-accurate instruction set simulator (ISS)
Hardware Devt Kit (HDK) FPGA-based, customizable
VPS = Vector Processing Slices in the Vector Processing Unit of the DSP
13 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
blank instructionsconfigured at run-time
icyflex: reconfigurable instructions and addressing modes
Instruction set
ADD
MUL
SHR
MAC
JUMP
configurable
configurable
SHIFT
MUX
ALU
ACC
ACC
SHIFT
MUX
ALU
ACC
ACC
inst
ruct
ion
de
cod
ing
cycle N: config MOPcycle N+1: use MOP
14 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
DSP in FlexTiles emulators
Emulator 1 (software):
Using Open Virtual Platform (OVP) Not cycle accurate The icyflex4 DSP is emulated by a GPP running at a higher frequency
Emulator 2 (hardware):
Using an FPGA board with two Xilinx Virtex6 FPGAs Uses a DFF version of the DSP accelerator
15 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
Exploitation of FlexTiles results at CSEM
CSEM specializes in low power solutions
A well-balanced multi-processor design can optimize energy
consumption by reducing voltage and frequency
For multi-core: we offer CSEM solutions
For many-core: CSEM collaborates with 1 or more of our partners
including e.g. a follow up project to produce FlexTiles chips
16 /
Da
te /R
efe
renc
e
The
info
rmat
ion
cont
aine
d in
thi
s do
cum
ent
and
any
atta
chm
ents
are
the
pro
pert
y of
Fle
xTile
s co
nsor
tium
. Y
ou a
re h
ereb
y no
tifie
d th
at a
ny r
evie
w,
diss
emin
atio
n, d
istr
ibut
ion,
co
pyin
g or
oth
erw
ise
use
of t
his
docu
men
t m
ust
be d
one
in a
ccor
danc
e w
ith t
he C
A o
f th
e pr
ojec
t (T
RT
/DJ/
6244
1278
5.20
11).
Tem
plat
e ve
rsio
n 1
.0
FlexTiles FP7 project
For more information regarding the FlexTiles project, visit:
http://www.flextiles.eu
Please take 5 minutes to fill out the surveyon the project web site under the Contact menu
The FlexTiles project is funded in part by FP7, the seventh framework programme of the European Commission.
www.flextiles.eu
FlexTiles
Da
te /R
efe
renc
e
Thank you for your attention!
For more information: http://www.csem.ch
Questions? mailto:[email protected]