22
Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea Marongiu Igor Loi Davide Rossi Prof. Luca Benini www.pulp.ethz.ch 30 June 2015

Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

Evaluating RISC-V Cores for PULP

An Open Parallel Ultra-Low-Power Platform

Sven Stucki Antonio Pullini

Michael Gautschi Frank K. Gürkaynak

Andrea Marongiu Igor Loi

Davide Rossi Prof. Luca Benini

www.pulp.ethz.ch 30 June 2015

Page 2: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

Summary

Background

The group

PULP platform

Goals

Our approach

RISC-V on PULP

Integrated Systems Laboratory 2

Page 3: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

The Group of Prof. Luca Benini

Approximately 40 people ETH Zürich Integrated Systems Laboratory (IIS)

University of Bologna EEES

Many involved in PULP

Great experience in IC design More than 400 ICs, in-house ASIC tester

Close Collaborations POLIMI (compiler support)

CEA/LETI

EPFL

Integrated Systems Laboratory 3

Page 4: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

Our Goal: Reach 1 GOPS/mW efficiency

Integrated Systems Laboratory 4 4

[RuchIBM11]

1012ops/J ↓

1pJ/op ↓

1GOPS/mW

Page 5: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

Energy Proportionality

Integrated Systems Laboratory 5

log(mW)

1GOPS/mW

log(GOPS)

0,003GOPS/mW – 30KW

0,03GOPS/mW

Page 6: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

PULP

An open research platform

Goals:

Reach 1 GOPS/mW efficiency

Scalable hardware: Achieve energy proportionality

Research on:

Efficient cores, platform innovations

Technology options

Software support

Integrated Systems Laboratory 6

http://pulp.ethz.ch

Page 7: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

Our Approach: PULP

Exploit parallelism

Multiple small cores organized in a cluster

Share memory within the cluster

Simple but efficient processor cores

Currently: OpenRISC with ISA extensions

RISC-V Minion core in the work

Hardware optimizations

Near-threshold operation

Dedicated accelerators

Integrated Systems Laboratory 7

Page 8: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

PULP Family of Chips

Silicon proven in 28nm Several tape-outs in different technologies

180nm (IcySoC project, approximate computing) 130nm (Mixed-signal PULP: Vivo-SOC) 65nm (Student projects, demonstrators) 28nm (Flagship designs, technology options)

Timeline:

Integrated Systems Laboratory 8

Page 9: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

RISC-V on PULP

Replace OpenRISC with RISC-V core

Motivation More active community

Compressed instruction set

Current status: Simple 4-stage (IF, ID, EX, WB) design

Support for RV32IC

‘mul’ instruction from M extension

UMC65: 22 kGE for tpd = 2.2ns @ 1.08V

Privileged features: M-mode, Mbare memory

Integrated Systems Laboratory 9

Page 10: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

RVC: Big impact on code size

Integrated Systems Laboratory 10

Page 11: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

PULP Architecture

Integrated Systems Laboratory 12

Page 12: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

Tightly Coupled Data Memory (TCDM)

Integrated Systems Laboratory 13

Cluster-local data storage, explicitly managed

Single-cycle access without contention

Page 13: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

Instruction Cache

Integrated Systems Laboratory 14

Exploit parallelism: shared between cores

Page 14: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

System on Chip

Integrated Systems Laboratory 15

Frequency-locked loop (FLL)

Memory, (I/O) Peripherals, ROM, …

Page 15: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

Improved OpenRISC: OR10N Core

OpenRISC core developed at IIS

ISA extensions to improve efficiency

Hardware Loops

Pre-/Postincrement memory access

Vectorial (packed SIMD) instructions

Custom llvm compiler

No changes to C code needed to use new instructions

Debugging support

Integrated Systems Laboratory 16

Page 16: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

RISC-V on PULP: Future Work

Tapeout in 4Q2015

GlobalFoundries 28nm

Evaluate OR10N extensions for RV core

Hardware loops: Smaller impact on RV

Vectorial instructions

We are open for suggestions / collaborations

Integrated Systems Laboratory 17

Page 17: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

Integrated Systems Laboratory 18

QUESTIONS?

http://pulp.ethz.ch

http://asic.ethz.ch

Page 18: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

OR10N Extensions Performance Gain

Integrated Systems Laboratory 19

Page 19: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

OR10N Extensions Power Improvement

Integrated Systems Laboratory 20

Page 20: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

Near- Super-Threshold Sub-

Near Threshold Operation More Efficient

Integrated Systems Laboratory 21

Higher energy efficiency

Actual PULP Measurement Results

Page 21: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

PULP v2: Best in Class

Integrated Systems Laboratory 22

PULP

commercial ARM processors

commercial low-power MCUs

multicore ULP platforms

sub-threshold MCUs

near-threshold MCUs

211 GOPS/W

power scalable by ~2000x many operating points provided by VBB and

VDD knobs

per

form

ance

sca

lab

le b

y ~3

00

x

Page 22: Evaluating RISC-V Cores for PULP · Evaluating RISC-V Cores for PULP An Open Parallel Ultra-Low-Power Platform Sven Stucki Antonio Pullini Michael Gautschi Frank K. Gürkaynak Andrea

PULP v2: Best in Class

Integrated Systems Laboratory 23

PULP

commercial ARM processors

commercial low-power MCUs

multicore ULP platforms

sub-threshold MCUs

near-threshold MCUs

211 GOPS/W

power scalable by ~2000x

per

form

ance

sca

lab

le b

y ~3

00

x