105
1 Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs MICRO Tutorial October 12, 2019 Website: http://accelergy.mit.edu/tutorial.html

Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

1

Accelergy: An Architecture-Level

Energy Estimation Methodology

for Accelerator Designs

MICRO Tutorial

October 12, 2019

Website: http://accelergy.mit.edu/tutorial.html

Page 2: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

2

Accelergy Overview

• An architecture-level energy estimator that can be easily integrated

into other design exploration tools

• Accurately and systematically captures the energy consumption of

basic building blocks in the design

• Succinctly models diverse and complicated designs

• Achieves 95% accuracy in evaluating a well-know deep neural

network accelerator design – Eyeriss [2016 ISSCC]

Page 3: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

3

Accelergy Overview

• An architecture-level energy estimator that can be easily integrated

into other design exploration tools

• Accurately and systematically captures the energy consumption of

basic building blocks in the design

• Succinctly models diverse and complicated designs

• Achieves 95% accuracy in evaluating a well-know deep neural

network accelerator design – Eyeriss [2016 ISSCC]

Page 4: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

4

Importance of Energy Estimation

Energy efficiency is the general concern for accelerator designs

Scene recognition

Object detection

Energy-efficient

DNN accelerator

We must quickly evaluate energy efficiency of arbitrary potential designs in the large design space

Page 5: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

5

Energy Estimation and Design Exploration

• Physical-level energy estimators

Page 6: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

6

Energy Estimation and Design Exploration

• Physical-level energy estimators

Synthesize the design, place standard cells, and

route the wires

RTL Model

Physical Layout

Develop the register transfer

level (RTL) details

Arch. Description

Page 7: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

7

Energy Estimation and Design Exploration

• Physical-level energy estimators

Synthesize the design, place standard cells, and

route the wires

RTL Model

Physical Layout

Develop the register transfer

level (RTL) details

Physical-Level

Energy Estimator

Energy

Arch. Description

Page 8: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

8

Energy Estimation and Design Exploration

• Physical-level energy estimators

Synthesize the design, place standard cells, and

route the wires

RTL Model

Physical Layout

Develop the register transfer

level (RTL) details

Physical-Level

Energy Estimator

NameSwitching

activity

wire0 cap.

01000111

wire1 cap.

00100111

NOR3

OR4

OR2

wire0 wire1

Physical-level simulations

Energy

Arch. Description

wire1wire0

Energy (pJ)

Page 9: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

9

Energy Estimation and Design Exploration

• Physical-level energy estimators

Synthesize the design, place standard cells, and

route the wires

RTL Model

Physical Layout

Develop the register transfer

level (RTL) details

Physical-Level

Energy Estimator

NameSwitching

activity

wire0 cap.

01000111

wire1 cap.

00100111

NOR3

OR4

OR2

wire0 wire1

Physical-level simulations

Energy

Arch. Description

wire1wire0

Energy (pJ)

Page 10: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

10

Energy Estimation and Design Exploration

• Physical-level energy estimators

RTL Model

Physical Layout

Energy

Arch. Description

FabricatedChip

Requires physical layout of the designSlow design space exploration

Page 11: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

11

Energy Estimation and Design Exploration

• Architecture-Level Energy Estimators

Page 12: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

12

Energy Estimation and Design Exploration

• Architecture-Level Energy Estimators

Arch Description

Energy

Architecture-Level Energy

Estimator

Page 13: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

13

Energy Estimation and Design Exploration

• Architecture-Level Energy Estimators

Arch Description

Energy

Architecture-Level Energy

EstimatorComp. name

Action name

Actioncounts

Buffer access 800

MAC compute 400

Architecture Description

GLBBuffer

MAC

PE

Action Counts

MACBuffer

Energy (pJ)

Page 14: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

14

Energy Estimation and Design Exploration

• Architecture-Level Energy Estimators

Arch Description

Energy

Architecture-LevelEnergy Estimator

Physical Layout

FabricatedChip

RTL Model

Only requires architecture-level design Faster design space exploration

Exactly what Timeloop uses

Page 15: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

15

Importance of Energy Estimations

Timeloop requires energy reference tables (ERTs) to evaluate the

energy efficiency of a design

ArchitectureDescription

Workload

Energy

Performance

AreaModel

Timeloop

Mapper

mappingConstraints

Energy Estimator

Global buffer(GLB)

Buffer

MAC

PE

Architecture Description

Page 16: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

16

Importance of Energy Estimations

Timeloop requires energy reference tables (ERTs) to evaluate the

energy efficiency of a design

ArchitectureDescription

Workload

Energy

Performance

AreaModel

Timeloop

Mapper

mappingConstraints

Energy Estimator

ERTs

Componentname

Action nameEnergy/action

GLB random read 100pJ

Buffer random read 10pJ

MAC Compute 5pJ

Global buffer(GLB)

Buffer

MAC

PE

Architecture Description Accelergy

Page 17: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

17

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators: Aladdin[ISCA2014], fixed-cost[Asilomar2017]

1024BGLB

128BBuffer

16bMAC

PE

Energy Estimator

Architecture Description

Page 18: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

18

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators

1024BGLB

128BBuffer

16bMAC

PE

Energy Estimator

Lowest-level building blocks of the architecture

Architecture Description

Description withprimitive components

Page 19: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

19

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators

1024BGLB

128BBuffer

16bMAC

PE

Lowest-level building blocks of the architecture

Energy Estimator

Design-Specific ERTArchitecture Description

Description withprimitive components

Comp. name

Action name

Energy/action

1024B GLB access 100pJ

128B Buffer access 10pJ

16b MAC compute 5pJ

Coarse-Grained Actions

Page 20: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

20

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators

Design-Specific ERT

Comp. name

Action name

Energy/action

1024B GLB access 100pJ

128B Buffer access 10pJ

16b MAC compute 5pJ

Energy-Per-Actions of a Register File(normalized to idle)

1.81.0

4.7

2.1 2.4

RandomRead

RepeatedRead

RandomWrite

RepeatedWrite

ConstantDataWrite

Coarse-grained Action Fine-grained Action

Page 21: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

21

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators

Design-Specific ERT

Comp. name

Action name

Energy/action

1024B GLB access 100pJ

128B Buffer access 10pJ

16b MAC compute 5pJ

Energy-Per-Actions of a Register File(normalized to idle)

1.81.0

4.7

2.1 2.4

RandomRead

RepeatedRead

RandomWrite

RepeatedWrite

ConstantDataWrite

Coarse-grained estimations reduce estimation accuracies

Repetitive access to the same memory location

Write the same data to different addresses

~5x

Page 22: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

22

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators

1024BGLB

128BBuffer

16bMAC

PE

Lowest-level building blocks of the architecture

Energy Estimator

Design-Specific ERTArchitecture Description

Description withprimitive components

Comp. name

Action name

Energy/action

1024B GLB access 100pJ

128B Buffer access 10pJ

16b MAC compute 5pJ

Coarse-Grained Actions

Page 23: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

23

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators

1024BGLB

128BBuffer

16bMAC

PE

Lowest-level building blocks of the architecture

Energy Estimator

Design-Specific ERTArchitecture Description

Description withprimitive components

Comp. name

Action name

Energy/action

1024B GLB access 100pJ

128B Buffer access 10pJ

16b MAC compute 5pJ

Page 24: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

24

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators

1024BGLB

128BBuffer

16bMAC

PE

Energy Estimator

Design-Specific ERTArchitecture Description

Energy Calculator

Comp. NameAction Name

ActionCounts

1024B GLB access 10

128B Buffer access 800

16b MAC compute 400

Action Counts

Comp. name

Action name

Energy/action

1024B GLB access 100pJ

128B Buffer access 10pJ

16b MAC compute 5pJ

Page 25: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

25

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators

1024BGLB

128BBuffer

16bMAC

PE

Energy Estimator

Design-Specific ERTArchitecture Description

Energy Calculator

Name Energy

128BBuffer

1000pJ

1024BGLB

8000pJ

16b MAC

2000pJ

Comp. NameAction Name

ActionCounts

1024B GLB access 10

128B Buffer access 800

16b MAC compute 400

Energy Estimations

Action Counts

Comp. name

Action name

Energy/action

1024B GLB access 100pJ

128B Buffer access 10pJ

16b MAC compute 5pJ

Page 26: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

26

Existing Architecture-Level Energy Estimators

• Design-Specific Accelerator Estimators

1024BGLB

64BBuffer

RRAMMAC

PE

Energy Estimator

Design-Specific ERTArchitecture Description

Energy Calculator

Comp. NameAction Name

ActionCounts

1024B GLB Access 10

64B Buffer Access 800

RRAM MAC Compute 400

Action CountsNot generalizable to other designs

Comp. name

Action name

Energy/action

1024B GLB access 100pJ

128B Buffer access 10pJ

16b MAC compute 5pJ

Page 27: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

27

Accelergy Overview

• An architecture-level energy estimator that can be easily integrated

into other design exploration tools

• Accurately and systematically captures the energy consumption of

basic building blocks in the design

• Succinctly models diverse and complicated designs

• Achieves 95% accuracy in evaluating a well-know deep neural

network accelerator design – Eyeriss [2016 ISSCC].

Page 28: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

28

1.8

1.0

4.7

2.1 2.4

RandomRead

RepeatedRead

RandomWrite

RepeatedWrite

ConstantData Write

Accelergy: Primitive Component Energy Estimation

• Accurate estimation for primitive components with a primitive

component library* that defines

– Necessary hardware attributes to accurately characterize each component

– Necessary fine-grained actions that improve energy estimation accuracy

Fine-grained memory action types

23.0

16.8

1.3

Random Mult Reused Mult Gated Mult

Fine-grained multiplier action types

*open-source at http://acelergy.mit.edu

Page 29: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

29

Accelergy: Primitive Component Energy Estimation

Systematically generates ERTs for primitive component in various designs

ERT Generator1024B

GLB

128BBuffer

16bMAC

PE

Primitive Component

Library

EstimatorArchitecture Description

Fixed Design-Specific Energy

ReferenceTable (ERT)

Page 30: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

30

Accelergy: Primitive Component Energy Estimation

Systematically generates ERTs for primitive component in various designs

ERT Generator1024B

GLB

128BBuffer

16bMAC

PE

Primitive Component

Library

AccelergyArchitecture Description

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Borrows and improves the 40nm public energy table from Aladdin [ISCA2014]

Uses and improves on open-source CACTI

Page 31: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

31

Use energy estimation plug-ins to characterize primitive components

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Traditional open-source plug-ins*

*available at http://accelergy.mit.edu

Proprietary plug-ins

Accelergy: Primitive Component Energy Estimation

Emerging technology plug-ins

NVSIM[TCAD 2012]

Page 32: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

32

Accelergy: Primitive Component Energy Estimation

Systematically generates ERTs for primitive component in various designs

ERT Generator1024B

GLB

128BBuffer

16bMAC

PE

Primitive Component

Library

AccelergyArchitecture Description

ERT (in progress)Comp. Name

Action Name

Energy/Action

1024BGLB

random read, …

100pJ, …Fine-grained

actions

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Page 33: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

33

Accelergy: Primitive Component Energy Estimation

Systematically generates ERTs for primitive component in various designs

ERT Generator1024B

GLB

128BBuffer

16bMAC

PE

Primitive Component

Library

AccelergyArchitecture Description

ERTComp. Name

Action Name

Energy/Action

1024BGLB

random read, …

100pJ, …

128BBuffer

random read, …

10pJ, …

16b MACrandom MAC, …

5pJ, …

Fine-grained actions

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Page 34: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

34

Accelergy: Primitive Component Energy Estimation

Systematically generates ERTs for primitive component in various designs

ERT Generator1024B

GLB

128BBuffer

16bMAC

PE

Primitive Component

Library

AccelergyArchitecture Description

ERT

Energy Calculator

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Page 35: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

35

Accelergy: Primitive Component Energy Estimation

Systematically generates ERTs for primitive component in various designs

ERT Generator1024B

GLB

128BBuffer

16bMAC

PE

Primitive Component

Library

AccelergyArchitecture Description

ERT

Energy Calculator

Comp. Name

Action Name

ActionCounts

1024BGLB

random read

10

128B Buffer

random read

800

16b MAC

random MAC

400

Action Counts

(generated by performance models, e.g., cycle-level simulator)

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Page 36: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

36

Accelergy: Primitive Component Energy Estimation

Systematically generates ERTs for primitive component in various designs

ERT Generator1024B

GLB

128BBuffer

16bMAC

PE

Primitive Component

Library

AccelergyArchitecture Description ERT

Energy Calculator

Comp. Name

Action Name

ActionCounts

1024BGLB

random read

10

128B Buffer

random read

800

16b MAC

random MAC

400

Action Counts

(generated by performance models, e.g., cycle-level simulator)

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Name Energy

1024B GLB 1000pJ

128B Buffer 8000pJ

16b MAC 2000pJ

Energy Estimations

Page 37: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

37

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Accelergy: Primitive Component Energy Estimation

Systematically generates ERTs for primitive component in various designs

ERT Generator1024B

GLB

64BBuffer

RRAMMAC

PE

Primitive Component

Library

AccelergyArchitecture Description

Energy Calculator

Comp. Name

Action Name

ActionCounts

1024BGLB

random read

10

64BBuffer

random read

800

RRAMMAC

random MAC

400

Action Counts

NVSIM[TCAD 2012]

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Page 38: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

38

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Accelergy: Primitive Component Energy Estimation

Systematically generates ERTs for primitive component in various designs

ERT Generator1024B

GLB

64BBuffer

RRAMMAC

PE

Primitive Component

Library

AccelergyArchitecture Description

ERT

Energy Calculator

Comp. Name

Action Name

ActionCounts

1024BGLB

random read

10

64BBuffer

random read

800

RRAMMAC

random MAC

400

Name Energy (pJ)

1024B GLB 1000

64B Buffer 4000

RRAM MAC 400

Energy Estimations

Action Counts

(generated by performance models, e.g., cycle-level simulator)

NVSIM[TCAD 2012]

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Page 39: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

39

How to use Accelergy?

1. Estimate architectures with primitive components

Page 40: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

40

Estimate Architectures with Primitive Components

ERT Generator

Primitive Component

Library

Accelergy

Energy Calculator

(generated by performance models,

e.g., cycle-level simulator)

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Architecture

Description

(YAML)

ERT

(YAML)

Action

Counts

(YAML)

Energy

Estimations

(YAML)

Page 41: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

41

Estimate Architectures with Primitive Components

Accepts inputs and generates outputs in YAML format

ERT Generator

Primitive Component

Library (YAML)

Accelergy

Energy Calculator

(generated by performance models,

e.g., cycle-level simulator)

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Architecture

Description

(YAML)

ERT

(YAML)

Action

Counts

(YAML)

Energy

Estimations

(YAML)

Page 42: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

42

Estimate Architectures with Primitive Components

Step1: Generate Energy Reference Tables

ERT Generator

Primitive Component

Library (YAML)

Accelergy

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Architecture

Description

(YAML)

ERT

(YAML)

Page 43: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

43

version: 0.2 # vcs

classes:

- name: bitwise

attributes:

technology: 40nm

datawidth: 1 …

actions:

- name: process …

- name: intadder

attributes:

technology: 40nm

datawidth: 16 …

actions:

- name: add …

YAML Syntax

– Contains various types of primitive

component classes s

– Each primitive component class

contains attribute names and action

names that can be shared by its

instances

Accelergy: Primitive Component Library

Object-Oriented ApproachComponents are instances of

component classes

Page 44: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

44

version: 0.2 # vcs

classes:

- name: bitwise

attributes:

technology: 40nm

datawidth: 1 …

actions:

- name: process …

- name: intadder

attributes:

technology: 40nm

datawidth: 16 …

actions:

- name: add …

YAML Syntax

– Contains various types of primitive

component classes s

– Each primitive component class

contains attribute names and action

names that can be shared by its

instances

– Default attributes are defined

Accelergy: Primitive Component Library

Object-Oriented ApproachComponents are instances of

component classes

Page 45: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

45

YAML Syntax

version: 0.2 # vcs

classes:

- name: bitwise

attributes: # defaults

technology: 40nm 40nm

datawidth: 16: 1 …

actions:

- name: process …

- name: intadder

attributes:

technology: 40nm: 40nm

datawidth: 16: : 16 …

actions:

- name: add …

– Contains various types of primitive

component classes s

– Each primitive component class

contains attribute names and action

names that can be shared by its

instances

– Default attributes are defined

Accelergy: Primitive Component Library

Object-Oriented ApproachComponents are instances of

component classes

data members

member functions

class

Page 46: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

46

Accelergy: Primitive Component Library

• Representation of fine-grained action types

Action type

Repeated read

Random read

Repeated write

Random write

Repeated data write

Need a new name for each fine-grained action?

Sender

Receiver 0

Receiver 1

Receiver 99

NoC

Tedious

We need 100 names for the multicast actions

Page 47: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

47

Accelergy: Primitive Component Library

• Fine-grained actions with

runtime arguments- name: regfile

attributes: … # defaults1

actions:

- name: read

arguments:

data_delta: 0..1

address_delta: 0..1

- name: write

arguments:

data_delta: 0..1

address_delta: 0..1

YAML Syntax

Original action typeNew

Actiontype

Repeated readread

Random read

Repeated write

writeRandom write

Repeated data write

Page 48: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

48

Accelergy: Primitive Component Library

• Fine-grained actions with

runtime arguments- name: regfile

attributes: … # defaults1

actions:

- name: read

arguments:

data_delta: 0..1

address_delta: 0..1

- name: write

arguments:

data_delta: 0..1

address_delta: 0..1

YAML Syntax

Original action typeNew

Actiontype

Argument

data∆ address∆

Repeated readread

0 0

Random read 1 1

Repeated write

write

0 0

Random write 1 1

Repeated data write 0 1

0≤ data_delta ≤1

Page 49: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

49

Estimate Architectures with Primitive Components

Step1: Generate Energy Reference Tables

ERT Generator

Primitive Component

Library (YAML)

Accelergy

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Architecture

Description

(YAML)

ERT

(YAML)

Page 50: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

50

Accelergy: Architecture Description

• Accelergy shares similar architecture

description syntax with Timeloop

1024BGLB

128Bbuffer

16bMAC

PE

subtree:

- name:simple_primitive

attributes:

technology: 40nm

local:

- name: GLB

class: SRAM

attributes:

width: 32 …

subtree:

- name: PE

local:

- name: Buffer

class: regfile

- name: MAC

class: intmac

YAML Syntax

Design

Page 51: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

51

Accelergy: Architecture Description

• Accelergy shares similar architecture

description syntax with Timeloop

PE

DesignDesign

PE

subtree:

- name: Design

attributes:

technology: 40nm

local:

- name: GLB

class: SRAM

attributes:

width: 32 …

subtree:

- name: PE

local:

- name: Buffer

class: regfile

- name: MAC

class: intmac

YAML Syntax

Abstract Hierarchy

Shared Attributes

Page 52: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

52

Accelergy: Architecture Description

• Accelergy shares similar architecture

description syntax with Timeloop

1024BGLB

128Bbuffer

16bMAC

PE Design

PEGLB

MAC buffer

SRAM

subtree:

- name:simple_primitive

attributes:

technology: 40nm

local:

- name: GLB

class: SRAM

attributes:

width: 32 …

subtree:

- name: PE

local:

- name: Buffer

class: regfile

- name: MAC

class: intmac

YAML Syntax

Primitive Components

Instances of classes

An Instance of the

SRAM class

Design

Page 53: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

53

Accelergy: Architecture Description

• Accelergy shares similar architecture

description syntax with Timeloop

1024BGLB

128Bbuffer

16bMAC

PE Design

PEGLB

MAC buffer

SRAM

intmac regfile

subtree:

- name:simple_primitive

attributes:

technology: 40nm

local:

- name: GLB

class: SRAM

attributes:

width: 32 …

subtree:

- name: PE

local:

- name: buffer

class: regfile

- name: MAC

class: intmac

YAML Syntax

Primitive Components

Instances of classes

Design

Page 54: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

54

Estimate Architectures with Primitive Components

Step1: Generate Energy Reference Tables

ERT Generator

Primitive Component

Library (YAML)

Accelergy

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Architecture

Description

(YAML)

ERT

(YAML)

Page 55: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

55

Accelergy: Energy Reference Table (ERT)

• Contains energy/action values for all the components in the designsYAML Syntax

tables:

- name:Design.GLB

read:

- arguments:

data_delta: 1

address_delta: 1

energy: 100

- name: Design.PE.buffer

read: …

- name: Design.PE.MAC

mac_random:

energy: 5

1024BGLB

128Bbuffer

16bMAC

PE

Design

Page 56: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

56

Accelergy: Energy Reference Table (ERT)

• Contains energy/action values for all the components in the designsYAML Syntax

tables:

- name:Design.GLB

read:

- arguments:

data_delta: 1

address_delta: 1

energy: 100

- name: Design.PE.buffer

read: …

- name: Design.PE.MAC

mac_random:

energy: 5

1024BGLB

128Bbuffer

16bMAC

PE

Design

Page 57: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

57

Exercise 1: Generate ERT for Simple Architecture

• Instructions:

– cd exercises/accelergy/01_primitive_architecute_ERT

– accelergy input/*.yaml -o output/ --enable_flattened_arch 1

– ERT.yaml, ERT_summary.yaml, and flattened_architecture will be generated in

current directory

• Questions:

– Each YAML file contains a top-key that indicates its content. What top-level

keys are used to identify the input file content?

– What information in ERT is used for Timeloop?

– Examine ERT_summary.yaml, why it’s bad to have coarse-grained action type?

Page 58: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

58

Estimate Architectures with Primitive Components

Step1: Generate Energy Reference Tables

ERT Generator

Primitive Component

Library (YAML)

Accelergy

…CACTI

Estimation Plug-in

40nmEstimation

Plug-in

Architecture

Description

(YAML)

ERT

(YAML)

Page 59: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

59

Estimate Architectures with Primitive Components

ERT Generator

Primitive Component

Library (YAML)

Accelergy

Energy Calculator

(generated by performance models,

e.g., cycle-level simulator)

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Architecture

Description

(YAML)

ERT

(YAML)

Action

Counts

(YAML)

Energy

Estimations

(YAML)

Step2: Generate Energy Estimations

Page 60: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

60

Accelergy: Action Counts

• Action counts are generated using external performance models, such

as a cycle-level simulator or Timeloop.YAML Syntax

subtree:

- name: Design

local:

- name: GLB

action_counts:

- name: read

arguments:

data_delta: 1

address_delta: 1

counts: 10

subtree:

- name: PE

local:

- name: Buffer

action_counts: …

Design

PE

PE

Design

Page 61: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

61

Accelergy: Action Counts

• Action counts are generated using external performance models, such

as a cycle-level simulator or Timeloop.YAML Syntax

Design

PEGLB

read(1,1): 10

MACrandom mac:

400

Bufferread(1,1):

800

subtree:

- name: Design

local:

- name: GLB

action_counts:

- name: read

arguments:

data_delta: 1

address_delta: 1

counts: 10

subtree:

- name: PE

local:

- name: Buffer

action_counts: …

1024BGLB

128Bbuffer

16bMAC

PE

Design

Primitive Components

Page 62: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

62

Estimate Architectures with Primitive Components

Accepts inputs and generates outputs in YAML format

ERT Generator

Primitive Component

Library (YAML)

Accelergy

Energy Calculator

(generated by performance models,

e.g., cycle-level simulator)

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

Architecture

Description

(YAML)

ERT

(YAML)

Action

Counts

(YAML)

Energy

Estimations

(YAML)

components:

Design.GLB: 1000

Design.PE.Buffer: 8000

Design.PE.MAC: 2000

YAML Syntax

Page 63: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

63

Exercise 2: Generate Energy Estimation for Simple Architecture

• Instructions:

– cd ../02_primitive_architecute_estimation

– accelergy *.yaml --enable_flattened_arch 1 –o output/

– estimation.yaml will be generated in current directory

• Questions:

– Did Accelergy run ERT generator part? Why?

– Is energy calculation faster than ERT generation?

– Would adding the architecture description in this directory confuse

Accelergy to rerun ERT generator?

– Why is separating the two parts good?

Page 64: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

64

Modeling Complicated Designs

• Practical architecture designs involve much more details

– Example: storage units with local address generators (AG)*

buffer

SRAM

o AG_buffer_SRAM is an abstract hierarchy

o Buffer is an instance of SRAM primitive class

o AGs are instances of counter primitive class

*There are much more complicated storage idioms forexplicit data orchestration, e.g., buffet [ASPLOS2019]

AG_buffer_SRAM

data out

AG[0]read address

counter

AG[1]write address

data in

counter

Page 65: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

65

Modeling Complicated Designs

• Practical architecture designs involve much more details

– Example: storage units with local address generators (AG)

buffer

AG[0]

AG[1]write address

read address

data out

data in

regfile

counter

counter

AG_buffer_RF

o AG_buffer_RF is an abstract hierarchy

o AGs are instances of counter primitive class

o Buffer is an instance of regfile primitive class

Page 66: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

66

Modeling Complicated Designs

• Practical architecture designs involve much more details

– Example: storage units with local address generators (AG)Design

16bMAC

PEGLB

1024B

128B1024B

GLB

128Bbuffer

16bMAC

PE

Design

AG_buffer_SRAM

AG_buffer_RF

Page 67: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

67

Modeling Complicated Designs

• Practical architecture designs involve much more details

– Example: storage units with local address generators (AG)

1024BGLB

128Bbuffer

16bMAC

PE

Design

Design

GLB

1024B

AG_buffer_SRAM

AG_buffer_RF

Page 68: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

68

Modeling Complicated Designs

• Practical architecture designs involve much more details

– Example: storage units with local address generators (AG)

1024BGLB

128Bbuffer

16bMAC

PE

Design

Design

GLB

1024B

MACFIFO

MAC

Add output FIFO to improve MAC performance

Page 69: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

69

Modeling Complicated Designs

ERT Generator

Primitive Component

Library (YAML)

Accelergy

Energy Calculator

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

ERT

(YAML)

Action

Counts

(YAML)

Energy

Estimations

(YAML)

Page 70: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

70

Modeling Complicated Designs

ERT Generator

Primitive Component

Library (YAML)

Accelergy

Energy Calculator

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

ERT

(YAML)

Action

Counts

(YAML)

Energy

Estimations

(YAML)

• Architecture description is tedious

• Modifications to the architecture are hard to make

Page 71: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

71

Modeling Complicated Designs

ERT Generator

Primitive Component

Library (YAML)

Accelergy

Energy Calculator

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

ERT

(YAML)

Action

Counts

(YAML)

Energy

Estimations

(YAML)

• Architecture description is tedious

• Modifications to the architecture are hard to make

• Action counts are even more tedious

• Small modification to the architecture requires new action counts

Page 72: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

72

Modeling Complicated Designs

• Existing work that aims to succinctly model complicated architectures

– Wacttch, McPAT, PowerTrain

ALU

L1 $

ROB

…L2 $

CPU-Centric Architecture Model

Use a fixed set of compound components to represent model the architecture

Components that can be decomposed into

lower level components

Page 73: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

73

Modeling Complicated Designs

• Existing work that aims to succinctly model complicated architectures

– Wacttch, McPAT, PowerTrain

ALU

L1 $

ROB

…L2 $

CPU-Centric Architecture Model

Design

16bMAC

PEGLB

1024B

128B

The fixed set of compound components is not sufficient to describe arbitrary accelerator designs

Page 74: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

74

Accelergy Overview

• An architecture-level energy estimator that can be easily integrated

into other design exploration tools

• Accurately and systematically captures the energy consumption of

basic building blocks in the design

• Succinctly models diverse and complicated designs

• Achieves 95% accuracy in evaluating a well-know deep neural

network accelerator design – Eyeriss [2016 ISSCC].

Page 75: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

75

Accelergy: Succinctly Model Arbitrary Designs

buffer

AG[0]

AG[1]

SRAM

counter

counter

buffer

AG[0]

AG[1]

regfile

counter

counter

AG_buffer_SRAM AG_buffer_RF

Previous: Abstract Hierarchies

Page 76: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

76

Accelergy: Succinctly Model Arbitrary Designs

• Allows users to describe the design-specific compound components

buffer

AG[0]

AG[1]

SRAM

counter

counter

buffer

AG[0]

AG[1]

regfile

counter

counter

AG_buffer_SRAM AG_buffer_RF

Now: User-Defined Compound Components

Page 77: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

77

Accelergy: Succinctly Model Arbitrary Designs

Design

16bMAC

PEGLB

1024B

128B

Previous: Architecture described with primitive components

Page 78: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

78

Accelergy: Succinctly Model Arbitrary Designs

• Allows Architecture to be described with compound components

Design

16bMAC

PEGLB

1024B

128B1024B

GLB

128Bbuffer

16b MAC

PE

Design

AG_buffer_SRAM

AG_buffer_RF

Now: Architecture described with compound components

Previous: Architecture described with primitive components

Page 79: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

79

Accelergy: Succinctly Model Arbitrary Designs

ERT Generator

Primitive Component

Library (YAML)

Accelergy

Energy Calculator

(generated by performance models,

e.g., cycle-level simulator)

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

ERT

(YAML)

Energy

Estimations

(YAML)

Comp. Name

Action Name

ActionCounts

1024BGLB

read_c(1,1)

10

128B Buffer

read_c(1,1)

800

16b MAC

random MAC

400

Compound actions defined by the user

Architecture Description

Compound Component Description

Page 80: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

80

How to use Accelergy?

1. Estimate architectures with primitive components

2. Estimate architectures with compound components

Page 81: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

81

Accelergy: Succinctly Model Arbitrary Designs

ERT Generator

Primitive Component

Library (YAML)

Accelergy

Energy Calculator

(generated by performance models,

e.g., cycle-level simulator)

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

ERT

(YAML)

Action

Counts

(YAML)

Energy

Estimations

(YAML)

Architecture

Description

(YAML)

Compound

Component

Description

(YAML)

Page 82: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

82

Exercise 3a

• Instructions:

– cd ../03_compound_architecute_estimation

• Questions:

– Does the architecture description with compound components follow the

previous general format?

– Does the action counts follow the previous general format?

Page 83: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

83

Exercise 3a

• Instructions:

– cd ../03_compound_architecute_estimation

• Questions:

– Does the architecture description with compound components follow the

previous general format?

– Does the action counts follow the previous general format?

Architecture description and Action counts follow the same general format

Page 84: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

84

Accelergy: Succinctly Model Arbitrary Designs

ERT Generator

Primitive Component

Library (YAML)

Accelergy

Energy Calculator

(generated by performance models,

e.g., cycle-level simulator)

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

ERT

(YAML)

Action

Counts

(YAML)

Energy

Estimations

(YAML)

Architecture

Description

(YAML)

Compound

Component

Description

(YAML)

Page 85: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

85

Accelergy: Succinctly Model Arbitrary Designs

• How to define a compound component?

buffer

AG[0]

AG[1]

SRAM

counter

counter

AG_buffer_SRAM

two-level tree

AG_buffer_SRAM

AGs[0..1]: buffer:

name:AG_buffer_SRAM

attributes:

size: 1024 …

subcomponents:

- name: buffer

class: SRAM

attributes:

size: size …

- name: AGs[0..1]

class: counter

attributes:

width: log(size)Define a set of subcomponents

YAML Syntax

Page 86: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

86

Accelergy: Succinctly Model Arbitrary Designs

• How to define a compound component?

buffer

AG[0]

AG[1]

SRAM

counter

counter

AG_buffer_SRAM

two-level tree

AG_buffer_SRAM

size: 1024

AGs[0..1]:width:

log(size)

buffer:size: size

name:AG_buffer_SRAM

attributes:

size: 1024 …

subcomponents:

- name: buffer

class: SRAM

attributes:

size: size …

- name: AGs[0..1]

class: counter

attributes:

width: log(size)

YAML Syntax

Define a set of attributes

Page 87: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

87

• How to define a compound component?

Accelergy: Compound Component Description

AG_buffer_SRAM :read (addr_delta,

data_delta)

name: AG_buffer_SRAM

actions:

- name: read

arguments:

addr_delta: 0..1

data_delta: 0..1

subcomponents:

- name: buffer

actions:

- name: read

arguments:

addr_delta: addr_delta

data_delta: data_delta

- name: AGs[0]

actions:

- name: count

YAML Syntax

Define a set of compound actions

Page 88: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

88

• How to define a compound component?

Accelergy: Compound Component Description

two-level tree

AG_buffer_SRAM :read (addr_delta,

data_delta)

AGs[0]:count

buffer :read

(addr_delta, data_delta)

name: AG_buffer_SRAM

actions:

- name: read

arguments:

addr_delta: 0..1

data_delta: 0..1

subcomponents:

- name: buffer

actions:

- name: read

arguments:

addr_delta: addr_delta

data_delta: data_delta

- name: AGs[0]

actions:

- name: count

YAML Syntax

Define a set of compound actions

Page 89: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

89

• How to define a compound component?

Accelergy: Compound Component Description

two-level tree

AG_buffer_SRAM :read (addr_delta,

data_delta)

AGs[0]:count

buffer :read

(addr_delta, data_delta)

name: AG_buffer_SRAM

actions:

- name: read

arguments:

addr_delta: 0..1

data_delta: 0..1

subcomponents:

- name: buffer

actions:

- name: read

arguments:

addr_delta: addr_delta

data_delta: data_delta

- name: AGs[0]

actions:

- name: count

YAML Syntax

Define a set of compound actions

Page 90: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

90

Accelergy: Succinctly Model Arbitrary Designs

ERT Generator

Primitive Component

Library (YAML)

Accelergy

Energy Calculator

(generated by performance models,

e.g., cycle-level simulator)

CACTIEstimation

Plug-in

40nmEstimation

Plug-in

ERT

(YAML)

Action

Counts

(YAML)

Energy

Estimations

(YAML)

Architecture

Description

(YAML)

Compound

Component

Description

(YAML)

Page 91: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

91

Exercise 3b

• Instruction:

– accelergy *.yaml components/*.yaml –o output/

• Questions:

– Are the ERTs in terms of compound components or primitive components?

Why?

– Compare the current ERT with the previous ERT, what is different?

– If we change the hardware implementation of a compound component,

e.g. add a FIFO to AG_buffer_SRAM, what needs to be changed?

Page 92: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

92

Accelergy Overview

• An architecture-level energy estimator that can be easily integrated

into other design exploration tools

• Accurately and systematically captures the energy consumption of

basic building blocks in the design

• Succinctly models diverse and complicated designs

• Achieves 95% accuracy in evaluating a well-know deep neural

network accelerator design – Eyeriss [2016 ISSCC].

Page 93: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

93

Energy Evaluations on Eyeriss

• Experimental Setup:

– Workload: Alexnet weights & ImageNet input feature maps

– Ground Truth: Energy obtained from post-layout simulations PE instance

weights_spad

ifmap_spad

psum_spad

MAC

Ifmap = input feature mapPsum = partial sumPE = processing element*_spad = *_scratchpad

WeightsNoC

IfmapNoC

PsumWrNoC

Eyeriss Architecture

GLBs

Weights GLB

Shared GLB

PE array 12x14

PE PE … PE

PE PE PE

PE PE PE

………

PsumRdNoC

Page 94: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

94

WeightsNoC

IfmapNoC

PsumWrNoC

Energy Evaluations on Eyeriss

• Experimental Setup:

– Workload: Alexnet weights & ImageNet input feature maps

– Ground Truth: Energy obtained from post-layout simulations

Eyeriss Architecture

GLBs

Weights GLB

Shared GLB

PE array 12x14

weights_spad

ifmap_spad

psum_spad

MAC

PE instance

Ifmap = input feature mapPsum = partial sumPE = processing element*_spad = *_scratchpad

PE PE … PE

PE PE PE

PE PE PE

………

PsumRdNoC

Zero-gating optimizationIf there is a 0 ifmap data• Gate on reading the weights data => gated-read• Gate on computing the MAC => gated-MAC

Page 95: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

95

Experimental Results

• Total energy estimation is 95% accurate of the post-layout energy.

• Estimated relative breakdown of the important units in the design

is within 8% of the post-layout energy.

Page 96: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

96

Experimental Results

• Comparisons with existing work: Aladdin and fixed-cost

Aladdin: [Shao et al., ISCA2014]

• lack of fine-grained action types, e.g., gated-read

fixed-cost: [Yang et al., Asilomar2017]

• Designed specifically for Eyeriss analysis• lack of fine-grained component characterizations,e.g., all local buffers are characterized as identical components

Design-specific energy estimators

Page 97: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

97

Experimental Results

• Comparisons with existing work: Aladdin and fixed-cost

Total Energy of PE Array(normalized to ground truth)

95%88%

78%

Accelergy Aladdin fixed-cost

Page 98: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

98

Experimental Results

Energy Breakdown of PEs across the Array

0.10

0.12

0.14

0.16

0.18

0.20

0.22

0.24

0.26

Ener

gy C

on

sum

pti

on

J)

PEs that process data of different sparsity

ground truth Accelergy

Aladdin fixed-cost

• Comparisons with existing work: Aladdin and fixed-cost

Total Energy of PE Array (normalized to ground truth)

95%88%

78%

Accelergy Aladdin fixed-cost

Page 99: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

99

Experimental Results

Energy Breakdown of PEs across the Array

0.10

0.12

0.14

0.16

0.18

0.20

0.22

0.24

0.26

Ener

gy C

on

sum

pti

on

J)

PEs that process data of different sparsity

ground truth Accelergy

Aladdin fixed-cost

• Comparisons with existing work: Aladdin and fixed-cost

Not sensitive to sparsity as the zero-gating optimization

is not recognized

Page 100: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

100

Experimental Results

Energy Breakdown of PEs across the Array

0.10

0.12

0.14

0.16

0.18

0.20

0.22

0.24

0.26

Ener

gy C

on

sum

pti

on

J)

PEs that process data of different sparsity

ground truth Accelergy

Aladdin fixed-cost

• Comparisons with existing work: Aladdin and fixed-cost

Missing components characterizations

Page 101: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

101

Experimental Results

• Comparisons with existing work: Aladdin and fixed-cost

0

10

20

30

40

50

60

70

80

90

100

ifmap_spad psum_spad weights_spad MAC

Ener

gy C

on

sum

pti

on

(n

J) ground truth

Accelergy

Aladdin

fixed-cost

Energy Breakdown of components inside a PE

Zero-gating action type not reflected

0

Page 102: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

102

Experimental Results

• Comparisons with existing work: Aladdin and fixed-cost

0

10

20

30

40

50

60

70

80

90

100

ifmap_spad psum_spad weights_spad MAC

Ener

gy C

on

sum

pti

on

(n

J) ground truth

Accelergy

Aladdin

fixed-cost

Energy Breakdown of components inside a PE

0

All local scratchpads share the same ERT

Page 103: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

103

Exercise 4

• Instructions

– cd ../04_eyeriss_like

– Run accelergy *.yaml components/*.yaml –o output/

• Questions:

– What compound components are involved in the architecture?

– What are the subcomponents of each compound component?

– What actions in the ERT can help better characterize data-gating?

– Which components in the PE are sensitive to data sparsity? How do you

know?

Page 104: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

104

Free Lab Session Timeloop + Accelergy

Page 105: Accelergy: An Architecture-Level Energy Estimation Methodology …accelergy.mit.edu/micro52/02_accelergy.pdf · Systematically generates ERTs for primitive component in various designs

105

Have fun exploring both tools

• Go to the timeloop + accelergy exercise folder

– Readme contains instructions for various experiments!

– Run an mapping exploration with an eyeriss-like architecture that is

described by compound components

– Run an mapping exploration with an floating-point eyeriss-like

architecture. Which component’s energy increases the most?

– Try different DRAM types. Modify the architecture description to

exploration the energy impact brought by DRAM technologies.