59
PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Presented by Nils Weller Hardware Acceleration for Data Processing Seminar, Fall 2017

PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Embed Size (px)

Citation preview

Page 1: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer:A Pipelined ReRAM-Based

Accelerator for Deep Learning

Presented by Nils Weller

Hardware Acceleration for Data ProcessingSeminar, Fall 2017

Page 2: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer:A Pipelined ReRAM-Based

Accelerator for Deep LearningPurpose:

- Processing-in-Memory (PIM) architecture to accelerate Convolutional Neural Networks (CNNs)

- Based on novel resistive memory (ReRAM) technology

- Incremental improvement on prior works

Page 3: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: CNNs

Page 4: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: CNNs

Goal: Classify image contents

Image: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/

Not shown:Nonlinear activation function after convolution

Page 5: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: CNNs

Goal: Classify image contents

Image: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/

Main layer type: Convolution

Page 6: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Convolution operation

Image: Burger, W. (2016): Digital Image Processing. An Algorithmic Introduction Using Java.

Input image

Output feature map

Filter matrix

Dot product

Page 7: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Convolution operation

Image: Burger, W. (2016): Digital Image Processing. An Algorithmic Introduction Using Java.

Input image

Output feature map

Filter matrix

Dot product

Traditional: Fixed - e.g. vertical Sobel:

Page 8: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Convolution operation

Image: Burger, W. (2016): Digital Image Processing. An Algorithmic Introduction Using Java.

Input image

Output feature map

Filter matrix

Dot product

Traditional: Fixed - e.g. vertical Sobel:

CNNs: Learnedweights for kernel:

Page 9: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: CNNs

Goal: Classify image contents

Image: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/

Page 10: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: CNNs

Goal: Classify image contents

Image: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/

Two phases:

1. Training2. Testing (= first half of training)

Page 11: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: CNNs

Phase 1: Training

Image: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/

Label: boat

Process image

Page 12: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: CNNs

Phase 1: Training

Image: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/

Label: boat

True value(label): dog (0) cat (0) boat (1) bird (0)

E(output)

Process image

Page 13: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: CNNs

Phase 1: Training

Image: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/

Label: boat

True value(label): dog (0) cat (0) boat (1) bird (0)

E(output)

Process image

Backpropagate error, gradient descentmethod- Calculate error contribution for layers- Update weights to reduce error

Page 14: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: CNNs

Phase 1: Training

Image: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/

...

Page 15: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: CNNs

Summary:

- Large amounts of data- Acceleration desirable- Particularly for training

- Simple core operations (matrix/dot product)- Opportunities for parallelization (single- or multi-image)- Non-trivial training process

- Error computations- Dependencies on intermediate results

Page 16: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: Resistive RAM (ReRAM)

Page 17: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: Resistive RAM (ReRAM)

1971: Theory of “Fourth Fundamental Circuit Element” (Leon Chua)

ResistorCapacitorIndctorMemristor = Memory + Resistance:

- Passive element- Resistance depends on charge passed through it- Enabling inherent computational capabilities

→ No separate processing unitsElectrical network theoryImage: Wikipedia

Page 18: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: Resistive RAM (ReRAM)

1971: Theory of “Fourth Fundamental Circuit Element” (Leon Chua)

ResistorCapacitorIndctorMemristor = Memory + Resistance:

- Passive element- Resistance depends on charge passed through it- Enabling inherent computational capabilities

→ No separate processing units

2008: Strukov et al. (HP Labs): The missing memristor found. In: Nature

Discovery in molecular electronics:- Memristor-like behavior through metal-oxide structures- Enabled through flow of oxygen atoms

Electrical network theoryImage: Wikipedia

Page 19: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: Resistive RAM (ReRAM)

1971: Theory of “Fourth Fundamental Circuit Element” (Leon Chua)

ResistorCapacitorIndctorMemristor = Memory + Resistance:

- Passive element- Resistance depends on charge passed through it- Enabling inherent computational capabilities

→ No separate processing units

2008: Strukov et al. (HP Labs): The missing memristor found. In: Nature

Discovery in molecular electronics:- Memristor-like behavior through metal-oxide structures- Enabled through flow of oxygen atoms

Since then:- Resistive memory designs and prototypes- Research in Processing-in-Memory with resistive memories

Electrical network theoryImage: Wikipedia

Page 20: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: Resistive RAM (ReRAM)

Hu et al. (2016): Dot-Product Engine for Neuromorphic Computing:Programming 1T1M Crossbar to Accelerate Matrix-VectorMultiplication

Page 21: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: Resistive RAM (ReRAM)

Hu et al. (2016): Dot-Product Engine for Neuromorphic Computing:Programming 1T1M Crossbar to Accelerate Matrix-VectorMultiplication

- Accumulation of vol- tages (Kirchoff’s Law)- Resistance of mem- ristors acts as weight - Parallel processing!

Feedback resistanceConductance matrix

Page 22: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: Resistive RAM (ReRAM)

Hu et al. (2016): Dot-Product Engine for Neuromorphic Computing:Programming 1T1M Crossbar to Accelerate Matrix-VectorMultiplication

Naive

Page 23: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Background: Resistive RAM (ReRAM)

Hu et al. (2016): Dot-Product Engine for Neuromorphic Computing:Programming 1T1M Crossbar to Accelerate Matrix-VectorMultiplication

Naive

- Assumes linear memristor conductance- Ignores circuit pararistics

→ More things to consider, but the basicidea is sound

Page 24: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

ReRAM-based PIM architecture

Page 25: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

ReRAM-based PIM architectureBuilding a complete ReRAM system from building blocks:

- HW structures for real CNN processing- Programmable for different CNNs- Process real benchmarks

Page 26: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

ReRAM-based PIM architectureBuilding a complete ReRAM system from building blocks:

- HW structures for real CNN processing- Programmable for different CNNs- Process real benchmarks

Page 27: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

ReRAM-based PIM architectureBuilding a complete ReRAM system from building blocks:

- HW structures for real CNN processing- Programmable for different CNNs- Process real benchmarks

No training support

- doesn’t do CNNs- claim: pipeline design not suitable for training due to stalls - claim: ADC/DAC overhead could be improved

Page 28: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

ReRAM-based PIM architectureBuilding a complete ReRAM system from building blocks:

- HW structures for real CNN processing- Programmable for different CNNs- Process real benchmarks

No training support

- doesn’t do CNNs- claim: pipeline design not suitable for training due to stalls - claim: ADC/DAC overhead could be improved

Page 29: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Side noteFull CNN processing introduces further practical issues:

1. Computations are analog – errors will occur2. Some CNN layers cannot be computed with ReRAM

AlexNet, 2012:

Page 30: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Side note

AlexNet, 2012:

Full CNN processing introduces further practical issues:

1. Computations are analog – errors will occur2. Some CNN layers cannot be computed with ReRAM

2015: CNNs without LCNshown to work just as well

Empirical results: NNs areresilient to errors

Page 31: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture

Main considerations:

1. Training support2. Intra-Layer Parallelism3. Inter-Layer Parallelism

Page 32: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture1. Training support

Figure 3: PipeLayer configured for training

Page 33: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture1. Training support

Intermediate memory(memory subarray)

Computationand weight storage(morphable subarray)

Traininglabel

Partial derivative for weight(averaged) Figure 3: PipeLayer configured for training

Page 34: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture1. Training support

Intermediate memory(memory subarray)

Computationand weight storage(morphable subarray)

Traininglabel

Partial derivative for weight(averaged)

Concept of batching:- Process batch of images with fixed weights- Update weights after batch

→ Reduce update overhead

Figure 3: PipeLayer configured for training

Page 35: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture1. Training support

Process image 1 of 2-sized batch(ignoring parallelism)

Figure 3: PipeLayer configured for training

Page 36: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture1. Training support

Process image 1 of 2-sized batch(ignoring parallelism)

Figure 3: PipeLayer configured for training

Page 37: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture1. Training support

Process image 1 of 2-sized batch(ignoring parallelism)

Figure 3: PipeLayer configured for training

Page 38: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture1. Training support

Process image 2 of 2-sized batch(ignoring parallelism)

Figure 3: PipeLayer configured for training

Page 39: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture1. Training support

Process image 2 of 2-sized batch(ignoring parallelism)

Figure 3: PipeLayer configured for training

Page 40: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture1. Training support

Batch complete - Weight update

Figure 3: PipeLayer configured for training

Page 41: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture1. Training support

Batch complete - Weight update

Image unclear:- Weight update path not shown- Text references nonexistent “b” derivatives

Figure 3: PipeLayer configured for training

Page 42: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture2. Intra-layer parallelism

Page 43: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture2. Intra-layer parallelism

Basic crossbar array matrix-vectorcomputation scheme

Added complexity:- Process batch of images in one go- Use multiple kernels

Without parallelism:

Page 44: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture2. Intra-layer parallelism

- Duplicate processing structure for parallelism- Break up computation arrays due to HW size constraints

Without parallelism:With parallelism:

Page 45: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture3. Inter-layer parallelism

Page 46: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture3. Inter-layer parallelism

Conceptually:

img1img2

Page 47: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture3. Inter-layer parallelism

Conceptually:

img2 img1img3

Page 48: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture3. Inter-layer parallelism

Conceptually:

img3 img2 img1img4

Page 49: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture3. Inter-layer parallelism

Conceptually:

img3 img2 img1

Implications: - Need to buffer multiple intermediate results for later use

img4

Page 50: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture3. Inter-layer parallelism

Conceptually:

img3 img2 img1

Implications: - Need to buffer multiple intermediate results for later use - Weight update requires pipeline flush (does it really?)

img4

Page 51: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture3. Inter-layer parallelism

Last image before update(gap of 2L+1 cycles)

Paper seems to agree on flush/stall:

Update looks larger,but is only 1 cycle

Page 52: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Architecture3. Inter-layer parallelism

Last image before update(gap of 2L+1 cycles)

Paper seems to agree on flush/stall:

Update looks larger,but is only 1 cycle

… but:

How is this pipeline designsuperior to ISAAC’s?

Page 53: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Implementation

Page 54: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Implementation

Activationfunctioncomponent

Typical division intomemory-only + memory/computation areas

Spike coding driver (for energy/area reduction):Input to weighted spikes conversion

Spike coding: analog input to“digital” spike sequence withoutADC. Output spike count =accumulated input*weight

… details like error propagation notvisualized

Page 55: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Discussion

- Limited ReRAM precision- Previous works showed NNs to take errors well

Page 56: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

PipeLayer: Evaluation

- Large improvements vs. reference GPU- Architecture is simulated (could results be impaired?)

Page 57: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

SummaryThe work:

- Successful design of ReRAM-based memory architecture for PIM- Good improvements in test setup- Support for training is new (but not a groundbreaking idea)

The paper:- Sensibly structured- Appropriate drawings- Many implicit assumptions; reasoning for claims often missing- Many grammatical errors

Page 58: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Take-aways

1971: Memristor 2008: Molecular electronics

2012: AlexNet CNN 2015: Good CNNswithout contrastnormalizationlayer

1990s: Initial PIM concepts

1. The work is made possible by progress in an interesting combination of fields

ReRAM-based CNNaccelerators

2. Various optimization techniques mentioned in this seminar are used- Hardware acceleration / PIM- Various layers of parallelism- Precision-speed trade-offs

Page 59: PipeLayer: A Pipelined ReRAM-Based Accelerator for … · PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning Purpose: - Processing-in-Memory (PIM) architecture to accelerate

Thanks for your time!

Questions?