31
Emerging NVM Memory Technologies Yuan Xie Associate Professor The Pennsylvania State University Department of Computer Science & Engineering www.cse.psu.edu/~yuanxie [email protected]

Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

Embed Size (px)

Citation preview

Page 1: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

Emerging NVM Memory Technologies

Yuan XieAssociate Professor

The Pennsylvania State UniversityDepartment of Computer Science & Engineering

www.cse.psu.edu/[email protected]

Page 2: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

2

Position Statement

Emerging NVM are very attractive Combing the speed of SRAM, the density of DRAM,

and the non-volatility of Flash memory,

Attractive features high density, low leakage, non-volatile

Undesirable features: Write-related: long write-latency, high write-energy,

low endurance (e.g. PCRAM) Cost (Needs large volume production)

Solution: Hybrid cache/mem/storage + 3D?

Enabling unique applications

Page 3: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

3

Outline

Introduction Modeling

MRAM/PCRAM modeling Architecture

MRAM stacking HCA: Hybrid Cache Architecture Hybrid storage system

Application Exascale computing

Conclusion

Page 4: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

44

Traditional Memory Hierarchies

On-chip memory (SRAM)

Off-chip memory (DRAM)

Secondary Storage (HDD)

1~30 100~300Latency:(Cycles)

>5000000Large Latency Gap

Solid State Disk (Flash Memory)

25000~2000000

4

Page 5: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

5

Emerging Memory Techologies

FeRAM (Ferroelectric RAM)

MRAM (Magnetic RAM)

Memristor (Resistive RAM)

PCRAM (Phase-Change RAM)

5

ToshibaFeRAM(2009)

HP LabsMemristor (2009)

SamsungPCRAM (2008)

EverSpinMRAM(2008)

Page 6: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

666

Page 7: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

77

Traditional Memory Hierarchies

On-chip memory (SRAM)

Off-chip memory (DRAM)

Secondary Storage (HDD)

1~30 100~300Latency:(Cycles)

>5000000

Solid State Disk (Flash Memory)

25000~2000000

7

Page 8: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

8

NVRAM Comparison

8

Courtesy: Motoyuki Ooishi

FeRAM, MRAM, or PCRAM, combines the advantages of SRAM, DRAM, and flash.

Good opportunity to rethink the memory hierarchy design.

Page 9: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

9

On-chip memory (SRAM)

Off-chip memory (DRAM)

Secondary Storage (HDD)

~10 ~100Latency:(Cycles) >5000000

Solid State Disk(SSD)

25000~2000000

Phase-change RAM (PCRAM)

Traditional Memory Hierarchies

Magnetic RAM (MRAM)Emerging Non-volatile Memory (NVM)

What is the impact of emerging NVM technologies on computer memory hierarchies?

Page 10: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

10

PCRAMsim Model

Developed on the basis of CACTI CACTI models SRAM and DRAM caches CACTI does NOT support PCRAM.

10

2D array of memory cells

Precharge & Equalization

Bitline MuxSense Amplifiers

Sense Amplifier MuxOutput/Write Drivers

Wor

dlin

eD

river

sR

ow D

ecod

ers

CACTI-modeled memory subarray

Memory cells

Peripheral circuitry

PCRAMsim made3 modificationson the subarray-level

Page 11: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

11

Area (65nm) 3.66mm2 SRAM 3.30mm2 MRAMCapacity 128KB 512KBRead latency 2.25ns 2.32nsWrite latency 2.26ns 11.02nsRead energy 0.90nJ 0.86nJWrite energy 0.80nJ 5.00nJ

Pros: Low leakage power, high density. Cons: Long write latency and large write energy.

11

High Density

Low Leakage

Replace SRAM caches with MRAM ?

SRAM vs. MRAM

Cache configurations Leakage power2MB (16x128KB) SRAM cache 2.09W8MB (16x512KB) MRAM cache 0.26W

Fast ReadSlow Write

Low Read EnergyHigh Write Energy

Page 12: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

12

Direct Replacement

Replace SRAM with MRAM of same area. The number of banks are kept the same. The capacity of L2 cache increases by 4X.

12

L2 cache miss rate reduced. How is the performance?

L2 Cache Miss Rate

Page 13: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

13

IPC Comparison (Direct Replacement)

13

The last four benchmarks have high write intensities.(see Observation 1)

IPC (SRAM vs. MRAM)

Page 14: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

14

Observation 1 Replacing SRAM L2 caches directly with MRAM

can reduce the access miss rate of L2 caches.

However, the long access latency to MRAM cache has a negative impact on the performance.

When the write intensity is high, it even results in performance degradation.

14

Direct MRAM replacement may harm performance,How is power consumption?

Page 15: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

15

Power Analysis (Direct Replacement)

15

For some workloads, MRAM dynamic power dominates!(see Observation 2)

Total Power (SRAM vs. MRAM)

(Normalized to 2M-SRAM-SNUCA)

MRAM leakage power

Total Power (SRAM vs. MRAM)

MRAM dynamic power

Page 16: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

16

Observation 2

Replacing SRAM L2 caches directly with MRAM can greatly reduce the leakage power.

When the write intensity is high, the dynamic power increases significantly because of the high write energy of MRAM cache.

Question: How to improve the performance and further reduce power of MRAM?

16

Page 17: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

17

SRAM-MRAM Hybrid L2 Cache

20

Using hybrid L2 cache,MRAM write intensities are reduced

(Write Intensity: Pure vs. Hybrid)

Write Intensity (Pure vs. Hybrid)

Page 18: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

18

IPC Result

21

the performance degradation is eliminated. The average IPC is increased by 15%.

with read-preemptivedirect replacement

IPC Comparison

Page 19: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

19

Power Result

22

with read-preemptive

Total Power Comparison

the dynamic power is reduced.The average total power is further reduced by 17%.

8M-MRAM-DNUCAdirect replacement

Page 20: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

20

Comparisons

1016

Yes

No

FastMedium

Medium

High (4)

eDRAMSRAM MRAM PRAM

Density (ratio) Low (1) High (4) High(16)

Dynamic Power Low Low for read; High for write

Medium for read; High for

write

Leakage Power High Low LowSpeed Very

FastFast for read; Slow for write

Slow for read;Very slow for

write

Non-volatility No Yes Yes

Scalability Yes Yes Yes

Endurance 1016 >1015 108

Reduce Cache miss rateIncrease hit latency

Low leakage powerHigh dynamic power

Page 21: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

21

No such “Ideal” (On-size-fits-all) Memory

0.20.6

11.4

astar

bzip2 gc

cgo

bmk

h264

hmmer-

splib

quan

tum mcfom

netpp pe

rlsje

ngbla

st bt cgclu

stalw

hmmer lu mg sp ua

spec

jbbde

dup

fluida

nimate

freqm

ine

strea

mcluste

rGeo

meanN

orm

aliz

ed IP

C

1M-SRAM 4M-DRAM 4M-MRAM 16M-PRAM

00.20.40.60.8

1

astar

bzip2 gc

cgo

bmk

h264

hmmer-

splib

quan

tum mcfom

netpp pe

rlsje

ngbla

st bt cgclu

stalw

hmmer lu mg sp ua

spec

jbbde

dup

fluida

nimate

freqm

ine

strea

mcluste

rGeo

mean

Nor

mal

ized

Pow

er

Static Dynamic

Hybrid Cache may outperform

its counterpart of single technology

1.88 1.89

No single memory technology has

the best power-performance

Page 22: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

22

HCA: Hybrid Cache ArchitectureCore w/ L1s

L2(SRAM)

L3(eDRAM/MRAM/PRAM)

A cache design scenario with 3D chip integration

Flattening L3 and L4 with hybrid cache

Flattening L2, L3 and L4 with hybrid cache

Core w/ L1s

L2(SRAM)

L3(eDRAM/MRAM)

L4(PRAM)

Core w/ L1s

L2 Fast(SRAM)

L2 Slow(eDRAM/MRAM)

L3(PRAM)

Core w/ L1s

L2 Fast(SRAM)

L2Middle

(eDRAM/MRAM)

L2Slow

(PRAM)

3D Layer 1

3D Layer 2

Core w/ L1s

L2 Fast(SRAM)

L2 Slow(eDRAM/MRAM/PRAM)

2D design scenario

Core

w/ L1sL2L3

Core

w/ L1sL2L3

Core

w/ L1sL2L3

Core

w/ L1sL2L3

Cor

e w

/ L1s

L2 L3

Cor

e w

/ L1s

L2 L3

Cor

e w

/ L1s

L2 L3

Cor

e w

/ L1s

L2 L3

Core w/ L1s

L2(SRAM)

L3(eDRAM/MRAM/PRAM)

3D design scenario

Flattening L2 and L3 with hybrid cache

Baseline: a 2D 8-core CMP (3-level SRAM Caches)

A A B

C D E

LHCA LHCA RHCA

3DHCA3DHCA3DHCA

Page 23: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

23

Hybrid Storage (HPCA 2010)

23

… …Data Region

DataBuffer

inMemory

Hybrid ArchitecturePhysical View Structural View

… …Log Region

NANDflash

PRAM

Erase Unit

How to manage the Log-region efficiently?

In-place updating

Sector (512Bytes)

Page 24: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

24

Outline

Introduction Modeling

MRAM/PCRAM modeling Architecture

MRAM stacking HCA: Hybrid Cache Architecture Hybrid storage system

Application Exascale computing

Conclusion

Page 25: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

25

Fault Resiliency for Exascale System

Microprocessor becomes unreliable Process scaling, voltage scaling, soft error,

NBTI, …… Even assuming socket MTTF remains constant

system MTTF = socket MTTF / number of socket

25

1 socketSocket MTTF = 5 years

Exascale ~100,000 socketSystem MTTF = 26 minutes

Page 26: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

26

Checkpoint / Restart Checkpoint / Restart is the state-of-the-art

Hard disk drive (HDD) as the checkpoint storageHDD peak bandwidth: ~100MB/s BlueGene/L: 12 mins to take a checkpoint

Equivalent to 8% performance loss

Scale to exascale ...

26

Tolerable

Unacceptable!

Page 27: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

27

PCRAM – A Good Candidate

27Courtesy: Motoyuki Ooishi

HDD NAND Flash

PCRAM

Cell size - 4-6F2 4-6F2

Read time ~4ms 5us-50us 10ns-100ns

Write time ~4ms 2ms-3ms 100-1000ns

Stanbypower

~1W ~0W ~0W

Endurance 1015 105 108

PCRAM is 2 orders fasterthan flash

PCRAM has 3 orders higher endurance than flash

Good candidate for local checkpoint

Page 28: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

28

How to Integrate PCRAM 3D PCRAM

Deploy PCRAM directly on top of DRAM Possible local bandwidth ~2.5TB/s

(DIMM bandwidth ~10GB/s)

DRAM

PCRAM

Parameters ValuesBank size 32MBMat count 16Required TSV pitch < 74umITRS TSV pitch projection for 2012

3.8um

3D-PCRAM delay 0.8msEquivalent bandwidth

2500GB/s

Collaboration with HP Labs, Exascale Computing Lab, Dr. Norm Jouppi, SC 2009)

Page 29: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

29

Our Projection

29

2008 20172011 20142009 2010 2012 2013 2015 2016

Collaboration with HP Labs, Exascale Computing Lab, Dr. Norm Jouppi, SC 2009)

Page 30: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

30

More Details Xiangyu Dong, X. Wu, Guangyu Sun, Yuan Xie, H. Li, Y.Chen, Circuit

and Microarchitecture Evaluation of 3D MRAM, DAC 2008 Xiangyu Dong, Norm Jouppi, Yuan Xie, PCRAMsim: System-Level

Performance, Energy, and Area Modeling for Phase-Change RAMICCAD 2009.

G.Sun, X. Dong, Y. Xie, J. Li, Y. Chen, Novel MRAM-Stacking Architecture for CMP, HPCA 2009

Xiaoxia Wu, J. Li, L. Zhang, E. Speight, Yuan Xie. Hybrid Cache Architecture with Disparate Memory Technologies." ISCA 2009

Guangyu Sun, Y. Joo, Y. Chen, Yuan Xie, Y. Chen, H. Li, A Hybrid Solid-State Storage Architecture for Performance, Energy Consumption and Lifetime Improvement. HPCA 2010.

Y.Joo, D.Niu, Guangyu Sun, Xiangyu Dong, Y. Xie, Energy- and Endurance-Aware Design of PCRAM Caches." DATE. 2010.

Xiangyu Dong, N. Muralimanohar, Norm Jouppi, Richard Kaufmann, Yuan Xie, Leveraging 3D PCRAM Technologies to Reduce Checkpoint Overhead for Future Exascale Systems SC 2009.

http://www.cse.psu.edu/~yuanxie/3d.html

Page 31: Emerging NVM Memory Technologies Yuan Xieweb.engr.oregonstate.edu/~sllu/xie.pdfEmerging NVM Memory Technologies Yuan Xie ... (Magnetic RAM) Memristor (Resistive RAM) ... ASPLOS-panel-Xie.ppt

31

Conclusion

Emerging NVM are very attractive Combing the speed of SRAM, the density of DRAM,

and the non-volatility of Flash memory,

Attractive features high density, low leakage, non-volatile

Undesirable features: Write-related: long write-latency, high write-energy,

low endurance (e.g. PCRAM) Cost (Needs large volume production)

Solution: Hybrid cache/mem/storage + 3D?

Enabling unique applications