Phase Change Memory as Main Memory

Phase Change Memory as Main Memory

CS 839 - Persistence

Learning outcomes

• Understand the basic characteristics of phase-change memory

• Understand evaluation techniques for new memory technologies

• Understand the optimization process for new memory technologies

Notes from reviews

• How do line-level writes solve endurance problem?

• Why characterize PCM by only two parameters?

• Why use 4kb pages – good for disk, why for memory?

• How do results hold up?

Background story

• Phase change memory becomes known to computer architects• Generally seen as slower, bigger DRAM, with slower writes

• Key question: can it compete with DRAM?• Why is this the question?

• Why not “is it useful for persistence?”

DRAM background

• Stores data in a capacitor• Address split into row address, column

address• Row address connects row buffer to

DRAM cell• Column address selects 64 bytes within

row buffer• Row buffer & cells are electrically

connected• Writing to row buffer modifies cells

• Reads erase capacitor contents (destructive reads), so must re-write

Phase change materials

Reset

Set

Amorphous Crystalline

High electrical resistivity Low electrical resistivity

Low optical reflectance High optical reflectance

6(C) Juejun Hu, MIT

Reprogram by applying a shaped current to heat up and cool device

Electronic ViewRead by applying a current, measuring resistance

Experimental results with PCM cells

Issues with memories

• Retention: how long does the device retain data?• DRAM – 64ms• PCM – years

• Endurance: how many times can you write to it?• DRAM – zillions• PCM – 100,000 – millions

• Why?• For persistent data, get better retention using more energy during write

• Lower energy less precise, more likely write fails

• But: can cause wear out of device• Thermal expansion & contraction degrades electrode-storage contacts

Other memory/storage technologies

What system design should we consider?

• Replace all DRAM with PCM• Only regular processor caches

• Hybrid system• DRAM cache in front of PCM

• With or without swapping• Flash and/or disk for pages

How do we evaluate the system?

• Extend existing systems with PCM

• Add PCM but reduce DRAM• Look at same cost system

• Look at same area system (# of memory chips, PCM is denser)

• Look at system with same performance, see how much cheaper or smaller it is

• What does this paper do?

13

Motivation: Capacity vs. Performance

Disk VM

Memory Size M

Exec

uti

on

Tim

e T

M

➢ Reduced DRAM ➢ Same performance ➢ Lower system price

T➢ Faster execution ➢ No additional DRAM➢ Small Price Increase

Low Locality

Unused Memory

W

PCM

Pure PCM system

Uses 2048 byte PCM pages

• High delay

• Higher energy usage

• Why?

Where PCM is bad

• Time to reprogram is 12x higher

• Energy to write full array is 43x higher

How do we optimize?

• High write energy comes from writing full array• Write just data that changed• How do we know?

• Delay comes from longer access times• Cache hot data in DRAM fior read• Add write queue• What policy, what granularity?

• Data fetched from disk likely to be accessed soon• Can fetch right to DRAM instead of PCM

• Streaming data not re-referenced• Can evict from DRAM to disk, not PCM

Partial writes

• Easy solution: write only cache lines that change (64B)• Record in PCM controller what portion has changed, only rewrite that

• Harder solution: track what portion of cache line has changed (4B)• Requires tracking portion through cache hierarchy

Don’t access: DRAM caching

• Use some DRAM to hold hot pages• How much?

• Run programs and measure

• Proposal is about 10% of PCM

• What granularity?• This paper uses 4KB

Memory Size M

Exec

uti

on

Tim

e T

Don’t wait: Lazy writes

• Send writes to a write-pending queue, not PCM directly

Write less: line-level writes

• Problem:• High energy of writes

• Limited endurance

• Solution:• Only write dirty data

• Out of 2048 bytes, mostly 1-3 dirty cachelines

Wear leveling

• Problem:• Uneven access of pages

• Uneven access of lines within a page

• Solution:• VM swapping for uneven use of pages

(not evaluated)

• Store a Shift value for each page• How much lines are shifted on that page

• E.g. each time we reallocate a page, randomly re-shift lines

Relevance to persistent memory

• Managing slow reads

• Managing slow writes

• Granularity of writes

• Wear leveling

Sensitivity

• How do research before technology is known?• Parameterize: assume certain

density, performance range

Optane Memory Mode

• Memory controller manages DRAM as direct-mapped cache in front of Optane• Why direct mapped?

• For NUMA: remote DRAM caches remote Optane

Memory mode performance

• Sequential

Memory mode performance

• Random access

• Bandwidth

Application performance - graph

Other Persistent Memory Technologies

• Spin-Transfer Torque MRAM• Uses Magnetic Tunnel Junction to

store data• Each cell has 2 ferromagnetic layers

• Reference layer stays magnetized in same direction

• Free layer can be programmed

• When polarity aligned → low resistance

• When polarity opposite → high resistance

• Program with high current in one direction

Properties of STT-MRAM

• Density close to DRAM• Higher density → smaller MTJ → lower retention• 1 day vs 10 years

• High write power

• Reads non-destructive: data is copied into row-buffer, can change in row buffer

• Close to DRAM speed, but high-energy writes

Optimizations for STT-MRAM

• Like PCM: selective/partial writes: only update data modified• Saves energy, endurance

• Row buffer bypass:• Higher locality for reads going back to same

row buffer than writes. Why?

• Let writes go directly to media & not be buffer in row buffer

Summary

• NVM can be attached like memory, accessed via same protocols

• Characteristics require new optimizations• Endurance: partial writes

• Endurance: wear leveling

• Energy of writes: partial writes

• Performance: DRAM caching, write bypassing

Documents

Phase Change Memory as Main Memory