Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
Phase Change Memory as Main Memory
CS 839 - Persistence
Learning outcomes
• Understand the basic characteristics of phase-change memory
• Understand evaluation techniques for new memory technologies
• Understand the optimization process for new memory technologies
Notes from reviews
• How do line-level writes solve endurance problem?
• Why characterize PCM by only two parameters?
• Why use 4kb pages – good for disk, why for memory?
• How do results hold up?
Background story
• Phase change memory becomes known to computer architects• Generally seen as slower, bigger DRAM, with slower writes
• Key question: can it compete with DRAM?• Why is this the question?
• Why not “is it useful for persistence?”
DRAM background
• Stores data in a capacitor• Address split into row address, column
address• Row address connects row buffer to
DRAM cell• Column address selects 64 bytes within
row buffer• Row buffer & cells are electrically
connected• Writing to row buffer modifies cells
• Reads erase capacitor contents (destructive reads), so must re-write
Phase change materials
Reset
Set
Amorphous Crystalline
High electrical resistivity Low electrical resistivity
Low optical reflectance High optical reflectance
6(C) Juejun Hu, MIT
Reprogram by applying a shaped current to heat up and cool device
Electronic ViewRead by applying a current, measuring resistance
Experimental results with PCM cells
Issues with memories
• Retention: how long does the device retain data?• DRAM – 64ms• PCM – years
• Endurance: how many times can you write to it?• DRAM – zillions• PCM – 100,000 – millions
• Why?• For persistent data, get better retention using more energy during write
• Lower energy less precise, more likely write fails
• But: can cause wear out of device• Thermal expansion & contraction degrades electrode-storage contacts
Other memory/storage technologies
What system design should we consider?
• Replace all DRAM with PCM• Only regular processor caches
• Hybrid system• DRAM cache in front of PCM
• With or without swapping• Flash and/or disk for pages
How do we evaluate the system?
• Extend existing systems with PCM
• Add PCM but reduce DRAM• Look at same cost system
• Look at same area system (# of memory chips, PCM is denser)
• Look at system with same performance, see how much cheaper or smaller it is
• What does this paper do?
13
Motivation: Capacity vs. Performance
Disk VM
Memory Size M
Exec
uti
on
Tim
e T
M
➢ Reduced DRAM ➢ Same performance ➢ Lower system price
T➢ Faster execution ➢ No additional DRAM➢ Small Price Increase
Low Locality
Unused Memory
W
PCM
Pure PCM system
Uses 2048 byte PCM pages
• High delay
• Higher energy usage
• Why?
Where PCM is bad
• Time to reprogram is 12x higher
• Energy to write full array is 43x higher
How do we optimize?
• High write energy comes from writing full array• Write just data that changed• How do we know?
• Delay comes from longer access times• Cache hot data in DRAM fior read• Add write queue• What policy, what granularity?
• Data fetched from disk likely to be accessed soon• Can fetch right to DRAM instead of PCM
• Streaming data not re-referenced• Can evict from DRAM to disk, not PCM
Partial writes
• Easy solution: write only cache lines that change (64B)• Record in PCM controller what portion has changed, only rewrite that
• Harder solution: track what portion of cache line has changed (4B)• Requires tracking portion through cache hierarchy
Don’t access: DRAM caching
• Use some DRAM to hold hot pages• How much?
• Run programs and measure
• Proposal is about 10% of PCM
• What granularity?• This paper uses 4KB
Memory Size M
Exec
uti
on
Tim
e T
Don’t wait: Lazy writes
• Send writes to a write-pending queue, not PCM directly
Write less: line-level writes
• Problem:• High energy of writes
• Limited endurance
• Solution:• Only write dirty data
• Out of 2048 bytes, mostly 1-3 dirty cachelines
Wear leveling
• Problem:• Uneven access of pages
• Uneven access of lines within a page
• Solution:• VM swapping for uneven use of pages
(not evaluated)
• Store a Shift value for each page• How much lines are shifted on that page
• E.g. each time we reallocate a page, randomly re-shift lines
Relevance to persistent memory
• Managing slow reads
• Managing slow writes
• Granularity of writes
• Wear leveling
Sensitivity
• How do research before technology is known?• Parameterize: assume certain
density, performance range
Optane Memory Mode
• Memory controller manages DRAM as direct-mapped cache in front of Optane• Why direct mapped?
• For NUMA: remote DRAM caches remote Optane
Memory mode performance
• Sequential
Memory mode performance
• Random access
• Bandwidth
Application performance - graph
Other Persistent Memory Technologies
• Spin-Transfer Torque MRAM• Uses Magnetic Tunnel Junction to
store data• Each cell has 2 ferromagnetic layers
• Reference layer stays magnetized in same direction
• Free layer can be programmed
• When polarity aligned → low resistance
• When polarity opposite → high resistance
• Program with high current in one direction
Properties of STT-MRAM
• Density close to DRAM• Higher density → smaller MTJ → lower retention• 1 day vs 10 years
• High write power
• Reads non-destructive: data is copied into row-buffer, can change in row buffer
• Close to DRAM speed, but high-energy writes
Optimizations for STT-MRAM
• Like PCM: selective/partial writes: only update data modified• Saves energy, endurance
• Row buffer bypass:• Higher locality for reads going back to same
row buffer than writes. Why?
• Let writes go directly to media & not be buffer in row buffer
Summary
• NVM can be attached like memory, accessed via same protocols
• Characteristics require new optimizations• Endurance: partial writes
• Endurance: wear leveling
• Energy of writes: partial writes
• Performance: DRAM caching, write bypassing