16
Lecture 5: Lecture 5: Memory Performance Memory Performance

Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Embed Size (px)

DESCRIPTION

Memory Hierarchy

Citation preview

Page 1: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Lecture 5:Lecture 5:

Memory PerformanceMemory Performance

Page 2: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Types of Memory

Registers

L1 cache

L2 cache

L3 cache

Main Memory

Local Secondary Storage (local disks)

Remote Secondary Storage (distributed file system, web servers)

Page 3: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Memory Hierarchy

Page 4: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Random Access Memory (RAM) DRAM (Dynamic RAM)

• Must be refreshed periodically• 1 transistor per bit• Unavailable when it is being refreshed• Slower• Less expensive

SRAM (Static RAM)• Does not require periodic refreshes• 5-6 transistors per bit• Faster and more complex• More expensive

Page 5: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Processor-Memory Problem Processors issue instructions roughly every

nanosecond

DRAM can be accessed roughly every 100 nanoseconds

The gap is growing:• processors getting faster by 60% per year• DRAM getting faster by 7% per year

Page 6: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Processor-Memory Problem

Page 7: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Locality of Reference

Principle of locality is the tendency of a program to reference data items that are near other recently referenced data items or that are recently referenced themselves.

Programs with good locality run faster.

Page 8: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Locality of Reference

Locality has two distinct forms:

Temporal Locality: A memory location that is referenced once is likely to be referenced again multiple times in the near future.

Spatial Locality: If a memory location is referenced once, the program is likely to reference a nearby location in the near future.

Page 9: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Cache Performance

Trashing: Cache is repeatedly loading and evicting the same cache blocks

Padding: Extra bytes at the end of an array

Page 10: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Cache Performance Intel Core i7

Page 11: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Cache Performance Read throughput (read bandwidth): The rate that a

program reads data from memory (MB/s)

Page 12: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Cache PerformanceWriting cache-friendly code:

1. Focus on the inner loops where most of the computation and memory accesses occur.

2. Maximize spatial locality by reading data sequentially with stride-1

• Stride-1 reference pattern is good because data is stored in caches as contiguous blocks

3. Maximize temporal locality by using data as often as possible once it has been read from memory.

• Repeated references to local variables are good because compiler can cache them in the register file

Page 13: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Cache Performance Matrix Multiply Performance

jki, kji

ijk, jik

kij, ikj

Page 14: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Memory Interleaving

Page 15: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Memory Interleaving

Page 16: Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary

Virtual Memory