53
7-1 Chapter 7 - Memory Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring Computer Architecture and Organization Miles Murdocca and Vincent Heuring Chapter 7 – Memory

Chapter 7 – Memory - iiusatech.comiiusatech.com/murdocca/CAO/SlidesPDF/Ch07CAO.pdf · 7-1 Chapter 7- Memory ... Functional Behavior of a RAM Cell Static RAM cell (a) ... Cache Read

Embed Size (px)

Citation preview

7-1 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Computer Architecture andOrganization

Miles Murdocca and Vincent Heuring

Chapter 7 – Memory

7-2 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter Contents7.1 The Memory Hierarchy7.2 Random-Access Memory7.3 Memory Chip Organization7.4 Case Study: Rambus Memory7.5 Cache Memory7.6 Virtual Memory7.7 Advanced Topics7.8 Case Study: Associative Memory in Routers7.9 Case Study: The Intel Pentium 4 Memory System

7-3 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

The MemoryHierarchy

7-4 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Functional Behavior of a RAM Cell

Static RAM cell (a) and dynamic RAM cell (b).

7-5 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Simplified RAM Chip Pinout

7-6 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

A Four-WordMemory withFour Bits perWord in a 2DOrganization

7-7 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

A Simplified Representation of theFour-Word by Four-Bit RAM

7-8 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

2-1/2D Organization of a 64-Word byOne-Bit RAM

7-9 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Two Four-Word by Four-Bit RAMs areUsed in Creating a Four-Word by

Eight-Bit RAM

7-10 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Two Four-Word by Four-Bit RAMs Makeup an Eight-Word by Four-Bit RAM

7-11 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Single-In-LineMemoryModule

• 256 MB dual in-linememory module organizedfor a 64-bit word with 1616M × 8-bit RAM chips(eight chips on each sideof the DIMM).

7-12 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Single-In-Line Memory

Module

• Schematic diagram of256 MB dual in-linememory module.(Source: adapted fromhttp://www-s.ti.com/sc/ds/tm4en64kpu.pdf.)

7-13 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

A ROM Stores Four Four-Bit Words

7-14 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

A Lookup Table (LUT) Implements anEight-Bit ALU

7-15 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Flash Memory

• (a) External view of flash memory module and (b) flash moduleinternals. (Source: adapted from HowStuffWorks.com.)

7-16 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Cell Structure for Flash Memory

• Current flows from source to drain when a sufficient negative charge isplaced on the dielectric material, preventing current flow through theword line. This is the logical 0 state. When the dielectric material is notcharged, current flows between the bit and word lines, which is thelogical 1 state.

7-17 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Rambus Memory• Comparison of DRAM and RDRAM configurations.

7-18 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Rambus Memory• Rambus technology on the Nintendo 64 motherboard (left)

enables cost savings over the conventional Sega Saturnmotherboard design (right).

• Nintendo 64 game console:

7-19 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Placement of Cache Memory in aComputer System

• The locality principle: a recently referenced memory location is likely tobe referenced again (temporal locality); a neighbor of a recentlyreferenced memory location is likely to be referenced (spatial locality).

7-20 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

An Associative Mapping Scheme for aCache Memory

7-21 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Associative Mapping Example• Consider how an access to memory location (A035F014)16 is mapped to

the cache for a 232 word memory. The memory is divided into 227 blocksof 25 = 32 words per block, and the cache consists of 214 slots:

• If the addressed word is in the cache, it will be found in word (14)16 of aslot that has tag (501AF80)16, which is made up of the 27 mostsignificant bits of the address. If the addressed word is not in thecache, then the block corresponding to tag field (501AF80)16 is broughtinto an available slot in the cache from the main memory, and thememory reference is then satisfied from the cache.

7-22 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Associative Mapping Area Allocation• Area allocation for associative mapping scheme based on bits stored:

7-23 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Replacement Policies• When there are no available slots in which to place a block, a

replacement policy is implemented. The replacement policy governsthe choice of which slot is freed up for the new block.

• Replacement policies are used for associative and set-associativemapping schemes, and also for virtual memory.

• Least recently used (LRU)

• First-in/first-out (FIFO)

• Least frequently used (LFU)

• Random

• Optimal (used for analysis only – look backward in time and reverse-engineer the best possible strategy for a particular sequence ofmemory references.)

7-24 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

A Direct Mapping Scheme for CacheMemory

7-25 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Direct Mapping Example• For a direct mapped cache, each main memory block can be mapped to

only one slot, but each slot can receive more than one block. Considerhow an access to memory location (A035F014)16 is mapped to thecache for a 232 word memory. The memory is divided into 227 blocks of25 = 32 words per block, and the cache consists of 214 slots:

• If the addressed word is in the cache, it will be found in word (14)16 of slot(2F80)16, which will have a tag of (1406)16.

7-26 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Direct Mapping Area Allocation• Area allocation for direct mapping scheme based on bits stored:

7-27 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

A Set Associative Mapping Schemefor a Cache Memory

7-28 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Set-Associative Mapping Example• Consider how an access to memory location (A035F014)16 is mapped to

the cache for a 232 word memory. The memory is divided into 227 blocksof 25 = 32 words per block, there are two blocks per set, and the cacheconsists of 214 slots:

• The leftmost 14 bits form the tag field, followed by 13 bits for the set field,followed by five bits for the word field:

7-29 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Set Associative Mapping AreaAllocation

• Area allocation for set associative mapping scheme based on bits stored:

7-30 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Cache Read and Write Policies

7-31 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Hit Ratios and Effective Access Times• Hit ratio and effective access time for single level cache:

• Hit ratios and effective access time for multi-level cache:

7-32 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Direct Mapped Cache Example• Compute hit ratio and

effective access time fora program that executesfrom memory locations48 to 95, and then loops10 times from 15 to 31.

• The direct mappedcache has four 16-wordslots, a hit time of 80 ns,and a miss time of 2500ns. Load-through isused. The cache isinitially empty.

7-33 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Table of Events for Example Program

7-34 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Calculation of Hit Ratio and EffectiveAccess Time for Example Program

7-35 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Multi-level Cache MemoryAs an example, consider a two-level cache in which the L1 hit time is 5 ns,the L2 hit time is 20 ns, and the L2 miss time is 100 ns. There are 10,000memory references of which 10 cause L2 misses and 90 cause L1 misses.Compute the hit ratios of the L1 and L2 caches and the overall effectiveaccess time.

H1 is the ratio of the number of times the accessed word is in the L1 cacheto the total number of memory accesses. There are a total of 85 (L1) and15 (L2) misses, and so:

(Continued on next slide.)

7-36 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Multi-level Cache Memory (Cont’)H2 is the ratio of the number of times the accessed word is in the L2 cacheto the number of times the L2 cache is accessed, and so:

The effective access time is then:

= 5.23 ns per access

7-37 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Neat Little LRU Algorithm• A sequence is shown for the Neat Little LRU Algorithm for a cache with

four slots. Main memory blocks are accessed in the sequence: 0, 2, 3,1, 5, 4.

7-38 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Cache Coherency• The goal of cache coherence is to ensure that every cache sees the

same value for a referenced location, which means making sure thatany shared operand that is changed is updated throughout the system.

• This brings us to the issue of false sharing, which reduces cacheperformance when two operands that are not shared betweenprocesses share the same cache line. The situation is shown below.The problem is that each process will invalidate the other’s cache linewhen writing data without a real need, unless the compiler prevents this.

7-39 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Overlays• A partition graph for a program with a main routine and three subroutines:

7-40 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Virtual Memory• Virtual memory is stored in a hard disk image. The physical memory

holds a small number of virtual pages in physical page frames.

• A mapping between a virtual and a physical memory:

7-41 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Page Table• The page table maps between virtual memory and physical memory.

7-42 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Using the Page Table

• A virtual address istranslated into a physicaladdress:

Typical page table entry

7-43 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Using the Page Table (cont’)• The

configuration ofa page tablechanges as aprogramexecutes.

• Initially, the pagetable is empty.In the finalconfiguration,four pages are inphysicalmemory.

7-44 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Segmentation• A segmented memory allows two users to share the same word

processor code, with different data spaces:

7-45 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Fragmentation

• (a) Free areaof memoryafter initial-ization; (b)afterfragment-ation; (c)aftercoalescing.

7-46 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Translation Lookaside Buffer• An example TLB holds 8 entries for a system with 32 virtual pages and

16 page frames.

7-47 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Putting it All Together• An example TLB holds 8 entries for a system with 32 virtual pages and

16 page frames.

7-48 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Content Addressable Memory –Addressing

• Relationships between random access memory and contentaddressable memory:

7-49 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Overview of CAM

• Source: (Foster,C. C., ContentAddressableParallelProcessors, VanNostrandReinholdCompany, 1976.)

7-50 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Addressing Subtrees for a CAM

7-51 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Associative Memory in Routers

• A simple network withthree routers.

• The use of associativememories in high-end routersreduces the lookup time byallowing a search to be performed in a single operation.

• The search is based on the destination address, rather than thephysical memory address.

• Access methods for this memory have been standardized into aninterface interoperability agreement by the Network Processing Forum.

7-52 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Block Diagram of Dual-Read RAM• A dual-read or dual-port

RAM allows any twowords to besimultaneously readfrom the same memory.

7-53 Chapter 7 - Memory

Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

The Intel 4 Pentium Memory System