Chap4 Caching Testing

  • View
    215

  • Download
    0

Embed Size (px)

Text of Chap4 Caching Testing

  • 7/27/2019 Chap4 Caching Testing

    1/20

    Chapter 4 (continued):

    Caching;

    Testing Memory Modules

  • 7/27/2019 Chap4 Caching Testing

    2/20

    fig_04_30

    Memory organization:

    Typical Memory map

    For power loss

  • 7/27/2019 Chap4 Caching Testing

    3/20

    fig_04_31

    Memory

    hierarchy

  • 7/27/2019 Chap4 Caching Testing

    4/20

    fig_04_32

    Paging / Caching

    Why it typically works:

    locality of reference

    (spatial/temporal)

    working set

    Note: in real-timeembedded systems,

    behavior may be atypical;

    but caching may still be a

    useful technique

    Here we consider caching

    external to the CPUthe

    CPU may have one or more

    levels of caching built in

  • 7/27/2019 Chap4 Caching Testing

    5/20

    fig_04_33

    Typical memory system with cache: hit rate (miss rate)

    important

    Remember!

    Registers here

  • 7/27/2019 Chap4 Caching Testing

    6/20

    Basic caching strategies:

    Direct-mapped Associative

    Block-set associative questions:

    what is associative memory?

    what is overhead?

    what is efficiency (hit rate)?

    is bigger cache better?

  • 7/27/2019 Chap4 Caching Testing

    7/20

    Associative memory: storage location related to data stored

    Examplehashing:--When software program is compiled or assembled, a symbol table must becreated to link addresses with symbolic names--table may be large; even binary search of names may be too slow

    --convert each name to a number associated with the name, this number will bethe symbol table indexFor example, let a = 1, b = 2, c = 3,Then cab has value 1 + 2 + 3 = 6ababab has value 3 *(1 + 2) = 9And vvvvv has value 5*22 = 110Address will be modulo a prime p, if we expect about 50 unique identifiers, cantake p = 101 (make storage about twice as large as number of items to be stored,reduce collisions)Now array of names in symbol table will look like:0>1>2--->

    6--->cab

    9--->ababab--->vvvvv

    Here there is one collision, at address 9; the two items are stored in a linked list

    Access time for an identifier

  • 7/27/2019 Chap4 Caching Testing

    8/20

    Caching: the basic processnote OVERHEAD for each task

    --program needs information M that is not in the CPU

    --cache is checked for M

    how do we know if M is in the cache?--hit: M is in cache and can be retrieved and used by CPU

    --miss: M is not in cache (M in RAM or in secondary memory)

    where is M?

    * M must be brought into cache* if there is room, M is copied into cache

    how do we know if there is room?

    * if there is no room, must overwrite some info M

    how do we select M?

    ++ if M has not been modified, overwrite ithow do we know if M has been modified?

    ++ if M has been modified, must save changes

    how do we save changes to M?

  • 7/27/2019 Chap4 Caching Testing

    9/20

    fig_04_34

    Example: direct mapping

    32-bit words, cache holds 64K words, in 128 0.5K blocks

    Memory addresses 32 bits

    Main memory 128M words; 2K pages, each holds 128 blocks (~ cache)

    fig_04_35

    fig_04_36

    2 bits--byte; 9 bits--word address;

    7 bitsblock address (index);

    11 (of 15)tag (page block is from)

    Tag table: 128 entries (one for each

    block in the cache). Contains:Tag: page block came from

    Valid bit: does this block contain data

    write-through: any change propagated

    immediately to main memory

    delayed write: since this data may

    change again soon, do not propagate

    change to main memory immediately

    this saves overhead; instead, set the dirty

    bit

    Intermediate: use queue, update

    periodically

    When a new block is brought in, if the

    valid bit is true and the dirty bit is true, the

    old block must first be copied into main

    memory

    Replacement algorithm: none; each

    block only has one valid cache location

  • 7/27/2019 Chap4 Caching Testing

    10/20fig_04_37

    Problem with direct mapping: two frequently used parts of

    code can be in different Block0sso repeated swapping

    would be necessary; this can degrade performance

    unacceptably, especially in realtime systems (similar tothrashing in operating system virtual memory system)

    Another method: associative mapping: put new block

    anywhere in the cache; now we need an algorithm to decide

    which block should be removed, if cache is full

  • 7/27/2019 Chap4 Caching Testing

    11/20

    fig_04_38

    Step 1: locate the desired

    block within the cache; must

    search tag table, linear

    search may be too slow;search all entries in parallel

    or use hashing

    Step 2: if miss, decide which

    block to replace.a.Add time accessed to tag

    table info, use temporal

    locality:

    Least recently used (LRU)

    a FIFO-type algorithm

    Most recently used (MRU)

    a LIFO-type algorithm

    b. Choose a block at random

    Drawbacks:long search times

    Complexity and cost of supporting

    logic

    Advantages: more flexibility in

    managing cache contents

  • 7/27/2019 Chap4 Caching Testing

    12/20fig_04_39

    Intermediate method: block-set associative cache

    Each index now specifies a setof blocks

    Main memory: divided into m blocks organized into n groups

    Group number = m mod nCache set number ~ main memory group number

    Block from main memory group j can go into cache set j

    Search time is less, since search space is smaller

    How many blocks: simulation answer (one rule of thumb:

    doubling associativity ~ doubling cache size, > 4-way probablynot efficient)

    Two-way set-associative scheme

  • 7/27/2019 Chap4 Caching Testing

    13/20

    Example: 256K memory-64 groups, 512 blocks

    Block Group (m mod 64)

    0 64 128 . . . 384 448 0

    1 65 129 . . . 385 449 1

    2 66 130 . . . 386 450 2

    . . .

    63 127 192 . . . 447 511 63

  • 7/27/2019 Chap4 Caching Testing

    14/20fig_04_40

    Dynamic memory allocation virtual storage):

    --for programs larger than main memory

    --for multiple processes in main memory

    --for multiple programs in main memory

    General strategies may not work well because of hard

    deadlines for real-time systems in embedded applications

    general strategies are nondeterministic

    Simple setup:

    Can swap processes/programs

    And their contexts

    --Need storage (may be infirmware)

    --Need small swap time compared

    to run time

    --Need determinism

    Ex: chemical processing, thermal control

  • 7/27/2019 Chap4 Caching Testing

    15/20

    fig_04_41

    Overlays (pre-virtual storage):

    Seqment program into one main

    section and a set of overlays (kept in

    ROM?)Swap overlays

    Choose segmentation carefully to

    prevent thrashing

  • 7/27/2019 Chap4 Caching Testing

    16/20

    fig_04_42

    Multiprogramming: similar to paging

    Fixed partition size: Can get

    memory fragmentationExample:

    If each partition is 2K and we have

    3 jobs:

    J1 = 1.5K, J2 = 0.5K, J3 = 2.1KAllocate to successive partitions (4)

    J2 is using only 0.5 K

    J3 is using 2 partitions, one of size

    0.1K

    If a new job of size 1K enters

    system, there is no place for it,

    even though there is actually

    enough unused memory for it

    Variable size:

    Use a scheme like

    pagingInclude compaction

    Choose parameters

    carefully to prevent

    thrashing

  • 7/27/2019 Chap4 Caching Testing

    17/20fig_04_43

    Memory testing:

    Components and basic architecture

  • 7/27/2019 Chap4 Caching Testing

    18/20fig_04_45

    Faults to test: data and address lines; stuck-at and bridging

    (if we assume no internal manufacturing defects)

  • 7/27/2019 Chap4 Caching Testing

    19/20fig_04_49

    ROM testing:

    stuck-at faults, bridging faults, correct data stored

    Method: CRC (cyclic reduncancy check) or signature

    analysisUse LFSR to compress a data stream into a K-bit pattern,

    similar to error checking

    (Q: how is error checking done?)

    ROM contents modeled as N*M-bit data stream,N= address size, M = word size

  • 7/27/2019 Chap4 Caching Testing

    20/20

    Error checking: simple examples

    1.Detect one bit error: add a parity bit

    2.Correct a 1-bit error: Hamming codeExample: send m message bits + r parity bitsThe number of possible error positions is

    m + r + 1, we need 2r>= m + r + 1If m = 8, need r = 4; ri checks parity of bits with i in binary representationPattern:Bit #: 1 2 3 4 5 6 7 8 9 10 11 12Info: r0 r1 m1 r2 m2 m3 m4 r3 m5 m6 m7 m8

    --- --- 1 --- 1 0 0 --- 0 1 1 1

    Set parity = 0 for each groupr0: bits 1 + 3 + 5 + 7 + 9 + 11 = r0 + 1 + 1 + 0 + 0 + 1 r0 = 1r1: bits 2 + 3 + 6 + 7 + 10 + 11 = r1 + 1 + 0 + 0 + 1 + 1 r1 = 1r2: bits 4 + 5 + 6 + 7 + 12 = r2 + 1 + 0 + 1 r2 = 0r3: bits 8 + 9 + 10 + 11 + 12 = r3 + 0 + 1 + 1 + 1 r3 = 1Exercise: suppose message is sent and 1 bit is flipped in received messageCompute the parity bits to see which bit is incorrect

    Addition: add an overall parity bit to end of message to also detect two errors

    Note:a.this is just one example, a more general formulation of Hamming codesusing the finite field arithmetic can also be givenb. this is one example of how error correcting codes can be obtained, there aremany more complex examples, e.g., Reed-Solomon codes used in CD player