Memory Management. Background Memory consists of a large array of words or bytes, each with its own address. The CPU fetches instructions from memory

Memory Management

Background

• Memory consists of a large array of words or bytes, each with its own address. The CPU fetches instructions from memory according to the value of the program counter. These instructions may cause additional loading from and storing to specific memory addresses.

• Memory unit sees only a stream of memory addresses. It does not know how they are generated.

• Program must be brought into memory and placed within a process for it to be run.

• Input queue – collection of processes on the disk that are waiting to be brought into memory for execution.

• User programs go through several steps before being run.

Multistep Processing of a User Program

Binding of Instructions and Data to Memory

• Compile time: If memory location known a priori, absolute code can be generated; must recompile code if starting location changes.

Example: .COM-format programs in MS-DOS.• Load time: Must generate relocatable code if memory

location is not known at compile time.• Execution time: Binding delayed until run time if the

process can be moved during its execution from one memory segment to another. Need hardware support for address maps (e.g., relocation registers).

Address binding of instructions and data to memory addresses canhappen at three different stages.

Logical vs. Physical Address Space• The concept of a logical address space that is bound to a

separate physical address space is central to proper memory management.

– Logical address – address generated by the CPU; also referred to as virtual address.

– Physical address – address seen by the memory unit.

• The set of all logical addresses generated by a program is a logical address space; the set of all physical addresses corresponding to these logical addresses is a physical address space.

• Logical and physical addresses are the same in compile-time and load-time address-binding schemes; logical (virtual) and physical addresses differ in execution-time address-binding scheme.

Memory-Management Unit (MMU)

• Hardware device that maps virtual address to physical address.

• In a simple MMU scheme, the value in the relocation register is added to every address generated by a user process at the time it is sent to memory.

• The user program deals with logical addresses; it never sees the real physical addresses.

Dynamic relocation using a relocation register

Dynamic Loading

• Routine is not loaded until it is called• Better memory-space utilization; unused routine

is never loaded.• Useful when large amounts of code are needed

to handle infrequently occurring cases.• No special support from the operating system is

required.• Implemented through program design.

Dynamic Linking

• Linking is postponed until execution time.• Small piece of code, stub, is used to locate the

appropriate memory-resident library routine, or to load the library if the routine is not already present.

• Stub replaces itself with the address of the routine, and executes the routine.

• Operating system is needed to check if routine is in processes’ memory address.

• Dynamic linking is particularly useful for libraries.

Swapping

• A process can be swapped temporarily out of memory to a backing store, and then brought back into memory for continued execution.

• Backing store – fast disk large enough to accommodate copies of all memory images for all users; must provide direct access to these memory images.

• Roll out, roll in – swapping variant used for priority-based scheduling algorithms; lower-priority process is swapped out so higher-priority process can be loaded and executed.

• Major part of swap time is transfer time; total transfer time is directly proportional to the amount of memory swapped.

• Modified versions of swapping are found on many systems (i.e., UNIX, Linux, and Windows).

Schematic View of Swapping

Contiguous Allocation• Main memory usually into two partitions:

– Resident operating system, usually held in low memory with interrupt vector

– User processes then held in high memory

• Single-partition allocation– Relocation-register scheme used to protect

user processes from each other, and from changing operating-system code and data

– Relocation register contains value of smallest physical address; limit register contains range of logical addresses – each logical address must be less than the limit register

HW support for relocation and limit registers

Memory Allocation

• First-fit: Allocate the first block that is big enough• Best-fit: Allocate the smallest block that is big

enough; must search entire list, unless ordered by size. Produces the smallest leftover block.

• Worst-fit: Allocate the largest block; must also search entire list. Produces the largest leftover block.

How to satisfy a request of size n from a list of free blocks

First-fit and best-fit better than worst-fit in terms of speed and storage utilization

Fragmentation• External Fragmentation – total memory space

exists to satisfy a request, but it is not contiguous.

• Internal Fragmentation – allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being used.

• Reduce external fragmentation by compaction– Shuffle memory contents to place all free memory

together in one large block.– Compaction is possible only if relocation is dynamic,

and is done at execution time.

Paging• Logical address space of a process can be

noncontiguous; process is allocated physical memory whenever the latter is available.

• Divide physical memory into fixed-sized blocks called frames (size is power of 2, for example 512 bytes).

• Divide logical memory into blocks of same size called pages.

• Keep track of all free frames.• To run a program of size n pages, need to find n free

frames and load program.• Set up a page table to translate logical to physical

addresses. • Internal fragmentation may occurs.

Address Translation Scheme

• Address generated by CPU is divided into:– Page number (p) – used as an index into a

page table which contains base address of each page in physical memory.

– Page offset (d) – combined with base address to define the physical memory address that is sent to the memory unit.

Address Translation Architecture

Paging Example

Paging Example

page size: 4 bytes

Free Frames

Before allocation After allocation

Hardware Support

• Most OS allocate a Page Table for each process. A pointer to the Page Table is stored with the other register values in the PCB

• When the dispatcher starts a process, it must reload the user registers and define the correct hardware page-table values from the stored user table.

• Hardware implementation can be done in these ways– Set of dedicated registers- built with high speed logic to make

page-address translation efficient– Page table is kept in the main memory- Page-table base register

(PTBR) points to the page table (In this scheme every data/instruction-byte access requires two memory accesses. One for the page-table entry and one for the byte.)

Hardware Support

• The two memory access problem can be solved by the use of a special fast-lookup hardware cache called associative registers or associative memory or translation look-aside buffers (TLBs).

• TLB entry consist of two parts: a key and a value. An item to be searched is compared with all keys simultaneously. If item is located the corresponding value is returned

• Fast but expensive.• Typically, the number of entries in a TLB is between 64

and 1024.

Associative Memory

• Associative memory – parallel search

Address translation (P, F)– If P is in associative register, get frame# out. – Otherwise get frame# from page table in memory

Page # Frame #

Paging Hardware With TLB

• Some TLBs Store Address-Space Identifiers (ASIDs) in each TLB entry, which uniquely identifies each process and is used to provide address space numbers for that process.

• When the TLB attempts to resolve virtual page numbers, it ensures that the ASID for the currently running process matches the ASID associated with the virtual page

• If the ASID do not match then it is treated as a TLB miss.

• ASID allows the TLB to contain entries for several processes simultaneously

Segmentation

• Memory-management scheme that supports user view of memory.

• A program is a collection of segments. Each segment has an name and a length. The addresses of segment specify both the segment name and the offset within the segment

• A segment is a logical unit such as:• main program, procedure, • function, method,• object,• local variables, global variables,• common block,• stack,• symbol table, arrays

User’s View of a Program

Logical View of Segmentation

1

3

2

4

1

4

2

3

user space physical memory space

Segmentation Architecture

• Logical address consists of a two tuple:

<segment-number, offset>• Segment table – maps two-dimensional physical

addresses; each table entry has:– base – contains the starting physical address where the

segments reside in memory.– limit – specifies the length of the segment.

• Segment-table base register (STBR) points to the segment table’s location in memory.

• Segment-table length register (STLR) indicates number of segments used by a program;

segment number s is legal if s < STLR.

Segmentation Hardware

Example of Segmentation

Sharing of Segments

Segmentation with Paging

• Both paging and segmentation have their advantages and disadvantages.

• Problems of external fragmentation and lengthy search times can be solved by paging the segments.

• Solution differs from pure segmentation in that the segment-table entry contains not the base address of the segment, but rather the base address of a page table for this segment.

Virtual-Memory Management

Background

Virtual memory – separation of user logical memory from physical memory. Allows an extremely large virtual memory to be provided for programmers when only a smaller physical memory is available. Only part of the program needs to be in memory for

execution. Logical address space can therefore be much larger than

physical address space. Allows address spaces to be shared by several processes. Allows for more efficient process creation.

Virtual memory can be implemented via: Demand paging Demand segmentation

Virtual Memory That is Larger Than Physical Memory

Virtual-address Space

Shared Library Using Virtual Memory

Demand Paging

Technique of bringing a page into memory only when it is needed, is used in virtual memory systems

Pager will bring the required pages rather than whole process, into the main memory.

Benefits- Less I/O needed Less memory needed Faster response More users

Page is needed reference to it invalid reference abort not-in-memory bring to memory

Transfer of a Paged Memory to Contiguous Disk Space

To distinguish between the pages that are in the memory and the pages that are on the disk, Valid-Invalid scheme is used.

This bit is set to ‘valid’ if the page is both legal and in memory This bit is set to ‘invalid’ if the page is either not valid (not in

logical address space of the process) or is valid but not in the main memory .

The process executes and accesses pages that are memory resident , execution proceeds normally.

If the page tries to access a page that is not in memory, (access to a page marked invalid causes a page fault trap- as a result of OS failure to bring the desired page into memory.

Page Table When Some Pages Are Not in Main Memory

Procedure for handling page fault

1. Check an Page table( in PCB) for this process to determine whether the reference was a valid or an invalid memory access.

2. If the reference was invalid , the process is terminated. If it was valid, but page is not brought in, it is paged in .

3. Free frame is located.4. Disk operation is initiated to read desired page in the

newly allocated frame5. On completion of disk read, the page of process is

modified to indicate that now the page is in memory6. Instruction which was trapped in restarted. Process can

now access the page as though it has always been there.

Steps in Handling a Page Fault

In the extreme case , a process starts executing with no pages in memory.

The OS sets the instruction pointer to the first instruction of the process, which is on non-memory-resident page, the process immediately faults for the page.

After this page is brought in the memory, the process continues to execute, faulting as necessary until every page is in the memory.

When all the pages required are in the memory, process executes with no faults. This scheme is called pure demand paging- never bring a page into memory until it is needed

Hardware support – Page table Secondary memory-to hold swapped pages not in main

memory

Performance of Demand Paging

Page Fault Rate 0 p 1.0 if p = 0 no page faults if p = 1, every reference is a fault

Effective Access Time (EAT)EAT = (1 – p) x memory access +p x page fault time

Where Page fault time= (page fault overhead + [swap page out ]

+ swap page in + restart overhead)

Page fault causes the following sequence to occur• Trap to the OS• Save user Registers & Process state.• Determine that the interrupt was a page fault.• Check that the page reference was legal and determine the location

of the page on the disk.• Issue a read from the disk to frame

– Wait in queue for this device until the read request is serviced– Wait for the device seek and/or latency time.– Begin the transfer of the page to a free frame.

• While waiting allocate CPU to other process• Receive an interrupt from the disk I/O subsystem.• Save the registers & process state for other user.• Determine the interrupt was from he disk.• Correct page table to show page is now in memory.• Wait for CPU to be allocated to process again.• Restore the user registers, process state and new page table and

then resume interrupted instruction.

Example to calculate EAT

Average page fault service time =8 milli secMemory access time = 200 nano sec

Effective Access Time= (1-p)x 200 + p x 8000000 = 200 + 7999800 x p

EAT is directly proportional to the page-fault rate if p=1 out of 1000 then

EAT= 200 + 7999800 * 1 /1000 = 8.199.8 Nano Sec = 8.2 Micro seconds

If we want performance to be degraded by 10 % 220>200+7999800 x p20>7999800xpP<0.0000025

It is important to keep page fault rate low in order to have less effective access time

Page Replacement Prevent over-allocation of memory by modifying

page-fault service routine to include page replacement

Use modify (dirty) bit to reduce overhead of page transfers – only modified pages are written to disk

Page replacement completes separation between logical memory and physical memory – large virtual memory can be provided on a smaller physical memory

Need For Page Replacement

Basic Page Replacement

1. Find the location of the desired page on disk2. Find a free frame:

If there is a free frame, use it If there is no free frame, use a page replacement

algorithm to select a victim frame Write the victim frame to the disk, change the page

and frames tables accordingly

3. Read the desired page into the (newly) free frame. Update the page and frame tables.

4. Restart the process

To evaluate the page replacement algorithm a reference string is used which is a string of memory references

Page Replacement

Graph of Page Faults Versus The Number of Frames

FIFO Page Replacement

This algorithm associates the time that the page was bought into memory.

When a page has to be replaced, the oldest page is chosen-can be implemented by a FIFO queue.In this scheme the new page that is bought in is inserted at the end of the queue.

FIFO Page Replacement

Easy to understand and program but performance is not always good.

If page selected for replacement is in active use, every thing still works fine.After replacing an active page with new one, a fault occurs almost immediately to retrieve the active page.

A bad replacement choice increases the page fault rate and slows down process execution.

FIFO Illustrating Belady’s Anomaly

Optimal Page Replacement

Replace the page that will not be used for the longest period of time.

Guarantees lowest possible page-fault rate for a fixed number of frames.

Better than the FIFO page replacement. Difficult to implement as it requires future

knowledge of the reference string

Optimal Page Replacement

Least Recently Used (LRU) Page Replacement

This algorithm associates with each page the time of that page’s last use.When a page must be replaced, LRU chooses the page that has not been used for the longest duration of time.

LRU Page ReplacementLRU Page Replacement

Good performance, but difficult to implement, requires substantial hardware assistance

Two types of implementations are feasibleCounters- Each page is associated with

page-table entry a time-of-use field and a logical clock or Counter with CPU.

Whenever a reference to the page is made, the contents of the clock register are copied to the field of time-of-use field in the page table entry of that page.

Stack implementation (to record the most recent page references) - keep a stack of page numbers in a double link form:Page referenced:

• move it to the top• Use doubly-linked list: requires 6

pointers to be changedNo search for replacement

Use Of A Stack to Record The Most Recent Page References

LRU Approximation Algorithms

Reference bit With each page associate a bit, initially = 0 When page is referenced bit set to 1 Replace the one which is 0 (if one exists). We do not know

the order, however. Second chance

Need reference bit Clock replacement If page to be replaced (in clock order) has reference bit = 1

then:• set reference bit 0• leave page in memory• replace next page (in clock order), subject to same rules

Second-Chance (clock) Page-Replacement Algorithm

Counting Algorithms

Keep a counter of the number of references that have been made to each page

LFU Algorithm: replaces page with smallest count

MFU Algorithm: based on the argument that the page with the smallest count was probably just brought in and has yet to be used

Thrashing

If a process does not have “enough” pages, the page-fault rate is very high. This leads to: low CPU utilization operating system thinks that it needs to

increase the degree of multiprogramming another process added to the system

Thrashing a process is busy swapping pages in and out

Thrashing (Cont.)

Locality In A Memory-Reference Pattern

Working-Set Model

working-set window a fixed number of page references Example: 10,000 instruction

WSSi (working set of Process Pi) =total number of pages referenced in the most recent (varies in time) if too small will not encompass entire locality if too large will encompass several localities if = will encompass entire program

D = WSSi total demand frames if D > m Thrashing Policy if D > m, then suspend one of the processes

Working-set model

Page-Fault Frequency Scheme

Establish “acceptable” page-fault rate If actual rate too low, process loses frame If actual rate too high, process gains frame

Documents

Memory Management. Background Memory consists of a large array of words or bytes, each with its own address. The CPU fetches instructions from memory