Memory Management and Virtual Memory

Memory Management and Virtual Memory
Background Overlays versus Swapping Contiguous Allocation Operating System Concepts Memory-Management Unit (MMU)
The run-time mapping from virtual to physical addresses is done by the memory-management unit (MMU), which is a hardware device. The user program deals with logical addresses; it never sees the real physical addresses. Two different types of addresses: Logical: range 0 to max. Physical: range R+0 to R+max; where R is a base value. The user generates only logical addresses and thinks that the process runs in locations 0 to max. Operating System Concepts Memory-Management Unit (MMU) (Cont.)
In MMU scheme, the value in the relocation register is added to every address generated by a user process at the time it is sent to memory. Example: Dynamic relocation using a relocation register. Memory Relocation register 1400 + MMU CPU Logical address 345 Physical address 1745 Operating System Concepts Operating System Concepts
Overlays The entire program and data of a process must be in the physical memory for the process to execute. The size of a process is limited to the size of physical memory. If a process is larger than the amount of memory, a technique called overlays can be used. Overlays is to keep in memory only those instructions and data that are needed at any given time. When other instructions are needed, they are loaded into space that was occupied previously by instructions that are no longer needed. Overlays are implemented by user, no special support needed from operating system, programming design of overlay structure is complex. Operating System Concepts Operating System Concepts
Overlays Example Example: Consider a two-pass assembler. Pass1 constructs a symbol table. Pass2 generates machine-language code. Assume the following: To load everything at once, we need 200k of memory. Size (k = 1024 bytes) Pass1 70k Pass2 80k Symbol table 20k Common routines 30k Total size 200k Operating System Concepts Overlays Example (Cont.)
If only 150K is available, we cannot run our process. Notice that Pass1 and Pass2 do not need to be in memory at same time. So, we define two overlays: Overlay A: symbol table, common routines, and Pass1. Overlay B: symbol table, common routines, and Pass2. We add overlay driver 10k and start with overlay A in memory. When finish Pass1, we jump to overlay driver, which reads overlay B into memory overwriting overlay A and transfer control to Pass2. Operating System Concepts Overlays Example (Cont.)
Overlay A needs 130k and Overlay B needs 140k. Symbol table Common routines Overlay driver Pass2 (80k) Pass1 (70k) Operating System Concepts Operating System Concepts
Swapping A process can be swapped temporarily out of memory to a backing store, and then brought back into memory for continued execution. Backing store fast disk large enough to accommodate copies of all memory images for all users; must provide direct access to these memory images. Roll out, roll in swapping variant used for priority-based scheduling algorithms; lower-priority process is swapped out so higher-priority process can be loaded and executed. Normally, a process that is swapped out will be swapped back into the same memory space that it occupied previously. Example: In a multiprogramming environment with Round Robin CPU scheduling, when time slice expires, the memory manager swap out the process that just finished, and swap in another process to the memory space that has been freed. Operating System Concepts Schematic View of Swapping
Major part of swap time is transfer time; total transfer time is directly proportional to the amount of memory swapped. The context-switch time in swapping system is high. Modified versions of swapping are found on many systems, i.e., UNIX and Microsoft Windows. Operating System Concepts Operating System Concepts
Example 1: Swapping Let Process P1 size is 100KB and the transfer rate of a hard disk is 1MB/second. To transfer P1 (100KB) to or from memory takes 100K/1000K per second, which is 1/10 second = 100 milliseconds. Assume the average latency is 8 milliseconds, then to swap in or out takes 108 milliseconds. The total swap (in and out) is 216 milliseconds for 100KB process. For efficient CPU utilization, we want our execution time for each process to be long relative to the swap time. A Round Robin scheduling algorithm, the time slice should be larger than 216 milliseconds (from the above example). Operating System Concepts Contiguous Allocation
Main memory usually is divided into two partitions: For the resident operating system For the user processes. 100K 1MB O.S. User Operating System Concepts Operating System Concepts
Example 2: Swapping A computer system has 1MB of main memory. The O.S. takes 100KB. Maximum size of user process is 900KB. User process may be smaller than 900KB. From previous example if process P1 is 100K the swap time is 108 milliseconds. But, if P1 is 900K then the swap time is 908 milliseconds. As the size of a process increases the swap time increases too. 100K 1MB O.S. 100K User 900K Main Memory Operating System Concepts Single Partition Allocation
Needs Protection. Protect O.S. code and data from changes by the user processes. Protect user processes from one another. We can provide protection by using a relocation-register with a limit register. Relocation register contains value of smallest physical address. Limit register contains range of logical addresses. Each logical address must be less than the limit register. Operating System Concepts Hardware Support for Relocation & Limit Registers
Memory Relocation Register Limit Register + CPU < Physical Address Yes No Logical Address Trap, Address Error Operating System Concepts Multiple - Partition Allocation
Several user processes residing in memory at the same time. Memory can be divided into a number of fixed-sized partitions, where each partition may contain exactly one process. Therefore, the degree of multiprogramming is bound by the number of partitions. Or all memory is available for user processes as one large block (hole). When a process arrives and needs memory, we search for a hole large enough for this process. If we find a space, we allocate only as much memory as is needed, keeping the rest available to satisfy future requests. Operating System Concepts Multiple - Partition Allocation (Cont.)
Hole block of available memory; holes of various size are scattered throughout memory. Operating system maintains information about: a) allocated partitionsb) free partitions (hole) OS process 5 process 8 process 2 process 9 process 10 Operating System Concepts Multiple - Partition Allocation (Cont.)
When no available block of memory (hole) is large enough to hold process, the O.S. waits until a large block is available. In general, there is at any time a set of holes, of various sizes, scattered throughout memory. When a process arrives and needs memory, we search this set for a hole that is large enough for this process. If the hole is too large, it is split into two: One part is allocated to the arriving process. The other is returned to the set of holes. When a process terminates, it releases its block of memory, which is then placed back in the set of holes. If the new hole is adjacent to other holes, we merge these adjacent holes to form one larger hole. Operating System Concepts Dynamic Storage-Allocation Problem
First-fit, best-fit, and worst fit are the most common strategies used to select a free hole from the set of available holes. First-fit:Allocate the first hole that is big enough. Searching starts at the beginning of the set of holes. We can stop searching as soon as we find a free hole that is large enough. Best-fit:Allocate the smallest hole that is big enough; must search entire list, unless ordered by size.Produces the smallest leftover hole. Worst-fit:Allocate the largest hole; must also search entire list, unless ordered by size.Produces the largest leftover hole, which may be more useful than the smaller leftover hole from best-fit approach. First-fit and best-fit better than worst-fit in terms of speed and storage utilization. Operating System Concepts Operating System Concepts
Fragmentation External fragmentation total memory space exists to satisfy a request, but it is not contiguous; storage is fragmented into a large number of small holes. Example: We have a total external fragmentation of ( )=560KB. If P5 is 500KB, then this space would be large enough to run P5. But the space is not contiguous. The selection of first-fit versus best-fit can affect the amount of fragmentation. 400K O.S. 1000K P1 1700K P4 2000K Free 2300K P3 2560K Operating System Concepts Operating System Concepts
Fragmentation (Cont.) Internal fragmentation Memory that is internal to a partition, but is not being used, because it is too small. Example: Assume next request is for bytes. If we allocate exactly the requested block, we are left with a hole of 2 bytes. The overhead to keep track of this hole will be larger than the hole itself. So, we ignore this small hole (internal fragmentation). O.S. P7 Hole of bytes Free P4 Operating System Concepts Operating System Concepts
Virtual Memory Background Demand Paging Performance of Demand Paging Page Replacement Page-Replacement Algorithms Operating System Concepts Operating System Concepts
Background (Cont.) Virtual memory is the separation of user logical memory from physical memory. This separation allows an extremely large virtual memory to be provided for programmers when only a smaller physical memory is available. Only part of the program needs to be in memory for execution. Logical address space can therefore be much larger than physical address space. Need to allow pages to be swapped in and out. Virtual memory can be implemented via: Demand paging Demand segmentation Operating System Concepts Diagram: virtual memory larger than physical memory
Disk Space Physical Memory Page 0 Page 1 Page 2 Page n Virtual Memory Memory Map Operating System Concepts Transfer of a Pages Memory to Contiguous Disk Space
1 2 3 4 5 6 7 8 Main Memory Program A Program B Swap Out Swap In Operating System Concepts Operating System Concepts
Valid-Invalid bit We need hardware support to distinguish between those pages that are in memory and those pages are on the disk. The valid-invalid bit scheme can be used: Valid: indicates that the associated pages is both legal and in memory. Invalid: indicates that the page either is not valid (not in logical address space) or is valid but is currently on the disk. What happens if the process tries to use a page that was not brought into memory? Access to a page marked invalid causes a page-fault trap. Operating System Concepts Valid-Invalid Bit (Cont.)
With each page table entry a validinvalid bit is associated (1 in-memory, 0 not-in-memory) Initially validinvalid but is set to 0 on all entries. Example of a page table snapshot. During address translation, if validinvalid bit in page table entry is 0 page fault. 1 Frame # valid-invalid bit page table Operating System Concepts Page Table: when some pages are not in main memory
Disk Space A B E D C F A 1 B 2 C 3 D 4 E 5 F 6 G 7 H Logical Memory 1 2 3 4 A 5 6 C 7 8 9 F Physical Memory Frame # Valid-Invalid bit 4 v 1 i 2 6 3 5 9 7 Operating System Concepts Steps in Handling a Page Fault (Cont.)
We check an internal table for this process, to determine whether the reference was a valid or invalid memory access. If the reference was invalid, we terminate process. If it was valid, but we have not yet brought in that page, we now page in the latter. We find a free frame. We schedule a disk operation to read the desired page into the newly allocated frame. When the disk read is complete, we modify the internal table kept with the process and the page table to indicate that the page is now in memory. We restart the instruction that was interrupted by the illegal address trap. The process can now access the page as though it had always been in memory. Operating System Concepts What happens if there is no free frame?
Page replacement find some page in memory, but not really in use, swap it out. Algorithm. Performance want an algorithm which will result in minimum number of page faults. Same page may be brought into memory several times. Operating System Concepts Performance of Demand Paging
Effective Access Time (EAT) for a demand-paged memory. Memory Access Time (ma) for most computers now ranges from 10 to 200 nanoseconds. If there is no page fault, then EAT = ma. If there is page fault, then EAT = (1 p) x (ma) + p x (page-fault time). p: the probability of a page fault (0 p 1), we expect p to be close to zero ( a few page faults). If p=0 then no page faults, but if p=1 then every reference is a fault If a page fault occurs, we must first read the relevant page from disk, and then access the desired word. Operating System Concepts Performance of Demand Paging (Cont.)
We are faced with three major components of the page-fault service time: Service the page-fault interrupt. Read in the page. Restart the process. A typical hard disk has: An average latency of 8 milliseconds. A seek of 15 milliseconds. A transfer time of 1 milliseconds. Total paging time = (8+15+1)= 24 milliseconds, including hardware and software time, but no queuing (wait) time. Operating System Concepts Operating System Concepts
Demand Paging Example 1 Assume an average page-fault service time of 25 milliseconds (10-3), and a Memory Access Time of 100 nanoseconds (10-9). Find the Effective Access Time? Solution: Effective Access Time (EAT) = (1 p) x (ma) + p x (page fault time) = (1 p) x p x 25,000,000 = 100 100 x p + 25,000,000 x p = ,999,900 x p. Note: The Effective Access Time is directly proportional to the page-fault rate. Operating System Concepts Operating System Concepts
Page Replacement Example: Assume each process contains 10 pages and uses only 5 pages. If we had 40 frames in physical memory then we can run 8 processes instead of 4 processes. Increasing the degree of multiprogramming. If we run 6 processes (each of which is 10 pages in size), but uses only 5 pages. We have higher CPU utilization and throughput, also 10 frames to spare (i.e., 6x5=30 frames needed out of 40 frames). It is possible each process tries to use all 10 of its pages, resulting in a need for 60 frames when only 40 are available. Operating System Concepts Need for Page Replacement
Frame # Valid-invalid bit Disk B M H 1 Load M 2 J 3 M Logical Memory for user 1 monitor 1 2 D 3 H 4 Load M 5 J 6 A 7 E Physical Memory 3 v 1 4 2 5 i Page table for user 1 A 1 B 2 D 3 E Logical Memory for user 2 6 v 1 i 2 3 7 Page table for user 2 Operating System Concepts The Operating System has Several Options
Terminate user process. Swap out a process, freeing all its frames, and reducing the level of multiprogramming. Page replacement takes the following approach: If no frames is free, find one that is not currently being used and free it. We can free a frame by writing its contents to swap space, and changing the page table to indicate that the page is no longer in memory. Operating System Concepts Operating System Concepts
Page Replacement Disk (2) Change to invalid (4) Reset page table for new page F Frame # Valid-invalid bit Swap out victim page Swap desired page in (1) (3) victim Physical Memory O i F v Page table Operating System Concepts Page Replacement (Cont.)
The page-fault service time is now modified to include page replacement: Find the location of the desired page on the disk. Find a free frame: If there is a free frame use it. Otherwise, use a page-replacement algorithm to select a victim frame. Write the victim page to the disk; change the page and frame tables accordingly. Read the desired page into the newly free frame; change the page and frame tables. Restart the user process. Operating System Concepts Page Replacement (Cont.)
Note: If no frames are free, two page transfers (one out and one in) are required. This doubles the page-fault service time and will increase the effective access time accordingly. This overhead can be reduced by the use of a modify (dirty) bit. Each page or frame may have a modify bit associated with it in the hardware. To implement demand paging, we must develop: Frame-allocation algorithm, to decide how many frames to allocate to each process. Page-replacement algorithm, to select the frames that are to be replaced. Operating System Concepts Page-Replacement Algorithms
We want a page replacement algorithm with the lowest page-fault rate. We evaluate an algorithm by running it on a particular string of memory references (reference string) and computing the number of page faults on that string. The string of memory references is called a reference string. We can generate reference strings by tracing a given system and recording the address of each memory reference. In our examples, the reference string is 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5. Operating System Concepts Page-Replacement Algorithms (Cont.)
Example: If we trace a particular process, we might record the following address sequence: 0100, 0432, 0101, 0102, 0609, 0601, 0612 This is reduced to the following reference string: 1, 4, 1, 6 As the number of frames available increases, the number of page faults will decrease. From the above example: If we had 3 or more frames, we would have only 3 faults, one fault for the first reference to each page. If we had only one frame, we would have a replacement with every reference resulting in 4 faults. 1 4 6 Operating System Concepts First-In-First-Out (FIFO) Algorithm
A FIFO algorithm associates with each page the time when that page was brought into memory. When a page must be replaced, the oldest page is chosen. Or we can use a FIFO queue to hold all pages in memory. We replace the page at the head of the queue. When a page is brought into memory, we insert it at the tail of the queue. In Out Tail Front Operating System Concepts Example: FIFO Algorithm
Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 Let 3 frames are initially empty (3 pages can be in memory at a time per process). The first 3 references (1, 2, 3) cause page faults, and are brought into these empty frames. 4 frames 1 2 3 4 5 9 page faults 1 2 3 5 4 10 page faults Operating System Concepts Optimal (OPT) Algorithm
An optimal algorithm has the lowest page-fault rate of all algorithms. An optimal algorithm will never suffer from Beladys anomaly. Replace the page that will not be used for the longest period of time. This algorithms guarantees the lowest possible page-fault rate for a fixed number of frames. The optimal algorithm is difficult to implement, because it requires future knowledge of the reference string. Similar situation with Shortest-Job-First in CPU scheduling. Operating System Concepts Example: OPT Algorithm
Initially 4 frames empty. Reference String: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 How do you know this? Used for measuring how well your algorithm performs. 1 2 3 4 6 page faults 5 Operating System Concepts Least Recently Used (LRU) Algorithm
The key distinction between FIFO and OPT algorithms is that FIFO uses the time when a page was brought into memory; the OPT uses the time when a page is to be used (future). LRU algorithm uses the time when a page has not been used for the longest period of time (Past). LRU replacement associates with each page the time of that pages last use. Operating System Concepts Example: LRU Algorithm
Looking backward in time. Reference string:1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 Note: Number of page faults (on same reference string) using: LRU is 8. FIFO is 10. OPT is 6. 1 2 3 5 4 8 page faults Operating System Concepts

Documents

Memory Management and Virtual Memory