Upload
silas-bruno-gallagher
View
213
Download
0
Embed Size (px)
Citation preview
Lecture 2: CS623
2/3/2004
© Joel Wein 2003, modified by T. Suel
Roadmap
Brief Trail through the LINUX Kernel– http://www.win.tue.nl/~aeb/linux/vfs/trail.html
Memory Management LINUX
– Page Frame Management– Memory Area Management
Virtual Memory
Memory Management
Issues/Requirements Techniques
– Fixed Partitioning– Dynamic Partitioning– Simple Paging– Simple Segmentation– Important Concept: Physical vs. Logical Addresses
Memory
Although cost of memory has dropped substantially there is never enough main memory to hold all of the programs and data structures needed by active processes and by the operating system.
Need to bring in and swap out blocks of data from secondary memory.
– Chapter 7: Memory Management. Basic Techniques
– Chapter 8: Virtual Memory Allow each process to behave as if it had unlimited main memory
at its disposal.
Memory Management Requirements
Relocation– Need ability to relocate process to different areas of memory.
Need to handle memory references correctly. Protection
– Programs in other processes should not be able to reference memory locations in a process without permission.
Complication: Relocation Complication: Dynamic calculation of addresses at run time. Note: memory protection must be satisfied by the processor rather
than OS. Can’t pre-screen program; only possible to assess permissibility of a memory reference at the time of execution of the instruction making the reference. Processor hardware must have that capability.
More Requirements
Sharing– Several processes access same main memory areas.
Logical Organization– If OS/hardware can effectively deal with user programs and
data in the form of modules of some sort, then… Modules can be written and compiled independently Different degrees of protection to different modules Sharing modules easier.
Physical Organization– Handle the two levels of main and secondary memory.
Memory Management Techniques
Principal operation of memory management is to bring programs into main memory for execution by the processor.
Simple Techniques: – Fixed Partitioning– Dynamic Partitioning– Simple Paging– Simple Segmentation
Built upon these: virtual memory.
Fixed Partitioning
Main memory is divided into a number of static partitions at system generation time. A process may be loaded into a partition of equal or greater size.
– Equal Size Partitions: If program is too big need overlays Internal fragmentation -- blocks of data loaded smaller than the
partition. – Unequal sized partitions helps some
Placement: – Smallest into which it will fit (one process queue per partition)– Smallest available into which it will fit. (one queue)
Dynamic Partitioning
Partitions are of variable length and number. Partitions are created dynamically , so that each process is
loaded into a partition of exactly same size as the process. Holes created when processes pulled out
– External fragmention: memory that is external to all partitions becomes increasingly fragmented.
Compaction: Shift processes so they are contiguous. Pro: No internal fragmentation! More efficient use of main
memory Con: Compaction is CPU intensive. Con: Complex to maintain
Placement Algorithm for Dynamic Partitioning
Best-Fit– Block that is closest in size.– Leaves behind small fragments
First-fit – Scan from beginning of memory– Pretty good!
Next-fit– Scan from location of last placement– Quickly chews up end of memory which otherwise would
usually be the largest block.
Relocation Issues
Fixed Partitioning: – Could expect that process always assigned to same
partition. (One process queue per partition.)– In this case all relative memory references in code
could be replaced by absolute main memory addresses, determined by base address of loaded process.
Relocation II
IF a process can be swapped back into different memory locations, or if we use compaction, locations of data and instructions reference by process are not fixed.
We need to distinguish between:– Logical address: reference to a memory location independent
of the current assignment of data to memory. Need to make translation to actually use.
Relative address: Address expressed as location relative to some point, like start point of program.
– Physical address/Absolute address: Actual location in main memory chips.
Relocation III
Programs that use relative addresses are loaded using dynamic run-time loading.– All memory references in loaded process are
relative to the origin of the program.– Need hardware mechanism to translate relative
addresses to physical main memory addresses at time of execution.
Paging
Combat internal and external fragmentation. Main memory is divided into a number of equal-size
frames. Each process is divided into a number of equal-size
pages of the same length as frames. A process is loaded by loading all of its pages into available, not necessarily contiguous, frames.
OS maintains a page table for each process. Shows frame location for each page of the process.
Use page sizes that are powers of 2.
How Does it Work?
Within the program, each logical address consists of page number and offset within the page.
Processor hardware still does logical-to-physical translation.
– Now processor must know how to access page table of the current process.
– Presented with logical address (page number, offset) it uses page table to produce (frame number, offset).
Simple Paging
No external fragmentation! A small amount of internal fragmentation!
Simple Segmentation
Each process is divided into a number of segments of potentially different sizes. A process is loaded by loading all of its segments into dynamic partitions that need not be contiguous.
– Logical address is now (segment number, offset). No internal fragmentation, like dynamic partitioning. Comparison with dynamic partitioning:
– Program may occupy more than one (non-contiguous) partition.
– Suffers from external fragmentation, but not as much because process broken up into a number of smaller pieces.
LINUX Memory Management
LINUX takes advantage of 80x86’s segmentation and paging circuits to translate logical addresses into physical ones.
Some portion of RAM permanently assigned to kernel Remaining part of RAM is dynamic memory.
– Need a robust and efficient strategy for allocating groups of contiguous page frames.
80x86 supports two levels of pages: 4KB and 4MB Three memory regions: DMA, NORMAL, HIGHMEM
LINUX: Buddy System
Goal: combat (external) fragmentation.– Just use paging circuitry to map noncontiguous to
look contiguous– Or have clever strategy to keep things contiguous– Second approach is better because…
Sometimes really need contiguous page frames – buffers for DMA processor – DMA ignores paging circuitry.
Leaves kernel page tables unchanged (TLB perf. Issues) Can also use 4MB pages of contiguous memory – makes
things faster.
LINUX: Buddy System
Compromise Between: – Fixed: May use space inefficiently; limits number of
active processes.– Dynamic: Complex, compaction overhead.
All free page frames grouped into 10 lists of blocks that contain groups of sizes 1,2,…, 512 contiguous 4KB page frames, respectively.
Buddy System, continued
Let’s say you need 128.– If its there, grab it.– If not, look on 256
If its there, take 128, put the other 128 on the 128-list. If not, look on 512
– Take 128– Put 256 on the 256-list– Put other 128 on 128-list.
Buddy System: Releasing Blocks
Attempt to merge pairs of free buddy blocks of size b together into a single block of size 2b.
Two blocks considered buddies if – Both blocks have same size b.– They are located in contiguous physical addresses.– The physical address of the first page frame of the
first block is a multiple of 2*b*(4K)
LINUX Memory Area Management
How deal with requests for small memory areas and avoid internal fragmentation?
Slab Allocator Based on Solaris 2.4:– To avoid initializing objects repeatedly, the slab allocator does
not discard the objects that have been allocated and then released but instead saves them in memory.
– Kernel functions tend to request memory of the same type repeatedly. (New process creation). Save page frames allocating same memory areas in a cache and reuse quickly
– Reusing/caching
Slab Allocator cont.
Slab allocator groups objects into caches – a cache is a store of objects of the same type
E.g. when a file is opened the memory area needed to store the corresponding “open file” object is taken from a slab allocator cache named “filp.”
Area of main memory that contains a cache is divided into slabs.
– Each slab consists of one or more contiguous page frames that contain both allocated and free objects.
Slab allocator never releases the page frames of an empty slab on its own. It would not know when free memory needed.
Virtual Memory Chapter 8
Outline
Basic Premise Locality and Virtual Memory Hardware and Control Structures
– Paging Page Table Structure Translation Lookaside Buffer Page Size
– Segmentation Operating System Software
Fetch, Placement, Replacement Policies Resident Set Management Cleaning Policy Load Control
Basic Premise
Paging and Segmentation give: – All memory references are logical addresses that are
dynamically translated to physical addresses at run-time. (Can occupy different parts of main memory at different times.)
– A process may be broken up into a number of pieces (pages or segments) that need not be contiguously located in main memory during execution.
Using dynamic run-time address translation and page/segment table.
And so…
If the previous characteristics are present, it is NOT necessary that all of the pages or segments of a process be in main memory during execution. – If the piece (segment or page) that holds the next
instruction to be fetched and the piece that holds the next data location to be accessed are in main memory, then at least for a time execution may proceed.
How Does it Work?
Resident set: portion of process in main memory. If processor encounters a logical address that is not in
main memory:– Generates an interrupt indicating memory access fault.– OS puts interrupted process in a blocking state and takes
control.– To resume this process, OS needs to bring into main memory
the piece of the process that contains logical address that caused access fault.
– Disk I/O request– When I/O interrupt issued, gives control back to OS which
places affected process in Ready state.
Questions, Implications
Efficient? Implications:
– More process may be maintained in main memory.– A process may be larger than all of main memory.
Programmer perceives a much larger, virtualvirtual, memory.
Locality and Virtual Memory
It works because typically processes use only a small part of a program at any time.– Principle of locality: Programs and data references
within a process tend to cluster. – Should be possible to make intelligent guesses
about which pieces of a process needed in near future to avoid thrashing.
Hardware & Software
For VM to work need: – Hardware support for paging &/or segmentation
scheme.– OS software must manage movement of pages and
or segments between secondary memory and main memory.
Hardware Support: Paging
Page table becomes more complex.– Bit P indicates whether present or not in main
memory.– If P, also includes frame number of that page.– Modify bit M: Have contents been altered since last
loaded into main memory? If not M, no need to write out when replace page in the
frame it occupies.
Page Table Structure
Basic required mechanism is translation from (page #, offset) to (frame #, offset) using page table.
– Page table of variable length, can’t hold in registers; must be in main memory.
– When process running, register holds start address of page table for process. Page number used to index it and look up frame number.
– Note that if VM is large (2^^32) page table could be large (2^^20) and need to be stored in virtual memory as well
– Huh?– Some processors (Pentium) make use of two-level scheme
Page directory in which each entry points to a page table.
Translation Lookaside Buffer
Problem: Each VM reference can cause 2 physical memory accesses, namely the page table entry and the desired data.
Solution: High-speed cache for page table entries, the translation lookaside buffer.
– Given a virtual address, processor will first examine the TLB. If hit, great. If not, if present, then retrieve and update TLB. If not
present, memory access fault (page fault) happens.
TLB misses expensive: random access to large data Sparc IIe: small TLB problems
TLB: Additional Details
TLB contains only some of the page table entries; cannot index in based on page number.
Therefore each entry must contain page number as well as complete page table entry.
Processor has hardware that allows it to simultaneously check a number of TLB entries to look if there is a match on page number. This technique is referred to as associative mapping.
Page Size
Considerations:– Internal fragmentation:
Smaller page size less internal fragmentation. (Good) Smaller the page greater number of pages required per process
larger page tables some portion of page table for active processes not in main memory. (Bad)
– Page size <-> Fault Rate. Page size small: lots of pages in memory, not too many faults. Middle: each page contains references further afield Large: page size approaches process size.
– Contemporary programming techniques used in large programs reduce locality.
OO techniques: many small programs and modules. Multithreaded applications.
Operating System Software
Fetch Policy Replacement Policy Resident Set Management
Issues
Want to minimize page faults to minimize software overhead.
– Deciding which pages to replace– I/O of exchanging pages– Scheduling another process to run during page I/O.
Relevant Issues in the Choice of a Policy:– Main memory size– Relative speed of main vs. secondary memory– Size and number of processes competing for resources– Execution behavior of individual programs.
Fetch Policy
When should a page be brought into main memory?– Demand Paging.
Flurry at start. Eventually locality kicks in.
– Pre-paging (prefetching)
Placement Policy
Where in real memory a process piece is to reside.– Pure Segmentation: placement policy an important
design issue (remember discussion of Best-fit, first-fit, etc.)
– Paging or Paging/Segmentation: placement not a big deal as address translation hardware can handle any page-frame combination with equal efficiency.
Replacement Policy
Deals with selection of page in memory to be replaced when a new page needs to be brought in.
Three issues that get lumped together.– How many page frames/process.– Whether page frames considered for replacement should be
limited to those of the process that caused the page fault or encompass all the page frames in main memory.
– Among the set of pages considered, which particular page should be selected for replacement?
Call the first two Resident Set Management; third is Replacement Policy.
Replacement Policy
Frame Locking: Some frames might be locked (kernel. OS).
Basic Algorithms: – Optimal– Least Recently Used (LRU)– FIFO– Clock
Optimal Algorithm
Select for replacement the page for which the time to next access is longest.
This results in fewest page faults. Not implementable (clairvoyance issue) Running Example: 3 pages, sequence
232152453252
LRU
Replace page not used for longest time Principle of locality: least likely to be used in
the future Does pretty good Hard to implement! (sort of) Not always good …
FIFO
Simple to implement Get rid of page in memory the longest Reasoning will often be wrong Exception: repeated scans!
Clock Policy
Try to emulate LRU Associate “use bit” with each frame.
– When page first loaded into a frame in memory, use bit for that frame set to 1.
– When subsequently referenced, set to 1.– Set of pages that are candidates for replacement are a
circular buffer with pointer. – When page replaced,pointer placed on the next frame.– When time to replace a frame scan for a use-bit 0 frame.
When you encounter a use-bit 1, set to 0.– Like FIFO except skips use-bit 1.
Resident Set Management
Resident Set Size: How much main memory to give to a particular process?
Replacement Scope: What set of potential replacements do you choose from?
Resident Set Size
Factors:– The smaller the assigned memory, the more
processes in main memory. Increases probability that OS will find at least one ready process and avoid swapping
– If a relatively small number of pages of a process are in main memory, then rate of page faults will be high
– Beyond a certain size adding more memory not that useful
Resident Set Size II
Two approaches:– Fixed allocation determined at initial load time– Variable Allocation: varies over lifetime of a
process. Give more frames to processes that are faulting a lot.
Replacement Scope
Local replacement policy: choose among resident pages of process that generated the fault.
Global replacement policy: consider all unlocked pages in main memory as candidates to replace.
Possible Combinations
Fixed Allocation, Local Scope– Drawback: If allocations too large or too small no
good way to recover. Too small: lots of page faults Too large: processor idle time or lots of swapping
Variable Allocation, Global Scope– Easiest to implement, widely adopted.– Processes that fault a lot should get helped out.– Hard to get a good replacement policy – not easy to
figure out which process is best to choose from.
Combinations
Variable Allocation, Local Scope– Try to overcome problems with a global-
scope strategy.– From time to time reevaluate allocation to
process.– The question: How do you determine
resident set size for each process and how do you time the changes?