38
1 W4118 Operating Systems Instructor: Junfeng Yang

LRU page replacement

Embed Size (px)

Citation preview

Page 1: LRU page replacement

1

W4118 Operating Systems

Instructor: Junfeng Yang

Page 2: LRU page replacement

Last lecture: VM Implementation Operations OS + hardware must provide to

support virtual memory Page fault

• Locate page• Bring page into memory

Continue process

OS decisions When to bring a on-disk page into memory?

• Demand paging• Request paging• Prepaging

What page to throw out to disk?• OPT, RANDOM, FIFO, LRU, MRU

2

Page 3: LRU page replacement

3

Today Virtual Memory Implementation

Implementing LRU How to approximate LRU efficiently?

Linux Memory Management

3

Page 4: LRU page replacement

Implementing LRU: hardware A counter for each page Every time page is referenced, save

system clock into the counter of the page Page replacement: scan through pages to

find the one with the oldest clock

Problem: have to search all pages/counters!

4

Page 5: LRU page replacement

5

Implementing LRU: software A doubly linked list of pages Every time page is referenced, move it to

the front of the list Page replacement: remove the page from

back of list Avoid scanning of all pages

Problem: too expensive Requires 6 pointer updates for each page

reference High contention on multiprocessor

Page 6: LRU page replacement

6

Example software LRU implementation

Page 7: LRU page replacement

7

LRU: Concept vs. Reality LRU is considered to be a reasonably good

algorithm

Problem is in implementing it efficiently Hardware implementation: counter per page, copied

per memory reference, have to search pages on page replacement to find oldest

Software implementation: no search, but pointer swap on each memory reference, high contention

In practice, settle for efficient approximate LRU Find an old page, but not necessarily the oldest LRU is approximation anyway, so approximate more

Page 8: LRU page replacement

Clock (second-chance) Algorithm Goal: remove a page that has not been

referenced recently good LRU-approximate algorithm

Idea: combine FIFO and LRU A reference bit per page Memory reference: hardware sets bit to 1 Page replacement: OS finds a page with

reference bit cleared OS traverses pages, clearing bits over time

8

Page 9: LRU page replacement

Clock Algorithm Implementation OS circulates through pages, clearing

reference bits and finding a page with reference bit set to 0

Keep pages in a circular list = clock Pointer to next victim = clock hand

9

Page 10: LRU page replacement

10

Second-Chance (clock) Page-Replacement Algorithm

Page 11: LRU page replacement

11

Clock Algorithm Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3,

4, 51

2

3

4

1

1

1

1

5

2

3

4

1

0

0

0

5

1

3

4

1

1

0

0

5

1

2

4

1

1

1

0

5

1

2

3

1

1

1

1

4

1

2

3

1

0

0

0

4

5

2

3

1

1

0

0

Page 12: LRU page replacement

12

Clock Algorithm Extension Problem of clock algorithm: does not

differentiate dirty v.s. clean pages

Dirty page: pages that have been modified and need to be written back to disk More expensive to replace dirty pages than

clean pages One extra disk write (5 ms)

Page 13: LRU page replacement

Clock Algorithm Extension Use dirty bit to give preference to dirty pages On page reference

Read: hardware sets reference bit Write: hardware sets dirty bit

Page replacement reference = 0, dirty = 0 victim page reference = 0, dirty = 1 skip (don’t change) reference = 1, dirty = 0 reference = 0, dirty = 0 reference = 1, dirty = 1 reference = 0, dirty = 1 advance hand, repeat If no victim page found, run swap daemon to flush

unreferenced dirty pages to the disk, repeat

13

Page 14: LRU page replacement

Problem with LRU-based Algorithms

When memory is too small to hold past LRU does handle repeated scan well when data

set is bigger than memory• 5-frame memory with 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

Solution: Most Recently Used (MRU) algorithm Replace most recently used pages Best for repeated scans

14

Page 15: LRU page replacement

Problem with LRU-based Algorithms (cont.)

LRU ignores frequency Intuition: a frequently accessed page is more likely to be

accessed in the future than a page accessed just once Problematic workload: scanning large data set

• 1 2 3 1 2 3 1 2 3 1 2 3 … (pages frequently used)• 4 5 6 7 8 9 10 11 12 … (pages used just once)

Solution: track access frequency Least Frequently Used (LFU) algorithm

• Expensive Approximate LFU:

• LRU-k: throw out the page with the oldest timestamp of the k’th recent reference

• LRU-2Q: don’t move pages based on a single reference; try to differentiate hot and cold pages

15

Page 16: LRU page replacement

Linux Page Replacement Algorithm Similar to LRU 2Q algorithm

Two LRU lists Active list: hot pages, recently referenced Inactive list: cold pages, not recently referenced

Page replacement: select from inactive list Transition of page between active and

inactive requires two references or “missing references”

Page 17: LRU page replacement

Allocating Memory to Processes Split pages among processes

Split pages among users

Global allocation

Page 18: LRU page replacement

Thrashing Example

Set of processes frequently referencing 5 pages Only 4 frames in physical memory

System repeats Reference page not in memory Replace a page in memory with newly referenced page

Thrashing system busy reading and writing instead of executing

useful instructions• CPU utilization low

Average memory access time equals disk access time• Illusion breaks: memory appears slow as disk rather than disks

appearing fast as memory Add more processes, thrashing get worse

Page 19: LRU page replacement

Working Set Informal Definition

Collection of pages the process is referencing frequently

Collection of pages that must be resident to avoid thrashing

Methods exist to estimate working set of process to avoid thrashing

Page 20: LRU page replacement

Memory Management Summary “All problems in computer science can be solved by another level of indirection” David Wheeler Different memory management techniques Contiguous allocation Paging Segmentation Paging + segmentation In practice, hierarchical paging is most widely used; segmentation loses is popularity

• Some RISC architectures do not even support segmentation

Virtual memory OS and hardware exploit locality to provide illusion of fast memory as large as disk

• Similar technique used throughout entire memory hierarchy Page replacement algorithms

• LRU: Past predicts future• No silver bullet: choose algorithm based on workload

20

Page 21: LRU page replacement

Current Trends Memory management: less critical now

Personal computer v.s. time-sharing machines Memory is cheap Larger physical memory

Segmentation becomes less popular Some RISC chips don’t event support segmentation

Larger page sizes (even multiple page sizes) Better TLB coverage Smaller page tables, less page to manage Internal fragmentation

Larger virtual address space 64-bit address space Sparse address spaces

File I/O using the virtual memory system Memory mapped I/O: mmap()

21

Page 22: LRU page replacement

22

Today Virtual Memory Implementation

Implementing LRU How to approximate LRU efficiently?

Linux Memory Management Page replacement Segmentation and Paging Dynamic memory allocation

22

Page 23: LRU page replacement

Page descriptor Keep track of the status of each page

frame struct papge, include/linux/mm.h

Each descriptor has two bits relevant to page replacement policy PG_active: is page on active_list? PG_referenced: was page referenced recently?

Page 24: LRU page replacement

Memory Zone Keep track of pages in different zones

struct zone, include/linux/mmzone.h ZONE_DMA: <16MB ZONE_NORMAL: 16MB-896MB ZONE_HIGHMEM: >896MB

Two LRU list of pages active_list inactive_list

Page 25: LRU page replacement

Functions lru_cache_add*(): add to page cache mark_page_accessed(): move pages from

inactive to active page_referenced(): test if a page is

referenced refill_inactive_zone(): move pages from

active to inactive

When to replace page Usually free_more_memory()

Page 26: LRU page replacement

26

Today Virtual Memory Implementation

Implementing LRU How to approximate LRU efficiently?

Linux Memory Management Page replacement Segmentation and Paging Dynamic memory allocation

26

Page 27: LRU page replacement

Recall: x86 segmentation and paging hardware

CPU generates logical address Given to segmentation unit

• Which produces linear addresses Linear address given to paging unit

• Which generates physical address in main memory• Paging units form equivalent of MMU

27

Page 28: LRU page replacement

Recall: Linux Process Address Space

User space

Kernel space

User-mode stack-area

Task’s code and data

kernel mode

user mode

Kernel space is also mapped into user space from user mode to kernel mode, no need to switch address spaces Shared runtime-libraries

0

3G

4Gprocess descriptor andkernel-mode stack

Page 29: LRU page replacement

Linux Segmentation Linux does not use segmentation

More portable since some RISC architectures don’t support segmentation

Hierarchical paging is flexible enough

X86 segmentation hardware cannot be disabled, so Linux just hacks segmentation table arch/i386/kernel/head.S Set base to 0x00000000, limit to 0xffffffff Logical addresses == linear addresses

Protection Descriptor Privilege Level indicates if we are in

privileged mode or user mode• User code segment: DPL = 3• Kernel code segment: DPL = 0

Page 30: LRU page replacement

Linux Paging Linux uses paging to translate logical

addresses to physical addresses Page model splits a linear address into five

parts Global dir Upper dir Middle dir Table Offset

30

Page 31: LRU page replacement

Kernel Address Space Layout

PhysicalMemory Mapping

vmalloc area

vmalloc area

PersistentHigh

MemoryMappings

Fix-mappedLinear

addresses

0xC0000000 0xFFFFFFFF

Page 32: LRU page replacement

Linux Page Table Operations include/asm-i386/pgtable.h arch/i386/mm/hugetlbpage.c Examples

mk_pte

32

Page 33: LRU page replacement

TLB Flush Operations include/asm-i386/tlbflush.h Flushing TLB on X86

load cr3: flush all TLB entries invlpg addr: flush a single TLB entry

33

Page 34: LRU page replacement

34

Today Virtual Memory Implementation

Implementing LRU How to approximate LRU efficiently?

Linux Memory Management Page replacement Segmentation and Paging Dynamic memory allocation

34

Page 35: LRU page replacement

Linux Page Allocation Linux use a buddy allocator for page allocation

Buddy Allocator: Fast, simple allocation for blocks that are 2^n bytes [Knuth 1968]

Allocation restrictions: 2^n pages Allocation of k pages:

Raise to nearest 2^n Search free lists for appropriate size

• Recursively divide larger blocks until reach block of correct size

• “buddy” blocks Free

Recursively coalesce block with buddy if buddy free Example: allocate a 256-page block mm/page_alloc.c

Page 36: LRU page replacement

Advantages and Disadvantages of Buddy Allocation

Advantages Fast and simple compared to general dynamic

memory allocation Avoid external fragmentation by keeping free

pages contiguous• Can use paging, but three problems:

– DMA bypasses paging– Modifying page table leads to TLB flush– Cannot use “super page” to increase TLB coverage

Disadvantages Internal fragmentation

• Allocation of block of k pages when k != 2^n• Allocation of small objects (smaller than a page)

Page 37: LRU page replacement

The Linux Slab Allocator For objects smaller than a page Implemented on top of page allocator Each memory region is called a cache Two types of slab allocator

Fixed-size slab allocator: cache contains objects of same size

• for frequently allocated objects

General-purpose slab allocator: caches contain objects of size 2^n

• for less frequently allocated objects• For allocation of object with size k, round to nearest 2^n

mm/slab.c

Page 38: LRU page replacement

Advantages and Disadvantages of slab allocation

Advantages Fast: no need to allocate and free page frames

• Allocation: no search of objects with the right size for fixed-size allocator; simple search for general-purpose allocator

• Free: no merge with adjacent free blocks

Reduce internal fragmentation: many objects in one page

Disadvantages Memory overhead for bookkeeping Internal fragmentation