Memory Management - uio.no · 1! 1 Memory Management • Basic memory management: • Mono- and multi-programming • Fixed and variable memory partitioning • Swapping • Paging

1!

1

Memory Management

•  Basic memory management: •  Mono- and multi-programming •  Fixed and variable memory partitioning

•  Swapping •  Paging •  Segmentation •  Virtual memory (paging) •  Page replacement algorithms •  Design issues for paging systems •  Implementation issues

Vera Goebel Department of Informatics

University of Oslo

2

Motivation •  In project assignments so far

– Program code is linked to kernel – Physical addresses are well-known – Not realistic

•  In the real world – Programs are loaded dynamically – Physical addresses it will get are not known to

program – Program size at run-time is not known to kernel

2!

3

Memory Management

•  Ideally programmers want memory that is –  large –  fast –  non volatile

•  Memory hierarchy –  small amount of fast, expensive memory – cache –  some medium-speed, medium price main memory –  gigabytes of slow, cheap disk storage

•  Memory manager handles the memory hierarchy

4

Computer Hardware Review

•  Typical memory hierarchy –  numbers shown are rough approximations

3!

5

Memory Management for Monoprogramming

•  Only one user program loaded –  Program is entirely in memory –  No swapping or paging

•  Three simple ways of organizing memory

OS in ROM!0x0! 0xfff….!

User program!

OS in RAM!0x0! 0xfff….!

User program!

OS in RAM!0x0! 0xfff….!

User program! Devs in ROM!MS-DOS, …!

Old mainframes and minicomputers!

C64, ZX80, …!Some PDAs, embedded systems!

6

Multiprogramming

•  Processes have to wait for I/O •  Goal

– Do other work while a process waits – Give CPU to another process

•  Processes may be concurrently ready •  So

–  If I/O waiting probability for all processes is p – Probable CPU utilization can be estimated as

CPU utilization = 1 - pn

4!

7

Multiprogramming

Job# Arriv

al time

CPU use time

IO wait time

1 10:00 4 16

2 10:10 3 12

3 10:15 2 8

4 10:20 2 8

•  Sequence of events as jobs arrive and finish –  Note numbers show amount of CPU time jobs get each interval

# processors % 1 2 3 4

CPU idle 80 64 51 41

CPU busy 20 36 49 59

CPU per process

20 18 16 15

2.0! 0.9!0.9!

0.7!

0.8!0.8!0.8!

0.3!0.3!0.3!0.3!

0.9!0.9!0.9!

2!3!4!

1!

0.1!

0.1!

22! 27.6!28.2! 31.7!

time!10! 15! 20!0!

•  Arrival and work requirements of 4 jobs •  CPU utilization for 1-4 jobs with 80% I/O wait

1!2!3!4!

2.0!3.0!1.1!2.1!2.0!

0.3!1.2!1.1!2.0!

1.0!0.9!1.7!0.7!

0.1!

0.8!

Remaining CPU time!

8

Multiprogramming

•  CPU utilization as a function of number of processes in memory

Degree of multiprogramming

5!

9

Multiprogramming •  Several programs

–  Concurrently loaded into memory –  OS must arrange memory sharing –  Memory partitioning

•  Memory –  Needed for different tasks within a process –  Shared among processes –  Process memory demand may change over time

•  Use of secondary storage –  Move (parts of) blocking processes from memory –  Higher degree of multiprogramming possible –  Makes sense if processes block for long times

10

Memory Management for Multiprogramming

•  Process may not be entirely in memory •  Reasons

–  Other processes use memory •  Their turn •  Higher priority •  Process is waiting for I/O

–  Too big •  For its share •  For entire available memory

•  Approaches –  Swapping –  Paging –  Overlays

Registers

Cache(s)

DRAM

Disk

2x

100x

109x

Paging Swapping Overlays

6!

11

Memory Management for Multiprogramming

•  Swapping –  Remove a process from memory

•  With all of its state and data •  Store it on a secondary medium

–  Disk, Flash RAM, other slow RAM, historically also Tape

•  Paging –  Remove part of a process from memory

•  Store it on a secondary medium •  Sizes of such parts are fixed •  Page size

•  Overlays –  Manually replace parts of code and data

•  Programmer’s rather than OS’s work •  Only for very old and memory-scarce systems

How to use these!with!

Virtual Memory!

12

Memory Management Techniques •  Before details about moving processes out

– Assign memory to processes

•  Memory partitioning – Fixed partitioning – Dynamic partitioning – Simple paging – Simple segmentation – Virtual memory paging – Virtual memory segmentation

7!

13

Multiprogramming with Fixed Partitions

•  Fixed memory partitions –  separate input queues for each partition –  single input queue

14

Fixed Partitioning

•  Divide memory –  Into static partitions – At system initialization time (boot or earlier)

•  Advantages – Very easy to implement – Can support swapping process in and out

8!

15

Fixed Partitioning

•  Two fixed partitioning schemes –  Equal-size partitions –  Unequal-size partitions

•  Equal-size partitions –  Big programs can not be

executed •  Unless program parts are

loaded from disk –  Small programs use entire

partition •  A problem called “internal

fragmentation”

Operating system!8MB!

8MB!

8MB!

8MB!

8MB!

8MB!

8MB!

8MB!

0x0!

0x…fff!

16

Fixed Partitioning •  Two fixed partitioning

schemes –  Equal-size partitions –  Unequal-size partitions

•  Unequal-size partitions –  Bigger programs can be

loaded at once –  Smaller programs can lead

to less internal fragmentation

–  Advantages require assignment of jobs to partitions


8MB!

8MB!

8MB!

8MB!

8MB!

8MB!

8MB!


8MB!

8MB!

2MB!4MB!6MB!

12MB!

16MB!

9!

17

Fixed Partitioning •  Approach

–  Has been used in mainframes

–  Uses the term job for a running program

–  Jobs run as batch jobs –  Jobs are taken from a

queue of pending jobs •  Problem with unequal

partitions –  Choosing a job for a

partition


8MB!

8MB!

2MB!4MB!6MB!

12MB!

16MB!

18

Fixed Partitioning

•  One queue per partition –  Internal fragmentation

is minimal –  Jobs wait although

sufficiently large partitions are available


8MB!

8MB!

2MB!4MB!6MB!

12MB!

16MB!

10!

19

Fixed Partitioning •  Single queue

–  Jobs are put into next sufficiently large partition

–  Waiting time is reduced

–  Internal fragmentation is bigger

–  A swapping mechanism can reduce internal fragmentation

•  Move a job to another partition


8MB!

8MB!

2MB!4MB!6MB!

12MB!

16MB!

20

Problems: Relocation and Protection

•  Cannot be sure where program will be loaded in memory –  address locations of variables, code routines cannot be absolute –  must keep a program out of other processes’ partitions

•  Use base and limit values –  address locations added to base value to map to physical addr –  address locations larger than limit value is an error

11!

2 Registers: Base and Bound

•  Built in Cray-1 •  A program can only access

physical memory in [base, base+bound]

•  On a context switch: save/restore base, bound registers

•  Pros: Simple •  Cons: fragmentation, hard to

share, and difficult to use disks

virtual address

base

bound

error

+

>

physical address

22

Swapping (1)

Memory allocation changes as –  processes come into memory –  leave memory

Shaded regions are unused memory

12!

23

Dynamic Partitioning •  Divide memory

–  Partitions are created dynamically for jobs

–  Removed after jobs are finished

•  External fragmentation –  Problem increases with

system running time –  Occurs with swapping

as well –  Addresses of process 2

changed


56MB free!

Process 1!20MB!

36MB free!

22MB free!

Process 2!14MB!

4MB free!

Process 3!18MB!

14MB free!Process 4!

8MB!6MB free!

20MB free!Process 5!

14MB!

6MB!

External!fragmentation!

Swapped in!Process 2!

14MB!

6MB free!

Solutions to address!change with!

Address Translation!

24


Dynamic Partitioning •  Reduce external fragmentation

–  Compaction

•  Compaction –  Takes time –  Consumes processing resources

•  Reduce compaction need –  Placement algorithms

4MB free!

Process 3!18MB!

Process 4!8MB!

6MB free!

Swapped in!Process 2!

14MB!

6MB!Process 4!8MB!

6MB free!Process 3!

18MB!

6MB free!

6MB free!16MB free!

13!

25

Dynamic Partitioning: Placement Algorithms

•  Use most suitable partition for process

•  Typical algorithms –  First fit –  Next fit –  Best fit

128MB! 128MB! 128MB!

16MB! 16MB!16MB!4MB! 4MB! 4MB!8MB! 8MB! 8MB!6MB! 6MB! 6MB!

16MB! 16MB! 16MB!

8MB! 8MB! 8MB!

4MB!

4MB!

4MB!

8MB! 8MB!

8MB!

6MB!

6MB!

6MB!

8MB!

8MB!

8MB!

16MB!

16MB!

16MB!

32MB! 32MB!

32MB!

First! Next! Best!

26


•  Use most suitable partition for process

•  Typical algorithms –  First fit –  Next fit –  Best fit

128MB! 128MB!

4MB! 4MB!

16MB! 16MB!

4MB!

4MB!

8MB!

6MB!

6MB!

8MB!

32MB! 32MB!

12MB!

12MB!

12MB!

12MB!

10MB!

10MB!

16MB! 16MB!

8MB! 8MB!First! Best!

14!

27


•  Comparison of First fit, Next fit and Best fit •  Example is naturally artificial

–  First fit •  Simplest, fastest of the three •  Typically the best of the three

–  Next fit •  Typically slightly worse than first fit •  Problems with large segments

–  Best fit •  Slowest •  Creates lots of small free blocks •  Therefore typically worst

28

Memory Management with Bit Maps

•  Part of memory with 5 processes, 3 holes –  tick marks show allocation units –  shaded regions are free

•  Corresponding bit map •  Same information as a list

15!

29

Memory Management with Linked Lists

Four neighbor combinations for the terminating process X

30

Buddy System •  Mix of fixed and dynamic

partitioning –  Partitions have sizes 2k,

L ≤ k ≤ U

•  Maintain a list of holes with sizes

•  Assign a process –  Find smallest k so that

process fits into 2k

–  Find a hole of size 2k

–  If not available, split smallest hole larger than 2k

•  Split recursively into halves until two holes have size 2k

1MB!

512kB!

512kB!

256kB!

256kB!

128kB!

128kB!

Process!128kB!

256kB!

Process!256kB!

256kB!Process!256kB!

Process!128kB!Process!256kB!

Process 32kB!

64kB!64kB!

32kB!32kB!Process 32kB!

16!

31

Swapping (2)

•  Allocating space for growing data segment •  Allocating space for growing stack & data segment

32

Memory use within a process

•  Memory needs of known size –  Program code –  Global variables

•  Memory needs of unknown size –  Dynamically allocated

memory –  Stack

•  Several in multithreaded programs

program!

Initialized global!variables (data)!

Uninitialized global vars!

Program!

PCB!

Uninitialized global!variables!data!

stack!

Possibly stacks for more threads!

Process!

17!

33

Memory Addressing

•  Addressing in memory –  Addressing needs are

determined during programming

–  Must work independently of position in memory

–  Actual physical address are not known

program!

Initialized global!variables!

Uninitialized global vars!

34

Memory Addressing

•  Addressing in memory –  Addressing needs are

determined during programming

–  Must work independently of position in memory

–  Actual physical address are not known

program!

PCB!

data!

stack!

18!

35

Memory Management •  Addressing

–  Covered address translation and virtual memory

•  Important now –  Translation is

necessary –  Therefore possible to

have several parts •  Pages •  Segments

program!PCB!

stack!

data!

data!

data!

program!

36

Paging •  Paging

–  Equal lengths –  Determined by processor –  One page moved into one

memory frames

•  Process is loaded into several frames –  Not necessarily consecutive

•  No external fragmentation •  Little internal

fragmentation –  Depends on frame size

Process 1!Process 2!Process 3!Process 4!Process 5!Process 1!

19!

Paging

•  Use a page table to translate •  Various bits in each entry •  Context switch: similar to

the segmentation scheme •  What should be the page

size? •  Pros: simple allocation, easy

to share •  Cons: big page table and

cannot deal with holes easily

VPage # offset

Virtual address

. . .

> error

PPage# ...

PPage# ...

...

PPage # offset

Physical address

Page table

page table size

38

Segmentation •  Segmentation

–  Different lengths –  Determined by programmer –  Memory frames

•  Programmer (or compiler toolchain) organizes program in parts –  Move control –  Needs awareness of possible segment size limits

•  Pros and Cons –  Principle as in dynamic partitioning –  No internal fragmentation –  Less external fragmentation because on average smaller segments

20!

Segmentation •  Have a table of (seg, size) •  Protection: each entry has

–  (nil, read, write) •  On a context switch: save/

restore the table or a pointer to the table in kernel memory

•  Pros: Efficient, easy to share •  Cons: Complex management

and fragmentation within a segment

physical address

+

segment offset

Virtual address

seg size

. . .

> error

40

Paging and Segmentation •  Typical for paging and

swapping –  Address translation –  At execution time –  With processor support

•  Simple paging and segmentation –  Without virtual memory and

protection –  Can be implemented

•  by address rewriting at load time •  by jump tables setup at load time

Code part 1!

Code part 2!

(“part 2”,!offset in part 2)!Lookup!

table!+!

Simplified!Address translation!

21!

Segmentation with Paging

VPage # offset

Virtual address

. . .

>

PPage# ...

PPage# ...

...

PPage # offset

Physical address

Page table seg size

. . .

Vseg #

error

42

Other needs (protection) •  Protection of process

from itself –  (stack grows into heap)

•  Protection of processes from each other –  (write to other process)

program!

PCB!

data!

stack!

program!

data!

stack!

program!

data!

stack!Solutions to protection!

with!Address Translation!

22!

43

Summary: Memory Management •  Algorithms

–  Paging and segmentation •  Extended in address translation and virtual memory lectures

–  Placement algorithms for partitioning strategies •  Mostly obsolete for system memory management

–  since hardware address translation is available •  But still necessary for managing

–  kernel memory –  memory within a process –  memory of specialized systems (esp. database systems)

•  Address translation solves –  Solves addressing in a loaded program

•  Hardware address translation –  Supports protection from data access –  Supports new physical memory position after swapping in

•  Virtual memory provides –  Provide larger logical (virtual) than physical memory –  Selects process, page or segment for removal from physical memory

44

Why Virtual Memory?

•  Use secondary storage –  Extend expensive DRAM with reasonable

performance •  Protection

– Programs do not step over each other and communicate with each other require explicit IPC operations

•  Convenience – Flat address space and programs have the same

view of the world

23!

45

Virtual Memory Paging (1)

The position and function of the MMU

Translation Overview •  Actual translation is in

hardware (MMU) •  Controlled in software •  CPU view

– what program sees, virtual memory

•  Memory view –  physical memory

Translation (MMU)

CPU

virtual address

Physical memory

physical address

I/O device

24!

47

Goals of Translation

•  Implicit translation for each memory reference

•  A hit should be very fast

•  Trigger an exception on a miss

•  Protected from user’s faults

Registers

Cache(s)

DRAM

Disk

10x

100x

10Mx paging

48

Paging (2)

The relation between virtual addresses and physical memory addresses given by page table

25!

49

Page Tables (1)

Internal operation of MMU with 16 4 KB pages

50

0! 0! 1! 0!0! 0! 1! 0!

Memory Lookup

0! 0! 1! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 1! 0! 0!

12-bit offset!

Outgoing physical address!

4-bit index!into page table!virtual page = 0x0010 = 2!

Incoming virtual address!(0x2004, 8196)!

0! 010! 1!1! 001! 1!2! 110! 1!3! 000! 1!4! 100! 1!5! 011! 1!6! 000! 0!7! 000! 0!8! 000! 0!9! 101! 1!

10! 000! 0!11! 111! 1!12! 000! 0!13! 000! 0!14! 000! 0!15! 000! 0!Page table!

0! 0! 1! 0!

present !bit!

0! 0! 0! 0! 0! 0! 0! 0! 0! 1! 0! 0!

(0x6004, 24580)!1! 1! 0!

0! 0! 0! 0! 0! 0! 0! 0! 0! 1! 0! 0!

26!

51

0! 0! 1! 0!0! 0! 1! 0!

Memory Lookup

0! 0! 1! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 1! 0! 0!

12-bit offset!

Outgoing physical address!

4-bit index!into page table!virtual page = 0x0010 = 2!


0! 010! 1!1! 001! 1!2! 110! 0!3! 000! 1!4! 100! 1!5! 011! 1!6! 000! 0!7! 000! 0!8! 000! 0!9! 101! 1!

10! 000! 0!11! 111! 1!12! 000! 0!13! 000! 0!14! 000! 0!15! 000! 0!Page table!

0! 0! 1! 0!

present !bit!

0! 0! 0! 0! 0! 0! 0! 0! 0! 1! 0! 0!

PAGE FAULT

52

Page Fault Handling 1.  Hardware traps to the kernel saving program counter and process state information

2.  Save general registers and other volatile information

3.  OS discover the page fault and tries to determine which virtual page is requested

4.  OS checks if the virtual page is valid and if protection is consistent with access

5.  Select a page to be replaced

6.  Check if selected page frame is ”dirty”, i.e., updated

7.  When selected page frame is ready, the OS finds the disk address where the needed data is located and schedules a disk operation to bring in into memory

8.  A disk interrupt is executed indicating that the disk I/O operation is finished, the page tables are updated, and the page frame is marked ”normal state”

9.  Faulting instruction is backed up and the program counter is reset

10.  Faulting process is scheduled, and OS returns to routine that made the trap to the kernel

11.  The registers and other volatile information is restored and control is returned to user space to continue execution as no page fault had occured

27!

53

Instruction Backup

An instruction causing a page fault

54

Page Tables (2)

•  32 bit address with 2 page table fields •  Two-level page tables

Second-level page tables!

Top-level !page table!

28!

55

Page Tables (3)

Typical page table entry

Multiple-Level Page Tables

Directory . . .

pte

. . .

. . .

. . .

dir table offset Virtual address

29!

57

TLBs – Translation Lookaside Buffers

A TLB to speed up paging

Translation Look-aside Buffer (TLB)

offset

Virtual address

. . .

PPage# ...

PPage# ...

PPage# ...

PPage # offset

Physical address

VPage #

TLB

Hit

Miss

Real page table

VPage# VPage#

VPage#

30!

Bits in A TLB Entry

•  Common (necessary) bits – Virtual page number: match with the virtual address – Physical page number: translated address – Valid – Access bits: kernel and user (nil, read, write)

•  Optional (useful) bits – Process tag – Reference – Modify – Cacheable

Hardware-Controlled TLB •  On a TLB miss

–  Hardware loads the PTE into the TLB •  Need to write back if there is no free entry

–  Generate a fault if the page containing the PTE is invalid –  VM software performs fault handling –  Restart the CPU

•  On a TLB hit, hardware checks the valid bit –  If valid, pointer to page frame in memory –  If invalid, the hardware generates a page fault

•  Perform page fault handling •  Restart the faulting instruction

31!

Software-Controlled TLB •  On a miss in TLB

–  Write back if there is no free entry –  Check if the page containing the PTE is in memory –  If no, perform page fault handling –  Load the PTE into the TLB –  Restart the faulting instruction

•  On a hit in TLB, the hardware checks valid bit –  If valid, pointer to page frame in memory –  If invalid, the hardware generates a page fault

•  Perform page fault handling •  Restart the faulting instruction

62

Hardware vs. Software Controlled

•  Hardware approach – Efficient –  Inflexible – Need more space for page table

•  Software approach – Flexible – Software can do mappings by hashing

•  PP# → (Pid, VP#) •  (Pid, VP#) → PP#

– Can deal with large virtual address space

32!

63

How Many PTEs Do We Need?

•  Worst case for 32-bit address machine –  # of processes × 220 (if page size is 4096 bytes)

•  What about 64-bit address machine? –  # of processes × 252

Inverted Page Tables

•  Main idea –  One PTE for each

physical page frame –  Hash (Vpage, pid) to

Ppage# •  Pros

–  Small page table for large address space

•  Cons –  Lookup is difficult –  Overhead of

managing hash chains, etc

pid vpage offset

pid vpage

0

k

n-1

k offset

Virtual address

Physical address

Inverted page table

33!

65

Inverted Page Tables

Comparison of a traditional page table with an inverted page table

66

Page Replacement Algorithms •  Page fault → OS has to select a page for

replacement –  Modified page → write back to disk –  Not modified page → just overwrite with new data

•  How do we decide which page to replace? → determined by the page replacement algorithm → several algorithms exist:

•  Random •  Other algorithms take into acount usage, age, etc.

(e.g., FIFO, not recently used, least recently used, second chance, clock, …)

•  which is best???

34!

67

Optimal •  Best possible page replacement algorithm:

•  When a page fault occurs, all pages in memory are labeled with the number of instructions that will be executed before this page will be used again

•  The page with most instructions before reuse is replaced

•  Easy to describe, but impossible to implement (OS cannot look into the future)

•  Estimate by logging page usage on previous runs of process

•  Useful to evaluate other page replacement algorithm

68

Not Recently Used (NRU) •  Two status bits associated with each page:

R → page referenced (read or written) M → page modified (written)

•  Pages belong to one of four set of pages according to the

status bits: •  Class 0: not referenced, not modified (R=0, M=0) •  Class 1: not referenced, modified (R=0, M=1) •  Class 2: referenced, not modified (R=1, M=0) •  Class 3: referenced, modified (R=1, M=1)

•  NRU removes a page at random from the lowest numbered, non-empty class

•  Low overhead

35!

69

First In First Out (FIFO) •  All pages in memory are maintained in a list sorted by age •  FIFO replaces the oldest page, i.e., the first in the list

•  Low overhead •  FIFO is rearly used in its pure form

Page most recently loaded

Page first loaded, i.e., FIRST REPLACED

Reference string: A B C D A E F G H I A J!

AC B AB AE D C B AF E D C B AG F E D C B AI H G F E D C BA I H G F E D CJ A I H G F E DD C B AD C B A

No change in the FIFO chain

H G F E D C B A

Now the buffer is full, next page fault results in a replacement

70

Page most recently loaded

Page first loaded

R-bit

Second Chance •  Modification of FIFO •  R bit: when a page is referenced again, the R bit is set,

and the page will be treated as a newly loaded page

Reference string: A B C D A E F G H I !

E

0

D

0

C

0

B

0

A

1

F

0

E

0

D

0

C

0

B

0

A

1

G

0

F

0

E

0

D

0

C

0

B

0

A

1

D

0

C

0

B

0

A

0

D

0

C

0

B

0

A

1

The R-bit for page A is set

H

0

G

0

F

0

E

0

D

0

C

0

B

0

A

1


H

0

G

0

F

0

E

0

D

0

C

0

B

0

A

1

Page I will be inserted, find a page to page out by looking at the first page loaded: -if R-bit = 0 → replace -if R-bit = 1 → clear R-bit, move page last, and finally look at the new first page

A

0

H

0

G

0

F

0

E

0

D

0

C

0

B

0

Page A’s R-bit = 1 → move last in chain and clear R-bit, look at new first page (B)

I

0

A

0

H

0

G

0

F

0

E

0

D

0

C

0

Page B’s R-bit = 0 → page out, shift chain left, and insert I last in the chain

•  Second chance is a reasonable algorithm, but inefficient because it is moving pages around the list

36!

71

Reference string: A B C D A E F G H I !

Clock •  More efficient way to implement Second Chance •  Circular list in form of a clock •  Pointer to the oldest page:

–  R-bit = 0 → replace and advance pointer –  R-bit = 1 → set R-bit to 0, advance pointer until R-bit = 0, replace

and advance pointer

A 0

D 0

B 0

C 0

A 1

E 0

F 0

G 0

H 0

I 0

72

Least Recently Used (LRU) •  Replace the page that has the longest time since

last reference

•  Based on the observation that pages that are heavily used in the last few instructions will probably be used again in the next few instructions

•  Several ways to implement this algoithm

37!

73

Least Recently Used (LRU) •  LRU as a linked list:

Page most recently used

Page least recently used

Reference string: A B C D A E F G H A C I!

E A D C BF E A D C BG F E A D C BD C B AA D B C

Move A last in the chain (most recently used)

H G F E A D C B


I C A H G F E D

Page fault, replace LRU (B) with I

A H G F E D C B

Move A last in the chain (most recently used)

C A H G F E D B

Move C last in the chain (most recently used)

•  Expensive - maintaining an ordered list of all pages in memory: •  most recently used at front, least at rear •  update this list every memory reference !!

74

Least Recently Used (LRU) •  LRU by using aging:

–  ”reference counter” for each page –  after a clock tick:

•  shift bits in the reference counter to the right (rightmost bit is deleted)

•  add a page’s referece bit in front of the reference counter (left) –  page with lowest counter is replaced

1 00000000

2 00000000

3 00000000

4 00000000

5 00000000

6 00000000

1 10000000

2 00000000

3 10000000

4 00000000

5 10000000

6 10000000

1 11000000

2 10000000

3 01000000

4 10000000

5 01000000

6 01000000

Clock tick 0 1 0 1 0 1 1

Clock tick 1 1 1 0 1 0 0

Clock tick 2 1 1 0 1 0 1

Clock tick 3 1 0 0 0 1 0

Clock tick 4 0 1 1 0 0 0

1 11100000

2 11000000

3 00100000

4 11000000

5 00100000

6 10100000

1 11110000

2 01100000

3 00010000

4 01100000

5 10010000

6 01010000

1 01111000

2 10110000

3 10001000

4 00110000

5 01001000

6 00101000

38!

75

Least Recently Used (LRU) •  LRU as a matrix:

–  N pages → N x N matrix –  Page N is referenced → row N is set (1)

→ column N is cleared (0) –  Replace page with lowest row value

1 2 3 4 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0

1 2 3 4 1 1 1 1 1 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0

”Page frame” string: 1 2 3 4 3 2 1 4!

1 2 3 4 1 0 1 1 1 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0

1 2 3 4 1 0 1 1 1 2 1 1 1 1 3 0 0 0 0 4 0 0 0 0

1 2 3 4 1 0 0 1 1 2 1 0 1 1 3 0 0 0 0 4 0 0 0 0

1 2 3 4 1 0 0 1 1 2 1 0 1 1 3 1 1 1 1 4 0 0 0 0

1 2 3 4 1 0 0 0 1 2 1 0 0 1 3 1 1 0 1 4 0 0 0 0

1 2 3 4 1 0 0 0 1 2 1 0 0 1 3 1 1 0 1 4 1 1 1 1

1 2 3 4 1 0 0 0 0 2 1 0 0 0 3 1 1 0 0 4 1 1 1 0

1 2 3 4 1 0 0 0 0 2 1 0 0 0 3 1 1 1 1 4 1 1 1 0

1 2 3 4 1 0 0 0 0 2 1 0 0 0 3 1 1 0 1 4 1 1 0 0

1 2 3 4 1 0 0 0 0 2 1 1 1 1 3 1 1 0 1 4 1 1 0 0

1 2 3 4 1 0 0 0 0 2 1 0 1 1 3 1 0 0 1 4 1 0 0 0

1 2 3 4 1 1 1 1 1 2 1 0 1 1 3 1 0 0 1 4 1 0 0 0

1 2 3 4 1 0 1 1 1 2 0 0 1 1 3 0 0 0 1 4 0 0 0 0

1 2 3 4 1 0 1 1 1 2 0 0 1 1 3 0 0 0 1 4 1 1 1 1

1 2 3 4 1 0 1 1 0 2 0 0 1 0 3 0 0 0 0 4 1 1 1 0

76

Counting Algorithms •  LRU by using a reference counter

–  clear the counter when the page is referenced (counter = 0) –  increase all counters each clock tick –  replace the page with the highest counter

•  Not/Least Frequently Used (N/LFU) –  counter initially 0 –  increase the page’s counter only if it has been referenced during

this clock tick –  replace the page with lowest counter

•  Most Frequently Used (MFU) –  counter as LFU –  replace the page with the highest counter

(assuming low counters mean new, fresh pages)

39!

77

LRU-K & 2Q •  LRU-K: bases page replacement in the last K

references on a page [O’Neil et al. 93]

•  2Q: uses 3 queues to hold much referenced and popular pages in memory [Johnson et al. 94]

•  2 FIFO queues for seldom referenced pages •  1 LRU queue for much referenced pages

FIFO LRU FIFO

Retrieved from disk Reused, move to LRU queue NOT Reused, move to FIFO queue

NOT reused, page out

NOT reused, page out

Reused, re-arrange LRU queue Reused, move back to LRU queue

78

Working Set Model

•  Working set: set of pages which a process is currently using

•  Working set model: paging system tries to keep track of each process’ working set and makes sure that these pages is in memory before letting the process run → reduces page fault rate (prepaging)

•  Defining the working set: –  set of pages used in the last k memory references (must count backwards) –  approximation is to use all references used in the last XX instructions

40!

79

The Working Set Page Replacement Algorithm (1)

•  The working set is the set of pages used by the k most recent memory references

•  w(k,t) is the size of the working set at time, t

k

80

Working Set Page Replacement Algorithm τ - time period to calculate the WS over age - virtual time - last reference time

if all pages have R == 1! select one page randomly!

•  Expensive - must search the whole page table

41!

81

WSClock Page Replacement Algorithm •  Organize each page table entry as a clock •  As with clock - the page pointed to is

examined first –  R = 1:

clear bit, set virtual time, continue (b)

–  R = 0: (c)

•  age < τ : continue to next

•  age > τ : –  if page clean, replace (d) –  othervice, write to disk and continue to

next

•  If all pointer comes back to start –  writes are scheduled to clean pages

(find first)

–  no scheduled writes (all in WS), several option

•  remove first clean •  remove oldest •  ...

2204!

2204! 2204!

2204!

82

Review of Page Replacement Algorithms

42!

83

Demand Paging Versus Prepaging •  Demand paging: pages are loaded on demand, i.e., after a

process needs it •  Should be used if we have no knowledge about future references •  Each page is loaded separatly from disk, i.e., results in many disk accesses

•  Prepaging: prefetching data in advance, i.e., before use •  Should be used if we have knowledge about future references •  # page faults is reduced, i.e., page in memory when needed by a process •  # disk accesses can be reduced by loading several pages in one

I/O-operation

84

Allocation Policies •  How should memory be allocated among

the competing runnable processes? •  Equal allocation: all processes get the same

amount of pages •  Proportional allocation: amount of pages is

depending on process size

43!

85

Allocation Policies •  Local page replacement: consider only pages of own

process when replacing a page •  corresponds to equal allocation •  can cause thrashing •  multiple, identical pages in memory

•  Global page replacemet: consider all pages in memory when replacing a page

•  corresponds to proportional allocation •  better performance in general •  monitoring of working set size and aging bits •  data sharing

86

Allocation Policies •  Example: local versus global replacement

insert page A5 using age replacement

Age

A1 10

A2 7

A3 4

A4 11

B1 6

B2 12

B3 1

B4 3

C1 8

C2 2

C3 9

C4 5

Original configuration

Age

A1 10

A2 7

A5 13

A4 11

B1 6

B2 12

B3 1

B4 3

C1 8

C2 2

C3 9

C4 5

Local replacement

Age

A1 10

A2 7

A3 4

A4 11

B1 6

B2 12

A5 13

B4 3

C1 8

C2 2

C3 9

C4 5

Global replacement

Local replacement: Replace the oldest of A’s pages

Global replacement: Replace the oldest page in memory

44!

87

Allocation Policies •  Page fault frequency (PFF):

Usually, more page frames → fewer page faults

PFF

: p

age

fau

lts/

sec

# page frames assigned

PFF is unacceptable high → process needs more memory

PFF might be too low → process may have too much memory!!!??????

Solution ??: Reduce number of processes competing for memory

•  reassign a page frame •  swap one or more to disk, divide up pages they held •  reconsider degree of multiprogramming

d n 1 d n 1

88

Page Size •  Determining the optimum page size requires balancing

several competing factors: •  Data segment size ≠ n x page size → internal fragmentation (small size) •  Keep in memory only data that is (currently) used (small size)

•  Disk operations (large

size) •  Page table size: access/load time and space requirements (large size) •  Page replacement algorithm: operations per page (large size)

•  Usual page sizes is 4 KB – 8 KB, but up to 64 KB is suggested for systems supporting ”new” applications managing high data rate data streams like video and audio

45!

89

Locking & Sharing •  Locking pages in memory:

–  I/O and context switches – Much used pages – …

•  Shared pages users running the same program at the same time, e.g., editor or compiler

– Problem 1: not all pages are shareable – Problem 2: process swapping or termination – …

90

Separate Instruction and Data Spaces

•  One address space •  Separate I and D

spaces

46!

91

Cleaning Policy •  Need for a background process, paging

daemon –  periodically inspects state of memory

•  When too few frames are free –  selects pages to evict using a replacement

algorithm •  It can use same circular list (clock)

–  as regular page replacement algorithmbut with diff ptr

92

Implementation Issues Operating System Involvement with Paging

Four times when OS involved with paging 1.  Process creation

-  determine program size -  create page table

2.  Process execution -  MMU reset for new process -  TLB flushed

3.  Page fault time -  determine virtual address causing fault -  swap target page out, needed page in

4.  Process termination time -  release page table, pages

47!

93

Page Fault Handling (1)

1.  Hardware traps to kernel 2.  General registers saved 3.  OS determines which virtual page needed 4.  OS checks validity of address, seeks page frame 5.  If selected frame is dirty, write it to disk

94

Page Fault Handling (2)

6.  OS brings schedules new page in from disk 7.  Page tables updated l  Faulting instruction backed up to when it began 6.  Faulting process scheduled 7.  Registers restored l  Program continues

48!

95

Instruction Backup

An instruction causing a page fault

96

Locking Pages in Memory

•  Virtual memory and I/O occasionally interact •  Proc issues call for read from device into buffer

– while waiting for I/O, another processes starts up –  has a page fault –  buffer for the first proc may be chosen to be paged out

•  Need to specify some pages locked –  exempted from being target pages

49!

97

Backing Store

(a) Paging to static swap area (b) Backing up pages dynamically

98

Segmentation (1)

•  One-dimensional address space with growing tables •  One table may bump into another

50!

99

Segmentation (2)

Allows each table to grow or shrink, independently

100

Segmentation (3)

Comparison of paging and segmentation

51!

101

Implementation of Pure Segmentation

(a)-(d) Development of checkerboarding (e) Removal of the checkerboarding by compaction

102

Segmentation with Paging: MULTICS (1)

•  Descriptor segment points to page tables •  Segment descriptor – numbers are field lengths

52!

103


A 34-bit MULTICS virtual address

104


Conversion of a 2-part MULTICS address into a main memory address

53!

105


•  Simplified version of the MULTICS TLB •  Existence of 2 page sizes makes actual TLB more complicated

106

Segmentation with Paging: Pentium (1)

A Pentium selector

54!

107


•  Pentium code segment descriptor •  Data segments differ slightly

108


Conversion of a (selector, offset) pair to a linear address

55!

109


Mapping of a linear address onto a physical address

110


Protection on the Pentium

Level!

56!

111

Paging on Pentium •  In protected mode, the currently executing process

have a 4 GB address space (232) – viewed as 1 M 4 KB pages –  The 4 GB address space is divided into 1 K page groups

(1 level – page directory) –  Each page group has 1 K 4 KB pages

(2 level – page table)

•  Mass storage space is also divided into 4 KB blocks of information

•  Uses control registers for paging information

112

Control Registers used for Paging on Pentium

•  Control register 0 (CR0):

•  Control register 1 (CR1) – does not exist, returns only zero

•  Control register 2 (CR2) –  only used if CR0[PG]=1 & CR0[PE]=1

31 30 29 16 0

PG

CD

NW

WP

PE

Not-Write-Through and Cache Disable: used to control internal cache

Paging Enable: OS enables paging by setting CR0[PG] = 1

Write-Protect: If CR0[WP] = 1, only OS may write to read-only pages

31 0

Page Fault Linear Address

Protected Mode Enable: If CR0[PE] = 1, the processor is in protected mode

57!

113

Control Registers used for Paging on Pentium

•  Control register 3 (CR3) – page directory base address: –  only used if CR0[PG]=1 & CR0[PE]=1

•  Control register 4 (CR4):

31 11 4 3 0

Page Directory Base Address PCD

PWT

A 4KB-aligned physical base address of the page directory

Page Cache Disable: If CR3[PCD] = 1, caching is turned off

Page Write-Through: If CR3[PWT] = 1, use write-through updated

31 4 0

PSE

Page Size Extension: If CR4[PSE] = 1, the OS designer may designate some pages as 4 MB

114

Pentium Memory Lookup 31 22 21 12 11 0

0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0


Page directory: 31 12 7 6 5 4 3 2 1 0

PT base address

... PS A U W P

physical base address of the page table

page size

accessed present

allowed to write

user access allowed

58!

115


0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0


31 12 7 6 5 4 3 2 1 0

0...01010101111

... 1 0...011111110

00 ... 0

0...01110000111

... 0

0...00001010101

... 1

0...01111000101

... 0

0...00000000100

... 0

......

Index to page directory!(0x6, 6)!

Page Directory Base Address

CR3:

Page table PF: 1.  Save pointer to instruction 2.  Move linear address to CR2 3.  Generate a PF exception – jump to handler 4.  Programmer reads CR2 address 5.  Upper 10 CR2 bits identify needed PT 6.  Page directory entry is really a mass

storage address 7.  Allocate a new page – write back if dirty 8.  Read page from storage device 9.  Insert new PT base address into

page directory entry 10.  Return and restore faulting instruction 11.  Resume operation reading the same

page directory entry again – now P = 1

116


0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0


31 12 7 6 5 4 3 2 1 0

0...01010101111

... 1 0...011111110

00 ... 0

0...01110000111

... 0

0...00001010101

... 1

0...01111000101

... 0

0...00000000100

... 1

......



CR3: 31 12 7 6 5 4 3 2 1 0

0...01010101111

... 1

0...01010100000

0

0...01100110011

1

0...00010000100

1

......

Page table:

Index to page table!(0x2, 2)!

Page frame PF: 1.  Save pointer to instruction 2.  Move linear address to CR2 3.  Generate a PF exception – jump to handler 4.  Programmer reads CR2 address 5.  Upper 10 CR2 bits identify needed PT 6.  Use middle 10 CR2 bit to determine entry

in PT – holds a mass storage address 7.  Allocate a new page – write back if dirty 8.  Read page from storage device 9.  Insert new page frame base address into

page table entry 10.  Return and restore faulting instruction 11.  Resume operation reading the same

page directory entry and page table entry again – both now P = 1

59!

117


0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0


31 12 7 6 5 4 3 2 1 0

0...01010101111

... 1 0...011111110

00 ... 0

0...01110000111

... 0

0...00001010101

... 1

0...01111000101

... 0

0...00000000100

... 1

......



CR3: 31 12 7 6 5 4 3 2 1 0

0...01010101111

... 1

0...01010100000

1

0...01100110011

1

0...00010000100

1

......

Index to page table!(0x2, 2)!

Page offset!(0x38, 56)!

Page:

requested data

118

Page Fault Causes •  Page directory entry’s P-bit = 0:

page group’s directory (page table) not in memory

•  Page table entry’s P-bit = 0: requested page not in memory

•  Attempt to write to a read-only page •  Insufficient page-level privilege to access

page table or frame •  One of the reserved bits are set in the page

directory or table entry

Documents

Memory Management - uio.no · 1! 1 Memory Management • Basic memory management: • Mono- and multi-programming • Fixed and variable memory partitioning • Swapping • Paging