Section I Section Real Time Systems. Processes - DCU

EE206: Software Engineering IV 1.7 Memory Management page 1 of 28

Section I Section Real Time Systems. Processes

1.7 Memory Management

(Textbook: A. S. Tanenbaum “Modern OS” - ch. 3)

Memory Management Introduction

• Memory Overview

• Memory operations allow the process to store and consequently retrieve information.

Retrieve Store

Memory

Process

• Memory Manager

• Memory Manager = Part of an operating system (OS) that manages memory

• Keeps track of which parts of memory are in use

C:\Gabi\EE206\2003-2004\worddocs\107_Memory_Management.doc Last saved by Gabriel Muntean 20/04/2004


• Allocates/reallocates/de-allocates memory to/from processes when needed

• Swapping between main memory and disk when not enough memory to hold all the processes

• Need for Memory Manager

• Programs (and consequently processes) expand

to fill all the memory available to them

• There may be not enough memory resources for other programs (processes)

• There is a need for an authority to control memory

management: Memory manager

• Main Memory Management Classes • Due to lack of sufficient memory to hold all

processes in the memory at once some processes are moved back/forth between main memory and disk during execution (more complex; use swapping/paging).

• Due to lack of memory launching processes is not

permitted at some point (simpler).



Memory Management Schemes

Without Swapping or Paging • Mono-programming - No Memory Management

• One process is in memory at a time • Running application has complete control over all

of the memory in the machine • No separate OS present • There is no protection involved, since the

application process can write to any memory location

• Every process must contain device drivers for each I/O device it uses

• No longer commonly used • Memory

0

Application

N-1



Mono-programming – Resident Monitor

• Usual technique in simple microcomputers • OS can be resident in RAM or ROM • In the memory there is only a Monitor (or OS)

and the User program (or application) • The Monitor is generally placed at the top or the

bottom of the memory

Memory Memory

OS in RAM

User

Program

Device Drivers in

ROM

User Program

Monitor (OS) Fence

General approach e.g. PC with MS-DOS

• There is a program in ROM called BIOS (Basic Input Output System) that allows basic I/O



Problem:

• How could be OS protected from corruption? • E.g. an application may write into OS region of

memory Solution:

• Hardware can provide a fence (a memory location) to compare all (user) memory accesses against

• If the value of the user’s memory access address

is on the OS side of the fence, the hardware generates a trap, preventing access to that memory location

• This is similar to what happens when a

“segmentation fault” occurs under Windows OS

• This can slow down the execution speed, as each memory reference must be checked => problem for OS operations that MUST be fast



Consequence: Two modes of execution:

• User mode (application running), a fence comparison is made with every memory access

• Supervisor mode (OS running), no comparison is

done, as the operating system must have access to the entire memory space

Further Problem:

• Fence had to be built into the hardware in order to achieve speed efficiency

• Fence value was originally constant

• If OS did not fit:

• some memory was wasted or • part of the O/S was unprotected

Solution:

• Fence value is placed in a register and can be updated to accommodate the size of the OS



• Multiprogramming:

• Large memory can improve CPU utilisation!

Approximate CPU Utilisation = 1 - pn

p = fraction of time a process spends in I/O wait state n = number of processes waiting for I/O (i.e. CPU idle)

E.g. Let’s assume that:

• There is 80% average I/O wait • there is 1M memory, • OS occupies 200k, and • each user program occupies 200k.

This allows 4 user programs in memory at once:

(1M – 200K) / 200K = 4 The CPU utilisation (ignoring OS system overhead) is:

1-0.84 = 1-0.41 = 0.59; approx. 60% Adding another 1 M of memory, we have:

1-0.89 = 1-0.13 = 0.87; approx. 87% CPU utilization

Adding a third M of memory, we have:

1-0.814 = 1-0.04 = 0.96; approx. 96% CPU utilization



• Multiprogramming - Fixed Number of Tasks (MFT)

Idea:

• Main memory is divided into a set of n fixed partitions

Partition 3

Partition 2

Partition 1

OS

• Each process can use one of these partitions only

• A process cannot cross partition boundaries

• A partition cannot be shared between processes

• When a process job is done, another appropriately sized) process/job will take its place



• There could be:

• Separate input queues for each partition (each partition can be accessed separately)

• Single input queue

Using separate queues Principle:

• Large jobs would queue for large partitions, whereas small jobs for small partitions

Problem:

• Queues for large partitions may be empty, while queues for smaller partitions may be full

Using a single queue Principle:

• As a partition becomes free, job closest to front of the queue could be loaded into the empty partition and run

Problem:

• This could waste large partitions on small jobs



Principle:

• A different strategy would be to search in the queue for the largest job that fits the empty partition

Problem:

• This discriminates against small jobs • Usually the best service is given to small jobs, not

the worst Solutions:

• Have one small partition to run small jobs or

• Allow a job to be skipped maximum k times Memory Fragmentation Internal Fragmentation: • Memory space within a partition is wasted because

often smaller jobs are run in larger partitions

e.g. Assume that a 200k job is running in a 500 k partition => 300 k memory space is wasted



External fragmentation:

• Enough total space in empty partitions exists to run a process, but this space cannot be used as no single partition can hold the entire process

• Partitions are wasted due to the way they are

divided up • These fixed partitions in MFT systems are generally

set up (e.g. every day) and not changed thereafter Key Issues:

Job-Partition Allocation

• Very difficult • A good selection for a job mix might be a bad

selection for another job mix Partition Size Selection

• Very difficult • Involves choosing the partition sizes to match the

greatest number of jobs • Both internal and external fragmentation should be

avoided



Swapping

Idea:

• Employed when it is not enough physical memory available to hold simultaneously data from more processes

• The memory content related to one process is

written to a backing store (disk), and memory content related to another process is read in

• Swapping time is long (it consists of one large write

and one large read, the speed of which depends mainly on the disk’s performance)

Background Swapping:

• Refers to disk transfers during process execution • It does not have to wait for slow disk-memory copy,

but the disk must be a DMA device • This makes for a much faster switch and better

overall execution speed • Swapping could be based on fixed partitions, but too

much memory is wasted by programs that are smaller than their partitions

• A different approach is needed



• Multiprogramming with Variable Partitions

• Location, size and number of processes in memory may vary dynamically in time

• Memory Allocation changes as processes come

into memory and leave it

C C C C C B B B B E

A A A D D D

OS OS OS OS OS OS OS

• Originally all the memory is free and is kept in one

block • OS keeps track of which memory is in use, and

which is not in use



• When a job arrives, the OS searches for a free memory block to put the job into

• If a big enough block is found, it is allocated to the

job. The remainder of that block (if any) is marked as a new, smaller, free block

• When a job ends, its memory is returned to the

system. If the returned memory is next to a free block, the two are joined into one contiguous block

Fragmentation Internal Fragmentation: • Hardly any internal fragmentation External fragmentation: • A significant issue Performance Issues • Generally, free blocks are kept in a sort of list. When

memory allocation takes place for a job, the list must be searched for a memory chunk that is big enough

• Improves memory utilisation, but complicates

allocation and de-allocation of memory



General Dynamic Storage Allocation Problem (GDSAP)

Goals: • Efficiency of memory allocation/de-allocation

and

• Minimisation of fragmentation. A. Memory Compaction Idea:

• Technique that combines all empty regions (“holes”) into one large one by moving all processes as far as possible to the end of memory

Notes:

• Not usually done as requires a lot of CPU time. However CPU speeds are increasing much faster than RAM speeds

• This eliminates the external fragmentation • Swapping and compaction can be used together.

When the jobs are swapped back in, they can be placed together to leave the largest hole possible



Observation 1:

• In some programming languages, memory allocated to processes can grow (e.g. by dynamically allocating memory – in C using “malloc()”)

• If there are “holes” on either side of the memory

allocated to a process, it can “grow” into them • If there are not adjacent “holes”:

- The growing process can be moved to a hole large enough

or - One or more processes can be swapped out to make a large enough hole for this process

• Extra memory can be added during swapping and/or

moving if the process is expected to grow Observation 2:

• Sometimes requests are made for amounts of memory just inferior to the available free space (e.g. 19995 bytes block and there is a 20000 bytes block free)

• The request can be satisfied, but the overhead to

keep track of the wasted spaces is more than if all 20000 blocks would be allocated. This principle produces very small internal fragmentation



B. Keeping Track of OS Memory Usage: 1. Memory Management with Bit Maps • Memory is divided into allocation units (words, KB) –

The size is a key design issue • Corresponding to each allocation unit is a bit in a Bit

Map having a value of 0 if free and 1 if occupied • E.g.:

- Memory with 4 Processes and two holes:

A B C D

- Corresponding Bit-map

1 1 1 1 1 0 0 01 1 1 1 1 1 1 11 1 0 0 1 1 1 1

• Notes:

- Searching the bit-map for a run of a given length is a slow operation

- Not generally used



2. Memory Management with Linked Lists • Maintain a linked list of allocated/free memory

segments • E.g.:

- Memory with 4 Processes and two holes:

A B C D

- Corresponding Linked list (sorted by address):

p 0 5 ⇒ h 5 3 ⇒ p 8 6 ⇒ p 14 4 ⇒

h 18 2 ⇒ p 20 4 ⇒ …..

- Legend: h - hole, p - process

C. Allocating Memory to a Process: • Several algorithms can be used to allocate memory • Assumes Memory Manager knows how much

memory to allocate



First-Fit:

• Scan list until it finds a hole big enough

• The hole is broken into two pieces (one for process, one for unused memory) except in the very unlikely case of an exact fit

• It is a fast algorithm as it searches as little as possible

Next-Fit:

• Minor variation of the above algorithm.

• Same as First-Fit, except it keeps track of where and when it finds a suitable hole

• Next time is starts searching from where it has left of

• Achieves slightly worst performance than first-fit Best-Fit:

• Searches the list and finds the smallest free hole that is needed

• This method tries to reduce fragmentation by minimising left over space

• Slower than first-fit as searches entire list every time

• Results in more wasted memory than First-Fit as it tends to fill up memory with tiny useless holes



Worst-Fit:

• Takes the largest available hole

• This method tries to reduce fragmentation by leaving the most usable leftover piece

• Not as good as First-Fit, Next-Fit or Best-Fit Quick-fit:

• Keeps separate lists for some of the more common sizes requested

• Finding an appropriate hole is extremely fast Disadvantage:

• For all schemes

• Mostly due to the usual sorting by hole size

• When process terminates or is swapped out, finding neighbours to see if merge is possible is expensive in terms of CPU time



3. Memory Management with the Buddy System • Takes advantage of fact that computers use the

binary system in addressing to speed up the merging of adjacent holes

• Memory Manager keeps a list of free blocks of size 1, 2, 4, 8, … bytes up to the size of memory

• i.e. 1M memory requires 21 lists

• Initially all memory is free, and 1M list contains a single entry containing a single 1 hole

• Holes listed only for sizes that are powers of 2

• E.g.

Memory 128 256 384 512 … 1M Holes Initially 1 Request 70 A 128 256 512 3 Request 35 A B 64 256 512 3 Request 80 A B 64 C 128 512 3 Return A 128 B 64 C 128 512 4 Request 60 128 B D C 128 512 4 Return B 128 64 D C 128 512 4 Return D 256 C 128 512 3 Return C 1024 1



Details:

• 70K process swapped into an empty 1M memory: 128K requested as smallest power-of-2 block

• No 128K block available so 1M block split into two

512K blocks (called buddies) • Repeat splitting until get 128K block • Next 35K swapped in. Round up to power of 2

(64K) • No 64K block – so split 128K block • If we free 128K block (i.e. Process A) – goes on

free list for 128K blocks • Check to see if merging possible.

Advantages:

• Sort blocks by size, but not necessarily at addresses that are multiple of block size

• Very fast

Disadvantages:

• Inefficient in the use of memory (i.e. internal fragmentation)



Virtual Memory Idea: • Addresses a different problem: programs too big to

fit into the memory The Old Way: • Split programs into overlays. As an overlay finishes

it calls another one

• The overlays are kept on disk and are swapped in/out of memory by OS

• The program has to be split by the programmer The New Way: • Virtual memory turned the whole job over to

computer

• OS keeps the parts of any program currently in use in main memory, the rest on the disk

• Works very well with multiprogramming

• Most Virtual memory systems use a technique called paging



Paging Idea: • Allows any available section of memory to be used

even if it is not large enough to hold an entire object • It makes use of program-generated addresses that

are called virtual addresses • A virtual address is not placed directly on the

memory bus, being first mapped to physical address by the Memory Management Unit (MMU)



Principles: • Virtual address space is divided into units called

pages • Corresponding units in physical memory are called

page frames or frames • Mapping pages into frames requires hardware

support • The size of the pages is determined by the

hardware. The pages and the frames always have the same size

• 512B to 8K are common sizes. All pages but the last

page are of the same size • Transfers between memory and disks are in units of

a page • E.g.: For 4K-large pages, 64K of Virtual address

space and 32K of physical memory we have 16 pages and 8 frames



• Each Virtual address consists of two parts:

- A page number (pageno) pageno = offset from top of module/pagesize

- A displacement within the page (offset) offset = offset from top of module%pagesize

/ = div operator % = modulo operator

• Each process has associated a data structure called page table. Entries in this page table contain the base (starting) address of each page in physical memory. This table is in fact a map showing where each part of the process is loaded in the physical memory

• The pageno of the Virtual address is used as an

index into the process page table • In order to get a physical address from a Virtual

address: - The page number is used to index the page

table. This derives the base address of the frame (the address at which the page is loaded in memory)

- The displacement is added to the frame base address to derive the physical address



• The page (and frame) size is generally a power of two. In this way, div and modulo operators are implemented by simple bit operations

• This allows the derivation of page number and offset

within the page to be computed very efficient in hardware

• When a job is to be loaded, the number of frames

required for the job is computed • If enough frames are available, the job is loaded into

the free frames, and the page table of this process is updated accordingly

Notes: • There is no external fragmentation as any free

frame can be allocated • There is some internal fragmentation. • E.g. Assume that frame size (i.e. page size) = 4K. If

the job is 17K, 5 frames are needed, one being only partially full. In this way 3K are wasted in the frame that holds the last page of the object




Documents

Section I Section Real Time Systems. Processes - DCU