Linux Memory Management - Virginia Techchangwoo/ECE-LKP-2019F/l/... · 2019-11-04 · Linux Memory Management Changwoo Min [email protected]. Today: Memory Management •Pages and zones

Linux Memory Management

Changwoo Min

[email protected]

Today: Memory Management

• Pages and zones

• Page allocation

• kmalloc, vmalloc

• Slab allocator

• Stack, high memory, per-CPU data structures

Background of Memory Management

• Virtual memory vs. Physical memory

• Address space

• Details will be covered in the following classes• The Process Address Space

• The Page Cache and Page Fault

What is virtual memory?

What is physical memory?

Physical Memory

• Actual RAM of the system• Volatile memory

• Bytes addressable medium

Virtual Memory

• Virtual memory is a memory management technique

• It creates the illusion to users of a very large memory

• OS (Linux) maps memory addresses used by a program, called virtual addresses, into physical addresses

Virtual Address with a simple user program

• Memory addresses used by a program

#include <stdio.h>

char buf[100];

int main(int argc, char *argv[])

{

int n;

while (n < 10)

n = n + 1;

printf (”[virt_addr] main():%p buf:%p n:%p”,

&main, buf, &n);

}

kernel

stack

heap

bss

text

4GB

3GB

kernel space

user space

0

/* local variables */

argc, argv, n

/* global variables */

buf

/* instruction codes

(read only) */

while (a < 10)

a = i + 1;

printf ( … );

data

x86 (32bit)

Virtual memory (VM) is a layer of indirection (map)

program address

space (4GB) 1GB physical RAM

0

1

2

0

1

……

??

No VM: Crash if we

try to access more

RAM than we have

program address

space (4GB) 1GB physical RAM

0

1

2

0

1

…

…

VM: mapping gives

us flexibility in how

we use the RAM

Map

Disk

Without Virtual MemoryProgram Address = RAM Address (no indirection)

With Virtual MemoryProgram Address Maps to RAM Address

Challenges: #1 not enough physical memory

• Map some of the program’s address space to the disk

• When we need it, we bring it into memory

32-bit program

address space (4GB) 1GB physical RAM

Map

Disk


Program 0

Program 1 Program 0

Program 1

Program 2Program 2

Program 3




32-bit program


Map

Disk


Program 0

Program 1 Program 0

Program 1

Program 2Program 2

Program 3

VM moves oldest

data (0) to disk




32-bit program


Map

Disk


Program 0

Program 1 Program 3

Program 1

Program 2Program 2

Program 3Program 0

Mapping let us use our disk

to give the illusion of

unlimited memory

Challenges: #2 holes in the address space

• How do we use the holes left when programs quit?

• We can map a program’s address to RAM address however we like

4GB physical RAM


Map 2

Map 3

Program 2

2GB

Program 3

2GB

Each program has its

own mapping.

Mapping lets us put our

program data wherever

we want to in the RAM

Challenges: #3 keeping programs secure

• Program 1’s and Program 2’s address map to different RAM addresses

• Because each program has its own address space, they cannot access each other’s data

4GB physical RAM


Map 2

Map 1

Program 2

Program 1

Program 1 Address

4096

Program 2 Address

4096

• Program 1 stores bank balance at address

4096

• VM maps it to RAM address 1

RAM

Address 1

RAM

Address 4

0

1

2

3

4

7

56

• Program 2 stores video game score at

address 4096

• VM maps it to RAM address 4

• Neither can touch the other data

Page frame

• Physical memory is divided into page frames• 4KB sized page frames

pag

e fra

me

pag

e fra

me

pag

e fra

me

pag

e fra

me

pag

e fra

me

pag

e fra

me

pag

e fra

me

pag

e fra

me

pag

e fra

me

Each page frame is

represented by struct page

{…}

to keep track of its status

4KB

Page frame

• Each page frame is represented by struct page• page size is machine-dependent

• 4KB in general

• Defined in include/linux/mm_types.h

struct page {

unsigned long flags; /* page status (permission, dirty, etc.) */

unsigned long counters; /* usage count */

struct address_space *mapping; /* address space mapping */

pgoff_t index; /* offset within mapping */

struct list_head lru; /* LRU list */

void *virtual; /* virtual address */

}

Zones

• Physical memory is divided into a number of blocks called zones

• ZONE_DMA• lower physical memory ranges for old (ISA) DMA devices (16MB)

• ZONE_DMA32• upper physical memory ranges for DMA devices supporting only 32bit

physical address (up to 4GB)

• ZONE_NORMAL• directly mapped into the upper region of the virtual address space

• ZONE_HIGHMEM• not mapped directly by kernel

• page frames should be mapped prior to access

Zones

x86 (32bit) x86_64 (64bit)

ZONE_NORMAL

ZONE_HIGHMEM

0

16MB

896MB

ZONE_DMA

ZONE_DMA32

ZONE_NORMAL

0

16MB

4GB

ZONE_DMA

Why split into Zones

• Certain contexts require certain physical pages due to hardware limitations• Industry Standard Architecture (ISA) card

• It provides 24-bit addressing

• DMA accessing can only be up to 16MB (2^24) due to ISA cards bus limitation

• Physical memory size exceeds kernel’s virtual address space

24-bit address (16MB)

Kernel(1G)

User(3G)

Virtual address space

exceeding

4GB RAM

ZO

NE

_D

MA

ZO

NE

_N

OR

MA

LZ

ON

E_

HIG

HM

EM

• Relationship between virtual and physical memory

Kernel (1G)

0

4GB

3GB

16MB

0

896MB

1:1 direct

mapping(always)

Dynamic

mapping(On demand)

1GB

Memory Layout (x86_32)

kernel image

struct mem_map

User (3G)

the rest memory

for memory

allocation

(kmalloc area)

vmalloc area

and etc.

…

array of

struct page {…}

Memory Layout (x86_32)

Memory Fragmentation

• External• various free spaced holes

• Internal• wasted space within each allocated page due to allocation granularity

=> Buddy System

=> Slab Allocator

Hierarchy of memory allocators

Buddy System

• Default memory allocator for Linux kernel• Reducing the external fragmentation

• Try to keep the page frames physically continuous as much as possible

• Runs on each zone

• Granularity of page allocation• Allocations are done in power of 2 number of page frames

Buddy

System

page

frame

Buddy System• Basic concepts

• Try to gather physically consecutive pages into a group

• Allocating continuous range of pages

Initial status1)

2)

3)

16 page frames (number of 2^4)

Request 8 pages

Request 2 pages

8 page frames

4 page frames

Buddy System

• With data structure

Status of Buddy System

* current status of buddy

system

$ cat /proc/buddyinfo

Low-level memory allocator (Buddy system)

• Low-level mechanisms to allocate memory at the page granularity

• interfaces in include/linux/gfp.h

• APIs for allocating pages• alloc_pages(gfp_t gfp_mask, unsigned int order);

• alloc_page(gfp_t gfp_mask);

• __get_free_pages(gfp_t gfp_mask, unsigned int order);

• __get_free_page(gfp_t gft_mask);

• get_zeroed_page(gfp_t gfp_mask);

Page allocation / deallocation

Page allocation / deallocation

Zeroed page allocation

• By default, the page data is not cleared

• May leak information through the page allocation

• To prevent information leakage, allocate a zero-out page for user-space request

unsinged long get_zeroed_page(gfp_t gfp_mask);

Relationships among APIs

• Eventually, all functions performs alloc_pages( ) and __free_pages( )

allocation functions deallocation functions

gfp_t: get free page flags

• Specify options for memory allocation• Action modifier

• How the memory should be allocated

• Zone modifier• From which zone the memory should be allocated

• Type flags• Combination of action and zone modifiers

• Generally preferred compared to the direct use of action/zone

• Defined include/linux/gfp.h

gfp_t: action modifiers

gfp_t: zone modifiers

• If not specified, allocated from ZONE_NORMAL or ZONE_DMA (high preference to ZONE_NORMAL)

gfp_t: type flags (preferred)

gfp_t: type flags (preferred)

gfp_t: Cheat sheets for type flags

• Which flags to use when

Low-level memory allocation example

Low-level memory allocation example

kmalloc( ) / kfree( ) : byte-sized

vmalloc( ) / vfree( ) : byte-sized

kmalloc( ) vs. vmalloc( )

kmalloc( ) : example



Slab allocator

• Basic idea• Caching commonly used objects (such as task_struct, inode, etc.)

rather than allocating/freeing memory

• Reducing internal fragmentation• By caching an object smaller than the page size

• It’s wasteful to allocate a page to store only a few bytes

Slab allocator

• A cache has one or more slabs• One or several physically contiguous

pages

• Slabs contain objects

• A slab may be empty, partially full, or full

• Allocate objects from the partially full slabs to prevent memory fragmentation

slab allocator components for a

certain type of object

(struct my_struct)

Slab allocator coloring

Component of slab allocator Example of slab coloring

cache line

size * 1

cache line

size * 2

• Preventing replacement from CPU cache• By adjusting the starting offset of the objects

Status of Slab allocator

* current status of slab allocator

$ cat /proc/slabinfo

APIs of Slab allocator

APIs of Slab allocator

Slab allocator example



Slab allocator variants

• SLOB (Simple List Of Blocks)• Used in early Linux version (from 1991)

• Low memory footprint, suitable for embedded systems

• SLAB• Integrated in 1999

• Cache-friendly

• SLUB• Integrated in 2008

• Improved scalability over SLAB on many cores

Stack

• Each process has• A user-space stack for execution

• A kernel stack for in-kernel execution

• User-space stack is large and grows dynamically

• Kernel-stack is small and has a fixed-size -> two pages (8KB)

• Interrupt stack is for interrupt handlers -> one page for each CPU

• Reduce kernel stack usage to a minimum• Local variables and function parameters

High memory

• On x86_32, physical memory above 896 MB is not permanently mapped within the kernel address space

• Due to the limited size of the address space and the 1/3 GB kernel/user-space memory split

• Before use, pages from highmem should be mapped to the address space

Per-CPU data structure

• Allow each core to have their own values• No locking required

• Reduce cache thrashing

• Implemented through arrays in which each index corresponding to a CPU



References

• Virtual Memory• https://www.youtube.com/watch?v=qlH4-

oHnBb8&list=PLiwt1iVUib9s2Uo5BeYmwkDFUh70fJPxX&index=3

• Professional Linux Kernel Architecture, Wolfgang Mauerer (2.6), Wiley Publishing, Inc.

• Understanding the Linux Virtual Memory Manager, Mel Gorman, PRETICE HALL (2.6)

• LKP class slides by Changwoo Min

• Linux kernel internal lecture slides from BIT ACADEMY by Sungjae Baek and Namhyung Kim

https://www.youtube.com/watch?v=qlH4-oHnBb8&list=PLiwt1iVUib9s2Uo5BeYmwkDFUh70fJPxX&index=3

Documents

Linux Memory Management - Virginia Techchangwoo/ECE-LKP-2019F/l/... · 2019-11-04 · Linux Memory Management Changwoo Min [email protected]. Today: Memory Management •Pages and zones