Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
CS162:Operating Systems andSystems Programming
Lecture 10:Address Translation:Paging, Protection7 July 2015Charles Reisshttp://cs162.eecs.berkeley.edu/
2
Recall: Simple Base and Bounds
DRAM
<?
+Base
Bound
CPU
VirtualAddress
PhysicalAddress
No: Error!
Actually what the Cray-1 didAdd base to every addressError if any address is greater than bound
3
Recall: Base and Bound ProblemsExternal fragmentation:
― free space between allocations is not all together― expensive moving
Internal fragmentation― unused gaps within allocated chunks― can't fix without relocating program
No sharing – waste space with extra copies"Swapping" is very expensive
4
Recall: Segmentation
Segment map in processor― Segment extracted from program address― Base added to generate physical address― Bound checked → error if out of range
Base0 Bound0 VBase1 Bound1 VBase2 Bound2 VBase3 Bound3 NBase4 Bound4 VBase5 Bound5 NBase6 Bound6 NBase7 Bound7 V
OffsetSeg #VirtualAddress
Base2 Bound2 V+ Physical
Address
> Error
Check Valid
AccessError
5
Recall: Segmentation ProblemsExternal fragmentation (still)Internal fragmentation (but less bad)Swapping requires moving whole segments
6
Physical AddressOffset
Recall: Paging overview
Page table: one per process (address space)― Resides in physical memory― Physical page #, permissions for each virtual page #― Index into table is virtual page #― Trigger fault (like bounds failure) if permissions wrong
OffsetVirtualPage #Virtual Address:
AccessError
>PageTableSize
PageTablePtr page #0
page #2page #3page #4page #5
V,Rpage #1 V,R
V,R,WV,R,WNV,R,W
page #1 V,R
Check Perm
AccessError
PhysicalPage #
7
Physical AddressOffset
Recall: Paging: Address mapping
Offset: copied from virtual addr to physical addr― Example: 10-bit offset → 1024-byte pages
Virtual page # is all remaining bits― 32-bit, 1024-byte pages: 22 bits → 4 M page table entries
Physical page # from table replaces virtual page # to form physical addr― May be different length
OffsetVirtualPage #Virtual Address:
AccessError
>PageTableSize
PageTablePtr page #0
page #2page #3page #4page #5
V,Rpage #1 V,R
V,R,WV,R,WNV,R,W
page #1 V,R
Check Perm
AccessError
PhysicalPage #
8
Physical AddressOffset
Recall: Paging and processes
Special registers to set on context switch― pointer to page table
Page table per process― Kept in memory, not loaded each time
OffsetVirtualPage #Virtual Address:
AccessError
>PageTableSize
PageTablePtr page #0
page #2page #3page #4page #5
V,Rpage #1 V,R
V,R,WV,R,WNV,R,W
page #1 V,R
Check Perm
AccessError
PhysicalPage #
9
Trivial Page Table
abcdefghijkl
0x00
0x04
0x08
VirtualMemory
0x00
ijkl
0x04
0x08
efgh
0x0C
abcd
0x10
PhysicalMemory
Example (4 byte pages)
431
PageTable
0
1
2
0000 0000
0001 0000
0000 0100 0000 1100
0000 1000
0000 01000x06?
0000 0110 0000 1110
0x0E!0x09?
0000 1001 0000 0101
0x05!
10
Paging Example: Allocation1111 1111
stack
heap
code
data
Virtual memory view
0000 0000
0100 0000
1000 0000
1100 0000
1111 0000
page # offset
Physical memory view
data
code
heap
stack
0000 00000001 0000
0101 000
0111 000
1110 0000
11111 1110111110 1110011101 null 11100 null 11011 null11010 null11001 null11000 null10111 null10110 null10101 null10100 null10011 null10010 1000010001 0111110000 0111001111 null01110 null 01101 null01100 null01011 01101 01010 01100 01001 0101101000 0101000111 null00110 null00101 null 00100 null 00011 0010100010 0010000001 0001100000 00010
Page Table1110 1111
11
Paging Example: Allocation1111 1111
stack
heap
code
data
Virtual memory view
0000 0000
0100 0000
1000 0000
1100 0000
1110 0000
page # offset
Physical memory view
data
code
heap
stack
0000 00000001 0000
0101 000
0111 000
1110 0000
11111 1110111110 1110011101 null 11100 null 11011 null11010 null11001 null11000 null10111 null10110 null10101 null10100 null10011 null10010 1000010001 0111110000 0111001111 null01110 null 01101 null01100 null01011 01101 01010 01100 01001 0101101000 0101000111 null00110 null00101 null 00100 null 00011 0010100010 0010000001 0001100000 00010
Page Table1110 1111
What happens if stack grows to 1110 0000?
12
stack
Paging Example: Allocation1111 1111
stack
heap
code
data
Virtual memory view
0000 0000
0100 0000
1000 0000
1100 0000
1110 0000
page # offset
Physical memory view
data
code
heap
stack
0000 00000001 0000
0101 000
0111 000
1110 0000
11111 1110111110 1110011101 10111 11100 10110 11011 null11010 null11001 null11000 null10111 null10110 null10101 null10100 null10011 null10010 1000010001 0111110000 0111001111 null01110 null 01101 null01100 null01011 01101 01010 01100 01001 0101101000 0101000111 null00110 null00101 null 00100 null 00011 0010100010 0010000001 0001100000 00010
Page Table1110 1111
No worrying about external fragmentation!
Allocate new pages where room!
13
Paging: Hardware supportSpecial register PageTablePtr
― Like base + bound register – kernel mode only― Change on context switch
Page table lookup― Two physical memory accesses per virtual access― Later: making this faster
14
Problems with Paging So FarHow big is the page table?E.g. MIPS, 4k pages, 2GBof virtual address space
― 512K entries― megabytes
64-bit x86: ~256 TBof virtual address space
― 64B entries― gigabytes
Even though most address space is unused
15
Two-level Page Table
Tree of page tablesEach has fixed size
― x86: 1024 four-byte entries
Still one pointer to change on switch
PhysicalAddress:
OffsetPhysicalPage #
4KB
4 bytes
PageTablePtr
4 bytes
10 bits 10 bits 12 bitsVirtual Address: OffsetVirtual
P2 indexVirtualP1 index
16
Two-level Page Table
Tree of page tablesEach has fixed size
― x86: 1024 four-byte entries
Still one pointer to change on switch
PhysicalAddress:
OffsetPhysicalPage #
4KB
4 bytes
PageTablePtr
4 bytes
Big gaps don't need second-level page tables
10 bits 10 bits 12 bitsVirtual Address: OffsetVirtual
P2 indexVirtualP1 index
17
stack
Two-Level Paging Example1111 1111
stack
heap
code
data
Virtual memory view
0000 0000
0100 0000
1000 0000
1100 0000
page1 # offset
Physical memory view
data
code
heap
stack
0000 00000001 0000
0101 000
0111 000
1110 0000
page2 #
111 110 null101 null100 011 null010 001 null000
11 11101
10 1110001 1011100 10110
11 01101
10 0110001 0101100 01010
11 00101
10 0010001 0001100 00010
11 null 10 1000001 0111100 01110
Page Tables(level 2)
Page Table(level 1)
1111 0000
18
stack
stack
heap
code
data
Virtual memory view
1001 0000(0x90)
Physical memory view
data
code
heap
stack
0000 00000001 0000
1000 0000(0x80)
1110 0000
111 110 null101 null100 011 null010 001 null000
11 11101 10 1110001 1011100 10110
11 01101 10 0110001 0101100 01010
11 00101 10 0010001 0001100 00010
11 null 10 1000001 0111100 01110
Page Tables(level 2)
Page Table(level 1)
Two-Level Paging Example
19
stack
stack
heap
code
data
Virtual memory view
1001 0000(0x90)
Physical memory view
data
code
heap
stack
0000 00000001 0000
1000 0000(0x80)
1110 0000
111 110 null101 null100 011 null010 001 null000
11 11101 10 1110001 1011100 10110
11 01101 10 0110001 0101100 01010
11 00101 10 0010001 0001100 00010
11 null 10 1000001 0111100 01110
Page Tables(level 2)
Page Table(level 1)
Two-Level Paging Example
In best case, total size of page tables ≈ number of pages used by program virtual memory. Requires two
additional memory access!
20
Page TablesGood:
― Easy memory allocation― Easy sharing― No memory allocated for "holes"
Bad:― Overhead: at least 1 pointer per ~4K page (~0.1%)― Page tables need to be contiguous (allocation?)
● Clever design (x86) can limit them to one page― Several lookups per reference
● Later: making this fast
21
Page tables for huge address spacesWant 64-bits of address space?
― Note: 64-bit x86 does not do this
How many levels of indirection?10 bits 10 bits 12 bits
Virtual Address: OffsetVirtual
P3 indexVirtualP2 index
VirtualP4 index
VirtualP6 index
VirtualP5 index
10 bits 10 bits 10 bits2 bits
How much extra space? Memory accesses?Will tables fit in one page?Real example: 64-bit x86: 4-level page tables
22
Alternate structure:Inverted Page Table (1)Previously: size of page table ~ # virtual addrs
New structure: size ~ # physical addrs― Better if most of address space not in use― Like 64-bit address space (example: Itanium)
Basic idea: Hash table OffsetVirtual
Page #
HashTable
OffsetPhysicalPage #
23
Alternate structure:Inverted Page Table (2)
Good:― Overhead proportional to virtual addresses in use― One level of indirection
Bad:― More complex
OffsetVirtualPage #
HashTable
OffsetPhysicalPage #
26
Mixing Segments and PagesOne idea: Each segment has its own page tableFragmentation/allocation advantage of pages
page #0page #1
page #3page #4page #5
V,RV,R
page #2 V,R,WV,R,WNV,R,W
OffsetPhysical Address
Virtual Address:
OffsetVirtualPage #
VirtualSeg #
Base0 Limit0 VBase1 Limit1 VBase2 Limit2 VBase3 Limit3 NBase4 Limit4 VBase5 Limit5 NBase6 Limit6 NBase7 Limit7 V
Base2 Limit2 V
AccessError>
page #2 V,R,WPhysicalPage #
Check Perm
AccessError
27
Sharing with Segments + PagingProcessA
OffsetVirtualPage #
VirtualSeg #
Base0 Limit0 VBase1 Limit1 VBase2 Limit2 VBase3 Limit3 NBase4 Limit4 VBase5 Limit5 NBase6 Limit6 NBase7 Limit7 V
Base2 Limit2 V
page #0page #1page #2page #3page #4page #5
V,RV,RV,R,WV,R,WNV,R,W
Shared Segment
ProcessB
OffsetVirtualPage #
VirtualSeg #
Base0 Limit0 VBase1 Limit1 VBase2 Limit2 VBase3 Limit3 NBase4 Limit4 VBase5 Limit5 NBase6 Limit6 NBase7 Limit7 V
Base2 Limit2 V
28
Multi-Level TranslationGood:
― No extra copies of page tables for shared data― Easy memory allocation (no external fragmentation)― Easy sharing – change segment entry in common case― Still have option to share at page granularity
Bad:― Overhead: 1 pointer per page― At least two extra lookups per memory access
● (But see later)
29
Address Translation ComparisonAdvantages Disadvantages
Segmentation Fast context switching: Segment mapping maintained by CPU
External fragmentation
Paging (single-level page)
No external fragmentation, fast easy allocation
Large table size ~ virtual memoryInternal fragmentation
Paged segmentation
Table size ~ # of pages in virtual memory, fast + easy allocation
Multiple memory references per page access
Two-level pagesInverted Table Table size ~ # of
pages in physical memory
Hash function more complex
30
Reality: 32-bit x86Has both segmentation and paging
Segmentation is different than what we describedSegment identified by instruction not address
Note: x86 has multiple modes― We will talk about 32-bit protected mode― Old x86 (DOS) is different
31
Reality: x86 Special Registers
Typical Segment Register
80386 Special Registers
"Segment Map Entry"Stored in table in memory
Special registers point to activesegment map entries
There's also paging (later)
32
Reality: Segmentation on x86 (1)6 "segment registers" point to segment table entries
― cs (code), ds (data), ss (stack), es, fs, gs (extras)
Instructions identify segment to use:―mov [es:bx], ax― not specified, default to cs, ds, or ss depending on
instruction― instead of in address (like all discussion later)
Mostly unused by modern OSs― Only fs, gs support base + limit in 64-bit mode
33
Reality: Segmentation on x86 (2)Segment registers store a pointer like this:
Segment selector [13 bits] G/L RPL
Point to segment descriptor in global descriptor table or local descriptor table (depending on G/L).Tables are in memory, pointed to by kernel-mode-only registers GDTR and LDTR
― Registers also include table length
RPL determines privilege level (user/kernel mode) of code segments
34
Reality: Segmentation on x86 (3)Segment descriptor format (64-bits):
Includes:― Base and limit― Valid bit (P for present)― Protection flags (DPL (kernel/user), parts of type (R/W/X))― Accessed bit (A)― Bits related to interpretation of base and limit― Bits related to mode switching
35
Reality: x86 Segments + PagingOnly one active page table
Segment + Offset is called logical address
Result of segmentation is called linear address
Linear address looked up in (single) page table
36
Reality: x86 Segments + Paging
37
Page Table EntriesWe just showed them containing physical page #
― Two-level page tables: maybe PP# of another page table
Actually want some extra information:― Present/valid bit – represent holes― Protection bits – read-only memory (good for sharing)
38
Example: x86-32 Page Table Entry
― PFN: physical page number of page or next page table― P: Present bit (= Valid)― W: Writable― U: User-accessible― A: Accessed – set when page is accessed― D: Dirty – set when page is modified― L: If 1, points to 4MB "hugepage" instead of next page table― PWT: Write-through caching behavior (for memory-mapped IO)― PCD: Disable caching (for memory-mapped IO)
Page Frame Number(Physical Page Number)
Free(OS) 0 L D APCDPWTU W P
01234567811-931-12
10/10/12-bit split of virtual addresstop-level page tables called directories
} Protection bits} Pointer
} Feedback to OS
39
LogisticsProject 1 Checkpoint 1 Due tommorrow 11:59PM
― Priority scheduler, Alarm clock
Homework 1 due tomorrow 11:59PMMidterm next week (more later this week)
40
Break
41
Paging TricksWhat does invalid Page Table Entry (PTE) mean?
― Region of address space is invalid or― Page is just somewhere else/not ready yet
When program accesses invalid PTE, OS gets an exception (a page fault or protection fault )Options:
― Crash program (it's actually invalid)― Get page ready and restart instruction
42
Paging Tricks: ExamplesDemand Paging
― Swapping for pages― Keep only active pages in memory― When exception occurs for page not in memory, load from disk and retry
Copy on Write― Remember fork() – copy of address space― Instead of real copy, mark pages read-only― Allocate new pages on protection fault
Zero-Fill On Demand― New pages should be zeroed out – slow!― Instead, pages start invalid – create new zero page when accessed
43
Demand Paging: Page Faults0x0000 1000: sw $t0, 0($sp)
$sp → 0x7fff f800
VPN# PPN# Protection Bits Logical state of page (not in page table)
0x0000 0 -- ... unmapped0x0000 1 -- INVALID on disk... ... ... ...0x7FFF F -- INVALID on disk
44
Demand Paging: Page Faults0x0000 1000: sw $t0, 0($sp)
$sp → 0x7fff f800
VPN# PPN# Protection Bits Logical state of page (not in page table)
0x0000 0 -- ... unmapped0x0000 1 -- INVALID on disk... ... ... ...0x7FFF F -- INVALID on disk
Running instruction will triggerpage fault in instruction fetch
45
Demand Paging: Page Faults0x0000 1000: sw $t0, 0($sp)
$sp → 0x7fff f800
VPN# PPN# Protection Bits Logical state of page (not in page table)
0x0000 0 -- ... unmapped0x0000 1 0x4887 valid, read,
executein memory
... ... ... ...0x7FFF F -- INVALID on disk
Page fault handler will run,load page, update page table
46
Demand Paging: Page Faults0x0000 1000: sw $t0, 0($sp)
$sp → 0x7fff f800
VPN# PPN# Protection Bits0x0000 0 -- ... unmapped0x0000 1 0x4887 valid, read,
executein memory
... ... ... ...0x7FFF F -- INVALID on disk
Now instruction will fault again
47
Demand Paging: Page Faults0x0000 1000: sw $t0, 0($sp)
$sp → 0x7fff f800
VPN# PPN# Protection Bits0x0000 0 -- ... unmapped0x0000 1 0x4887 valid, read,
executein memory
... ... ... ...0x7FFF F 0x4888 valid, read,
writein memory
And page fault handler willupdate page table again
and rerun the instruction again
Transparent Exceptions (1)
Process calls OS to fixup TLB or load pagesCan we return like system call?
― Not quite! Need to redo load/store, not skip it.
Hardware help:― Faulting instruction address― Address of memory access?― Side effects?
SoftwareLoad TLB
Faul
ting
Inst
1
Faul
ting
Inst
1
Faul
ting
Inst
2
Faul
ting
Inst
2
Fetch page/Load TLB
User
OS
TLB Faults
Transparent Exceptions: Side Effectspush eax (x86)
― Semantics: MEM[ESP] ← EAX; ESP ← ESP + 4― If this faults, what happened to SP?
strcpy (r1), (r2) (VAX)― Semantics: copy null terminated string in memory― If this faults, how much was copied?
bne somwhereld r1, (sp) (MIPS)
― If this faults, how do we resume?
Precise Exceptionsprecise exceptions means machine is as if program executed up to offending instruction
― Hardware responsible for completing previous instructions, undoing other effects
― MIPS position● solution for branch-delay – up to branch instruction
Some systems have imprecise exceptions― Much harder for OS― No current generation system does this
51
What happens in the MMU?
Reads the page table, triggers interruptsAlways: translations are cached
― Typical case – no memory accesses
What about when nothing cached yet?
CPU MMUVirtualAddresses
PhysicalAddresses
52
What happens in the MMU?
Option 1: Hardware traversal (example: x86)― Hardware reads page tables― Invoke page fault handler if invalid/non-present PTE― Most common option for new architectures
Option 2: Software traversal (example: MIPS)― Invoke software handler – an interrupt― MIPS: some virtual addresses mapped to physical addresses
without TLB (in kernel mode only)― Handler invokes page fault handler if invalid/not-present PTE
CPU MMUVirtualAddresses
PhysicalAddresses
53
Recall: Dual-Mode OperationProcess cannot modify its own translation tables
― Otherwise, it can access all physical memory― Can modify kernel
Hardware provides user/kernel mode controlled by special register
― Protects page table pointer― Kernel needs to ensure translation tables are not
accessible through process's page table― x86: code segment selector (CS) RPL bits
● 0 = kernel, 3 = user; most OSs don't use other modes
54
Recall: 61C Program Layout
Higher addresses unused?
55
32-bit x86 Linux Memory Layout
56
32-bit x86 Linux Memory LayoutShared libraries
Extra stacks for multithreaded processes
Some memory allocations
57
32-bit x86 Linux Memory LayoutKernel is mapped into every process's address space
Protection bits: can't be accessed from user mode
Faster than changing address space on every kernel mode switch
58
32-bit x86 Linux Memory LayoutAddress Space Layout Randomization: Make turning bugs into security problems harder
Effective? See a security class?
59
Starting a Program: StepsAllocate Process Control BlockRead (some of) program off disk and store in memoryAllocate page table
― Point at code so program can execute
Setup machine registers― Includes pointer to translation table
Set user mode in special register + jump
60
Switching Between ProcessesSave/restore registers like for thread switch()
― Typically on corresponding kernel stack
But also save/restore page table pointer
61
Recall: System CallsAccess to operating system services:
― IO (open, read, …), Files (mkdir, …), Process (fork, …)
Voluntary call into kernelSpecifies index of system call
― Not memory address― Only known, carefully written entry points
62
Recall: Kernel entry
Lookup entry point in interrupt vector
Hardware switches to kernel mode
interrupt number(i)
intrpHandler_i () { …}
Address and properties of each interrupt handler
(or syscall handleror exception handler, etc.)
63
System call argument passingArguments in registers or pointed to by registersRead from saved registers
― Remember: first task of on kernel entry― (Pintos/x86: mostly done by hardware)
What about pointers?― read(file-descriptor, POINTER, SIZE)
64
Syscall argument passing: PointersIn same address spaceIs this safe?
― read() into kernel memory?
Solution:― Check bounds of pointers― Access normally
What if page swapped out?― Same handler as userspace,
no mode switch
65
Synchronous Traps versus SyscallsHandling essentially the same – different argument passingSpecial hardware registers for arguments
― Example: virtual address of page fault
Often rerun the triggering instruction― after fixing something – example: page table entry― same as return from syscall but different user program
counter
66
Summary (1)Segment mapping
― Segment registers within processor― Segment ID with each access (address or instruction)― Base + Limit for each segment
Page Tables― Fixed-sized chunks (kilobytes) called pages― Virtual page # (top bits of virtual address) mapped
through page table to physical page #― Rest of address (offset) unchanged
67
Summary (2)Multi-Level Page Tables
― Tree of tables― Some may be missing
Inverted Page Tables― Hashtable― Advantage: less overhead for huge address space
Page Table Entries:― Physical Page # + Valid + Permissions
68
Summary (3)Dual mode operation – Kernel/User
― Only kernel can change PageTablePtr or write to page tables
Exceptions – controlled entry into kernel― Synchronous exceptions ("traps") – System calls, page
faults, etc.― Asynchronous exceptions: Interrupts