55
1 CSCI 350 Ch. 8 – Address Translation Mark Redekopp Michael Shindler & Ramesh Govindan

CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

1

CSCI 350Ch. 8 – Address Translation

Mark Redekopp

Michael Shindler & Ramesh Govindan

Page 2: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

2

Abstracting Memory

• Thread = Abstraction of the processor

• What about abstracting memory?– "All problems in computer science can be solved by another level of

indirection"

• Address translation => Abstraction of memory

Processor

MemoryOutput

Devices

Input

DevicesSoftware

Program

Data

Page 3: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

3

Benefits of Address Translation

• What is enabled through address translation?

– Illusion of more or less memory than physically present

– Isolation

– Controlled sharing of code or data

– Efficient I/O (memory-mapped files)

– Dynamic allocation (Heap / Stack growth)

– Process migration

Page 4: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

4

Virtual vs. Physical Addresses

• Translation between:

– Virtual address: Address used by the process (programmer)

– Physical address: Physical memory location of the desired data

Translation Unit /

MMU

(Mem. Mgmt. Unit)

Proc.

Core

Virtual

AddrMemory

Physical

Addr

Data

0x00000000

0x3fffffffframe

Pg. 1

Physical

Memory and

Address Space

Pg. 0

Pg. 3

Pg. 2

Pg. 0

frame

I/O

and

un-

used

area

0xffffffff

Pg. 0

Pg. 1

Pg. 2

Pg. 3

unused

Pg. 0

Pg. 1

Pg. 2

Pg. 0

Pg. 1

Pg. 2

Pg. 3

Secondary

Storage

unused

unused

unused

Pg. 0

Pg. 1

Pg. 2

Pg. 3

Pg. 0

Pg. 1

Pg. 2

Pg. 0

Pg. 1

Pg. 2

Pg. 3

Fictitious Virtual

Address Spaces

Page 5: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

5

Translation Goals & Roadmap

• Functional Goals

– Isolation and transparency to programmer

– Controlled sharing

– Support for sparse address spaces and dynamic resizing

• Performance Goals

– Flexible physical placement

– Fast translation

– Low memory overhead (compact tables)

• We'll first focus on function and then discuss efficiency

Page 6: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

6

Evolution of Translation

• Base + bounds check

• Segmentation

• Paging

Page 7: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

7

SEGMENTATION

Page 8: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

8

Base and Bounds

• Each running process has a:– Virtual address space from VA: 0 to N-1

– Physical address space from PA: BASE to BASE+N-1

• Program written (compiled) using VAs and determine necessary BOUND (i.e. N)

• When loaded, OS assigns BASE (phys. addr. space) dynamically

• Hardware converts VAs to PAs and performs bounds check

CPU

esp

VA: 0x02000ebx

eip

eax

Translation Unit / MMU

0x14000

base

0x05000

bound>

+PA: 0x16000

Exception

Physical Addr

Virtual Addr

unused

P1 Phys.

Addr.

Space

Base: 0x14000

Base + Bound:

0x19000

0x16000

P2 Phys

Addr.

Space

• The "BASE" and "BOUNDS" registers and checking hardware are termed the MMU (Mem. Mgmt. Unit) or Translation Unit

• Base and Bound are loaded/restored on process switches.

Page 9: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

9

Base and Bounds Pros & Cons

• Pros:– Simple

– Provides isolation amongst processes

• Cons:– No easy way to share data

– Can't enforce access rights within the process (e.g. code = read only)• Processes can "rewrite" their own address space

CPU

esp

VA: 0x02000ebx

eip

eax

Translation Unit / MMU

0x14000

base

0x05000

bound>

+PA: 0x16000

Exception

Physical Addr

Virtual Addr

unused

P1 Phys.

Addr.

Space

Base: 0x14000

Base + Bound:

0x19000

0x16000

P2 Phys

Addr.

Space

• The "BASE" and "BOUNDS" registers and checking hardware are termed the MMU (Mem. Mgmt. Unit) or Translation Unit

• Base and Bound are loaded/restored on process switches.

Page 10: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

10

Segmentation

• Allow a process to be broken into multiple base + bounds "segments"– Code, data, stack, heap

• Multiple base + bounds registers stored in a table in the MMU

• Updating the table is a "privileged" (kernel-only) operation

CPU

esp

VA: 0x102000ebx

eip

eax

Translation Unit / MMU

>

+

PA: 0x16000

Exception

Physical Addr

Virtual Addr

unused

Stack

Seg.

Base: 0x14000

Base + Bound:

0x19000

0x16000

0x2a000 0x03200 R/W

Base Bound R/W

0x14000 0x05000 R/W

0x08000 0x0400 R

1 02000

seg. offset

0

1

2

De

scrip

tor

Ta

ble

1:1:3ss

13 bits=Seg. 1=LDT/0=GDT 0-3=RPLSeg. Reg. Format:

Data

Seg.

Base: 0x2a000

Code

Seg.

Base + Bound:

0x2d200

Base: 0x08000

Base + Bound:

0x80400

http://ece-research.unm.edu/jimp/310/slides/micro_arch2.html

Page 11: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

11

Segmentation Pros• Enforce access rights to various segment using access bit

entries in the segment descriptor

• Detect out of bounds accesses (segmentation fault)

• Memory mapped files (segment = file on disk)

• Sharing code + data (see example in a few slides)

– Shared code / libraries and/or data

– One code segment for multiple instances of a running program

• Key idea: What is behind a segment can be "anything"– Translation gives us a chance to intervene

CPU

esp

VA: 0x102000ebx

eip

eax

Translation Unit / MMU

>

+

PA: 0x16000

Exception

Physical Addr

Virtual Addr

unused

Stack

Seg.

Base: 0x14000

Base + Bound:

0x19000

0x16000

0x2a000 0x03200 R/W

Base Bound R/W

0x14000 0x05000 R/W

0x08000 0x0400 R

1 02000

seg. offset

0

1

2

De

scrip

tor

Ta

ble

1:1:3ss

Data

Seg.

Base: 0x2a000

Code

Seg.

Base + Bound:

0x2d200

Base: 0x08000

Base + Bound:

0x80400

Page 12: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

12

Segmentation Cons

• External Fragmentation– As a system runs and segments are created and deleted

the physical address may have many small, unusable gaps and we effectively lose that memory for use

• Growing a segment may be hard since they must be contiguous (other segments may be bracketing our growth)

CPU

esp

VA: 0x102000ebx

eip

eax

Translation Unit / MMU

>

+

PA: 0x16000

Exception

Physical Addr

Virtual Addr

unused

Stack

Seg.

Base: 0x14000

Base + Bound:

0x19000

0x16000

0x2a000 0x03200 R/W

Base Bound R/W

0x14000 0x05000 R/W

0x08000 0x0400 R

1 02000

seg. offset

0

1

2

De

scrip

tor

Ta

ble

1:1:3ss

13 bits=Seg. 1=LDT/0=GDT 0-3=RPLSeg. Reg. Format:

Data

Seg.

Base: 0x2a000

Code

Seg.

Base + Bound:

0x2d200

Base: 0x08000

Base + Bound:

0x80400

http://ece-research.unm.edu/jimp/310/slides/micro_arch2.html

Page 13: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

13

x86 Segmentation

• Segment descriptors are stored in tables in the kernel's memory region

• Local descriptor table (LDT) [Mem]: Up to 8192 segment descriptors for each process

• Global descriptor table (GDT) [Mem]: Kernel, LDT, and a few other types of descriptors

– Pointed to by GDTR / Interrupt descriptor table pointed to by IDTR

• MMU (in CPU) stores a cache of descriptors from the process' LDT for each segment register (CS, DS, SS, ES, FS, GS)

CPU

esp

VA: 0x102000ebx

eip

eax

Translation Unit / MMU

>

+

PA: 0x16000Exception

Physical Addr

Virtual Addr

unused

Stack

Seg.

Base: 0x14000

Base + Bound:

0x19000

0x16000

0x2a000 0x03200 R/W

Base Bound R/W

0x14000 0x05000 R/W

0x08000 0x0400 R

1 02000

seg. offset

DS

SS

CS

De

scrip

tor C

ache

1:1:3ss

Data

Seg.

Base: 0x2a000

Code

Seg.

Base + Bound:

0x2d200

Base: 0x08000

Base + Bound:

0x80400

http://ece-research.unm.edu/jimp/310/slides/micro_arch2.html

LDT1

GDT

0xc4000GDTR

0xc0000

0xc4000

IDTR

LDT/TR 0xc0000

0

8191

0:1:3cs2:1:3ds

Page 14: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

14

Multiple Processes + Sharing

• A physical segment can be shared by 2 processes by setting up the descriptor

CPU

esp

VA: 0x102000ebx

eip

eax

Translation Unit / MMU

>

+

PA: 0x16000Exception

Physical Addr

Virtual Addr

unused

P1

Stack

Base: 0x14000

0x16000

0x7a000 0x03200 R/W

Base Bound R/W

0x14000 0x05000 R/W

0x08000 0x0400 R

1 02000

seg. offset

DS

SS

CS

De

scrip

tor C

ache

1:1:3ss

P2

Data

Base: 0x7a000

P1/P2

Code Base: 0x08000

http://ece-research.unm.edu/jimp/310/slides/micro_arch2.html

LDT1

GDT

0xc4000GDTR

0xc0000

0xc4000

IDTR

LDT/TR 0xc0000

0

8191

0:1:3cs2:1:3ds

LDT2

0

8191

P1

Data

P2

Stack

0x7e000 0x01000 R/W

Base Bound R/W

0x6e000 0x05000 R/W

0x08000 0x0400 R

DS

SS

CS

Base: 0x6e000

Base: 0x7e000Process 2

Process 1

Page 15: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

15

Forking a Process & Copy-On-Write

• Recall forking a process makes a "copy" of the address space

– Generally going to "exec" soon afterward

• In reality we can just make a copy of the page directory but mark all entries read/only

• On any write, we will gen. an exception and we can make a copy of the segment for the child (aka Copy-On-Write)

• If no write, the two processes can share and thus we save time avoiding the copy

• This is an example of a general concept called Lazy Evaluation– Start by doing minimal required work; do more work when required

CPU

esp

VA: 0x102000ebx

eip

eax

Translation Unit / MMU

unused

Stack

Seg.

Base: 0x14000

Base + Bound:

0x19000

0x16000

0x2a000 0x03200 R/W

Base Bound R/W

0x14000 0x05000 R/W

0x08000 0x0400 R

0

1

2

1:1:3ss

Data

Seg.

Base: 0x2a000

Code

Seg.

Base + Bound:

0x2d200

Base: 0x08000

Base + Bound:

0x80400

0x2a000 0x03200 R

0x14000 0x05000 R

0x08000 0x0400 R

0

1

2

Forked Proc Seg.

Table

(Read only)

Parent Proc Seg.

Table

(Read only)

Page 16: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

16

Shared Library Tangent• All programs don't need to link in the same

library functions (think printf, strcpy, memset, etc.)

• Place the library functions in a shared library

– Problem: Compiled progs. won't know where the library code will be when compiled

– Solution: Compiler generates code to lookup any call to a shared function's in a table that it generates but loader fills in with actual library

– Map a segment to describe that shared code area

unused

P1

Stack

Base: 0x14000

0x16000

P2

Data

Base: 0x7a000

Shared

Lib. Base: 0x08000

LDT1

GDT

0xc0000

0xc4000

8191

LDT2

0

8191

P1

Code

P2

Code Base: 0x6e000

Base: 0x7e000

CPU

esp

VA: 0x102000ebx

eip

eax

Translation Unit / MMU

0x2a000 0x03200 R/W

Base Bound R/W

0x14000 0x05000 R/W

0x08000 0x0400 R

0

1

2

1:1:3ss

Proc. 2

Proc. 1

0x7e000 0x01000 R/W

Base Bound R/W

0x6e000 0x05000 R/W

0x08000 0x0400 R

2

3

4

printf 0x400strcpy 0x640

… …

printf 0x400strcpy 0x640

… …

Page 17: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

17

PAGING

Page 18: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

18

Paging

• Helps to avoid external fragmentation– No unused "gaps" in memory

• Virtual memory (process' viewpoint) is divided into equal size "pages" (4KB)

• Physical memory is broken into page frames (which can hold any page of virtual memory and then be swapped for another page)

• Virtual address spaces are contiguous while physical layout is not

Physical Frame of

memory can hold data

from any virtual page.

Since all pages are the

same size any page can

go in any frame (and be

swapped at our desire).

0x00000000

0x3fffffffframe

Pg. 1

Pg. 0

Pg. 3

Pg. 2

Pg. 0

frame

I/O

and

un-

used

area

0xffffffff

Pg. 0

Pg. 1

Pg. 2

Pg. 3

unused

Pg. 0

Pg. 1

Pg. 2

unused

unused

Phys. Addr.

Space

Proc. 1 VAS Proc. 2 VAS

Page 19: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

19

Page Size and Address Translation

• Since pages are usually retrieved from disk, we size them to be fairly large (several KB) to amortize the large access time

• Virtual page number to physical page frame translation performed by HW unit = MMU (Mem. Management Unit)

• Page table is an in-memory data structure that the HW MMU will use to look up translations from VPN to PPFN

Offset within pageVirtual Address Virtual Page Number

31 12 11 0

Offset within pagePhysical AddressPhys. Page Frame

Number

31 30 12 11 0

00

Copied

12

Translation

Process

(MMU +

Page Table)

29

20

18

Page 20: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

20

Analogy for Page Tables

• Suppose we want to build a caller-ID mechanism for your contacts on your cell phone– Let us assume 1000 contacts represented by a 3-digit integer (0-999) by

the cell phone (this ID can be used to look up their names)

– We want to use a simple Look-Up Table (LUT) to translate phone numbers to contact ID’s, how shall we organize/index our LUT

213-745-9823

LUT indexed w/

contact ID

000

LUT indexed w/ all

possible phone #’s

626-454-9985

323-823-7104

818-329-1980

001

002

999

null000-000-0000

..

null

000213-745-9823

999-999-9999

Sorted LUT indexed

w/ used phone #’s

436

213-745-9823 000

002

999323-823-7104

213-730-2198

818-329-1980

O(n) - Doesn’t Work

We are given phone # and

need to translate to ID

(1000 accesses)

O(log n) - Could Work

Since its in sorted order we

could use a binary search

(log21000 accesses)

O(1) - Could Work

Easy to index & find but

LARGE

(1 access)

1 2 3

Page 21: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

21

Page Tables• VA is broken into:

– VPN (upper bits)

– Page offset: Based on page size (i.e. 0 to 4K-1 for 4KB page)

• VPN is index to the "Page Table"

• MMU uses VPN & PTBR to access the page table in memory and lookup physical frame

• Physical frame is combined with offset to form physical address

• For 20-bit VPN, how big is the page table?

VAOffset w/in pageVirtual Page Number

31 12 11 0

Page Table Size

= 220 entries * 19 bits

= approx. 220*4bytes = 4MB

PTBR = Page Table Base Reg.

Offset w/in page PAPhys. Frame #

31 12 11 0

00

Page Frame Number

Valid /

Present

20

CPUMemory

18

Page 22: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

22

Paging• Each process has its own virtual

address space and thus needs its own page table

• On context switch to new process, reload the PTBR

CPU

esp

VA: 0x001040ebx

VA: 0x002eaceip

eax

Translation Unit / MMU

+

PA: 0x6e040

Physical Addr

Virtual Addr

unused

VPN offsetss

Code 2.1

PT1

GDT

0xc4000PTBR/CR3

0xc4000

0xd0000

cs

ds

PT2

0x7e000 R/W

Phys. Frame # R/W

0x6e000 R/W

0x08000 R

0

1

2

0x6e000

Process 2 Page Table

Process 1

VPN

0x7e000 R/W

Phys. Frame # R/W

0x6e000 R/W

0x08000 R

Stack 2.1

Data 1.1

Stack 1.1

Code 1.2

Data 2.1

Code1.1

0

1

2

VPN

offs: 0x040

PPFN: 0x6e000

0x08000

0x002 0xeac

0x001 0x040

offs: 0xeac

PA: 0x08eac

PPFN: 0x08000

0xc4000

Physical Addr

Process 1 Page Table

Page 23: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

23

Reducing the Size of Page Tables

• Can we use the table indexed using all possible phone numbers (because it only requires 1 access) but somehow reduce the size especially since much of it is unused?

• Do you have friends from every area code? Likely contacts are clustered in only a few area codes.

• Use a 2-level organization

– 1st level LUT is indexed on area code and contains pointers to 2nd level tables

– 2nd level LUT’s indexed on local phone numbers and contains contact ID entries

LUT indexed w/ all

possible phone #’s

null

000

213

323

1st Level Index =

Area Code

Area Code

null000-000-0000

..

null

000213-745-9823

999-999-9999

213

Table

2nd Level Index =

Local Phone #

000-0000

999-9999

323

Table

000-0000

999-9999

If only 2 used

area codes

then only 1000

+ 2(107) entries

rather than 1010

entries

Page 24: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

24

Analogy for Multi-level Page Tables

• Could extend to 3 levels if desired– 1st Level = Area code and pointers to 2nd level tables

– 2nd Level = First 3-digits of local phone and pointers to 3rd level tables

– 3rd Level = Contact ID’s

null

000

213

323

1st Level Index =

Area Code

Area Code

2nd Level Index =

Local Phone #

000

999

000

999

323

Table

213

Table

null

null

745

823

null

null

3rd Level Index =

Local Phone #

000

999

213-745

Table

null

000

null

9823

000

999

323-823

Table

null

999

null

7104

Page 25: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

25

Analogy for Multi-level Page Tables

• If we add a friend from area code 408 we would have to add a second and third level table for just this entry

null

000

213

323

1st Level Index =

Area Code

Area Code

2nd Level Index =

Local Phone #

000

999

000

999

323

Table

213

Table

null

null

745

823

null

null

3rd Level Index =

Local Phone #

000

999

213-745

Table

null

000

null

9823

000

999

323-823

Table

null

999

null

7104

Page 26: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

26

Multi-level Page Tables

CPU

esp

VA: 0x001040ebx

VA: 0x002eaceip

eax

Translation Unit / MMU

+

PA: 0x6e040

Physical Addr

Virtual Addr

unused

VPN offset

ss

Code 2.1

PTy

PD

0xd0000

PDBR/CR3

0xc4000

0xd0000

cs

ds

PTx

0x0007e R/W

Phys. Frame # Acess

0x0006e R/W

0x00008 R

0

1023

0x6e000

PTIdx

Phys PT Pointer Access

Stack 2.1

Data 1.1

Stack 1.1

Code 1.2

Data 2.1

Code1.1

0

1023

PDIdx

offs: 0x040

PPFN: 0x6e000

0x08000

0x000 0x040

0xd0000

Physical Addr

Proc 1 Page Table y

0x001

PDIdx PTIdx

Page Directory

0x0007a R

Phys. Frame # R/W

… …

0x00041 R/W

0

1023

PTIdx

0xe8000

0xe8000

1

0xc4000

0x4a000

0x7e000

0x7a000

0x000e8 -

… …

0x000c4 -

Proc 1 Page Table x

Page 27: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

27

Another View

0

1

2

1023

0

1

2

1023

0

1

2

1023

0

1

2

1023

Offset w/in pageLevel

Index 1

31 12 11 022 21

Level

Index 21010

Pointer to start of

2nd Level Table

PPFN’s

frame

I/O

and

un-

used

area

frame 0x0Unused entries can store a NULL pointer (dark shaded entries)

"Swapping" can be performed if no physical frames available when a new page need be allocated. We simply swap the page of data to disk to free up the physical frame. We can retrieve it when necessary.

A secondary data structure (not shown) can be maintained for pages and store where that page’s data resides on disk

Page 28: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

28

Page Table Entries (PTEs)

• Valid bit (1 = desired page in memory / 0 = page not present / page fault)

• Referenced = To implement replacement algorithms (e.g. pseudo-LRU)

• Protection: Read/Write/eXecute

Page Frame Number

Valid / Present

Modified / Dirty

Referenced

Protection

Cacheable

Page 29: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

29

Page Fault Steps

• What happens when you reference a page that is not present?

• HW will…– Record the offending address and generate a page fault exception

• SW will…– Pick an empty frame or select a page to evict

– Writeback the evicted page if it has been modified

• May block process while waiting and yield processor

– Bring in the desired page and update the page table

• May block process while waiting and yield processor

– Restart the offending instruction

• Key Idea: Handler can bring in the page or do anything appropriate to handle the page fault– Allocate a new page, zero it out, perform copy-on-write, etc.

Page 30: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

30

Multi-level Page Tables• Think of multi-level page tables as a tree

– Internal nodes contain pointers to other page tables

– Leaves hold actual translations

0x40 0x0400x35

Virtual

Addr

VPN

offsetIdx1 Idx27 bits 7 bits 12 bits

0xd0000

PDBR/CR3

… …

0x3f

6 bitsIdx3

[0x45]

PT2[] = start addr

PD start addr

[0x3f]

[0x35]

PT3[] = start addr

Level 1

Level 2

Level 3

CPU

Phys. Frame Addr

Translations live in this level

Page 31: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

31

Sparse Address Spaces

0x40 0x0400x35

Virtual

Addr

VPN

offsetIdx1 Idx27 bits 7 bits 12 bits

0xd0000

PDBR/CR3

… …

0x3f

6 bitsIdx3

[0x45]

PT2[] = start addr

PD start addr

[0x3f]

[0x35]

PT3[] = start addr

CPU

Phys. Frame Addr

unu

sed

unused

used

unu

sed

used

unu

sed

used

used

unu

sed

used

use

d

Physical Address SpacePages can be anywhere.

used

used

used

unused

unused

used

unused

used

unused

unused

used

Virtual Address Space may be "sparse". In that case any PT entry

can be null indicating no translations (and page tables are needed for those address ranges)

1

2

3 3

Notes:1. No 2nd level table means…2. …no 3rd level tables for that range3. No 3rd level tables for NULL 2nd level

entries

Page 32: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

32

x86 Segments + Paging

• x86 allows use of segments + paging

• Most modern OS's do NOT use x86 segmentation for newer (protected-mode) apps

• All segments use Base = 0, Bounds = 0xffffffff and thus little use of LDTs

Segment

Translation

(Bounds checks)

Proc.

Core

Virtual

Addr

Memory

Physical

Addr

Data

Paging

Page 33: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

33

32-bit x86 (Pintos) Translation

0x040Virtual

Addr

VPNoffset12 bits

0xd0000

PDBR/CR3

0x001 0x001

VPN

PDIdx PTIdx

10 bits10 bits

[0x3f]

CPU

Level 1

(Page Dir) Level 2

(Page Table)

unused

Code 2.1

PTy

0xc4000

0xd0000

PTx

0x6e000

Stack 2.1

Data 1.1

Stack 1.1

Code 1.2

Data 2.1

Code1.10x08000

0xe8000

0x4a000

0x7e000

0x7a000

PD

0x040

+

PA: 0x6e040

Physical Addr

PPFN: 0x6e000

• 2 Level Page table

Page 34: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

34

SPARC VM Implementation

Offset w/in pageIndex 1

8 11 06

Index 2Process ID Index 3

6

0

4095

MMU hold 4096 entry table

(one entry per

context/process)

[Essentially, PTBR for each

process]

Context TableFirst

Level Second

Level Third

Level 4K

Page

Desired

word

PPFN

28 * 4

bytes 26 * 4

bytes 26 * 4

bytes

Virtual Address:

Page 35: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

35

Shared Memory

• In current system, all memory is private to each process

• To share memory between two processes, the OS can allocate an entry in each process’ page table to point to the same physical page

• Can use different protection bits for each page table entry (e.g. P1 can be R/W while P2 can be read only)

0

1

2

3

0

1

2

0

1

2

…Physical

Memory

Page Tables

P1

P2

Page 36: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

36

Paged Segmentation (Not covered)

• Paging & segmentation can be combined

• Segment could point to a page table

– PT for Code, Data, Stack, etc.

• VA is broken into segment, VPN, and offset

• Used when VA < PA

CPU

esp

VA: 0x102040ebx

VA: 0x202eaceip

eax

Translation Unit / MMU

+

PA: 0x6e040

Physical Addr

Virtu

al A

ddr

unused

ss

Code 2.1

PT1.1

0xc4000

0xcc000

cs

ds

PT1.0

0x7e000 R

Phys. Frame # R/W

0x6e000 R

0x08000 R

0

1

2

0x6e000

Proc. 1 CS Page Table

VPN

0x75000 R/W

Phys. Frame # R/W

0x48000 R/W

0x6e000 R/W

Stack 2.1

Stack 1.0

Stack 1.2

Code 1.0

Stack 1.1

Code1.1

0

1

2

VPN

PPFN: 0x6e000

0x080000x02 0x040

0xc4000

Physical Addr

Proc. 1 SS Page Table

0x1

0xcc000 0x18 R/W

PTBaseBound

(# pages) R/W

0xc4000 0x03 R/W

0xc8000 0x08 R/W

0

1

2

>

0x75000

0x48000

VPN offsetseg.

0xcc000

Exception

offs: 0x040

De

scrip

tor

Ta

ble

Page 37: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

37

Inverted Page Tables• Page tables may seem expensive in terms of memory

overhead– Though they really aren't that big

• One option to consider is an "inverted" page table– One entry per physical frame

– Hash the virtual address and whatever results is where that page must reside

• What about collisions?– Becomes hard to maintain in hardware, but can be used by secondary

software structures

213-745-9823

LUT indexed w/

contact ID

000

626-454-9985

323-823-7104

818-329-1980

001

002

999

626-454-9985

Hash

func.

Page 38: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

38

A Note on Portability

• You can see that VM implementations leverage specific HW registers and capabilities

• OSs may provide some VM abstraction for easier portability

• Also may maintain secondary data structures about virtual and physical pages to help this process

Page 39: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

39

EFFICIENCY

Page 40: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

40

VM Design Implications

• SLOW secondary storage access on page faults (10 ms)

– Implies page size should be fairly large (i.e. once we’ve taken the time to find data on disk, make it worthwhile by accessing a reasonably large amount of data)

– Implies a “page fault” is going to take so much time to even access the data that we can handle them in software (via an exception) rather than using HW like typical cache misses

[Other implications you might not yet understand yet w/o caching knowledge]

– Implies eviction algorithms like LRU can be used since reducing page miss rates will pay off greatly

– Implies the placement of pages in main memory should be fully associative to reduce conflicts and maximize page hit rates

– Implies write-back (write-through would be too expensive)

Page 41: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

41

Page Table Performance

• How many accesses to memory does it take to get the desired word that corresponds to the given virtual address?

• So for each needed memory access, we need 3 additional?

– That sounds BAD!

• Would that change for a 1- or 2- level table?

• Walking the page table can take a long time

Offset w/in pageIndex 1

8 11 06

Index 2Process ID Index 3

6

0

4095

MMU hold 4096 entry table

(one entry per

context/process)

[Essentially, PTBR for each

process]

Context TableFirst

Level Second

Level Third

Level 4K

Page

Desired

word

PPFN

28 * 4

bytes 26 * 4

bytes 26 * 4

bytes

Virtual Address

Page 42: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

42

Processor Chip

Translation Unit / MMU

Translation Lookaside Buffer (TLB)

• Solution: Let’s create a cache for translations = Translation Lookaside Buffer (TLB)

• Needs to be small (64-128 entries) so it can be fast, with high degree of associativity (at least 4-way and many times fully associative) to avoid conflicts

– On hit, the PPFN is produced and concatenated with the offset

– On miss, a page table walk is needed

TLB

CacheCPUVA

VPN

Page Offset

PPFN

PA data10 ns

10 ns

Memory

Memory

(Page Table)

Hit

Mis

s

Mis

s

Hit

Cost of Translation: Cost of TLB lookup + (1-P(TLB-hit) * Cost(Page Table Walk))

Page 43: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

43

Translation Lookaside Buffer (TLB)

Offset w/in page Virtual AddressVirtual Page Number

31 12 11 0

Page Frame #

0x308ac

Offset w/in pagePhysical Address

Phys. Frame #

31 12 11 0

V D

0x7ffe4

Tag = VPN

=

=

=

=

Fully Associative TLB

(Entry can be anywhere

and thus we must check

all locations in TLB for a

hit)

20

12

TLB

0x7ffe4

Page 44: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

44

A 4-Way Set Associative TLB

• 64 entry 4-way SA TLB (wet field indexes each “way”)– On hit, page frame # supplied quickly w/o page table access

Offset w/in page Virtual AddressVirtual Page Number

31 12 11 0

Offset w/in page Physical AddressPhys. Frame #

31 12 11 0

SetTag

Tag PF# Tag PF#Tag PF# Tag PF#

= = = =

Way 1Way 0 Way 2 Way 34

16

Page 45: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

45

TLB Miss Process

• On a TLB miss, there is some division of work between the hardware (MMU) and OS

• Option 1

– MMU can perform page table walk if needed

– If page fault occurs, OS takes over to bring in the page

• Option 2

– If TLB miss, OS can perform both the page table walk and bring in page if necessary

• When we want to remove a page from memory

– First flush out blocks belong to that page from data cache (writing back if necessary)

– Invalidate tags of those blocks

– Invalidate TLB entry (if any) corresponding to that page

• If D=1, set dirty bit in page table

– If page is dirty, copy page back to the disk

– Simple way to remember this…

• If parents (page) leave a party then the children (cache blocks & TLB entries) must leave too

Page 46: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

46

Multiple Processes• On a process context switch can TLB maintain mappings

– Can TLB share mappings from multiple processes?

• Recall each process has its own virtual address space, page table, and translations

– Virtual addresses are not unique between processes

• How does TLB handle context switch

– Can choose to only hold translations for current process and thus invalidate all entries on context switch

– Can hold translations for multiple processes concurrently by concatenating a process or address space ID (PID or ASID) to the VPN tag

Offset VAVPN

31 12 11 0

ASID

Unique ID for

each process

Page Frame # V MTag

=

=

=

=

ASID

Page 47: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

47

Page Sizes & Superpages• For large contiguous chunks of virtual address space we can attempt to allocate a

large contiguous physical chunk

– All the lower (2nd) level entries of the page table would point at a contiguous range

– Instead of the level 1 entry pointing at a level 2 page table, it has the physical address translation

– A bit in the level 1 entry indicates if it contains this phys. translation or a pointer to the 2nd

level table

0x00000000

0x3a000000

Super

Page

Pg. 3

Pg. 2

Pg. 0

Pg. 0

frame

I/O

and

un-

used

area

0xffffffff

Phys. Addr. Space

Pg. 0

Pg. 1

Pg. 2

unused

unused

Super

Page

Virt. Addr.

Space

0x040Virtual

Addr

VPNoffset

12 bits

0xd0000

PDBR/CR3

0x010 0x001

VPN

PDIdx PTIdx

10 bits10 bits

CPU

…Level 1

(Page Dir)

Level 2

(Page Table)

x86 allows for 2MB or 1GB super pages by skipping 1 or 2

last levels of its 4-level page table scheme.

0x001040

offset 22-bits

0x3a001040

0x01001040

Page 48: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

48

Processor Chip

Translation Unit / MMU

Modern Processor Organizations

• Often times, separate TLB’s for instruction and data address translation (and data/instruction caches as well)

TLB

IL1 $ /

DL1 $CPU

VA

VPN

Page Offset

PPFN

PA data10 ns

10 ns

Memory

Memory

(Page Table)

Hit

Mis

s

Mis

s

Hit

ITLB /

DTLB

IL1 $ /

DL1 $

Page 49: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

49

Multiprocessor TLB Issues & Shootdown

• Assume 2 threads from 2 processes executing on 4 cores

• TLB's store copies of PTEs but no HW support for "coherency"

• What if one thread needs to reduce access privileges of a page?– Conservatively need to invalidate TLB entry in the other processor

• What if access in P2 cause eviction of pages whose translations are cached in the TLBs from P1?

– SHOOTDOWN: 1 proc. interrupts another and indicates a TLB entry should be invalidated

Processor Chip

Translation Unit / MMU

Core1

(P1)

TLB Cache

Translation Unit / MMU

Core3

(P2)

TLB Cache

Translation Unit / MMU

Core2

(P1)

TLB Cache

Translation Unit / MMU

Core4

(P2)

TLB Cache

Level 1

(Page Dir)

Level 2

(Page Table)

Level 1

(Page Dir)

Level 2

(Page Table)

Page 50: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

50

IMPLICATIONS FOR DATA CACHES

Page 51: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

51

Cache Addressing with VM

• Review of cache

– Store copies of data indexed based on the address they came from in MM

– Simplified view: 2 steps to determine hit

• Index: Hash portion of address to find "set" to look in

• Tag match: Compare remaining address to all entries in set to determine hit

– Sequential connection between indexing these two steps (index + tag match)

• Rather than waiting for address translation and then performing this two step hit process, can we overlap the translation and portions of the hit sequence?

– Yes if we choose page size, block size, and set/direct mapping carefully

0

1

2

3

4

addr, data addr, data

Index/HashTag Offset

Address

Page 52: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

52

Virtual vs. Physical Addressed Cache

• Physically indexed, physically tagged (PIPT)

– Wait for full address translation

– Then use physical address for both indexing and tag comparison

• Virtually indexed, physically tagged (VIPT)

– Use portion of the virtual address for indexing then wait for address translation and use physical address for tag comparisons

– Easiest when index portion of virtual address w/in offset (page size) address bits, otherwise aliasing may occur

• Virtually indexed, virtually tagged (VIVT)

– Use virtual address for both indexing and tagging…No TLB access unless cache miss

– Requires invalidation of cache lines on context switch or use of process ID as part of tags

Offset VAVPN

31 12 11 0

Offset PAPFN

31 12 11 0

Set/BlkTag

PIP

T

Offset VAVPN

31 12 11 0

Offset PAPFN

31 12 11 0

Tag

Set/Blk

Offset VAVPN

31 12 11 0

Offset PAPFN

31 12 11 0

Set/BlkTagV

IPT

VIV

T

Page 53: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

53

Virtual vs. Physical Addressed Cache

• Another view:

Virtually addressed Cache Physically addressed Cache

In a modern system the L1 caches may be virtually addressed while L2 may be physically addressed.

Page 54: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

54

SOFTWARE PROTECTION

Page 55: CSCI 350 Ch. 8 Address Translation - USC Viterbi | Ming

55

Protection Without (With Less) HW

• Interpreted languages can perform checking before dereferencing – Most languages don't provide raw pointer features

• Intermediate code– MS.NET - Many languages (C#, VB, etc.) compiled to intermediate

byte-code and then run by an interpreter

– Java

• Program Analysis– Insert code to perform checks that prove the program cannot violate

certain properties

– If proven we can remove checks