CSE 451 Section

CSE 451 Section

February 24, 2000Project 3 – VM

Outline

• Project 2 questions?• Project 3 overview• File systems & disks

Project 2 questions?

• I hope to have them graded by next Friday

Project 3 – VM

• Core of the project – adding VM to nachos

• Currently:– Linear page table– No demand paging

• If a page is not in memory, it never will be• All files loaded off disk in advance

– No TLB: all memory accesses hit the page table

VM in Nachos

• For the project:– Implement a TLB

• Compile with –DUSE_TLB• This makes Machine::Translate look at the TLB not

the page table to translate• Raises page fault (not a TLB fault) on a TLB miss• TLB ignores invalid entries• TLB must handle context switches

– Can invalidate the TLB– Can add address-space ID to the TLB

For the project:

• Implement inverted page tables– Has a PTE for each physical page,

referring to the virtual page it contains

– Use a hash on the virtual page number to find the PTE

– Need a separate data structure for out-of-memory virtual pages

For the project (2):

• Implement VM allocation routines– Dynamically extend the virtual

address space– Reallocate memory without copying

data• Just update page table so new virtual

pages point to same physical pages

For the project (3)

• Implement demand paged VM– Allow programs to use more VM than physical memory– Swap pages out to disk, reload when necessary– Optional: swap executable pages back to their original

files– You will need:

• A data structure that indicates, for each virtual page in an address space not in physical memory, where it is stored on disk

• For pages in memory, where they came from on disk

– For page replacement, you can remove pages from other address spaces.

For the project (4)

• Optionally: swap pages from executable code back to their source files– Must make executable pages read-only for

this– Must load data onto separate pages from

code, or not swap pages with code and data to executable file

– How?• Implement segments: for a region of VM, what file

did it come from

The project

• New this time: design review– Next week, I want to meet with each

group to discuss the design• Page table data structures• TLB filling algorithm• Page replacement algorithm• Backing store management• Schedule for completion

Project questions?

File Systems – advanced topics

• Journaling file systems• Log-structured file systems• RAID

Journaling File Systems

• Ordering of writes is a problem– Considering adding a file to a directory– Do you update the directory first, or

the file first?– During a crash, what happens?– What happens if the disk controller re-

orders the updates, so that the file is created before the directory?

Journaling File Systems (2)

• Standard approach:– Enforce a strict ordering that can be recovered

• E.g. update directories / inode map, then create file• Can scan during reboot to see if directories contain

unused inodes – use fsck to check the file system

– Journaling• Treat updates to meta-data as transactions• Write a log containing the changes before they happen• On reboot, just restore from the log

Journaling File Systems (3)

• How is the journal stored?– Use a separate file, a separate portion

of the disk– Circular log– Need to remember where the log

begins – restart area

Log structured file systems

• Normal file systems try to optimize how data is laid out– Blocks in a single file are close together– Rewriting a file updates existing disk blocks

• What does this help?– Reads?– Writes?

• Reads are faster, but a cache also helps reads, so may not be so important

• Caches don’t help writes as much

LFS (2)

• Log Structured File System– Structure disk into large segments

• Large: seek time << read time

– Always write to the current segment– Rewritten files are not written in place

• New blocks written to a new segment

– Inode map written to latest segment & cached in memory• Contains latest location of each file block

LFS (3)

• What happens when the disk fills up?– Need to garbage collect: clean up segments where

some blocks have been overwritten– Traditionally: read in a number of segments, write just

useful blocks into new segments– Discover:

• Better to clean cold segments (few writes) than hot segments

– Cold segments less-likely to be rewritten soon, so cleaning last longer

• Can do hole-filling– If some space available in a segment, and another

segment is mostly empty, move blocks between segments

RAID – Redundant Arrays of Inexpensive Disks

• Disks are slow:– 10 ms seek time, 5 ms rotational delay– Bigger disks aren’t any faster than

small disks– Disk bandwidth limited by rotational

delay – can only read one track per rotation

• Solution:– Have multiple disks

RAID (2)

• Problems with multiple disks– They are smaller– They are slower– They are less reliable– The chances of a failure rise linearly:

for 10 disks,failures are 10 times more frequent

Raid (3)

• Solution:– Break set of disks into reliability

groups– Each group has extra “check” disks

containing redundant information– Failed disks are replaced quickly, lost

information is restored

RAID Organization of data

• Different apps have different needs– Databases do a lot of

read/modify/write – small updates• Can do read/modify/write on smaller

number of disks, increased parallelism

– Supercomputers read or write huge quantities of data:• Can do read or write in parallel on all

disks to increase throughput

RAID levels

• RAID has six standard layouts (levels 0-5) plus newer ones, 6 and 7

• Each level has different performance, cost, and reliability characteristics

• Level 0: no redundancy– Data is striped across disks– Consecutive sectors are on different disks, can be read

in parallel– Small reads/writes hit one disk– Large reads/writes hit all disks – increased throughput

• Drawback: decreased reliablity

RAID levels (1)

• Level 1: mirrored disks– Duplicate each disk– Writes take place to both disks– Reads happen to either disk– Expensive, but predictable: recovery

time is time to copy one disk

RAID levels (2)

• Level 2: hamming codes for ECC– Have some number of data disks (10?), and

enough parity disks to locate error (log #data disks)

– Store ECC per sector, so small writes require read/write of all disks

– Reads and writes hit all disks – decreases response time

– Reads happen from all disks in parallel – increases throughput

RAID level (3)

• Level 3: Single check disk per group– Observation: host of the hamming

code is used to determine which disk failed

– Disk controllers already know which disk failed

– Instead, store a single parity disk– Benefit – lower overhead, increased

reliability (fewer total disks)

RAID levels (4)

• Independent reads/writes:– Don’t have to read/write from all disks– Interleave data by sector, not bit– Check disks contains parity across several

sectors, not bits of one sector– Parity update with XOR

• Writes update two disks, reads go to one disk

• All parity stored on one disk – becomes bottleneck

RAID level (5)

• Remove bottleneck of check disk– Interleave parity sectors onto data disks

• Sector one – parity disk = 5• Sector two – parity disk = 4• Etc.

• No single bottleneck• Reads hit one disk• Writes hit two disks• Drawbacks: recovery of a lost disk

requires scanning all disks

What next?

• Petal – block storage separate from file system

• Active Disks – put a CPU on each disk– Current bottleneck is bus bandwidth– Can do computation at the disk:

• Searching for data• Sorting data• Processing (e.g. filtering)

Documents

CSE 451 Section