23
Web Caching File System Jonathan Ledlie Matt McCormick

Web Caching File System Jonathan Ledlie Matt McCormick

Embed Size (px)

DESCRIPTION

Two points Optimizing on invariants Impending I/O bottleneck

Citation preview

Page 1: Web Caching File System Jonathan Ledlie Matt McCormick

Web Caching File System

Jonathan LedlieMatt McCormick

Page 2: Web Caching File System Jonathan Ledlie Matt McCormick

Outline

• Motivation - why design a new file system?• Current state of affairs• Design of web caching file system• Performance comparison - WCFS to Unix• Future work• Conclusions

Page 3: Web Caching File System Jonathan Ledlie Matt McCormick

Two points

• Optimizing on invariants• Impending I/O bottleneck

Page 4: Web Caching File System Jonathan Ledlie Matt McCormick

Motivation

• Disks are slow• Communication rates increasing rapidly• Web cache anomalies

– only write to files when they are created– permissions stay constant for files– all files are have copies at original server

Page 5: Web Caching File System Jonathan Ledlie Matt McCormick

Web Caching File System vs.Unix File System

0

0.2

0.4

0.6

0.8

1

1.2

cfscreate

unix cfsread

unix cfsdelete

unix

Operation

Tim

e

cfscreate

unix

cfsread

unix

cfsdelete

unix

50,000 of each operation

CFS is using one thread.

Unix file I/O is synchronous

500 mHz PIII, 8.5 G disk

Page 6: Web Caching File System Jonathan Ledlie Matt McCormick

Current State of AffairsInternet Topology

NETWORK

Client SideCache

Client

Client

Client

Client SideCache

Client

Client

Client

Server SideCache

SERVER

Page 7: Web Caching File System Jonathan Ledlie Matt McCormick

Current State of AffairsUnix File System

• Life of file in web cache– create, write, close– open, read, close (multiple times)– delete

• Using i-nodes– lots of flexibility that is not needed– extra access to disk for each file reference

• Directory structure and name lookup

Page 8: Web Caching File System Jonathan Ledlie Matt McCormick

Design of WCFSSpecializations

• Life of file– create– read (multiple times)– delete

• No i-nodes or permanent file status data– faster create and file access

• In memory hash table stores file locations– faster file lookup and delete

• All file data written to consecutive blocks– faster reads and writes

Page 9: Web Caching File System Jonathan Ledlie Matt McCormick

Design of WCFSObject Diagram

RequestQueueFileTable BitMap

Disk

CacheDisk

Request

Request

Request

CacheCache Cache

getNewCacheObject

Page 10: Web Caching File System Jonathan Ledlie Matt McCormick

Design of WCFSDisk Initialization

• First create cache disk object– creates disk object to represent physical disk– starts a disk thread running

• Disk object and physical disk– utilize an SGI raw I/O patch for Linux– bypass kernel and kernel buffers

Disk

CacheDisk

Page 11: Web Caching File System Jonathan Ledlie Matt McCormick

Design of WCFSDisk Object

• FileTable– stores names and locations of files on disk– MD5 conversion of url

• RequestQueue– stores read and write requests from process threads– whenever anything in queue, disk thread runs

• BitMap– keeps status of each block on disk– locates and marks spot on disk for files to be placed

RequestQueueFileTable BitMap

Disk

Page 12: Web Caching File System Jonathan Ledlie Matt McCormick

Design of WCFSRequest Objects

• Request– write starting block, length, buffer to write from– read starting block, length, buffer to write to– (implies files must be smaller than virtual memory)

• Currently queued by FIFO (soon to be one-way elevator)

RequestQueue

Request

Request

Request

Page 13: Web Caching File System Jonathan Ledlie Matt McCormick

Design of WCFSCache Objects for Threading

• Multiple threads for handling clients• Each thread gets a single Cache object• Cache Object

– create, read, remove, length, sync• Thread create and read Asynchronous

– turned into request objects– placed in request queue for disk

• Thread calls sync to guarantee its operations are done

CacheDisk

CacheCache Cache

Page 14: Web Caching File System Jonathan Ledlie Matt McCormick

Design of CFSCode Snippet

• Common web caching operations: create(“url”, buffer, size); read(“url”, buffer); remove(“url”); sync();

• Equivalent Operations in Unix: fd = creat(“url”, permissions); write(fd, buffer, size); close(fd);

fd = open(“url”, mode); read(fd, buffer, size); close(fd);

unlink(“url”);

Page 15: Web Caching File System Jonathan Ledlie Matt McCormick

Design of WCFSBasic File System Layout

RequestQueueFileTable BitMap

Disk

CacheDisk

Request

Request

Request

CacheCache Cache

getNewCacheObject

Page 16: Web Caching File System Jonathan Ledlie Matt McCormick

Design of WCFSFeature Recap

• Raw I/O• Multi-threading• Asynchronous I/O• Quick name lookup• File data on consecutive blocks

Page 17: Web Caching File System Jonathan Ledlie Matt McCormick

Performance ComparisonsTrace

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0

600

1200

1800

2400

3000

3600

4200

4800

5400

6000

6600

7200

7800

8400

9000

9600

1020

0

1080

0

1140

0

1200

0

1260

0

1320

0

1380

0

1440

0

Operations

Tim

e (u

sec)

cfsext2

Page 18: Web Caching File System Jonathan Ledlie Matt McCormick

Performance ComparisonsCreate

0500

10001500200025003000350040004500

0 50 100

150

200

250

300

350

400

450

operations (x 100)

mill

isec

onds ext2 - 1

ext2 - 2

cfs - 1

cfs - 2

Page 19: Web Caching File System Jonathan Ledlie Matt McCormick

Performance ComparisonsRead

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

0 25 50 75 100 125 150 175 200 225 250 275 300 325 350 375 400 425 450 475

operations (x 100)

mill

isec

onds

ext2 - 1

ext2 - 2

ext2 - 3

ext2 - 4

cfs - 1

cfs - 2

Page 20: Web Caching File System Jonathan Ledlie Matt McCormick

Performance ComparisonsDelete

0100002000030000400005000060000700008000090000

0 50 100

150

200

250

300

350

400

450

operations (x 100)

mill

isec

onds ext2 - 1

ext2 - 2

cfs - 1

cfs - 2

Page 21: Web Caching File System Jonathan Ledlie Matt McCormick

Two points, revisited

• Optimizing on invariants• Impending I/O bottleneck

Page 22: Web Caching File System Jonathan Ledlie Matt McCormick

What’s coming...

• Real raw I/O and proper memory alignment• Testing with more threads• Trace testing• Determining optimal fragmentation and cleaning• Is MD5 a bottleneck?• Elevator algorithm• Adding save on clean shutdown• Examine memory requirements for FileTable

Page 23: Web Caching File System Jonathan Ledlie Matt McCormick

Conclusions

• Unix file system induces unnecessary overhead

• Possible to take advantage of application specific traits

• Specialization works