37
MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

Embed Size (px)

Citation preview

Page 1: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Consistent Hashing:Load Balancing in a Changing World

David Karger, Eric Lehman,

Tom Leighton, Matt Levine,

Daniel Lewin, Rina Panigrahy

Page 2: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Load Balancing

Task: distribute items into buckets» Data to memory locations» Files to disks» Tasks to processors» Web pages to caches (our motivation)

Goal: even distribution

Page 3: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

World Wide Web

Servers

Browsers(clients)

LCS

ATTCMU

IBMX

Page 4: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Hot Spots

LCS

ATTCMU

IBMXTHORCILK

W3C

PDOS

Page 5: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Temporary Loads

For permanent loads, use bigger server Must also deal with “flash crowds”

» IBM chess match» NASA

Inefficient to design for max load» rarely attained» much capacity wasted

Better to offload peak load elsewhere

Page 6: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

ATT

LCS

Proxy Caches Balance Load

CMU

IBMXTHOR

CILK

W3C

PDOS

Page 7: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Proxy Caching

Old: server hit once for each browser New: server hit once for each page Adapts to changing access patterns

Page 8: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Proxy Caching

Every server can also be a cache Provides a social good Reduces load at sites you want to contact Costs you little, if done right

» few accesses» small amount of storage (times many servers)

Page 9: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Who Caches What?

Each cache should hold few items» else cache gets swamped by clients

Each item should be in few caches» else server gets swamped by caches» and cache invalidations/updates expensive

Browser must know right cache» could ask server for redirect» server gets swamped by redirect requests

Page 10: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Hashing

Simple and powerful load balancing Constant time to find bucket for item Example map to n buckets. Pick a,b:

y=ax+b (mod n) Intuition: hash maps each item to one

random bucket» no bucket gets many items

Page 11: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Problem: Adding Caches

Suppose a new cache arrives. How work it into hash function? Natural change:

y=ax+b (mod n+1) Problem: changes bucket for every item

» every cache will be flushed» servers get swamped with new requests

Goal: when add bucket, few items move

Page 12: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Problem: Inconsistent Views

Each client knows about a different set of caches: its view

View affects choice of cache for item With many views, each cache will be

asked for item Item in all caches---swamps server

Goal: item in few caches despite views

Page 13: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Problem: Inconsistent Views

LCS

caches0 321

ax+b (mod 4) = 2

My view

Page 14: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Problem: Inconsistent Views

LCS

caches0 321

ax+b (mod 4) = 2

John’s view

Page 15: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Problem: Inconsistent Views

LCS

caches0 321

ax+b (mod 4) = 2

Dave’s view

Page 16: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Problem: Inconsistent Views

LCS

caches0 321

ax+b (mod 4) = 2

Al’s view

Page 17: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Problem: Inconsistent Views

LCS

caches0 321

ax+b (mod 4) = 2

Mike’s view

Page 18: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Problem: Inconsistent Views

LCS

caches22222

Page 19: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Consistent Hashing

A new kind of hash function Maps any item to a bucket in my view Computable in constant time, locally

» 1 standard hash function Adding bucket to view takes logarithmic time

» logarithmic number of standard hash functions Handles incremental and inconsistent views

Page 20: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Single View Properties

Balance: all buckets get roughly same number of items (like standard hashing)

Smooth: when an kth bucket is added, only a 1/k fraction of the items move» and only from O(log n) servers» minimum needed to preserve balance

Page 21: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Multiple View Properties

Consider n views, each of an arbitrary constant fraction of the buckets

Load: number of items a bucket gets from all views is O(log n) times average» Despite views, load balanced

Spread: over all views, each item appears in O(log n) buckets» Despite views, few caches for each item

Page 22: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Implementation

Use standard hash function H to map buckets, items to unit interval» “random” points spread uniformly

BucketItem

Page 23: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Implementation

Use standard hash function H to map buckets, items to unit interval» “random” points spread uniformly

Item assigned to nearest bucket

BucketItem

Page 24: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Computation Cost

Bucket positions precomputed To hash item

» compute H» find nearest bucket point

O(log n) time using binary search Constant time with auxiliary hash table

Page 25: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Balance

Bucket points uniformly distributed by H

Each bucket “owns” equal portion of line Item position “random” by H So item equally likely to be at any bucket So all buckets get about same # items

Bucket

Page 26: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Smoothness

To add a kth bucket, hash it to line

Old bucket

Page 27: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Smoothness

To add a kth bucket, hash it to line

New bucketOld bucket

Page 28: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Smoothness

To add a kth bucket, hash it to line

Captures items nearest to it» only 1/k fraction of total items

» only from 2 other buckets (on each side)

New bucketOld bucket

Page 29: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Low Spread

Some views might not see nearest bucket to an item, hash it elsewhere

But every view will have a bucket near the item (by “random” placement)

So only buckets near the item will ever have to hold it

But only a few buckets are near the item (by “random” placement)

Page 30: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Low Load

Bucket only gets item I if no other bucket is closer to I

Under any view, some bucket is close to I» by “random” placement of buckets

So a bucket only gets items close to it But an item is unlikely to be close So bucket doesn’t get many items

Page 31: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Summary: Consistent Hashing

Trivial to implement (20 lines code)» don’t try this at home!

Fast to compute (no hidden constants) Uniformly distributes items Can cheaply add/remove buckets Even with multiple views

» no bucket gets many items» each item in only a few buckets

Page 32: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Caching

Consistent hashing good for caching» client maps known caches to unit interval» when item arrives, hash to cache» server gets O(log n) requests for its own pages

Each server can also be a cache» gets small number of requests for others’ pages

Consistent hashing is robust» caches can come and go» different browsers can know different caches

Page 33: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Refinements

Browser bandwidth to machines may vary If bandwidth to server high, unwilling to use

low bandwidth cache Consistently hash item only to caches with

bandwidth as good as server Theorem: all previous properties still hold

» uniform cache loads» low server loads (few caches per item)

Page 34: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Fault Tolerance

Suppose contacted cache is down Delete from bucket set, find next closest

bucket in consistent hashing interval Just a small change in view… Even with many failures, previous

properties (uniform low load, etc) still hold.

Page 35: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Hot pages

What if one page gets popular?» cache responsible for it gets swamped

Use tree of caches (e.g. harvest, DNS)?» cache at root gets swamped

Use a different tree for each page» build using consistent hashing

Balances load for hot pages and servers

Page 36: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Main Result

Using cache trees of logarithmic depth, for any set of page accesses, can adaptively balance load such that every server gets at most log times average load of system (browser/server ratio).

(assorted theory caveats)

Page 37: MIT Consistent Hashing: Load Balancing in a Changing World David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, Rina Panigrahy

MIT

Conclusion

Consistent hashing balances load Simple enough to be practical (?) We’d like to build distributed web cache

» need help of good systems student Looking for other applications

» DNS lookup? » PICS? URN resolution?» NOWs? Distributed file systems?