17
A Case Study in Building Layered DHT Applications Yatin Chawathe Sriram Ramabhadran, Sylvia Ratnasamy, Anthony LaMarca, Scott Shenker, Joseph Hellerstein

A Case Study in Building Layered DHT Applications

  • Upload
    lavey

  • View
    24

  • Download
    0

Embed Size (px)

DESCRIPTION

A Case Study in Building Layered DHT Applications. Yatin Chawathe Sriram Ramabhadran, Sylvia Ratnasamy, Anthony LaMarca, Scott Shenker, Joseph Hellerstein. Building distributed applications. Distributed systems are designed to be scalable, available and robust - PowerPoint PPT Presentation

Citation preview

Page 1: A Case Study in Building Layered  DHT Applications

A Case Study in Building Layered DHT Applications

Yatin Chawathe

Sriram Ramabhadran, Sylvia Ratnasamy, Anthony LaMarca, Scott Shenker,

Joseph Hellerstein

Page 2: A Case Study in Building Layered  DHT Applications

2

Building distributed applications

Distributed systems are designed to be scalable, available and robust

What about simplicity of implementation and deployment?

DHTs proposed as simplifying building block Simple hash-table API: put, get, remove Scalable content-based routing, fault tolerance

and replication

Page 3: A Case Study in Building Layered  DHT Applications

3

Can DHTs help

Can we layer complex functionality on top of unmodified DHTs? Can we outsource the entire DHT operation to a

third-party DHT service, e.g., OpenDHT?

Existing DHT applications fall into two classes Simple unmodified DHT for rendezvous or storage,

e.g., i3, CFS, FOOD Complex apps that modify the DHT for enhanced

functionality, e.g, Mercury, CoralCDN

Page 4: A Case Study in Building Layered  DHT Applications

4

Outline

Motivation

A case study: Place Lab

Range queries with Prefix Hash Trees

Evaluation

Conclusion

Page 5: A Case Study in Building Layered  DHT Applications

5

A Case Study: Place Lab Positioning service for location-enhanced apps

Clients locate themselves by listening for known radio beacons (e.g. WiFi APs)

Database of APs and their known locations

Place Lab service

Computes maps of AP MAC address ↔ lat,lon “War-drivers” submit

neighborhood logsClients download local WiFi maps

{ lat, lon → list of APs }...

{ AP → lat, lon }...

Page 6: A Case Study in Building Layered  DHT Applications

6

Why Place Lab Developed by group of ubicomp researchers

Not experts in system design and management

Centralized deployment since March 2004 Software downloaded by over 6000 sites

Concerns over organizational control decentralize the service But, want to avoid implementation and

deployment overhead of distributed service

Page 7: A Case Study in Building Layered  DHT Applications

7

How DHTs can help Place Lab

Automatic content-based routing Route logs by AP MAC address to appropriate Place Lab

server

Robustness and availability DHT managed entirely by third party Provides automatic replication and failure recovery of

database content

“War-drivers” submit neighborhood logs

Clients download local WiFi maps …

Place Lab servers compute AP location

DHTstorage and routing

Page 8: A Case Study in Building Layered  DHT Applications

8

Downloading WiFi Maps

Clients perform geographic range queries Download segments of the database

e.g., all access points in Philadelphia

Can we perform this entirely on top of unmodified third-party DHT DHTs provide exact-match queries, not range queries

“War-drivers” submit neighborhood logs

Clients download local WiFi maps …

Place Lab servers compute AP location

DHTstorage and routing?

Page 9: A Case Study in Building Layered  DHT Applications

9

Supporting range queries

Prefix Hash Trees Index built entirely with put, get, remove

primitives No changes to DHT topology

or routing

Binary tree structure Node label is a binary prefix

of values stored under it Nodes split when they get

too big

Stored in DHT with node label as key Allows for direct access to interior and leaf nodes

R

R1R0

R11R10R00 R01

R010R011 R111R110

00000

30011

40100

50101

60110

81000

121100

131101

141110

151111

Page 10: A Case Study in Building Layered  DHT Applications

10

PHT operations Lookup(K)

Find leaf node whose label is prefix of K Binary search across K’s bits O(log log D) where

D = size of key space

Insert(K, V) Lookup leaf node for K If full, split node into two Put value V into leaf node

Query(K1, K2) Lookup node for P, where P=longest common prefix of K1,K2

Traverse subtree rooted at node for P

R

R1R0

R11R10R00 R01

R010R011 R111R110

00000

30011

40100

50101

60110

81000

121100

131101

141110

151111

R

R1

R11

R110

R1101

131101

R01

R010R011

40100

50101

60110

Page 11: A Case Study in Building Layered  DHT Applications

11

2-D geographic queries

Convert lat/lon into 1-D key Use z-curve linearization Interleave lat/lon bits to create

z-curve key

Linearized query results may not be contiguous Start at longest prefix subtree Visit child nodes only if

they can contribute to query result

P(=R000…00)

P1P0

P11P10P00 P01

P010 P011 P100

P101

P111P110

P0100 P0101 P0110 P0111 P1100 P1101(2,4)(2,5)(3,5)

(3,6)(3,7)

(0,4)(1,5)

(1,0)

(0,7)(1,6)(1,7)

0 1 2 3 4 5 6 701234567

longitude

latit

ude

( 5 , 6 )(0101,0110)

00110110

(54)

P10

Page 12: A Case Study in Building Layered  DHT Applications

12

PHT Visualization

Page 13: A Case Study in Building Layered  DHT Applications

13

Ease of implementation and deployment

2,100 lines of code to hook Place Lab into underlying DHT service Compare with 14,000 lines for the DHT

Runs entirely on top of deployed OpenDHT service DHT handles fault tolerance and robustness, and

masks failures of Place Lab servers

Page 14: A Case Study in Building Layered  DHT Applications

14

Flexibility of DHT APIs

Range queries use only the get operation Updates use combination of put, get, remove But…

Concurrent updates can cause inefficiencies No support for concurrency in existing DHT APIs A test-and-set extension can be beneficial to

PHTs and a range of other applications put_conditional: perform the put only if value has not

changed since previous get

Page 15: A Case Study in Building Layered  DHT Applications

15

PHT insert performance

Median insert latency is 1.45 sec w/o caching = 3.25 sec; with caching = 0.76 sec

Page 16: A Case Study in Building Layered  DHT Applications

16

PHT query performance

Data size Latency (sec)

5k 2.13

10k 2.76

50k 3.18

100k 3.75

Queries on average take 2–4 seconds Varies with block size

Smaller (or very large) block size implies longer query time

Page 17: A Case Study in Building Layered  DHT Applications

17

Conclusion

Concrete example of building complex applications on top of vanilla DHT service

DHT provides ease of implementation and deployment Layering allows inheriting of robustness,

availability and scalable routing from DHT Sacrifices performance in return