29
CloudantGeo: Scaling geospatial for the masses Mike Miller CoFounder, Chief Scientist

Seattle Scalability Meetup

Embed Size (px)

DESCRIPTION

Presentation from the Seattle Scalability meet up on Cloudant Geospatial

Citation preview

Page 1: Seattle Scalability Meetup

CloudantGeo:Scaling geospatial for the massesMike MillerCoFounder, Chief Scientist

Page 2: Seattle Scalability Meetup

Mike Miller, 2013/02/27 2

{Problems: [‘Mobile’, ‘Data’]}

Mobile Big Data

These break our existing models for computing

Page 3: Seattle Scalability Meetup

Mike Miller, 2013/02/27 3

{Introductions: ‘Cloudant’}

Database that ships with a mobile strategy

Page 4: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{Introductions: ‘Cloudant’}

• Distributed Database as a Service (DBaaS)• For developers of high-velocity web and

mobile apps• Venture Funded, YCombinator• Ends drudgery of SQL and scale-it-

yourself NoSQL• 13,000+ users• Founded by big data scientists • Speaks CouchDB API

4

Page 5: Seattle Scalability Meetup

Mike Miller, 2013/02/27 5

{Introductions: ‘Cloudant’}

Page 6: Seattle Scalability Meetup

Mike Miller, 2013/02/27 6

Schemas & protocols can be restrictive and inhibit data integration

{Schemas: ‘Optional’}

JSON over HTTP

Page 7: Seattle Scalability Meetup

Mike Miller, 2013/02/27 7

You do this:

We give you:

{Install: ‘Cloudant’}

That’s It

Page 8: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{API: ‘REST’}

8

Write a doc...from the browser

No client install necessary

JSON, REST, HTTP

Page 9: Seattle Scalability Meetup

Mike Miller, 2013/02/27 9

Create Secondary Indexes

Query Those indexes

Rinse and repeat.... billions of times per day

{API: ‘Search’}

Page 10: Seattle Scalability Meetup

Mike Miller, 2013/02/27 10

“Scaling would be a good problem to have. We’ll deal with that later”

{Challenge: ‘Dogma’}

False: Scaling Mandatory

Page 11: Seattle Scalability Meetup

Mike Miller, 2013/02/27 11

{Have: ‘Success’}

Page 12: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{Minimize: ‘Latency’}

12

Public or Private Cloud

Page 13: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{Write: ‘Local’, Sync: ‘Later’}

13

Embedded,Edge, Sattelites

Desktop, Browser

Cloud

Page 14: Seattle Scalability Meetup

Mike Miller, 2013/02/27 14

Page 15: Seattle Scalability Meetup

Mike Miller, 2013/02/27 15

{Example: ‘HotheadGames’}

Page 16: Seattle Scalability Meetup

Mike Miller, 2013/02/27 16

~100x Scaling in last 9 months

{Example: ‘HotheadGames’}

This Database is bigger than many (most?) Hadoop clusters

Page 17: Seattle Scalability Meetup

Mike Miller, 2013/02/27 17

How do you scale a database from MB to PB?

How do you add new features that scale?

Page 18: Seattle Scalability Meetup

Mike Miller, 2013/02/27 18

What if my application cares about geospatial location?

What if I need to combine FTI/search, SELECT, and geo?

Cloudant: MapReduce + Lucene + Geospatial

Page 19: Seattle Scalability Meetup

Mike Miller, 2013/02/27 19

hash(blah) = E

Load Balancer

PUT http://rnewson.cloudant.com/dbname/blah?w=2

N=3W=2R=2

Node 1

A B C DNode 2B

CD

E

Node 3

C

D

E

F

Node 4

D

E

F

G

Node 24

XY

ZA

• Clustering in a ring (a la Dynamo)• Any node can handle a request• O(1) lookup• Quorum system (N, R, W)• Views distributed like

documents• Distributed Erlang• Masterless

{Sharding: ‘Automatic’}

Page 20: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{Manual_Sharding: ‘Sucks’}

20

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/spanner-osdi2012.pdf

Page 21: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{Format: ‘geojson’}

21

http://geojson.org/geojson-spec.html

Page 22: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{Format: ‘topojson’}

22

https://github.com/mbostock/topojson/wiki

Page 23: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{Index: ‘R-Tree’}

Wikpedia

23

Page 24: Seattle Scalability Meetup

Mike Miller, 2013/02/27

Wikpedia

24

{Index: ‘R*-Tree’}

Page 25: Seattle Scalability Meetup

Mike Miller, 2013/02/27

• Predictive spatio-temporal query retrieves the set of moving objects that will intersect a query window during a future time interval

25

{Index: ‘TPR*-Tree’}

http://www.cs.ust.hk/~dimitris/PAPERS/VLDB03-TPR.pdf

Page 26: Seattle Scalability Meetup

Mike Miller, 2013/02/27 26

{API: ‘geo’}

Create Secondary Indexes

Query Those indexes

Page 27: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{Operators: ‘more’ }

27

•Disjoint•Equals•DWithin•Beyond• Intersect•Touches

•Crosses•Within•Contains•Overlaps•BBX

Page 28: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{Want: ‘Beta Testers’ }

28

For a demo or a beta invite, pm me @mlmilleratmit

Page 29: Seattle Scalability Meetup

Mike Miller, 2013/02/27

{Status: [‘Now’, ‘Future’] }

29

•Big and Getting Bigger•Databases -- 100M+•Daily Transactions -- 10s of Billions•Indexed Data -- 100s TBs•Map Reduces/day -- Billions

•Product Roadmap•Security enhancements•Advanced geospatial•Graph engine