Building Expedia’s Travel Graph using MongoDB

Preview:

DESCRIPTION

 

Citation preview

Jeff MillerPrincipal Program Manager

Jaeque (JQ) YoonSoftware Development Manager

Building Expedia’s Travel Graph using MongoDB

What you imagine…

…what you get.

https://flic.kr/p/6h4u72

How many combinations?

Actual existing travel planning tool.

Mental image…

https://flic.kr/p/5nduiR

…reality.

https://flic.kr/p/8bxbnE

Competitive marketplace

Changing all the time Remember and connect everything.

Travel Graph Vision:

Scratchpad Vision: No more paper notes.

Automatic notetaking

Live updates

Multiple devices

Functional requirements

Photo © 2012 Jonny Ross (https://flic.kr/p/cQamT9)

20-40M writes/day

<1 sec latency

40-60M reads/day

Non-Functional Requirements

Six months to market

Scratchpad stack

Listen +Distribute

Buffer

Store

Retrieve

Amazon SQS

MongoDB

History Service

Writer

Activity Broker

Record

• Three-node replicaset– Multiple availability zones– Reads from secondaries

• TTL Index to enforce data retention limit

• Working set not in memory

• Single shard

Scratchpad: target launch configuration

Config

Config

Config

Mongo-s

Shard 1

Primary

Secondary Secondary

History Service

Mongo-s

June

Projected delivery timeline

AugustJuly

CompleteDesign

April May

ReleaseTarget

Cluster Build out

Enable Write Traffic

Scratchpad data structures: v1{  "_id" : "110c71e5-8476-4325-bec7-91659720faa6",  "key" : "110c71e5-8476-4325-bec7-91659720faa6-1",  "uid" : "110c71e5-8476-4325-bec7-91659720faa6",  "timestamp" : 12332323,  "hotel" : {    "details" : [ {           "rooms" : [ {               "roomtype": null,               "numadults": 1,               "numchildren": null             }],           "hotelid" : 87,           "checkindate" : "07-24-2014",           "checkoutdate" : "07-26-2014",           "price" : 255.00,           "timestamp" : 12332323         }],    },    "flight" : {        "searches" : [ FlightSearchInteraction ],     }}

{ "_id" : "110c71e5-8476-4325-bec7-91659720faa6-hotels-detail-87-2807186400000", "key" : "110c71e5-8476-4325-bec7-91659720faa6-1" "rooms" : [ { "roomtype" : null, "numadults" : 1, "children" : null } ], "checkindate" : "07-24-2014", "checkoutdate" : "07-26-2014", "price" : 255.00, "hotelid" : 87, "regionId" : 800077, "product" : "HOTEL", "uid" : "110c71e5-8476-4325-bec7-91659720faa6", "timestamp" : 12332323, "type" : "DETAIL", "ttl" : ISODate("2014-05-13T15:14:35.916Z")}

Scratchpad data structures: v2

CompleteDesign

June

Post-launch timeline

AugustJuly

Live Traffic System Falls Over

April

Cluster Build out

Perf Testing

May

IOPS

Increased IOPS

IO Saturation

TTL deletes

Increased Delete Volume

Response Time Variation

TTL Deletes Trim Job Diff %Change

112 69 -43 -38.4%

History Average Response Time

July

Optimized for production

OctoberAug

Live Write Traffic Performanc

e Optimized

April

Official Release:Live Read Traffic

• Provisioned IOPS• Additional shard• Reduced Write Load

CompleteDesign

Scratchpad in the wild

My notes, everywhere.

My notes, everywhere.

No, wait, they were right here…

Photo © 2012 Vic De Leon (https://flic.kr/p/ebH68m)

Wait, they were here just a minute ago…

Easy come, easy go!

Config

Config

Config

Mongo-s

Shard 1

Primary

Secondary Secondary

History Service

Mongo-s

Shard 2

Primary

Secondary Secondary

Uh… now what?

Config

Mongo-s

EBS volumes, but no servers

Shard 1

EBS EBS EBS

Shard 2

EBS EBS EBS

Config

Mongo-s

New Cluster

Shard 1

EBS EBS EBS

Shard 2

EBS EBS EBS

Shard 1B

EBS EBS EBS

Shard 2B

EBS EBS EBS

sh.setBalancerState(false)

Config

Mongo-s

Swap Shards

Shard 1

EBS EBS EBS

Shard 2

EBS EBS EBS

Shard 1B

EBS EBS EBS

Shard 2B

EBS EBS EBS

db.shards.update({ _id: "travelgraphuhs"} , {$set: {"host" : "travelgraphuhs/ 10.0.21.107:27017,10.0.21.225:27017,10.0.21.189:27017” }})

Config

Mongo-s

Unmount Volumes

Shard 1

EBS EBS EBS

Shard 2

EBS EBS EBS

Shard 1B

EBS EBS EBS

Shard 2B

EBS EBS EBS

Config

Mongo-s

Swap Volumes

Shard 1

Shard 2

Shard 1B

EBS EBS EBS

Shard 2B

EBS EBS EBS

Config

Mongo-s

Restored

Shard 1B

EBS EBS EBS

Shard 2B

EBS EBS EBS

Scratchpad Today/Tomorrow

Config

Config

ConfigMongo-s

History Service

Mongo-s

Shard 1

Primary

Secondary

Secondary

Analytics

Shard 2

Primary

Secondary

Secondary

Analytics

Shard 3

Primary

Secondary

Secondary

Analytics

• 3 shards serving ~80 million reads/writes daily

• 4 node replicaset• Increased data

retention• Geo distribution of

secondary reads

Special-Purpose Tools Joined Together

Travel Graph Ecosystem

Jeff MillerPrincipal Program Manager

Jaeque (JQ) YoonSoftware Development Manager

THANK YOU!

Recommended