Upload
couchbase
View
2.688
Download
0
Tags:
Embed Size (px)
Citation preview
1
2
Chiyoung Seo, Couchbase Inc.Matt Ingenthron, Couchbase Inc.
USING COUCHBASE FOR SOCIAL GAME SCALING AND SPEED
3
• Introduction• What is Couchbase Server?
– Simple, Fast, Elastic– Technology Overview (Architecture, data flow, rebalancing)
• Tribal Crossing Inc: Animal Party– Challenges before Couchbase
• Original Architecture
– Why Couchbase?• Simplicity• Performance• Flexibility
– Deploying Couchbase• New Architecture• EC2• Data Model• Accessing data in Couchbase
• Product Roadmap• Q&A
Agenda
4
• Membase and CouchOne have merged to form Couchbase Inc. (headquartered in Silicon Valley)
• Team– Brings together the creators and core contributors of Memcached,
Membase and CouchDB technologies– Doubles technical team size, accelerates roadmaps by over a year
• Products– Couchbase Server (Formerly Membase)– Couchbase Single Server– Mobile Couchbase (iPhone and Android)
• Technology– Most mature, reliable and widely deployed NoSQL technologies– Fully featured, open source document datastore– First complete, end-to-end NoSQL database product
Couchbase Inc.
5
Modern Interactive Web Application Architecture
Application Scales OutJust add more commodity web servers
Database Scales UpGet a bigger, more complex server
www.facebook.com/animalparty
Web Servers
Relational Database
Load Balancer
- Expensive and disruptive sharding- Doesn’t perform at Web Scale
6
Couchbase Server is a distributed database
Couchbase Servers
Web application server
Application user
Couchbase Web Console
7
Couchbase data layer scales like application logic tierData layer now scales with linear cost and constant performance.
Application Scales OutJust add more commodity web servers
Database Scales OutJust add more commodity data servers
Scaling out flattens the cost and performance curves.
Couchbase Servers
www.facebook.com/animalparty
Web ServersLoad Balancer
Horizontally scalable, schema-less, auto-sharding, high-performance at Web Scale
8
Couchbase Server is Simple, Fast, Elastic
• Five minutes or less to a working cluster– Downloads for Windows, Linux and OSX– Start with a single node– One button press joins nodes to a cluster
• Easy to develop against– Just SET and GET – no schema required– Drop it in. 10,000+ existing applications
already “speak Couchbase” (via memcached)– Practically every language and application
framework is supported, out of the box
• Easy to manage– One-click failover and cluster rebalancing– Graphical and programmatic interfaces– Configurable alerting
9
Couchbase Server is Simple, Fast, Elastic
• Predictable– “Never keep an application waiting”– Quasi-deterministic latency and throughput
• Low latency– Built-in Memcached technology– Auto-migration of hot data to lowest latency
storage technology (RAM, SSD, Disk)– Selectable write behavior – asynchronous,
synchronous (on replication, persistence)
• High throughput– Multi-threaded– Low lock contention– Asynchronous wherever possible– Automatic write de-duplication
10
Couchbase Server is Simple, Fast, Elastic
• Zero-downtime elasticity– Spread I/O and data across commodity
servers (or VMs) – Consistent performance with linear cost– Dynamic rebalancing of a live cluster
• All nodes are created equal– No special case nodes– Clone to grow
• Extensible– Change feeds– Real-time map-reduce– RESTful interface for management
Couchbase Web Console
11
Proven at Small, and Extra Large Scale
• Leading cloud service (PAAS) provider
• Over 150,000 hosted applications
• Couchbase Server serving over 6,200 Heroku customers
• Social game leader – FarmVille, Mafia Wars, Empires and Allies, Café World, FishVille
• Over 230 million monthly users
• Couchbase Server is the primary database behind key Zynga properties
13
moxi
11211 11210
memcachedprotocol listener/sender
Couchbase Storage Engine
engine interface
memcapable 1.0 memcapable 2.0
21100 – 2119943698091
httpRE
ST m
anag
emen
t API
/Web
UI
Hea
rtbe
at
Proc
ess
mon
itor
Glo
bal s
ingl
eton
sup
ervi
sor
Confi
gura
tion
man
ager
on each node
Erlang/OTP
Reba
lanc
e or
ches
trat
or
Nod
e he
alth
mon
itor
one per cluster
vBuc
ket s
tate
and
repl
icati
on m
anag
er
HTTP distributed erlangerlang port mapper
Data Manager Cluster Manager
Couchbase Server Architecture
14
moxi
11211 11210
memcachedprotocol listener/sender
engine interface
memcapable 1.0 memcapable 2.0
21100 – 2119943698091
httpRE
ST m
anag
emen
t API
/Web
UI
Hea
rtbe
at
Proc
ess
mon
itor
Glo
bal s
ingl
eton
sup
ervi
sor
Confi
gura
tion
man
ager
on each node
Erlang/OTP
Reba
lanc
e or
ches
trat
or
Nod
e he
alth
mon
itor
one per cluster
vBuc
ket s
tate
and
repl
icati
on m
anag
er
HTTP distributed erlangerlang port mapper
Couchbase Server Architecture
Couchbase Storage Engine
15
Couchbase “write” Data Flow – application view
User action results in the need to change the VALUE of KEY
Application updates key’s VALUE, performs SET operation
Couchbase client hashes KEY, identifies KEY’s master serverSET request sent over
network to master server
Couchbase replicates KEY-VALUE pair, caches it in memory and stores it to disk
1
2
34
5
16
Couchbase Data Flow – under the hood
Listener-Sender
DiskDisk Disk
RAM*
mem
base
stor
age
engi
ne
SSDSSD SSD
Listener-Sender
DiskDisk Disk
RAM*
mem
base
stor
age
engi
ne
SSDSSD SSD
SET request arrives at KEY’s master server
Listener-Sender
Master server for KEY Replica Server 2 for KEYReplica Server 1 for KEY
2 2
1 SET acknowledgement returned to application3
DiskDisk Disk
RAM*
Couc
hbas
e st
orag
e en
gine
SSDSSD SSD
2
4
17
Elasticity - Rebalancing
vBucket 1vBucket 2
vBucket 3
vBucket 4vBucket 5vBucket 6
Node 1 Node 2 Node 3
vBucket 1
vBucket 2
vBucket 3
vBucket 4
vBucket 5
vBucket 6vBucket 7
vBucket 8
vBucket 9
vBucket 10
vBucket 11
vBucket 12
Before• Adding Node 3• Node 3 is in pending state• Clients talk to Node 1,2 only
After• Node 3 is balanced• Clients are reconfigured to talk to
Node 3
During• Rebalancing orchestrator recalculates
the vBucket map (including replicas)• Migrate vBuckets to the new server• Finalize migration
vBucket 7vBucket 8
vBucket 9
vBucket 10vBucket 11vBucket 12
Pending state
vBucket 1vBucket 2
vBucket 3
vBucket 4vBucket 5vBucket 6
vBucket 7vBucket 8
vBucket 9
vBucket 10vBucket 11vBucket 12
Rebalancing
vBucket migrator vBucket migrator
Client
18
Data buckets are secure Couchbase “slices”
Couchbase data servers
In the data center
Web application server
Application user
On the administrator console
Bucket 1
Bucket 2
Aggregate Cluster Memory and Disk Capacity
19
• Support large-scale analytics on application data by streaming data from Couchbase to Hadoop– Real-time integration using Flume– Batch integration using Sqoop
• Examples– Various game statistics (e.g., monthly / daily / hourly rankings)– Analyze game patterns from users to enhance various game metrics
Couchbase and Hadoop Integration
memcachedprotocol listener/sender
engine interface
Couchbase Storage Engine
TAP
Flume
Sqoop
20
• Introduction• What is Couchbase Server?
– Simple, Fast, Elastic– Technology Overview (Architecture, data flow, rebalancing)
• Tribal Crossing Inc: Animal Party– Challenges before Couchbase
• Original Architecture
– Why Couchbase?• Simplicity• Performance• Flexibility
– Deploying Couchbase• New Architecture• EC2• Data Model• Accessing data in Couchbase
• Product Roadmap• Q&A
Agenda
21
Common steps on scaling up database:● Tune queries (indexing, explain query)● Denormalization● Cache data (APC / Memcache)● Tune MySQL configuration● Replication (read slaves)
Where do we go from here to prepare for the scale of a successful social game?
Tribal Crossing: Challenges
22
● Write-heavy requests– Caching does not help– MySQL / InnoDB limitation (Percona)
● Need to scale drastically over night– My Polls – 100 to 1m users over a weekend
● Small team, no dedicated sysadmin– Focus on what we do best – making games
● Keeping cost down
Tribal Crossing: Challenges
23
● MySQL with master-to-master replication and sharding
– Complex to setup, high administration cost– Requires application level changes
● Cassandra– High write, but low read throughput– Live cluster reconfiguration and rebalance is quite complicated– Eventual consistency gives too much burden to application
developers● MongoDB
– High read/write, but unpredictable latency– Live cluster rebalance for existing nodes only– Eventual consistency with slave nodes
Tribal Crossing: “Old” Architecture and Options
24
● SPEED, SPEED, SPEED● Immediate consistency● Interface is dead simple to use
– We are already using Memcache● Low sysadmin overhead● Schema-less data store● Used and Proven by big guys like Zynga● … and lastly, because Tribal CAN
– Bigger firms with legacy code base = hard to adapt– Small team = ability to get on the cutting edge
Tribal Crossing: Why Couchbase Server?
25
● But, there are some different challenges in using Couchbase (currently 1.7) to handle the game data:
– No easy way to query data– No transaction / rollback
➔ Couchbase Server 2.0 resolves them by using CouchDB as the underlying database engine
● Can this work for an online game?– Break out of the old ORM / relational paradigm!– We are not handling bank transactions
Tribal Crossing: New Challenges With Couchbase
26
Couchbase Cluster
Web Server
Tribal Crossing: Deploying Couchbase in EC2
● Basic production environment setup
● Dev/Stage environment – feel free to install Couchbase on your web server
Apache
Couchbase Couchbase
DNS Entry
Client-side Moxi
Cluster Mgmt. Requests
…
27
Tribal Crossing: Deploying Couchbase in EC2
● Amazon Linux AMI, 64-bit, EBS backed instance
● Setup swap space● Install Couchbase’s
Membase Server 1.7● Access web console
http://<hostname>:8091
● Start the new cluster with a single node
● Add the other nodes to the cluster and rebalance
Couchbase Cluster
Web Server
Apache
Couchbase
DNS Entry
Client-side Moxi
Cluster Mgmt. Requests
… Couchbase
28
Tribal Crossing: Deploying Couchbase in EC2
Moxi figures out which node in the cluster holds data for a given key.● On each web server, install Moxi
proxy● Start Moxi by pointing it to the
DNS entry you created● Web apps connect to Moxi that is
running locallymemcache->addServer(‘localhost’, 11211);
Couchbase Cluster
Web Server
Apache
Couchbase Couchbase
DNS Entry
Client-side Moxi
Cluster Mgmt. Requests
…
29
Use case - simple farming game:● A player can have a variety of plants on their farm.● A player can add or remove plants from their farm.● A Player can see what plants are on another player's
farm.
Tribal Crossing: Representing Game Data in Couchbase
30
Representing Objects● Simply treat an object as an associative array● Determine the key for an object using the class name
(or type) of the object and an unique ID
Representing Object Lists● Denormalization● Save a comma separated list or an array of object IDs
Tribal Crossing: Representing Game Data in Couchbase
31
Player ObjectKey: 'Player1'
Array( [Id] => 1 [Name] => Shawn)
Tribal Crossing: Representing Game Data in Couchbase
Plant ObjectKey: 'Plant201'
Array( [Id] => 201 [Player_Id] => 1 [Name] => Starflower)PlayerPlant List
Key: 'Player1_PlantList'
Array( [0] => 201 [1] => 202 [2] => 204)
32
● No need to “ALTER TABLE”● Add new “fields” all objects at any time
– Specify default value for missing fields– Increased development speed
● Using JSON for data objects though, owing to the ability to query on arbitrary fields in Couchbase 2.0
Tribal Crossing: Schema-less Game Data
33
Get all plants belong to a given playerRequest: GET /player/1/farm
$plant_ids = couchbase->get('Player1_PlantList');
$response = array();
foreach ($plant_ids as $plant_id){ $plant = couchbase->get('Plant' . $plant_id); $response[] = $plant;}
echo json_encode($response);
Tribal Crossing: Accessing Game Data in Couchbase
34
Give a player a new plant// Create the new plant$new_plant = array ( 'id' => 100, 'name' => 'Mushroom');
$couchbase->set('Plant100', $new_plant);
// Update the player plant list$plant_ids = $couchbase->get('Player1_PlantList');$plant_ids[] = $new_plant['id'];
$couchbase->set('Player1_PlantList', $plant_ids);
Tribal Crossing: Modifying Game Data in Couchbase
35
Concurrency issue can occur when multiple requests are working with the same piece of data.
Solution:● CAS (check-and-set)
– Client can know if someone else has modified the data while you are trying to update
– Implement optimistic concurrency control
● Locking (try/wait cycle)– GETL (get with lock + timeout)
operations– Pessimistic concurrency control
Tribal Crossing: Concurrency
36
● Record object relationships both ways– Example: Plots and Plants
● Plot object stores id of the plant that it hosts● Plant object stores id of the plot that it grows on
– Resolution in case of mismatch● Don't sweat the extra calls to load data in a one-to-
many relationship– Use multiGet– We can still cache aggregated results in a Memcache
bucket if needed
Tribal Crossing: Data Relationship
37
Web Server
First migrated large or slow performing tables and frequently updated fields from MySQL to Couchbase
Tribal Crossing: Migrating to Couchbase Servers
memcachedprotocol listener/sender
engine interface
Couchbase Storage Engine
TAP
TAP Client
Apache + PHP
Client-side Moxi
Reporting Applications
MySQL
38
Tribal Crossing: Deployment
39
Tribal Crossing: Deployment
40
• Significantly reduced the cost incurred by scaling up database servers and managing them.
• Achieved significant improvements in various performance metrics (e.g., read, write, latency, etc.)
• Allowed them to focus more on game development and optimizing key metrics
• Plan to use real-time MapReduce, querying, and indexing abilities provided by the upcoming Elastic Couchbase 2.0
Tribal Crossing: Conclusion
41
• Introduction• What is Couchbase Server?
– Simple, Fast, Elastic– Technology Overview (Architecture, data flow, rebalancing)
• Tribal Crossing Inc: Animal Party– Challenges before Couchbase
• Original Architecture
– Why Couchbase?• Simplicity• Performance• Flexibility
– Deploying Couchbase• New Architecture• EC2• Data Model• Accessing data in Couchbase
• Product Roadmap• Q&A
Agenda
42
• Mobile to cloud data synchronization• Cross data center replication
Product Roadmap: Couchbase Server 2.0
…
Couchbase Single Server
US West Coast Data Center
CouchbaseServer
…
Couchbase Single Server
US East Coast Data Center
CouchbaseServer
CouchSync
CouchSync CouchSync
CouchSync
…… …
… …
CouchSync
43
• Replace Sqlite-based storage engine with CouchDB• Support indexing and querying on values• Integrate real-time MapReduce into Couchbase server• SDK for Couchbase server
Product Roadmap: Couchbase Server 2.0
The world’s leading caching and clustering technology
The most reliable and full-featured document database
The fastest, most complete and most reliable database on the
planet
Membase Server 1.7 CouchDB 1.1 Couchbase Server 2.0
44
• Community Edition– Open source build– Free forum support
• Enterprise Edition– Free for non-production use– Certified, QA tested version of open source– Case tracking and guaranteed SLA for production
environments
• Partner in Korea– N2M Inc. (http://www.n2m.co.kr)
Couchbase Product Download
45
Q&AMatt Ingenthron, Couchbase Inc.
([email protected], @ingenthr)Chiyoung Seo, Couchbase Inc.
([email protected], @chiyoungseo)