Upload
acunu
View
1.988
Download
2
Embed Size (px)
DESCRIPTION
Slides from Matt Aslett's presentation at Cassandra Europe 2012.
Citation preview
© 2012 by The 451 Group. All rights reserved
1
NoSQL, NewSQL, and BeyondThe answer to SPRAINed relational databases
Matthew AslettResearch Manager, Data Management and Analytics
© 2012 by The 451 Group. All rights reserved
451 Research
Information Management Operational databasesData warehousingData cachingEvent processing
Matthew Aslett• Research manager, data management and analytics• With The 451 Group since 2007• www.twitter.com/maslett
Commercial Adoption of Open Source (CAOS)Open source projectsAdoption of open source softwareVendor strategies
© 2012 by The 451 Group. All rights reserved
3
The 451 Group
© 2012 by The 451 Group. All rights reserved
Relevant reports
NoSQL, NewSQL and Beyond• Assessing the drivers behind the development and
adoption of NoSQL and NewSQL databases, as well as data grid/caching technologies
• Released April 2011
• Role of open source in driving innovation
© 2012 by The 451 Group. All rights reserved
NoSQL, NewSQL and Beyond
NoSQL New breed of non-relational
database products Rejection of fixed table
schema and join operations Designed to meet scalability
requirements of distributed architectures
And/or schema-less data management requirements
NewSQL New breed of relational
database products Retain SQL and ACID Designed to meet scalability
requirements of distributed architectures
Or improve performance so horizontal scalability is no longer a necessity
… and Beyond In-memory data grid/cache products Potential primary platform for distributed data management
© 2012 by The 451 Group. All rights reserved
The NoSQL landscape
Key Value Store• Citrusleaf• HandlerSocket*• Redis• Voldemort• Membrain• Oracle NoSQL• Castle• RethinkDB• LevelDB• Cassandra• DataStax EE• Acunu
• HBase• Hypertable
• DynamoDB• Redis-to-go• SimpleDB
• Riak• CouchBase• Mongo Labs• Mongo HQ
• App Engine Datastore -as-a-Service
Graph• InfiniteGraph• Neo4j• DEX• OrientDB• NuvolaBase
Document• RavenDB• MongoDB• CouchDB• Cloudant• Iris Couch
Big Tables
© 2012 by The 451 Group. All rights reserved
The NewSQL ecosystem
• ParElastic• Continuent• Galera
• NuoDB• SQLFire
• Translattice• Clustrix
• Schooner SQL• ScaleBase• ScaleArc
• CodeFutures
• GenieDB
• ScaleDB• MySQL Cluster • Zimory Scale
New databases
• Xeround• Tokutek
Storage engines
Clustering/sharding
-as-a-Service
• Drizzle• Akiban
• Drizzle• VoltDB• JustOne DB
© 2012 by The 451 Group. All rights reserved
SPRAINED RELATIONAL DATABASES
Photo credit: Foxtongue on Flickrhttp://www.flickr.com/photos/foxtongue/4844016087/
© 2012 by The 451 Group. All rights reserved
SPRAIN
Scalability - Hardware economics Example project/service/vendor:• BigTable, HBase, Riak, MongoDB, Couchbase, Hadoop• Xeround, NuoDB• Data grid/cache
Associated use case:• Large-scale distributed data storage• Analysis of continuously updated data• Multi-tenant PaaS data layer
© 2012 by The 451 Group. All rights reserved
SPRAIN
Scalability Netflix:• 37X growth in requests Jan 2010-Jan 2011• “We had to get out of the datacenter business” Dachis Group:• Specifically wanted a horizontally scalable data store Spotify:• Importance (and challenges) of cross-datacenter replication Tellybug:• Peak scalability – mission critical for two hours per week. Elastic
scalability still not solved.
© 2012 by The 451 Group. All rights reserved
SPRAIN
Performance – RDBMS limitations Example project/service/vendor:• Hypertable, Couchbase, Riak, Membrain, MongoDB, Redis• Data grid/cache• VoltDB, Clustrix
Associated use case:• Real time data processing of mixed read/write workloads• Data caching• Large-scale data ingestion
© 2012 by The 451 Group. All rights reserved
SPRAIN
Performance Tellybug:• Having won the contract to deliver app for Britain’s Got Talent
realised that its MySQL/Django/Python stack couldn’t deliver the anticipated load
Spotify:• Major upgrades without service interruptions, not possible with
sharded SQL databases
Rackspace:• Ability to monitor a million different things, ability to withstand the
failure of 1/3 data centers
© 2012 by The 451 Group. All rights reserved
SPRAIN
Relaxed consistency - CAP Theorem Example project/service/vendor:• Dynamo, Voldemort, Cassandra, Riak• Amazon DynamoDB
Associated use case:• Multi-data center replication • Service availability• Non-transactional data off-load
© 2012 by The 451 Group. All rights reserved
SPRAIN
Relaxed consistency Netflix:• “We value availability over consistency. We don’t need full
consistency.”
Tellybug:• Soft production data – the ability to ignore failures
• Spotify:• Flexibility to combine different consistency levels for different
column families in a single application
© 2012 by The 451 Group. All rights reserved
SPRAIN
Agility - polyglot persistence, schema-less Example project/service/vendor:• MongoDB, CouchDB, Cassandra, Riak• Google App Engine, SimpleDB,
Associated use case:• Mobile/remote device synchronization• Agile development• Data caching
© 2012 by The 451 Group. All rights reserved
SPRAIN
Agility Tellybug:• Had 4-5 weeks to ship a production system to meet contract• Ability to re-build entire infrastructure from scratch in 10-15 mins
Netflix:• Single SQL database had to be brought down to change the schema.• Put the logic into the Web services and employ distributed key
stores to enable agile development and schema changes
Spotify:• Cassandra is a platform for quickly developing new applications
© 2012 by The 451 Group. All rights reserved
SPRAIN
Intricacy – volume, velocity, variety Example project/service/vendor:• Neo4j, GraphDB, InfiniteGraph• Apache Cassandra, Hadoop, Riak• VoltDB, Clustrix
Associated use case:• Social networking applications• Geo-locational applications• Configuration management database
© 2012 by The 451 Group. All rights reserved
SPRAIN
Intricacy Dachis Group:• Combining social media data for 2,000 brands with sentiment,
relationship, conversations, social graph Spotify:• More than half a billion playlists, about 10 thousand requests per
second at peak Rackspace: One row per customer with thousands of columns – potentially
millions of columns Tellybug:• Ability to cope with 10,000 interactions/sec and integrate that with
the social graph for thousands of concurrent users
© 2012 by The 451 Group. All rights reserved
SPRAIN
Necessity - open source The failure of existing suppliers to address emerging
requirements
Example projects:• BigTable: Google• Dynamo: Amazon• Cassandra: Facebook• HBase: Powerset• Voldemort: LinkedIn• Hypertable: Zvents• Neo4j: Windh Technologies
© 2012 by The 451 Group. All rights reserved
SPRAIN
Necessity Spotify: • Created own backup and restore technology Rackspace: • Distributed secondary indexing, blob storage, many other projects Tellybug: • Created sharded counter in memcached, wrote several tools to
figure out closeness to the ‘truth’ Netflix: • Developed multiple tools including multi-datacenter replication,
automated configuration and backup, provisioning interface etc
© 2012 by The 451 Group. All rights reserved
Relevant reports
MySQL, NoSQL and NewSQL• Assessing the competitive dynamic between the MySQL
ecosystem, NoSQL and NewSQL technologies
• Due May 2012
• Including market sizing of the threedatabase segments
• Survey of 200+ database users
COMINGSOON