22
© 2012 by The 451 Group. All rights reserved NoSQL, NewSQL, and Beyond The answer to SPRAINed relational databases Matthew Aslett Research Manager, Data Management and Analytics 1

Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

  • Upload
    acunu

  • View
    1.988

  • Download
    2

Embed Size (px)

DESCRIPTION

Slides from Matt Aslett's presentation at Cassandra Europe 2012.

Citation preview

Page 1: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

1

NoSQL, NewSQL, and BeyondThe answer to SPRAINed relational databases

Matthew AslettResearch Manager, Data Management and Analytics

Page 2: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

451 Research

Information Management Operational databasesData warehousingData cachingEvent processing

Matthew Aslett• Research manager, data management and analytics• With The 451 Group since 2007• www.twitter.com/maslett

Commercial Adoption of Open Source (CAOS)Open source projectsAdoption of open source softwareVendor strategies

Page 3: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

3

The 451 Group

Page 4: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

Relevant reports

NoSQL, NewSQL and Beyond• Assessing the drivers behind the development and

adoption of NoSQL and NewSQL databases, as well as data grid/caching technologies

• Released April 2011

• Role of open source in driving innovation

[email protected]

Page 5: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

NoSQL, NewSQL and Beyond

NoSQL New breed of non-relational

database products Rejection of fixed table

schema and join operations Designed to meet scalability

requirements of distributed architectures

And/or schema-less data management requirements

NewSQL New breed of relational

database products Retain SQL and ACID Designed to meet scalability

requirements of distributed architectures

Or improve performance so horizontal scalability is no longer a necessity

… and Beyond In-memory data grid/cache products Potential primary platform for distributed data management

Page 6: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

The NoSQL landscape

Key Value Store• Citrusleaf• HandlerSocket*• Redis• Voldemort• Membrain• Oracle NoSQL• Castle• RethinkDB• LevelDB• Cassandra• DataStax EE• Acunu

• HBase• Hypertable

• DynamoDB• Redis-to-go• SimpleDB

• Riak• CouchBase• Mongo Labs• Mongo HQ

• App Engine Datastore -as-a-Service

Graph• InfiniteGraph• Neo4j• DEX• OrientDB• NuvolaBase

Document• RavenDB• MongoDB• CouchDB• Cloudant• Iris Couch

Big Tables

Page 7: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

The NewSQL ecosystem

• ParElastic• Continuent• Galera

• NuoDB• SQLFire

• Translattice• Clustrix

• Schooner SQL• ScaleBase• ScaleArc

• CodeFutures

• GenieDB

• ScaleDB• MySQL Cluster • Zimory Scale

New databases

• Xeround• Tokutek

Storage engines

Clustering/sharding

-as-a-Service

• Drizzle• Akiban

• Drizzle• VoltDB• JustOne DB

Page 8: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAINED RELATIONAL DATABASES

Photo credit: Foxtongue on Flickrhttp://www.flickr.com/photos/foxtongue/4844016087/

Page 9: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Scalability - Hardware economics Example project/service/vendor:• BigTable, HBase, Riak, MongoDB, Couchbase, Hadoop• Xeround, NuoDB• Data grid/cache

Associated use case:• Large-scale distributed data storage• Analysis of continuously updated data• Multi-tenant PaaS data layer

Page 10: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Scalability Netflix:• 37X growth in requests Jan 2010-Jan 2011• “We had to get out of the datacenter business” Dachis Group:• Specifically wanted a horizontally scalable data store Spotify:• Importance (and challenges) of cross-datacenter replication Tellybug:• Peak scalability – mission critical for two hours per week. Elastic

scalability still not solved.

Page 11: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Performance – RDBMS limitations Example project/service/vendor:• Hypertable, Couchbase, Riak, Membrain, MongoDB, Redis• Data grid/cache• VoltDB, Clustrix

Associated use case:• Real time data processing of mixed read/write workloads• Data caching• Large-scale data ingestion

Page 12: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Performance Tellybug:• Having won the contract to deliver app for Britain’s Got Talent

realised that its MySQL/Django/Python stack couldn’t deliver the anticipated load

Spotify:• Major upgrades without service interruptions, not possible with

sharded SQL databases

Rackspace:• Ability to monitor a million different things, ability to withstand the

failure of 1/3 data centers

Page 13: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Relaxed consistency - CAP Theorem Example project/service/vendor:• Dynamo, Voldemort, Cassandra, Riak• Amazon DynamoDB

Associated use case:• Multi-data center replication • Service availability• Non-transactional data off-load

Page 14: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Relaxed consistency Netflix:• “We value availability over consistency. We don’t need full

consistency.”

Tellybug:• Soft production data – the ability to ignore failures

• Spotify:• Flexibility to combine different consistency levels for different

column families in a single application

Page 15: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Agility - polyglot persistence, schema-less Example project/service/vendor:• MongoDB, CouchDB, Cassandra, Riak• Google App Engine, SimpleDB,

Associated use case:• Mobile/remote device synchronization• Agile development• Data caching

Page 16: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Agility Tellybug:• Had 4-5 weeks to ship a production system to meet contract• Ability to re-build entire infrastructure from scratch in 10-15 mins

Netflix:• Single SQL database had to be brought down to change the schema.• Put the logic into the Web services and employ distributed key

stores to enable agile development and schema changes

Spotify:• Cassandra is a platform for quickly developing new applications

Page 17: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Intricacy – volume, velocity, variety Example project/service/vendor:• Neo4j, GraphDB, InfiniteGraph• Apache Cassandra, Hadoop, Riak• VoltDB, Clustrix

Associated use case:• Social networking applications• Geo-locational applications• Configuration management database

Page 18: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Intricacy Dachis Group:• Combining social media data for 2,000 brands with sentiment,

relationship, conversations, social graph Spotify:• More than half a billion playlists, about 10 thousand requests per

second at peak Rackspace: One row per customer with thousands of columns – potentially

millions of columns Tellybug:• Ability to cope with 10,000 interactions/sec and integrate that with

the social graph for thousands of concurrent users

Page 19: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Necessity - open source The failure of existing suppliers to address emerging

requirements

Example projects:• BigTable: Google• Dynamo: Amazon• Cassandra: Facebook• HBase: Powerset• Voldemort: LinkedIn• Hypertable: Zvents• Neo4j: Windh Technologies

Page 20: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

SPRAIN

Necessity Spotify: • Created own backup and restore technology Rackspace: • Distributed secondary indexing, blob storage, many other projects Tellybug: • Created sharded counter in memcached, wrote several tools to

figure out closeness to the ‘truth’ Netflix: • Developed multiple tools including multi-datacenter replication,

automated configuration and backup, provisioning interface etc

Page 21: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

Relevant reports

MySQL, NoSQL and NewSQL• Assessing the competitive dynamic between the MySQL

ecosystem, NoSQL and NewSQL technologies

• Due May 2012

• Including market sizing of the threedatabase segments

• Survey of 200+ database users

[email protected]

COMINGSOON

Page 22: Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 Research

© 2012 by The 451 Group. All rights reserved

Thank you. Questions? [email protected]

@maslett