28
Cassandra 1.0 The Future Of Big Data Matthew F. Dennis // @mdennis 7 th  Advanced Computing Conference Seoul, South Korea February 15 th , 2012

The Future Of Big Data

Embed Size (px)

DESCRIPTION

A high level overview of common Cassandra use cases, adoption reasons, BigData trends, DataStax Enterprise and the future of BigData given at the 7th Advanced Computing Conference in Seoul, South Korea

Citation preview

Page 1: The Future Of Big Data

Cassandra 1.0The Future Of Big DataMatthew F. Dennis // @mdennis7th Advanced Computing ConferenceSeoul, South KoreaFebruary 15th, 2012

Page 2: The Future Of Big Data

Cassandra Job Trends (indeed.com)

Page 3: The Future Of Big Data

Cassandra Job Trends (indeed.com)

Page 4: The Future Of Big Data

“Big Data” Job Trends (indeed.com)

Page 5: The Future Of Big Data

Big Data

Page 6: The Future Of Big Data

Why People Choose Cassandra

True Multi­DC Support

Linearly scalable

Larger­than­memory datasets

Best­in­class performance (not just for writes!)

Fully durable

Integrated caching

Tuneable consistency

No single point of failure (SPOF)

Page 7: The Future Of Big Data

Common Cassandra Use Cases

Time Series 

Sensor Data

Messaging

Ad Tracking

Financial Market Data

User Activity Streams

Fraud Detection / Risk Analysis

Anything Requiring:linear scale + high performance + global availability

Page 8: The Future Of Big Data

“With Cassandra, we get better business agility, and we don’t have to plan capacity in advance, we don’t need to ask permission of other people to build things for us, and we don’t worry about running out of space or power.”   

Adrian Cockcroft, Cloud Architect

Page 9: The Future Of Big Data

Netflix’s problems

Could not build datacenters fast enoughMade decision to go to cloud (AWS)Cassandra on AWS is a key infrastructure component of its globally distributed streaming product.Applications include Netflix’s subscriber system, AB testing, and viewing history service (including pause/resume).

Page 10: The Future Of Big Data

Netflix on Cassandra

FastCheapScalableFlexibleNo SPOF

Page 11: The Future Of Big Data

Scale Horizontallyhttp://www.datastax.com/1-million-writes

Number Of Nodes

Clie

nt

Wri

tes

Per

Seco

nd

Page 12: The Future Of Big Data

“Without Cassandra, our engineers would’ve had to create something that could scale to our needs, that would’ve prevented us from focusing on building product and solving problems for Backupify’s users, which are far more important tasks.”

Matt Conway, VP Engineering

Page 13: The Future Of Big Data

Backupify’s problem

Cloud­based utility that enables businesses and consumers to backup, search and restore the content of popular online applications such as Google Apps, Gmail, Facebook, Twitter, and Blogger

Page 14: The Future Of Big Data

Backupify on Cassandra

Ease of scale enabled engineers to focus on building great applicationsDataStax OpsCenter made it easy to monitor the health and performance of their clusterReliable, redundant, scalable and cheap data  storage helped eliminate down­timeAbility to offer both backup and storage, but   also analysis of data in the future

Page 15: The Future Of Big Data

“You can seamlessly add new nodes and expand your total capacity without deteriorating the performance of the data store. Cassandra has allowed us to scale very effectively.”

Harry Robertson, Tech Lead

Page 16: The Future Of Big Data

Ooyala’s problem

Ooyala provides a suite of technologies and services that support content owners in managing, analyzing and monetizing the digital video they publish online

Page 17: The Future Of Big Data

Ooyala on Cassandra

Classic “Big Data” problem did not require re­architectingEnabled Application agility – developers spend time building cool apps, not figuring out how to scaleEnabled more powerful and granular analytics for their customers

Page 18: The Future Of Big Data

Financial

Social Media

Advertising

Entertainment

Energy

E­Tail

Health Care

Infrastructure

Government

Some More Cassandra Users http://www.datastax.com/cassandrausers

Page 19: The Future Of Big Data

Big Data

Page 20: The Future Of Big Data

The evolution of Analytics

Analytics + Realtime

Page 21: The Future Of Big Data

The evolution of Analytics

Analytics Realtime

replication

Page 22: The Future Of Big Data

The evolution of Analytics

ETL

RealtimeAnalytics

Page 23: The Future Of Big Data

DataStax Enterprise re-unifies realtime and analytics

Page 24: The Future Of Big Data

realtime and analytics

Page 25: The Future Of Big Data

Portfolio Demo dataflow

Portfolios

Historical Prices

Intermediate Results

Largest loss

Portfolios

Live Prices for today

Largest loss

Page 26: The Future Of Big Data

Operations

“Vanilla” HadoopMany pieces to setup, monitor, backup, and maintain(NameNode, SecondaryNameNode, DataNode, JobTracker, TaskTracker, Zookeeper, Region Server, ...)Single points of failure

DataStax EnterpriseSingle simplified systemSelf­organizes based on workloadPeer to peerJobTracker failoverNo additional Cassandra config

Page 27: The Future Of Big Data

Monitoring Cassandra (OpsCenter)

Page 28: The Future Of Big Data

Q?Matthew F. Dennis // @mdennishttp://slideshare.net/mattdennis