32
Reporting from the Trenches How Intuit Uses Cassandra Effectively to Improve Customer Experiences Rekha Joshi, Staff Engineer Intuit, Inc. Thank you for joining. We will begin shortly.

Reporting from the Trenches: Intuit & Cassandra

Embed Size (px)

Citation preview

Page 1: Reporting from the Trenches: Intuit & Cassandra

Reporting from the Trenches – How Intuit Uses Cassandra Effectively to Improve Customer Experiences

Rekha Joshi, Staff EngineerIntuit, Inc.

Thank you for joining. We will begin shortly.

Page 2: Reporting from the Trenches: Intuit & Cassandra

Webinar Housekeeping

© 2015 DataStax, All Rights Reserved. 2

All attendees placed on mute

Input questions at any timeusing the online interface

Page 3: Reporting from the Trenches: Intuit & Cassandra

Speaker Bio

© 2015 DataStax, All Rights Reserved. 3

O’Reilly Certified Apache Cassandra Architect

Rekha JoshiStaff Engineer at Intuit

Inc.

Page 4: Reporting from the Trenches: Intuit & Cassandra

1 About Intuit

2 Use Case: Personalized A/B Testing 3 Database Requirements

4 Cassandra: Intuit NoSQL Standard

5 Using Cassandra Effectively

4© 2015 DataStax, All Rights Reserved.

Page 5: Reporting from the Trenches: Intuit & Cassandra

Intuit On Mission

© 2015 DataStax, All Rights Reserved. 5

Page 6: Reporting from the Trenches: Intuit & Cassandra

Intuit Data Platforms

© 2015 DataStax, All Rights Reserved. 6

50M+manage all of the data

complex compliancePublic and private cloud

customers to handle6+

petabytes of data

45M+ Customers

Manage all of the data 6+ Petabytes of data

Complex compliance

Page 7: Reporting from the Trenches: Intuit & Cassandra

Use Case: Personalized A/B Testing

© 2015 DataStax, All Rights Reserved. 7

Opinion-vs-Opinion Wars

Huge Investment

Angry Customer

Experiment, experiment, experiment!

Let Data Be The Decision Maker!

No Personalized A/B Testing?

With Personalized A/B Testing!!

Page 8: Reporting from the Trenches: Intuit & Cassandra

Use Case: Personalized A/B Testing

© 2015 DataStax, All Rights Reserved. 8

To Continuously Improve User Experience, Data Is Better Than Guess!

Page 9: Reporting from the Trenches: Intuit & Cassandra

Personalized A/B Testing Platform

© 2015 DataStax, All Rights Reserved. 9

User Assignment

Personalization Service

Segmentation Filters and Sampling

Personalization Engine

Analytics

Set up and administration

Profile Store

User Actions

A/B Testing Service

Page 10: Reporting from the Trenches: Intuit & Cassandra

Deployment

© 2015 DataStax, All Rights Reserved. 10

Monitoring

Alerting

Amazon CloudJenkinsCoopr ChefCloudformationECS/Docker

CloudwatchSplunkGraphiteGrafanaLogstashPrometheusNew Relic

SensuNew Relic AlertsHipchatPagerDuty

Page 11: Reporting from the Trenches: Intuit & Cassandra

Database Requirements

© 2015 DataStax, All Rights Reserved. 11

• High Data Security• No Data Loss• No Downtime• Linear Scalability• Tunable Consistency• Performance Under Workloads

Page 12: Reporting from the Trenches: Intuit & Cassandra

All This Data!!!!!

© 2015 DataStax, All Rights Reserved. 12

Page 13: Reporting from the Trenches: Intuit & Cassandra

Can I Lift This Alone?

© 2015 DataStax, All Rights Reserved. 13

Page 14: Reporting from the Trenches: Intuit & Cassandra

Need for Speed

© 2015 DataStax, All Rights Reserved. 14

Page 15: Reporting from the Trenches: Intuit & Cassandra

Cassandra, Who?

© 2015 DataStax, All Rights Reserved. 15

Cassandra is a Java based NoSQL, linearly scalable, best in class tunable performance, fault tolerant, distributed, masterless, time series database.

Page 16: Reporting from the Trenches: Intuit & Cassandra

Cassandra: The Hybrid Kid has the Edge!

© 2015 DataStax, All Rights Reserved. 16

DynamoDB(Amazon)

Big Table(Google)

Cassandra

Inherits data distribution Inherits data model

Masterless ArchitectureLinear Scalability Tunable Consistency/Performance

ApplicationQuery Access Patterns

influencing influencing

Page 17: Reporting from the Trenches: Intuit & Cassandra

Cassandra and DataStax Enterprise

© 2015 DataStax, All Rights Reserved. 17

Advanced Security

Integrated Analytics (Spark)

Advanced Tools

24/7 Support

Page 18: Reporting from the Trenches: Intuit & Cassandra

A Truly Successful Software

© 2015 DataStax, All Rights Reserved. 18

• Solves A Real Need• Is A Building Block for Platforms• Becomes Open Source• Gets Commercial Backing• Tools Ecosystem Builds Around It• Establishes Strong Users Base• Companies in Critical Domains use It!!

Page 19: Reporting from the Trenches: Intuit & Cassandra

Database Options

© 2015 DataStax, All Rights Reserved. 19

Page 20: Reporting from the Trenches: Intuit & Cassandra

Intuit and Cassandra

© 2015 DataStax, All Rights Reserved. 20

Cassandra = Intuit Technology Standard of Choice for NoSQL Distributed Database

High Data SecurityNo Data LossNo Downtime

Linear ScalabilityTunable ConsistencyOther NoSQL variants

Performance Under Workloads

Page 21: Reporting from the Trenches: Intuit & Cassandra

Did You Use Cassandra Effectively?

© 2015 DataStax, All Rights Reserved. 21

Page 22: Reporting from the Trenches: Intuit & Cassandra

Garbage Collection Issue

© 2015 DataStax, All Rights Reserved. 22

New objects created at faster rate, than they are GC’ed Can causes STOP-THE-WORLD GC pauses! •Configure Heap size, MAX_HEAP_SIZE•Set up GC logging CASSANDRA_HEAP_DIR•Configure CMS GC/G1GC•Automated Heap Dump•Upgrade System

Cassandra is a Java based NoSQL linearly scalable, fault tolerant, distributed time series database.

Page 23: Reporting from the Trenches: Intuit & Cassandra

Clock Issue

© 2015 DataStax, All Rights Reserved. 23

Ensure when you move setups/do upgrades, the ntp server is set correctly

Cassandra is a NoSQL linearly scalable, fault tolerant, distributed time series database.

Page 24: Reporting from the Trenches: Intuit & Cassandra

Understand the Node Ring

© 2015 DataStax, All Rights Reserved. 24

Repeat after me: Cassandra is a Java based NoSQL linearly scalable, best in class tunable performance, fault tolerant, distributed, masterless, time series database.

Nodetool statusNodetool ringNodetool infoNodetool cfstatsNodetool tpstats

Page 25: Reporting from the Trenches: Intuit & Cassandra

What If A Node Goes Down?

© 2015 DataStax, All Rights Reserved. 25

ReplicationConsistencyNodetool repairNodetool decommissionNodetool snapshots

Cassandra is a NoSQL linearly scalable, fault tolerant, distributed, masterless time series database.

Page 26: Reporting from the Trenches: Intuit & Cassandra

Tuning The Application

© 2015 DataStax, All Rights Reserved. 26

Cassandra is a Java based NoSQL linearly scalable, best in class tunable performance, fault tolerant, distributed, masterless, time series database.

Refactor data modelRevisit the usage access patternsParanoid Monitoring

Page 27: Reporting from the Trenches: Intuit & Cassandra

Tuning For Reads

© 2015 DataStax, All Rights Reserved. 27

• Caching Layer – Key Cache/Row Cache• SSTable Compactions Frequency

• Multiple SSTable inefficient

Cassandra is a Java based NoSQL linearly scalable, best in class tunable performance, fault tolerant, distributed time series database.

Page 28: Reporting from the Trenches: Intuit & Cassandra

Tuning For Writes

© 2015 DataStax, All Rights Reserved. 28

Cassandra is a Java based NoSQL linearly scalable, best in class tunable performance, fault tolerant, distributed time series database.

• Memtable – Fast Writes• CommitLog – Separate Dedicated Disk

Page 29: Reporting from the Trenches: Intuit & Cassandra

Tuning the System

© 2015 DataStax, All Rights Reserved. 29

EXT4 Filesystem System Memory, CPU, DiskParanoid Monitoring

Cassandra is a NoSQL linearly scalable, fault tolerant, distributed, masterless time series database.

Page 30: Reporting from the Trenches: Intuit & Cassandra

Little Talked Aspect Of The Pareto Principle!

© 2015 DataStax, All Rights Reserved. 30

Page 31: Reporting from the Trenches: Intuit & Cassandra

Heavy Lifting? Easy!

© 2015 DataStax, All Rights Reserved. 31

Page 32: Reporting from the Trenches: Intuit & Cassandra

© 2015 DataStax, All Rights Reserved. 32

Thank you!

Input questions at any timeusing the online interface

Q & A

https://www.linkedin.com/in/rekhajoshmhttps://twitter.com/rekhajoshm