Transcript
Page 1: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 1

Constant ContactMarch 2011

Dave Connors – VP OperationsJim Ancona – Systems ArchitectMark Schena – Manager Systems Automation

Cassandra & Puppet:Scaling data at $15/month

Page 2: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 2

Constant Contact

2000 – 2010

Market leader for Small Businesses• Email, Event & Survey• Over 400k paying customers• No. 134 on the Deloitte Technology Fast 500 listing

Business model• Many customers pay as little as $15 a month• ~2 million database transactions per minute

Constant Contact

Page 3: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 3

Constant Contact

The business problem

Page 4: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 4

Constant Contact

Small Businesses are looking to us for help with Social Media marketing

• Social Media 10-100 times more data

• Challenge with our business model

Page 5: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 5

The Key Challenge

Integrate social media data

• Solution = NoSQL

• Cost = Low

• Time to market = ?

The Key Challenge

Page 6: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 6

Implementation

Ops and Dev both face issues

• Data model• Monitoring• Authentication• Logging• Risk profile• Roles & Responsibilities

Implementing NoSQL

Page 7: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 7

Dev

Ops

Page 8: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 8

Apache Cassandra

• Developed at Facebook• Open sourced in 2008• Incubated at Apache• Became an Apache top-level project in 2010

• http://cassandra.apache.org

• In use at Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco, …

• Largest production cluster has over 100 TB of data in over 150 machines

Apache Cassandra

Page 9: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 9

What is Cassandra?

• Implemented in Java

• Fault Tolerant• Elastic• Durable

• Rich data model• Replicated data • Consistency

options

What is Cassandra

Page 10: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 10

Replication

X

X X

How many copies of each piece of data

do we want?

N=3

Replication

Page 11: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 11

Y

Y Y

Y

X Y

Consistency LevelONE

WriterReade

r

YX

X X

Consistency Level One

Page 12: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 12

Y

Y Y

X

X Y

WriterReade

r

XX

X X

Consistency Level Quorum

Page 13: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 13

Risks and Mitigation

• Moving target• Developer

unfamiliarity• Operational

procedures• Reliability concerns

• Deployment automation

• Community involvement

• Training/Consulting• Application

selection• Lots of monitoring• Phased rollout

Risks and Mitigation

Page 14: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 14

Development Challenges

Understanding the data modelChoosing a client

■ Clients available for Java, Python, .NET, Ruby, PHP

■ Don’t use Thrift

Moving target

Development Challenges

Page 15: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 15

• Not “one neck to wring”• Paid support and training is available:

http://datastax.com• Community

■ Mailing lists■ IRC #cassandra at freenode

• Contribute

Open Source

Page 16: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 16

• Switchable modes• Mirroring• Dial-able traffic

Phased Rollout

Page 17: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 17

• Big, complex project• Close collaboration• Flexible roles• Ability to iterate

Collaboration

Page 18: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 18

Dev

Ops

Page 19: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 19

“Are you sure you really want that?”

• 3 500G disks• 1 250G disk• No SWAP• RAID Zero Root Partition and Data Storage• 32G Memory

“Are you sure you really want that?”

Page 20: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 20

We will need how many servers?We will need how many servers?

Page 21: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 21

• Quorum = 3 • Multiple Datacenters = 2• Use only half the available disk = 2• 12 Servers = ~1 TB Of Data Storage• ~6 TB of Data Storage

3 x 2 = 6x 2 = 12

72x 6 =

How many nodes?

Page 22: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 22

RanRandom Partitioner

Page 23: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 23

Tool ChainTool Chain

Page 24: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 24

with Puppet

• Puppet is the shared framework between Operations and Development

• Versioning of puppet code allows for adoption of development best practices

• Leverage Domain specific knowledge and skill

DevOps with Puppet

Page 25: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 25

Always Move ForwardAlways Move Forward

Page 26: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 26

Operational Efficiencies

• Remote logging is a requirement • Cassandra uses log4j natively• Resources not available for remote log4j

development• Scribed with Puppet provides the solution

Operational Efficiencies

Page 27: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 27

• Munin• JMX trending• Identify critical data points• Rapid development of graphs• Puppet Definitions are used for rapid

deployment

Development takes the Operational Lead

Page 28: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 28

Sample Munin Graph

Page 29: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 29

Puppet Code

define munin::cassandracolumnfamily ( ) { include cassandravirtual File <| title == "jmxbin" |>

$confdir="/opt/cassandra-munin-plugins” $plugindir="/etc/munin/plugins" $target="/opt/cassandra-munin-plugins/jmx_" # Match 3 strings separated by periods $pattern = '^([^.]*)[.]([^.]*)[.]([^.]*)$' $keyspace = regsubst($name, $pattern, '\1') $columnfamily = regsubst($name, $pattern, '\2') $file = regsubst($name, $pattern, '\3')

file {"${keyspace}_${columnfamily}_${file}.conf": owner => 'root', ensure => 'file', group => 'root', type => 'file', path => "${confdir}/${keyspace}_${columnfamily}_${file}.conf", mode => '644', content => template("munin/attribute_${file}.conf.erb"), require => [ Package['munin-node'], File['/opt/cassandra-munin-plugins'], File['jmxquery'], ], } file {"$plugindir/${keyspace}_${columnfamily}_${file}": ensure => 'link', owner => 'root', group => 'root', mode => '511', type => 'link', target => "$target", require => [ File['/opt/cassandra-munin-plugins'], File["${keyspace}_${columnfamily}_${file}.conf"], File['jmxquery'], Package['munin-node'], ],

Example: Munin Puppet Code

Page 30: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 30

Conclusion

• Cassandra as an appliance• Development Best Practices with Life Cycle

Management• Traditional vs. Today

• Infrastructure 4 weeks 4 hours to build 72 nodes

• Development to Deployment9 months 3 months

• CostMillions 150k

Conclusion

Page 31: Cassandra & puppet, scaling data at $15 per month

Copyright © 2011 Constant Contact Inc. 31

Q&A

Thank You!


Recommended