24
Big Data Little Tests John Heintz Founder, Gist Labs Technical Consultant, Cutter Consortium [email protected] @jheintz http://gistlabs.com

Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " [email protected] @jheintz" " " © 2012

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

Big Data���Little Tests

John Heintz

Founder, Gist Labs Technical Consultant, Cutter Consortium

[email protected] @jheintz

http://gistlabs.com

Page 2: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

About John Heintz •  Developer since 1995

•  Agilist since 1999

•  Founded Gist Labs in 2008

•  Developer, Mentor, Consultant

•  Intuitive, Abstract, Precise

2

Kool-Aids I’ve drank: Agile/Lean/Kanban, OO, TDD, REST, Mentoring, Craftsmanship, Emergent/Progressive Design, InnovationGames®, Systems and Complexity Theory

Page 3: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

My Goals for You

•  Demystify test automation for Big Data

•  Provide executable examples

3

Page 4: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

What you shouldn’t expect…

•  Barely introduce Big Data concepts

• No performance tuning

4

Page 5: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Simple Code, Config

•  I went as simple and clear as possible

•  Java, JUnit4

• Maven… okay maybe not simple :-\

5

Page 6: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Mostly Code

•  Remember the Law of Two Feet

•  If code isn’t what you were looking for I totally respect you finding something better for your time J

6

Page 7: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

•  Everything available from http://gistlabs.com/2012/08/big-data-little-tests/

•  The entire command script is there…

so you can take notes assuming that’s available

7

Page 8: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

My Soapboxes…

These are topics I’ll repeat myself on

•  Fast test execution

• One-click build

8

Page 9: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Big Data

•  Too much

•  Too fast

• Not trivially structured

9

Page 10: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Map Reduce

• Map from one input to one output

•  Reduce from many inputs to one output

•  Can be run in parallel

•  Crude, but massive

10

Page 11: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

CAP Theorem

•  Consistency

•  Availability

•  Partition Tolerance

11

Page 12: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Big Data Ecosystem

•  Hadoop: A giant among giants

(Tons of projects on this platform!!)

•  Cassandra: Feels like a weird RDBMS

•  Riak: An elegant key/value/search store

• MongoDB: Document store

12

Page 13: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Let’s Run Some Code

13

Page 14: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Hadoop Tests

14

Page 15: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Riak tests

15

Page 16: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Other Frameworks

•  CassandraUnit

https://github.com/jsevellec/cassandra-unit

•  PigUnit, Hadoop Query Language

http://pig.apache.org/docs/r0.8.1/pigunit.html

16

Page 17: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Code Questions?

•  Fast test execution?

• One-click build?

17

Page 18: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

What about Big Tests?

•  Real test data

•  Realistic cluster

18

Page 19: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Real Test Data

My favorite strategy is to:

•  Develop with small, crafted data

•  Build/test the same way

•  Run another test on top of real prod data

19

Page 20: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Continuous Deployment Servers

Build

Cluster

Test1

Cluster

Version Control

Staging

Production

Continuous Integration Servers

Developers

Developers

Test2

Cluster

Virtual vs Physical Servers

Network Infrastructure

Storage Infrastructure

Developer Sandboxes

Self-service Provisioning

Private vs Public Cloud

20

Page 21: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Realistic Cluster

•  Use a CI/DevOps environment

•  Virtualize, “X as a Service”

•  Virtual Machines

•  Virtual Infrastructure (Network, Storage)

21

Page 22: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Jenkins CI Server • Master/slave clusters

•  Plugins for Hadoop and VMWare

•  http://jenkins-ci.org/

22

Page 23: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Big Questions?

23

Page 24: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012

© 2012 Gist Labs, LLC

Thank you!

•  Everything available from:

http://gistlabs.com/2012/08/big-data-little-tests/

•  John Heintz, @jheintz, http://gistlabs.com

24