46
Welcome to Cloud Chicago Live Tweet on the second screen by using: #cloudcamp @cloudcamp_chi 1 Sponsored by Hosted by Thursday, December 13, 12

Cloud Camp Chicago Dec 2012 Slides

Embed Size (px)

DESCRIPTION

The slides from the December 2012 Cloud Camp Chicago. The slides include slides from our speakers: Dave Falck, Model Metrics: node.js on AWS; Paul Mantz, CohesiveFT: Working with APIs; Bob Chojnacki, Jellyvision Labs: Hadoop on AWS; Karl Zimmerman, Steadfast: Keep control with the Private Cloud

Citation preview

Page 1: Cloud Camp Chicago Dec 2012 Slides

Welcome to Cloud Chicago

Live Tweet on the second screen by using:#cloudcamp@cloudcamp_chi

1

Sponsored by

Hosted by

Thursday, December 13, 12

Page 2: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Agenda

6:00pm Registration, Food, Drinks and Networking6:30 Opening Remarks, Patrick Kerpan, CoehsiveFT

6:45 Lightning TalksDave Falck, Model Metrics: node.js on AWSPaul Mantz, CohesiveFT: Working with APIs Bob Chojnacki, Jellyvision Labs: Hadoop on AWSKarl Zimmerman, Steadfast: Keep control with the Private Cloud

7:45 Unpanel: “Who’s in Control of Your Cloud? Security and Visibility”

Emceed by Mike Dorosh, IBM & Patrick Kerpan, CoehsiveFT

8:30 Breakout Sessions 9:00 Wrap Up - Drinks, anyone?

Thursday, December 13, 12

Page 3: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Sponsored by

Hosted byDave Falck, Customer Solutions Engineer

Thursday, December 13, 12

Page 4: Cloud Camp Chicago Dec 2012 Slides

Node.js  +  AWS  @davidfalck  

Page 5: Cloud Camp Chicago Dec 2012 Slides

*  LinkedIn’s  entire  mobile  software  stack  is  completely  built  in  Node  

*  Why?  Scale.  *  Huge  performance  gains  compared  to  what  they  were  

using  before  (Ruby  on  Rails)  *  Went  from  running  15  servers  with  15  instances  (virtual  

servers)  on  each  physical  machine,  to  just  four  instances  that  can  handle  double  the  traffic.    

 

Why  the  Node.js  Buzz?    

Page 6: Cloud Camp Chicago Dec 2012 Slides

*  Javascript  platform  based  on  Google  Chrome  V8  JS  Engine    

*  Ryan  Dahl  (Joyent)  *  Event-­‐driven,  non-­‐blocking  I/O  model  to  allow  your  

applications  to  scale  while  keeping  you  from  having  to  deal  with  threads,  polling,  timeouts,  and  event  loops  

*  FAST  *  Used  for  real-­‐time,  data-­‐intensive  apps  (mobile!)  

*  POPULAR    

What  is  Node.js  ?  

Page 7: Cloud Camp Chicago Dec 2012 Slides

Node.js  on  GitHub  

Page 8: Cloud Camp Chicago Dec 2012 Slides

var  http  =  require('http');  http.createServer(function  (req,  res)  {      res.writeHead(200,  {'Content-­‐Type':  'text/plain'});      res.end('Hello  World\n');  }).listen(1337,  '127.0.0.1');  

Hello  World  

Page 9: Cloud Camp Chicago Dec 2012 Slides

*  Thread-­‐based  networking  is  inefficient  and  difficult  *  Node  shows  much  better  memory  efficiency  under  high-­‐

loads  than  systems  which  allocate  2mb  thread  stacks  for  each  connection.    

*  Users  of  Node  are  free  from  worries  of  dead-­‐locking  the  process  (*there  are  no  locks*)  

*  Almost  no  function  in  Node  directly  performs  I/O,  so  the  process  never  blocks.    

*  Because  nothing  blocks,  less-­‐than-­‐expert  programmers  are  able  to  develop  fast  systems  

What  makes  Node.js  so  fast?  

Page 10: Cloud Camp Chicago Dec 2012 Slides

Under  the  Node.js  hood    

Javascript?  

Page 11: Cloud Camp Chicago Dec 2012 Slides

Under  the  Node.js  hood    

*  Javascript!  *  Platform  independent  *  Easy  to  use  *  Ubiquitous  

*  Google  Chrome’s  V8  Javascript  Engine  *  Translates  JS  into  machine  code  (not  interpreted)  

Page 12: Cloud Camp Chicago Dec 2012 Slides

When  not  to  use  Node.js    

*  Node.js  is  not  ideal  for  CPU  intensive  jobs  like  sorting,  transformations,  number  crunching,  analytics…  *  Traditional  CRUD  web  apps  that  need  to  be  highly  concurrent,  performance  degradation  will  occur  when  the  data  is  needed  to  be  transformed…    *  You  can  offload  processing  to  another  language  that  is  better  at  making  use  of  the  CPU  *  Cultural  fit?  Too  new?    You  decide…  

Page 13: Cloud Camp Chicago Dec 2012 Slides

*  Dec  6th:  AWS  released  developer  preview  of  node.js  libraries  to  access  AWS:  *  DynamoDB  *  S3  *  EC2    *  SWS  

*  Allows  you  to  manage  parallel  calls  to  several  AWS  web  services  

Node.js  +  AWS  

Page 14: Cloud Camp Chicago Dec 2012 Slides

*  Azure    *  Joyent  *  EngineYard  *  Heroku  

Node.js  +  Other  Clouds  

Page 15: Cloud Camp Chicago Dec 2012 Slides

*  http://nodejs.org  *  http://en.wikipedia.org/wiki/Nodejs  *  http://aws.typepad.com/aws/2012/12/aws-­‐sdk-­‐for-­‐nodejs-­‐now-­‐available-­‐in-­‐preview-­‐form.html  *  http://www.jamesward.com/2011/06/21/getting-­‐started-­‐with-­‐node-­‐js-­‐on-­‐the-­‐cloud/  *  http://venturebeat.com/2011/08/16/linkedin-­‐node/  

More  info  

Page 16: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Sponsored by

Hosted byPaul Mantz, Software Engineer

Thursday, December 13, 12

Page 17: Cloud Camp Chicago Dec 2012 Slides

Copyright CohesiveFT - Dec 13, 2012

APIs in Cloud Environments Paul Mantz

1Thursday, December 13, 12

Page 18: Cloud Camp Chicago Dec 2012 Slides

Copyright CohesiveFT - Dec 13, 2012

API Command-Line Clients

• Benefits to Creating API Command-Line Clients

• Lowers barrier of entry

• Familiar to technical consumers

• Advanced usage cases

• Integrates into existing toolsets

2Thursday, December 13, 12

Page 19: Cloud Camp Chicago Dec 2012 Slides

Copyright CohesiveFT - Dec 13, 2012

API Command-Line Clients

Excellent Internal Developer Tool

• Excellent for testing and rapid development

• Useful operations tool

3Thursday, December 13, 12

Page 20: Cloud Camp Chicago Dec 2012 Slides

Copyright CohesiveFT - Dec 13, 2012

API Command-Line Clients

Reference Implementation

• Gives developers an example to integrate the API

• Helps users model workflows

• DSL

4Thursday, December 13, 12

Page 21: Cloud Camp Chicago Dec 2012 Slides

Copyright CohesiveFT - Dec 13, 2012

API Command-Line Clients

Excellent Demo Tool

• Quick installation, often one file

5Thursday, December 13, 12

Page 22: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Sponsored by

Hosted byBob Chojnacki, Programmer

Thursday, December 13, 12

Page 23: Cloud Camp Chicago Dec 2012 Slides

Big  Data  in  the  Cloud  

A  Journey  into  the  unknown  

Page 24: Cloud Camp Chicago Dec 2012 Slides

Who  Jellyvision  is  and  why  are  analy9cs  important  to  us  

•  We  create  interac9ve  experiences  –  Desktop  – Mobile  

•  …  which  ask  ques9ons,  inform  people,  generate  leads  •  “Virtual  Advisors”  •  We  also  collect  analy9cs  in  real  9me  to  generate  reports  

about:  –  How  people  answered  a  ques9on  –  Where  they  dropped  out  –  Lots  of  impressive  stats!    

Page 25: Cloud Camp Chicago Dec 2012 Slides

The  Problem  

•  Longer  term  projects  and  high  volume  projects  causing  MySQL  to  bust  at  the  seams  

•  Some  types  of  reports  taking  too  long,  or  causing  MySQL  to  crash  if  we  include  too  much  data  

•  In  all  fairness,  we  could  probably  tune  MySQL,  throw  it  on  bigger  servers,  more  memory  

•  Diminishing  returns  •  MySQL  is  fine  for  collec9ng  the  data…  

Page 26: Cloud Camp Chicago Dec 2012 Slides

The  Solu9on  

•  Hadoop!  •  Why  Hadoop?  Lots  of  possibili9es  out  there,  but  which  one  to  use?  Cassandra,  CouchDB,  Hadoop,  Membase,  MongoDB,  Neo4j,  …  

•  Big  Data  meetups  tended  to  have  lots  of  people  using  Hadoop  

•  And  I  knew  others  using  it.  •  And  Hortonworks  had  a  fancy  point  and  click  solu9on  I  could  use  to  get  started  quickly  

Page 27: Cloud Camp Chicago Dec 2012 Slides

Op9ons  with  op9ons  

•  Now  that  I  picked  Hadoop,  I  had  several  op9ons,  and  op9ons  within  op9ons  to  use  to  analyze  my  data:  – Hive,  Pig,  MapReduce,  Java,  R  

•  I  knew  Java  •  MapReduce  seemed  to  make  sense  •  I’ll  probably  play  with  Hive  and  Pig  next  

Page 28: Cloud Camp Chicago Dec 2012 Slides

It’s  All  About  The  Data  

•  Visit  data  •  Event  data  •  Denormaliza9on  of  data  •  Generated  a  ton  of  fake  data:  – Started  with  600K  visits,  3M  events  – Moved  up  to  1.8M  visits,  60M  events  

Page 29: Cloud Camp Chicago Dec 2012 Slides

Make  it  so  •  First  experience:  Hortonworks  Virtual  Sandbox  

–  Single  node  AMI  at  Amazon  –  Hadoop  1.0  –  600K  visits,  3M  events  

•  On  our  exis9ng  placorm  we  needed  to  break  reports  up  into  smaller  chunks  for  some  data  because  MySQL  could  not  handle  it.  

•  Results!  What  would  have  taken  hours,  took  only  5  minutes  on  a  single  node  Hadoop  "cluster”  

•  In  reality,  some  of  the  queries  I  could  also  run  with  command-­‐line  tools  (wc,  grep,  awk)  on  the  data  considerably  faster  than  even  Hadoop.  

•  Important  lessons  learned  so  far:  –  Think  outside  the  RDBMS:  they  are  great,  but  it  may  not  make  sense  

for  all  types  data  

Page 30: Cloud Camp Chicago Dec 2012 Slides

Looking  at  more  real  data  •  Now,  lets  generate  data  that  is  much  closer  to  some  of  our  product  •  Instead  of  one  ques9on  and  answer,  how  about  15  ques9ons?    Add  

in  some  other  events  gives  a  total  of  34  events.  •  Throw  in  some  people  returning,  some  of  them  mul9ple  9mes  •  Throw  in  some  people  who  don't  start  the  conversa9on,  etc.  •  Run  my  lijle  auto-­‐data-­‐generator  and  BOOM!  20  million  events  

and  4.4GB  later  I  have  my  data…  •  …  which  took  up  too  much  disk  space  to  run  on  the  demo  system  I  

was  using.    Might  as  well  turbo-­‐charge  this  puppy...  

Page 31: Cloud Camp Chicago Dec 2012 Slides

More  disk  space!  

•  Full  install  of  Hadoop  (Hortonworks  HDP)  •  Single  node  •  600K  visits,  20M  events  – 6m  29s,  ~30s  aner  map  phase  completed  

•  1.8M  visits,  60M  events  – 18m  3s,  ~90s  aner  map  phase  completed  

Page 32: Cloud Camp Chicago Dec 2012 Slides

More  nodes  

•  3  nodes:  11m  •  4  nodes:  9m  16s  •  Yay!  Nodes!  

Page 33: Cloud Camp Chicago Dec 2012 Slides

Caveats  

•  Not  using  Hadoop  to  its  fullest  /  basically  a  weekend  job  

•  Algorithms  employed  in  this  example  probably  won't  end  up  it  a  book  alongside  Knuth’s  

Page 34: Cloud Camp Chicago Dec 2012 Slides

Next  steps  

•  Make  sure  results  on  real  data  lines  up  •  Integrate  with  team  to  generate  reports  they  need  

Page 35: Cloud Camp Chicago Dec 2012 Slides

End  stuff  

•  Thanks  to  the  folks  at  Hortonworks  who  answered  my  fran9c  and  spas9c  ques9ons.  

Page 36: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Sponsored by

Hosted byKarl Zimmerman, President

Thursday, December 13, 12

Page 37: Cloud Camp Chicago Dec 2012 Slides

Keep Your Control.Private Cloud with Karl Zimmerman, CEO of Steadfast.

Page 38: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:What do we mean?

Private cloud is a form of cloud computing where the customer has some control/ownership of the service implementation. It is a scalable, elastic IaaS solution based on cloud computing but with more control over resources.

Page 39: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:What are the advantages?

Security

Availability

No vendor lock-in

Ease of management

Page 40: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:Security

Dedicated & segregated resources

More options to integrate with existing security

Page 41: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:Availability

Understanding and control of the infrastructure

Get the resources you need, when you need them

You're not subject to the whims of other users

Page 42: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:Vendor Lock-In

No "secret sauce."

Utilize true open source

Page 43: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:Management

Easier to find employees with general IT knowledge

Utilize a broader array of tools and software

Get support/assistance from multiple levels

Page 44: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:To Summarize

Private cloud can deliver what you need out of a public cloud, but giving you more control. Losing control over security, availability and issues like vendor lock-in and management vanish into thin air like, well, a cloud. And the fact that it doesn’t have to cost you more is a plus, too.

Page 45: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Sponsored by

Hosted by

Unpanel: “Who’s in Control of Your Cloud? Security and Visibility”

Emceed by: Mike Dorosh, Program Manager –Cloud Technical Partnerships, IBM 

& Patrick Kerpan CEO, CoehsiveFT

Thursday, December 13, 12

Page 46: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Thursday, December 13, 12