Upload
rakesh-saha
View
862
Download
0
Tags:
Embed Size (px)
Citation preview
Docker-BasedHadoop ProvisioningOn Cisco InterCloud
Innovation Architect, CIS CTO Group
Cisco
Dmitri Chtchourov Rakesh SahaProduct Management
Hortonworks
© Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cautionary Statement Regarding Forward-Looking Statements
This presentation contains forward-looking statements involving risks and uncertainties. Such forward-looking statements in this presentation generally relate to future events, our ability to increase the number of support subscription customers, the growth in usage of the Hadoop framework, our ability to innovate and develop the various open source projects that will enhance the capabilities of the Hortonworks Data Platform, anticipated customer benefits and general business outlook. In some cases, you can identify forward-looking statements because they contain words such as “may,” “will,” “should,” “expects,” “plans,” “anticipates,” “could,” “intends,” “target,” “projects,” “contemplates,” “believes,” “estimates,” “predicts,” “potential” or “continue” or similar terms or expressions that concern our expectations, strategy, plans or intentions. You should not rely upon forward-looking statements as predictions of future events. We have based the forward-looking statements contained in this presentation primarily on our current expectations and projections about future events and trends that we believe may affect our business, financial condition and prospects. We cannot assure you that the results, events and circumstances reflected in the forward-looking statements will be achieved or occur, and actual results, events, or circumstances could differ materially from those described in the forward-looking statements.
The forward-looking statements made in this prospectus relate only to events as of the date on which the statements are made and we undertake no obligation to update any of the information in this presentation.
Trademarks
Hortonworks is a trademark of Hortonworks, Inc. in the United States and other jurisdictions. Other names used herein may be trademarks of their respective owners.
3© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Speakers
Rakesh SahaProduct ManagementHortonworks
Dmitri ChtchourovInnovation Architect, CIS CTO GroupCisco
4© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Agenda
• About Hortonworks
• Cloudbreak – Docker-based Hadoop provisioning tool
• Introduction to Docker
• Hadoop Provisioning using Docker
• Cisco and Hortonworks Collaboration
© Hortonworks Inc. 2011 – 2015. All Rights Reserved
About HortonworksO
NLY 100
open source Apache Hadoop data platform
%Founded in 2011
HADOOP1STdistribution to go public
IPO Fall 2014 (NASDAQ: HDP)
subscription
customers322 employees across
600+
countries
technology partners1000+ 17
TM
© Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hortonworks Mission:
Power your Modern Data Architecture with HDP and Enterprise Apache Hadoop
Customer Momentum• 300+ customers in seven quarters, growing at 75+/quarter
• Two thirds of customers come from F1000
Hortonworks and Hadoop at Scale• HDP in production on largest clusters on planet
• Multiple +1000 node clusters, including 35,000 nodes at Yahoo!, 800 nodes at Spotify
• Founded in 2011
• Original 24 architects, developers, operators of Hadoop from Yahoo!
• We are leaders in Hadoop community
• 500+ employees
© Hortonworks Inc. 2011 – 2015. All Rights Reserved
OPERATIONAL TOOLS
DEV & DATA TOOLS
INFRASTRUCTURE
HDP is deeply integrated in the data centerS
OU
RC
ES
EXISTING Systems
Clickstream Web &Social Geolocation Sensor & Machine
Server Logs Unstructured
DA
TA S
YS
TE
M
RDBMS EDW MPP
APPL
ICAT
ION
S
Deep PartnershipsHortonworks engages in deep engineered relationships with the leaders in the data center, such as Cisco, Microsoft, EMC, Pivotal, Teradata, Red Hat, SAS & SAP.
Broad PartnershipsOver a 1,000 partners work with us to certify their applications to work with Hadoop so they can extend big data to their users.
HDP
Go
vern
ance
&
Inte
gra
tio
n
Sec
uri
ty
Op
erat
ion
sData Access
Data Management
YARN
8© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Agenda
Cloudbreak Docker Provisioning Collaboration
9© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Cloudbreak
• Developed by SequenceIQ
• Open source with Apache 2.0 license [ Apache project soon ]
• Deploys selected services to public and private cloud via Ambari Blueprints
• Elastic – can spin up any number of nodes, add/remove on the fly
• Provides full cloud lifecycle management post-deployment
10© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
BI / Analytics(Hive)
IoT Apps(Storm, HBase, Hive)
Launch HDP on Any Cloud for Any Application
Dev / Test(all HDP services)
Data Science(Spark)
Cloudbreak
1. Pick a Blueprint2. Choose a Cloud3. Launch HDP!
Example Ambari Blueprints:
IoT Apps, BI / Analytics, Data Science, Dev / Test
11© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Hadoop in Cloud Provisioning with Cloudbreak
CreateTemplates
ProvideBlueprint
AssociateCredentials
LaunchCluster
12© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Provisioning: Template
CreateTemplate
ProvideBlueprint
AssociateCredentials
LaunchCluster
13© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Provisioning: Blueprint
CreateTemplate
ProvideBlueprint
AssociateCredentials
LaunchCluster
14© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Provisioning: Provider Credentials
CreateTemplate
ProvideBlueprint
AssociateCredentials
LaunchCluster
15© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Provisioning: Launch
CreateTemplate
ProvideBlueprint
AssociateCredentials
LaunchCluster
16© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Specialized Blueprints
Quick productivity with pre-configured clusters blueprints
Lambda Architecture
Machine Learning
Batch ETL
…
17© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
BI / Analytics(Hive)
IoT Apps(Storm, HBase, Hive)
Dev / Test(all HDP services)
Data Science(Spark)
Autoscaling Policy
• Policies based on any Ambari metrics• Coordinates with YARN • Policies are based on Metrics or Time • Scaling can be service or component
type specific
Optimize cloud usage via Elastic Clusters
18© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Auto-scale Policy
Auto-scale Policy
Auto-scale Policy
YARN
Ambari Alerts
Ambari Metrics
Ambari
Ambari
Ambari
Provisioning
CloudbreakStatic
Dynamic
Enforces PoliciesScales Cluster/YARN Apps
Metrics and Alerts Feed Cloudbreak
Scaling for Static and Dynamic Clusters
19© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Provisioning – How it works
Start VMs - with a running
Docker daemon
Cloudbreak Bootstrap•Start Consul Cluster
•Start Swarm Cluster (Consul for discovery)
Start Ambari servers/agents - Swarm API
Ambari services
registered in Consul
(Registrator)
Post Blueprint
20© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Agenda
Cloudbreak Docker Provisioning Collaboration
21© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Multiplicity of
Stacks
Multiplicity of hardware
environments
Static website Web frontend User DB Queue Analytics DB
Development VM QA server Public Cloud
Contributor’s laptopProduction
ClusterCustomer Data
Center
An engine that enables any payload to be encapsulated as a lightweight, portable, self-sufficient container
Docker is a “Shipping Container” System for Code
22© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Lightweight, portable Build once, run anywhere VM – without the overhead of a VM Isolated containers Automated and scripted
Docker
23© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Why Is Docker So Exciting?
For Developers:
Build once…run anywhere
• A clean, safe, and portable runtime environment for your app.
• No missing dependencies, packages etc.
• Run each app in its own isolated container
• Automate testing, integration, packaging
• Reduce/eliminate concerns about compatibility on different platforms
• Cheap, zero-penalty containers to deploy services
For DevOps:
Configure once…run anything
• Make the entire lifecycle more efficient, consistent, and repeatable
• Eliminate inconsistencies between SDLC stages
• Support segregation of duties
• Significantly improves the speed and reliability of CICD
• Significantly lightweight compared to VMs
24© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
AppA
Hypervisor (Type 2)
Host OS
Server
GuestOS
Bins/Libs
AppA’
GuestOS
Bins/Libs
AppB
GuestOS
Bins/LibsD
ocker
Host OS kernel
Server
binA
pp A
lib
App
B
VM
Container
Containers are isolated,Share only the kernel
GuestOS
GuestOS
…result is significantly faster deployment, much less overhead, easier migration, faster restart
lib
App
B
lib
App
B
lib
App
B
bin
App
A
Docker: Containers vs. VMs
25© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Agenda
Cloudbreak Docker Provisioning Collaboration
26© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
HDP as Docker Containersvia Cloudbreak
• Running Ambari Cluster in Containers• Use Blueprint to define services• All HDP services share a single container
Cloudbreak
Ambari HDP
Installs Ambari on the VMs
Docker
VM
Docker
VM
Docker
Linux
Instructs
Ambari to build
HDP cluster
Cloud Provider/Bare Metal
Provisions VMs from
Cloud Providers
Run Hadoop as Docker Containers
27© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Swarm + Consul for Placement and Discovery
28© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Cloudbreak
Run Hadoop as Docker containers
Docker Docker
DockerDockerDocker
Docker
29© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Cloudbreak
Run Hadoop as Docker containers
Docker Docker
DockerDockerDocker
Docker
amb-agn
amb-seramb-agn
amb-agn
amb-agn
amb-agn
Blueprint
30© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Cloudbreak
Run Hadoop as Docker containers
Docker Docker
DockerDockerDocker
Docker
amb-agn- hdfs- hbase
amb-seramb-agn-hdfs-hive
amb-agn-hdfs-yarn
amb-agn-hdfs-zookpr
amb-agn-nmnode-hdfs
31© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
• Quick installation with pre-pulled rpms
• Same process/images for dev/qa/prod
• Same process for single/multi-node
Benefits of running Hadoop on Docker
32© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Demo
42© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Agenda
Cloudbreak Docker Provisioning Collaboration
43© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Cisco and Hortonworks’ Partnership
100% open source Hadoop Distribution, Support and Training
Integrated Infrastructures for Big Data
CISCO AND HORTONWORKS ARE PARTNERING TO HELP YOU BUILD YOUR BIG DATA SOLUTION AND REACH MASSIVE SCALABILITY,
SUPERIOR EFFICIENCY AND DRAMATICALLY LOWER TOTAL COST OF OWNERSHIP THANKS TO A VALIDATED JOINT ARCHITECTURE.
44© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Results of the collaboration
• Efficient Hadoop as a service
• Adoption of Docker for enterprise Hadoop deployment
Tasks Cisco InterCloud
Public Cloud Provider
HDP installation15:04 mins 11:55 mins
Teragen (avg of 3 execution)7:08 mins 22:15 mins
Terasort(avg of 3 execution)32:09 mins 60:12 mins
Teravalidate(avg of 3 execution)
2:31 mins 10:40 mins
45© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Observations Future Collaboration
• Docker is maturing inside enterprises
• Interest to run Docker on top of bare
metal
• Big data app developers are leaning
towards containerization of apps
• YARN is becoming application
deployment platform beyond big data
apps
• Demand for native containerized fully
managed app on YARN
• Run Docker natively on Openstack
• Run Docker on Yarn
• OpenStack bare metal
46© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Conclusion
Data Science
IoT
BI / Analytics
Dev / Test
Blueprints
HDP
HDP + Cisco InterCloud - Efficient Hadoop-as-a-service
47© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Learn More
Download the Hortonworks Sandbox
Learn Hadoop
Build Your Analytic App
Try Hadoop 2
More about Cisco & Hortonworkshttp://hortonworks.com/partner/cisco/
More about Hortonworks’ Acquisition of SequenceIQhttp://bit.ly/1R1ktxO