38
Page 1 © Hortonworks Inc. 2014 Discover HDP 2.2: Using Apache Ambari to Manage Hadoop Clusters Hortonworks. We do Hadoop.

Discover.hdp2.2.ambari.final[1]

Embed Size (px)

Citation preview

Page 1 © Hortonworks Inc. 2014

Discover HDP 2.2: Using Apache Ambari to Manage Hadoop Clusters

Hortonworks. We do Hadoop.

Page 2 © Hortonworks Inc. 2014

Speakers

Justin Sears

Hortonworks Product Marketing Manager

Jeff Sposetti

Hortonworks Senior Director of Product Management and Committer for Apache Ambari

Mahadev Konar

Hortonworks Co-Founder, Committer and PMC Member for Apache Hadoop, Apache Ambari & Apache ZooKeeper

Page 3 © Hortonworks Inc. 2014

Agenda

•  Introduction to Apache Ambari

•  New Ambari Innovation in HDP 2.2 –  Configuration Enhancements, including Versioning & History –  Ambari Administration, including Views Framework

–  Ambari Stacks “Stack Advisor”

•  Demo

•  Q & A

We’ll move quickly: •  Attendee phone lines are muted •  Text any questions to Mahadev Konar using Webex chat •  Questions answered at the end

•  Unanswered questions and answers in upcoming blog post

Page 4 © Hortonworks Inc. 2014

Big Data, Hadoop & Data Center Re-platforming

Business Drivers

•  From reactive analytics to proactive interactions

•  Insights that drive competitive advantage & optimal returns

Financial Drivers

•  Cost of data systems, as % of IT spend, continues to grow

•  Cost advantages of commodity hardware & open source software

$ Technical Drivers

•  Data is growing exponentially & existing systems overwhelmed

•  Predominantly driven by NEW types of data that can inform analytics

There is an inequitable balance between vendor and customer in the market

Page 5 © Hortonworks Inc. 2014

Clickstream Capture and analyze website visitors’ data trails and optimize your website

Sensors Discover patterns in data streaming automatically from remote sensors and machines

Server Logs Research logs to diagnose process failures and prevent security breaches

New Types of Data Hadoop Value:

Sentiment Understand how your customers feel about your brand and products – right now

Geographic Analyze location-based data to manage operations where they occur

Unstructured Understand patterns in files across millions of web pages, emails, and documents

Page 6 © Hortonworks Inc. 2014

A Shift from Reactive to Proactive Interactions

HDP and Hadoop allow organizations to use data to shift interactions from…

Reactive Post Transaction

Proactive Pre Decision

…to Real-time Personalization From static branding

…to repair before break From break then fix

…to Designer Medicine From mass treatment

…to Automated Algorithms From Educated Investing

…to 1x1 Targeting From mass branding

A shift in Advertising

A shift in Financial Services

A shift in Healthcare

A shift in Retail

A shift in Telco

Page 7 © Hortonworks Inc. 2014

Enterprise Goals for the Modern Data Architecture

•  Consolidate siloed data sets structured and unstructured

•  Central data set on a single cluster

•  Multiple workloads across batch interactive and real time

•  Central services for security, governance and operation

•  Preserve existing investment in current tools and platforms

•  Single view of the customer, product, supply chain

APP

LIC

ATIO

NS

DAT

A S

YSTE

M

Business Analytics

Custom Applications

Packaged Applications

RDBMS

EDW

MPP

YARN: Data Operating System

1 ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° N

Interactive Real-Time Batch CRM

ERP

Other 1 ° ° °

° ° ° °

HDFS (Hadoop Distributed File System)

SOU

RC

ES

EXISTING  Systems  

Clickstream   Web    &Social  

Geoloca9on   Sensor    &  Machine  

Server    Logs  

Unstructured  

Page 8 © Hortonworks Inc. 2014

YARN Transformed Hadoop & Opened a New Era

YARN The Architectural Center of Hadoop

•  Common data platform, many applications

•  Support multi-tenant access & processing

•  Batch, interactive & real-time use cases

YARN: Data Operating System (Cluster Resource Management)

1 ° ° ° ° ° ° °

° ° ° ° ° ° ° °

Script

Pig

SQL

Hive

Tez Tez

Java Scala

Cascading

Tez

° °

° °

° ° ° ° °

° ° ° ° °

Others

ISV Engines

HDFS (Hadoop Distributed File System)

Stream

Storm

Search

Solr

NoSQL

HBase Accumulo

Slider Slider

BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

In-Memory

Spark

Page 9 © Hortonworks Inc. 2014

YARN Extends Hadoop to Other Data Center Leaders

YARN The Architectural Center of Hadoop

•  Common data platform, many applications

•  Support multi-tenant access & processing

•  Batch, interactive & real-time use cases

•  Supports 3rd-party ISV tools

(ex. SAS, Syncsort, Actian, etc.)

YARN Ready Applications Facilitates ongoing innovation and enterprise adoption via ecosystem of new and existing “YARN Ready” solutions

YARN: Data Operating System (Cluster Resource Management)

1 ° ° ° ° ° ° °

° ° ° ° ° ° ° °

Script

Pig

SQL

Hive

Tez Tez

Java Scala

Cascading

Tez

° °

° °

° ° ° ° °

° ° ° ° °

Others

ISV Engines

HDFS (Hadoop Distributed File System)

Stream

Storm

Search

Solr

NoSQL

HBase Accumulo

Slider Slider

BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

In-Memory

Spark

Page 10 © Hortonworks Inc. 2014

Enterprise Hadoop: Central Set of Services

YARN: Data Operating System (Cluster Resource Management)

1 ° ° ° ° ° ° °

° ° ° ° ° ° ° °

° °

° °

° ° ° ° °

° ° ° ° °

Enables Apache Hadoop to be an Enterprise Data Platform with centralized services for:

•  Governance

•  Operations

•  Security

Everything that plugs into Hadoop inherits these services

Provision, Manage & Monitor

Ambari

Zookeeper

Scheduling

Oozie

Load data and manage

according to policy

Deploy and effectively

manage the platform

Provide layered approach to

security through Authentication, Authorization,

Accounting, and Data Protection

SECURITY GOVERNANCE OPERATIONS

Script

Pig

SQL

Hive

Java Scala

Cascading

Stream

Storm

Search

Solr

NoSQL

HBase Accumulo

BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

In-Memory

Spark

Others

ISV Engines

YARN: Data Operating System (Cluster Resource Management)

HDFS (Hadoop Distributed File System)

Tez Slider Slider Tez Tez

Page 11 © Hortonworks Inc. 2014

Hortonworks Data Platform 2.2

HDP Delivers Enterprise Hadoop

YARN: Data Operating System (Cluster Resource Management)

1 ° ° ° ° ° ° °

° ° ° ° ° ° ° °

Script

Pig

SQL

Hive

Tez Tez

Java Scala

Cascading

Tez

° °

° °

° ° ° ° °

° ° ° ° °

HDFS (Hadoop Distributed File System)

Stream

Storm

Search

Solr

NoSQL

HBase Accumulo

Slider Slider

SECURITY GOVERNANCE OPERATIONS BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

In-Memory

Spark

Provision, Manage & Monitor

Ambari

Zookeeper

Scheduling

Oozie

Data Workflow, Lifecycle & Governance

Falcon Sqoop Flume Kafka NFS

WebHDFS

Authentication Authorization

Audit Data Protection

Storage: HDFS

Resources: YARN Access: Hive

Pipeline: Falcon Cluster: Ranger Cluster: Knox

Deployment Choice Linux Windows Cloud

YARN is the architectural center of HDP

•  Common data set across all applications

•  Batch, interactive & real-time workloads

•  Multi-tenant access & processing

Provides comprehensive enterprise capabilities

•  Governance

•  Security

•  Operations

Enables broad ecosystem adoption

•  ISVs can plug directly into Hadoop

The widest range of deployment options •  Linux & Windows

•  On premises & cloud

Others

ISV Engines

On-Premises

Page 12 © Hortonworks Inc. 2014

Hortonworks Data Platform 2.2

HDP Delivers Enterprise Hadoop

YARN: Data Operating System (Cluster Resource Management)

1 ° ° ° ° ° ° °

° ° ° ° ° ° ° °

Script

Pig

SQL

Hive

Tez Tez

Java Scala

Cascading

Tez

° °

° °

° ° ° ° °

° ° ° ° °

HDFS (Hadoop Distributed File System)

Stream

Storm

Search

Solr

NoSQL

HBase Accumulo

Slider Slider

SECURITY GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

In-Memory

Spark

Scheduling

Oozie

Data Workflow, Lifecycle & Governance

Falcon Sqoop Flume Kafka NFS

WebHDFS

Authentication Authorization

Audit Data Protection

Storage: HDFS

Resources: YARN Access: Hive

Pipeline: Falcon Cluster: Ranger Cluster: Knox

Deployment Choice Linux Windows Cloud

YARN is the architectural center of HDP

•  Common data set across all applications

•  Batch, interactive & real-time workloads

•  Multi-tenant access & processing

Provides comprehensive enterprise capabilities

•  Governance

•  Security

•  Operations

Enables broad ecosystem adoption

•  ISVs can plug directly into Hadoop

The widest range of deployment options •  Linux & Windows

•  On premises & cloud

Others

ISV Engines

On-Premises

OPERATIONS

Provision, Manage & Monitor

Ambari

Zookeeper

Page 13 © Hortonworks Inc. 2014

Introduction to Apache Ambari

Page 14 © Hortonworks Inc. 2014

How do you Operate a Hadoop Cluster?

Apache Ambari is a framework to provision,

manage and monitor Hadoop clusters

Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Apache Ambari Themes

Operate  Hadoop  at  Scale  

Deliver  the  core  opera-onal  capabili-es  to  provision,  manage  and  monitor  Hadoop  clusters  at  scale.  

Integrate  with  the  Enterprise  

Robust  API  for  integra-on  with  exis9ng  enterprise  systems,  such  as  Teradata  Viewpoint  and  MicrosoL  SCOM.  

Extend  for  the  Ecosystem  

Provide  an  extensible  plaNorm  for  Enterprises,  Partners  and  the  Community,  via  Stacks  and  Views.  

Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

What’s New in Ambari 1.7.0 Core Services •  ResourceManager HA •  Capacity Scheduler Refresh Queues •  HDFS Rebalance •  Service Config Versioning + History •  Manage -env.sh Files •  Set <final> Config Properties •  Download Client Configs

Ambari Platform •  Ambari Administration •  Ambari Views Framework •  Ambari Blueprints Export Configs •  Ubuntu 12 Platform Support Stacks •  Support for HDP 2.2 •  Stack Advisor For a complete list of enhancements…

http://www.slideshare.net/hortonworks/apache-ambari-whats-new-in-170

Page 17 © Hortonworks Inc. 2014

New in HDP 2.2: Configuration Enhancements

Page 18 © Hortonworks Inc. 2014

Configuration Versioning and History

•  Service Config Versions (saved per service)

•  List of Config History

•  Compare Versions

•  Filter by “Changed Properties”

•  Revert Changes (i.e. “Make Current”)

•  Audit Log of Changes

Page 19 © Hortonworks Inc. 2014

Configuration History History of Changes

Filter, Sort,

Search

Page 20 © Hortonworks Inc. 2014

Service Configuration Controls Most Recent Versions (view, compare, revert)

Compare Versions

Revert Version

Filter by “Changed”

Page 21 © Hortonworks Inc. 2014

New in HDP 2.2: Views Framework

Page 22 © Hortonworks Inc. 2014

Ambari Extension Points

Ambari Server

Ambari Agent Ambari

Agent Ambari Agent

Ambari Web

Stacks Stacks

Stacks

java!js! python!

Ambari Views Ambari Stacks

Page 23 © Hortonworks Inc. 2014

Ambari Extension Points

Ambari Server

Ambari Agent Ambari

Agent Ambari Agent

Ambari Web

Stacks Stacks

Stacks

java!js! python!

Ambari Views Ambari Stacks

Page 24 © Hortonworks Inc. 2014

Ambari Views Framework

Goal: enable the delivery of custom UI experiences in Ambari Web

Developers can extend the Ambari Web interface •  Views expose custom UI features for Hadoop Services

Ambari Admins can entitle Views to Ambari Web users •  Entitlements framework for controlling access to Views

Page 25 © Hortonworks Inc. 2014

Example Views

“Queue Manager” View

“Jobs” View

Page 26 © Hortonworks Inc. 2014

View Components

•  Serve client-side assets (such as HTML + JavaScript)

•  Expose server-side resources (such as REST endpoints)

VIEW  Client-­‐side  assets  

(.js,  html)  

AMBARI  WEB  

VIEW  Server-­‐side  resources  (java)  

AMBARI  SERVER  

{rest}!Hadoop

and other systems

Page 27 © Hortonworks Inc. 2014

Versions and Instances

•  Deploy multiple versions and create multiple instances of a view

•  Manage accessibility and usage

Page 28 © Hortonworks Inc. 2014

Choice of Deployment Model

•  For Hadoop Operators: Deploy Views in an Ambari Server that is managing a Hadoop cluster

•  For Data Workers: Run Views in a “standalone” Ambari Server

Ambari Server

HADOOP  Store  &  Process  

Ambari Server

Operators manage the cluster, may have Views deployed

Data Workers use the cluster and use a “standalone” Ambari Server for Views

Page 29 © Hortonworks Inc. 2014

Learn More About Views Framework

https://github.com/apache/ambari/blob/trunk/ambari-views/docs/index.md

https://github.com/apache/ambari/tree/trunk/ambari-views/examples

https://cwiki.apache.org/confluence/display/AMBARI/Views

https://github.com/apache/ambari/tree/trunk/contrib/views

Page 30 © Hortonworks Inc. 2014

New in HDP 2.2: Stack Advisor

Page 31 © Hortonworks Inc. 2014

Ambari Extension Points

Ambari Server

Ambari Agent Ambari

Agent Ambari Agent

Ambari Web

Stacks Stacks

Stacks

java!js! python!

Ambari Views Ambari Stacks

Page 32 © Hortonworks Inc. 2014

Ambari Extension Points

Ambari Server

Ambari Agent Ambari

Agent Ambari Agent

Ambari Web

Stacks Stacks

Stacks

java!js! python!

Ambari Views Ambari Stacks

Page 33 © Hortonworks Inc. 2014

Ambari Stacks

•  Defines a consistent Stack lifecycle interface that can be extended

•  Encapsulates Stack Versions, Services, Components, Dependencies, Cardinality, Configurations, Commands

•  Dynamically add Stack + Service definitions

AMBARI  {rest}!

<ambari-web>!

Stacks

HDFS   YARN   MR2  

Hive  

Pig  

Oozie  HBase  

Storm  Falcon  

Page 34 © Hortonworks Inc. 2014

Stacks In Action http://hortonworks.com/partners/certified/ops-ready/

Page 35 © Hortonworks Inc. 2014

Stack Advisor

•  Extends Ambari Stacks to include a “Stack Advisor”

•  Provides recommendations for and performs validation on component layout & configuration

•  Improves Stack pluggability

•  Exposes new REST endpoints:

/recommendations!!/validations!

•  REST endpoints used during Cluster Install Wizard and Configs UI

Page 36 © Hortonworks Inc. 2014

DEMO

Page 37 © Hortonworks Inc. 2014

Q & A

Page 38 © Hortonworks Inc. 2014

Thank you! Learn more at: hortonworks.com/hadoop/ambari/