How To Get Hadoop App Intelligence with Driven

Preview:

Citation preview

How to Get Hadoop Application Intelligence with Driven

Confidential!

WHY NOW

2!

As Hadoop applications become the engine of your data management strategy, they must meet higher standards of quality, reliability, and manageability.

Confidential!

WE’RE THE FORCE BEHIND CASCADING…

3!

Cascading is a proven platform for building and deploying big data applications on Hadoop with 10,000+ production deployments!

Java, Scala (Scalding), SQL!

SIMPLEEnsure best practices !at any scale thanks to !easy-to-learn design

principles!

FLEXIBLELeverage existing Java,

Scala, and SQL skills and easily adapt to new

systems!

RELIABLEAlways get optimal performance and !

reliability for big data applications!

!

Confidential!

… POWERING BIG DATA APPS ACROSS INDUSTRIES

!

Social Media Consumer & Retail Business Services Ad & Marketing

Financial

Telecom

What people are saying…!

4!

Confidential!

WHO ARE WE

!

TRUSTEDby over 10,000

companies as their big data app platform!

!

BACKED

by top Silicon Valley investors True Ventures,

Rembrandt VP, Bain Capital!

!!!

FOUNDED !in 2008, with

headquarters in San Francisco!

5!

HADOOP APP INTELLIGENCE

Confidential!

DEVELOPERS, OPS TEAMS, AND CIOS ASKED US

Can you help us improve the quality, reliability and manageability of all our big data applications? !

! By visualizing our entire data pipeline!!By tracking exactly how our big data apps behave at runtime and pinpointing bottlenecks!!By helping us understand how our departments, teams, and other segments consume big data resources and deliver value!

!7!

Confidential!

PERFORMANCE MANAGEMENT FOR HADOOP APPS

PERFORMANCE MANAGEMENT FOR HADOOP APPLICATIONS

higher quality hadoop apps

BUILD hadoop apps more reliably

RUN hadoop apps

more effectively

MANAGE

BUILD HIGHER QUALITY APPS

Confidential!

BUILD HIGHER QUALITY HADOOP APPS

10!

SOURCES OPERATIONS ! (Functions, filters, joins, and aggregators)

RESULTS

Fully visualize your entire data pipeline Quickly and easily identify execution errors

Confidential!11!

BUILD HIGHER QUALITY HADOOP APPS Fully visualize your entire data pipeline Quickly and easily identify execution errors

RUN APPS MORE RELIABLY

Confidential!

RUN HADOOP APPS MORE RELIABLY

13!

CURRENTLY EXECUTING

Watch your apps execute in real time Easily detect apps that violate SLA’s and policies Pinpoint bottlenecks and identify causes

Confidential!

RUN HADOOP APPS MORE RELIABLY

14!

Pinpoint bottlenecks and identify causes

EXECUTING! WAITING!

Watch your apps execute in real time Easily detect apps that violate SLA’s and policies Pinpoint bottlenecks and identify causes

DETAILED MAPPER/REDUCER STATS!

Confidential!

RUN HADOOP APPS MORE RELIABLY

15!

Pinpoint bottlenecks and identify causes

Watch your apps execute in real time Easily detect apps that violate SLA’s and policies Pinpoint bottlenecks and identify causes

View metrics for all apps on the production cluster that failed to execute in under 5 minutes… !

…or all applications that use more than their allotment of mappers!

MANAGE APPS MORE EFFECTIVELY

Confidential!

MANAGE BIG DATA APPS MORE EFFECTIVELY

17!

See how all apps consume resources as they run Segment performance by team, by department or custom tags for role-based views, chargeback models, and capacity planning

Confidential!

MANAGE HADOOP APPS MORE EFFECTIVELY

18!

See how all apps consume resources as they run Segment performance by team, by department or custom tags for role-based views, chargeback models, and capacity planning

View the performance of all apps owned by the DevOps team!

Marketing

Sales

Compliance

Data science team

QA cluster

Production cluster

Confidential!

MANAGE HADOOP APPS FOR COMPLIANCE

19!

Visualize Lineage – See exactly how each app ingests, manipulates and outputs data

Further inspect lineage by detecting apps that write to, or read from, a given dataset

SOURCES OPERATIONS ! (Functions, filters, joins, and aggregators)

RESULTS

Confidential!

MANAGE HADOOP APPS FOR COMPLIANCE

20!

Visualize Lineage – See exactly how each app ingests, manipulates and outputs data

Further inspect lineage by detecting apps that write to, or read from, a given dataset

For example, show all apps that interact with the dataset in “rain.txt”!

Confidential!

MANAGE HADOOP APPS WITH COLLABORATION

21!

Create JIRA issues with views and data for quickly collaborating to resolve performance problems

Integrate alerts with popular notification platforms like HipChat, PagerDuty, & Nagios

With one click, create a JIRA issue with a link to this view!

Confidential!

MANAGE HADOOP APPS WITH INTEGRATION

22!

Create JIRA issues with views and data for quickly collaborating to resolve performance problems

Integrate alerts with popular notification platforms like HipChat, PagerDuty, & Nagios

Automatically send app status notifications via webhooks or JMX !

ARCHITECTURE / DEMO

Confidential!

End-to-end operational telemetry metadata for big data applications!!Accessible via Web browser, command-line interface (CLI), or simple search queries!!Easy integrations through JMX and upcoming Driven SDK!

DRIVEN ARCHITECTURE

Telemetry metadata! (SSL)!

YARN!

HADOOP APPS AND INFRASTRUCTURE

APPLICATIONS!

Plugin!

24!

HADOOP CLUSTERS!

WAR

file

s! Web App!Server!

Server!

Web CLI JMX!

Web App!Server!

Confidential!

DELIVERING OPERATIONAL EXCELLENCE

“The coolest part about Driven is being able to visualize data pipelines and inspect components in real time for easy troubleshooting and optimization. I don't know of any other tool that's close in functionality.” - Neville Li Software Engineer, Spotify

25!

“With Driven, it’s easy to see how our apps use the data. When there’s an exception, Driven shows the history, so we can learn exactly what went wrong. That’s a huge time saver.”" - Niels Boldt Lead Software Engineer, Mojn

Confidential!

PERFORMANCE MANAGEMENT FOR HADOOP APPS

PERFORMANCE MANAGEMENT FOR HADOOP APPLICATIONS

higher quality hadoop apps

BUILD hadoop apps more reliably

RUN hadoop apps

more effectively

MANAGE

QUESTIONS

Recommended