Upload
becky-mendenhall
View
18
Download
1
Tags:
Embed Size (px)
Citation preview
© 2015 Enterprise Integration News, Inc.
Introduction
Agenda Bio
Making Hadoop just work better for varied workloads
Details top challenges in adopting Hadoop
How Pepperdata automatically improves performance, visibility, controls
A pioneer in production-ready Hadoop
15+ years web search and big data; Focus on huge scale, huge impact products
Started the Silicon Valley branch of Microsoft’s Bing engineering & product team
Visibility & Optimization for Hadoop
Sean SuchterCo-Founder and CEO
1
©2015 Pepperdata
VISIBILITY & OPTIMIZATION FOR HADOOP
• • • • • • • • ©2014 Pepperdata
AGENDA
CHALLENGES USING HADOOP
HOW PEPPERDATA ADDRESSES THESE CHALLENGES
Q&A and NEXT STEPS
3
• • • • • • • • ©2014 Pepperdata
HADOOP CHALLENGES YOU FACE DAILY
4
Tact
ical
Stra
tegi
cV
ery
Str
ateg
ic Call from your CEO asking“WTF is happening?!?
Can’t make SEC filingEOQ and can’t send the
invoice
Critical feature on your website is broken!
Online ad impression data unavailable
External customer reports unavailable
Users complainingCustomer churn metrics
unavailableRevenue report doesn’t
completeHave to buy more
servers
End user SLAs compromised
Finding root-cause of problems is manual
Low priority jobs taking over the cluster
Ad hoc jobs interfere with production jobs
HBase & MapReduce contentionRogue jobs hammer cluster
performanceCluster seems near maximum
capacity
Developers can’t submit new jobs
• • • • • • • • ©2014 Pepperdata
HADOOP WASTES VALUABLE CAPACITY
Physi
cal hard
ware
reso
urc
e
Time
Theoretical maximum usage (reservation)
Actual physical capacity used
1. Production clusters are sized for peak SLA with lots of headroom, so capacity is wasted
1. Ad-hoc jobs consume capacity from high-priority jobs, so companies run them on separate cluster
1. Hadoop’s allocations are predefinedand static, resulting in wasted capacity
• • • • • • • • ©2014 Pepperdata
MORE AND MORE WASTED CAPACITY
6
Over time, more and more clusters are built to isolate the different workloads
Production Cluster Ad Hoc Cluster Priority Job Cluster HBase Cluster Bulk Load Cluster
But they are full of “holes”!
• • • • • • • • ©2014 Pepperdata
PEPPERDATA MAKES HADOOP WORK BETTER
7
FINE-GRAINED VISIBILITYMonitor CPU, RAM, I/O, network per task, job, user, group
Identify bottlenecks in real-time or at any moment historically
TOTAL PREDICTABILITYSLA enforcement for true multi-tenancy: dynamically adjusts resource usage
Set policies to protect high-priority jobs
30-50% GREATER THROUGHPUT ON ALREADY HIGHLY TUNED
CLUSTERSReclaims wasted capacity: use all true hardware capacity
Run more jobs with our Dynamic Capacity Creation
• • • • • • • • ©2014 Pepperdata
PEPPERDATA REAL-TIME ARCHITECTURE
8
VISIBILITY
CONTROL
CAPACITY
Delivers real-time, granular visibility into resource consumption by user, job, and task
Allows user-defined prioritization of Hadoop jobs and automatically allocates resources to ensure jobs run safely
Reclaims wasted capacity and allows mixed workloads to be shared on a single cluster
Developer AnalystFinancial
ReportProduct
Pepperdata Dashboard
Hadoop Configuration
YOUR EXISTING HADOOP
MapReduce, HBase, etc.
Job Tracker / Resource Manager (Scheduler & YARN)
ETL
Policies
• • • • • • • • ©2014 Pepperdata
FINE-GRAINED VISIBILITY INTO THE CLUSTER
9
• • • • • • • • ©2014 Pepperdata
FINE-GRAINED VISIBILITY INTO YOUR CLUSTER
10
• • • • • • • • ©2014 Pepperdata
EASILY PINPOINT BOTTLENECKS IN THE CLUSTER
11
• • • • • • • • ©2014 Pepperdata
PEPPERDATA MAKES HADOOP WORK BETTER
12
FINE-GRAINED VISIBILITYMonitor CPU, RAM, I/O, network per task, job, user, group
Identify bottlenecks in real-time or at any moment historically
TOTAL PREDICTABILITYSLA enforcement for true multi-tenancy: dynamically adjusts resource usage
Set policies to protect high-priority jobs
30-50% GREATER THROUGHPUT ON ALREADY HIGHLY TUNED
CLUSTERSReclaims wasted capacity: use all true hardware capacity
Run more jobs with our Dynamic Capacity Creation
• • • • • • • • ©2014 Pepperdata
NEXT STEPS
Like what you saw? Want to learn more?
Visit pepperdata.com for more product information.
Visit pepperdata.com/demo to request a
free demo from one of our technical experts!
13
• • • • • • • • ©2014 Pepperdata
THANK YOU
14
© 2015 Enterprise Integration News, Inc.
Questions & Answers
Q&A
Question & Answer
What is the form factor of Pepperdata, and how long does it take to install?
How do we make sure Pepperdata ‘agents’ are where they need to be -- and working?
Sean SuchterCo-Founder and CEO
• • • • • • • • ©2014 Pepperdata
PEPPERDATA REAL-TIME ARCHITECTURE
17
VISIBILITY
CONTROL
CAPACITY
Delivers real-time, granular visibility into resource consumption by user, job, and task
Allows user-defined prioritization of Hadoop jobs and automatically allocates resources to ensure jobs run safely
Reclaims wasted capacity and allows mixed workloads to be shared on a single cluster
Developer AnalystFinancial
ReportProduct
Pepperdata Dashboard
Hadoop Configuration
YOUR EXISTING HADOOP
MapReduce, HBase, etc.
Job Tracker / Resource Manager (Scheduler & YARN)
ETL
Policies
We have mixed workloads that often force us to overprovision Hadoop resources.
Does Pepperdata help us deal with this by allowing Hadoop to adjust dynamically?
Sean SuchterCo-Founder and CEO
Given Pepperdata’s intelligent and dynamic environment,
how does that impact the way we do Hadoop prep or set-up?
Sean SuchterCo-Founder and CEO
How much Hadoop cluster resource does Pepperdata use?
Sean SuchterCo-Founder and CEO
How do customers use the Pepperdata dashboard?
Where is it hosted?
Sean SuchterCo-Founder and CEO
• • • • • • • • ©2014 Pepperdata
PEPPERDATA REAL-TIME ARCHITECTURE
22
VISIBILITY
CONTROL
CAPACITY
Delivers real-time, granular visibility into resource consumption by user, job, and task
Allows user-defined prioritization of Hadoop jobs and automatically allocates resources to ensure jobs run safely
Reclaims wasted capacity and allows mixed workloads to be shared on a single cluster
Developer AnalystFinancial
ReportProduct
Pepperdata Dashboard
Hadoop Configuration
YOUR EXISTING HADOOP
MapReduce, HBase, etc.
Job Tracker / Resource Manager (Scheduler & YARN)
ETL
Policies
How is the Pepperdata approach different from YARN?
Sean SuchterCo-Founder and CEO
Please detail some customer successes from using Pepperdata with Hadoop?
Sean SuchterCo-Founder and CEO
© 2015 Enterprise Integration News, Inc.
For More Information
For More Information
Pepperdata – Rely on Hadoop http://pepperdata.com/
Visibility Capacity Control Technology
Learn More About PepperdataProducthttp://pepperdata.com/products/
Real-Time Architecture http://pepperdata.com/products/#pd-technology
Benefitshttp://pepperdata.com/benefits/
Blog http://pepperdata.com/blog/
Other Pepperdata Resources (Whitepapers & Case Studies)http://pepperdata.com/resources/
Request a Demo http://pepperdata.com/demo/