Upload
tyler-carson
View
231
Download
4
Embed Size (px)
Citation preview
**
MapReduce Debuggingwith
Jumbune
*
Agenda
*
Debugging Challenges
DebuggingMapReduce
Jumbune’s Debugger
Zero Tolerance in Production
*
Typically, working in Big Data we Ingest and analyze multi-terabyte to petabytes of data on multi-node cluster to perform actionable analytical outcomes for
• Discovering opportunities and solutions
• Deriving operational intelligence
• Driving sales
Production errors and failures results in significant loss in revenue and time.
Enterprise analytical solutions executing in production have zero acceptance to bugs & errors.
Zero Tolerance in Production
*
Huge applications, the symptom and cause may be in remote parts of the program.
Multiple Components that work in tandem may trigger rare or difficult to reproduce input sequence, program timing.
Complex systems - Flaws due to human mistake or misunderstanding and is difficult to trace
Must be frugal, scalable yet detailed
Debugging challenges
*
MapReduce Debugging
● Handle Billions of <Key, Value> pairs● Through multiple phases and components● On thousands of machines● Customized logic in Mapper, Reducers, UDFs
● Frugal that it does not escalate the execution time● Detailed enough to let the developer understand● Scale to terabytes of data● Scale to thousand node clusters
*
Jumbune’s Debugger
*
The Developer develops chained & complex
MapReduce application
jumbune
Submits the job to Jumbune for flow
analysis
Dynamic instrumentation
Job executed on the cluster
Logs collected from the executed cluster nodes
MapReduce execution flow debug
results
Log parsing and analysis
*
Asymmetric Advantages
xxxxxx
MapReduce DevLogic Test
Presents easy to understand hierarchical execution flow details of MapReduce Job
Bring down hours of execution logic debugging trails by identification of root cause within minutes
Verify execution on all participating nodes of the cluster
Ability to work with all major Hadoop Distributions
*
Hierarchical Flow AnalysisTrace <Key,Value> pairs into each control structure in every phase of MapReduce
Regular expressions and Custom Java validations on every phase
Job, phase and instance level details
Method, counter and control structure details for deeper analysis
Input keys, output records and filtered in/out details for advanced debugging.
Chained job support
Map
Method()
Method()
Method()
IF1
IF2
IF3
IF1
IF1
IF2
*
Let’s debug your Jobs together!Website• http://jumbune.org
Contribute• http://github.com/impetus-opensource/jumbune• http://jumbune.org/jira/JUM
Social• Follow @jumbune Use #jumbune• Jumbune Group: http://linkd.in/1mUmcYm
Forums• Users: [email protected] • Dev: [email protected]• Issues: [email protected]
Downloads• http://jumbune.org• https://bintray.com/jumbune/downloads/jumbune