29
Java Production Debugging 101 A Reversim Summit Lab, February, 2013

Lab: JVM Production Debugging 101

Embed Size (px)

DESCRIPTION

A lab given at the Reversim Summit on 19 February 2013. http://summit2013.reversim.com/#/sessions/Lab:%20Java%20Production%20Debugging%20101 The code for the sample scenarios can be found on GitHub: https://github.com/holograph/examples/tree/master/reversim-proddbg-lab

Citation preview

Page 1: Lab: JVM Production Debugging 101

Java Production Debugging 101A Reversim Summit Lab, February, 2013

Page 2: Lab: JVM Production Debugging 101

PRODUCTION DEBUGGING

= FORENSICS

Page 3: Lab: JVM Production Debugging 101

Business Requirements

Requirements

Prod. Debugging Forensics

Timeframe Severely limited

Hours, days, weeks…

Chain of Custody Meaningless Sacred

Documentation Useful Sacred

Page 4: Lab: JVM Production Debugging 101

Endgame

Production Debugging Forensics

1. Gather evidence1. Identify crime in progress

2. Restore functionality 2. Gather evidence

3. Figure out what happened

Page 5: Lab: JVM Production Debugging 101

Our Forensic Process

Gather Evidence

Restore Production

Analyze Findings

Implement Solution

Post-Mortem

Page 6: Lab: JVM Production Debugging 101

Evidence toolchain

Page 7: Lab: JVM Production Debugging 101

WHAT SHALL WE COLLECT?

Page 8: Lab: JVM Production Debugging 101

Our focus points for today

• Thread dump• Heap dump• VM (especially GC) metrics• System metrics• Logs

Page 9: Lab: JVM Production Debugging 101

jstack

• Minimalistic tool• Against a running process:jstack <pid>

• Outputs to stdout• Identifies deadlocks

Page 10: Lab: JVM Production Debugging 101

jmap

• Heap-dump from a running process– Lengthy process– Freezes VM

• Some extras• Command:

jmap –dump:format=b,file=<output> <pid>

Page 11: Lab: JVM Production Debugging 101

jstat

• JVM metrics: classloader, JIT, GC• Tracking over time• Console-based• jstat –gcutil <pid> 5s

Page 12: Lab: JVM Production Debugging 101

The JVM GC

Page 13: Lab: JVM Production Debugging 101

jvisualvm

• Combines most of the above, with GUI

• Remote via X11 forwarding (dreadful!)

Page 14: Lab: JVM Production Debugging 101

SHALL WE DANCE?So…

Page 15: Lab: JVM Production Debugging 101

Scenario 1

• Phone call in the middle of the night– “The application is stuck!”

• What do you do?

Page 16: Lab: JVM Production Debugging 101

Scenario 2

• Looks familiar?– “The application is

crawling to a halt!”– “So restart it.”– “OK, it’s good

now.”

• This is a lie.– You will get

another call.

Page 17: Lab: JVM Production Debugging 101

Scenario 3

• 1st tier support engineer (maybe you?) calls:– “I get OutOfMemoryExceptions on

this service.”– “Restart it.”– “Already have. Happened again.”– “Well, shit.”

Page 18: Lab: JVM Production Debugging 101

BREAK TIME!

Page 19: Lab: JVM Production Debugging 101

FORENSICTOOLCHAIN

Without further ado…

Page 20: Lab: JVM Production Debugging 101

GNU toolchain is your friend

• bash, ps, grep, less, awk– ‘nuff said

• … or:– http://gnuwin32.sourceforge.net/

Page 21: Lab: JVM Production Debugging 101

MAT

• Eclipse plugin/standalone

• Reads heap dumps

• Easy drill-down

Page 22: Lab: JVM Production Debugging 101

And most important…

Page 23: Lab: JVM Production Debugging 101

RESOLUTION TIME!

Page 24: Lab: JVM Production Debugging 101

Back to: Scenario 1

• What did we gather?– CPU – 100% single-core utilization– GC metrics – no useful data– Heap dump – no useful data– Thread dump

• java.util.Regex * gazillion

• Where the problem is implies… what the problem is

Page 25: Lab: JVM Production Debugging 101

Back to: Scenario 2

• What did we gather?– CPU – 100% single-core utilization– Heap dump – no useful data– Thread dump– GC metrics

• Frequent, long GCs (GC, FGC, FGCT)

• Rapid HashMap insertions: recipe for disaster

Page 26: Lab: JVM Production Debugging 101

Back to: Scenario 3

• What did we gather?– CPU – low utilization– Thread dump – no useful data– GC metrics – high heap utilization,

low GC – Heap dump

• Predictably high number of strings• Strings are abnormally large• Strings contain entire HTML subset!

• Substring/regex can be dangerous!

Page 27: Lab: JVM Production Debugging 101

AFTERWORDHeadache? Take two of these!

Page 28: Lab: JVM Production Debugging 101

Adieu

• Thank you for attending!

• Presentation and demos:

http://git.io/7LK4fw

• Tomer Gabel– [email protected]– http://www.tomergabel.com/– @tomerg

Page 29: Lab: JVM Production Debugging 101

Thank youour sponsors