Transcript
Page 1: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Pimp my GC-

Supersonic Scala !

Page 2: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

/me

● Pierre Laporte● “Java Performance Tuning” Trainer● Perfs issues, logs GC eye-compliant

http://www.pingtimeout.fr

@pingtimeout

[email protected]

Page 3: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Agenda

● 42 minutes of– Fun (Theory)

– Fun (Practice)

– Fun (Feedbacks)

– Fun (Questions/Answers)

– Fun (Trolls)

● Because performance is fun !

Page 4: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Disclaimer

● Be critical with the information contained in this talk

● JVM Tuning is always made on a case-by-case basis. There is no magic, no special set of flags that produces good results on every project.

● The resemblance of any opinion, recommendation or comment made during this presentation to performance tuning advice is merely coincidental.

Page 5: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Weak Generational Hypothesis 101

Page 6: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Theory – Weak Generational Hypothesis

Page 7: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Theory – Weak Generational Hypothesis

Page 8: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Theory – Weak Generational Hypothesis

● “Most objects die young”

● Possible scales :– MB, GB, TB

– Minutes, hours, days

Page 9: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Examples – Weak Generational Hypothesis

GB

3j

Total : 145 GBAvg : 48 GB/j

Page 10: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Examples – Weak Generational Hypothesis

TB

10j

Total : 30 TBAvg : 3TB/j

Page 11: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Examples – Weak Generational Hypothesis

● 35 GB/j– Scala

– Play 2

– Akka

● 3 TB/j– Java

– Tomcat

– Jax-RS / Spring / Hibernate...

Page 12: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Examples – Weak Generational Hypothesis

Don't forget !– Be critical

– Case-by-case analysis

– Please don't do that

---->

Page 13: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

JVM Heap 101

Page 14: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Theory – Memory pools

● Java Heap – 2 memory pools

(Except for G1 GC)

● Young Generation for... young objects● Old Generation for... old objects !!!

Amazing, right ?

Page 15: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Theory – Memory pools

Page 16: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Theory – Memory pools

● Young Generation = Eden + Survivors● Every object is created in Eden*

* : except when it is too big to fit in Eden

* : except in special cases for G1 GC

Page 17: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Theory – Memory pools

Page 18: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Why memory pools ?!

● Always 2 GC per JVM*

* Except for G1 GC

● Young GC– Cheap

– Duration mostly ≈ O(Live data in YG)

● Old GC– Expensive

– Duration mostly ≈ O(Live data in OG)

Page 19: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Why memory pools ?!

Common GC Name

Young Gen GC Old Gen GC

“Parallel GC” PSYoungGen ParOldGen

“CMS” ParNew CMS

“G1 GC” G1

Page 20: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

GC Duration ?!

Prove it!

Page 21: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

App with small live set

Page 22: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

App with big live set

Page 23: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Experiment 1

● 1st run (SmallLiveSet)– 50 GB heap (-ms50g -mx50g)

– 49.9GB Young Gen (-Xmn49900m)

– GC logs

Page 24: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Experiment 1

● 1st run (SmallLiveSet)– 50 GB heap

● -ms50g -mx50g

– 49.9GB Young Gen● -Xmn49900m

– GC logs

● Result :– 6ms YGC pauses to free 38GB of memory

Page 25: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Experiment 1 : Result

[PSYoungGen: 38329728K->6496K(44710400K)] 38329744K->6512K(46041600K), 0.0067050 secs] //...

● 38.329.728K data before GC in YG, 6.496K after● YG size is 44.710.400K● 38.329.744K data before GC in heap, 6.512K after● Heap size is 46.041.600K● Total pause time : 6.7ms

Page 26: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Experiment 2

● 2nd run (SmallLiveSet)– 50 GB heap (-ms50g -mx50g)

– 10MB Young Gen (-Xmn10m)

– GC logs

Page 27: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Experiment 2

● 1st run (SmallLiveSet)– 50 GB heap

● -ms50g -mx50g

– 10MB Young Gen● -Xmn10m

– GC logs

● Result :– 322ms Full GC pauses to free 52GB of memory

Page 28: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Experiment 2 : Result

[Full GC [PSYoungGen: 3072K->0K(7168K)] [ParOldGen: 52418151K->30287K(52418560K)] 52421223K->30287K(52425728K)//... 0.3229410 secs]

● 52.418.151K data before GC in OG, 30.287K after

● OG size is 52.418.560K

● 52.421.223K data before GC in heap, 30.287K after

● Heap size is 52.425.728K

● Total pause time : 322.9ms

Page 29: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Experiments 1->4, Wrap up

● 1st and 2nd runs with BigLiveSet– Ran out of time* :-(

*: Stopped measuring at Heap occupancy ≈ 22GB

● GC Pauses :

Live setSmall 6 millis 322 millis (Full GC)Big 55 secs (Full GC)* 250 secs (Full GC)*

Page 30: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Immutability

Is immutability a problem ?

Page 31: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Immutability

● What does this code do ?

(GC point of view ?)

Page 32: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Immutability

Page 33: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Immutability

● What does this code do ?– Create more temporary objects that dies young

– Respect Weak Generational Hypothesis

Page 34: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Immutability

● Consequences compared to mutable state– GC will run more frequently

– GC time will be short

O(Live data in YG)

Page 35: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Tuning for immutability

● Reduce YGC frequency (for ParallelGC and CMS)– Identify allocation rate (MB/seconds)

– Define the GC interval (seconds between GCs)

=> Set Eden = Allocation rate * GC interval

Page 36: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Tuning for immutability

● Reduce YGC frequency (for ParallelGC and CMS)– AR = 200 MB/s

– Desired interval = 1 YGC every 4 seconds

=> Set Eden to 800 MB (Young to 1 GB)

-Xmn1g

Page 37: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Poney Pause

Page 38: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC time !

Page 39: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC

● Idea– Split the heap in

2048 regions

– Associate on-the-fly oneregion to a memory pool

– Increase/Shrink memorypool at runtime

http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-Them-All

Page 40: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC

● Memory pools :– Young (Eden, Survivors)

– Old

– Humongous

● Humongous:– Objects >= 50% region

http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-Them-All

Page 41: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC

● 1 ½ GC algorithm:– Always collect Young Gen

– Collect Old Gen if possible● Best regions only● Time budget large enough● Preconditions● “mixed” collection

● G1 is self-tuning

http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-Them-All

Page 42: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC Tuning

● Define GC time budget

-XX:MaxGCPauseMillis=<N>

-XX:GCPauseIntervalMillis=<M>

● Set Xms == Xmx● Drop all other GC-related flags

-Xmn, -XX:TenuringThreshold, -XX:NewRatio

-XX:InitiatingHeapOccupancyPercent, …

● Don't try to outsmart the GC

Page 43: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC Tuning

● Enable GC logs

-Xloggc:gc.log

-XX:+PrintGCDetails

-XX:+PrintTenuringDistribution

-XX:+PrintGCCause

-XX:+PrintAdaptiveSizePolicy

● Wait and see

Page 44: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC Tuning – Low hanging fruits

● Eliminate Humongous allocations– Humongous regions collected only at Full GC

– Or when empty

[G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 0 bytes, allocation request: 79012360 bytes, threshold: 47185920 bytes (45.00 %), source: concurrent humongous allocation]

[G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: requested by GC cause, GC cause: G1 Humongous Allocation]

Page 45: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC Tuning – Low hanging fruits

● Eliminate Humongous allocations– Humongous regions collected only at Full GC

– Or when empty

2013-10-21T19:23:48.758+0200: [GC pause (G1 Humongous Allocation) (young) (initial-mark) Desired survivor size 1572864 bytes, new threshold 15 (max 15) , 0.0015120 secs]

Page 46: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC Tuning – Low hanging fruits

● Eliminate Humongousallocations– Track your big allocations

– Kill'em !

● Why ?– Fragments the heap

– Can cause evacuationsfailures

Page 47: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC Tuning – Low hanging fruits

● Get rid of “mixed collections”– Increase heap size

– Set a higher threshold for mixed collections

-XX:InitiatingHeapOccupancyPercent=<N>

● Why ?– Some phases of G1 are STW (like “baaaaad”)

– G1 goal : find the best candidates among all old regions

Page 48: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

G1 GC Tuning – Low hanging fruits

● Eliminate “Evacuation/Allocation failures”– They are our good old Full Gcs

[GC pause (G1 Evacuation Pause) (young)

//...

[Full GC (Allocation Failure)

5860M->2690M(7000M), 0.9824032 secs]

Page 49: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Summary

● Performance is fun !● Understand what you do● Immutability is not an issue (by itself)

– Bad code is.

● GC Duration ≈ O(Live data) ● G1 is self-tuning

– Try it :-)

Page 50: Pimp my gc - Supersonic Scala

@pingtimeout Scala.IO – 24&25 oct 13

Thank you for listening !

For more information :

http://www.pingtimeout.fr

@pingtimeout

[email protected]