Tales About Scala Performance

Preview:

DESCRIPTION

A session given on Scalapeño conference 2013.

Citation preview

© Copyright Performize-IT LTD.

Tales About Scala Performance

About Me

My Name: Haim Yadid Hard to PronounceLuckily it is meaningful

Haim => Life Yadid => Friend

hybrid nick: lifey

© Copyright Performize-IT LTD.

:: ::::::this :: Nil::

Performize-IT

© Copyright Performize-IT LTD.

Performize-IT

© Copyright Performize-IT LTD.

Optimizing Software since 2007

Performance Bottlenecks

Crashes

GC Tuning Training&Mentoring

OutOfMemory

Concurrency

Contact Me

© Copyright Performize-IT LTD.

http://il.linkedin.com/in/haimyadid

lifey@performize-it.com

www.performize-it.com

blog.performize-it.com

https://github.com/lifey

@lifeyx

© Copyright Performize IT LTD.

Once Upon A Time

Benchmarks by Google

© Copyright Performize-IT LTD.

So we are done

So what is this talk about?

© Copyright Performize-IT LTD.

Best practices Micro benchmarks?

Understanding

Understand

How to Find performance problemsHow to solve themReach a well performing production system

Prerequisites:Familiarity with the JVMBasic knowledge of Scala

© Copyright Performize-IT LTD.

Performance is all about

MethodologyMonitoring

Hotspots Isolation Analysis Solution

Tools are your Best Friends for this task

© Copyright Performize-IT LTD.

Scala Runs on the JVM

All JVM capabilities and tools still apply Take your best friends with you

© Copyright Performize-IT LTD.

Premature Optimization

© Copyright Performize-IT LTD.

I shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurely

Monitoring the JVM

Java management extensions (JMX)on the same machine(Attach)Remotely via command line paramsTools

JConsoleJVisualVMMission Control

© Copyright Performize-IT LTD.

Remote Monitoring - JMX

Add params to command line of profiled app-Dcom.sun.management.jmxremote-Dcom.sun.management.jmxremote.port=<port>-Dcom.sun.management.jmxremote.authenticate=false-Dcom.sun.management.jmxremote.ssl=false

Recommend authentication and security, refer tohttp://java.sun.com/j2se/1.5.0/docs/guide/management/agent.html

© Copyright Performize-IT LTD.

Production

© Copyright Performize IT LTD.

A Tale about a Stack

Your First Scala Function

Functional Programming recursionEasy to understand Probably your 1st program in Scala will look like:

© Copyright Performize-IT LTD.

def sumOfSquares(st:Int , end : Int ) = { if (st>end) 0 else st*st + sumOfSquares(st+1,end) }

And your first exception will be:

© Copyright Performize-IT LTD.

java.lang.StackOverflowError at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:8) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9)

at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9)

Tail Recursion

© Copyright Performize-IT LTD.

Recursive call to the function must be the value returned

 if  (number  ==  1)  1  else  number  *  factorial  (number  -­‐  1)

Favor tail recursion

The JVM does not optimize recursionMeaning extra call for every iterationLimit on recursion depthScala compiler can optimize tail recursion!!

© Copyright Performize-IT LTD.

@tailrec def sumOfSquares(st:Int , end : Int, sum = 0 ) = { if (st>end) sum else sumOfSquares(st+1,end,sum + st*st)}

@tailrec Annotation

A compile time directivefail compilation if tail recursion optimization cannot be appliedUse whenever the fact tail recursion is used is mandatory for performance and functionality

© Copyright Performize-IT LTD.

Stack Size

Ranges from 256k-1024kDepending on platform and JVM versionWhat is it in your system?

java -XX:+PrintFlagsFinal -version |& grep ThreadStackSize

Tune thread stack to your needs Example: -Xss1312k

© Copyright Performize-IT LTD.

Production

Stacks in Scala

Scala stack is just like Java Stackjstack is your best friend Scala terminology may be obscuredE.g. List will look like $colon$colon

© Copyright Performize-IT LTD.

JStack

Part of the JDKDumps stack traces of all live threadsSynopsis: jstack -lUse when

Get a snapshot for program activitydetect deadlocks

© Copyright Performize-IT LTD.

Takipi’s Stackifier

www.stackifier.com

© Copyright Performize-IT LTD.

© Copyright Performize IT LTD.

Humpty Dumpty sat on a heap,Humpty Dumpty had anOutOfMemory flip.All the king’s horses and all the king’s menCouldn’t put Humpty together again

Heap

Max Used

In a Perfect World.....

Heap(Or Perm Gen) is depleted -XX:+HeapDumpOnOutOfMemoryErrorScala code does not have larger memory footprintScala code may have larger permgen footprint

© Copyright Performize-IT LTD.

Production

MAT

MAT - Memory Analyzer ToolA very powerful tool analyzing heap dumps

Use to investigate :Memory leaksOutOfMemory errors Memory footprint

AlternativesYourkit /JProbe/JProfiler (Commercial)VisualVM(JDK)JHat(JDK)

© Copyright Performize-IT LTD.

MAT-name-resolver

Add-on for MAT Helps MAT understand ScalaDeveloped by Iulian Dragos from TypesafeGithub project https://github.com/dragos/MAT-name-resolver

© Copyright Performize-IT LTD.

List[Int] ?

© Copyright Performize-IT LTD.

OutOfMemory Perm Space

Class byte code resides in PermGenScala will use more perm space You can write small piece of codewhich will create a lot of byte-code

© Copyright Performize-IT LTD.

@ScalaSignature

@ScalaSignature(bytes="... Meta data needed for:

ReflectionCompilation

Larger class files

© Copyright Performize-IT LTD.

More classes

Each closure is actually a JVM class Implicit conversions are classesCompanion objects are also classes

© Copyright Performize-IT LTD.

Well

© Copyright Performize-IT LTD.

object ClosureExample extends App { val f = (x: Int) => x*x println (s"closure ${f(5)}");}

ClosureExample$.classpackage com.performizeit.scalapeno.demos;

import scala.Function0;import scala.Function1;import scala.LowPriorityImplicits;import scala.Predef.;import scala.StringContext;import scala.reflect.ScalaSignature;import scala.runtime.AbstractFunction0;import scala.runtime.BoxedUnit;import scala.runtime.BoxesRunTime;

@ScalaSignature(bytes="\006\001\035:Q!\001\002\t\002-\tab\0217pgV\024X-\022=b[BdWM\003\002\004\t\005)A-Z7pg*\021QAB\001\ng\016\fG.\0319f]>T!a\002\005\002\031A,'OZ8s[&TX-\033;\013\003%\t1aY8n\007\001\001\"\001D\007\016\003\t1QA\004\002\t\002=\021ab\0217pgV\024X-\022=b[BdWmE\002\016!Y\001\"!\005\013\016\003IQ\021aE\001\006g\016\fG.Y\005\003+I\021a!\0218z%\0264\007CA\t\030\023\tA\"CA\002BaBDQAG\007\005\002m\ta\001P5oSRtD#A\006\t\017ui!\031!C\001=\005\ta-F\001 !\021\t\002E\t\022\n\005\005\022\"!\003$v]\016$\030n\03482!\t\t2%\003\002%%\t\031\021J\034;\t\r\031j\001\025!\003 \003\t1\007\005")public final class ClosureExample{ public static void main(String[] paramArrayOfString) { ClosureExample..MODULE$.main(paramArrayOfString); }

public static void delayedInit(Function0<BoxedUnit> paramFunction0) { ClosureExample..MODULE$.delayedInit(paramFunction0); }

public static String[] args() { return ClosureExample..MODULE$.args(); }

public static void scala$App$_setter_$executionStart_$eq(long paramLong) { ClosureExample..MODULE$.scala$App$_setter_$executionStart_$eq(paramLong); }

public static long executionStart() { return ClosureExample..MODULE$.executionStart(); }

public static Function1<Object, Object> f() { return ClosureExample..MODULE$.f(); }

public static class delayedInit$body extends AbstractFunction0 { private final ClosureExample. $outer;

public final Object apply() { this.$outer.f_$eq(new ClosureExample..anonfun.1()); Predef..MODULE$.println(new StringContext(Predef..MODULE$.wrapRefArray((Object[])new String[] { "closure ", "" })).s(Predef..MODULE$.genericWrapArray(new Object[] { BoxesRunTime.boxToInteger(this.$outer.f().apply$mcII$sp(5)) })));

return BoxedUnit.UNIT; }

public delayedInit$body(ClosureExample. $outer) { } }}

ClosureExample.classpackage com.performizeit.scalapeno.demos;

import scala.App;import scala.App.class;import scala.DelayedInit;import scala.Function0;import scala.Function1;import scala.Serializable;import scala.collection.mutable.ListBuffer;import scala.runtime.AbstractFunction1.mcII.sp;import scala.runtime.BoxedUnit;

public final class ClosureExample$ implements App{ public static final MODULE$; private Function1<Object, Object> f; private final long executionStart; private String[] scala$App$$_args; private final ListBuffer<Function0<BoxedUnit>> scala$App$$initCode;

static { new (); }

public long executionStart() { return this.executionStart; } public String[] scala$App$$_args() { return this.scala$App$$_args; } public void scala$App$$_args_$eq(String[] x$1) { this.scala$App$$_args = x$1; } public ListBuffer<Function0<BoxedUnit>> scala$App$$initCode() { return this.scala$App$$initCode; } public void scala$App$_setter_$executionStart_$eq(long x$1) { this.executionStart = x$1; } public void scala$App$_setter_$scala$App$$initCode_$eq(ListBuffer x$1) { this.scala$App$$initCode = x$1; } public String[] args() { return App.class.args(this); } public void delayedInit(Function0<BoxedUnit> body) { App.class.delayedInit(this, body); } public void main(String[] args) { App.class.main(this, args); } public Function1<Object, Object> f() { return this.f; } public void f_$eq(Function1 x$1) { this.f = x$1; }

ClosureExample$$anonfun$1.classpackage com.performizeit.scalapeno.demos;

import scala.Serializable;import scala.runtime.AbstractFunction1.mcII.sp;

public final class ClosureExample$$anonfun$1 extends AbstractFunction1.mcII.sp implements Serializable{ public static final long serialVersionUID = 0L;

public final int apply(int x) { return apply$mcII$sp(x); } public int apply$mcII$sp(int x) { return x * x; }

}

ClosureExample$delayedInit$body.classpackage com.performizeit.scalapeno.demos;

import scala.Function1;import scala.LowPriorityImplicits;import scala.Predef.;import scala.StringContext;import scala.runtime.AbstractFunction0;import scala.runtime.BoxedUnit;import scala.runtime.BoxesRunTime;

public final class ClosureExample$delayedInit$body extends AbstractFunction0{ private final ClosureExample. $outer;

public final Object apply() { this.$outer.f_$eq(new ClosureExample..anonfun.1()); Predef..MODULE$.println(new StringContext(Predef..MODULE$.wrapRefArray((Object[])new String[] { "closure ", "" })).s(Predef..MODULE$.genericWrapArray(new Object[] { BoxesRunTime.boxToInteger(this.$outer.f().apply$mcII$sp(5)) })));

return BoxedUnit.UNIT; }

public ClosureExample$delayedInit$body(ClosureExample. $outer) {

@specialized

Generics implemented by type erasureFor primitive types this means : Boxing/UnboxingPerformance hit Large memory footprint

@specialized annotation enables specialized implementations

© Copyright Performize-IT LTD.

What about code cache?

Code cache hold optimized assembly code Should be large enough to hold If you need more perm gen You may need more code cache-XX:CodeCacheSize=Monitor it via JMX

© Copyright Performize-IT LTD.

Production

@specialized Nightmare

© Copyright Performize-IT LTD.

class SpecializeNightmare { trait S1[@specialized A, @specialized B] { def f(p1:A): Unit }}

Generates 165 classes

Don’t try with 3,4,5

OutOfMemory Perm Gen Space

Congrats you have a perm gen OOM -XX:MaxPermSize=1024m(Or -J-XX:MaxPermSize=1024m if you use Scala command line)

© Copyright Performize-IT LTD.

Production

© Copyright Performize IT LTD.

Oh dear! Oh dear! I shall be too late!

-optimise

A scalac command line parameter Performs optimizations of bytecode Inlining boxing/unboxing elimination etcImproves performance Slower compilation

© Copyright Performize-IT LTD.

Production

Inlining

Scala uses information it has in compile time To know which methods can be inlinedIt can do better job than the JVMAutomatic when you -optimise

© Copyright Performize-IT LTD.

Production

Inlining Visibility

On scala compiler levelAdd -Ylog:inline to see what inlined

© Copyright Performize-IT LTD.

scalac -optimise -Ylog:inline -d ../bin com/performizeit/scalapeno/demos/ClosureExampleInline.scala |& grep inlined

[log inliner] inlined com.performizeit.scalapeno.demos.ClosureExampleInline.<init> // 1 inlined: com.performizeit.scalapeno.demos.ClosureExampleInline.delayedInit[log inliner] inlined com.performizeit.scalapeno.demos.ClosureExampleInline$$anonfun$f$1.apply // 1 inlined: com.performizeit.scalapeno.demos.anonfun$f$1.apply$mcII$sp

Inlining Visibility JVM

JIT Compiler compiler optionsNot recommended for production

-XX:+PrintCompilation-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining

© Copyright Performize-IT LTD.

! Prod

@inline

You may direct the compiler to inline a methodUsually you will not need it the compiler will do it anyway.Or the JVM will do it anywayNo real need to clutter the code....

© Copyright Performize-IT LTD.

@inline final def f = (x: Int) => x*x

Member accessors

Get/Setgetters to a val fieldsgetters&setters to var fieldsWill you pay for this?

Nope !JVM inlines accessor methods (by default)If you insist on penalty-XX:-UseFastAccessorMethods

© Copyright Performize-IT LTD.

Parallel Collections

ParArrayParVectormutable.ParHashMapmutable.ParHashSetimmutable.ParHashMapimmutable.ParHashSetParRangeParTrieMap

© Copyright Performize-IT LTD.

Parallel Collections

Apply only when has a location is a hotspotVery easy to use behind the scenes ForkJoinFramework (Java 6)Dangerous when code :

has side effectsNon associative

Easy to use

© Copyright Performize-IT LTD.

val v = Vector(Range(0,10000000)).flatten v.par.map(_ + 1)

Only when proven to improve

Profiler - JVisualVM

Part of the JDKA profiler Use when

Want to identify hotspot Analyze memory allocation bottlenecks

Alternatives Yourkit (Commercial)JProbe(Commercial)JProfiler(Commercial)

© Copyright Performize-IT LTD.

Sampling vs Instrumentation

Sampling - sample application threads and stack traces to get statistics Instrumentation - modify byte code to record times and invocation counts

© Copyright Performize-IT LTD.

Scala Stacks revisited

© Copyright Performize-IT LTD.

while (true) { var a = List(Range(0,1000)).flatten // println(a) for (i <- 1 to 10 ) { a = a :+ i println(a.last) } }

© Copyright Performize IT LTD.

Garbage Collection

Immutability

Immutability may cause more objects allocation Not necessary a performance hit

Short lived objectsGC handles them efficientlyEscape analysis

Parallelization!!!

© Copyright Performize-IT LTD.

VisualVM (allocation hotspots)

Find locations large amounts of bytes are being allocated.large number of objects being allocation

© Copyright Performize-IT LTD.

Large (im)mutable state

You have a huge graph which changes graduallyEventually end up in Old Generation A small change may cause huge impact on state That may screw up GC

© Copyright Performize-IT LTD.

GC Visibility

GC can be visualized partially through JMXThe best way to do get the whole picture is by GC logs

-Xloggc:<log file name>-XX:+PrintGCDetails -XX:+PrintGCDateStamps

Java 7 supports a “rolling appender” -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=<#files> -XX:GCLogFileSize=<number>M

© Copyright Performize-IT LTD.

Prod

GCViewer

Analysis GC logs Use when:

Experience GC problemsIs GC efficient ?(throughput )Does GC stops application ( pause time)

Alternatives Cesnum (Commercial)

© Copyright Performize-IT LTD.

© Copyright Performize IT LTD.

And They Lived Happily Ever After

slides /: (_ + _)

Don’t be afraid of Scala You will be able to optimize large scale apps Optimize where needed You need to (Java =>) Scala Yourself ATM - Know Java to optimize Scala

© Copyright Performize-IT LTD.

© Copyright Performize IT LTD.

Q&A

© Copyright Performize IT LTD.

The End

Recommended