53
JVM Memory Model The examples on Github https:// github.com / yoavaa / jvm -memory-model

Jvm memory model

Embed Size (px)

Citation preview

Page 2: Jvm memory model

JIT

Page 3: Jvm memory model

Anomalies• How long does it take to count to

100?

• How long does it take to append to a list? To sort a list?

• How long does it take to append to a vector? To sort a vector?Code: Com.wix.JIT

Page 4: Jvm memory model

Dynamic vs Static Compilation

• Static Compilation– “ahead-of-time” (AOT) compilation– Source code -> Native executable– Compiles before executing

• Dynamic compiler (JIT)– “just-in-time” (JIT) compilation– Source -> bytecode -> interpreter -> JITed– Most of compilation happens during executing

Page 5: Jvm memory model

JIT Compilation• Aggressive optimistic optimizations

– Through extensive usage of profiling info– Limited budget (CPU, Memory)– Startup speed may suffer

• The JIT – Compiles bytecode when needed– Maybe immediately before execution?– Maybe never?

Page 6: Jvm memory model

JVM JIT Compilation• Eventually JITs bytecode

– Based on profiling – After 10,000 cycles, again after 20,000 cycles

• Profiling allows focused code-gen• Profiling allows better code-gen

– Inline what’s hot– Loop unrolling, range-check elimination, etc.– Branch prediction, spill-code-gen, scheduling

Page 7: Jvm memory model

JVM JIT Compilation• JVM applications operate in mixed

mode • Interpreted

– Bytecode-walking– Artificial stack machine

• Compiled– Direct native operations– Native register machine

Page 8: Jvm memory model

JVM application utilization

Page 9: Jvm memory model

Optimizations in HotSpots JVM

Page 10: Jvm memory model

Inliningint addAll(int max) { int accum = 0; for (int i=0; i < max; i++) { accum = add(accum, i); } return accum;}

int add(int a, int b) { return a+b;}

int addAll(int max) { int accum = 0; for (int i=0; i < max; i++) { accum = accum + i; } return accum;}

Page 11: Jvm memory model

Loop unrollingpublic void foo(int[] arr, int a) { for (int i=0; i<arr.length; i++) { arr[i] += a; }}

public void foo(int[] arr, int a) { int limit = arr.length / 4; for (int i=0; i<limit ; i++){ arr[4*i] += a; arr[4*i+1] += a; arr[4*i+2] += a; arr[4*i+3] += a; } for (int i=limit*4; i<arr.length; i++) { arr[i] += a; }}

Page 12: Jvm memory model

Escape Analysispublic int m1() { Pair p = new Pair(1,2); return m2(p);}public int m2(Pair p) { return p.first + m3(p);}public int m3(Pair p) { return p.second;}

// after deep inliningpublic int m1() { Pair p = new Pair(1,2); return p.first + p.second;}

// optimized versionpublic int m1() { return 3;}

Page 13: Jvm memory model

Monitoring Jit• Info about compiled methods

– -XX:+PrintCompilation

• Info about inlining– -xx:+PrintInlining– Requires also -XX:+UnlockDiagnosticVMOptions

• Print the assembly code– -XX:+PrintAssembly– Also requires also -XX:+UnlockDiagnosticVMOptions– On Mac OS requires adding hsdis-amd64.dylib

to the LD_LIBRARY_PATH environment variable.

Page 14: Jvm memory model

Challenge

• Rerun the benchmarks, this time using

1. -XX:+PrintCompilation2. -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining

Page 15: Jvm memory model

JVM MemoryThe Java Memory Model

Page 16: Jvm memory model

Java Memory Model• The Java Memory Model (JMM)

describes how threads in the Java (Scala) Programming language interact through memory.

• Provides sequential consistency for data race free programs.

Page 17: Jvm memory model

Instruction Reordering• Program Orderint a=1;int b=2;int c=3;int d=4;int e = a + b;int f = c - d;

• Execution Orderint d=4;int c=3;int f = c - d;int b=2;int a=1;int e = a + b;

Page 18: Jvm memory model

Anomaly• Two threads running

• What will be the result?i=1, j=1i=0, j=1i=1, j=0i=0, j=0

x=y=0

j=y

x=1

i=x

y=1

Thread 1 Thread 2

Page 19: Jvm memory model

Let’s Check• Let’s build the scenarioval t1 = new Thread(new Runnable { def run() { // sleep a little to add some uncertainty Thread.sleep(1) x=1 j=y }})

• Then run it a few times• Do we see the anomaly?

Code: Com.wix.MemoryModelOrdering

Page 20: Jvm memory model

Happens Before Ordering• Defines constraints on instruction reordering

• Assignment dependency within a single thread

• Volatile field reads are after writes– For non volatile field, this is not necessarily the

case!• A monitor release • A matching monitor acquire• Happens Before ordering is transitive

Page 21: Jvm memory model

Anomaly• Let’s see how far we can count in 100 milli-secondsvar running = true

• Let thread 1 countvar count = 0while (running) count = count + 1println(count)

• Let thread 2 signal thread 1 to stopThread.sleep(100)running = falseprintln("thread 2 set running to false”)

Code: Com.wix.Visabilityjps, jstack

Page 22: Jvm memory model

Volatile• Compilers can reorder instructions• Compilers can keep values in registers• Processors can reorder instructions• Values may be in different caching

levels and not synced to main memory• JMM is designed for aggressive

optimizations

Page 23: Jvm memory model

Volatile• Modern processor caches

Core 1 Core 2 Core 3 Core 4

L1 L1 L1 L1

L2 L2 L2 L2

L3 L3

Main Memory ~65 ns (DRAM)

~15 ns (40-45 cycles)

~3 ns (10-12 cycles)

~1 ns (3-4 cycles)

< 1 ns

Page 24: Jvm memory model

Volatile• Volatile instructs the compiler and processor

to sync the value to main memory on every access– Does not utilize the L1, L2 or L3 cache

• Volatile reads / writes cannot be reordered

• Volatile long and doubles are atomic– Long and double types are over 32bit – the

processor operates on 32bit atomicity by default.

Page 25: Jvm memory model

Resolve the Anomaly• Let’s see how far we can count in 100 milli-seconds@volatile var running = true

• Let thread 1 countvar count = 0while (running) count = count + 1println(count)

• Let thread 2 signal thread 1 to stopThread.sleep(100)running = falseprintln("thread 2 set running to false”)

Page 26: Jvm memory model

Anomaly• Let’s count to 10,000• But lets use 10 threads, each adding

1,000 to our countvar count = 0

• Each of the 10 threads doesfor (i <- 1 to 1000) count = count + 1

• What did we get?Code: Com.wix.Sync101, counter, volatile

Page 27: Jvm memory model

Synchronization • Let’s have another look at the

assignmentcount = count + 1count = count + 1

• Is this a single instruction?• javap

– javap <class> - Print the class signature– javap -c <class> - Print the class

bytecodejavap

Page 28: Jvm memory model

Synchronization • The bytecode for count = count + 1

14: getfield #38 // Field scala/runtime/IntRef.elem:I 17: iconst_1 18: iadd 19: putfield #38 // Field scala/runtime/IntRef.elem:I

Page 29: Jvm memory model

Synchronization • The bytecode for count = count + 1

// Read the current counter value from field 38 // and add it to the stack 14: getfield #38 // Field scala/runtime/IntRef.elem:I // Add 1 to the stack 17: iconst_1// Add the first two stack elements as integers, // and put the result in the stack 18: iadd// set field 38 to the current top element of the stack// assuming it is an integer 19: putfield #38 // Field scala/runtime/IntRef.elem:I

Page 30: Jvm memory model

Synchronization Tools

Actions by thread 1

Thread 1“release”monitor

Thread 2“acquire”monitor

Actions by thread 2

Happens-before

Page 31: Jvm memory model

Synchronization Tools• Synchronization tools allow grouping

instructions as if “one atomic instruction”– Only one thread can perform the code at a time

• Some tools– Synchronized– ReentrantLock– CountDownLatch– Semaphore– ReentrantReadWriteLock

Page 32: Jvm memory model

Synchronization Tools• Simplest tools – synchronized// for each threadfor (i <- 1 to 1000) synchronized { count = count + 1 }

• Works relative to ‘this’

Code: Com.wix.Sync101, lock counter - synchronized

Page 33: Jvm memory model

Synchronization Tools• Using ReentrantLock// before the threadsval lock = new ReentrantLock()// for each threadfor (i <- 1 to 1000) { lock.lock() try { count = count + 1 } finally { lock.unlock() }}

Code: Com.wix.Sync101, lock counter – re-entrant lock

Page 34: Jvm memory model

Atomic Operations• Containers for simple values or

references with atomic operations• getAndIncrement• getAndDecrement• getAndAdd

Page 35: Jvm memory model

Atomic Operations

• All are based on compareAndSwap– From the unsafe class– Used to implement spin-locks

Page 36: Jvm memory model

Atomic Operations• Spin Lockpublic final int getAndIncrement() { for (;;) { int current = get(); int next = current + 1; if (compareAndSet(current, next)) return current; } }}public final boolean compareAndSet(int expect, int update) { return unsafe.compareAndSwapInt(this, valueOffset, expect, update);}

Code: Com.wix.Sync101, atomic counter

Page 37: Jvm memory model

References• The examples on Github

https://github.com/yoavaa/jvm-memory-model

Page 38: Jvm memory model

JVM MemoryMemory allocation of a JVM

process

Page 39: Jvm memory model

Java Memory

• Java runs as a single process• Each process allocates memory

– Process Heap• JVM creates a Java Heap

– Part of the process Heap

OS Memory (RAM)

Process Heap

Java Object Heap

Everything else…

Page 40: Jvm memory model

Java Process Heap• On a 32bit Java

– Process heap limited to ~2GB• If 2GB is the max for a process

– Setting the Java heap to 1800MB – not a good idea

– Using –Xmx1800m –Xms1800m– Leaves small room for anything else

• On a 64bit Java, this is not an issue

Page 41: Jvm memory model

Java Object Heap• Stores Java Objects

– Instances of classes, primitives and references• Pre-allocated large blocks of memory

– No fragmentation– Allocation of small blocks of memory is very

fast• NullPointerException vs. General Access

Fault– NPE is a runtime exception– GAF crash the process

Page 42: Jvm memory model

Java Object Heap• Tuning the Java Heap

– Only controls the Object Heap, not the Process Heap

• -Xmx – specifies maximum size of the heap

• -Xms – specifies the initial size of the heap• -XX:MinHeapFreeRatio – how much to

allocate– Default to 40% - allocate another 40% each

time• -XX:MaxHeapFreeRatio – when to free

memory – Default to 70% - when 70% of memory is free,

release memory to the OS

Page 43: Jvm memory model

Classic Memory Leak in C• User does the memory managementvoid service(int n, char** names) { for (int i = 0; i < n; i++) { char* buf = (char*) malloc(strlen(names[i])); strncpy(buf, names[i], strlen(names[i])); } // memory leaked here}

• User is responsible for calling free()• User is vulnerable to

– Dangling pointers– Double frees

Page 44: Jvm memory model

Garbage Collection• Find and reclaim unreachable objects• Not reachable from the application roots

– thread stacks, static fields, registers

• Traces the heap starting at the roots. Anything not visited is unreachable and garbage collected

• 80-98% of newly allocated are extremely short lived. With Scala, the ratio of short lived objects is even larger

Page 45: Jvm memory model

Garbage CollectionAvailable Collectors (algorithms)• Serial Collector• Parallel Collector• Parallel Compacting Collector• Concurrent Mark Sweep Collector• G1 Collector

• Which one is the default on your machine?java -XX:+PrintCommandLineFlags -version

Page 46: Jvm memory model

Memory Generations

• Applies to all collectors except G1• All new objects are created at the Young Generation, Eden

space• Moved to Old Generation if they survive one or more minor

GC• Survivor Spaces – 2 of them, used during the GC algorithm• PermGen holds the class files (the bytecode)

Java Object Heap

Young Generation

Eden Space

Tenured (Old) Generation

Survivor Spaces

PermGen

Page 47: Jvm memory model

Types of Collectors• The G1 collector does not use generations

– Heap divided into ~2000 regions– Objects are moving between regions during

collectionYoung Generation

Tenured (Old) Generation

old unusedyoung

old

unused

old

old

unused

young old

old unused

young old

old young

old old

Page 48: Jvm memory model

Everything else

• Code Generation• Socket Buffers• Thread Stacks• Direct Memory Space

OS Memory (RAM)

Process Heap

Java Object Heap

Everything else…

• JNI Code• Garbage Collection• JNI Allocated Memory

Page 49: Jvm memory model

Thread Stack• Each thread has a separate memory space

called “thread stack”• Configured by –Xss• Default value depends on OS / JVM

– Defaults around 1M - 2M

• As the number of threads increase, the memory usage increases

Page 50: Jvm memory model

Monitoring Memory Usage

Using Java command line args• -verbose:gc – report each GC event• -Xloggc:file – report each GC event to file• -XX:+PrintGCDetails – print GC output• -XX:+PrintGCTimeStamps –

print GC with timestamps• -XX:+HeapDumpOnOutOfMemoryError –

create a dump file on out of memory– The process is suspended while writing the

dump file

Page 51: Jvm memory model

Monitoring Memory Usage

Using JDK command line tools• jps to get the pid of java processes• jinfo to get information about a running

java process – VM flags and system properties

• jmap to take a memory dump• jhat to view a memory dump• Jstat to view different stats about the jvm

Page 52: Jvm memory model

Monitoring Memory Usage

Using JDK GUI tools • jconsole

– Monitor a live process– JMX console

• jvisualvm– Monitor a live process (more detailed

compared to jconsole)– Take a memory dump – View a memory dump file– Profile a process– Lots of other great stuff

Page 53: Jvm memory model

Questions?