Upload
ali-gholami
View
325
Download
3
Embed Size (px)
Citation preview
Java performance tuning
A project by Ali Gholami and Saeedeh Davoudi
Dr.Nosratali Ashrafi Payaman
University of Kharazmi
+JVM, history, vs. c and c++
Overview
Foundation of java
• Java language project started in June 1991 by James Gosling and Mike Sheridan and it
Was originally designed for interactive television.
• Java names
1. Oak 2. Green 3. Java
Foundation of java
• Sun Microsystems released the first public implementation of java(java 1.0) in 1995.
• Java slogan :
Write once, Run Anywhere.
• Java 1.2 called J2SE in December 1998-1999 with multiple configurations for
different platforms. Including APIs for applications typically run in server
environments and mobile applications.
• in 2006 java renamed new J2 versions as Java EE, Java ME, and Java SE for marketing
purposes.
Foundation of java
Five primary goals of Java foundation:
simple, object-oriented, and
familiar
robust and secure
architecture-neutral and
portable
high performance
interpreted, threaded, and
dynamic
Java Virtual Machine
• A Java virtual machine (JVM) is an abstract computing machine that enables a
computer to run a Java program.
What is Java Virtual Machine and How applications run on that?
• An instance of a JVM is an implementation running in a process that executes a
computer program compiled into Java bytecode.
• The state of JVM in the compilation process is given below:
Java Virtual Machine Class Loader
Structure of the Java Virtual Machine is given below:
Class Loader Subsystem is a part of the Java
Runtime Environment that dynamically loads Java
classes into the Java Virtual Machine.
• Usually classes are only loaded on demand. The
Java run time system does not need to know
about files and file systems because of class
loaders.
• In the Java language, libraries(consists multiple
classes) are typically packaged in JAR files.
• In general speech, the class loader will LOAD
the libraries and classes needed in the JVM.
Execution Engine
Java Virtual Machine Class Loader
When the JVM is started, three class loaders are used:
• Bootstrap class loader(Loads the core Java libraries located in the
<JAVA_HOME>/jre/lib directory.)
• Extensions class loader( Loads the code in the extensions directories
(<JAVA_HOME>/jre/lib/ext, or any other directory specified by the java.ext.dirs system
property).
It is implemented by the sun.misc.Launcher$ExtClassLoader class.)
• System class loader(Loads code found on java.class.path, which maps to the CLASSPATH
environment variable. This is implemented by the sun.misc.Launcher$AppClassLoader
class.)
Java Virtual Machine Method Area
• information about loaded types is stored in a logical area of memory called the
method area.
• The data in the Method Area stay in memory as
long as the classloader which loaded them is
alive.
• The method area stores:
class information (number of fields/methods,
super class name, interfaces names, version).
the bytecode of methods and constructors.
a runtime constant pool per class loaded.
• Constant pool is a part of .class file (and its in-memory representation) that contains
constants needed to run the code of that class.
Java Virtual Machine Heap
• The heap is a memory area shared among all Java Virtual Machine Threads. It is
created on virtual machine start-up. All class instances and arrays are allocated in
the heap (with the new operator).
• This zone must be managed by a garbage collector to
remove the instances allocated by the developer when they
are not used anymore.
• The heap can be dynamically expanded or contracted and
can have a fixed minimum and maximum size.
• Note: There is a maximum size that the heap can’t exceed. If
this limit is exceeded the JVM throws an OutOfMemoryError.
Java Virtual Machine Stack
• Java Stack memory is used for execution of a thread. They contain method specific
values that are short-lived and references to other objects in the heap that are
getting referred from the method.
• Here is the example of JVM stack before
and after a function call.
• Each function call or passing
the arguments to constructor
and referencing on the heap
will create new stack frame.
Stack and Heap Example
Stack Memory
main()
memoryobjectint i=1
func()
string arg
Memory
Object
String Pool
Heap Memory
Java Runtime Memory
Java Virtual Machine PC Register / Native Stack
• PC(Program Counter) Register contains the address of the instruction currently
being executed in its associated thread. The PC Register is very small data area and
has a fixed size. Java applications do not have any impact on its content and size.
• Native Method Stack stores similar data elements as a JVM
Stack and it is used to help executing native (non-Java)
methods. To play with a Native Method Stack, we need to
integrate some native program codes into Java
applications.
Java Virtual Machine Execution Engine
• At the core of any Java virtual machine implementation is its execution engine.
• the behavior of the execution engine is defined in terms of an instruction set. For
each instruction, the specification describes in detail what an implementation should
do when it encounters the instruction as it executes bytecodes.
• Each thread of a running Java application is a distinct instance of the virtual
machine's execution engine. From the beginning of its lifetime to the end, a thread is
either executing bytecodes or native methods. A thread may execute bytecodes
directly, by interpreting or executing natively in silicon, or indirectly, by just- in-time
compiling and executing the resulting native code.
Java Virtual Machine JNI
• The Java Native Interface (JNI) is a native programming interface that is part of the
Java Software Development Kit (SDK).
• JNI is a programming framework that enables
Java code running in a Java Virtual Machine
(JVM) to call and be called by native
applications (programs specific to a hardware
and operating system platform) and libraries
written in other languages such as C, C++ and
assembly.
Introduction to Java performance
• Java was always slower than C and C++ because of the different language structure:
After compiling java app will run on JVM Instead of computer processor.
• Performance of java increased since 1997 with following events:
1. Introducing the JIT(Just In Time Compilation).
2. optimizations in the JVM (such as HotSpot becoming the default for Sun's JVM in
2000).
3. Hardware execution of Java bytecode,such as that offered by ARM's Jazelle.
Introduction to Java performance
• The performance of a Java bytecode (compiled Java program) depends on:
MANAGEMENTOF TASKS GIVEN TO JVM.
EXPLOITATION OF HARDWARE AND FEATURES OF THE
COMPUTER BY THE JVM.
Tuning of java performance = JVM Optimization
Virtual Machine Optimization methods
Split-time Bytecode verification
I/O tuningMemory Tuning
Thread contention tuning
CPU usage tuning
Register allocation improvement
Class data sharing
Java Memory Tuning
Java Memory performance tuning
areas
Memory footprint Allocation rate Garbage collection
Java Memory Tuning memory footprint tuning
Reasons of getting a OutOfMemoryError…
1. Too much data
2. Fat data representation
Java Memory Tuning memory footprint tuning too much data
• Verbose logging is intended as the first tool to be used when attempting to diagnose
garbage collector problems.
Using Verbose GC we have to observe numbers in “Full GC” Logs.
[Full GC $before->$after($total), $time secs]
• If you don’t need all that data you can use: LRU cache or soft references.
One simple but effective algorithm is the Least Recently Used, or LRU, algorithm.
When performing LRU caching, you always throw out the data that was least recently
used. As an example, let's imagine a cache that can hold up to five pieces of data.
the garbage collector will always collect weakly referenced objects, but will only
collect softly referenced objects when its algorithms decide that memory is low
enough to warrant it.
Java Memory Tuning memory footprint tuning fat data problem
• This conditions occurs when massive data loading is in process.
• Like encoding a 100 GB text file with Huffman algorithm (when the file is loaded into
a string).
• Compressed object pointers can be used.
Uncompressed(bytes) Compressed(bytes)
Pointer 8 4
Object header 16 12
Array header 24 16
Java Memory Tuning Allocation rate tuning
• Allocation rate is measured in the amount of memory allocated per time unit. Often it
is expressed in MB/sec.
High allocation rate = performance issues
Mostly Occurs when
Garbage Collection becomes a bottleneck.
Garbage collection is covered in this section further.
Java application Cpu usage tuning
• Real Problem seen on stackoverflow.com :
There are two Java processes (A, B) on a Linux machine (CentOS 6.5 64bit). A sends lots
of binary data to B using sockets. B writes data to disk. Per second 50-100MB data are
written to disk. On a quad core processor, the CPU is nearly 100% used. Previously we
ran a similar application but written by C, only 25% of CPU was used.
• We can use below methods to reduce application cpu usage:
restricting the use of JVM memory in JDK settings.
refactoring of the application code.
reducing memory allocation.(reuse of objects and …)
Java application Cpu usage tuning
Thread contention tuning
• thread contention is a condition where one thread is waiting for a lock/object that is
currently being held by another thread. Therefore, this waiting thread cannot use that
object until the other thread has unlocked that particular object.
• Ways of reducing Thread contention:
No expensive calculations in locks.
Employ interlocked/atomic operations.
Use synchronized data structures.
Use Read-Only data whenever possible.
Avoid Object Pooling.
Java I/O tuning
• Java IO is an API that comes with Java which is targeted at reading and writing data
(input and output). Most applications need to process some input and produce some
output based on that input.
• One thing that affects Java IO performance is the use of character-by-character IO --
calling the InputStream.read() or the Reader.read() methods to read one character
which don’t use the BUFFERING.
• It is recommended to use the standard BufferedReader and BufferedInputStream
classes or use the block-read methods to read larger blocks of data at a time.
split-time bytecode verification
What is Bytecode verification?
• When a class loader presents the bytecodes of a newly loaded Java platform class to
the virtual machine, these bytecodes are first inspected by a verifier. The verifier
checks that the instructions cannot perform actions that are obviously damaging.
• A method named split-time verification, first introduced in the Java Platform, Micro
Edition (J2ME), is used in the JVM since Java version 6. It splits the verification of Java
bytecode in two phases:
• Design-time – when compiling a class from source to bytecode
• Runtime – when loading a class.
Garbage Collection tuning
Garbage collection = biggest threat to JVM responsiveness
What is Memory pool and GC?
• Memory pools, also called fixed-size blocks allocation, is the use of pools for memory
management that allows dynamic memory allocation comparable to malloc or C++'s
operator new.
• Objects in memory have an important property of temporal persistence.
• To exploit this principle, we can build what is known as a generational garbage
collector. Objects will initially be allocated to a chunk of memory called the first
generation(Eden), or G1. When G1 becomes full, we copy the live objects into
another block of memory called the second generation, or G2, and free up the entire
G1.
What is Generational GC?
Garbage Collection tuning
• all a copying-collector does is start from a set of roots (in our case, the operand
stack), and traverse all of the reachable memory-allocated objects, copying them
from one half of memory into the other half. The area of memory that we copy from is
called old space and the area of memory that we copy to is called new space.
• Eden: All new allocation happens in eden.
• Survivor: when eden fills up, stop-the-world(kind of GC) copy-collection into survivor
space.
• After several collection, survivors get tenured into old generation.
History of performance tuningJava version Performance improvement(s)
JDK 1.1.6 First just-in-time compilation(Symantec's JIT-compiler).
J2SE 1.2 Use of a generational collector.
J2SE 1.3 Just-in-time compiling by HotSpot.
Java SE 5.0 Class data sharing.
Java SE 6 Split bytecode verification.Escape analysis and lock coarsening.Register allocation improvements.
Java 7 JVM support for dynamic programming languages.Enhance the existing concurrency library by managing parallel computing on multi-core processors.Allow the JVM to use both the client and server JIT compilers in the same session with a method called tiered compiling.
Performance comparison
Implementation of an app with c++ and java languages.C++ produces faster results.
Performance comparison
What makes C/C++ faster than java?
C++ java
In touch with processor directly.Write once, compile anywhere (WOCA).
Java apps run on JVM.Write once, run anywhere/everywhere (WORA/WORE).
Fast and direct pointer access in C++. Slower pointer access and differentstructure.
C++ "on the stack" objects will cost nothing at allocation and destruction, and will need no GC to work in an independent thread to do the cleaning.
The use of GC and methods called for each object to be cleaned, waste the time!
No strict relationship between class names and filenames.A header file and implementation file are used for each class.
Strict relationship is enforced,Example: source code for class test has to be in test.java and such overheads which makes it slower in small apps.
Performance comparison
Benchmark comparison between Java and C++
•
•
•
Performance comparison
Multi-Core performance comparison between Java and C++
• the scalability and performance of Java applications on multi-core systems is limited
by the object allocation rate.This effect is sometimes called an "allocation wall".
• modern garbage collector algorithms use multiple cores to perform garbage
collection, which to some degree alleviates this problem.
Memory use performance comparison between Java and C++
Performance comparison
• Java memory use is much higher than C++ because:
There is an overhead of 8 bytes for each object and 12 bytes for each array in Java.
Parts of the Java Class Library must load before program execution.
The virtual machine uses substantial memory.
Ali Gholami 932171021Saeedeh Davoudi 932171010