20

Click here to load reader

Direct memory jugl-2012.03.08

Embed Size (px)

DESCRIPTION

Apache Direct Memory, JUG Lausanne, 08.03.2012

Citation preview

Page 1: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Apache Direct Memory

Reducing Heap Memory StressThe next battle horse for JVM

performance tuning

Page 2: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

About me

• Benoit Perroud

• Apache Direct Memory Commiter

[email protected]

• @killerwhile

• Software craftsman

• BigData Engineer @

Page 3: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Today's Agenda• Off Heap Caching

– Java Memory

– Garbage Collector (GC)

– Cache On-heap vs. Off-heap Caching

• Apache Direct Memory

– Design and principles

– Uses cases

• Multi layered cache

• Standalone server “à la memcache”

– Next steps

• Questions

Page 4: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Before starting

• Sorry for my bad English and my poor French

• Interrupt me anytime

• I have nothing to sell. It's just worth while sharing

• Please do ask questions

Page 5: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Java Memory

• Automatic memory allocation

• Garbage collector (GC)

Page 6: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Garbage Collector

• Several types of GC– Serial GC– Parallel GC (throughput collector)– Concurrent Mark & Sweep GC (concurrent

low pause collector)– G1 GC (low latency concurrent M&S)

Page 7: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Garbage Collector

• But all GC have a stop-the-world behavior

• Proportional to the memory's size

• Resulting in application unresponsiveness– A pain when dealing with tight SLAs

Page 8: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Cache On-Heap vs. Off-Heap

• On-heap– Objects tends to be promoted into tenured

memory

– GC storm effect when using refreshing cache

– No overhead (for caching by reference)

Page 9: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

On-Heap vs. Off-Heap

• Off-heap– Object payload is no more affecting GC– Serialization/Deserialization overhead

• Hopefully lots of work on serialization has been done (Protobuf, Avro, Thrift, msgpack, BSON, ...)

Page 10: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Apache Direct Memory

Apache Direct Memory is a multi layered cache implementation featuring off-heap memory storage to enable caching of java objects without degrading jvm performance.

→ Opensource implementation of Terracotta BigMemory.

Page 11: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Apache Direct Memory

• Apache Software Foundation Incubator project

• Met the Incubator falls 2011

• 12 developers ATM, 10+ contributors

• I joined 1st January 2012– was the good achievement of my Hacky Christmas Holiday :)

• Disclaimer : Under heavy development– I rewrote most of the memory allocation service

– APIs are subject to changes, and bugs to be found

Page 12: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Design & Principles

• ByteBuffer.allocateDirect is the foundation of the cache

• ByteBuffers are allocated in big chunk and then splitted for internal use

Page 13: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Design & Principles

• Build on layers :– CachingService

• Serialize object (pluggable)

– MemoryManagerService• Compute access statistics

– ByteBufferAllocatorService• Eventually deal with ByteBuffers

Page 14: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

ByteBuffers Allocation

2 different allocation's strategies

• Merging ByteBuffers allocation– No memory wasted– Free at creation– Suffer from fragmentation– Need synchronization at allocation and

deallocation

Page 15: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

ByteBuffers Allocation

• Fixed size ByteBuffers allocation

– Linux kernel SLAB's style allocation

• Select a set of fixed sizes

• Split bigger buffers (1MB+) in that size

– Allocation is really fast and good concurrency

• All structures is pre-instanciated

– Creation (or buffer's size increase) has a cost• 1GB split in 128 bytes slabs is 8M+ buffers created

– Do not suffer from fragmentation

– Waste memory if the selected size is not relevant• Work really well in HDFS where all blocks are of the same size

Page 16: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Use case 1 : Multi layers cache

• Idea : most used objects are cached on-heap, the rest off-heap, may overflown to disk.

• Sounds like BigMemory.

• See net.sf.ehcache.store.offheap.OffHeapStore

• Actually we inject DM in ehcache like do BigMemory. Ouch ;)

• Comparison needs to be done

Page 17: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Use case 2 : OffHeap Output Stream

• Idea : read Twitter firehose stream without filling the precious heap memory– OOM will lead to unpredictable behavior else where in the

application

• From your socket directly write off-heap using OutputStream style – allocate a fixed size temporary buffer of your choice

• Read from this stream– InputAndOutputStream parent class that holds both

OutputStream and InputStream instances

Page 18: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Use case 3 : Standalone cache server

• Idea : replace Memcached :)– But with native plain REST API

• DM has all the building blocks to implement such server, worth while trying

• See the server submodule

Page 19: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Next Steps

• JSR 107

• Real Benchmarks

• Builder patterns

• Integration with more libs (Spring, Guice, …)

• Implementations with DM lib (Cassandra (wip), Lucene, Tomcat, …)

• Cache Resizing

• Management and monitoring

• ...

• https://issues.apache.org/jira/browse/DIRECTMEMORY

Page 20: Direct memory jugl-2012.03.08

JUG Lausanne8. March 2012

Questions ?

• Thanks for you attention