Click here to load reader
Upload
benoit-perroud
View
1.567
Download
0
Embed Size (px)
DESCRIPTION
Apache Direct Memory, JUG Lausanne, 08.03.2012
Citation preview
JUG Lausanne8. March 2012
Apache Direct Memory
Reducing Heap Memory StressThe next battle horse for JVM
performance tuning
JUG Lausanne8. March 2012
About me
• Benoit Perroud
• Apache Direct Memory Commiter
• @killerwhile
• Software craftsman
• BigData Engineer @
JUG Lausanne8. March 2012
Today's Agenda• Off Heap Caching
– Java Memory
– Garbage Collector (GC)
– Cache On-heap vs. Off-heap Caching
• Apache Direct Memory
– Design and principles
– Uses cases
• Multi layered cache
• Standalone server “à la memcache”
– Next steps
• Questions
JUG Lausanne8. March 2012
Before starting
• Sorry for my bad English and my poor French
• Interrupt me anytime
• I have nothing to sell. It's just worth while sharing
• Please do ask questions
JUG Lausanne8. March 2012
Java Memory
• Automatic memory allocation
• Garbage collector (GC)
JUG Lausanne8. March 2012
Garbage Collector
• Several types of GC– Serial GC– Parallel GC (throughput collector)– Concurrent Mark & Sweep GC (concurrent
low pause collector)– G1 GC (low latency concurrent M&S)
JUG Lausanne8. March 2012
Garbage Collector
• But all GC have a stop-the-world behavior
• Proportional to the memory's size
• Resulting in application unresponsiveness– A pain when dealing with tight SLAs
JUG Lausanne8. March 2012
Cache On-Heap vs. Off-Heap
• On-heap– Objects tends to be promoted into tenured
memory
– GC storm effect when using refreshing cache
– No overhead (for caching by reference)
JUG Lausanne8. March 2012
On-Heap vs. Off-Heap
• Off-heap– Object payload is no more affecting GC– Serialization/Deserialization overhead
• Hopefully lots of work on serialization has been done (Protobuf, Avro, Thrift, msgpack, BSON, ...)
JUG Lausanne8. March 2012
Apache Direct Memory
Apache Direct Memory is a multi layered cache implementation featuring off-heap memory storage to enable caching of java objects without degrading jvm performance.
→ Opensource implementation of Terracotta BigMemory.
JUG Lausanne8. March 2012
Apache Direct Memory
• Apache Software Foundation Incubator project
• Met the Incubator falls 2011
• 12 developers ATM, 10+ contributors
• I joined 1st January 2012– was the good achievement of my Hacky Christmas Holiday :)
• Disclaimer : Under heavy development– I rewrote most of the memory allocation service
– APIs are subject to changes, and bugs to be found
JUG Lausanne8. March 2012
Design & Principles
• ByteBuffer.allocateDirect is the foundation of the cache
• ByteBuffers are allocated in big chunk and then splitted for internal use
JUG Lausanne8. March 2012
Design & Principles
• Build on layers :– CachingService
• Serialize object (pluggable)
– MemoryManagerService• Compute access statistics
– ByteBufferAllocatorService• Eventually deal with ByteBuffers
JUG Lausanne8. March 2012
ByteBuffers Allocation
2 different allocation's strategies
• Merging ByteBuffers allocation– No memory wasted– Free at creation– Suffer from fragmentation– Need synchronization at allocation and
deallocation
JUG Lausanne8. March 2012
ByteBuffers Allocation
• Fixed size ByteBuffers allocation
– Linux kernel SLAB's style allocation
• Select a set of fixed sizes
• Split bigger buffers (1MB+) in that size
– Allocation is really fast and good concurrency
• All structures is pre-instanciated
– Creation (or buffer's size increase) has a cost• 1GB split in 128 bytes slabs is 8M+ buffers created
– Do not suffer from fragmentation
– Waste memory if the selected size is not relevant• Work really well in HDFS where all blocks are of the same size
JUG Lausanne8. March 2012
Use case 1 : Multi layers cache
• Idea : most used objects are cached on-heap, the rest off-heap, may overflown to disk.
• Sounds like BigMemory.
• See net.sf.ehcache.store.offheap.OffHeapStore
• Actually we inject DM in ehcache like do BigMemory. Ouch ;)
• Comparison needs to be done
JUG Lausanne8. March 2012
Use case 2 : OffHeap Output Stream
• Idea : read Twitter firehose stream without filling the precious heap memory– OOM will lead to unpredictable behavior else where in the
application
• From your socket directly write off-heap using OutputStream style – allocate a fixed size temporary buffer of your choice
• Read from this stream– InputAndOutputStream parent class that holds both
OutputStream and InputStream instances
JUG Lausanne8. March 2012
Use case 3 : Standalone cache server
• Idea : replace Memcached :)– But with native plain REST API
• DM has all the building blocks to implement such server, worth while trying
• See the server submodule
JUG Lausanne8. March 2012
Next Steps
• JSR 107
• Real Benchmarks
• Builder patterns
• Integration with more libs (Spring, Guice, …)
• Implementations with DM lib (Cassandra (wip), Lucene, Tomcat, …)
• Cache Resizing
• Management and monitoring
• ...
• https://issues.apache.org/jira/browse/DIRECTMEMORY
JUG Lausanne8. March 2012
Questions ?
• Thanks for you attention