Solving Multi-tenancy and G1GC in Apache HBase

Preview:

Citation preview

HBase @ HubSpot

Multitenancy

What does Multitenancy Mean?

Benefits of Multitenancy

Identifying Bad Actors

Our Tool: HBasetracing

Ad-hoc Querying with HBasetracing

HBasetracing Roll-up

Example Incident

Dealing with Bad Actors

Quotas

HADOOP_USER_NAME

Deploying Quotas

When Quotas Help

Remember this?

When Quotas Fall Flat

Detention Queues

The Dream

Handling Failure & Managing Risk

Read Replicas

● Consistency.TIMELINE

● isStale()

Read Replica Usage

Read Replica Timeout Settings

Read Replica Limitations

Cluster Replication

Failover Client

@StaleReadOnly Annotation

Monitoring

● next()

G1GC:Making it work with HBase

Why G1GC?● Designed for large heaps.

○ Divides heap into many smaller G1 regions.○ G1 regions scanned and collected independently.

● Instead of occasional very long pauses, G1GC has more frequent, shorter pauses.

If tuned properly, G1GC can provide performant GC that scales well for large RegionServer heaps.

The Need for TuningOut of the box, G1GC hurt our HBase

clusters’ performance:

● Too much time spent in GC pauses.● Occasional very long GC pauses.● “To-space Exhaustion”, leading to Full GCs,

which led to slow RegionServer deaths.

Recommended Defaults

Important Metrics for Tuning● G1GC Eden & Tenured size.

○ GC logs: “[Eden: … Survivors: … Heap: …]”

● HBase memory used by Memstore.○ RegionServer JMX: “memStoreSize”

● HBase memory used by Block Cache.○ RegionServer JMX: “blockCacheSize”

● HBase memory used by “static index”.○ RegionServer JMX: “staticIndexSize”

Necessary Tuning Params● JVM args:

-Xms, -Xmx-XX:G1NewSizePercent -XX:InitiatingHeapOccupancyPercent (aka “IHOP”)

● HBase configs (hbase-site.xml):hfile.block.cache.sizehbase.regionserver.global.memstore.size

Necessary Tuning: MethodA. Find max block cache size, memstore size,

and static index size from the past month.B. Sum 110% of (A) maxes, add heap waste.C. Set IHOP and heap size such that Initiating

Heap Occupancy > (B) by at least 10% heap.D. Ensure IHOP + G1NewSizePercent < 90%.

– 90% = 100% - G1ReservePercent (default 10)

Necessary Tuning: cont.In hbase-site.xml:● Set hfile.block.cache.size ratio value to 110%

max block cache size from the past month.● Set hbase.regionserver.global.memstore.size

ratio value to 110% max Memstore size from the past month.

Further Tuning & Considerations

● -XX:G1ReservePercent○ Accommodating for burst-y usage.

● -XX:G1HeapRegionSize○ Reducing occurrence of humongous objects.○ Reducing long tail of slow GCs in some cases.

● -XX:G1NewSizePercent○ Tuning individual pause time vs. % time in GC.

HBase Usage & Tuning Limits

A Full GC isn’t necessarily G1GC’s fault. There’s a level of “bad usage” that’s unreasonable to tune around:● Unexpected, excessively burst-y traffic.● Too many/enormous Humongous objects.

In either of these cases, the real solution is tofix the client code.

Usage Note: Caching isn’t Free!

Yellow: % time spent in Mixed GC (left axis) | Blue: block cache churn, MB/sec (right axis)

...to Summarize:● Tune heap size, IHOP, & HBase memory

caps based on HBase memory usage.● Tune Eden size based on % time in GC &

average Young GC pause times.● Make adjustments as needed, based on

cluster usage.● Look for suboptimal usage in your HBase

clients to further improve HBase GC.

Links & Reference

Blog Post —  http://bit.ly/hbasegc

G1GC CollectD Plugin — http://bit.ly/collectdgc

G1GC Log Visualizer — http://bit.ly/gclogviz