46
HBase @ HubSpot

Solving Multi-tenancy and G1GC in Apache HBase

Embed Size (px)

Citation preview

Page 1: Solving Multi-tenancy and G1GC in Apache HBase

HBase @ HubSpot

Page 2: Solving Multi-tenancy and G1GC in Apache HBase

Multitenancy

Page 3: Solving Multi-tenancy and G1GC in Apache HBase

What does Multitenancy Mean?

Page 4: Solving Multi-tenancy and G1GC in Apache HBase

Benefits of Multitenancy

Page 5: Solving Multi-tenancy and G1GC in Apache HBase

Identifying Bad Actors

Page 6: Solving Multi-tenancy and G1GC in Apache HBase

Our Tool: HBasetracing

Page 7: Solving Multi-tenancy and G1GC in Apache HBase

Ad-hoc Querying with HBasetracing

Page 8: Solving Multi-tenancy and G1GC in Apache HBase

HBasetracing Roll-up

Page 9: Solving Multi-tenancy and G1GC in Apache HBase

Example Incident

Page 10: Solving Multi-tenancy and G1GC in Apache HBase

Dealing with Bad Actors

Page 11: Solving Multi-tenancy and G1GC in Apache HBase

Quotas

Page 12: Solving Multi-tenancy and G1GC in Apache HBase

HADOOP_USER_NAME

Deploying Quotas

Page 13: Solving Multi-tenancy and G1GC in Apache HBase

When Quotas Help

Page 14: Solving Multi-tenancy and G1GC in Apache HBase

Remember this?

Page 15: Solving Multi-tenancy and G1GC in Apache HBase

When Quotas Fall Flat

Page 16: Solving Multi-tenancy and G1GC in Apache HBase

Detention Queues

Page 17: Solving Multi-tenancy and G1GC in Apache HBase

The Dream

Page 18: Solving Multi-tenancy and G1GC in Apache HBase

Handling Failure & Managing Risk

Page 19: Solving Multi-tenancy and G1GC in Apache HBase

Read Replicas

Page 20: Solving Multi-tenancy and G1GC in Apache HBase

● Consistency.TIMELINE

● isStale()

Read Replica Usage

Page 21: Solving Multi-tenancy and G1GC in Apache HBase

Read Replica Timeout Settings

Page 22: Solving Multi-tenancy and G1GC in Apache HBase

Read Replica Limitations

Page 23: Solving Multi-tenancy and G1GC in Apache HBase

Cluster Replication

Page 24: Solving Multi-tenancy and G1GC in Apache HBase

Failover Client

Page 25: Solving Multi-tenancy and G1GC in Apache HBase

@StaleReadOnly Annotation

Page 26: Solving Multi-tenancy and G1GC in Apache HBase

Monitoring

Page 27: Solving Multi-tenancy and G1GC in Apache HBase

Page 28: Solving Multi-tenancy and G1GC in Apache HBase

● next()

Page 29: Solving Multi-tenancy and G1GC in Apache HBase

Page 30: Solving Multi-tenancy and G1GC in Apache HBase

Page 31: Solving Multi-tenancy and G1GC in Apache HBase

G1GC:Making it work with HBase

Page 32: Solving Multi-tenancy and G1GC in Apache HBase

Why G1GC?● Designed for large heaps.

○ Divides heap into many smaller G1 regions.○ G1 regions scanned and collected independently.

● Instead of occasional very long pauses, G1GC has more frequent, shorter pauses.

If tuned properly, G1GC can provide performant GC that scales well for large RegionServer heaps.

Page 33: Solving Multi-tenancy and G1GC in Apache HBase

The Need for TuningOut of the box, G1GC hurt our HBase

clusters’ performance:

● Too much time spent in GC pauses.● Occasional very long GC pauses.● “To-space Exhaustion”, leading to Full GCs,

which led to slow RegionServer deaths.

Page 34: Solving Multi-tenancy and G1GC in Apache HBase
Page 35: Solving Multi-tenancy and G1GC in Apache HBase

Recommended Defaults

Page 36: Solving Multi-tenancy and G1GC in Apache HBase

Important Metrics for Tuning● G1GC Eden & Tenured size.

○ GC logs: “[Eden: … Survivors: … Heap: …]”

● HBase memory used by Memstore.○ RegionServer JMX: “memStoreSize”

● HBase memory used by Block Cache.○ RegionServer JMX: “blockCacheSize”

● HBase memory used by “static index”.○ RegionServer JMX: “staticIndexSize”

Page 37: Solving Multi-tenancy and G1GC in Apache HBase

Necessary Tuning Params● JVM args:

-Xms, -Xmx-XX:G1NewSizePercent -XX:InitiatingHeapOccupancyPercent (aka “IHOP”)

● HBase configs (hbase-site.xml):hfile.block.cache.sizehbase.regionserver.global.memstore.size

Page 38: Solving Multi-tenancy and G1GC in Apache HBase

Necessary Tuning: MethodA. Find max block cache size, memstore size,

and static index size from the past month.B. Sum 110% of (A) maxes, add heap waste.C. Set IHOP and heap size such that Initiating

Heap Occupancy > (B) by at least 10% heap.D. Ensure IHOP + G1NewSizePercent < 90%.

– 90% = 100% - G1ReservePercent (default 10)

Page 39: Solving Multi-tenancy and G1GC in Apache HBase

Necessary Tuning: cont.In hbase-site.xml:● Set hfile.block.cache.size ratio value to 110%

max block cache size from the past month.● Set hbase.regionserver.global.memstore.size

ratio value to 110% max Memstore size from the past month.

Page 40: Solving Multi-tenancy and G1GC in Apache HBase
Page 41: Solving Multi-tenancy and G1GC in Apache HBase
Page 42: Solving Multi-tenancy and G1GC in Apache HBase

Further Tuning & Considerations

● -XX:G1ReservePercent○ Accommodating for burst-y usage.

● -XX:G1HeapRegionSize○ Reducing occurrence of humongous objects.○ Reducing long tail of slow GCs in some cases.

● -XX:G1NewSizePercent○ Tuning individual pause time vs. % time in GC.

Page 43: Solving Multi-tenancy and G1GC in Apache HBase

HBase Usage & Tuning Limits

A Full GC isn’t necessarily G1GC’s fault. There’s a level of “bad usage” that’s unreasonable to tune around:● Unexpected, excessively burst-y traffic.● Too many/enormous Humongous objects.

In either of these cases, the real solution is tofix the client code.

Page 44: Solving Multi-tenancy and G1GC in Apache HBase

Usage Note: Caching isn’t Free!

Yellow: % time spent in Mixed GC (left axis) | Blue: block cache churn, MB/sec (right axis)

Page 45: Solving Multi-tenancy and G1GC in Apache HBase

...to Summarize:● Tune heap size, IHOP, & HBase memory

caps based on HBase memory usage.● Tune Eden size based on % time in GC &

average Young GC pause times.● Make adjustments as needed, based on

cluster usage.● Look for suboptimal usage in your HBase

clients to further improve HBase GC.

Page 46: Solving Multi-tenancy and G1GC in Apache HBase

Links & Reference

Blog Post —  http://bit.ly/hbasegc

G1GC CollectD Plugin — http://bit.ly/collectdgc

G1GC Log Visualizer — http://bit.ly/gclogviz