View
1.789
Download
1
Category
Tags:
Preview:
DESCRIPTION
Overview of the upcoming snapshot feature in HBase.
Citation preview
HBase Snapshots
HBase User Group Meetup10/29/12
Jesse Yates
So you wanna….
• Prevent data loss
• Recover to a point in time
• Backup your data
• Sandbox copy of data
Problem!
a BIG Problem…
• Petabytes of data
• 100’s of servers
• At a single point in time
• Millions of writes per-second
Solution!
Solutions!
(Obvious) Solutions!
Built-in• Export
– MapReduce job against HBase API– Output to single seqeunce file
• Copy Table– MapReduce job against HBase API– Output to another table
Yay• Simple• Heavily tested • Can do point-in-time
Boo• Slow• High impact for running cluster
(Less Obvious) Solution!
Replication
• Export all changes by tailing WAL
YAY• Simple• Gets all edits• Minimal impact on running cluster
Boo• Turn on from beginning• Can’t turn it off and catch up• No built-in point-in-time• Still need ETL process to get multiple copies
(Facebook) Solution!1
Mozilla did something similar2
1. issues.apache.org/jira/browse/HBASE-55092. github.com/mozilla-metrics/akela/blob/master/src/main/java/com/mozilla/hadoop/Backup.java
Facebook Backup
• Copy existing hfiles, hlogs
Yay• Through HDFS
– Doesn’t impact running cluster• Fast
– distcp is 100% faster than M/R through HBase
Boo• Not widely used• Requires Hardlinks• Recovery requires WAL replay• Point-in-time needs filter
Backup through the ages
HBaseHDFS
Export
Copy Table
Replication
HBASE-50
Maybe this is harder than we thought…
We did some work…
Hardlink workarounds
• HBASE-5547– Move deleted hfiles to .archive directory
• HBASE-6610– FileLink: equivalent to Windows link files
Enough to get started….
Difficulties
• Coordinating many servers
• Minimizing unavailability
• Minimize time to restore
• Gotta’ be Fast
HBASE-50HBASE-6055
Snapshots
• Fast- zero-copy of files
• Point-in-time semantics– Part of how its built
• Built-in recovery– Make a table from a snapshot
• SLA enforcement– Guaranteed max unavailability
Coming in HBase-0.96!
Snapshots?
We’ve got a couple of those…
Snapshot Types
• Offline– Table is already disabled
• Globally consistent– Consistent across all servers
• Timestamp consistent– Point-in-time according to each server
Offline Snapshots
• Table is already disabled• Requires minimal log replay– Especially if table is cleanly disabled
• State of the table when disabled• Don’t need to worry about changing state
YAY• Fast!• Simple!
But I can’t take my table offline!
Globally Consistent Snapshots
• All regions block writes until everyone agrees to snapshot– Two-phase commit-ish
• Time-bound to prevent infinite blocking– Unavailability SLA maintained per region
• No Flushing – its fast!
What could possibly go wrong?
Cross-Server Consistency Problems
• General distributed coordination problems– Block writes while waiting for all regions– Limited by slowest region– servers = P(failure)
• Stronger guarantees than currently in HBase
• Requires WAL replay to restore table
I don’t need all that, what else do you have?
Timestamp Consistent Snapshots
• All writes up to a TS are in the snapshot
• Leverages existing flush functionality
• Doesn’t block writes
• No WAL replay on recovery
Timestamp Consistent?
Put/Get/Delete/Mutate/etc.
Timestamp in snapshot?
Snapshot Store Future Store
Yes No
MemStore
I’ve got a snapshot,now what?
Recovery
• Export snapshot– Send snapshot to another cluster
• Clone snapshot– Create new table from snapshot
• Restore table– Rollback table to specific state
Export Snapshot
• Copy a full snapshot to another cluster– All required HFiles/Hlogs– Lots of options
• Fancy dist-cp– Fast!– Minimal impact on running cluster
Clone Table
• New table from snapshot
• Create multiple tables from same snapshot
• Exact replica at the point-in-time
• Full Read/Write on new table
Restore
• Replace existing table with snapshot
• Snapshots current table, just in case
• Minimal overhead– Handles creating/deleting regions– Fixes META for you
Whew, that’s a lot!
Even more awesome!
Goodies
• Full support in shell
• Distributed Coordination Framework
• ‘Ragged Backup’ added along the way
• Coming in next CDH
• Backport to 0.94?
Special thanks!
• Matteo Bertozzi– All the recovery code– Shell support
• Jon Hsieh– Distributed Two-Phase Commit refactor
• All our reviewers…– Stack, Ted Yu, Jon Hsieh, Matteo
Thanks!Questions?
Jesse Yates@jesse_yates
jesse.k.yates@gmail.com
Recommended