Upload
hdhappy001
View
111
Download
3
Tags:
Embed Size (px)
DESCRIPTION
BDTC 2013 Beijing China
Citation preview
The State of Apache HBaseMichael Stack
* Project Management Committee
●PMC* Chair of Apache HBase Project ●Caretaker/Janitor
●Member of the Hadoop PMC●Engineer at Cloudera in San Francisco
Table of Contents:●What is HBase?●Who uses it?●Who runs the project?●HBase Today●Tomorrow●Ecosystem
HBase is... ”...an open source, distributed, scalable, consistent, low latency, non-relational, random access database”
Built on Apache Hadoop
● Hadoop core is:– Distributed file system (HDFS)– MapReduce
● HBase persists all data to HDFS● Uses Apache ZooKeeper
– Cluster coordination
“Billions of rows X millions of columns on clusters of 'commodity hardware'”
http://www.flickr.com/photos/ag_gilmore/8170021483/in/photostream/
Project Goal:
InspirationA Google Technology described in a 2006 paper, Bigtable: A Distributed Storage System for Structured Data by Chang et al.?
First commit...
commit 454a9dbe046194f8eef3dddc3e5942910dd5b7a1Author: Douglass Cutting <[email protected]>Date: Tue Apr 3 20:34:28 2007 +0000
HADOOP-1045. Add contrib/hbase, a BigTable-like online database.
HADOOPDISTRIBUTIONS
When to use it?
BIG data
scalE!
Low-latency, online, random read/writes+ “Simple” access patterns
*Like Google Bigtable model only different nomenclature
Datamodel*
DataModel: A Bigtable!●0-N Bigtable(s)●Bigtable has:
●Rows x Column Families●Rows have primary key
●Column Families have:●Any number of Columns●By access/attributes●CF prefix and qualifier
● e.g. attribute:mimetype
Bigtable A
abcdefghi
k
m
o
j
l
p
srq
t
n
uvwxyz
aabbcc
Colum n Family A Column Family BRow Key
= Cell @ bigtable 'A', row key 'p', CF 'B:red'
Datamodel: Regions●Bigtable splits into “regions”
●Automatically as table grows●Region has contiguous rows
●Known by [startRow, endRow)●Distributed over cluster
●0-100s per server c
e
abcd
fghij
lmno
k
Region a-e
Region e-j
Region k-o
Etc.
DataModel: Sorted & Versioned●All is byte []
●No native 'types'●Minor schema or schema-less (NoSQL)
●All is SORTED●Rows in byte-lexicographical order●Columns sorted along row
●VERSIONED●Cells are “versioned”●3D (timestamp)
Region a-e3D
cde
cd
bcde
bcd
a
e
bcd
cde
cd
bcde
bcd
a
e
bcd
cde
cd
bcde
bcd
a
e
bcd
Datamodel: Strongly consistent
●Favors consistency over availability“Designing applications to cope with concurrency anomalies in their data is very error-prone, time-consuming, and ultimately not worth the performance gains” -- F1: A Distributed SQL Database That Scales
●Row modifications are atomic●Even if thousands of columns on a row
Datamodel: in short ”...a sparse, distributed, persistent
multidimensional sorted map” – Bigtable Paper (2006)
(Table, Row, ColumnFamily, Qualifer, Timestamp) → Value
Architecture: Birds-eye viewApplication MapReduce Impala
Thrift/REST Gateway
HBase Java Client
ZooKeeperHBase Master
HBase RegionServer
HDFS
Features•Classes to MapReduce HBase tables
– HIVE, PIG, etc.
•Query predicate push down via server side filters •Coprocessors (stored procedures/triggers)
– e.g. security, secondary indices
•Java clients– REST and thrift too
•Extensible jruby-based (JIRB) shell•Replication•Security
– Table/Column Family– Kerberos Authentication, ACLs
API●get●put●delete●multi●scan●increment●append●checkAnd*●MapReduce
What to expect• Writes:
– 1-3ms, 1k-20k writes/sec per node
• Reads:– 0-3ms cached, 10-30ms disk– 10-40k reads / second / node from cache– > if SSD
• Cell size• 0-3MB preferred
• Column-orientated so wide tables are OK• Sparsely populated rows OK
Who uses it?
In Production
● OLTP & Batch● Messages○ 1B+ users○ Tens of PBs (compressed)○ Thousands of machines, Pods of ~200
● ODS/Real-time monitoring/Timeseries○ Metrics from every server @ FB○ 2.5B writes/16k reads per minute
● Post Search Store○ MapReduce to build index○ 1 Trillion posts
● All on AWS● 5 production clusters and growing● Mix of SSD and SATA● Billions of page views per month
● Long time HBase user● Two clusters of 1k nodes each
○ Master-Master replicating● Separate low-latency cluster
○ Up to 1M reads a second
Cassini● Ebay item search indexing● 600M active items in HBase tables● 1.4TB of data processed each day● 400M puts to HBase each day● 250M search metrics per day● Two datacenters● Growing clusters...
– 500->1k
Deploy types• Multitenant multifarious feature storeo a.k.a dumping groundo Stumbleupon, Y!, SalesForce
• Reconciliation storeo ebay
• Timeserieso SalesForce, FB ODS
• Lots-o-entities storeo Flurry, genomeo Lots-o-entities BLOBs, FB Messages
Who runs the project?
Diverse team*
* http://hbase.apache.org/team-list.html
COMMITTERS!
Preferably ALIVE!
Dev Rate
# of commitsTotal Files 2021Total Lines of Code 832122Total Commits 6615 (~ 3/day)Authors 39
(https://www.ohloh.net/p/hbase)
JIRA: 2008-2013
Commits/Month Over Time (0.94/trunk)
HBase Today
•Release every month• Each more stable•& more performant•Some features…• Wire compatible between releases
•Currently at 0.94.13
http://www.flickr.com/photos/sysli/3026288256/sizes/o/in/photostream/
● hbase-0.96.0–Released October 19th, 2013– 18months in the making
● >2000 fixes
Big Themes● Stability● Operability
–Insight, tools● Scalability● Evolvability
● Pluggable Compaction– Smarter triggers
● Hadoop1 AND Hadoop2● Smarter Region Balancer● Region Assignment & Replication
– Hardened
● Coprocessors– More hooks
Sampler
http://www.flickr.com/photos/allspaw/5815258929/sizes/o/in/photostream/
http://www.flickr.com/photos/38595542@N02/3690830720/sizes/o/in/photostream/
•System tables• Filesystem•Up in zookeeper•Over the wire
Snapshots• By TableoSnapshot, clone, restore, export
• InexpensiveoJust metadata
• Good for...oBackupsoReplicationoOffline processing
Namespaces• Grouping of tables
– Like database in mysql
• System/User– hbase:meta
• Quota• Coming
– Security by namespace– Grouping on cluster by namespace
And more...• X-row (in-region) Transactions• Query tracing• New UI• Online Region Merge• Client-side types• Metrics2o Radical revamp
• Windows!
• Branched, released soon• Rolling upgrade from 0.96.0
• In-line Cell-tags– Security++
● ACL down to the Cell-level● Cell-level visibility labels● Encryption
• Reverse Scan
●HBase 1.0.0●Reining in the 99th percentiles
●Multi-WAL●Speculative replica reads
●More support for multi-tenancy●Off-heap
HBase 2014
Ecosystem
OpenTSDB● Timeseries● Store, index and serve metrics at large scale● Make data easily accessible and graphable
HaeinsaHaeinsa 란 무엇인가 ?
Is a linearly scalable multi-row, multi-table transaction library for HBase. Haeinsa uses two-phase locking and optimistic concurrency control for implementing transaction. The isolation level of transaction is serializable.
● Inspired by Google Percolator● VCNC
Chasm
How to make it easier writing applications against HBase?
Frameworks: Kiji.org
• Entity-centric, simple modelo Types, complex, compound types.
• Each cell is schema versioned
• Works across MR & REST, etc.
• Machine-learning libs
• Examples, tutorials
• Production users
• Open-source
Frameworks: CDK• APIs providing Dataset abstraction
– get/put/delete API in AVRO objects
• Highlights: – Supports multiple components
● flume, morphlines, hive, crunch, hcat – Types using Avro and parquet formats– Manages schema evolution
• Open source by Cloudera – http://cloudera.github.io/cdk/docs/current
● Client-embedded JDBC driver○ Connection conn =
DriverManager.getConnection("jdbc:phoenix:localhost");
● Alternate HBase Client API (SQL)● Fast!
○ Exploits HBase Coprocessors/Filters○ Types○ Aggregations○ Skip scans○ Secondary indices
+ + etcDatastores
Thank [email protected]
End
TODO
● DBA: R (read), W (write), C (create), X (execute), A (admin). ● cell-level security. Every cell in an Accumulo store can have a label, stored effectively as part of
the key, which is used to determine whether a value is visible to a given subject or not. The label is not an ACL, it is a different way of expressing security policy.
● A label instead turns this on its head and describes the sensitivity of the information to a decision engine that then figures out if the subject is authorized to view data of that sensitivity based on (potentially, many) factors.
● Then, as of HBASE-7662, HBase can store into and apply ACLs from cell tags, extending the current HBase ACL model down to the cell.
● Finally, we have also contributed transparent server side encryption, as HBASE-7544, for additional assurance against accidental leakage of data at rest, which is at this time an HBase-only feature.
● Auto-manages partitioning● Storage machinery in the RS● I like the Latency/Throughput/Read/Write axis in Nick