Apache hbase overview (20160427)

Apache HBase2016. 04. 27.

민형기

About Me

• 삼성SDS - SSA BigData Analysis 전문가 과정 강의• 1~3회(2013.02 ~ 2014.02)• http://www.slideshare.net/hypermin/presentations

• SNDB: 2012~2013.02 (Cassandra)• (Samsung NoSQL DB – Cassandra, AWS DynamoDB 호환API)

• M-Project: 2014.02~2016.03 (HBase)• 용도: 메일 저장소, Queue 저장소• 적용내용

• C++ hbase client• 2nd index / region colocation• hbase thrift server customization• hbase htrace 적용

http://www.slideshare.net/hypermin/presentations

Contents

• Overview

• Data Model

• Architecture

• Use Case

• Access API

What is HBase

• Hadoop 기반의 NoSQL• 구글 빅테이블 논문을 기반으로 구현• 컬럼 패밀리 + 스키마 없음• 조인, 인덱스 없음• 행 단위 조작에 대해서만 Atomic

• 다중 행에 대한 트랜잭션 지원 없음

• 분산 데이터 저장소• 데이터를 다수 서버에 분산 보관• 자주 접근되는 데이터를 메모리에 캐시

• 데이터 영속성• 물리적 저장소로 HDFS를 사용

Why Use HBase?

• Hbase는 검증되고 넓게 적용된 도구• 하둡 에코시스템과 밀접하게 결합• 자동화된 장애 극복• 거의 대부분의 메이저 데이터 중심 회사에서 사용 중

• Scales Linearly• TB/PB 규모의 데이터를 저장할 수 있음• 많은 수의 요청을 위한 높은 처리량• 랜덤 읽기 / 쓰기가 우수함• 비정형이거나 가변 컬럼 데이터를 저장할 수 있음• 고도로 분산됨

• CAP Theorem – CP• Read – Write에 대한 일관성을 보장함• Competition: Cassandra(AP)

HBase History

BigTable paper2006.11

Hadoop’s contrib2007.02, 2007.10

Hadoop’s sub project2008.01

Apache top-level project2010.05

2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

0.18.12008.10

0.19.02009.01

0.20.02009.09

0.90.02011.01

0.92.02012.01

0.94.02012.05

0.95.02013.01

0.96.02013.10

0.98.02014.02

0.99.02014.09

1.0.02015.02

1.1.02015.05

2016

1.2.02016.02

참조) - http://www.tutorialspoint.com/hbase/hbase_overview.htm- https://blogs.apache.org/hbase/entry/start_of_a_new_era- https://issues.apache.org/jira/browse/HBASE

1.1.42016.03

http://www.tutorialspoint.com/hbase/hbase_overview.htm

https://blogs.apache.org/hbase/entry/start_of_a_new_era

https://issues.apache.org/jira/browse/HBASE

Who use HBase

• Facebook• Adobe• Netflix• Twitter• Yahoo• Kakao• Line• E-Bay• Bloomberg• Huawei

NoSQL 분류표

https://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/ http://nosql.mypopescu.com/post/2335288905/nosql-databases-genealogy

https://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/

http://nosql.mypopescu.com/post/2335288905/nosql-databases-genealogy

데이터 모델 - 주요 용어

구분 설명

Table - Row들과 컬럼들로 구성됨, n개의 컬럼 패밀리를 가짐

Row- Row Key(식별키) + 다수의 컬럼 들- Row Key 사전 순서대로 정렬됨

Column Family

- 컬럼들의 그룹으로 모든 컬럼은 같은 접두사를 사용함 (ex,“cf:q”)- HFile에 저장하는 단위, Table 생성의 단위임

Column Qualifier

- 실제 컬럼명을 나타냄- 동적 컬럼 명을 가질 수 있음- n개의 버전을 가짐 (버전 값은 timestamp를 이용함)

Cell- <rowkey, column family, column qualifier, version>- 사전순 정렬

Value - Cell의 값, Byte 배열

https://www.safaribooksonline.com/library/view/hadoop-essentials/9781784396688/ch05s04.html

https://www.safaribooksonline.com/library/view/hadoop-essentials/9781784396688/ch05s04.html

Row Key TimestampColumn Family: person Column Family: home

name phone phone addr1 addr2

A0001 t1 홍길동 010-1234-5678 02-3345-0001 서울시 강서구 가양동

A0002 t2 설현 010-3333-5555 02-3345-0002 서울시 양천구 목동

A0003 t2 수지 010-3333-5556 02-3345-0003 서울시 강남구 청담동

A0003 t5 - 010-3333-5678 - - -

D0001 t3 유시진 - 02-3345-0004 서울시 광진구 구의동

D0002 t3 서대영 - - - -

D0003 t4 강모연 - - - -

D0004 T4 윤명주 - - - -

데이터 모델 – Conceptual View

Cells모든 Cell은 다수의 버전을 가지고 있음

Row Key TimestampColumn Family: person Column Family: home

name phone phone addr1 addr2

A0001 t1 홍길동 010-1234-5678 02-3345-0001 서울시 강서구 가양동

A0002 t2 설현 010-3333-5555 02-3345-0002 서울시 양천구 목동

A0003 t2 수지 010-3333-5556 02-3345-0003 서울시 강남구 청담동

A0003 t5 - 010-3333-5678 - - -

D0101 t3 유시진 - 02-3345-0004 서울시 광진구 구의동

D0102 t3 서대영 - - - -

D0103 t4 강모연 - - - -

D0104 T4 윤명주 - - - -

데이터 모델 – Physical View

Region 1

Region 3

Separate directories

데이터 모델 - Distribution

HBase Architecture

구분 설명

Table - Region들로 구성됨

Region- Row들의 부분집합으로 범위를 나타냄- 부하 분산(Load Balancing)의 단위임

Region Server- Region이 서비스 되기 위해 위치하는 곳- 1개 이상의 region을 제공함

Master Server - Hbase 클러스터(리전서버들)를 관리하는 책임을 가짐

HDFS- Hbase의 데이터는 HDFS를 통해서 저장됨- HDFS의 고가용성, 내고장성 feature를 이용

ZooKeeper - 분산 코디네이션에 이용

HBase 컴포넌트

• 리전 서버• 리전들을 포함하는 서버

• 읽기 쓰기에 대한 책임

• 리전• Hbase에서 확장의 기본단위

• Table의 subset

• Row의 범위내의 모든 데이터를 다룸(저장포함)

• 마스터• Hbase 클러스터를 조정

• 리전 할당/조정

• Admin 요청을 처리

클라이언트

주키퍼

마스터 서버

리전 서버

리전

리전

리전

HDFS

리전 서버

리전

리전

리전

리전 서버

리전

리전

리전

리전 위치 검색

테이블 생성, 삭제, 수정 요청

마스터(HBase Master)

• 리전과 리전의 위치를 관리함• 새로운 인스턴스를 띄움

• 리전 서버에 리전 할당(Assigns Regions)

• 부하 분산(Balance workload)• 리전들이 분리된 경우 부하분산을 위해 리전을 옮김

• 특정 리전 서버 장애시 복구(리전들을 다른 리전서버로 옮김)

• ZooKeeper를 분산 코디네이션 서비스로 사용

• 클라이언트는 리전 서버와 직접 통신

• 스키마 관리와 변경(DDL)을 수행• 테이블 추가/삭제, 컬럼패밀리 추가/삭제 등

리전 서버(Region Server)

• 리전을 관리함

• 클라이언트의 읽기/쓰기 요청을 다룸

• 클라이언트의 읽기/쓰기 요청에 대한 WAL로그를 저장

• Master 요청에 의해 리전 분리(region split)을 관리

주키퍼(Zookeeper)

• HBase는 리전 할당을 위해 ZooKeeper를 이용

• 마스터 선출(election)과 서버 가용성을 담당

• 리전 키 범위, 리전 서버 주소정보를 제공• 각 리전 서버는 ephemeral Node를 생성

• 마스터는 리전 서버 노드가 삭제를 감지하여 장애 복구 작업을 진행

• 클라이언트에서 요청에 대한 리전 정보와 리전 서버의 위치 식별 가능

• 클라이언트에서 접속할때 데이터에 대한 일관성을 보장

리전(Regions)

• 테이블의 부분 집합을 관리• 각 리전은 정렬된 rowkey의 특정 범위의 데이터를 다룸

• Row key range: [start key, stop key)

• 예) 리전 1: [, 1000), 리전 2: [1000, 2000), 리전 3: [2000,)

• 리전 서버마다 n개의 리전을 제공

• 일정 크기 이상 커지면 분리(split)• 최초에는 1개 리전만 존재

• 리전에 설정된 최대 값 이상의 데이터가 들어온 경우 리전 범위의 중간 지점에서 분리됨

리전 서버 #1 리전 서버 #2

테이블 A리전 1

테이블 B리전 2

테이블 C리전 1

테이블 A리전 2

테이블 B리전 1

테이블 B리전 3

리전(Regions)

리전서버

블록캐시(BlockCache)

리전(특정 테이블의 rowkey range)

스토어(컬럼패밀리 당 1개)

HLog(WAL)

리전

스토어

멤스토어(MemStore)

StoreFile(HFile)

StoreFile(HFile)

StoreFile(HFile)

파일 시스템(HDFS)

파일파일 파일 파일

변경 내역 로그

LRU 캐시, HFile의 블록을 캐시

데이터를 Put하면일단 멤스토어에 보관. 멤스토어가 가득 차면Hfile 형식으로 파일로Flush하고 멤스토어를 비움

<rowkey, cf, q, ver, 값> 목록을 가짐

HBase I/O

Write

MemStore

HFile HFile HFile

WAL

Read

Flush

memory

HDFS

Compaction/Major Compaction

①②

①

②

③

Write operation

Read operation

Flush operation

Compaction operation

Compaction

•Minor Compaction• 주어진 기준에 따라 작은 크기의 HFile들을 묶어서 보다 큰HFile로 만듦• 기준: 스토어의 최대/최소 HFile 개수, 최소 크기, 최대 크기 등을 고

려해서 합칠 파일 선택

• 삭제된 행, 오래된 버전도 그대로 복사

•Major Compaction• 스토어의 모든 HFile을 합쳐서 1개의 HFile 생성• 삭제된 행, 오래된 버전은 삭제 됨

HBase 아키텍처 - Overview

http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html


HBase 아키텍처 – HFile and KeyValue

http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html


HBase 아키텍처 – Write-Ahead-Log



HBase 아키텍처 – WAL(cont.)



Use Cases: Kinds of Apps Built with HBase

Hortonworks Technical Workshop: HBase For Mission Critical Applications (2015-05-05)

Write Heavy Low-Latency

Search /Indexing

Messaging

Audit /Log Archive AdvertisingData Cubes

Time SeriesSensor /Device

http://www.slideshare.net/hortonworks/h-base-mission-critical-apps-final

Use Cases: Facebook Messing System

• New Message Service (2010-11-16)• 실시간 대화를 위한 채팅, SMS, email을 결합한 메시징 시스템• 데이터 패턴

• 짧은 데이터 크기의 임시데이터(휘발성 경향이 있음)• 드물게 엑세스 되는 데이터 셋이 계속 증가함

• 데이터 규모: 3억명 사용자가 120B/월의 메시지를 보낼 수 있어야 함

• HBase를 선택한 이유• 운영 용이성: Cassandra보다 단순한 Consistency Model을 제공• 메시징 서비스의 워크로드에 적합하고, 매우 우수한 확장성과 성능을 제공• 풍부한 기능을 제공(자동 로드 밸런싱, 장애복구, 압축지원, 서버당 복수개의

샤드 등)• 안정적인 HDFS

• HDFS(Hbase의 파일시스템)는 리플리케이션, end-to-end 체크섬, 자동 리밸런싱 지원• Facebook의 운영팀이 이미 HDFS 운영 노하우가 있음

The Underlying Technology of Messages (2010-11-16)

https://www.facebook.com/notes/facebook-engineering/the-underlying-technology-of-messages/454991608919/

Use Cases: LINE Storage

• 요구사항• Redis로 초기 구축 되었으나 폭발적인 성장을 감

당 할 수 없었음• O(n):

• 서비스: User Profile, Contacts / Groups• Workload: fast random reads

• O(n*t)• 서비스: Messages in Inbox• Scalability, Massive volume(Billions of small rows per

day, but mostly cold data)• Workload: fast sequential writes(append-only) and

fast reads of the latest data

• HBase 선정사유• 3개 NoSQL비교검토(HBase, Cassandra,

MongoDB)후 HBase 선정• LINE의 workload 요구사항에 제일 적합• 운영 용이성

LINE Storage: http://developers.linecorp.com/blog/?p=1420 (2012.04.26)

http://developers.linecorp.com/blog/?p=1420

Use Cases: Xiomi

• Mi Cloud: 스마트폰향 퍼스널 클라우드 스토리지• 90+ million users, 3X increased in 2014• 500 billion rows, 6X increased in 2014• 1000+ regions in the largest table

• Mi Push: push service on android• Data stored in HBase

• Pub-sub relations of topics and devices• Messages to each device

• Numbers• 200+ million users• Push 2 billion+ messages every day• 200,000+ requests per second at peak

HBaseCon 2015: HBase Operations at Xiomi - http://www.slideshare.net/HBaseCon/operations-session-6-49043532

http://www.slideshare.net/HBaseCon/operations-session-6-49043532

HBase Access

• HBase Shell

• Native Java API(hbase-client)• 가장빠르고 풍푸한 옵션 제공

• REST Server• REST 서버 구동이 필요함• 다양한 포맷 지원: text, xml, json, protocol buffers, raw binary

• Thrift Server• Thrift Server 구동이 필요함• Thrift에서 지원하는 언어에 대한 client 지원 가능 (C++, Python등)

• AsyncHBase• Asynchronous, non-blocking, thread-safe hbase client• 3rd party api• https://github.com/stumbleupon/asynchbase

https://github.com/stumbleupon/asynchbase

HBase Access - API

• HBase Admin• create, drop, truncate, alter, list, describe, disable table 등

• HTable• Get: 특정 rowkey에 해당하는 row 접근• Put: 특정 rowkey에 해당하는 데이터를 추가 또는 변경• Append: 기존 Rowkey에 column을 추가• Delete: 특정 rowkey에 해당하는 row삭제• Scan: 특정 범위의 rowkey에 속하는 row목록 접근• Increment: 특정 셀의 값을 증가• Batch: 여러 operation을 묶어 요창• checkAndPut: 특정 셀의 값을 비교 후 설정• checkAndDelete: 특정 셀의 값을 비교 후 삭제

Appendix.

HBase

•구글 빅테이블에 영향받은 분산되고, 버전관리되는 비관계형 오픈소스 데이터 베이스• BigTable paper from Google, 2006

• “Bigtable is a sparse, distributed, persistent multi-dimensional sorted map”

•Key Features• 클러스터의 다수의 머신을 이용한 분산 데이터 저장소• 랜덤 온라인 읽기와 쓰기를 위한 데이터 접근 가능• 스키마리스 데이터 모델(“NoSQL”)• 데이터 파티션에 대한 자기관리

Apache Hadoop in Review

• Apache Hadoop Distributed Filesystem(HDFS)• Distributed, fault-tolerant, throughput-optimized data storage

• Uses a filesystem analogy, not structured tables

• The Google File System, 2003, Ghemawat et al.

• http://research.google.com/archive/gfs.html

• Apache Hadoop MapReduce(MR)• Distributed, fault-tolerant, batch-oriented data processing

• Line- or record-oriented processing of the entire dataset

• “[Application] schema on read”

• MapReduce: Simplified Data Processing on Large Clusters, 2004, Dean

• http://research.google.com/archive/mapreduce.html

http://research.google.com/archive/gfs.html

http://research.google.com/archive/mapreduce.html

HBase vs. BigTable 용어 정리

HBase BigTable

Table Table

Region Tablet

RegionServer Tablet Server

MemStore Memtable

HFile SSTable

WAL Commit Log

Flush Minor compaction

Minor Compaction Merging compaction

Major Compaction Major compaction

HDFS GFS

MapReduce MapReduce

ZooKeeper Chubby

Tall Narrow vs. Flat-Wide Tables

<userId> : <colfam> : <messageId> : <timestamp> : <email-message>

12345 : data : 5fc38314-e290-ae5da5fc375d : 1307097848 : "Hi Lars, ..."

12345 : data : 725aae5f-d72e-f90f3f070419 : 1307099848 : "Welcome, and ..."

12345 : data : cc6775b3-f249-c6dd2b1a7467 : 1307101848 : "To Whom It ..."

12345 : data : dcbee495-6d5e-6ed48124632c : 1307103848 : "Hi, how are ...“

or

<userId>-<messageId> : <colfam> : <qualifier> : <timestamp> : <email-message>

12345-5fc38314-e290-ae5da5fc375d : data : : 1307097848 : "Hi Lars, ..."

12345-725aae5f-d72e-f90f3f070419 : data : : 1307099848 : "Welcome, and ..."

12345-cc6775b3-f249-c6dd2b1a7467 : data : : 1307101848 : "To Whom It ..."

12345-dcbee495-6d5e-6ed48124632c : data : : 1307103848 : "Hi, how are ...“

Same Storage Requirements

Partial Key Scans

Key Description

<userId> 주어진 userId의 모든 메시지 스캔

<userId>-<date>주어진 userId로 특정 date에 받은 모든 메시지 스캔

<userId>-<date>-<messageId>주어진 userId로 특정 date에 받은 특정 메시지의 모든 구성요소 스캔

<userId>-<date>-<messageId>-<attachmentId>주어진 userId로 특정 date에 받은 특정 메시지의 모든 첨부파일 스캔

Fold, Store, and Shift

Reference.

• HBase: Just the Basics

• HBASE Overview

• Introduction to Apache Hbase

• Hadoop Turotial: Hbase Part 1 – Overview

• HBase 훑어보기

• HBase for Architects

• HBase Storage Internals

• Hbase Reference Guide

• HBase Architecture 101 – Storage

• HBase Architecture 101 – Write-ahead-Log

• Hortonworks Technical Workshop: HBase For Mission Critical Applications

http://www.slideshare.net/HBaseCon/hbase-just-the-basics

http://www.slideshare.net/sampathrachakonda7/hbase-overview-33312898

http://www.slideshare.net/gokuul/introduction-to-apache-hbase

http://www.slideshare.net/martyhall/hadoop-tutorial-hbase-part-1-overview

http://www.slideshare.net/madvirus/hbase-29278429

http://www.slideshare.net/xefyr/h-base-for-architectspptx

http://www.slideshare.net/Hadoop_Summit/hbase-storage-internals

http://hbase.apache.org/book.html



http://www.slideshare.net/hortonworks/h-base-mission-critical-apps-final