31
Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved. Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved. HBase on Flash Matt Kennedy HBaseCon June 13, 2013

HBaseCon 2013: Apache HBase on Flash

Embed Size (px)

DESCRIPTION

Presented by: Matt Kennedy, Fusion-io

Citation preview

Page 1: HBaseCon 2013: Apache HBase on Flash

Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc.  All rights reserved.Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc.  All rights reserved.

HBase on FlashMatt Kennedy

HBaseCon June 13, 2013

Page 2: HBaseCon 2013: Apache HBase on Flash

Switch your database to flash now. Or you’re doing it wrong.Brian Bulkowski, Aerospike CTO and co-founder

April 11, 2023 2

http://highscalability.com/blog/2012/12/10/switch-your-databases-to-flash-storage-now-or-youre-doing-it.html

Page 3: HBaseCon 2013: Apache HBase on Flash

April 11, 2023 3

NAND Flash

+

Page 4: HBaseCon 2013: Apache HBase on Flash

April 11, 2023 #Cassandra13 4

NAND Flash Memory

Page 5: HBaseCon 2013: Apache HBase on Flash

April 11, 2023 5

NAND Flash Memory

Flash is a persistent memory technology invented by Dr. Fujio Masuoka at Toshiba in 1980.

BitLine

Source Line Word Line

Control Gate

Float Gate

NPN

Page 6: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 6

Flash in Servers

Page 7: HBaseCon 2013: Apache HBase on Flash

04/12/2023 Fusion-io Confidential 7

Direct Cut Through Architecture

PC

Ie

DRAM

Host CPU

AppOS

LEGACY APPROACH FUSION DIRECT APPROACH

PC

IeS

AS

DRAM

Data path Controller

NAND

Host CPU

RAIDController

AppOS

Goal of every I/O operation to move data to/from DRAM and flash.

SC

SC

Super Capacitors

Page 8: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 8

NAND Flash

+

Page 9: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 9

HBase Options

1. What do we do today?

2. What does HBase look like on flash?

3. What if we can not go all flash?

Page 10: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 10

Conventional HBase Node

▸Key Design Principle:

▸Working Set < DRAM

Page 11: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 11

Working Set Getting Bigger

▸Key Design Principle:

▸Working Set < DRAM

Page 12: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 12

EEP

▸Key Design Principle:

▸Working Set < DRAM

Page 13: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 13

DO

LL

AR

S

Cost of DRAM Modules

4 G B 8 G B 1 6 G B 3 2 G B0

200

400

600

800

1000

1200

1400

1600

$ $$$$$

$$$$$$

Page 14: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 14

HBase Server

▸A typical server…

CPU Cores: 32 with HTMemory: 128 GB

Is your working set larger than 128GB?

Page 15: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 15

HBase Cluster

▸With NoSQL Databases, we tend to scale out for DRAM

Combined ResourcesCPU Cores: 96Memory: 384 GB

More cores than needed to serve reads and writes.

Page 16: HBaseCon 2013: Apache HBase on Flash

The All Flash Option

April 12, 2023 16

HBase Node

Data NodeDaemon

Region ServerDaemon

Memory Store Storage Directories

Disk Flash

HBase Node

Data Node Daemon

Storage Directories

Page 17: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 Fusion-io Confidential 17

Short-Circuit Reads Matter on Flash!R

EA

D O

PS

/SE

C

02

04

06

08

01

00

12

01

40

16

01

80

20

02

20

24

02

60

28

03

00

32

03

40

36

03

80

40

04

20

44

04

60

48

05

00

52

05

40

56

05

80

60

06

20

64

06

60

68

07

00

72

07

40

76

07

80

80

0

0

1000

2000

3000

4000

5000

6000

7000

DataNode reads Short-Circuit Reads

Page 18: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 18

Short-Circuit Reads (Datanode Bypass)

▸ In hdfs-site.xml▸ <property>    <name>dfs.block.local-path-access.user</name>    <value>hbase</value> </property>

In hbase-site.xml

<property>    <name>dfs.client.read.shortcircuit</name>    <value>true</value>  </property>

<property>    <name>hbase.regionserver.checksum.verify</name>    <value>true</value>  </property>

Page 19: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 Fusion-io Confidential 19

YCSB Suite – Uniform DistributionM

IXE

D W

OR

KL

OA

D O

PE

RA

TIO

NS

1 0 4 0 7 01 0 0

1 3 01 6 0

1 9 02 2 0

2 5 02 8 0

3 1 03 4 0

3 7 04 0 0

4 3 04 6 0

4 9 05 2 0

5 5 05 8 0

6 1 06 4 0

6 7 07 0 0

7 3 07 6 0

7 9 08 2 0

8 5 08 8 0

0

20000

40000

60000

80000

100000

120000

140000

50/50 R/W 95/5 R/W Read-only

Page 20: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 20

YCSB Suite Latency – Uniform Distribution

Workload Average Latency

95th Percentile Latency

99th Percentile Latency

50/50 Read/WriteUpdate Latency

81 µs 0 ms 0 ms

50/50 Read/WriteRead Latency

13.5 ms 34ms 128ms

95/5 Read/WriteUpdate Latency

69.3 µs 0 ms 0 ms

95/5 Read/WriteRead Latency

8.5 ms 26 ms 39 ms

Read-only 9.2 ms 26 ms 38 ms

Page 21: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 21

Write Amplification

Workload Type Amplification Factor

Bulk Load 14.8

Normal Operations (80/20 update/insert split)

4.2

Amplification Factor = Physical Bytes WrittenWorkload Bytes Written

Page 22: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 22

The HBase BucketCache (HBase-7404)Committed to HBase trunk. Will be in 0.96 release, backport patch for 0.94 available.

+

https://issues.apache.org/jira/browse/HBASE-7404

Page 23: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 23

BucketCache Configuration

▸ In hbase-site.xml <property>    <name>hbase.bucketcache.ioengine</name>    <value>file:/path/to/bucketcache.dat</value>  </property>

<property>    <name>hbase.bucketcache.size</name> <!-- 2TB: unit is MB -->

    <value>2097152</value>

  </property>

Page 24: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 Fusion-io Confidential 24

BucketCache Warm-upR

EA

D O

PS

DU

RIN

G C

AC

HE

WA

RM

-UP

30

51

09

90

14

70

19

50

24

30

29

10

33

90

38

70

43

50

48

30

53

10

57

90

62

70

67

50

72

30

77

10

81

90

86

70

91

50

96

30

10

11

01

05

90

11

07

01

15

50

12

03

01

25

10

12

99

01

34

70

13

95

01

44

30

14

91

01

53

90

15

87

01

63

50

16

83

01

73

10

17

79

01

82

70

18

75

01

92

30

19

71

02

01

91

20

67

12

11

51

21

63

12

21

11

22

59

12

30

71

23

55

12

40

31

24

51

12

49

91

25

47

12

59

51

26

43

12

69

11

27

39

1

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

read ops/sec

Page 25: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 Fusion-io Confidential 25

BucketCache Steady-StateR

EA

D O

PS

/SE

C S

TE

AD

Y-S

TA

TE

10

80

15

02

20

29

03

60

43

05

00

57

06

40

71

07

80

85

09

20

99

01

06

01

13

01

20

01

27

01

34

01

41

01

48

01

55

01

62

01

69

01

76

01

83

01

90

01

97

02

04

02

11

02

18

02

25

02

32

02

39

02

46

02

53

02

60

02

67

02

74

02

81

02

88

02

95

03

02

03

09

03

16

03

23

03

30

03

37

03

44

03

51

03

58

03

65

03

72

03

79

03

86

03

93

04

00

04

07

0

0

10000

20000

30000

40000

50000

60000

read ops/sec

Avg Latency: 5.2 ms95th Percentile: 21 ms99th Percentile: 24 ms

Page 26: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 Fusion-io Confidential 26

BucketCache 50% Read, 50% UpdateO

PS

/SE

C M

IXE

D W

OR

KL

OA

D

10

70

13

01

90

25

03

10

37

44

34

49

65

56

61

66

76

73

67

96

85

69

16

97

71

03

71

09

71

15

71

21

71

27

71

33

71

39

71

45

71

51

71

58

11

64

11

70

11

76

11

82

11

88

11

94

42

00

42

06

42

12

42

18

42

24

42

30

42

36

42

42

42

48

42

54

42

60

42

66

42

72

42

78

42

84

42

90

42

96

43

02

43

08

4

0

20000

40000

60000

80000

100000

120000

read ops/sec

Update LatencyAverage: 11.7 µs95th Pctl:0 ms99th Pctl: 0 ms

Read LatencyAverage: 7.9 ms95th Pctl: 34 ms99th Pctl: 65 ms

Page 27: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 Fusion-io Confidential 27

BucketCache during CompactionR

EA

D O

PS

/SE

C U

ND

ER

CO

MP

AC

TIO

N

10

13

02

50

37

04

90

61

07

30

85

09

70

10

90

12

10

13

30

14

50

15

70

16

90

18

10

19

30

20

50

21

70

22

90

24

10

25

30

26

50

27

70

28

90

30

10

31

30

32

50

33

70

34

90

36

10

37

30

38

50

39

70

40

90

42

10

43

30

44

50

45

70

46

90

48

10

49

30

50

50

51

70

52

90

54

10

55

30

56

50

57

70

58

90

60

10

0

10000

20000

30000

40000

50000

60000

read ops/sec

Avg Latency: 7.8 ms95th Percentile: 37 ms99th Percentile: 61 ms

Page 28: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 28

What Next?

1. Can we do something about that write amplification?

2. There is minimal penalty to in-place updates in flash on modern FTLs; can we devise a way to do this compatible with HDFS?

3. Does HDFS need to be more aware of different storage technologies? (DRAM, Flash, PCM)

Page 29: HBaseCon 2013: Apache HBase on Flash

f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E

T H A N K Y O U

f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E

T H A N K Y O U

Page 30: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 30

Performance

Page 31: HBaseCon 2013: Apache HBase on Flash

April 12, 2023 31

Performance