30

Optimizing HBase scanner performance

  • Upload
    akasma

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Optimizing HBase scanner performance. Mikhail Bautin Software Engineer 01/19/2012. HBase Scanners. What happens on a Get. RegionScanner. ColumnFamily1. ColumnFamily2. StoreScanner. StoreScanner. Store = (Region, CF). . . . StoreFileScanner. . . . StoreFileScanner. - PowerPoint PPT Presentation

Citation preview

Page 1: Optimizing HBase scanner performance
Page 2: Optimizing HBase scanner performance

Optimizing HBase scanner performance

Mikhail BautinSoftware Engineer01/19/2012

Page 3: Optimizing HBase scanner performance

HBase ScannersWhat happens on a Get

RegionScanner

StoreScanner

StoreScanner

StoreFileScanner

StoreFileScanner

StoreFileScanner

ColumnFamily1

ColumnFamily2

. . .(R1,C1,T3) (R1,C2,T2) (R1,C2,T1)

(R1,C1,T1) (R1,C2,T3) (R2,C1,T2)

(R2,C2,T1) . . .

Store = (Region, CF)

. . .

Page 4: Optimizing HBase scanner performance

HBase Scanner StateWhat happens on a next()

RegionScanner

StoreScanner

StoreScanner

StoreFileScanner

StoreFileScanner

StoreFileScanner

ColumnFamily1

ColumnFamily2

. . .Current KeyValue

Store = (Region, CF)

Current KeyValue

Current KeyValue

Priority

Queue. . .Priorit

y Queue

Priority

Queue

Page 5: Optimizing HBase scanner performance

Avoiding next() on StoreFileScannerEvery next() call may result in disk I/O▪HBASE-4433: avoid extra next if done with row/column (Kannan)▪ An optimization for queries specifying a column set▪ INCLUDE_AND_SEEK_NEXT_COL▪ INCLUDE_AND_SEEK_NEXT_ROW

▪HBASE-4434: Don't do HFile Scanner next() unless the next KV is needed (Kannan)▪ Avoid aggressive pre-fetching

Page 6: Optimizing HBase scanner performance

Simple ROWCOL Bloom FiltersDo we have to read all of these files?

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1C1 T3C2 T3C3 T2

R2C1 T2C2 T3

Row Col TS

R1C1 T4C2 T2

R2 C1 T1

Query: (R1, C3)

Page 7: Optimizing HBase scanner performance

Simple ROWCOL Bloom FiltersIn some cases, we only have to read one file

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2

R2C1 T2C2 T3

Row Col TS

R1C1 T4C2 T2

R2 C1 T1

Query: (R1, C3)

Page 8: Optimizing HBase scanner performance

Multi-column Bloom Filters (HBASE-2794)ROWCOL Bloom filters for multi-column queries

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1C1 T3C2 T3C3 T2

R2C1 T2C2 T3

Row Col TS

R1C1 T4C2 T2

R2 C1 T1

Query: C1 and C3 in all rows

Page 9: Optimizing HBase scanner performance

Multi-column Bloom Filters (HBASE-2794)ROWCOL Bloom filters for multi-column queries

Row Col TS

R1 C1 T2R1 C1 T1R1 C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2

R2C1 T2C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

Query: C1 and C3 in all rows—seek to (R1, C1)

Page 10: Optimizing HBase scanner performance

Multi-column Bloom Filters (HBASE-2794)ROWCOL Bloom filters for multi-column queries

Row Col TS

R1 C1 T2R1 C1 T1R1 C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2

R2C1 T2C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

Query: C1 and C3 in all rows—seek to (R1, C3)

Fake key: (R1, end of C3)

Fake key: (R1, end of C3)

Page 11: Optimizing HBase scanner performance

Multi-column Bloom Filters (HBASE-2794)ROWCOL Bloom filters for multi-column queries

Row Col TS

R1 C1 T2R1 C1 T1R1 C2 T1R2 C1 T1R2 C2 T2R2 C2 T1R2 C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2R2 C1 T2R2 C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

Query: C1 and C3 in all rows—seek to (R2, C1)

(R2, C1, T1)

(R2, C1, T1)

(R2, C1, T2) wins by

timestamp

Page 12: Optimizing HBase scanner performance

Multi-column Bloom Filters (HBASE-2794)ROWCOL Bloom filters for multi-column queries

Row Col TS

R1 C1 T2R1 C1 T1R1 C2 T1R2 C1 T1R2 C2 T2R2 C2 T1R2 C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2R2 C1 T2R2 C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

Query: C1 and C3 in all rows—seek to (R2, C3)

(R2, C3, T1)

Fake key: (R2, end of C3)

Fake key: (R2, end of C3)

Page 13: Optimizing HBase scanner performance

Lazy Seek (HBASE-4465)Optimizing for reading recent data

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1C1 T3C2 T3C3 T2

R2C1 T2C2 T3

Row Col TS

R1C1 T4C2 T2

R2 C1 T1

T1 – T2

T2 – T3

T1 – T4

Fake key: (R1, C1, T3)

Fake key: (R1, C1, T2)

Fake key: (R1, C1, T4)

Page 14: Optimizing HBase scanner performance

Lazy Seek (HBASE-4465)Optimizing for reading recent data

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1C1 T3C2 T3C3 T2

R2C1 T2C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

T1 – T2

T2 – T3

T1 – T4

Fake key: (R1, C1, T3)

Fake key: (R1, C1, T2)

(R1, C1, T4)

Page 15: Optimizing HBase scanner performance

Lazy Seek (HBASE-4465)Optimizing for reading recent data

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2

R2C1 T2C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

T1 – T2

T2 – T3

T1 – T4

Fake key: (R1, C3, T3)

Fake key: (R1, C3, T2)

Fake key: (R1, C3, T4)

Page 16: Optimizing HBase scanner performance

Lazy Seek (HBASE-4465)Optimizing for reading recent data

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2

R2C1 T2C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

T1 – T2

T2 – T3

T1 – T4

Fake key: (R1, C3, T3)

Fake key: (R1, C3, T2)

(R2, C1, T1)

Page 17: Optimizing HBase scanner performance

Lazy Seek (HBASE-4465)Optimizing for reading recent data

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2

R2C1 T2C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

T1 – T2

T2 – T3

T1 – T4

(R1, C3, T2) is next

Fake key: (R1, C3, T2)

(R2, C1, T1)

Page 18: Optimizing HBase scanner performance

Lazy Seek (HBASE-4465)Optimizing for reading recent data

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2

R2C1 T2C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

T1 – T2

T2 – T3

T1 – T4

Fake key: (R2, C1, T3)To be selected next.Fake key: (R2,

C1, T2)

(R2, C1, T1)

Page 19: Optimizing HBase scanner performance

Lazy Seek (HBASE-4465)Optimizing for reading recent data

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2R2 C1 T2R2 C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

T1 – T2

T2 – T3

T1 – T4

(R2, C1, T2) wins by

timestampFake key: (R2, C1, T2)

(R2, C1, T1)

Page 20: Optimizing HBase scanner performance

Lazy Seek (HBASE-4465)

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2R2 C1 T2R2 C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

T1 – T2

T2 – T3

T1 – T4

Fake key: (R2, C3, T3)

Fake key: (R2, C3, T2)

Fake key: (R2, C3, T4)

Optimizing for reading recent data

Page 21: Optimizing HBase scanner performance

Lazy Seek (HBASE-4465)Optimizing for reading recent data

Row Col TS

R1C1

T2T1

C2 T1

R2

C1 T1

C2T2T1

C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2R2 C1 T2R2 C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

T1 – T2

T2 – T3

T1 – T4

Real seek to (R2, C3, T3)Fake key: (R2,

C3, T2)

EOF

Page 22: Optimizing HBase scanner performance

Lazy Seek (HBASE-4465)

Row Col TS

R1C1

T2T1

C2 T1

R2C1 T1

C2T2T1

R2 C3 T1

Row Col TS

R1 C1 T3R1 C2 T3R1 C3 T2R2 C1 T2R2 C2 T3

Row Col TS

R1 C1 T4R1 C2 T2R2 C1 T1

T1 – T2

T2 – T3

T1 – T4

EOF

(R2, C3, T1)

EOF

Optimizing for reading recent data

Page 23: Optimizing HBase scanner performance

Top-of-the-row seekSome applications do not use DeleteFamily▪We always seek to the top of the row first

▪ DeleteFamily comes before all columns, i.e. at (R1, empty column)

▪ Even if we only need (R1, C1), there might be a DeleteFamily for R1

▪Some applications do not even use DeleteFamily▪Two fixes by Liyin Tang:

▪ Utilize existing ROWCOL Bloom filter (HBASE-4469)▪ Added a separate ROW-only Bloom filter for

DeleteFamily(HBASE-4532)

Page 24: Optimizing HBase scanner performance

Seek on deleted KV (HBASE-4585)What if the requested column has been deleted?▪We are requesting C1, C2, ..., Cn▪What if we see a delete marker for Ci?▪Previously, we would keep calling next()▪Now, we seek to (i + 1)’th requested column

(also a fix by Liyin)

Page 25: Optimizing HBase scanner performance

Data block read requests (dark launch)Thu, Sep 15 – Sun, Sep 25 2011

Pushed on Tue Sep 20th:• No extra next when done

with column/row (HBASE-4433)

• No KV prefetch (HBASE-4434)

• Lazy Seek (HBASE-4465)

Fri Sep 16th vs. Sep 23rd:45% savings in logical block read requests(cache hits + misses)

Page 26: Optimizing HBase scanner performance

Data block read requests (dark launch)Sun, Sep 25 – Mon, Oct 3 2011

Pushed on Fri Sep 30th:• Avoid top-of-the-row seek

(HBASE-4469, Liyin)• Off-peak compactions

(HBASE-4463, Karthik)

Sun Sep 25th vs. Oct 2nd: 33% savings in logical block read requests (cache hits + misses)

Page 27: Optimizing HBase scanner performance

Data block cache misses (dark launch)▪20.6 K (Mon Sep 19th) -> 11.8 K (Mon Sep 26th) -> 9.8 K (Mon Oct 3rd)

▪52% savings (42% and then 17% more)

• No next KV prefetch

• No next() when done with row/column

• Lazy Seek

• No top-of-the-row seek

• Off-peak compactios

Page 28: Optimizing HBase scanner performance

Avoid loading previous block (HBASE-4443)We sometimes go to previous block on exact match ▪Future work▪Suppose the first key of a block matches (Row, Column)

▪But maybe there is an earlier key that would also match?

▪We load the previous block to find out▪Possible fixes:

▪ Track deletes and optimize the MAX_VERSIONS=1 case

▪ Add last key in block to index (increases index size)

Page 29: Optimizing HBase scanner performance

Top-of-the-column seek (HBASE-4962)Some applications do not use DeleteColumn▪Future work▪DeleteColumn deletes all versions of a particular column

▪Comes before all Puts for a (Row, Column)▪Slows down timestamp range queries▪Proposed solution:

▪ Add a (Row, Column) Bloom filter for DeleteColumn only

▪ Seek to (Row, Column, T2) for a [T1, T2] range query

Page 30: Optimizing HBase scanner performance

(c) 2009 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0