32
@agatestudio TokuDB Introduction Aswin Knight Agate Studio

Toku DB by Aswin

Embed Size (px)

DESCRIPTION

Presentasi dari Lee Marvin, kru Agate Studio dalam event Talent Development Saturday Agate Studio. http://agatestudio.com

Citation preview

Page 1: Toku DB by Aswin

@agatestudio

TokuDB Introduction

Aswin

Knight

Agate Studio

Page 2: Toku DB by Aswin

TokuDB Introduction

Aswin Juari

Page 3: Toku DB by Aswin

TokuDB Overview

• MySQL Like Storage Engine (MySQL, MariaDB, PerconaServer)

• Database Engine that more friendly than InnoDB

• TokuTek Product

• Open Source

• No Built for 32 bit

• Only for Linux Server

Page 4: Toku DB by Aswin

Traditional MySQL Engines

• MyISAM – Support nontransactional query

– No Foreign Key Support

– Table Level Locking

– Simple

• InnoDB – Support Transaction and MVCC

– Foreign Key Support

– Row Level Locking

– Complex

Page 5: Toku DB by Aswin

Why TokuDB introduced

Big Data Challenges

• Volume

• Velocity

• Variety

Page 6: Toku DB by Aswin

TokuDB vs InnoDB

• Similarity:

– Support Transaction and MVCC

Page 7: Toku DB by Aswin

TokuDB vs InnoDB

TokuDB InnoDB

Indexing Use Fractal Binary Index Use Binary Tree Index

Hot Indexing Yes No

Hot Column Changes Yes No

Data Compression Up to 25x 2x

Fragmentation Immunity

Yes No

Eliminate Slave Lag Yes No

Fast Loader Yes No

Fast Recovery Time Yes No

Foreign Key Support No Yes

Page 8: Toku DB by Aswin

Benefit of TokuDB

• Schema change w/o downtime

• Performances (Fewer Server & Faster)

• More data Compression (Data Volume)

• Efficiency of Server

• TokuDB may extends Storage Life

• TokuDB may take the same amount time with InnoDB to complete if data is small. However, in huge data, tokudb can give you quick result

Page 9: Toku DB by Aswin

TokuDB Technology (Why TokuDB is Better)

Page 10: Toku DB by Aswin

Indexing

• TokuDB uses Fractal Index

–Much faster than B-trees in > RAM workloads

– InnoDB and MyISAM use B-trees

–Key Idea is minimizing I/O Operation

Page 11: Toku DB by Aswin

B-Tree

• B-Tree Structure

Page 12: Toku DB by Aswin

B-Tree

• B-Tree are Fast at Sequential Insert

– Require 1 Disk I/O because Data fit in memory

Page 13: Toku DB by Aswin

B-Tree

• B-Trees are Slow for High-Entropy Inserts

– Require Many Disk I/O (because data not fit in memory)

• Insert Cost

B

NO

log

log

Page 14: Toku DB by Aswin

B-Tree

• New Created B-Trees Run Fast Range Queries

– Because Data is put sequentially on Disk

• Aged B-Trees Run Slow Range Queries

Page 15: Toku DB by Aswin

(Simplified) Fractal Index

Page 16: Toku DB by Aswin

Simplified Fractal Tree

• Make N Array of 2 ^ (N-1) Elemen

• Each Array is full or empty

• Each array is sorted

• For Example if there are 10 elements.

5 10

3 6 8 12 17 23 26 30

Page 17: Toku DB by Aswin

Fractal Tree Insertion

7

5 10

3 6 8 12 17 23 26 30

• Insert 7

Temp Array Result Array

Page 18: Toku DB by Aswin

Fractal Tree Insertion

15

5 10

3 6 8 12 17 23 26 30

• Insert 15

7

Temp Array Result Array

Page 19: Toku DB by Aswin

Fractal Tree Insertion

5 10

3 6 8 12 17 23 26 30

7 15

Temp Array Result Array

Page 20: Toku DB by Aswin

Fractal Tree Insertion

5 7 10 15

3 6 8 12 17 23 26 30

• Insert 15

Temp Array Result Array

Page 21: Toku DB by Aswin

Fractal Tree Insertion

• Cost to merge 2 arrays of size X is O(X/B) block I/O

• Cost per element to merge is O(1/B) since O(X) elements were merged.

• Max # of times of each element is merged is O(log N).

• Average insert cost (I/O) is O

• It may takes more CPU cycles because of merge process

B

Nlog

Page 22: Toku DB by Aswin

Fractal Tree vs Binary Tree

• Example:

– 10^9 of 128 Byte Rows inserted

• N = 2 ^ 30, log(N)= 30

– 1MB Block Size

• 1 Block 8192 rows (2^20/2^7). B = 8192

• Log (B) = 13

– B-Tree:

– Fractal Tree:

313

30

log

log

B

NO

003,08192

13log

B

NO

Page 23: Toku DB by Aswin

Fractal vs Binary Tree

InnoDB TokuDB

Performance Fast until index not fit in Memory

Start Fast, Stay Fast

Deletion Not Free Hardisk. Free Hardisk

Disk Block 64KB, Random IO 4MB, Sequential IO

Bottleneck Disk (I/O) CPU

Page 24: Toku DB by Aswin

Fractal vs B-Tree

Page 25: Toku DB by Aswin

InnoDB Compression

InnoDB

Cache Block 16k

Disk Block 8k, 4k, 2k, 1k

Algorithm zlib

Page 26: Toku DB by Aswin

TokuDB Compression

TokuDB

Cache Block 64k (default)

Disk Block 4MB

Algorithm Quicklz/zlib/lzma

Page 27: Toku DB by Aswin

Installing TokuDB

• Pre-requisite: Install MySQL First • TokuDB Procedure:

1. Download TokuDB (Check the version of MySQL) 2. Upload to Server 3. Extract TokuDB 4. Make symbolic link from /path-to-tokudb-folder/mysql-5.5.38-

tokudb-7.1.0-linux-x86_64 to /path-to-tokudb-folder/mysql 5. Make symbolic link from /path-to-tokudb-folder/mysql/support-

files/mysql.server to /etc/init.d/mysql_toku 6. Edit Configuration /etc/init.d/mysql_toku

a. basedir=/opt/tokutek/mysql

b. datadir=/var/lib/mysql

7. Copy ‘ha_tokudb.so’ from /path-to-tokudb-folder/mysql/lib/plugin to /usr/lib64/mysql/plugin/

Page 28: Toku DB by Aswin

Installing TokuDB

• TokuDB Procedure (Cont) 8. Login to Mysql server as root 9. Do Query Below:

a) INSTALL PLUGIN tokudb_trx SONAME 'ha_tokudb.so';

b) INSTALL PLUGIN tokudb_locks SONAME 'ha_tokudb.so';

c) INSTALL PLUGIN tokudb_lock_waits SONAME 'ha_tokudb.so';

d) DELETE FROM mysql.plugin WHERE NAME LIKE 'tokudb_user_data%';

e) INSTALL PLUGIN TokuDB SONAME 'ha_tokudb.so';

f) INSTALL PLUGIN tokudb_file_map SONAME 'ha_tokudb.so';

g) INSTALL PLUGIN tokudb_fractal_tree_info SONAME 'ha_tokudb.so';

h) INSTALL PLUGIN tokudb_fractal_tree_block_map SONAME 'ha_tokudb.so';

i) SET GLOBAL default_storage_engine=TokuDB;

10. Verify Installation by Query SHOW PLUGINS; and SHOW ENGINES;

Page 29: Toku DB by Aswin

Result

• Conversion Result

– Compression

• Input Data: InnoDB 1.0 GB

• Output : TokuDB 200MB and more files XD

Page 30: Toku DB by Aswin

Advanced Setting

• Edit /etc/my.cnf – Reduce innodb_buffer_pool_size (InnoDB) and

key_cache_size (MyISAM) – Especially if converting tables

– tokudb_cache_size=?G – Defaults to 50% of RAM, I recommend 80%

– tokudb_directio=1 – (We Tell that OS should not cache page requested by TokuDB)

– Data is cache by tokudb cache

– In Experiment I set innodb_buffer_pool_size about 40-50%

memory.

Page 31: Toku DB by Aswin

Advanced Setting

• TokuDB Compression method – TOKUDB_ZLIB - This compression is using zlib library and provides mid-range

compression with medium CPU utilization. – TOKUDB_QUICKLZ - This compression is using quicklz library and provides

light compression with low CPU utilization. – TOKUDB_LZMA - This compression is using lzma library and provides the

highest compression with high CPU utilization. – TOKUDB_UNCOMPRESSED - This option disables the compression.

• Default TokuDB 7.1 TOKUDB_ZLIB • Query create table t1 (

`ID` int(11) NOT NULL AUTO_INCREMENT

) engine=tokudb, row_format=

[tokudb_lzma | tokudb_zlib | tokudb_quicklz];

ALTER TABLE City ROW_FORMAT=TOKUDB_ZLIB;