Toku DB by Aswin

Preview:

DESCRIPTION

Presentasi dari Lee Marvin, kru Agate Studio dalam event Talent Development Saturday Agate Studio. http://agatestudio.com

Citation preview

@agatestudio

TokuDB Introduction

Aswin

Knight

Agate Studio

TokuDB Introduction

Aswin Juari

TokuDB Overview

• MySQL Like Storage Engine (MySQL, MariaDB, PerconaServer)

• Database Engine that more friendly than InnoDB

• TokuTek Product

• Open Source

• No Built for 32 bit

• Only for Linux Server

Traditional MySQL Engines

• MyISAM – Support nontransactional query

– No Foreign Key Support

– Table Level Locking

– Simple

• InnoDB – Support Transaction and MVCC

– Foreign Key Support

– Row Level Locking

– Complex

Why TokuDB introduced

Big Data Challenges

• Volume

• Velocity

• Variety

TokuDB vs InnoDB

• Similarity:

– Support Transaction and MVCC

TokuDB vs InnoDB

TokuDB InnoDB

Indexing Use Fractal Binary Index Use Binary Tree Index

Hot Indexing Yes No

Hot Column Changes Yes No

Data Compression Up to 25x 2x

Fragmentation Immunity

Yes No

Eliminate Slave Lag Yes No

Fast Loader Yes No

Fast Recovery Time Yes No

Foreign Key Support No Yes

Benefit of TokuDB

• Schema change w/o downtime

• Performances (Fewer Server & Faster)

• More data Compression (Data Volume)

• Efficiency of Server

• TokuDB may extends Storage Life

• TokuDB may take the same amount time with InnoDB to complete if data is small. However, in huge data, tokudb can give you quick result

TokuDB Technology (Why TokuDB is Better)

Indexing

• TokuDB uses Fractal Index

–Much faster than B-trees in > RAM workloads

– InnoDB and MyISAM use B-trees

–Key Idea is minimizing I/O Operation

B-Tree

• B-Tree Structure

B-Tree

• B-Tree are Fast at Sequential Insert

– Require 1 Disk I/O because Data fit in memory

B-Tree

• B-Trees are Slow for High-Entropy Inserts

– Require Many Disk I/O (because data not fit in memory)

• Insert Cost

B

NO

log

log

B-Tree

• New Created B-Trees Run Fast Range Queries

– Because Data is put sequentially on Disk

• Aged B-Trees Run Slow Range Queries

(Simplified) Fractal Index

Simplified Fractal Tree

• Make N Array of 2 ^ (N-1) Elemen

• Each Array is full or empty

• Each array is sorted

• For Example if there are 10 elements.

5 10

3 6 8 12 17 23 26 30

Fractal Tree Insertion

7

5 10

3 6 8 12 17 23 26 30

• Insert 7

Temp Array Result Array

Fractal Tree Insertion

15

5 10

3 6 8 12 17 23 26 30

• Insert 15

7

Temp Array Result Array

Fractal Tree Insertion

5 10

3 6 8 12 17 23 26 30

7 15

Temp Array Result Array

Fractal Tree Insertion

5 7 10 15

3 6 8 12 17 23 26 30

• Insert 15

Temp Array Result Array

Fractal Tree Insertion

• Cost to merge 2 arrays of size X is O(X/B) block I/O

• Cost per element to merge is O(1/B) since O(X) elements were merged.

• Max # of times of each element is merged is O(log N).

• Average insert cost (I/O) is O

• It may takes more CPU cycles because of merge process

B

Nlog

Fractal Tree vs Binary Tree

• Example:

– 10^9 of 128 Byte Rows inserted

• N = 2 ^ 30, log(N)= 30

– 1MB Block Size

• 1 Block 8192 rows (2^20/2^7). B = 8192

• Log (B) = 13

– B-Tree:

– Fractal Tree:

313

30

log

log

B

NO

003,08192

13log

B

NO

Fractal vs Binary Tree

InnoDB TokuDB

Performance Fast until index not fit in Memory

Start Fast, Stay Fast

Deletion Not Free Hardisk. Free Hardisk

Disk Block 64KB, Random IO 4MB, Sequential IO

Bottleneck Disk (I/O) CPU

Fractal vs B-Tree

InnoDB Compression

InnoDB

Cache Block 16k

Disk Block 8k, 4k, 2k, 1k

Algorithm zlib

TokuDB Compression

TokuDB

Cache Block 64k (default)

Disk Block 4MB

Algorithm Quicklz/zlib/lzma

Installing TokuDB

• Pre-requisite: Install MySQL First • TokuDB Procedure:

1. Download TokuDB (Check the version of MySQL) 2. Upload to Server 3. Extract TokuDB 4. Make symbolic link from /path-to-tokudb-folder/mysql-5.5.38-

tokudb-7.1.0-linux-x86_64 to /path-to-tokudb-folder/mysql 5. Make symbolic link from /path-to-tokudb-folder/mysql/support-

files/mysql.server to /etc/init.d/mysql_toku 6. Edit Configuration /etc/init.d/mysql_toku

a. basedir=/opt/tokutek/mysql

b. datadir=/var/lib/mysql

7. Copy ‘ha_tokudb.so’ from /path-to-tokudb-folder/mysql/lib/plugin to /usr/lib64/mysql/plugin/

Installing TokuDB

• TokuDB Procedure (Cont) 8. Login to Mysql server as root 9. Do Query Below:

a) INSTALL PLUGIN tokudb_trx SONAME 'ha_tokudb.so';

b) INSTALL PLUGIN tokudb_locks SONAME 'ha_tokudb.so';

c) INSTALL PLUGIN tokudb_lock_waits SONAME 'ha_tokudb.so';

d) DELETE FROM mysql.plugin WHERE NAME LIKE 'tokudb_user_data%';

e) INSTALL PLUGIN TokuDB SONAME 'ha_tokudb.so';

f) INSTALL PLUGIN tokudb_file_map SONAME 'ha_tokudb.so';

g) INSTALL PLUGIN tokudb_fractal_tree_info SONAME 'ha_tokudb.so';

h) INSTALL PLUGIN tokudb_fractal_tree_block_map SONAME 'ha_tokudb.so';

i) SET GLOBAL default_storage_engine=TokuDB;

10. Verify Installation by Query SHOW PLUGINS; and SHOW ENGINES;

Result

• Conversion Result

– Compression

• Input Data: InnoDB 1.0 GB

• Output : TokuDB 200MB and more files XD

Advanced Setting

• Edit /etc/my.cnf – Reduce innodb_buffer_pool_size (InnoDB) and

key_cache_size (MyISAM) – Especially if converting tables

– tokudb_cache_size=?G – Defaults to 50% of RAM, I recommend 80%

– tokudb_directio=1 – (We Tell that OS should not cache page requested by TokuDB)

– Data is cache by tokudb cache

– In Experiment I set innodb_buffer_pool_size about 40-50%

memory.

Advanced Setting

• TokuDB Compression method – TOKUDB_ZLIB - This compression is using zlib library and provides mid-range

compression with medium CPU utilization. – TOKUDB_QUICKLZ - This compression is using quicklz library and provides

light compression with low CPU utilization. – TOKUDB_LZMA - This compression is using lzma library and provides the

highest compression with high CPU utilization. – TOKUDB_UNCOMPRESSED - This option disables the compression.

• Default TokuDB 7.1 TOKUDB_ZLIB • Query create table t1 (

`ID` int(11) NOT NULL AUTO_INCREMENT

) engine=tokudb, row_format=

[tokudb_lzma | tokudb_zlib | tokudb_quicklz];

ALTER TABLE City ROW_FORMAT=TOKUDB_ZLIB;

Recommended