Upload
aleksandr-kuzminsky
View
487
Download
1
Embed Size (px)
Citation preview
Efficient Indexes in MySQL
https://twindb.com Ovais Tariq & Aleksandr Kuzminsky
Who we are Aleks: ● TwinDB co-founder ● Dropbox MySQL SRE ● ex-Percona consultant
Ovais: ● TwinDB co-founder ● Lithium Lead SRE ● ex-Percona consultant
Agenda
1. How is data organized 2. How is data accessed
How data is organized
B+ Tree
O(log(n))
B+Tree Characteristics
• Leaf node contains data
• Doubly linked list of leaf nodes
• Keys stored in sorted order
• All leaf nodes at the same height
Few Advantages
• Reduced I/O
• Reduced Rebalancing
• Extremely efficient range scans
• Implicit sorting
Index Height
h is the height of the tree n is the number of rows in a table p is the branching factor of the tree p = page size in bytes/key length in bytes
h=⎡ logn/logp⎤
Index Height
# records Height with INT Primary Key
Height with CHAR(36) Primary Key
Increase in IO Operations
1,000 1 2 +100%
1,000,000 2 3 +50%
1,000,000,000 3 4 +33%
1,000,000,000,000 4 5 +25%
Table in MySQL (InnoDB)
CREATETABLE`actor`(`actor_id`SMALLINT(5)UNSIGNEDNOTNULL,`first_name`VARCHAR(45)NOTNULL,`last_name`VARCHAR(45)NOTNULL,`last_update`TIMESTAMPNOTNULL,PRIMARYKEY(`actor_id`),KEY`idx_actor_last_name`(`last_name`))ENGINE=InnoDB;
sakila.actor
PRIMARY
idx_actor_last_name actor_id first_name last_name last_update
1 PENELOPE GUINESS 2006-02-1504:34:33
2 NICK WAHLBERG 2006-02-1504:34:33
3 ED CHASE 2006-02-1504:34:33
4 JENNIFER DAVIS 2006-02-1504:34:33
5 JOHNNY WOOD 2006-02-1504:34:33
... ... ... ...
last_name actor_id
AKROYD 58
AKROYD 92
AKROYD 182
ALLEN 118
ALLEN 145
... ...
How Data is Accessed
Fast if accessing table
and producing result
is simultaneous
Point SELECT SELECT*FROMactorWHEREactor_id=3;
actor_id first_name last_name last_update
1 PENELOPE GUINESS 2006-02-1504:34:33
2 NICK WAHLBERG 2006-02-1504:34:33
3 ED CHASE 2006-02-1504:34:33
4 JENNIFER DAVIS 2006-02-1504:34:33
5 JOHNNY WOOD 2006-02-1504:34:33
... ... ... ...
SELECT by range of keys SELECT*FROMactorWHEREactor_id>3;
actor_id first_name last_name last_update
1 PENELOPE GUINESS 2006-02-1504:34:33
2 NICK WAHLBERG 2006-02-1504:34:33
3 ED CHASE 2006-02-1504:34:33
4 JENNIFER DAVIS 2006-02-1504:34:33
5 JOHNNY WOOD 2006-02-1504:34:33
... ... ... ...
Lookup by secondary key
actor_id first_name last_name last_update
117 RENEE TRACY 2006-02-1504:34:33
118 CUBA ALLEN 2006-02-1504:34:33
119 WARREN JACKMAN 2006-02-1504:34:33
... ... ... ...
145 KIM ALLEN 2006-02-1504:34:33
... ... ... ...
last_name actor_id
AKROYD 58
AKROYD 92
AKROYD 182
ALLEN 118
ALLEN 145
... ...
SELECT*FROMactorWHERElast_name=‘ALLEN’;
Step 1 Step 2
Using index for data access
last_name actor_id
AKROYD 182
ALLEN 118
ALLEN 145
ALLEN 194
ASTAIRE 76
... ...
SELECTCOUNT(*)FROMactorWHERElast_name=‘ALLEN’;
Using index for data access EXPLAINSELECTCOUNT(*)FROMactorWHERElast_name=‘ALLEN’;
***************************1.row***************************id:1select_type:SIMPLEtable:actortype:refpossible_keys:idx_actor_last_namekey:idx_actor_last_namekey_len:137ref:constrows:3Extra:Usingwhere;Usingindex
Covering indexes ALTERTABLEactorADDINDEXidx_last_first(last_name,first_name);SELECTfirst_nameFROMactorWHERElast_name='ALLEN'
last_name first_name actor_id
AKROYD KIRSTEN 182
ALLEN CUBA 118
ALLEN KIM 145
ALLEN MERYL 194
ASTAIRE ANGELINA 76
... ...
***************************1.row***************************
id:1select_type:SIMPLEtable:actortype:refpossible_keys:idx_actor_last_name,idx_last_firstkey:idx_last_firstkey_len:137ref:constrows:3Extra:Usingwhere;Usingindex
DISTINCT
***************************1.row***************************
id:1select_type:SIMPLEtable:actortype:indexpossible_keys:idx_actor_last_name,idx_last_firstkey:idx_actor_last_namekey_len:137ref:NULLrows:200Extra:Usingindex
last_name actor_id
AKROYD 182
ALLEN 118
ALLEN 145
ALLEN 194
ASTAIRE 76
... ...
SELECTDISTINCTlast_nameFROMactor
GROUP BY
***************************1.row***************************
id:1select_type:SIMPLEtable:actortype:indexpossible_keys:idx_actor_last_name,idx_last_first key:idx_actor_last_namekey_len:137ref:NULLrows:200Extra:Usingindex
last_name actor_id
AKROYD 182
ALLEN 118
ALLEN 145
ALLEN 194
ASTAIRE 76
... ...
SELECTlast_name,COUNT(*)FROMactorGROUPBYlast_name
Loose index scan ALTERTABLEactorADDCOLUMNrankINT;UPDATEactorSETrank=ROUND(100*RAND());ALTERTABLEactorADDINDEXidx_last_rank(last_name,rank);
last_name rank actor_id
AKROYD 40 58
AKROYD 42 92
AKROYD 95 182
ALLEN 19 194
ALLEN 35 118
... ...
Loose index scan SELECTlast_name,MIN(rank)FROMactorGROUPBYlast_name
last_name rank actor_id
AKROYD 40 58
AKROYD 42 92
AKROYD 95 182
ALLEN 19 194
ALLEN 35 118
... ...
***************************1.row***************************
id:1select_type:SIMPLEtable:actortype:rangepossible_keys:…,idx_last_rank key:idx_last_rankkey_len:137ref:NULLrows:247Extra:Usingindexforgroup-by
Sorting SELECT*FROMactorWHERElast_name='AKROYD'ORDERBYrank
last_name rank actor_id
AKROYD 40 58
AKROYD 42 92
AKROYD 95 182
ALLEN 19 194
ALLEN 35 118
... ...
***************************1.row***************************
id:1select_type:SIMPLEtable:actortype:refpossible_keys:…,idx_last_rank key:idx_last_rankkey_len:137ref:constrows:3Extra:Usingwhere
Joining tables SELECTtitle,first_name,last_nameFROMfilmJOINfilm_actorONfilm_actor.film_id=film.film_idJOINactorONactor.actor_id=film_actor.actor_idORDERBYtitle;
Response time: 25ms
Joining tables SELECTtitle,first_name,last_nameFROMfilmFORCEINDEX(`idx_title`)JOINfilm_actorONfilm_actor.film_id=film.film_idJOINactorONactor.actor_id=film_actor.actor_idORDERBYtitle;
Response time: 5ms (5 times faster!)
How to compare efficiency
Q&A
Thank you!