Upload
others
View
19
Download
1
Embed Size (px)
Citation preview
Galera in MariaDB 10.4State of the Art and Plans
Seppo JaakolaCodership
2
➢ Seppo Jaakola➢ One of the Founders of Codership
➢ Codership – Galera Replication developers➢ Partner of MariaDB for developing and supporting MariaDB
Galera Cluster➢ Galera releases since 2009
3
Agenda
● Galera in 10.4 Status● Galera Cluster Upgrading● Streaming Replication● XA Transaction Support● Spider Cluster
4
Galera in 10.4 and Beyond
● Group Commit Support <refactor for MariaDB>
● Non Blocking DDL <testing>
● Huge transactions by streaming replication <testing>
● Inconsistency Voting Protocol <testing>
Galera 4.0
v
v
MariaDB 10.4
● Gcache Encryption <implementation>
● MariaDB GTID Compatibility <requirement>
v
v
● XA transaction Support <implementation>
● Spider Cluster <design>
Galera 4.1
v
Galera Upgradewsrep API Change
6
API 25
Galera Rolling Upgrades
Galera Replication
read & write
MariaDB
API 25
read & write
MariaDB
API 25
MariaDB
API 25
Upgrade with API #26
7
API 25
Galera Rolling Upgrades
Galera Replication
read & write
One node upgradedTo API #26
MariaDB
API 25
MariaDB
API 26
read & write
MariaDB
API 25
Upgrade with API #26
8
API 25
Galera Rolling Upgrades
Galera Replication
read & write
One node upgradedTo API #26
MariaDB
API 26
MariaDB
API 26
read & write
MariaDB
API 25
Upgrade with API #26
9
API 26
Galera Rolling Upgrades
Galera Replication
read & write
All nodes upgradedTo API #26
MariaDB
API 26
read & write
MariaDB
API 26
read & write
MariaDB
API 26API #26 features nowEnabled in replication
Streaming ReplicationHuge Transaction Support
11
Huge Transaction Demo Setup
1. Two nodes
2. Steady load of pure autocommit updates to measure trx throughput
3. A huge table with ~1.5M rows
4. Run update on huge table to modify all rows● → monitor trx/sec rate in the cluster when the
huge transaction kicks in
12
Impact of Huge Transaction
0
500
1000
1500
2000
2500
3000
3500
4000
4500
Huge Transaction Slave Lag
Trx in master24 secs
Trx in slave9 secs
13
Streaming Replication
● Transaction is replicated, gradually in small fragments, during transaction processing
● i.e. before actual commit, we replicate a number of small scale fragments
● Size threshold for fragment replication is configurable
● Replicated fragments are applied in slave transactions in all cluster nodes
➔ Fragments hold locks in all nodes and cannot be conflicted later
14
Streaming Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
Update, update, update....
15
Streaming Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
WS
Update, update, update....
16
Streaming Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
WS
Update, update, update....
17
Streaming Replication
Huge transaction
Galera Replication
Node A Node B
Huge trx
WS
commit
18
Fragment Transaction
SR transaction pool
SR#1 THD
SR#2 THD
SR#n THD
WSSR trx :2
CF: 0
applier
applier
applier
certification
19
applier
Fragment Transaction
SR transaction pool
SR#1 THD
SR#2 THD
SR#n THD
WSSR trx :2
CF: 0
applier
applier
ev→apply_event()…ev->apply_event()
wsrep_SR_store->append_frag_apply())
20
Fragment Transaction
SR transaction pool
SR#1 THD
SR#2 THD
SR#n THD
WSSR trx :2
CF: 1
applier
applier
applier
certification
21
applier
Fragment Transaction
SR transaction pool
SR#1 THD
SR#2 THD
SR#n THD
WSSR trx :2
CF: 1
applier
applier
trans_commit()
wsrep_SR_store->append_frag_commit())
22
Configuring Streaming Replication
wsrep_trx_fragment_unit Unit metrics for fragmenting, options are:● bytes WS size in bytes● events # of binlog events● rows # of rows modified● statements # of SQL statements issued
wsrep_trx_fragment_size ● Threshold size (in units), when fragment will be replicated
● 0 = no streaming
23
Streaming Replication Demo Setup
1. Same scenario as before
2. Configure node1 to fragment huge transaction in 10K batches
● wsrep_trx_fragment_unit = bytes● wsrep_trx_fragment_size = 10000
→ monitor trx/sec rate in the cluster when streaming replication progresses
24
Streaming Replication
0
500
1000
1500
2000
2500
3000
3500
4000
4500
Streaming Replication
time
trx/
sec
Streaming Replication70 secs
25
Streaming Replication
0
500
1000
1500
2000
2500
3000
3500
4000
4500
time
trx/
sec
XA Transactions with Galera 3
27
XA Transaction Support
XA Start
Node A Node B
smith smith
XA trans
28
XA Transaction Support
XA Insert into persons ‘’jones’
Node A Node B
smith smith
insertXA trans
29
XA Transaction Support
XA Prepare
Node A Node B
smith smith
insertXA trans
30
XA Transaction Support
XA Prepare
Node A Node B
smith smith
insertXA trans
WS
31
XA Transaction Support
XA Prepare
Node A Node B
smithsmithjones
WSinsertXA trans
insertapply
32
XA Transaction Support
XA Rollback
Node A Node B
smithsmithjones
insertXA trans
33
XA Transaction Support
Node A Node B
smithsmithjones
XA by Streaming Replication
35
XA Transaction Support
XA Start
Node A Node B
smith smith
XA trans
WS
36
XA Transaction Support
XA Insert into persons ‘’jones’
Node A Node B
smith smith
insertXA transXA transXA transXA trans
WS
insertSR trans
37
XA Transaction Support
XA Prepare
Node A Node B
smith smith
insertXA trans
insertSR trans
38
XA Transaction Support
XA Rollback
Node A Node B
smithsmithjones
insertXA trans
insertSR trans
WSrollback
39
XA Transaction Support
Node A Node B
smithsmith
Spider Cluster
Shard A Shard B
Spider SE
XA XA
A
B
Insert into t values….
Table t
G a l e r aG a l e r a
Node 1 Node 1
Spider SE
XA XA
Node 2 Node 3 Node 2 garbd
Insert into t values….
A
B
Table t
G a l e r a
Node 1
G a l e r aG a l e r a
Node 1 Node 1
Spider SE
XA XA
Node 2 Node 3 Node 2 garbd
Node 1Spider SE Node 1Spider SE
Insert into t values….
Insert into t values (1),(2)shard A shard B
XA start
XA start
Insert (1)
Insert (2)
XA prepare
XA prepare
XA Commit
XA Commit
S p i d e r A C I D
Insert into t values (1),(2) Select * from tshard A shard B
XA start
XA start
Insert (1)
Insert (2)
XA prepare
XA prepare
XA Commit
XA Commit
()()
= ()
(1)()= (1)
(1)(2)= (1),(2)
S p i d e r A C I D