View
418
Download
4
Category
Tags:
Preview:
DESCRIPTION
High Availability with MariaDB Enterprise by Stéphane Varoqui. Presented 26.6.2014 at the MariaDB Roadshow in Paris, France.
Citation preview
© SkySQL Corpora-on Ab. Company Confiden-al. 09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
High Availability with MariaDB Enterprise
Stephane Varoqui Professional Services, SkySQL
* * 09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Agenda
❏ Introduc-on to High Availability ❏ Different services that need HA ❏ Different components of High Availability ❏ Different MariaDB HA Solu-ons
❏ HA using MariaDB Replica-on ❏ HA using MariaDB Galera Cluster
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Introduc-on to High Availability
High availability is a system design protocol and associated
implementation that ensures a certain degree of operational continuity during
a given measurement period
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Introduc-on to High Availability
09/06/2014
❏ High Availability != Long Up*me “up” might s-ll not be accessible “down” just once, but for a long *me
❏ High Availability rather means Long Mean Time Between Failures (MTBF) Short Mean Time To Recover (MTTR)
© SkySQL Corpora-on Ab. Company Confiden-al.
Introduc-on to High Availability
❏ Availability level is measured as the ra-o of -me the system is available over a year, expressed as a percentage
❏ 99.9% availability means that the system is available at least 8751 of 8760 hours in a year, or that it is unavailable at the most 9 hours per year
❏ 99.999% availability means that the system is available at least 525595 or 525600 minutes in a year, or that it is unavailable at the most 5 minutes per year
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Maintaining High Availability
❏ There are two common situa-ons that we try to protect ourselves from using an HA solu-on
❏ Datacenter failure – A whole datacenter becomes unavailable for some reason, like power failure, network failure, a virus or similar situa-ons
❏ Server failure – An individual server fails because of a hardware failure or something similar
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Layers in HA Solu-on
❏ All services that make up the applica-on stack needs HA for the system to achieve HA
Web servers Applica-on servers Applica-ons Database servers Storage Network
09/06/2014
TCO for Unbreakable
Hardware
© SkySQL Corpora-on Ab. Company Confiden-al.
Services in an HA Solu-on
❏ Of the different types of services, there are two types Stateless services
These servers has no state beyond the current opera-on. If such a server fails, another server of the same type can replace it without having to transfer any set of data. Webservers and applica-on servers are typical stateless services
Stateful services These services maintains a state, and that state needs to be preserved if a server fails, and has to be made available to any other server that takes it place. A database service such as a MariaDB server is a typical stateful service
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Components of High Availability
❏ Monitoring and Management Availability of the services needs to be monitored, to be able to take ac-on when there is a failure. A failover can be manual or automa-c, but it has to be managed
❏ Failover / Load Balancing mechanism Some mechanism to redirect traffic from the failed server or Datacenter and to a working one
❏ Data redundancy For stateful services, we need to make sure that data is somehow made redundant
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Monitoring and Management
❏ There are many different solu-ons here, some focused on specific services (in par-cular database), some as part of a Load Balancing sodware solu-on of an Appliance or a pure sodware based solu-on such as LinuxHA, HaProxy, MaxScale, Mha
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Failover mechanism
❏ There is a wide range of op-ons here too, in par-cular when it comes to Datacenter failover, which can be more complicated
❏ Common mechanisms range from applica-on based failover and DNS failover to Load Balancing and Network Failover
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Data Redundancy
❏ Providing Data Redundancy is complex, error prone and takes a toll on performance
❏ It also has to be men-oned that a SAN does not provide redundancy just because a disk set can be failed over from one server to another. A SAN might be a SPOF, but in some cases that is a risk that some customers is willing to take
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Data Redundancy with MariaDB
❏ MariaDB Internal Replica*on ❏ Assync ❏ Semi-‐sync ❏ Sync per table with spider
❏ MariaDB Galera Cluster
❏ Storage based redundancy (Ac-ve / Passive setups) ❏ DRBD (Distributed Replicated Block Device) ❏ SAN (Storage Area Network) ❏ VMWare replica-on
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
HA using MariaDB Replica-on
❏ In many cases best solu-on
❏ Well known and easy to use technology
❏ The cluster is very loosely coupled, not all nodes are aware of all the other nodes for example it’s self healing solu-on, no sensibility to network latency, and replica-on node speed (hardware variability)
❏ Offer the lowest master performance impact so far
❏ Mul* node read scalability with best possible query latency, network data distribu-on s-ll slow vs in memory scale up (photonic bus & memory is for 2020)
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
HA using MariaDB Replica-on
❏ In many cases the best solu-on
❏ Logical replica-on no corrup*on propagated
❏ Various topologies using mul- source , storage engine switching, mul- master , peer to peer
❏ Row vs Statement for strong network compression
❏ MariaDB fixing the gotcha
❏ Parallel replica-on , group commit, checksum, heartbeat
❏ Can be extended with per session consistency using GTID_POS_WAIT
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
HA MariaDB Replica-on GTID
SET GLOBAL GTID_SLAVE_POS = BINLOG_GTID_POS("master-bin.00024", 1600);
CHANGE MASTER TO master_host="10.2.3.4", master_use_gtid=slave_pos; START SLAVE;
CHANGE MASTER TO master_host="10.2.3.5";
© SkySQL Corpora-on Ab. Company Confiden-al.
HA using MariaDB Replica-on
❏ Replica-on gotcha
❏ Can lose data on master crash without semi-‐sync and semi sync performance should be address in next release
❏ Adding a slave stay manual and complex (MyISAM)
❏ Failover is more complex, found the most up to date slave, depends external product like MHA and scrip-ng to LB or HA solu-ons (maxscale can fix)
❏ Automa-c Strong consistency (maxscale can fix)
❏ No map reduce queries (spider temporary table can fix)
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
HA using MariaDB ❏ This a sodware-‐only solu-on
09/06/2014
MariaDB Server MariaDB Server MariaDB Server
Galera wsrep Library Galera wsrep Library Galera wsrep Library
Synchronous Replication
Load Balancer / Failover
Application / Application server
© SkySQL Corpora-on Ab. Company Confiden-al.
Installing MariaDB
❏ MariaDB Galera Cluster consists of separate MariaDB binary that in turns talks to the Galera wsrep library
❏ Once set up, Galera is configured using the usual my.cnf file, and is monitored using the SHOW GLOBAL STATUS command
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
HA using MariaDB
09/06/2014
❏ Scales reads
❏ Limited impact on concurrent writes, despite long distance network latency
❏ Proper Cluster with easy failover, op-mis-c locking, split-‐brain protec-on, no possible data lost on failover
❏ Synchronous replica*on with low overhead due to op-mis-c locking , paxos queue, parallel event replica-on
❏ Transparent provisioning
❏ Galera best point
© SkySQL Corpora-on Ab. Company Confiden-al.
HA using MariaDB
09/06/2014
❏ InnoDB only solu-on
❏ Network architecture reliability
❏ Local CommiUed Read only (select for update)
❏ Deadlock error prone
❏ Galera dark side
For transac-onal scenarios it may s-ll need read write spliVng !
© SkySQL Corpora-on Ab. Company Confiden-al.
Failover with MariaDB
❏ MariaDB Galera Cluster handles failed servers internally provide a status, that tells about membership of the cluster
❏ MariaDB Galera Cluster also handles split-‐brain protec*on and this requires at least 3 servers
❏ MariaDB Galera Cluster can also be configured without split-‐brain protec*on, for example when failover is manual or is handled in some other way
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Sejng up MariaDB
❏ Necessary to allow for protec-on against split-‐brain scenarios, but only using 2 database servers, a third server can be set up with a Galera specific arbitra-on agent (grbd)
❏ Galera also can be used with it’s own simple Load Balancer, (glb) , auto detect new nodes, although it is much more common that other technologies are used, such as HA aware Connectors or a Load Balancer, in this case use a backup policy and no load balancing for write
❏ For advance setup without applica*on control, auto balancing reads and write to a single node we advise you to try out our maxscale proxy
09/06/2014
© SkySQL Corpora-on Ab. Company Confiden-al.
Ques*ons?
09/06/2014
Recommended