Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Backups in the real word: Challenges and Solutions
Ryan HuddlestonDirector Remote DBA
Wednesday, March 27, 13
www.percona.com
Why do we need backups?• Anything can and will fail• Applications have bugs• Everyone makes mistakes• Plan for the worst
Wednesday, March 27, 13
www.percona.com
What happens when there are no backups?
Wednesday, March 27, 13
www.percona.com
Wednesday, March 27, 13
www.percona.com
What happens when there are no backups?
Wednesday, March 27, 13
www.percona.com
Types of outages to prepare for• A customer removes data through your app then realizes
it was a mistake• An admin runs UPDATE/DELETE on the wrong data• Your application has a bug that overrides or removes
data• A table or entire schema was dropped• InnoDB corruption bug that propagates via replication • Your database or application was hacked
• All of the above changes replicate to your replication slaves!
Wednesday, March 27, 13
www.percona.com
Hardware and Bugs• InnoDB table is corrupt and mysql shuts down• Your server, RAID controller or other attached hardware
crashes and all data is lost or is unavailable• A disk failed, and RAID array does not recover
• Redundant hardware with no shared components is the answer
Wednesday, March 27, 13
www.percona.com
Disaster Scenarios• You lose your entire SAN and all your DB servers were
located there• You lose a PSU or network switch in your datacenter and
some or all of your servers go down in that location• Your entire datacenter loses power and the generators
do not start• Hurricane Sandy put’s a few feet of water in your data
center
Wednesday, March 27, 13
www.percona.com
Tools used by Percona Remote DBA
• Percona XtraBackup for MySQL for binary backups
• mydumper for logical backups• mysqlbinlog 5.6• Amazon S3
• monitoring for all the above
Wednesday, March 27, 13
www.percona.com
Percona XtraBackup • Performs Binary backups• Can restore an entire server very fast. • Can compress on the fly• Can backup a server quickly, typically only limited by
available IO
Wednesday, March 27, 13
www.percona.com
When do we restore from Xtrabackup?
• Setting up new slaves• When we lose an entire server due to hardware failure,
corruption, etc• When the majority of data on the server was lost.
• Basically when restoring may take less time that trying to load a logical backup
Wednesday, March 27, 13
www.percona.com
XtraBackup Tips/Tricks• Upgrade to 2.0.5+, less locking, less IO when using
compression• Use --rsync option. Syncs MyISAM and other non-
transactional tables. Syncs frm files. Results it almost no locking
• enable --slave-info when backing up from a slave• --compress option available
Wednesday, March 27, 13
www.percona.com
Mydumper• Very fast for logical backups (compared to mysqldump)• Consistent backups between myisam and innodb tables.• Almost no locking, if not using myisam tables• Each table is dumped to a separate file.• Built in compression
• Compressed mydumper typically 3x-5x smaller vs compressed xtrabackup
• Perfect for uploading to S3 given bandwidth constraints
Wednesday, March 27, 13
www.percona.com
When do we restore from Mydumper?
• Restoring a single table• Restoring a single schema or rolling forward a single
schema to a point in time• Restoring data while automatically replicating out to all
slaves
Wednesday, March 27, 13
www.percona.com
Mydumper Challenges• You can’t rely on mydumper to dump schema‘s. It does
not handle views/triggers/procedures• Run with --no-schemas and use mysqldump for the schemas
and rely on mydumper for data only.• You will have to compile it yourself, as binary packages
aren’t distributed• Be careful with importing a dump from a server running
in a different timezone.
https://bugs.launchpad.net/mydumper/+bug/1125997
Wednesday, March 27, 13
www.percona.com
Schema backup• loop through each DB• write out ALTER DATABASE DEFAULT CHARACTER
SET <charset> > schema.sql• mysqldump -d -R skip-triggers >> schema.sql• mysqldump -d -t > schema-post.sql
• has all triggers
Wednesday, March 27, 13
www.percona.com
Mydumper Tips/Tricks• run with --kill-long-queries to avoid “FLUSH TABLES
WITH READ LOCK ” problems• enable --compress, compresses tables per file
• The time needed to uncompress is not a limiting factor on restore time when done inline.
Wednesday, March 27, 13
www.percona.com
mysqlbinlog 5.6• mysqlbinlog --read-from-remote-server --raw --stop-
never• Mirror the binlogs on the master to a second server• Works with 5.1/5.5 servers also• Roll forward backups even after losing the master• Very useful for disaster recovery
• Backups in S3• mysqlbinlog --stop-never running on a small ec2 instance
• Takes very little resources to run. Can run about anywhere with disk space and writes out binlog files sequentially.
Wednesday, March 27, 13
www.percona.com
mysqlbinlog 5.6• Ensure it stays running, restart it if it appears to be
hanging• Verify the file size is the same on master and slave• Re-transfer files that are partially transferred• Compress the files after successful transfer
• http://tinyurl.com/73jo3cm
Wednesday, March 27, 13
www.percona.com
Amazon S3• s3cmd we have been using the version from github,
Mostly for multi-part upload support. This prevents us from having to split files up before uploading to S3.
• There is released alpha version of this version at http://s3tools.org
• With S3 you can now set bucket lifecycle properties so data over X days is archived to Glacier and data over Y days is removed
Wednesday, March 27, 13
www.percona.com
Amazon S3cmd Tips• --add-header=x-amz-server-side-encryption:AES256 to
use the server side encryption feature• use_https = True, especially if your data is not encrypted
before transfer
Wednesday, March 27, 13
www.percona.com
Monitoring• We employ nsca nagios alerts for all of the backup
processes.• freshness_threshold should be set so if your nsca hasn’t
checked in within a certain period it will alert you• For our mysqlbinlog processes, we have it sending nsca
alerts every 30 seconds and have it alert when nothing has been received for 15 minutes -> 1 hour
• If backups throw an error and are aborted, we send a critical alert immediately to be investigated
• The number one cause of backup alerts are due to problems with FLUSH TABLE WITH READ LOCK
Wednesday, March 27, 13
www.percona.com
Tips for Backups• Use both logical and binary backups• Store backups on more than one server• Store copied of your backups offsite• Pull binlogs offsite• Test your backups• Run pt-table-checksum
Wednesday, March 27, 13
Percona Remote [email protected]
http://www.percona.com/products/mysql-remote-dba
We’re Hiring! http://www.percona.com/about-us/careers/
Wednesday, March 27, 13