Scripted MySQL Replication Consistency Checks

I’ve been fixing and breaking MySQL replication clusters for quite some time now, and I realized one of the biggest problems is the lack of consistency provided through MySQL replication.

Sure your data will be consistent most of the time, but how do you check if it really IS consistent across all your slaves? How do you make sure your slaves don’t have missing or invalid entries?

I’m sure you’ve all run:

mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;

Not always a good idea…

Well today I present you with a little bash script I’ve written which performs all these verifications. I haven’t invented anything. On the contrary, I’m just using the methods and tools provided by Percona in their fantastic toolset called Maatkit.

Usage:

  • Change the default “User Defined Variables” in the script, to reflect your MASTER mysql server.

  • Configure slave reporting on each slave so that “SHOW SLAVE HOSTS\G” works from the MASTER

  • Make sure your slaves are running properly: Slave_IO_Running: Yes, Slave_SQL_Running: Yes, Seconds_Behind_Master: 0

  • Make sure you have Maatkit installed (apt-get install maatkit)

  • Run the script on the MASTER like this: ./mysql_consistency.sh -c

I have tested this script on Debian Lenny (5.0) with maatkit version 4334-1 and MySQL 5.0.

How does it work?

When you run the script, after performing some necessary sanity checks, the MASTER will create a checksum of every database and every table. It will store those results in the default database called test in the table called checksum. It will then replicate the data to the SLAVES who will create their own checksums on the same databases and tables. Afterwards it will tell you which slaves are consistent and which ones are not.

[root@db01 /opt (353)]#: ./mysql_consistency.sh -c
Checking consistency
Replication Slave ID 3 on 172.16.0.63:3306 is consistent.
Replication Slave ID 4 on 172.16.0.64:3306 is consistent.
Replication Slave ID 5 on 172.16.0.65:3306 is inconsistent. Requires rebuild

You might get some error messages too.

Download the script here: mysql_consistency.sh

Please notify me in the comments of any errors or adjustements as I’ve only used this in a small test-environment.

Comments (2)

  • Jean-Francois

    October 4, 2009

    11:58 pm

    Very nice info.
    If the scripts finds that a slave in inconsistent, does the maatkit toolset have something to resync the slave without having to re-create the whole thing from scratch? (ex: stop sync, copy files over, yadda-yadda)

  • Alex

    October 9, 2009

    6:20 pm

    Actually maatkit does have a command: mk-table-sync which would apparently sync the differences. I don’t think it’s necessarily a bad thing to resync everything from scratch, assuming your DB doesn’t hold 500GB of data.

    If your MySQL databases are running off an LVM partition, you can rebuild a slave using the latest snapshot of a master or a “consistent” slave.