Fast-forwarding a Delayed MySQL Replica
Last week I needed to fast-forward a delayed MySQL replica in order to recover some data. I use pt-slave-delay to keep this replica X days behind the master, and I needed to fast-forward it to be about X-2 days behind the master instead.
I wanted to catch the replica up to a precise moment in time, and it was important that it not go past that point until I was done recovering the data I needed.
I knew there was a command to do this, but I couldn’t remember the syntax. Since I figured
pt-slave-delay was implemented using this command, I just opened the perl script and searched around until I found the syntax I was looking for. Here it is:
Once I got the right syntax I needed to know which
log_pos values to use.
I got the
log_name by SSHing to the master DB, listing the binary log directory, and finding the oldest log with an mtime greated than the time I was looking for (If there is a way to get a listing of binary logs based on time range in the MySQL CLI, please let me know in the comments!).
Once I had
log_name, I used the
mysqlbinlog tool to find the
log_pos I was looking for. In my experience
start-datetime is inclusive, and
stop-datetime is exclusive, so to get all of the binary logging from
2014-02-04 16:18:30 I use that as
start-datetime and add 1 second to
stop-datetime. For example:
1 2 3
Then I sorted through the output of that until I found the most recent SQL statement I wanted to execute on my delayed replica. One of the comments preceding that statement has an
end_log_pos value, which I used as my
In order to fast-forward the replica I needed to stop running
pt-slave-delay. Otherwise it would not let replication proceed past X days behind the master.
Once I had those values, here’s the process I followed to fast-forward the replica:
- Disable monit for
pt-slave-delayso it wouldn’t restart automatically
STOP SLAVEon the delayed replica
START SLAVE UNTIL MASTER_LOG_FILE = 'bin-log.000386', MASTER_LOG_POS = 950592556on the delayed replica
- Wait for replication to catch up to that point
- Recover the data I need
STOP SLAVEon the delayed replica (just to be safe)
- Enable monit for
pt-slave-delay(or just let monit start it)
At that point the replica will stay in that state for about 2 days, at which point
pt-slave-delay will again enforce the pre-defined replication delay of X days.