Flite Culture

One Advantage of MySQL Statement-based Replication

MySQL introduced row-based replication in version 5.1, but I still use statement-based replication most of the time. One thing I like about statement-based replication is that I can delete a row from the master that doesn’t exist on the replica, without an error. I’m sure many people don’t care about this feature because their master(s) and replica(s) contain exactly the same data set (maybe they even keep them in sync use pt-table-checksum and pt-table-sync). That being said, personally I have several use cases for storing different rows in the master than the replica. Here’s one of them.

Recently I’ve been working on a project that requires deleting a lot of row data from a legacy application in a production database. The legacy app uses many of the same tables as my current app, so I can’t just truncate a bunch of old tables. Instead I need to selectively delete certain rows. I’m starting this data pruning on the replicas before doing it on the masters, so for some period of time there will be rows on the masters that do not exist on the replicas. If I delete one of the rows from the master using row-based replication the delete would cause a replication error. Here’s a quick example to illustrate my point.

I’ll create a table with two rows on a master DB with row-based replication enabled:

1
2
3
create table pruning(id int not null primary key, str varchar(50));
insert into pruning (id,str) values(1,'old');
insert into pruning (id,str) values(2,'new');

First I delete the “old” row on the replica:

1
delete from pruning where id = 1;

Then I delete the “old” row on the master:

1
delete from pruning where id = 1;

With row-based replication here’s the error I see on the replica when running SHOW SLAVE STATUS:

Last_SQL_Error: Could not execute Delete_rows event on table test.pruning; Can’t find record in ‘pruning’, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event’s master log mysql-bin.000002, end_log_pos 1448

With statement-based replication, rather than trying to find the row to delete on the replica and failing, replication would execute the DELETE statement on the replica successfully despite the fact that it affects 0 rows.

The same thing happens if I update a row on the master that does not exist on the replica:

1
update pruning set str = 'updated on master' where id = 1;

Last_SQL_Error: Could not execute Update_rows event on table test.pruning; Can’t find record in ‘pruning’, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event’s master log mysql-bin.000002, end_log_pos 1252

The MySQL reference manual dedicates a page to various pros and cons of row-based and statement-based replication, but it doesn’t mention this case so I thought it was worth pointing out.

Comments

Comments