There are two reasons I chose to use common_schema for this task:
Chunking breaks a single update statement into multiple statements, each updating at most a certain specified number of rows. The default chunk size is 1000, and I changed it to 2000 by setting the
Throttling has two benefits: it minimizes the amount of load on the master host, and it minimizes the replication lag introduced when the update statements are executed on the replicas. It achieves this by introducing sleep statements after each chunk is updated. The duration of the sleep statement is proportional to the runtime of the update statement. After experimenting with several different
throttle values in a test environment, I decided on a value of 4, which means it will sleep for 4 times as long as each update takes. Thus if an update takes 100 milliseconds, it will sleep for 400 milliseconds before executing the next update. Using
throttle does not guarantee that you will completely avoid replication lag, but if you do some testing you should be able to find an appropriate value that will strike a balance between the total run time of your updates and the level of replication lag that it introduces.