Executing Complex Deletes With Common_schema
Recently I started a project that required deleting millions of rows of data from various MySQL tables. Many of the queries which identifed the rows to delete required several joins. I decided to use common_schema’s
split() function to break the deletes into chunks, but when I started running the queries in common_schema some of them failed to delete any data. I assume the failures were related to the complexity of the multi-table delete statements.
Thiking of this as a two step problem a natural workaround emerged. The two steps are:
- Identify the rows to delete
- Delete the rows
I want to use
split() for step 2, but I don’t really need it for step 1.
I decided to create a new table to store the ids which define the rows to delete, then populate that table, and use a simpler multi-table delete statement to execute the deletes using
split(). So far this workaround has allowed me to execute several separate delete statements, each deleting millions of rows in chunks.
Here’s an example. Say I want to execute a delete like this:
1 2 3 4 5 6 7
I can delete the same rows using the two step approach discussed above as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24