Adding a Unique Constraint With More Than 16 Columns in MySQL
A while ago I blogged about a potential workaround for MySQL error 1070 that could be used to add a unique constraint with more than 16 columns.
As a reminder here’s the error you get when you try to create a unique constraint with more than 16 columns in MySQL:
The solution I proposed should work, but when I started to implement it I made a couple of changes that merit this follow-up post. One change is fairly generic, and the other is specific to my use case.
The generic change was to only concatenate some of the columns, rather than all of them. Specifically I only need to concatenate N-15 columns. So if I want to have a unique constraint on 20 columns, I can include 15 of those actual columns and then concatenate the remaining 5 columns into a new column that is included as the 16th column in the unique constraint. When choosing which columns to concatenate, it makes sense to choose columns with short values, since that limits the amount of redundant data you are storing and helps prevent the table from taking up too much disk space.
As in my previous post, here’s a table with 20 columns:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
I can include 15 of those columns in my unique constraint, and concatenate the other 5 together to make the 16th column in the index. Looking at the schema, it’s not immediately obvious which columns are the smallest. The
INT columns can have values ranging from 1 to 11 characters long, while the
CHAR columns can have values ranging from 1 to 8 characters long. Since the range of lengths is fairly small I could choose the 5 columns to concatenate arbitrarily, or if I have a representative data set I could run some queries using
AVG(CHAR_LENGTH())) to empircally determine which columns are shortest. For the sake of simplicity in this example I’ll use columns c16 - c20.
Now I add a new column to store those 5 columns concatenated together. The column needs to be long enough to store the concatenated values. Since each
CHAR column value can be up to 8 characters long, and the hyphens will account for 4 characters, the new column needs to store up to (5 * 8) + 4 = 44 characters.
Here’s the DDL to add the new column and the unique constraint:
1 2 3 4
If the table has data in it I can populate the new concatenated column for all existing rows with an update statement:
1 2 3 4 5 6 7
The update will fail if any existing data violates the new unique constraint.
Now that the existing data is updated, I need to make sure the concatenated column is populated properly going forward. In my previous post I used triggers to populate the concatenated column. In some ways that is the safest implementation, because it practically guarantees that the concatenated column will always be set regardless of how the table is written, but it adds some overhead to inserts and updates. In my specific use case I am writing tens of thousands of rows at a time to the table using
LOAD DATA INFILE. Once written the rows are never updated, and there are no other code paths writing to the table, so for me it makes more sense to set the values in the
LOAD DATA INFILE statement rather than using triggers.
Here’s an example to load the contents of a tab-delimited text file into my database table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
An alternative implementation could be to use virtual columns, as pointed out by some of the commenters on my previous post. I have not tried that approach yet – mostly because I don’t use MariaDB or MySQL 5.7 – but I may set up a sandbox instance to try that out soon.