Flite Culture

Using Innodb_large_prefix to Avoid ERROR 1071

If you’ve ever tried to add an index that includes a long varchar column to an InnoDB table in MySQL, you may have seen this error:

1
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes

The character limit depends on the character set you use. For example if you use latin1 then the largest column you can index is varchar(767), but if you use utf8 then the limit is varchar(255). There is also a separate 3072 byte limit per index. The 767 byte limit is per column, so you can include multiple columns (each 767 bytes or smaller) up to 3072 total bytes per index, but no column longer than 767 bytes. (MyISAM is a little different. It has a 1000 byte index length limit, but no separate column length limit within that).

One workaround for these limits is to only index a prefix of the longer columns, but what if you want to index more than 767 bytes of a column in InnoDB?

In that case you should consider using innodb_large_prefix, which was introduced in MySQL 5.5.14 and allows you to include columns up to 3072 bytes long in InnoDB indexes. It does not affect the index limit, which is still 3072 bytes as quoted in the manual:

The InnoDB internal maximum key length is 3500 bytes, but MySQL itself restricts this to 3072 bytes. This limit applies to the length of the combined index key in a multi-column index.

Read on for details and examples about innodb_large_prefix.

Read on ✈
Comments

Replacing Pt-slave-delay With MASTER_DELAY in MySQL 5.6

In the past I have used pt-slave-delay when I want to maintain an intentionally delayed replica. Now that I have upgraded to MySQL 5.6 I am switching over to use MASTER_DELAY, which is a built-in feature that does the same thing.

For example I can replace this:

1
pt-slave-delay --delay 7d --interval 1m --daemonize

With this:

1
2
3
STOP SLAVE;
CHANGE MASTER TO MASTER_DELAY = 604800;
START SLAVE;

The implementation is similar: the IO thread copies the events to the relay log as fast as normal, but the SQL thread only executes events older than the defined lag. The process to fast-forward the replica should be similar as well.

So far I see a couple of advantages of using MASTER_DELAY:

  • Running stop slave and start slave manually can’t cause the replica to catch up beyond the defined lag, which could happen with pt-slave-delay
  • No need to monitor the daemon (I used monit for this when running pt-slave-delay)

Since MASTER_DELAY is part of CHANGE MASTER it is persisted with the other replication configuration data, thus it doesn’t need to (and can’t) be defined in my.cnf, and it survives a reboot.

Comments

Fine-Tuning MySQL Full-Text Search With InnoDB

If you are using FULLTEXT indexes in MySQL and plan to switch from MyISAM to InnoDB then you should review the reference manual section on Fine-Tuning MySQL Full-Text Search to see what configuration changes may be required. As I mentioned in yesterday’s post when comparing query results on my database with FULLTEXT indexes in MyISAM versus InnoDB I got different results. Specifically, the InnoDB tables were returning fewer results for certain queries with short FULLTEXT search terms. Here’s an example of a query that returned fewer results on InnoDB:

1
2
3
select id
from flite.ad_index
where match(name,description,keywords) against('+v1*' IN BOOLEAN MODE);

The issue was that all of the fine tuning I had done before was limited to MyISAM, so it didn’t affect InnoDB. In the past I configured MySQL FULLTEXT search to index words as short as 1 character (the default is 3), and to index common words (not to use any stopword list). These are the relevant variables I set in in my.cnf:

1
2
ft_min_word_len = 1
ft_stopword_file = ''

InnoDB has its own variables to control stopwords and minimum word length, so I needed to set these variables when I changed the tables from MyISAM to InnoDB:

1
2
innodb_ft_min_token_size = 1
innodb_ft_enable_stopword = OFF

Since those variables are not dynamic, I had to restart MySQL for them to take effect. Furthermore, I needed to rebuild the FULLTEXT indexes on the relevant tables. This is how the manual instructs you to rebuld the indexes:

To rebuild the FULLTEXT indexes for an InnoDB table, use ALTER TABLE with the DROP INDEX and ADD INDEX options to drop and re-create each index.

Rather than drop and recreate the indexes, I just used ALTER TABLE ... FORCE to rebuild the table (and indexes), like this:

1
alter table flite.ad_index force;

After making those changes I re-ran pt-upgrade, and now I am getting the same set of rows back from MyISAM and InnoDB. The order of the rows is slightly different in some cases, but as I mentioned yesterday that is expected behavior.

Comments

Testing MySQL FULLTEXT Indexes in InnoDB Using Pt-upgrade

As I prepare to convert some MySQL tables with FULLTEXT indexes from MyISAM to InnoDB I want to verify that running a standard production query set against the tables will return the same results with InnoDB that it did with MyISAM. Since I read Matt Lord’s blog post about the document relevancy rankings used for InnoDB full-text searches I knew to expect some differences when sorting by relevancy, so I want to focus on getting the same set of rows back, mostly ignoring the order in which the rows are returned.

Percona toolkit has a tool called pt-upgrade that works well for this purpose. I used 2 test servers with a copy of my production database. On one of the servers I left the tables in MyISAM, and on the other I converted the tables to InnoDB. I copied a slow query log from a production host running with long_query_time=0 to get the query set for testing. Since I was only interested in queries on a few tables, rather than running the entire slow query log against the servers I just extracted the specific queries I was interested in and ran them as a raw log.

Here’s the command I used:

1
2
3
4
5
pt-upgrade --read-only \
  --database flite \
  --type rawlog /tmp/proddb-slow.log.raw \
  h=testdb33.flite.com \
  h=testdb47.flite.com

I used the --read-only flag so pt-upgrade would only execute SELECT statements, and not any statements that modify data.

Since I extracted the SQL queries from the slow query log instead of using the full slow query log, I used --type rawlog instead of the default of --type slowlog.

For the two hosts I compared, testdb33 is using FULLTEXT on InnoDB, and testdb47 is using FULLTEXT on MyISAM.

When I ran pt-upgrade it exposed several significant discrepancies. I will document those discrepancies and how I fixed them in a future post.

Comments

MyISAM Crash Recovery, and Triggers

I’ve recently run into a scenario where MyISAM auto-repair does not appear to work in MySQL 5.6. Most of my tables are InnoDB, but having recently upgraded from MySQL 5.5 to 5.6 I still have a handful of MyISAM tables in order to use FULLTEXT indexes since FULLTEXT on InnoDB was not supported in MySQL 5.5. These MyISAM tables are essentially copies of InnoDB tables, and are kept in sync via triggers.

I’m using myisam_recover_options=FORCE,QUICK so I expect any crashed MyISAM tables to be repaired automatically when mysqld tries to open them, but when one of these MyISAM tables crashes the auto-repair does not happen unless I explicitly interact with the table. For example I’ll see an error like this if replication updates a table that triggers an update on a crashed MyISAM table:

1
2
2014-06-02 15:01:01 8617 [ERROR] /usr/sbin/mysqld: Table './flite/ad_index' is marked as crashed and should be repaired
2014-06-02 15:01:01 8617 [ERROR] Slave SQL: Error 'Table './flite/ad_index' is marked as crashed and should be repaired' on query. Default database: 'flite'. Query: 'update AD_INDEX set ...

That error stops replication, so I need to repair the relevant MyISAM table before I can resume replication. If I execute an explicit query on that table then mysqld will repair it automatically. For example a query as simple as this leads to auto-repair:

1
select count(*) from flite.ad_index;

I’m hoping to get rid of these MyISAM tables soon since I can add FULLTEXT indexes on the InnoDB tables now. Once I do that this issue will not affect me any more. Nonetheless it seems to me that auto-repair should handle this scenario, so I entered a bug report.

Comments

Re-installing Common_schema After Upgrading to MySQL 5.6

After upgrading from MySQL 5.5 to 5.6, I started getting some warnings when running queries in common_schema. For example:

1
2
3
4
5
6
7
8
9
10
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                          |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1109 | Unknown table 'innodb_index_stats' in information_schema                                                                                         |
| Warning | 1109 | Unknown table 'innodb_index_stats' in information_schema                                                                                         |
| Warning | 1356 | View 'common_schema.processlist_repl' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them |
| Warning | 1356 | View 'common_schema.processlist_top' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them  |
| Warning | 1356 | View 'common_schema.slave_hosts' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them      |
| Warning | 1356 | View 'common_schema.slave_status' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them     |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------+

The common_schema installer is good at accomodating whichever version of MySQL it is being installed on, but if you change that version it makes sense that common_schema needs to be re-installed. As stated in the documentation:

If you should upgrade your MySQL server, or enable features which were turned off during install of common_schema, the new feature set are not automatically available by common_schema, and a re-install of common_schema is required.

A re-install is easy. Here’s what I did:

1
2
wget https://common-schema.googlecode.com/files/common_schema-2.2.sql
mysql --init-command="set sql_log_bin = 0;" < common_schema-2.2.sql

Most of the warnings went away, but I’m still getting this:

1
2
3
4
5
6
+---------+------+----------------------------------------------------------+
| Level   | Code | Message                                                  |
+---------+------+----------------------------------------------------------+
| Warning | 1109 | Unknown table 'innodb_index_stats' in information_schema |
| Warning | 1109 | Unknown table 'innodb_index_stats' in information_schema |
+---------+------+----------------------------------------------------------+

Apparently in MySQL 5.6 the innodb_index_stats table has moved from information_schema to the mysql schema, so I submitted a bug report. So far this isn’t affecting the way I use common_schema, but if it did the common_schema source code is very readable so it would probably be a simple patch.

Comments

Upgrading Temporal Columns From MySQL 5.5 to MySQL 5.6 Format

Before upgrading from MySQL 5.5 to 5.6, I read the relevant page in the reference manual and found this section about the storage format change for DATETIME, TIME, TIMESTAMP data types to support microseconds:

Incompatible change: For TIME, DATETIME, and TIMESTAMP columns, the storage required for tables created before MySQL 5.6.4 differs from storage required for tables created in 5.6.4 and later. This is due to a change in 5.6.4 that permits these temporal types to have a fractional part. After upgrading from MySQL 5.5 to MySQL 5.6.4 or later, it is recommended that you also upgrade from MySQL 5.5 to MySQL 5.6 TIME, DATETIME, and TIMESTAMP types…

The problem is that the manual doesn’t tell you how to “upgrade from MySQL 5.5 to MySQL 5.6 TIME, DATETIME, and TIMESTAMP types”. I figured simply rebuilding the relevant tables would probably do the trick, and I found some confirmation in a blog post from the MySQL server team. Quoting that post:

The ALTER TABLE requests ADD/CHANGE/MODIFY COLUMN, ADD INDEX or FORCE operation upon detecting old temporal data types upgrades them to the new format. Also a ‘NOTE’ is reported to indicate the user that an upgrade of the old temporal columns to the new format has been performed as well.

So I figured I would use ALTER TABLE ... FORCE to rebuild my tables. But which tables need to be rebuilt? I could simply rebuild every table that has DATETIME, TIME, and/or TIMESTAMP columns, but I’d rather be able to tell which storage format the table is using so don’t end up rebuilding it if I don’t need to. For InnoDB tables I can identify the columns using the old temporal types by checking the MTYPE value in the INFORMATION_SCHEMA.INNODB_SYS_COLUMNS table, since that value will be 6 (INT) for the old storage format and 3 (FIXBINARY) for the new storage format. Since almost all of my tables are InnoDB that approach works well for me. For other storage engines I’ll just rebuild all tables with DATETIME, TIME, and/or TIMESTAMP columns.

Here’s a query to list all of the relevant columns suspected of using the old format:

Read on ✈
Comments

Disabling Binlog_checksum for MySQL 5.5/5.6 Master-master Replication

Replicating from a newer major version to an older major version in MySQL (for example a 5.6 master and a 5.5 replica) is generally not recommended, but when upgrading a master-master replication topology it’s hard to avoid this scenario entirely. We ended up in this situation last week when upgrading the passive master of an active-passive master-master pair from 5.5 to 5.6. The primary replication flow was going from the active master (5.5) to the passive master (5.6) with no errors, but pt-heartbeat was running on the passive master, which led to a replication failure with this error on the active master:

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘Slave can not handle replication events with the checksum that master is configured to log; the first event ‘bin-log.002648’ at 4, the last event read from ‘/var/lib/mysqllogs/bin-log.002648’ at 120, the last byte read from ‘/var/lib/mysqllogs/bin-log.002648’ at 120.’

Why did this happen? Starting in MySQL 5.6.6, the new binlog_checksum option defaults to CRC32. Since that option did not exist in MySQL 5.5, the replica can’t handle the checksums coming from the master. Therefore I recommend setting binlog_checksum=NONE in my.cnf as part of the upgrade process for a master-master setup to avoid this error.

My fix was to run this on the passive master:

1
set global binlog_checksum='NONE';

Then I added this to my.cnf so it would survive a restart:

1
binlog_checksum=NONE

After that change was made I examined the binary log to confirm that it did not include anything other than pt-heartbeat activity, and then executed CHANGE MASTER on the active master to skip the checksummed events.

Once the other master is upgraded I can go back and consider changing binlog_checksum to CRC32.

Comments

Using Explode and Lateral View in Hive

At Flite we store a lot of JSON formatted log data in Amazon S3, and we use Amazon Elastic MapReduce to run Hive queries on that log data using Amazon’s standard JSON SerDe. Sometimes we store embedded JSON objects, which can be extracted pretty easily using the get_json_object function.

We also store some embedded JSON objects, including arrays. Recently I wanted to do a rollup query in Hive using the exploded results of an embedded array using the explode UDTF along with the LATERAL VIEW syntax. Converting the JSON array (which is stored as a string) to a Hive array was a bit messy, but I got it to work. Below I will show the query and break down the steps to create the query.

The query

Here’s the final query to count impressions by placement, id, and version for a given partition:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
select d.placement_id, d.id, d.version, count(1) as impressions
from
(
  select c.placement_id,
    get_json_object(c.info_object,'$.id') as id,
    get_json_object(c.info_object,'$.v') as version
  from
  (
    select b.placement_id, info_object
    from
    (
      select placement_id,
        split(regexp_replace(regexp_replace(a.info_array_string,'\\}\\,\\{','\\}\\;\\{'),'\\[|\\]',''),'\\;') as info_array
      from
      (
        select placement_id,
          get_json_object(embedded_data, '$.info') as info_array_string
        from ad_impression
        where dt = '2014-04-15-12'
      ) a
    ) b lateral view explode(b.info_array) info_array_exploded as info_object
  ) c
) d
group by d.placement_id, d.id, d.version;

Read on for a detailed explanation of how the query is built.

Read on ✈
Comments

Recovering MySQL Data From the BLACKHOLE Storage Engine

I used to think it was impossible to recover data from the MySQL BLACKHOLE storage engine. As stated in the manual:

The BLACKHOLE storage engine acts as a “black hole” that accepts data but throws it away and does not store it. Retrievals always return an empty result

I’ve always trusted the manual on this, but a recently published article about information and black holes proposes that the information may be recoverable. As stated in the abstract:

…probability conservation is guaranteed because black holes stimulate the emission of radiation in response to infalling matter. This stimulated radiation is non-thermal and contains all of the information about the infalling matter…

The nearest black hole is about 1600 light years away from earth, so I decided to apply this research closer to home by creating a new storage engine to capture the data that I insert into my BLACKHOLE table using stimulated radiation. I should be ready to open source this project soon, but for now here’s a quick demonstration.

First, here’s the classical example of data loss with the BLACKHOLE storage engine:

1
2
3
4
5
6
7
8
9
mysql> CREATE TABLE test(i INT, c CHAR(10)) ENGINE = BLACKHOLE;
Query OK, 0 rows affected (0.02 sec)

mysql> INSERT INTO test VALUES(1,'record one'),(2,'record two');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM test;
Empty set (0.00 sec)

Now I will use my new storage engine (I named it “Adami” after the author of the research paper) to recover the data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
mysql> CREATE TABLE test_recovery as select * from test;
Query OK, 0 rows affected (0.18 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> ALTER TABLE test_recovery ENGINE = Adami;
Query OK, 2 rows affected (0.03 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM test_recovery;
+------+------------+
| i    | c          |
+------+------------+
|    1 | record one |
|    2 | record two |
+------+------------+
2 rows in set (0.00 sec)

I recovered the data! If you’ve ever lost important data in the BLACKHOLE storage engine, this could be the solution you’ve been looking for.

Comments