MySQL Binary Log Parsing With Pt-query-digest
I frequently use pt-query-digest to parse the MySQL slow query log. This morning I wanted to parse the MySQL binary log to find out how frequently various tables in my database are updated. A quick check of the pt-query-digest documentation confirmed that pt-query-digest does support parsing the binary log:
type: Array; default: slowlog
The type of input to parse. The permitted types are
Parse a binary log file.
However, the docs don’t have any further details or examples. Given the lack of documentation my first instinct was to run pt-query-digest on my binary log using defaults for all of the other settings, like this:
That won’t work. Why not? Apparently when the documentaiton says “Parse a binary log file” it doesn’t mean a MySQL binary log in binary format. It means a binary log in text format. So first I need to convert my binary log to text using
mysqlbinlog like this:
Then I can parse it using pt-query-digest. The defaults work pretty well:
That gives me the top 95% of queries based on total response time, grouped by their fingerprint, over the entire file. So that gives me a pretty good idea of which tables are being updated the most frequently.
I can also look at a specific time window. Say I only want to look at one hour of data:
1 2 3 4
In my specific case I want to see all of the queries, not just the top 95%. I want to sort them by occurences instead of response time. I only care about the profile, not the other sections. I also like to explicitly name some of the defaults for the sake of readability. This is the final command I came up with:
1 2 3 4 5 6 7
I hope you find these examples useful. There are lots of other options in the docs, so you should experiment to see what works best for your use case.