<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[Mechanics of Flite]]></title>
  <link href="http://mechanics.flite.com/atom.xml" rel="self"/>
  <link href="http://mechanics.flite.com/"/>
  <updated>2013-05-21T19:13:22-04:00</updated>
  <id>http://mechanics.flite.com/</id>
  <author>
    <name><![CDATA[Flite Inc.]]></name>
    
  </author>
  <generator uri="http://mechanics.flite.com">Flite</generator>

  
  <entry>
    <title type="html"><![CDATA[Implementing asynchronous cascade delete in MySQL]]></title>
    <link href="http://mechanics.flite.com/blog/2013/05/21/implementing-asynchronous-cascade-delete-in-mysql/"/>
    <updated>2013-05-21T19:12:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/05/21/implementing-asynchronous-cascade-delete-in-mysql</id>
    <content type="html"><![CDATA[<p>A while back one of my foreign keys started causing trouble. The problem was that some parent rows had tens of thousand of child rows, and the foreign key was defined with <code>CASCADE DELETE</code> enabled. When we deleted one of those parent rows on a master database, it took several seconds to execute the delete because of the cascade. This led to latency for the end user, and also led to replication delays.</p>

<p>The immediate solution was make the application tolerant of orphaned rows in the child table and to drop the explicit foreign key constraint.</p>

<p>I didn&#8217;t really want to leave those orphaned rows hanging around in the child table, so I decided to implement an asynchronous process to delete the orphaned rows on a scheduled basis. Read on for a description of that process.</p>

<!-- more -->


<p>Using the <a href="http://dev.mysql.com/doc/sakila/en/sakila-installation.html">sakila database</a> as an example, imagine I drop the foreign key between <code>file_category</code> and <code>category</code>, like so:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>alter table sakila.film_category 
</span><span class='line'>  drop foreign key fk_film_category_category;</span></code></pre></td></tr></table></div></figure>


<p>Without the foreign key in place, deletes on the <code>category</code> table lead to orphaned rows in <code>film_category</code>. For example, I will delete the &#8220;New&#8221; category:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; delete from sakila.category
</span><span class='line'>    -&gt; where name = 'New';
</span><span class='line'>Query OK, 1 row affected (0.01 sec)</span></code></pre></td></tr></table></div></figure>


<p>There are several ways to count the orphaned rows. Here are two different naive implementations using <code>OUTER JOIN</code> or <code>NOT EXISTS</code>:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; select count(*)
</span><span class='line'>    -&gt; from sakila.film_category c 
</span><span class='line'>    -&gt; left outer join sakila.category p on p.category_id = c.category_id 
</span><span class='line'>    -&gt; where p.category_id is null;
</span><span class='line'>+----------+
</span><span class='line'>| count(*) |
</span><span class='line'>+----------+
</span><span class='line'>|       63 |
</span><span class='line'>+----------+
</span><span class='line'>1 row in set (0.00 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; select count(*)
</span><span class='line'>    -&gt; from sakila.film_category c 
</span><span class='line'>    -&gt; where not exists 
</span><span class='line'>    -&gt; (
</span><span class='line'>    -&gt;   select NULL from sakila.category p where p.category_id = c.category_id
</span><span class='line'>    -&gt; );
</span><span class='line'>
</span><span class='line'>+----------+
</span><span class='line'>| count(*) |
</span><span class='line'>+----------+
</span><span class='line'>|       63 |
</span><span class='line'>+----------+
</span><span class='line'>1 row in set (0.01 sec)</span></code></pre></td></tr></table></div></figure>


<p>I can also delete the orphaned rows using the same query approaches. I&#8217;ll roll the first delete back so I can demonstrate the second query in the same session:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; set autocommit = 0;
</span><span class='line'>Query OK, 0 rows affected (0.00 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; delete c.*
</span><span class='line'>    -&gt; from sakila.film_category c 
</span><span class='line'>    -&gt; left outer join sakila.category p on p.category_id = c.category_id 
</span><span class='line'>    -&gt; where p.category_id is null;
</span><span class='line'>Query OK, 63 rows affected (0.02 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; rollback;
</span><span class='line'>Query OK, 0 rows affected (0.00 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; delete c.*
</span><span class='line'>    -&gt; from sakila.film_category c 
</span><span class='line'>    -&gt; where not exists (select NULL from sakila.category p where p.category_id = c.category_id);
</span><span class='line'>Query OK, 63 rows affected (0.01 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; commit;
</span><span class='line'>Query OK, 0 rows affected (0.00 sec)</span></code></pre></td></tr></table></div></figure>


<p>This approach will be very slow for tables containing millions of rows, so in my real world case I didn&#8217;t use this approach. Instead I decided it would be a lot easier and faster to delete the orphaned rows if I knew who their parent was. To this end I created a new table to track the deleted rows, and populated it using a trigger. Continuing the example in the sakila database:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>CREATE TABLE category_deleted (
</span><span class='line'>  category_id tinyint(3) unsigned NOT NULL,
</span><span class='line'>  name varchar(25) NOT NULL,
</span><span class='line'>  last_update timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
</span><span class='line'>  delete_time timestamp NOT NULL,
</span><span class='line'>  PRIMARY KEY (category_id)
</span><span class='line'>) ENGINE=InnoDB DEFAULT CHARSET=utf8;
</span><span class='line'>
</span><span class='line'>DELIMITER $$
</span><span class='line'>
</span><span class='line'>DROP TRIGGER IF EXISTS sakila.TR_A_DEL_CATEGORY $$
</span><span class='line'>
</span><span class='line'>CREATE TRIGGER sakila.TR_A_DEL_CATEGORY AFTER DELETE ON sakila.category FOR EACH ROW BEGIN
</span><span class='line'>  
</span><span class='line'>  INSERT IGNORE INTO sakila.category_deleted (category_id, name, last_update, delete_time)
</span><span class='line'>  VALUES (old.category_id, old.name, old.last_update,now());
</span><span class='line'>
</span><span class='line'>END $$
</span><span class='line'>
</span><span class='line'>DELIMITER ;</span></code></pre></td></tr></table></div></figure>


<p>Now I can delete another category to test the trigger:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; -- delete a single category
</span><span class='line'>mysql&gt; delete from sakila.category
</span><span class='line'>    -&gt;     where name = 'Classics';
</span><span class='line'>Query OK, 1 row affected (0.01 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; -- verify the trigger worked
</span><span class='line'>mysql&gt; select * from sakila.category_deleted;
</span><span class='line'>+-------------+----------+---------------------+---------------------+
</span><span class='line'>| category_id | name     | last_update         | delete_time         |
</span><span class='line'>+-------------+----------+---------------------+---------------------+
</span><span class='line'>|           4 | Classics | 2006-02-15 04:46:27 | 2013-05-21 18:21:53 |
</span><span class='line'>+-------------+----------+---------------------+---------------------+
</span><span class='line'>1 row in set (0.00 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; -- count the orphaned rows using the _deleted table
</span><span class='line'>mysql&gt; select count(*)
</span><span class='line'>    -&gt; from sakila.film_category c 
</span><span class='line'>    -&gt;     inner join sakila.category_deleted p on p.category_id = c.category_id;   
</span><span class='line'>+----------+
</span><span class='line'>| count(*) |
</span><span class='line'>+----------+
</span><span class='line'>|       57 |
</span><span class='line'>+----------+
</span><span class='line'>1 row in set (0.00 sec)</span></code></pre></td></tr></table></div></figure>


<p>I also wanted to execute the deletes on the child table in chunks, so I implemented a stored procedure to delete the orphaned rows by iterating through the rows in the <code>_deleted</code> table, deleting the child rows, and then deleting from the <code>_deleted</code> table. If I had to implement it again, I would probably use <a href="http://code.google.com/p/common-schema/">common_schema</a> to chunk the deletes so I wouldn&#8217;t need the stored procedure.</p>

<p>Here&#8217;s an implementation using common_schema:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>-- first delete the orphaned rows from the child table
</span><span class='line'>set @script := "
</span><span class='line'>  split(sakila.film_category: delete sakila.film_category.*
</span><span class='line'>    from sakila.film_category 
</span><span class='line'>    inner join sakila.category_deleted on sakila.category_deleted.category_id = sakila.film_category.category_id 
</span><span class='line'>  ) 
</span><span class='line'>  SELECT $split_total_rowcount AS 'rows deleted so far';
</span><span class='line'>";
</span><span class='line'>call common_schema.run(@script);
</span><span class='line'>
</span><span class='line'>-- then delete the rows from the _deleted table (assuming they have no children)
</span><span class='line'>delete p.*
</span><span class='line'>from sakila.category_deleted p 
</span><span class='line'>left outer join sakila.film_category c on c.category_id = p.category_id 
</span><span class='line'>where c.category_id is null;</span></code></pre></td></tr></table></div></figure>

]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Using local public data sets]]></title>
    <link href="http://mechanics.flite.com/blog/2013/05/17/using-local-public-data-sets/"/>
    <updated>2013-05-17T17:35:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/05/17/using-local-public-data-sets</id>
    <content type="html"><![CDATA[<p>After learning the secret ingredient for Flite’s Q2 hackday was big data, we spent a lot of time looking at the many public data sets available, and deciding what to incorporate in our project. One notable departure from the typical hackday format for Q2 was that we were not required to develop something specifically for the Flite platform.</p>

<p>This interested us, and looking at the plethora of public data sets available on the web, we kept coming back to the sets that hit closest to home: San Francisco City data.</p>

<p>The San Francisco <a href="https://data.sfgov.org/">Open Data Portal</a> provides a wide range of data sets containing information related to campaign finance contribution, public transportation (MUNI), and even listings of locations for movies shot in the city.</p>

<p>Being somewhat typical, modern city dwellers, one thing we love about San Francisco is the many purveyors of, and availability for, tasty food and beverage. Given this preoccupation with eats, drinks, etc., one of the most interesting data sets available to us was the Health Department’s repository of <a href="https://data.sfgov.org/Public-Health/Restaurant-Scores/stya-26eb">Restaurant Health Inspection Scores</a>.</p>

<p>We set out to build a visualization of these scores by plotting this data in an application using Google Maps.</p>

<p><img src="http://mechanics.flite.com/images/hackday_asparagus/sf_restaurant_score_heat_map.png" title="SF Restaurant Health Inspection Score Heat Map" alt="SF Restaurant Health Inspection Score Heat Map" /></p>

<p>Red sections indicate lower (high risk) health scores. Blue sections represent higher (low risk) health scores.</p>

<p>Our hope was to provide a heat map of the city indicating areas that have more dubious or more positive health scores. Ultimately the visualization provides more of a heat map of restaurants inspected over a period of time. Looking at the results, we came to the conclusion that health score filtering options would be necessary to accomplish what we were after.</p>

<p>For a fully functional visualization of health scores on a map, visit <a href="http://sfscores.com/">sfscores.com</a>.</p>

<p>Bon Appétit!</p>

<p>Team Asparagus
(Steve Rowe, Saami Siddiqui, Omar Megdadi, Eugene Feingold, John Skinner)</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Interaction path analysis in an ad]]></title>
    <link href="http://mechanics.flite.com/blog/2013/05/16/interaction-path-analysis-in-an-ad/"/>
    <updated>2013-05-16T16:35:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/05/16/interaction-path-analysis-in-an-ad</id>
    <content type="html"><![CDATA[<p>During <a href="http://www.flite.com/">Flite’s</a> Q2 hackday, the secret ingredient was &#8220;Big data sets&#8221; and every group could work on any hack that deals with the secret ingredient.
During the brainstorming process, our group &#8220;Beets&#8221; thought of utilizing the large amount of data generated from our ad metrics system when a user interacts with an ad.</p>

<p>When a user interacts with a Flite ad, we collect all actions made by the user within the ad. The common interactions we collect include clicks, scrolls, hovers, clickthroughs and many more. Each interaction contains a series of information, including the sequence number, the spot where the interaction happened (x,y position relative to the ad), component name, etc. With this information we can reconstruct the whole user session.</p>

<p>As we have this huge amount of interaction data, we thought of building a feature which shows the most common user path for a specific ad. This can help data analysts see how our current users are interacting with the ad, when they are leaving the ad, and find patterns to improve the interaction rate.</p>

<p>For any given ad:</p>

<ol>
<li>Grab all interaction data, grouped by session.</li>
<li>For each session, remove interactions that are not important like hovers, rollovers, rollouts etc., and generate new sequence numbers.</li>
<li>Group all sessions by its sequence number and component name.</li>
</ol>


<!-- more -->


<p>After that, we used the data to construct a weighted tree where the root node will always be the number of impressions served till now. The child nodes for the root will be the first interactions users have done on the ad. You can click on each node to find out what other interactions users performed after that interaction.</p>

<p><img src="http://mechanics.flite.com/images/hackday_beets/UserInteractionsFlow.png" title="User Interactions Flow" alt="User Interactions Flow" /></p>

<p>Session Ended events show how many users didn&#8217;t interact with the ad after the parent interaction.</p>

<p>To implement the UI, we used the flow plugin from an open source visualization tool - <a href="https://github.com/skydb/d3.flow.js">D3.flow.js</a>. On each node click we make an ajax call with the id to get the child nodes and render them. The time consuming part was grabbing session data for an ad and constructing a data model out of the huge data set within 24 hours.</p>

<p>Cheers</p>

<p>Team Beets - Sam Chan, Toshi, Jiangyue, Chakri</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Looking for State Level Trends]]></title>
    <link href="http://mechanics.flite.com/blog/2013/05/16/looking-for-state-level-trends/"/>
    <updated>2013-05-16T15:05:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/05/16/looking-for-state-level-trends</id>
    <content type="html"><![CDATA[<p>Have you ever wondered who clicks on digital ads? Is it people who live in the most tech savvy, modern cities? Is it someone with too much time on their hands in a remote corner of North Dakota? Do idle hands lead to higher interactions in ads? Team Garlic wanted to know and their Hack Day goal was to find to out more about who clicks on <a href="http://www.flite.com">Flite</a> ads by using our own data combined with open data on <a href="http://census.gov">census.gov</a>.</p>

<p>The plan was to gather the smallest, but most relevant data set as quickly as possible, see what it told us, and show it off. Ike and Angel mined our own data in <a href="http://hive.apache.org/">Hive</a> for the top impressions, interactions, and browser by city and state. Meanwhile, Andy and Matt decided to see what kind census data we could get from <a href="http://census.gov">census.gov</a> regarding population, income, gender, and employment rate. Once we had all of that info, Ike was going to merge the data in MySQL, Angel would then analyze it for trends, and Andy would build a page to display the data.</p>

<p>Before we went out and started gathering data, we decided to form a hypothesis about correlations between census data and Flite usage data. The best idea we could come up with based on state-level data was that a higher unemployment rate may correlate with a higher time on unit, engagement rate, and/or interaction rate. Read on to find out if our hypothesis was valid.</p>

<!-- more -->


<p>During the data gathering processed we discovered a few interesting things and ran into a few hiccups. For example, the team discovered that West Virginia has the highest usage of Internet Explorer on Flite ads. The lowest: Utah. (Utah, you&#8217;re awesome. Thank you.)</p>

<p>We also discovered that the data on <a href="http://census.gov">census.gov</a> wasn&#8217;t always easy to procure the way we wanted it. One on hand making an API call to the 2010 census data was straightforward and fast. Matt was able to get a unique key to make the calls within minutes and was running our first queries in less than 15 minutes. On the other hand, the API calls to the American Community Survey (ACS) <a href="http://www.census.gov/acs/www/guidance_for_data_users/guidance_main/">5 Year Data Set</a> was much more difficult and we never made a successful custom query. For the most part this was due to the lack of clear documentation on the <a href="http://census.gov">census.gov</a> site.</p>

<p>Since we needed some additional information and were unable to get it from ACS we went looking elsewhere. When in doubt, look it up on the &#8220;The Google,&#8221; right? Andy found a couple of pre-populated pages on <a href="http://bls.gov">bls.gov</a> and <a href="http://census.gov">census.gov</a> for unemployment numbers and median state income. We weren&#8217;t able to easily export the data so Matt just hacked together his own CSV files and sent them to Angel.</p>

<p>We had a little more time and thought that gathering population information by city in the US would be fun to look at as well. Fortunately, Ike found this population data fairly quickly from <a href="http://biggestuscities.com">biggestuscities.com</a> which linked directly to the relevant <a href="http://www.census.gov/popest/cities/">data set</a> on the census site. Unfortunately the city/town names in that data set did not match up well with our city-level Flite usage data, and we didn&#8217;t have enough time to clean the data set before demos.</p>

<p>So what did we find? First, although humorous, we tossed out the browser data because we didn&#8217;t have time to integrate it. Matt really dislikes IE so it was best not to go down that rat hole.</p>

<p>Looking at the maps comparing Flite usage data to population data, the first thing we calculated was daily Flite ad impressions per 10,000 residents. That map didn&#8217;t look very interesting because we didn&#8217;t see much variation between the states. When we looked more closely at the data set we found that Washington, DC (which we did not highlight on the map) had by far the highest per capita impression rate, about 300% of the second place state. Ike&#8217;s theory is that more people work in DC than actually live there. We found a <a href="http://www.welovedc.com/2010/02/16/dc-mythbusting-daytime-population/">blog post</a> that supported our theory about DC&#8217;s high employment/residence ratio, higher than any other US city. And unlike other US cities where the effect is dilluted over an entire state&#8217;s population, Washington, DC is both a city and a state for our purposes.</p>

<p>Keeping it political, Ohio was a true battleground state in the 2012 election, but it&#8217;s the clear winner when it comes to Clickthrough Rate.
<img src="http://mechanics.flite.com/images/hackday_garlic/ctr.jpg" title="Clickthrough Rate by state" alt="Clickthrough Rate by state" />
We reached out to our in-office Ohio expert, John Skinner, but received no comment.</p>

<p>Engagement Rate and Interaction Rate by state also had some interesting variations, but we couldn&#8217;t come with any useful theories about the cause:
<img src="http://mechanics.flite.com/images/hackday_garlic/er.jpg" title="Engagement Rate by state" alt="Engagement Rate by state" />
<img src="http://mechanics.flite.com/images/hackday_garlic/ir.jpg" title="Interaction Rate by state" alt="Interaction Rate by state" /></p>

<p>Finally, and the truly most interesting part of this project is that Angel discovered that that higher the income rate, the lower an ad&#8217;s interaction rate ended up being. We compared all of the different factors, and found two strong correlations.</p>

<p>The first was that interaction rate and median income were inversely proportional:</p>

<p><img src="http://mechanics.flite.com/images/hackday_garlic/plot_income_interaction_correlation.jpg" title="Interaction rate and median income" alt="Interaction rate and median income" /></p>

<p>Secondly, following up on our initial hypothesis we found a positive correlation between engagement rate and unemployment rate:</p>

<p><img src="http://mechanics.flite.com/images/hackday_garlic/plot_unemployment_engagement.jpg" title="Engagement rate and uneployment rate" alt="Engagement rate and uneployment rate" /></p>

<p>This data supported our initial hypothesis to some extent. Of course there are many caveats working with state level data. For example, many ad campaigns are regional, so some of the data trends may be based more on the ads themselves rather than the people interacting with them. Also, the &#8220;unemployment rate&#8221; we used was not the true unemployment rate, but rather we divided the census figure for unemployed persons by the census population figure. By the time we realized that error it was too late to re-run the data with official unemployment rate figures. Nonetheless, it was nice to find some relationship within our data set!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Integration of Public-Use Metadata and Flickr-hosted Image Data]]></title>
    <link href="http://mechanics.flite.com/blog/2013/05/14/integration-of-public-use-metadata-and-flickr-hosted-image-data/"/>
    <updated>2013-05-14T18:48:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/05/14/integration-of-public-use-metadata-and-flickr-hosted-image-data</id>
    <content type="html"><![CDATA[<p>Hackday here at <a href="http://www.flite.com">Flite</a> is an integral component of our company culture. As Flite employees we pride ourselves on being agile, hard-working, and results-oriented.</p>

<p>Keeping this in mind, Hackday 2013 was kicked off with an <strong>Iron Chef</strong> theme, culinary-based team names (go Trout!!), and a Hacker draft.</p>

<p>The “Chairman” unveiled the special ingredient for this Hackday challenge: Big Data.  Additionally, data resources were distributed to all teams and a firm deadline of 4PM was established.</p>

<p>Following a quick Trout brainstorm session, we were able to establish certain goals for our Hackday. Team goals were to track interactions as a means of gaining insight to platform usage as well as to utilize both Flickr’s large <a href="http://www.flickr.com/services/feeds/docs/photos_public/">photo repository</a> and public-use metadata.</p>

<!-- more -->


<p>With this challenge rooted in the use of data, the initial steps were deciphering where to source data from and how to mesh the information into a useful format.</p>

<p>Interaction tracking data was sourced within the platform, the random-sample data was aggregated interactions across both weekends and weekdays (20,000 sessions per day) for a month long interval. After statistical analysis and visual inspection the expected results were not realized. Initial expectations were that a large difference between weekday and weekend interactions should present itself within the data. Assumptions included: employees at work during the week would fuel interactions and that weekends would be spent enjoying leisure activities meaning more time away from the “point” of interaction. However, the data tells a different or conflicting story. Graphics show similar interaction data and usage habits across both weekends and weekdays with a horizontal shift (2 hours to the right). This horizontal shift could express the later start to people’s days during the weekend as well as their tendency to stay up later on weekend nights into the AM, fueling the higher weekend interactions post midnight when compared to weekdays.</p>

<p>Additionally, the confidence interval is much tighter for weekday interactions when compared to the wider CI’s associated with weekend interactions. This can be attributable to the predictability of one’s workday hours and interactions when compared to the less predictable and greater variance presented by less regimented weekend days.</p>

<p><img src="http://mechanics.flite.com/images/hackday_trout/interactions_plot.svg" title="Session length based on hour of day and weekday vs. weekend" alt="Session length based on hour of day and weekday vs. weekend" /></p>

<h2>UFO Data &amp; Flickr Photos</h2>

<p>Within the group there was an affinity to make the most of the <a href="http://www.infochimps.com/datasets/60000-documented-ufo-sightings-with-text-descriptions-and-metada">UFO data</a> available from the National UFO Reporting Center. The UFO data was refined and filtered, leaving the desired cell-based variables including:</p>

<ul>
<li>City/State</li>
<li>color associated with UFO sighting</li>
<li>gender of claimant</li>
<li>shape of UFO</li>
<li>use of term “alien” in the sighting description</li>
</ul>


<p>Team focus shifted towards the use of this observational UFO data as a means of forming queries within the Flickr platform. The Flickr platform is rich with tags, locations, and personal descriptions that should match well with our workable dataset. Matching text data with image tag search results from Flickr should yield photos of these UFOs and in turn lead to visual verification of these sightings.</p>

<p>Initial obstacles included data formatting issues (text versus numerical). There were also problems with Flickr search results producing different responses based on manual versus automated queries. Strangely, manual searches yielded more results than automated, which essentially makes our service offered less effective and negates the applicable use of our creation.  Limited auto query results were due to geographic tagging variation between Flickr (numerical) and UFO (text) data as well as the ability to include all search terms jointly using an “or” syntax.</p>

<p>Less than stellar query results continued, so as part of the iterative process we loosened our search parameters and tags. Exclusion of gender and color associated with UFO sightings seemed to increase query results and increase the effectiveness of our search. However, with this minimization of specified variables there is a decrease in test strength and results significance.</p>

<p>Following the discard of several search variables and adjustments of geographic tags into numerical data we were able to produce query results that exceeded the threshold we established as our benchmark. With the desired results in our query the JPEG photos were “in-filed” to JSON and uploaded to a user interface for the 4 PM demo.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[How do External Factors Affect Ad Performance]]></title>
    <link href="http://mechanics.flite.com/blog/2013/05/13/how-do-external-factors-affect-ad-performance/"/>
    <updated>2013-05-13T12:00:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/05/13/how-do-external-factors-affect-ad-performance</id>
    <content type="html"><![CDATA[<p><em>Editor&#8217;s Note: This is the first in a series of posts about Flite&#8217;s most recent hackday. The hackday theme was &#8220;Big Open Data Sets&#8221;, and several different teams had one day to create a demo incorporating that theme.</em></p>

<p>At Flite, we have no idea what type of content is displayed alongside our ads, which run millions of impressions all over the web. The engagement and interaction rates are mostly based on the creative and messaging. But this may not always be the complete picture. The context of the real world in which the ad runs also may have a great amount of influence on these rates.</p>

<p>For our hackday project, we wanted to pull in data sets from many sources and measure the numbers against the engagement and interaction rates to see if there are any correlations within our <a href="http://www.flite.com/platform/">Report Studio</a>.</p>

<!-- more -->


<p>We started the process by looking for the apporiate data sets that would be apllicable to our project.
Some of our challenges included the following:</p>

<ul>
<li>Free or public data is hard to find, especially daily or historical data</li>
<li>Data in various formats made it hard to parse</li>
<li>Google APIs are inaccessible, or not available/rate-limited</li>
</ul>


<p>The stock index (S&amp;P 500) ended up providing us with the cleanest data set. We added a UI in our Report Studio that would enable users to input a stock symbol and see whether we could correlate that data to our metrics on the ad impression.</p>

<p><img src="http://mechanics.flite.com/images/hackday_scallop/hackday-stockmarket.png" title="Interaction Rate vs. S&amp;P 500" alt="Interaction Rate vs. S&amp;P 500" /></p>

<p><img src="http://mechanics.flite.com/images/hackday_scallop/hackday-stockandhashtags.png" alt="Interaction Rate vs. S&amp;P 500 &amp; social trends" /></p>

<p>In the Report Studio, we use a Javascript library called <a href="http://www.highcharts.com">Highcharts</a> to give a visual representation of the data to our customers. During the hackday process, we found out that we are on an older version of Highcharts, and upgrading was not straightforward. Part of the project was to be able to dynamically add in new series to the chart that has its own yAxis and scale. This feature was only available in the 3.0 version of highcharts and we had 2.4. After upgrading, there seemed to be some small quirks with compatibility with our code.</p>

<p><img src="http://mechanics.flite.com/images/hackday_scallop/hackday-screentip.png" alt="Screentips providing context infomation" /></p>

<p>We also explored how we could correlate open data if we had more granular geolocation data available in the Studio. We collect geolocation data at the ad level and that can currently be found under the Measure tab in the Ad Studio; however, that data is not available in Report Studio (yet). If it were available, we could tell the user the geographic location where the most impressions were served and information about that location, like headlines, events or weather that happened there on that day.</p>

<p><img src="http://mechanics.flite.com/images/hackday_scallop/hackday-geolocation.png" alt="Interaction Rate displayed on Google Maps" /></p>

<p>Impression data could also be represented on a map using geolocation to see if local news events could possibly have a major influence on the numbers.</p>

<p>Other data sets we considered included</p>

<ul>
    <li>Trends (google, twitter), </li>
    <li>News Articles (page views for news articles), </li>
    <li>Weather (by region), </li>
    <li>Geolocation info could also correlate with that</li>
    <li>Map against custom data (like your own KPI or internal data sets)</li>
</ul>


<p>This was a very informative project and expect more soon.</p>

<p>Cheers</p>

<p>Team Scallop: Grant Lee, Joel Antipuesto, Matt Thomas, Mekuria Getinet, Paul Krohn</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The downside of MySQL auto-reconnect]]></title>
    <link href="http://mechanics.flite.com/blog/2013/05/03/the-downside-of-mysql-auto-reconnect/"/>
    <updated>2013-05-03T17:30:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/05/03/the-downside-of-mysql-auto-reconnect</id>
    <content type="html"><![CDATA[<p>A few days ago I was doing some cleanup on a passive master database using the MySQL client. I didn&#8217;t want my commands to be replicated so I executed <code>set sql_log_bin=0</code> in my session.</p>

<p>One of my queries dropped an unused schema that I knew was corrupt, so I wasn&#8217;t too surprised when the <code>drop database</code> command crashed the MySQL server. After the crash, the server came back up quickly, and my client automatically reconnected, so it was safe to keep running queries right?</p>

<p>Wrong.</p>

<p>When the client reconnected I lost my session state, so <code>sql_log_bin</code> reverted to 1, and any commands I ran from that point forward would be replicated, which I did <em>not</em> want.</p>

<p>This behavior makes sense, and it&#8217;s documented in the <a href="http://dev.mysql.com/doc/refman/5.5/en/auto-reconnect.html">manual</a>:</p>

<blockquote><p>Automatic reconnection can be convenient because you need not implement your own reconnect code, but <strong>if a reconnection does occur, several aspects of the connection state are reset on the server side and your application will not know about it</strong>. The connection-related state is affected as follows:</p>

<ul>
<li>Any active transactions are rolled back and autocommit mode is reset.</li>
<li>All table locks are released.</li>
<li>All TEMPORARY tables are closed (and dropped).</li>
<li>Session variables are reinitialized to the values of the corresponding variables. This also affects variables that are set implicitly by statements such as SET NAMES.</li>
<li>User variable settings are lost.</li>
<li>Prepared statements are released.</li>
<li>HANDLER variables are closed.</li>
<li>The value of LAST_INSERT_ID() is reset to 0.</li>
<li>Locks acquired with GET_LOCK() are released.</li>
</ul>
</blockquote>

<p>But it&#8217;s easy to overlook such details when working with automatic features like MySQL client auto-reconnect. In this specific case I didn&#8217;t execute any other commands in the reconnected session so I didn&#8217;t inadvertantly replicate anything, but this incident served as a good reminder to be vigilant about my session state when using the MySQL client.</p>

<!-- more -->


<p>Here&#8217;s a snippet from my session to show the value of <code>sql_log_bin</code> before and after the crash:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
<span class='line-number'>33</span>
<span class='line-number'>34</span>
<span class='line-number'>35</span>
<span class='line-number'>36</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; set sql_log_bin = 0;
</span><span class='line'>Query OK, 0 rows affected (0.00 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; select @@sql_log_bin;
</span><span class='line'>+---------------+
</span><span class='line'>| @@sql_log_bin |
</span><span class='line'>+---------------+
</span><span class='line'>|             0 |
</span><span class='line'>+---------------+
</span><span class='line'>1 row in set (0.00 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; drop database test;
</span><span class='line'>Query OK, 1 row affected (0.19 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; select @@sql_log_bin;
</span><span class='line'>+---------------+
</span><span class='line'>| @@sql_log_bin |
</span><span class='line'>+---------------+
</span><span class='line'>|             0 |
</span><span class='line'>+---------------+
</span><span class='line'>1 row in set (0.00 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; drop database reports;
</span><span class='line'>ERROR 2013 (HY000): Lost connection to MySQL server during query
</span><span class='line'>mysql&gt; select @@sql_log_bin;
</span><span class='line'>ERROR 2006 (HY000): MySQL server has gone away
</span><span class='line'>No connection. Trying to reconnect...
</span><span class='line'>Connection id:    505
</span><span class='line'>Current database: *** NONE ***
</span><span class='line'>
</span><span class='line'>+---------------+
</span><span class='line'>| @@sql_log_bin |
</span><span class='line'>+---------------+
</span><span class='line'>|             1 |
</span><span class='line'>+---------------+
</span><span class='line'>1 row in set (0.00 sec)</span></code></pre></td></tr></table></div></figure>

]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[JavaScript's document.domain and how to detect when it changes]]></title>
    <link href="http://mechanics.flite.com/blog/2013/04/29/javascripts-document-domain-property-and-how-to-detect-when-it-changes/"/>
    <updated>2013-04-29T12:00:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/04/29/javascripts-document-domain-property-and-how-to-detect-when-it-changes</id>
    <content type="html"><![CDATA[<p>Whether you are a JavaScript developer,  in Ad Ops, or in QA you&#8217;ve probably heard of this thing called <code>document.domain</code>. In JavaScript, the window&#8217;s <code>document</code> object has a property <code>domain</code> that when accessed typically returns the full host name of the site you&#8217;re running in. Go ahead, try it. Go to <a href="http://www.flite.com">http://www.flite.com</a> in your browser and in the developer tools, execute <code>document.domain</code>, and you&#8217;ll see that it returns <code>www.flite.com</code>. This property is used by the browser to determine if Site A and Site B can communicate with each other.</p>

<p>All browsers respect a security specification known as a <a href="http://en.wikipedia.org/wiki/Same_origin_policy">Same-origin Policy</a>, which means that sites and even iFrames can only communicate with eachother if they share the same origin or host. The <code>document.domain</code> property must match exactly in order for two frames to communicate with each other directly. I say &#8220;directly&#8221; because methods like using <code>window.postMessage</code> allow for <a href="http://en.wikipedia.org/wiki/Cross-document_messaging">Cross-document messaging</a> which I&#8217;ll cover later in the post.</p>

<!-- more -->


<h2>What can you do with document.domain?</h2>

<p>Aside from reading the property, you can also <em>set</em> the property using JavaScript, but you may only set it to be less specific. For example, you can change <code>www.flite.com</code> to <code>flite.com</code>, but you cannot change <code>flite.com</code> to <code>www.flite.com</code>. There are some legitimate reasons why a site might make use of changing its <code>document.domain</code> property. One typical use case is for supporting a plugin across a site that has many subdomains. Here&#8217;s a fictional example: let&#8217;s use <strong>flite.com</strong>. We have several subdomains where different information is kept for different audiences. There&#8217;s the main marketing site <code>www.flite.com</code>, a site for developers at <code>developer.flite.com</code>, and a site to get help at <code>support.flite.com</code>. We also have a widget for serving videos that serves out of <code>video.flite.com</code> which happens to require access to the top frame in order to expand a sharing menu. In order for the video player to work on all three sites, we will either need to inline the video player, enable <a href="http://en.wikipedia.org/wiki/Cross-origin_resource_sharing">Cross-Origin Resource Sharing (CORS)</a> on our servers, or we could set the <code>document.domain</code> property in all four of our sites using <code>document.domain = "flite.com"</code>. Magically, everything works and we can move along with our lives. This method is all well and good if you are the one controlling each of the sites and can orchestrate this sort of change, but it can wreak havoc on ads and plugins running on your site if they aren&#8217;t prepared to handle it.</p>

<h2>So, what&#8217;s the problem?</h2>

<p>Flite Ads are 3rd party and we don&#8217;t have control over how sites are implemented. In some implementations, our ads are served in iFrames that share the same domain as the top page where we need to expand into. Under normal circumstances the <code>document.domain</code> property is not changing. In this ideal scenario, we continue to have the ability to expand onto the top page. Recently, we have been seeing a trend among sites where their JavaScript, or some plugin the site is loading, changes the <code>document.domain</code> property of the page. Sometimes it changes before the ad is loaded, sometimes shortly after the ad is loaded, or sometimes not at all. This makes it impossible to always change the <code>document.domain</code> property by default. Instead, our code running inside of the iFrame must try and detect when the parent&#8217;s <code>document.domain</code> property changes and update its own in lock-step so that during any future attempts to expand, our code will get the access it may need.</p>

<h2>Flite&#8217;s secret sauce&#8230;</h2>

<p>I mentioned 2 scenarios, the first being where <code>document.domain</code> is changed before our ad is loaded. To overcome this, we simply try to determine if we can access the top window&#8217;s document object and if not, we change our local <code>document.domain</code> property and try again.</p>

<h3>Shifting document.domain using trial and error:</h3>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>//Function to check if current window and the top window are Cross-Origin
</span><span class='line'>var isCrossOrigin = function() {
</span><span class='line'>    try{
</span><span class='line'>        //try to access the document object
</span><span class='line'>        if (top.document || top.document.domain){
</span><span class='line'>          //we have the same document.domain value!
</span><span class='line'>        }
</span><span class='line'>    }catch(e) {
</span><span class='line'>      //We don't have access, it's cross-origin!
</span><span class='line'>        return true;
</span><span class='line'>    }
</span><span class='line'>    return false;
</span><span class='line'>};
</span><span class='line'>
</span><span class='line'>//Function to shift off the first part of the host name
</span><span class='line'>var shiftDomain = function() {
</span><span class='line'>  var currentDomain = document.domain;
</span><span class='line'>  document.domain = currentDomain.substring(currentDomain.indexOf('.') + 1);
</span><span class='line'>};
</span><span class='line'>
</span><span class='line'>//Reduces 2 subdomains down to just the domain only if necessary 
</span><span class='line'>//(e.g. dev.support.flite.com to flite.com)
</span><span class='line'>if (isCrossOrigin()) {
</span><span class='line'>     shiftDomain();   
</span><span class='line'>     
</span><span class='line'>     if (isCrossOrigin()) {
</span><span class='line'>      shiftDomain();
</span><span class='line'>     }
</span><span class='line'>}</span></code></pre></td></tr></table></div></figure>


<p>The problem with this method is that while it will work and allow you to communicate between frames, all browsers will complain about the first attempt to access the document by throwing an error in the JavaScript console. Even though the code is catching the error, the browser will still report it. It won&#8217;t halt execution of subesequent code, but it will cause an error message stating: <code>Unsafe JavaScript attempt to access frame with URL 'blah' from frame with URL 'blah'. Domains, protocols and ports must match.</code></p>

<p>The second case is what we&#8217;ve been encountering more of: the site loads a widget that is loaded after our ad is loaded which requires changing the <code>document.domain</code> property. This bypasses our initial scenario leaving us with a mis-matched <code>document.domain</code> property after we&#8217;ve initialized. As soon as this property changes, direct communication between frames in either direction is cut off. As I mentioned earlier, we are able to overcome this. On initialization we create a polling loop using <code>createInterval</code> in the top window that waits until the <code>document.domain</code> property changes from what it once was. Once the change is detected, it clears the interval and executes a call to <code>window.postMessage</code> which sends a message to our frame. We send along the new value in the message and our frame is ready waiting for this message. Upon receiving it, the frame changes the local <code>document.domain</code> property which restores the ability to communicate to the parent frame and vise versa!</p>

<h3>Tracking when document.domain changes gracefully:</h3>

<ul>
<li>JSFiddle showing how when <code>document.domain</code> changes, a child page can no longer access it. The top box is the current frame, the blue box is the child iFrame, both are loaded from the same subdomain:</li>
</ul>


<iframe width="100%" height="300" src="http://jsfiddle.net/97vS8/3/embedded/result,js,html,css" allowfullscreen="allowfullscreen" frameborder="0"></iframe>


<ul>
<li>The child frame used in the previous example. It displays the current frame&#8217;s and the parent frame&#8217;s <code>document.domain</code> values:</li>
</ul>


<iframe width="100%" height="400" src="http://jsfiddle.net/X8pne/11/embedded/js,html,css,result" allowfullscreen="allowfullscreen" frameborder="0"></iframe>


<ul>
<li>JSFiddle showing how a child frame can detect the change and handle it gracefully:</li>
</ul>


<iframe width="100%" height="300" src="http://jsfiddle.net/Jv7eD/4/embedded/result,js,html,css" allowfullscreen="allowfullscreen" frameborder="0"></iframe>


<ul>
<li>The child frame with code that tracks <code>document.domain</code> changes in the parent frame:</li>
</ul>


<iframe width="100%" height="400" src="http://jsfiddle.net/mBBaQ/4/embedded/js,html,css,result" allowfullscreen="allowfullscreen" frameborder="0"></iframe>


<h2>Final thoughts</h2>

<p>If I could have my way, I would push everyone towards abandoning using this property. The reality is that all modern browsers support CORS and we as an industry should be embracing it. It&#8217;s one of those things that your developers want to use, but your Operations team likely needs to enable. The Web as a whole moves at glacial speeds when adopting new standards, so while it would be ideal to have everyone move to CORS, it&#8217;s not practical. I&#8217;ve shown that there are ways to work around it that make your plugin a little more resilient out in The Wild. However, if you develop a plugin or use a plugin on your site that requires changing <code>document.domain</code>, please consider enabling CORS and please stop changing <code>document.domain</code>! As a final note, on <strong>Flite.com</strong> and all of its subdomains, we do <strong>not</strong> change the <code>document.domain</code> property, I just needed an example :-).</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[JSON Parsing in MySQL Using common_schema]]></title>
    <link href="http://mechanics.flite.com/blog/2013/04/08/json-parsing-in-mysql-using-common-schema/"/>
    <updated>2013-04-08T11:43:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/04/08/json-parsing-in-mysql-using-common-schema</id>
    <content type="html"><![CDATA[<p>Last week I was implementing a new report using MySQL, and some of the data was stored in JSON format. MySQL has lots of built-in <a href="http://dev.mysql.com/doc/refman/5.5/en/string-functions.html">string functions</a>, but none of them work for JSON. My first idea was to use the <a href="http://blog.kazuhooku.com/2011/09/mysqljson-mysql-udf-for-parsing-json.html">mysql_json UDF</a>, but then I remembered that <a href="http://code.google.com/p/common-schema/">common_schema</a> recently <a href="http://code.openark.org/blog/mysql/common_schema-1-3-security-goodies-parameterized-split-json-to-xml-query-checksum">added JSON parsing</a>. Since I have common_schema 1.3 installed on all of my databases already, I tried that first.</p>

<p>In this particular case the JSON is pretty simple. It contains two fields: age and gender. Here&#8217;s an example of the data format:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>{"age":"Over 30","gender":"female"}
</span><span class='line'>{"age":"Under 30","gender":"female"}
</span><span class='line'>{"age":"Over 30","gender":"male"}
</span><span class='line'>{"age":"Under 30","gender":"male"}</span></code></pre></td></tr></table></div></figure>


<p>Parsing that into two separate columns with common_schema is pretty easy; just use the <a href="http://common-schema.googlecode.com/svn/trunk/common_schema/doc/html/extract_json_value.html">extract_json_value()</a> function like this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; select common_schema.extract_json_value(f.event_data,'/age') as age,
</span><span class='line'>    -&gt;   common_schema.extract_json_value(f.event_data,'/gender') as gender,
</span><span class='line'>    -&gt;   sum(f.event_count) as event_count 
</span><span class='line'>    -&gt; from json_event_fact f
</span><span class='line'>    -&gt; group by age, gender;
</span><span class='line'>
</span><span class='line'>+----------+---------+-------------+
</span><span class='line'>| age      | gender  | event_count |
</span><span class='line'>+----------+---------+-------------+
</span><span class='line'>| Over 30  | female  |     3710983 |
</span><span class='line'>| Over 30  | male    |     2869302 |
</span><span class='line'>| Under 30 | female  |     5027591 |
</span><span class='line'>| Under 30 | male    |     4918382 |
</span><span class='line'>| unknown  | female  |       42039 |
</span><span class='line'>| unknown  | male    |       50173 |
</span><span class='line'>| unknown  | unknown |        8372 |
</span><span class='line'>+----------+---------+-------------+
</span></code></pre></td></tr></table></div></figure>


<p>That fits my reporting use case perfectly, and I&#8217;m now looking into other ways to use common_schema to parse JSON stored elsewhere in my database.</p>

<p>If you haven&#8217;t tried <a href="http://code.google.com/p/common-schema/">common_schema</a>, I recommend you check it out. It&#8217;s easy to install, saves a lot of time and effort, and gets better with each release. In fact, I&#8217;m now using 4 of the 7 &#8220;New and Noteworthy&#8221; features from the 1.3 release.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Hidden Gems in Percona Toolkit]]></title>
    <link href="http://mechanics.flite.com/blog/2013/04/01/hidden-gems-in-percona-toolkit/"/>
    <updated>2013-04-01T11:36:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/04/01/hidden-gems-in-percona-toolkit</id>
    <content type="html"><![CDATA[<p>I use <a href="http://www.percona.com/software/percona-toolkit">Percona Toolkit</a> a lot. We all know about the most popular tools: pt-heartbeat, pt-kill, pt-online-schema-change, pt-query-digest, pt-stalk, pt-table-checksum, etc. They can help a lot working with MySQL.</p>

<p>But there are also some lesser known tools that don&#8217;t have anything to do with MySQL.</p>

<p>Here are a few of my favorites:</p>

<h2>pt-price</h2>

<p>Percona acknowledges that &#8220;PT&#8221; doesn&#8217;t only stand for &#8220;Percona Toolkit&#8221;, it is also the chemical symbol for <a href="http://en.wikipedia.org/wiki/Platinum">Platinum</a>, so they recently added this handy little tool to give you the current spot price for Platinum. The price of Platinum is changing all the time, and it&#8217;s hard to keep up with it. Also, currency exchange rates are constantly fluctuating, so figuring out the value of platinum in various currencies is almost impossible. That&#8217;s where pt-price comes in. For example, say you want to know the price of Platinum in Euros per kilo or US dollars per ounce right now. Just run one of these commands:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ pt-price --currency=euro --unit=kilogram
</span><span class='line'>39586.26
</span><span class='line'>
</span><span class='line'>$ pt-price --currency=usd --unit=ounce
</span><span class='line'>1586</span></code></pre></td></tr></table></div></figure>


<h2>pt-height</h2>

<p>We all know that the <a href="http://en.wikipedia.org/wiki/Petronas_Towers">Petronas Towers</a> in Kuala Lumpur are 1483 feet tall, but how many meters is that? How about furlongs? pt-height can tell you:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ pt-height --unit=meter
</span><span class='line'>451.9
</span><span class='line'>
</span><span class='line'>$ pt-height --unit=furlong
</span><span class='line'>2.247</span></code></pre></td></tr></table></div></figure>


<p>The Petronas Towers haven&#8217;t been the tallest buildings in the world since 2004, but where do they rank right now? pt-height can tell you:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ pt-height --rank
</span><span class='line'>7</span></code></pre></td></tr></table></div></figure>


<h2>pt-barnum</h2>

<p>I haven&#8217;t used this one much yet, so I&#8217;ll leave this as an excercise for the reader.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Helping common_schema help me]]></title>
    <link href="http://mechanics.flite.com/blog/2013/03/18/helping-common-schema-help-me/"/>
    <updated>2013-03-18T12:02:00-04:00</updated>
    <id>http://mechanics.flite.com/blog/2013/03/18/helping-common-schema-help-me</id>
    <content type="html"><![CDATA[<p>I&#8217;m a big fan of <a href="http://code.google.com/p/common-schema/">common_schema</a>. It&#8217;s a really powerful and flexible tool, and I&#8217;m always looking for new ways to use it.</p>

<p>Last week I had to update millions of rows across many databases to tokenize some persisted URL values, and I remembered reading Baron Schwartz&#8217;s recent <a href="http://www.xaprb.com/blog/2013/01/28/deleting-millions-of-rows-in-small-chunks-with-common_schema/">blog post</a> about using the common_schema <a href="http://common-schema.googlecode.com/svn/trunk/common_schema/doc/html/query_script_split.html">split</a> feature. Baron&#8217;s use case was deleting data, but I figured this could work well to break my large updates into chunks too. I had already written the update statements I wanted to execute, and after five minutes reading the common_schema documentation I was ready to try it out on a dev database.</p>

<p>One of the queries I tried didn&#8217;t work in common_schema, but with a creative workaround I was able to trick common_schema into accepting it. Read on for the details.</p>

<p>The first query I wanted to run involved two tables, so I needed to specify which table to split on. In this case I&#8217;m updating the <code>parameter</code> table, so I specify that as my split table first, then specify the full update statement, and after that I select <code>$split_total_rowcount</code> so I can see the progress of the update after each chunk is done.</p>

<p>Here&#8217;s the code:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>set @script := "
</span><span class='line'>  split(flite.parameter: update flite.registration
</span><span class='line'>    inner join flite.parameter on parameter.registration_id = registration.id
</span><span class='line'>    set parameter.value = replace(parameter.value,'http://flite.com','${urls.siteUrl}')
</span><span class='line'>    where registration.type in (1,9)
</span><span class='line'>    and parameter.name = 'asset_url'
</span><span class='line'>  ) 
</span><span class='line'>  SELECT $split_total_rowcount AS 'rows updated so far';
</span><span class='line'>";
</span><span class='line'>call common_schema.run(@script);</span></code></pre></td></tr></table></div></figure>


<p>It worked on dev and QA so I tried it out on my staging db and it worked perfectly, updating over a million rows in about an hour. Here&#8217;s the end of the output:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>...
</span><span class='line'>+---------------------+
</span><span class='line'>| rows updated so far |
</span><span class='line'>+---------------------+
</span><span class='line'>|             1195047 |
</span><span class='line'>+---------------------+
</span><span class='line'>1 row in set (1 hour 3 min 40.12 sec)
</span><span class='line'>
</span><span class='line'>Query OK, 0 rows affected (1 hour 3 min 40.16 sec)</span></code></pre></td></tr></table></div></figure>


<!-- more -->


<p>The second query was even simpler, just a single table update. But when I ran this one with common_schema no rows were updated:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; set @script := "
</span><span class='line'>    "&gt;   split(update flite.registration
</span><span class='line'>    "&gt; set url = replace(url,'http://flite.com','${urls.siteUrl}')
</span><span class='line'>    "&gt; where type in (1,9)
</span><span class='line'>    "&gt; )     
</span><span class='line'>    "&gt;   SELECT $split_total_rowcount AS 'rows updated so far';
</span><span class='line'>    "&gt; ";
</span><span class='line'>Query OK, 0 rows affected (0.00 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; call common_schema.run(@script);
</span><span class='line'>Query OK, 0 rows affected (1.23 sec)
</span></code></pre></td></tr></table></div></figure>


<p>I guess common_schema failed trying to split the table, so I tried a no op call to verify it&#8217;s splitting approach, and that returned nothing, which validated my hypothesis:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; call common_schema.run("
</span><span class='line'>    "&gt;   split(flite.registration) { 
</span><span class='line'>    "&gt;     select 
</span><span class='line'>    "&gt;       $split_step as step, $split_columns as columns, 
</span><span class='line'>    "&gt;       $split_min as min_value, $split_max as max_value, 
</span><span class='line'>    "&gt;       $split_range_start as range_start, $split_range_end as range_end
</span><span class='line'>    "&gt;   }
</span><span class='line'>    "&gt; ");
</span><span class='line'>Query OK, 0 rows affected (0.33 sec)</span></code></pre></td></tr></table></div></figure>


<p>The <code>registration</code> table has over 50 columns, and in addition to having an auto-increment primary key it also has other unique indexes, so of all my tables I&#8217;m not too surprised this one had a problem.</p>

<p>So now what do I do? I could submit a <a href="http://code.google.com/p/common-schema/issues/entry">bug report</a> for common_schema, or open up the code and try to fix it myself (and I&#8217;ll eventually do at least one of those two things), but I really wanted to make this work, so I decided to trick common_schema into doing it anyways. Since common_schema supports multi-table updates, I just rewrote my single table update as a gratuitous multi-table update so I could split on the second table.</p>

<p>First I tried joining to a parent table:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>set @script := "
</span><span class='line'>  split(flite.instance: update flite.registration
</span><span class='line'>inner join flite.organization on organization.id = registration.organization_id
</span><span class='line'>set registration.url = replace(registration.url,'http://flite.com','${urls.siteUrl}')
</span><span class='line'>)
</span><span class='line'>  SELECT $split_total_rowcount AS 'rows updated so far';
</span><span class='line'>";
</span><span class='line'>call common_schema.run(@script);</span></code></pre></td></tr></table></div></figure>


<p>But when I ran that it didn&#8217;t return anything for a few minutes, which was not a good sign. Rather than troubleshoot that query, I decided to try another join table.</p>

<p>Next I tried joining to a child table, using a where clause that I knew would give me a 1-1 relationship with the table I was updating:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>set @script := "
</span><span class='line'>  split(flite.instance: update flite.registration
</span><span class='line'>inner join flite.instance on instance.registration_id = registration.id
</span><span class='line'>set registration.url = replace(registration.url,'http://flite.com','${urls.siteUrl}')
</span><span class='line'>where registration.type in (1,9)
</span><span class='line'>and instance.is_preview = true
</span><span class='line'>)
</span><span class='line'>  SELECT $split_total_rowcount AS 'rows updated so far';
</span><span class='line'>";
</span><span class='line'>call common_schema.run(@script);</span></code></pre></td></tr></table></div></figure>


<p>This time it worked! Here&#8217;s the end of the output:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>...
</span><span class='line'>+---------------------+
</span><span class='line'>| rows updated so far |
</span><span class='line'>+---------------------+
</span><span class='line'>|             1195725 |
</span><span class='line'>+---------------------+
</span><span class='line'>1 row in set (42 min 33.48 sec)
</span><span class='line'>
</span><span class='line'>Query OK, 0 rows affected (42 min 33.52 sec)</span></code></pre></td></tr></table></div></figure>


<p>Now I can go back and look at the code and/or submit a bug report, but I was pleased that with a little help common_schema solved my use case. This was a reminder to me of the power of a good tool, and the value of being creative in the way you use that tool.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Enabling InnoDB compression on tables with heavy rows]]></title>
    <link href="http://mechanics.flite.com/blog/2013/03/01/enabling-innodb-compression-on-tables-with-heavy-rows/"/>
    <updated>2013-03-01T16:09:00-05:00</updated>
    <id>http://mechanics.flite.com/blog/2013/03/01/enabling-innodb-compression-on-tables-with-heavy-rows</id>
    <content type="html"><![CDATA[<p>I&#8217;ve been interested in <a href="http://dev.mysql.com/doc/refman/5.5/en/innodb-compression-background.html">InnoDB compression</a> for a while, and have been looking for a good use case at <a href="http://www.flite.com">Flite</a> where I could start using it.</p>

<p>Recently I found a perfect use case to start using InnoDB compression on a single table. This table gets a few hundred inserts per day, and each row stores up to 350 KB of serialized JSON in a LONGTEXT column.</p>

<p>Since I use <code>innodb_file_per_table</code> on MySQL 5.5, compressing the table was easy. I just executed 2 lines of DDL:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>SET GLOBAL innodb_file_format=BARRACUDA;
</span><span class='line'>
</span><span class='line'>ALTER TABLE reg_version 
</span><span class='line'>  ENGINE=InnoDB 
</span><span class='line'>  ROW_FORMAT=COMPRESSED 
</span><span class='line'>  KEY_BLOCK_SIZE=16;</span></code></pre></td></tr></table></div></figure>


<p>Of course I also added <code>innodb_file_format = BARRACUDA</code> to <code>/etc/my.cnf</code> to make the change permanent.</p>

<p>From my understanding, this change means that any newly created InnoDB tables will use the <code>BARRACUDA</code> format, and all pre-existing tables will still be in <code>ANTELOPE</code> format. I don&#8217;t think the format will matter to me unless I want to compress another table, but I&#8217;m not sure if rebuilding an existing table via <code>OPTIMIZE TABLE</code> or <code>ALTER TABLE</code> would cause it to be rebuilt in <code>BARRACUDA</code> format or not.</p>

<p>The table is about 80% smaller on disk now that it is compressed, so this change is working as expected for me.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A fail-fast first test of MySQL backup integrity]]></title>
    <link href="http://mechanics.flite.com/blog/2013/02/21/a-fail-fast-first-test-of-mysql-backup-integrity/"/>
    <updated>2013-02-21T15:41:00-05:00</updated>
    <id>http://mechanics.flite.com/blog/2013/02/21/a-fail-fast-first-test-of-mysql-backup-integrity</id>
    <content type="html"><![CDATA[<p>This morning I was planning to check the integrity of a snapshot of a MySQL database that runs on EC2 using EBS, and I accidentally stumbled upon a fast way to quickly find certain types of problems with a backup.</p>

<p>My plan was to use <code>CHECK TABLE</code> on all of the tables to verify the integrity of the backup. The simplest way to run <code>CHECK TABLES</code> on an entire database is like this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysqlcheck -c --all-databases</span></code></pre></td></tr></table></div></figure>


<p>Rather than using that command, I decided to generate a bunch of individual <code>CHECK TABLE</code> statements dynamically by running a query on the <code>information_schema.tables</code> table. Why do it that way? It gives me the flexibility to check a specific schema first, or check the smallest or most frequently updated tables first, etc.</p>

<!-- more -->


<p>So I ran this command to generate my <code>CHECK TABLE</code> script:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>select concat('check table ', table_schema, '.', table_name, ';') as check_stmt
</span><span class='line'>into outfile '/tmp/check_tables.sql'
</span><span class='line'>from information_schema.tables
</span><span class='line'>where table_type = 'BASE TABLE'
</span><span class='line'>order by (data_length + index_length) asc;</span></code></pre></td></tr></table></div></figure>


<p>And I planned to execute the script like this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql -f &lt; /tmp/check_tables.sql</span></code></pre></td></tr></table></div></figure>


<p>However, when I ran the first query it crashed the MySQL server!</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; select concat('check table ', table_schema, '.', table_name, ';') as check_stmt
</span><span class='line'>    -&gt; into outfile '/tmp/check_tables.sql'
</span><span class='line'>    -&gt; from information_schema.tables
</span><span class='line'>    -&gt; where table_type = 'BASE TABLE'
</span><span class='line'>    -&gt; order by (data_length + index_length) asc;
</span><span class='line'>ERROR 2013 (HY000): Lost connection to MySQL server during query</span></code></pre></td></tr></table></div></figure>


<p>Of course that error message doesn&#8217;t <em>always</em> mean that mysqld crashed. Sometimes it just means your session was killed. But in this case I checked, and mysqld did indeed crash.</p>

<p>It was a harmless crash since it was on a passive server, but it pointed out a problem with the backup so I had to find out exactly what went wrong so I could fix it.</p>

<p>I checked the error log and found the source of the crash:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>InnoDB: Error: tablespace id is 17722 in the data dictionary
</span><span class='line'>InnoDB: but in file ./test/promo_history_old.ibd it is 18280!
</span><span class='line'>130221  9:39:55  InnoDB: Assertion failure in thread 139984392570624 in file fil0fil.c line 765
</span><span class='line'>InnoDB: We intentionally generate a memory trap.
</span><span class='line'>InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
</span><span class='line'>InnoDB: If you get repeated assertion failures or crashes, even
</span><span class='line'>InnoDB: immediately after the mysqld startup, there may be
</span><span class='line'>InnoDB: corruption in the InnoDB tablespace. Please refer to
</span><span class='line'>InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
</span><span class='line'>InnoDB: about forcing recovery.
</span><span class='line'>130221  9:39:55 - mysqld got signal 6 ;
</span><span class='line'>...</span></code></pre></td></tr></table></div></figure>


<p>I&#8217;ve omitted the rest of the error output (including the backtrace) for brevity, but it&#8217;s clear which table caused the problem: <code>test.promo_history_old</code>.</p>

<p>After confirming that table is no longer needed I dropped it and re-ran the <code>information_schema</code> query successfully so I could move on with the rest of my integrity check.</p>

<p>In the future I&#8217;ll run a quick fail-fast test on <code>information_schema</code> before doing the more time-consuming check of running <code>CHECK TABLE</code> on all of the tables. I think this should be sufficient:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>select * from information_schema.tables;</span></code></pre></td></tr></table></div></figure>


<p>Of course I&#8217;ll still run <code>CHECK TABLE</code> on all of the tables, but the <code>information_schema</code> test will help me find some problems faster.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Why I use ONLY_FULL_GROUP_BY]]></title>
    <link href="http://mechanics.flite.com/blog/2013/02/12/why-i-use-only-full-group-by/"/>
    <updated>2013-02-12T15:25:00-05:00</updated>
    <id>http://mechanics.flite.com/blog/2013/02/12/why-i-use-only-full-group-by</id>
    <content type="html"><![CDATA[<p>MySQL uses the concept of <a href="http://dev.mysql.com/doc/refman/5.5/en/server-sql-mode.html">SQL_MODE</a> to &#8220;define what SQL syntax MySQL should support and what kind of data validation checks it should perform&#8221;. This post is about one of those modes, <a href="http://dev.mysql.com/doc/refman/5.5/en/server-sql-mode.html#sqlmode_only_full_group_by">ONLY_FULL_GROUP_BY</a>, and why I use it.</p>

<p>Roland Bouman wrote a <a href="http://rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html">great post</a> a few years ago that debunks some myths about using <code>GROUP BY</code> in MySQL. His post has a lot of detail and examples, and does a very good job detailing the way <code>GROUP BY</code> works in MySQL with and without <code>ONLY_FULL_GROUP_BY</code> enabled. I recommend that you go and read that post now. Among other things, Roland points out one case where query performance is improved by <em>not</em> using a full <code>GROUP BY</code>. The post is several years old, but the performance difference is still present today in MySQL 5.6.</p>

<p>That post doesn&#8217;t make any strong recommendation on using <code>ONLY_FULL_GROUP_BY</code>, so why do I use it? For me, it makes it safer to run dynamically generated report queries. I execute a variety of dynamic reports that select one or more metrics for a specific time period over a specific set of dimensions. In order for the report to return accurate data, all of the dimensional columns must be in both the <code>SELECT</code> clause and the <code>GROUP BY</code> clause. I trust myself to write good SQL by hand, but since I have code generating dynamic SQL I value the extra protection provided by this SQL mode. If MySQL throws an error due to a partial <code>GROUP BY</code> I can catch it with a functional test, rather than trying to catch a more subtle error, namely incorrect report data.</p>

<!-- more -->


<p>To illustrate the problem I am trying to avoid, I will execute a reporting query on the <a href="http://dev.mysql.com/doc/sakila/en/sakila-installation.html">sakila database</a>. Let&#8217;s say I want to see the 10 most popular language/category combinations for the films in the sakila database. I could use a query like this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; select l.name,c.name,count(*)
</span><span class='line'>    -&gt; from sakila.film f
</span><span class='line'>    -&gt; left outer join sakila.film_category fc on fc.film_id = f.film_id
</span><span class='line'>    -&gt; left outer join sakila.category c on c.category_id = fc.category_id
</span><span class='line'>    -&gt; left outer join sakila.language l on l.language_id = f.language_id
</span><span class='line'>    -&gt; group by c.name
</span><span class='line'>    -&gt; order by count(*) desc
</span><span class='line'>    -&gt; limit 10;
</span><span class='line'>+---------+-------------+----------+
</span><span class='line'>| name    | name        | count(*) |
</span><span class='line'>+---------+-------------+----------+
</span><span class='line'>| English | Sports      |       74 |
</span><span class='line'>| English | Foreign     |       73 |
</span><span class='line'>| English | Family      |       69 |
</span><span class='line'>| English | Documentary |       68 |
</span><span class='line'>| English | Animation   |       66 |
</span><span class='line'>| English | Action      |       64 |
</span><span class='line'>| English | New         |       63 |
</span><span class='line'>| English | Drama       |       62 |
</span><span class='line'>| English | Games       |       61 |
</span><span class='line'>| English | Sci-Fi      |       61 |
</span><span class='line'>+---------+-------------+----------+
</span><span class='line'>10 rows in set (0.00 sec)</span></code></pre></td></tr></table></div></figure>


<p>That looks pretty good. Now imagine my query was written by a query builder in code, and due to a bug the category column was not added to the <code>GROUP BY</code> clause. By default MySQL will still execute the query and give me a result, but the result is misleading:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; select l.name,c.name,count(*)
</span><span class='line'>    -&gt; from sakila.film f
</span><span class='line'>    -&gt; left outer join sakila.film_category fc on fc.film_id = f.film_id
</span><span class='line'>    -&gt; left outer join sakila.category c on c.category_id = fc.category_id
</span><span class='line'>    -&gt; left outer join sakila.language l on l.language_id = f.language_id
</span><span class='line'>    -&gt; group by l.name
</span><span class='line'>    -&gt; order by count(*) desc
</span><span class='line'>    -&gt; limit 10;
</span><span class='line'>+---------+-------------+----------+
</span><span class='line'>| name    | name        | count(*) |
</span><span class='line'>+---------+-------------+----------+
</span><span class='line'>| English | Documentary |     1000 |
</span><span class='line'>+---------+-------------+----------+
</span><span class='line'>1 row in set (0.01 sec)</span></code></pre></td></tr></table></div></figure>


<p>That makes it look like every film in the sakila database is a Documentary, which is obviously not accurate.</p>

<p>To avoid this I will set the <code>SQL_MODE</code>. When setting the <code>SQL_MODE</code> you need to be careful not to remove any existing <code>SQL_MODE</code> values. In my case I will append <code>ONLY_FULL_GROUP_BY</code> to the 2 <code>SQL_MODE</code> values I am already using:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; select @@sql_mode;
</span><span class='line'>+--------------------------------------------+
</span><span class='line'>| @@sql_mode                                 |
</span><span class='line'>+--------------------------------------------+
</span><span class='line'>| STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION |
</span><span class='line'>+--------------------------------------------+
</span><span class='line'>1 row in set (0.00 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; set session sql_mode = 'STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY';
</span><span class='line'>Query OK, 0 rows affected (0.00 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; select @@sql_mode;
</span><span class='line'>+---------------------------------------------------------------+
</span><span class='line'>| @@sql_mode                                                    |
</span><span class='line'>+---------------------------------------------------------------+
</span><span class='line'>| ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION |
</span><span class='line'>+---------------------------------------------------------------+
</span><span class='line'>1 row in set (0.00 sec)</span></code></pre></td></tr></table></div></figure>


<p>Now that I&#8217;ve set the <code>SQL_MODE</code>, I will try my query again:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; select l.name,c.name,count(*)
</span><span class='line'>    -&gt; from sakila.film f
</span><span class='line'>    -&gt; left outer join sakila.film_category fc on fc.film_id = f.film_id
</span><span class='line'>    -&gt; left outer join sakila.category c on c.category_id = fc.category_id
</span><span class='line'>    -&gt; left outer join sakila.language l on l.language_id = f.language_id
</span><span class='line'>    -&gt; group by l.name
</span><span class='line'>    -&gt; order by count(*) desc
</span><span class='line'>    -&gt; limit 10;
</span><span class='line'>ERROR 1055 (42000): 'sakila.c.name' isn't in GROUP BY
</span><span class='line'>mysql&gt; </span></code></pre></td></tr></table></div></figure>


<p>That&#8217;s much better. In my case I would rather get an error than get bad data, so this is the result I want.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Capturing errors and warnings from LOAD DATA INFILE]]></title>
    <link href="http://mechanics.flite.com/blog/2013/02/05/capturing-errors-and-warnings-from-load-data-infile/"/>
    <updated>2013-02-05T10:51:00-05:00</updated>
    <id>http://mechanics.flite.com/blog/2013/02/05/capturing-errors-and-warnings-from-load-data-infile</id>
    <content type="html"><![CDATA[<p>MySQL makes it easy to bulk load multiple rows of data from a flat file into a MySQL table using the <a href="http://dev.mysql.com/doc/refman/5.5/en/load-data.html"><code>LOAD DATA INFILE</code></a> command, but that command can quickly get you into trouble if you are not careful about capturing the warnings and errors it produces.</p>

<p>Running <code>LOAD DATA INFILE</code> commands at the mysql prompt gives you pretty good output, but if you run the same command at the terminal or in a shell script you have to do a little bit of extra work to capture the errors and warnings.</p>

<p>Here are a few techniques I use when I run <code>LOAD DATA INFILE</code> at the terminal or in a shell script:</p>

<ol>
<li>Use <a href="http://dev.mysql.com/doc/refman/5.5/en/mysql-command-options.html#option_mysql_verbose">double-verbose mode</a> (<code>--v</code>) to capture the high level counts of Records, Deleted, Skipped, Warnings, etc.</li>
<li>Use <a href="http://dev.mysql.com/doc/refman/5.5/en/mysql-command-options.html#option_mysql_init-command"><code>--init-command</code></a> to set session variables (<a href="http://dev.mysql.com/doc/refman/5.5/en/server-sql-mode.html">sql_mode</a>, <a href="http://dev.mysql.com/doc/refman/5.5/en/set-sql-log-bin.html">sql_log_bin</a>, <a href="http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_foreign_key_checks">foreign_key_checks</a>, etc)</li>
<li>Use <a href="http://dev.mysql.com/doc/refman/5.5/en/mysql-command-options.html#option_mysql_show-warnings"><code>--show warnings</code></a> to get all of the warnings, even if they exceed the value of <a href="http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_max_error_count"><code>max_error_count</code></a></li>
<li>Use <a href="http://dev.mysql.com/doc/refman/5.5/en/user-variables.html">user-defined variables</a> and <code>SET</code> statements to explicitly handle NULL values and defaults.</li>
</ol>


<p>Below I will provide more detail on each of those techniques, using the <a href="http://dev.mysql.com/doc/sakila/en/sakila-installation.html">sakila database</a> for my code examples:</p>

<!-- more -->


<h2>1. Use double-verbose mode (<code>--v</code>) to capture the high level counts of Records, Deleted, Skipped, Warnings, etc.</h2>

<p>For my first example, I&#8217;ll dump all of the rows from a table with a primary key into a flat file on disk, and then try to load that flat file into the table again and see what happens.</p>

<p>If I do a straightforward <code>LOAD DATA INFILE</code> I get an error:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; select * into outfile '/tmp/rental.txt' from sakila.rental;
</span><span class='line'>Query OK, 16044 rows affected (0.02 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; load data infile '/tmp/rental.txt' into table sakila.rental;
</span><span class='line'>ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'</span></code></pre></td></tr></table></div></figure>


<p>If I add the <code>IGNORE</code> keyword I can avoid that error, and MySQL tells me that all 16044 rows were skipped as duplicates:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; load data infile '/tmp/rental.txt' ignore into table sakila.rental;
</span><span class='line'>Query OK, 0 rows affected (0.23 sec)
</span><span class='line'>Records: 16044  Deleted: 0  Skipped: 16044  Warnings: 0</span></code></pre></td></tr></table></div></figure>


<p>The <code>IGNORE</code> keyword is also useful if your file contains some duplicates and some new rows. To illustrate this I delete 1000 random rows from the table and then re-run the same command:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql&gt; delete from sakila.rental where rental_id &gt;= 8000 and rental_id &lt; 9000;
</span><span class='line'>Query OK, 1000 rows affected (0.11 sec)
</span><span class='line'>
</span><span class='line'>mysql&gt; load data infile '/tmp/rental.txt' ignore into table sakila.rental;
</span><span class='line'>Query OK, 1000 rows affected (0.74 sec)
</span><span class='line'>Records: 16044  Deleted: 0  Skipped: 15044  Warnings: 0</span></code></pre></td></tr></table></div></figure>


<p>The 1000 rows I deleted were successfully loaded back into the table, and the other 15044 rows were rejected as duplicates.</p>

<p>By default, running <code>LOAD DATA INFILE</code> from a shell script will not give you any output unless there is an error.</p>

<p>To illustrate what happens in a shell script I&#8217;ll run the same commands from the terminal.</p>

<p>Error conditions are the same:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   -e "load data infile '/tmp/rental.txt' into table sakila.rental;"
</span><span class='line'>ERROR 1062 (23000) at line 1: Duplicate entry '1' for key 'PRIMARY'
</span><span class='line'>$ </span></code></pre></td></tr></table></div></figure>


<p>If I add the <code>IGNORE</code> keyword, by default I don&#8217;t get the information about how many rows were skipped:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ mysql --database=sakila -e \
</span><span class='line'>&gt;   "load data infile '/tmp/rental.txt' ignore into table sakila.rental;"
</span><span class='line'>$ </span></code></pre></td></tr></table></div></figure>


<p>I can capture that information by passing in the <a href="http://dev.mysql.com/doc/refman/5.5/en/mysql-command-options.html#option_mysql_verbose"><code>--verbose</code></a> flag, but I need to pass it twice, and I get some extra information, specifically the SQL command I execute is echoed back to me:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   -vve "load data infile '/tmp/rental.txt' ignore into table sakila.rental;"
</span><span class='line'>--------------
</span><span class='line'>load data infile '/tmp/rental.txt' ignore into table sakila.rental
</span><span class='line'>--------------
</span><span class='line'>
</span><span class='line'>Query OK, 0 rows affected (0.43 sec)
</span><span class='line'>Records: 16044  Deleted: 0  Skipped: 16044  Warnings: 0
</span><span class='line'>
</span><span class='line'>Bye
</span><span class='line'>$ </span></code></pre></td></tr></table></div></figure>


<h2>2. Use &#8211;init-command to set session variables (sql_mode, sql_log_bin, foreign_key_checks, etc)</h2>

<p><code>--init-command</code> is a handy way to set your session up the way you want it when running mysql commands from the terminal or in a shell script. For example, if your server uses a relaxed global sql_mode, you may enforce a stricter sql_mode for your session like this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql --init-command="set sql_mode = 'STRICT_ALL_TABLES';"</span></code></pre></td></tr></table></div></figure>


<p>You can set multiple variables in a single init-command. For example you can set the sql_mode, disable binary logging for your session, and disable foreign key checks like this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>mysql --init-command="set sql_mode = 'STRICT_ALL_TABLES'; set sql_log_bin = 0; set foreign_key_checks = 0;"</span></code></pre></td></tr></table></div></figure>


<p>To further illustrate this point, I&#8217;ll use the sakila.rental table again in several different error scenarios that benefit from the use of <code>--init-command</code>.</p>

<p>First I&#8217;ll truncate the table before I load the data back in:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   -e "truncate table sakila.rental;"
</span><span class='line'>ERROR 1701 (42000) at line 1: Cannot truncate a table referenced in a foreign key constraint (`sakila`.`payment`, CONSTRAINT `fk_payment_rental` FOREIGN KEY (`rental_id`) REFERENCES `sakila`.`rental` (`rental_id`))</span></code></pre></td></tr></table></div></figure>


<p>Oops, I can&#8217;t truncate the rental table because the payment table has a foreign key to it. Since I&#8217;m planning to load the data back into the rental table I want to just ignore the foreign key. I can do that with <code>--init-command</code>:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   --init-command="set foreign_key_checks=0;" \
</span><span class='line'>&gt;   -e "truncate table sakila.rental;"
</span><span class='line'>$ </span></code></pre></td></tr></table></div></figure>


<p>That&#8217;s better. Now I&#8217;ll change return_date from NULL to NOT NULL and see what happens when I try to load the flat file back in, knowing that it has 183 rows with NULL values in that column:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   -e "alter table sakila.rental modify column return_date datetime not null;"
</span><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   -vve "load data infile '/tmp/rental.txt' into table sakila.rental;"
</span><span class='line'>--------------
</span><span class='line'>load data infile '/tmp/rental.txt' into table sakila.rental
</span><span class='line'>--------------
</span><span class='line'>
</span><span class='line'>Query OK, 16044 rows affected, 183 warnings (0.82 sec)
</span><span class='line'>Records: 16044  Deleted: 0  Skipped: 0  Warnings: 183
</span><span class='line'>
</span><span class='line'>Bye
</span><span class='line'>$ </span></code></pre></td></tr></table></div></figure>


<p>All of the rows loaded successfully, but I got 183 warnings. Unfortunately I don&#8217;t see any of the warnings, which brings me to:</p>

<h2>3. Use <code>--show warnings</code> to get all of the warnings, even if they exceed the value of <a href="http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_max_error_count"><code>max_error_count</code></a></h2>

<p>I&#8217;ll try that truncate and load again using <code>--show-warnings</code> so I can see all of the warnings.</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   --init-command="set foreign_key_checks=0;" \
</span><span class='line'>&gt;   -e "truncate table sakila.rental;"
</span><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   --show-warnings \
</span><span class='line'>&gt;   -vve "load data infile '/tmp/rental.txt' into table sakila.rental;"
</span><span class='line'>--------------
</span><span class='line'>load data infile '/tmp/rental.txt' into table sakila.rental
</span><span class='line'>--------------
</span><span class='line'>
</span><span class='line'>Query OK, 16044 rows affected, 183 warnings (0.82 sec)
</span><span class='line'>Records: 16044  Deleted: 0  Skipped: 0  Warnings: 183
</span><span class='line'>
</span><span class='line'>Warning (Code 1263): Column set to default value; NULL supplied to NOT NULL column 'return_date' at row 11492
</span><span class='line'>Warning (Code 1263): Column set to default value; NULL supplied to NOT NULL column 'return_date' at row 11537
</span><span class='line'>Warning (Code 1263): Column set to default value; NULL supplied to NOT NULL column 'return_date' at row 11559
</span><span class='line'>Warning (Code 1263): Column set to default value; NULL supplied to NOT NULL column 'return_date' at row 11573
</span><span class='line'>Warning (Code 1263): Column set to default value; NULL supplied to NOT NULL column 'return_date' at row 11589
</span><span class='line'>Warning (Code 1263): Column set to default value; NULL supplied to NOT NULL column 'return_date' at row 11607
</span><span class='line'>Warning (Code 1263): Column set to default value; NULL supplied to NOT NULL column 'return_date' at row 11642
</span><span class='line'>...
</span><span class='line'>Bye
</span><span class='line'>$ </span></code></pre></td></tr></table></div></figure>


<p>I snipped the list of warnings for brevity. It was 183 occurrences of the same warning.</p>

<p>This begs the question, what default value did MySQL use?</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ mysql --database=sakila -e "select min(return_date) from sakila.rental;"
</span><span class='line'>+---------------------+
</span><span class='line'>| min(return_date)    |
</span><span class='line'>+---------------------+
</span><span class='line'>| 0000-00-00 00:00:00 |
</span><span class='line'>+---------------------+</span></code></pre></td></tr></table></div></figure>


<p>That&#8217;s not ideal. I&#8217;ll try again with a stricter sql_mode that disallows zero dates:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   --init-command="set foreign_key_checks=0;" \
</span><span class='line'>&gt;   -e "truncate table sakila.rental;"
</span><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   --show-warnings \
</span><span class='line'>&gt;   --init-command="set sql_mode = 'STRICT_ALL_TABLES';" \
</span><span class='line'>&gt;   -vve "load data infile '/tmp/rental.txt' into table sakila.rental;"
</span><span class='line'>--------------
</span><span class='line'>load data infile '/tmp/rental.txt' into table sakila.rental
</span><span class='line'>--------------
</span><span class='line'>
</span><span class='line'>ERROR 1263 (22004) at line 1: Column set to default value; NULL supplied to NOT NULL column 'return_date' at row 11492
</span><span class='line'>Bye</span></code></pre></td></tr></table></div></figure>


<p>Okay, that&#8217;s better. I don&#8217;t get any zero dates, but instead none of the data is loaded. Which brings me to:</p>

<h2>4. Use user-defined variables and <code>SET</code> statements to explicitly handle NULL values and defaults</h2>

<p>Let&#8217;s say I don&#8217;t want zero dates, but I do want to enforce a default value for rows without a date in the specific context of my <code>LOAD DATA INFILE</code> statement. For the purposes of this example I&#8217;ll use the current date and time as returned by the MySQL <code>NOW()</code> function.</p>

<p>I can capture the relevant column values from the flat file in a user-defined variable, and then use a <code>SET</code> statement within <code>LOAD DATA INFILE</code> to explictly set return_date=now() if a NULL is in the file:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ mysql --database=sakila \
</span><span class='line'>&gt;   --show-warnings \
</span><span class='line'>&gt;   --init-command="set sql_mode = 'STRICT_ALL_TABLES';" \
</span><span class='line'>&gt;   -vve "load data infile '/tmp/rental.txt' into table sakila.rental \
</span><span class='line'>&gt;   (rental_id,rental_date,inventory_id,customer_id,@return_date,staff_id,last_update) \
</span><span class='line'>&gt;   set return_date = coalesce(@return_date,now());"
</span><span class='line'>--------------
</span><span class='line'>load data infile '/tmp/rental.txt' into table sakila.rental   (rental_id,rental_date,inventory_id,customer_id,@return_date,staff_id,last_update)   set return_date = coalesce(@return_date,now())
</span><span class='line'>--------------
</span><span class='line'>
</span><span class='line'>Query OK, 16044 rows affected (0.99 sec)
</span><span class='line'>Records: 16044  Deleted: 0  Skipped: 0  Warnings: 0
</span><span class='line'>
</span><span class='line'>Bye
</span><span class='line'>$ </span></code></pre></td></tr></table></div></figure>



]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[JavaScript event bindings that will throw you for a loop]]></title>
    <link href="http://mechanics.flite.com/blog/2013/01/30/javascript-event-bindings-that-will-throw-you-for-a-loop/"/>
    <updated>2013-01-30T16:47:00-05:00</updated>
    <id>http://mechanics.flite.com/blog/2013/01/30/javascript-event-bindings-that-will-throw-you-for-a-loop</id>
    <content type="html"><![CDATA[<p>If you build rich internet applications that rely heavily on JavaScript and frameworks such as Backbone and jQuery, chances are you have come across a quirky little thing about JavaScript event bindings, specifically when you are dealing with things within loops.</p>

<!-- more -->


<p>Let&#8217;s take a simple example:<br /></p>

<iframe style="width: 100%; height: 300px" src="http://jsfiddle.net/grizznant/nDXSb/embedded/" allowfullscreen="allowfullscreen" frameborder="0"></iframe>


<p>You want to bind a click handler to each LI element of a UL in the DOM. Let&#8217;s say you decide on doing it by getting a reference to the items, loop over them, and bind a click handler that will alert the value of the loop counter &#8220;i&#8221;. You might expect that each item will alert &#8220;0&#8221;, &#8220;1&#8221;, &#8220;2&#8221;, etc. right? Well, if you run the example, you will see that it is not the case! Turns out that because of how JavaScript scopes things, the reference to &#8220;i&#8221; is held in each click handler function and is updated as it is incremented in the loop. So instead of getting a nice list of items that happily alert the appropriate index number, you get a list of items that all alert &#8220;5&#8221;. Annoying!</p>

<p>I&#8217;m not saying you would actually try to code things as I did in the example, but what are some other ways to do the same thing that would avoid this problem? Here are just a few other examples that you might consider:</p>

<h2>1. Bypass using the loop entirely<br /></h2>

<p>jQuery makes it super easy to just skip the whole looping mess using the power of CSS selectors and the <code>on()</code> function. Without a loop or counter, we can use jQuery&#8217;s <code>index()</code> function to alert the index as we were trying to do in the first example too. <a href="http://jsfiddle.net/grizznant/nEtuK/" target="_blank">try&nbsp;it</a></p>

<h2>2. Use function closures<br /></h2>

<p>The first example works, but what if you still want to do things using a loop? Since the problem stems from the variable reference problem, using closures will safely protect the variable values and properly give you the expected result. <a href="http://jsfiddle.net/grizznant/LvzZj/" target="_blank">try&nbsp;it</a></p>

<h2>3. Use a separate function<br /></h2>

<p>If you don&#8217;t like using closures for whatever reason, defining a separate outside function also achieves the same result. This approach might even be a bit cleaner and/or reusable. <a href="http://jsfiddle.net/grizznant/BYYhX/" target="_blank">try&nbsp;it</a></p>

<p>What are some ways you have solved this type of problem? I&#8217;d love to hear from our readers to see how they might have done things differently!</p>

<p>For further reading on this topic, I found <a href="https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Closures#Creating_closures_in_loops.3A_A_common_mistake">this article</a> rather interesting.<br /></p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The other reason you need unique server_ids in  MySQL replication]]></title>
    <link href="http://mechanics.flite.com/blog/2013/01/28/the-other-reason-you-need-unique-server-ids-in-mysql-replication/"/>
    <updated>2013-01-28T08:41:00-05:00</updated>
    <id>http://mechanics.flite.com/blog/2013/01/28/the-other-reason-you-need-unique-server-ids-in-mysql-replication</id>
    <content type="html"><![CDATA[<p>I recently encountered an error when two slave databases were using the same server_id value. I know having unique server_ids across all MySQL databases in a given replication topology is a best practice, but I&#8217;m not sure I understand exactly why MySQL is trying to enforce uniqueness in this specific case.</p>

<p>The most obvious reason to use unique server_id values for every database in a replication setup is to avoid having a master and slave with the same server_id. If that happens the slave will skip any events coming from that master.</p>

<p>But if I have two slaves with the same server_id, that should be safe unless I promote one of those slave to be the master of the other. Is that the case that MySQL is trying to protect me from, or is there some other case I am missing?</p>

<!-- more -->


<p>My specific problem started when I cloned a slave database. This is something I do fairly often, but most of the time I am creating a new read slave, and I change its server_id to something unique. This time I was replacing an existing database, so I left the server_id the same. When I started the slave on the new copy of the database, I now had two slaves with the same server_id. It didn&#8217;t seem like a big deal at the time, because I knew I would be shutting down the old copy in the near future.</p>

<p>The first symptom I noticed was an explosion of relay logs on the new database. On a healthy slave database there should usually only be one or two relay logs, but this database had tens of thousands of them, all very small, and all very recent.</p>

<p>I looked into the MySQL error log to see what was going on, and saw the same error over and over again:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>130122 20:28:08 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'bin-log.000049' at position 340959933
</span><span class='line'>130122 20:28:08 [Note] Slave: received end packet from server, apparent master shutdown: 
</span><span class='line'>130122 20:28:08 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'bin-log.000049' at position 340959933
</span><span class='line'>130122 20:28:08 [Note] Slave: received end packet from server, apparent master shutdown: 
</span><span class='line'>130122 20:28:08 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'bin-log.000049' at position 340959933
</span><span class='line'>130122 20:28:08 [Note] Slave: received end packet from server, apparent master shutdown: 
</span><span class='line'>130122 20:28:08 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'bin-log.000049' at position 340959933
</span><span class='line'>130122 20:28:09 [Note] Slave: received end packet from server, apparent master shutdown: 
</span><span class='line'>130122 20:28:09 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'bin-log.000049' at position 340959933
</span><span class='line'>130122 20:28:09 [Note] Slave: received end packet from server, apparent master shutdown: 
</span><span class='line'>130122 20:28:09 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'bin-log.000049' at position 340959933
</span><span class='line'>130122 20:28:09 [Note] Slave: received end packet from server, apparent master shutdown: </span></code></pre></td></tr></table></div></figure>


<p>I did a search for that error message and found <a href="http://www.mysqlperformanceblog.com/2008/06/04/confusing-mysql-replication-error-message/">Peter&#8217;s post</a> from a few years ago which pointed me in the right direction.</p>

<p>So I changed the server_id on the old slave db, restarted it, and the errors went away.</p>

<p>In the future I&#8217;ll be more careful about this when cloning databases, but I have to say this seems like an odd way to enforce this restriction, and not a very descriptive error message. Even though the master was constantly sending end packets to the slave and trying to make it go away, the slave continued to reconnect and process updates from the master, although it certainly did so inefficiently.</p>

<p>I&#8217;ve learned my lesson for the future, but I&#8217;m still left with two questions:</p>

<ol>
<li>Why is MySQL enforcing unique server_ids for slave databases?</li>
<li>Can we get a more descriptive error message when this situation occurs?</li>
</ol>

]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Using the ExternalInterface Class to Bridge JavaScript and Flash]]></title>
    <link href="http://mechanics.flite.com/blog/2013/01/14/using-the-externalinterface-class-to-bridge-javascript-and-flash/"/>
    <updated>2013-01-14T21:10:00-05:00</updated>
    <id>http://mechanics.flite.com/blog/2013/01/14/using-the-externalinterface-class-to-bridge-javascript-and-flash</id>
    <content type="html"><![CDATA[<p>At <a href="http://www.flite.com">Flite</a> the core of our <a href="http://www.flite.com/platform/">Desktop Ad product</a> is a SWF file that pulls in data and the structure of an ad in order to put it together on the client. In previous versions, the SWF file contained most of the code to interface with our servers. Recently, we built a <a href="http://www.flite.com/platform/touch-ad-studio/">Touch Ad product</a> that was entirely HTML5, and with it a new JavaScript API was created to talk to our servers. Instead of maintaining two versions of the API code, we decided to go with a pure JavaScript implementation and used Flash&#8217;s native ExternalInterface class to call into the JavaScript API.</p>

<p>The ExternalInterface class is extremely handy, but troubleshooting your implementation can prove to be difficult. Here are some helpful tips for using the ExternalInterface class that I&#8217;ve picked up along the way.</p>

<!-- more -->


<h2>Basic examples of using ExternalInterface</h2>

<p>The most useful function of the ExternalInterface class is <code>call</code>. A typical use case is to execute existing JavaScript that might be on the page. But a more exciting use case is to execute your own JavaScript functions in ActionScript code. Anonymous functions are great for one time usage. For example, opening a predetermined URL:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='as'><span class='line'><span class="nb">ExternalInterface</span><span class="p">.</span><span class="nx">call</span><span class="p">(</span><span class="s2">&quot;function(){window.location.href = &#39;http://flite.com&#39;;}&quot;</span><span class="p">);</span>
</span></code></pre></td></tr></table></div></figure>


<p><code>ExternalInterface.call</code> also allows you to pass any number of parameters to JavaScript functions. The first argument is the function (passed as a String), followed by the parameters to be passed to the function. For example, you can get <code>trace</code> output on machines without the Flash Debugger version installed. ExternalInterface can be used to log into your browser&#8217;s JavaScript debugging console instead of the default location.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='as'><span class='line'><span class="nb">ExternalInterface</span><span class="p">.</span><span class="nx">call</span><span class="p">(</span><span class="s2">&quot;console.log&quot;</span><span class="o">,</span> <span class="nx">myVar</span><span class="o">,</span> <span class="nx">myVar2</span><span class="p">);</span>
</span></code></pre></td></tr></table></div></figure>


<p>For times when you need to make repeated calls to the same function, improve your performance by using static constants.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='as'><span class='line'><span class="kd">private</span> <span class="kd">static</span> <span class="kd">const</span> <span class="nx">ALERT_FN</span><span class="o">:</span><span class="nb">String</span> <span class="o">=</span> <span class="s2">&quot;function(msg){alert(msg);}&quot;</span><span class="o">;</span>
</span><span class='line'><span class="nb">ExternalInterface</span><span class="p">.</span><span class="nx">call</span><span class="p">(</span><span class="nx">ALERT_FN</span><span class="o">,</span> <span class="s2">&quot;ding!&quot;</span><span class="p">);</span>
</span></code></pre></td></tr></table></div></figure>


<p>As your code gets more complex, I suggest breaking it up onto several lines to make it readable.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='as'><span class='line'><span class="kd">private</span> <span class="kd">static</span> <span class="kd">const</span> <span class="nx">ALERT_FN</span><span class="o">:</span><span class="nb">String</span> <span class="o">=</span> <span class="s2">&quot;function(msg){&quot;</span> <span class="o">+</span>
</span><span class='line'>    <span class="s2">&quot;alert(msg);&quot;</span> <span class="o">+</span>
</span><span class='line'><span class="s2">&quot;}&quot;</span><span class="o">;</span>
</span></code></pre></td></tr></table></div></figure>


<h2>How to Debug your JavaScript</h2>

<p>Make your code debuggable by adding &#8220;debugger;&#8221; statements. The newline characters can help with formatting of the code when the breakpoint catches in the debugging console. Most developer tools &#8211; such as <a href="http://stackoverflow.com/questions/4484407/human-readable-javascripts-in-chrome-developer-tools">Chrome&#8217;s Web Inspector</a> &#8211; allow you to format JavaScript, so the newline characters may not be needed.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class='as'><span class='line'><span class="kd">private</span> <span class="kd">static</span> <span class="kd">const</span> <span class="nx">ALERT_FN</span><span class="o">:</span><span class="nb">String</span> <span class="o">=</span> <span class="s2">&quot;function(msg){\n&quot;</span> <span class="o">+</span>
</span><span class='line'>    <span class="s2">&quot;debugger;\n&quot;</span> <span class="o">+</span>
</span><span class='line'>    <span class="s2">&quot;alert(msg);\n&quot;</span> <span class="o">+</span>
</span><span class='line'><span class="s2">&quot;}&quot;</span><span class="o">;</span>
</span></code></pre></td></tr></table></div></figure>


<p>Improve performance and your ability to debug your code even further by injecting complex functions onto the page in an initialization function. By doing this, you can also execute your injected JavaScript functions directly in the browser&#8217;s debugging console. I recommend creating a namespace object on the <code>window</code> that is unique to your SWF, so that you don&#8217;t overwrite existing code in JavaScript.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class='as'><span class='line'><span class="kd">private</span> <span class="kd">static</span> <span class="kd">const</span> <span class="nx">MYFUNC_FN</span><span class="o">:</span><span class="nb">String</span> <span class="o">=</span> <span class="s2">&quot;window.MY_NAMESPACE.myFunc&quot;</span><span class="o">;</span>
</span><span class='line'><span class="kd">private</span> <span class="kd">static</span> <span class="kd">const</span> <span class="nx">INIT_JS</span><span class="o">:</span><span class="nb">String</span> <span class="o">=</span> <span class="s2">&quot;window.MY_NAMESPACE = window.MY_NAMESPACE || {};&quot;</span> <span class="o">+</span>
</span><span class='line'><span class="nx">MYFUNC_FN</span> <span class="o">+</span> <span class="s2">&quot; = function(arg1, arg2){ /*do a lot of things*/}&quot;</span><span class="o">;</span>
</span><span class='line'>
</span><span class='line'><span class="kd">public</span> <span class="kd">function</span> <span class="nx">MyConstructor</span><span class="p">(){</span>
</span><span class='line'>     <span class="nb">ExternalInterface</span><span class="p">.</span><span class="nx">call</span><span class="p">(</span><span class="nx">INIT_JS</span><span class="p">);</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="kd">private</span> <span class="kd">function</span> <span class="nx">doMyFunc</span><span class="p">(</span><span class="nx">p1</span><span class="o">:</span><span class="nb">String</span><span class="o">,</span> <span class="nx">p2</span><span class="o">:</span><span class="nb">Boolean</span><span class="p">)</span> <span class="o">:</span> <span class="nx">void</span> <span class="p">{</span>
</span><span class='line'>     <span class="nb">ExternalInterface</span><span class="p">.</span><span class="nx">call</span><span class="p">(</span><span class="nx">MYFUNC_FN</span><span class="o">,</span> <span class="nx">p1</span><span class="o">,</span> <span class="nx">p2</span><span class="p">);</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<h2>From JavaScript to ActionScript</h2>

<p>Beyond just executing javascript, Flash can also evaluate return values coming back from JavaScript. You could use JavaScript to generate random numbers, for example:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='as'><span class='line'><span class="kd">private</span> <span class="kd">function</span> <span class="nx">getJSRandom</span><span class="p">()</span> <span class="o">:</span> <span class="nb">Number</span> <span class="p">{</span>
</span><span class='line'>     <span class="k">return</span> <span class="nb">Number</span><span class="p">(</span><span class="nb">ExternalInterface</span><span class="p">.</span><span class="nx">call</span><span class="p">(</span><span class="s2">&quot;Math.random&quot;</span><span class="p">));</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>Avoid passing objects that reference non-primitive types (i.e. DOM, Functions, etc). JavaScript is designed to work with the DOM object which is known to be full of circular references. For example, a DOM node has a property referencing its parent element and the parent also has references to its children. When passing values back and forth between Flash and JavaScript, the arguments and return values are serialized and the circular references will cause Flash to come crumbling to its knees.</p>

<p>For example, the first code block will <em>not</em> work, while the second one will:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='as'><span class='line'><span class="k">var</span> <span class="nx">el</span><span class="o">:</span><span class="nb">Object</span> <span class="o">=</span> <span class="nb">Object</span><span class="p">(</span><span class="nb">ExternalInterface</span><span class="p">.</span><span class="nx">call</span><span class="p">(</span><span class="s2">&quot;document.getElementById(&#39;container&#39;)&quot;</span><span class="p">);</span>
</span><span class='line'><span class="nf">trace</span><span class="p">(</span><span class="nx">el</span><span class="p">.</span><span class="nx">nodeName</span><span class="p">);</span>
</span></code></pre></td></tr></table></div></figure>




<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='as'><span class='line'><span class="k">var</span> <span class="nx">nodeName</span><span class="o">:</span><span class="nb">String</span> <span class="o">=</span> <span class="nb">String</span><span class="p">(</span><span class="nb">ExternalInterface</span><span class="p">.</span><span class="nx">call</span><span class="p">(</span><span class="s2">&quot;function(){return document.getElementById(&#39;container&#39;).nodeName;}&quot;</span><span class="p">);</span>
</span><span class='line'><span class="nf">trace</span><span class="p">(</span><span class="nx">nodeName</span><span class="p">);</span>
</span></code></pre></td></tr></table></div></figure>


<p>And finally, if the need arises, you can also call into your ActionScript code from JavaScript by using the <code>ExternalInterface.addCallback</code> function to register an ActionScript function. JavaScript can also evaluate return values returned from your ActionScript functions.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='as'><span class='line'><span class="c1">//AS3 Code</span>
</span><span class='line'><span class="k">var</span> <span class="nx">doSomething</span><span class="o">:</span><span class="nb">Function</span> <span class="o">=</span> <span class="kd">function</span><span class="p">(</span><span class="nx">arg1</span><span class="p">)</span> <span class="o">:</span> <span class="nx">void</span><span class="p">{</span>
</span><span class='line'>    <span class="c1">//do something here</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'><span class="nb">ExternalInterface</span><span class="p">.</span><span class="nx">registerCallback</span><span class="p">(</span><span class="s2">&quot;myASFunc&quot;</span><span class="o">,</span> <span class="nx">doSomething</span><span class="p">);</span>
</span></code></pre></td></tr></table></div></figure>


<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="c1">//JS Code</span>
</span><span class='line'><span class="kd">var</span> <span class="nx">myObj</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nx">getElementById</span><span class="p">(</span><span class="s2">&quot;my-object-tag&quot;</span><span class="p">);</span>
</span><span class='line'><span class="k">if</span><span class="p">(</span><span class="nx">myObj</span><span class="p">.</span><span class="nx">myASFunc</span><span class="p">)</span> <span class="p">{</span>
</span><span class='line'>   <span class="nx">myObj</span><span class="p">.</span><span class="nx">myASFunc</span><span class="p">(</span><span class="s2">&quot;foo&quot;</span><span class="p">);</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>For more on Flash&#8217;s ExternalInterface class, see Adobe&#8217;s API documentation <a href="http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/external/ExternalInterface.html">here</a>.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A simple way to make MySQL replication more crash-safe]]></title>
    <link href="http://mechanics.flite.com/blog/2013/01/07/a-simple-way-to-make-mysql-replication-more-crash-safe/"/>
    <updated>2013-01-07T14:55:00-05:00</updated>
    <id>http://mechanics.flite.com/blog/2013/01/07/a-simple-way-to-make-mysql-replication-more-crash-safe</id>
    <content type="html"><![CDATA[<p>I recently discovered the <a href="http://dev.mysql.com/doc/refman/5.5/en/replication-options-binary-log.html#sysvar_sync_binlog">sync_binlog</a>, <a href="http://dev.mysql.com/doc/refman/5.5/en/replication-options-slave.html#sysvar_sync_relay_log">sync_relay_log</a>, <a href="http://dev.mysql.com/doc/refman/5.5/en/replication-options-slave.html#sysvar_sync_master_info">sync_master_info</a>, and <a href="http://dev.mysql.com/doc/refman/5.5/en/replication-options-slave.html#sysvar_sync_relay_log_info">sync_relay_log_info</a> system variables in MySQL, and am using them to make my MySQL replication more crash safe.</p>

<p>Here&#8217;s the problem that inspired me to make this change.</p>

<p>It started when one of our passive MySQL master database hosts restarted unexpectedly. The host came back up fairly quickly, and MySQL started up cleanly once InnoDB finished its crash recovery.</p>

<p>However, all of this master&#8217;s slave databases were in a failed state with the same error:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>Last_IO_Errno: 1236
</span><span class='line'>Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position'</span></code></pre></td></tr></table></div></figure>




<!-- more -->


<p>Apparently some replication events were lost from the end of the binary log when the host went down, so the slave was asking for a position that no longer existed. Since this was a passive master we didn&#8217;t lose any data (the relevant events were still on the active master and came through in the next binary log on the passive master), but replication from this master was temporarily broken.</p>

<p>I manually rolled all of the slaves forward to the appropriate binary log using <code>CHANGE MASTER</code>, but they were behind for several minutes before I did that.</p>

<p>Digging through the configuration on the master I discovered that sync_binlog was disabled, as it is by default in MySQL 5.5. Here&#8217;s how that option is explained in the <a href="http://dev.mysql.com/doc/refman/5.5/en/replication-options-binary-log.html#sysvar_sync_binlog">manual</a>:</p>

<blockquote><p>If the value of this variable is greater than 0, the MySQL server synchronizes its binary log to disk (using fdatasync()) after every sync_binlog writes to the binary log. There is one write to the binary log per statement if autocommit is enabled, and one write per transaction otherwise. The default value of sync_binlog is 0, which does no synchronizing to disk—in this case, the server relies on the operating system to flush the binary log&#8217;s contents from to time as for any other file. A value of 1 is the safest choice because in the event of a crash you lose at most one statement or transaction from the binary log. However, it is also the slowest choice (unless the disk has a battery-backed cache, which makes synchronization very fast).</p></blockquote>

<p>That&#8217;s a pretty good description of the trade-offs involved, and given my write volume I figured it would be worth it for us to enable that option. I did some testing to verify that our masters can handle the extra IO overhead and then enabled those settings on all master databases where it was disabled. Now replication will be more likely to resume automatically after a master failure.</p>

<p>I also updated all of our slave databases to enable sync_relay_log and sync_relay_log_info so if a slave db restarts it&#8217;s more likely that replication will resume automatically. As expected, the manual enumerates the same &#8220;safest&#8221; and &#8220;slowest&#8221; arguments for <a href="http://dev.mysql.com/doc/refman/5.5/en/replication-options-slave.html#sysvar_sync_relay_log">sync_relay_log</a>.</p>

<p>The manual is less clear about how to set <a href="http://dev.mysql.com/doc/refman/5.5/en/replication-options-slave.html#sysvar_sync_master_info">sync_master_info</a> and <a href="http://dev.mysql.com/doc/refman/5.5/en/replication-options-slave.html#sysvar_sync_relay_log_info">sync_relay_log_info</a>. It recommends <em>disabling</em> sync_master_info and <em>enabling</em> sync_relay_log_info, even though both are disabled by default. And it doesn&#8217;t explain the reasoning behind those recommendations. I guess that&#8217;s left as an exercise for the reader :)</p>

<p>In my opinion, if you enable sync_relay_log it makes sense to enable sync_master_info and sync_relay_log_info, too.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Underappreciated NPM Commands Developers Should Know About]]></title>
    <link href="http://mechanics.flite.com/blog/2012/12/11/underappreciated-npm-commands/"/>
    <updated>2012-12-11T15:10:00-05:00</updated>
    <id>http://mechanics.flite.com/blog/2012/12/11/underappreciated-npm-commands</id>
    <content type="html"><![CDATA[<p><a href="http://www.npmjs.org">npm</a>, the package manager that comes bundled with <a href="http://www.nodejs.com">node.js</a>, is generally described as &#8220;awesome&#8221;. While it was originally a separate project, npm has been bundled with node.js  for the past few releases and the number of npm packages <a href="http://www.npmjs.org">available in the npm registry</a> has exploded.</p>

<p>While there’s a well-documented workflow to interacting with npm packages (<code>npm install</code>, <code>npm test</code>, <code>npm publish</code>, rinse and repeat) there are several powerful npm commands that are very useful but less commonly used.
 <!-- more --></p>

<h2>1. The power of npm scripts</h2>

<p>It’s generally known that the <code>scripts</code> member of a <code>package.json</code> file allows developers to specify actions that <code>npm</code> can execute. For instance, if you have a <code>scripts</code> member in your package.json file that looks like this:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="s2">&quot;scripts&quot;</span><span class="o">:</span> <span class="p">{</span>
</span><span class='line'>    <span class="s2">&quot;start&quot;</span><span class="o">:</span> <span class="s2">&quot;node server.js&quot;</span><span class="p">,</span>
</span><span class='line'>    <span class="s2">&quot;test&quot;</span><span class="o">:</span> <span class="s2">&quot;grunt test&quot;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>The command <code>npm start</code> will execute <code>node server.js</code> and <code>npm test</code> will execute the <code>grunt test</code> command (or any other shell script). It’s very useful for defining a common interface to start and test node.js packages across projects.</p>

<p>What’s less well-known, however, is that the <code>scripts</code> member supports over <a href="https://npmjs.org/doc/scripts.html">dozen other actions</a> that can execute before and after other <code>npm</code> commands. For example:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="s2">&quot;scripts&quot;</span><span class="o">:</span> <span class="p">{</span>
</span><span class='line'>    <span class="s2">&quot;preinstall&quot;</span><span class="o">:</span> <span class="s2">&quot;./bin/custom-script.sh&quot;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>Before <code>npm install</code> is run, the <code>./bin/custom-script.sh</code> file is executed. If the custom script returns a non-zero exit code, the <code>npm install</code> command aborts.</p>

<p>Other actions include <code>postinstall</code>, <code>poststart</code>, <code>prepublish</code> and <code>pretest</code>.</p>

<p>This is very nice for coffeescript projects. You can define a <code>prepublish</code> script that compiles your <code>*.coffee</code> files to <code>*.js</code> and your app is ready to run in node.js. All users have to do is run <code>npm install</code> as they normally would.</p>

<p>Alternatively, you could have <code>prestart</code> script that uses <a href="http://github.com/gruntjs">grunt</a> to minify and concatenate <code>*.js</code> files before starting the web server. This is particularly useful for avoiding including minified and optimized resources in source control, not to mention making it easier to deploy to cloud providers like heroku and nodejitsu.</p>

<h2>2. Faster package development with <code>npm link</code></h2>

<p><code>npm link</code> is an essential tool when developing npm packages that you’d like to test in other projects on your  system. It creates a globally-installed symlink for a package folder and then allows you to use that globally-installed symlink in another project. As you change the linked package, the other package that gets the changes instantly. It’s great for package development.</p>

<p>As the <a href="https://npmjs.org/doc/link.html">documentation states</a>, it’s a two step process:</p>

<ul>
<li>Run <code>npm link</code> in a package folder (typically the package you’re developing).</li>
<li>In other package folder (typically the one you want to test with your in-development package) run <code>npm link &lt;your development package name&gt;</code>.</li>
</ul>


<p>That’s it.</p>

<h2>3. Keeping track of old dependencies with <code>npm outdated</code></h2>

<p>The npm package world moves very fast. It’s easy to write a new package and then have its dependencies get out-of-date. The <a href="https://npmjs.org/doc/outdated.html"><code>npm outdated</code> command</a> does exactly what it says &#8211; it reads your <code>package.json</code> file and tells you which of your dependencies have new versions.</p>

<p>It’s a simple idea but one that I’ve tried to work into my development workflow more often. It&#8217;s a good sanity check to make sure I have the fairly recent versions installed of my package’s dependencies.</p>

<h2>4. Know your npm aliases</h2>

<p>How many times have you typed <code>npm install</code> after cloning a git repository? That’s an entire 11 characters! Fortunately, there’s a shortcut: <code>npm i</code> does exactly the same thing.</p>

<p>The list of aliases is longer than you might expect &#8211; there’s no less than four ways you can <code>uninstall</code> a package: <code>npm rm</code>, <code>npm unlink</code>, <code>npm r</code>, <code>npm remove</code> and <code>npm un</code>.</p>

<p>There’s even a helpful <code>npm isntall</code> alias that maps to <code>npm install</code>.</p>

<p>Here’s the <a href="https://github.com/isaacs/npm/blob/390cc4097caa7516949bcdf9c384204a31c7380d/lib/npm.js#L77">complete list</a>.</p>

<h2>5. Shell completion with <code>npm completion</code></h2>

<p>npm aliases are nice, but what’s even better than aliases is <em>not typing text at all</em>. <code>npm completion</code>  is one of those commands I wish I knew about months ago &#8211; it prints a shell script snippet that adds shell completion for npm commands.</p>

<p>For example, typing <code>npm i&lt;tab&gt;</code> then gives you all of the wonderful npm command possibilities that start with i:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='sh'><span class='line'>sh <span class="nv">$ </span>npm i <span class="o">[</span>press tab<span class="o">]</span>
</span><span class='line'>i        info     init     install
</span></code></pre></td></tr></table></div></figure>


<p>To install npm shell completion, it’s just a matter of adding the script <code>npm completion</code> outputs <a href="https://npmjs.org/doc/completion.html">to your shell init file</a>. Or, even better, add it to <a href="http://mechanics.flite.com/blog/2012/07/24/better-dev-environments-with-vagrant-and-dotfiles/">your dotfiles repository</a>.</p>

<p>npm has great man pages, and with autocompletion it’s even easier to learn new interesting commands.</p>

<h2>Bonus: Happy Holidays from npm!</h2>

<p>Finally, perhaps the one command every node.js developer should know about: <code>npm xmas</code></p>

<p>Open your terminal and run this command immediately.</p>
]]></content>
  </entry>
  
</feed>
