Search This Blog

Thursday, August 25, 2011

Keeping Your Database “Fresh” Using MERGE Engine

Keeping Your Database “Fresh” Using MERGE Engine:

Problem Statement:

As you store more and more records in your tables, your database becomes slower and slower. if you do not have a good archiving strategy and implementation, chances are your Web applications will slow down over time due to time consuming data scans in large tables. Don’t you wish your database looked exactly as new as it looked the day you turned it on live?

Good news! There is a way you can achieve a “fresh” feel for your database using the MERGE table engine. In this article we will discuss how. The best part of this solution is that you do not have to change your Web application a single line to take advantage of this hack!


The best way to describe this solution would be to use an example scenario. Assume that you have a MySQL table that stores order information. Every day new orders are being inserted into this table. If you run a large e-commerce site, you would have a huge order table in your hand in a year or so. As order table gets larger, your Web application gets incrementally slower and slower.

You could solve this problem by removing or archiving the order information to a different table and change  your Web application responsible for showing order info to handle both tables in the code. This is a very desirable solution as the burden of database size issue falls on the Web application and not the database itself.

What would be ideal is a solution which allows you to empty the order table on a schedule — say daily, weekly, monthly, quarterly or even yearly — and have no change in your Web apps. Imagine if you could empty your order table every night and have all the old records still accessible as easily as before without changing a single line of Web application! Wow, that would be great. Enter MERGE Engine from MySQL!

MySQL Merge Table

Benefits of MERGE Engine

When you recreate a table to exists as a union of two or more merged tables using the MySQL MERGE engine type, you can direct all your INSERT traffic to a table in the union so that INSERT traffic completes less with your SELECT, UPDATE, DELETE traffic for older records. How much of the competition is reduced depends entirely on your applications.

Recreating your table with MERGE engine

MySQL MERGE table engine is designed to allow you just that. Here is how you can make it work:

  • Create two identical copies of your order table and call them order_live and order_archive

  • Create a new table called order with the same structure with ENGINE=MERGE UNION=(order_live,order_archive) INSERT_METHOD=FIRST; at the end of your create table statement

Now you should have order_live, order_archive and a merge table called order which is a union of the first two tables. Your Web application accesses the order table as usual. All inserts go to order table which automatically writes them into order_live (due to setting the INSERT_METHOD=FIRST option in the create table statement). This means your Web application continues to function exactly as before but new orders automatically enters into order_live table.

Setting up a cron job to move old data

Now setup a simple cron job that moves all records from order_live to order_archive using:

LOCK TABLE db_name.order_live WRITE;

REPLACE INTO db_name.order_archive SELECT * FROM db_name.order_live;
TRUNCATE db_name.order_live;
UNLOCK TABLE db_name.order_live;

This cron job locks the order_live table in db_name database so that no write operation is permitted during the execution of the cron job. It then inserts (or replaces if the records already exists) the order_live records into order_archive table. Once the data from order_live is copied into the order_archive table, the data in order_live is truncated. This frees up the order_live table. The lock on the order_live table is then released.

At this point, you have a fresh, empty order_live table ready to take the orders of tomorrow. The new orders will insert themselves into order_live via the merge interface provided by the order table and your older records are safely stored in the order_archive for other parts of your Web application to perform SELECT operation. Since your Web application only deal with the order table itself, there is no need for your application to know where the actual record resides.

For example, say you inserted an order with order_id set to 1000 today and it got moved to order_archive when the cron job ran. Now say for some reason your application need to perform a rare but needed update statement tomorrow on this record, it still will perform a query such as UPDATE order SET field=value WHERE order_id = 1000; without needing to know anything about the physical table location of the record. Beautiful, isn’t it?

Caveats of MERGE table

Nothing is ever free from caveats; sorry. The major caveats that we found when using MERGE table is related to altering merge table fields. Say you want to add a new field to your order table, you will have to add the fields in all the member tables and then recreate the merge table itself.

1 comment: