As applications mature, tables tend to grow infinitely,
specially the time series ones. They have tens of millions of rows and run up to
a few TBs in size. Even though the tables have so much data, applications only
need to access data that was recently saved. (Say within, the last year or so).
A trivial task of adding a new index or changing the type of a column for such
a humungous table, becomes a very painful task. I was charged with such a task
recently and that got me thinking about archiving strategies.
Archiving is a good practice to keep the data to a required
working set, while at the same time, maintaining the data that might be needed
for legal or compliance reasons, or for certain less frequent workflows in the application,
like looking at the 10 year purchase history of a customer.
Though archiving seems like a database problem, it cannot be
done in vacuum. It needs buy in from application, legal and compliance to know
what the archiving boundaries are, what are the access patterns and
availability requirements for the archived data.
But solving for the database problem is something right up
my alley, so I thought of testing a few approaches. One of my favorites for the
ease of use is the "EXCHANGE PARTITION"
feature for 5.7.
In MySQL 5.7, it is possible to exchange a table partition
or subpartition with a table using –
ALTER
TABLE pt EXCHANGE PARTITION p WITH TABLE nt
where
pt is the partitioned table
p is the partition of pt to be exchanged
and nt is a non partitioned table.
Privileges
needed for the statement is the combination of ALTER
TABLE and
TRUNCATE TABLE on both tables.
This provides a great opportunity for archival of
partitioned tables.
Consider an online retail store, that logs its invoices or
orders in a table. Here is an over simplified invoice table for the store.
CREATE TABLE `invoice` (
`id` int(10) unsigned NOT NULL
AUTO_INCREMENT,
`invoice_num` int(10) unsigned NOT NULL,
`stockcode` int(10) unsigned NOT NULL,
`invoice_date` datetime NOT NULL,
`price` decimal(10,2) DEFAULT NULL,
PRIMARY KEY (`id`,`invoice_date`)
) ENGINE=InnoDB
PARTITION BY RANGE
(YEAR(invoice_date))
(PARTITION p2009 VALUES
LESS THAN (2010) ENGINE = InnoDB,
PARTITION p2010 VALUES LESS THAN (2011) ENGINE
= InnoDB,
PARTITION p2011 VALUES LESS THAN (2012) ENGINE
= InnoDB,
PARTITION p2012 VALUES LESS THAN (2013) ENGINE
= InnoDB,
PARTITION p2013 VALUES LESS THAN (2014) ENGINE
= InnoDB,
PARTITION p2014 VALUES LESS THAN (2015) ENGINE
= InnoDB,
PARTITION p2015 VALUES LESS THAN (2016) ENGINE
= InnoDB,
PARTITION p2016 VALUES LESS THAN (2017) ENGINE
= InnoDB,
PARTITION p2017 VALUES LESS THAN (2018) ENGINE
= InnoDB,
PARTITION p2018 VALUES LESS THAN (2019) ENGINE
= InnoDB,
PARTITION pMAX VALUES LESS THAN MAXVALUE
ENGINE = InnoDB)
The queries on this table only need data from a year ago, to
be available for queries, however for compliance and other non-frequent
workflow we need to maintain older data. The EXCHANGE
PARTITION gives us an opportunity to archive partitions, with quick DDL
operations.
Consider the need to archive 2010 data, which is in p2010 partition.
mysql> select count(*)
from invoice PARTITION (p2010);
+----------+
| count(*) |
+----------+
| 1111215 |
+----------+
1 row in set (0.38 sec)
To use EXCHANGE PARTITION
the table to be partitioned pt and the
non-partitioned table that the data will be archived into nt need to meet a few requirements -
-
The non-partitioned table needs to have the same structure
as the partitioned table.
-
It cannot be a temporary table
-
It cannot have foreign keys, or have any foreign keys
that refer to it.
-
There are no rows in nt
that lie beyond the boundaries of the partition p.
Lets create the table invoice_2010
which will archive all invoices from 2010.
mysql> create table
invoice_2010 like invoice;
Query OK, 0 rows affected
(0.44 sec)
mysql> alter table
invoice_2010 remove partitioning;
Query OK, 0 rows affected
(0.38 sec)
Records: 0 Duplicates: 0
Warnings: 0
mysql> show create
table invoice_2010 \G
Table: invoice_2010
Create Table: CREATE TABLE
`invoice_2010` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`invoice_num` int(10) unsigned NOT NULL,
`stockcode` int(10) unsigned NOT NULL,
`invoice_date` datetime NOT NULL,
`price` decimal(10,2) DEFAULT NULL,
PRIMARY KEY (`id`,`invoice_date`)
) ENGINE=InnoDB DEFAULT
CHARSET=utf8
1 row in set (0.00 sec)
Archiving the partition using EXCHANGE partition, can be
done as –
mysql> select count(*)
from invoice PARTITION (p2010);
+----------+
| count(*) |
+----------+
| 1111215 |
+----------+
1 row in set (0.38 sec)
mysql> ALTER TABLE invoice EXCHANGE PARTITION p2010 WITH TABLE
invoice_2010;
Query OK, 0 rows affected
(0.14 sec)
mysql> select count(*)
from invoice PARTITION (p2010);
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*)
from invoice_2010;
+----------+
| count(*) |
+----------+
| 1111215 |
+----------+
1 row in set (0.21 sec)
If needed the partition can also be moved back.
mysql> ALTER TABLE invoice EXCHANGE PARTITION p2010 WITH TABLE
invoice_2010;
Query OK, 0 rows affected
(0.73 sec)
mysql> select count(*)
from invoice_2010;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*)
from invoice PARTITION (p2010);
+----------+
| count(*) |
+----------+
| 1111215 |
+----------+
1 row in set (0.23 sec)
If any of the rows in the non partitioned table violate the
partitioning rule for the partition it is being exchanged with, you would get
an error.
mysql> update
invoice_2010 set invoice_date='2012-01-01' where id=10000009;
Query OK, 1 row affected
(0.17 sec)
Rows matched: 1 Changed: 1
Warnings: 0
mysql> ALTER TABLE invoice EXCHANGE PARTITION p2010 WITH TABLE
invoice_2010;
ERROR 1737 (HY000): Found
a row that does not match the partition
In its current implementation a row by row validation, does
a full table scan on the non partitioned table to evaluate if there is a row in
the table that violates the partitioning rule. From the open worklog, it seems
like there are plans to have the command use index instead of a full table scan
but it isn't implemented yet. A workaround to this is using WITHOUT VALIDATION.
To avoid time consuming validation when exchanging a
partition with a table that has many rows, it is possible to skip the
row-by-row validation step by appending WITHOUT
VALIDATION to the ALTER TABLE ... EXCHANGE
PARTITION statement. However with this the onus lies on the engineer to
verify that no partitioning rules are getting violated.
mysql> ALTER TABLE invoice EXCHANGE PARTITION p2010 WITH TABLE
invoice_2010 WITHOUT VALIDATION;
Query OK, 0 rows affected
(0.13 sec)
From a 20,000 ft view, the high level architecture for this involves
-
Taking a upgradable metadata lock on both tables
-
Verifying that the metadata matches (i.e both tables
have the same structure)
-
If WITHOUT VALIDATION is not used, verifying data in
the non partitioned table
-
Upgrading to a exclusive metadata lock for both tables
-
Rename non-partitioned table to partition and partition
to non-partitioned table.
-
Releasing the metadata locks
It would have been nice if it was possible to append to the non partitioned
table, rather than a exchange, but my guess is then it wouldn't be a metadata
operation. The same can be achieved by an exchange to an intermediate table, followed
by copy into the archive table.
As pertains to previous example, this would involve creating
a table invoice_2010_intermediate. Exchanging
p2010 from invoice table with invoice_2010_intermediate and doing a copy
from invoice_2010_intermediate to invoice_2010.
But for what it does, I think it is a delightful approach to
archiving data that you still need on your database server, for reading and
archiving, but it doesn't change.
Resources :
Script for populating data into invoice table - https://gist.github.com/dontstopbelieveing/3c42338c8f5c756a526ab2f7bef5525e