Tuesday, October 15, 2019

MySQL storing time

Storing time in databases seems like a daunting task to many. The general principle with storing time in databases is - ALWAYS STORE TIME IN UTC. However for a lot of people this raises questions.. but my application expects time to be in US/Pacific, or my database server's time zone in PDT or most of my users are in US/Eastern. Some also ask that the global setting for MySQL variable time_zone be set to US/Pacific or US/Eastern to match their requirement.

From a DBA's perspective, I think 99% use cases meet the ALWAYS STORE TIME IN UTC guideline, and for 1% that don't an understanding of the right way of storing and reading/writing time is necessary to ensure that you are doing the right thing.

MySQL's datetime data types


MySQL provides two data types to store time - TIMESTAMP and DATETIME

TIMESTAMP stores time as unix timestamp, or seconds since epoch, regardless of what the time zone is.

DATETIME simply takes the date and stores it as a string like representation as YYYYMMDDHHmmSS

While this difference might not be very evident while writing data, it becomes clearer while reading data.

TIMESTAMP


Let's take an example of a table with TIMESTAMP to store date and time. My time_zone variables has the global setting 'UTC'


CREATE TABLE `test_timestamp` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `time_created` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Inserting data into the table.

mysql> insert into test_timestamp (name, time_created) values ('a','2019-10-09 21:51:00');
Query OK, 1 row affected (0.01 sec)

Reading data.

mysql> select * from test_timestamp;
+----+------+---------------------+
| id | name | time_created        |
+----+------+---------------------+
|  1 | a    | 2019-10-09 21:51:00 |
+----+------+---------------------+
1 row in set (0.00 sec)


So far the difference isn't quite clear, since we wrote the data in UTC and are reading it in UTC.

Let's change the time zones around a little and read the same data.

mysql> set time_zone='US/Pacific';

mysql> select * from test_timestamp;
+----+------+---------------------+
| id | name | time_created        |
+----+------+---------------------+
|  1 | a    | 2019-10-09 14:51:00 |
+----+------+---------------------+
1 row in set (0.00 sec)

When the data is retrieved it is presented in the time zone, the session value of time_zone variable is set to. Well, what about if I write in one time zone (other than UTC) and read it in another.

mysql> set time_zone='US/Pacific';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test_timestamp (name, time_created) values ('b','2019-10-09 14:57:00');
Query OK, 1 row affected (0.01 sec)

mysql> select * from test_timestamp;
+----+------+---------------------+
| id | name | time_created        |
+----+------+---------------------+
|  1 | a    | 2019-10-09 14:51:00 |
|  2 | b    | 2019-10-09 14:57:00 |
+----+------+---------------------+
2 rows in set (0.00 sec)

Lets switch back to UTC and read it.

mysql> set time_zone='UTC';
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test_timestamp;
+----+------+---------------------+
| id | name | time_created        |
+----+------+---------------------+
|  1 | a    | 2019-10-09 21:51:00 |
|  2 | b    | 2019-10-09 21:57:00 |
+----+------+---------------------+
2 rows in set (0.00 sec)

How about reading it in US/Eastern ?

mysql> set time_zone='US/Eastern';
Query OK, 0 rows affected (0.01 sec)

mysql> select * from test_timestamp;
+----+------+---------------------+
| id | name | time_created        |
+----+------+---------------------+
|  1 | a    | 2019-10-09 17:51:00 |
|  2 | b    | 2019-10-09 17:57:00 |
+----+------+---------------------+
2 rows in set (0.00 sec)

TL;DR

  • TIMESTAMP stores data in the format of seconds since epoch or unix time.
  • If you want the time you store to follow the time zone of user or application server, use TIMESTAMP
  • Change the time zone via session value of time_zone, rather than setting the global value
  • If it is easier, you can add the time_zone setting to your connection parameters or even if you expect it to be global to your init_connect.

DATETIME


Now let's take a look at what happens with DATETIME

CREATE TABLE `test_datetime` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `time_created` datetime DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Writing data in US/Pacific time zone

mysql> set time_zone='US/Pacific';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test_datetime (name, time_created) values ('c','2019-10-09 15:59');
Query OK, 1 row affected (0.01 sec)

mysql> select * from test_datetime;
+----+------+---------------------+
| id | name | time_created        |
+----+------+---------------------+
|  1 | c    | 2019-10-09 15:59:00 |
+----+------+---------------------+
1 row in set (0.00 sec)

Now let's try to read it in UTC.

mysql> set time_zone='UTC';
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test_datetime;
+----+------+---------------------+
| id | name | time_created        |
+----+------+---------------------+
|  1 | c    | 2019-10-09 15:59:00 |
+----+------+---------------------+
1 row in set (0.00 sec)

The data does not follow the time zone change. How about if we insert data in UTC  and try to read it in other time zones?

mysql> set time_zone='UTC';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test_datetime (name, time_created) values ('d','2019-10-09 22:00');
Query OK, 1 row affected (0.01 sec)

mysql> select * from test_datetime;
+----+------+---------------------+
| id | name | time_created        |
+----+------+---------------------+
|  1 | c    | 2019-10-09 15:59:00 |
|  2 | d    | 2019-10-09 22:00:00 |
+----+------+---------------------+
2 rows in set (0.00 sec)

mysql> set time_zone='US/Pacific';
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test_datetime;
+----+------+---------------------+
| id | name | time_created        |
+----+------+---------------------+
|  1 | c    | 2019-10-09 15:59:00 |
|  2 | d    | 2019-10-09 22:00:00 |
+----+------+---------------------+
2 rows in set (0.00 sec)

The time value you stored always remains what you stored regardless of time zone changes while reading or writing it.

TL; DR

  • DATETIME stores the date time value as a string like literal YYYYMMDDHHmmSS
  • No matter what the time zone changes are while reading or writing the data you will always see what you stored. 

What to use when



  • Use TIMESTAMP when you want the timestamp to follow time zone changes of the application users. Examples of this are - timestamps for log events, or a created at / modified at timestamp, timestamps for chat messages or comments etc., notifications 
  • Use DATETIME when you don't want the timestamp to follow time zone changes. Examples of this are - Date of Birth

If using these two data types right, with setting time_zone on the session level cannot take care of your time zone issues, then and only then explore changing the time_zone variable globally.

On a lighter note, If you are in the business of using TIMESTAMP, have you heard of the Year 2038 problem? Nothing to freak out about yet, we are only 19 years away :) Anyone having Y2K flashbacks?







Tuesday, October 1, 2019

Quick fix for Replication Error "but the master has purged binary logs containing GTIDs"

I ran into this error recently, and wanted to share a quick fix.

This is what the problem looks like -

mysql> show slave status \G
..
              Master_Log_File: mysql_binary_log.000382
          Read_Master_Log_Pos: 341467211
               Relay_Log_File: mysql_relay_log.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: mysql_binary_log.000382
             Slave_IO_Running: No
            Slave_SQL_Running: Yes

        Seconds_Behind_Master: NULL
..
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
               Last_SQL_Errno: 0

..

Checking the binary on the master

mysql> show binary logs;
+-------------------------+------------+
| Log_name                | File_size  |
+-------------------------+------------+
| mysql_binary_log.000388 | 1073741987 |
| mysql_binary_log.000389 | 1073742774 |
| mysql_binary_log.000390 | 1074106024 |
| mysql_binary_log.000391 | 1073748407 |
| mysql_binary_log.000392 |  120633755 |
+-------------------------+------------+
5 rows in set (0.00 sec)

As we can see that the binary log mysql_binary_log.000382 has indeed been purged.
However I have a script that backups binary logs before it purges them. Since the database is in order of TBs, I will attempt to recover from this using those, rather than rebuild the database using a backup. 

Copy the binary logs from the backup location to MySQL's binary log directory.

cp mysql_binary_log.00038* /mysql_binlog_dir

Be sure to change the ownership to mysql user

chown -R mysql:mysql /mysql_binlog_dir

Edit mysql_binary.index and add these files to it.

mysql_binary_log.000382
mysql_binary_log.000383
..
mysql_binary_log.000392

The logs still won't show up on MySQL. Lets try FLUSH LOGS;

mysql> flush logs;
Query OK, 0 rows affected (0.02 sec)

mysql> show binary logs;
+-------------------------+------------+
| Log_name                | File_size  |
+-------------------------+------------+
| mysql_binary_log.000382 | 1073897047 |
| mysql_binary_log.000383 | 1073860223 |
| mysql_binary_log.000384 | 1073742372 |
| mysql_binary_log.000385 | 1075249248 |
| mysql_binary_log.000386 | 1074009237 |
| mysql_binary_log.000387 |  115207752 |
| mysql_binary_log.000388 | 1073741987 |
| mysql_binary_log.000389 | 1073742774 |
| mysql_binary_log.000390 | 1074106024 |
| mysql_binary_log.000391 | 1073748407 |
| mysql_binary_log.000392 |  120924130 |
| mysql_binary_log.000393 |       6322 |
+-------------------------+------------+

12 rows in set (0.00 sec)

The logs now show up in MySQL.

Let us log in to the slave and try to resume replication now.

mysql> change master to master_auto_position=0;
Query OK, 0 rows affected (0.02 sec)

mysql> change master to master_log_file='mysql_binary_log.000382';
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> set global slave_parallel_workers=0;

There are warnings, if you use start slave until with parallel replications, you could run into issues. 


start slave until master_log_file='mysql_binary_log.000393', master_log_pos=38204485;

MySQL replication now resumes

mysql> show slave status \G
..
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
  ..
        Seconds_Behind_Master: 45844
..

Monitor the replication lag, until the slave catches up to previous position.

Then switch back to master_auto_position and parallel replication

mysql> change master to master_auto_position=1;
mysql> set global slave_parallel_workers=20;
mysql> start slave;

mysql> show slave status \G
..
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
  ..
        Seconds_Behind_Master: 600
..

Eventually replication will catch up and things will be back to normal

mysql> show slave status \G
..
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
  ..
        Seconds_Behind_Master: 0
..

It is a good practice to validate that the data remains consistent after this fix. One way to do this is to check whether the Executed_Gtid_Set that shows up in show slave status \G, contains all transactions from master.  However this only validates that the transaction ID is present and not whether the actual data is consistent. If you want to validate if the actual data is consistent, this can be done using a tool like pt-table-checksum to check data consistency.

MySQL storing time

Storing time in databases seems like a daunting task to many. The general principle with storing time in databases is - ALWAYS STORE TIME ...