I ran into this error recently, and wanted to share a quick fix.
This is what the problem looks like -
mysql> show slave status \G
..
Master_Log_File: mysql_binary_log.000382
Read_Master_Log_Pos: 341467211
Relay_Log_File: mysql_relay_log.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql_binary_log.000382
Slave_IO_Running: No
Slave_SQL_Running: Yes
Seconds_Behind_Master: NULL
..
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
Last_SQL_Errno: 0
..
Checking the binary on the master
mysql> show binary logs;
+-------------------------+------------+
| Log_name | File_size |
+-------------------------+------------+
| mysql_binary_log.000388 | 1073741987 |
| mysql_binary_log.000389 | 1073742774 |
| mysql_binary_log.000390 | 1074106024 |
| mysql_binary_log.000391 | 1073748407 |
| mysql_binary_log.000392 | 120633755 |
+-------------------------+------------+
5 rows in set (0.00 sec)
As we can see that the binary log mysql_binary_log.000382 has indeed been purged.
However I have a script that backups binary logs before it purges them. Since the database is in order of TBs, I will attempt to recover from this using those, rather than rebuild the database using a backup.
Copy the binary logs from the backup location to MySQL's binary log directory.
cp mysql_binary_log.00038* /mysql_binlog_dir
Be sure to change the ownership to mysql user
chown -R mysql:mysql /mysql_binlog_dir
Edit mysql_binary.index and add these files to it.
mysql_binary_log.000382
mysql_binary_log.000383
..
mysql_binary_log.000392
The logs still won't show up on MySQL. Lets try FLUSH LOGS;
mysql> flush logs;
Query OK, 0 rows affected (0.02 sec)
mysql> show binary logs;
+-------------------------+------------+
| Log_name | File_size |
+-------------------------+------------+
| mysql_binary_log.000382 | 1073897047 |
| mysql_binary_log.000383 | 1073860223 |
| mysql_binary_log.000384 | 1073742372 |
| mysql_binary_log.000385 | 1075249248 |
| mysql_binary_log.000386 | 1074009237 |
| mysql_binary_log.000387 | 115207752 |
| mysql_binary_log.000388 | 1073741987 |
| mysql_binary_log.000389 | 1073742774 |
| mysql_binary_log.000390 | 1074106024 |
| mysql_binary_log.000391 | 1073748407 |
| mysql_binary_log.000392 | 120924130 |
| mysql_binary_log.000393 | 6322 |
+-------------------------+------------+
12 rows in set (0.00 sec)
The logs now show up in MySQL.
Let us log in to the slave and try to resume replication now.
mysql> change master to master_auto_position=0;
Query OK, 0 rows affected (0.02 sec)
mysql> change master to master_log_file='mysql_binary_log.000382';
Query OK, 0 rows affected, 1 warning (0.01 sec)
mysql> set global slave_parallel_workers=0;
There are warnings, if you use start slave until with parallel replications, you could run into issues.
start slave until master_log_file='mysql_binary_log.000393', master_log_pos=38204485;
MySQL replication now resumes
mysql> show slave status \G
..
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
..
Seconds_Behind_Master: 45844
..
Monitor the replication lag, until the slave catches up to previous position.
Then switch back to master_auto_position and parallel replication
mysql> change master to master_auto_position=1;
mysql> set global slave_parallel_workers=20;
mysql> start slave;
mysql> show slave status \G
..
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
..
Seconds_Behind_Master: 600
..
Eventually replication will catch up and things will be back to normal
mysql> show slave status \G
..
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
..
Seconds_Behind_Master: 0
..
It is a good practice to validate that the data remains consistent after this fix. One way to do this is to check whether the Executed_Gtid_Set that shows up in show slave status \G, contains all transactions from master. However this only validates that the transaction ID is present and not whether the actual data is consistent. If you want to validate if the actual data is consistent, this can be done using a tool like pt-table-checksum to check data consistency.
This is what the problem looks like -
mysql> show slave status \G
..
Master_Log_File: mysql_binary_log.000382
Read_Master_Log_Pos: 341467211
Relay_Log_File: mysql_relay_log.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql_binary_log.000382
Slave_IO_Running: No
Slave_SQL_Running: Yes
Seconds_Behind_Master: NULL
..
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
Last_SQL_Errno: 0
..
Checking the binary on the master
mysql> show binary logs;
+-------------------------+------------+
| Log_name | File_size |
+-------------------------+------------+
| mysql_binary_log.000388 | 1073741987 |
| mysql_binary_log.000389 | 1073742774 |
| mysql_binary_log.000390 | 1074106024 |
| mysql_binary_log.000391 | 1073748407 |
| mysql_binary_log.000392 | 120633755 |
+-------------------------+------------+
5 rows in set (0.00 sec)
As we can see that the binary log mysql_binary_log.000382 has indeed been purged.
However I have a script that backups binary logs before it purges them. Since the database is in order of TBs, I will attempt to recover from this using those, rather than rebuild the database using a backup.
Copy the binary logs from the backup location to MySQL's binary log directory.
cp mysql_binary_log.00038* /mysql_binlog_dir
Be sure to change the ownership to mysql user
chown -R mysql:mysql /mysql_binlog_dir
Edit mysql_binary.index and add these files to it.
mysql_binary_log.000382
mysql_binary_log.000383
..
mysql_binary_log.000392
The logs still won't show up on MySQL. Lets try FLUSH LOGS;
mysql> flush logs;
Query OK, 0 rows affected (0.02 sec)
mysql> show binary logs;
+-------------------------+------------+
| Log_name | File_size |
+-------------------------+------------+
| mysql_binary_log.000382 | 1073897047 |
| mysql_binary_log.000383 | 1073860223 |
| mysql_binary_log.000384 | 1073742372 |
| mysql_binary_log.000385 | 1075249248 |
| mysql_binary_log.000386 | 1074009237 |
| mysql_binary_log.000387 | 115207752 |
| mysql_binary_log.000388 | 1073741987 |
| mysql_binary_log.000389 | 1073742774 |
| mysql_binary_log.000390 | 1074106024 |
| mysql_binary_log.000391 | 1073748407 |
| mysql_binary_log.000392 | 120924130 |
| mysql_binary_log.000393 | 6322 |
+-------------------------+------------+
12 rows in set (0.00 sec)
The logs now show up in MySQL.
Let us log in to the slave and try to resume replication now.
mysql> change master to master_auto_position=0;
Query OK, 0 rows affected (0.02 sec)
mysql> change master to master_log_file='mysql_binary_log.000382';
Query OK, 0 rows affected, 1 warning (0.01 sec)
mysql> set global slave_parallel_workers=0;
There are warnings, if you use start slave until with parallel replications, you could run into issues.
start slave until master_log_file='mysql_binary_log.000393', master_log_pos=38204485;
MySQL replication now resumes
mysql> show slave status \G
..
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
..
Seconds_Behind_Master: 45844
..
Monitor the replication lag, until the slave catches up to previous position.
Then switch back to master_auto_position and parallel replication
mysql> change master to master_auto_position=1;
mysql> set global slave_parallel_workers=20;
mysql> start slave;
mysql> show slave status \G
..
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
..
Seconds_Behind_Master: 600
..
Eventually replication will catch up and things will be back to normal
mysql> show slave status \G
..
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
..
Seconds_Behind_Master: 0
..
It is a good practice to validate that the data remains consistent after this fix. One way to do this is to check whether the Executed_Gtid_Set that shows up in show slave status \G, contains all transactions from master. However this only validates that the transaction ID is present and not whether the actual data is consistent. If you want to validate if the actual data is consistent, this can be done using a tool like pt-table-checksum to check data consistency.
How did you come up with this number master_log_pos=38204485 in 'start slave until' command??
ReplyDeleteI stopped the slave before I did the change master to master_auto_position=0; That was the position in show master status at that point.
ReplyDelete