早上巡檢資料庫,發現一個延遲從庫的sql_thread中斷了。 Last_SQL_Errno: 1755 Last_SQL_Error: Cannot execute the current event group in the parallel mode. Encountered event Gt ...
早上巡檢資料庫,發現一個延遲從庫的sql_thread中斷了。
Last_SQL_Errno: 1755
Last_SQL_Error: Cannot execute the current event group in the parallel mode. Encountered event Gtid, relay-log name ./oracle-relay-bin.000093, position 152912092 which prevents execution of this event group in parallel mode. Reason: The master event is logically timestamped incorrectly..
檢查performance_schema下的replication_applier_status_by_worker表,除了GTID之外也沒有更具體的信息:
"root@localhost:mysql3308.sock [(none)]>select * from performance_schema.replication_applier_status_by_worker; +--------------+-----------+-----------+---------------+------------------------------------------------+-------------------+--------------------+----------------------+ | CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP | +--------------+-----------+-----------+---------------+------------------------------------------------+-------------------+--------------------+----------------------+ | | 1 | NULL | OFF | 0b961fcc-41c2-11e7-84fd-286ed488c7da:156369774 | 0 | | 0000-00-00 00:00:00 | | | 2 | NULL | OFF | | 0 | | 0000-00-00 00:00:00 | | | 3 | NULL | OFF | | 0 | | 0000-00-00 00:00:00 | | | 4 | NULL | OFF | | 0 | | 0000-00-00 00:00:00 | | | 5 | NULL | OFF | | 0 | | 0000-00-00 00:00:00 | | | 6 | NULL | OFF | | 0 | | 0000-00-00 00:00:00 | | | 7 | NULL | OFF | | 0 | | 0000-00-00 00:00:00 | | | 8 | NULL | OFF | | 0 | | 0000-00-00 00:00:00 | +--------------+-----------+-----------+---------------+------------------------------------------------+-------------------+--------------------+----------------------+
既然relay_log的位置信息都有了,那就去日誌里看看吧:
解析Binlog文件:
mysqlbinlog -v --base64-output=decode-rows oracle-relay-bin.000093 >1.sql
找到152912092位置點附近的日誌:
檢查了一下資料庫中這個表ID為14816035的數據確實是不存在的。
另外除了這條日誌,其它日誌的last_committed和sequence_number都為0,last_committed表示事務提交的時候,上次事務提交的編號。last_committed和sequence_number代表的就是所謂的LOGICAL_CLOCK。
猜測如果手動把這條數據插入延遲從庫,並且使用註入一個空事務跳過這個GTID的方法重啟sql_thread,相信這個錯誤也能被解決。
但既然帶了LOGICAL_CLOCK的事務就會出錯,跳過事務的方法很難保證以後不會出錯。
註意到這條日誌的last_committed是一個異常大的值,且錯誤信息中有提到The master event is logically timestamped incorrectly。我懷疑是不是並行配置的問題。
從庫配置:
"root@localhost:mysql3308.sock [(none)]>show variables like '%para%'; +------------------------+---------------+ | Variable_name | Value | +------------------------+---------------+ | slave_parallel_type | LOGICAL_CLOCK | | slave_parallel_workers | 8 | +------------------------+---------------+
再檢查主庫配置:
(root@localhost:mysql.sock) [(none)]>show variables like '%para%'; +------------------------+-------+ | Variable_name | Value | +------------------------+-------+ | slave_parallel_workers | 0 | +------------------------+-------+
發現主庫根本就沒有slave_parallel_type這項配置。想起來主庫是mysql5.6了。
(root@localhost:mysql.sock) [(none)]>select version(); +------------+ | version() | +------------+ | 5.6.35-log | +------------+
那麼問題基本上就知道了,主庫5.6只支持基於DATABASE的並行複製,而5.7的從庫配置成LOGICAL_CLOCK導致了異常。
明白了問題所在,那就好解決了,把從庫的slave_parallel_type改為DATABASE,再起sql_thread問題應該就解決了:
"root@localhost:mysql3308.sock [none]>set global slave_parallel_type='DATABASE'; Query OK, 0 rows affected (0.00 sec) "root@localhost:mysql3308.sock [none]>show global variables like '%slave_parallel_type%'; +---------------------+----------+ | Variable_name | Value | +---------------------+----------+ | slave_parallel_type | DATABASE | +---------------------+----------+ 1 row in set (0.00 sec) "root@localhost:mysql3308.sock [none]>show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: master Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: binlog.000104 Read_Master_Log_Pos: 160115307 Relay_Log_File: oracle-relay-bin.000093 Relay_Log_Pos: 152912092 Relay_Master_Log_File: binlog.000100 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1755 Last_Error: Cannot execute the current event group in the parallel mode. Encountered event Gtid, relay-log name ./oracle-relay-bin.000093, position 152912092 which prevents execution of this event group in parallel mode. Reason: The master event is logically timestamped incorrectly.. Skip_Counter: 0 Exec_Master_Log_Pos: 152911925 Relay_Log_Space: 4455094667 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1755 Last_SQL_Error: Cannot execute the current event group in the parallel mode. Encountered event Gtid, relay-log name ./oracle-relay-bin.000093, position 152912092 which prevents execution of this event group in parallel mode. Reason: The master event is logically timestamped incorrectly.. Replicate_Ignore_Server_Ids: Master_Server_Id: 50 Master_UUID: 0b961fcc-41c2-11e7-84fd-286ed488c7da Master_Info_File: mysql.slave_master_info SQL_Delay: 3600 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 180716 18:02:56 Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 0b961fcc-41c2-11e7-84fd-286ed488c7da:111060115-163843604 Executed_Gtid_Set: 0b961fcc-41c2-11e7-84fd-286ed488c7da:1-156369774 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) "root@localhost:mysql3308.sock [none]>stop slave sql_thread; Query OK, 0 rows affected, 1 warning (0.00 sec) "root@localhost:mysql3308.sock [none]>start slave sql_thread; Query OK, 0 rows affected (0.01 sec) "root@localhost:mysql3308.sock [none]>show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: master Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: binlog.000104 Read_Master_Log_Pos: 160161836 Relay_Log_File: oracle-relay-bin.000093 Relay_Log_Pos: 169205552 Relay_Master_Log_File: binlog.000100 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 169205385 Relay_Log_Space: 4455141196 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 5351 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 50 Master_UUID: 0b961fcc-41c2-11e7-84fd-286ed488c7da Master_Info_File: mysql.slave_master_info SQL_Delay: 3600 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Waiting for Slave Worker to release partition Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 0b961fcc-41c2-11e7-84fd-286ed488c7da:111060115-163843692 Executed_Gtid_Set: 0b961fcc-41c2-11e7-84fd-286ed488c7da:1-156400100 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
打完收工。
轉載請註明出處。
本文地址:https://www.cnblogs.com/ajiangg/p/9399892.html