並行複製從庫更新的記錄不存在實際卻存在 背景 開了並行複製的半同步從庫SQL 線程報1032錯誤,非同步複製從庫沒有報錯,偶爾會出現這種 分析 版本mysql 5.7.16 mysql show variables like '%slave_para%'; + + + | Variable_name ...
並行複製從庫更新的記錄不存在實際卻存在
背景
開了並行複製的半同步從庫SQL 線程報1032錯誤,非同步複製從庫沒有報錯,偶爾會出現這種
分析
版本mysql 5.7.16
mysql> show variables like '%slave_para%';
+------------------------+---------------+
| Variable_name | Value |
+------------------------+---------------+
| slave_parallel_type | LOGICAL_CLOCK |
| slave_parallel_workers | 16 |
+------------------------+---------------+
1、疑是對從庫執行了更新操作,導致更新的記錄不存在
2、查看error log發現
2018-04-03T10:11:47.720156+08:00 16 [ERROR] Slave SQL for channel '': Worker 13 failed executing transaction 'a272bbcf-874f-11e7-a288-00505695b721:687871861' at master log mysql-bin.004119, end_log_pos 376471678; Could not execute Update_rows event on table anytxn.seq_xxxx; Can't find record in 'seq_xxxx', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.004119, end_log_pos 376471678, Error_code: 1032
2018-04-03T10:11:47.720230+08:00 2 [Warning] Slave SQL for channel '': ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756
2018-04-03T10:11:47.720959+08:00 2 [Note] Error reading relay log event for channel '': slave SQL thread was killed
3、從 SQL線程停止的position分析binlog發現
SET @@SESSION.GTID_NEXT= 'a272bbcf-874f-11e7-a288-00505695b721:687871861'/*!*/;
# at 376471694
#180403 10:11:47 server id 104073 end_log_pos 376471555 CRC32 0x1be91176 Query thread_id=2086049 exec_time=0 error_code=0
SET TIMESTAMP=1522721507/*!*/;
BEGIN
/*!*/;
# at 376471768
#180403 10:11:47 server id 104073 end_log_pos 376471616 CRC32 0x10644d77 Table_map: `anytxn`.`seq_xxxx` mapped to number 301
# at 376471829
#180403 10:11:47 server id 104073 end_log_pos 376471678 CRC32 0x871a9787 Update_rows: table id 301 flags: STMT_END_F
### UPDATE `anytxn`.`seq_xxxx`
### WHERE
### @1=7116088 /* LONGINT meta=0 nullable=0 is_null=0 */
### @2=1 /* INT meta=0 nullable=0 is_null=0 */
### SET
### @1=7116089 /* LONGINT meta=0 nullable=0 is_null=0 */
### @2=1 /* INT meta=0 nullable=0 is_null=0 */
# at 376471891
#180403 10:11:47 server id 104073 end_log_pos 376471709 CRC32 0x9eb59238 Xid = 22247621418
COMMIT/*!*/;
# at 376471922
#180403 10:11:47 server id 104073 end_log_pos 376471774 CRC32 0xf7b6ad5d GTID last_committed=641254 sequence_number=641259
SET @@SESSION.GTID_NEXT= 'a272bbcf-874f-11e7-a288-00505695b721:687871862'/*!*/;
# at 376471987
#180403 10:11:47 server id 104073 end_log_pos 376471856 CRC32 0x6256de00 Query thread_id=2085350 exec_time=0 error_code=0
SET TIMESTAMP=1522721507/*!*/;
BEGIN
/*!*/;
# at 376472069
#180403 10:11:47 server id 104073 end_log_pos 376471979 CRC32 0x6c329578 Table_map: `anytxn`.`bm_cc_customer_address_info` mapped to number 1569
# at 376472192
#180403 10:11:47 server id 104073 end_log_pos 376472162 CRC32 0x834cc8b9 Write_rows: table id 1569 flags: STMT_END_F
### INSERT INTO `anytxn`.`bm_xxxxxxxxxxxxxx`
### SET
### @1=14480779 /* LONGINT meta=0 nullable=0 is_null=0 */
### @2='0000001002380654' /* STRING(96) meta=65120 nullable=0 is_null=0 */
### @3='B001' /* VARSTRING(30) meta=30 nullable=1 is_null=0 */
### @4=NULL /* STRING(12) meta=65036 nullable=1 is_null=1 */
### @5='10000010001202000000001' /* VARSTRING(96) meta=96 nullable=1 is_null=0 */
### @6='B00' /* STRING(9) meta=65033 nullable=1 is_null=0 */
### @7='xxxxxxxxxxx' /* VARSTRING(765) meta=765 nullable=1 is_null=0 */
### @8=NULL /* STRING(18) meta=65042 nullable=1 is_null=1 */
### @9=NULL /* STRING(18) meta=65042 nullable=1 is_null=1 */
### @10=NULL /* STRING(18) meta=65042 nullable=1 is_null=1 */
[email protected]:/home/mysql> mysqlbinlog -v -v --start-datetime='2018-04-03 10:11:45' --stop-datetime='2018-04-03 10:11:48' /data/mysql/ACSXFDB6-relay-bin.005477 | grep last_comm | grep 10:11:47 | grep 641254
#180403 10:11:47 server id 104073 end_log_pos 376469618 CRC32 0xb6dc6cef GTID last_committed=641227 sequence_number=641254
#180403 10:11:47 server id 104073 end_log_pos 376471774 CRC32 0xf7b6ad5d GTID last_committed=641254 sequence_number=641259
#180403 10:11:47 server id 104073 end_log_pos 376472258 CRC32 0x27cf3013 GTID last_committed=641254 sequence_number=641260
從上面信息可以看出,發生更新記錄不存在是在更新anytxn
.seq_xxxx
表的標識為7116088 的記錄
有兩個併發提交的事務last_committed=641254 ,與發現更新的記錄不存在的 687871861 事務還有提交的事務 sequence_number=641260(即insert另一張表的操作)
4、查看更新的記錄不存在的表和相關記錄
show create table seq_xxxx;
| seq_xxxx | CREATE TABLEseq_xxxx
(
currentValue
bigint(20) NOT NULL,
increment
int(11) NOT NULL DEFAULT '1'
mysql> select * from seq_xxxx;
+--------------+-----------+
| currentValue | increment |
+--------------+-----------+
| 7116088 | 1 |
+--------------+-----------+
可以發現實際資料庫中是存在該記錄的
測試
mysql> select @@version;
+------------+
| @@version |
+------------+
| 5.7.19-log |
+------------+
1 row in set (0.00 sec)
mysql> show variables like '%para%';
+------------------------+---------------+
| Variable_name | Value |
+------------------------+---------------+
| slave_parallel_type | LOGICAL_CLOCK |
| slave_parallel_workers | 4 |
+------------------------+---------------+
sysbench /usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-host=10.186.30.73 --mysql-socket=/opt/mysql/data/3307/mysqld.sock --mysql-port=3307 --db-driver=mysql --mysql-db=test --mysql-user=admin --mysql-password=admin --table_size=100000 --tables=5 --threads=100 --time=120 --report-interval=5 run
有併發提交的事務,但沒有模擬重現出更新的記錄不存在,但在庫中卻存在的情況
相關bug鏈接
Repeated multi-threaded slave replication failures