1.背景概述 客戶業務發生死鎖的報錯,根據業務程式日誌及業務流程,發現造成死鎖的原因是:事務1 delete + insert ,事務2 delete + insert 2個事務交替執行導致的死鎖;由於GAP鎖阻塞了插入意向鎖,並且當delete的數據存在時死鎖不會發生,當delete的數據不存在時 ...
1.背景概述
客戶業務發生死鎖的報錯,根據業務程式日誌及業務流程,發現造成死鎖的原因是:事務1 delete + insert ,事務2 delete + insert 2個事務交替執行導致的死鎖;由於GAP鎖阻塞了插入意向鎖,並且當delete的數據存在時死鎖不會發生,當delete的數據不存在時,會發生死鎖。
2.問題復現
本次測試基於 GreatSQL-8.0.32-24,隔離級別為 RR
2.1 創建測試表
greatsql> create database test;
greatsql> create table test(c1 int unique key, c2 int, c3 int, c4 int);
greatsql> insert into test values (1,1,1,1),(3,3,3,3),(5,5,5,5),(9,9,9,9);
greatsql> select * from test;
+------+------+------+------+
| c1 | c2 | c3 | c4 |
+------+------+------+------+
| 1 | 1 | 1 | 1 |
| 3 | 3 | 3 | 3 |
| 5 | 5 | 5 | 5 |
| 9 | 9 | 9 | 9 |
+------+------+------+------+
4 rows in set (0.01 sec)
2.2 事務執行順序
按以下事務執行順序,如果要刪除的數據存在,則不會發生死鎖;如果要刪除的數據不存,並且要刪除的數據在同一個GAP鎖的區間內則會發生死鎖;
時間 | 事務1 | 事務2 |
---|---|---|
T1 | BEGIN; | BEGIN; |
T2 | delete from test where c1=?; | |
T3 | delete from test where c1=?; | |
T4 | insert into test value(?,?,?,?); | |
T5 | insert into test value(?,?,?,?); |
2.3 當delete的數據存在時
事務1:delete
greatsql> begin;
Query OK, 0 rows affected (0.00 sec)
greatsql> delete from test where c1=3;
Query OK, 1 row affected (0.00 sec)
查看鎖信息:
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| 59 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 59 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 3, 0x000000000201 |
| 59 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000201 |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
3 rows in set (0.00 sec)
此時事務1給 3, 0x000000000201 這條數據加了 記錄鎖 X,REC_NOT_GAP
事務2:delete
greatsql> begin;
Query OK, 0 rows affected (0.00 sec)
greatsql> delete from test where c1=5;
Query OK, 1 row affected (0.00 sec)
查看鎖信息:
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| 57 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 57 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 5, 0x000000000202 |
| 57 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000202 |
| 59 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 59 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 3, 0x000000000201 |
| 59 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000201 |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
6 rows in set (0.00 sec)
此時事務2給 5, 0x000000000202 這條數據加了 記錄鎖 X,REC_NOT_GAP
事務1:insert
greatsql> insert into test value(3,3,3,3);
Query OK, 1 row affected (0.00 sec)
查看鎖信息:
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| 57 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 57 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 5, 0x000000000202 |
| 57 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000202 |
| 59 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 59 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 3, 0x000000000201 |
| 59 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000201 |
| 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 3, 0x000000000201 |
| 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 5, 0x000000000202 |
| 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 3, 0x000000000206 |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
9 rows in set (0.00 sec)
此時事務1給 被delete刪除的數據 3, 0x000000000201 ,插入的數據 3, 0x000000000202,以及相鄰的下一條數據 5, 0x000000000202 加了 間隙鎖 S,GAP
事務2:insert
greatsql> insert into test value(5,5,5,5);
Query OK, 1 row affected (0.01 sec)
查看鎖信息:
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| 57 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 57 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 5, 0x000000000202 |
| 57 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000202 |
| 57 | test | test | c1 | RECORD | S,GAP | GRANTED | 5, 0x000000000202 |
| 57 | test | test | c1 | RECORD | S,GAP | GRANTED | 9, 0x000000000203 |
| 57 | test | test | c1 | RECORD | S,GAP | GRANTED | 5, 0x000000000207 |
| 59 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 59 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 3, 0x000000000201 |
| 59 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000201 |
| 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 3, 0x000000000201 |
| 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 5, 0x000000000202 |
| 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 3, 0x000000000206 |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
12 rows in set (0.00 sec)
此時事務2給 被delete刪除的數據 5, 0x000000000202 ,插入的數據 5, 0x000000000207,以及相鄰的下一條數據 9, 0x000000000203 加了 間隙鎖 S,GAP
由於GAP鎖之間是相互相容的,所以沒有發生鎖等待及死鎖,此時事務1,事務2都執行完成,可以正常提交。
2.4 當delete的數據不存在時
事務1:delete
greatsql> begin;
Query OK, 0 rows affected (0.01 sec)
greatsql> delete from test where c1=6;
Query OK, 0 rows affected (0.00 sec)
查看鎖信息:
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
| 61 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 61 | test | test | c1 | RECORD | X,GAP | GRANTED | 9, 0x000000000203 |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
2 rows in set (0.00 sec)
此時事務1給 9, 0x000000000203 這條數據加了 間隙鎖 X,GAP
事務2:delete
greatsql> begin;
Query OK, 0 rows affected (0.00 sec)
greatsql> delete from test where c1=7;
Query OK, 0 rows affected (0.00 sec)
查看鎖信息:
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
| 62 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 62 | test | test | c1 | RECORD | X,GAP | GRANTED | 9, 0x000000000203 |
| 61 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 61 | test | test | c1 | RECORD | X,GAP | GRANTED | 9, 0x000000000203 |
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+
4 rows in set (0.00 sec)
此時事務2給 9, 0x000000000203 這條數據加了 間隙鎖 X,GAP 間隙鎖可以相互相容,因此沒有報錯
事務1:insert
greatsql> insert into test value(6,6,6,6);
---hang住,處於鎖等待
查看鎖信息:
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+
| 62 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 62 | test | test | c1 | RECORD | X,GAP | GRANTED | 9, 0x000000000203 |
| 61 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 61 | test | test | c1 | RECORD | X,GAP | GRANTED | 9, 0x000000000203 |
| 61 | test | test | c1 | RECORD | X,GAP,INSERT_INTENTION | WAITING | 9, 0x000000000203 |
+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+
5 rows in set (0.00 sec)
此時事務1,要申請給 9, 0x000000000203 這條數據加 間隙鎖,插入意向鎖 X,GAP,INSERT_INTENTION;由於事務2已經加了 間隙鎖 X,GAP 間隙鎖與插入意向鎖並不相容,因此事務1的insert處於鎖等待狀態
事務2:insert
greatsql> insert into test value(7,7,7,7);
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
查看鎖信息:
greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+
| 61 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 61 | test | test | c1 | RECORD | X,GAP | GRANTED | 9, 0x000000000203 |
| 61 | test | test | c1 | RECORD | X,GAP | GRANTED | 6, 0x000000000213 |
| 61 | test | test | GEN_CLUST_INDEX | RECORD | X | GRANTED | supremum pseudo-record |
| 61 | test | test | GEN_CLUST_INDEX | RECORD | X,GAP | GRANTED | 0x000000000213 |
| 61 | test | test | c1 | RECORD | X,GAP,INSERT_INTENTION | GRANTED | 9, 0x000000000203 |
+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+
6 rows in set (0.01 sec)
事務2回滾,只有事務1的加鎖信息。
由於此時事務2,要申請給 9, 0x000000000203 這條數據加 間隙鎖,插入意向鎖由於事務1已經加了 間隙鎖 X,GAP 間隙鎖與插入意向鎖並不相容,因此事務2的insert處於鎖等待狀態,2個事務相互等待鎖導致死鎖,此時事務2回滾。
3.總結
此次死鎖的發生主要是GAP 鎖 和 插入意向鎖的衝突,建議讓業務修改一下邏輯,先判斷數據是否存在 select 一下,存在的話 delete 後在 insert ; 不存在的話直接 insert 不用delete了。
Enjoy GreatSQL