一、背景 公元2023-10-12(周四)上午,組內的亞梅反饋,用戶生成標簽報死鎖異常 二、排查異常日誌 查到當時報錯的日誌 具體異常信息如下 server-provider-info-2023-10-12.0.log:2023-10-12 09:40:50.593 [TID:bf623bded18 ...
一、背景
公元2023-10-12(周四)上午,組內的亞梅反饋,用戶生成標簽報死鎖異常
二、排查異常日誌
查到當時報錯的日誌
具體異常信息如下
server-provider-info-2023-10-12.0.log:2023-10-12 09:40:50.593 [TID:bf623bded189486cbb0b6a64d81b64b4.357.16970748504097047] [4ed7b7943a8a47de912e4b644d70285e] [SimpleAsyncTaskExecutor-5080] INFO com.emax.user.user.provider.UserTagAPIImpl:? - 簽約完成,更新用戶標簽信息異常,用戶id[1712282059297107970]org.apache.ibatis.exceptions.PersistenceException: server-provider-info-2023-10-12.0.log-### Error flushing statements. Cause: org.apache.ibatis.executor.BatchExecutorException: com.emax.user.user.mapper.UserTagMapper.insert (batch index #1) failed. Cause: java.sql.BatchUpdateException: Deadlock found when trying to get lock; try restarting transaction server-provider-info-2023-10-12.0.log-### Cause: org.apache.ibatis.executor.BatchExecutorException: com.emax.user.user.mapper.UserTagMapper.insert (batch index #1)failed. Cause: java.sql.BatchUpdateException: Deadlock found when trying to get lock; try restarting transaction server-provider-info-2023-10-12.0.log- at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30) server-provider-info-2023-10-12.0.log- at org.apache.ibatis.session.defaults.DefaultSqlSession.flushStatements(DefaultSqlSession.java:254) server-provider-info-2023-10-12.0.log- at com.baomidou.mybatisplus.extension.service.impl.ServiceImpl.saveBatch(ServiceImpl.java:127) server-provider-info-2023-10-12.0.log- at com.baomidou.mybatisplus.extension.service.IService.saveBatch(IService.java:58) server-provider-info-2023-10-12.0.log- at com.baomidou.mybatisplus.extension.service.IService$$FastClassBySpringCGLIB$$f8525d18.invoke(<generated>) server-provider-info-2023-10-12.0.log- at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) server-provider-info-2023-10-12.0.log- at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:749) server-provider-info-2023-10-12.0.log- at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) server-provider-info-2023-10-12.0.log- at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:294) server-provider-info-2023-10-12.0.log- at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98) server-provider-info-2023-10-12.0.log- at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) server-provider-info-2023-10-12.0.log- at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688) server-provider-info-2023-10-12.0.log- at com.emax.user.user.service.UserTagManager$$EnhancerBySpringCGLIB$$67ca0067.saveBatch(<generated>) server-provider-info-2023-10-12.0.log- at com.emax.user.user.provider.UserTagAPIImpl.addUserTagByUser(UserTagAPIImpl.java:236) server-provider-info-2023-10-12.0.log- at com.emax.user.user.provider.UserTagAPIImpl.addTagAfterSign(UserTagAPIImpl.java:116) server-provider-info-2023-10-12.0.log- at com.emax.user.user.provider.UserTagAPIImpl$$FastClassBySpringCGLIB$$2547ce26.invoke(<generated>) server-provider-info-2023-10-12.0.log- at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) server-provider-info-2023-10-12.0.log- at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:749) -- server-provider-info-2023-10-12.0.log- at io.seata.rm.datasource.StatementProxy.lambda$executeBatch$9(StatementProxy.java:129) server-provider-info-2023-10-12.0.log- at io.seata.rm.datasource.exec.ExecuteTemplate.execute(ExecuteTemplate.java:73) server-provider-info-2023-10-12.0.log- at io.seata.rm.datasource.exec.ExecuteTemplate.execute(ExecuteTemplate.java:51) server-provider-info-2023-10-12.0.log- at io.seata.rm.datasource.StatementProxy.executeBatch(StatementProxy.java:129) server-provider-info-2023-10-12.0.log- at sun.reflect.GeneratedMethodAccessor1346.invoke(Unknown Source) server-provider-info-2023-10-12.0.log- at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) server-provider-info-2023-10-12.0.log- at java.lang.reflect.Method.invoke(Method.java:498) server-provider-info-2023-10-12.0.log- at org.apache.ibatis.logging.jdbc.PreparedStatementLogger.invoke(PreparedStatementLogger.java:78) server-provider-info-2023-10-12.0.log- at com.sun.proxy.$Proxy501.executeBatch(Unknown Source) server-provider-info-2023-10-12.0.log- at com.baomidou.mybatisplus.core.executor.MybatisBatchExecutor.doFlushStatements(MybatisBatchExecutor.java:132) server-provider-info-2023-10-12.0.log- ... 71 common frames omitted server-provider-info-2023-10-12.0.log-Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction server-provider-info-2023-10-12.0.log- at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:123) server-provider-info-2023-10-12.0.log- at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97) server-provider-info-2023-10-12.0.log- at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) server-provider-info-2023-10-12.0.log- at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:970) server-provider-info-2023-10-12.0.log- at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1109) server-provider-info-2023-10-12.0.log- at com.mysql.cj.jdbc.ClientPreparedStatement.executeBatchSerially(ClientPreparedStatement.java:849) server-provider-info-2023-10-12.0.log- ... 89 common frames omitted server-provider-info-2023-10-12.0.log- |
三、代碼所在地
查看發生Deadlock的代碼現場。
代碼如下:
/** * 增加用戶標簽 * @param user 用戶信息 * @param tagNameList 標簽集合 */ private void addUserTagByUser(User user, List<String> tagNameList){ log.info("用戶添加標簽入參信息,當前用戶id[{}],標簽信息[{}]", user.getUserId(), JSON.toJSONString(tagNameList)); Set<String> existTags = new HashSet<>(); final List<UserTag> userTagList = userTagManager.listUserTagByUserId(user.getUserId()); if(CollectionUtils.isNotEmpty(userTagList)){ existTags = userTagList.stream().map(UserTag::getTagName).collect(Collectors.toSet()); } //遍歷合集 如果不存在舊數據 直接都新增 存在 判斷當前數據是否在舊數據中 List<UserTag> newInsertList = new ArrayList<>(); for(String name : tagNameList){ if (StringUtils.isNotBlank(name) && existTags.add(name)) { newInsertList.add(userTagManager.buildEntity(user.getUserId(), user.getIdcardNo(), name)); } } userTagManager.saveBatch(newInsertList); } |
觀察代碼,大致邏輯是,先根據用戶id查詢出所有標簽信息,計算出來需要新增的用戶標簽,批量入庫。其中,user_tag表存儲用戶的標簽數據。兩個關鍵欄位 user_id(用戶id)和tag_name(標簽)上面具有唯一索引,插入的數據也主要是這兩個欄位。
從這段代碼並不容易判斷出來如何會產生死鎖。推測是多個線程同時操作相同的數據,併發插入導致的死鎖。
四、併發操作日誌
elk上查詢方法入參日誌。果然在發生異常的時間節點上,發現了兩條插入相同數據的日誌。相同時間點,兩個traceId,可見當時發生了併發調用。
找到當時另一個線程的操作日誌,日誌無異常,數據入庫成功。見下麵截圖。
五、資料庫死鎖日誌驗證推測
當天下午上班後,找運維大哥幫忙,找出死鎖日誌
執行命令 show ENGINE INNODB status
得到信息如下
===================================== 2023-10-12 13:49:17 0x7fea5ae87700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 61 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 3123131 srv_active, 0 srv_shutdown, 7715544 srv_idle srv_master_thread log flush and writes: 10838675 ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 3905404 OS WAIT ARRAY INFO: signal count 27617015 RW-shared spins 0, rounds 21456656, OS waits 2772950 RW-excl spins 0, rounds 41132727, OS waits 487320 RW-sx spins 227170, rounds 4363782, OS waits 66152 Spin rounds per wait: 21456656.00 RW-shared, 41132727.00 RW-excl, 19.21 RW-sx ------------------------ LATEST DETECTED DEADLOCK ------------------------ 2023-10-12 09:40:50 0x7fecedb1e700 *** (1) TRANSACTION: TRANSACTION 404898333, ACTIVE 0 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 1 MySQL thread id 11375251, OS thread handle 140644580488960, query id 1031478425 10.128.0.160 emax_base update INSERT INTO emax_user_tag ( user_id, id_card_no, tag_name, create_time, update_time ) VALUES ( 1712282059297107970, 'A978B69811B82F4B08F929D26B360F2D980CDCB5F419F1DD46527FB90BE1E5A9', '市場推廣', '2023-10-12 09:40:50.481', '2023-10-12 09:40:50.481' ) *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1155 page no 65295 n bits 312 index uniq_user_id_tag_name of table `emax_base`.`emax_user_tag` trx id 404898333 lock mode S waiting Record lock, heap no 242 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 8; hex 97c33f75fbadb002; asc ?u ;; 1: len 12; hex e5b882e59cbae68ea8e5b9bf; asc ;; 2: len 8; hex 000000000044c0da; asc D ;; *** (2) TRANSACTION: TRANSACTION 404898332, ACTIVE 0 sec inserting mysql tables in use 1, locked 1 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 2 MySQL thread id 11375224, OS thread handle 140655576868608, query id 1031478426 10.128.0.160 emax_base update INSERT INTO emax_user_tag ( user_id, id_card_no, tag_name, create_time, update_time ) VALUES ( 1712282059297107970, 'A978B69811B82F4B08F929D26B360F2D980CDCB5F419F1DD46527FB90BE1E5A9', '信息技術服務', '2023-10-12 09:40:50.481', '2023-10-12 09:40:50.481' ) *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 1155 page no 65295 n bits 312 index uniq_user_id_tag_name of table `emax_base`.`emax_user_tag` trx id 404898332 lock_mode X locks rec but not gap Record lock, heap no 242 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 8; hex 97c33f75fbadb002; asc ?u ;; 1: len 12; hex e5b882e59cbae68ea8e5b9bf; asc ;; 2: len 8; hex 000000000044c0da; asc D ;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1155 page no 65295 n bits 312 index uniq_user_id_tag_name of table `emax_base`.`emax_user_tag` trx id 404898332 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 242 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 8; hex 97c33f75fbadb002; asc ?u ;; 1: len 12; hex e5b882e59cbae68ea8e5b9bf; asc ;; 2: len 8; hex 000000000044c0da; asc D ;; *** WE ROLL BACK TRANSACTION (1) |
該日誌信息量很多,已省略掉若幹非關鍵信息行,我們主要關註最後一次死鎖相關信息。
六、死鎖復現
sql準備
INSERT INTO emax_user_tag ( user_id, id_card_no, tag_name, create_time, update_time ) VALUES ( 1712282059297107970, 'A978B69811B82F4B08F929D26B360F2D980CDCB5F419F1DD46527FB90BE1E5A9', '市場推廣', '2023-10-12 09:40:50.481', '2023-10-12 09:40:50.481' );
INSERT INTO emax_user_tag ( user_id, id_card_no, tag_name, create_time, update_time ) VALUES ( 1712282059297107970, 'A978B69811B82F4B08F929D26B360F2D980CDCB5F419F1DD46527FB90BE1E5A9', '信息技術服務', '2023-10-12 09:40:50.481', '2023-10-12 09:40:50.481' );
本地創建一個和生產一樣的資料庫emax_user_tag,同樣,在user_id和tag_name兩個欄位上創建唯一索引。
操作步驟1
在視窗一執行
begin;
INSERT INTO emax_user_tag ( user_id, id_card_no, tag_name, create_time, update_time ) VALUES ( 1712282059297107970, 'A978B69811B82F4B08F929D26B360F2D980CDCB5F419F1DD46527FB90BE1E5A9', '市場推廣', '2023-10-12 09:40:50.481', '2023-10-12 09:40:50.481' );
操作步驟2
在視窗二執行
begin;
INSERT INTO emax_user_tag ( user_id, id_card_no, tag_name, create_time, update_time ) VALUES ( 1712282059297107970, 'A978B69811B82F4B08F929D26B360F2D980CDCB5F419F1DD46527FB90BE1E5A9', '市場推廣', '2023-10-12 09:40:50.481', '2023-10-12 09:40:50.481' );
操作步驟3
在視窗一執行
INSERT INTO emax_user_tag ( user_id, id_card_no, tag_name, create_time, update_time ) VALUES ( 1712282059297107970, 'A978B69811B82F4B08F929D26B360F2D980CDCB5F419F1DD46527FB90BE1E5A9', '信息技術服務', '2023-10-12 09:40:50.481', '2023-10-12 09:40:50.481' );
commit;
操作步驟4
切回視窗二
操作步驟5
執行命令 show ENGINE INNODB status 查看死鎖日誌。與上面的情況一樣。
===================================== 2023-10-13 14:52:43 0x7fe759575700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 24 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 137590 srv_active, 0 srv_shutdown, 2010475 srv_idle srv_master_thread log flush and writes: 2148065 ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 6627679 OS WAIT ARRAY INFO: signal count 14339066 RW-shared spins 0, rounds 22104735, OS waits 1859432 RW-excl spins 0, rounds 258795010, OS waits 2238669 RW-sx spins 6551822, rounds 136672247, OS waits 1786313 Spin rounds per wait: 22104735.00 RW-shared, 258795010.00 RW-excl, 20.86 RW-sx ------------------------ LATEST DETECTED DEADLOCK ------------------------ 2023-10-13 14:50:31 0x7fe892ad0700 *** (1) TRANSACTION: TRANSACTION 293171186, ACTIVE 10 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 1 MySQL thread id 1266048, OS thread handle 140631613069056, query id 35172347 192.168.48.179 develop update /* ApplicationName=DBeaver 22.2.1 - SQLEditor <Script-4.sql> */ INSERT INTO emax_user_tag ( user_id, id_card_no, tag_name, create_time, update_time ) VALUES ( 1712282059297107970, 'A978B69811B82F4B08F929D26B360F2D980CDCB5F419F1DD46527FB90BE1E5A9', '市場推廣', '2023-10-12 09:40:50.481', '2023-10-12 09:40:50.481' ) *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 14171 page no 4 n bits 72 index user_id_tag_name of table `dbsyncer_mf`.`emax_user_tag` trx id 293171186 lock mode S waiting Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 8; hex 97c33f75fbadb002; asc ?u ;; 1: len 12; hex e5b882e59cbae68ea8e5b9bf; asc ;; 2: len 8; hex 80000000000358d0; asc X ;; *** (2) TRANSACTION: TRANSACTION 293171182, ACTIVE 19 sec inserting mysql tables in use 1, locked 1 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 2 MySQL thread id 1266036, OS thread handle 140636869953280, query id 35172492 192.168.48.179 develop update /* ApplicationName=DBeaver 22.2.1 - SQLEditor <Script-1.sql> */ INSERT INTO emax_user_tag ( user_id, id_card_no, tag_name, create_time, update_time ) VALUES ( 1712282059297107970, 'A978B69811B82F4B08F929D26B360F2D980CDCB5F419F1DD46527FB90BE1E5A9', '信息技術服務', '2023-10-12 09:40:50.481', '2023-10-12 09:40:50.481' ) *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 14171 page no 4 n bits 72 index user_id_tag_name of table `dbsyncer_mf`.`emax_user_tag` trx id 293171182 lock_mode X locks rec but not gap Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 8; hex 97c33f75fbadb002; asc ?u ;; 1: len 12; hex e5b882e59cbae68ea8e5b9bf; asc ;; 2: len 8; hex 80000000000358d0; asc X ;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 14171 page no 4 n bits 72 index user_id_tag_name of table `dbsyncer_mf`.`emax_user_tag` trx id 293171182 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 8; hex 97c33f75fbadb002; asc ?u ;; 1: len 12; hex e5b882e59cbae68ea8e5b9bf; asc ;; 2: len 8; hex 80000000000358d0; asc X ;; *** WE ROLL BACK TRANSACTION (1)View Code
七、死鎖分析
根據死鎖日誌進行分析。
執行操作步驟一時:
事務一 獲取到了插入意向鎖
執行操作步驟二時:
事務二 將事務的意向鎖升級為唯一索引排他鎖,並且嘗試獲取唯一索引共用鎖(還沒有獲得共用鎖,排隊中)
執行操作步驟三時:
事務一 嘗試獲取插入意向間隙鎖
事務二等待事務一釋放排他鎖
事務一等待事務二釋放共用鎖
形成相互等待關係,死鎖。
八、總結歸納
1.批量插入操作儘量保證數據有序性
2.可藉助性能更高的redis進行併發攔截 或 同步處理控制
3.將參數 innodb_deadlock_detect
設置為 on,死鎖時會進行回滾(資料庫預設開啟)
4.業務代碼需要考慮資料庫異常
5.當前mysql資料庫版本是5.7.28-log。經細心的紅潔同學驗證,mysql-8.0版本針對這個場景做了優化,不會發生死鎖。
文獻參考 https://blog.csdn.net/minghao0508/article/details/129093202
當看到一些不好的代碼時,會發現我還算優秀;當看到優秀的代碼時,也才意識到持續學習的重要!--buguge
本文來自博客園,轉載請註明原文鏈接:https://www.cnblogs.com/buguge/p/17788113.html