MVCC併發版本控制 本文大部分來自《MySQL是怎樣運行的》,這裡只是簡單總結,用於各位回憶和複習。 版本鏈 對於使用 InnoDB 存儲引擎的表來說,它的聚簇索引記錄中都包含兩個必要的隱藏列(不知道的快去看《MySQL是怎樣運行的》) trx_id :每次一個事務對某條聚簇索引記錄進行改動時,都 ...
[20230616]One Deadlock of 'row cache lock' and 'library cache lock'.txt
--//鏈接http://ksun-oracle.blogspot.com/2023/06/one-deadlock-of-row-cache-lock-and.html演示一個有趣的測試.
--//他測試採用cluster表,我估計普通表這樣操作不會出現這樣的情況,先重覆作者的測試看看.
1.環境:
SCOTT@test01p> @ ver1
PORT_STRING VERSION BANNER CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0 12.2.0.1.0 Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production 0
2.建立測試腳本:
--//drop table test_tab purge;
--//drop cluster test_cluster;
SCOTT@test01p> create cluster test_cluster (x number);
Cluster created.
SCOTT@test01p> create index test_cluster_indx on cluster test_cluster;
Index created.
SCOTT@test01p> create table test_tab (x, y) cluster test_cluster(x) as select level, 'abc' from dual connect by level <= 3;
Table created.
create or replace procedure test_proc_sel_tab(p_cnt number) as
begin
for i in 1..p_cnt loop
for c in (select * from test_tab) loop
null;
end loop;
--dbms_session.sleep(0.01);
dbms_lock.sleep(0.01);
end loop;
end;
/
-- exec test_proc_sel_tab(5);
create or replace procedure test_proc_alt_cluster(p_cnt number) as
begin
for i in 1..p_cnt loop
execute immediate 'ALTER CLUSTER test_cluster PCTFREE 18 INITRANS 19';
end loop;
end;
/
-- exec test_proc_alt_cluster(5);
--//簡單說明:存儲過程test_proc_sel_tab 不斷執行select操作,每次修改停頓0.01秒.存儲過程test_proc_alt_cluster不斷的執行
--//alert修改cluster表的PCTFREE,INITRANS屬性.
3.測試:
--//session 1,按照作者介紹需要10秒完成, 感覺不對0.01*10000 = 100秒. 我修改為1e3.
SCOTT@test01p> @ spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
252 29639 4784:8016 DEDICATED 260 27 7 alter system kill session '252,29639' immediate;
SCOTT@test01p> @zzdate
C30 C30 C31
-------------------------------------- -------------------------------------- --------------------------------------
2023-06-19 21:15:59 trunc(sysdate)+21/24+15/1440+59/86400 "timestamp'2023-06-19 21:15:59'"
SCOTT@test01p> exec test_proc_sel_tab(1e3);
PL/SQL procedure successfully completed.
Elapsed: 00:00:14.27
--//註:先執行test_proc_sel_tab ,切換會話執行test_proc_alt_cluster。
--//session 2:
SCOTT@test01p> @ spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
267 7490 3624:6852 DEDICATED 6964 59 13 alter system kill session '267,7490' immediate;
SCOTT@test01p> exec test_proc_alt_cluster(1e4);
BEGIN test_proc_alt_cluster(1e4); END;
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "SCOTT.TEST_PROC_ALT_CLUSTER", line 4
ORA-06512: at line 1
SCOTT@test01p> @zzdate
C30 C30 C31
-------------------------------------- -------------------------------------- --------------------------------------
2023-06-19 21:18:38 trunc(sysdate)+21/24+18/1440+38/86400 "timestamp'2023-06-19 21:18:38'"
--//作者給出的解析,實際上就是形成了相鎖.當然這個給看運氣,當然對於一個在cluster中的表如果頻繁訪問,如果修改cluster屬性,很
--//有可能遇到這樣的情況.
4.我的分析:
--//作者的分析使用gdb,我的測試環境是windows,僅僅簡單的使用ashtop觀察看看.
SCOTT@test01p> @ ashtop sql_id,SESSION_ID,SESSION_SERIAL#,event 1=1 trunc(sysdate)+21/24+15/1440+59/86400 trunc(sysdate)+21/24+18/1440+38/86400
Total Distinct Distinct
Seconds AAS %This SQL_ID SESSION_ID SESSION_SERIAL# EVENT FIRST_SEEN LAST_SEEN Execs Seen Tstamps
--------- ------- ------- ------------- ---------- --------------- ------------------ ------------------- ------------------- ---------- --------
5 .0 42% | 42y5q6y5vyxsw 267 7490 library cache lock 2023-06-19 21:17:09 2023-06-19 21:17:13 1 5
5 .0 42% | g677askf3bry6 252 29639 row cache lock 2023-06-19 21:17:09 2023-06-19 21:17:13 1 5
2 .0 17% | 42y5q6y5vyxsw 267 7490 2023-06-19 21:17:07 2023-06-19 21:17:08 2 2
SCOTT@test01p> @ sql_id g677askf3bry6
--SQL_ID = g677askf3bry6
SELECT * FROM TEST_TAB;
SCOTT@test01p> @ sql_id 42y5q6y5vyxsw
--SQL_ID = 42y5q6y5vyxsw
--//alter 語句這樣查詢不到。
SYS@test> @ sharepool/shp4x 42y5q6y5vyxsw 0
TEXT KGLHDADR KGLHDPAR C40 KGLHDLMD KGLHDPMD KGLHDIVC KGLOBHD0 KGLOBHD6 KGLOBHS0 KGLOBHS6 KGLOBT16 N0_6_16 N20 KGLNAHSH KGLOBT03 KGLOBT09
--------------------- ---------------- ---------------- ---------------------------------------- ---------- ---------- ---------- ---------------- ---------------- ---------- ---------- ---------- --------- ---------- ---------- ------------- ----------
child handle address 000007FF1527C4D8 000007FF11755510 ALTER CLUSTER test_cluster PCTFREE 18 IN 0 0 10000 00 00 0 0 3186 3186 3186 2344580892 42y5q6y5vyxsw 0
parent handle address 000007FF11755510 000007FF11755510 ALTER CLUSTER test_cluster PCTFREE 18 IN 0 0 10000 000007FF1153E188 00 4072 0 0 4072 4072 2344580892 42y5q6y5vyxsw 65535
SCOTT@test01p> @ ash/ash_wait_chains2.sql username||':'||program2||event2 session_type='FOREGROUND' trunc(sysdate)+21/24+15/1440+59/86400 trunc(sysdate)+21/24+18/1440+38/86400
-- Display ASH Wait Chain Signatures script v0.7 by Tanel Poder ( http://blog.tanelpoder.com )
%This SECONDS AAS WAIT_CHAIN FIRST_SEEN LAST_SEEN
------ ---------- ------- ----------------------------------------------------------------------------------------------------------------------------- ------------------- -------------------
42% 5 .0 -> SCOTT:(sqlplus.exe) row cache lock -> SCOTT:(sqlplus.exe) library cache lock -> [idle blocker 1,252,29639 (sqlplus.exe)] 2023-06-19 21:17:09 2023-06-19 21:17:13
42% 5 .0 -> SCOTT:(sqlplus.exe) library cache lock -> SCOTT:(sqlplus.exe) row cache lock -> [idle blocker 1,267,7490 (sqlplus.exe)] 2023-06-19 21:17:09 2023-06-19 21:17:13
17% 2 .0 -> SCOTT:(sqlplus.exe) ON CPU 2023-06-19 21:17:07 2023-06-19 21:17:08
SCOTT@test01p> @wcx trunc(sysdate)+21/24+15/1440+59/86400 trunc(sysdate)+21/24+18/1440+38/86400
-- Display ASH Wait Chain Signatures script v0.7 by Tanel Poder ( http://blog.tanelpoder.com )
%This SECONDS AAS WAIT_CHAIN FIRST_SEEN LAST_SEEN
------ ---------- ------- ------------------------------------------------------------------------------------------------------------------------------------------ ------------------- -------------------
42% 5 .0 -> 267,7490,@1=>252,29639,@1=>row cache lock -> 252,29639,@1=>267,7490,@1=>library cache lock -> [idle blocker 1,252,29639 (sqlplus.exe)] 2023-06-19 21:17:09 2023-06-19 21:17:13
42% 5 .0 -> 252,29639,@1=>267,7490,@1=>library cache lock -> 267,7490,@1=>252,29639,@1=>row cache lock -> [idle blocker 1,267,7490 (sqlplus.exe)] 2023-06-19 21:17:09 2023-06-19 21:17:13
17% 2 .0 -> ,,@=>267,7490,@1=> 2023-06-19 21:17:07 2023-06-19 21:17:08
--// row cache lock 和 library cache lock 形成相互等待,兩個session導致互鎖。ORA-00060: deadlock detected while waiting for resource。
5. Root Cacuse Analysis
--//鏈接http://ksun-oracle.blogspot.com/2023/06/one-deadlock-of-row-cache-lock-and.html的解析:
Look gdb lock request order of both sessions:
(a). ALTER CLUSTER GDB shows:
first 'row cache lock' reqeust on TEST_CLUSTER by kqrLockPo,
then 'library cache lock' reqeust on TEST_TAB by kgllkal.
(b). Query GDB shows
first 'library cache lock' reqeust on TEST_TAB by kgllkal,
then 'row cache lock' reqeust on TEST_CLUSTER by kqrLockPo.
We can see that the deadlock is caused by the crossing order of 'row cache lock' and 'library cache lock' reqeusts.
There are also similar cases of deadlocks documented in:
Oracle MOS: Deadlock on Row Cache Lock and Library Cache Lock Dropping REF-PARTITION (Doc ID 2796372.1)
Deadlock Caused by Row Cache and Library Cache Locks (https://nenadnoveljic.com/blog/deadlock-row-cache-lock-library-cache-lock/)
6.附上測試腳本:
> cat wcx.sql
@ tpt/ash/ash_wait_chains2 BLOCKING_SESSION||','||BLOCKING_SESSION_SERIAL#||',@'||BLOCKING_INST_ID||'=>'||session_id||','||SESSION_SERIAL#||',@'||inst_id||'=>'||event 1=1 "&&1" "&&2"
> cat sharepool/shp4x.sql
column N0_6_16 format 99999999
SELECT /*+ USE_CONCAT(@"SEL$1" 8 OR_PREDICATES(1)) */ DECODE (kglhdadr,
kglhdpar, 'parent handle address',
'child handle address')
text,
kglhdadr,
kglhdpar,
substr(kglnaobj,1,40) c40,
KGLHDLMD,
KGLHDPMD,
kglhdivc,
kglobhd0,
kglobhd6,
kglobhs0,kglobhs6,kglobt16,
kglobhs0+kglobhs6+kglobt16 N0_6_16,
kglobhs0+kglobhs1+kglobhs2+kglobhs3+kglobhs4+kglobhs5+kglobhs6+kglobt16 N20,
kglnahsh,
kglobt03,
kglobt09
FROM x$kglob
WHERE kglobt03 = lower('&1') or KGLNAHSH= &2;
7.如果換成表呢?
--//drop table test_tab purge;
--//drop cluster test_cluster;
SCOTT@test01p> create table test_tab (x, y) as select level, 'abc' from dual connect by level <= 3;
Table created.
create or replace procedure test_proc_alt_table(p_cnt number) as
begin
for i in 1..p_cnt loop
execute immediate 'ALTER TABLE test_tab PCTFREE 18 INITRANS 19';
end loop;
end;
/
--//session 1:
SCOTT@test01p> exec test_proc_sel_tab(1e3);
PL/SQL procedure successfully completed.
Elapsed: 00:00:11.26
--//session 2:
SCOTT@test01p> exec test_proc_alt_table(1e4);
PL/SQL procedure successfully completed.
--//沒有任何問題。