[20181130]hash衝突導致查詢緩慢.txt

来源:https://www.cnblogs.com/lfree/archive/2018/11/30/10043021.html
-Advertisement-
Play Games

[20181130]hash衝突導致查詢緩慢.txt--//昨天看了鏈接https://jonathanlewis.wordpress.com/2018/11/26/shrink-space-2/,演示了Shrink Space導致--//執行語句緩慢的情況,我自己重覆測試,實際上這樣發生的概率還是 ...


[20181130]hash衝突導致查詢緩慢.txt

--//昨天看了鏈接https://jonathanlewis.wordpress.com/2018/11/26/shrink-space-2/,演示了Shrink Space導致
--//執行語句緩慢的情況,我自己重覆測試,實際上這樣發生的概率還是很低的,我個人認為,至於Shrink Space是否好壞,
--//我個人還是根據實際的情況來確定.

1.環境:
SCOTT@book> @ ver1

PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

SCOTT@book> rename  emp to empxxx;
Table renamed.

--//作者測試的表名與scoot用戶下衝突,我先修改原系統的表名.

2.建立測試腳本:
create table emp(
        dept_no         not null,
        sal,
        emp_no          not null,
        padding,
        constraint e_pk primary key(emp_no)
)
as
with generator as (
        select  null
        from    dual
        connect by
                level <= 1e4 -- > comment to avoid wordpress format issue
)
select
        mod(rownum,6),
        rownum,
        rownum,
        rpad('x',60)
from
        generator       v1,
        generator       v2
where
        rownum <= 2e4 -- > comment to avoid wordpress format issue
;
 
 
insert into emp values(432, 20001, 20001, rpad('x',60));
delete /*+ full(emp) */ from emp where emp_no <= 1000;      -- > comment to avoid wordpress format issue
--//註:執行時要刪除後面的註解.不然報錯.作者應該把分號放在最後才能正常執行ok.
commit;
 
begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'EMP',
                method_opt       => 'for all columns size 1'
        );
end;
/

3.測試:
SCOTT@book> alter session set statistics_level = all;
Session altered.

select
        /*+ gather_plan_statistics pre-shrink */
        count(*)
from    (
        select  /*+ no_merge */
                outer.*
        from
                emp outer
        where
                outer.sal > (
                        select  /*+ no_unnest */
                                avg(inner.sal)
                        from
                                emp inner
                        where
                                inner.dept_no = outer.dept_no
                )
        )
;


  COUNT(*)
----------
      9998

SCOTT@book> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  9bkx1f5cpcv14, child number 1
-------------------------------------
select         /*+ gather_plan_statistics pre-shrink */
count(*) from    (         select  /*+ no_merge */
outer.*         from                 emp outer         where
     outer.sal > (                         select  /*+ no_unnest */
                            avg(inner.sal)                         from
                                emp inner                         where
                                inner.dept_no = outer.dept_no
      )         )

Plan hash value: 322796046

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        |       |   569 (100)|          |      1 |00:00:00.04 |    1912 |
|   1 |  SORT AGGREGATE       |      |      1 |      1 |       |            |          |      1 |00:00:00.04 |    1912 |
|   2 |   VIEW                |      |      1 |    143 |       |   569   (1)| 00:00:07 |   9998 |00:00:00.04 |    1912 |
|*  3 |    FILTER             |      |      1 |        |       |            |          |   9998 |00:00:00.03 |    1912 |
|   4 |     TABLE ACCESS FULL | EMP  |      1 |  20001 |   156K|    71   (0)| 00:00:01 |  20001 |00:00:00.01 |     239 |
|   5 |     SORT AGGREGATE    |      |      7 |      1 |     8 |            |          |      7 |00:00:00.02 |    1673 |
|*  6 |      TABLE ACCESS FULL| EMP  |      7 |   2857 | 22856 |    71   (0)| 00:00:01 |  20001 |00:00:00.01 |    1673 |
------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$2 / from$_subquery$_001@SEL$1
   3 - SEL$2
   4 - SEL$2 / OUTER@SEL$2
   5 - SEL$3
   6 - SEL$3 / INNER@SEL$3

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("OUTER"."SAL">)
   6 - filter("INNER"."DEPT_NO"=:B1)

--//你可以發現內層迴圈掃描emp表7次.因為有7個部門.也就是oracle緩存了執行過.雖然最後1個記錄是dept_no=432存在衝突,僅僅1條,
--//影響不大.

4.測試Shrink Space後:
SCOTT@book> alter table emp enable row movement;
Table altered.

SCOTT@book> alter table emp shrink space compact;
Table altered.

SCOTT@book> select * from emp where rownum<=4;
   DEPT_NO        SAL     EMP_NO PADDING
---------- ---------- ---------- ---------
       432      20001      20001 x
         4      19978      19978 x
         5      19979      19979 x
         0      19980      19980 x

--//這樣dept_no=432被移動到前面.

select
        /*+ gather_plan_statistics post-shrink  */
        count(*)
from    (
        select  /*+ no_merge */
                outer.*
        from emp outer
        where outer.sal >
                (
                        select /*+ no_unnest */ avg(inner.sal)
                        from emp inner
                        where inner.dept_no = outer.dept_no
                )
        )
;

  COUNT(*)
----------
      9498

SCOTT@book> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  gx7xb7rhfd2zf, child number 0
-------------------------------------
select         /*+ gather_plan_statistics post-shrink  */
count(*) from    (         select  /*+ no_merge */
outer.*         from emp outer         where outer.sal >
 (                         select /*+ no_unnest */ avg(inner.sal)
                  from emp inner                         where
inner.dept_no = outer.dept_no                 )         )

Plan hash value: 322796046

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        |       |   569 (100)|          |      1 |00:00:03.43 |     783K|
|   1 |  SORT AGGREGATE       |      |      1 |      1 |       |            |          |      1 |00:00:03.43 |     783K|
|   2 |   VIEW                |      |      1 |    143 |       |   569   (1)| 00:00:07 |   9498 |00:00:03.43 |     783K|
|*  3 |    FILTER             |      |      1 |        |       |            |          |   9498 |00:00:03.43 |     783K|
|   4 |     TABLE ACCESS FULL | EMP  |      1 |  20001 |   156K|    71   (0)| 00:00:01 |  19001 |00:00:00.01 |     247 |
|   5 |     SORT AGGREGATE    |      |   3172 |      1 |     8 |            |          |   3172 |00:00:03.42 |     783K|
|*  6 |      TABLE ACCESS FULL| EMP  |   3172 |   2857 | 22856 |    71   (0)| 00:00:01 |     10M|00:00:02.71 |     783K|
------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$2 / from$_subquery$_001@SEL$1
   3 - SEL$2
   4 - SEL$2 / OUTER@SEL$2
   5 - SEL$3
   6 - SEL$3 / INNER@SEL$3

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("OUTER"."SAL">)
   6 - filter("INNER"."DEPT_NO"=:B1)

--//註:我的測試機器比較快,沒有作者測試的9秒,僅僅接近4秒完成,不過還是看出比原來執行慢.註意看id=6,迴圈執行次數是3172.
--//也就是dept_no=432與dept_no=0,1,2,3,4,5存在hash衝突,這樣每次執行內層迴圈dept_no=:B1是都要重覆調用.

SCOTT@book> select dept_no,count(*) from emp group by dept_no order by 1;
   DEPT_NO   COUNT(*)
---------- ----------
         0       3167
         1       3167
         2       3167
         3       3166
         4       3166
         5       3167
       432          1
7 rows selected.

--//假設與dept_no=1出現hash衝突.
--//dept_no=432 迴圈1次
--//dept_no=0   迴圈1次
--//dept_no=1   迴圈3167次
--//dept_no=2   迴圈1次
--//dept_no=3   迴圈1次
--//dept_no=4   迴圈1次
--//dept_no=5   迴圈1次

--//這樣累加: 1+1+3167+1+1+1+1 = 3173 ,不對相差1.我做了一些細節,證明hash衝突是dept_no=4.

5.其它有趣的測試:
--//執行如下,只要dept_no in 裡面包括4,432查詢就很慢(至少查詢3個部門).就會有點慢.
--//也就是證明hash衝突的是dept_no=4.

select
        /*+ gather_plan_statistics post-shrink  */
        count(*)
from    (
        select  /*+ no_merge */
                outer.*
        from emp outer
        where outer.sal >
                (
                        select /*+ no_unnest */ avg(inner.sal)
                        from emp inner
                        where inner.dept_no = outer.dept_no
                )
        ) where dept_no in (432,4,5)
;

--//如果你執行如下,你會發現執行很快:
select
        /*+ gather_plan_statistics post-shrink  */
        count(*)
from    (
        select  /*+ no_merge */
                outer.*
        from emp outer
        where outer.sal >
                (
                        select /*+ no_unnest */ avg(inner.sal)
                        from emp inner
                        where inner.dept_no = outer.dept_no
                )
        ) where dept_no in (432,4)
;


SCOTT@book> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  9v9984wd6k9t5, child number 0
-------------------------------------
select         /*+ gather_plan_statistics post-shrink  */
count(*) from    (         select  /*+ no_merge */
outer.*         from emp outer         where outer.sal >
 (                         select /*+ no_unnest */ avg(inner.sal)
                  from emp inner                         where
inner.dept_no = outer.dept_no                 )         ) where dept_no
in (432,4)

Plan hash value: 322796046

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        |       |   214 (100)|          |      1 |00:00:00.01 |     741 |
|   1 |  SORT AGGREGATE       |      |      1 |      1 |       |            |          |      1 |00:00:00.01 |     741 |
|   2 |   VIEW                |      |      1 |    143 |       |   214   (1)| 00:00:03 |   1583 |00:00:00.01 |     741 |
|*  3 |    FILTER             |      |      1 |        |       |            |          |   1583 |00:00:00.01 |     741 |
|*  4 |     TABLE ACCESS FULL | EMP  |      1 |   5715 | 45720 |    71   (0)| 00:00:01 |   3167 |00:00:00.01 |     247 |
|   5 |     SORT AGGREGATE    |      |      2 |      1 |     8 |            |          |      2 |00:00:00.01 |     494 |
|*  6 |      TABLE ACCESS FULL| EMP  |      2 |   2857 | 22856 |    71   (0)| 00:00:01 |   3167 |00:00:00.01 |     494 |
------------------------------------------------------------------------------------------------------------------------

--//迴圈僅僅2次.這是因為參考鏈接:
http://blog.itpub.net/267265/viewspace-2155927/
https://blogs.oracle.com/oraclemagazine/on-caching-and-evangelizing-sql

-//摘要如下:

When you're using a scalar subquery, Oracle Database will set up a small in-memory hash table for the subquery and its
results each time it runs the query. So, when you run the previous query, Oracle Database sets up in memory a hash table
that looks like this:

Oracle Database will use this hash table to remember the scalar subquery and the inputs to it—just :DEPTNO in this case
—and the output from it. At the beginning of every query execution, this cache is empty, but suppose you run the query
and the first PROJECTS row you retrieve has a DEPTNO value of 10. Oracle Database will assign the number 10 to a hash
value between 1 and 255 (the size of the hash table cache in Oracle Database 10g and Oracle Database 11g currently) and
will look in that hash table slot to see if the answer exists. In this case, it will not, so Oracle Database must run
the scalar subquery with the input of 10 to get the answer. If that answer (count) is 42, the hash table may look
something like this:

//註:補充說明我測試10.2.0.5,buckets=512而不是255.有機會測試11.2.0.4的情況.

Select count(*) from emp where emp.deptno = :deptno
:deptno     Count(*)

You'll have saved the DEPTNO value of 10 and the answer (count) of 42 in some slot—probably not the first or last slot,
but whatever slot the hash value 10 is assigned to. Now suppose the second row you get back from the PROJECTS table
includes a DEPTNO value of 20. Oracle Database will again look in the hash table after assigning the value 20, and it
will discover "no result in the cache yet." So it will run the scalar subquery, get the result, and put it into the hash
table cache. Now the cache may look like this:

Select count(*) from emp where emp.deptno = :deptno
:deptno     Count(*)
Select count(*) from emp where emp.deptno = :deptno
:deptno     Count(*)
…     …
10     42

Now suppose the query returns a third row and it again includes a DEPTNO value of 10. This time, Oracle Database will
see DEPTNO = 10, find that it already has that value in the hash table cache, and will simply return 42 from the cache
instead of executing the scalar subquery. In fact, it will never have to run that scalar subquery for the DEPTNO values
of 10 or 20 again for that query—it will already have the answer.

What happens if the number of unique DEPTNO values exceeds the size of the hash table? What if there are more than 255
values? Or, more generally, if more than one DEPTNO value is assigned to the same slot in the hash table, what happens
in a hash collision?

The answer is the same for all these questions and is rather simple: Oracle Database will not be able to cache the
second or nth value to that slot in the hash table. For example, what if the third row returned by the query contains
the DEPTNO = 30 value? Further, suppose that DEPTNO = 30 is to be assigned to exactly the same hash table slot as DEPTNO
= 10. The database won't be able to effectively cache DEPTNO = 30 in this case—the value will never make it into the
hash table. It will, however, be "partially cached." Oracle Database still has the hash table with all the previous
executions, but it also keeps the last scalar subquery result it had "next to" the hash table. That is, if the fourth
row also includes a DEPTNO = 30 value, Oracle Database will discover that the result is not in the hash table but is
"next to" the hash table, because the last time it ran the scalar subquery, it was run with an input of 30. On the other
hand, if the fourth row includes a DEPTNO = 40 value, Oracle Database will run the scalar subquery with the DEPTNO = 40
value (because it hasn't seen that value yet during this query execution) and overwrite the DEPTNO = 30 result. The next
time Oracle Database sees DEPTNO = 30 in the result set, it'll have to run that scalar subquery again.

--//答案在這一段落中,如果查詢結果臨近它會從前面的查詢獲得結果,而不用進入迴圈.我僅僅查詢dept_no in (432,4),這樣後面全部是
--//dept_no=4返回,這樣可以從臨近的查詢獲得結果.實際上你看作者的表設計就知道答案:
SCOTT@book> select * from emp where rownum<=10;
   DEPT_NO        SAL     EMP_NO PADDING
---------- ---------- ---------- --------
       432      20001      20001 x
         4      19978      19978 x
         5      19979      19979 x
         0      19980      19980 x
         1      19981      19981 x
         2      19982      19982 x
         3      19983      19983 x
         4      19984      19984 x
         5      19985      19985 x
         0      19986      19986 x
10 rows selected.
--//dept_no記錄不是聚集在一起的.

總結:
--//實際上這個例子我記憶在作者<基於成本的優化>的書中提到過.當時想作者如何知道那個數存在衝突.感覺作者很厲害.
--//另外寫一篇blog猜測那些hash存在衝突的.




您的分享是我們最大的動力!

-Advertisement-
Play Games
更多相關文章
  • 一. 好用便利的工具,常用 pt-align 對齊文本格式pt-archiver 循序漸進的歸檔表,刪除表,遷移數據pt-config-diff 對比不同配置文件、伺服器配置參數pt-diskstats 查看磁碟iopt-fifo-split 把大文件通過管道分割成若幹小文件pt-kill 批量殺連 ...
  • 問題:Oracle EBS 11無法打開Form及Form顯示亂碼 解決: 1、嘗試使用jre1.5或1.6安裝目錄下jre/bin/server目錄里的jvm.dll替換JInitiator安裝目錄里的bin/hotspot目錄下的jvm.dll文件,如果替換後,後續操作提示錯誤,不妨再換為預設的 ...
  • 以前記錄數據可能很少也很簡單,比如說老王借了老李半斤肉,這樣的數據老李直接就寫到牆上就行了。 後來數據多了人們就以表格的方式開始記錄,寫到一張A4紙上,比如學生的檔案,有表頭和序號等。 表頭裡有姓名、性別、年齡、籍貫等等,有橫向的信息有豎向的信息。這樣的A4紙散放在那裡肯定是不行的。 於是人們把它們 ...
  • oracle11g中沒有scott用戶的解決,通過創建scott.sql文件,然後授予許可權,解鎖,修改密碼等一系列的操作來完成賬號建立 ...
  • 一.概述 除了上篇介紹的RDB持久化功能之外,Redis還提供了AOF(Append Only File)持久化功能。與RDB保存資料庫中的鍵值對來記錄資料庫狀態不同,AOF是通過保存redis伺服器所執行的寫命令來記錄資料庫狀態的。AOF持久化方式記錄每次對伺服器寫的操作,當伺服器啟動時,就會通過 ...
  • 摘要:當前研發工作中經常出現因資料庫表、資料庫表欄位格式不規則而影響開發進度的問題,在後續開發使用原來資料庫表時,也會因為資料庫表的可讀性不夠高,表欄位規則不統一,造成數據查詢,數據使用效率低的問題,所以有必要整理出一套合適的資料庫表欄位命名規範來解決優化這些問題。 本文是一篇包含了資料庫命名、數據 ...
  • mysql是一種關係型資料庫管理系統。以mysql5.7版本為例,安裝過程如下: 首先百度出mysql的官網,進入:(以下是自己安裝失敗的過程,直接下拉最後看大佬的安裝過程吧,就是那個紅紅的網址) 找到mysql的下載社區,找到對應的版本,這裡以mysql5.7為例: 這裡我們選擇zip格式安裝,對 ...
  • [20181130]如何猜測那些值存在hash衝突.txt--//今年6月份開始kerrycode的1個帖子提到子查詢結果緩存在哈希表中情況:--//鏈接:http://www.cnblogs.com/kerrycode/p/9099507.html,摘要:通俗來將,當使用標量子查詢的時候,ORAC ...
一周排行
    -Advertisement-
    Play Games
  • 示例項目結構 在 Visual Studio 中創建一個 WinForms 應用程式後,項目結構如下所示: MyWinFormsApp/ │ ├───Properties/ │ └───Settings.settings │ ├───bin/ │ ├───Debug/ │ └───Release/ ...
  • [STAThread] 特性用於需要與 COM 組件交互的應用程式,尤其是依賴單線程模型(如 Windows Forms 應用程式)的組件。在 STA 模式下,線程擁有自己的消息迴圈,這對於處理用戶界面和某些 COM 組件是必要的。 [STAThread] static void Main(stri ...
  • 在WinForm中使用全局異常捕獲處理 在WinForm應用程式中,全局異常捕獲是確保程式穩定性的關鍵。通過在Program類的Main方法中設置全局異常處理,可以有效地捕獲並處理未預見的異常,從而避免程式崩潰。 註冊全局異常事件 [STAThread] static void Main() { / ...
  • 前言 給大家推薦一款開源的 Winform 控制項庫,可以幫助我們開發更加美觀、漂亮的 WinForm 界面。 項目介紹 SunnyUI.NET 是一個基於 .NET Framework 4.0+、.NET 6、.NET 7 和 .NET 8 的 WinForm 開源控制項庫,同時也提供了工具類庫、擴展 ...
  • 說明 該文章是屬於OverallAuth2.0系列文章,每周更新一篇該系列文章(從0到1完成系統開發)。 該系統文章,我會儘量說的非常詳細,做到不管新手、老手都能看懂。 說明:OverallAuth2.0 是一個簡單、易懂、功能強大的許可權+可視化流程管理系統。 有興趣的朋友,請關註我吧(*^▽^*) ...
  • 一、下載安裝 1.下載git 必須先下載並安裝git,再TortoiseGit下載安裝 git安裝參考教程:https://blog.csdn.net/mukes/article/details/115693833 2.TortoiseGit下載與安裝 TortoiseGit,Git客戶端,32/6 ...
  • 前言 在項目開發過程中,理解數據結構和演算法如同掌握蓋房子的秘訣。演算法不僅能幫助我們編寫高效、優質的代碼,還能解決項目中遇到的各種難題。 給大家推薦一個支持C#的開源免費、新手友好的數據結構與演算法入門教程:Hello演算法。 項目介紹 《Hello Algo》是一本開源免費、新手友好的數據結構與演算法入門 ...
  • 1.生成單個Proto.bat內容 @rem Copyright 2016, Google Inc. @rem All rights reserved. @rem @rem Redistribution and use in source and binary forms, with or with ...
  • 一:背景 1. 講故事 前段時間有位朋友找到我,說他的窗體程式在客戶這邊出現了卡死,讓我幫忙看下怎麼回事?dump也生成了,既然有dump了那就上 windbg 分析吧。 二:WinDbg 分析 1. 為什麼會卡死 窗體程式的卡死,入口門檻很低,後續往下分析就不一定了,不管怎麼說先用 !clrsta ...
  • 前言 人工智慧時代,人臉識別技術已成為安全驗證、身份識別和用戶交互的關鍵工具。 給大家推薦一款.NET 開源提供了強大的人臉識別 API,工具不僅易於集成,還具備高效處理能力。 本文將介紹一款如何利用這些API,為我們的項目添加智能識別的亮點。 項目介紹 GitHub 上擁有 1.2k 星標的 C# ...