官網網址參考: https://docs.oracle.com/cd/B19306_01/appdev.102/b14258/d_stats.htm#CIHBIEII https://docs.oracle.com/cd/B12037_01/server.101/b10759/statements_ ...
官網網址參考:
https://docs.oracle.com/cd/B19306_01/appdev.102/b14258/d_stats.htm#CIHBIEII
https://docs.oracle.com/cd/B12037_01/server.101/b10759/statements_4005.htm#i2150533
https://asktom.oracle.com/pls/asktom/f?p=100:11:::::p11_question_id:5792247321358
https://docs.oracle.com/database/121/TGSQL/tgsql_histo.htm#TGSQL366
查詢表上一次收集統計信息的時間:
select owner,table_name,last_analyzed from dba_tables where owner='SCOTT';
統計信息涉及的視圖:
Column statistics appear in the data dictionary views USER_TAB_COLUMNS
, ALL_TAB_COLUMNS
, and DBA_TAB_COLUMNS
. Histograms appear in the data dictionary views USER_TAB_HISTOGRAMS
, DBA_TAB_HISTOGRAMS
, and ALL_TAB_HISTOGRAMS
; USER_PART_HISTOGRAMS
, DBA_PART_HISTOGRAMS
, and ALL_PART_HISTOGRAMS
; and USER_SUBPART_HISTOGRAMS
, DBA_SUBPART_HISTOGRAMS
, and ALL_SUBPART_HISTOGRAMS
.
收集統計信息主要有2種方法:
1. analyze
analyze可以用來收集表,索引,列以及系統的統計信息和直方圖,以下為一些典型用法:
analyze table scott.emp compute statistics; --收集所有的統計信息和直方圖信息,包括表、列、索引。 analyze table scott.emp compute statistics for table; --收集emp表的統計信息,不含列、索引統計信息和直方圖。 analyze table scott.emp compute statistics for all columns; --收集所有列的統計信息和直方圖(超大表較耗資源,因為只要列中有非空值,那麼就會收集這個列的統計信息和直方圖)。 analyze table scott.emp compute statistics for all indexed columns; --收集所有索引列的統計信息和直方圖。 analyze table scott.emp compute statistics for all indexes; --收集所有索引統計信息,不含列的統計信息和直方圖。 analyze table scott.emp compute statistics for columns 列1,列2; --收集2個列的統計信息和直方圖。 analyze index idx_ename delete statistics; --刪除索引idx_ename的統計信息。 analyze table scott.emp delete statistics; --刪除表t1所有的表,列,索引的統計信息和列直方圖。 analyze table scott.emp estimate statistics sample 15 percent for table; --收集emp表的統計信息,以估算模式採樣比例為15%進行收集,不含列、索引統計信息和直方圖。
從語法可以看出,只有指定列統計信息收集時,才會收集相關列的直方圖,此外收集直方圖時for子句還可以加size子句,size的取值範圍是1-254,預設值是75,表示直方圖的buckets的最大數目。而dbms_stats包的size選擇則有:數字|auto|repeat|skewonly選項,但analyze的size只能是數字。
關於直方圖:
A histogram is a special type of column statistic that provides more detailed information about the data distribution in a table column. A histogram sorts values into "buckets," as you might sort coins into buckets.
從官網解釋(參考第四個網址)來看,直方圖就是一種特殊的列統計信息,這也與我們上邊的推斷相符,只有列才有直方圖。
這裡貼一個Tom Kyte用於查看analyze後統計信息的SQL:(已稍作改進,僅示例,這種格式的SQL不推薦,原SQL較簡單參考第三個網址)
select t.num_rows as num_rows_in_table, i.index_name, i.num_rows as num_rows_in_index, co.num_analyzed_cols,ch.histogram_cnt from (select num_rows from user_tables where table_name ='EMP') t, (select index_name,num_rows from user_indexes where table_name = 'EMP') i, (select count(*) as num_analyzed_cols from user_tab_columns where table_name='EMP' and num_distinct is not null) co, (select count(distinct column_name) histogram_cnt from user_tab_histograms where table_name = 'EMP' ) ch;
需要註意的一點是for table選項在某些版本中並不只收集表統計信息,而是連列和索引的統計信息一塊收集了,至於具體哪些版本的表現不同這裡不做深究,使用上述SQL可以輕易的測試出你的analyze和dbms_stats語句到底收集了什麼統計信息和直方圖。
2. 調用dbms_stats包
dbms_stats與analyze的區別是:
analyze收集系統內部對象會報錯,而dbms_stats不會
analyze不能正確的收集分區表的統計信息,而dbms_stats可以通過指定粒度來實現(granularity)。
analyze不能並行的收集統計信息,而dbms_stats可以(可以加上degree=>4來實現並行度為4的收集)。
Oracle推薦使用dbms_stats來收集統計信息,analyze將會被逐漸拋棄。
dbms_stats中負責收集統計信息的是以下幾個存儲過程:
GATHER_DATABASE_STATS --This procedure gathers statistics for all objects in the database. GATHER_DICTIONARY_STATS --This procedure gathers statistics for dictionary schemas 'SYS', 'SYSTEM' and schemas of RDBMS components. GATHER_FIXED_OBJECTS_STATS --This procedure gathers statistics for all fixed objects (dynamic performance tables). GATHER_INDEX_STATS --This procedure gathers index statistics. It attempts to parallelize as much of the work as possible. Restrictions are described in the individual parameters. This operation will not parallelize with certain types of indexes, including cluster indexes, domain indexes, and bitmap join indexes. The granularity and no_invalidate arguments are not relevant to these types of indexes. GATHER_SCHEMA_STATS --This procedure gathers statistics for all objects in a schema. GATHER_SYSTEM_STATS --This procedure gathers system statistics. GATHER_TABLE_STATS --This procedure gathers table and column (and index) statistics. It attempts to parallelize as much of the work as possible, but there are some restrictions as described in the individual parameters.
三個常用Procedure用法詳解:GATHER_SCHEMA_STATS(兩種用法)、GATHER_TABLE_STATS、GATHER_INDEX_STATS
PROCEDURE GATHER_SCHEMA_STATS Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- OWNNAME VARCHAR2 IN ESTIMATE_PERCENT NUMBER IN DEFAULT BLOCK_SAMPLE BOOLEAN IN DEFAULT METHOD_OPT VARCHAR2 IN DEFAULT DEGREE NUMBER IN DEFAULT GRANULARITY VARCHAR2 IN DEFAULT CASCADE BOOLEAN IN DEFAULT STATTAB VARCHAR2 IN DEFAULT STATID VARCHAR2 IN DEFAULT OPTIONS VARCHAR2 IN DEFAULT OBJLIST OBJECTTAB OUT STATOWN VARCHAR2 IN DEFAULT NO_INVALIDATE BOOLEAN IN DEFAULT GATHER_TEMP BOOLEAN IN DEFAULT GATHER_FIXED BOOLEAN IN DEFAULT STATTYPE VARCHAR2 IN DEFAULT FORCE BOOLEAN IN DEFAULT OBJ_FILTER_LIST OBJECTTAB IN DEFAULT PROCEDURE GATHER_SCHEMA_STATS Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- OWNNAME VARCHAR2 IN ESTIMATE_PERCENT NUMBER IN DEFAULT BLOCK_SAMPLE BOOLEAN IN DEFAULT METHOD_OPT VARCHAR2 IN DEFAULT DEGREE NUMBER IN DEFAULT GRANULARITY VARCHAR2 IN DEFAULT CASCADE BOOLEAN IN DEFAULT STATTAB VARCHAR2 IN DEFAULT STATID VARCHAR2 IN DEFAULT OPTIONS VARCHAR2 IN DEFAULT STATOWN VARCHAR2 IN DEFAULT NO_INVALIDATE BOOLEAN IN DEFAULT GATHER_TEMP BOOLEAN IN DEFAULT GATHER_FIXED BOOLEAN IN DEFAULT STATTYPE VARCHAR2 IN DEFAULT FORCE BOOLEAN IN DEFAULT OBJ_FILTER_LIST OBJECTTAB IN DEFAULT PROCEDURE GATHER_TABLE_STATS Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- OWNNAME VARCHAR2 IN TABNAME VARCHAR2 IN PARTNAME VARCHAR2 IN DEFAULT ESTIMATE_PERCENT NUMBER IN DEFAULT BLOCK_SAMPLE BOOLEAN IN DEFAULT METHOD_OPT VARCHAR2 IN DEFAULT DEGREE NUMBER IN DEFAULT GRANULARITY VARCHAR2 IN DEFAULT CASCADE BOOLEAN IN DEFAULT STATTAB VARCHAR2 IN DEFAULT STATID VARCHAR2 IN DEFAULT STATOWN VARCHAR2 IN DEFAULT NO_INVALIDATE BOOLEAN IN DEFAULT STATTYPE VARCHAR2 IN DEFAULT FORCE BOOLEAN IN DEFAULT PROCEDURE GATHER_INDEX_STATS Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- OWNNAME VARCHAR2 IN INDNAME VARCHAR2 IN PARTNAME VARCHAR2 IN DEFAULT ESTIMATE_PERCENT NUMBER IN DEFAULT STATTAB VARCHAR2 IN DEFAULT STATID VARCHAR2 IN DEFAULT STATOWN VARCHAR2 IN DEFAULT DEGREE NUMBER IN DEFAULT GRANULARITY VARCHAR2 IN DEFAULT NO_INVALIDATE BOOLEAN IN DEFAULT STATTYPE VARCHAR2 IN DEFAULT FORCE BOOLEAN IN DEFAULT
GATHER_SCHEMA_STATS參數詳解:(其他存儲過程的參數解釋參見官方頁面,很多參數description都是通用的)
詳見:Table 103-30 GATHER_SCHEMA_STATS Procedure Parameters
一些實際用例:
EXEC DBMS_STATS.GATHER_SCHEMA_STATS('SCOTT',estimate_percent=>80,method_opt=>'FOR ALL COLUMNS SIZE AUTO',degree=>4,cascade=>TRUE); EXEC DBMS_STATS.GATHER_TABLE_STATS('SCOTT','EMP',estimate_percent=>80,method_opt=>'FOR ALL COLUMNS SIZE AUTO',degree=>4,cascade=>TRUE); EXEC DBMS_STATS.GATHER_INDEX_STATS('SCOTT','PK_EMP',estimate_percent=>80,degree=>4);
一些特別提示:
- 雖然method_opt的description中並未提及for table這個選項,但其實這個選項也是有效的,同analyze一樣,這個參數在不同版本的表現也是不一樣的,具體差異也可以輕易的使用本文中提供的SQL觀察到。
- 如果想使用compute方式收集統計信息,將estimate_percent設為100或者null即可。
- Oracle有auto optimizer stats collection的自動維護任務定期的收集統計信息,這些任務是預設開啟的,但當資料庫變的很大之後就會引發嚴重的性能問題,建議只保留周末的一個視窗,其他視窗全部關閉。
- 直方圖統計信息並不是那麼的重要,只有在遇到對傾斜列(skew)的查詢很頻繁時才有用,這種情況並不常見。
- 不再推薦使用analyze來收集統計信息,除非是做測試或者表很小,dbms_stats的並行度選項能加快收集速度。
- 對大表採樣收集統計信息時一般採樣比例不需要很大,通常10%到30%即可,如果業務可以提供維護視窗,那100%也沒什麼大不了。
- 如果要詳細瞭解統計信息收集了什麼內容,可以參考本文提供的網址鏈接和視圖。
關於執行許可權:
To invoke this procedure you must be owner of the table, or you need the ANALYZE ANY privilege. For objects owned by SYS, you need to be either the owner of the table, or you need the ANALYZE ANY DICTIONARY privilege or the SYSDBA privilege.當然關於許可權還有個取巧的辦法,示例如下:
conn hr/hr create or replace procedure gather_stats is begin dbms_stats.gather_table_stats('HR', 'EMPLOYEES'); end gather_stats; / grant select on hr.employees to scott; grant execute on gather_stats to scott; conn scott/scott exec hr.gather_stats;