摘要:本文主要講解數倉運維中遇到單SQL磁碟空間管控問題的解析和方案。 本文分享自華為雲社區《GaussDB(DWS)運維 -- 單SQL磁碟空間管控》,作者: 譡里個檔。 【問題描述】 執行部分SQL語句時出現如下報錯信息(具體數值可能因為配置有差異),本文針對根因和場景觸發場景,確定觸發此類問題 ...
摘要:本文主要講解數倉運維中遇到單SQL磁碟空間管控問題的解析和方案。
本文分享自華為雲社區《GaussDB(DWS)運維 -- 單SQL磁碟空間管控》,作者: 譡里個檔。
【問題描述】
執行部分SQL語句時出現如下報錯信息(具體數值可能因為配置有差異),本文針對根因和場景觸發場景,確定觸發此類問題的根因
The space used on DN (209715224 kB) has exceeded the sql use space limit (209715200 kB)
【問題根因】
該報錯表示用戶執行的sql在單DN上所用空間超過了參數sql_use_spacelimit的限制。sql_use_spacelimit限制單個SQL在單個DN上,觸發落盤操作時,落盤文件的空間大小,管控的空間包括普通表、臨時表以及中間結果集落盤占用的空間
可以使用如下SQL查看所有實例上的配置參數sql_use_spacelimit的值
SELECT * FROM pgxc_settings WHERE name = 'sql_use_spacelimit';
【解決方案】
當前現網最常見的此類錯誤一般都是INERT語句觸發的,我們以常見如下語句為例,說明這類問題的解決方案
INSERT INTO dwljaa.bif_col_edw_dut_257_t (attribute1, attribute2, attribute3, attribute4, attribute5, column_name1, column_name2, column_name3, column_name4, column_name5, tag_code, tag_id, table_name, period, tbl_code, tag_grp_code, target_key_val, cycle_id, creation_date, target_key_num, priority) SELECT 'SCN_SVC_3003', NULL, NULL, NULL, NULL, 'BIZ_SCR_CODE', NULL, NULL, NULL, NULL, 'SCN_SVC_3003-02', 3026937, 'dwr_fin_hwip_man_je_f_tmp0', '202208', 'PL_E17360237', 'SUB_PL_PUB_SCN', A.record_seq_num, 20230321000000, SYSDATE, A.record_seq_num, 53333 FROM (SELECT /*+PARALLEL(8) NO_EXPAND*/ T.record_seq_num FROM dwljaa.dwr_fin_hwip_man_je_f_tmp0 T INNER JOIN dwrdim.dwr_dim_department_d PL_E100134 ON T.COA_DEPT_KEY = PL_E100134.DEPT_KEY INNER JOIN dwrdim.dwr_dim_grp_acct_code_d PL_E100119 ON T.GROUP_ACCOUNT_CODE = PL_E100119.GROUP_ACCOUNT_CODE INNER JOIN dwrdim.dwr_dim_journal_category_d PL_E100147 ON T.JE_CATEGORY_ID = PL_E100147.JE_CATEGORY_ID INNER JOIN dwrdim.dwr_dim_product_d PL_E100121 ON T.MAJOR_PROD_KEY = PL_E100121.PROD_KEY INNER JOIN dwrdim.dwr_dim_product_d PL_E100122 ON T.MINOR_PROD_KEY = PL_E100122.PROD_KEY INNER JOIN dwrdim.dwr_dim_project_d PL_E100155 ON T.PROJ_KEY = PL_E100155.PROJ_KEY WHERE 1 = 1 AND ((((((((((PL_E100119.LVL1_ACCOUNT_CODE IN('501', '503', '506', '512')) OR(PL_E100119.GROUP_ACCOUNT_CODE = '5980406')) AND((PL_E100121.PROD_CLASS_FLAG = '0') OR(PL_E100122.PROD_CLASS_FLAG = '0'))) OR((PL_E100119.LVL1_ACCOUNT_CODE IN('504', 'SVC')) OR(PL_E100119.LVL2_ACCOUNT_CODE IN('55107', '57702', '57703', '58303')) OR(PL_E100119.GROUP_ACCOUNT_CODE IN('5980407', '5825107', '5827702', '5827703', 'C501'))) OR((PL_E100119.ACCOUNT_EXPENSE_CATG_CODE IN('E01', 'E03', 'E05', 'E06', 'E08', 'E09', 'E10', 'E11', 'E02', 'E04', 'E07', 'E12')) AND(PL_E100134.DEPT_EXPENSE_TYPE_CODE = 'Z4') AND((PL_E100155.PROJ_TYPE_CODE <> '02') OR(PL_E100155.PROJ_NUM IN('9000000')))) OR((PL_E100119.ACCOUNT_EXPENSE_CATG_CODE IN('E02', 'E04', 'E07', 'E12', 'E05', 'E09', 'E10')) AND(PL_E100134.DEPT_EXPENSE_TYPE_CODE IN('Z1', 'Z2', 'Z3')) AND(PL_E100155.PROJ_TYPE_CODE = '01'))) AND((PL_E100119.GROUP_ACCOUNT_CODE <> 'EXP5555')))) AND(PL_E100119.LVL2_ACCOUNT_CODE <> '58303') AND(PL_E100155.PROJ_TYPE_CODE = '01'))) AND((PL_E100119.LVL1_ACCOUNT_CODE = '504') AND(PL_E100147.CN_NAME IN('JV-PFC cooper cost', 'JV-ADJ PA Cooper/Constr', 'JV-Agent REV&COST Adj', 'JV-Agent totalvalue Adj')))) OR((PL_E100119.ACCOUNT_EXPENSE_CATG_CODE = 'E05') AND(PL_E100134.DEPT_EXPENSE_TYPE_CODE IN('Z1', 'Z2', 'Z3')) AND(PL_E100155.PROJ_TYPE_CODE = '01') AND(PL_E100147.CN_NAME <> 'JV-ADJ 557 WITH B CODE') AND((((((((PL_E100119.LVL1_ACCOUNT_CODE IN('501', '503', '506', '512')) OR(PL_E100119.GROUP_ACCOUNT_CODE = '5980406')) AND((PL_E100121.PROD_CLASS_FLAG = '0') OR(PL_E100122.PROD_CLASS_FLAG = '0'))) OR((PL_E100119.LVL1_ACCOUNT_CODE IN('504', 'SVC')) OR(PL_E100119.LVL2_ACCOUNT_CODE IN('55107', '57702', '57703', '58303')) OR(PL_E100119.GROUP_ACCOUNT_CODE IN('5980407', '5825107', '5827702', '5827703', 'C501'))) OR((PL_E100119.ACCOUNT_EXPENSE_CATG_CODE IN('E01', 'E03', 'E05', 'E06', 'E08', 'E09', 'E10', 'E11', 'E02', 'E04', 'E07', 'E12')) AND(PL_E100134.DEPT_EXPENSE_TYPE_CODE = 'Z4') AND((PL_E100155.PROJ_TYPE_CODE <> '02') OR(PL_E100155.PROJ_NUM IN('9000000')))) OR((PL_E100119.ACCOUNT_EXPENSE_CATG_CODE IN('E02', 'E04', 'E07', 'E12', 'E05', 'E09', 'E10')) AND(PL_E100134.DEPT_EXPENSE_TYPE_CODE IN('Z1', 'Z2', 'Z3')) AND(PL_E100155.PROJ_TYPE_CODE = '01'))) AND((PL_E100119.GROUP_ACCOUNT_CODE <> 'EXP5555')))) AND(PL_E100119.LVL2_ACCOUNT_CODE <> '58303') AND(PL_E100155.PROJ_TYPE_CODE = '01'))))) AND T.PERIOD_ID >= 202208 AND T.PERIOD_ID <= 202208 ) A
場景1
1)往目標表INSERT的數據量過大,導致INSERT的數據量在單DN上使用的存儲空間超過sql_use_spacelimit閾值
這種場景一般是配置參數sql_use_spacelimit設置不合理,建議直接調大配置參數sql_use_spacelimit的值
2)INSERT的數據存在傾斜,導致某個DN上數據寫入量特別大,在這個DN上觸發了sql_use_spacelimit閾值
這類問題的解決一般分為以下幾個步驟
a)使用如下SQL查找目標表分佈列
SELECT pg_get_tabledef('dwljaa.bif_col_edw_dut_257_t'::regclass);
獲取的表定義如下
SET search_path = dwljaa; CREATE TABLE bif_col_edw_dut_257_t ( tag_grp_code character varying(100), tag_code character varying(100), tag_id numeric, period character varying(20), tbl_code character varying(100) NOT NULL, table_name character varying(50), target_key_val character varying(100), target_key_num numeric, cycle_id numeric, creation_date timestamp(0) without time zone, attribute1 character varying(100), attribute2 character varying(100), attribute3 character varying(100), attribute4 character varying(100), attribute5 character varying(100), priority numeric, column_name1 character varying(100), column_name2 character varying(100), column_name3 character varying(100), column_name4 character varying(100), column_name5 character varying(100), carrying_dimension1 character varying(100), carrying_dimension2 character varying(100), carrying_dimension3 character varying(100) ) WITH (orientation=column, compression=low, colversion=2.0, enable_delta=false) DISTRIBUTE BY HASH(target_key_num) TO GROUP group_version1;
b)根據表定義和INSERT語句,確認分佈列在查詢語句中的輸出列位置
根據表定義(分佈列為target_key_num)以及原始的INSERT語句描述,查詢語句輸出的導數第二列( A.record_seq_num)對應目標表的分佈列target_key_num
c)構建如下查詢語句,判斷查詢語句輸出數據在欄位A.record_seq_num是否存在嚴重
如果輸出的第一條記錄的cnt值非常大(比如上百萬甚至更多),導致此值對應記錄的存儲空間可能觸發sql_use_spacelimit閾值,那麼就可以明確是數據傾斜導致的。 這種場景一般需要先排查數據傾斜產生的原因是否合理,如果數據缺失存在傾斜,那麼建議修改表的分佈列,具體修改方案參見《GaussDB(DWS)性能調優系列實戰篇三:十八般武藝之好味道表定義》
WITH t AS(-- 把原始語句中的查詢部分封裝為CTE,查詢語句實處列只包含分佈列 SELECT A.record_seq_num FROM (SELECT /*+PARALLEL(8) NO_EXPAND*/ T.record_seq_num FROM dwljaa.dwr_fin_hwip_man_je_f_tmp0 T INNER JOIN dwrdim.dwr_dim_department_d PL_E100134 ON T.COA_DEPT_KEY = PL_E100134.DEPT_KEY INNER JOIN dwrdim.dwr_dim_grp_acct_code_d PL_E100119 ON T.GROUP_ACCOUNT_CODE = PL_E100119.GROUP_ACCOUNT_CODE INNER JOIN dwrdim.dwr_dim_journal_category_d PL_E100147 ON T.JE_CATEGORY_ID = PL_E100147.JE_CATEGORY_ID INNER JOIN dwrdim.dwr_dim_product_d PL_E100121 ON T.MAJOR_PROD_KEY = PL_E100121.PROD_KEY INNER JOIN dwrdim.dwr_dim_product_d PL_E100122 ON T.MINOR_PROD_KEY = PL_E100122.PROD_KEY INNER JOIN dwrdim.dwr_dim_project_d PL_E100155 ON T.PROJ_KEY = PL_E100155.PROJ_KEY WHERE 1 = 1 AND ((((((((((PL_E100119.LVL1_ACCOUNT_CODE IN('501', '503', '506', '512')) OR(PL_E100119.GROUP_ACCOUNT_CODE = '5980406')) AND((PL_E100121.PROD_CLASS_FLAG = '0') OR(PL_E100122.PROD_CLASS_FLAG = '0'))) OR((PL_E100119.LVL1_ACCOUNT_CODE IN('504', 'SVC')) OR(PL_E100119.LVL2_ACCOUNT_CODE IN('55107', '57702', '57703', '58303')) OR(PL_E100119.GROUP_ACCOUNT_CODE IN('5980407', '5825107', '5827702', '5827703', 'C501'))) OR((PL_E100119.ACCOUNT_EXPENSE_CATG_CODE IN('E01', 'E03', 'E05', 'E06', 'E08', 'E09', 'E10', 'E11', 'E02', 'E04', 'E07', 'E12')) AND(PL_E100134.DEPT_EXPENSE_TYPE_CODE = 'Z4') AND((PL_E100155.PROJ_TYPE_CODE <> '02') OR(PL_E100155.PROJ_NUM IN('9000000')))) OR((PL_E100119.ACCOUNT_EXPENSE_CATG_CODE IN('E02', 'E04', 'E07', 'E12', 'E05', 'E09', 'E10')) AND(PL_E100134.DEPT_EXPENSE_TYPE_CODE IN('Z1', 'Z2', 'Z3')) AND(PL_E100155.PROJ_TYPE_CODE = '01'))) AND((PL_E100119.GROUP_ACCOUNT_CODE <> 'EXP5555')))) AND(PL_E100119.LVL2_ACCOUNT_CODE <> '58303') AND(PL_E100155.PROJ_TYPE_CODE = '01'))) AND((PL_E100119.LVL1_ACCOUNT_CODE = '504') AND(PL_E100147.CN_NAME IN('JV-PFC cooper cost', 'JV-ADJ PA Cooper/Constr', 'JV-Agent REV&COST Adj', 'JV-Agent totalvalue Adj')))) OR((PL_E100119.ACCOUNT_EXPENSE_CATG_CODE = 'E05') AND(PL_E100134.DEPT_EXPENSE_TYPE_CODE IN('Z1', 'Z2', 'Z3')) AND(PL_E100155.PROJ_TYPE_CODE = '01') AND(PL_E100147.CN_NAME <> 'JV-ADJ 557 WITH B CODE') AND((((((((PL_E100119.LVL1_ACCOUNT_CODE IN('501', '503', '506', '512')) OR(PL_E100119.GROUP_ACCOUNT_CODE = '5980406')) AND((PL_E100121.PROD_CLASS_FLAG = '0') OR(PL_E100122.PROD_CLASS_FLAG = '0'))) OR((PL_E100119.LVL1_ACCOUNT_CODE IN('504', 'SVC')) OR(PL_E100119.LVL2_ACCOUNT_CODE IN('55107', '57702', '57703', '58303')) OR(PL_E100119.GROUP_ACCOUNT_CODE IN('5980407', '5825107', '5827702', '5827703', 'C501'))) OR((PL_E100119.ACCOUNT_EXPENSE_CATG_CODE IN('E01', 'E03', 'E05', 'E06', 'E08', 'E09', 'E10', 'E11', 'E02', 'E04', 'E07', 'E12')) AND(PL_E100134.DEPT_EXPENSE_TYPE_CODE = 'Z4') AND((PL_E100155.PROJ_TYPE_CODE <> '02') OR(PL_E100155.PROJ_NUM IN('9000000')))) OR((PL_E100119.ACCOUNT_EXPENSE_CATG_CODE IN('E02', 'E04', 'E07', 'E12', 'E05', 'E09', 'E10')) AND(PL_E100134.DEPT_EXPENSE_TYPE_CODE IN('Z1', 'Z2', 'Z3')) AND(PL_E100155.PROJ_TYPE_CODE = '01'))) AND((PL_E100119.GROUP_ACCOUNT_CODE <> 'EXP5555')))) AND(PL_E100119.LVL2_ACCOUNT_CODE <> '58303') AND(PL_E100155.PROJ_TYPE_CODE = '01'))))) AND T.PERIOD_ID >= 202208 AND T.PERIOD_ID <= 202208 ) A ) -- 對分佈列做彙總求和,查找分佈列值重覆次數最多的值 SELECT record_seq_num, cnt FROM ( SELECT record_seq_num, count(1) AS cnt FROM t GROUP BY record_seq_num HAVING count(1) > 10000 ) ORDER BY cnt LIMIT 10