問題描述 優化過程中遇到一個SQL: 其執行計劃為: 從執行計劃來看,使用Using index(覆蓋索引)已經是最優的執行計劃,但每次查詢掃描數據較多,影響整體查詢性能。 優化方案 查詢需要使用SUM計算user_value的總和,借用1+1+0+0+0+0+0=1+1=2的例子,進行如下測試: ...
問題描述
優化過程中遇到一個SQL:
SELECT SUM(user_value) FROM user_log WHERE del_flag = 0 AND product_id = 2324 AND user_type = 1;
其執行計劃為:
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: user_log partitions: NULL type: ref possible_keys: index_sId_ty_vl_ct,IDX_product_id_oth1 key: IDX_product_id_oth1 key_len: 12 ref: const,const,const rows: 14884 filtered: 100.00 Extra: Using index 1 row in set, 1 warning (0.00 sec
從執行計劃來看,使用Using index(覆蓋索引)已經是最優的執行計劃,但每次查詢掃描數據較多,影響整體查詢性能。
優化方案
查詢需要使用SUM計算user_value的總和,借用1+1+0+0+0+0+0=1+1=2的例子,進行如下測試:
SELECT SUM(CASE WHEN user_value>0 THEN 1 ELSE 0 END) AS count1, COUNT(1) AS count2 FROM user_log WHERE del_flag = 0 AND product_id = 2324 AND user_type = 1; +--------+--------+ | count1 | count2 | +--------+--------+ | 680 | 8067 | +--------+--------+
在假設user_value沒有負值的情況下,下麵兩條SQL的結果相同:
##測試SQL1 SELECT SUM(user_value), COUNT(1) AS count2 FROM user_log WHERE del_flag = 0 AND product_id = 2324 AND user_type = 1; ##測試SQL2 SELECT SUM(user_value) FROM user_log WHERE del_flag = 0 AND product_id = 2324 AND user_type = 1 AND user_value>0;
測試SQL1的執行時間為0.00327250,其資源消耗為:
+----------------------+----------+----------+------------+--------------+---------------+-------+ | Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out | Swaps | +----------------------+----------+----------+------------+--------------+---------------+-------+ | starting | 0.000066 | 0.000000 | 0.000000 | 0 | 0 | 0 | | checking permissions | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | | Opening tables | 0.000016 | 0.000000 | 0.000000 | 0 | 0 | 0 | | init | 0.000021 | 0.000000 | 0.000000 | 0 | 0 | 0 | | System lock | 0.000015 | 0.000000 | 0.000000 | 0 | 0 | 0 | | optimizing | 0.000016 | 0.000000 | 0.000000 | 0 | 0 | 0 | | statistics | 0.000127 | 0.000000 | 0.000000 | 0 | 0 | 0 | | preparing | 0.000019 | 0.000000 | 0.000000 | 0 | 0 | 0 | | executing | 0.000011 | 0.000000 | 0.000000 | 0 | 0 | 0 | | Sending data | 0.002867 | 0.002999 | 0.000000 | 0 | 0 | 0 | | end | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | | query end | 0.000013 | 0.000000 | 0.000000 | 0 | 0 | 0 | | closing tables | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | | freeing items | 0.000054 | 0.000000 | 0.000000 | 0 | 8 | 0 | | cleaning up | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | +----------------------+----------+----------+------------+--------------+---------------+-------+
而測試SQL2的執行時間為0.00072325,其資源消耗為:
+----------------------+----------+----------+------------+--------------+---------------+-------+ | Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out | Swaps | +----------------------+----------+----------+------------+--------------+---------------+-------+ | starting | 0.000072 | 0.000000 | 0.000000 | 0 | 0 | 0 | | checking permissions | 0.000013 | 0.000000 | 0.000000 | 0 | 0 | 0 | | Opening tables | 0.000016 | 0.000000 | 0.000000 | 0 | 0 | 0 | | init | 0.000021 | 0.000000 | 0.000000 | 0 | 0 | 0 | | System lock | 0.000014 | 0.000000 | 0.000000 | 0 | 0 | 0 | | optimizing | 0.000020 | 0.000000 | 0.000000 | 0 | 0 | 0 | | statistics | 0.000089 | 0.000000 | 0.000000 | 0 | 0 | 0 | | preparing | 0.000020 | 0.000000 | 0.000000 | 0 | 0 | 0 | | executing | 0.000011 | 0.000000 | 0.000000 | 0 | 0 | 0 | | Sending data | 0.000365 | 0.000000 | 0.000000 | 0 | 0 | 0 | | end | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | | query end | 0.000014 | 0.000000 | 0.000000 | 0 | 0 | 0 | | closing tables | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | | freeing items | 0.000035 | 0.000000 | 0.000000 | 0 | 0 | 0 | | cleaning up | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | 0 | +----------------------+----------+----------+------------+--------------+---------------+-------+
在Sending data部分,兩者在Durion部分差距約10倍,而測試SQL2在CPU_user部分差距更明顯。
總結:
DBA在優化SQL時,除了從數據分佈/索引結構等方面入手外,還需要從業務邏輯方面入手。
PS:上面的優化是假設user_value沒有負值,而實際業務邏輯中user_value可能存在負值,因此以上優化純屬於瞎編。