前言: 本文是對這篇博客MySQL 8.0 Histograms的翻譯,翻譯如有不當的地方,敬請諒解,請尊重原創和翻譯勞動成果,轉載的時候請註明出處。謝謝! 英文原文地址:https://lefred.be/content/mysql-8-0-histograms/ 翻譯原文地址:https://w... ...
前言: 本文是對這篇博客MySQL 8.0 Histograms的翻譯,翻譯如有不當的地方,敬請諒解,請尊重原創和翻譯勞動成果,轉載的時候請註明出處。謝謝!
英文原文地址:https://lefred.be/content/mysql-8-0-histograms/
翻譯原文地址:https://www.cnblogs.com/kerrycode/p/11817026.html
在MySQL 8.0之前,MySQL缺失了其它關係資料庫中一個眾所周知的功能:優化器的直方圖
優化器團隊(Optimizer Team)在越來越多的MySQL DBA的呼聲中實現了這個功能。
直方圖定義
但什麼是直方圖呢?我們來看維基百科的定義吧,直方圖是數值數據分佈的準確表示。 對於RDBMS來說,直方圖是特定列內數據分佈的近似值。因此在MySQL中,直方圖能夠幫助優化器找到最有效的執行計劃。
直方圖例子
為了說明直方圖是如何影響優化器工作的,我會用dbt3生成的數據來演示。
我們準備了一個簡單查詢:
SELECT * FROM orders
JOIN customer ON o_custkey = c_custkey
WHERE o_orderdate < '1993-01-01'
AND c_mktsegment = "AUTOMOBILE"\G
讓我們看一下傳統的執行計劃的EXPLAIN輸出,以及可視化方式(VISUAL one):
mysql> EXPLAIN SELECT * FROM orders
JOIN customer ON o_custkey = c_custkey
WHERE o_orderdate < '1993-01-01' AND c_mktsegment = "AUTOMOBILE"\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
partitions: NULL
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 149050
filtered: 10.00
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: orders
partitions: NULL
type: ref
possible_keys: i_o_custkey,i_o_orderdate
key: i_o_custkey
key_len: 5
ref: dbt3.customer.c_custkey
rows: 14
filtered: 30.62
Extra: Using where
2 rows in set, 1 warning (0.28 sec)
我們看到MySQL首先對customer表做了一個全表掃描,並且它的選擇估計記錄(過濾)是10%;
接下來讓我們運行這個查詢(我使用了COUNT(*)),然後我們來看看有多少行記錄
mysql> SELECT count(*) FROM orders
JOIN customer ON o_custkey = c_custkey
WHERE o_orderdate < '1993-01-01' AND c_mktsegment = "AUTOMOBILE"\G
*************************** 1. row ***************************
count(*): 45127
1 row in set (49.98 sec)
創建直方圖
現在,我將在表customer上的欄位c_mktsegment上創建一個直方圖
mysql> ANALYZE TABLE customer UPDATE HISTOGRAM ON c_mktsegment WITH 1024 BUCKETS;
+---------------+-----------+----------+---------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+---------------+-----------+----------+---------------------------------------------------------+
| dbt3.customer | histogram | status | Histogram statistics created for column 'c_mktsegment'. |
+---------------+-----------+----------+---------------------------------------------------------+
接下來,我們來驗證查詢的執行計劃:
mysql> EXPLAIN SELECT * FROM orders
JOIN customer ON o_custkey = c_custkey
WHERE o_orderdate < '1993-01-01' AND c_mktsegment = "AUTOMOBILE"\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: orders
partitions: NULL
type: ALL
possible_keys: i_o_custkey,i_o_orderdate
key: NULL
key_len: NULL
ref: NULL
rows: 1494230
filtered: 30.62
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: customer
partitions: NULL
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: dbt3.orders.o_custkey
rows: 1
filtered: 19.84
Extra: Using where
2 rows in set, 1 warning (1.06 sec)