在StoneDB中,數據包分為以下幾類: 不相關的數據包:不滿足查詢條件的數據包。 相關的數據包:滿足查詢條件的數據包。 可疑的數據包:數據包中的數據部分滿足查詢條件,需要進一步解壓縮數據包才能得到滿足條件的數據行。 通過對數據包的劃分,知識網格技術過濾掉不相關的數據包,讀取相關的數據包和可疑的數據 ...
在StoneDB中,數據包分為以下幾類:
- 不相關的數據包:不滿足查詢條件的數據包。
- 相關的數據包:滿足查詢條件的數據包。
- 可疑的數據包:數據包中的數據部分滿足查詢條件,需要進一步解壓縮數據包才能得到滿足條件的數據行。
通過對數據包的劃分,知識網格技術過濾掉不相關的數據包,讀取相關的數據包和可疑的數據包。其中相關的數據包不需要解壓縮,只讀取元數據,不會發生IO,可疑的數據包需要解壓縮,會發生IO。
1)創建表t_user
CREATE TABLE t_user(
id INT NOT NULL AUTO_INCREMENT,
first_name VARCHAR(10) NOT NULL,
last_name VARCHAR(10) NOT NULL,
sex VARCHAR(5) NOT NULL,
score INT NOT NULL,
copy_id INT NOT NULL,
PRIMARY KEY (`id`),
key idx_lastname(last_name)
) engine=STONEDB;
2)創建存儲過程
DELIMITER //
create PROCEDURE add_user(in num INT)
BEGIN
DECLARE rowid INT DEFAULT 0;
DECLARE firstname CHAR(1);
DECLARE name1 CHAR(1);
DECLARE name2 CHAR(1);
DECLARE lastname VARCHAR(3) DEFAULT '';
DECLARE sex CHAR(1);
DECLARE score CHAR(2);
WHILE rowid < num DO
SET firstname = SUBSTRING('趙錢孫李周吳鄭王林楊柳劉孫陳江阮侯鄒高彭徐',FLOOR(1+21*RAND()),1);
SET name1 = SUBSTRING('一二三四五六七八九十甲乙丙丁靜景京晶名明銘敏閔民軍君俊駿天田甜兲恬益依成城誠立莉力黎勵',ROUND(1+43*RAND()),1);
SET name2 = SUBSTRING('一二三四五六七八九十甲乙丙丁靜景京晶名明銘敏閔民軍君俊駿天田甜兲恬益依成城誠立莉力黎勵',ROUND(1+43*RAND()),1);
SET sex=FLOOR(0 + (RAND() * 2));
SET score= FLOOR(40 + (RAND() *60));
SET rowid = rowid + 1;
IF ROUND(RAND())=0 THEN
SET lastname =name1;
END IF;
IF ROUND(RAND())=1 THEN
SET lastname = CONCAT(name1,name2);
END IF;
insert INTO t_user(first_name,last_name,sex,score,copy_id) VALUES (firstname,lastname,sex,score,rowid);
END WHILE;
END //
DELIMITER ;
3)插入數據
call add_user(10000000);
4)創建表t_user_innodb
create table t_user_innodb like t_user;
insert into t_user_innodb select * from t_user;
alter table t_user_innodb engine=innodb;
1)驗證讀取相關數據包
SQL的語義邏輯是對欄位 first_name 進行分組統計,在StoneDB中,元數據信息記錄在元數據節點,如果能通過元數據節點讀取到元數據,就不需要解壓縮數據包,不發生IO。
在InnoDB中,表的統計信息記錄在mysql.innodb_table_stats,優化器根據表和索引的統計信息,生成一個最優的執行計劃,然後執行SQL。分別在InnoDB與StoneDB執行,通過SQL profile觀察讀取IO的情況。
註:為規避緩存的影響,每組測試前重啟資料庫實例。
InnoDB
mysql> set profiling=on;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> select first_name,count(*) from t_user_innodb group by first_name;
+------------+----------+
| first_name | count(*) |
+------------+----------+
| 侯 | 476424 |
| 劉 | 475764 |
| 吳 | 475979 |
| 周 | 475891 |
| 孫 | 950444 |
| 彭 | 476632 |
| 徐 | 476219 |
| 李 | 475521 |
| 楊 | 476026 |
| 林 | 477289 |
| 柳 | 476250 |
| 江 | 476623 |
| 王 | 475119 |
| 趙 | 476529 |
| 鄒 | 476852 |
| 鄭 | 476379 |
| 錢 | 476829 |
| 阮 | 476336 |
| 陳 | 476746 |
| 高 | 476148 |
+------------+----------+
20 rows in set (8.62 sec)
mysql> show profiles;
+----------+------------+-------------------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+-------------------------------------------------------------------+
| 1 | 8.61591075 | select first_name,count(*) from t_user_innodb group by first_name |
+----------+------------+-------------------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> show profile cpu,block io for query 1;
+----------------------+----------+----------+------------+--------------+---------------+
| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+----------------------+----------+----------+------------+--------------+---------------+
| starting | 0.000149 | 0.000059 | 0.000083 | 0 | 0 |
| checking permissions | 0.000027 | 0.000011 | 0.000015 | 0 | 0 |
| Opening tables | 0.048181 | 0.003919 | 0.007952 | 608 | 0 |
| init | 0.000036 | 0.000014 | 0.000021 | 0 | 0 |
| System lock | 0.000022 | 0.000009 | 0.000013 | 0 | 0 |
| optimizing | 0.000017 | 0.000007 | 0.000010 | 0 | 0 |
| statistics | 0.000029 | 0.000012 | 0.000016 | 0 | 0 |
| preparing | 0.000022 | 0.000009 | 0.000013 | 0 | 0 |
| Creating tmp table | 0.000045 | 0.000019 | 0.000027 | 0 | 0 |
| Sorting result | 0.000016 | 0.000007 | 0.000009 | 0 | 0 |
| executing | 0.000014 | 0.000005 | 0.000008 | 0 | 0 |
| Sending data | 8.566974 | 6.905969 | 0.772964 | 873888 | 0 |
| Creating sort index | 0.000144 | 0.000164 | 0.000037 | 64 | 0 |
| end | 0.000014 | 0.000012 | 0.000003 | 32 | 0 |
| query end | 0.000028 | 0.000038 | 0.000009 | 0 | 0 |
| removing tmp table | 0.000019 | 0.000015 | 0.000003 | 0 | 0 |
| query end | 0.000012 | 0.000010 | 0.000002 | 0 | 0 |
| closing tables | 0.000031 | 0.000025 | 0.000006 | 0 | 0 |
| freeing items | 0.000032 | 0.000027 | 0.000006 | 0 | 0 |
| logging slow query | 0.000067 | 0.000054 | 0.000012 | 0 | 8 |
| cleaning up | 0.000035 | 0.000028 | 0.000006 | 0 | 0 |
+----------------------+----------+----------+------------+--------------+---------------+
21 rows in set, 1 warning (0.00 sec)
從SQL profile可知,SQL在InnoDB執行的過程中,發生IO的階段有Opening tables、Sending data、Creating sort index、end,其中Opening tables是每張表第一次載入都會經歷的,可排除討論。重點關註Sending data部分,它表示在執行器的任意階段,通常是存儲引擎層與Server層的IO交互過程。
StoneDB
mysql> set profiling=on;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> select first_name,count(*) from t_user group by first_name;
+------------+----------+
| first_name | count(*) |
+------------+----------+
| 趙 | 476529 |
| 徐 | 476219 |
| 王 | 475119 |
| 阮 | 476336 |
| 柳 | 476250 |
| 侯 | 476424 |
| 孫 | 950444 |
| 鄭 | 476379 |
| 高 | 476148 |
| 林 | 477289 |
| 鄒 | 476852 |
| 彭 | 476632 |
| 李 | 475521 |
| 吳 | 475979 |
| 劉 | 475764 |
| 錢 | 476829 |
| 周 | 475891 |
| 楊 | 476026 |
| 陳 | 476746 |
| 江 | 476623 |
+------------+----------+
20 rows in set (0.59 sec)
mysql> show profiles;
+----------+------------+------------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+------------------------------------------------------------+
| 1 | 0.59069975 | select first_name,count(*) from t_user group by first_name |
+----------+------------+------------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> show profile cpu,block io for query 1;
+----------------------+----------+----------+------------+--------------+---------------+
| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+----------------------+----------+----------+------------+--------------+---------------+
| starting | 0.000160 | 0.000066 | 0.000089 | 0 | 0 |
| checking permissions | 0.000027 | 0.000011 | 0.000015 | 0 | 0 |
| Opening tables | 0.011405 | 0.003718 | 0.007688 | 0 | 240 |
| System lock | 0.000385 | 0.000163 | 0.000222 | 0 | 0 |
| init | 0.000050 | 0.000021 | 0.000028 | 0 | 0 |
| optimizing | 0.000143 | 0.000061 | 0.000082 | 0 | 0 |
| update multi-index | 0.000052 | 0.000022 | 0.000030 | 0 | 0 |
| aggregation | 0.578315 | 2.639504 | 0.981471 | 0 | 8 |
| query end | 0.000069 | 0.000043 | 0.000026 | 0 | 0 |
| closing tables | 0.000035 | 0.000021 | 0.000013 | 0 | 0 |
| freeing items | 0.000034 | 0.000021 | 0.000013 | 0 | 0 |
| cleaning up | 0.000027 | 0.000017 | 0.000010 | 0 | 0 |
+----------------------+----------+----------+------------+--------------+---------------+
12 rows in set, 1 warning (0.00 sec)
從SQL profile可知,SQL在StoneDB執行的過程中,只在Opening tables階段發生IO。其它階段沒有發生IO,說明相關數據包是不需要解壓縮的,通過元數據得到。
2)驗證讀取可疑數據包
SQL的語義邏輯是查詢一行數據,StoneDB可以通過知識網格技術過濾掉不相關的數據包,由於只返回一行數據,最終只能找到可疑的數據包,然後解壓縮可疑的數據包,最終得到這一行數據。InnoDB還是根據統計信息生成一個最優的執行計划去執行SQL。
InnoDB
mysql> set profiling=on;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> select count(*) from t_user_innodb where first_name='柳' and copy_id=9968888;
+----------+
| count(*) |
+----------+
| 1 |
+----------+
1 row in set (3.20 sec)
mysql> show profile cpu,block io for query 1;
+----------------------+----------+----------+------------+--------------+---------------+
| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+----------------------+----------+----------+------------+--------------+---------------+
| starting | 0.000170 | 0.000072 | 0.000092 | 0 | 0 |
| checking permissions | 0.000030 | 0.000013 | 0.000016 | 0 | 0 |
| Opening tables | 0.024121 | 0.004351 | 0.008638 | 800 | 0 |
| init | 0.000049 | 0.000021 | 0.000027 | 0 | 0 |
| System lock | 0.000019 | 0.000008 | 0.000011 | 0 | 0 |
| optimizing | 0.000022 | 0.000010 | 0.000012 | 0 | 0 |
| statistics | 0.000030 | 0.000013 | 0.000016 | 0 | 0 |
| preparing | 0.000026 | 0.000012 | 0.000015 | 0 | 0 |
| executing | 0.000013 | 0.000005 | 0.000007 | 0 | 0 |
| Sending data | 3.169882 | 2.755171 | 0.389367 | 534176 | 0 |
| end | 0.000069 | 0.000050 | 0.000018 | 0 | 0 |
| query end | 0.000029 | 0.000022 | 0.000008 | 0 | 0 |
| closing tables | 0.000031 | 0.000023 | 0.000009 | 0 | 0 |
| freeing items | 0.000035 | 0.000025 | 0.000009 | 0 | 0 |
| cleaning up | 0.000038 | 0.000028 | 0.000010 | 0 | 0 |
+----------------------+----------+----------+------------+--------------+---------------+
15 rows in set, 1 warning (0.00 sec)
StoneDB
mysql> set profiling=on;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> select count(*) from t_user where first_name='柳' and copy_id=9968888;
+----------+
| count(*) |
+----------+
| 1 |
+----------+
1 row in set (0.01 sec)
mysql> show profile cpu,block io for query 1;
+----------------------+----------+----------+------------+--------------+---------------+
| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+----------------------+----------+----------+------------+--------------+---------------+
| starting | 0.000173 | 0.000081 | 0.000086 | 0 | 0 |
| checking permissions | 0.000026 | 0.000013 | 0.000013 | 0 | 0 |
| Opening tables | 0.010228 | 0.009385 | 0.000843 | 0 | 240 |
| System lock | 0.000232 | 0.000113 | 0.000119 | 0 | 0 |
| init | 0.000045 | 0.000021 | 0.000022 | 0 | 0 |
| optimizing | 0.000144 | 0.000071 | 0.000074 | 0 | 0 |
| update multi-index | 0.003694 | 0.002027 | 0.006428 | 0 | 0 |
| aggregation | 0.000191 | 0.000093 | 0.000098 | 0 | 16 |
| query end | 0.000020 | 0.000010 | 0.000010 | 0 | 0 |
| closing tables | 0.000029 | 0.000014 | 0.000015 | 0 | 0 |
| freeing items | 0.000033 | 0.000016 | 0.000017 | 0 | 0 |
| cleaning up | 0.000027 | 0.000013 | 0.000013 | 0 | 0 |
+----------------------+----------+----------+------------+--------------+---------------+
12 rows in set, 1 warning (0.00 sec)
從SQL profile可知,SQL在StoneDB執行的過程中,在aggregation階段發生IO。
綜上所述:
- 知識網格技術過濾出相關的數據包後,只需要讀取元數據,不再解壓縮數據包;
- 知識網格技術過濾出可疑的數據包後,需要解壓縮數據包。