https://clickhouse.com/ 概念 ClickHouse 是俄羅斯的 Yandex 於 2016 年開源的列式存儲資料庫(DBMS),使用 C++語言編寫,主要用於線上分析處理查詢(OLAP),能夠使用 SQL 查詢實時生成分析數據報告。 OLAP:一次寫入,多次讀取 ClickH ...
概念
ClickHouse 是俄羅斯的 Yandex 於 2016 年開源的列式存儲資料庫(DBMS),使用 C++語言編寫,主要用於線上分析處理查詢(OLAP),能夠使用 SQL 查詢實時生成分析數據報告。
OLAP:一次寫入,多次讀取
ClickHouse憑藉著其出色的分析查詢性能,尤其是在日誌處理上的優勢,非常有效的解決了關係型資料庫在海量數據場景下的分析查詢短板問題
ClickHouse 的特點
Id | Name | Age |
---|---|---|
1 | 張三 | 18 |
2 | 李四 | 22 |
3 | 王五 | 34 |
➢ 採用行式存儲時,數據在磁碟上的組織結構為:
1 | 張三 | 18 | 2 | 李四 | 22 | 3 | 王五 | 34 |
➢ 採用列式存儲時,數據在磁碟上的組織結構為:
1 | 2 | 3 | 張三 | 李四 | 王五 | 18 | 22 | 34 |
DBMS 的功能
幾乎覆蓋了標準 SQL 的大部分語法,包括 DDL 和 DML,以及配套的各種函數,用戶管理及許可權管理,數據的備份與恢復。
多樣化引擎
ClickHouse 和 MySQL 類似,把表級的存儲引擎插件化,根據表的不同需求可以設定
不同的存儲引擎。目前包括合併樹、日誌、介面和其他四大類 20 多種引擎。
高吞吐寫入能力
ClickHouse 採用類 LSM Tree 的結構,數據寫入後定期在後臺 Compaction。通過類 LSM tree 的結構,ClickHouse 在數據導入時全部是順序 append 寫,寫入後數據段不可更改,在後臺 compaction 時也是多個段 merge sort 後順序寫回磁碟。順序寫的特性,充分利用了磁碟的吞吐能力,即便在 HDD 上也有著優異的寫入性能。
官方公開 benchmark 測試顯示能夠達到 50MB-200MB/s 的寫入吞吐能力,按照每行 100Byte 估算,大約相當於 50W-200W 條/s 的寫入速度。
數據分區與線程級並行
ClickHouse 將數據劃分為多個 partition,每個 partition 再進一步劃分為多個 index granularity(索引粒度),然後通過多個 CPU 核心分別處理其中的一部分來實現並行數據處理。在這種設計下,單條 Query 就能利用整機所有 CPU。極致的並行處理能力,極大的降低了查詢延時。
所以,ClickHouse 即使對於大量數據的查詢也能夠化整為零平行處理。但是有一個弊端就是對於單條查詢使用多 cpu,就不利於同時併發多條查詢。所以對於高 QPS(Queries Per Second 每秒查詢率) 的查詢業務,ClickHouse 並不是強項。
性能對比
單表查詢
關聯查詢
結論: ClickHouse 像很多 OLAP 資料庫一樣,單表查詢速度由於關聯查詢,而且ClickHouse 的兩者差距更為明顯。
安裝
https://clickhouse.com/docs/en/install/
[root@VipSoft clickhouse]# pwd
/opt/clickhouse
# 下載安裝程式
[root@VipSoft clickhouse]# curl https://clickhouse.com/ | sh
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2473 0 2473 0 0 1514 0 --:--:-- 0:00:01 --:--:-- 1515
Will download https://builds.clickhouse.com/master/amd64/clickhouse into clickhouse
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 502M 100 502M 0 0 5262k 0 0:01:37 0:01:37 --:--:-- 6022k
Successfully downloaded the ClickHouse binary, you can run it as:
./clickhouse
You can also install it:
sudo ./clickhouse install
# 設置許可權
[root@VipSoft clickhouse]# chmod -R 775 ./
[root@VipSoft clickhouse]# ll
total 4447904
-rwxrwxr-x 1 root root 2147479552 Dec 30 13:46 clickhouse
[root@VipSoft clickhouse]# ./clickhouse install
Copying ClickHouse binary to /usr/bin/clickhouse.new
Renaming /usr/bin/clickhouse.new to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-server to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-client to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-local to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-benchmark to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-copier to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-obfuscator to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-git-import to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-compressor to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-format to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-extract-from-config to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-keeper to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-keeper-converter to /usr/bin/clickhouse.
Creating symlink /usr/bin/clickhouse-disks to /usr/bin/clickhouse.
Creating clickhouse group if it does not exist.
groupadd -r clickhouse
Creating clickhouse user if it does not exist.
useradd -r --shell /bin/false --home-dir /nonexistent -g clickhouse clickhouse
Will set ulimits for clickhouse user in /etc/security/limits.d/clickhouse.conf.
Creating config directory /etc/clickhouse-server.
Creating config directory /etc/clickhouse-server/config.d that is used for tweaks of main server configuration.
Creating config directory /etc/clickhouse-server/users.d that is used for tweaks of users configuration.
Data path configuration override is saved to file /etc/clickhouse-server/config.d/data-paths.xml.
Log path configuration override is saved to file /etc/clickhouse-server/config.d/logger.xml.
User directory path configuration override is saved to file /etc/clickhouse-server/config.d/user-directories.xml.
OpenSSL path configuration override is saved to file /etc/clickhouse-server/config.d/openssl.xml.
Creating log directory /var/log/clickhouse-server.
Creating data directory /var/lib/clickhouse.
Creating pid directory /var/run/clickhouse-server.
chown -R clickhouse:clickhouse '/var/log/clickhouse-server'
chown -R clickhouse:clickhouse '/var/run/clickhouse-server'
chown clickhouse:clickhouse '/var/lib/clickhouse'
Enter password for default user: 【輸入密碼】# 輸入密碼
Password for default user is saved in file /etc/clickhouse-server/users.d/default-password.xml.
Setting capabilities for clickhouse binary. This is optional.
Cannot set 'net_admin' or 'ipc_lock' or 'sys_nice' or 'net_bind_service' capability for clickhouse binary. This is optional. Taskstats accounting will be disabled. To enable taskstats accounting you may add the required capability later manually.
Allow server to accept connections from the network (default is localhost only), [y/N]: y # 允許外面訪問
The choice is saved in file /etc/clickhouse-server/config.d/listen.xml.
chown -R clickhouse:clickhouse '/etc/clickhouse-server'
ClickHouse has been successfully installed.
Start clickhouse-server with:
sudo clickhouse start
Start clickhouse-client with:
clickhouse-client --password
# 啟動服務
[root@VipSoft clickhouse]# clickhouse start
chown -R clickhouse: '/var/run/clickhouse-server/'
Will run sudo -u 'clickhouse' /usr/bin/clickhouse-server --config-file /etc/clickhouse-server/config.xml --pid-file /var/run/clickhouse-server/clickhouse-server.pid --daemon
Waiting for server to start
Waiting for server to start
Waiting for server to start
Waiting for server to start
Waiting for server to start
Server started
客戶端工具
DBeaver、NineData