MySQL Backup mydumper_ZenDei技術網路在線

正文生產環境中有一實例每天使用mysqldump備份時長達到了2個小時53分鐘，接近3個小時，還不算上備份文件歸檔的時間，這個時間對於邏輯備份來說有點久。為了提高邏輯備份效率，打算替換為使用mydumper。對比mysqldump，mydumper具有如下特點： 1. 多線程備份 2. 備份執行 ...

生產環境中有一實例每天使用mysqldump備份時長達到了2個小時53分鐘，接近3個小時，還不算上備份文件歸檔的時間，這個時間對於邏輯備份來說有點久。為了提高邏輯備份效率，打算替換為使用mydumper。

對比mysqldump，mydumper具有如下特點：

多線程備份
備份執行速度更快
支持備份文件壓縮
支持行級別切塊備份

更多關於mydumper的說明，可以查看官方GitHub：https://github.com/maxbube/mydumper

安裝

之前在測試mydumper時有使用過早期版本，是通過編譯進行安裝的，而mydumper是C語言寫的，編譯過程中出現了一系列的依賴問題。為了避免出現安裝依賴問題，官方從0.9.3版本開始提供了編譯後的安裝包，建議採用RPM包的方式進行安裝。同樣通過官方GitHub獲取安裝包。

本文使用的RPM安裝包為：

mydumper-0.9.5-2.el7.x86_64.rpm

安裝完成後驗證：

# rpm -qa |grep mydumper
mydumper-0.9.5-2.x86_64

# rpm -ql mydumper-0.9.5-2.x86_64
/usr/bin/mydumper
/usr/bin/myloader

mydumper：用來備份數據。
myloader：用來還原數據。

本文主要討論mydumper，查看版本信息：

# mydumper -V
mydumper 0.9.5, built against MySQL 5.7.21-21

主要選項

# mydumper --help
Usage:
  mydumper [OPTION?] multi-threaded MySQL dumping

Help Options:
  -?, --help                  Show help options

Application Options:
  -B, --database              Database to dump
  -T, --tables-list           Comma delimited table list to dump (does not exclude regex option)
  -O, --omit-from-file        File containing a list of database.table entries to skip, one per line (skips before applying regex option)
  -o, --outputdir             Directory to output files to
  -s, --statement-size        Attempted size of INSERT statement in bytes, default 1000000
  -r, --rows                  Try to split tables into chunks of this many rows. This option turns off --chunk-filesize
  -F, --chunk-filesize        Split tables into chunks of this output file size. This value is in MB
  -c, --compress              Compress output files
  -e, --build-empty-files     Build dump files even if no data available from table
  -x, --regex                 Regular expression for 'db.table' matching
  -i, --ignore-engines        Comma delimited list of storage engines to ignore
  -N, --insert-ignore         Dump rows with INSERT IGNORE
  -m, --no-schemas            Do not dump table schemas with the data
  -d, --no-data               Do not dump table data
  -G, --triggers              Dump triggers
  -E, --events                Dump events
  -R, --routines              Dump stored procedures and functions
  -W, --no-views              Do not dump VIEWs
  -k, --no-locks              Do not execute the temporary shared read lock.  WARNING: This will cause inconsistent backups
  --no-backup-locks           Do not use Percona backup locks
  --less-locking              Minimize locking time on InnoDB tables.
  -l, --long-query-guard      Set long query timer in seconds, default 60
  -K, --kill-long-queries     Kill long running queries (instead of aborting)
  -D, --daemon                Enable daemon mode
  -I, --snapshot-interval     Interval between each dump snapshot (in minutes), requires --daemon, default 60
  -L, --logfile               Log file name to use, by default stdout is used
  --tz-utc                    SET TIME_ZONE='+00:00' at top of dump to allow dumping of TIMESTAMP data when a server has data in different time zones or data is being moved between servers with different time zones, defaults to on use --skip-tz-utc to disable.
  --skip-tz-utc
  --use-savepoints            Use savepoints to reduce metadata locking issues, needs SUPER privilege
  --success-on-1146           Not increment error count and Warning instead of Critical in case of table doesn't exist
  --lock-all-tables           Use LOCK TABLE for all, instead of FTWRL
  -U, --updated-since         Use Update_time to dump only tables updated in the last U days
  --trx-consistency-only      Transactional consistency only
  --complete-insert           Use complete INSERT statements that include column names
  -h, --host                  The host to connect to
  -u, --user                  Username with the necessary privileges
  -p, --password              User password
  -a, --ask-password          Prompt For User password
  -P, --port                  TCP/IP port to connect to
  -S, --socket                UNIX domain socket file to use for connection
  -t, --threads               Number of threads to use, default 4
  -C, --compress-protocol     Use compression on the MySQL connection
  -V, --version               Show the program version and exit
  -v, --verbose               Verbosity of output, 0 = silent, 1 = errors, 2 = warnings, 3 = info, default 2
  --defaults-file             Use a specific defaults file
  --ssl                       Connect using SSL
  --key                       The path name to the key file
  --cert                      The path name to the certificate file
  --ca                        The path name to the certificate authority file
  --capath                    The path name to a directory that contains trusted SSL CA certificates in PEM format
  --cipher                    A list of permissible ciphers to use for SSL encryption

-B, --database
指定dump資料庫
-T, --tables-list
指定dump表，多個表用逗號分隔(不排除正則匹配)
-O, --omit-from-file
指定dump需要跳過包含一行或多行database.table格式的文件，跳過dump的優先順序大於dump正則匹配
-o, --outputdir
指定dump文件保存目錄
-s, --statement-size
指定dump生成insert語句的大小，單位位元組，預設是1000000
-r, --rows
把表多少行分割成chunks，這個選項會關閉選項 --chunk-filesize
-F, --chunk-filesize
表分割成chunks的大小，單位為MB，這個指定大小預設為加1MB，如果想切割成每個3MB大小的文件，則指定 -F 2，如果指定 -F 1，則不進行切割，不清楚為什麼這麼設置
-c, --compress
壓縮輸出文件
-e, --build-empty-files
表中如果沒有數據也創建dump文件
-x, --regex
正則匹配，如'db.table'
-i, --ignore-engines
忽略存儲引擎，如有多個用逗號分隔
-N, --insert-ignore
dump文件中不使用INSERT語句
-m, --no-schemas
dump文件中只有表數據而沒有表結構信息
-d, --no-data
dump文件中只有表結構而沒有表數據
-G, --triggers
dump觸發器
-E, --events
dump事件
-R, --routines
dump存儲過程和函數
-W, --no-views
不要dump視圖
-k, --no-locks
不執行臨時的共用讀鎖，這有可能會導致不一致的備份
--no-backup-locks
不使用Percona備份鎖
--less-locking
最小化對InnoDB表的鎖定時間
-l, --long-query-guard
設置長查詢的時間, 單位秒，預設60秒
-K, --kill-long-queries
Kill長時間執行的查詢 (instead of aborting)
-D, --daemon
指定為守護進程模式
-I, --snapshot-interval
每次dump的快照間隔，單位分鐘，需要開啟 --daemon，預設60分鐘
-L, --logfile
指定輸出日誌文件名，預設為屏幕標準輸出
--tz-utc
在dump一開始加入時區timestamp，數據移動或恢復至不同時區上的資料庫適用，預設通過選項 --skip-tz-utc 來禁用
--skip-tz-utc
用法如上
--use-savepoints
通過使用savepoints來避免元數據鎖的產生，需要SUPER許可權
--success-on-1146
不統計增量錯誤和警告，除非是表不存在的錯誤
--lock-all-tables
使用LOCK TABLE鎖定所有表，代替FTWRL
-U, --updated-since
指定Update_time天數來dump只在過去幾天內更新的表
--trx-consistency-only
事務一致性備份導出
--complete-insert
dump文件中包含完整的INSERT語句，語句中包含所有欄位的名稱
-h, --host
指定連接host
-u, --user
指定連接用戶，需有相應的許可權
-p, --password
指定用戶密碼
-a, --ask-password
指定用戶密碼提示輸入
-P, --port
指定連接port
-S, --socket
指定本地socket文件連接
-t, --threads
指定dump線程數, 預設是4
-C, --compress-protocol
在mysql連接時使用壓縮協議
-V, --version
顯示程式版本並退出
-v, --verbose
顯示更詳細的輸出, 0 = silent, 1 = errors, 2 = warnings, 3 = info, 預設是2
--defaults-file
指定預設參數文件
--ssl
使用SSL連接
--key
指定key file的文件路徑
--cert
指定證書文件路徑
--ca
指定證書授權文件路徑
--capath
指定所有CA頒發的PEM格式文件路徑
--cipher
指定允許使用SSL連接加密的密碼列表

備份流程

測試MySQL版本為官方社區版5.7.24。

(root@localhost) [test] > select version();
+------------+
| version()  |
+------------+
| 5.7.24-log |
+------------+
1 row in set (0.00 sec)

通過開啟mysql的general log觀察下mydumper在備份過程中做了哪些操作。

開啟general log

(root@localhost) [(none)] > show global variables like '%general%';
+------------------+---------------------------------+
| Variable_name    | Value                           |
+------------------+---------------------------------+
| general_log      | OFF                             |
| general_log_file | /data/mysql/3306/data/dbabd.log |
+------------------+---------------------------------+
2 rows in set (0.00 sec)

(root@localhost) [(none)] > set global general_log = 1;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [(none)] > show global variables like '%general%';
+------------------+---------------------------------+
| Variable_name    | Value                           |
+------------------+---------------------------------+
| general_log      | ON                              |
| general_log_file | /data/mysql/3306/data/dbabd.log |
+------------------+---------------------------------+
2 rows in set (0.01 sec)

備份test庫

# mydumper -h 192.168.58.3 -u root -a -P 3306 -B test -o /data/test/

備份文件結構(以test.t1表為例)：

# ll /data/test/
total 66728
-rw-r--r--. 1 root root      136 Dec 27 16:02 metadata
-rw-r--r--. 1 root root       63 Dec 27 16:02 test-schema-create.sql
-rw-r--r--. 1 root root      278 Dec 27 16:02 test.t1-schema.sql
-rw-r--r--. 1 root root 18390048 Dec 27 16:02 test.t1.sql

通過以上信息可知，備份所有文件都存放在一個目錄當中，可以指定。如果沒有指定路徑，則在運行mydumper命令的當前目錄下，生成一個新的目錄，名稱命名規則為：export-yyyymmdd-HHMMSS 。每個備份目錄中主要產生的備份文件為：

metadata文件

metadata：備份元數據信息。包含備份開始和備份結束時間，以及MASTER LOG FILE和MASTER LOG POS。如果是在從庫進行備份，則記錄的是 SHOW SLAVE STATUS 中同步到的主庫binlog文件及binlog位置。

# cat metadata
Started dump at: 2018-12-27 16:02:06
SHOW MASTER STATUS:
        Log: mysql-bin.000034
        Pos: 154
        GTID:

Finished dump at: 2018-12-27 16:02:35

庫創建語句文件

test-schema-create.sql：test庫的創建語句。

# cat test-schema-create.sql
CREATE DATABASE `test` /*!40100 DEFAULT CHARACTER SET utf8 */;

每張表兩個備份文件

test.t1-schema.sql：t1表的創建語句。

test.t1.sql：t1表數據文件，以INSERT語句存儲。

如果涉及到大表進行表切片備份的話，會有多個表數據文件。

查看general log

-- 主線程連接資料庫，設置臨時session級別參數
 7   Connect   admin@dbabd on test using TCP/IP
 7   Query     SET SESSION wait_timeout = 2147483
 7   Query     SET SESSION net_write_timeout = 2147483
 7   Query     SHOW PROCESSLIST

-- 主線程執行FTWRL獲取全局讀鎖，並開啟一致性快照事務，記錄當前binlog文件及位置
 7   Query     FLUSH TABLES WITH READ LOCK
 7   Query     START TRANSACTION /*!40108 WITH CONSISTENT SNAPSHOT */
 7   Query     /*!40101 SET NAMES binary*/
 7   Query     SHOW MASTER STATUS
 7   Query     SHOW SLAVE STATUS

-- 產生了4個子進程，並且設置會話級事務隔離級別為REPEATABLE READ，4個子線程同時進行dump操作
 8   Connect   admin@dbabd on  using TCP/IP
 8   Query     SET SESSION wait_timeout = 2147483
 8   Query     SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ
 8   Query     START TRANSACTION /*!40108 WITH CONSISTENT SNAPSHOT */
 8   Query     /*!40103 SET TIME_ZONE='+00:00' */
 8   Query     /*!40101 SET NAMES binary*/

 9   Connect   admin@dbabd on  using TCP/IP
 9   Query     SET SESSION wait_timeout = 2147483
 9   Query     SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ
 9   Query     START TRANSACTION /*!40108 WITH CONSISTENT SNAPSHOT */
 9   Query     /*!40103 SET TIME_ZONE='+00:00' */
 9   Query     /*!40101 SET NAMES binary*/

10   Connect   admin@dbabd on  using TCP/IP
10   Query     SET SESSION wait_timeout = 2147483
10   Query     SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ
10   Query     START TRANSACTION /*!40108 WITH CONSISTENT SNAPSHOT */
10   Query     /*!40103 SET TIME_ZONE='+00:00' */
10   Query     /*!40101 SET NAMES binary*/

11   Connect   admin@dbabd on  using TCP/IP
11   Query     SET SESSION wait_timeout = 2147483
11   Query     SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ
11   Query     START TRANSACTION /*!40108 WITH CONSISTENT SNAPSHOT */
11   Query     /*!40103 SET TIME_ZONE='+00:00' */
11   Query     /*!40101 SET NAMES binary*/

-- 主線程獲取備份庫語句和表狀態 
 7   Init DB   test
 7   Query     SHOW TABLE STATUS
 7   Query     SHOW CREATE DATABASE `test`

-- 4個子進程備份庫中所有的表
 8   Query     select COLUMN_NAME from information_schema.COLUMNS where TABLE_SCHEMA='test' and TABLE_NAME='course' and extra like '%GENERATED%'
11   Query     select COLUMN_NAME from information_schema.COLUMNS where TABLE_SCHEMA='test' and TABLE_NAME='t' and extra like '%GENERATED%'
10   Query     select COLUMN_NAME from information_schema.COLUMNS where TABLE_SCHEMA='test' and TABLE_NAME='t2' and extra like '%GENERATED%'
 9   Query     select COLUMN_NAME from information_schema.COLUMNS where TABLE_SCHEMA='test' and TABLE_NAME='t1' and extra like '%GENERATED%'
 8   Query     SELECT /*!40001 SQL_NO_CACHE */ * FROM `test`.`course`
 7   Query     UNLOCK TABLES /* FTWRL */
 7   Quit
11   Query     SELECT /*!40001 SQL_NO_CACHE */ * FROM `test`.`t`
 9   Query     SELECT /*!40001 SQL_NO_CACHE */ * FROM `test`.`t1`
10   Query     SELECT /*!40001 SQL_NO_CACHE */ * FROM `test`.`t2`
11   Query     select COLUMN_NAME from information_schema.COLUMNS where TABLE_SCHEMA='test' and TABLE_NAME='t3' and extra like '%GENERATED%'
11   Query     SELECT /*!40001 SQL_NO_CACHE */ * FROM `test`.`t3`
11   Query     select COLUMN_NAME from information_schema.COLUMNS where TABLE_SCHEMA='test' and TABLE_NAME='teacher' and extra like '%GENERATED%'
11   Query     SELECT /*!40001 SQL_NO_CACHE */ * FROM `test`.`teacher`
 8   Query     select COLUMN_NAME from information_schema.COLUMNS where TABLE_SCHEMA='test' and TABLE_NAME='teachercard' and extra like '%GENERATED%'
 8   Query     SELECT /*!40001 SQL_NO_CACHE */ * FROM `test`.`teachercard`
 8   Query     SHOW CREATE TABLE `test`.`course`
 8   Query     SHOW CREATE TABLE `test`.`t`
 8   Query     SHOW CREATE TABLE `test`.`t1`
 8   Query     SHOW CREATE TABLE `test`.`t2`
 8   Query     SHOW CREATE TABLE `test`.`t3`
 8   Query     SHOW CREATE TABLE `test`.`teacher`
 8   Query     SHOW CREATE TABLE `test`.`teachercard`
 8   Query     SHOW CREATE TABLE `test`.`v9_pic_tag_content`
11   Query     select COLUMN_NAME from information_schema.COLUMNS where TABLE_SCHEMA='test' and TABLE_NAME='v9_pic_tag_content' and extra like '%GENERATED%'
 8   Quit
11   Query     SELECT /*!40001 SQL_NO_CACHE */ * FROM `test`.`v9_pic_tag_content`
 9   Quit
10   Quit
11   Quit

總結下mydumper的工作流程：

主線程連接MySQL，查詢當前服務線程狀態確定是否中止dump或是KILL長查詢；
通過FTWRL獲取全局讀鎖，確保dump一致性，開啟一致性快照事務，查詢當前binlog信息寫入metadata文件；
創建多個子線程(預設4個)，開啟一致性快照事務，將session級事務隔離級別設置成REPEATABLE READ；
子線程備份非事務引擎表(non-InnoDB tables)；
待子線程備份完非事務引擎表後，主線程執行UNLOCK TABLES釋放全局讀鎖；
子線程備份事務引擎表(InnoDB tables)；
(如有)子線程備份函數、存儲過程、觸發器和視圖；
dump過程結束。

用法示例

備份全庫

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -o /data/backupdir

備份某個庫

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -B test -o /data/test/

備份多個庫(可使用正則匹配)

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -x '^(test\.|test2\.)' -o /data/

不備份某(幾)個庫

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -x '^(?!(mysql\.|sys\.))' -o /data/

備份某(幾)張表

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -T test.t1,test2.t3 -o /data/

不備份某(幾)張表

通過正則匹配

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -B test -x '^(?!test.t2)' -o /data/test/

通過選項 -O, --omit-from-file

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -B test -O nodump.file -o /data/test/

切割表數據文件，指定每份文件包含行數

test.t2表有100萬行：

(root@localhost) [test] > select count(*) from t2;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.45 sec)

現在指定備份test.t2表分割成每個chunks包含的行數為10萬行：

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -T test.t2 -r 100000 -o /data/test/

查看表備份文件：

# ls /data/test/
metadata           test.t2.00001.sql  test.t2.00003.sql  test.t2.00005.sql  test.t2.00007.sql  test.t2.00009.sql
test.t2.00000.sql  test.t2.00002.sql  test.t2.00004.sql  test.t2.00006.sql  test.t2.00008.sql  test.t2-schema.sql

切割表數據文件，指定每份文件大小

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -T test.t2 -F 2 -o /data/test/

查看表備份文件：

# ll -h /data/test/
total 18M
-rw-r--r--. 1 root root  141 Dec 27 16:32 metadata
-rw-r--r--. 1 root root 2.9M Dec 27 16:32 test.t2.00001.sql
-rw-r--r--. 1 root root 2.9M Dec 27 16:32 test.t2.00002.sql
-rw-r--r--. 1 root root 2.9M Dec 27 16:32 test.t2.00003.sql
-rw-r--r--. 1 root root 2.9M Dec 27 16:32 test.t2.00004.sql
-rw-r--r--. 1 root root 2.9M Dec 27 16:32 test.t2.00005.sql
-rw-r--r--. 1 root root 2.9M Dec 27 16:32 test.t2.00006.sql
-rw-r--r--. 1 root root 381K Dec 27 16:32 test.t2.00007.sql
-rw-r--r--. 1 root root  278 Dec 27 16:32 test.t2-schema.sql

對備份文件進行壓縮

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -B test -c -o /data/test/

沒壓縮之前的大小：

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -B test -o /data/test/

# du /data/test  --max-depth=1 -h
53M     /data/test

壓縮之後的大小：

# mydumper -h 192.168.58.3 -u admin -a -P 3306 -B test -c -o /data/test/

# du /data/test/  --max-depth=1 -h
22M     /data/test/

對空表備份也生成數據文件
```
# mydumper -h 192.168.58.3 -u admin -a -P 3306 -B test -e -o /data/test/
```
這樣即使是張空表，不僅備份會生成table-schema.sql文件，也會生成table.sql文件。
指定備份子線程數
```
# mydumper -h 192.168.58.3 -u admin -a -P 3306 -B test -e -t 8 -o /data/test/
```
沒指定 -t 選項預設是4個子線程，可以根據機器配置進行適當增加子線程數加快備份效率。

參考

☆〖本人水平有限，文中如有錯誤還請留言批評指正！〗☆

作者: H_Johnny 出處: http://www.cnblogs.com/dbabd/ 本文版權歸作者和博客園共有，歡迎轉載，但未經作者同意必須保留此段聲明，且在文章頁面明顯位置給出原文鏈接。如果您覺得文章不錯，請幫忙點擊右下角的推薦，謝謝！