# 一、前言 原有的業務系統跑在MySQL主從架構中,高可用通過腳本完成,但存在切換數據丟失和切換不及時風險,調研了高可用更穩定的MGR後,準備入手一試。本篇文章主要記錄GreatSQL從單機擴展到MGR的詳細過程,遇到的問題及解決方法。 # 二、基礎環境 伺服器角色如下 | IP | 埠 | 主 ...
一、前言
原有的業務系統跑在MySQL主從架構中,高可用通過腳本完成,但存在切換數據丟失和切換不及時風險,調研了高可用更穩定的MGR後,準備入手一試。本篇文章主要記錄GreatSQL從單機擴展到MGR的詳細過程,遇到的問題及解決方法。
二、基礎環境
伺服器角色如下
IP | 埠 | 主機名 | 作用 |
---|---|---|---|
172.17.140.25 | 3310 | gdb1 | 原單機 |
172.17.140.24 | 3310 | gdb2 | MGR擴展節點1 |
172.17.139.164 | 3310 | gdb3 | MGR擴展節點2 |
本次驗證使用的是GreatSQL8.0.32-24版本,相比MySQL官網的8.0.32,GreatSQL在性能、可靠性、易用性和安全方面,都有不錯的提升,可以在後續環境中驗證。具體提升點可以參考:GreatSQL簡介[https://greatsql.cn/docs/8032/user-manual/2-about-greatsql/2-1-greatsql-brief-intro.html]
通過下麵的命令下載GreatSQL8.0.32-24進行測試。
$ wget https://product.greatdb.com/GreatSQL-8.0.32-24/GreatSQL-8.0.32-24-Linux-glibc2.17-x86_64-minimal.tar.xz
三、單機搭建
3.1 單機搭建
將軟體包,上傳至172.17.140.25的/opt/software目錄,解壓安裝
# cd /opt/software
# xz -d GreatSQL-8.0.32-24-Linux-glibc2.17-x86_64-minimal.tar.xz
# tar xf GreatSQL-8.0.32-24-Linux-glibc2.17-x86_64-minimal.tar
# ln -s GreatSQL-8.0.32-24-Linux-glibc2.17-x86_64-minimal mysql
結果如下所示
創建必要的目錄,並修改屬主
[#25#root@root /opt/software 09:55:30]25 mkdir -p /data/dbdata/data3310/{data,log,tmp}
[#26#root@root /opt/software 09:55:41]26 chown -R mysql. /data/dbdata/data3310
[#27#root@root /opt/software 09:55:53]27 ll /data/dbdata/data3310
total 12
drwxr-xr-x 2 mysql mysql 4096 Apr 26 09:55 data
drwxr-xr-x 2 mysql mysql 4096 Apr 26 09:55 log
drwxr-xr-x 2 mysql mysql 4096 Apr 26 09:55 tmp
[#28#root@root /opt/software 09:56:02]28
創建配置文件/data/dbdata/data3310/my3310.cnf
,配置文件中並沒有與MGR相關的參數配置,內容如下
[client]
socket = /data/dbdata/data3310/mysql3310.sock
[mysqld]
report_host=172.17.139.164
report_port=3310
user=mysql
basedir = /data/mysql
datadir = /data/dbdata/data3310/data
log_error=/data/dbdata/data3310/log/error3310.log
port = 3310
socket = /data/dbdata/data3310/mysql3310.sock
server-id = 253310
default_authentication_plugin=mysql_native_password
back_log = 5000
binlog_format = row
character_set_server = utf8
enforce_gtid_consistency = 1
expire_logs_days = 7
federated = 1
gtid_mode = on
innodb_buffer_pool_instances = 8
innodb_buffer_pool_size = 2G
innodb_change_buffering = all
innodb_doublewrite = true
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 1
innodb_flush_method = O_DIRECT
innodb_io_capacity = 8000
#innodb_locks_unsafe_for_binlog = 1
innodb_log_buffer_size = 256M
innodb_log_file_size = 1G
innodb_log_files_in_group = 4
innodb_print_all_deadlocks = on
innodb_read_io_threads = 16
innodb_temp_data_file_path = ibtmp1:12M:autoextend:max:8192M
innodb_thread_concurrency = 64
innodb_write_io_threads = 16
interactive_timeout = 3600
lock_wait_timeout = 600
log_bin = mysql-bin
log_bin_trust_function_creators = 1
log_slave_updates = 1
log_timestamps = SYSTEM
long_query_time = 1
lower_case_table_names = 1
master_info_repository = TABLE
max_allowed_packet = 16M
max_connections = 20480
max_prepared_stmt_count = 1048576
net_read_timeout = 10000
net_write_timeout = 10000
open_files_limit = 1000000
relay_log = mysql-relay
relay_log_info_repository = TABLE
skip_external_locking = 1
skip_name_resolve = 1
skip_slave_start = 1
slave_parallel_type = LOGICAL_CLOCK
slave_parallel_workers = 16
slave_pending_jobs_size_max = 2147483648
slave_preserve_commit_order = on
slave_rows_search_algorithms = INDEX_SCAN,HASH_SCAN
slow_query_log = on
sort_buffer_size = 2M
ssl = OFF
sync_binlog = 1
table_definition_cache = 5000
table_open_cache = 5000
thread_cache_size = 3000
tmpdir = /data/dbdata/data3310/tmp
transaction_isolation = READ-COMMITTED
wait_timeout = 31536000
plugin_dir = /opt/software/mysql/lib/plugin
transaction_write_set_extraction=XXHASH64
binlog_transaction_dependency_tracking=WRITESET
初始化實例,並啟動
[#34#root@root /opt/software 10:01:55]34 /opt/software/mysql/bin/mysqld --defaults-file=/data/dbdata/data3310/my3310.cnf --initialize-insecure
[#35#root@root /opt/software 10:02:05]35
[#35#root@root /opt/software 10:02:25]35 /opt/software/mysql/bin/mysqld_safe --defaults-file=/data/dbdata/data3310/my3310.cnf &
[1] 4038
[#36#root@root /opt/software 10:02:36]36 2023-04-26T02:02:37.359772Z mysqld_safe Logging to '/data/dbdata/data3310/log/error3310.log'.
2023-04-26T02:02:37.423502Z mysqld_safe Starting mysqld daemon with databases from /data/dbdata/data3310/data
[#36#root@root /opt/software 10:02:39]36 ps -ef| grep 3310
root 4038 21630 1 10:02 pts/1 00:00:00 /bin/sh /opt/software/mysql/bin/mysqld_safe --defaults-file=/data/dbdata/data3310/my3310.cnf
mysql 5167 4038 72 10:02 pts/1 00:00:05 /opt/software/mysql/bin/mysqld --defaults-file=/data/dbdata/data3310/my3310.cnf --basedir=/data/mysql --datadir=/data/dbdata/data3310/data --plugin-dir=/data/mysql/lib/plugin --user=mysql
--log-error=/data/dbdata/data3310/log/error3310.log --open-files-limit=1000000 --pid-file=root.pid --socket=/data/dbdata/data3310/mysql3310.sock --port=3310root 5294 21630 0 10:02 pts/1 00:00:00 grep --color=auto 3310
[#37#root@root /opt/software 10:02:44]37
通過進程信息可以確認,資料庫啟動完成。
3.2 測試數據構造
連接資料庫,修改預設密碼,並且創建部分測試數據
[#37#root@root /opt/software 10:06:12]37 /opt/software/mysql/bin/mysql --defaults-file=/data/dbdata/data3310/my3310.cnf
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.32-24 GreatSQL, Release 24, Revision 3714067bc8c
Copyright (c) 2021-2021 GreatDB Software Co., Ltd
Copyright (c) 2009-2021 Percona LLC and/or its affiliates
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> select user();
+----------------+
| user() |
+----------------+
| root@localhost |
+----------------+
1 row in set (0.00 sec)
mysql> alter user user() identified by 'a123456';
Query OK, 0 rows affected (0.01 sec)
mysql> create user m1 identified by 'a123456';
Query OK, 0 rows affected (0.01 sec)
mysql> grant all on test.* to m1;
Query OK, 0 rows affected (0.01 sec)
mysql> create database test;
Query OK, 1 row affected (0.01 sec)
mysql> create table test.t1(id int primary key, cname varchar(10));
Query OK, 0 rows affected (0.03 sec)
mysql> insert into test.t1 values(1,'a'),(2,'b'),(3,'c');
Query OK, 3 rows affected (0.02 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select * from test.t1;
+----+-------+
| id | cname |
+----+-------+
| 1 | a |
| 2 | b |
| 3 | c |
+----+-------+
3 rows in set (0.00 sec)
四、擴展節點搭建,MGR構建
4.1 擴展節點搭建
按照3.1章節步驟,分別在伺服器172.17.140.24、172.17.139.164
上初始化單機實例
4.2 在gdb1、gdb2、gdb3實例中,創建用於MGR同步數據的用戶
[#41#root@root /opt/software 10:23:42]41 /opt/software/mysql/bin/mysql -S /data/dbdata/data3310/mysql3310.sock -pa123456
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 8.0.32-24 GreatSQL, Release 24, Revision 3714067bc8c
Copyright (c) 2021-2021 GreatDB Software Co., Ltd
Copyright (c) 2009-2021 Percona LLC and/or its affiliates
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> create user mgr_sync identified by '!QAZ2wsx';
Query OK, 0 rows affected (0.01 sec)
mysql> GRANT REPLICATION SLAVE,BACKUP_ADMIN ON *.* to mgr_sync;
Query OK, 0 rows affected (0.01 sec)
4.3 安裝必要插件
在gdb1、gdb2、gdb3伺服器實例中,安裝下麵插件
[#46#root@root /opt/software 10:37:08]46 /opt/software/mysql/bin/mysql -S /data/dbdata/data3310/mysql3310.sock -pa123456
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.32-24 GreatSQL, Release 24, Revision 3714067bc8c
Copyright (c) 2021-2021 GreatDB Software Co., Ltd
Copyright (c) 2009-2021 Percona LLC and/or its affiliates
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> install plugin group_replication soname 'group_replication.so';
Query OK, 0 rows affected (0.23 sec)
mysql> install plugin clone soname 'mysql_clone.so';
Query OK, 0 rows affected (0.09 sec)
4.4 配置並啟動MGR
在三個節點中,進行如下參數配置
mysql> set persist group_replication_group_name='0762c38f-e3dd-11ed-9fb3-00163ece3e28';
mysql> set persist group_replication_group_seeds='172.17.140.25:13310,172.17.140.24:13310,172.17.139.164:13310';
mysql> CHANGE MASTER TO MASTER_USER='mgr_sync', MASTER_PASSWORD='!QAZ2wsx' FOR CHANNEL 'group_replication_recovery';
在三個節點中,對group_replication_local_address進行配置,配置規則見章節5.2 錯誤2
mysql> set persist group_replication_local_address='172.17.140.25:13310';
在已有數據節點gdb1設置引導參數,並且啟動mgr
mysql> set global group_replication_bootstrap_group=on;
Query OK, 0 rows affected (0.01 sec)
mysql> start group_replication;
Query OK, 0 rows affected (0.01 sec)
mysql> set global group_replication_bootstrap_group=off;
Query OK, 0 rows affected (0.01 sec)
在擴展節點gdb2、gdb3啟動mgr
mysql> start group_replication;
Query OK, 0 rows affected (0.01 sec)
五、日常運維
5.1 節點狀態查詢
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 5301e31b-e3d6-11ed-bc14-00163ece3e28 | 172.17.140.25 | 3310 | ONLINE | PRIMARY | 8.0.32 | XCom |
| group_replication_applier | e01908f9-e3d8-11ed-be2e-00163e02fd34 | 172.17.140.24 | 3310 | ONLINE | SECONDARY | 8.0.32 | XCom |
| group_replication_applier | e0e5b2f9-e3d8-11ed-86cf-00163e073778 | 172.17.139.164 | 3310 | ONLINE | SECONDARY | 8.0.32 | XCom |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
3 rows in set (0.00 sec)
5.2 節點複製通道狀態查詢
mysql> select * from replication_connection_status \G
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
GROUP_NAME: 0762c38f-e3dd-11ed-9fb3-00163ece3e28
SOURCE_UUID: 0762c38f-e3dd-11ed-9fb3-00163ece3e28
THREAD_ID: NULL
SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 0
LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00.000000
RECEIVED_TRANSACTION_SET: 0762c38f-e3dd-11ed-9fb3-00163ece3e28:1-13,
5301e31b-e3d6-11ed-bc14-00163ece3e28:1-8
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
LAST_QUEUED_TRANSACTION: 0762c38f-e3dd-11ed-9fb3-00163ece3e28:11
LAST_QUEUED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2023-04-26 11:38:02.990181
LAST_QUEUED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2023-04-26 11:38:02.990181
LAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMP: 2023-04-26 11:38:02.990295
LAST_QUEUED_TRANSACTION_END_QUEUE_TIMESTAMP: 2023-04-26 11:38:02.990340
QUEUEING_TRANSACTION:
QUEUEING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
QUEUEING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
QUEUEING_TRANSACTION_START_QUEUE_TIMESTAMP: 0000-00-00 00:00:00.000000
1 row in set (0.00 sec)
更多運維信息詳見https://gitee.com/GreatSQL/GreatSQL-Doc/tree/master/deep-dive-mgr
六、問題處理
6.1 異常1
提示信息:The group_replication_group_name option is mandatory
原因:變數group_replication_group_name必須在每個MGR成員中設置,並且保持一致
解決方法:如果是初始化集群,可以查詢一個uuid()的值作為group_replication_group_name的值,如果是已經存在的集群,則到其他MGR節點查詢該變數,並且在本節點設置
mysql> select uuid();
+--------------------------------------+
| uuid() |
+--------------------------------------+
| 0762c38f-e3dd-11ed-9fb3-00163ece3e28 |
+--------------------------------------+
1 row in set (0.01 sec)
mysql> set persist group_replication_group_name='0762c38f-e3dd-11ed-9fb3-00163ece3e28';
Query OK, 0 rows affected (0.00 sec)
mysql> select @@group_replication_group_name;
+--------------------------------------+
| @@group_replication_group_name |
+--------------------------------------+
| 0762c38f-e3dd-11ed-9fb3-00163ece3e28 |
+--------------------------------------+
1 row in set (0.00 sec)
6.2 異常2
提示信息:[GCS] Invalid hostname or IP address () assigned to the parameter local_node!
原因:變數group_replication_local_address必須在每個MGR成員中設置,並且每個成員的值不一樣
解決方法:參數group_replication_local_address設置規則為IP:PORT
,一般是設置為本機IP,然後在實例埠前面加1作為MGR監聽埠,即文章中的實例埠為3310,此時MGR監聽埠設置為13310
mysql> set persist group_replication_local_address='172.17.140.25:13310';
Query OK, 0 rows affected (0.00 sec)
6.3 異常3
提示信息:[GCS] Unable to join the group: peers not configured.
原因:變數group_replication_group_seeds必須在每個MGR成員中設置,並且保持一致
解決方法:設置格式為所有節點的group_replication_local_address值使用逗號拼接上
mysql> set persist group_replication_group_seeds='172.17.140.25:13310,172.17.140.24:13310,172.17.139.164:13310';
Query OK, 0 rows affected (0.01 sec)
6.4 異常4
提示信息:performance_schema.replication_group_members中節點狀態一直處於RECOVERING,並且在error log中有如下信息
2023-04-26T11:06:32.048986+08:00 38 [ERROR] [MY-013117] [Repl] Slave I/O for channel 'group_replication_recovery': Fatal error: Invalid (empty) username when attempting to connect to the master server. Connection attempt terminated. Error_code: MY-013117
2023-04-26T11:06:32.179376+08:00 37 [ERROR] [MY-011582] [Repl] Plugin group_replication reported: 'There was an error when connecting to the donor server. Please check that group_replication_recovery channel credentials and all MEMBER_HOST column values of performance_schema.replication_group_members table are correct and DNS resolvable.'
2023-04-26T11:06:32.179493+08:00 37 [ERROR] [MY-011583] [Repl] Plugin group_replication reported: 'For details please check performance_schema.replication_connection_status table and error log messages of Slave I/O for channel group_rep
lication_recovery.'
原因: MGR同步數據賬戶密碼設置不正確,導致節點間無法通信
解決方法:通過下麵的命令重新設置同步賬號信息
mysql> CHANGE MASTER TO MASTER_USER='mgr_sync', MASTER_PASSWORD='!QAZ2wsx' FOR CHANNEL 'group_replication_recovery';
Query OK, 0 rows affected (0.01 sec)
6.5 異常5
提示信息:執行start group_replication命令返回失敗,在錯誤日誌中存在如下信息
2023-04-26T11:27:14.610060+08:00 0 [ERROR] [MY-011526] [Repl] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 0762c38f-e3dd-11ed-9fb3-00163ece3e28:1-7,
5301e31b-e3d6-11ed-bc14-00163ece3e28:1-8, e0e5b2f9-e3d8-11ed-86cf-00163e073778:1 > Group transactions: 0762c38f-e3dd-11ed-9fb3-00163ece3e28:1-7, 5301e31b-e3d6-11ed-bc14-00163ece3e28:1-8'
原因:由於新加入的MGR節點存在本地事務,與MGR集群事務衝突,導致無法加入
解決方法有兩種:
- 確認集群為新節點,並且本地操作未修改數據,可以直接重置本節點的GTID,再啟動即可
## 在新節點中執行如下命令
mysql> reset master;
Query OK, 0 rows affected (0.03 sec)
mysql> start group_replication;
Query OK, 0 rows affected (2.06 sec)
- 手動從其他正常節點進行數據clone,clone完成後,實例會自動重啟,自動加入mgr組
mysql> set global clone_valid_donor_list='172.17.140.25:3310';
Query OK, 0 rows affected (0.00 sec)
mysql> set global super_read_only=off;
Query OK, 0 rows affected (0.00 sec)
mysql> clone instance from [email protected]:3310 identified by '!QAZ2wsx';
Query OK, 0 rows affected (1.62 sec)
mysql> Restarting mysqld...
2023-04-26T03:37:53.502149Z mysqld_safe Number of processes running now: 0
2023-04-26T03:37:53.511862Z mysqld_safe mysqld restarte
Enjoy GreatSQL