1. 單主模式的部署步驟 目標:部署一個有3台主機的單主模式的MySQL分組。 Primary:192.168.197.110。 Secondary:192.168.197.111。 Secondary:192.168.197.112。 MySQL埠:3306,MySQL分組複製埠:33061。 ...
1. 單主模式的部署步驟
目標:部署一個有3台主機的單主模式的MySQL分組。
Primary:192.168.197.110。
Secondary:192.168.197.111。
Secondary:192.168.197.112。
MySQL埠:3306,MySQL分組複製埠:33061。
(1)主機192.168.197.110上的操作。
按照如下步驟完成MySQL配置,並且將自己加入到分組中,最後啟動分組。
(a)修改MySQL配置。
修改MySQL服務的配置文件。
[mysqld]
server_id=110
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=binlog
binlog_format=ROW
transaction_write_set_extraction=XXHASH64
loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
loose-group_replication_start_on_boot=off
loose-group_replication_local_address= "192.168.197.110:33061"
loose-group_replication_group_seeds="192.168.197.110:33061,192.168.197.110:33061,192.168.197.110:33061"
loose-group_replication_bootstrap_group= off
其中,transaction_write_set_extraction表示需要為每個事務收集寫集合併使用XXHASH64演算法編碼。
loose-group_replication_group_name表示本MySQL服務將加入的分組的名稱。
loose-group_replication_start_on_boot表示在MySQL服務啟動時是否自動啟動分組複製服務。
loose-group_replication_local_address表示分組複製服務的本地監聽地址。
loose-group_replication_group_seeds表示分組複製服務的種子成員,當本MySQL服務需要聯繫MySQL分組時,應當與這些種子成員聯繫。
loose-group_replication_bootstrap_group表示是否應當啟動MySQL分組。在一個分組中,只應當允許一個成員啟動MySQL分組。
(b)添加分組複製用戶。
mysql>SET SQL_LOG_BIN=0
Query OK, 0 rows affected (0.03 sec)
mysql> create user 'repl'@'%.coe2coe.me' identified by '123456';
Query OK, 0 rows affected (0.03 sec)
mysql> grant replication slave on *.* to 'repl'@'%';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)
mysql> SET SQL_LOG_BIN=1;
Query OK, 0 rows affected (0.03 sec)
(c)建立分組複製關係。
使用MySQL分組複製專門的複製通道,建立複製關係。
mysql> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='123456' FOR CHANNEL 'group_replication_recovery';
Query OK, 0 rows affected, 2 warnings (0.04 sec)
(d)安裝分組複製插件。
mysql> INSTALL PLUGIN group_replication SONAME 'group_replication.so';
Query OK, 0 rows affected (0.17 sec)
安裝完成後使用show plugins;命令查看已安裝的插件,應該可以看到有一個插件:
| group_replication | ACTIVE | GROUP REPLICATION | group_replication.so | GPL |
(e)啟動分組和分組複製服務。
啟動分組
mysql> SET GLOBAL group_replication_bootstrap_group=ON;
Query OK, 0 rows affected (0.00 sec)
啟動分組複製服務:
mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (2.17 sec)
mysql> SET GLOBAL group_replication_bootstrap_group=OFF;
Query OK, 0 rows affected (0.00 sec)
至此,MySQL分組已經啟動。可以通過以下命令查看分組成員,應該有一個線上成員。
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c110 | 110.coe2coe.me | 3306 | ONLINE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
1 row in set (0.00 sec)
(2)主機192.168.197.111上的操作。
對該主機的MySQL服務進行相似的配置,將該主機上的MySQL服務加入到MySQL分組中。
區別在於192.168.197.110上啟動了MySQL分組以及分組複製服務,而其它主機上僅僅需要啟動分組複製服務,而不能再次啟動MySQL分組。
mysql> start group_replication;
Query OK, 0 rows affected (4.94 sec)
正常情況下,可以看到110和111兩台主機上的MySQL都已經處於線上狀態。
mysql> SELECT * FROM performance_schema.replication_group_members;
No connection. Trying to reconnect...
Connection id: 16
Current database: mysql
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c110 | 110.coe2coe.me | 3306 | ONLINE |
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c111 | 111.coe2coe.me | 3306 | ONLINE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
2 rows in set (0.00 sec)
在執行start group_replication時,有時會出錯,或者長時間處於Recoverying狀態並最終變成ERROR狀態。查看MySQL的log,可以看到以下的錯誤:
2017-08-19T06:12:16.471125Z 15 [ERROR] Slave I/O for channel 'group_replication_recovery': error connecting to master '[email protected]:3306' - retry-time: 60 retries: 1, Error_code: 2003
2017-08-19T06:12:16.471163Z 15 [Note] Slave I/O thread for channel 'group_replication_recovery' killed while connecting to master
2017-08-19T06:12:16.471169Z 15 [Note] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'FIRST', position 4
2017-08-19T06:12:16.471315Z 10 [ERROR] Plugin group_replication reported: 'There was an error when connecting to the donor server. Check group replication recovery's connection credentials.'
此時需要檢查主機上的DNS解析是否正常,以及是否可以正常連接其它已經線上的MySQL服務。
(3)部署192.168.197.112節點。
按照與(2)中類似的方式進行設置和部署即可。
(4)驗證部署結果。
部署完畢後,查看MySQL分組中的節點:
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c110 | 110.coe2coe.me | 3306 | ONLINE |
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c111 | 111.coe2coe.me | 3306 | ONLINE |
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c112 | 112.coe2coe.me | 3306 | ONLINE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
3 rows in set (0.00 sec)
三個節點均已經處於線上狀態。
繼續進行數據驗證。
先在192.168.197.110(Primary)上創建資料庫和數據表並插入數據:
mysql> create database mydb;
Query OK, 1 row affected (0.01 sec)
mysql> use mydb;
Database changed
mysql> create table test (name varchar(100) primary key);
Query OK, 0 rows affected (0.03 sec)
mysql> insert into test (name) values ('001'),('002'),('003');
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select * from test;
+------+
| name |
+------+
| 001 |
| 002 |
| 003 |
+------+
3 rows in set (0.00 sec)
然後在192.168.197.111(Secondary)和192.168.197.112(Secondary)上查看數據:
mysql> use mydb;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from test;
+------+
| name |
+------+
| 001 |
| 002 |
| 003 |
+------+
3 rows in set (0.00 sec)
結果表明數據已經正確的複製到了其餘兩個節點上了。
至此,MySQL分組複製體系部署成功。
兩個Secondary節點已經自動被設置為read_only模式了。
mysql> show variables like '%read_only%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| innodb_read_only | OFF |
| read_only | ON |
| super_read_only | ON |
| tx_read_only | OFF |
+------------------+-------+
4 rows in set (0.01 sec)
而Primary節點仍然為可讀寫模式。
mysql> show variables like '%read_only%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| innodb_read_only | OFF |
| read_only | OFF |
| super_read_only | OFF |
| tx_read_only | OFF |
+------------------+-------+
4 rows in set (0.01 sec)
1.2. 單主模式的Primary故障自動切換
在基於單主模式的MySQL分組複製體系正常運行時,將Primary節點停止服務,此時分組複製體系中將自動選舉一個新的節點作為Primary。
故障發生之前的線上狀態列表:
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c110 | 110.coe2coe.me | 3306 | ONLINE |
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c111 | 111.coe2coe.me | 3306 | ONLINE |
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c112 | 112.coe2coe.me | 3306 | ONLINE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
3 rows in set (0.00 sec)
在192.168.197.110(Primary)上執行如下命令使得Primary停止服務:
mysql> shutdown;
Query OK, 0 rows affected (0.00 sec)
此時在192.168.197.111(Secondary)上查看線上狀態列表:
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c111 | 111.coe2coe.me | 3306 | ONLINE |
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c112 | 112.coe2coe.me | 3306 | ONLINE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
2 rows in set (0.00 sec)
查看111的系統變數:
mysql> show variables like '%read_only%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| innodb_read_only | OFF |
| read_only | OFF |
| super_read_only | OFF |
| tx_read_only | OFF |
+------------------+-------+
4 rows in set (0.00 sec)
查看目前的Primary節點:
mysql> SELECT VARIABLE_NAME,VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME like 'group_replication%';
+----------------------------------+--------------------------------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | a2392929-6dfb-11e7-b294-000c29b1c111 |
+----------------------------------+--------------------------------------+
1 row in set (0.00 sec)
查看111的日誌記錄:
2017-08-19T07:16:51.519461Z 0 [Note] Plugin group_replication reported: 'getstart group_id 4317e324'
2017-08-19T07:16:52.634010Z 0 [Note] Plugin group_replication reported: 'Unsetting super_read_only.'
2017-08-19T07:16:52.634129Z 98 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
查看112的日誌記錄:
2017-08-19T07:00:43.407749Z 0 [Note] Plugin group_replication reported: 'getstart group_id 4317e324'
2017-08-19T07:00:45.570407Z 0 [Note] Plugin group_replication reported: 'Marking group replication view change with view_id 15031230388599955:19'
2017-08-19T07:00:45.849564Z 0 [Note] Plugin group_replication reported: 'The member with address 111.coe2coe.me:3306 was declared online within the replication group'
2017-08-19T07:16:50.830199Z 0 [Note] Plugin group_replication reported: 'getstart group_id 4317e324'
2017-08-19T07:16:51.944908Z 0 [Note] Plugin group_replication reported: 'Setting super_read_only.'
2017-08-19T07:16:51.945033Z 33 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
綜合上述信息,說明此時192.168.197.111已經成為Primary節點了:
(a)111已經變成可讀寫狀態。
(b)111的線上狀態發生了變化,儘管仍然是ONLINE狀態。
此時,如果再次啟動曾經的Primary(192.168.197.110)節點,則該節點的位置比較尷尬,即使仍然作為啟動分組的節點啟動,觀察到的結果也是該節點所在的分組與目前已經存在的分組不是同一個分組。
重啟110後110所在的分組:
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c110 | 110.coe2coe.me | 3306 | ONLINE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
1 row in set (0.00 sec)
原來的分組仍然只有2個節點成員線上:
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c111 | 111.coe2coe.me | 3306 | ONLINE |
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c112 | 112.coe2coe.me | 3306 | ONLINE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
2 rows in set (0.00 sec)
在發生這種情況後,如果只是希望110節點以Secondary方式加入原來的分組,則可以按照以下方式在110上進行操作:
110原來是是bootstrap_group方式啟動的,需要屏蔽掉。
mysql> set @@global.group_replication_bootstrap_group=off;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c110 | 110.coe2coe.me | 3306 | ONLINE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
1 row in set (0.00 sec)
mysql> stop group_replication;
Query OK, 0 rows affected (9.51 sec)
mysql> start group_replication;
ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log.
出現這個錯誤的原因是110上出現了新的Primary(111)上沒有的數據,此時使用以下命令直接忽略。不忽略則不能加入到原來的分組中。
mysql> set @@global.group_replication_allow_local_disjoint_gtids_join=1;
Query OK, 0 rows affected (0.00 sec)
mysql> start group_replication;
Query OK, 0 rows affected (3.18 sec)
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c110 | 110.coe2coe.me | 3306 | ONLINE |
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c111 | 111.coe2coe.me | 3306 | ONLINE |
| group_replication_applier | a2392929-6dfb-11e7-b294-000c29b1c112 | 112.coe2coe.me | 3306 | ONLINE |
+---------------------------+--------------------------------------+----------------+-------------+--------------+
3 rows in set (0.00 sec)
至此,在模擬故障恢復之後的110已經成功加入到原來的分組中了。