占座 ...
Preface I've installed MasterHA yesterday,Now let's test the master-slave switch and failover feature. Framework
Hostname | IP | Port | Identity | OS Version | MySQL Version |
zlm2 | 192.168.1.101 | 3306 | master | CentOS 7.0 | 5.7.21 |
zlm3 | 192.168.1.102 | 3306 | slave/mha-manager | CentOS 7.0 | 5.7.21 |
null | 192.168.1.200 | null | vip | null | null |
1 [root@zlm3 07:35:00 ~] 2 #masterha_check_ssh --conf=/etc/masterha/app1.conf 3 Fri Aug 3 07:37:13 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. 4 Fri Aug 3 07:37:13 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. 5 Fri Aug 3 07:37:13 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. 6 Fri Aug 3 07:37:13 2018 - [info] Starting SSH connection tests.. 7 Fri Aug 3 07:37:13 2018 - [debug] 8 Fri Aug 3 07:37:13 2018 - [debug] Connecting via SSH from root@192.168.1.101(192.168.1.101:22) to root@192.168.1.102(192.168.1.102:22).. 9 Fri Aug 3 07:37:13 2018 - [debug] ok. 10 Fri Aug 3 07:37:14 2018 - [debug] 11 Fri Aug 3 07:37:13 2018 - [debug] Connecting via SSH from root@192.168.1.102(192.168.1.102:22) to root@192.168.1.101(192.168.1.101:22).. 12 Fri Aug 3 07:37:13 2018 - [debug] ok. 13 Fri Aug 3 07:37:14 2018 - [info] All SSH connection tests passed successfully. 14 15 [root@zlm3 07:37:14 ~] 16 #masterha_check_repl --conf=/etc/masterha/app1.conf --global_conf=/etc/masterha/masterha_default.conf 17 Fri Aug 3 07:37:37 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf.. 18 Fri Aug 3 07:37:37 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. 19 Fri Aug 3 07:37:37 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. 20 Fri Aug 3 07:37:37 2018 - [info] MHA::MasterMonitor version 0.56. 21 Fri Aug 3 07:37:38 2018 - [info] GTID failover mode = 1 22 Fri Aug 3 07:37:38 2018 - [info] Dead Servers: 23 Fri Aug 3 07:37:38 2018 - [info] Alive Servers: 24 Fri Aug 3 07:37:38 2018 - [info] 192.168.1.101(192.168.1.101:3306) 25 Fri Aug 3 07:37:38 2018 - [info] 192.168.1.102(192.168.1.102:3306) 26 Fri Aug 3 07:37:38 2018 - [info] Alive Slaves: 27 Fri Aug 3 07:37:38 2018 - [info] 192.168.1.102(192.168.1.102:3306) Version=5.7.21-log (oldest major version between slaves) log-bin:enabled 28 Fri Aug 3 07:37:38 2018 - [info] GTID ON 29 Fri Aug 3 07:37:38 2018 - [info] Replicating from 192.168.1.101(192.168.1.101:3306) 30 Fri Aug 3 07:37:38 2018 - [info] Primary candidate for the new Master (candidate_master is set) 31 Fri Aug 3 07:37:38 2018 - [info] Current Alive Master: 192.168.1.101(192.168.1.101:3306) 32 Fri Aug 3 07:37:38 2018 - [info] Checking slave configurations.. 33 Fri Aug 3 07:37:38 2018 - [info] read_only=1 is not set on slave 192.168.1.102(192.168.1.102:3306). 34 Fri Aug 3 07:37:38 2018 - [info] Checking replication filtering settings.. 35 Fri Aug 3 07:37:38 2018 - [info] binlog_do_db= , binlog_ignore_db= 36 Fri Aug 3 07:37:38 2018 - [info] Replication filtering check ok. 37 Fri Aug 3 07:37:38 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking. 38 Fri Aug 3 07:37:38 2018 - [info] Checking SSH publickey authentication settings on the current master.. 39 ssh_exchange_identification: Connection closed by remote host 40 Fri Aug 3 07:37:38 2018 - [warning] HealthCheck: SSH to 192.168.1.101 is NOT reachable. 41 Fri Aug 3 07:37:38 2018 - [info] 42 192.168.1.101(192.168.1.101:3306) (current master) 43 +--192.168.1.102(192.168.1.102:3306) 44 45 Fri Aug 3 07:37:38 2018 - [info] Checking replication health on 192.168.1.102.. 46 Fri Aug 3 07:37:38 2018 - [info] ok. 47 Fri Aug 3 07:37:38 2018 - [info] Checking master_ip_failover_script status: 48 Fri Aug 3 07:37:38 2018 - [info] /etc/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.101 --orig_master_ip=192.168.1.101 --orig_master_port=3306 --orig_master_ssh_port=3306 49 Fri Aug 3 07:37:38 2018 - [info] OK. 50 Fri Aug 3 07:37:38 2018 - [warning] shutdown_script is not defined. 51 Fri Aug 3 07:37:38 2018 - [info] Got exit code 0 (Not master dead). 52 53 MySQL Replication Health is OK. 54 55 [root@zlm3 07:40:03 ~] 56 #Fri Aug 3 07:40:03 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf.. 57 Fri Aug 3 07:40:03 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. 58 Fri Aug 3 07:40:03 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. 59 ssh_exchange_identification: Connection closed by remote host 60 ^C 61 62 [root@zlm3 07:40:11 ~] 63 #masterha_check_status --conf=/etc/masterha/app1.conf 64 app1 (pid:5628) is running(0:PING_OK), master:192.168.1.101
Switch master to slave and make it become a new slave of new master.
1 [root@zlm3 08:21:27 ~] 2 #masterha_master_switch --conf=/etc/masterha/app1.conf --global_conf=/etc/masterha/masterha_default.conf --master_state=alive --new_master_host=192.168.1.102 --orig_master_is_new_slave --running_updates_limit=60 3 Fri Aug 3 08:21:29 2018 - [info] MHA::MasterRotate version 0.56. 4 Fri Aug 3 08:21:29 2018 - [info] Starting online master switch.. 5 Fri Aug 3 08:21:29 2018 - [info] 6 Fri Aug 3 08:21:29 2018 - [info] * Phase 1: Configuration Check Phase.. 7 Fri Aug 3 08:21:29 2018 - [info] 8 Fri Aug 3 08:21:29 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf.. 9 Fri Aug 3 08:21:29 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. 10 Fri Aug 3 08:21:29 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. 11 Fri Aug 3 08:21:30 2018 - [info] GTID failover mode = 1 12 Fri Aug 3 08:21:30 2018 - [info] Current Alive Master: 192.168.1.101(192.168.1.101:3306) 13 Fri Aug 3 08:21:30 2018 - [info] Alive Slaves: 14 Fri Aug 3 08:21:30 2018 - [info] 192.168.1.102(192.168.1.102:3306) Version=5.7.21-log (oldest major version between slaves) log-bin:enabled 15 Fri Aug 3 08:21:30 2018 - [info] GTID ON 16 Fri Aug 3 08:21:30 2018 - [info] Replicating from 192.168.1.101(192.168.1.101:3306) 17 Fri Aug 3 08:21:30 2018 - [info] Primary candidate for the new Master (candidate_master is set) 18 19 It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.1.101(192.168.1.101:3306)? (YES/no): yes 20 Fri Aug 3 08:21:33 2018 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time.. 21 Fri Aug 3 08:21:33 2018 - [info] ok. 22 Fri Aug 3 08:21:33 2018 - [info] Checking MHA is not monitoring or doing failover.. 23 Fri Aug 3 08:21:33 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterRotate.pm, ln142] Getting advisory lock failed on the current master. MHA Monitor runs on the current master. Stop MHA Manager/Monitor and try again. 24 Fri Aug 3 08:21:33 2018 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln177] Got ERROR: at /usr/bin/masterha_master_switch line 53. 25 26 //It means that we should stop MHA-manager when donging switchover master. 27 28 [root@zlm3 08:21:33 ~] 29 #masterha_stop --conf=/etc/masterha/app1.conf --global_conf=/etc/masterha/masterha_default.conf 30 Stopped app1 successfully. 31 [1]+ Exit 1 masterha_manager --conf=/etc/masterha/app1.conf --global_conf=/etc/masterha/masterha_default.conf 32 33 [root@zlm3 08:28:07 ~] 34 #masterha_master_switch --conf=/etc/masterha/app1.conf --global_conf=/etc/masterha/masterha_default.conf --master_state=alive --new_master_host=192.168.1.102 --orig_master_is_new_slave --running_updates_limit=60 35 Fri Aug 3 08:28:21 2018 - [info] MHA::MasterRotate version 0.56. 36 Fri Aug 3 08:28:21 2018 - [info] Starting online master switch.. 37 Fri Aug 3 08:28:21 2018 - [info] 38 Fri Aug 3 08:28:21 2018 - [info] * Phase 1: Configuration Check Phase.. 39 Fri Aug 3 08:28:21 2018 - [info] 40 Fri Aug 3 08:28:21 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf.. 41 Fri Aug 3 08:28:21 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf.. 42 Fri Aug 3 08:28:21 2018 - [info] Reading server configuration from /etc/masterha/app1.conf.. 43 Fri Aug 3 08:28:22 2018 - [info] GTID failover mode = 1 44 Fri Aug 3 08:28:22 2018 - [info] Current Alive Master: 192.168.1.101(192.168.1.101:3306) 45 Fri Aug 3 08:28:22 2018 - [info] Alive Slaves: 46 Fri Aug 3 08:28:22 2018 - [info] 192.168.1.102(192.168.1.102:3306) Version=5.7.21-log (oldest major version between slaves) log-bin:enabled 47 Fri Aug 3 08:28:22 2018 - [info] GTID ON 48 Fri Aug 3 08:28:22 2018 - [info] Replicating from 192.168.1.101(192.168.1.101:3306) 49 Fri Aug 3 08:28:22 2018 - [info] Primary candidate for the new Master (candidate_master is set) 50 51 It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.1.101(192.168.1.101:3306)? (YES/no): yes 52 Fri Aug 3 08:28:25 2018 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time.. 53 Fri Aug 3 08:28:25 2018 - [info] ok. 54 Fri Aug 3 08:28:25 2018 - [info] Checking MHA is not monitoring or doing failover.. 55 Fri Aug 3 08:28:25 2018 - [info] Checking replication health on 192.168.1.102.. 56 Fri Aug 3 08:28:25 2018 - [info] ok. 57 Fri Aug 3 08:28:25 2018 - [info] 192.168.1.102 can be new master. 58 Fri Aug 3 08:28:25 2018 - [info] 59 From: 60 192.168.1.101(192.168.1.101:3306) (current master) 61 +--192.168.1.102(192.168.1.102:3306) 62 63 To: 64 192.168.1.102(192.168.1.102:3306) (new master) 65 +--192.168.1.101(192.168.1.101:3306) 66 67 Starting master switch from 192.168.1.101(192.168.1.101:3306) to 192.168.1.102(192.168.1.102:3306)? (yes/NO): yes 68 Fri Aug 3 08:28:31 2018 - [info] Checking whether 192.168.1.102(192.168.1.102:3306) is ok for the new master.. 69 Fri Aug 3 08:28:31 2018 - [info] ok. 70 Fri Aug 3 08:28:31 2018 - [info] 192.168.1.101(192.168.1.101:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host. 71 Fri Aug 3 08:28:31 2018 - [info] 192.168.1.101(192.168.1.101:3306): Resetting slave pointing to the dummy host. 72 Fri Aug 3 08:28:31 2018 - [info] ** Phase 1: Configuration Check Phase completed. 73 Fri Aug 3 08:28:31 2018 - [info] 74 Fri Aug 3 08:28:31 2018 - [info] * Phase 2: Rejecting updates Phase.. 75 Fri Aug 3 08:28:31 2018 - [info] 76 Fri Aug 3 08:28:31 2018 - [info] Executing master ip online change script to disable write on the current master: 77 Fri Aug 3 08:28:31 2018 - [info] /etc/masterha/master_ip_online_change --command=stop --orig_master_host=192.168.1.101 --orig_master_ip=192.168.1.101 --orig_master_port=3306 --orig_master_user='zlm' --orig_master_password='zlmzlm' --new_master_host=192.168.1.102 --new_master_ip=192.168.1.102 --new_master_port=3306 --new_master_user='zlm' --new_master_password='zlmzlm' --orig_master_ssh_user=root --new_master_ssh_user=root --orig_master_ssh_port=3306 --new_master_ssh_port=3306 --orig_master_is_new_slave 78 Unknown option: new_master_ssh_port 79 Fri Aug 3 08:28:32 2018 116409 Set read_only on the new master.. ok. 80 Fri Aug 3 08:28:32 2018 125643 drop vip 10.33.101.239.. 81 ssh_exchange_identification: Connection closed by remote host 82 Fri Aug 3 08:28:32 2018 142948 Waiting all running 1 threads are disconnected.. (max 1500 milliseconds) 83 {'Time' => '13435','db' => undef,'Id' => '21','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => 'zlm3:40535'} 84 Fri Aug 3 08:28:32 2018 646769 Waiting all running 1 threads are disconnected.. (max 1000 milliseconds) 85 {'Time' => '13435','db' => undef,'Id' => '21','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => 'zlm3:40535'} 86 Fri Aug 3 08:28:33 2018 149221 Waiting all running 1 threads are disconnected.. (max 500 milliseconds) 87 {'Time' => '13436','db' => undef,'Id' => '21','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => 'zlm3:40535'} 88 Fri Aug 3 08:28:33 2018 650816 Set read_only=1 on the orig master.. ok. 89 Fri Aug 3 08:28:33 2018 653323 Waiting all running 1 queries are disconnected.. (max 500 milliseconds) 90 {'Time' => '13436','db' => undef,'Id' => '21','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => 'zlm3:40535'} 91 Fri Aug 3 08:28:34 2018 154965 Killing all application threads.. 92 Fri Aug 3 08:28:34 2018 167919 done. 93 Fri Aug 3 08:28:34 2018 - [info] ok. 94 Fri Aug 3 08:28:34 2018 - [info] Locking all tables on the orig master to reject updates from everybody (including root): 95 Fri Aug 3 08:28:34 2018 - [info] Executing FLUSH TABLES WITH READ LOCK.. 96 Fri Aug 3 08:28:34 2018 - [info] ok. 97 Fri Aug