深入理解Redis高可用方案-Sentinel

Redis Sentinel是Redis的高可用方案。是Redis 2.8中正式引入的。在之前的主從複製方案中，如果主節點出現問題，需要手動將一個從節點升級為主節點，然後將其它從節點指向新的主節點，並且需要修改應用方主節點的地址。整個過程都需要人工干預。下麵通過日誌具體看看Sentinel的切換 ...

Redis Sentinel是Redis的高可用方案。是Redis 2.8中正式引入的。

在之前的主從複製方案中，如果主節點出現問題，需要手動將一個從節點升級為主節點，然後將其它從節點指向新的主節點，並且需要修改應用方主節點的地址。整個過程都需要人工干預。

下麵通過日誌具體看看Sentinel的切換流程。

Sentinel的切換流程

集群拓撲圖如下。

角色 IP 埠 runID

主節點 127.0.0.1 6379

從節點-1 127.0.0.1 6380

從節點-2 127.0.0.1 6381

Sentinel-1 127.0.0.1 26379 d4424b8684977767be4f5abd1e364153fbb0adbd

Sentinel-2 127.0.0.1 26380 18311edfbfb7bf89fe4b67d08ef432053db62fff

Sentinel-3 127.0.0.1 26381 3e9eb1aa9378d89cfe04fe21bf4a05a901747fa8

kill -9 將主節點進程殺死。

1. 最先反應的是從節點。

其會馬上輸出如下信息。

28244:S 08 Oct 16:03:34.184 # Connection with master lost.
28244:S 08 Oct 16:03:34.184 * Caching the disconnected master state.
28244:S 08 Oct 16:03:34.548 * Connecting to MASTER 127.0.0.1:6379
28244:S 08 Oct 16:03:34.548 * MASTER <-> SLAVE sync started
28244:S 08 Oct 16:03:34.548 # Error condition on socket for SYNC: Connection refused
28244:S 08 Oct 16:03:35.556 * Connecting to MASTER 127.0.0.1:6379
28244:S 08 Oct 16:03:35.556 * MASTER <-> SLAVE sync started
...

2. Sentinel的日誌30s後才有輸出，這個與“sentinel down-after-milliseconds mymaster 30000”的設置有關。

下麵，依次貼出哨兵各個節點及slave的日誌輸出。

Sentinel-1

28087:X 08 Oct 16:04:04.277 # +sdown master mymaster 127.0.0.1 6379
28087:X 08 Oct 16:04:04.379 # +new-epoch 1
28087:X 08 Oct 16:04:04.385 # +vote-for-leader 18311edfbfb7bf89fe4b67d08ef432053db62fff 1
28087:X 08 Oct 16:04:05.388 # +odown master mymaster 127.0.0.1 6379 #quorum 3/2
28087:X 08 Oct 16:04:05.388 # Next failover delay: I will not start a failover before Mon Oct  8 16:10:04 2018
28087:X 08 Oct 16:04:05.631 # +config-update-from sentinel 18311edfbfb7bf89fe4b67d08ef432053db62fff 127.0.0.1 26380 @ mymaster 127.0.0.1 6379
28087:X 08 Oct 16:04:05.631 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381
28087:X 08 Oct 16:04:05.631 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
28087:X 08 Oct 16:04:05.631 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
28087:X 08 Oct 16:04:35.656 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

Sentinel-2

28163:X 08 Oct 16:04:04.289 # +sdown master mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:04.366 # +odown master mymaster 127.0.0.1 6379 #quorum 3/2
28163:X 08 Oct 16:04:04.366 # +new-epoch 1
28163:X 08 Oct 16:04:04.366 # +try-failover master mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:04.373 # +vote-for-leader 18311edfbfb7bf89fe4b67d08ef432053db62fff 1
28163:X 08 Oct 16:04:04.385 # 3e9eb1aa9378d89cfe04fe21bf4a05a901747fa8 voted for 18311edfbfb7bf89fe4b67d08ef432053db62fff 1
28163:X 08 Oct 16:04:04.385 # d4424b8684977767be4f5abd1e364153fbb0adbd voted for 18311edfbfb7bf89fe4b67d08ef432053db62fff 1
28163:X 08 Oct 16:04:04.450 # +elected-leader master mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:04.450 # +failover-state-select-slave master mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:04.528 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:04.528 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:04.586 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:05.543 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:05.543 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:05.629 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:06.554 # -odown master mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:06.555 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:06.555 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:06.606 # +failover-end master mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:06.606 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381
28163:X 08 Oct 16:04:06.606 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
28163:X 08 Oct 16:04:06.606 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
28163:X 08 Oct 16:04:36.687 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

Sentinel-3

28234:X 08 Oct 16:04:04.288 # +sdown master mymaster 127.0.0.1 6379
28234:X 08 Oct 16:04:04.378 # +new-epoch 1
28234:X 08 Oct 16:04:04.385 # +vote-for-leader 18311edfbfb7bf89fe4b67d08ef432053db62fff 1
28234:X 08 Oct 16:04:04.385 # +odown master mymaster 127.0.0.1 6379 #quorum 2/2
28234:X 08 Oct 16:04:04.385 # Next failover delay: I will not start a failover before Mon Oct  8 16:10:04 2018
28234:X 08 Oct 16:04:05.630 # +config-update-from sentinel 18311edfbfb7bf89fe4b67d08ef432053db62fff 127.0.0.1 26380 @ mymaster 127.0.0.1 6379
28234:X 08 Oct 16:04:05.630 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381
28234:X 08 Oct 16:04:05.630 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
28234:X 08 Oct 16:04:05.630 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
28234:X 08 Oct 16:04:35.709 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

slave2

28244:S 08 Oct 16:04:04.762 * MASTER <-> SLAVE sync started
28244:S 08 Oct 16:04:04.762 # Error condition on socket for SYNC: Connection refused
28244:S 08 Oct 16:04:05.630 * SLAVE OF 127.0.0.1:6381 enabled (user request from 'id=6 addr=127.0.0.1:43880 fd=12 name= age=148 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=224 qbuf-free=
32544 obl=81 oll=0 omem=0 events=r cmd=slaveof')28244:S 08 Oct 16:04:05.636 # CONFIG REWRITE executed with success.
28244:S 08 Oct 16:04:05.770 * Connecting to MASTER 127.0.0.1:6381
28244:S 08 Oct 16:04:05.770 * MASTER <-> SLAVE sync started
28244:S 08 Oct 16:04:05.770 * Non blocking connect for SYNC fired the event.
28244:S 08 Oct 16:04:05.770 * Master replied to PING, replication can continue...
28244:S 08 Oct 16:04:05.770 * Trying a partial resynchronization (request b95802ca8afd97c578b355a5838d219681d0af27:24302).
28244:S 08 Oct 16:04:05.770 * Successful partial resynchronization with master.
28244:S 08 Oct 16:04:05.770 # Master replication ID changed to a4022bb5c361353a4773fd460cec5cdcc5c02031
28244:S 08 Oct 16:04:05.770 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization.

slave3

28253:S 08 Oct 16:04:03.655 * MASTER <-> SLAVE sync started
28253:S 08 Oct 16:04:03.655 # Error condition on socket for SYNC: Connection refused
28253:M 08 Oct 16:04:04.586 # Setting secondary replication ID to b95802ca8afd97c578b355a5838d219681d0af27, valid up to offset: 24302. New replication ID is a4022bb5c361353a4773fd460cec5cdc
c5c0203128253:M 08 Oct 16:04:04.586 * Discarding previously cached master state.
28253:M 08 Oct 16:04:04.586 * MASTER MODE enabled (user request from 'id=9 addr=127.0.0.1:49316 fd=8 name=sentinel-18311edf-cmd age=137 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-
free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')28253:M 08 Oct 16:04:04.593 # CONFIG REWRITE executed with success.
28253:M 08 Oct 16:04:05.770 * Slave 127.0.0.1:6380 asks for synchronization
28253:M 08 Oct 16:04:05.770 * Partial resynchronization request from 127.0.0.1:6380 accepted. Sending 156 bytes of backlog starting from offset 24302.

結合上面的日誌，可以看到，

各個Sentinel節點都判斷127.0.0.1 6379為主觀下線（Subjectively Down，縮寫為sdown）。

28163:X 08 Oct 16:04:04.289 # +sdown master mymaster 127.0.0.1 6379

達到quorum的設置，Sentinel-2判斷其為客觀下線（Objectively Down，縮寫為odown）。結合其它兩個Sentinel節點的日誌，可以看到，Sentinel-2最先判定其客觀下線。接下來，會進行Sentinel的領導者選舉。一般來說，誰先完成客觀下線的判定，誰就是領導者，只有Sentinel領導者才能進行failover。

28163:X 08 Oct 16:04:04.366 # +odown master mymaster 127.0.0.1 6379 #quorum 3/2
28163:X 08 Oct 16:04:04.366 # +new-epoch 1
28163:X 08 Oct 16:04:04.366 # +try-failover master mymaster 127.0.0.1 6379
28163:X 08 Oct 16:04:04.373 # +vote-for-leader 18311edfbfb7bf89fe4b67d08ef432053db62fff 1
28163:X 08 Oct 16:04:04.385 # 3e9eb1aa9378d89cfe04fe21bf4a05a901747fa8 voted for 18311edfbfb7bf89fe4b67d08ef432053db62fff 1
28163:X 08 Oct 16:04:04.385 # d4424b8684977767be4f5abd1e364153fbb0adbd voted for 18311edfbfb7bf89fe4b67d08ef432053db62fff 1
28163:X 08 Oct 16:04:04.450 # +elected-leader master mymaster 127.0.0.1 6379

尋找合適的slave作為master

28163:X 08 Oct 16:04:04.450 # +failover-state-select-slave master mymaster 127.0.0.1 6379

+failover-state-select-slave <instance details> -- New failover state is select-slave: we are trying to find a suitable slave for promotion.

將127.0.0.1 6381設置為新主

28163:X 08 Oct 16:04:04.528 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

+selected-slave <instance details> -- We found the specified good slave to promote.

命令6381節點執行slaveof no one，使其成為主節點

28163:X 08 Oct 16:04:04.528 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

+failover-state-send-slaveof-noone <instance details> -- We are trying to reconfigure the promoted slave as master, waiting for it to switch.

等待6381節點升級為主節點

28163:X 08 Oct 16:04:04.586 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

確認6381節點已經升級為主節點

28163:X 08 Oct 16:04:05.543 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

再來看看16:04:04.528到16:04:05.543這個時間段slave3的日誌輸出。可以看到，其開啟了MASTER模式，且重寫了配置文件。

28253:M 08 Oct 16:04:04.586 # Setting secondary replication ID to b95802ca8afd97c578b355a5838d219681d0af27, valid up to offset: 24302. New replication ID is a4022bb5c361353a4773fd460cec5cdcc5c02031
28253:M 08 Oct 16:04:04.586 * Discarding previously cached master state.
28253:M 08 Oct 16:04:04.586 * MASTER MODE enabled (user request from 'id=9 addr=127.0.0.1:49316 fd=8 name=sentinel-18311edf-cmd age=137 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
28253:M 08 Oct 16:04:04.593 # CONFIG REWRITE executed with success.

failover進入重新配置從節點階段

28163:X 08 Oct 16:04:05.543 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379

命令6380節點複製新的主節點

28163:X 08 Oct 16:04:05.629 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

+slave-reconf-sent <instance details> -- The leader sentinel sent the SLAVEOF command to this instance in order to reconfigure it for the new slave.

看看這個時間點slave2的日誌輸出，基本吻合。其進行的是增量同步。

28244:S 08 Oct 16:04:05.630 * SLAVE OF 127.0.0.1:6381 enabled (user request from 'id=6 addr=127.0.0.1:43880 fd=12 name= age=148 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=224 qbuf-free=32544 obl=81 oll=0 omem=0 events=r cmd=slaveof')
28244:S 08 Oct 16:04:05.636 # CONFIG REWRITE executed with success.
28244:S 08 Oct 16:04:05.770 * Connecting to MASTER 127.0.0.1:6381
28244:S 08 Oct 16:04:05.770 * MASTER <-> SLAVE sync started
28244:S 08 Oct 16:04:05.770 * Non blocking connect for SYNC fired the event.
28244:S 08 Oct 16:04:05.770 * Master replied to PING, replication can continue...
28244:S 08 Oct 16:04:05.770 * Trying a partial resynchronization (request b95802ca8afd97c578b355a5838d219681d0af27:24302).
28244:S 08 Oct 16:04:05.770 * Successful partial resynchronization with master.
28244:S 08 Oct 16:04:05.770 # Master replication ID changed to a4022bb5c361353a4773fd460cec5cdcc5c02031
28244:S 08 Oct 16:04:05.770 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization.

同時，在這個時間點，sentinel也有日誌輸出，以sentinel1為例。從日誌中，可以看到，在這個時間點它會更改配置信息。

28087:X 08 Oct 16:04:05.631 # +config-update-from sentinel 18311edfbfb7bf89fe4b67d08ef432053db62fff 127.0.0.1 26380 @ mymaster 127.0.0.1 6379
28087:X 08 Oct 16:04:05.631 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381
28087:X 08 Oct 16:04:05.631 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
28087:X 08 Oct 16:04:05.631 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

switch-master <master name> <oldip> <oldport> <newip> <newport> -- The master new IP and address is the specified one after a configuration change. This is the message most external users are interested in.

同步過程尚未完成。

28163:X 08 Oct 16:04:06.555 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

+slave-reconf-inprog <instance details> -- The slave being reconfigured showed to be a slave of the new master ip:port pair, but the synchronization process is not yet complete.

主從同步完成。

28163:X 08 Oct 16:04:06.555 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

+slave-reconf-done <instance details> -- The slave is now synchronized with the new master.

failover切換完成。

28163:X 08 Oct 16:04:06.606 # +failover-end master mymaster 127.0.0.1 6379

failover成功後，發佈主節點的切換消息

28163:X 08 Oct 16:04:06.606 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381

關聯新主節點的slave信息，需要註意的是，原來的主節點會作為新主節點的slave。

28163:X 08 Oct 16:04:06.606 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
28163:X 08 Oct 16:04:06.606 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

+slave <instance details> -- A new slave was detected and attached.

過了30s後，判定原來的主節點主觀下線。

28163:X 08 Oct 16:04:36.687 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

綜合來看，Sentinel進行failover的流程如下

1. 每隔1秒，每個Sentinel節點會向主節點、從節點、其餘Sentinel節點發送一條ping命令做一次心跳檢測，來確認這些節點當前是否可達。當這些節點超過down-after-milliseconds沒有進行有效回覆，Sentinel節點就會判定該節點為主觀下線。

2. 如果被判定為主觀下線的節點是主節點，該Sentinel節點會通過sentinel is master-down-by-addr命令向其他Sentinel節點詢問對主節點的判斷，當超過<quorum>個數，Sentinel節點會判定該節點為客觀下線。如果從節點、Sentinel節點被判定為主觀下線，並不會進行後續的故障切換操作。

3. 對Sentinel進行領導者選舉，由其來進行後續的故障切換（failover）工作。選舉演算法基於Raft。

4. Sentinel領導者節點開始進行故障切換。

5. 選擇合適的從節點作為新主節點。

6. Sentinel領導者節點對上一步選出來的從節點執行slaveof no one命令讓其成為主節點。

7. 向剩餘的從節點發送命令，讓它們成為新主節點的從節點，複製規則和parallel-syncs參數有關。

8. 將原來的主節點更新為從節點，並將其納入到Sentinel的管理，讓其恢復後去複製新的主節點。

Sentinel的領導者選舉流程。

Sentinel的領導者選舉基於Raft協議。

1. 每個線上的Sentinel節點都有資格成為領導者，當它確認主節點主觀下線時候，會向其他Sentinel節點發送sentinel is-master-down-by-addr命令，要求將自己設置為領導者。

2. 收到命令的Sentinel節點，如果沒有同意過其他Sentinel節點的sentinel is-master-down-by-addr命令，將同意該請求，否則拒絕。

3. 如果該Sentinel節點發現自己的票數已經大於等於max（quorum，num（sentinels）/2+1），那麼它將成為領導者。

新主節點的選擇流程。

1. 刪除所有已經處於下線或斷線狀態的從節點。

2. 刪除最近5秒沒有回覆過領導者Sentinel的INFO命令的從節點。

3. 刪除所有與已下線主節點連接斷開超過down-after-milliseconds*10毫秒的從節點。

4. 選擇優先順序最高的從節點。

5. 選擇複製偏移量最大的從節點。

6. 選擇runid最小的從節點。

三個定時監控任務

1. 每隔10秒，每個Sentinel節點會向主節點和從節點發送info命令獲取最新的拓撲結構。其作用如下：

1> 通過向主節點執行info命令，獲取從節點的信息，這也是為什麼Sentinel節點不需要顯式配置監控從節點。
2> 當有新的從節點加入時可立刻感知出來。
3> 節點不可達或者故障切換後，可通過info命令實時更新節點拓撲信息。

2. 每隔2秒，每個Sentinel節點會向Redis數據節點的__sentinel__：hello頻道上發送該Sentinel節點對於主節點的判斷以及當前Sentinel節點的信息，同時每個Sentinel節點也會訂閱該頻道，來瞭解其它Sentinel節點以及它們對主節點的判斷。其作用如下：

1> 發現新的Sentinel節點：通過訂閱主節點的__sentinel__：hello瞭解其它Sentinel節點信息，如果是新加入的Sentinel節點，將該Sentinel節點信息保存起來，並與該Sentinel節點創建連接。
2> Sentinel節點之間交換主節點的狀態，作為後面客觀下線以及領導者選舉的依據。

3. 每隔1秒，每個Sentinel節點會向主節點、從節點、其餘Sentinel節點發送一條ping命令做一次心跳檢測，來確認這些節點當前是否可達。這個定時任務是節點失敗判定的重要依據。

Sentinel的相關參數

# bind 127.0.0.1 192.168.1.1
# protected-mode no
port 26379
# sentinel announce-ip <ip>
# sentinel announce-port <port>
dir /tmp
sentinel monitor mymaster 127.0.0.1 6379 2
# sentinel auth-pass <master-name> <password>
sentinel down-after-milliseconds mymaster 30000
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 180000
# sentinel notification-script mymaster /var/redis/notify.sh
# sentinel client-reconfig-script mymaster /var/redis/reconfig.sh
sentinel deny-scripts-reconfig yes

其中，

dir：設置Sentinel的工作目錄。

sentinel monitor mymaster 127.0.0.1 6379 2：其中2是quorum，即權重，代表至少需要兩個Sentinel節點認為主節點主觀下線，才可判定主節點為客觀下線。一般建議將其設置為Sentinel節點的一半加1。不僅如此，quorum還與Sentinel節點的領導者選舉有關。為了選出Sentinel的領導者，至少需要max(quorum, num(sentinels) / 2 + 1)個Sentinel節點參與選舉。

sentinel down-after-milliseconds mymaster 30000：每個Sentinel節點都要通過定期發送ping命令來判斷Redis節點和其餘Sentinel節點是否可達。

如果在指定的時間內，沒有收到主節點的有效回覆，則判斷其為主觀下線。需要註意的是，該參數不僅用來判斷主節點狀態，同樣也用來判斷該主節點下麵的從節點及其它Sentinel的狀態。其預設值為30s。

sentinel parallel-syncs mymaster 1：在failover期間，允許多少個slave同時指向新的主節點。如果numslaves設置較大的話，雖然複製操作並不會阻塞主節點，但多個節點同時指向新的主節點，會增加主節點的網路和磁碟IO負載。

sentinel failover-timeout mymaster 180000：定義故障切換超時時間。預設180000，單位秒，即3min。需要註意的是，該時間不是總的故障切換的時間，而是適用於故障切換的多個場景。

# Specifies the failover timeout in milliseconds. It is used in many ways:
#
# - The time needed to re-start a failover after a previous failover was
#   already tried against the same master by a given Sentinel, is two
#   times the failover timeout.
#
# - The time needed for a slave replicating to a wrong master according
#   to a Sentinel current configuration, to be forced to replicate
#   with the right master, is exactly the failover timeout (counting since
#   the moment a Sentinel detected the misconfiguration).
#
# - The time needed to cancel a failover that is already in progress but
#   did not produced any configuration change (SLAVEOF NO ONE yet not
#   acknowledged by the promoted slave).
#
# - The maximum time a failover in progress waits for all the slaves to be
#   reconfigured as slaves of the new master. However even after this time
#   the slaves will be reconfigured by the Sentinels anyway, but not with
#   the exact parallel-syncs progression as specified.

第一種適用場景：如果Redis Sentinel對一個主節點故障切換失敗，那麼下次再對該主節點做故障切換的起始時間是failover-timeout的2倍。這點從Sentinel的日誌就可體現出來（28234:X 08 Oct 16:04:04.385 # Next failover delay: I will not start a failover before Mon Oct 8 16:10:04 2018）

sentinel notification-script：定義通知腳本，當Sentinel出現WARNING級別的事件時，會調用該腳本，其會傳入兩個參數：事件類型，事件描述。

sentinel client-reconfig-script：當主節點發生切換時，會調用該參數定義的腳本，其會傳入以下參數：<master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port>

關於腳本，其必須遵循一定的規則。

# SCRIPTS EXECUTION
#
# sentinel notification-script and sentinel reconfig-script are used in order
# to configure scripts that are called to notify the system administrator
# or to reconfigure clients after a failover. The scripts are executed
# with the following rules for error handling:
#
# If script exits with "1" the execution is retried later (up to a maximum
# number of times currently set to 10).
#
# If script exits with "2" (or an higher value) the script execution is
# not retried.
#
# If script terminates because it receives a signal the behavior is the same
# as exit code 1.
#
# A script has a maximum running time of 60 seconds. After this limit is
# reached the script is terminated with a SIGKILL and the execution retried.

sentinel deny-scripts-reconfig：不允許使用SENTINEL SET設置notification-script和client-reconfig-script。

Sentinel的常見操作

PING This command simply returns PON