# Keepalived高可用集群 ## 高可用集群簡介 **什麼是高可用集群?** 高可用集群 (High Availability;Cluster,簡稱HA Cluster) ,是指以減少服務中斷時間為目的的伺服器集群技術。它通過保護用戶的業務程式對外不間斷提供的服務,把因軟體、硬體、人為造成的 ...
Keepalived高可用集群
高可用集群簡介
什麼是高可用集群?
高可用集群 (High Availability;Cluster,簡稱HA Cluster) ,是指以減少服務中斷時間為目的的伺服器集群技術。它通過保護用戶的業務程式對外不間斷提供的服務,把因軟體、硬體、人為造成的故障對業務的影響降低到最小程度。
自動切換/故障轉移(FailOver)
自動切換階段某一主機如果確認對方故障,則正常主機除繼續進行原來的任務還將依據各種容錯備援模式接管預先設定的備援作業程式,併進行後續的程式及服務。
通俗地說,即當A無法為客戶服務時,系統能夠自動地切換,使B能夠及時地頂上繼續為客戶提供服務,且客戶感覺不到這個為他提供服務的對象已經更換
通過上面判斷節點故障後,將高可用集群資源(如VIP、httpd等)從該不具備法定票數的集群節點轉移到故障轉移域( Failover Domain,可以接收故障資源轉移的節點)。
自動偵測/腦裂
自動偵測階段由主機上的軟體通過冗餘偵測線,經由複雜的監聽程式,邏輯判斷,來相互偵測對方運行的情況。
常用的方法是:集群各節點間通過心跳信息判斷節點是否出現故障。
腦裂:在高可用(HA)系統中,當聯繫2個節點的“心跳線"斷開時,本來為一整體、動作協調的HA系統,就分裂成為2個獨立的個體。由於相互失去了聯繫,都以為是對方出了故障。兩個節點上的HA軟體像“裂腦人"一樣,爭搶“"共用資源"、爭起“應用服務",就會發生嚴重後果——或者共用資源被瓜分、2邊"服務"都起不來了"或者2邊"服務"都起來了,但同時讀寫“共用存儲",導致數據損壞(常見如資料庫輪詢著的聯機日誌出錯)。
腦裂解決方案:1.添加冗餘的心跳線 2.啟用磁碟鎖 3. 設置仲裁機制 4. 腦裂的監控報警
其他高可用方案:heartbeat、pacemaker、piranha(web頁面)
Keepalived
keepalived是什麼?
keepalived是集群管理中保證集群高可用的一個服務軟體,用來防止單點故障.
keepalived工作原理
keepalived是以VRRP協議為實現基礎的,VRRP全稱Virtual Router Redundancy Protocol,即虛擬路由冗餘協議。
將N台提供相同功能的伺服器組成一個伺服器組,這個組裡面有一個master和個backup,master上面有一個對外提供服務的vip(該伺服器所在區域網內其他機器的預設路由為該vip) ,master會發組播,當backup收不到vrrp包時就認為master宕掉了,這時就需要根據VRRP的優先順序來選舉一個backup當master
keepalived主要有三個模塊
分別是core. check和vrrp。
core模塊為keepalived的核心,負責主進程的啟動、維護以及全局配置文件的載入和解析。check負責健康檢查,包括常見的各種檢查方式。vrrp模塊是來實現VRRP協議的。
實戰案例1 keepalived + nginx
準備:server1 server2 關閉防火牆 selinux
server1:
yum install -y keepalived
cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf-backup //備份原文件
vim /etc/keepalived/keepalived.conf //把內容全刪了 ggdG 然後配置如下
! Configuration File for keepalived
global_defs {
router_id 1
}
#vrrp_script chk_nginx {
#script "/etc/keepalived/ck_ng.sh"
#interval 2
#weight -5
#fall 3
#}
vrrp_instance VI_1 {
state MASTER
interface ens33
mcast_src_ip 192.168.70.130
virtual_router_id 55
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
192.168.70.140
}
#track_script {
#chk_nginx
#}
}
yum install -y nginx
systemctl enable nginx
systemctl start nginx
vim /var/share/nginx/html/index.html //自行修改頁面 以便區分server2的nginx
curl -i 192.168.70.130
systemctl start keepalived
systemctl enable keepalived
server2:
yum install -y keepalived
cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf-backup //備份原文件
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id 2
}
#vrrp_script chk_nginx {
#script "/etc/keepalived/ck_ng.sh"
#interval 2
#weight -5
#fall 3
#}
vrrp_instance VI_1 {
state BACKUP
interface ens33
mcast_src_ip 192.168.70.132
virtual_router_id 55
priority 99
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
192.168.70.140
}
#track_script {
#chk_nginx
#}
}
yum install -y nginx
systemctl enable nginx
systemctl start nginx
curl -i localhost
systemctl start keepalived
systemctl enable keepalived
測試:
[root@localhost local]# curl -i 192.168.70.140 //返回的應該是server1 nginx的內容
可以試著把server1斷網,vmware設置 取消網路連接 再測試訪問 這時候返回的應該是server2 nginx的內容
關於keepalived對nginx狀態未知的問題
恢復之前的實驗。啟動兩台主機的keepalived和nginx。確保頁面訪問正常。關閉master的nginx服務。systemctl stop nginx繼續訪問VIP,請問頁面是否會切換到backup呢?keepalived並不會關心nginx的狀態,原因是keepalived監控的是介面ip狀態。無法監控nginx服務狀態。解決方案:
1、監控腳本
server1 server2 添加nginx監控腳本
vim /etc/keepalived/ck_ng.sh
#!/bin/bash
#檢查nginx進程是否存在
counter=`ps -C nginx --no-heading | wc -l`
if [ ${counter} = 0 ] ;then
systemctl restart nginx
sleep 5
counter2=`ps -C nginx --no-heading | wc -l`
if [ ${counter2} = 0 ] ;then
systemctl stop keepalived
fi
fi
chmod +x /etc/keepalived/ck_ng.sh
修改keepalived.conf文件 把上述寫的註釋都取消 server1 server2 都取消註釋 其他內容不變
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id 1
}
vrrp_script chk_nginx {
script "/etc/keepalived/ck_ng.sh"
interval 2
weight -5
fall 3
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
mcast_src_ip 192.168.70.130
virtual_router_id 55
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
192.168.70.140
}
track_script {
chk_nginx
}
}
systemctl restart keepalived
測試:
systemctl stop nginx
systemctl status nginx
如果測試結果並沒按預期執行
在vrrp_script chk_nginx{} 中間加debug
tail -f /var/log/messages //查看日誌
如果出現Aug 27 20:59:44 localhost Keepalived_vrrp[51703]: /etc/keepalived/ck_ng.sh exited due to signal 15
說明生命探測advert_int設置時間太短了 增加5秒試試 相應interval必須大於advert_int的時間設置6秒試試,兩台server都必須改過來!
實戰案例2 keepalived + lvs集群
1.在master上安裝配置keepalived ipvsadm
yum install keepalived ipvsadm -y
2.在master上修改配置文件
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id Director 1
}
#Keepalived
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
192.168.70.140/24 dev ens33
}
}
#LVS
virtual_server 192.168.70.140 80 {
delay_loop 3 # 將 Keepalived 故障轉移時的延遲檢測迴圈次數設置為 5 次
lb_algo rr
lb_kind DR
protocol TCP
real_server 192.168.70.133 80 {
weight 1
TCP_CHECK {
connect_timeout 5
}
}
real_server 192.168.70.134 80 {
weight 1
TCP_CHECK{
connect_timeout 3
}
}
}
3.在backup安裝配置keepalived ipvsadm
yum install keepalived ipvsadm -y
4.在backup上修改配置文件
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id Director 2
}
#Keepalived
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 99
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
192.168.70.140/24 dev ens33
}
}
#LVS
virtual_server 192.168.70.140 80 {
delay_loop 3 # 將 Keepalived 故障轉移時的延遲檢測迴圈次數設置為 5 次
lb_algo rr
lb_kind DR
protocol TCP
real_server 192.168.70.133 80 {
weight 1
TCP_CHECK {
connect_timeout 5
}
}
real_server 192.168.70.134 80 {
weight 1
TCP_CHECK{
connect_timeout 3
}
}
}
5.啟動兩台設備的keepalived
systemctl start keepalived
systemctl enable keepalived
6.兩台realserver 安裝並啟動httpd
yum install -y httpd
systemtl start httpd
systemtl enable httpd
7.新建lo:0文件 迴環介面
vim /etc/sysconfig/network-scripts/ifcfg-lo:0 //配置如下
DEVICE=lo:0
IPADDR=192.168.70.140
NETMASK=255.255.255.255
ONBOOT=yes
8.配置路由 讓每次開機都配置上迴環介面
不管誰訪問140 都讓迴環介面來處理
vim /etc/rc.local //添加如下
/sbin/route add -host 192.168.70.140 dev lo:0
9.配置 sysctl.conf文件
vim /etc/sysctl.conf
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.default.arp_ignore = 1
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_ignore = 1
net.ipv4.conf.lo.arp_announce = 2
10.將lo:0文件拷貝到另一臺realserver
scp /etc/sysconfig/network-scripts/ifcfg-lo:0 192.168.70.134:/etc/sysconfig/network-scripts/ifcfg-lo:0
scp /etc/sysctl.conf 192.168.70.134:/etc/sysctl.conf
10.另一臺一樣配置rc.local文件
vim /etc/rc.local //添加
/sbin/route add -host 192.168.70.140 dev lo:0
11.一樣配置sysctl.conf文件
vim /etc/sysctl.conf
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.default.arp_ignore = 1
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_ignore = 1
net.ipv4.conf.lo.arp_announce = 2
12.測試
瀏覽器訪問192.168.70.140 再關閉master的網路再試試 還能訪問表示我們實驗成功了
LVS+Keepalived 常見面試題
1.什麼是集群?集群分為哪些類型?列舉代表的產品。2.有些負載均衡集群服務?他們有什麼區別?
3.LVS-DR和LVS-NAT的工作原理。
4.keepalived的工作原理。
5.高可用集群有哪些產品。他們的區別。
6.負載均衡集群的策略有哪些?能否舉例說明?