Keepalived高可用

Keepalived高可用

3.7.1 Keepalived简介
项目 说明
核心功能 健康检查 + VRRP故障转移
应用场景 LVS调度器的高可用、Nginx/HAProxy的高可用
工作原理 基于VRRP协议,实现VIP漂移,主备自动切换
与LVS关系 自动管理ipvsadm规则,无需手动配置

3.7.2 VRRP协议原理
复制代码
        ┌─────────┐                    ┌─────────┐
        │  Master │ ←------ VIP ------→       │  Backup │
        │ (主LVS) │    192.168.1.200   │ (备LVS) │
        └────┬────┘                    └────┬────┘
             │                              │
             │        心跳检测(1秒/次)       │
             └────────------→←------------------------------------------------------┘
             
Master故障 → Backup收不到心跳 → Backup抢占VIP → 成为新Master
概念 说明
VRID 虚拟路由ID,同一组主备必须相同(0-255)
Priority 优先级(1-254),值大的优先成为Master
Preempt 抢占模式,Master恢复后是否抢回VIP
Advert_int 心跳间隔,默认1秒

3.7.3 Keepalived安装配置
1. 安装Keepalived
bash 复制代码
# 主备LVS都安装
yum install -y keepalived

# 查看版本
keepalived -v
2. Master节点配置(lvs-server1)
bash 复制代码
cat > /etc/keepalived/keepalived.conf << 'EOF'
global_defs {
    router_id LVS_MASTER    # 路由ID,唯一标识
    script_user root
    enable_script_security
}

# VRRP实例定义
vrrp_instance VI_1 {
    state MASTER            # 初始状态:MASTER
    interface eth0          # 绑定网卡
    virtual_router_id 51    # VRID,主备必须一致
    priority 100            # 优先级,MASTER要高于BACKUP
    advert_int 1            # 心跳间隔1秒
    authentication {
        auth_type PASS      # 认证类型
        auth_pass 1111      # 认证密码(主备一致)
    }
    virtual_ipaddress {
        192.168.1.200/24    # VIP配置
    }
}

# LVS虚拟服务配置(自动管理ipvsadm)
virtual_server 192.168.1.200 80 {
    delay_loop 6            # 健康检查间隔6秒
    lb_algo wrr             # 调度算法:加权轮询
    lb_kind DR              # 工作模式:DR(或NAT/TUN)
    protocol TCP            # 协议类型
    
    # RS1配置
    real_server 192.168.1.11 80 {
        weight 1            # 权重
        TCP_CHECK {         # TCP健康检查
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
    
    # RS2配置
    real_server 192.168.1.12 80 {
        weight 2
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
}
EOF
3. Backup节点配置(lvs-server2)
bash 复制代码
cat > /etc/keepalived/keepalived.conf << 'EOF'
global_defs {
    router_id LVS_BACKUP    # 路由ID不同
}

vrrp_instance VI_1 {
    state BACKUP            # 初始状态:BACKUP
    interface eth0
    virtual_router_id 51    # VRID与MASTER一致
    priority 90             # 优先级低于MASTER
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111      # 密码与MASTER一致
    }
    virtual_ipaddress {
        192.168.1.200/24    # VIP相同
    }
}

# LVS配置与MASTER完全一致
virtual_server 192.168.1.200 80 {
    delay_loop 6
    lb_algo wrr
    lb_kind DR
    protocol TCP
    
    real_server 192.168.1.11 80 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
    
    real_server 192.168.1.12 80 {
        weight 2
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
}
EOF
4. 启动服务
bash 复制代码
# 主备都执行
systemctl start keepalived
systemctl enable keepalived

# 查看状态
systemctl status keepalived

# 查看VIP(仅在MASTER上能看到)
ip addr show eth0

3.7.4 健康检查机制
TCP_CHECK(四层检查)
bash 复制代码
TCP_CHECK {
    connect_timeout 3      # 连接超时3秒
    nb_get_retry 3         # 重试3次
    delay_before_retry 3   # 重试间隔3秒
    connect_port 80        # 检查端口
}
HTTP_GET(七层检查)
bash 复制代码
HTTP_GET {
    url {
        path /health.html   # 检查URL
        status_code 200     # 期望状态码
    }
    connect_timeout 3
    nb_get_retry 3
    delay_before_retry 3
}
自定义脚本检查(MISC_CHECK)
bash 复制代码
# 检查脚本(如检查MySQL)
cat > /etc/keepalived/check_mysql.sh << 'EOF'
#!/bin/bash
mysql -e "show status;" > /dev/null 2>&1
if [ $? -eq 0 ]; then
    exit 0    # 健康
else
    exit 1    # 不健康
fi
EOF
chmod +x /etc/keepalived/check_mysql.sh

# keepalived配置中使用
real_server 192.168.1.11 3306 {
    weight 1
    MISC_CHECK {
        misc_path "/etc/keepalived/check_mysql.sh"
        misc_timeout 10
    }
}

3.7.5 高可用测试
bash 复制代码
# 1. 查看主备状态(MASTER)
cat /var/log/messages | grep Keepalived
# 输出:VRRP_Instance(VI_1) Transition to MASTER STATE

# 2. 查看VIP绑定(仅在MASTER)
ip addr | grep 192.168.1.200

# 3. 模拟MASTER故障
systemctl stop keepalived    # 或关机

# 4. 观察BACKUP日志(自动抢占VIP)
# 输出:VRRP_Instance(VI_1) Transition to MASTER STATE

# 5. 恢复MASTER(看是否抢回VIP,取决于preempt设置)
systemctl start keepalived

3.7.6 Keepalived + LVS 架构图
复制代码
                    客户端请求
                       ↓
                ┌─────────────┐
                │   VIP漂移    │  ← 192.168.1.200
                │  192.168.1.200 │
                └──────┬──────┘
                       │
           ┌───────────┴───────────┐
           │                       │
    ┌──────┴──────┐         ┌──────┴──────┐
    │   MASTER    │←------------------→│   BACKUP    │
    │  lvs-server1│  VRRP心跳 │  lvs-server2│
    │  Priority:100│  (1秒)  │  Priority:90 │
    └──────┬──────┘         └─────────────┘
           │
    ┌──────┴──────┐
    ↓             ↓
   RS1           RS2
 1.11:80       1.12:80

3.7.7 常见问题排查
问题 原因 解决
VIP无法漂移 VRID不一致/密码不一致/防火墙阻断VRRP 检查配置,开放VRRP协议(组播224.0.0.18)
脑裂(双VIP) 心跳线中断,互相认为对方故障 增加串口线/独立心跳网卡,配置仲裁机制
RS健康检查失败 RS未启动/防火墙/检查脚本错误 手动测试检查脚本,关闭RS防火墙
LVS规则未生效 lb_kind配置错误 确认DR/NAT/TUN模式与网络环境匹配

keepalived的基本设定和配置

1.安装keepalived

bash 复制代码
[root@KA1 ~]# dnf install keepalived.x86_64 -y
[root@KA2 ~]#  dnf install keepalived.x86_64 -y

2.环境设定

bash 复制代码
#部署rs1和rh2(单网卡NAT模式)
[root@rs1 ~]# vmset.sh eth0 172.25.254.10 rs1
[root@rs1 ~]# dnf install httpd -y
[root@rs1 ~]# echo RS1 - 172.25.254.10 > /var/www/html/index.html
[root@rs1 ~]# systemctl enable --now httpd

[root@rs2 ~]# vmset.sh eth0 172.25.254.20 rs2
[root@rs2 ~]# dnf install httpd -y
[root@rs2 ~]# echo RS2 - 172.25.254.20 > /var/www/html/index.html
[root@rs2 ~]# systemctl enable --now httpd

测试

bash 复制代码
#设定ka1和ka2
[root@KA1 ~]# vmset.sh eth0 172.25.254.50 KA1
[root@KA2 ~]# vmset.sh eth0 172.25.254.60 KA2


#设定本地解析
[root@KA1 ~]# vim /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.25.254.50     KA1
172.25.254.60     KA2
172.25.254.10     rs1
172.25.254.20     rs2


[root@KA1 ~]# for i in 60 10 20
> do
> scp /etc/hosts 172.25.254.$i:/etc/hosts
> done

#在所有主机中查看/etc/hosts


#在ka1中开启时间同步服务
[root@KA1 ~]# vim /etc/chrony.conf
 26 allow 0.0.0.0/0
 29 local stratum 10
 
[root@KA1 ~]# systemctl restart chronyd
[root@KA1 ~]# systemctl enable --now chronyd



#在ka2中使用ka1的时间同步服务
[root@KA2 ~]# vim /etc/chrony.conf
pool 172.25.254.50 iburst

[root@KA2 ~]# systemctl restart chronyd
[root@KA2 ~]# systemctl enable --now chronyd

[root@KA2 ~]# chronyc sources -v

3.配置虚拟路由

bash 复制代码
#在master
[root@KA1 ~]# vim /etc/keepalived/keepalived.conf
global_defs {
   notification_email {
     timinglee_zln@163.com
   }
   notification_email_from timinglee_zln@163.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id KA1
   vrrp_skip_check_adv_addr
   #vrrp_strict
   vrrp_garp_interval 1
   vrrp_gna_interval 1
   vrrp_mcast_group4 224.0.0.44
}
vrrp_instance WEB_VIP {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        172.25.254.100/24 dev eth0 label eth0:0
    }
}

[root@KA1 ~]# systemctl enable --now keepalived.service
Created symlink /etc/systemd/system/multi-user.target.wants/keepalived.service → /usr/lib/systemd/system/keepalived.service.

#在KA2中设定
[root@KA2 ~]# vim /etc/keepalived/keepalived.conf
global_defs {
   notification_email {
     timinglee_zln@163.com
   }
   notification_email_from timinglee_zln@163.com
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id KA1
   vrrp_skip_check_adv_addr
   #vrrp_strict
   vrrp_garp_interval 1
   vrrp_gna_interval 1
   vrrp_mcast_group4 224.0.0.44
}
vrrp_instance WEB_VIP {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 80
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        172.25.254.100/24 dev eth0 label eth0:0
    }
}

[root@KA2 ~]# systemctl enable --now keepalived.service
Created symlink /etc/systemd/system/multi-user.target.wants/keepalived.service → /usr/lib/systemd/system/keepalived.service.


#验证
[root@KA1 ~]# tcpdump -i eth0 -nn host 224.0.0.44
11:38:46.183386 IP 172.25.254.50 > 224.0.0.44: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
11:38:47.184051 IP 172.25.254.50 > 224.0.0.44: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
11:38:48.184610 IP 172.25.254.50 > 224.0.0.44: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
11:38:49.185084 IP 172.25.254.50 > 224.0.0.44: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20


[root@KA1 ~]# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.25.254.50  netmask 255.255.255.0  broadcast 172.25.254.255
        inet6 fe80::3901:aeea:786a:7227  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:26:33:d9  txqueuelen 1000  (Ethernet)
        RX packets 5847  bytes 563634 (550.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 5224  bytes 698380 (682.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.25.254.100  netmask 255.255.255.0  broadcast 0.0.0.0
        ether 00:0c:29:26:33:d9  txqueuelen 1000  (Ethernet)

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 42  bytes 3028 (2.9 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 42  bytes 3028 (2.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


#测试故障
#在一个独立的shell中执行
[root@KA1 ~]# tcpdump -i eth0 -nn host 224.0.0.44

#在kA1中模拟故障
[root@KA1 ~]# systemctl stop keepalived.service

#在KA2中看vip是否被迁移到当前主机
[root@KA2 ~]# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.25.254.60  netmask 255.255.255.0  broadcast 172.25.254.255
        inet6 fe80::26df:35e5:539:56bc  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:1e:fd:7a  txqueuelen 1000  (Ethernet)
        RX packets 2668  bytes 237838 (232.2 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2229  bytes 280474 (273.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.25.254.100  netmask 255.255.255.0  broadcast 0.0.0.0
        ether 00:0c:29:1e:fd:7a  txqueuelen 1000  (Ethernet)

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 52  bytes 3528 (3.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 52  bytes 3528 (3.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
相关推荐
dreams_dream2 小时前
docker清除所有网络
运维·docker·容器
承渊政道2 小时前
Linux系统学习【深入剖析Git的原理和使用(下)】
linux·服务器·git·学习·gitee·vim·gitcode
The森2 小时前
Linux IO 模型纵深解析 06:IO 多路转接与多路复用的内核全链路实现
linux·服务器
敲上瘾2 小时前
从虚拟地址到物理页框:Linux 页表与内存管理全解析
linux·运维·服务器·缓存
袁袁袁袁满2 小时前
Linux如何导出指定时间的日志?
linux·运维·服务器·linux日志·linux日志导出
白太岁2 小时前
操作系统开发:(10) 线程创建与调度的底层原理:从硬件行为解释线程
c语言·网络·系统架构
捷利迅分享2 小时前
Xshell高效运维实战技术大纲(含企业级案例+命令示例)
运维
Never_Satisfied2 小时前
在c#中,Jint的AsString()和ToString()的区别
服务器·开发语言·c#
键盘鼓手苏苏3 小时前
Flutter for OpenHarmony:cider 自动化版本管理与变更日志生成器(发布流程标准化的瑞士军刀) 深度解析与鸿蒙适配指南
运维·开发语言·flutter·华为·rust·自动化·harmonyos