云原生环境下的LVS+Keepalived负载均衡与高可用方案
概述
在云原生架构中,高可用负载均衡是确保关键服务持续可用的核心技术。Keepalived与kube-proxy的IPVS模式相结合,能够为Kubernetes控制平面和数据平面提供高性能、低延迟的流量分发能力。本方案将从Keepalived DaemonSet部署、kube-proxy IPVS配置优化、调度算法选择、VIP管理规范以及生产环境验证与故障排查等方面,提供一套完整的云原生负载均衡与高可用解决方案。
一、Keepalived DaemonSet部署方案
1.1 DaemonSet配置优化
在Kubernetes环境中,Keepalived通常通过DaemonSet部署在Master节点上,实现API Server的VIP高可用。优化后的DaemonSet配置如下:
bash
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: keepalived
namespace: kube-system
labels:
app: keepalived
spec:
selector:
matchLabels:
app: keepalived
template:
metadata:
labels:
app: keepalived
annotations:
container.apparmor.security.beta.kubernetes.io/keepalived: unconfined
spec:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
containers:
- name: keepalived
image: bitnami/keepalived:2.2.4
securityContext:
privileged: true
command:
- /usr/sbin/keepalived
- --dont-fork
- --log-console
- --use-file=/etc/keepalived/keepalived.conf
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
volumeMounts:
- name: keepalived-conf
mountPath: /etc/keepalived/keepalived.conf
subPath: keepalived.conf
- name: health-check
mountPath: /usr/local/bin/health_check.sh
subPath: health_check.sh
readOnly: true
volumes:
- name: keepalived-conf
configMap:
name: keepalived-config
- name: health-check
configMap:
name: health-check-config
defaultMode: 0755
nodeSelector:
node-role.kubernetes.io/control-plane: "true"
1.2 Keepalived主节点配置
bash
apiVersion: v1
kind: ConfigMap
metadata:
name: keepalived-config
namespace: kube-system
data:
keepalived.conf: |
global_defs {
router_id kube-apiserver
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/usr/local/bin/health_check.sh"
interval 2
weight -5
fall 2
rise 1
}
vrrp_instance VI_API {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass YourStrongPass123!
}
virtual_ipaddress {
192.168.1.100/24 dev eth0
}
track_script {
chk_apiserver
}
preempt_delay 30
}
1.3 健康检查脚本
#!/bin/bash
# 检查API Server健康状态
curl -k -s --connect-timeout 2 https://127.0.0.1:6443/healthz > /dev/null
if [ $? -eq 0 ]; then
exit 0
else
exit 1
fi
1.4 云环境单播配置
在云环境中,由于VRRP组播可能被网络策略限制,需要使用单播模式:
bash
vrrp_instance VI_API {
state MASTER
interface eth0
virtual_router_id 51
priority 100
# 云环境单播配置
unicast_src_ip 10.0.0.10
unicast_peer {
10.0.0.11
10.0.0.12
}
authentication {
auth_type PASS
auth_pass YourStrongPass123!
}
virtual_ipaddress {
192.168.1.100/24 dev eth0
}
track_script {
chk_apiserver
}
}
二、kube-proxy IPVS模式配置优化
2.1 kube-proxy ConfigMap配置
bash
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-proxy
namespace: kube-system
data:
config.conf: |
apiVersion: kubeproxy.config.k8s.io/v1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
strictARP: true
scheduler: "wrr"
syncPeriod: 30s
minSyncPeriod: 0s
tcpTimeout: 0s
tcpFinTimeout: 0s
udpTimeout: 0s
metricsBindAddress: "0.0.0.0:10249"
bindAddress: "0.0.0.0"
clusterCIDR: "10.244.0.0/16"
healthzBindAddress: "0.0.0.0:10256"
2.2 调度算法选择策略
| 轮询(rr) | 短连接场景,后端Pod性能均衡 | scheduler: "rr" |
| 加权轮询(wrr) | 短连接场景,后端Pod性能差异较大 | scheduler: "wrr" |
| 最少连接(lc) | 长连接场景,动态分配请求 | scheduler: "lc" |
| 加权最少连接(wlc) | 长连接场景,后端Pod性能差异较大 | scheduler: "wlc" |
| 源地址哈希(sh) | 需要会话保持的应用场景 | scheduler: "sh" |
三、IPVS调度算法在Kubernetes中的实现
3.1 调度算法实现机制
- kube-proxy监听API Server:持续监听API Server中的Service和Endpoint变化
- 规则更新:当检测到变化时,kube-proxy通过调用IPVS内核模块更新IPVS规则
- 流量分发:根据配置的调度算法,将流量从Service的Cluster IP转发到后端Pod
- 健康检查:通过Pod的livenessProbe和readinessProbe信息,自动将不健康的Pod从IPVS规则中移除
3.2 调度算法性能对比
| 调度算法 | 规则更新频率 | 网络带宽 | 资源消耗 | 适用场景 |
|---|---|---|---|---|
| 轮询(rr) | 微秒级 | 高 | 低 | 短连接场景,后端Pod性能均衡 |
| 加权轮询(wrr) | 微秒级 | 高 | 低 | 短连接场景,后端Pod性能差异较大 |
| 最少连接(lc) | 微秒级 | 高 | 低 | 长连接场景,动态分配请求 |
| 加权最少连接(wlc) | 微秒级 | 高 | 低 | 长连接场景,后端Pod性能差异较大 |
| 源地址哈希(sh) | 微秒级 | 中等 | 低 | 需要会话保持的应用场景 |
四、VIP管理规范
4.1 VIP与Cluster IP的范围规划
- Cluster IP :Kubernetes默认Cluster IP范围为
10.96.0.0/12,由kube-proxy自动管理 - VIP :Keepalived管理的VIP通常用于API Server等关键服务,应选择在此范围之外的地址,如
192.168.0.0/16等外部网段
4.2 网络策略配置
bash
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: keepalived-egress
namespace: kube-system
spec:
podSelector:
matchLabels:
app: keepalived
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 10.0.0.0/24
ports:
- protocol: 112 # IP协议号112,VRRP协议
五、生产环境验证与故障排查
5.1 VIP漂移测试
bash
# 验证VIP初始状态
ip addr show eth0 | grep 192.168.1.100
kubectl get pods -n kube-system -l app=keepalived
# 模拟主节点故障
systemctl stop keepalived
# 验证VIP漂移
watch -n 1 'ip addr show eth0 | grep 192.168.1.100'
kubectl get pods -n kube-system -l app=keepalived
# 恢复主节点
systemctl start keepalived
5.2 负载均衡测试
bash
# 验证IPVS规则
ipvsadm -Ln | grep 192.168.1.100
# 测试API Server连通性
curl -k https://192.168.1.100:6443/healthz
# 检查连接统计
ipvsadm -Ln --stats | grep 192.168.1.100
5.3 故障排查方法
5.3.1 VIP无法获取
bash
# 检查Keepalived配置
cat /etc/keepalived/keepalived.conf
# 检查Keepalived状态
systemctl status keepalived
ps aux | grep keepalived
# 检查VIP绑定
ip addr show eth0 | grep 192.168.1.100
# 检查VRRP通信
tcpdump -i eth0 -nn host 224.0.0.18
5.3.2 请求无法正确分发
bash
# 检查kube-proxy配置
kubectl get cm kube-proxy -n kube-system -o yaml
# 检查kube-proxy状态
kubectl get pods -n kube-system -l k8s-app=kube-proxy
# 检查IPVS规则
ipvsadm -Ln
# 检查后端Pod健康状态
kubectl get pods -o wide
六、内核参数调优
bash
# /etc/sysctl.d/k8s-ipvs.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-arptables = 1
# IPVS连接跟踪
net.ipv4.vs.conntrack = 1
net.ipv4.vs.expire_nodest_conn = 1
net.ipv4.vs.expire_quiescent_template = 1
# 连接跟踪表大小
net.nf_conntrack_max = 1000000
net.netfilter.nf_conntrack_max = 1000000
# 应用配置
sysctl -p /etc/sysctl.d/k8s-ipvs.conf
七、监控与告警集成
7.1 Prometheus ServiceMonitor
bash
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: kube-proxy
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: kube-proxy
endpoints:
- port: metrics
interval: 15s
path: /metrics
namespaceSelector:
matchNames:
- kube-system
7.2 监控指标
- 连接数 :
ipvs_connections_total - 带宽 :
ipvs_bandwidth_bytes_total - VIP状态:通过自定义脚本监控VIP绑定状态
- 健康检查:监控API Server健康状态
八、安全配置建议
8.1 不要禁用SELinux
bash
# 正确的安全配置
setsebool -P haproxy_connect_any=on
8.2 使用强密码
bash
authentication {
auth_type PASS
auth_pass Your$tr0ngP@ss123!
}
8.3 网络安全
- 在云环境中配置安全组允许IP协议112
- 使用单播模式避免组播限制
- 配置适当的网络策略
九、部署流程
9.1 环境准备
bash
# 安装必要工具
yum install -y ipvsadm keepalived
# 配置内核参数
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
sysctl -p
9.2 部署步骤
bash
# 1. 部署kube-proxy配置
kubectl apply -f kube-proxy-configmap.yaml
# 2. 重启kube-proxy
kubectl rollout restart daemonset kube-proxy -n kube-system
# 3. 部署Keepalived配置
kubectl apply -f keepalived-configmap.yaml
kubectl apply -f health-check-config.yaml
# 4. 部署Keepalived DaemonSet
kubectl apply -f keepalived-daemonset.yaml
# 5. 验证部署
kubectl get pods -n kube-system -l app=keepalived
ip addr show eth0 | grep 192.168.1.100
十、总结与最佳实践
10.1 方案优势
- 高可用性:通过Keepalived实现API Server的VIP高可用
- 高性能:kube-proxy的IPVS模式性能显著优于iptables模式
- 灵活性:支持多种调度算法,可根据业务场景选择
- 安全性:通过网络策略和安全配置保障系统安全
- 可观察性:集成监控与告警系统
10.2 最佳实践建议
-
调度算法选择:
- 短连接场景:使用轮询(rr)或加权轮询(wrr)
- 长连接场景:使用最少连接(lc)或加权最少连接(wlc)
- 会话保持:使用源地址哈希(sh)
-
VIP管理:
- VIP与Cluster IP范围必须明确区分
- 云环境使用单播模式
- 定期执行VIP漂移测试
-
网络插件:
- 确保网络插件兼容VRRP协议
- 配置适当的安全组规则
-
生产验证:
- 定期进行故障切换测试
- 监控关键指标
- 设置合理的告警阈值