使用 keepalived 和 haproxy 实现Kubernetes Control Plane的高可用 (HA)

文章目录

小结
问题
解决
- keepalived设置
- haproxy设置
- [部署Kubernetes control plane集群](#部署Kubernetes control plane集群)
参考

小结

本文记录了使用 keepalived 和 haproxy 实现Kubernetes Control Plane的高可用 (HA)。

VRRP(Virtual Router Redundancy Protocol) - 虚拟路由冗余协议

VIP - Virtual IP, 虚拟IP（浮动IP）

问题

需要实现Kubernetes Control Plane的高可用 (HA)，也就是Kubernetes Control Plane部署成一个集群，集群中的单个Kubernetes Control Plane损坏，Kubernetes Control Plane仍然工作。

解决

Kubernetes的Worker Nodes通过负载均衡连接Kubernetes Control Plane的集群，负载均衡使用keepalived 和 haproxy 来实现。

keepalived设置

在三台主机上安装并启动keepalived （Master, Node1 和 Node2），这三个节点也同时运行haproxy 和 Kubernetes Control Plane的集群。

shell 复制代码

yum install keepalived
systemctl enable keepalived

设置keepalived

shell 复制代码

vim /etc/keepalived/keepalived.conf

配置如下，

shell 复制代码

vrrp_instance VI_1 {
    state MASTER
    interface ens160
    virtual_router_id 51
    priority 100  # 优先级
    advert_int 1
    mcast_src_ip 192.168.238.130 #本机IP
    unicast_src_ip 192.168.238.130 #本机IP
    unicast_peer {
        192.168.238.131, #另外两个节点
        192.168.238.132
    }
    nopreempt
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.238.100/24  #VIP
    }
    track_script {
        check_apiserver
    }
}

创建心跳线探测脚本：

shell 复制代码

vim /etc/keepalived/check_apiserver.sh
chmod +x /etc/keepalived/check_apiserver.sh

脚本如下：

shell 复制代码

#!/bin/sh

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl --silent --max-time 2 --insecure https://localhost:4300/ -o /dev/null || errorExit "Error GET https://localhost:4300/"
if ip addr | grep -q 192.168.238.100; then
    curl --silent --max-time 2 --insecure https://192.168.238.100:4300/ -o /dev/null || errorExit "Error GET https://192.168.238.100:4300/"
fi

启动keepalived服务

shell 复制代码

systemctl start keepalived

haproxy设置

安装并启动haproxy

shell 复制代码

yum install haproxy
systemctl enable haproxy

配置haproxy

shell 复制代码

vim /etc/haproxy/haproxy.cfg

配置的内容如下：

shell 复制代码

...
#---------------------------------------------------------------------
# apiserver frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend apiserver
    bind *:4300
    mode tcp
    option tcplog
    default_backend apiserver
...
#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
    option httpchk GET /healthz
    http-check expect status 200
    mode tcp
    option ssl-hello-chk
    balance     roundrobin
        #server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check
        # [...]
        server Master 192.168.238.130:6443 check
        server Node1 192.168.238.131:6443 check
        server Node2 192.168.238.132:6443 check
...

系统设置：

shell 复制代码

[root@Master ~]# echo net.ipv4.ip_nonlocal_bind=1 >> /etc/sysctl.d/haproxy-keepalived.conf
[root@Master ~]# echo net.ipv4.ip_forward=1 >> /etc/sysctl.d/haproxy-keepalived.conf
[root@Master ~]# cat /etc/sysctl.d/haproxy-keepalived.conf
net.ipv4.ip_nonlocal_bind=1
net.ipv4.ip_forward=1
[root@Master ~]#
[root@Master ~]# sysctl -p /etc/sysctl.d/haproxy-keepalived.conf
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
[root@Master ~]#

启动haproxy服务

shell 复制代码

systemctl start haproxy

查看虚拟IP绑定：

shell 复制代码

[root@Master ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:0c:29:9e:d5:84 brd ff:ff:ff:ff:ff:ff
    inet 192.168.238.130/24 brd 192.168.238.255 scope global dynamic noprefixroute ens160
       valid_lft 1790sec preferred_lft 1790sec
    inet 192.168.238.100/24 scope global secondary ens160
       valid_lft forever preferred_lft forever
    inet6 fe80::b52:173:210c:48d9/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
[root@Master ~]# ip a | grep 192.168.238.100
    inet 192.168.238.100/24 scope global secondary ens160
[root@Master ~]#

验证VIP和haproxy在工作：

shell 复制代码

[root@Master ~]# curl --insecure https://localhost:4300
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
  "reason": "Forbidden",
  "details": {},
  "code": 403
}
[root@Master ~]# curl --insecure https://192.168.238.100:4300
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
  "reason": "Forbidden",
  "details": {},
  "code": 403
}[root@Master ~]#

在浏览器中输入https://192.168.238.100:4300/可以得到以上同样的内容。如果输入http://192.168.238.100:4300/，则返回Client sent an HTTP request to an HTTPS server.。

部署Kubernetes control plane集群

根据以上配置，使用以下指令

shell 复制代码

#以下指令在三个节点中的一个节点执行，这里在Master上
kubeadm init --pod-network-cidr=10.244.0.0/16 --control-plane-endpoint 192.168.238.100

按提示在三个节点上运行以下指令：

shell 复制代码

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

拷贝证书文件到剩下的节点，如下：

shell 复制代码

scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@Node1:/etc/kubernetes/pki/
mkdir -p /etc/kubernetes/pki/etcd
scp /etc/kubernetes/pki/etcd/ca.* root@Node1:/etc/kubernetes/pki/etcd

在加入其它的两个Kubernetes Control Plane之前，需要部署flannel网络模块：

shell 复制代码

kubectl apply -f kube-flannel.yml

以下指令根据提示在剩下下两个节点执行

shell 复制代码

kubeadm join 192.168.238.100:4300 --token si5oek.mbrw418p8mr357qt --discovery-token-ca-cert-hash sha256:0e23eb637e09afc4c6dbb1f891409b314d5731e46fe33d84793ba2d58da006d6 --control-plane

再安装剩下的模块即可成功，结果如下：

shell 复制代码

[root@Master ~]#kubectl get nodes -o wide 
NAME     STATUS   ROLES           AGE     VERSION   INTERNAL-IP       EXTERNAL-IP   OS-IMAGE                               KERNEL-VERSION          CONTAINER-RUNTIME
master   Ready    control-plane   3h57m   v1.27.3   192.168.238.130   <none>        Red Hat Enterprise Linux 8.3 (Ootpa)   4.18.0-240.el8.x86_64   containerd://1.6.21
node1    Ready    control-plane   3h53m   v1.27.3   192.168.238.131   <none>        Red Hat Enterprise Linux 8.3 (Ootpa)   4.18.0-240.el8.x86_64   containerd://1.6.21
node2    Ready    control-plane   3h51m   v1.27.3   192.168.238.132   <none>        Red Hat Enterprise Linux 8.3 (Ootpa)   4.18.0-240.el8.x86_64   containerd://1.6.21
[root@Master ~]#

这里的核心组件如下：

shell 复制代码

[root@Master ~]# kubectl get pods -o wide -A
NAMESPACE      NAME                             READY   STATUS    RESTARTS         AGE     IP                NODE     NOMINATED NODE   READINESS GATES
kube-flannel   kube-flannel-ds-6n8rh            1/1     Running   0                3m11s   192.168.238.131   node1    <none>           <none>
kube-flannel   kube-flannel-ds-c65rz            1/1     Running   0                80s     192.168.238.132   node2    <none>           <none>
kube-flannel   kube-flannel-ds-ss8pw            1/1     Running   0                10m     192.168.238.130   master   <none>           <none>
kube-system    coredns-5d78c9869d-cw5cz         1/1     Running   0                17m     10.244.1.96       node1    <none>           <none>
kube-system    coredns-5d78c9869d-z99r8         1/1     Running   0                17m     10.244.1.95       node1    <none>           <none>
kube-system    etcd-master                      1/1     Running   13               17m     192.168.238.130   master   <none>           <none>
kube-system    etcd-node1                       1/1     Running   0                3m10s   192.168.238.131   node1    <none>           <none>
kube-system    etcd-node2                       1/1     Running   0                79s     192.168.238.132   node2    <none>           <none>
kube-system    kube-apiserver-master            1/1     Running   11               17m     192.168.238.130   master   <none>           <none>
kube-system    kube-apiserver-node1             1/1     Running   1                3m11s   192.168.238.131   node1    <none>           <none>
kube-system    kube-apiserver-node2             1/1     Running   2                79s     192.168.238.132   node2    <none>           <none>
kube-system    kube-controller-manager-master   1/1     Running   20 (2m58s ago)   17m     192.168.238.130   master   <none>           <none>
kube-system    kube-controller-manager-node1    1/1     Running   1                3m11s   192.168.238.131   node1    <none>           <none>
kube-system    kube-controller-manager-node2    1/1     Running   2                79s     192.168.238.132   node2    <none>           <none>
kube-system    kube-proxy-87gfk                 1/1     Running   0                17m     192.168.238.130   master   <none>           <none>
kube-system    kube-proxy-crjc4                 1/1     Running   0                80s     192.168.238.132   node2    <none>           <none>
kube-system    kube-proxy-mnl2d                 1/1     Running   0                3m11s   192.168.238.131   node1    <none>           <none>
kube-system    kube-scheduler-master            1/1     Running   20 (2m55s ago)   17m     192.168.238.130   master   <none>           <none>
kube-system    kube-scheduler-node1             1/1     Running   1                3m11s   192.168.238.131   node1    <none>           <none>
kube-system    kube-scheduler-node2             1/1     Running   2                79s     192.168.238.132   node2    <none>           <none>
[root@Master ~]#

参考

Github: High Availability Considerations
Kubernetes: Options for Highly Available Topology
刘达的博客: haproxy+keepalived配置示例
 阿里云：搭建高可用负载均衡器: haproxy+keepalived
HAproxy and keepAlived for Multiple Kubernetes Master Nodes
Create a Highly Available Kubernetes Cluster Using Keepalived and HAproxy
Install and Configure a Multi-Master HA Kubernetes Cluster with kubeadm, HAProxy and Keepalived on CentOS 7
CSDN: haproxy+keepalived搭建高可用k8s集群
 CSDN: kubernetes学习---安装haproxy并配置keepalived高可用
 使用 Keepalived 和 HAproxy 创建高可用 Kubernetes 集群
 CSDN: HAProxy + Keepalived 配置架构解析
 cnblog: haproxy+keepalived原理特点
 Keepalived与HaProxy的协调合作原理分析
 High availability Kubernetes cluster on bare metal - part 2