超详细ubuntu22.04部署k8s1.28高可用(一)【多master+keepalived+nginx实现负载均衡】

超详细ubuntu22.04部署k8s1.28高可用(一)【多master+keepalived+nginx实现负载均衡】

文章目录

  • 超详细ubuntu22.04部署k8s1.28高可用(一)【多master+keepalived+nginx实现负载均衡】
  • 一、整体架构说明
  • 二、环境准备
  • 三、Kubeadm部署K8S集群
    • [1. 安装常用依赖(每台服务器)](#1. 安装常用依赖(每台服务器))
    • [2. 基础配置(192.168.24.31~34)](#2. 基础配置(192.168.24.31~34))
    • [3. 安装containerd(192.168.24.31~34)](#3. 安装containerd(192.168.24.31~34))
    • [4. 安装kubelet、kubeadm、kubectl(192.168.24.31~34)](#4. 安装kubelet、kubeadm、kubectl(192.168.24.31~34))
    • [5. 安装Nginx并配置(192.168.24.35~36)](#5. 安装Nginx并配置(192.168.24.35~36))
    • [6. 安装Keepalived并配置(192.168.24.35~36)](#6. 安装Keepalived并配置(192.168.24.35~36))
    • [7. 初始化集群(192.168.24.31)](#7. 初始化集群(192.168.24.31))
    • [8. 安装网络插件](#8. 安装网络插件)
  • 总结

随着业务规模扩大和系统稳定性要求的不断提升,单 Master 的 Kubernetes 集群在生产环境中已经难以满足高可用需求。一旦 Master 节点发生故障,集群将面临 API 不可用、调度停止等风险,严重影响业务运行

Kubernetes 官方推荐通过 多 Master + etcd 集群 的方式构建高可用控制平面,但在多 Master 架构下,如何为 kube-apiserver 提供统一、稳定的访问入口,成为集群高可用部署中的关键问题

在生产实践中,常见的解决方案是通过 Keepalived + Nginx:

  • Keepalived 提供 VIP 漂移能力,保证访问入口的高可用
  • Nginx 作为四层或七层负载均衡器,将请求分发到后端多个 Master 节点上的 kube-apiserver
  • 多个 Master 节点同时运行 kube-apiserver / controller-manager / scheduler,并配合 etcd 集群 实现控制平面的整体高可用

本篇将基于 Kubernetes 1.28 版本,详细介绍一种 生产可落地的高可用部署方案,通过:

  • 多 Master + 多 etcd 架构
  • Keepalived 提供 VIP
  • Nginx 实现 kube-apiserver 的负载均衡

构建一个稳定、可靠、可扩展的 Kubernetes 高可用集群,为后续的业务容器化部署和集群运维打下坚实基础

一、整体架构说明

本方案采用 三 Master + 两节点负载均衡 的经典高可用架构:

  • 控制平面高可用: 多个 Master 节点同时运行 kube-apiserver、controller-manager、scheduler 组件
  • etcd 高可用:每个 Master 节点部署一个 etcd 实例,组成三节点 etcd 集群
  • 访问入口高可用:Keepalived 提供 VIP,Nginx 监听 VIP 并将请求负载到后端 Master 节点

无论单个 Master、单个 etcd 节点,还是单个负载均衡节点发生故障,集群都可以自动切换,确保 Kubernetes API 服务持续可用

二、环境准备

服务器 主机名 IP地址 主要组件
master-01 + etcd-01节点 k8s-master-1 192.168.24.31 apiserver controller-manager scheduler proxy etcd
master-02 + etcd-02节点 k8s-master-2 192.168.24.32 apiserver controller-manager scheduler proxy etcd
master-03 + etcd-03节点 k8s-master-3 192.168.24.33 apiserver controller-manager scheduler proxy etcd
node-01 节点 k8s-node-1 192.168.24.34 kubelet proxy(资源有限,本demo只使用一个node节点)
nginx + keepalive-01节点 keepalived-1 192.168.24.35 nginx keepalived(主)
nginx + keepalive-02节点 keepalived-2 192.168.24.36 nginx keepalived(备)

三、Kubeadm部署K8S集群

1. 安装常用依赖(每台服务器)

bash 复制代码
apt install iproute2 net-tools ntpdate tcpdump telnet traceroute nfs-kernel-server nfs-common lrzsz tree openssl libssl-dev libpcre3 libpcre3-dev zlib1g-dev ntpdate tcpdump telnet traceroute gcc openssh-server lrzsz tree openssl libssl-dev libpcre3 libpcre3-dev zlib1g-dev ntpdate tcpdump telnet traceroute iotop unzip zip vim -y

2. 基础配置(192.168.24.31~34)

bash 复制代码
# 配置主机名
hostnamectl set-hostname k8s-master-1
hostnamectl set-hostname k8s-master-2
hostnamectl set-hostname k8s-master-3
hostnamectl set-hostname k8s-node-1

# 配置hosts文件
cat >> /etc/hosts << EOF
192.168.24.31 k8s-master-1
192.168.24.32 k8s-master-2
192.168.24.33 k8s-master-3
192.168.24.34 k8s-master-4
EOF

# 关闭swap分区
swapoff -a
sed -i '/swap/d' /etc/fstab

# 加载内核模块
cat <<EOF | tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF

modprobe overlay
modprobe br_netfilter

# 加载sysctl参数
cat <<EOF | tee /etc/sysctl.d/kubernetes.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.ipv4.ip_nonlocal_bind=1
EOF

sysctl --system

3. 安装containerd(192.168.24.31~34)

bash 复制代码
# 安装containerd(K8s1.24版本以后已经不使用docker作为容器运行时了)
apt update
apt install -y containerd

# 生成配置文件
mkdir -p /etc/containerd
containerd config default | tee /etc/containerd/config.toml

# 修改cgroup驱动和镜像地址
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sed -i 's#sandbox_image = "registry.k8s.io/pause:3.8"#sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"#' /etc/containerd/config.toml

# 启动containerd
systemctl restart containerd
systemctl enable containerd
containerd --version

4. 安装kubelet、kubeadm、kubectl(192.168.24.31~34)

bash 复制代码
# 添加阿里云GPG KEY
mkdir -p /etc/apt/keyrings
curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/Release.key \
| gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

# 添加阿里云K8S源
cat <<EOF | tee /etc/apt/sources.list.d/kubernetes.list
deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] \
https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/ /
EOF

# 更新并安装
apt update
apt install -y kubelet kubeadm kubectl

# 配置开机自启
systemctl enable kubelet --now
systemctl status kubelet  #可以看到状态是auto-restart,正常

5. 安装Nginx并配置(192.168.24.35~36)

bash 复制代码
# 安装nginx
#安装必备组件
sudo apt install curl gnupg2 ca-certificates lsb-release ubuntu-keyring
#导入导入官方nginx签名密钥,以便apt可以验证包的真实性,获取密钥
curl https://nginx.org/keys/nginx_signing.key | gpg --dearmor \
| sudo tee /usr/share/keyrings/nginx-archive-keyring.gpg >/dev/null
#设置apt存储库
echo "deb [signed-by=/usr/share/keyrings/nginx-archive-keyring.gpg] \
http://nginx.org/packages/ubuntu `lsb_release -cs` nginx" \
| sudo tee /etc/apt/sources.list.d/nginx.list
#使用主线nginx包
echo "deb [signed-by=/usr/share/keyrings/nginx-archive-keyring.gpg] \
http://nginx.org/packages/mainline/ubuntu `lsb_release -cs` nginx" \
| sudo tee /etc/apt/sources.list.d/nginx.list
#设置存储库
echo -e "Package: *\nPin: origin nginx.org\nPin: release o=nginx\nPin-Priority: 900\n" \
| sudo tee /etc/apt/preferences.d/99nginx
#更新源并安装nginx
apt update
apt install nginx

# 配置nginx
vim /etc/nginx/nginx.conf
http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

stream {
    upstream kube_apiserver {
        server 192.168.24.31:6443;
        server 192.168.24.32:6443;
        server 192.168.24.33:6443;
    }

    server {
        listen 6443;
        proxy_pass kube_apiserver;
    }
}

# 启动nginx
nginx -t && nginx

# 验证nginx端口是否启动
ss -tnlp

6. 安装Keepalived并配置(192.168.24.35~36)

bash 复制代码
# 安装keepalived
apt install keepalived

# 配置keepalived
vim /etc/keepalived/keepalived.conf (192.168.24.35)
vrrp_instance VI_1 {
    state MASTER            #主节点
    interface ens160        #根据自己网卡名称进行更改
    virtual_router_id 51
    priority 100            #节点优先级
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1234
    }
    virtual_ipaddress {
        192.168.24.30       #定义的VIP地址
    }
}

vim /etc/keepalived/keepalived.conf (192.168.24.36)
vrrp_instance VI_1 {
    state BACKUP            #备用节点
    interface ens160
    virtual_router_id 51
    priority 90
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1234
    }
    virtual_ipaddress {
        192.168.24.30
    }
}

# 启动keepalived
systemctl enable keepalived --now

# 查看VIP是否生效
ip a

7. 初始化集群(192.168.24.31)

bash 复制代码
# 初始化集群(VIP:6443)
kubeadm init --control-plane-endpoint "192.168.24.30:6443" --upload-certs \
--pod-network-cidr=10.244.0.0/16 \
--kubernetes-version=1.28.15 \
--image-repository=registry.aliyuncs.com/google_containers

# 其他master节点加入集群(类似如下命令,初始化完成后会有提示)
kubeadm join 192.168.24.30:6443 --token odmvxr.yvxk22il5749xkdc \
--discovery-token-ca-cert-hash sha256:40c0f9fdb7f426a18a5d90e30bb65de4d6115d2dab27c78b20194e7424c5afdc \
--control-plane \
--certificate-key 382d1d4669d7149422b6a6aace780e3d38965c5450888403b8d0b3bc871434f3

# node节点加入集群(类似如下命令,初始化完成后会有提示)
kubeadm join 192.168.24.30:6443 \
--token odmvxr.yvxk22il5749xkdc \
--discovery-token-ca-cert-hash sha256:40c0f9fdb7f426a18a5d90e30bb65de4d6115d2dab27c78b20194e7424c5afdc

# 验证K8S集群部署情况(如果出现NotReady,是没有安装网络插件导致的,往后看)
root@k8s-master-1:~# kubectl get node -A
NAME           STATUS   ROLES           AGE     VERSION
k8s-master-1   Ready    control-plane   3d21h   v1.28.15
k8s-master-2   Ready    control-plane   3d21h   v1.28.15
k8s-master-3   Ready    control-plane   3d21h   v1.28.15
k8s-node-1     Ready    <none>          3d21h   v1.28.15

root@k8s-master-1:~# kubectl get pod -A -owide
NAMESPACE       NAME                                         READY   STATUS    RESTARTS        AGE     IP              NODE           NOMINATED NODE   READINESS GATES
default         demo-6bb9f6fc6-k9ckz                         1/1     Running   0               2d22h   10.244.3.25     k8s-node-1     <none>           <none>
default         demo-6bb9f6fc6-pz598                         1/1     Running   0               2d22h   10.244.3.24     k8s-node-1     <none>           <none>
ingress-nginx   ingress-nginx-controller-pzz9k               1/1     Running   0               2d22h   192.168.24.34   k8s-node-1     <none>           <none>
kube-flannel    kube-flannel-ds-bchvv                        1/1     Running   0               3d6h    192.168.24.32   k8s-master-2   <none>           <none>
kube-flannel    kube-flannel-ds-tqw7l                        1/1     Running   0               3d6h    192.168.24.34   k8s-node-1     <none>           <none>
kube-flannel    kube-flannel-ds-xx895                        1/1     Running   0               3d6h    192.168.24.31   k8s-master-1   <none>           <none>
kube-flannel    kube-flannel-ds-z9bcz                        1/1     Running   0               3d6h    192.168.24.33   k8s-master-3   <none>           <none>
kube-system     coredns-66f779496c-d2c8d                     1/1     Running   0               3d21h   10.244.2.3      k8s-master-3   <none>           <none>
kube-system     coredns-66f779496c-wrvwp                     1/1     Running   0               3d21h   10.244.2.2      k8s-master-3   <none>           <none>
kube-system     dashboard-metrics-scraper-64fbc9c6c7-9phmb   1/1     Running   0               3d6h    10.244.3.3      k8s-node-1     <none>           <none>
kube-system     etcd-k8s-master-1                            1/1     Running   0               3d21h   192.168.24.31   k8s-master-1   <none>           <none>
kube-system     etcd-k8s-master-2                            1/1     Running   0               3d21h   192.168.24.32   k8s-master-2   <none>           <none>
kube-system     etcd-k8s-master-3                            1/1     Running   0               3d21h   192.168.24.33   k8s-master-3   <none>           <none>
kube-system     kube-apiserver-k8s-master-1                  1/1     Running   0               3d21h   192.168.24.31   k8s-master-1   <none>           <none>
kube-system     kube-apiserver-k8s-master-2                  1/1     Running   0               3d21h   192.168.24.32   k8s-master-2   <none>           <none>
kube-system     kube-apiserver-k8s-master-3                  1/1     Running   0               3d21h   192.168.24.33   k8s-master-3   <none>           <none>
kube-system     kube-controller-manager-k8s-master-1         1/1     Running   1 (3d21h ago)   3d21h   192.168.24.31   k8s-master-1   <none>           <none>
kube-system     kube-controller-manager-k8s-master-2         1/1     Running   0               3d21h   192.168.24.32   k8s-master-2   <none>           <none>
kube-system     kube-controller-manager-k8s-master-3         1/1     Running   0               3d21h   192.168.24.33   k8s-master-3   <none>           <none>
kube-system     kube-proxy-fftrh                             1/1     Running   0               3d21h   192.168.24.34   k8s-node-1     <none>           <none>
kube-system     kube-proxy-hl9sl                             1/1     Running   0               3d21h   192.168.24.32   k8s-master-2   <none>           <none>
kube-system     kube-proxy-ksh6p                             1/1     Running   0               3d21h   192.168.24.33   k8s-master-3   <none>           <none>
kube-system     kube-proxy-r72c6                             1/1     Running   0               3d21h   192.168.24.31   k8s-master-1   <none>           <none>
kube-system     kube-scheduler-k8s-master-1                  1/1     Running   1 (3d21h ago)   3d21h   192.168.24.31   k8s-master-1   <none>           <none>
kube-system     kube-scheduler-k8s-master-2                  1/1     Running   1 (4h28m ago)   3d21h   192.168.24.32   k8s-master-2   <none>           <none>
kube-system     kube-scheduler-k8s-master-3                  1/1     Running   0               3d21h   192.168.24.33   k8s-master-3   <none>           <none>
kube-system     kubernetes-dashboard-78d7494969-wmd7l        1/1     Running   0               3d6h    10.244.3.2      k8s-node-1     <none>           <none>

8. 安装网络插件

bash 复制代码
# 这里以安装Flannel为例
mkdir -p /root/kubernetes/cni

vim /root/kubernetes/cni/kube-flannel.yaml
---
kind: Namespace
apiVersion: v1
metadata:
  name: kube-flannel
  labels:
    k8s-app: flannel
    pod-security.kubernetes.io/enforce: privileged
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: flannel
  name: flannel
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: flannel
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: flannel
  name: flannel
  namespace: kube-flannel
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-flannel
  labels:
    tier: node
    k8s-app: flannel
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "EnableNFTables": false,
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds
  namespace: kube-flannel
  labels:
    tier: node
    app: flannel
    k8s-app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      hostNetwork: true
      priorityClassName: system-node-critical
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni-plugin
        image: hub.hwj.com/k8s/flannel-cni-plugin:v1.4.1-flannel1
        command:
        - cp
        args:
        - -f
        - /flannel
        - /opt/cni/bin/flannel
        volumeMounts:
        - name: cni-plugin
          mountPath: /opt/cni/bin
      - name: install-cni
        image: hub.hwj.com/k8s/flannel:v0.25.2
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: hub.hwj.com/k8s/flannel:v0.25.2
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN", "NET_RAW"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: EVENT_QUEUE_DEPTH
          value: "5000"
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
        - name: xtables-lock
          mountPath: /run/xtables.lock
      volumes:
      - name: run
        hostPath:
          path: /run/flannel
      - name: cni-plugin
        hostPath:
          path: /opt/cni/bin
      - name: cni
        hostPath:
          path: /etc/cni/net.d
      - name: flannel-cfg
        configMap:
          name: kube-flannel-cfg
      - name: xtables-lock
        hostPath:
          path: /run/xtables.lock
          type: FileOrCreate

# 运行falnnel pod
# 耐心等待Pod启动后再使用kubectl get node就能看到是Ready状态了
cd /root/kubernetes/cni && kubectl apply -f .

# 如果遇到镜像拉取失败报错X509,是因为我这里使用的是自签证书的Harbor仓库,各位需根据自己的实际情况进行镜像更换
# 排查命令
kubectl get pod -n kube-flannel
kubectl describe pod -n kube-falnnel kube-flannel-ds-xxxxx
systemctl status containerd

# 解决方案(将harbor服务器的证书文件复制到k8s服务器后更新证书,重启containerd即可)
scp -rp root@harbor:/data/cert/ca.crt k8s:/usr/local/share/ca-certificates/harbor-ca.crt
sudo update-ca-certificates  #更新证书
ll /etc/ssl/certs/harbor-ca.pem
systemctl restart containerd

总结

🚀 本文基于 Ubuntu 22.04 + Kubernetes 1.28,完整演示了一套 生产可落地的 Kubernetes 高可用部署方案,通过 多 Master + Keepalived + Nginx 架构,成功构建了一个稳定、可靠的高可用集群控制平面

下一篇 将在此高可用集群基础之上,进一步介绍 Ingress,逐步完善 Kubernetes 的生产级使用体系

相关推荐
k_cik_ci2 小时前
什么是负载均衡?
服务器·网络·负载均衡
张小凡vip2 小时前
Kubernetes---gitlab的ci/cd发布基于k8s的项目示例参考
ci/cd·kubernetes·gitlab
VermiliEiz2 小时前
使用二进制方式部署k8s(6)
云原生·容器·kubernetes
七七powerful2 小时前
Docker 容器化 GPU 压测工具(压测单卡)
运维·docker·容器
AZ996ZA2 小时前
自学linux第十九天:Cron定时任务完全指南:从入门到排错
linux·运维·服务器
牛奶咖啡132 小时前
Prometheus+Grafana构建云原生分布式监控系统(十四)_Prometheus中PromQL使用(一)
云原生·prometheus·promql·计算一个时间范围内的平均值·将相同数据整合查看整体趋势·计算时间范围内的最大最小比率·向量标量的算术运算
HIT_Weston2 小时前
117、【Ubuntu】【Hugo】首页板块配置:Branch Bundle
linux·运维·ubuntu
HalvmånEver2 小时前
Linux:线程创建与终止下(线程六)
linux·运维·算法
firstacui2 小时前
Docker compose的安装与使用
运维·docker·容器