最新1.33.1 k8s高可用集群搭建(免翻墙)

大家好,今天给大家分享一个最新k8s版本1.33.1的高可用搭建教程,获取更多k8s相关内容可以关注我的微信公众号"运维日常手记"。

介绍:

k8s高可用拓扑有两种,一种称之为堆叠,如下图,即etcd作为控制组件之一部署在master节点上,每个master节点上都有一个etcd实例,这个etcd实例只与该节点上的kube-apiserver通信,多个master之间组成etcd集群,这种拓扑的好处是配置简单,需要的机器比较少,易于副本管理,缺点是存在耦合风险,一个节点故障,该节点的控制组件和etcd都将失效。

​编辑

还有一种拓扑是etcd外置拓扑,使用单独的节点部署etcd集群,这种拓扑的好处是解耦etcd和控制组件,失去etcd实例或控制组件的影响比较小,缺点是配置相对复杂,需要的机器数量要更多。

​编辑

k8s高可用需要我们自行处理的只有负载均衡器这一部分,负载均衡器的目标是对外提供一个统一的vip给节点的kubelet访问,而负载均衡器通过反代到多个master节点上的kube-apiserver服务实现kube-apiserver服务的负载均衡和高可用,本次使用堆叠的拓扑部署高可用k8s集群,负载均衡器的实现使用haproxy+keepalive的静态pod。

基础环境准备:

操作系统:rockylinux 9.5

内核版本:5.14.0-503.14.1.el9_5.x86_64

kubernetes版本:1.33.1

vip:172.16.10.205

三个master节点分别为:

k8s-master-01 172.16.10.200

k8s-master-02 172.16.10.201

k8s-master-03 172.16.10.202

关闭防火墙和selinux等

分别在三个节点执行:

bash 复制代码
# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld


# 临时关闭swap 
swapswapoff -a

# 永久关闭swap需要修改/etc/fstab配置,注释swap分区
#/dev/mapper/rl-swap     none    swap    defaults        0 0


# 关闭selinux
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config



# 添加配置文件并加载ipvs模块(kube-proxy使用ipvs模式)
cat <<EOF | sudo tee /etc/modules-load.d/ipvs.conf
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
EOF

# 执行命令加载模块
systemctl restart systemd-modules-load.service
modprobe -- ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack


# 配置/etc/hosts 添加如下解析
172.16.10.200  k8s-master-01
172.16.10.201  k8s-master-02
172.16.10.202  k8s-master-03


# 添加内核参数配置文件,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF


# 执行命令使参数生效
sysctl --system

设置主机名和免密登录

在k8s-master-01节点上执行:

python 复制代码
# 设置主机名
hostnamectl set-hostname k8s-master-01 


# 创建ssh密钥对(输入命令一直回车就行)
ssh-keygen -t rsa 


# 拷贝公钥到其他两个节点
ssh-copy-id root@k8s-master-02
ssh-copy-id root@k8s-master-03

在k8s-master-02节点执行:

python 复制代码
# 设置主机名
hostnamectl set-hostname k8s-master-02


# 创建ssh密钥对(输入命令一直回车就行)
ssh-keygen -t rsa 


# 拷贝公钥到其他两个节点
ssh-copy-id root@k8s-master-01
ssh-copy-id root@k8s-master-03

在k8s-master-03节点上执行:

python 复制代码
# 设置主机名
hostnamectl set-hostname k8s-master-03


# 创建ssh密钥对(输入命令一直回车就行)
ssh-keygen -t rsa 


# 拷贝公钥到其他两个节点
ssh-copy-id root@k8s-master-01
ssh-copy-id root@k8s-master-02

安装前检查

执行完以上操作后在进入正式安装前,建议先重启一下服务器,并执行如下命令检查以上配置是否都已经生效。

bash 复制代码
# 执行命令检查selinux状态,输出应为SELinux status:disabled
sestatus


# 检查swap是否已经关闭,输出应为空
cat /proc/swaps


# 检查ipvs模块是否已经加载
lsmod |grep ip_vs


# 检查防火墙是否已经关闭
systemctl status firewalld


# 检查内核参数是否已经设置,值应该为 1
cat /proc/sys/net/ipv4/ip_forward


# 分别在三个节点上尝试免密登录到其他两个节点
ssh root@k8s-master-01

安装容器运行时

分别在三个节点上执行

ini 复制代码
# 添加docker 源
dnf -y install dnf-plugins-core
dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo


# 安装containerd
dnf install -y containerd.io


# 生成默认配置
containerd config default | sudo tee /etc/containerd/config.toml


# 修改/etc/containerd/config.toml配置
disabled_plugins = []

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]    
  SystemdCgroup = true # 这里改为true

[plugins."io.containerd.grpc.v1.cri"]  
  # 这里修改镜像地址  
  sandbox_image = "crpi-u3f530xi0zgtvmy0.cn-beijing.personal.cr.aliyuncs.com/image-infra/pause:3.10"



# 设置开机启动
systemctl enable containerd

# 启动containerd
systemctl start containerd

# 检查状态
systemctl status containerd

安装kubelet,kubeadm和kubectl

分别在三个节点执行

ini 复制代码
# 配置kubernetes 源
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.33/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.33/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF


# 安装kubelet,kubeadm,kubectl
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes



# 启动kubelet,启动完成查看kubelet的状态可能是异常的,不用惊慌
systemctl enable --now kubelet

配置负载haproxy+keepalive静态pod

三个节点的haproxy.cfg配置文件都是一样的,分别执行

bash 复制代码
# 创建配置文件目录
mkdir /etc/haproxy/


# 添加haproxy.cfg 配置文件
# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log stdout format raw local0
    daemon

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 1
    timeout http-request    10s
    timeout queue           20s
    timeout connect         5s
    timeout client          35s
    timeout server          35s
    timeout http-keep-alive 10s
    timeout check           10s

#---------------------------------------------------------------------
# apiserver frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend apiserver
    bind *:8443
    mode tcp
    option tcplog
    default_backend apiserverbackend

#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserverbackend
    option httpchk

    http-check connect ssl
    http-check send meth GET uri /healthz
    http-check expect status 200

    mode tcp
    balance     roundrobin
    
    server k8s-master-01 172.16.10.200:6443 check verify none
    server k8s-master-02 172.16.10.201:6443 check verify none
    server k8s-master-03 172.16.10.202:6443 check verify none

haproxy静态pod haproxy.yaml文件,三个节点也是一样的,分别添加:

yaml 复制代码
apiVersion: v1
kind: Pod
metadata:
  name: haproxy
  namespace: kube-system
spec:
  containers:
  - image: crpi-u3f530xi0zgtvmy0.cn-beijing.personal.cr.aliyuncs.com/image-infra/haproxy:2.8
    name: haproxy
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: localhost
        path: /healthz
        port: 8443
        scheme: HTTPS
    volumeMounts:
    - mountPath: /usr/local/etc/haproxy/haproxy.cfg
      name: haproxyconf
      readOnly: true
  hostNetwork: true
  volumes:
  - hostPath:
      path: /etc/haproxy/haproxy.cfg
      type: FileOrCreate
    name: haproxyconf
status: {}

keepalive的配置文件,三个节点不完全一样,我们把k8s-master-01节点设置为keepalive的主节点。

bash 复制代码
# 创建配置文件目录
mkdir /etc/keepalived

k8s-master-01节点上keepalived.conf配置如下,需要注意几个地方,"state"设置为"MASTER","priority"设置为"101","interface"设置为你的网卡名称,我的环境中网卡名称为"ens18","virtual_ipaddress"为我们最开始规划的vip

typescript 复制代码
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
    router_id LVS_DEVEL
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance VI_1 {
    state MASTER
    interface ens18
    virtual_router_id 51
    priority 101
    authentication {
        auth_type PASS
        auth_pass 42
    }
    virtual_ipaddress {
        172.16.10.205
    }
    track_script {
        check_apiserver
    }
}

k8s-master-02和k8s-master-03节点上keepalived.conf是一样的,需要注意的是"state"设置为"BACKUP","priority"设置为"100",其他和k8s-master-01上的配置一致。

typescript 复制代码
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
    router_id LVS_DEVEL
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens18
    virtual_router_id 51
    priority 100
    authentication {
        auth_type PASS
        auth_pass 42
    }
    virtual_ipaddress {
        172.16.10.205
    }
    track_script {
        check_apiserver
    }
}

keepalive的健康检查脚本,三个节点都是一致的:

保存到路径:/etc/keepalived/check_apiserver.sh

bash 复制代码
#!/bin/sh

APISERVER_DEST_PORT=8443

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl -sfk --max-time 2 https://localhost:${APISERVER_DEST_PORT}/healthz -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/healthz"

上述步骤执行完成后,在初始化集群前,需要把haproxy和keepalive这两个静态pod的yaml文件复制一份到/etc/kubernetes/manifests/路径下,kubelet会检测这个路径下的yaml文件并把pod拉起,三个节点分别执行。

bash 复制代码
# 复制harpxy和keepalive的yaml到静态pod扫描路径
cp keepalived.yaml haproxy.yaml  /etc/kubernetes/manifests

初始化集群

在k8s-master-01节点创建kubeadm-config.yaml 配置文件,需要注意的是"serviceSubnet"是cluster ip的地址段,"podSubnet" 是pod ip地址段,这两个地址段需自行规划不能有冲突,"kubernetesVersion"指定要安装的版本,"controlPlaneEndpoint"指定的是规划的vip和端口,端口要和haproxy.cfg中配置的haproxy监听的端口一致,"localAPIEndpoint"指定的是kube-apiserver的监听地址和端口,因为我们在k8s-master-01上执行初始化,所以此处设置的是k8s-master-01的ip,"caCertificateValidityPeriod"和"certificateValidityPeriod"参数还修改了集群的证书有效期为100年和10年:

yaml 复制代码
apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication

nodeRegistration:
  name: "k8s-master-01"
  criSocket: "unix:///var/run/containerd/containerd.sock"
  imagePullPolicy: "IfNotPresent"

localAPIEndpoint:
  advertiseAddress: "172.16.10.200"
  bindPort: 6443

---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
networking:
  serviceSubnet: "10.96.0.0/16"
  podSubnet: "10.244.0.0/16"
  dnsDomain: "cluster.local"
kubernetesVersion: "1.33.1"
controlPlaneEndpoint: "172.16.10.205:8443"
certificatesDir: "/etc/kubernetes/pki"
imageRepository: "crpi-u3f530xi0zgtvmy0.cn-beijing.personal.cr.aliyuncs.com/image-infra"
clusterName: "kubernetes"
caCertificateValidityPeriod: 876000h0m0s
certificateValidityPeriod: 87600h0m0s
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: "systemd"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"

在k8s-master-01上初始化集群

css 复制代码
# 初始化集群
kubeadm init --config kubeadm-config.yaml --upload-certs --v=5

顺利的话集群几分钟就初始化完毕(所有的镜像已经上传到阿里云,直接用就行),你会看到如下提示,表示集群初始化成功。

​编辑

按照上一步的输出,配置kubectl:

bash 复制代码
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

复制上一步的输出的kubeadm join命令,在k8s-master-02和k8s-master-03 执行如下命令添加master节点,如下:

sql 复制代码
kubeadm join 172.16.10.205:8443 --token abcdef.0123456789abcdef \ 
 --discovery-token-ca-cert-hash sha256:53950443f3c42bbf8c6aa17289327e03555649d9ce0b11f6e5df1f87fc67956f \ 
--control-plane --certificate-key 98b0dd21cd66398aa656d2724d56c583ec7131808c42845d4fb76cc51c243d9d

三个master节点都添加完成后执行kubectl get no 将看到节点是NotReady状态,这是因为网络插件还没有安装:

​编辑

安装网络插件calico

安装calico operator

bash 复制代码
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.30.0/manifests/tigera-operator.yaml

创建custom-resources.yaml配置,注意"cidr"的配置要和kubeadm-config.yaml配置里面的"podSubnet"保持一致,此处我还配置了"registry"和"imagePath"这可以直接拉取我上传到阿里云的镜像,无需为镜像下载而烦恼,如果你有自己的私有镜像仓库也可以直接替换,yyds。

yaml 复制代码
# This section includes base Calico installation configuration.
# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  # Configures Calico networking.
  registry: "crpi-u3f530xi0zgtvmy0.cn-beijing.personal.cr.aliyuncs.com/"
  imagePath: "image-infra"
  calicoNetwork:
    ipPools:
    - name: default-ipv4-ippool
      blockSize: 26
      cidr: 10.244.0.0/16
      encapsulation: VXLANCrossSubnet
      natOutgoing: Enabled
      nodeSelector: all()

---

# This section configures the Calico API server.
# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
  name: default
spec: {}

---

# Configures the Calico Goldmane flow aggregator.
apiVersion: operator.tigera.io/v1
kind: Goldmane
metadata:
  name: default

---

# Configures the Calico Whisker observability UI.
apiVersion: operator.tigera.io/v1
kind: Whisker
metadata:
  name: default

创建calico资源

lua 复制代码
kubectl create -f custom-resources.yaml

等个几分钟后,检查集群状态,会发现全部都正常了,这个时候,就可以去创建个pod测试下了,恭喜。

​编辑

至此完结,如果文章对你有用,麻烦点个关注撒。

参考文档:

github.com/kubernetes/...

kubernetes.io/zh-cn/docs/...

docs.tigera.io/calico/late...

相关推荐
wearegogog12325 分钟前
Docker Buildx 简介与安装指南
运维·docker·容器
苏州向日葵2 小时前
virtualBox安装ubuntu,常用知识点
linux·运维·ubuntu
夜光小兔纸2 小时前
SQL Server 修改数据库名及物理数据文件名
运维·数据库·sql server
yangzx的网工日常3 小时前
网络的那些事——初级——OSPF(1)
运维·服务器·网络
代码写到35岁10 小时前
Jenkins自动发布C# EXE执行程序
运维·c#·jenkins
苹果醋314 小时前
AI大模型竞赛升温:百度发布文心大模型4.5和X1
java·运维·spring boot·mysql·nginx
liulilittle16 小时前
OpenSSL 的 AES-NI 支持机制
linux·运维·服务器·算法·加密·openssl·解密
风清再凯17 小时前
docker镜像的构建image
运维·docker·容器
饭碗、碗碗香17 小时前
【开发常用命令】:docker常用命令
linux·运维·笔记·学习·docker·容器
鸡鸭扣17 小时前
25年春招:米哈游运维开发一面总结
运维·面试·求职招聘·运维开发·面经·sre·米哈游