Kubernetes 多主多从集群部署完整文档

好久不见呀!今天给大家整点干货尝尝(其实是自己的总结),主打的就是全程无尿点。

Kubernetes 多主多从集群部署完整文档

1. 机器列表

PS: master,lb,nfs机器均为CentOS 7,其他为Ubuntu 22.04 LTS

机器名称 IP地址 备注
lb1 192.168.1.120 负载均衡器1
lb2 192.168.1.119 负载均衡器2
master-a 192.168.1.74 主节点1
master-b 192.168.1.93 主节点2
master-c 192.168.1.107 主节点3
node01 192.168.1.13 工作节点1
master01 192.168.1.53 原单主集群节点(现为工作节点)
vip 192.168.1.150 虚拟IP地址

2. 负载均衡器配置

2.1 lb1配置

2.1.1 基础环境配置
bash 复制代码
# 备份原有yum源
cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak

# 设置阿里云yum源
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo

# 清理并重建yum缓存
yum clean all
yum makecache

# 设置主机名
hostnamectl set-hostname lb1

# 设置时区
timedatectl set-timezone Asia/Shanghai

# 关闭并禁用防火墙
systemctl stop firewalld
systemctl disable firewalld

# 关闭SELinux
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
2.1.2 安装必要组件
bash 复制代码
# 安装基础工具
yum install -y curl socat conntrack ebtables ipset ipvsadm

# 安装负载均衡组件
yum install -y keepalived haproxy psmisc
2.1.3 配置HAProxy

编辑配置文件/etc/haproxy/haproxy.cfg

config 复制代码
global
    log /dev/log  local0 warning
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon
    stats socket /var/lib/haproxy/stats

defaults
    log global
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client 50000
    timeout server 50000

frontend kube-apiserver
    bind *:6443
    mode tcp
    option tcplog
    default_backend kube-apiserver

backend kube-apiserver
    mode tcp
    option tcplog
    option tcp-check
    balance roundrobin
    default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
    server kube-apiserver-1 192.168.1.74:6443 check
    server kube-apiserver-2 192.168.1.93:6443 check
    server kube-apiserver-3 192.168.1.107:6443 check

启动HAProxy服务:

bash 复制代码
systemctl restart haproxy
systemctl enable haproxy
2.1.4 配置Keepalived

编辑配置文件/etc/keepalived/keepalived.conf

config 复制代码
global_defs {
    notification_email {
    }
    router_id LVS_DEVEL
    vrrp_skip_check_adv_addr
    vrrp_garp_interval 0
    vrrp_gna_interval 0
}

vrrp_script chk_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}

vrrp_instance haproxy-vip {
    state BACKUP
    priority 100
    interface ens192                       # 根据实际网卡名称修改
    virtual_router_id 60
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    unicast_src_ip 192.168.1.120          # 本机IP
    unicast_peer {
        192.168.1.119                     # 对端lb2的IP
    }
    virtual_ipaddress {
        192.168.1.150/24                  # VIP地址
    }
    track_script {
        chk_haproxy
    }
}

启动Keepalived服务:

bash 复制代码
systemctl restart keepalived
systemctl enable keepalived

2.2 lb2配置

lb2的配置与lb1基本相同,主要区别在于:

  1. 主机名设置为lb2
  2. Keepalived配置中的unicast_src_ip改为192.168.1.119
  3. Keepalived配置中的unicast_peer改为192.168.1.120

完整配置步骤参考lb1的配置。

详细配置

bash 复制代码
# 备份原有yum源
cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak

# 设置阿里云yum源
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo

# 清理并重建yum缓存
yum clean all
yum makecache

# 设置主机名
hostnamectl set-hostname lb2

# 设置时区
timedatectl set-timezone Asia/Shanghai

# 关闭并禁用防火墙
systemctl stop firewalld
systemctl disable firewalld

# 关闭SELinux
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
2.2.2 安装必要组件
bash 复制代码
# 安装基础工具
yum install -y curl socat conntrack ebtables ipset ipvsadm

# 安装负载均衡组件
yum install -y keepalived haproxy psmisc
2.2.3 配置HAProxy

编辑配置文件/etc/haproxy/haproxy.cfg

config 复制代码
global
    log /dev/log  local0 warning
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon
    stats socket /var/lib/haproxy/stats

defaults
    log global
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client 50000
    timeout server 50000

frontend kube-apiserver
    bind *:6443
    mode tcp
    option tcplog
    default_backend kube-apiserver

backend kube-apiserver
    mode tcp
    option tcplog
    option tcp-check
    balance roundrobin
    default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
    server kube-apiserver-1 192.268.1.74:6443 check
    server kube-apiserver-2 192.268.1.93:6443 check
    server kube-apiserver-3 192.268.1.107:6443 check

启动HAProxy服务:

bash 复制代码
systemctl restart haproxy
systemctl enable haproxy
2.2.4 配置Keepalived

编辑配置文件/etc/keepalived/keepalived.conf

config 复制代码
global_defs {
    notification_email {
    }
    router_id LVS_DEVEL
    vrrp_skip_check_adv_addr
    vrrp_garp_interval 0
    vrrp_gna_interval 0
}

vrrp_script chk_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}

vrrp_instance haproxy-vip {
    state BACKUP
    priority 100
    interface ens192                       # 根据实际网卡名称修改
    virtual_router_id 60
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    unicast_src_ip 192.268.1.119          # 本机IP
    unicast_peer {
        192.268.1.120                     # 对端lb2的IP
    }
    virtual_ipaddress {
        192.268.1.150/24                  # VIP地址
    }
    track_script {
        chk_haproxy
    }
}

启动Keepalived服务:

bash 复制代码
systemctl restart keepalived
systemctl enable keepalived

3. NFS服务器配置

bash 复制代码
# 解压NFS安装包
tar zxf nfs/nfs.tar.gz

# 安装NFS服务
yum -y localinstall nfs-rpm/*.rpm

# 配置NFS共享目录
cat > /etc/exports << EOF
/nfs-data/data *(rw,sync,no_root_squash,no_subtree_check)
EOF

# 启动NFS服务
systemctl start nfs
systemctl enable nfs
systemctl restart nfs

# 在其他节点安装NFS客户端工具
yum install -y nfs-utils

4. Master节点配置

4.1 master-a配置

4.1.1 基础环境配置
bash 复制代码
# 设置阿里云yum源
cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
yum clean all
yum makecache

# 设置主机名
hostnamectl set-hostname master-a

# 设置时区
timedatectl set-timezone Asia/Shanghai

# 关闭防火墙和SELinux
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config

# 配置hosts文件
cat >> /etc/hosts << EOF
192.168.1.53 master01
192.168.1.13 node01
192.168.1.81 ai3
192.168.1.74 master-a
192.168.1.93 master-b
192.168.1.107 master-c
EOF

# 配置时间同步
yum install -y ntpdate
ntpdate cn.pool.ntp.org
echo "*/5 * * * * root /usr/sbin/ntpdate cn.pool.ntp.org &>/dev/null" >> /etc/crontab
4.1.2 安装Docker
bash 复制代码
# 解压Docker安装包
tar xf docker-20.10.23.tgz -C /usr/local

# 复制Docker二进制文件
cp /usr/local/docker/* /usr/bin/

# 创建Docker服务文件
cat > /usr/lib/systemd/system/docker.service <<EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
 
[Service]
Type=notify
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP \$MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
 
[Install]
WantedBy=multi-user.target
EOF

# 配置Docker
mkdir -p /etc/docker
cat > /etc/docker/daemon.json << EOF
{ 
  "insecure-registries":["192.168.1.13:5000"],
  "exec-opts": ["native.cgroupdriver=systemd"],
  "data-root": "/home/docker",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  }
}
EOF

# 启动Docker服务
systemctl daemon-reload
systemctl start docker
systemctl enable docker
docker -v
4.1.3 安装Kubernetes组件
bash 复制代码
# 解压Kubernetes安装包
tar -xvf k8s-v1.23.16.tar

# 关闭swap
swapoff -a
sed -i 's/.*swap.*/#&/g' /etc/fstab

# 配置内核参数
cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
vm.swappiness                       = 0
net.ipv6.conf.all.disable_ipv6 = 0
net.ipv6.conf.default.disable_ipv6 = 0
net.ipv6.conf.lo.disable_ipv6 = 0
net.ipv6.conf.all.forwarding = 1
EOF
sysctl -p /etc/sysctl.d/k8s.conf

# 允许转发
iptables -P FORWARD ACCEPT

# 安装Kubernetes RPM包
yum -y localinstall k8s-rpm/*.rpm

安装nfs
mkdir tmp
tar zxf nfs/nfs.tar.gz -C tmp
yum -y localinstall tmp/nfs-rpm/*.rpm
systemctl start nfs
systemctl enable nfs
4.1.4 初始化Kubernetes集群

创建kubeadm配置文件kubeadm-config.yaml

yaml 复制代码
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.1.74
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  imagePullPolicy: IfNotPresent
  name: master-a
  taints: null
---
apiServer:
  certSANs:
  - master-a
  - master-b
  - master-c
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 192.168.1.150:6443
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: 192.168.1.13:5000
kind: ClusterConfiguration
kubernetesVersion: 1.23.17
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16
scheduler: {}

初始化集群:

bash 复制代码
kubeadm init --config kubeadm-config.yaml --upload-certs

配置kubectl:

bash 复制代码
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

安装网络插件:

bash 复制代码
kubectl apply -f flannel/kube-flannel.yml

打包并分发集群配置文件:

bash 复制代码
cd /etc/kubernetes
tar zcf /root/k8s_conf.tar.gz pki/ca.crt pki/ca.key pki/sa.key pki/sa.pub pki/front-proxy-ca.crt pki/front-proxy-ca.key pki/etcd/ca.crt pki/etcd/ca.key admin.conf
scp /root/k8s_conf.tar.gz [email protected]:~/
scp /root/k8s_conf.tar.gz [email protected]:~/

生成加入集群命令:

bash 复制代码
kubeadm token create --print-join-command

4.2 master-b配置

master-b的配置与master-a类似,主要区别在于:

  1. 主机名设置为master-b
  2. 初始化时使用从master-a获取的配置文件
  3. 使用加入命令而非初始化命令

具体步骤:

bash 复制代码
# 设置阿里云yum源
cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
yum clean all
yum makecache

# 设置主机名
hostnamectl set-hostname master-a

# 设置时区
timedatectl set-timezone Asia/Shanghai

# 关闭防火墙和SELinux
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config

# 配置hosts文件
cat >> /etc/hosts << EOF
192.168.1.53 master01
192.168.1.13 node01
192.168.1.81 ai3
192.168.1.74 master-a
192.168.1.93 master-b
192.168.1.107 master-c
EOF

# 配置时间同步
yum install -y ntpdate
ntpdate time.windows.com
4.2.2 安装Docker
bash 复制代码
# 解压Docker安装包
tar xf docker-20.10.23.tgz -C /usr/local

# 复制Docker二进制文件
cp /usr/local/docker/* /usr/bin/

# 创建Docker服务文件
cat > /usr/lib/systemd/system/docker.service <<EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
 
[Service]
Type=notify
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP \$MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
 
[Install]
WantedBy=multi-user.target
EOF

# 配置Docker
mkdir -p /etc/docker
cat > /etc/docker/daemon.json << EOF
{ 
  "insecure-registries":["192.168.1.13:5000"],
  "exec-opts": ["native.cgroupdriver=systemd"],
  "data-root": "/home/docker",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  }
}
EOF

# 启动Docker服务
systemctl daemon-reload
systemctl start docker
systemctl enable docker
docker -v
4.2.3 安装Kubernetes组件
bash 复制代码
# 解压Kubernetes安装包
tar -xvf k8s-v1.23.16.tar

# master-a配置解压到k8s目录
mkdir -p /etc/kubernetes
tar zxf /root/k8s_conf.tar.gz -C /etc/kubernetes/

# 关闭swap
swapoff -a
sed -i 's/.*swap.*/#&/g' /etc/fstab

# 配置内核参数
cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
vm.swappiness                       = 0
net.ipv6.conf.all.disable_ipv6 = 0
net.ipv6.conf.default.disable_ipv6 = 0
net.ipv6.conf.lo.disable_ipv6 = 0
net.ipv6.conf.all.forwarding = 1
EOF
sysctl -p /etc/sysctl.d/k8s.conf

# 允许转发
iptables -P FORWARD ACCEPT

# 安装Kubernetes RPM包
yum -y localinstall k8s-rpm/*.rpm


kubeadm join 192.168.1.150:6443 --token abcdef.0123456789abcdef         --discovery-token-ca-cert-hash sha256:8e13ce8a9e6ce68c4ba9b6b01ca98cff62a61d4d6c9b6063bd6b37aca19f7890         --control-plane
systemctl restart kubelet
systemctl status kubelet
查看端口占用
lsof -i:10250

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

安装nfs
mkdir tmp
tar zxf nfs/nfs.tar.gz -C tmp
yum -y localinstall tmp/nfs-rpm/*.rpm
systemctl start nfs
systemctl enable nfs

外网安装
yum install nfs-utils

4.3 master-c配置

master-c的配置与master-b完全相同。

5. 工作节点配置

5.1 Ubuntu节点配置

bash 复制代码
# 更新系统并安装必要工具
apt-get update && apt-get install -y apt-transport-https ca-certificates curl software-properties-common

# 添加Docker官方GPG密钥
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

# 添加Docker仓库
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

# 安装Docker
apt-get update && apt-get install -y docker-ce=5:20.10.23~3-0~ubuntu-$(lsb_release -cs) docker-ce-cli=5:20.10.23~3-0~ubuntu-$(lsb_release -cs) containerd.io

# 配置Docker
mkdir -p /etc/docker
cat > /etc/docker/daemon.json <<EOF
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

# 重启Docker
systemctl daemon-reload
systemctl restart docker
systemctl enable docker

# 安装kubeadm、kubelet和kubectl
apt-get update && apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - 
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubelet=1.23.16-00 kubeadm=1.23.16-00 kubectl=1.23.16-00

# 关闭swap
swapoff -a
sed -ri 's/.*swap.*/#&/' /etc/fstab

# 关闭防火墙
ufw disable

# 加入集群
kubeadm join 192.168.1.150:6443 --token 661ic1.vgfsbtnxte96nldg \
    --discovery-token-ca-cert-hash sha256:8e13ce8a9e6ce68c4ba9b6b01ca98cff62a61d4d6c9b6063bd6b37aca19f7890

安装nfs
apt-get install nfs-common

6. 常见问题解决方案

6.1 kubelet启动超时

复制代码
error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition To see the stack trace of this error execute with --v=5 or higher
bash 复制代码
kubeadm reset -f
docker rm -f $(docker ps -a -q)
rm -rf /var/lib/cni/
systemctl daemon-reload
systemctl restart kubelet
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

6.2 CNI网络插件问题

~~

(combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "f4aac82f1a810b98057c8bb838deec809eb0750d703abcfb4a505ddcfb8406cd" network for pod "eip-nfs-nfs-client-6478c978c9-tqxld": networkPlugin cni failed to set up pod "eip-nfs-nfs-client-6478c978c9-tqxld_kube-system" network: failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 10.244.4.1/24

~~

bash 复制代码
rm -rf /etc/cni
ip link set cni0 down
ip link delete cni0

6.3 HAProxy端口绑定失败

复制代码
高可用集群安装haproxy启动报告"Starting frontend api: cannot bind socket [0.0.0.0:6443]"的错误,可以执行以下命令
bash 复制代码
setsebool -P haproxy_connect_any=1
systemctl restart haproxy

6.4 Kubernetes证书过期

6.4.1 检查证书过期情况
bash 复制代码
kubeadm certs check-expiration
6.4.2 更新所有证书
bash 复制代码
kubeadm certs renew all
6.4.3 更新kubeconfig文件
bash 复制代码
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
6.4.4 重启相关服务
bash 复制代码
systemctl restart kubelet
docker ps | grep -E 'k8s_kube-apiserver|k8s_kube-controller-manager|k8s_kube-scheduler|k8s_etcd_etcd' | awk '{print $1}' | xargs docker restart

7. 验证集群状态

在所有master节点上执行:

bash 复制代码
kubectl get nodes
kubectl get pods --all-namespaces
kubectl get cs

预期输出应显示所有节点状态为Ready,所有系统Pod运行正常,组件状态均为Healthy。

8. H3C路由器配置VIP

配置路径

复制代码
高级选项 --->> 策略路由

详细配置参数

配置项 参数值 说明
接口 VLAN1 指定策略应用的VLAN接口
协议类型 IP 选择IP协议
源IP地址段 192.168.1.150-192.168.1.150 精确匹配单个源IP
目的IP地址段 192.168.1.119-192.168.1.120 匹配目标IP范围
源端口 不限制源端口
目的端口 不限制目的端口
生效时间 全天生效
优先级 自动 由系统自动分配优先级
出接口 WAN1 指定流量出口
是否启用 启用 启用该策略
描述 LB-VIP,代理的是kube-api:6443端口 策略用途说明

配置说明

  1. 此策略将来自192.168.1.150访问192.168.1.119-120的流量强制从WAN1接口转发
  2. 适用于k8s API Server(6443端口)的VIP代理场景
  3. 保持源/目的端口为空表示匹配所有端口流量
  4. 优先级自动分配可确保策略正常排序

注意事项

  • 确保WAN1接口已正确配置
  • 检查VLAN1接口状态是否UP
  • 配置后建议测试连通性
相关推荐
CAE虚拟与现实4 天前
Dockerfile 文件常见命令及其作用
docker·容器·k8s·镜像·dockerhub
sunican5 天前
在Mac上离线安装k3s
macos·k8s·k3s
AWS官方合作商5 天前
如何配置AWS EKS自动扩展组:实现高效弹性伸缩
云计算·k8s·aws
仇辉攻防6 天前
【云安全】云原生-centos7搭建/安装/部署k8s1.23.6单节点
web安全·docker·云原生·容器·kubernetes·k8s·安全性测试
Gold Steps.10 天前
真实企业级K8S故障案例:ETCD集群断电恢复与数据保障实践
服务器·k8s·高可用·故障恢复
Gold Steps.10 天前
基于Kubeadm实现K8S集群扩缩容指南
运维·容器·k8s
人不走空10 天前
Kubernetes核心架构:从组件协同到工作原理
云原生·架构·k8s
可乐加.糖14 天前
腾讯云K8s容器部署SpringBoot项目实现方案
java·spring boot·容器·kubernetes·k8s·腾讯云
終不似少年遊*15 天前
操作系统、虚拟化技术与云原生及云原生AI简述
docker·ai·云原生·容器·华为云·云计算·k8s