基于kubeadm部署Kubernetes 1.26.4 集群指南

本文详细记录了部署Kubernetes 1.26.4集群的完整过程。集群采用Containerd作为容器运行时，Flannel作为网络插件，IPVS作为Service代理模式，并集成了Dashboard可视化界面、Metrics-Server资源监控以及Ingress-Nginx流量入口控制器。文档从环境准备、系统配置开始，逐步引导完成集群初始化、节点加入、网络配置和核心组件安装，最后提供了完整的应用测试方案。本指南适合需要快速搭建生产级Kubernetes集群的运维人员和开发者。

一、部署前环境准备（所有节点）

1.架构说明

本次部署采用经典的三节点架构：

一个Master控制平面节点：Master节点运行API Server、Controller Manager、Scheduler和Etcd等核心组件。
两个Worker工作节点：Worker节点运行实际的工作负载。通过Flannel实现Pod网络互通，IPVS提供高性能Service负载均衡，确保集群的稳定性和高性能。

2.环境初始化与系统配置

在开始Kubernetes部署前，必须对所有节点进行统一的基础环境准备。这包括禁用防火墙和SELinux以避免网络访问限制，安装常用运维工具方便后续操作，并重启系统确保配置生效。此步骤为Kubernetes集群的正常运行扫清系统层面的障碍。

sh 复制代码

yum install wget vim lrzsz net-tools -y
#禁用防火墙
systemctl stop firewalld
systemctl disable firewalld
#关闭SELinux
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

3.加载IPVS内核模块

Kubernetes Service的网络代理支持多种模式，IPVS相比传统的iptables模式具有更好的性能和可扩展性，尤其适合大规模集群。确保系统启动时自动加载IPVS所需的内核模块，为后续配置IPVS模式的kube-proxy做好准备。

sh 复制代码

cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
modprobe -- br_netfilter
EOF

4.安装IPVS管理工具

ipvsadm是管理和监控IPVS规则的用户空间工具。安装后可以查看kube-proxy创建的负载均衡规则，验证Service的转发策略是否正常。此工具在排查Service网络问题时非常有用，能够直观展示虚拟服务与真实服务器的映射关系。

sh 复制代码

chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4

yum install ipvsadm ipset -y

# 查看规则
ipvsadm -ln

5. 设置节点主机名

为每个节点设置唯一且有意义的主机名是Kubernetes集群管理的最佳实践。清晰的主机名有助于在管理多节点集群时快速识别机器角色和位置，特别是在查看节点状态、Pod分布和日志时能够提高效率。

sh 复制代码

hostnamectl set-hostname node-18
hostnamectl set-hostname node-19
hostnamectl set-hostname node-20

6.配置主机名解析

虽然Kubernetes支持DNS服务发现，但在集群初始化阶段，确保节点之间能够通过主机名相互解析至关重要。通过/etc/hosts文件配置静态解析，可以避免内网DNS不稳定带来的问题，确保kubeadm join等操作顺利进行。

复制代码

vim /etc/hosts
192.168.91.18 node-18
192.168.91.19 node-19
192.168.91.20 node-20

7.配置SSH免密登录

在多节点集群环境中，从Master节点到所有Worker节点的SSH免密登录能够极大简化管理操作。这使得管理员可以在Master节点上统一执行命令、分发配置文件、收集日志，实现集中式管理，提高运维效率。

sh 复制代码

ssh-keygen    # 所有的确认,直接回车enter
ssh-copy-id node-18
ssh-copy-id node-19
ssh-copy-id node-20

8.优化系统内核参数

Kubernetes对Linux内核参数有特定要求，合理的参数配置可以显著提升集群的网络性能和稳定性。这些优化包括增加端口范围、启用TCP连接复用、调整连接跟踪表大小等，确保系统能够支持大量Pod和Service的网络通信。

sh 复制代码

tee /etc/sysctl.conf <<-"EOF"
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 30
net.core.somaxconn = 20480
net.core.netdev_max_backlog = 20480
net.ipv4.tcp_max_syn_backlog = 20480
net.ipv4.tcp_max_tw_buckets = 800000
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
EOF

# 应用生效(每个节点)
sysctl -p

9.安装与配置Containerd

自Kubernetes 1.24起，默认不再包含dockershim，因此需要直接安装支持CRI的容器运行时。Containerd是一个行业标准的容器运行时，相比Docker更加轻量高效。这里我们配置其使用systemd cgroup驱动，并设置国内镜像仓库以加速镜像拉取。

sh 复制代码

yum install -y yum-utils
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install containerd -y

# 生成配置文件
containerd config default > /etc/containerd/config.toml
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sed -i 's#sandbox_image =.*#sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"#' /etc/containerd/config.toml
sed -i '/.*plugins."io.containerd.grpc.v1.cri".registry.mirrors.*/ a\        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.k8s.io"]\n          endpoint = ["https://registry.aliyuncs.com/google_containers"]\n        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."k8s.gcr.io"]\n          endpoint = ["https://registry.aliyuncs.com/google_containers"]\n        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]\n          endpoint = ["https://usydjf4t.mirror.aliyuncs.com"]' /etc/containerd/config.toml

# 设置开机启动
systemctl restart containerd
systemctl enable containerd
systemctl status containerd

10.安装Containerd管理工具

crictl是兼容CRI的容器运行时命令行工具，类似于Docker CLI。它允许管理员直接在节点上查看容器状态、拉取镜像、查看日志等，是管理基于Containerd的Kubernetes节点的重要工具。需要确保crictl版本与Kubernetes版本兼容。

sh 复制代码

# 下载crictl，注意支持的kubernetes版本
cd /root
VERSION="v1.26.0"
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz
sudo tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin

# 设置配置文件（每个工作节点）
vi /etc/crictl.yaml
runtime-endpoint: unix:///var/run/containerd/containerd.sock
image-endpoint: unix:///var/run/containerd/containerd.sock
timeout: 10
debug: true

# 查看节点镜像
crictl images list

# 下载镜像测试
ctr image pull registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.3

二、部署k8s

1.配置Kubernetes Yum源

Kubernetes组件通过Yum仓库进行安装。使用阿里云的镜像仓库可以避免因网络问题导致的安装失败，并显著提升下载速度。此步骤在Master节点配置仓库，并通过SCP命令分发到所有Worker节点，确保所有节点使用相同的源。

sh 复制代码

# Master节点操作（安装kubeadm、kubelet和kubectl）
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyuncs.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

# 分发repo版本（master-1）
for i in 19 20; do scp /etc/yum.repos.d/kubernetes.repo node-$i:/etc/yum.repos.d/ ; done

2.安装Kubernetes核心组件

在所有节点上安装指定版本（1.26.4）的kubelet、kubeadm和kubectl。kubelet是运行在每个节点上的核心代理，负责管理Pod和容器；kubeadm是集群引导工具；kubectl是集群管理命令行工具。安装后启动kubelet服务，但此时它处于等待状态，直到集群初始化完成。

sh 复制代码

# 每个节点安装指定版本
yum install -y kubelet-1.26.4 kubeadm-1.26.4 kubectl-1.26.4 --disableexcludes=kubernetes

# 每个节点启动kubelet
systemctl enable kubelet
systemctl start kubelet

3.集群初始化配置

在初始化集群之前，需要生成并修改kubeadm配置文件。该文件定义了集群的各种参数，包括API Server地址、容器运行时接口、Pod和Service网段、镜像仓库等。正确的配置是集群成功初始化的关键，特别是要注意指定正确的容器运行时socket路径。

sh 复制代码

# 查看所需镜像列表
kubeadm config images list

# 生成默认配置文件
kubeadm config print init-defaults > /root/init-config.yaml

# 修改配置文件
vi /root/init-config.yaml
# 按照实际情况修改参数
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdzf
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.91.18  # 修改成master节点的IP地址
  bindPort: 6443
nodeRegistration:
  criSocket: /run/containerd/containerd.sock  # 修改成containerd.sock文件所在路径
  imagePullPolicy: IfNotPresent
  name: node-18   # 修改成master节点的主机名
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers  # 使用阿里云的镜像地址
kind: ClusterConfiguration
kubernetesVersion: 1.26.4  # 需要安装的k8s版本号
networking:
  dnsDomain: cluster.local
  podSubnet: "10.244.0.0/16"  # 指定pod网段
  serviceSubnet: 10.96.0.0/16  # 指定service网段
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs

4.初始化Kubernetes控制平面

使用kubeadm init命令在Master节点上引导启动完整的Kubernetes控制平面。这个过程包括生成证书、启动静态Pod（API Server、Controller Manager、Scheduler、Etcd）、创建kubeconfig文件等。初始化成功后，会输出加入集群的命令和配置kubectl的指引。

sh 复制代码

[root@node-18 ~]# kubeadm init --config=./init-config.yaml --upload-certs

# 显示结果
W0508 10:18:14.000745   11217 initconfiguration.go:305] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeadm.k8s.io", Version:"v1beta3", Kind:"InitConfiguration"}: the bootstrap token "abcdef.0123456789abcdefz" was not of the form "\\A([a-z0-9]{6})\\.([a-z0-9]{16})\\z"
the bootstrap token "abcdef.0123456789abcdefz" was not of the form "\\A([a-z0-9]{6})\\.([a-z0-9]{16})\\z"
To see the stack trace of this error execute with --v=5 or higher
[root@node-18 ~]# vim init-config.yaml 
[root@node-18 ~]# kubeadm init --config=./init-config.yaml --upload-certs
W0508 10:19:00.177331   11248 initconfiguration.go:119] Usage of CRI endpoints without URL scheme is deprecated and can cause kubelet errors in the future. Automatically prepending scheme "unix" to the "criSocket" with value "/run/containerd/containerd.sock". Please update your configuration!
[init] Using Kubernetes version: v1.26.4
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local node-18] and IPs [10.96.0.1 192.168.91.18]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost node-18] and IPs [192.168.91.18 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost node-18] and IPs [192.168.91.18 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig文件
[kubeconfig] Writing "kubelet.conf" kubeconfig文件
[kubeconfig] Writing "controller-manager.conf" kubeconfig文件
[kubeconfig] Writing "scheduler.conf" kubeconfig文件
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 10.003563 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
20410a49b0e3bed290f53a1ae344a6dd26b60964e9cfe1b5a6660513a4ab3762
[mark-control-plane] Marking the node node-18 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node node-18 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdzf
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.91.18:6443 --token abcdef.0123456789abcdzf \
        --discovery-token-ca-cert-hash sha256:87c1ef98da8e594d830d846e231f0a8118b41522d9db7eac15c701d58390e663

5.配置kubectl客户端

初始化成功后，需要将管理员kubeconfig文件复制到用户目录，以便使用kubectl命令管理集群。这个配置文件包含了访问API Server所需的证书和密钥，赋予用户集群管理员权限。

sh 复制代码

# 执行上述提示命令
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

6.worker节点加入到集群

使用初始化时生成的join命令，将node-19和node-20作为工作节点加入集群。这个命令使节点上的kubelet向控制平面注册，开始接收工作负载。节点加入后，它们的状态最初会是NotReady，因为尚未安装Pod网络插件。

sh 复制代码

# node-19与node-20节点执行
kubeadm join 192.168.91.18:6443 --token abcdef.0123456789abcdzf \
        --discovery-token-ca-cert-hash sha256:87c1ef98da8e594d830d846e231f0a8118b41522d9db7eac15c701d58390e663

# 显示结果
[preflight] Running pre-flight checks
        [WARNING Hostname]: hostname "node-19" could not be reached
        [WARNING Hostname]: hostname "node-19": lookup node-19 on 192.168.91.2:53: no such host
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

# 查看节点状态
# 需要安装CNI插件，才能Ready
[root@node-18 ~]# kubectl get nodes
NAME      STATUS     ROLES           AGE   VERSION
node-18   NotReady   control-plane   93m   v1.26.4
node-19   NotReady   <none>          40s   v1.26.4
node-20   NotReady   <none>          37s   v1.26.4

7.安装Flannel网络插件

Kubernetes集群需要Pod网络插件（CNI）来实现Pod之间的通信。Flannel是一个简单可靠的选项，它通过为每个节点分配子网，创建Overlay网络。安装Flannel后，所有节点状态将变为Ready，Pod之间可以相互通信。

sh 复制代码

# 注意修改pod网段为：10.244.0.0/16
# 下载地址
https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

[root@node-18 ~]# kubectl apply -f kube-flannel.yml 
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created

# 查看pod状态（需要等待3分钟左右）
[root@node-18 ~]#  kubectl get pods -A      
NAMESPACE      NAME                              READY   STATUS    RESTARTS      AGE          
kube-flannel   kube-flannel-ds-9cgzq             1/1     Running   0             47s
kube-flannel   kube-flannel-ds-gqwsh             1/1     Running   0             47s
kube-flannel   kube-flannel-ds-h8lcp             1/1     Running   0             47s
kube-system    coredns-567c556887-fnlj2          1/1     Running   0             114m          
kube-system    coredns-567c556887-sr2bn          1/1     Running   0             114m          
kube-system    etcd-node-18                      1/1     Running   0             114m          
kube-system    kube-apiserver-node-18            1/1     Running   0             114m          
kube-system    kube-controller-manager-node-18   1/1     Running   0             114m          
kube-system    kube-proxy-bjmcb                  1/1     Running   0             4m31s
kube-system    kube-proxy-k8cbc                  1/1     Running   0             4m42s
kube-system    kube-proxy-pxjfn                  1/1     Running   8 (10m ago)   21m          
kube-system    kube-scheduler-node-18            1/1     Running   0             114m   

# 查看节点状态
[root@node-18 ~]# kubectl get nodes
NAME      STATUS   ROLES           AGE    VERSION
node-18   Ready    control-plane   116m   v1.26.4
node-19   Ready    <none>          23m    v1.26.4
node-20   Ready    <none>          23m    v1.26.4

8.验证IPVS负载均衡

Kubernetes Service的负载均衡功能由kube-proxy实现。配置为IPVS模式后，可以通过ipvsadm命令查看创建的虚拟服务和真实服务器。这验证了Service的流量转发机制是否正常工作，也是排查Service网络问题的重要方法。

sh 复制代码

# 查看转发规则列表（所有节点）
[root@node-19 ~]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.96.0.1:443 rr
  -> 192.168.91.21:6443           Masq    1      0          0         
TCP  10.96.0.10:53 rr
TCP  10.96.0.10:9153 rr
UDP  10.96.0.10:53 rr

# 查看转发模式（node-18）
[root@node-18 ~]# kubectl logs -n kube-system kube-proxy-sg8ln             
I1123 17:21:34.664421       1 node.go:172] Successfully retrieved node IP: 192.168.91.22
I1123 17:21:34.664516       1 server_others.go:142] kube-proxy node IP is an IPv4 address (192.168.91.22), assume IPv4 operation
I1123 17:21:34.691739       1 server_others.go:258] Using ipvs Proxier.

9.测试集群基本功能

通过部署一个简单的Nginx应用并将其暴露为NodePort类型的Service，可以全面测试集群的基本功能：调度（Pod被分配到节点）、网络（Pod内容器运行）、服务发现（通过Service访问Pod）和负载均衡（kube-proxy规则生效）。这是验证集群是否正常工作的关键步骤。

sh 复制代码

# 测试容器（node-18）
kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=NodePort
kubectl get pod,svc

# 访问测试容器
# 不能使用master节点地址访问（svc地址：192.168.91.21），找任意的worker节点
[root@node-18 ~]# curl 192.168.91.22:30146

# 查看节点运行容器
[root@node-19 ~]# crictl ps          
DEBU[0000] get runtime connection                       
DEBU[0000] get image connection                         
DEBU[0000] ListContainerResponse: [&Container{Id:64111ce7169d13ee387255c36ef864e581212747d3a1d87bfc6f9a4ae1cd82bb,PodSandboxId:10d49bbfd80182a7ddfeeb1680aec0b172937998c52ab685238d8eec37c47422,Metadata:&ContainerMetadata{Name:nginx,Attempt:0,},Image:&ImageSpec{Image:sha256:605c77e624ddb75e6110f997c58876baa13f8754486b461117934b24a9dc3a85,Annotations:map[string]string{},},ImageRef:sha256:605c77e624ddb75e6110f997c58876baa13f8754486b461117934b24a9dc3a85,State:CONTAINER_RUNNING,CreatedAt:1683519520834167014,Labels:map[string]string{io.kubernetes.container.name: nginx,io.kubernetes.pod.name: nginx-748c667d99-9ksg2,io.kubernetes.pod.namespace: default,io.kubernetes.pod.uid: 79a3c460-864f-4932-a0e2-247fa2468173,},Annotations:map[string]string{io.kubernetes.container.hash: 4f05bfea,io.kubernetes.container.restartCount: 0,io.kubernetes.container.terminationMessagePath: /dev/termination-log,io.kubernetes.container.terminationMessagePolicy: File,io.kubernetes.pod.terminationGracePeriod: 30,},} &Container{Id:2bde5c57a9c61f641af1a8608f66c5ecadc18659709d89ddbf54629067141000,PodSandboxId:707b6aac52e7643556bc3cf783d95aac67e6f114d7e3722658faadb533f483ff,Metadata:&ContainerMetadata{Name:kube-proxy,Attempt:0,},Image:&ImageSpec{Image:sha256:b19f8eada6a93e5c7f8f3dc0a7530cd8d2e6cc50426ac7efccbb3f6cbb23cb5f,Annotations:map[string]string{},},ImageRef:sha256:b19f8eada6a93e5c7f8f3dc0a7530cd8d2e6cc50426ac7efccbb3f6cbb23cb5f,State:CONTAINER_RUNNING,CreatedAt:1683519169216678350,Labels:map[string]string{io.kubernetes.container.name: kube-proxy,io.kubernetes.pod.name: kube-proxy-k8cbc,io.kubernetes.pod.namespace: kube-system,io.kubernetes.pod.uid: 8c54f4b4-74c9-42f4-befc-6c7b714ea116,},Annotations:map[string]string{io.kubernetes.container.hash: b23fed2,io.kubernetes.container.restartCount: 0,io.kubernetes.container.terminationMessagePath: /dev/termination-log,io.kubernetes.container.terminationMessagePolicy: File,io.kubernetes.pod.terminationGracePeriod: 30,},} &Container{Id:edc56e6de02dcb1ab9d696fca08323e765a722a7608158b1451eb65a58485bc2,PodSandboxId:24c2ddbf9bd8241c5bb72bc62cfe4a1cdd328735188d4f070166dd5868faa459,Metadata:&ContainerMetadata{Name:kube-flannel,Attempt:0,},Image:&ImageSpec{Image:sha256:a6c0cb5dbd21197123942b3469a881f936fd7735f2dc9a22763b6f777f24345e,Annotations:map[string]string{},},ImageRef:sha256:a6c0cb5dbd21197123942b3469a881f936fd7735f2dc9a22763b6f777f24345e,State:CONTAINER_RUNNING,CreatedAt:1683519405289175227,Labels:map[string]string{io.kubernetes.container.name: kube-flannel,io.kubernetes.pod.name: kube-flannel-ds-gqwsh,io.kubernetes.pod.namespace: kube-flannel,io.kubernetes.pod.uid: 961db8bb-8b35-4101-8af3-bddb82ac4835,},Annotations:map[string]string{io.kubernetes.container.hash: 557c98fc,io.kubernetes.container.restartCount: 0,io.kubernetes.container.terminationMessagePath: /dev/termination-log,io.kubernetes.container.terminationMessagePolicy: File,io.kubernetes.pod.terminationGracePeriod: 30,},}] 
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID              POD
64111ce7169d1       605c77e624ddb       7 minutes ago       Running             nginx               0                   10d49bbfd8018       nginx-748c667d99-9ksg2
edc56e6de02dc       a6c0cb5dbd211       9 minutes ago       Running             kube-flannel        0                   24c2ddbf9bd82       kube-flannel-ds-gqwsh
2bde5c57a9c61       b19f8eada6a93       13 minutes ago      Running             kube-proxy          0                   707b6aac52e76       kube-proxy-k8cbc

10.集群重置与清理

在需要重新部署或测试时，可以使用kubeadm reset命令重置集群状态，清理相关配置文件和数据。这包括删除kubelet配置、证书、etcd数据等，确保环境干净，便于重新开始部署。

sh 复制代码

# 重装之前的清理操作
kubeadm reset -f
rm -rf ~/.kube/
rm -rf /etc/kubernetes/
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables --X
ipvsadm -C
yum remove docker* -y
rm -rf /var/lib/etcd

三、其他组件部署

1.部署Kubernetes Dashboard

Kubernetes Dashboard提供了一个Web图形界面，用于可视化地管理集群中的应用和资源。部署后，需要创建具有适当权限的服务账号并获取其访问令牌，以便通过浏览器登录Dashboard。这为非命令行用户提供了友好的集群管理方式。

sh 复制代码

# 安装dashboard（node-18）
# 下载地址 https://raw.githubusercontent.com/cby-chen/Kubernetes/main/yaml/dashboard.yaml
# 权限文件：https://raw.githubusercontent.com/cby-chen/Kubernetes/main/yaml/dashboard-user.yaml
[root@master-1 dashboard]# kubectl apply -f recommended.yaml 

# 查看服务端口
[root@node-18 ~]#  kubectl get services -n kubernetes-dashboard
NAME                        TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
dashboard-metrics-scraper   ClusterIP   10.96.25.224   <none>        8000/TCP        48s
kubernetes-dashboard        NodePort    10.96.10.198   <none>        443:31000/TCP   48s

# 创建用户授权
[root@node-18 ~]# kubectl create serviceaccount admin-user -n kubernetes-dashboard
[root@node-18 ~]# kubectl create clusterrolebinding  \
admin-user --clusterrole=cluster-admin --serviceaccount=kubernetes-dashboard:admin-user

# 创建Token
[root@node-18 ~]#  kubectl -n kubernetes-dashboard create token admin-user
eyJhbGciOiJSUzI1NiIsImtpZCI6ImFfanRoaFBRcDQ1SmR6U3ZFNy16YUJyLURDTGlXYXhkLWlWNHNRWFJYUWMifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjgzNTI1NDg3LCJpYXQiOjE2ODM1MjE4ODcsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJhZG1pbi11c2VyIiwidWlkIjoiYzhlNmM1MDQtZDJmNi00NTBkLThjYzktYTU3ZmE5YzI3NDYxIn19LCJuYmYiOjE2ODM1MjE4ODcsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbi11c2VyIn0.GiVBS3OlTFuqKvgggenH13kWj4TqiEjj_nls-wNnye98wMrin-1864mT5BTJoBfinaPZNqylqaWHPfugV378ZAYNXLp0JEeJya65n8EMarEka6WOb5nC6B8GhzfDZuaZTkiSrUFxJyigDfl_17QHeaNpDX8szMat1bNDEBk1LHz3dSs8fQ-GkqV6WgUXFe2QR8sreMHGVPrl70SoQ3jWaJqVUmZE-zGaZJPUYHjVZwoH8Qxx5eAfX4tUPT7RGZm3ILNTbKZ8TAV3BgwhItbFHzb8NrEW48uo6eO_s_qgsnkNiM8HqVE4IQeRfGPGYh1-sEaMdFqpgp_t5cSIdUuV-w

# 登录浏览器访问
https://192.168.91.19:31000

2.部署Metrics-Server资源监控

Metrics Server是集群范围的资源使用数据聚合器。它从每个节点的kubelet收集CPU和内存使用指标，为kubectl top命令和Dashboard的资源图表提供数据支持。这是实现HPA（Horizontal Pod Autoscaler）自动扩缩容的基础组件。

sh 复制代码

# 安装负载指标Metrics
# 1.修改api server（master）
# 增加参数项
vi /etc/kubernetes/manifests/kube-apiserver.yaml
...
spec:
  containers:
  - command:
...
  - --enable-aggregator-routing=true

# 注意在修改之后，会自动重启服务6443（apiserver），稍等3分钟

# 2.部署服务Metrics
# 下载地址：https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.3/components.yaml
[root@node-18 ~]# kubectl apply -f components.yaml

# 3.查看指标
[root@node-18 ~]# kubectl top nodes
NAME      CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
node-18   263m         13%    1126Mi          59%       
node-19   46m          2%     630Mi           16%       
node-20   36m          1%     908Mi           23%

# 刷新dashboard会有图形显示

# 4.配置日志查看权限
kubectl create clusterrolebinding system:anonymous --clusterrole=cluster-admin --user=system:anonymous

3.部署Ingress-Nginx流量入口

Ingress是管理外部访问集群内部服务的API对象，通常提供HTTP路由、负载均衡和SSL终止等功能。Ingress-Nginx控制器负责实现Ingress规则。部署它之后，便可以通过域名和路径将外部流量路由到不同的后端Service，实现灵活的七层负载均衡。

sh 复制代码

# ingress
# 安装NGINX， Inc.的NGINX Ingress控制器
mkdir ~/nginx-ingress
cd ~/nginx-ingress

# 创建Ingress控制器的命名空间和服务账号
wget https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.7.1/deploy/static/provider/cloud/deploy.yaml

# 设置节点标签
kubectl label  node  node-19  ingress=true

# 导入镜像
ctr -n=k8s.io images import kube-webhook-certgen.1.7.1.tar.gz
ctr -n=k8s.io images import ingress.nginx.controller.v1.7.1.tar.gz

# 部署ingress-nginx
kubectl apply -f ingress-nginx.yaml

# 检查容器
[root@node-18 nginx-ingress]# kubectl get pods -n ingress-nginx -o wide
NAME                                        READY   STATUS      RESTARTS   AGE     IP              NODE      NOMINATED NODE   READINESS GATES
ingress-nginx-admission-create-w8c5s        0/1     Completed   0          2m47s   10.244.1.6      node-19   <none>           <none>
ingress-nginx-admission-patch-7rmn8         0/1     Completed   0          2m47s   10.244.2.12     node-20   <none>           <none>
ingress-nginx-controller-57d5c8598b-zxn2x   1/1     Running     0          2m47s   192.168.91.19   node-19   <none>           <none>

# --ingress访问测试（注意节点）
[root@node-18 nginx-ingress]# curl 192.168.91.19
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

# --测试ingress访问
vi nginx-ingress-test.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-test
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
  - host: nginx.hostscc.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: nginx
            port: 
              number: 80

# 绑定hosts
[root@node-18 ~]# cat /etc/hosts
# 增加
192.168.91.19 nginx.hostscc.com

# 域名访问
[root@node-18 ~]# curl nginx.hostscc.com
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

四、总结

通过本指南，我们成功部署了一个功能完备的Kubernetes 1.26.4集群。整个过程涵盖了从底层系统优化、容器运行时配置，到核心集群初始化、网络方案实施，再到上层可观测性（Dashboard、Metrics）和流量入口（Ingress）的完整部署。集群采用Containerd作为容器运行时，性能更高且资源占用更少；使用Flannel提供稳定的Pod网络；配置IPVS模式提升Service性能；并集成了Dashboard可视化管理和Metrics-Server资源监控，为应用部署和运维提供了完整的基础设施。

部署过程中需要注意的关键点包括：确保所有节点时间同步、配置正确的容器运行时接口、设置合适的Pod和Service网段、及时安装CNI网络插件。本指南提供了详细的步骤和配置示例，读者可以根据实际环境调整IP地址、主机名等参数。部署完成后，通过简单的Nginx应用测试验证了集群的核心功能，包括Pod调度、Service发现和网络连通性。

此集群环境适合用于开发测试、学习研究和中小型生产环境。对于生产环境，建议进一步考虑高可用控制平面、更健壮的网络方案（如Calico）、持久化存储方案以及安全加固措施。