本教程是通过kubeadm部署k8s的详细教程。我们在阿里云购买了五台ecs服务器,我们创建的vpc网段是:172.16.0.0/12
一、安装Docker、Docker-Compose
安装docker后,默认会安装容器运行时containerd, kubernetes也支持containerd作为runtime
1.安装docker
perl
sudo yum install -y docker-ce-24.0.6 docker-ce-cli-24.0.6 containerd.io
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<EOF
{
"registry-mirrors": [
"https://docker.m.daocloud.io",
"https://hfxvwdfx.mirror.aliyuncs.com",
"https://dockerproxy.com",
"https://proxy.1panel.live",
"https://dockerproxy.cn",
"https://hub1.nat.tf",
"https://docker.ketches.cn",
"https://hub1.nat.tf",
"https://hub2.nat.tf",
"https://docker.6252662.xyz"
],
"insecure-registries" : ["https://ude7leho.mirror.aliyuncs.com"],
"log-driver": "json-file",
"log-opts": {"max-size": "100m", "max-file": "1"}
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo systemctl enable docker
sudo systemctl start docker
sudo docker version --format '{{.Server.Version}}'
由于docker容器网段和vpc网段重叠
原来的两台服务器,跑了docker容器的,配置网段如下:
json
{
"bip": "192.168.0.1/24",
"default-address-pools": [
{"base": "10.10.0.0/16","size": 24}
]
}
{
"bip": "192.168.1.1/24",
"default-address-pools": [
{"base": "10.11.0.0/16","size": 24}
]
}
分别为node1-node5分配bip 2.1/24~6.1/24,修改docker配置
arduino
在docker里面添加:bip和default-address-pools
"bip": "192.168.6.1/24",
"default-address-pools": [
{
"base": "10.11.0.0/16",
"size": 24
}
]
2.安装docker-compose
bash
curl -SL https://github.com/docker/compose/releases/download/v2.18.1/docker-compose-$(uname -s)-$(uname -m)
mv docker-compose /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
docker-compose --version
二、安装kubernetes
由 kubeadm 创建的 Kubernetes 集群依赖于使用内核特性的相关软件。
这些软件包括但不限于容器运行时、kubelet 和容器网络接口(CNI)插件
清单
IP(内网) | hostname | 配置 | 节点类型 |
---|---|---|---|
172.22.162.243 | k8s-node1 | 4C/16G/100G | worker |
172.22.162.245 | k8s-node2 | 4C/16G/100G | worker |
172.22.162.244 | k8s-node3 | 4C/8G/100G | master |
172.22.162.242 | k8s-node4 | 4C/8G/100G | master |
172.22.162.241 | k8s-node5 | 4C/8G/100G | master |
配置hosts
bash
cat >> /etc/hosts << EOF
172.22.162.243 k8s-node1
172.22.162.245 k8s-node2
172.22.162.244 k8s-node3
172.22.162.242 k8s-node4
172.22.162.241 k8s-node5
172.22.162.241 cluster-endpoint
EOF
- 映射关系对
master-ip cluster-endpoint
与kubeadm init
时传递参数--control-plane-endpoint=cluster-endpoint
关联,配置高可用集群时,该参数必须指定。 - 高可用集群部署成功后,可将该映射 IP 地址改为负载均衡地址。
- 该映射 IP 为第一个部署的 master 节点地址。
1.前提条件
- Linux主机,每台机器2GB+的RAM
- Control Plane要2C+的核心
- 网络互通,内网公网均可
- 节点之中不可以有重复的主机名、MAC 地址或 product_uuid
- 开启机器上的某些端口
- 交换分区的配置。kubelet 的默认行为是在节点上检测到交换内存时无法启动
所有节点都需要执行一遍
2.检查mac和product uuid唯一性
硬件设备会拥有唯一的地址,但是有些虚拟机的地址可能会重复,如果这些值在每个节点上不唯一,可能会导致安装失败
bash
# 检测 mac 地址
ip link
ifconfig -a
# 检查 product_uuid
cat /sys/class/dmi/id/product_uuid
3.检查所需端口
-
必须开启6443端口,用于Kubernetes API服务器入站请求
-
目前服务器已关闭防火墙,所有的端口在内网均可访问
4.配置内核参数 转发ipv4并让iptables看到桥接流量
- 启用Overlay文件系统,Kubernetes 需要它来管理容器镜像和存储
- 让Linux桥接网络支持iptables防火墙规则, Linux 的网桥(如 Docker 或 Kubernetes 创建的虚拟网桥
cni0
)会将多个容器/Pod 的网络连接在一起,形成局域网。默认情况下,桥接的流量不经过 宿主机的iptables
规则链。 - Iptables是Linux 的防火墙工具,Kubernetes 用它实现:Service负载菌核、网络策略、节点流量转发
- 如果桥接流量不经过
iptables
,Kubernetes 的 Service 和网络策略将失效。这两行配置强制让桥接流量经过iptables/ip6tables
规则链。
bash
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
# 启用 Linux 内核的 IPv4 数据包转发功能,允许主机充当路由器,在不同网络接口间转发流量
net.ipv4.ip_forward = 1
EOF
# 应用 sysctl 参数而不重新启动
sudo sysctl --system
5.yum源配置 安装常用包
yum install -y vim bash-completion net-tools gcc
6.时间同步
bash
yum install -y chrony
# 同步的时间服务器修改为国家授时中心 NTP 服务器
sed -i 's/^pool 2.centos.pool.ntp.org iburst$/pool ntp.ntsc.ac.cn iburst/' /etc/chrony.conf
systemctl start chronyd
systemctl enable chronyd
# 查看同步的时间服务器
chronyc sources
7.关闭防火墙
arduino
systemctl stop firewalld
systemctl disable firewalld
8.关闭swap分区
bash
# 临时有效
swapoff -a
# 永久生效,修改 /etc/fstab,删除如下行
/dev/mapper/cl-swap swap swap defaults 0 0
9.关闭SELinux
arduino
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
10.配置ssh互信
ssh-copy-id
将本地 SSH 公钥复制到远程主机 k8s-node1
,以实现免密码登录
typescript
# 生成 ssh 公私钥
ssh-keygen
# 配置 authorized_keys(各节点交叉执行)
ssh-copy-id -i ~/.ssh/id_rsa.pub root@k8s-node1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@k8s-node2
ssh-copy-id -i ~/.ssh/id_rsa.pub root@k8s-node3
ssh-copy-id -i ~/.ssh/id_rsa.pub root@k8s-node4
11.containered配置
修改运行时为containerd
⚠️:containerd虽然和docker一起安装上,但是他们的镜像是单独保存的
- docker镜像存储在/var/lib/docker, containerd存储在/var/lib/containerd
- 可以手动导出docker镜像并导入containerd,他们的格式是一样的(OCI)
perl
crictl config runtime-endpoint unix:///var/run/containerd/containerd.sock
修改containerd配置文件,如果不存在,先生成默认配置
arduino
containerd config default > /etc/containerd/config.toml
修改配置文件 vim /etc/containerd/config.toml
css
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.k8s.io"]
endpoint = ["https://docker.6252662.xyz","https://docker.m.daocloud.io","https://dockerproxy.com"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://docker.6252662.xyz","https://docker.m.daocloud.io","https://dockerproxy.com"]
修改沙箱sandbox pause的镜像地址
bash
grep sandbox_image /etc/containerd/config.toml
# 按上一步输出选择
sed -i "s#k8s.gcr.io/pause#registry.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml
或
sed -i "s#registry.k8s.io/pause#registry.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml
# 确认修改成功
grep sandbox_image /etc/containerd/config.toml
# 应用所有更改后,重新启动containerd
systemctl restart containerd
# 检查registry配置是否正确
crictl info | grep -A 20 "registry"
12.修改systemd cgroup驱动
官方建议使用systemd cgroup驱动,SystemdCgroup默认是false,使用cgroupfs驱动,改为true使用systemd驱动
bash
sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml
# 应用所有更改后,重新启动containerd
systemctl restart containerd
13.安装kubeadm、kubelet、kubectl
kubeadm
:用来初始化集群的指令。kubelet
:在集群中的每个节点上用来启动 Pod 和容器等。kubectl
:用来与集群通信的命令行工具。
配置k8s yum源
ini
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-$basearch
enabled=1
gpgcheck=1
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF
安装
css
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
systemctl enable --now kubelet
⚠️注意:kubelet 现在每隔几秒就会重启,属于正常现象,因为它陷入了一个等待 kubeadm 指令的死循环。
14.使用kubeadm创建集群
在master1 也就是cluster-endpoint节点上执行kubeadm init
ini
# ---apiserver-advertise-address 用于为控制平面节点的 API server 设置广播地址。
# ---image-repository 指定从什么位置来拉取镜像。
# --control-plane-endpoint 用于为所有控制平面节点设置共享端点。如果不设置,则无法将单个控制平面 kubeadm 集群升级成高可用。
# --upload-certs 指定将在所有控制平面实例之间的共享证书上传到集群。后面 join 其他 master 节点时,可以使用该证书。
# ---kubernetes-version 指定 k8s 版本号。
# ---pod-network-cidr 指定 Pod 网络的范围。不同 CNI 默认网段也不一样,Calico 默认为 192.168.0.0/16。
kubeadm init \
--apiserver-advertise-address=172.22.162.241 \
--image-repository registry.aliyuncs.com/google_containers \
--control-plane-endpoint=cluster-endpoint \
--upload-certs \
--kubernetes-version v1.31.9 \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=192.168.0.0/16 \
--v=5
部署成功后,输出如下,保存kubeadm join的字符串
sql
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join cluster-endpoint:6443 --token l3o830.bnrk2y3fsi40w10o \
--discovery-token-ca-cert-hash sha256:87219069b3e533e86dc28d62446daebe0628b76703aa740f149b032fa7690b6a \
--control-plane --certificate-key ee9fff16e3f8a46dd31624e68df31065cfc8ecdd007ce7cb020a492e8952eedf
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join cluster-endpoint:6443 --token l3o830.bnrk2y3fsi40w10o \
--discovery-token-ca-cert-hash sha256:ee9fff16e3f8a46dd31624e68df31065cfc8ecdd007ce7cb020a492e8952eedf
- cluster-endpoint:6443: 集群API服务器端点
- token:加入集群的认证令牌 24H有效期
- discovery-token-ca-cert-hash,CA证书,验证API服务器身份
- control-plane:加入的是控制平面的服务器
- certificate-key:获取控制平面证书,2H过期
手动重新获取join命令
获取worker的join:
lua
kubeadm token create --print-join-command
获取master的join:
vbnet
先获取worker的join string
再组合上--control-plan --discovery-token-ca-cert-hash = certificate-key
token过期
perl
kubeadm token create # yqiri0.oxybs1qfy7trixy3
kubeadm token create --print-join-command # kubeadm join cluster-endpoint:6443 --token 0tx28e.kskomhxaz21efutu --discovery-token-ca-cert-hash sha256:87219069b3e533e86dc28d62446daebe0628b76703aa740f149b032fa7690b6a
certificate-key过期
lua
kubeadm init phase upload-certs --upload-certs
kubeadm token create --print-join-command
master1服务器 配置kubectl kube config环境变量
bash
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 设置环境变量
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
source ~/.bash_profile
查看node信息 node5 notReady
csharp
[root@k8s-node5 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-node5 NotReady control-plane 115s v1.28.2
查看日志报错原因
arduino
journalctl -xeu kubelet
# Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
安装网络插件CNI calico
你必须部署一个基于 Pod 网络插件的容器网络接口(CNI), 以便你的 Pod 可以相互通信。在安装网络之前,集群 DNS (CoreDNS) 将不会启动。
- 通过kubectl apply来安装
- 安装Pod网络后,检查 CoreDNS Pod 是否
Running
来确认其是否正常运行。一旦 CoreDNS Pod 启用并运行,你就可以继续加入节点 - 我们安装的k8s是1.31 兼容的calico是3.29.5
下载calico 3.29.5的yaml
sql
wget https://raw.githubusercontent.com/projectcalico/calico/v3.29.5/manifests/calico.yaml
kubectl create -f calico.yaml
kubectl get pod -n kube-system -o wide
如果安装过程出现失败,需要删除所有资源,然后重新执行上面的安装命令
perl
# 删除 DaemonSet 和 Deployment
kubectl delete daemonset calico-node -n kube-system
kubectl delete deployment calico-kube-controllers -n kube-system
# 删除 ServiceAccount
kubectl delete serviceaccount calico-kube-controllers -n kube-system
kubectl delete serviceaccount calico-node -n kube-system
kubectl delete serviceaccount calico-cni-plugin -n kube-system
# 删除 ClusterRole 和 ClusterRoleBinding
kubectl delete clusterrole calico-kube-controllers
kubectl delete clusterrole calico-node
kubectl delete clusterrole calico-cni-plugin
kubectl delete clusterrole calico-tier-getter
kubectl delete clusterrolebinding calico-kube-controllers
kubectl delete clusterrolebinding calico-node
kubectl delete clusterrolebinding calico-cni-plugin
kubectl delete clusterrolebinding calico-tier-getter
# 删除 ConfigMap
kubectl delete configmap calico-config -n kube-system
# 删除 PodDisruptionBudget
kubectl delete poddisruptionbudget calico-kube-controllers -n kube-system
# 删除calico crd
kubectl delete crd \
bgpconfigurations.crd.projectcalico.org \
bgpfilters.crd.projectcalico.org \
bgppeers.crd.projectcalico.org \
blockaffinities.crd.projectcalico.org \
caliconodestatuses.crd.projectcalico.org \
clusterinformations.crd.projectcalico.org \
felixconfigurations.crd.projectcalico.org \
globalnetworkpolicies.crd.projectcalico.org \
globalnetworksets.crd.projectcalico.org \
hostendpoints.crd.projectcalico.org \
ipamblocks.crd.projectcalico.org \
ipamconfigs.crd.projectcalico.org \
ipamhandles.crd.projectcalico.org \
ippools.crd.projectcalico.org \
ipreservations.crd.projectcalico.org \
kubecontrollersconfigurations.crd.projectcalico.org \
networkpolicies.crd.projectcalico.org \
networksets.crd.projectcalico.org \
tiers.crd.projectcalico.org \
adminnetworkpolicies.policy.networking.k8s.io
vbnet
kubectl apply -f tigera-operator.yaml
kubectl apply -f custom-resources.yaml
node3-4加入集群
certificate-key可能会过期,有效时间2H,需要已经加入集群的master1重新生成:
sql
kubeadm join cluster-endpoint:6443 --token l3o830.bnrk2y3fsi40w10o \
--discovery-token-ca-cert-hash sha256:87219069b3e533e86dc28d62446daebe0628b76703aa740f149b032fa7690b6a \
--control-plane --certificate-key ee9fff16e3f8a46dd31624e68df31065cfc8ecdd007ce7cb020a492e8952eedf
- calico在整个集群里面只需要装一次
- 但是需要初始化kubectl
node1-2加入集群作为worker
sql
kubeadm join cluster-endpoint:6443 --token l3o830.bnrk2y3fsi40w10o \
--discovery-token-ca-cert-hash sha256:ee9fff16e3f8a46dd31624e68df31065cfc8ecdd007ce7cb020a492e8952eedf
- worker节点不需要kubectl,kubectl的初始化可以跳过 worker节点也看不到nodes
安装kubesphere
安装helm
- helm需要安装在master节点上
1.下载helm安装包github.com/helm/helm/r...
2.解压后,移动mv linux-amd64/helm /usr/local/bin/helm
3.helm version检测是否安装完成
4.修改源 微软的源推荐azure的
csharp
helm repo remove stable
helm repo add stable http://mirror.azure.cn/kubernetes/charts/
helm repo update
安装kubesphere
sql
helm upgrade --install \
-n kubesphere-system \
--create-namespace \
ks-core \
https://charts.kubesphere.com.cn/main/ks-core-1.1.4.tgz \
--debug