【k8s实践】使用kubeadm搭建kubernetes集群

文章目录

环境要求

3台Linux虚拟机, 能联网, 我使用的发行版是Rocky Linux 9.4

最低配置: 2CPU, 2G内存, 20G硬盘

3台虚拟机的IP和主机名如下

复制代码
k8s-master 192.168.52.200 (k8s主节点)
k8s-node1 192.168.52.201 (k8s从节点1)
k8s-node2 192.168.52.202 (k8s从节点2)

0. 准备操作

关闭防火墙

复制代码
systemctl stop firewalld
systemctl disable firewalld

关闭SELINUX

复制代码
setenforce 0
sed --follow-symlinks -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

修改主机名

k8s主节点执行

复制代码
hostnamectl set-hostname k8s-master

从节点1执行

复制代码
hostnamectl set-hostname k8s-node1

从节点2执行

复制代码
hostnamectl set-hostname k8s-node2

修改hosts文件

修改所有机器的/etc/hosts文件, 添加三台虚拟机的IP和主机名

复制代码
192.168.52.200 k8s-master 
192.168.52.201 k8s-node1 
192.168.52.202 k8s-node2 

禁用交换分区

kubelet默认行为是在节点上检测到交换内存时无法启动,所以这里先禁用交换分区。临时禁用交换分区方法:

复制代码
swapoff -a

如需要永久禁用交换分区,修改/etc/fstab, 删除swap分区那一行的配置。

加载内核模块, 再设置内核参数

临时加载内核模块

复制代码
modprobe ip_vs_rr
modprobe br_netfilter

每次启动自动加载

vim /etc/modules-load.d/k8s.conf

复制代码
overlay
br_netfilter

再设置内核参数

复制代码
cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables 	= 1
net.ipv4.ip_forward 				= 1
EOF

sysctl -p /etc/sysctl.d/k8s.conf

1. 安装containerd, kubeadm, kubelet, kubectl, calico

国内机器需要更换YUM国内源

复制代码
sed -e 's|^mirrorlist=|#mirrorlist=|g' \
    -e 's|^#baseurl=http://dl.rockylinux.org/$contentdir|baseurl=https://mirrors.aliyun.com/rockylinux|g' \
    -i.bak \
    /etc/yum.repos.d/[Rr]ocky-*.repo
dnf makecache

所有机器安装containerd

复制代码
dnf config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo # 国内用阿里源
#dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
dnf install -y containerd

配置containerd

复制代码
containerd config default > /etc/containerd/config.toml

再修改/etc/containerd/config.toml

  • 把SystemdCGroup的值改成true
  • 把sandbox_image改为registry.aliyuncs.com/google_containers/pause:3.9 (国内用户需要做这一步,把镜像换成阿里的)

启动containerd

复制代码
systemctl daemon-reload
systemctl enable --now containerd

查看containerd版本

复制代码
# ctr version
Client:
  Version:  1.7.24
# runc -v
runc version 1.2.2

所有机器安装kubeadm, kubelet, kubectl

先添加k8s的repo

复制代码
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

再通过yum安装kubelet, kubeadm, kubectl

复制代码
yum install -y kubelet kubeadm kubectl
systemctl enable --now kubelet

查看kubelet, kubeadm, kubectl版本

复制代码
kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"28", GitVersion:"v1.28.2", GitCommit:"89a4ea3e1e4ddd7f7572286090359983e0387b2f", GitTreeState:"clean", BuildDate:"2023-09-13T09:34:32Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"}

[root@k8s-master yum.repos.d]# kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
The connection to the server localhost:8080 was refused - did you specify the right host or port?

[root@k8s-master yum.repos.d]# kubelet --version
Kubernetes v1.28.2

说明: kubelet现在每隔几秒就会重启,它陷入了一个等待 kubeadm 指令的死循环, 这是符合预期的。 接下来需要在主节点执行kubeadm init,初始化k8s集群

2. 初始化k8s集群

在主节点执行kubeadm init

复制代码
kubeadm init --apiserver-advertise-address 192.168.52.200 \
			 --image-repository registry.aliyuncs.com/google_containers \
			 --kubernetes-version v1.28.2 \
			 --pod-network-cidr=198.18.0.0/16

参数说明:

--apiserver-advertise-address:监听地址,填主节点IP

--image-repository:国内用户需指定镜像地址为阿里云的,默认是海外镜像你无法访问。

--kubernetes-version:指定kubernetes的版本

--pod-network-cidr=198.18.0.0/16 (这个cidr表示Pod的IP地址范围,根据你的网络环境自定义,不能和其他IP发生冲突即可)

执行时间较长,耐心等几分钟。 执行成功后,会打印如下内容,提示你下一步怎么做

复制代码
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.52.200:6443 --token zk8fth.5psohqfk9lomq0tw \
        --discovery-token-ca-cert-hash sha256:f32851bb6a86cc7f0a394f1d77e1db5b217cde1b0f40909ee3916959519173f7

我使用的是root用户,参照上面的提示,只需export环境变量KUBECONFIG,操作如下:

编辑/etc/profile,结尾添加一行

复制代码
export KUBECONFIG=/etc/kubernetes/admin.conf

使环境变量立即生效

复制代码
source /etc/profile

此时,kubectl已经可以查到如下pod, 但coredns pod运行不成功。下一步需要在主节点上安装网络插件

复制代码
kubectl get pods -A
NAMESPACE     NAME                                 READY   STATUS    RESTARTS   AGE
kube-system   coredns-66f779496c-8ctsc             0/1     Pending   0          3m11s
kube-system   coredns-66f779496c-hx76v             0/1     Pending   0          3m11s
kube-system   etcd-k8s-master                      1/1     Running   0          3m24s
kube-system   kube-apiserver-k8s-master            1/1     Running   0          3m24s
kube-system   kube-controller-manager-k8s-master   1/1     Running   0          3m24s
kube-system   kube-proxy-89c9k                     1/1     Running   0          3m11s
kube-system   kube-scheduler-k8s-master            1/1     Running   0          3m24s

3. 在主节点安装calico网络插件

安装网络插件,可以选择calico或者flannel,我这里选calico,安装最新版本

calico安装,我参考了官方文档: https://docs.tigera.io/calico/latest/getting-started/kubernetes/quickstart

1、Install the Tigera Calico operator and custom resource definitions.

复制代码
yum install -y wget
wget https://raw.githubusercontent.com/projectcalico/calico/v3.29.1/manifests/tigera-operator.yaml
kubectl create -f tigera-operator.yaml

2、Install Calico by creating the necessary custom resource.

复制代码
wget https://raw.githubusercontent.com/projectcalico/calico/v3.29.1/manifests/custom-resources.yaml
修改custom-resources.yaml, 把cidr改成198.18.0.0/16
kubectl create -f custom-resources.yaml

3、Confirm that all of the pods are running with the following command.

复制代码
watch kubectl get pods -n calico-system

Wait until each pod has the STATUS of Running.

4、Remove the taints on the control plane so that you can schedule pods on it.

复制代码
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
node/k8s-master untainted

5、 Confirm that you now have a node in your cluster with the following command.

复制代码
kubectl get nodes -o wide
NAME         STATUS   ROLES           AGE     VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                      KERNEL-VERSION                 CONTAINER-RUNTIME
k8s-master   Ready    control-plane   5h16m   v1.28.2   10.206.216.96   <none>        Rocky Linux 9.4 (Blue Onyx)   5.14.0-427.13.1.el9_4.x86_64   containerd://1.7.24

如果是国内用户,最常见问题是Pod启动失败,无法拉取镜像。 你可以用kubectl describe查看启动日志,获取拉取失败的镜像信息,然后从国内源拉取,再改一下tag即可

calico安装需要的镜像如下:

复制代码
[root@k8s-master ~]# ctr -n k8s.io image list | awk '{print $1}'
REF
docker.io/calico/apiserver:v3.29.1
docker.io/calico/cni:v3.29.1
docker.io/calico/csi:v3.29.1
docker.io/calico/kube-controllers:v3.29.1
docker.io/calico/node-driver-registrar:v3.29.1
docker.io/calico/node:v3.29.1
docker.io/calico/pod2daemon-flexvol:v3.29.1
docker.io/calico/typha:v3.29.1
quay.io/tigera/operator:v1.36.2

国内镜像站: https://docker.aityp.com

例: 手动拉取docker.io/calico/kube-controllers:v3.29.1镜像,再打tag

复制代码
ctr -n k8s.io images pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/calico/kube-controllers:v3.29.1
ctr -n k8s.io images tag  swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/calico/kube-controllers:v3.29.1 docker.io/calico/kube-controllers:v3.29.1

安装calicoctl

calicoctl安装,参考了官方文档: https://docs.tigera.io/calico/latest/operations/calicoctl/install

复制代码
curl -L https://github.com/projectcalico/calico/releases/download/v3.29.1/calicoctl-linux-amd64 -o calicoctl
chmod +x ./calicoctl
cp calicoctl /usr/bin/

查看calico Pod是否创建成功

复制代码
kubectl get pods -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-658d97c59c-gftzg   1/1     Running   0          12m
kube-system   calico-node-x7dhl                          1/1     Running   0          12m
kube-system   coredns-66f779496c-8ctsc                   1/1     Running   0          62m
kube-system   coredns-66f779496c-hx76v                   1/1     Running   0          62m
kube-system   etcd-k8s-master                            1/1     Running   0          62m
kube-system   kube-apiserver-k8s-master                  1/1     Running   0          62m
kube-system   kube-controller-manager-k8s-master         1/1     Running   0          62m
kube-system   kube-proxy-89c9k                           1/1     Running   0          62m
kube-system   kube-scheduler-k8s-master                  1/1     Running   0          62m

此时,通过calicoctl可以查到主节点k8s-master

复制代码
# calicoctl node status
Calico process is running.

IPv4 BGP status
No IPv4 peers found.

IPv6 BGP status
No IPv6 peers found.

# calicoctl get nodes
NAME
k8s-master

4. 把其他两个从节点加入集群

先在Master节点上获取token

复制代码
kubeadm token list | awk '{print $1}'
TOKEN
zk8fth.5psohqfk9lomq0tw

默认token 24小时内过期,如果过期了,可以在主节点上重新创建新token

复制代码
kubeadm token create

再从主节点上获取--discovery-token-ca-cert-hash的值

复制代码
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \
   openssl dgst -sha256 -hex | sed 's/^.* //'
f32851bb6a86cc7f0a394f1d77e1db5b217cde1b0f40909ee3916959519173f7

最后,在两个从节点执行kubeadm join命令,将从节点加入集群:

复制代码
kubeadm join --token zk8fth.5psohqfk9lomq0tw \
		192.168.52.200:6443 \
		--discovery-token-ca-cert-hash sha256:f32851bb6a86cc7f0a394f1d77e1db5b217cde1b0f40909ee3916959519173f7

[preflight] Running pre-flight checks
        [WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

将kubelet设为开机自启动

复制代码
systemctl enable kubelet.service

回到主节点,查看从节点是否加入成功

需要等几分钟,直到所有Pod创建完成,如下:

复制代码
watch kubectl get pods -A
NAMESPACE          NAME                                       READY   STATUS    RESTARTS   AGE
calico-apiserver   calico-apiserver-7f67554766-dfzc9          1/1     Running   0          22m
calico-apiserver   calico-apiserver-7f67554766-xsdlk          1/1     Running   0          22m
calico-system      calico-kube-controllers-6b7df74554-qxqgf   1/1     Running   0          22m
calico-system      calico-node-7lhhg                          1/1     Running   0          10m
calico-system      calico-node-ksrfp                          1/1     Running   0          22m
calico-system      calico-node-lnt8q                          1/1     Running   0          10m
calico-system      calico-typha-6855cf6f56-dzfd4              1/1     Running   0          22m
calico-system      calico-typha-6855cf6f56-z77jt              1/1     Running   0          10m
calico-system      csi-node-driver-cn6h5                      2/2     Running   0          10m
calico-system      csi-node-driver-pqrw8                      2/2     Running   0          10m
calico-system      csi-node-driver-rhd5m                      2/2     Running   0          22m
kube-system        coredns-5dd5756b68-84wqr                   1/1     Running   0          5h32m
kube-system        coredns-5dd5756b68-hc9xm                   1/1     Running   0          5h32m
kube-system        etcd-k8s-master                            1/1     Running   0          5h32m
kube-system        kube-apiserver-k8s-master                  1/1     Running   0          5h32m
kube-system        kube-controller-manager-k8s-master         1/1     Running   0          5h32m
kube-system        kube-proxy-2hmd5                           1/1     Running   0          10m
kube-system        kube-proxy-4h2cz                           1/1     Running   0          10m
kube-system        kube-proxy-6qptd                           1/1     Running   0          5h32m
kube-system        kube-scheduler-k8s-master                  1/1     Running   0          5h32m
tigera-operator    tigera-operator-c7ccbd65-rddsw             1/1     Running   0          5h25m

可以查到两个从节点已经Ready

复制代码
# kubectl get nodes
NAME         STATUS   ROLES           AGE     VERSION
k8s-master   Ready    control-plane   5h33m   v1.28.2
k8s-node1    Ready    <none>          11m     v1.28.2
k8s-node2    Ready    <none>          11m     v1.28.2

# calicoctl node status
calicoctl node status
Calico process is running.

IPv4 BGP status
+---------------+-------------------+-------+----------+-------------+
| PEER ADDRESS  |     PEER TYPE     | STATE |  SINCE   |    INFO     |
+---------------+-------------------+-------+----------+-------------+
| 10.206.216.98 | node-to-node mesh | up    | 08:48:32 | Established |
| 10.206.216.99 | node-to-node mesh | up    | 08:49:43 | Established |
+---------------+-------------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.

# calicoctl get nodes
NAME
k8s-master
k8s-node1
k8s-node2

5. 测试集群,在集群上部署Nginx

创建Nginx deployment, 设置副本数为3

复制代码
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 3 # 副本数设置为3,假设集群中有三个节点,每个节点上会尝试运行一个Nginx实例(但k8s会根据资源情况和调度策略来实际分配)
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

使用kubectl apply -f nginx-deployment.yaml命令部署Nginx

复制代码
# kubectl get pods  -o wide
NAME                                READY   STATUS    RESTARTS   AGE   IP               NODE         NOMINATED NODE   READINESS GATES
nginx-deployment-7c79c4bf97-2bk2q   1/1     Running   0          22m   198.18.169.130   k8s-node2    <none>           <none>
nginx-deployment-7c79c4bf97-5pdr5   1/1     Running   0          22m   198.18.235.199   k8s-master   <none>           <none>
nginx-deployment-7c79c4bf97-w4s8h   1/1     Running   0          22m   198.18.36.66     k8s-node1    <none>           <none>

可以看到,每个节点都成功部署了一个Nginx

删除所有节点的Nginx Pod

复制代码
[root@k8s-master ~]# kubectl delete deploy nginx-deployment
deployment.apps "nginx-deployment" deleted

6. 从节点退出集群

在主节点下删除节点

复制代码
kubectl delete node k8s-node1
kubectl delete node k8s-node2

从节点退出后,如果想重新加入,在从节点上执行如下命令

复制代码
systemctl stop kubelet
rm -rf /etc/kubernetes/*
kubeadm join ... # 从节点上执行,重新加入集群

参考

【1】 https://medium.com/@redswitches/install-kubernetes-on-rocky-linux-9-b01909d6ba72

【2】 https://juejin.cn/post/7286669548740591673

【3】 https://cloud.tencent.com/developer/article/2255721

相关推荐
胡八一6 小时前
Kubernetes 节点磁盘空间空了怎么办?解决 containerd overlay 100%问题
云原生·容器·kubernetes
云计算运维丁丁6 小时前
k8s 1.30.6版本部署(使用canal插件)
云原生·容器·kubernetes
oceanweave6 小时前
【k8s学习之CSI】理解 LVM 存储概念和相关操作
学习·容器·kubernetes
backRoads8 小时前
docker部署rabbitmq
docker·容器·rabbitmq
穷儒公羊10 小时前
第一部分——Docker篇 第六章 容器监控
运维·后端·学习·docker·云原生·容器
CAE虚拟与现实10 小时前
记录一下学习docker的命令(不断补充中)
学习·docker·容器·容器化·docker部署·docker命令
rocksun13 小时前
如何构建自己的简单AI代理来排除Kubernetes故障
人工智能·kubernetes
穷儒公羊14 小时前
第一部分——Docker篇 第三章 构建自定义镜像
java·运维·后端·学习·docker·云原生·容器
weixin_4284984915 小时前
Docker 容器内运行程序的性能开销
docker·容器