k8s迁移——岁月云实战笔记

新系统使用rockylinux9.5,旧系统虚拟机装的是centos7

1 目标服务器

1.1 禁止swap

swapoff -a
vi /etc/fstab
#/dev/mapper/rl-swap     none                    swap    defaults        0 0
#执行,swap一行都是0
free -h

1.2 关闭防火墙

只是为了减少维护成本。

bash 复制代码
systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld

1.3 关闭SE

bash 复制代码
# 临时关闭 重启系统后还会开启
setenforce 0
# 永久关闭
vi /etc/selinux/config
# 将SELINUX=enforcing改为SELINUX=disabled,

1.4 更改主机名

bash 复制代码
hostnamectl set-hostname master7

1.5 添加host

bash 复制代码
vi /etc/hosts
10.101.10.6 master6
10.101.10.7 master7
10.101.10.8 master8

1.6 配置ip_forward机制

bash 复制代码
# 设置
modprobe br_netfilter
# net.ipv4.ip_forward为0,则pod的ip无法转发
sysctl -w net.ipv4.ip_forward=1
sysctl -w net.bridge.bridge-nf-call-iptables=1
sysctl -w net.bridge.bridge-nf-call-ip6tables=1
sysctl -p
# 检查
sysctl -a | grep net.ipv4.ip_forward
sysctl -a | grep net.bridge.bridge-nf-call-iptables
sysctl -a | grep net.bridge.bridge-nf-call-ip6tables

1.7 时间同步

bash 复制代码
sudo dnf install chrony
sudo systemctl start chronyd
sudo systemctl enable chronyd

# 添加配置
vi /etc/chrony.conf
# 添加如下配置
pool ntp1.aliyun.com iburst
pool ntp2.aliyun.com iburst


server ntp1.aliyun.com iburst
server ntp2.aliyun.com iburst
server ntp3.aliyun.com iburst
server ntp4.aliyun.com iburst
server ntp5.aliyun.com iburst
server ntp7.aliyun.com iburst


# 立即同步
sudo chronyc -a makestep

# 查看时间状态
timedatectl status

1.8 添加rancher用户

bash 复制代码
useradd rancher
usermod -aG docker rancher
echo 123456 | passwd --stdin rancher
cat /etc/group | grep docker

2 源服务器

由原来的master节点添加新的节点,因此这个是在源服务器上执行。

2.1 免密登录

bash 复制代码
# 在原master节点中执行
su - rancher
ssh-copy-id rancher@master7

2.2 安装新的rke

bash 复制代码
curl -sfL https://get.rke2.io | sh -

2.2 添加节点

rke管理k8s节点的新增与删除,更改cluster.yml配置,然后执行rke up --update-only --config cluster.yml,因为涉及到etcd的添加,因此需要选择空闲时段来处理。

2.3 安装kubectlctl

安装对应的kubectl

https://dl.k8s.io/release/v1.30.7/bin/linux/amd64/kubectl

bash 复制代码
chmod +x kubectl
cp -a kubectl /usr/bin
cd /root
mkdir .kube
cp /home/rancher/kube_config_cluster.yml /root/.kube/config

3 一些问题

3.1 docker版本不兼容问题

bash 复制代码
su - rancher
rke up --update-only --config cluster.yml

执行完命令后,提示下面的错误信息,rancher官网也有这个错误Failed to set up SSH tunneling for host [xxx.xxx.xxx.xxx]: Can't retrieve Docker Info#

bash 复制代码
WARN[0000] Failed to set up SSH tunneling for host [master6]: Can't retrieve Docker Info: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": Unable to access node with address [master6:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain 
WARN[0000] Removing host [master6] from node lists      
INFO[0000] [network] No hosts added existing cluster, skipping port check 

但在源服务器中执行,下面的命令是通过的

bash 复制代码
ssh -i ~/.ssh/id_rsa rancher@master6

查看docker版本,估计是docker版本

bash 复制代码
# 目标服务器
[root@master6 ~]# docker --version
Docker version 27.4.0, build bde2b89
# 源服务器
[root@master1 ~]# docker --version
Docker version 19.03.8, build afacb8b

docker并不是最新的就好,当前 rke 版本Release v1.6.5,但是安装的时候提示,也就是说docker27.4.1当前不支持。因此还得做版本回退。

bash 复制代码
[rancher@master8 ~]$ rke up --config cluster.yml
INFO[0000] Running RKE version: v1.6.5                  
INFO[0000] Initiating Kubernetes cluster                
INFO[0000] [certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates 
INFO[0000] [certificates] Generating Kubernetes API server certificates 
INFO[0000] [certificates] Generating admin certificates and kubeconfig 
INFO[0000] [certificates] Generating kube-etcd-master6 certificate and key 
INFO[0000] [certificates] Generating kube-etcd-master7 certificate and key 
INFO[0000] [certificates] Generating kube-etcd-master8 certificate and key 
INFO[0000] Successfully Deployed state file at [./cluster.rkestate] 
INFO[0000] Building Kubernetes cluster                  
INFO[0000] [dialer] Setup tunnel for host [master7]     
INFO[0000] [dialer] Setup tunnel for host [master8]     
INFO[0000] [dialer] Setup tunnel for host [master6]     
FATA[0001] Unsupported Docker version found [27.4.1] on host [master8], supported versions are [1.13.x 17.03.x 17.06.x 17.09.x 18.06.x 18.09.x 19.03.x 20.10.x 23.0.x 24.0.x 25.0.x 26.0.x 26.1.x 27.0.x 27.1.x 27.2.x] 

重置docker环境

bash 复制代码
systemctl disable docker
sudo systemctl stop docker.socket
systemctl stop docker
dnf remove docker-ce docker-ce-cli containerd.io docker-compose-plugin -y
# 删除docker数据
sudo rm -rf /var/lib/docker
sudo rm -rf /var/lib/containerd
rm -rf /home/docker
# 清理残留文件,如果是重装下面两步也可以跳过
sudo rm -rf /etc/docker
sudo rm -rf /etc/systemd/system/docker.service.d
# 查看可用的docker
sudo yum list docker-ce --showduplicates | sort -r
# 安装指定版本的docker
yum install docker-ce-27.2.1-1.el9 docker-ce-cli-27.2.1-1.el9 containerd.io -y
# 更改docker路径
vi /lib/systemd/system/docker.service
# 重启docker
systemctl start docker
systemctl enable docker

3.2 rke下载不了文件

虽然你改了/etc/docker/daemon.json,但是执行rke up --config cluster.yml,镜像还是下载不下来。在各个节点手工执行一下,如下面拉去对应的镜像,然后再rke up --config cluster.yml就可以往下走了。

bash 复制代码
docker pull rancher/rke-tools:v0.1.105

下面是执行过程中,我的截图,可以看到有些rancher相关的镜像比较大,都有16.GB,而有些镜像还在下载过程中。

3.3 canal安装失败

calico-kube-controllers安装也失败,但是解决下面的问题后,一并会解决

bash 复制代码
# 执行这个可以看到详细的错误日志
kubectl describe pod canal-5vznx -n kube-system

Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  32m                   default-scheduler  Successfully assigned kube-system/canal-5vznx to master7
  Normal   Pulling    27m (x4 over 32m)     kubelet            Pulling image "rancher/calico-cni:v3.28.1-rancher1"
  Warning  Failed     25m (x4 over 31m)     kubelet            Error: ErrImagePull
  Warning  Failed     24m (x7 over 31m)     kubelet            Error: ImagePullBackOff
  Warning  Failed     11m (x7 over 31m)     kubelet            Failed to pull image "rancher/calico-cni:v3.28.1-rancher1": rpc error: code = Canceled desc = context canceled
  Normal   BackOff    2m44s (x77 over 31m)  kubelet            Back-off pulling image "rancher/calico-cni:v3.28.1-rancher1"

# 于是手工执行
docker pull rancher/calico-cni:v3.28.1-rancher1
docker pull rancher/mirrored-calico-node:v3.28.1

3.5 kuboard安装失败

下面看还是同样的问题,镜像下载不下来,这个是因为kuboard要设置secret到本地harbor中下载镜像。

bash 复制代码
Events:
  Type     Reason                           Age                From               Message
  ----     ------                           ----               ----               -------
  Normal   Scheduled                        46s                default-scheduler  Successfully assigned kube-system/kuboard-559bccdc6-zf67z to master6
  Normal   BackOff                          18s (x2 over 44s)  kubelet            Back-off pulling image "10.101.10.2:8081/mid/eipwork/kuboard:latest"
  Warning  Failed                           18s (x2 over 44s)  kubelet            Error: ImagePullBackOff
  Warning  FailedToRetrieveImagePullSecret  3s (x5 over 46s)   kubelet            Unable to retrieve some image pull secrets (regcred); attempting to pull the image may not succeed.
  Normal   Pulling                          3s (x3 over 45s)   kubelet            Pulling image "10.101.10.2:8081/mid/eipwork/kuboard:latest"
  Warning  Failed                           3s (x3 over 45s)   kubelet            Failed to pull image "10.101.10.2:8081/mid/eipwork/kuboard:latest": Error response from daemon: unauthorized: unauthorized to access repository: mid/eipwork/kuboard, action: pull: unauthorized to access repository: mid/eipwork/kuboard, action: pull
  Warning  Failed                           3s (x3 over 45s)   kubelet            Error: ErrImagePull
bash 复制代码
kubectl create secret docker-registry regcred \
  --docker-server=http://harbor的ip:端口 \
  --docker-username=用户名 \
  --docker-password=密码\
  --docker-email=邮箱 \
  -n kube-system

接口要获取kuboard的token

bash 复制代码
echo $(kubectl -n kube-system get secret $(kubectl -n kube-system get secret | grep kuboard-user | awk '{print $1}') -o go-template='{{.data.token}}' | base64 -d)

3.6 kuboard拿不到token

以往都很容易执行上面的命令就可以了,但是今天不知道为什么kuboard没有创建对应的secret。检查账户信息,里面确实没有scecret

bash 复制代码
kubectl get serviceaccount kuboard-user -n kube-system -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"name":"kuboard-user","namespace":"kube-system"}}
  creationTimestamp: "2024-12-21T07:24:12Z"
  name: kuboard-user
  namespace: kube-system
  resourceVersion: "3491"
  uid: 7d46c0a1-07e9-4cb2-ad99-00b7e6091151

解决方案如下,创建了secret,接着按照上面的命令,从secret中拿到token就可以登录kuboard的网页了。

bash 复制代码
# 这个命令会创建一个新的Token Secret
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: kuboard-user-token
  namespace: kube-system
  annotations:
    kubernetes.io/service-account.name: kuboard-user
type: kubernetes.io/service-account-token
EOF



# 将这个新创建的Secret关联到ServiceAccount
kubectl patch serviceaccount kuboard-user -n kube-system --patch '{"secrets":[{"name":"kuboard-user-token"}]}'
相关推荐
柳鲲鹏5 小时前
jiangdg/AndroidCamera关闭摄像头流程
笔记
韩俊强7 小时前
使用Docker部署一个Node.js项目
docker·容器·node.js
岳不谢7 小时前
华为DHCP高级配置学习笔记
网络·笔记·网络协议·学习·华为
Steven_Mmm8 小时前
初试Docker
运维·docker·容器
19999er9 小时前
CDN信息收集(小迪网络安全笔记~
服务器·网络·笔记·安全·web安全
红色的山茶花10 小时前
YOLOv9-0.1部分代码阅读笔记-dataloaders.py
笔记·深度学习·yolo
UQI-LIUWJ10 小时前
datasets 笔记:加载数据集(基本操作)
笔记
1101 110111 小时前
STM32-笔记9-电动车报警器
笔记