k8s迁移——岁月云实战笔记

新系统使用rockylinux9.5,旧系统虚拟机装的是centos7

1 目标服务器

1.1 禁止swap

复制代码
swapoff -a
vi /etc/fstab
#/dev/mapper/rl-swap     none                    swap    defaults        0 0
#执行,swap一行都是0
free -h

1.2 关闭防火墙

只是为了减少维护成本。

bash 复制代码
systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld

1.3 关闭SE

bash 复制代码
# 临时关闭 重启系统后还会开启
setenforce 0
# 永久关闭
vi /etc/selinux/config
# 将SELINUX=enforcing改为SELINUX=disabled,

1.4 更改主机名

bash 复制代码
hostnamectl set-hostname master7

1.5 添加host

bash 复制代码
vi /etc/hosts
10.101.10.6 master6
10.101.10.7 master7
10.101.10.8 master8

1.6 配置ip_forward机制

bash 复制代码
# 设置
modprobe br_netfilter
# net.ipv4.ip_forward为0,则pod的ip无法转发
sysctl -w net.ipv4.ip_forward=1
sysctl -w net.bridge.bridge-nf-call-iptables=1
sysctl -w net.bridge.bridge-nf-call-ip6tables=1
sysctl -p
# 检查
sysctl -a | grep net.ipv4.ip_forward
sysctl -a | grep net.bridge.bridge-nf-call-iptables
sysctl -a | grep net.bridge.bridge-nf-call-ip6tables

1.7 时间同步

bash 复制代码
sudo dnf install chrony
sudo systemctl start chronyd
sudo systemctl enable chronyd

# 添加配置
vi /etc/chrony.conf
# 添加如下配置
pool ntp1.aliyun.com iburst
pool ntp2.aliyun.com iburst


server ntp1.aliyun.com iburst
server ntp2.aliyun.com iburst
server ntp3.aliyun.com iburst
server ntp4.aliyun.com iburst
server ntp5.aliyun.com iburst
server ntp7.aliyun.com iburst


# 立即同步
sudo chronyc -a makestep

# 查看时间状态
timedatectl status

1.8 添加rancher用户

bash 复制代码
useradd rancher
usermod -aG docker rancher
echo 123456 | passwd --stdin rancher
cat /etc/group | grep docker

2 源服务器

由原来的master节点添加新的节点,因此这个是在源服务器上执行。

2.1 免密登录

bash 复制代码
# 在原master节点中执行
su - rancher
ssh-copy-id rancher@master7

2.2 安装新的rke

bash 复制代码
curl -sfL https://get.rke2.io | sh -

2.2 添加节点

rke管理k8s节点的新增与删除,更改cluster.yml配置,然后执行rke up --update-only --config cluster.yml,因为涉及到etcd的添加,因此需要选择空闲时段来处理。

2.3 安装kubectlctl

安装对应的kubectl

https://dl.k8s.io/release/v1.30.7/bin/linux/amd64/kubectl

bash 复制代码
chmod +x kubectl
cp -a kubectl /usr/bin
cd /root
mkdir .kube
cp /home/rancher/kube_config_cluster.yml /root/.kube/config

3 一些问题

3.1 docker版本不兼容问题

bash 复制代码
su - rancher
rke up --update-only --config cluster.yml

执行完命令后,提示下面的错误信息,rancher官网也有这个错误Failed to set up SSH tunneling for host [xxx.xxx.xxx.xxx]: Can't retrieve Docker Info#

bash 复制代码
WARN[0000] Failed to set up SSH tunneling for host [master6]: Can't retrieve Docker Info: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": Unable to access node with address [master6:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain 
WARN[0000] Removing host [master6] from node lists      
INFO[0000] [network] No hosts added existing cluster, skipping port check 

但在源服务器中执行,下面的命令是通过的

bash 复制代码
ssh -i ~/.ssh/id_rsa rancher@master6

查看docker版本,估计是docker版本

bash 复制代码
# 目标服务器
[root@master6 ~]# docker --version
Docker version 27.4.0, build bde2b89
# 源服务器
[root@master1 ~]# docker --version
Docker version 19.03.8, build afacb8b

docker并不是最新的就好,当前 rke 版本Release v1.6.5,但是安装的时候提示,也就是说docker27.4.1当前不支持。因此还得做版本回退。

bash 复制代码
[rancher@master8 ~]$ rke up --config cluster.yml
INFO[0000] Running RKE version: v1.6.5                  
INFO[0000] Initiating Kubernetes cluster                
INFO[0000] [certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates 
INFO[0000] [certificates] Generating Kubernetes API server certificates 
INFO[0000] [certificates] Generating admin certificates and kubeconfig 
INFO[0000] [certificates] Generating kube-etcd-master6 certificate and key 
INFO[0000] [certificates] Generating kube-etcd-master7 certificate and key 
INFO[0000] [certificates] Generating kube-etcd-master8 certificate and key 
INFO[0000] Successfully Deployed state file at [./cluster.rkestate] 
INFO[0000] Building Kubernetes cluster                  
INFO[0000] [dialer] Setup tunnel for host [master7]     
INFO[0000] [dialer] Setup tunnel for host [master8]     
INFO[0000] [dialer] Setup tunnel for host [master6]     
FATA[0001] Unsupported Docker version found [27.4.1] on host [master8], supported versions are [1.13.x 17.03.x 17.06.x 17.09.x 18.06.x 18.09.x 19.03.x 20.10.x 23.0.x 24.0.x 25.0.x 26.0.x 26.1.x 27.0.x 27.1.x 27.2.x] 

重置docker环境

bash 复制代码
systemctl disable docker
sudo systemctl stop docker.socket
systemctl stop docker
dnf remove docker-ce docker-ce-cli containerd.io docker-compose-plugin -y
# 删除docker数据
sudo rm -rf /var/lib/docker
sudo rm -rf /var/lib/containerd
rm -rf /home/docker
# 清理残留文件,如果是重装下面两步也可以跳过
sudo rm -rf /etc/docker
sudo rm -rf /etc/systemd/system/docker.service.d
# 查看可用的docker
sudo yum list docker-ce --showduplicates | sort -r
# 安装指定版本的docker
yum install docker-ce-27.2.1-1.el9 docker-ce-cli-27.2.1-1.el9 containerd.io -y
# 更改docker路径
vi /lib/systemd/system/docker.service
# 重启docker
systemctl start docker
systemctl enable docker

3.2 rke下载不了文件

虽然你改了/etc/docker/daemon.json,但是执行rke up --config cluster.yml,镜像还是下载不下来。在各个节点手工执行一下,如下面拉去对应的镜像,然后再rke up --config cluster.yml就可以往下走了。

bash 复制代码
docker pull rancher/rke-tools:v0.1.105

下面是执行过程中,我的截图,可以看到有些rancher相关的镜像比较大,都有16.GB,而有些镜像还在下载过程中。

3.3 canal安装失败

calico-kube-controllers安装也失败,但是解决下面的问题后,一并会解决

bash 复制代码
# 执行这个可以看到详细的错误日志
kubectl describe pod canal-5vznx -n kube-system

Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  32m                   default-scheduler  Successfully assigned kube-system/canal-5vznx to master7
  Normal   Pulling    27m (x4 over 32m)     kubelet            Pulling image "rancher/calico-cni:v3.28.1-rancher1"
  Warning  Failed     25m (x4 over 31m)     kubelet            Error: ErrImagePull
  Warning  Failed     24m (x7 over 31m)     kubelet            Error: ImagePullBackOff
  Warning  Failed     11m (x7 over 31m)     kubelet            Failed to pull image "rancher/calico-cni:v3.28.1-rancher1": rpc error: code = Canceled desc = context canceled
  Normal   BackOff    2m44s (x77 over 31m)  kubelet            Back-off pulling image "rancher/calico-cni:v3.28.1-rancher1"

# 于是手工执行
docker pull rancher/calico-cni:v3.28.1-rancher1
docker pull rancher/mirrored-calico-node:v3.28.1

3.5 kuboard安装失败

下面看还是同样的问题,镜像下载不下来,这个是因为kuboard要设置secret到本地harbor中下载镜像。

bash 复制代码
Events:
  Type     Reason                           Age                From               Message
  ----     ------                           ----               ----               -------
  Normal   Scheduled                        46s                default-scheduler  Successfully assigned kube-system/kuboard-559bccdc6-zf67z to master6
  Normal   BackOff                          18s (x2 over 44s)  kubelet            Back-off pulling image "10.101.10.2:8081/mid/eipwork/kuboard:latest"
  Warning  Failed                           18s (x2 over 44s)  kubelet            Error: ImagePullBackOff
  Warning  FailedToRetrieveImagePullSecret  3s (x5 over 46s)   kubelet            Unable to retrieve some image pull secrets (regcred); attempting to pull the image may not succeed.
  Normal   Pulling                          3s (x3 over 45s)   kubelet            Pulling image "10.101.10.2:8081/mid/eipwork/kuboard:latest"
  Warning  Failed                           3s (x3 over 45s)   kubelet            Failed to pull image "10.101.10.2:8081/mid/eipwork/kuboard:latest": Error response from daemon: unauthorized: unauthorized to access repository: mid/eipwork/kuboard, action: pull: unauthorized to access repository: mid/eipwork/kuboard, action: pull
  Warning  Failed                           3s (x3 over 45s)   kubelet            Error: ErrImagePull
bash 复制代码
kubectl create secret docker-registry regcred \
  --docker-server=http://harbor的ip:端口 \
  --docker-username=用户名 \
  --docker-password=密码\
  --docker-email=邮箱 \
  -n kube-system

接口要获取kuboard的token

bash 复制代码
echo $(kubectl -n kube-system get secret $(kubectl -n kube-system get secret | grep kuboard-user | awk '{print $1}') -o go-template='{{.data.token}}' | base64 -d)

3.6 kuboard拿不到token

以往都很容易执行上面的命令就可以了,但是今天不知道为什么kuboard没有创建对应的secret。检查账户信息,里面确实没有scecret

bash 复制代码
kubectl get serviceaccount kuboard-user -n kube-system -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"name":"kuboard-user","namespace":"kube-system"}}
  creationTimestamp: "2024-12-21T07:24:12Z"
  name: kuboard-user
  namespace: kube-system
  resourceVersion: "3491"
  uid: 7d46c0a1-07e9-4cb2-ad99-00b7e6091151

解决方案如下,创建了secret,接着按照上面的命令,从secret中拿到token就可以登录kuboard的网页了。

bash 复制代码
# 这个命令会创建一个新的Token Secret
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: kuboard-user-token
  namespace: kube-system
  annotations:
    kubernetes.io/service-account.name: kuboard-user
type: kubernetes.io/service-account-token
EOF



# 将这个新创建的Secret关联到ServiceAccount
kubectl patch serviceaccount kuboard-user -n kube-system --patch '{"secrets":[{"name":"kuboard-user-token"}]}'
相关推荐
原神启动1几秒前
K8S(六)—— 企业级,Rancher安装配置与核心功能实操
容器·kubernetes·rancher
阿杰 AJie2 分钟前
安装 docker.io(不走外网 Docker 域名)
docker·容器·eureka
时兮兮时2 分钟前
MODIS Land Cover (MCD12Q1 and MCD12C1) Product—官方文档的中文翻译
笔记·mcd12q1
.hopeful.12 分钟前
Docker——镜像仓库和镜像
运维·docker·容器
时兮兮时17 分钟前
Linux 服务器后台任务生存指南
linux·服务器·笔记
m0_4856146717 分钟前
K8S项目生命周期管理
云原生·容器·kubernetes
CodeCaptain21 分钟前
Dify结合vllm-openai docker镜像出现docker: invalid reference format问题的解决方案
运维·docker·容器
LucidX28 分钟前
Kubernetes Pod 详解与Rancher 部署
容器·kubernetes·rancher
星环处相逢30 分钟前
从 Pod 核心原理到 Rancher 实战:K8s 容器管理全解析
容器·kubernetes·rancher
_Kayo_33 分钟前
Node.js 学习笔记6
笔记·学习·node.js