前言
集群部署在 VMware 创建的三台虚拟机上,每台虚拟机同时承担 Master 角色。因长期未做系统安全更新,近期执行了 dnf upgrade-minimal --security --allowerasing 升级内核与软件包。内核等更新需重启节点才能生效,而三台节点都运行着 etcd,重启必须逐台进行,避免丢失 quorum 导致集群不可用。
此外,由于集群跑在 VMware 虚拟机之上,执行这套流程时还遇到了一些环境相关的问题,具体表现与解决方式整理在文末问题排查。
bash
# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
demo-1 Ready control-plane 336d v1.31.14 x.x.x.x <none> Ubuntu 24.04.4 LTS 6.8.0-53-generic containerd://1.7.27
demo-2 Ready control-plane 336d v1.31.14 x.x.x.x <none> Ubuntu 24.04.4 LTS 6.8.0-64-generic containerd://1.7.27
demo-3 Ready control-plane 336d v1.31.14 x.x.x.x <none> Ubuntu 24.04.4 LTS 6.8.0-106-generic containerd://1.7.27
节点重启流程
每台节点按以下步骤循环执行:检查 etcd → 阻止调度 → 驱逐 Pod → 重启 → 恢复调度,全部完成后再处理下一台。
1.检查 etcd 状态
etcd 推荐奇数节点 部署,以保证 quorum(多数派)存活时集群可正常读写。容错计算公式:
⌊n/2⌋ + 1,参考官方容错性说明。当前 3 节点需至少保证 2 个 etcd 存活。
以下操作需在每台 etcd 节点上验证:
bash
## 验证数据一致性 / 节点健康
# etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
endpoint status --write-out=table
## DB SIZE: 数据库大小,部署时通过 --quota-backend-bytes 设置上限(默认 2G)
## IS LEADER: 是否为 leader
## IS LEARNER: 是否为非投票成员(worker)
## RAFT TERM: leader 任期,须保证各节点该值一致;重启 / 网络抖动都会使其 +1
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://x.x.x.x:2379 | 683c58b549788bd9 | 3.5.15 | 30 MB | true | false | 40 | 130863104 | 130863104 | |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
## 验证连通性 / 响应延迟
# etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
endpoint health --write-out=table
## HEALTH: 能否读写
## TOOK: 读一个随机 key,无错误即判定健康;耗时 ≈ 网络往返 + leader 心跳确认
+------------------------+--------+-------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+------------------------+--------+-------------+-------+
| https://x.x.x.x:2379 | true | 60.842745ms | |
+------------------------+--------+-------------+-------+
## 查看成员列表
# etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
member list -w table
## STATUS: 节点状态
## PEER ADDRS: 节点间通信地址
## CLIENT ADDRS: 客户端(API server)访问地址
## IS LEARNER: 是否为非投票成员(worker)
+---------+---------+--------+----------------------+----------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+---------+---------+--------+----------------------+----------------------+------------+
| xxxxxxx | started | demo-1 | https://x.x.x.x:2380 | https://x.x.x.x:2379 | false |
| xxxxxxx | started | demo-2 | https://x.x.x.x:2380 | https://x.x.x.x:2379 | false |
| xxxxxxx | started | demo-3 | https://x.x.x.x:2380 | https://x.x.x.x:2379 | false |
+---------+---------+--------+----------------------+----------------------+------------+
2.阻止 Pod 调度
通过 cordon 标记待重启节点,阻止新 Pod 调度上来:
注:一次只操作一台节点,完成该节点的 cordon → 重启 → uncordon 流程后,才能处理下一台!!!
bash
# kubectl cordon demo-1
3.驱逐 Pod
kube-apiserver/controller-manager/scheduler/etcd是 kubelet 直接管理的静态 Pod(static pod),drain 不会驱逐它们。只要其余两个节点的 etcd 存活,集群控制面就正常。
bash
# kubectl drain demo-1 \
--ignore-daemonsets \
--delete-emptydir-data \
--timeout=300s
驱逐后确认节点已停止调度:
bash
# kubectl get node
NAME STATUS ROLES AGE VERSION
demo-1 Ready,SchedulingDisabled control-plane 336d v1.31.14
demo-2 Ready control-plane 336d v1.31.14
demo-3 Ready control-plane 336d v1.31.14
常见报错 :驱逐超时通常是因为 PDB(PodDisruptionBudget)不允许驱逐,例如:
bash
error when evicting pods/"prometheus-k8s-0" -n "monitoring": Cannot evict pod as it would violate the pod's disruption budget.
排查并处理该节点的 PDB:
除临时改 PDB 外,也可扩容对应副本数,或手动清理 Pod(但这会使 PDB 失去意义)。
bash
# kubectl get pdb -n monitoring prometheus-k8s
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
prometheus-k8s 1 N/A 0 337d
## 临时将 minAvailable 调为 0(结束后建议还原)
# kubectl patch pdb prometheus-k8s -n monitoring --type=json -p='[{"op":"replace","path":"/spec/minAvailable","value":0}]'
4.重启节点
bash
# ssh demo-1
# reboot
5.恢复调度并验证
节点重启后,确认状态恢复 Ready(内核版本也会变为更新后的版本),再用 uncordon 解除调度限制:
bash
# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
demo-1 Ready,SchedulingDisabled control-plane 336d v1.31.14 x.x.x.x <none> Ubuntu 24.04.4 LTS 6.8.0-124-generic containerd://1.7.27
demo-2 Ready control-plane 336d v1.31.14 x.x.x.x <none> Ubuntu 24.04.4 LTS 6.8.0-64-generic containerd://1.7.27
demo-3 Ready control-plane 336d v1.31.14 x.x.x.x <none> Ubuntu 24.04.4 LTS 6.8.0-106-generic containerd://1.7.27
# kubectl uncordon demo-1
node/demo-1 uncordoned
# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
demo-1 Ready control-plane 336d v1.31.14 x.x.x.x <none> Ubuntu 24.04.4 LTS 6.8.0-124-generic containerd://1.7.27
demo-2 Ready control-plane 336d v1.31.14 x.x.x.x <none> Ubuntu 24.04.4 LTS 6.8.0-64-generic containerd://1.7.27
demo-3 Ready control-plane 336d v1.31.14 x.x.x.x <none> Ubuntu 24.04.4 LTS 6.8.0-106-generic containerd://1.7.27
6.循环操作
当前节点处理完毕,回到第 1 步对下一台节点重复整个流程...
问题排查
1.重启后 Pod 被标记为 <invalid>
1.1.问题现象
重启节点后,该节点所有 Pod 的 RESTARTS 列显示为 <invalid>,静态 Pod READY 列展示为 0/1。在 k8s 源码中:pod 创建时间与当前时间偏差超过 2 秒即显示 invalid。
bash
# kubectl get pods -A -o wide | grep demo-3
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system coredns-dbbb9ff68-z8wjd 1/1 Running 0 (<invalid> ago) 131m x.x.x.x demo-3
kube-system etcd-demo-3 0/1 Running 0 (<invalid> ago) 10m x.x.x.x demo-3
kube-system kube-apiserver-demo-3 0/1 Running 0 (<invalid> ago) 10m x.x.x.x demo-3
kube-system kube-controller-manager-demo-3 0/1 Running 0 (<invalid> ago) 10m x.x.x.x demo-3
kube-system kube-proxy-f2fj8 1/1 Running 1 (<invalid> ago) 75d x.x.x.x demo-3
kube-system kube-scheduler-demo-3 0/1 Running 0 (<invalid> ago) 10m x.x.x.x demo-3
kube-system node-local-dns-xkfxc 1/1 Running 3 (<invalid> ago) 75d x.x.x.x demo-3
1.2.排查过程
1.2.1.校验容器与节点时间
以 etcd Pod 为例,其 startedAt(UTC)比节点当前 UTC 时间晚了约 7 小时 40 分,处于 "未来" 时间:
通过时间结尾的 Z 判断时间格式为 UTC
bash
## 节点当前时间(CST / UTC)
# date
Wed Jun 17 09:06:16 PM CST 2026
# date -u
Wed Jun 17 01:06:16 PM UTC 2026
## Pod 容器状态(时间为 UTC)
# kubectl get pod -n kube-system etcd-demo-1 -o jsonpath='{.status.containerStatuses}' | jq .
[
{
...
"lastState": {
"terminated": {
"exitCode": 255,
"finishedAt": "2026-06-17T20:46:05Z",
"reason": "Unknown",
"startedAt": "2026-06-17T09:56:07Z"
}
},
"name": "etcd",
"ready": false,
"restartCount": 4,
"state": {
"running": {
"startedAt": "2026-06-17T20:46:16Z"
}
}
}
]
## containerd 记录的容器创建/启动时间
## https://github.com/kubernetes-sigs/cri-tools/blob/v1.26.1/cmd/crictl/container.go#L862
# crictl inspect 2ea3cdadfec6f | grep -Ei "createdAt|startedAt|finishedAt"
"createdAt": "2026-06-18T04:46:16.17870878+08:00",
"startedAt": "2026-06-18T04:46:16.641950022+08:00",
"finishedAt": "0001-01-01T00:00:00Z",
## 换算为 UTC 统一对比
## containerd: 2026-06-17 20:46:16 UTC
## 节点当前: 2026-06-17 13:06:16 UTC ← 容器时间在「未来」(晚 7h40m)
1.2.2.校验 Pod 底层容器状态
排除 containerd / etcd 异常。底层容器实际处于 Running 状态:
bash
# crictl ps -a | grep 'etcd'
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD
2ea3cdadfec6f 2e96e5913fc06 Less than a second ago Running etcd 4 7451b51061d22 etcd-demo-1
fab531bce16e7 2e96e5913fc06 3 hours ago Exited etcd 3 5f897d72fd206 etcd-demo-1
## 进程确实在跑
# ps aux | grep -v 'grep' | grep 'etcd'
root 2089 ... etcd --advertise-client-urls=https://x.x.x.x:2379 ...
## etcd 节点状态
# etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
endpoint status -w table
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://x.x.x.x:2379 | 683c58b549788bd9 | 3.5.15 | 30 MB | false | false | 59 | 131325454 | 131325454 | |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
1.2.3.定位时间问题根因
各节点都配了 NTP,对比时间一致。这里我的想法是这样的:containerd 服务本身已经很多年了,大概率不会有这种 bug,更可能是他创建容器时,识别到的时间就是在 "未来"...那现在查看时间却正常了,应该是被节点中配置的时间同步改回来了。所以我才通过 dmesg 看一下启动后的内核记录:
bash
# dmesg -T | grep -iEw "rtc|time|clock"
[Wed Jun 17 20:54:08 2026] vmware: Host bus clock speed read from hypervisor : 66000000 Hz
[Wed Jun 17 20:54:08 2026] vmware: using clock offset of 5629468472 ns
[Wed Jun 17 20:54:08 2026] PM: RTC time: 12:54:08, date: 2026-06-17
[Wed Jun 17 20:54:09 2026] PTP clock support registered
## 设置 UTC 时间
[Wed Jun 17 20:54:10 2026] rtc_cmos 00:01: setting system clock to 2026-06-17T12:54:10 UTC (1781700850)
[Wed Jun 17 20:54:11 2026] Loaded X.509 cert 'Build time autogenerated kernel key: ...'
## 内核设置完时间约 40 秒后 systemd-journald 又记录了一次时间向后跳变
[Wed Jun 17 20:54:50 2026] systemd-journald[525]: Time jumped backwards, rotating.
用 journalctl 把 systemd-journald 记录的日志导出,由于输出较多,我做了一些精简。发现时间线为:20:54:xx → 04:45:xx → 20:54:xx,在主机时间被拨到"未来"(04:45)时 containerd 才启动,所以创建出的容器时间也是 "未来"。
注:这份日志并不能直接证明是 VMware 改的时间。我的判断依据是:时间被改为 04:45:xx 前启动的只有 ssh/cron 等系统服务,他们不没有改时间的能力;而 VGAuthService 是距改时最近、且具备改时能力的服务,因此列为第一怀疑对象,后续治本方案也验证了这一点。
bash
## 内容较多,输出到文件后截取必要片段
# journalctl -b --no-pager > journalctl.txt
## -b: 只看本次开机后的日志;
## --no-pager: 不分页全部输出
## 50 秒前后可看到明显的时间差异:20:54:xx → 04:45:xx → 20:54:xx
Jun 17 20:54:19 demo-1 kernel: DMI: VMware, Inc. VMware Virtual Platform/...
Jun 17 20:54:19 demo-1 kernel: vmware: hypercall mode: 0x02
Jun 17 20:54:19 demo-1 kernel: Hypervisor detected: VMware
Jun 17 20:54:19 demo-1 kernel: vmware: TSC freq read from hypervisor : 2600.000 MHz
Jun 17 20:54:19 demo-1 kernel: vmware: Host bus clock speed read from hypervisor : 66000000 Hz
Jun 17 20:54:19 demo-1 kernel: vmware: using clock offset of 5629468472 ns
Jun 17 20:54:19 demo-1 kernel: Booting paravirtualized kernel on VMware hypervisor
Jun 17 20:54:26 demo-1 VGAuthService[799]: Using '/var/lib/vmware/VGAuth/aliasStore' for alias store root directory
Jun 17 20:54:26 demo-1 VGAuthService[799]: LoadCatalogAndSchema: Using '/etc/vmware-tools/vgauth/schemas' for SAML schemas
Jun 17 20:54:26 demo-1 VGAuthService[799]: LoadPrefs: Allowing 300 of clock skew for SAML date validation
Jun 17 20:54:26 demo-1 VGAuthService[799]: SAML_Init: Using xmlsec1 1.2.39 for XML signature support
Jun 17 20:54:26 demo-1 VGAuthService[799]: ServiceNetworkCreateSocketDir: Created socket directory '/var/run/vmware'
Jun 17 20:54:26 demo-1 VGAuthService[799]: BEGIN SERVICE
Jun 17 20:54:26 demo-1 systemd[1]: Starting etcd.service - Etcd Service...
## 主机时间被拨到 "未来" 后,containerd 才启动
Jun 18 04:45:57 demo-1 systemd-resolved[796]: Clock change detected. Flushing caches.
Jun 18 04:45:57 demo-1 systemd[1]: Started kubelet.service - kubelet: The Kubernetes Node Agent.
Jun 18 04:45:58 demo-1 systemd[1]: Starting containerd.service - containerd container runtime...
Jun 18 04:46:16 demo-1 containerd[919]: time="..." msg="CreateContainer ... for &ContainerMetadata{Name:etcd,Attempt:4,} returns container id \"2ea3cdadfec6f...\""
Jun 18 04:46:16 demo-1 containerd[919]: time="..." msg="StartContainer for \"2ea3cdadfec6f...\""
Jun 18 04:46:16 demo-1 systemd[1]: Started cri-containerd-2ea3cdadfec6f....scope - libcontainer container 2ea3cdadfec6f....
Jun 18 04:46:16 demo-1 containerd[919]: time="..." msg="StartContainer for \"2ea3cdadfec6f...\" returns successfully"
## 时钟再次被更改
Jun 17 20:54:50 demo-1 systemd-resolved[796]: Clock change detected. Flushing caches.
Jun 17 20:54:50 demo-1 systemd-journald[525]: Time jumped backwards, rotating.
Jun 17 20:54:50 demo-1 systemd-timesyncd[797]: Contacted time server 91.189.91.157:123 (ntp.ubuntu.com).
Jun 17 20:54:50 demo-1 systemd-timesyncd[797]: Initial clock synchronization to Wed 2026-06-17 20:54:50.365713 CST.
Jun 17 20:54:50 demo-1 systemd[1]: etcd.service: Scheduled restart job, restart counter is at 2.
Jun 17 20:54:50 demo-1 systemd[1]: Starting etcd.service - Etcd Service...
1.3.解决方式
通过上面日志输出,怀疑根因是 VMware 导致的时间跳变,那就有有两条路:
- 改虚拟机自身启动顺序(治标);
- 改 VMware 时间同步配置(治本)。
1.3.1.治标:更改启动顺序
适用于没有 VMware 宿主机权限的场景。新建一个等待时钟同步的服务,让 kubelet 依赖它,确保主机时间恢复正常后再拉起容器。以及 contaierd 也需要这个依赖,否则会出现下一个问题。
这个自定义服务具体内容不重要,换成
sleep 60也能达到目的。
bash
## 1. 用 Drop-In 而非直接改 kubelet.service
## /usr/lib/systemd/system/kubelet.service 归 rpm/deb 包所有,升级会被覆盖;
## systemd Drop-In (/etc/systemd/system/kubelet.service.d/) 不属于任何包,类似 helm custom value.
# systemctl status kubelet
## ● kubelet.service - kubelet: The Kubernetes Node Agent
## Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; preset: enabled)
## Drop-In: /usr/lib/systemd/system/kubelet.service.d/
## └─10-kubeadm.conf
# mkdir /etc/systemd/system/kubelet.service.d/
## 2. 新建等待时钟同步服务
cat > /etc/systemd/system/wait-for-clock-sync.service <<'EOF'
[Unit]
Description=Wait for system clock to be synchronized
After=systemd-timesyncd.service network-online.target
Before=kubelet.service
Wants=network-online.target
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/bash -c 'for i in $(seq 1 60); do if timedatectl show -p NTPSynchronized --value 2>/dev/null | grep -q yes; then exit 0; fi; sleep 1; done; exit 0'
TimeoutStartSec=70
[Install]
WantedBy=multi-user.target
EOF
## 3. 让 kubelet 依赖它
cat > /etc/systemd/system/kubelet.service.d/20-wait-time-sync.conf <<'EOF'
[Unit]
After=wait-for-clock-sync.service
Requires=wait-for-clock-sync.service
EOF
## 4. 启用 + 重载
# systemctl daemon-reload
# systemctl enable --now wait-for-clock-sync.service
## 5. 验证依赖链
# systemctl list-dependencies --reverse wait-for-clock-sync.service
# systemctl status wait-for-clock-sync.service
1.3.2.治本:关闭 VMware 时间同步
本次环境是 VMware 与虚拟机时间不一致、启动时把虚拟机时间拨乱导致的。参考 VMware 官方博客,关掉所有时间同步:
- 关闭虚拟机;
- 更改对应机器 vmx 配置;
- 开机。
bash
## 在 ESXi 宿主机上编辑对应虚拟机的 vmx
# grep -i 'time' /vmfs/volumes/data-1/demo-1/demo-1.vmx
time.synchronize.continue = "FALSE"
time.synchronize.restore = "FALSE"
time.synchronize.resume.disk = "FALSE"
time.synchronize.shrink = "FALSE"
time.synchronize.tools.startup = "FALSE"
time.synchronize.tools.enable = "FALSE"
time.synchronize.resume.host = "FALSE"
bash
## 关闭后重启,dmesg 不再有时间跳变
# dmesg -T | grep -iEw "rtc|time|clock"
[Thu Jun 18 18:14:06 2026] vmware: Host bus clock speed read from hypervisor : 66000000 Hz
[Thu Jun 18 18:14:06 2026] vmware: using clock offset of 4177512396 ns
[Thu Jun 18 18:14:07 2026] PM: RTC time: 10:14:06, date: 2026-06-18
[Thu Jun 18 18:14:08 2026] PTP clock support registered
[Thu Jun 18 18:14:08 2026] rtc_cmos 00:01: setting system clock to 2026-06-18T10:14:08 UTC (1781777648)
[Thu Jun 18 18:14:08 2026] Loaded X.509 cert 'Build time autogenerated kernel key: ...'
[Thu Jun 18 18:14:09 2026] Loaded X.509 cert 'Build time autogenerated kernel key: ...'
2.重启后容器名称被占用,容器无法创建
2.1.问题现象
触发原因与问题 1 同源(时间跳变导致 containerd 数据写入不完整),根治同样用 [1.3](#根治同样用 1.3 的方案) 的方案。本节讲的是重启后已经出现该症状时,如何手动恢复。
此问题一般由两种情况导致:
- 非原子写入:containerd 创建容器时会写两条记录(名称 + 详情),如出现 断电/panic/时间跳变 等情况只写一条就会残留;
- 时间跳变(本次根因)。
bash
# kubectl get pods -n kube-system -o wide | grep demo-1
NAME READY STATUS RESTARTS AGE IP NODE
coredns-dbbb9ff68-pzr4j 1/1 Running 4 5h8m x.x.x.x demo-1
etcd-demo-1 0/1 Unknown 8 20m x.x.x.x demo-1
kube-apiserver-demo-1 0/1 Unknown 32 20m x.x.x.x demo-1
kube-controller-manager-demo-1 0/1 Unknown 20 20m x.x.x.x demo-1
kube-proxy-jgx42 1/1 Running 0 34m x.x.x.x demo-1
kube-scheduler-demo-1 0/1 Unknown 19 20m x.x.x.x demo-1
node-local-dns-vgjlf 1/1 Running 0 28m x.x.x.x demo-1
2.2.排查过程
2.2.1.查底层容器与日志
容器已创建但处于 Exited,且日志文件不存在(容器并未真正起来):
bash
# crictl ps -a | grep -i 'etcd'
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD
c93ac13a41b67 2e96e5913fc06 Less than a second ago Exited etcd 8 f2fc82e9a4d98 etcd-demo-1
## 日志文件不存在
# crictl logs 624bf43aff2db
FATA[0000] failed to try resolving symlinks in path "/var/log/pods/kube-system_etcd-demo-1_c93ac13a41b67d89d5dbbfbc90cf9c8f/etcd/8.log":
lstat /var/log/pods/kube-system_etcd-demo-1_c93ac13a41b67d89d5dbbfbc90cf9c8f/etcd/8.log: no such file or directory
2.2.2.查 kubelet 日志
容器由 kubelet 拉起,看 kubelet 的报错。以下日志都在说同一件事:kubelet 想创建 etcd-demo-1,但这个名字在 containerd 里已被占用(reserved):
bash
# journalctl -u kubelet --since "3 min ago" --no-pager | grep -i 'etcd'
Jun 18 23:02:36 demo-1 kubelet[23429]: E0618 23:02:36.675072 ... "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to reserve sandbox name \"etcd-demo-1_kube-system_df1fae7c70ff1a1dfc6127a8f7bf67a2_6\": name ... is reserved for \"7cc9d6627964...\""
Jun 18 23:02:36 demo-1 kubelet[23429]: E0618 23:02:36.675212 ... "Failed to create sandbox for pod" err="... failed to reserve sandbox name \"etcd-demo-1_kube-system_..._6\": name ... is reserved for \"7cc9d6627964...\"" pod="kube-system/etcd-demo-1"
Jun 18 23:02:36 demo-1 kubelet[23429]: E0618 23:02:36.675273 ... "CreatePodSandbox for pod failed" err="... failed to reserve sandbox name \"etcd-demo-1_kube-system_..._6\": name ... is reserved for \"7cc9d6627964...\"" pod="kube-system/etcd-demo-1"
Jun 18 23:02:36 demo-1 kubelet[23429]: E0618 23:02:36.675410 ... "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"etcd-demo-1_kube-system(...)\" with CreatePodSandboxError: ... name ... is reserved for \"7cc9d6627964...\"" pod="kube-system/etcd-demo-1"
2.3.解决方式
补充:containerd 社区已有对应 issue #10848,2.2.x / 2.3.x 已修复(见 PR #11576),升级后可避免复发。
暂停 kubelet 服务后,手动清理被占用的容器,再重启 containerd 即可:
bash
## 停 kubelet,否则它会反复尝试创建
# systemctl stop kubelet
## 找到被占用的 sandbox(按名字查重名)
# crictl pods ## 列出所有 sandbox
# crictl pods -q --name <name> ## 拿到对应 ID
## 删除残留 sandbox
# crictl stopp $ID
# crictl rmp -f $ID
# systemctl restart containerd
# systemctl start kubelet