k8s 1.10.26 一次containerd失败引发kubectl不可用问题
开机k8s 1.10.26时,报以下错误
bash
[root@master ~]# kubectl get no
E0515 08:03:00.914894 7993 memcache.go:265] couldn't get current server API group list: Get "https://192.168.80.50:6443/api?timeout=32s": dial tcp 192.168.80.50:6443: connect: connection refused
E0515 08:03:00.915787 7993 memcache.go:265] couldn't get current server API group list: Get "https://192.168.80.50:6443/api?timeout=32s": dial tcp 192.168.80.50:6443: connect: connection refused
E0515 08:03:00.917903 7993 memcache.go:265] couldn't get current server API group list: Get "https://192.168.80.50:6443/api?timeout=32s": dial tcp 192.168.80.50:6443: connect: connection refused
E0515 08:03:00.920028 7993 memcache.go:265] couldn't get current server API group list: Get "https://192.168.80.50:6443/api?timeout=32s": dial tcp 192.168.80.50:6443: connect: connection refused
E0515 08:03:00.922527 7993 memcache.go:265] couldn't get current server API group list: Get "https://192.168.80.50:6443/api?timeout=32s": dial tcp 192.168.80.50:6443: connect: connection refused
The connection to the server 192.168.80.50:6443 was refused - did you specify the right host or port?
查看kubelet状态

bash
kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor pre>
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Thu 2025-05-15 0>
Docs: https://kubernetes.io/docs/
Process: 8114 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CO>
Main PID: 8114 (code=exited, status=1/FAILURE)
查看日志 journalctl -xu kubelet
关键日志:
启动container查看报错
journalctl -xe
● containerd.service - containerd container runtime Loaded: loaded
(/usr/lib/systemd/system/containerd.service; disabled; vendor>
Active: inactive (dead)
Docs: https://containerd.io
解决:
排查发现,配置文件有问题
bash
systemctl stop containerd.service
cp /etc/containerd/config.toml /etc/containerd/config.toml.bak
sudo containerd config default > $HOME/config.toml
sudo cp $HOME/config.toml /etc/containerd/config.toml
sudo sed -i "s#registry.k8s.io/pause#registry.cn-hangzhou.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml
sudo sed -i "s#SystemdCgroup = false#SystemdCgroup = true#g" /etc/containerd/config.toml
#启动containerd

启动kubelet
bash
systemctl start kubelet
systemctl status kubelet

问题解决。
get node 正常。