利用Kubespray安装生产环境的k8s集群-排错篇

利用Kubespray安装生产环境的k8s集群 - 排错

Ansible 排错

  1. Ansible 安装后需要加入到PATH,以便能够直接运行。 一般用非root用户运行,注意在playbook或者inventory里要显性定义become属性。 最好采用ansible运行账号可以免密码sudo的模式。
  2. Ansible 的配置文件,在kuberspray安装时已经存在,需要将Ansible运行配置文件指向相应位置。配置文件约定了Ansible运行时需要的各类依赖的位置,非常重要。
    可通过运行 ansible --version 查看ansible的各类配置情况
    以下代码可以看到ansible尚未绑定任何配置文件。
bash 复制代码
bill@jump-server:~/kubespray/playbooks$ ansible --version
ansible [core 2.16.14]
  config file = None # 未绑定默认配置文件
  configured module search path = ['/home/bill/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/bill/.local/lib/python3.11/site-packages/ansible
  ansible collection location = /home/bill/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/bill/.local/bin/ansible
  python version = 3.11.2 (main, Nov 30 2024, 21:22:50) [GCC 12.2.0] (/usr/bin/python3)
  jinja version = 3.1.2
  libyaml = True

两种(二选一 Alternative)临时定义实现配置文件绑定:

bash 复制代码
export ANSIBLE_CONFIG=/home/bill/kubespray/ansible.cf  # 方法1 定义系统变量
ansible-playbook -i ../inventory/mycluster/inventory.ini -c /home/bill/kubespray/ansible.cf cluster.yml  # 方法2, 每次运行ansible时 用 -c 去显性指定
  1. Ansbile 是可以反复执行同一个playbook,如果上一次执行时有任务失败有任务成功,ansible会从新执行失败的任务,并掠过成功的任务,然后记录每次执行成功和失败 已经未变动的TASK情况。

Kubernets 排错

用一条ansible 命令去执行cluster.yml 这一个playbook,即完成了整个kubernets集群的安装。然后你就可以ssh到各个node 去验收了。

我自己装完后几乎没发现大的问题。 小问题倒是有几个。

  1. admin.conf 配置文件没有和kubectl绑定,导致第一次运行kubectl报错。只有定义好kubectl执行的config文件即可。
  2. 3台master node上,都是在127.0.0.1:6443监听和暴露Kubernets API server的,这样显然无法提供api server的高可用。
    我也不知道给3 master node的k8s做api server 高可用的最佳实践是什么,先挖个坑。

附录 - 安装完成后的Kubernets Cluster 情况

1 . 大致的资源

bash 复制代码
bill@master-1:~$ sudo kubectl get ns -A
NAME              STATUS   AGE
default           Active   2d6h
kube-node-lease   Active   2d6h
kube-public       Active   2d6h
kube-system       Active   2d6h
bill@master-1:~$ sudo kubectl get pods -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS      AGE
kube-system   calico-kube-controllers-69d8557557-txxwk   1/1     Running   0             2d6h
kube-system   calico-node-2wj9h                          1/1     Running   0             2d6h
kube-system   calico-node-4ntch                          1/1     Running   0             2d6h
kube-system   calico-node-66pd7                          1/1     Running   0             2d6h
kube-system   calico-node-vs699                          1/1     Running   0             2d6h
kube-system   calico-node-x29jb                          1/1     Running   0             2d6h
kube-system   coredns-5c54f84c97-5src4                   1/1     Running   0             2d6h
kube-system   coredns-5c54f84c97-lbl9g                   1/1     Running   0             2d6h
kube-system   dns-autoscaler-76ddddbbc-9c56w             1/1     Running   0             2d6h
kube-system   kube-apiserver-master-1                    1/1     Running   1             2d6h
kube-system   kube-apiserver-master-2                    1/1     Running   1             2d6h
kube-system   kube-apiserver-master-3                    1/1     Running   1             2d6h
kube-system   kube-controller-manager-master-1           1/1     Running   3 (11h ago)   2d6h
kube-system   kube-controller-manager-master-2           1/1     Running   2             2d6h
kube-system   kube-controller-manager-master-3           1/1     Running   2             2d6h
kube-system   kube-proxy-52rsj                           1/1     Running   0             2d6h
kube-system   kube-proxy-c5z9k                           1/1     Running   0             2d6h
kube-system   kube-proxy-hctbg                           1/1     Running   0             2d6h
kube-system   kube-proxy-hkth5                           1/1     Running   0             2d6h
kube-system   kube-proxy-rbpp2                           1/1     Running   0             2d6h
kube-system   kube-scheduler-master-1                    1/1     Running   1             2d6h
kube-system   kube-scheduler-master-2                    1/1     Running   2             2d6h
kube-system   kube-scheduler-master-3                    1/1     Running   1             2d6h
kube-system   nginx-proxy-worker-1                       1/1     Running   0             2d6h
kube-system   nginx-proxy-worker-2                       1/1     Running   0             2d6h
kube-system   nodelocaldns-7ftp2                         1/1     Running   0             2d6h
kube-system   nodelocaldns-ckxd4                         1/1     Running   0             2d6h
kube-system   nodelocaldns-kp64g                         1/1     Running   0             2d6h
kube-system   nodelocaldns-x6bhp                         1/1     Running   0             2d6h
kube-system   nodelocaldns-z5grf                         1/1     Running   0             2d6h
bill@master-1:~$ sudo kubectl get nodes -A
NAME       STATUS   ROLES           AGE    VERSION
master-1   Ready    control-plane   2d6h   v1.32.0
master-2   Ready    control-plane   2d6h   v1.32.0
master-3   Ready    control-plane   2d6h   v1.32.0
worker-1   Ready    <none>          2d6h   v1.32.0
worker-2   Ready    <none>          2d6h   v1.32.0
bill@master-1:~$ sudo kubectl get svc -A
NAMESPACE     NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
default       kubernetes   ClusterIP   10.233.0.1   <none>        443/TCP                  2d6h
kube-system   coredns      ClusterIP   10.233.0.3   <none>        53/UDP,53/TCP,9153/TCP   2d6h
bill@master-1:~$ sudo kubectl get ds -A
NAMESPACE     NAME           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-system   calico-node    5         5         5       5            5           kubernetes.io/os=linux   2d6h
kube-system   kube-proxy     5         5         5       5            5           kubernetes.io/os=linux   2d6h
kube-system   nodelocaldns   5         5         5       5            5           kubernetes.io/os=linux   2d6h

确认没问题了, 你就可以用Lens 来管理和监控新的K8s 集群了。

  1. 以 systemd方式部署的二进制etcd
bash 复制代码
bill@master-1:~$ etcdctl version
etcdctl version: 3.5.16
API version: 3.5
bill@master-1:~$ sudo systemctl status etcd
● etcd.service - etcd
     Loaded: loaded (/etc/systemd/system/etcd.service; enabled; preset: enabled)
     Active: active (running) since Mon 2025-01-20 12:21:29 EST; 2 days ago
   Main PID: 19218 (etcd)
      Tasks: 13 (limit: 4602)
     Memory: 202.9M
        CPU: 48min 35.517s
     CGroup: /system.slice/etcd.service
             └─19218 /usr/local/bin/etcd

Jan 22 18:54:59 master-1 etcd[19218]: {"level":"info","ts":"2025-01-22T18:54:59.823494-0500","caller":"mvcc/hash.go:151","msg":"storing new hash","hash":1770769020,"revision":116007,"compa>
Jan 22 18:59:59 master-1 etcd[19218]: {"level":"info","ts":"2025-01-22T18:59:59.827207-0500","caller":"mvcc/index.go:214","msg":"compact tree index","revision":116632}
  1. CRI 采用了containerd
bash 复制代码
bill@master-1:~$ sudo systemctl status containerd
● containerd.service - containerd container runtime
     Loaded: loaded (/etc/systemd/system/containerd.service; enabled; preset: enabled)
     Active: active (running) since Mon 2025-01-20 12:14:33 EST; 2 days ago
       Docs: https://containerd.io
   Main PID: 16548 (containerd)
      Tasks: 108
     Memory: 995.8M
        CPU: 12min 17.084s
     CGroup: /system.slice/containerd.service
             ├─16548 /usr/local/bin/containerd
             ├─20832 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id b8999694f0d531b9cff8a7bb5797d7e091988801296d72ba615d10851eee175d -address /run/containerd/containerd.sock
             ├─21418 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id 06de62add7a4030a3af3a4f2978d0e8d88ef8af100f682b9f350f3b2bf1fbdf7 -address 

以上 Kubernets Cluster 通过Kuberspray进行全自动化定制安装就搞定了。 下一篇讲讲哪些常用的重要的配置是需要设置的,已经如何设置。

相关推荐
涛声依旧3931625 分钟前
构建部署kubernetes所需主机
linux·运维·云原生·容器·kubernetes
槐序深巷里打雨伞的人1 小时前
k8s中部署prometheus并监控k8s集群以及nginx案例
nginx·kubernetes·prometheus
eRTE XFUN2 小时前
Redis 设置密码(配置文件、docker容器、命令行3种场景)
数据库·redis·docker
万象.2 小时前
Docker网络原理
网络·docker·容器
春日见2 小时前
从底层思维3分钟彻底弄清卷积神经网络CNN
人工智能·深度学习·神经网络·计算机视觉·docker·cnn·计算机外设
wudl55662 小时前
MySQL 8.0.42 Docker 开发部署手册
数据库·mysql·docker
IT一氪3 小时前
K8s Admin:一个轻量级的多集群 Kubernetes 管理平台
云原生·容器·kubernetes
大新新大浩浩3 小时前
Deerflow部署-X86架构-在ubuntu2204操作系统上使用docker模式部署
docker·容器·架构
魔都吴所谓3 小时前
【Linux】Ubuntu22.04 Docker+四大数据库(挂载本地)一键安装脚本
linux·数据库·docker
大道V至简3 小时前
解决docker apt安装缓慢,切换国内源
docker