arm64适配系列文章-第一章-arm64环境上kubesphere和k8s的部署

ARM64适配系列文章

第一章 arm64环境上kubesphere和k8s的部署


文章目录


前言

手里运维的业务平台要部署到用户环境,对方是华为910B的机器,单位目前没有,只有老的arm64架构的机器,反正先适配着,防止后续现抓麻爪了。


一、机器信息获取

1.1 芯片信息

lscpu

shell 复制代码
Architecture:          aarch64
Byte Order:            Little Endian
CPU(s):                40
On-line CPU(s) list:   0-39
Thread(s) per core:    1
Core(s) per socket:    40
Socket(s):             1
NUMA node(s):          1
Model:                 1
CPU max MHz:           2500.0000
CPU min MHz:           600.0000
BogoMIPS:              40.00
L1d cache:             unknown size
L1i cache:             unknown size
L2 cache:              unknown size
L3 cache:              unknown size
NUMA node0 CPU(s):     0-39
Flags:                 fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid asimdrdm

1.2 操作系统版本信息

hostnamectl

shell 复制代码
   Static hostname: datax3
         Icon name: computer-server
           Chassis: server
        Machine ID: 570e6fdcda17439886d6364f7a3ba217
           Boot ID: c6b431eb288d4de4b62a823a7f383e7b
  Operating System: CentOS Linux 7 (AltArch)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 4.14.0-115.el7a.0.1.aarch64
      Architecture: arm64

1.3 硬盘分区信息

lsblk

shell 复制代码
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0  1.8T  0 disk 
├─sda2            8:2    0    1G  0 part /boot
├─sda3            8:3    0  1.8T  0 part 
│ ├─centos-swap 253:1    0 15.9G  0 lvm  
│ ├─centos-home 253:2    0  1.8T  0 lvm  /home
│ └─centos-root 253:0    0   50G  0 lvm  /
└─sda1            8:1    0  200M  0 part /boot/efi

1.4 内核信息检查

主要是检查当前内核的BPF支持能力

cat /boot/config-$(uname -r) |grep BPF

shell 复制代码
CONFIG_BPF=y
# CONFIG_BPF_SYSCALL is not set
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NET_CLS_BPF=m
# CONFIG_NET_ACT_BPF is not set
CONFIG_BPF_JIT=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
# CONFIG_TEST_BPF is not set

发现问题,内核不支持CONFIG_BPF_SYSCALL,需要升级


二、升级内核

2.1 使用阿里云的arm源

阿里云arm源地址: https://developer.aliyun.com/mirror/centos-altarch/?spm=a2c6h.13651104.d-2001.3.40cd320cKIvAMX

shell 复制代码
# 获取repo文件
wget http://mirrors.aliyun.com/repo/Centos-altarch-7.repo -O /etc/yum.repos.d/CentOS-Base.repo
# 升级内核
yum clean all
yum makecache
yum list kernel
yum update -y kernel
reboot

2.2 检查升级后BPF支持能力

shell 复制代码
   Static hostname: datax3
         Icon name: computer-server
           Chassis: server
        Machine ID: 570e6fdcda17439886d6364f7a3ba217
           Boot ID: c6b431eb288d4de4b62a823a7f383e7b
  Operating System: CentOS Linux 7 (AltArch)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 4.18.0-348.20.1.el7.aarch64
      Architecture: arm64
shell 复制代码
CONFIG_CGROUP_BPF=y
CONFIG_BPF=y
CONFIG_BPF_LSM=y
CONFIG_BPF_SYSCALL=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
# CONFIG_BPF_PRELOAD is not set
CONFIG_NETFILTER_XT_MATCH_BPF=m
# CONFIG_BPFILTER is not set
CONFIG_NET_CLS_BPF=m
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_JIT=y
CONFIG_BPF_STREAM_PARSER=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_TEST_BPF=m

可以了


三、安装基础环境包

这里要配置上docker的dns信息

shell 复制代码
yum install -y yum-utils
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
mkdir -p /home/data/docker_data/docker/
ln -s /home/data/docker_data/docker/ /var/lib/
sudo yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
cat > /etc/docker/daemon.json <<EOF
{
  "dns": [
    "8.8.8.8",
    "114.114.114.114"
  ],
    "exec-opts":["native.cgroupdriver=systemd"],
    "log-driver":"json-file",
    "log-opts":{
        "max-size":"100m"
    }
}
EOF

service docker start
systemctl enable docker

四、准备安装工具

4.1 下载kk工具

kk工具下载地址:https://github.com/kubesphere/kubekey/releases/tag/v3.1.8

shell 复制代码
wget https://github.com/kubesphere/kubekey/releases/download/v3.1.8/kubekey-v3.1.8-linux-arm64.tar.gz

4.2 准备config-sample.yaml文件

shell 复制代码
tar -xvf kubekey-v3.1.8-linux-arm64.tar
chmod a+x ./kk
# 创建配置文件
./kk create config --with-kubernetes v1.23.17 --with-kubesphere

修改配置文件中的机器信息,增加架构配置<arch: arm64>

yaml 复制代码
apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
  name: sample
spec:
  hosts:
  - {name: datax3, address: xxx.xxx.103.6, internalAddress: xxx.xxx.103.6, user: root, arch: arm64 ,password: "smartcore"}
  roleGroups:
    etcd:
    - datax3
    control-plane: 
    - datax3
    worker:
    - datax3

增加架构信息的官方文档:https://kubesphere.io/zh/docs/v4.1/03-installation-and-upgrade/02-install-kubesphere/02-install-kubernetes-and-kubesphere/

4.3 开始部署

shell 复制代码
export KKZONE=cn
./kk create   cluster -f /home/k8s-one-node/config-sample.yaml -y --debug

4.4 部署后替换backend镜像

shell 复制代码
# 使用国内源下载
sudo docker pull hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4
# 下载后修改tag
docker tag hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4  mirrorgooglecontainers/defaultbackend-arm64:1.4
# 修改信息
kubectl set image deployment/default-http-backend default-http-backend=mirrorgooglecontainers/defaultbackend-arm64:1.4 -n kubesphere-controls-system
kubectl rollout restart deployment/default-http-backend -n kubesphere-controls-system

五、部署完毕,访问网页

确认可以访问,没有问题


六、部署中遇到的问题

6.1 bpf导致的calico无法启动问题

异常提示:Error from server (BadRequest): pod ks-installer-ddbcf44f8-8zhb5 does not have a host assigned

进行定位,定位到是calico的问题

shell 复制代码
kubectl get pod -A 
NAMESPACE           NAME                                           READY   STATUS     RESTARTS   AGE
kube-system         calico-kube-controllers-6f996c8485-7f6rf       0/1     Pending    0          20m
kube-system         calico-node-q82bk                              0/1     Init:0/3   0          20m
kube-system         coredns-5667b47695-qsd6f                       0/1     Pending    0          20m
kube-system         coredns-5667b47695-rttmr                       0/1     Pending    0          20m
kube-system         kube-apiserver-datax3                          1/1     Running    0          21m
kube-system         kube-controller-manager-datax3                 1/1     Running    0          21m
kube-system         kube-proxy-2h4xf                               1/1     Running    0          20m
kube-system         kube-scheduler-datax3                          1/1     Running    0          21m
kube-system         nodelocaldns-bjfm7                             1/1     Running    0          20m
kube-system         openebs-localpv-provisioner-7bbcf865cd-pmk7s   0/1     Pending    0          20m
kubesphere-system   ks-installer-ddbcf44f8-8zhb5                   0/1     Pending    0          20m

查看pod:calico-kube-controllers-6f996c8485-7f6rf

shell 复制代码
kubectl describe pods  calico-kube-controllers-6f996c8485-7f6rf -n kube-system
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  31s (x22 over 22m)  default-scheduler  0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.

查看pod:calico-node-q82bk

shell 复制代码
kubectl describe pods  calico-node-q82bk  -n kube-system
  Warning  FailedMount  7s (x2 over 2m25s)  kubelet            (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[bpffs], unattached volumes=[var-run-calico bpffs kube-api-access-f8tww xtables-lock policysync host-local-net-dir cni-bin-dir cni-log-dir sys-fs nodeproc lib-modules cni-net-dir var-lib-calico]: timed out waiting for the condition
  Warning  FailedMount  2s (x19 over 22m)   kubelet            MountVolume.SetUp failed for volume "bpffs" : hostPath type check failed: /sys/fs/bpf is not a directory

查看kernel 对BPF 的支持情况,确保CONFIG_BPF、CONFIG_BPFSYSCALL 是yes的。

eBPF 在 Linux 3.18 版本以后引入。

这个问题就需要升级内核来解决。由于阿里源里面有4.18的内核版本,我就没有手动搞,直接yum升级了,也幸亏升级之后就好使了。

6.2 default-http-backend 启动失败问题

异常提示:

shell 复制代码
kubesphere-controls-system     default-http-backend-659cc67b6b-652n7              0/1     CrashLoopBackOff    5 (87s ago)   6m6s

进行定位,查看pod:default-http-backend-659cc67b6b-652n7

shell 复制代码
kubectl describe pods  default-http-backend-659cc67b6b-652n7 -n kubesphere-controls-system


  Normal   Scheduled               8m26s                  default-scheduler  Successfully assigned kubesphere-controls-system/default-http-backend-659cc67b6b-652n7 to datax3
  Warning  FailedCreatePodSandBox  8m8s (x2 over 8m16s)   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "default-http-backend-659cc67b6b-652n7": Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: unable to freeze: unknown
  Normal   SandboxChanged          8m7s (x2 over 8m15s)   kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal   Pulling                 7m56s                  kubelet            Pulling image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4"
  Normal   Pulled                  7m19s                  kubelet            Successfully pulled image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4" in 5.626141615s (37.059010311s including waiting)
  Normal   Pulled                  6m23s (x3 over 7m12s)  kubelet            Container image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4" already present on machine
  Normal   Created                 6m22s (x4 over 7m18s)  kubelet            Created container default-http-backend
  Normal   Started                 6m18s (x4 over 7m13s)  kubelet            Started container default-http-backend
  Warning  BackOff                 3m14s (x23 over 7m5s)  kubelet            Back-off restarting failed container

镜像不对,去论坛找帖子

地址:https://ask.kubesphere.com.cn/forum/d/8874-arm-default-http-backend-elasticsearch-logging-curator/11

按照帖子里面的方式处理

shell 复制代码
# 使用国内源下载
sudo docker pull hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4
# 下载后修改tag
docker tag hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4  mirrorgooglecontainers/defaultbackend-arm64:1.4
# 进行替换
kubectl set image deployment/default-http-backend default-http-backend=mirrorgooglecontainers/defaultbackend-arm64:1.4 -n kubesphere-controls-system
kubectl rollout restart deployment/default-http-backend -n kubesphere-controls-system

替换后查看集群状态

shell 复制代码
kubectl get pod -A 
NAMESPACE                      NAME                                               READY   STATUS    RESTARTS   AGE
kube-system                    calico-kube-controllers-6f996c8485-7b7cw           1/1     Running   0          24m
kube-system                    calico-node-qljdf                                  1/1     Running   0          24m
kube-system                    coredns-7bfd7cb54c-ctcps                           1/1     Running   0          24m
kube-system                    coredns-7bfd7cb54c-nb7xz                           1/1     Running   0          24m
kube-system                    kube-apiserver-datax3                              1/1     Running   0          24m
kube-system                    kube-controller-manager-datax3                     1/1     Running   0          24m
kube-system                    kube-proxy-s4scz                                   1/1     Running   0          24m
kube-system                    kube-scheduler-datax3                              1/1     Running   0          24m
kube-system                    nodelocaldns-pxmfx                                 1/1     Running   0          24m
kube-system                    openebs-localpv-provisioner-7bbcf865cd-qr8qq       1/1     Running   0          24m
kube-system                    snapshot-controller-0                              1/1     Running   0          19m
kubesphere-controls-system     default-http-backend-658d66d59f-mvxmf              1/1     Running   0          2m23s
kubesphere-controls-system     kubectl-admin-7966644f4b-9rdj6                     1/1     Running   0          7m
kubesphere-monitoring-system   alertmanager-main-0                                2/2     Running   0          12m
kubesphere-monitoring-system   kube-state-metrics-856b7b8fdd-f4ltb                3/3     Running   0          13m
kubesphere-monitoring-system   node-exporter-h9dgm                                2/2     Running   0          13m
kubesphere-monitoring-system   notification-manager-deployment-6cd86468dc-f99jx   2/2     Running   0          10m
kubesphere-monitoring-system   notification-manager-operator-b9d6bf9d4-4n8wx      2/2     Running   0          12m
kubesphere-monitoring-system   prometheus-k8s-0                                   2/2     Running   0          13m
kubesphere-monitoring-system   prometheus-operator-684988fc5c-c6dbn               2/2     Running   0          13m
kubesphere-system              ks-apiserver-68648cb47c-9sg6w                      1/1     Running   0          16m
kubesphere-system              ks-console-777b56767b-vl8sp                        1/1     Running   0          16m
kubesphere-system              ks-controller-manager-86f56844c-jwnzb              1/1     Running   0          16m
kubesphere-system              ks-installer-ddbcf44f8-6scmx                       1/1     Running   0          23m

pod均正常,尝试访问页面,页面访问正常


总结

很好使,和x86上部署体验感几乎相同。

相关推荐
我是谁??1 小时前
ubuntu22.04 通过docker部署vLLM(Qwen3-0.6B)大模型+New API+OpenWebUI
docker·容器·vllm
Patrick_Wilson1 小时前
K8s 探针避坑:Next.js 不同部署模式下的健康检查实践
kubernetes·node.js·next.js
运维瓦工1 小时前
DevOps 生态介绍(十):Docker Compose 核心 YAML 配置详解与常用命令大全
spring cloud·docker·容器
Plastic garden2 小时前
K8s(10)NFS 的动态 PV 创建数据库给k8s的mysql和redis
docker·容器·kubernetes
Plastic garden2 小时前
k8s(11) Pod 控制器,服务发现与存储管理
kubernetes
与海boy2 小时前
docker compose minio
docker·容器·eureka
星辰徐哥3 小时前
云原生核心特性:容器化、微服务与DevOps的通俗解读
微服务·云原生·devops
武子康3 小时前
调查研究-167 Docker Compose 详解:从单容器到多服务编排的工程化入口
运维·docker·云原生·容器·kubernetes·k8s·docker-compose
heimeiyingwang3 小时前
【架构实战】分布式会话:从Session到JWT的演进
微服务·云原生·架构