arm64适配系列文章-第一章-arm64环境上kubesphere和k8s的部署

ARM64适配系列文章

第一章 arm64环境上kubesphere和k8s的部署


文章目录


前言

手里运维的业务平台要部署到用户环境,对方是华为910B的机器,单位目前没有,只有老的arm64架构的机器,反正先适配着,防止后续现抓麻爪了。


一、机器信息获取

1.1 芯片信息

lscpu

shell 复制代码
Architecture:          aarch64
Byte Order:            Little Endian
CPU(s):                40
On-line CPU(s) list:   0-39
Thread(s) per core:    1
Core(s) per socket:    40
Socket(s):             1
NUMA node(s):          1
Model:                 1
CPU max MHz:           2500.0000
CPU min MHz:           600.0000
BogoMIPS:              40.00
L1d cache:             unknown size
L1i cache:             unknown size
L2 cache:              unknown size
L3 cache:              unknown size
NUMA node0 CPU(s):     0-39
Flags:                 fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid asimdrdm

1.2 操作系统版本信息

hostnamectl

shell 复制代码
   Static hostname: datax3
         Icon name: computer-server
           Chassis: server
        Machine ID: 570e6fdcda17439886d6364f7a3ba217
           Boot ID: c6b431eb288d4de4b62a823a7f383e7b
  Operating System: CentOS Linux 7 (AltArch)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 4.14.0-115.el7a.0.1.aarch64
      Architecture: arm64

1.3 硬盘分区信息

lsblk

shell 复制代码
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0  1.8T  0 disk 
├─sda2            8:2    0    1G  0 part /boot
├─sda3            8:3    0  1.8T  0 part 
│ ├─centos-swap 253:1    0 15.9G  0 lvm  
│ ├─centos-home 253:2    0  1.8T  0 lvm  /home
│ └─centos-root 253:0    0   50G  0 lvm  /
└─sda1            8:1    0  200M  0 part /boot/efi

1.4 内核信息检查

主要是检查当前内核的BPF支持能力

cat /boot/config-$(uname -r) |grep BPF

shell 复制代码
CONFIG_BPF=y
# CONFIG_BPF_SYSCALL is not set
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NET_CLS_BPF=m
# CONFIG_NET_ACT_BPF is not set
CONFIG_BPF_JIT=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
# CONFIG_TEST_BPF is not set

发现问题,内核不支持CONFIG_BPF_SYSCALL,需要升级


二、升级内核

2.1 使用阿里云的arm源

阿里云arm源地址: https://developer.aliyun.com/mirror/centos-altarch/?spm=a2c6h.13651104.d-2001.3.40cd320cKIvAMX

shell 复制代码
# 获取repo文件
wget http://mirrors.aliyun.com/repo/Centos-altarch-7.repo -O /etc/yum.repos.d/CentOS-Base.repo
# 升级内核
yum clean all
yum makecache
yum list kernel
yum update -y kernel
reboot

2.2 检查升级后BPF支持能力

shell 复制代码
   Static hostname: datax3
         Icon name: computer-server
           Chassis: server
        Machine ID: 570e6fdcda17439886d6364f7a3ba217
           Boot ID: c6b431eb288d4de4b62a823a7f383e7b
  Operating System: CentOS Linux 7 (AltArch)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 4.18.0-348.20.1.el7.aarch64
      Architecture: arm64
shell 复制代码
CONFIG_CGROUP_BPF=y
CONFIG_BPF=y
CONFIG_BPF_LSM=y
CONFIG_BPF_SYSCALL=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
# CONFIG_BPF_PRELOAD is not set
CONFIG_NETFILTER_XT_MATCH_BPF=m
# CONFIG_BPFILTER is not set
CONFIG_NET_CLS_BPF=m
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_JIT=y
CONFIG_BPF_STREAM_PARSER=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_TEST_BPF=m

可以了


三、安装基础环境包

这里要配置上docker的dns信息

shell 复制代码
yum install -y yum-utils
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
mkdir -p /home/data/docker_data/docker/
ln -s /home/data/docker_data/docker/ /var/lib/
sudo yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
cat > /etc/docker/daemon.json <<EOF
{
  "dns": [
    "8.8.8.8",
    "114.114.114.114"
  ],
    "exec-opts":["native.cgroupdriver=systemd"],
    "log-driver":"json-file",
    "log-opts":{
        "max-size":"100m"
    }
}
EOF

service docker start
systemctl enable docker

四、准备安装工具

4.1 下载kk工具

kk工具下载地址:https://github.com/kubesphere/kubekey/releases/tag/v3.1.8

shell 复制代码
wget https://github.com/kubesphere/kubekey/releases/download/v3.1.8/kubekey-v3.1.8-linux-arm64.tar.gz

4.2 准备config-sample.yaml文件

shell 复制代码
tar -xvf kubekey-v3.1.8-linux-arm64.tar
chmod a+x ./kk
# 创建配置文件
./kk create config --with-kubernetes v1.23.17 --with-kubesphere

修改配置文件中的机器信息,增加架构配置<arch: arm64>

yaml 复制代码
apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
  name: sample
spec:
  hosts:
  - {name: datax3, address: xxx.xxx.103.6, internalAddress: xxx.xxx.103.6, user: root, arch: arm64 ,password: "smartcore"}
  roleGroups:
    etcd:
    - datax3
    control-plane: 
    - datax3
    worker:
    - datax3

增加架构信息的官方文档:https://kubesphere.io/zh/docs/v4.1/03-installation-and-upgrade/02-install-kubesphere/02-install-kubernetes-and-kubesphere/

4.3 开始部署

shell 复制代码
export KKZONE=cn
./kk create   cluster -f /home/k8s-one-node/config-sample.yaml -y --debug

4.4 部署后替换backend镜像

shell 复制代码
# 使用国内源下载
sudo docker pull hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4
# 下载后修改tag
docker tag hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4  mirrorgooglecontainers/defaultbackend-arm64:1.4
# 修改信息
kubectl set image deployment/default-http-backend default-http-backend=mirrorgooglecontainers/defaultbackend-arm64:1.4 -n kubesphere-controls-system
kubectl rollout restart deployment/default-http-backend -n kubesphere-controls-system

五、部署完毕,访问网页

确认可以访问,没有问题


六、部署中遇到的问题

6.1 bpf导致的calico无法启动问题

异常提示:Error from server (BadRequest): pod ks-installer-ddbcf44f8-8zhb5 does not have a host assigned

进行定位,定位到是calico的问题

shell 复制代码
kubectl get pod -A 
NAMESPACE           NAME                                           READY   STATUS     RESTARTS   AGE
kube-system         calico-kube-controllers-6f996c8485-7f6rf       0/1     Pending    0          20m
kube-system         calico-node-q82bk                              0/1     Init:0/3   0          20m
kube-system         coredns-5667b47695-qsd6f                       0/1     Pending    0          20m
kube-system         coredns-5667b47695-rttmr                       0/1     Pending    0          20m
kube-system         kube-apiserver-datax3                          1/1     Running    0          21m
kube-system         kube-controller-manager-datax3                 1/1     Running    0          21m
kube-system         kube-proxy-2h4xf                               1/1     Running    0          20m
kube-system         kube-scheduler-datax3                          1/1     Running    0          21m
kube-system         nodelocaldns-bjfm7                             1/1     Running    0          20m
kube-system         openebs-localpv-provisioner-7bbcf865cd-pmk7s   0/1     Pending    0          20m
kubesphere-system   ks-installer-ddbcf44f8-8zhb5                   0/1     Pending    0          20m

查看pod:calico-kube-controllers-6f996c8485-7f6rf

shell 复制代码
kubectl describe pods  calico-kube-controllers-6f996c8485-7f6rf -n kube-system
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  31s (x22 over 22m)  default-scheduler  0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.

查看pod:calico-node-q82bk

shell 复制代码
kubectl describe pods  calico-node-q82bk  -n kube-system
  Warning  FailedMount  7s (x2 over 2m25s)  kubelet            (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[bpffs], unattached volumes=[var-run-calico bpffs kube-api-access-f8tww xtables-lock policysync host-local-net-dir cni-bin-dir cni-log-dir sys-fs nodeproc lib-modules cni-net-dir var-lib-calico]: timed out waiting for the condition
  Warning  FailedMount  2s (x19 over 22m)   kubelet            MountVolume.SetUp failed for volume "bpffs" : hostPath type check failed: /sys/fs/bpf is not a directory

查看kernel 对BPF 的支持情况,确保CONFIG_BPF、CONFIG_BPFSYSCALL 是yes的。

eBPF 在 Linux 3.18 版本以后引入。

这个问题就需要升级内核来解决。由于阿里源里面有4.18的内核版本,我就没有手动搞,直接yum升级了,也幸亏升级之后就好使了。

6.2 default-http-backend 启动失败问题

异常提示:

shell 复制代码
kubesphere-controls-system     default-http-backend-659cc67b6b-652n7              0/1     CrashLoopBackOff    5 (87s ago)   6m6s

进行定位,查看pod:default-http-backend-659cc67b6b-652n7

shell 复制代码
kubectl describe pods  default-http-backend-659cc67b6b-652n7 -n kubesphere-controls-system


  Normal   Scheduled               8m26s                  default-scheduler  Successfully assigned kubesphere-controls-system/default-http-backend-659cc67b6b-652n7 to datax3
  Warning  FailedCreatePodSandBox  8m8s (x2 over 8m16s)   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "default-http-backend-659cc67b6b-652n7": Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: unable to freeze: unknown
  Normal   SandboxChanged          8m7s (x2 over 8m15s)   kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal   Pulling                 7m56s                  kubelet            Pulling image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4"
  Normal   Pulled                  7m19s                  kubelet            Successfully pulled image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4" in 5.626141615s (37.059010311s including waiting)
  Normal   Pulled                  6m23s (x3 over 7m12s)  kubelet            Container image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4" already present on machine
  Normal   Created                 6m22s (x4 over 7m18s)  kubelet            Created container default-http-backend
  Normal   Started                 6m18s (x4 over 7m13s)  kubelet            Started container default-http-backend
  Warning  BackOff                 3m14s (x23 over 7m5s)  kubelet            Back-off restarting failed container

镜像不对,去论坛找帖子

地址:https://ask.kubesphere.com.cn/forum/d/8874-arm-default-http-backend-elasticsearch-logging-curator/11

按照帖子里面的方式处理

shell 复制代码
# 使用国内源下载
sudo docker pull hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4
# 下载后修改tag
docker tag hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4  mirrorgooglecontainers/defaultbackend-arm64:1.4
# 进行替换
kubectl set image deployment/default-http-backend default-http-backend=mirrorgooglecontainers/defaultbackend-arm64:1.4 -n kubesphere-controls-system
kubectl rollout restart deployment/default-http-backend -n kubesphere-controls-system

替换后查看集群状态

shell 复制代码
kubectl get pod -A 
NAMESPACE                      NAME                                               READY   STATUS    RESTARTS   AGE
kube-system                    calico-kube-controllers-6f996c8485-7b7cw           1/1     Running   0          24m
kube-system                    calico-node-qljdf                                  1/1     Running   0          24m
kube-system                    coredns-7bfd7cb54c-ctcps                           1/1     Running   0          24m
kube-system                    coredns-7bfd7cb54c-nb7xz                           1/1     Running   0          24m
kube-system                    kube-apiserver-datax3                              1/1     Running   0          24m
kube-system                    kube-controller-manager-datax3                     1/1     Running   0          24m
kube-system                    kube-proxy-s4scz                                   1/1     Running   0          24m
kube-system                    kube-scheduler-datax3                              1/1     Running   0          24m
kube-system                    nodelocaldns-pxmfx                                 1/1     Running   0          24m
kube-system                    openebs-localpv-provisioner-7bbcf865cd-qr8qq       1/1     Running   0          24m
kube-system                    snapshot-controller-0                              1/1     Running   0          19m
kubesphere-controls-system     default-http-backend-658d66d59f-mvxmf              1/1     Running   0          2m23s
kubesphere-controls-system     kubectl-admin-7966644f4b-9rdj6                     1/1     Running   0          7m
kubesphere-monitoring-system   alertmanager-main-0                                2/2     Running   0          12m
kubesphere-monitoring-system   kube-state-metrics-856b7b8fdd-f4ltb                3/3     Running   0          13m
kubesphere-monitoring-system   node-exporter-h9dgm                                2/2     Running   0          13m
kubesphere-monitoring-system   notification-manager-deployment-6cd86468dc-f99jx   2/2     Running   0          10m
kubesphere-monitoring-system   notification-manager-operator-b9d6bf9d4-4n8wx      2/2     Running   0          12m
kubesphere-monitoring-system   prometheus-k8s-0                                   2/2     Running   0          13m
kubesphere-monitoring-system   prometheus-operator-684988fc5c-c6dbn               2/2     Running   0          13m
kubesphere-system              ks-apiserver-68648cb47c-9sg6w                      1/1     Running   0          16m
kubesphere-system              ks-console-777b56767b-vl8sp                        1/1     Running   0          16m
kubesphere-system              ks-controller-manager-86f56844c-jwnzb              1/1     Running   0          16m
kubesphere-system              ks-installer-ddbcf44f8-6scmx                       1/1     Running   0          23m

pod均正常,尝试访问页面,页面访问正常


总结

很好使,和x86上部署体验感几乎相同。

相关推荐
庸子2 小时前
解析虚拟机与Docker容器化服务的本质差异及Docker核心价值
运维·docker·容器
weisian1512 小时前
云原生--CNCF-3-核心工具介绍(容器和编排、服务网格和通信、监控和日志、运行时和资源管理,安全和存储、CI/CD等)
安全·ci/cd·云原生
李菠菜3 小时前
非root用户运行Docker命令的最佳实践
linux·docker·容器
KubeSphere 云原生3 小时前
云原生周刊:KubeSphere 平滑升级
云原生
rider1893 小时前
【3】CICD持续集成-k8s集群中安装Jenkins-agent(主从架构)
ci/cd·kubernetes·jenkins
Justice link3 小时前
K8S安全认证
java·安全·kubernetes
zizisuo3 小时前
5.4.云原生与服务网格
云原生
时迁2473 小时前
【k8s】LVS/IPVS的三种模式:NAT、DR、TUN
kubernetes·k8s·lvs
容器魔方3 小时前
KubeEdge边缘设备管理系列(六):Mapper-Framework开发示例
云原生·容器·云计算
西柚小萌新3 小时前
【人工智能agent】--docker本地部署dify教程
人工智能·docker·容器