ARM64适配系列文章
第一章 arm64环境上kubesphere和k8s的部署
文章目录
- ARM64适配系列文章
- 前言
- 一、机器信息获取
-
- [1.1 芯片信息](#1.1 芯片信息)
- [1.2 操作系统版本信息](#1.2 操作系统版本信息)
- [1.3 硬盘分区信息](#1.3 硬盘分区信息)
- [1.4 内核信息检查](#1.4 内核信息检查)
- 二、升级内核
-
- [2.1 使用阿里云的arm源](#2.1 使用阿里云的arm源)
- [2.2 检查升级后BPF支持能力](#2.2 检查升级后BPF支持能力)
- 三、安装基础环境包
- 四、准备安装工具
-
- [4.1 下载kk工具](#4.1 下载kk工具)
- [4.2 准备config-sample.yaml文件](#4.2 准备config-sample.yaml文件)
- [4.3 开始部署](#4.3 开始部署)
- [4.4 部署后替换backend镜像](#4.4 部署后替换backend镜像)
- 五、部署完毕,访问网页
- 六、部署中遇到的问题
-
- [6.1 bpf导致的calico无法启动问题](#6.1 bpf导致的calico无法启动问题)
- [6.2 default-http-backend 启动失败问题](#6.2 default-http-backend 启动失败问题)
- 总结
前言
手里运维的业务平台要部署到用户环境,对方是华为910B的机器,单位目前没有,只有老的arm64架构的机器,反正先适配着,防止后续现抓麻爪了。
一、机器信息获取
1.1 芯片信息
lscpu
shell
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 1
Core(s) per socket: 40
Socket(s): 1
NUMA node(s): 1
Model: 1
CPU max MHz: 2500.0000
CPU min MHz: 600.0000
BogoMIPS: 40.00
L1d cache: unknown size
L1i cache: unknown size
L2 cache: unknown size
L3 cache: unknown size
NUMA node0 CPU(s): 0-39
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid asimdrdm
1.2 操作系统版本信息
hostnamectl
shell
Static hostname: datax3
Icon name: computer-server
Chassis: server
Machine ID: 570e6fdcda17439886d6364f7a3ba217
Boot ID: c6b431eb288d4de4b62a823a7f383e7b
Operating System: CentOS Linux 7 (AltArch)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 4.14.0-115.el7a.0.1.aarch64
Architecture: arm64
1.3 硬盘分区信息
lsblk
shell
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.8T 0 disk
├─sda2 8:2 0 1G 0 part /boot
├─sda3 8:3 0 1.8T 0 part
│ ├─centos-swap 253:1 0 15.9G 0 lvm
│ ├─centos-home 253:2 0 1.8T 0 lvm /home
│ └─centos-root 253:0 0 50G 0 lvm /
└─sda1 8:1 0 200M 0 part /boot/efi
1.4 内核信息检查
主要是检查当前内核的BPF支持能力
cat /boot/config-$(uname -r) |grep BPF
shell
CONFIG_BPF=y
# CONFIG_BPF_SYSCALL is not set
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NET_CLS_BPF=m
# CONFIG_NET_ACT_BPF is not set
CONFIG_BPF_JIT=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
# CONFIG_TEST_BPF is not set
发现问题,内核不支持CONFIG_BPF_SYSCALL,需要升级
二、升级内核
2.1 使用阿里云的arm源
阿里云arm源地址: https://developer.aliyun.com/mirror/centos-altarch/?spm=a2c6h.13651104.d-2001.3.40cd320cKIvAMX
shell
# 获取repo文件
wget http://mirrors.aliyun.com/repo/Centos-altarch-7.repo -O /etc/yum.repos.d/CentOS-Base.repo
# 升级内核
yum clean all
yum makecache
yum list kernel
yum update -y kernel
reboot
2.2 检查升级后BPF支持能力
shell
Static hostname: datax3
Icon name: computer-server
Chassis: server
Machine ID: 570e6fdcda17439886d6364f7a3ba217
Boot ID: c6b431eb288d4de4b62a823a7f383e7b
Operating System: CentOS Linux 7 (AltArch)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 4.18.0-348.20.1.el7.aarch64
Architecture: arm64
shell
CONFIG_CGROUP_BPF=y
CONFIG_BPF=y
CONFIG_BPF_LSM=y
CONFIG_BPF_SYSCALL=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
# CONFIG_BPF_PRELOAD is not set
CONFIG_NETFILTER_XT_MATCH_BPF=m
# CONFIG_BPFILTER is not set
CONFIG_NET_CLS_BPF=m
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_JIT=y
CONFIG_BPF_STREAM_PARSER=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_TEST_BPF=m
可以了
三、安装基础环境包
这里要配置上docker的dns信息
shell
yum install -y yum-utils
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
mkdir -p /home/data/docker_data/docker/
ln -s /home/data/docker_data/docker/ /var/lib/
sudo yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
cat > /etc/docker/daemon.json <<EOF
{
"dns": [
"8.8.8.8",
"114.114.114.114"
],
"exec-opts":["native.cgroupdriver=systemd"],
"log-driver":"json-file",
"log-opts":{
"max-size":"100m"
}
}
EOF
service docker start
systemctl enable docker
四、准备安装工具
4.1 下载kk工具
kk工具下载地址:https://github.com/kubesphere/kubekey/releases/tag/v3.1.8
shell
wget https://github.com/kubesphere/kubekey/releases/download/v3.1.8/kubekey-v3.1.8-linux-arm64.tar.gz
4.2 准备config-sample.yaml文件
shell
tar -xvf kubekey-v3.1.8-linux-arm64.tar
chmod a+x ./kk
# 创建配置文件
./kk create config --with-kubernetes v1.23.17 --with-kubesphere
修改配置文件中的机器信息,增加架构配置<arch: arm64>
yaml
apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
name: sample
spec:
hosts:
- {name: datax3, address: xxx.xxx.103.6, internalAddress: xxx.xxx.103.6, user: root, arch: arm64 ,password: "smartcore"}
roleGroups:
etcd:
- datax3
control-plane:
- datax3
worker:
- datax3
4.3 开始部署
shell
export KKZONE=cn
./kk create cluster -f /home/k8s-one-node/config-sample.yaml -y --debug
4.4 部署后替换backend镜像
shell
# 使用国内源下载
sudo docker pull hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4
# 下载后修改tag
docker tag hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4 mirrorgooglecontainers/defaultbackend-arm64:1.4
# 修改信息
kubectl set image deployment/default-http-backend default-http-backend=mirrorgooglecontainers/defaultbackend-arm64:1.4 -n kubesphere-controls-system
kubectl rollout restart deployment/default-http-backend -n kubesphere-controls-system
五、部署完毕,访问网页
确认可以访问,没有问题
六、部署中遇到的问题
6.1 bpf导致的calico无法启动问题
异常提示:Error from server (BadRequest): pod ks-installer-ddbcf44f8-8zhb5 does not have a host assigned
进行定位,定位到是calico的问题
shell
kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6f996c8485-7f6rf 0/1 Pending 0 20m
kube-system calico-node-q82bk 0/1 Init:0/3 0 20m
kube-system coredns-5667b47695-qsd6f 0/1 Pending 0 20m
kube-system coredns-5667b47695-rttmr 0/1 Pending 0 20m
kube-system kube-apiserver-datax3 1/1 Running 0 21m
kube-system kube-controller-manager-datax3 1/1 Running 0 21m
kube-system kube-proxy-2h4xf 1/1 Running 0 20m
kube-system kube-scheduler-datax3 1/1 Running 0 21m
kube-system nodelocaldns-bjfm7 1/1 Running 0 20m
kube-system openebs-localpv-provisioner-7bbcf865cd-pmk7s 0/1 Pending 0 20m
kubesphere-system ks-installer-ddbcf44f8-8zhb5 0/1 Pending 0 20m
查看pod:calico-kube-controllers-6f996c8485-7f6rf
shell
kubectl describe pods calico-kube-controllers-6f996c8485-7f6rf -n kube-system
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 31s (x22 over 22m) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
查看pod:calico-node-q82bk
shell
kubectl describe pods calico-node-q82bk -n kube-system
Warning FailedMount 7s (x2 over 2m25s) kubelet (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[bpffs], unattached volumes=[var-run-calico bpffs kube-api-access-f8tww xtables-lock policysync host-local-net-dir cni-bin-dir cni-log-dir sys-fs nodeproc lib-modules cni-net-dir var-lib-calico]: timed out waiting for the condition
Warning FailedMount 2s (x19 over 22m) kubelet MountVolume.SetUp failed for volume "bpffs" : hostPath type check failed: /sys/fs/bpf is not a directory
查看kernel 对BPF 的支持情况,确保CONFIG_BPF、CONFIG_BPFSYSCALL 是yes的。
eBPF 在 Linux 3.18 版本以后引入。
这个问题就需要升级内核来解决。由于阿里源里面有4.18的内核版本,我就没有手动搞,直接yum升级了,也幸亏升级之后就好使了。
6.2 default-http-backend 启动失败问题
异常提示:
shell
kubesphere-controls-system default-http-backend-659cc67b6b-652n7 0/1 CrashLoopBackOff 5 (87s ago) 6m6s
进行定位,查看pod:default-http-backend-659cc67b6b-652n7
shell
kubectl describe pods default-http-backend-659cc67b6b-652n7 -n kubesphere-controls-system
Normal Scheduled 8m26s default-scheduler Successfully assigned kubesphere-controls-system/default-http-backend-659cc67b6b-652n7 to datax3
Warning FailedCreatePodSandBox 8m8s (x2 over 8m16s) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "default-http-backend-659cc67b6b-652n7": Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: unable to freeze: unknown
Normal SandboxChanged 8m7s (x2 over 8m15s) kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulling 7m56s kubelet Pulling image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4"
Normal Pulled 7m19s kubelet Successfully pulled image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4" in 5.626141615s (37.059010311s including waiting)
Normal Pulled 6m23s (x3 over 7m12s) kubelet Container image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4" already present on machine
Normal Created 6m22s (x4 over 7m18s) kubelet Created container default-http-backend
Normal Started 6m18s (x4 over 7m13s) kubelet Started container default-http-backend
Warning BackOff 3m14s (x23 over 7m5s) kubelet Back-off restarting failed container
镜像不对,去论坛找帖子
地址:https://ask.kubesphere.com.cn/forum/d/8874-arm-default-http-backend-elasticsearch-logging-curator/11
按照帖子里面的方式处理
shell
# 使用国内源下载
sudo docker pull hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4
# 下载后修改tag
docker tag hub.fast360.xyz/mirrorgooglecontainers/defaultbackend-arm64:1.4 mirrorgooglecontainers/defaultbackend-arm64:1.4
# 进行替换
kubectl set image deployment/default-http-backend default-http-backend=mirrorgooglecontainers/defaultbackend-arm64:1.4 -n kubesphere-controls-system
kubectl rollout restart deployment/default-http-backend -n kubesphere-controls-system
替换后查看集群状态
shell
kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6f996c8485-7b7cw 1/1 Running 0 24m
kube-system calico-node-qljdf 1/1 Running 0 24m
kube-system coredns-7bfd7cb54c-ctcps 1/1 Running 0 24m
kube-system coredns-7bfd7cb54c-nb7xz 1/1 Running 0 24m
kube-system kube-apiserver-datax3 1/1 Running 0 24m
kube-system kube-controller-manager-datax3 1/1 Running 0 24m
kube-system kube-proxy-s4scz 1/1 Running 0 24m
kube-system kube-scheduler-datax3 1/1 Running 0 24m
kube-system nodelocaldns-pxmfx 1/1 Running 0 24m
kube-system openebs-localpv-provisioner-7bbcf865cd-qr8qq 1/1 Running 0 24m
kube-system snapshot-controller-0 1/1 Running 0 19m
kubesphere-controls-system default-http-backend-658d66d59f-mvxmf 1/1 Running 0 2m23s
kubesphere-controls-system kubectl-admin-7966644f4b-9rdj6 1/1 Running 0 7m
kubesphere-monitoring-system alertmanager-main-0 2/2 Running 0 12m
kubesphere-monitoring-system kube-state-metrics-856b7b8fdd-f4ltb 3/3 Running 0 13m
kubesphere-monitoring-system node-exporter-h9dgm 2/2 Running 0 13m
kubesphere-monitoring-system notification-manager-deployment-6cd86468dc-f99jx 2/2 Running 0 10m
kubesphere-monitoring-system notification-manager-operator-b9d6bf9d4-4n8wx 2/2 Running 0 12m
kubesphere-monitoring-system prometheus-k8s-0 2/2 Running 0 13m
kubesphere-monitoring-system prometheus-operator-684988fc5c-c6dbn 2/2 Running 0 13m
kubesphere-system ks-apiserver-68648cb47c-9sg6w 1/1 Running 0 16m
kubesphere-system ks-console-777b56767b-vl8sp 1/1 Running 0 16m
kubesphere-system ks-controller-manager-86f56844c-jwnzb 1/1 Running 0 16m
kubesphere-system ks-installer-ddbcf44f8-6scmx 1/1 Running 0 23m
pod均正常,尝试访问页面,页面访问正常
总结
很好使,和x86上部署体验感几乎相同。