RK3588 + 银河麒麟部署 swarm 集群指南

RK3588 + 银河麒麟部署 swarm 集群指南

一. 前提(所有的节点)

1)设置主机名(各台设备)

复制代码
	# 在 master-node 上执行
	sudo hostnamectl set-hostname master-node
	# 在 worker-1 上执行
	sudo hostnamectl set-hostname worker-1
	# 在 worker-2 上执行
	sudo hostnamectl set-hostname worker-2
	
	# 验证
	hostname
	
	# 编辑hosts文件(所有节点)
	sudo vi /etc/hosts
	# 添加以下内容(根据实际IP修改):
	192.168.137.224 master-node
	192.168.137.225 worker-1  # 根据实际worker-1的IP修改
	192.168.137.226 worker-2  # 根据实际worker-2的IP修改

2)sshd使能

复制代码
	vi /etc/ssh/sshd_config
	修改:PermitRootLogin yes
	sudo systemctl restart ssh

3)修改源

复制代码
	vi /etc/apt/sources.list
	# 主仓库
	deb http://archive.kylinos.cn/kylin/KYLIN-ALL 10.1 main restricted universe multiverse
	# 更新仓库
	deb http://archive.kylinos.cn/kylin/KYLIN-ALL 10.1-2107-updates main restricted universe multiverse
	deb http://archive.kylinos.cn/kylin/KYLIN-ALL 10.1-2203-updates main restricted universe multiverse

4)更新系统 & 安装基础工具

复制代码
	sudo apt update && sudo apt upgrade -y
	sudo apt install libcurl4=7.68.0-1kylin2.12
	sudo apt install -y curl wget net-tools ssh iputils-ping

5)确保时间同步

复制代码
	sudo apt install -y chrony
	sudo systemctl enable chrony --now

	# 验证时间同步状态
	chronyc sources -v		

6)关闭防火墙

复制代码
	systemctl list-units | grep -E "(ufw|firewall)" //判断使用哪种
	# 如果使用 ufw
	sudo ufw disable  
	# 如果是 firewalld
	sudo systemctl stop firewalld && sudo systemctl disable firewalld

7)工作目录

复制代码
	mkdir -p /home/swarm

8)modprobe欺骗

复制代码
	原因:内核的版本号与kylin的文件系统的驱动版本不对应,同时内核中把某些驱动直接编译到了内核中,导致modprobe不能成功。当然了这里是swarm,k3s和k8s是必须的。
	# 1. 备份真实的 modprobe
	sudo mv /sbin/modprobe /sbin/modprobe.real

	# 2. 创建欺骗脚本
	cat << 'EOF' | sudo tee /sbin/modprobe
	#!/bin/sh
	MODULE="$1"
	if [ "$MODULE" = "br_netfilter" ] || [ "$MODULE" = "overlay" ]; then
    	exit 0
	elif [ "$MODULE" = "-r" ] || [ "$MODULE" = "--remove" ]; then
	    if [ "$2" = "br_netfilter" ] || [ "$2" = "overlay" ]; then
        	exit 0
    	fi
    	exec /sbin/modprobe.real "$@"
	else
    	exec /sbin/modprobe.real "$@"
	fi
	EOF
	# 3. 添加可执行权限
	sudo chmod +x /sbin/modprobe

	# 4. 确保 sysctl 已设置(关键!)
	cat << 'EOF' | sudo tee /etc/sysctl.d/99-k8s-bridge-netfilter.conf
	net.bridge.bridge-nf-call-iptables = 1
	net.bridge.bridge-nf-call-ip6tables = 1
	net.bridge.bridge-nf-call-arptables = 1
	EOF

	sudo sysctl -p

# 5. 验证欺骗是否成功
	sudo modprobe br_netfilter && echo "✅ Success" || echo "❌ Failed"    // 应输出:✅ Success
	sudo modprobe overlay && echo "✅ Success" || echo "❌ Failed"

二、docker安装过程

1)安装

复制代码
	sudo apt update
	sudo apt install docker.io
	sudo usermod -aG docker $USER
	
	docker -v
	Docker version 20.10.7, build 20.10.7-0kylin5~20.04.2

2)配置

复制代码
	sudo tee /etc/docker/daemon.json <<-'EOF'
	{
	  "registry-mirrors": [
	    "https://docker.mirrors.ustc.edu.cn",
	    "https://hub-mirror.c.163.com",
	    "https://mirror.baidubce.com",
	    "https://dockerproxy.com",
	    "https://mirrors.ustc.edu.cn",
	    "https://docker-cf.registry.cyou",
	    "https://dockercf.jsdelivr.fyi",
	    "https://docker.jsdelivr.fyi",
	    "https://dockertest.jsdelivr.fyi",
	    "https://mirror.aliyuncs.com",
	    "https://docker.m.daocloud.io",
	    "https://docker.nju.edu.cn",
	    "https://docker.mirrors.sjtug.sjtu.edu.cn",
	    "https://mirror.iscas.ac.cn",
	    "https://docker.rainbond.cc",
	    "https://docker.registry.cyou",
	    "https://do.nark.eu.org",
	    "https://dc.j8.work",
	    "https://gst6rzl9.mirror.aliyuncs.com",
	    "https://registry.docker-cn.com",
	    "https://mirrors.tuna.tsinghua.edu.cn",
	    "https://registry.cn-beijing.aliyuncs.com"
	  ],
	 "insecure-registries" : [
	    "192.168.137.224:5000",
	    "registry.docker-cn.com",
	    "http://mirrors.sohu.com",
	    "http://hub-mirror.c.163.com",
	    "docker.mirrors.ustc.edu.cn"
	    ],
	 "debug": true,
	 "experimental": false
	}
	EOF

	systemctl daemon-reload
	systemctl restart docker.service
	docker info

3)测试

复制代码
	docker system prune -f //清理之前的下载
	docker pull hello-world
	docker run hello-world
	
	运行结果:
	Hello from Docker!
	This message shows that your installation appears to be working correctly.

	To generate this message, Docker took the following steps:
	 1. The Docker client contacted the Docker daemon.
	 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
		(arm64v8)
	 3. The Docker daemon created a new container from that image which runs the
		executable that produces the output you are currently reading.
	 4. The Docker daemon streamed that output to the Docker client, which sent it
		to your terminal.

	To try something more ambitious, you can run an Ubuntu container with:
	 $ docker run -it ubuntu bash

	Share images, automate workflows, and more with a free Docker ID:
	 https://hub.docker.com/

	For more examples and ideas, visit:
	 https://docs.docker.com/get-started/

4)docker的一些指令

复制代码
# 删除不完整的镜像
docker system prune -f
# 检查镜像是否存在
docker images | grep portainer
 使用国内镜像源拉取 Portainer
docker pull portainer/portainer-ce:latest

//镜像打包
# 检查 hello-world 镜像是否存在
docker images | grep hello-world
# 将镜像保存为 tar 文件
docker save -o hello-world.tar hello-world:latest
# 或者使用重定向方式
docker save hello-world:latest > hello-world-v2.tar
# 查看生成的文件
ls -lh hello-world*.tar

//本地镜像导入
# 从 tar 文件导入镜像
docker load -i hello-world.tar
# 或者使用输入重定向
docker load < hello-world.tar
# 验证镜像已导入
docker images | grep hello-world

三、swarm安装

1)manager初始化及查看

复制代码
docker swarm init --advertise-addr 192.168.137.224
	打印显示:
	Swarm initialized: current node (quj6te9i5wbhnvlukhr8umyyw) is now a manager.
	To add a worker to this swarm, run the following command:
	docker swarm join --token SWMTKN-1-09sgobht9vrzc0kzh80lvejpvyp2suda5akgmbf96dvrbiszpi-al15wyhpzlr2cyd68291mup3q 192.168.137.224:2377
	To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

查看swarm集群状态: docker info
	
查看节点信息: docker node ls
		打印如下:
		root@master-node:/home/swarm# docker node ls
		ID                            HOSTNAME      STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
		quj6te9i5wbhnvlukhr8umyyw *   master-node   Ready     Active         Leader           20.10.7
查询worker要用的加入指令: docker swarm join-token worker		
		打印如下:
		docker swarm join --token SWMTKN-1-09sgobht9vrzc0kzh80lvejpvyp2suda5akgmbf96dvrbiszpi-al15wyhpzlr2cyd68291mup3q 192.168.137.224:2377	

2)worker配置及查看

复制代码
在worker-1和worker-2执行:
	docker swarm join --token SWMTKN-1-09sgobht9vrzc0kzh80lvejpvyp2suda5akgmbf96dvrbiszpi-al15wyhpzlr2cyd68291mup3q 192.168.137.224:2377
	打印如下:
	This node joined a swarm as a worker.
在manager查看节点信息: docker node ls
	打印如下:
	ID                            HOSTNAME      STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
	quj6te9i5wbhnvlukhr8umyyw *   master-node   Ready     Active         Leader           20.10.7
	v7n81ykd7mvine0ul6npshbyz     worker-1      Ready     Active                          20.10.7
	r9zl96awq209m8kjpffr9a1es     worker-2      Ready     Active  						  20.10.7

四、部署服务hello-world

注意:::hello-world镜像为输出一个打印,因此为执行即结束,但是集群具备故障恢复功能,因此就会一直在重复启动!!!

1)操作过程及指令

复制代码
# 1. 使用已有的 hello-world 镜像创建服务
docker service create --name test-service --replicas 3 hello-world

# 2. 查看服务状态
docker service ps test-service

# 3. 查看服务日志
docker service logs test-service

# 4. 扩展服务到更多副本
docker service scale test-service=6

# 5. 再次查看分布情况
docker service ps test-service

# 6. 清理测试服务
docker service rm test-service

2)日志如下

复制代码
root@master-node:~# docker service ps test-service
ID             NAME                 IMAGE                NODE          DESIRED STATE   CURRENT STATE             ERROR     PORTS
anemuatoff61   test-service.1       hello-world:latest   master-node   Ready           Ready 3 seconds ago                 
0iwhwbqm1w0i    \_ test-service.1   hello-world:latest   master-node   Shutdown        Complete 3 seconds ago              
ylqqghxb9jd6    \_ test-service.1   hello-world:latest   master-node   Shutdown        Complete 9 seconds ago              
5zh3m6cf1a6n    \_ test-service.1   hello-world:latest   master-node   Shutdown        Complete 15 seconds ago             
jvec6zvc3ti0    \_ test-service.1   hello-world:latest   master-node   Shutdown        Complete 21 seconds ago             
d5j4mbjai5ay   test-service.2       hello-world:latest   worker-2      Ready           Ready 2 seconds ago                 
v6d2one9fs0c    \_ test-service.2   hello-world:latest   worker-2      Shutdown        Complete 2 seconds ago              
hbj46wuzzva2    \_ test-service.2   hello-world:latest   worker-2      Shutdown        Complete 8 seconds ago              
uq0nq6z3wh0k    \_ test-service.2   hello-world:latest   worker-2      Shutdown        Complete 15 seconds ago             
mjfrz9ie9qks    \_ test-service.2   hello-world:latest   worker-2      Shutdown        Complete 21 seconds ago             
0bti65ql81y5   test-service.3       hello-world:latest   worker-1      Ready           Ready 2 seconds ago                 
fvr1thptrn5y    \_ test-service.3   hello-world:latest   worker-1      Shutdown        Complete 3 seconds ago              
za0b02yrakt3    \_ test-service.3   hello-world:latest   worker-1      Shutdown        Complete 9 seconds ago              
rmgiomcz54g1    \_ test-service.3   hello-world:latest   worker-1      Shutdown        Complete 15 seconds ago             
i4bh61dg0o4q    \_ test-service.3   hello-world:latest   worker-1      Shutdown        Complete 21 seconds ago             
root@master-node:~# docker service logs test-service	

3)成功的证据

复制代码
1. 服务调度正常
	3个副本分别运行在3个节点上:
	test-service.1 在 master-node
	test-service.2 在 worker-2
	test-service.3 在 worker-1
2. 负载均衡和高可用性
	每个任务都有多个历史记录(Shutdown → Complete → 新的 Ready),这表明 Swarm 在容器退出后自动重新调度,确保服务持续可用,这是 Swarm 自愈能力 的体现
3. 跨节点通信正常
	日志显示所有3个节点都成功执行了任务
	网络通信畅通,日志能够汇总到管理节点
4. 镜像分发正常
	所有节点都能访问并运行 hello-world 镜像;
	ARM64 架构兼容性良好;

4)🎯 集群功能验证完成

复制代码
功能	状态	验证结果
服务创建	✅	成功创建多副本服务
节点调度	✅	均匀分布在3个节点
负载均衡	✅	自动分配任务
高可用性	✅	故障后自动恢复
日志收集	✅	跨节点日志汇总
网络通信	✅	节点间通信正常
相关推荐
ZFB00015 天前
【麒麟桌面系统】V10-SP1 2503 系统知识——插入U盘(移动硬盘)为只读状态
linux·运维·kylin
!沧海@一粟!6 天前
Kylin/Linux 服务器健康一键巡检工具
linux·服务器·kylin
野指针YZZ7 天前
Gstreamer插入第三方plugins流程:rgaconvert
linux·音视频·rk3588
苏叶新城9 天前
麒麟操作系统(Kylin OS)V10、V11发展历程
操作系统·kylin
市安10 天前
Swarm集群管理
运维·nginx·集群·镜像·swarm
J2虾虾11 天前
在Kylin Server上安装并配置MariaDB
大数据·mariadb·kylin
紫郢剑侠17 天前
使用Samba服务让kylin| 银河麒麟系统电脑向Windows系统电脑共享文件(下)Windows系统端配置
大数据·kylin
NotStrandedYet19 天前
《国产系统运维笔记》第8期:挑战国产化流媒体部署——银河麒麟+龙芯架构编译SRS实战全记录
运维·kylin·国产化·银河麒麟·龙芯·信创运维·srs编译安装
peixiuhui20 天前
G8701 RK3576 RealTime Linux 测试报告
linux·运维·边缘计算·rk3588·rk3576·rt-linux·实时内核
野指针YZZ22 天前
一键配置RK3588网络与SSH远程连接
网络·ssh·rk3588