在微服务架构演进的道路上,我们从单机 Docker Compose 起步,经历了 Docker Swarm 的集群化改造,现在将迈入更加强大的 Kubernetes 生态系统。本文基于您提供的实际项目环境(包括之前遇到的权限问题),详细讲解如何将 RocketMQ 微服务项目从 Docker Swarm 无缝升级到生产级的 Kubernetes 集群部署。
一、Kubernetes 与 Docker Swarm 架构对比
1.1 核心差异分析
| 特性 | Docker Swarm | Kubernetes |
|---|---|---|
| 学习曲线 | 平缓,Docker 原生 | 陡峭,概念丰富 |
| 功能完整性 | 基础编排 | 企业级完整生态 |
| 扩展性 | 中等 | 极强 |
| 社区生态 | 一般 | 极其丰富 |
| 部署复杂度 | 简单 | 复杂但灵活 |
1.2 升级价值点
- ✅ 更强大的自愈能力
- ✅ 精细化的资源管理
- ✅ 丰富的扩展组件(监控、日志、网络策略等)
- ✅ 行业标准,社区支持强大
二、Kubernetes 集群规划与搭建
2.1 节点规划(基于您现有的多台 CentOS 机器)
假设我们有 3 台 CentOS 7.9 机器:
- k8s-master (192.168.1.10) - 控制平面节点
- k8s-node1 (192.168.1.11) - 工作节点
- k8s-node2 (192.168.1.12) - 工作节点
2.2 所有节点基础环境准备
bash
# 所有节点执行:关闭 SELinux 和防火墙
setenforce 0
sed -i 's/^SELINUX=enforcing/SELINUX=permissive/' /etc/selinux/config
systemctl stop firewalld
systemctl disable firewalld
# 关闭 swap
swapoff -a
sed -i '/ swap / s/^(.*)$/#\1/g' /etc/fstab
# 配置内核参数
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system
# 配置国内 yum 源(阿里云)
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
curl -o /etc/yum.repos.d/epel.repo https://mirrors.aliyun.com/repo/epel-7.repo
2.3 安装 Docker 和 Kubernetes 组件
perl
# 所有节点:安装 Docker
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum makecache fast
yum install -y docker-ce-20.10.12 docker-ce-cli-20.10.12 containerd.io
# 配置 Docker 镜像加速和 cgroup 驱动
mkdir -p /etc/docker
cat > /etc/docker/daemon.json << EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"registry-mirrors": [
"https://docker.mirrors.ustc.edu.cn",
"https://hub-mirror.c.163.com"
]
}
EOF
# 启动 Docker
systemctl enable docker && systemctl start docker
# 所有节点:安装 kubeadm, kubelet, kubectl
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet-1.23.0 kubeadm-1.23.0 kubectl-1.23.0
systemctl enable kubelet && systemctl start kubelet
2.4 初始化 Kubernetes 集群
bash
# 在 master 节点执行初始化
kubeadm init \
--apiserver-advertise-address=192.168.1.10 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.23.0 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 \
--ignore-preflight-errors=Swap
# 配置 kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 在工作节点执行加入命令(在 kubeadm init 输出中获取)
kubeadm join 192.168.1.10:6443 --token <token> --discovery-token-ca-cert-hash <hash>
2.5 安装网络插件(Flannel)
csharp
# 在 master 节点执行
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 验证节点状态
kubectl get nodes
kubectl get pods -n kube-system
三、Docker Swarm 到 Kubernetes 的配置转换
3.1 核心概念映射表
| Docker Swarm 概念 | Kubernetes 对应资源 | 说明 |
|---|---|---|
service+ replicas |
Deployment |
无状态应用部署 |
service(有状态) |
StatefulSet |
有状态应用部署 |
service暴露 |
Service |
服务发现和负载均衡 |
ports |
Service+ Ingress |
端口暴露和路由 |
volumes |
PersistentVolumeClaim |
持久化存储 |
configs |
ConfigMap/Secret |
配置管理 |
networks |
NetworkPolicy |
网络策略 |
3.2 创建 Kubernetes 命名空间
yaml
# 00-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: rocketmq-app
labels:
name: rocketmq-app
四、有状态服务改造(重点解决权限问题)
4.1 Redis 部署配置
yaml
# 01-redis-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-pvc
namespace: rocketmq-app
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: standard
---
# 02-redis-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
namespace: rocketmq-app
labels:
app: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
# 重点:解决权限问题 - 设置安全上下文
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
containers:
- name: redis
image: redis:7-alpine
ports:
- containerPort: 6379
command:
- redis-server
- "--requirepass"
- "123456"
- "--appendonly"
- "yes"
volumeMounts:
- name: redis-data
mountPath: /data
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
livenessProbe:
exec:
command: ["redis-cli", "ping"]
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command: ["redis-cli", "ping"]
initialDelaySeconds: 5
periodSeconds: 10
volumes:
- name: redis-data
persistentVolumeClaim:
claimName: redis-pvc
---
# 03-redis-service.yaml
apiVersion: v1
kind: Service
metadata:
name: redis
namespace: rocketmq-app
spec:
selector:
app: redis
ports:
- port: 6379
targetPort: 6379
clusterIP: None # Headless Service for stateful applications
4.2 MongoDB 部署配置
yaml
# 04-mongo-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongo-pvc
namespace: rocketmq-app
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: standard
---
# 05-mongo-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongo
namespace: rocketmq-app
labels:
app: mongo
spec:
replicas: 1
selector:
matchLabels:
app: mongo
template:
metadata:
labels:
app: mongo
spec:
securityContext:
runAsUser: 999
runAsGroup: 999
fsGroup: 999
containers:
- name: mongo
image: mongo:5
ports:
- containerPort: 27017
env:
- name: MONGO_INITDB_ROOT_USERNAME
value: "root"
- name: MONGO_INITDB_ROOT_PASSWORD
value: "root"
volumeMounts:
- name: mongo-data
mountPath: /data/db
resources:
requests:
memory: "512Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "500m"
livenessProbe:
exec:
command:
- mongosh
- --eval
- "db.adminCommand('ping')"
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- mongosh
- --eval
- "db.adminCommand('ping')"
initialDelaySeconds: 5
periodSeconds: 10
volumes:
- name: mongo-data
persistentVolumeClaim:
claimName: mongo-pvc
---
# 06-mongo-service.yaml
apiVersion: v1
kind: Service
metadata:
name: mongo
namespace: rocketmq-app
spec:
selector:
app: mongo
ports:
- port: 27017
targetPort: 27017
五、RocketMQ 集群化部署
5.1 RocketMQ NameServer 配置
yaml
# 07-rmqnamesrv-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: rmqnamesrv
namespace: rocketmq-app
labels:
app: rmqnamesrv
spec:
replicas: 2 # 高可用部署
selector:
matchLabels:
app: rmqnamesrv
template:
metadata:
labels:
app: rmqnamesrv
spec:
securityContext:
runAsUser: 3000
runAsGroup: 3000
fsGroup: 3000
containers:
- name: rmqnamesrv
image: apache/rocketmq:5.1.0
command: ["sh", "mqnamesrv"]
ports:
- containerPort: 9876
resources:
requests:
memory: "256Mi"
cpu: "200m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
tcpSocket:
port: 9876
initialDelaySeconds: 30
periodSeconds: 10
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rmqnamesrv
topologyKey: kubernetes.io/hostname
---
# 08-rmqnamesrv-service.yaml
apiVersion: v1
kind: Service
metadata:
name: rmqnamesrv
namespace: rocketmq-app
spec:
selector:
app: rmqnamesrv
ports:
- port: 9876
targetPort: 9876
clusterIP: None # Headless Service for stateful discovery
5.2 RocketMQ Broker 配置(支持集群)
yaml
# 09-rocketmq-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: rocketmq-config
namespace: rocketmq-app
data:
broker.conf: |
# Broker 集群配置
brokerClusterName = DefaultCluster
brokerName = broker-a
brokerId = 0
deleteWhen = 04
fileReservedTime = 48
brokerRole = ASYNC_MASTER
flushDiskType = ASYNC_FLUSH
# 自动发现 NameServer
namesrvAddr = rmqnamesrv.rocketmq-app.svc.cluster.local:9876
---
# 10-rmqbroker-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rmqbroker
namespace: rocketmq-app
spec:
serviceName: "rmqbroker"
replicas: 2
selector:
matchLabels:
app: rmqbroker
template:
metadata:
labels:
app: rmqbroker
spec:
securityContext:
runAsUser: 3000
runAsGroup: 3000
fsGroup: 3000
containers:
- name: rmqbroker
image: apache/rocketmq:5.1.0
command:
- sh
- -c
- |
# 动态设置 Broker 名称
BROKER_NAME="broker-${HOSTNAME##*-}"
sed -i "s/brokerName = broker-a/brokerName = ${BROKER_NAME}/" /opt/rocketmq/conf/broker.conf
sh mqbroker -c /opt/rocketmq/conf/broker.conf
ports:
- containerPort: 10911
- containerPort: 10909
env:
- name: JAVA_OPTS
value: " -Duser.home=/opt"
- name: JAVA_OPT_EXT
value: "-server -Xms512m -Xmx512m -Xmn256m"
volumeMounts:
- name: broker-config
mountPath: /opt/rocketmq/conf/broker.conf
subPath: broker.conf
- name: broker-data
mountPath: /home/rocketmq/store
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1"
livenessProbe:
tcpSocket:
port: 10911
initialDelaySeconds: 60
periodSeconds: 30
volumes:
- name: broker-config
configMap:
name: rocketmq-config
volumeClaimTemplates:
- metadata:
name: broker-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: standard
resources:
requests:
storage: 20Gi
---
# 11-rmqbroker-service.yaml
apiVersion: v1
kind: Service
metadata:
name: rmqbroker
namespace: rocketmq-app
spec:
selector:
app: rmqbroker
ports:
- name: main
port: 10911
targetPort: 10911
- name: vip
port: 10909
targetPort: 10909
clusterIP: None
六、SpringBoot 应用改造(重点解决权限问题)
6.1 应用配置和部署
yaml
# 12-springboot-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: springboot-app-config
namespace: rocketmq-app
data:
application.yml: |
server:
port: 8080
spring:
application:
name: rocketmq-user-service
data:
mongodb:
uri: mongodb://root:root@mongo.rocketmq-app.svc.cluster.local:27017/product_db?authSource=admin
redis:
host: redis.rocketmq-app.svc.cluster.local
port: 6379
password: 123456
database: 0
rocketmq:
name-server: rmqnamesrv.rocketmq-app.svc.cluster.local:9876
producer:
group: user-service-producer-group
logging:
level:
RocketmqClient: ERROR
com.example: DEBUG
file:
name: /home/spring/logs/application/application.log
---
# 13-springboot-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: springboot-app-secret
namespace: rocketmq-app
type: Opaque
data:
# echo -n 'your-password' | base64
redis-password: MTIzNDU2 # 123456
---
# 14-springboot-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-springboot-app
namespace: rocketmq-app
labels:
app: my-springboot-app
spec:
replicas: 3
selector:
matchLabels:
app: my-springboot-app
template:
metadata:
labels:
app: my-springboot-app
spec:
# 重点:安全上下文配置,解决权限问题
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
fsGroupChangePolicy: "OnRootMismatch"
containers:
- name: my-springboot-app
image: my-springboot-app:latest
ports:
- containerPort: 8080
env:
- name: SPRING_PROFILES_ACTIVE
value: "k8s"
- name: ROCKETMQ_CLIENT_LOG_ROOT
value: "/home/spring/logs/rocketmqlogs"
volumeMounts:
- name: app-config
mountPath: /app/config
readOnly: true
- name: app-logs
mountPath: /home/spring/logs
- name: tmp-volume
mountPath: /tmp
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
livenessProbe:
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 60
periodSeconds: 10
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 30
periodSeconds: 5
timeoutSeconds: 3
# 初始化容器:预先创建日志目录并设置正确权限
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "mkdir -p /home/spring/logs/application /home/spring/logs/rocketmqlogs && chmod -R 755 /home/spring/logs"]
initContainers:
- name: init-logs-permission
image: busybox:1.35
command: ['sh', '-c', 'mkdir -p /home/spring/logs/application /home/spring/logs/rocketmqlogs && chown -R 1000:1000 /home/spring/logs && chmod -R 755 /home/spring/logs']
volumeMounts:
- name: app-logs
mountPath: /home/spring/logs
volumes:
- name: app-config
configMap:
name: springboot-app-config
items:
- key: application.yml
path: application.yml
- name: app-logs
persistentVolumeClaim:
claimName: springboot-logs-pvc
- name: tmp-volume
emptyDir: {}
---
# 15-springboot-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: springboot-logs-pvc
namespace: rocketmq-app
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: standard
---
# 16-springboot-service.yaml
apiVersion: v1
kind: Service
metadata:
name: my-springboot-app
namespace: rocketmq-app
spec:
selector:
app: my-springboot-app
ports:
- port: 8080
targetPort: 8080
type: ClusterIP
---
# 17-springboot-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: springboot-app-ingress
namespace: rocketmq-app
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: rocketmq-app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-springboot-app
port:
number: 8080
七、部署和验证
7.1 按顺序部署所有资源
csharp
# 创建命名空间
kubectl apply -f 00-namespace.yaml
# 部署存储相关
kubectl apply -f 01-redis-pvc.yaml
kubectl apply -f 04-mongo-pvc.yaml
kubectl apply -f 15-springboot-pvc.yaml
# 部署配置
kubectl apply -f 09-rocketmq-configmap.yaml
kubectl apply -f 12-springboot-configmap.yaml
kubectl apply -f 13-springboot-secret.yaml
# 部署有状态服务
kubectl apply -f 02-redis-deployment.yaml
kubectl apply -f 03-redis-service.yaml
kubectl apply -f 05-mongo-deployment.yaml
kubectl apply -f 06-mongo-service.yaml
kubectl apply -f 07-rmqnamesrv-deployment.yaml
kubectl apply -f 08-rmqnamesrv-service.yaml
kubectl apply -f 10-rmqbroker-statefulset.yaml
kubectl apply -f 11-rmqbroker-service.yaml
# 部署应用服务
kubectl apply -f 14-springboot-deployment.yaml
kubectl apply -f 16-springboot-service.yaml
kubectl apply -f 17-springboot-ingress.yaml
# 验证部署状态
kubectl get all -n rocketmq-app
kubectl get pvc -n rocketmq-app
7.2 验证服务连通性
ini
# 检查所有 Pod 状态
kubectl get pods -n rocketmq-app -w
# 查看详细部署状态
kubectl describe deployment my-springboot-app -n rocketmq-app
# 测试服务发现
kubectl run test-pod --image=busybox -it --rm -n rocketmq-app -- sh
# 在测试 Pod 中执行:
nslookup redis.rocketmq-app.svc.cluster.local
nslookup rmqnamesrv.rocketmq-app.svc.cluster.local
telnet my-springboot-app 8080
7.3 验证权限问题解决
bash
# 进入 SpringBoot 应用 Pod 验证权限
kubectl exec -it deployment/my-springboot-app -n rocketmq-app -- sh
# 在容器内执行权限验证
whoami
ls -la /home/spring/logs/
touch /home/spring/logs/test-permission.log
echo "权限测试成功" > /home/spring/logs/test-permission.log
cat /home/spring/logs/test-permission.log
八、高级特性配置
8.1 资源限制和 HPA
yaml
# 18-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: springboot-app-hpa
namespace: rocketmq-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-springboot-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
8.2 网络策略
yaml
# 19-network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: rocketmq-app-policy
namespace: rocketmq-app
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: rocketmq-app
egress:
- to:
- namespaceSelector:
matchLabels:
name: rocketmq-app
九、监控和日志收集
9.1 部署 Prometheus 监控
csharp
# 添加 Prometheus 仓库
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# 安装 Prometheus Stack
helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring --create-namespace
9.2 应用监控配置
yaml
# 20-service-monitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: springboot-app-monitor
namespace: rocketmq-app
spec:
selector:
matchLabels:
app: my-springboot-app
endpoints:
- port: 8080
path: /actuator/prometheus
interval: 30s
十、迁移验证清单
10.1 功能验证清单
bash
# 1. 基础服务验证
kubectl get pods -n rocketmq-app | grep -v Running && echo "有Pod未就绪"
# 2. 服务连通性测试
kubectl run connectivity-test --image=curlimages/curl -it --rm -- \
curl -s http://my-springboot-app:8080/actuator/health
# 3. RocketMQ 功能测试
kubectl exec -it deployment/my-springboot-app -n rocketmq-app -- \
curl -X POST http://localhost:8080/api/test-message
# 4. 数据持久化验证
kubectl exec -it statefulset/rmqbroker -n rocketmq-app -- \
ls -la /home/rocketmq/store/
# 5. 权限问题验证(重点)
kubectl exec -it deployment/my-springboot-app -n rocketmq-app -- \
touch /home/spring/logs/permission-test.log && echo "权限配置正确"
总结
通过本文的详细步骤,我们成功将 RocketMQ 微服务项目从 Docker Swarm 迁移到了功能更强大的 Kubernetes 集群,并重点解决了之前遇到的权限问题。关键改进包括:
🎯 核心成果
- ✅ 彻底解决权限问题:通过 SecurityContext 和 initContainers 确保正确的文件权限
- ✅ 生产级高可用:所有关键组件实现多副本部署
- ✅ 精细化资源管理:CPU/内存限制、HPA 自动扩缩容
- ✅ 完善的服务治理:服务发现、健康检查、网络策略
- ✅ 企业级监控:集成 Prometheus 监控体系
🔄 迁移价值
- 可扩展性:从简单的 Swarm 扩展到强大的 K8s 生态
- 稳定性:完善的自愈能力和故障转移机制
- 可观测性:完整的监控、日志、追踪体系
- 安全性:网络策略、RBAC、安全上下文等企业级安全特性