Rabbitmq+STS+discovery_k8s +localpv部署排坑详解

#作者:朱雷

文章目录

  • 一、部署排坑
    • [1.1. configmap配置文件](#1.1. configmap配置文件)
    • [1.2. pv文件](#1.2. pv文件)
    • [1.3. sc文件](#1.3. sc文件)
    • [1.4. serviceAccount文件](#1.4. serviceAccount文件)
    • [1.5. headless-service文件](#1.5. headless-service文件)
    • [1.6. sts文件](#1.6. sts文件)
  • 二、RabbitMQ集群部署关键问题总结

一、部署排坑

1.1. configmap配置文件

编辑cm.yaml 文件

复制代码
apiVersion: v1
kind: ConfigMap
metadata:
  name: rabbitmq-config
  namespace: rabbitmq-clu-9
data:
  rabbitmq.conf: |
    # 基础配置
    listeners.tcp.default = 5672
    # management.listener.port = 15672
    # management.listener.ssl = false
    disk_free_limit.absolute = 1GB
    cluster_formation.peer_discovery_backend = k8s  #指定集群发现通过k8s插件
    # cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s
    cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
cluster_formation.k8s.address_type = hostname
# 后缀中rabbitmq-clu-9 为namespace 名称保持和sts 文件中的namespace一致
    cluster_formation.k8s.hostname_suffix = .rabbitmq-headless.rabbitmq-clu-9.svc.cluster.local
    cluster_formation.discovery_retry_limit = 10
cluster_formation.discovery_retry_interval = 3000
# service_name与headless 中保持一致
    cluster_formation.k8s.service_name = rabbitmq-headless
    cluster_formation.node_cleanup.interval = 30
    cluster_formation.node_cleanup.only_log_warning = false
    cluster_formation.etcd.ssl_options.verify = verify_none
    # 内存配置
    vm_memory_high_watermark.relative = 0.6
    vm_memory_high_watermark_paging_ratio = 0.5
    # 日志配置
    log.console = true
    log.console.level = debug
    log.file = false
    # 临时启用调试日志
    log.connection.level = debug
    log.channel.level = debug
    log.queue.level = debug

1.2. pv文件

复制代码
apiVersion: v1
kind: PersistentVolume
metadata:
  name: rabbitmq-cluster-pv-0
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: hostpath-storage
  hostPath:
    path: /tmp/rabbitmq/0
    type: DirectoryOrCreate
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - 192.168.88.201   #修改为绑定的node的hostname
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: rabbitmq-cluster-pv-1
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: hostpath-storage
  hostPath:
    path: /tmp/rabbitmq/1
    type: DirectoryOrCreate
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - 192.168.88.202    #修改为绑定的node的hostname
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: rabbitmq-cluster-pv-2
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: hostpath-storage
  hostPath:
    path: /tmp/rabbitmq/2
    type: DirectoryOrCreate
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - 192.168.88.203   #修改为绑定的node的hostname

坑1:node选择器绑定时使用的是集群node 的hostname,如node 的IP 和Hostname 不一致,填写IP 会导致pod 为运行pending状态。

1.3. sc文件

复制代码
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: hostpath-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

1.4. serviceAccount文件

复制代码
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rabbitmq
  namespace: rabbitmq-clu-9
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: rabbitmq-peer-discovery
rules:
- apiGroups: [""]
  resources: ["nodes", "pods", "endpoints"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["discovery.k8s.io"]
  resources: ["endpointslices"] 
  verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: rabbitmq-peer-discovery
subjects:
- kind: ServiceAccount
  name: rabbitmq
  namespace: rabbitmq-clu-9
roleRef:
  kind: ClusterRole
  name: rabbitmq-peer-discovery
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: rabbitmq-configmap
  namespace: rabbitmq-clu-9
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "update"]
  resourceNames: ["rabbitmq-config"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: rabbitmq-configmap
  namespace: rabbitmq-clu-9
subjects:
- kind: ServiceAccount
  name: rabbitmq
roleRef:
  kind: Role
  name: rabbitmq-configmap
  apiGroup: rbac.authorization.k8s.io

坑2:在启动pod 的过程中如果集群角色rabbitmq-peer-discovery未授权nodes资源则pod一直报错启动失败

1.5. headless-service文件

复制代码
apiVersion: v1
kind: Service
metadata:
  name: rabbitmq-headless
  labels:
    app: rabbitmq
spec:
  clusterIP: None
  ports:
  - name: amqp
    port: 5672
    targetPort: 5672
  - name: management
    port: 15672 
    targetPort: 15672
  - name: epmd
    port: 4369
    targetPort: 4369
  - name: dist
    port: 25672
    targetPort: 25672
  selector:
    app: rabbitmq
  publishNotReadyAddresses: true

坑3:上面这几个端口都需要暴漏出来,否则集群创建失败

1.6. sts文件

复制代码
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: rabbitmq
  namespace: rabbitmq-clu-9
  labels:
    app: rabbitmq
spec:
  serviceName: rabbitmq-headless
  replicas: 3
  #podManagementPolicy: "Parallel"
  selector:
    matchLabels:
      app: rabbitmq
  template:
    metadata:
      labels:
        app: rabbitmq
    spec:
      terminationGracePeriodSeconds: 10
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - rabbitmq
              topologyKey: kubernetes.io/hostname
      serviceAccountName: rabbitmq        
      containers:
      - name: rabbitmq
        image: rabbitmq:3.8.27-management
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 5672
          name: amqp
        - containerPort: 15672
          name: http
        env:
        - name: RABBITMQ_USE_LONGNAME
          value: "true"        
        - name: RABBITMQ_ERLANG_COOKIE
          value: "secret-cookie"
        - name: RABBITMQ_DEFAULT_USER
          value: "admin"
        - name: RABBITMQ_DEFAULT_PASS
          value: "admin123"
        volumeMounts:
        - name: config
          mountPath: /etc/rabbitmq/rabbitmq.conf
          subPath: rabbitmq.conf
        - name: data
          mountPath: /var/lib/rabbitmq
          readOnly: false
        #readinessProbe:
        #  exec:
        #    command: ["rabbitmq-diagnostics", "status"]
        #  initialDelaySeconds: 20
        #  periodSeconds: 30
        livenessProbe:
          exec:
            command: ["rabbitmq-diagnostics", "ping"]
          initialDelaySeconds: 60
          periodSeconds: 30
      volumes:
      - name: config
        configMap:
          name: rabbitmq-config
  volumeClaimTemplates:
  - metadata:
      name: data
      namespace: rabbitmq-clu-9
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1000M
      storageClassName: hostpath-storage

坑4:RABBITMQ_USE_LONGNAME 环境变量需要指定为true,指定使用FQDN格式避免截全节点通信失败,集群建立失败。

坑5:如果podManagementPolicy 策略不为 "Parallel", 则readiness 探针需要关闭,避免集群启动失败。

二、RabbitMQ集群部署关键问题总结

以上总结了RabbitMQ集群部署中的五大核心问题,建议在实施前逐项核查配置,可显著提升部署成功率。

  1. 节点选择器绑定:必须使用集群节点的Hostname而非IP,否则Pod会陷入Pending状态。
  2. 角色授权:确保rabbitmq-peer-discovery角色已授权nodes资源,否则Pod启动报错。
  3. 端口暴露:必须开放4369(EPMD)、25672(Erlang分布式通信)等核心端口,否则集群初始化失败。
  4. 长名称配置:环境变量RABBITMQ_USE_LONGNAME需设为true,强制使用FQDN格式避免节点通信截断。
  5. Pod管理策略:若未采用Parallel策略,需关闭readiness探针,防止集群启动阻塞。
相关推荐
用户83071968408216 小时前
Spring Boot 集成 RabbitMQ :8 个最佳实践,杜绝消息丢失与队列阻塞
spring boot·后端·rabbitmq
蝎子莱莱爱打怪3 天前
GitLab CI/CD + Docker Registry + K8s 部署完整实战指南
后端·docker·kubernetes
用户8307196840823 天前
RabbitMQ vs RocketMQ 事务大对决:一个在“裸奔”,一个在“开挂”?
后端·rabbitmq·rocketmq
初次攀爬者4 天前
RabbitMQ的消息模式和高级特性
后端·消息队列·rabbitmq
初次攀爬者6 天前
ZooKeeper 实现分布式锁的两种方式
分布式·后端·zookeeper
蝎子莱莱爱打怪6 天前
Centos7中一键安装K8s集群以及Rancher安装记录
运维·后端·kubernetes
阿里云云原生7 天前
Kubernetes 官方再出公告,强调立即迁移 Ingress NGINX
kubernetes
至此流年莫相忘7 天前
Kubernetes实战篇之配置与存储
云原生·容器·kubernetes
让我上个超影吧7 天前
消息队列——RabbitMQ(高级)
java·rabbitmq
塔中妖7 天前
Windows 安装 RabbitMQ 详细教程(含 Erlang 环境配置)
windows·rabbitmq·erlang