CentOS Stream 8 高可用 Kuboard 部署方案

下面是在 CentOS Stream 8 上部署高可用 Kuboard 管理平台的详细方案,包含多副本、持久化存储和定期备份配置。

一、架构设计

高可用架构图

图表

节点规划

主机名 IP 地址 角色 资源配置
lb1 192.168.1.10 HAProxy + Keepalived 2C/4G
lb2 192.168.1.11 HAProxy + Keepalived 2C/4G
master1 192.168.1.101 Kubernetes 控制平面 4C/8G
master2 192.168.1.102 Kubernetes 控制平面 4C/8G
worker1 192.168.1.201 Kubernetes 工作节点 8C/32G
worker2 192.168.1.202 Kubernetes 工作节点 8C/32G
worker3 192.168.1.203 Kubernetes 工作节点 8C/32G
storage 192.168.1.50 MinIO 备份存储 4C/16G

二、前置条件准备

1. 所有节点基础配置

bash

# 关闭 SELinux

sudo setenforce 0

sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

# 关闭防火墙

sudo systemctl stop firewalld

sudo systemctl disable firewalld

# 禁用 Swap

sudo swapoff -a

sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

# 设置 hosts 解析

sudo tee -a /etc/hosts <<EOF

192.168.1.10 lb1

192.168.1.11 lb2

192.168.1.101 master1

192.168.1.102 master2

192.168.1.201 worker1

192.168.1.202 worker2

192.168.1.203 worker3

192.168.1.50 storage

EOF

2. 负载均衡节点配置 (lb1, lb2)

bash

# 安装 HAProxy 和 Keepalived

sudo dnf install -y haproxy keepalived

# 配置 HAProxy (/etc/haproxy/haproxy.cfg)

sudo tee /etc/haproxy/haproxy.cfg <<EOF

global

log /dev/log local0

maxconn 10000

user haproxy

group haproxy

defaults

mode tcp

timeout connect 5s

timeout client 50s

timeout server 50s

frontend k8s-api

bind *:6443

default_backend k8s-masters

frontend kuboard-http

bind *:80

default_backend kuboard-http-backend

frontend kuboard-https

bind *:443

default_backend kuboard-https-backend

backend k8s-masters

balance roundrobin

option tcp-check

server master1 192.168.1.101:6443 check fall 3 rise 2

server master2 192.168.1.102:6443 check fall 3 rise 2

backend kuboard-http-backend

balance roundrobin

server worker1 192.168.1.201:30080 check

server worker2 192.168.1.202:30080 check

server worker3 192.168.1.203:30080 check

backend kuboard-https-backend

balance roundrobin

server worker1 192.168.1.201:30443 check

server worker2 192.168.1.202:30443 check

server worker3 192.168.1.203:30443 check

EOF

# 启动 HAProxy

sudo systemctl enable --now haproxy

3. Keepalived 配置 (lb1 为主节点)

bash

# lb1 配置 (/etc/keepalived/keepalived.conf)

sudo tee /etc/keepalived/keepalived.conf <<EOF

vrrp_script chk_haproxy {

script "pidof haproxy"

interval 2

}

vrrp_instance VI_1 {

state MASTER

interface ens192 # 替换为实际网卡名

virtual_router_id 51

priority 100

advert_int 1

authentication {

auth_type PASS

auth_pass secretpassword

}

virtual_ipaddress {

192.168.1.100/24

}

track_script {

chk_haproxy

}

}

EOF

# lb2 配置 (备用节点)

sudo tee /etc/keepalived/keepalived.conf <<EOF

vrrp_script chk_haproxy {

script "pidof haproxy"

interval 2

}

vrrp_instance VI_1 {

state BACKUP

interface ens192 # 替换为实际网卡名

virtual_router_id 51

priority 90

advert_int 1

authentication {

auth_type PASS

auth_pass secretpassword

}

virtual_ipaddress {

192.168.1.100/24

}

track_script {

chk_haproxy

}

}

EOF

# 启动 Keepalived

sudo systemctl enable --now keepalived

三、Kubernetes 集群部署

1. 所有节点安装容器运行时

bash

sudo dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo

sudo dnf install -y containerd.io

# 配置 containerd

sudo mkdir -p /etc/containerd

containerd config default | sudo tee /etc/containerd/config.toml

sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml

sudo systemctl restart containerd && sudo systemctl enable containerd

2. 所有节点安装 Kubernetes 组件

bash

cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo

kubernetes

name=Kubernetes

baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64

enabled=1

gpgcheck=1

repo_gpgcheck=1

gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg

EOF

sudo dnf install -y kubelet-1.28 kubeadm-1.28 kubectl-1.28 --disableexcludes=kubernetes

sudo systemctl enable kubelet

3. 初始化控制平面 (master1)

bash

sudo kubeadm init \

--control-plane-endpoint="192.168.1.100:6443" \

--upload-certs \

--pod-network-cidr=10.244.0.0/16 \

--apiserver-advertise-address=192.168.1.101

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown (id -u):(id -g) $HOME/.kube/config

4. 添加其他控制平面节点 (master2)

bash

# 在 master1 上获取 join 命令

kubeadm token create --print-join-command

# 在 master2 上执行(添加 --control-plane 参数)

sudo kubeadm join 192.168.1.100:6443 --token <token> \

--discovery-token-ca-cert-hash <hash> \

--control-plane \

--certificate-key <cert-key>

5. 添加工作节点

bash

# 在 worker 节点上执行 join 命令

sudo kubeadm join 192.168.1.100:6443 --token <token> \

--discovery-token-ca-cert-hash <hash>

6. 安装网络插件 (Calico)

bash

kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml

四、存储解决方案部署

1. 安装 Longhorn 分布式存储

bash

# 添加 Helm 仓库

helm repo add longhorn https://charts.longhorn.io

helm repo update

# 安装 Longhorn

helm install longhorn longhorn/longhorn \

--namespace longhorn-system \

--create-namespace \

--set persistence.defaultClass=true \

--set defaultSettings.defaultDataLocality="best-effort" \

--set defaultSettings.replicaSoftAntiAffinity=true \

--set defaultSettings.storageOverProvisioningPercentage=200 \

--set defaultSettings.storageMinimalAvailablePercentage=15 \

--set defaultSettings.guaranteedEngineCPU=0.25

2. 创建 Kuboard 专用存储类

yaml

# kuboard-storageclass.yaml

apiVersion: storage.k8s.io/v1

kind: StorageClass

metadata:

name: kuboard-storage

provisioner: driver.longhorn.io

allowVolumeExpansion: true

reclaimPolicy: Retain

volumeBindingMode: Immediate

parameters:

numberOfReplicas: "3"

staleReplicaTimeout: "2880" # 48 小时

dataLocality: "best-effort"

五、高可用 Kuboard 部署

1. 创建 Kuboard 命名空间和 PVC

bash

kubectl create namespace kuboard-system

yaml

# kuboard-pvc.yaml

apiVersion: v1

kind: PersistentVolumeClaim

metadata:

name: kuboard-data

namespace: kuboard-system

spec:

accessModes:

  • ReadWriteMany

storageClassName: kuboard-storage

resources:

requests:

storage: 20Gi

2. 部署高可用 Kuboard

bash

# 下载原始部署文件

curl -LO https://addons.kuboard.cn/kuboard/kuboard-v3.yaml

# 修改为高可用版本

sed -i 's/replicas: 1/replicas: 3/' kuboard-v3.yaml

sed -i '/containers:/i \ volumes:\n - name: data\n persistentVolumeClaim:\n claimName: kuboard-data' kuboard-v3.yaml

sed -i '/containers:/,/ports:/ {/imagePullPolicy:/a \ volumeMounts:\n - name: data\n mountPath: /data' kuboard-v3.yaml

# 应用配置

kubectl apply -f kuboard-v3.yaml

3. 配置服务暴露

yaml

# kuboard-service.yaml

apiVersion: v1

kind: Service

metadata:

name: kuboard-v3

namespace: kuboard-system

spec:

selector:

app: kuboard

ports:

  • name: http

port: 80

targetPort: 80

nodePort: 30080

  • name: https

port: 443

targetPort: 443

nodePort: 30443

type: NodePort

4. 创建长期有效的访问 Token

bash

kubectl -n kuboard-system create serviceaccount kuboard-admin

kubectl create clusterrolebinding kuboard-admin-binding \

--clusterrole=cluster-admin \

--serviceaccount=kuboard-system:kuboard-admin

# 创建有效期1年的Token

kubectl -n kuboard-system create token kuboard-admin --duration=8760h > kuboard-token.txt

六、备份解决方案

1. 安装 MinIO 备份存储

bash

# 在 storage 节点安装 MinIO

sudo dnf install -y minio

# 创建数据目录

sudo mkdir -p /data/backups

sudo chown minio-user:minio-user /data/backups

# 配置 MinIO 服务

sudo tee /etc/default/minio <<EOF

MINIO_VOLUMES="/data/backups"

MINIO_OPTS="--address :9000 --console-address :9001"

MINIO_ROOT_USER=admin

MINIO_ROOT_PASSWORD=StrongPassword123!

EOF

# 启动 MinIO

sudo systemctl enable --now minio

2. 安装 Velero 备份工具

bash

# 下载 Velero 客户端

wget https://github.com/vmware-tanzu/velero/releases/download/v1.11.1/velero-v1.11.1-linux-amd64.tar.gz

tar -zxvf velero-v1.11.1-linux-amd64.tar.gz

sudo mv velero-v1.11.1-linux-amd64/velero /usr/local/bin/

# 创建备份凭证

cat <<EOF > credentials-velero

default

aws_access_key_id = admin

aws_secret_access_key = StrongPassword123!

EOF

# 安装 Velero

velero install \

--provider aws \

--plugins velero/velero-plugin-for-aws:v1.7.0 \

--bucket kuboard-backups \

--secret-file ./credentials-velero \

--use-volume-snapshots=true \

--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://192.168.1.50:9000 \

--snapshot-location-config region=minio

3. 配置定期备份

bash

# 每日全量备份

velero schedule create kuboard-daily \

--schedule="0 3 * * *" \

--include-namespaces kuboard-system \

--ttl 72h

# 每周快照备份

velero schedule create kuboard-weekly \

--schedule="0 4 * * 0" \

--include-namespaces kuboard-system \

--ttl 720h \

--snapshot-volumes

4. 备份验证脚本

bash

#!/bin/bash

# check-backup.sh

# 检查最新备份状态

LATEST_BACKUP=(velero backup get \| grep kuboard-daily \| sort -r \| head -n1 \| awk '{print 1}')

BACKUP_STATUS=(velero backup describe LATEST_BACKUP --details | grep Phase | awk '{print $3}')

if [ "$BACKUP_STATUS" != "Completed" ]; then

echo "Backup LATEST_BACKUP failed! Status: BACKUP_STATUS"

exit 1

else

echo "Backup $LATEST_BACKUP completed successfully"

fi

# 添加至 cron 每日检查

# 0 4 * * * /path/to/check-backup.sh | mail -s "Kuboard Backup Report" admin@example.com

七、访问与监控

1. 访问 Kuboard

2. 监控配置

yaml

# kuboard-monitor.yaml

apiVersion: monitoring.coreos.com/v1

kind: ServiceMonitor

metadata:

name: kuboard-monitor

namespace: kuboard-system

spec:

selector:

matchLabels:

app: kuboard

endpoints:

  • port: http

interval: 30s

path: /metrics

3. 告警规则

yaml

# kuboard-alerts.yaml

apiVersion: monitoring.coreos.com/v1

kind: PrometheusRule

metadata:

name: kuboard-alerts

namespace: kuboard-system

spec:

groups:

  • name: kuboard-rules

rules:

  • alert: KuboardDown

expr: up{job="kuboard"} == 0

for: 5m

labels:

severity: critical

annotations:

summary: Kuboard pod down in {{ $labels.namespace }}

  • alert: KuboardHighLatency

expr: histogram_quantile(0.95, sum(rate(kuboard_request_duration_seconds_bucket[5m])) by (le) > 3

for: 10m

labels:

severity: warning

annotations:

summary: Kuboard high request latency

八、运维与维护

1. 日常维护命令

操作 命令
查看 Kuboard 状态 kubectl -n kuboard-system get pods -l app=kuboard
检查备份状态 velero backup get
查看存储使用 kubectl -n longhorn-system get volumes
重启 Kuboard kubectl -n kuboard-system rollout restart deployment kuboard-v3

2. 灾难恢复流程

  1. 恢复集群状态:

bash

velero restore create --from-backup kuboard-daily-latest

  1. 恢复存储卷:

bash

# 列出可用快照

velero snapshot location get

# 恢复特定卷

velero restore create --from-backup kuboard-daily-latest \

--restore-volumes \

--include-resources persistentvolumeclaims,persistentvolumes

3. 升级策略

图表

九、安全加固

1. RBAC 权限控制

yaml

apiVersion: rbac.authorization.k8s.io/v1

kind: Role

metadata:

namespace: kuboard-system

name: kuboard-viewer

rules:

  • apiGroups: [""]

resources: ["pods", "services", "deployments"]

verbs: ["get", "list", "watch"]

2. 网络策略

yaml

apiVersion: projectcalico.org/v3

kind: NetworkPolicy

metadata:

name: kuboard-access

namespace: kuboard-system

spec:

selector: app == 'kuboard'

ingress:

  • action: Allow

protocol: TCP

source:

namespaceSelector: name == 'ingress-nginx'

destination:

ports: [80, 443]

egress:

  • action: Allow

protocol: TCP

destination:

ports: [80, 443]

  • action: Allow

protocol: UDP

destination:

ports: [53]

3. 证书管理

bash

# 为 Kuboard 生成 TLS 证书

openssl req -x509 -nodes -days 365 -newkey rsa:2048 \

-keyout kuboard.key -out kuboard.crt \

-subj "/CN=kuboard.example.com" \

-addext "subjectAltName=DNS:kuboard.example.com,IP:192.168.1.100"

# 创建 Kubernetes Secret

kubectl -n kuboard-system create secret tls kuboard-tls \

--key kuboard.key \

--cert kuboard.crt

十、性能优化建议

1. Kuboard 资源配置

yaml

# kuboard-resources.yaml

apiVersion: apps/v1

kind: Deployment

metadata:

name: kuboard-v3

namespace: kuboard-system

spec:

template:

spec:

containers:

  • name: kuboard

resources:

requests:

memory: "512Mi"

cpu: "250m"

limits:

memory: "2Gi"

cpu: "1"

2. 数据库性能优化

sql

-- 在 Kuboard 的 PostgreSQL 中执行

ALTER SYSTEM SET shared_buffers = '1GB';

ALTER SYSTEM SET work_mem = '32MB';

ALTER SYSTEM SET maintenance_work_mem = '256MB';

ALTER SYSTEM SET effective_cache_size = '3GB';

3. 缓存配置

yaml

# kuboard-cache.yaml

apiVersion: apps/v1

kind: Deployment

metadata:

name: kuboard-v3

namespace: kuboard-system

spec:

template:

spec:

containers:

  • name: kuboard

env:

  • name: CACHE_TYPE

value: "redis"

  • name: REDIS_URL

value: "redis://redis.kuboard-system:6379/0"

总结

此方案提供了在 CentOS Stream 8 上部署高可用 Kuboard 的完整解决方案,关键特点包括:

  1. 高可用架构
    • 多副本 Kuboard (3个实例)
    • 负载均衡 (HAProxy + Keepalived VIP)
    • 分布式存储 (Longhorn)
  2. 持久化存储
    • Longhorn 提供分布式块存储
    • 专用存储类配置
    • 多副本数据保护
  3. 备份解决方案
    • Velero 定时备份
    • MinIO 备份存储
    • 备份状态监控
  4. 安全加固
    • RBAC 权限控制
    • 网络策略隔离
    • TLS 加密通信
  5. 监控告警
    • Prometheus 集成
    • 关键指标告警
    • 性能监控

此架构能够支持中等规模生产环境的使用,建议每季度进行一次全链路压力测试,每月验证一次备份恢复流程,确保系统的高可用性和数据安全性。

相关推荐
独行soc4 小时前
2025年渗透测试面试题总结-264(题目+回答)
网络·python·安全·web安全·网络安全·渗透测试·安全狮
t198751285 小时前
在Ubuntu 22.04系统上安装libimobiledevice
linux·运维·ubuntu
skywalk81635 小时前
linux安装Code Server 以便Comate IDE和CodeBuddy等都可以远程连上来
linux·运维·服务器·vscode·comate
汤姆yu5 小时前
基于python的外卖配送及数据分析系统
开发语言·python·外卖分析
AKAMAI5 小时前
从客户端自适应码率流媒体迁移到服务端自适应码率流媒体
人工智能·云计算
如何原谅奋力过但无声5 小时前
TensorFlow 1.x常用函数总结(持续更新)
人工智能·python·tensorflow
晚风吹人醒.5 小时前
缓存中间件Redis安装及功能演示、企业案例
linux·数据库·redis·ubuntu·缓存·中间件
翔云 OCR API6 小时前
人脸识别API开发者对接代码示例
开发语言·人工智能·python·计算机视觉·ocr
REDcker6 小时前
tcpdump 网络数据包分析工具完整教程
网络·测试工具·tcpdump
roman_日积跬步-终至千里6 小时前
【Docker】Docker Stop 后到底发生了什么?——从信号机制到优雅停机
运维·docker·容器