一、方案概述
1.1 核心组件介绍
Rook 是一个开源的云原生存储编排器,为 Kubernetes 提供平台级的存储服务。它简化了 Ceph、EdgeFS、Cassandra 等存储系统在 Kubernetes 中的部署和管理。
Ceph 是一个统一的分布式存储系统,提供对象存储、块存储和文件系统服务,具有高可靠性、高扩展性和高性能的特点。
1.2 Rook + Ceph 的优势
- 自动化管理:Rook 自动化部署、配置、扩展、升级和监视 Ceph 集群
- 云原生集成:深度集成 Kubernetes,使用 Kubernetes 原语进行管理
- 运维简化:减少人工干预,降低运维复杂度
- 弹性扩展:支持动态扩展存储容量
- 多云支持:可在不同云平台和本地环境中部署
二、架构设计
2.1 Rook 架构组件
yaml
# Rook 主要组件
- Rook Operator: 负责部署和管理 Ceph 集群
- Ceph Monitors (MON): 维护集群映射和监控状态
- Ceph Managers (MGR): 提供监控和模块管理
- Ceph OSDs: 对象存储守护进程,实际存储数据
- Ceph MDS: 元数据服务器(仅 CephFS 需要)
- Rook Agents: 在每个节点上运行,管理存储设备
2.2 数据流架构
Kubernetes Pod → CSI Driver → Rook Operator → Ceph Cluster
↓ ↓ ↓ ↓
StorageClass PersistentVolume Ceph Pool OSD Nodes
三、详细部署步骤
3.1 环境准备
3.1.1 系统要求
bash
# 检查内核版本(推荐 4.x+)
uname -r
# 检查 Kubernetes 版本(1.13+)
kubectl version
# 确保节点有可用的裸磁盘或分区
lsblk
# 安装 lvm2(如果需要使用存储设备)
sudo apt-get install -y lvm2 # Ubuntu/Debian
sudo yum install -y lvm2 # CentOS/RHEL
3.1.2 启用内核模块
bash
# 加载必要的内核模块
sudo modprobe rbd
echo "rbd" | sudo tee /etc/modules-load.d/rook-ceph.conf
# 验证模块加载
lsmod | grep rbd
3.2 部署 Rook Operator
3.2.1 克隆 Rook 仓库
bash
git clone --single-branch --branch master https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/ceph
3.2.2 创建 Rook 命名空间和 CRD
yaml
# common.yaml
apiVersion: v1
kind: Namespace
metadata:
name: rook-ceph
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: rook-ceph-system
namespace: rook-ceph-system
---
# Rook 相关 CRD 定义...
部署:
bash
kubectl create -f common.yaml
3.2.3 部署 Rook Operator
yaml
# operator.yaml 关键配置示例
apiVersion: apps/v1
kind: Deployment
metadata:
name: rook-ceph-operator
namespace: rook-ceph-system
spec:
replicas: 1
selector:
matchLabels:
app: rook-ceph-operator
template:
metadata:
labels:
app: rook-ceph-operator
spec:
serviceAccountName: rook-ceph-system
containers:
- name: rook-ceph-operator
image: rook/ceph:v1.5.9
args: ["ceph", "operator"]
env:
- name: ROOK_CURRENT_NAMESPACE_ONLY
value: "false"
- name: ROOK_ENABLE_DISCOVERY_DAEMON
value: "true"
- name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED
value: "true"
- name: ROOK_ENABLE_FLEX_DRIVER
value: "false"
- name: ROOK_CEPH_STATUS_CHECK_INTERVAL
value: "60s"
securityContext:
privileged: true
部署 Operator:
bash
kubectl create -f operator.yaml
# 验证 Operator 状态
kubectl -n rook-ceph-system get pods -l app=rook-ceph-operator
3.3 创建 Ceph 集群
3.3.1 集群配置示例
yaml
# cluster.yaml
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
dataDirHostPath: /var/lib/rook
mon:
count: 3
allowMultiplePerNode: false
dashboard:
enabled: true
ssl: true
monitoring:
enabled: true
rulesNamespace: rook-ceph
storage:
useAllNodes: true
useAllDevices: true
config:
databaseSizeMB: "1024"
journalSizeMB: "1024"
disruptionManagement:
managePodBudgets: true
osdMaintenanceTimeout: 30
manageMachineDisruptionBudgets: false
machineDisruptionBudgetNamespace: rook-ceph
3.3.2 高级存储配置
yaml
# 指定特定节点的配置
storage:
nodes:
- name: "node1"
devices:
- name: "sdb"
config:
storeType: bluestore
osdsPerDevice: "1"
- name: "node2"
devices:
- name: "sdb"
- name: "sdc"
config:
osdStoreType: bluestore
osdJournalSize: 5120
osdMaxBackfills: 1
osdRecoveryMaxActive: 1
osdRecoveryMaxSingleStart: 1
osdRecoveryOpPriority: 3
部署集群:
bash
kubectl create -f cluster.yaml
# 查看集群状态
kubectl -n rook-ceph get cephcluster
# 查看详细状态
kubectl -n rook-ceph describe cephcluster rook-ceph
3.3.3 监控部署进度
bash
# 查看所有相关 Pod
kubectl -n rook-ceph get pods -o wide
# 查看 monitor Pods
kubectl -n rook-ceph get pods -l app=rook-ceph-mon
# 查看 OSD Pods
kubectl -n rook-ceph get pods -l app=rook-ceph-osd
# 查看集群健康状态
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph status
# 查看 OSD 树
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd tree
3.4 部署存储类(StorageClass)
3.4.1 块存储(RBD)
yaml
# storageclass-block.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: replicapool
imageFormat: "2"
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
3.4.2 文件存储(CephFS)
yaml
# storageclass-filesystem.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-cephfs
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
clusterID: rook-ceph
fsName: myfs
pool: myfs-data0
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Delete
allowVolumeExpansion: true
3.4.3 对象存储(Object Store)
yaml
# object.yaml
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
name: my-store
namespace: rook-ceph
spec:
metadataPool:
failureDomain: host
replicated:
size: 3
dataPool:
failureDomain: host
replicated:
size: 3
gateway:
type: s3
sslCertificateRef:
port: 80
securePort:
instances: 1
部署存储类:
bash
# 部署块存储
kubectl create -f storageclass-block.yaml
# 部署文件存储
kubectl create -f storageclass-filesystem.yaml
# 部署对象存储
kubectl create -f object.yaml
kubectl create -f object-user.yaml
kubectl create -f object-bucket-claim.yaml
四、使用示例
4.1 使用块存储(RBD)
4.1.1 创建 PersistentVolumeClaim
yaml
# pvc-block.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
spec:
storageClassName: rook-ceph-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
4.1.2 在 Pod 中使用
yaml
# pod-with-pvc.yaml
apiVersion: v1
kind: Pod
metadata:
name: mysql-pod
spec:
containers:
- name: mysql
image: mysql:5.7
env:
- name: MYSQL_ROOT_PASSWORD
value: "password"
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: mysql-pvc
4.2 使用文件存储(CephFS)
4.2.1 创建 StatefulSet
yaml
# statefulset-cephfs.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web-server
spec:
serviceName: "nginx"
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteMany" ]
storageClassName: "rook-cephfs"
resources:
requests:
storage: 1Gi
4.3 使用对象存储(S3兼容)
4.3.1 获取访问凭证
bash
# 获取访问密钥
kubectl -n rook-ceph get secret rook-ceph-object-user-my-store-my-user -o yaml | grep AccessKey | awk '{print $2}' | base64 --decode
# 获取秘密密钥
kubectl -n rook-ceph get secret rook-ceph-object-user-my-store-my-user -o yaml | grep SecretKey | awk '{print $2}' | base64 --decode
# 获取端点
kubectl -n rook-ceph get service rook-ceph-rgw-my-store
4.3.2 在应用中使用
python
# Python 示例
import boto3
from botocore.client import Config
s3 = boto3.resource(
's3',
endpoint_url='http://rook-ceph-rgw-my-store.rook-ceph:8080',
aws_access_key_id='ACCESS_KEY',
aws_secret_access_key='SECRET_KEY',
config=Config(signature_version='s3v4'),
)
# 创建存储桶
s3.create_bucket(Bucket='my-bucket')
# 上传文件
s3.Bucket('my-bucket').upload_file('local_file.txt', 'remote_file.txt')
五、运维管理
5.1 监控和仪表板
5.1.1 启用 Ceph Dashboard
yaml
# 在 cluster.yaml 中配置
dashboard:
enabled: true
ssl: true
port: 8443
访问 Dashboard:
bash
# 获取管理员密码
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
# 端口转发
kubectl port-forward svc/rook-ceph-mgr-dashboard 8443:8443 -n rook-ceph
5.1.2 Prometheus 监控
bash
# 启用监控
kubectl create -f cluster-monitoring.yaml
# 部署 Prometheus Operator(如果未安装)
kubectl create -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/bundle.yaml
5.2 集群操作
5.2.1 扩展集群
yaml
# 添加新节点
kubectl -n rook-ceph edit cephcluster rook-ceph
# 在 storage.nodes 部分添加新节点配置
5.2.2 OSD 管理
bash
# 查看 OSD 状态
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd status
# 安全移除 OSD
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd out osd.<id>
kubectl -n rook-ceph delete deployment rook-ceph-osd-<id>
# 添加新磁盘
# 1. 在节点上添加新磁盘
# 2. 更新 cluster.yaml 配置
# 3. 应用配置
kubectl apply -f cluster.yaml
5.2.3 升级集群
bash
# 升级 Rook Operator
kubectl apply -f operator-upgrade.yaml
# 升级 Ceph 集群
kubectl -n rook-ceph patch CephCluster rook-ceph --type merge -p '{"spec": {"cephVersion": {"image": "ceph/ceph:v15.2.4"}}}'
5.3 故障排除
5.3.1 常见问题检查
bash
# 1. 检查所有组件状态
kubectl -n rook-ceph get pods
# 2. 查看有问题 Pod 的日志
kubectl -n rook-ceph logs <pod-name> -c <container-name>
# 3. 检查事件
kubectl -n rook-ceph get events --sort-by='.lastTimestamp'
# 4. 检查存储节点
kubectl -n rook-ceph get cephcluster -o yaml
# 5. 进入工具 Pod 检查 Ceph 状态
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
ceph status
ceph osd tree
ceph df
5.3.2 数据恢复
bash
# 如果集群健康状态为 HEALTH_ERR
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph health detail
# 修复 PG(Placement Group)
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph pg repair <pg-id>
# 重新平衡数据
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd rebalance
六、最佳实践
6.1 生产环境配置建议
6.1.1 硬件建议
yaml
# 生产环境硬件配置建议
- Monitors: 3-5个节点,SSD磁盘,至少2核4GB内存
- OSDs: 每个节点多个磁盘,避免使用根分区
- Managers: 与Monitors共置,SSD磁盘
- 网络: 10GbE或更高,分离公共和集群网络
6.1.2 存储池配置
yaml
# 创建不同性能等级的存储池
# 1. 高性能池(SSD)
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: ssd-pool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
deviceClass: ssd
# 2. 容量池(HDD)
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: hdd-pool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
deviceClass: hdd
6.1.3 备份和恢复策略
yaml
# 使用 Velero 进行备份
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
name: ceph-backup
namespace: velero
spec:
provider: aws
objectStorage:
bucket: ceph-backup-bucket
config:
region: us-west-2
---
apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
name: ceph-snapshot
namespace: velero
spec:
provider: csi.ceph.com
config:
clusterID: rook-ceph
csiDriver: rook-ceph.rbd.csi.ceph.com
6.2 安全配置
6.2.1 网络安全策略
yaml
# NetworkPolicy 示例
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: rook-ceph-network-policy
namespace: rook-ceph
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
rook.io/namespace: rook-ceph
ports:
- protocol: TCP
port: 6789 # Monitor
- protocol: TCP
port: 6800 # OSD
6.2.2 RBAC 配置
yaml
# 最小权限原则配置
apiVersion: v1
kind: ServiceAccount
metadata:
name: rook-ceph-limited
namespace: application-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: rook-ceph-pvc-creator
namespace: application-namespace
rules:
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["create", "get", "list", "watch", "update", "patch", "delete"]
七、性能优化
7.1 性能调优参数
yaml
# cluster.yaml 中的性能配置
spec:
storage:
config:
# OSD 配置
osdMaxBackfills: 2
osdRecoveryMaxActive: 3
osdRecoveryMaxSingleStart: 1
osdRecoveryOpPriority: 3
osdClientOpPriority: 63
osdCrushChooseleafType: 0
# 网络优化
publicNetwork: 10.0.0.0/24
clusterNetwork: 192.168.0.0/24
7.2 监控指标
bash
# 关键性能指标
# 1. 集群使用率
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph df
# 2. IOPS 和吞吐量
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd perf
# 3. 延迟监控
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd pool stats
八、迁移和扩展
8.1 从传统 Ceph 迁移
bash
# 1. 逐步迁移策略
# 2. 使用 rbd 镜像同步
rbd export-pool old-pool new-pool
# 3. 应用逐步切换
# 阶段1: 新应用使用 Rook Ceph
# 阶段2: 旧应用数据迁移
# 阶段3: 完全切换
8.2 多云部署
yaml
# 跨云区域配置
spec:
storage:
nodes:
- name: "aws-node-1"
devices: [...]
config:
location: "region=us-west-2,zone=a"
- name: "azure-node-1"
devices: [...]
config:
location: "region=eastus,zone=1"
placement:
osd:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/region
operator: In
values: [us-west-2, eastus]
总结
Rook 结合 Ceph 为 Kubernetes 提供了企业级的存储解决方案,具有以下特点:
- 自动化程度高:简化了 Ceph 集群的部署和管理
- 云原生集成:与 Kubernetes 深度集成,使用原生 API
- 功能全面:支持块存储、文件存储和对象存储
- 可扩展性强:支持动态扩展存储容量和性能
- 生产就绪:提供监控、备份、安全等企业级功能
通过本文的详细配置指南,可以成功在 Kubernetes 集群中部署和管理 Rook Ceph 存储系统,为应用提供可靠、高性能的持久化存储服务。