基于ceph rbd生成pv
在集群中认证ceph
用下面代码生成ceph的secret
.创建 ceph 的 secret,在 k8s 的控制节点操作:
回到 ceph 管理节点创建 pool 池:
[root@master1-admin ~]# ceph osd pool create k8stest 56
pool 'k8stest' created
[root@master1-admin ~]# rbd create rbda -s 1024 -p k8stest
[root@master1-admin ~]# rbd feature disable k8stest/rbda object-map fast-diff deepflatten
创建pv(RBD共享挂载)
[root@xianchaomaster1 ~]# cat pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: ceph-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
rbd:
monitors:
- '192.168.40.201:6789'
- '192.168.40.200:6789'
- '192.168.40.202:6789'
pool: k8stest
image: rbda
user: admin
secretRef:
name: ceph-secret
fsType: xfs
readOnly: false
persistentVolumeReclaimPolicy: Recycle
ceph rbd 块存储的特点,ceph rbd 块存储能在同一个 node 上跨 pod 以 ReadWriteOnce 共享挂载,ceph rbd 块存储能在同一个 node 上同一个 pod 多个容器中以 ReadWriteOnce 共享挂载,ceph rbd 块存储不能跨 node 以 ReadWriteOnce 共享挂载。如果一个使用ceph rdb 的pod所在的node挂掉,这个pod虽然会被调度到其它node, 但是由于 rbd 不能跨 node 多次挂载和挂掉的 pod 不能自动解绑 pv 的问题,这个新 pod 不会正常运行。
问题:
结合 ceph rbd 共享挂载的特性和 deployment 更新的特性,我们发现原因如下:
由于 deployment 触发更新,为了保证服务的可用性,deployment 要先创建一个 pod 并运行正常之后,再去删除老 pod。而如果新创建的 pod 和老 pod 不在一个 node,就 会导致此故障。
解决办法:
1,使用能支持跨 node 和 pod 之间挂载的共享存储,例如 cephfs,GlusterFS 等
2,给 node 添加 label,只允许 deployment 所管理的 pod 调度到一个固定的 node 上。
(不建议,这个 node 挂掉的话,服务就故障了)
storageClass动态生成pv
role和clusterrole等配置:
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: rbd-provisioner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
- apiGroups: [""]
resources: ["services"]
resourceNames: ["kube-dns","coredns"]
verbs: ["list", "get"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: rbd-provisioner
subjects:
- kind: ServiceAccount
name: rbd-provisioner
namespace: default
roleRef:
kind: ClusterRole
name: rbd-provisioner
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: rbd-provisioner
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: rbd-provisioner
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: rbd-provisioner
subjects:
- kind: ServiceAccount
name: rbd-provisioner
namespace: default
设置rbd provisioner
apiVersion: apps/v1
kind: Deployment
metadata:
name: rbd-provisioner
spec:
selector:
matchLabels:
app: rbd-provisioner
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: rbd-provisioner
spec:
containers:
- name: rbd-provisioner
image: quay.io/xianchao/external_storage/rbd-provisioner:v1
imagePullPolicy: IfNotPresent
env:
- name: PROVISIONER_NAME
value: ceph.com/rbd
serviceAccount: rbd-provisioner
设置secret和sa
apiVersion: v1
kind: ServiceAccount
metadata:
name: rbd-provisioner
apiVersion: v1
kind: Secret
metadata:
name: ceph-secret-1
type: "ceph.com/rbd"
data:
key: QVFBWk0zeGdZdDlhQXhBQVZsS0poYzlQUlBianBGSWJVbDNBenc9PQ==
storageclass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: k8s-rbd
provisioner: ceph.com/rbd
parameters:
monitors: 192.168.40.201:6789
adminId: admin
adminSecretName: ceph-secret-1
pool: k8stest1
userId: admin
userSecretName: ceph-secret-1
fsType: xfs
imageFormat: "2"
imageFeatures: "layering
pvc
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 1Gi
storageClassName: k8s-rbd
发现PVC绑定错误,绑定的为k8stest1,而创建的pool资源为k8stest。
创建新的pool资源:ceph osd pool create k8stest1 56
测试pod
[root@xianchaomaster1 ~]# cat pod-sto.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
test: rbd-pod
name: ceph-rbd-pod
spec:
containers:
- name: ceph-rbd-nginx
image: nginx
imagePullPolicy: IfNotPresent
volumeMounts:
- name: ceph-rbd
mountPath: /mnt
readOnly: false
volumes:
- name: ceph-rbd
persistentVolumeClaim:
claimName: rbd-pvc
K8s挂载cephFS
k8s挂载ceph后,就可以将ceph作为一个存储路径使用。
仍使用之前创建的secret文件。
先在ceph集群的节点上创建一个secret file
[root@master1-admin ~]# cat /etc/ceph/ceph.client.admin.keyring |grep key|awk -
F" " '{print $3}' > /etc/ceph/admin.secret
然后在ceph集群中创建一个目录,并把ceph的根目录挂载到这个目录。xianchao作为一个文件系统,是我们之前已经创建好了的(见CSDN)。这里再重复一遍:
[root@master1-admin ceph]# ceph osd pool create cephfs_data 26
pool 'cephfs_data' created
[root@master1-admin ceph]# ceph osd pool create cephfs_metadata 26
pool 'cephfs_metadata' created
[root@master1-admin ceph]# ceph fs new xianchao cephfs_metadata cephfs_data
[root@master1-admin ~]# mkdir xianchao_data
[root@master1-admin ~]# mount -t ceph 192.168.40.201:6789:/ /root/xianchao_data
-o name=admin,secretfile=/etc/ceph/admin.secret
[root@master1-admin ~]# df -h
192.168.40.201:6789:/ 165G 106M 165G 1% /root/xianchao_data
在 cephfs 的根目录里面创建了一个子目录 lucky,k8s 以后就可以挂载这个目录
[root@master1-admin ~]# cd /root/xianchao_data/
[root@master1-admin xianchao_data]# mkdir lucky
[root@master1-admin xianchao_data]# chmod 0777 lucky/
mount -t ceph 表明挂载的是一个cephFS系统,而网址是monitor的地址,跟着他的根目录。将这个根目录挂载到xianchao_data 文件夹下。
使用 admin
用户来挂载 CephFS。
使用secret 来验证这个挂载。
这里只有xianchao一个cephFS,如果有多个,可以指定cephFS的名称:
mount -t ceph 192.168.40.201:6789:/xianchao /root/xianchao_data -o name=admin,secretfile=/etc/ceph/admin.secret
CephFS 是一个分布式文件系统,允许将其挂载到不同节点上。为了更方便地管理和组织数据,通常会在 CephFS 挂载的根目录下创建子目录。例如,在这个的场景中,lucky
目录是为了以后 Kubernetes 的持久卷使用。
创建pv
[root@xianchaomaster1 ceph]# cat cephfs-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: cephfs-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
cephfs:
monitors:
- 192.168.40.201:6789
path: /lucky
user: admin
readOnly: false
secretRef:
name: cephfs-secret
persistentVolumeReclaimPolicy: Recycle
其余的pvc创建和之前类似,通过创建在不同节点上的pod,可以看到不同node上的pod可以共享一个pv。
pod文件
[root@xianchaomaster1 ceph]# cat cephfs-pod-1.yaml
apiVersion: v1
kind: Pod
metadata:
name: cephfs-pod-1
spec:
containers:
- image: nginx
name: nginx
imagePullPolicy: IfNotPresent
volumeMounts:
- name: test-v1
mountPath: /mnt
volumes:
- name: test-v1
persistentVolumeClaim:
claimName: cephfs-pvc
[root@xianchaomaster1 ceph]# kubectl apply -f cephfs-pod-1.yaml
创建第二个 pod,挂载 cephfs-pvc
[root@xianchaomaster1 ceph]# cat cephfs-pod-2.yaml
apiVersion: v1
kind: Pod
metadata:
name: cephfs-pod-2
spec:
containers:
- image: nginx
name: nginx
imagePullPolicy: IfNotPresent
volumeMounts:
- name: test-v1
mountPath: /mnt
volumes:
- name: test-v1
persistentVolumeClaim:
claimName: cephfs-pvc