bash
health: HEALTH_ERR
mons are allowing insecure global_id reclaim
1 filesystem is offline
1 filesystem is online with fewer MDS than max_mds
mon ceph-02 is low on available space
1/3 mons down, quorum ceph-03,ceph-02
1 daemons have recently crashed
bash
mons ceph-02,ceph-03 are low on available space
操作系统根分区空间不足
解决:
根分区扩容,参见lvm类型的根分区扩容。
报错
bash
1/3 mons down, quorum ceph-03,ceph-02
可以明显看出是ceph-01的mon进程挂了,不过还是靠命令查
bash
ceph crash ls
#ceph crash archive <id>
OR
#ceph crash archive-all
bash
[root@ceph-02 ~]# ceph crash ls
ID ENTITY NEW
2023-04-01T22:37:01.370914Z_923d40d5-085e-4d4c-b164-96c8c514c093 mon.ceph-01 *
[root@ceph-02 ~]# ceph crash archive 2023-04-01T22:37:01.370914Z_923d40d5-085e-4d4c-b164-96c8c514c093
解决
bash
systemctl restart ceph-mon@ceph-01.service
报错
bash
[root@ceph-01 ~]# ceph health detail
HEALTH_ERR mons are allowing insecure global_id reclaim; 1 filesystem is offline; 1 filesystem is online with fewer MDS than max_mds
[WRN] AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure global_id reclaim
mon.ceph-03 has auth_allow_insecure_global_id_reclaim set to true
mon.ceph-01 has auth_allow_insecure_global_id_reclaim set to true
mon.ceph-02 has auth_allow_insecure_global_id_reclaim set to true
[ERR] MDS_ALL_DOWN: 1 filesystem is offline
fs cephfs is offline because no MDS is active for it.
1 MDSs report oversized cache
使用 vdbench 压测 ceph 目录,期间会进行海量小文件的创建,当达到10亿+数量时,集群出现警告1 MDSs report oversized cache
其实根据字面意思不难看出,是因为 mds 内存不够导致
解决方法
查看一下当前mds内存限制mds_cache_memory_limit是4G,报警阈值mds_health_cache_threshold为 1.5(即达到 1.5 倍时产生告警)
这里挑选一个 mds 进程(pod)查看,myfs-a 为 mds name(可通过 ceph fs status 命令查看)
bash
ceph fs status
myfs - 18 clients
====
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active myfs-a Reqs: 1461 /s 3957k 2931k 744k 534
1 active myfs-b Reqs: 0 /s 3830k 3830k 6836 212
2 active myfs-c Reqs: 1 /s 3832k 3832k 1061 421
1-s standby-replay myfs-e Evts: 0 /s 2984k 2984k 961 0
2-s standby-replay myfs-f Evts: 0 /s 2917k 2917k 872 0
0-s standby-replay myfs-d Evts: 1494 /s 56.6k 950 1392 0
采用 rook-ceph 部署的 ceph 集群,需要进到 pod 中操作
bash
kubectl get pod |grep mds
rook-ceph-mds-myfs-a-5cd89f74cb-kjq2r 2/2 Running 0 5d20h
rook-ceph-mds-myfs-b-5788f77df7-v7dfx 2/2 Running 0 5d20h
rook-ceph-mds-myfs-c-5c9ff6c85d-zb4vc 2/2 Running 0 5d20h
rook-ceph-mds-myfs-d-5db9b4cc75-slwlr 2/2 Running 1 (13m ago) 14m
rook-ceph-mds-myfs-e-777fbb5f96-drz7z 2/2 Running 1 (4d22h ago) 5d20h
rook-ceph-mds-myfs-f-658ccfc65-vh5z4 2/2 Running 0 5d20h
挑选 rook-ceph-mds-myfs-a-5cd89f74cb-kjq2r pod 查看
bash
ceph daemon mds.myfs-a config show|grep mds_cache
"mds_cache_memory_limit": "4294967296",
"mds_cache_mid": "0.700000",
"mds_cache_release_free_interval": "10",
"mds_cache_reservation": "0.050000",
"mds_cache_trim_decay_rate": "1.000000",
"mds_cache_trim_interval": "1",
"mds_cache_trim_threshold": "262144",
bash
ceph daemon mds.myfs-a config show|grep mds_health
"mds_health_cache_threshold": "1.500000",
"mds_health_summarize_threshold": "10",
修改mds_cache_memory_limit为10G
方法一:
说明:所有 mds 进程都要改,这里以 myfs-a 为例进行说明如何修改,注意换 mds.name
bash
ceph daemon mds.myfs-a config set mds_cache_memory_limit 10737418240
{
"success": "mds_cache_memory_limit = '10737418240' "
}
方法二
bash
vim /etc/ceph/ceph.conf
[global]
mds cache memory limit = 10737418240
重启出问题的 mds 相关 pod ,集群状态变为 HEALTH_OK
补充:
bash
ceph daemonperf mds.node1
---------------mds---------------- --mds_cache--- ------mds_log------ -mds_mem- ----mds_server----- mds_ -----objecter------ purg
req rlat fwd inos caps exi imi |stry recy recd|subm evts segs repl|ino dn |hcr hcs hsr cre |sess|actv rd wr rdwr|purg|
0 0 0 2.4M 2.4M 0 0 | 0 0 0 |657 322k 438 0 |2.4M 2.4M| 0 0 0 0 | 1 | 0 0 0 0 | 0
5.3k 2 0 2.4M 2.4M 0 0 | 0 0 0 |7.5k 329k 448 0 |2.4M 2.4M|5.3k 0 0 0 | 1 | 0 10 93 0 | 0
7.1k 1 0 2.4M 2.4M 0 0 | 0 0 0 |7.2k 336k 461 0 |2.4M 2.4M|7.1k 0 0 0 | 1 | 0 14 173 0 | 0
4.4k 1 0 2.4M 2.4M 0 0 | 0 0 0 |4.4k 341k 467 0 |2.4M 2.4M|4.4k 0 0 0 | 1 | 0 8 48 0 | 0
1.1k 1 0 2.4M 2.4M 0 0 | 0 0 0 |1.1k 342k 470 0 |2.4M 2.4M|1.1k 0 0 0 | 1 | 0 2 24 0 | 0
cephfs mount Permission denied
原因:https://www.spinics.net/lists/ceph-users/msg66881.html
这将使挂载变得无用,直到完全重新挂载 为止。挂载点不再可用 后,您可以使用"ceph osd blocklist ls"命令打印 osd 阻止列表来验证这一点 。
此行为的目的是处理恶意/故障 客户端。如果您的客户端当前持有重要目录的权限 ,并且机器出现硬件错误(并且不会 很快恢复),则对这些目录的访问将被阻止。其他客户端 将无法访问它们,直到损坏的机器恢复。 网络中断是另一个例子。
您可以配置触发黑名单的 mds 会话超时。 但请记住, 如果出现实际错误,简单地使用较长的超时可能会导致其他问题。
现象,pod 无法挂载 cephfs,导致 pod 无法启动,进到 pod 所在 node 查看日志
bash
dmesg
输出
bash
[14339331.775013] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339331.793133] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339331.812353] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339331.830346] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339346.864268] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339346.882236] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339346.901333] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339346.919118] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339361.953755] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339361.971964] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339361.991242] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339362.009235] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339377.043628] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339377.061753] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339377.081095] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339377.098940] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339392.133366] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339392.151389] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339392.171953] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
[14339392.194282] ceph: get_quota_realm: ino (10000003584.fffffffffffffffe) null i_snap_realm
使用 pod uid 过滤挂载信息
bash
mount|grep e42e2d74-d77d-4677-bdf5-447e9f921200
10.97.4.161:6789,10.97.24.187:6789,10.97.27.96:6789,10.97.171.67:6789,10.97.128.140:6789:/volumes/csi/csi-vol-af15c093-ab93-4c0b-bc21-5ea417ec20c9/f39de2b8-93af-4ca4-80aa-2e0e535db82a on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volumes/kubernetes.io~csi/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/mount type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=myfs,_netdev)
tmpfs on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volumes/kubernetes.io~empty-dir/shm type tmpfs (rw,relatime,size=247857152k,inode64)
tmpfs on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volumes/kubernetes.io~projected/kube-api-access-ssft6 type tmpfs (rw,relatime,size=247857152k,inode64)
10.97.4.161:6789,10.97.24.187:6789,10.97.27.96:6789,10.97.171.67:6789,10.97.128.140:6789:/volumes/csi/csi-vol-af15c093-ab93-4c0b-bc21-5ea417ec20c9/f39de2b8-93af-4ca4-80aa-2e0e535db82a on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/init/1 type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=myfs)
10.97.4.161:6789,10.97.24.187:6789,10.97.27.96:6789,10.97.171.67:6789,10.97.128.140:6789:/volumes/csi/csi-vol-af15c093-ab93-4c0b-bc21-5ea417ec20c9/f39de2b8-93af-4ca4-80aa-2e0e535db82a on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/init/2 type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=myfs)
10.97.4.161:6789,10.97.24.187:6789,10.97.27.96:6789,10.97.171.67:6789,10.97.128.140:6789:/volumes/csi/csi-vol-af15c093-ab93-4c0b-bc21-5ea417ec20c9/f39de2b8-93af-4ca4-80aa-2e0e535db82a on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/main/2 type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=myfs)
10.97.4.161:6789,10.97.24.187:6789,10.97.27.96:6789,10.97.171.67:6789,10.97.128.140:6789:/volumes/csi/csi-vol-af15c093-ab93-4c0b-bc21-5ea417ec20c9/f39de2b8-93af-4ca4-80aa-2e0e535db82a on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/main/3 type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=myfs)
10.97.4.161:6789,10.97.24.187:6789,10.97.27.96:6789,10.97.171.67:6789,10.97.128.140:6789:/volumes/csi/csi-vol-af15c093-ab93-4c0b-bc21-5ea417ec20c9/f39de2b8-93af-4ca4-80aa-2e0e535db82a on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/main/4 type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=myfs)
10.97.4.161:6789,10.97.24.187:6789,10.97.27.96:6789,10.97.171.67:6789,10.97.128.140:6789:/volumes/csi/csi-vol-af15c093-ab93-4c0b-bc21-5ea417ec20c9/f39de2b8-93af-4ca4-80aa-2e0e535db82a on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/main/5 type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=myfs)
10.97.4.161:6789,10.97.24.187:6789,10.97.27.96:6789,10.97.171.67:6789,10.97.128.140:6789:/volumes/csi/csi-vol-af15c093-ab93-4c0b-bc21-5ea417ec20c9/f39de2b8-93af-4ca4-80aa-2e0e535db82a on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/main/6 type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=myfs)
10.97.4.161:6789,10.97.24.187:6789,10.97.27.96:6789,10.97.171.67:6789,10.97.128.140:6789:/volumes/csi/csi-vol-af15c093-ab93-4c0b-bc21-5ea417ec20c9/f39de2b8-93af-4ca4-80aa-2e0e535db82a on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/main/7 type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=myfs)
10.97.4.161:6789,10.97.24.187:6789,10.97.27.96:6789,10.97.171.67:6789,10.97.128.140:6789:/volumes/csi/csi-vol-af15c093-ab93-4c0b-bc21-5ea417ec20c9/f39de2b8-93af-4ca4-80aa-2e0e535db82a on /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/sidecar/1 type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=myfs)
进不去 pod 挂载点
bash
cd /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/init/1
-bash: cd: /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/init/1: Permission denied
进到 pod 挂载点上一级目录查看权限
bash
cd /var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/init
[wlcb] root@a6000-6:/var/lib/kubelet/pods/e42e2d74-d77d-4677-bdf5-447e9f921200/volume-subpaths/pvc-73fa05ff-6c2b-45b5-b49f-6489c67d0df0/init# ll
ls: cannot access '2': Permission denied
ls: cannot access '1': Permission denied
total 8
drwxr-x--- 4 root root 4096 Jun 3 20:15 ./
drwxr-x--- 5 root root 4096 Jun 3 20:17 ../
d????????? ? ? ? ? ? 1/
d????????? ? ? ? ? ? 2/
参考
网络中断导致的讨论 issues
When the network fluctuates, the mount will report an error #970
cephfs: detect corrupted FUSE mounts and try to restore them #2634