cephfs mds 常见告警含义

CephFS health messages CephFS 健康消息

Cluster health checks 集群健康检查

The Ceph Monitor daemons will generate health messages in response to certain states of the file system map structure (and the enclosed MDS maps).

Ceph Monitor 守护进程会根据文件系统映射结构(以及包含的 MDS 映射)的某些状态生成健康消息。

Message: mds rank(s) ranks have failed Description: One or more MDS ranks are not currently assigned to an MDS daemon; the cluster will not recover until a suitable replacement daemon starts.

消息:mds rank(s) 失败 描述:一个或多个 MDS进程当前未分配给 MDS 守护进程;集群将无法恢复,直到合适的替代守护进程启动。

Message: mds rank(s) ranks are damaged Description: One or more MDS ranks has encountered severe damage to its stored metadata, and cannot start again until it is repaired.

消息:mds rank (s) 已损坏 说明:一个或多个 MDS rank 的存储元数据已严重损坏,必须修复后才能重新启动。

Message: mds cluster is degraded Description: One or more MDS ranks are not currently up and running, clients may pause metadata IO until this situation is resolved. This includes ranks being failed or damaged, and additionally includes ranks which are running on an MDS but have not yet made it to the active state (e.g. ranks currently in replay state).

消息:MDS 集群已降级 描述:一个或多个 MDS 进程当前未启动并运行,客户端可能会暂停元数据 I/O,直到此问题解决。这包括发生故障或损坏的进程,以及正在 MDS 上运行但尚未进入活动状态的 进程。 状态(例如,当前处于回放状态的排名)。

Message: mds names are laggy Description: The named MDS daemons have failed to send beacon messages to the Monitor for at least mds_beacon_grace (default 15s), while they are supposed to send beacon messages every mds_beacon_interval (default 4s). The daemons may have crashed. The Ceph Monitor will automatically replace laggy daemons with standbys if any are available.

消息:MDS名称 延迟 描述:指定的 MDS 守护进程至少在 mds_beacon_grace (默认 15 秒)内未能向监视器发送信标消息,而它们应该每隔 mds_beacon_interval 发送一次信标消息。 (默认 4 秒)。守护进程可能已崩溃。Ceph 监视器将 如果有备用进程可用,则自动用备用进程替换运行缓慢的守护进程。

Message: insufficient standby daemons available Description: One or more file systems are configured to have a certain number of standby daemons available (including daemons in standby-replay) but the cluster does not have enough standby daemons. The standby daemons not in replay count towards any file system (i.e. they may overlap). This warning can be configured by setting ceph fs set <fs> standby_count_wanted <count>. Use zero for count to disable.

消息:备用守护进程数量不足。描述:一个或多个文件系统配置了一定数量的备用守护进程(包括备用重放中的守护进程),但集群中备用守护进程数量不足。未在重放模式下的备用守护进程也计入任何文件系统的备用守护进程数量(即它们可能重叠)。可以通过设置 ceph fs set <fs> standby_count_wanted <count> 来配置此警告。将 count 设置为零可禁用此警告。

Daemon-reported health checks

守护进程报告的健康检查

MDS daemons can identify a variety of unwanted conditions, and indicate these to the operator in the output of ceph status. These conditions have human-readable messages, and additionally a unique code starting with MDS_.

MDS 守护进程可以识别各种异常情况,并在 ceph status 的输出中向操作员指示这些异常情况。这些异常情况包含人类可读的消息,以及一个以 MDS_ 开头的唯一代码。

ceph health detail shows the details of the conditions. Following is a typical health report from a cluster experiencing MDS related performance issues:

ceph health detail 显示了集群状况的详细信息。以下是出现 MDS 相关性能问题的集群的典型健康报告:

复制代码
ceph health detail
HEALTH_WARN 1 MDSs report slow metadata IOs; 1 MDSs report slow requests
MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs
   mds.fs-01(mds.0): 3 slow metadata IOs are blocked > 30 secs, oldest blocked for 51123 secs
MDS_SLOW_REQUEST 1 MDSs report slow requests
   mds.fs-01(mds.0): 5 slow requests are blocked > 30 secs

Where, for instance, MDS_SLOW_REQUEST is the unique code representing the condition where requests are taking long time to complete. And the following description shows its severity and the MDS daemons which are serving these slow requests.

例如, MDS_SLOW_REQUEST 是一个唯一代码,表示请求耗时过长的情况。以下描述显示了其严重程度以及处理这些慢速请求的 MDS 守护进程。

This page lists the health checks raised by MDS daemons. For the checks from other daemons, please see Health Checks.

本页列出了 MDS 守护进程发起的健康检查。有关其他守护进程发起的检查,请参阅 "健康检查"页面

MDS_TRIM

Message

"Behind on trimming..." "修剪工作进度落后了......"

Description

CephFS maintains a metadata journal that is divided into log segments . The length of journal (in number of segments) is controlled by the setting mds_log_max_segments, and when the number of segments exceeds that setting the MDS starts writing back metadata so that it can remove (trim) the oldest segments. If this writeback is happening too slowly, or a software bug is preventing trimming, then this health message may appear. The threshold for this message to appear is controlled by the config option mds_log_warn_factor, the default is 2.0.

CephFS 维护着一个元数据日志,该日志分为以下几部分: 日志段 。日志长度(以段数计)由设置 mds_log_max_segments 控制。当段数超过该设置值时,MDS 会开始写回元数据,以便删除(修剪)最旧的段。如果此写回速度过慢,或者软件错误导致无法修剪,则可能会出现此运行状况消息。此消息出现的阈值由配置选项 mds_log_warn_factor 控制,默认值为 2.0。

MDS_HEALTH_CLIENT_LATE_RELEASE, MDS_HEALTH_CLIENT_LATE_RELEASE_MANY

MDS_HEALTH_CLIENT_LATE_RELEASEMDS_HEALTH_CLIENT_LATE_RELEASE_MANY

Message

"Client name failing to respond to capability release"

"客户端名称未能响应功能发布"

Description

CephFS clients are issued capabilities by the MDS, which are like locks. Sometimes, for example when another client needs access, the MDS will request clients release their capabilities. If the client is unresponsive or buggy, it might fail to do so promptly or fail to do so at all. This message appears if a client has taken longer than session_timeout (default 60s) to comply.

CephFS 客户端由 MDS 授予权限 ,具体来说, 就像锁一样。有时,例如当另一个客户端需要访问权限时, MDS 将要求客户端释放其功能。如果客户端 如果程序无响应或存在故障,则可能无法及时执行或无法完成操作。 完全没有。如果客户端花费的时间超过一定时长,则会显示此消息。 session_timeout (默认 60 秒)以符合要求。

MDS_CLIENT_RECALL, MDS_HEALTH_CLIENT_RECALL_MANY

MDS_CLIENT_RECALLMDS_HEALTH_CLIENT_RECALL_MANY

Message

"Client name failing to respond to cache pressure"

"客户端名称未能响应缓存压力"

Description

Clients maintain a metadata cache. Items (such as inodes) in the client cache are also pinned in the MDS cache, so when the MDS needs to shrink its cache (to stay within mds_cache_memory_limit), it sends messages to clients to shrink their caches too. If the client is unresponsive or buggy, this can prevent the MDS from properly staying within its cache limits and it may eventually run out of memory and crash. This message appears if a client has failed to release more than mds_recall_warning_threshold capabilities (decaying with a half-life of mds_recall_max_decay_rate) within the last mds_recall_warning_decay_rate second.

客户端维护一个元数据缓存。客户端缓存中的项目(例如 inode)也会被锁定在 MDS 缓存中,因此当 MDS 需要缩小其缓存(以保持在 mds_cache_memory_limit 范围内)时,它会向客户端发送消息。 客户端也会缩小缓存。如果客户端无响应或出现错误, 这会导致 MDS 无法正确地保持在缓存限制之内,而且 最终可能会耗尽内存并崩溃。如果客户端出现此消息,则会显示此消息。 未能发布超过 mds_recall_warning_threshold 功能(半衰期为 mds_recall_max_decay_rate )在最后一个 mds_recall_warning_decay_rate 秒。

MDS_CLIENT_OLDEST_TID, MDS_CLIENT_OLDEST_TID_MANY

MDS_CLIENT_OLDEST_TIDMDS_CLIENT_OLDEST_TID_MANY

Message

"Client name failing to advance its oldest client/flush tid"

"客户端名称未能推进其最老的客户端/刷新 tid"

Description

The CephFS client-MDS protocol uses a field called the oldest tid to inform the MDS of which client requests are fully complete and may therefore be forgotten about by the MDS. If a buggy client is failing to advance this field, then the MDS may be prevented from properly cleaning up resources used by client requests. This message appears if a client appears to have more than max_completed_requests (default 100000) requests that are complete on the MDS side but haven't yet been accounted for in the client's oldest tid value. The last tid used by the MDS to trim completed client requests (or flush) is included as part of session ls (or client ls) command as a debug aid.

CephFS 客户端-MDS 协议使用一个名为" 最旧的 tid 用于告知 MDS 哪些客户端请求已完全完成,因此可能被 MDS 忽略。如果存在缺陷的客户端无法更新此字段,则 MDS 可能无法正确清理客户端请求使用的资源。如果客户端的已完成请求数超过 max_completed_requests 则会显示此消息。 (默认值 100000)MDS 端已完成但尚未提交的请求 尚未计入客户端最早的 tid 值。MDS 用于清理已完成客户端请求(或刷新)的最后一个 tid 会作为调试辅助信息包含在 session ls (或 client ls )命令中。

MDS_DAMAGE

Message

"Metadata damage detected"

"检测到元数据损坏"

Description

Corrupt or missing metadata was encountered when reading from the metadata pool. This message indicates that the damage was sufficiently isolated for the MDS to continue operating, although client accesses to the damaged subtree will return IO errors. Use the damage ls admin socket command to get more detail on the damage. This message appears as soon as any damage is encountered.

从元数据池读取数据时遇到损坏或缺失的元数据。此消息表明损坏范围足够小,MDS 可以继续运行,但客户端访问损坏的子树会返回 I/O 错误。使用 damage ls admin socket` 命令可以获取更多损坏详情。一旦遇到任何损坏,就会显示此消息。

MDS_HEALTH_READ_ONLY

Message

"MDS in read-only mode"

"MDS 处于只读模式"

Description

The MDS has gone into readonly mode and will return EROFS error codes to client operations that attempt to modify any metadata. The MDS will go into readonly mode if it encounters a write error while writing to the metadata pool, or if forced to by an administrator using the force_readonly admin socket command.

MDS 已进入只读模式,任何尝试修改元数据的客户端操作都将返回 EROFS 错误代码。如果 MDS 在写入元数据池时遇到写入错误,或者管理员使用 force_readonly 管理套接字命令强制其进入只读模式,MDS 也将进入只读模式。

MDS_SLOW_REQUEST MDS_慢请求

Message

"N slow requests are blocked"

" N 个慢请求已被阻止"

Description

One or more client requests have not been completed promptly, indicating that the MDS is either running very slowly, or that the RADOS cluster is not acknowledging journal writes promptly, or that there is a bug. Use the ops admin socket command to list outstanding metadata operations. This message appears if any client requests have taken longer than mds_op_complaint_time (default 30s).

一个或多个客户端请求未及时完成,这表明 MDS 运行速度非常慢,或者 RADOS 集群未及时确认日志写入,或者存在错误。请使用 ops admin socket 命令列出未完成的元数据操作。 如果任何客户端请求的耗时超过一定时间,则会显示此消息。 mds_op_complaint_time (默认值为 30 秒)。

MDS_CACHE_OVERSIZED

Message

"Too many inodes in cache"

"缓存中的 inode 过多"

Description

The MDS is not succeeding in trimming its cache to comply with the limit set by the administrator. If the MDS cache becomes too large, the daemon may exhaust available memory and crash. By default, this message appears if the actual cache size (in memory) is at least 50% greater than mds_cache_memory_limit (default 4GB). Modify mds_health_cache_threshold to set the warning ratio.

MDS 未能成功精简其缓存以符合规定。 管理员设置的限制。如果 MDS 缓存过大,守护进程将停止运行。 可能会耗尽可用内存并导致崩溃。默认情况下,如果出现以下情况,则会显示此消息: 实际缓存大小(在内存中)至少比预期大 50% mds_cache_memory_limit (默认 4GB)。修改 mds_health_cache_threshold 设定预警比例。

FS_WITH_FAILED_MDS

Message

"Some MDS ranks do not have standby replacements"

"部分 MDS 职位没有替补人员"

Description

Normally, a failed MDS rank will be replaced by a standby MDS. This situation is transient and is not considered critical. However, if there are no standby MDSs available to replace an active MDS rank, this health warning is generated.

通常情况下,发生故障的 MDS 进程会被备用 MDS 进程替换。这种情况是暂时的,并不属于严重问题。但是,如果没有备用 MDS 进程可以替换正在运行的 MDS 进程,则会生成此运行状况警告。

MDS_INSUFFICIENT_STANDBY

Message

"Insufficient number of available standby(-replay) MDS daemons than configured"

"可用备用(重放)MDS 守护进程数量少于配置数量"

Description

The minimum number of standby(-replay) MDS daemons can be configured by setting standby_count_wanted configuration variable. This health warning is generated when the configured value mismatches the number of standby(-replay) MDS daemons available.

可以通过设置来配置备用(重放)MDS 守护进程的最小数量。 standby_count_wanted 配置变量。当配置值与可用备用(-replay)MDS 守护进程的数量不匹配时,会生成此运行状况警告。

FS_DEGRADED

Message

"Some MDS ranks have been marked failed or damaged"

"部分 MDS 等级已被标记为不合格或损坏"

Description

When one or more MDS rank ends up in failed or damaged state due to an unrecoverable error. The file system may be partially or fully unavailable when one (or more) ranks are offline.

当一个或多个 MDS 进程由于不可恢复的错误而处于故障或损坏状态时,文件系统可能会部分或完全不可用。

MDS_UP_LESS_THAN_MAX

Message

"Number of active ranks are less than configured number of maximum MDSs"

"当前活动等级数量少于配置的最大 MDS 数量"

Description

The maximum number of MDS ranks can be configured by setting max_mds configuration variable. This health warning is generated when the number of MDS ranks falls below this configured value.

可以通过设置 max_mds 来配置 MDS 等级的最大数量。 配置变量。当数字为 时,将生成此健康警告。 MDS 排名低于此配置值。

MDS_ALL_DOWN

Message

"None of the MDS ranks are available (file system offline)"

"所有 MDS 等级均不可用(文件系统离线)"

Description

All MDS ranks are unavailable resulting in the file system to be completely offline.

所有 MDS 等级均不可用,导致文件系统完全离线。

MDS_CLIENTS_LAGGY

Message

"Client ID is laggy; not evicted because some OSD(s) is/are laggy"

"客户端 ID 延迟;由于某些 OSD 延迟,因此未被驱逐"

Description

If OSD(s) is laggy (due to certain conditions like network cut-off, etc) then it might make clients laggy(session might get idle or cannot flush dirty data for cap revokes). If defer_client_eviction_on_laggy_osds is set to true (default true), client eviction will not take place and thus this health warning will be generated.

如果 OSD 服务延迟过高(例如由于网络中断等原因),则可能导致客户端延迟(会话可能处于空闲状态或无法刷新脏数据以进行权限撤销)。如果 defer_client_eviction_on_laggy_osds 设置为 `true`(默认值为 `true`),则不会执行客户端驱逐操作,因此会生成此健康警告。

MDS_CLIENTS_BROKEN_ROOTSQUASH

Message

"X client(s) with broken root_squash implementation (MDS_CLIENTS_BROKEN_ROOTSQUASH)"

"X 个客户端的 root_squash 实现存在问题 (MDS_CLIENTS_BROKEN_ROOTSQUASH)"

Description

A bug was discovered in root_squash which would potentially lose changes made by a client restricted with root_squash caps. The fix required a change to the protocol and a client upgrade is required.

root_squash 中发现了一个漏洞,该漏洞可能会导致受限于 root_squash 权限的客户端所做的更改丢失。修复此漏洞需要更改协议,并且需要升级客户端。

This is a HEALTH_ERR warning because of the danger of inconsistency and lost data. It is recommended to either upgrade your clients, discontinue using root_squash in the interim, or silence the warning if desired.

这是 HEALTH_ERR 警告,因为存在数据不一致和丢失的风险。建议您升级客户端、暂时停止使用 root_squash,或者根据需要忽略此警告。

To evict and permanently block broken clients from connecting to the cluster, set the required_client_feature bit client_mds_auth_caps.

要驱逐并永久阻止损坏的客户端连接到集群,请设置 required_client_featureclient_mds_auth_caps

MDS_ESTIMATED_REPLAY_TIME

MDS 预估重放时间

Message

HEALTH_WARN Replay: x% complete. Estimated time remaining x seconds

健康警告:回放已完成 x%。预计剩余时间 x

Description

When an MDS journal replay takes more than 30 seconds, this message indicates the estimated time to completion.

当 MDS 日志重放超过 30 秒时,此消息会显示预计完成时间。