详解MongoDB监控体系：Zabbix/Prometheus集成实战指南

一、MongoDB监控的重要性与核心指标

1.1 为什么需要专业监控

MongoDB作为现代应用的核心数据存储，其性能和可用性直接影响业务连续性。根据MongoDB官方报告，73%的性能问题在出现明显业务影响前可以通过监控指标发现。

关键统计 ：实施专业监控的企业，平均MTTR（平均修复时间）降低45% ，服务中断时间减少60% ，运维效率提升35%（Gartner 2023）。

1.2 核心监控指标分类

1.2.1 基础系统指标

指标	重要性	说明
CPU使用率	高	超过80%持续5分钟需告警
内存使用率	高	检查WiredTiger缓存使用
磁盘I/O	高	IOPS和吞吐量监控
磁盘空间	高	预测剩余空间，避免耗尽
网络流量	中	监控异常网络活动

1.2.2 MongoDB特定指标

指标	重要性	说明
活动连接数	高	接近最大连接数需告警
慢查询数量	高	指示查询性能问题
锁等待时间	高	指示写入瓶颈
复制延迟	高	副本集健康关键指标
内存映射文件大小	中	指示数据集大小

1.2.3 业务指标

指标	说明	价值
关键查询延迟	业务核心查询响应时间	直接影响用户体验
每秒操作数	业务吞吐量	衡量业务负载
事务成功率	业务操作成功率	指示业务健康状态
业务关键集合大小	重要数据增长	预测未来需求

1.3 监控架构设计原则

分层监控：基础设施 → MongoDB服务 → 业务指标
时间序列存储：长期趋势分析
实时告警：异常事件即时响应
可扩展性：支持未来扩展
安全优先：监控系统本身的安全

二、Zabbix监控MongoDB实战

2.1 Zabbix架构与MongoDB监控

2.1.1 Zabbix监控架构

复制代码

+------------------+      +------------------+      +------------------+
|   MongoDB Server |      |   Zabbix Server  |      |    Monitoring    |
|                  |<---->|                  |<---->|      Dashboard   |
+------------------+      +------------------+      +------------------+
       ↑                         ↑
       |                         |
+------------------+      +------------------+
|  Zabbix Agent    |      |  External Checks |
| (MongoDB模块)    |      | (Custom Scripts) |
+------------------+      +------------------+

2.1.2 为什么选择Zabbix

成熟稳定：15+年发展历史，企业级可靠性
丰富功能：自动发现、模板化、告警机制
可扩展：支持自定义监控脚本
社区支持：活跃社区和企业支持
成本效益：开源版本功能强大，企业版提供专业支持

2.2 Zabbix部署与配置

2.2.1 安装Zabbix Server

bash 复制代码

# CentOS/RHEL
sudo rpm -Uvh https://repo.zabbix.com/zabbix/6.0/rhel/8/x86_64/zabbix-release-6.0-3.el8.noarch.rpm
sudo dnf clean all
sudo dnf install zabbix-server-mysql zabbix-web-mysql zabbix-apache-conf zabbix-sql-scripts zabbix-agent

# 配置数据库
sudo mysql -uroot -p
CREATE DATABASE zabbix CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
CREATE USER 'zabbix'@'localhost' IDENTIFIED BY 'StrongPassword123!';
GRANT ALL PRIVILEGES ON zabbix.* TO 'zabbix'@'localhost';
FLUSH PRIVILEGES;
EXIT

# 导入数据库结构
zcat /usr/share/doc/zabbix-server-mysql*/create.sql.gz | mysql -uzabbix -p zabbix

# 配置Zabbix Server
sudo nano /etc/zabbix/zabbix_server.conf
DBHost=localhost
DBName=zabbix
DBUser=zabbix
DBPassword=StrongPassword123!

2.2.2 配置Zabbix Agent

bash 复制代码

# 安装Zabbix Agent
sudo dnf install zabbix-agent

# 配置Agent
sudo nano /etc/zabbix/zabbix_agentd.conf
Server=192.168.1.100
ServerActive=192.168.1.100
Hostname=MySQL-DB-Server
Include=/etc/zabbix/zabbix_agentd.d/*.conf

# 创建MongoDB监控配置
sudo mkdir -p /etc/zabbix/zabbix_agentd.d
sudo nano /etc/zabbix/zabbix_agentd.d/mongodb.conf
UserParameter=mongodb.status, echo "db.adminCommand('ping')" | mongo --quiet
UserParameter=mongodb.uptime, echo "db.serverStatus().uptime" | mongo --quiet
UserParameter=mongodb.connections, echo "db.serverStatus().connections.current" | mongo --quiet
UserParameter=mongodb.memory, echo "db.serverStatus().mem.resident" | mongo --quiet

2.3 MongoDB监控模板配置

2.3.1 自定义监控项

xml 复制代码

<?xml version="1.0" encoding="UTF-8"?>
<zabbix_export>
  <version>5.0</version>
  <date>2023-05-15T10:00:00Z</date>
  <groups>
    <group>
      <name>MongoDB</name>
    </group>
  </groups>
  <templates>
    <template>
      <name>MongoDB Monitoring</name>
      <groups>
        <group>
          <name>MongoDB</name>
        </group>
      </groups>
      <items>
        <item>
          <name>MongoDB Status</name>
          <key>mongodb.status</key>
          <delay>60s</delay>
          <type>0</type>
          <value_type>3</value_type>
        </item>
        <item>
          <name>MongoDB Uptime</name>
          <key>mongodb.uptime</key>
          <delay>300s</delay>
          <type>0</type>
          <value_type>3</value_type>
        </item>
        <item>
          <name>MongoDB Connections</name>
          <key>mongodb.connections</key>
          <delay>60s</delay>
          <type>0</type>
          <value_type>3</value_type>
        </item>
        <item>
          <name>MongoDB Memory Usage</name>
          <key>mongodb.memory</key>
          <delay>300s</delay>
          <type>0</type>
          <value_type>3</value_type>
        </item>
      </items>
      <triggers>
        <trigger>
          <expression>{mongodb.status.last()}=0</expression>
          <name>MongoDB Service Down</name>
          <priority>4</priority>
        </trigger>
        <trigger>
          <expression>{mongodb.connections.last()}>100</expression>
          <name>High MongoDB Connections</name>
          <priority>3</priority>
        </trigger>
      </triggers>
    </template>
  </templates>
</zabbix_export>

2.3.2 高级监控项配置

bash 复制代码

# /etc/zabbix/zabbix_agentd.d/mongodb.conf
UserParameter=mongodb.ops, echo "db.serverStatus().opcounters" | mongo --quiet
UserParameter=mongodb.repl.status, echo "rs.status()" | mongo --quiet
UserParameter=mongodb.slow.queries, echo "db.getSiblingDB('local').system.profile.find().count()" | mongo --quiet
UserParameter=mongodb.wiredtiger.cache, echo "db.serverStatus().wiredTiger.cache" | mongo --quiet

2.4 高级监控配置

2.4.1 自动发现配置

bash 复制代码

# /etc/zabbix/zabbix_agentd.d/mongodb-autodiscovery.conf
UserParameter=mongodb.databases, echo "db.getMongo().getDBNames()" | mongo --quiet
UserParameter=mongodb.collections[*], echo "db.getSiblingDB('$1').getCollectionNames()" | mongo --quiet

2.4.2 LLD (Low-Level Discovery) 配置

xml 复制代码

<discovery_rules>
  <discovery_rule>
    <name>MongoDB Databases</name>
    <key>mongodb.databases</key>
    <delay>3600s</delay>
    <type>0</type>
    <item_prototypes>
      <item_prototype>
        <name>Database Size: {#DBNAME}</name>
        <key>mongodb.database.size[{#DBNAME}]</key>
        <delay>300s</delay>
      </item_prototype>
    </item_prototypes>
  </discovery_rule>
</discovery_rules>

三、Prometheus监控MongoDB实战

3.1 Prometheus架构与MongoDB监控

3.1.1 Prometheus监控架构

复制代码

+------------------+      +------------------+      +------------------+
|   MongoDB Server |      |   Prometheus     |      |    Monitoring    |
| (Exporter)       |<---->| (Time Series DB) |<---->|      Dashboard   |
+------------------+      +------------------+      +------------------+
       ↑                         ↑
       |                         |
+------------------+      +------------------+
|  Service Discovery|      |  Alert Manager   |
| (Kubernetes, DNS)|      | (Alert Routing)  |
+------------------+      +------------------+

3.1.2 为什么选择Prometheus

时间序列数据库：专为监控设计
强大的查询语言：PromQL
动态服务发现：Kubernetes集成
活跃社区：CNCF项目，广泛采用
丰富的生态系统：Grafana、Alertmanager等

3.2 Prometheus部署与配置

3.2.1 安装Prometheus

bash 复制代码

# 下载并安装
wget https://github.com/prometheus/prometheus/releases/download/v2.44.0/prometheus-2.44.0.linux-amd64.tar.gz
tar xvfz prometheus-2.44.0.linux-amd64.tar.gz
cd prometheus-2.44.0.linux-amd64

# 配置Prometheus
cat << EOF > prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'mongodb'
    static_configs:
      - targets: ['localhost:9216']
EOF

# 启动Prometheus
./prometheus --config.file=prometheus.yml

3.2.2 MongoDB Exporter配置

bash 复制代码

# 下载MongoDB Exporter
wget https://github.com/percona/mongodb_exporter/releases/download/v0.32.0/mongodb_exporter-0.32.0.linux-amd64.tar.gz
tar xvfz mongodb_exporter-0.32.0.linux-amd64.tar.gz
cd mongodb_exporter-0.32.0.linux-amd64

# 配置Exporter
cat << EOF > mongodb_exporter.conf
mongodb_uri: "mongodb://monitor:monitor123@localhost:27017"
direct: true
log.level: info
EOF

# 启动Exporter
./mongodb_exporter --config.mongodb_exporter.conf

3.3 Prometheus监控配置详解

3.3.1 基本监控配置

yaml 复制代码

# prometheus.yml
scrape_configs:
  - job_name: 'mongodb'
    metrics_path: /metrics
    params:
      auth: ['true']
      user: ['monitor']
      password: ['monitor123']
    static_configs:
      - targets: ['localhost:9216']

3.3.2 高级配置（Kubernetes环境）

yaml 复制代码

# prometheus-k8s.yml
scrape_configs:
  - job_name: 'kubernetes-mongodb'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app]
        regex: mongo
        action: keep
      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_pod_name]
        target_label: kubernetes_pod_name

3.4 关键指标查询示例

3.4.1 基础指标查询

promql 复制代码

# 活动连接数
mongodb_connections{instance="localhost:9216"}

# 慢查询数量
mongodb_mongod_op_counters_total{operation="query", instance="localhost:9216"}

# 复制延迟
mongodb_repl_lag_seconds{instance="localhost:9216"}

# 内存使用率
1 - (mongodb_wiredtiger_cache_bytes{instance="localhost:9216"} / mongodb_wiredtiger_cache_max_bytes{instance="localhost:9216"})

3.4.2 复杂查询示例

promql 复制代码

# 每秒操作数
rate(mongodb_mongod_op_counters_total{operation=~"insert|query", instance="localhost:9216"}[5m])

# 复制延迟超过5秒
mongodb_repl_lag_seconds{instance="localhost:9216"} > 5

# 连接池使用率
mongodb_connections{instance="localhost:9216"} / mongodb_connections_max{instance="localhost:9216"}

# 慢查询比例
sum(rate(mongodb_slow_queries_total[5m])) / sum(rate(mongodb_mongod_op_counters_total[5m]))

四、告警配置与最佳实践

4.1 Zabbix告警配置

4.1.1 基础告警配置

xml 复制代码

<triggers>
  <trigger>
    <expression>{mongodb.status.last()}=0</expression>
    <name>MongoDB Service Down</name>
    <priority>4</priority>
    <description>MongoDB service is not responding</description>
  </trigger>
  <trigger>
    <expression>{mongodb.connections.last()}>{mongodb.connections.max(15m)}</expression>
    <name>Connection Spikes Detected</name>
    <priority>3</priority>
  </trigger>
</triggers>

4.1.2 高级告警配置

xml 复制代码

<triggers>
  <trigger>
    <expression>
      {mongodb.repl.status["primary"]} = 0 and
      {mongodb.repl.status["secondary"]} = 0
    </expression>
    <name>Replica Set Inoperable</name>
    <priority>5</priority>
    <description>All replica set members are down</description>
  </trigger>
  <trigger>
    <expression>
      {mongodb.wiredtiger.cache["used"]}/
      {mongodb.wiredtiger.cache["max"]} > 0.9
    </expression>
    <name>WiredTiger Cache Pressure</name>
    <priority>3</priority>
    <description>WiredTiger cache utilization > 90%</description>
  </trigger>
</triggers>

4.2 Prometheus告警配置

4.2.1 基础告警规则

yaml 复制代码

# alert-rules.yml
groups:
  - name: MongoDB
    rules:
      - alert: MongoDBDown
        expr: mongodb_up == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "MongoDB instance down"
          description: "MongoDB instance {{ $labels.instance }} is down"

      - alert: HighConnections
        expr: mongodb_connections_current > 80% of mongodb_connections_max
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High MongoDB connections"
          description: "MongoDB instance {{ $labels.instance }} has high connection count ({{ $value }})"

4.2.2 高级告警配置

yaml 复制代码

groups:
  - name: MongoDBAdvanced
    rules:
      - alert: HighReplicationLag
        expr: mongodb_repl_lag_seconds > 5
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High MongoDB replication lag"
          description: "Replication lag on {{ $labels.instance }} is {{ $value }} seconds"

      - alert: SlowQueryRate
        expr: rate(mongodb_slow_queries_total[5m]) > 5
        for: 15m
        labels:
          severity: warning
        annotations:
          summary: "High slow query rate"
          description: "Slow query rate is {{ $value }} queries/min on {{ $labels.instance }}"

      - alert: LowCacheUtilization
        expr: 1 - (mongodb_wiredtiger_cache_bytes / mongodb_wiredtiger_cache_max_bytes) < 0.2
        for: 30m
        labels:
          severity: info
        annotations:
          summary: "Low WiredTiger cache utilization"
          description: "WiredTiger cache utilization is low ({{ $value }}) on {{ $labels.instance }}"

4.3 告警管理最佳实践

4.3.1 告警分级策略

级别	严重性	处理方式	响应时间
1 (Critical)	系统宕机	立即响应	<15分钟
2 (High)	服务降级	紧急处理	<1小时
3 (Medium)	性能问题	计划处理	<4小时
4 (Low)	潜在问题	记录跟踪	<24小时

4.3.2 告警抑制规则

yaml 复制代码

# alert-suppression.yml
route:
  group_by: [alertname, instance]
  receiver: 'slack-notifications'
  routes:
    - match:
        severity: 'critical'
      receiver: 'on-call-team'
      continue: true

    - match:
        alertname: 'LowCacheUtilization'
      receiver: 'daily-summary'

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['instance']

五、可视化展示与分析

5.1 Grafana配置

5.1.1 Grafana安装与配置

bash 复制代码

# 安装Grafana
sudo rpm -Uvh https://dl.grafana.com/oss/release/grafana-10.0.3-1.x86_64.rpm
sudo systemctl daemon-reload
sudo systemctl start grafana-server
sudo systemctl enable grafana-server

5.1.2 MongoDB监控仪表板

json 复制代码

{
  "title": "MongoDB Monitoring Dashboard",
  "panels": [
    {
      "title": "Connections",
      "type": "graph",
      "datasource": "Prometheus",
      "targets": [
        {
          "expr": "mongodb_connections_current",
          "legendFormat": "Current"
        },
        {
          "expr": "mongodb_connections_available",
          "legendFormat": "Available"
        }
      ]
    },
    {
      "title": "Replication Status",
      "type": "gauge",
      "datasource": "Prometheus",
      "targets": [
        {
          "expr": "mongodb_repl_lag_seconds",
          "legendFormat": "Lag (seconds)"
        }
      ]
    }
  ]
}

5.2 关键可视化场景

5.2.1 基础健康状态概览

实时指标：活动连接、CPU/内存使用率
副本集状态：Primary/Secondary状态
关键操作计数：读写操作速率
系统资源：磁盘空间、I/O延迟

5.2.2 慢查询分析

慢查询计数：随时间变化的趋势
查询延迟分布：95/99分位数
慢查询模式：识别高频慢查询
索引使用情况：识别缺失索引

5.2.3 复制延迟监控

实时复制延迟：秒级更新
历史趋势：分析延迟模式
延迟分布：分片/副本集成员对比
延迟与负载关系：相关性分析

5.3 高级分析技术

5.3.1 异常检测

promql 复制代码

# 使用标准差检测异常
(
  mongodb_connections_current > 
  avg_over_time(mongodb_connections_current[1h]) + 
  2 * stddev_over_time(mongodb_connections_current[1h])
)

5.3.2 预测分析

promql 复制代码

# 基于线性回归预测磁盘空间耗尽时间
predict_linear(mongodb_disk_space_bytes[24h], 3600 * 24)

六、高级监控场景与优化

6.1 分片集群监控

6.1.1 分片集群专用指标

指标	说明	价值
`sh.chunkDistribution`	数据块分布	检测数据倾斜
`sh.status()`	分片状态	了解集群健康
`sh.balance`	平衡器状态	检测平衡活动
`sh.migration`	迁移状态	识别迁移瓶颈

6.1.2 分片集群监控配置

yaml 复制代码

# prometheus.yml
scrape_configs:
  - job_name: 'mongodb-sharded'
    params:
      auth: ['true']
      user: ['monitor']
      password: ['monitor123']
    static_configs:
      - targets: ['config-svr1:9216', 'mongos1:9216', 'shard1:9216', 'shard2:9216']

6.2 容器化环境监控

6.2.1 Kubernetes特定指标

指标	说明	查询示例
`container_memory_usage_bytes`	容器内存使用	`container_memory_usage_bytes{pod=~"mongo-.*"}`
`container_cpu_usage_seconds_total`	CPU使用率	`rate(container_cpu_usage_seconds_total{pod=~"mongo-.*"}[5m])`
`container_network_receive_bytes_total`	网络接收	`rate(container_network_receive_bytes_total{pod=~"mongo-.*"}[5m])`

6.2.2 容器化监控配置

yaml 复制代码

# prometheus-k8s.yml
scrape_configs:
  - job_name: 'kubernetes-mongo'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app]
        regex: mongo
        action: keep
      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        target_label: __address__

6.3 性能优化建议

6.3.1 监控系统优化

采样率调整：根据负载调整采集间隔
指标过滤：仅收集关键指标
存储优化：调整保留策略
分片配置：高负载环境使用分片存储

6.3.2 MongoDB配置优化

bash 复制代码

# 优化WiredTiger配置
storage:
  wiredTiger:
    engineConfig:
      cacheSizeGB: 8
      journalCompressor: zlib
      directoryForIndexes: true

七、监控运维最佳实践

7.1 监控配置管理

7.1.1 配置版本控制

bash 复制代码

# 使用Git管理监控配置
mkdir -p ~/monitoring-config
cp /etc/zabbix/zabbix_agentd.d/*.conf ~/monitoring-config/
cp /etc/prometheus/prometheus.yml ~/monitoring-config/
git init ~/monitoring-config
git add .
git commit -m "Initial monitoring configuration"

7.1.2 配置自动化

yaml 复制代码

# monitoring-config-deploy.yml (Ansible)
- name: Deploy Monitoring Config
  hosts: monitoring_servers
  tasks:
    - name: Copy Zabbix config
      copy:
        src: zabbix_agentd.conf
        dest: /etc/zabbix/zabbix_agentd.conf
      notify: restart zabbix-agent

    - name: Copy Prometheus config
      copy:
        src: prometheus.yml
        dest: /etc/prometheus/prometheus.yml
      notify: restart prometheus

  handlers:
    - name: restart zabbix-agent
      service:
        name: zabbix-agent
        state: restarted

    - name: restart prometheus
      service:
        name: prometheus
        state: restarted

7.2 监控健康检查

7.2.1 监控系统自检

bash 复制代码

# 每日运行自检
0 0 * * * /usr/local/bin/monitoring-health-check.sh

7.2.2 健康检查脚本

bash 复制代码

#!/bin/bash
# monitoring-health-check.sh

# 检查Zabbix
zabbix_status=$(systemctl is-active zabbix-agent)
if [ "$zabbix_status" != "active" ]; then
  echo "ALERT: Zabbix agent is not running" | mail -s "Monitoring Alert" admin@yourcompany.com
fi

# 检查Prometheus
prometheus_status=$(curl -s http://localhost:9090/-/ready)
if [ "$prometheus_status" != "ok" ]; then
  echo "ALERT: Prometheus is not ready" | mail -s "Monitoring Alert" admin@yourcompany.com
fi

# 检查关键指标
critical_metrics=$(curl -s http://localhost:9090/api/v1/query?query=mongodb_up | grep -c '"value":\[.*0\]')
if [ $critical_metrics -gt 0 ]; then
  echo "ALERT: MongoDB instances down" | mail -s "Monitoring Alert" admin@yourcompany.com
fi

7.3 持续改进流程

是
否
监控数据收集
数据异常?
根因分析
性能趋势分析
修复措施
优化建议
实施改进
验证效果
更新监控策略

八、常见问题与解决方案

8.1 监控数据问题

问题	原因	解决方案
数据缺失	服务不可用	检查目标服务状态
数据异常	采集间隔太长	调整采集间隔
数据不一致	多个监控系统	统一监控数据源
指标缺失	配置错误	验证监控配置

8.2 告警管理问题

问题	原因	解决方案
告警风暴	配置错误	调整告警阈值和分组
误报	阈值过低	分析历史数据，设置合理阈值
漏报	配置不完整	完善监控覆盖
告警疲劳	告警过多	实施告警分级和抑制

8.3 性能问题

问题	原因	解决方案
监控系统资源高	采集频率过高	降低采集频率
存储空间不足	保留策略不合理	调整保留策略
查询缓慢	数据量过大	优化查询语句
服务不稳定	资源不足	增加资源分配

九、结论与建议

9.1 监控系统实施路线图

基础监控阶段（1-2周）：
- 部署基础监控系统
- 配置核心指标监控
- 设置关键告警
完善阶段（2-4周）：
- 扩展监控覆盖
- 配置高级告警
- 建立可视化仪表板
优化阶段（1-2月）：
- 分析历史数据
- 优化监控配置
- 实施自动化
持续改进（持续）：
- 定期审查监控系统
- 调整告警策略
- 优化性能配置

9.2 关键成功因素

因素	说明	实施建议
业务对齐	监控指标与业务目标一致	定期审查业务需求
适度监控	避免过度监控	聚焦关键指标
持续改进	适应环境变化	建立定期审查机制
团队协作	DBA与运维团队合作	建立共同KPI
自动化	减少人工干预	投资自动化工具

关键提示 ：监控系统不是"一劳永逸"的项目，而是持续的实践 。成功的监控系统应该随着业务增长和环境变化而演进。监控的价值不在于收集多少数据，而在于如何利用这些数据做出更好的决策。

附录：监控工具速查表

Zabbix命令速查

bash 复制代码

# 检查Zabbix Agent状态
zabbix_agentd -V

# 测试Zabbix监控项
zabbix_get -s localhost -k mongodb.status

# 查看Zabbix日志
tail -f /var/log/zabbix/zabbix_agentd.log

Prometheus命令速查

bash 复制代码

# 检查Prometheus状态
curl http://localhost:9090/-/ready

# 测试MongoDB Exporter
curl http://localhost:9216/metrics

# 执行PromQL查询
curl "http://localhost:9090/api/v1/query?query=mongodb_connections_current"

通用监控命令

bash 复制代码

# 检查MongoDB状态
mongo --eval "db.serverStatus()"

# 查看慢查询
mongo --eval "db.getSiblingDB('local').system.profile.find().sort({ts: -1}).limit(10)"

# 检查副本集状态
mongo --eval "rs.status()"

通过实施本指南中的监控策略，您的MongoDB部署将获得强大的监控能力，既能满足日常运维需求，又能支持高级分析和故障诊断。记住，好的监控系统是安全、稳定和高效数据库环境的基础。