Prometheus+Grafana实时监控系统各项指标

一、监控架构设计

核心组件与数据流

  • Prometheus:时序数据采集、存储与告警规则管理
  • Node Exporter:采集主机指标(CPU、内存、磁盘、网络等)
  • 数据库Exporter :如 mysqld_exporterpostgres_exporter
  • Grafana:数据可视化与仪表盘展示
  • Alertmanager(可选):告警通知管理

二、主机环境准备

1. 系统要求

  • Linux系统(推荐CentOS 7+/Ubuntu 20.04+)
  • 开放端口:9090(Prometheus)、3000(Grafana)、9100(Node Exporter)
  • 确保所有节点时间同步(NTP服务)
bash 复制代码
# CentOS安装NTP
sudo yum install ntp
sudo systemctl start ntpd
sudo systemctl enable ntpd

# Ubuntu安装NTP
sudo apt install ntp
sudo systemctl restart ntp

三、组件安装与配置

1. 安装Prometheus Server

下载二进制包
bash 复制代码
wget https://github.com/prometheus/prometheus/releases/download/v2.39.1/prometheus-2.39.1.linux-amd64.tar.gz
tar xvfz prometheus-*.tar.gz
sudo mv prometheus-2.39.1.linux-amd64 /usr/local/prometheus
创建系统服务
bash 复制代码
sudo useradd --no-create-home --shell /bin/false prometheus
sudo mkdir /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus

# 创建service文件
sudo cat <<EOF > /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
ExecStart=/usr/local/prometheus/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/var/lib/prometheus \
  --web.listen-address=0.0.0.0:9090

Restart=always

[Install]
WantedBy=multi-user.target
EOF

# 配置Prometheus
sudo cp /usr/local/prometheus/prometheus.yml /etc/prometheus/
sudo chown -R prometheus:prometheus /etc/prometheus

# 启动服务
sudo systemctl daemon-reload
sudo systemctl start prometheus
sudo systemctl enable prometheus

2. 部署Node Exporter(所有节点)

下载安装
bash 复制代码
wget https://github.com/prometheus/node_exporter/releases/download/v1.4.0/node_exporter-1.4.0.linux-amd64.tar.gz
tar xvfz node_exporter-*.tar.gz
sudo mv node_exporter-1.4.0.linux-amd64/node_exporter /usr/local/bin/
sudo useradd -rs /bin/false node_exporter
创建系统服务
bash 复制代码
sudo cat <<EOF > /etc/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter
After=network.target

[Service]
User=node_exporter
Group=node_exporter
ExecStart=/usr/local/bin/node_exporter

Restart=always

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter

3. 配置Prometheus抓取规则

编辑 /etc/prometheus/prometheus.yml

yaml 复制代码
scrape_configs:
  - job_name: 'node'
    static_configs:
      - targets: ['node1:9100', 'node2:9100', 'node3:9100']

重启Prometheus生效:

bash 复制代码
sudo systemctl restart prometheus

四、数据库监控配置(以MySQL为例)

1. 安装mysqld_exporter

bash 复制代码
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.14.0/mysqld_exporter-0.14.0.linux-amd64.tar.gz
tar xvfz mysqld_exporter-*.tar.gz
sudo mv mysqld_exporter-0.14.0.linux-amd64/mysqld_exporter /usr/local/bin/
sudo useradd -rs /bin/false mysqld_exporter

2. 创建监控用户

sql 复制代码
CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'SecurePass123!' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';

3. 创建环境变量文件

bash 复制代码
sudo mkdir /etc/mysqld_exporter
sudo cat <<EOF > /etc/mysqld_exporter/.my.cnf
[client]
user=exporter
password=SecurePass123!
EOF

4. 创建系统服务

bash 复制代码
sudo cat <<EOF > /etc/systemd/system/mysqld_exporter.service
[Unit]
Description=MySQL Exporter
After=network.target

[Service]
User=mysqld_exporter
EnvironmentFile=/etc/mysqld_exporter/.my.cnf
ExecStart=/usr/local/bin/mysqld_exporter \
  --config.my-cnf="%a" \
  --web.listen-address=0.0.0.0:9104

Restart=always

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl start mysqld_exporter
sudo systemctl enable mysqld_exporter

五、安装与配置Grafana

1. 安装Grafana(CentOS)

bash 复制代码
sudo tee /etc/yum.repos.d/grafana.repo <<EOF
[grafana]
name=grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packages.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
EOF

sudo yum install grafana
sudo systemctl start grafana-server
sudo systemctl enable grafana-server

2. 配置Grafana数据源

  1. 访问 http://<服务器IP>:3000,默认账号 admin/admin
  2. 左侧菜单 → Configuration → Data Sources → Add data source
  3. 选择 Prometheus ,填写URL http://localhost:9090
  4. 点击 Save & Test

六、导入监控仪表盘

1. 主机监控仪表盘

  • Node Exporter Full :ID 1860
  • Linux Hosts Metrics :ID 11074

2. MySQL监控仪表盘

  • MySQL Overview :ID 7362
  • Percona MySQL :ID 11323

操作步骤

  1. 左侧菜单 → Create → Import
  2. 输入仪表盘ID → Load
  3. 选择Prometheus数据源 → Import

七、安全加固

1. 防火墙配置

bash 复制代码
# CentOS
sudo firewall-cmd --permanent --add-port=3000/tcp
sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --reload

# Ubuntu
sudo ufw allow 3000/tcp
sudo ufw allow 9090/tcp
sudo ufw reload

2. Grafana反向代理(Nginx示例)

nginx 复制代码
server {
    listen 80;
    server_name grafana.yourdomain.com;

    location / {
        proxy_pass http://localhost:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

八、告警配置示例

1. 创建告警规则文件

bash 复制代码
sudo cat <<EOF > /etc/prometheus/alerts.yml
groups:
- name: host-alerts
  rules:
  - alert: HighMemoryUsage
    expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 85
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "内存使用率过高 (实例 {{ $labels.instance }})"
      description: "内存使用率超过85%持续5分钟"
EOF

2. 修改Prometheus配置

yaml 复制代码
# /etc/prometheus/prometheus.yml
rule_files:
  - alerts.yml

重启服务:

bash 复制代码
sudo systemctl restart prometheus

九、故障排查指南

1. 服务状态检查

bash 复制代码
sudo systemctl status prometheus
sudo systemctl status node_exporter
sudo systemctl status mysqld_exporter

2. 日志查看

bash 复制代码
# Prometheus日志
journalctl -u prometheus -f

# Node Exporter日志
journalctl -u node_exporter -f

# MySQL Exporter日志
journalctl -u mysqld_exporter -f

十、总结

通过原生安装方式,您已构建完整的监控系统:

  • 资源监控:实时掌握CPU、内存、磁盘等指标
  • 数据库监控:跟踪查询性能、连接数、复制状态
  • 告警通知:配置阈值触发邮件/钉钉通知
  • 安全加固:通过防火墙和反向代理保护服务

后续扩展方向

  • 集成Alertmanager实现多通道告警
  • 监控Redis、Kafka等中间件
  • 部署长期存储(如Thanos)管理历史数据

资源参考

相关推荐
计算机毕设定制辅导-无忧学长1 天前
Grafana 与 InfluxDB 可视化深度集成(二)
信息可视化·数据分析·grafana
云游1 天前
大模型性能指标的监控系统(prometheus3.5.0)和可视化工具(grafana12.1.0)基础篇
grafana·prometheus·可视化·监控
qq_232045572 天前
非容器方式安装Prometheus和Grafana,以及nginx配置访问Grafana
nginx·grafana·prometheus
测试开发Kevin2 天前
详解grafana k6 中stage的核心概念与作用
测试工具·压力测试·grafana
夜莺云原生监控2 天前
Prometheus 监控 Kubernetes Cluster 最新极简教程
容器·kubernetes·prometheus
SRETalk3 天前
Prometheus 监控 Kubernetes Cluster 最新极简教程
kubernetes·prometheus
川石课堂软件测试3 天前
JMeter并发测试与多进程测试
功能测试·jmeter·docker·容器·kubernetes·单元测试·prometheus
SRETalk4 天前
夜莺监控的几种架构模式详解
prometheus·victoriametrics·nightingale·夜莺监控
天翼云开发者社区4 天前
Grafana无法启动修复解决
grafana
Ditglu.5 天前
使用Prometheus + Grafana + node_exporter实现Linux服务器性能监控
服务器·grafana·prometheus