prometheus+grafana应用监控配置

配置Prometheus

官方地址:Download | Prometheus

(wegt下载压缩包,解压并重命名prometheus,文件放于/data/prometheus即可)

配置 service方法(文件放于 /etc/systemd/system/prometheus.service):

复制代码
[Unit]

Description=Prometheus Server

Documentation=https://prometheus.io/docs/introduction/overview/

After=network-online.target

[Service]

Type=simple

User=prometheus
Group=prometheus

ExecStart=/data/prometheus/prometheus/prometheus \
  --config.file=/data/prometheus/prometheus/prometheus.yml \
  --storage.tsdb.path=/data/prometheus/prometheus/data \
  --storage.tsdb.retention.time=60d \
  --web.enable-lifecycle


Restart=on-failure

[Install]

WantedBy=multi-user.target

授权给用户 prometheus

复制代码
# 创建用户 -M 不创建源目录 -s 指定 shell
useradd -M -s /usr/sbin/nologin prometheus

# 赋权
chown prometheus:prometheus -R /data/prometheus

启动运行并检查状态

复制代码
# 启动
systemctl start prometheus.service

# 检查
systemctl status prometheus.service

# 开启自启动
systemctl enable prometheus.service

# 修改单元文件,重新加载systemd配置 
sudo systemctl daemon-reload

# 检查日志方法
journalctl -u prometheus.service -f

重新加载 prometheus

复制代码
# -X  指定 POST 请求
curl -X POST http://localhost:9090/-/reload  

访问web地址9090:

复制代码
# ip:9090 访问
prometheus    http://xxx.xxx.xxx.xxx:9090 

监控指标       http://xxx.xxx.xxx.xxx:9090/metrics

配置 alertmanager

配置文件

复制代码
[Unit]

Description=Alert Manager

Wants=network-online.target
After=network-online.target

[Service]

Type=simple

User=prometheus
Group=prometheus

ExecStart=/data/prometheus/alertmanager/alertmanager \
  --config.file=/data/prometheus/alertmanager/alertmanager.yml \
  --storage.path=/data/prometheus/prometheus/data \


Restart=always

[Install]

WantedBy=multi-user.target

访问:web端口9093

配置alert服务器告警提示:

复制代码
root@pass:~# cat /data/prometheus/prometheus/alert.yml

groups:
- name: Prometheus alert
  rules:
  - alert: 服务告警
    expr: up == 0
    for: 30s
    labels:
        severity: critical
    annotations:
        instance: "{{ $labels.instance }}"
        description: "{{ $labels.job }} 服务器"

访问web检查:

检查yml文件配置方法promtool工具:

复制代码
root@pass:~# cd /data/prometheus/prometheus/
root@pass:/data/prometheus/prometheus# ./promtool check config prometheus.yml
Checking prometheus.yml
  SUCCESS: 1 rule files found
 SUCCESS: prometheus.yml is valid prometheus config file syntax

Checking alert.yml
  SUCCESS: 1 rules found

配置grafana

下载地址:Download Grafana | Grafana Labs

配置service文件:

复制代码
[Unit]
Description=Grafana server

Documentation=http://docs.grafana.org
 
[Service]
Type=simple
User=prometheus
Group=prometheus
Restart=on-failure
ExecStart= /data/prometheus/grafana/bin/grafana-server \
	--config=/data/prometheus/grafana/conf/defaults.ini \
	--homepath=/data/prometheus/grafana

[Install]
WantedBy=multi-user.target

访问web端口3000:(默认账号及密码admin/admin)

配置node_exporter

下载地址同Prometheus

配置service文件:

复制代码
[Unit]
Description=node_exporter

Documentation=http://prometheus.io/

After=network.target
 
[Service]
User=prometheus
Group=prometheus
Restart=on-failure
ExecStart= /data/prometheus/node_exporter/node_exporter

[Install]
WantedBy=multi-user.target

访问web端口9100

最后注意记得配置时将需要的配置到Prometheus.yml中:

复制代码
# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
            - localhost:9093 
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "alert.yml"
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]
       
  - job_name: "node-exporter"
    scrape_interval: 15s
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]
        labels:
             instance: Prometheus服务器

web页面展示

手动创建仪表板:官方地址 Grafana dashboards | Grafana Labs

复制需要的ID ,然后导入到granafa图表中即可

第一步:

第二步:

第三步:

展示结果:

相关推荐
咖啡啡不加糖1 天前
Grafana 监控服务指标使用指南:打造可视化监控体系
java·后端·grafana
牛奶咖啡131 天前
Prometheus+Grafana构建云原生分布式监控系统(十四)_Prometheus中PromQL使用(一)
云原生·prometheus·promql·计算一个时间范围内的平均值·将相同数据整合查看整体趋势·计算时间范围内的最大最小比率·向量标量的算术运算
世界尽头与你1 天前
(修复方案)CVE-2021-43798: Grafana路径遍历漏洞
安全·grafana
牛奶咖啡132 天前
Prometheus+Grafana构建云原生分布式监控系统(十三)_Prometheus数据模型及其PromQL
云原生·prometheus·prometheus数据类型·promql使用场景·promql表达式解析·promql数据类型·监控系统的方法论与指标
AC赳赳老秦3 天前
外文文献精读:DeepSeek翻译并解析顶会论文核心技术要点
前端·flutter·zookeeper·自动化·rabbitmq·prometheus·deepseek
qq_312920114 天前
Proxmox VE 监控:把集群指标秒级推送到 InfluxDB 2.x,Grafana 大屏一步到位
运维·grafana
牛奶咖啡135 天前
Prometheus+Grafana构建云原生分布式监控系统(十二)_基于DNS的服务发现
云原生·prometheus·dns·搭建自己的dns服务器·使用bind搭建dns服务器·配置正向解析·基于dns的服务发现
A-刘晨阳5 天前
Prometheus + Grafana + Alertmanager 实现邮件监控告警及配置告警信息
运维·云计算·grafana·prometheus·监控·邮件
饺子大魔王的男人5 天前
告别服务器失联!Prometheus+Alertmanager+cpolar 让监控告警不局限于内网
运维·服务器·prometheus
电话交换机IPPBX-3CX6 天前
如何使用 Grafana 可视化你的 3CX 呼叫中心电话系统
grafana·ip pbx·电话交换机·企业电话系统