【基于 Consul 实现 Prometheus 服务发现部署与实战】

提示:本文原创作品,良心制作,干货为主,简洁清晰,一看就会

文章目录

  • 前言
  • 一、基于consul的服务发现
    • [1.1 基本介绍](#1.1 基本介绍)
    • [1.2 整体工作流程](#1.2 整体工作流程)
    • [1.3 基于文件的服务发现VS基于consul的服务发现](#1.3 基于文件的服务发现VS基于consul的服务发现)
  • 二、基于consul的服务发现实战
    • [2.1 环境介绍](#2.1 环境介绍)
    • [2.2 运行consul](#2.2 运行consul)
    • [2.3 json文件临时注册服务到consul](#2.3 json文件临时注册服务到consul)
    • [2.4 修改prometheus.yml文件](#2.4 修改prometheus.yml文件)
    • [2.5 json文件持久化注册服务到consul](#2.5 json文件持久化注册服务到consul)
    • [2.6 下载ConsulManager管理界面](#2.6 下载ConsulManager管理界面)

前言

在监控体系中,传统文件服务发现需手动维护目标列表,节点增减时要频繁修改配置,运维效率偏低。为适配动态变化的业务与服务环境,Prometheus 支持多种主流服务发现方式。本文将讲解基于 Consul 的自动服务发现方案,借助 Consul 的服务注册与检索能力,实现监控目标动态管理,简化运维工作

一、基于consul的服务发现

1.1 基本介绍

Consul 是分布式服务注册与配置中心,核心分为 服务注册服务查询 两部分:

  1. 各类 Exporter、业务服务启动后,主动将自身地址、端口、标签等信息 注册 到 Consul
  2. Prometheus 定时请求 Consul API,拉取 已注册的服务列表,自动生成监控目标,无需手动维护地址
  3. 服务下线、故障时,Consul 会剔除异常节点,Prometheus 同步更新采集目标,实现动态管理

1.2 整体工作流程

  1. 部署 Consul 服务端,搭建注册中心
  2. 各监控组件/业务服务注册信息至 Consul
  3. 配置 Prometheus consul_sd_configs 对接 Consul
  4. Prometheus 周期性拉取服务列表,自动完成指标采集

1.3 基于文件的服务发现VS基于consul的服务发现

文件:手动维护、简单轻量;Consul:自动注册、动态发现

特性 文件服务发现 Consul 服务发现
工作方式 手动维护yaml文件,无服务中心,无自动发现 服务自动注册到consul,无需手动改配置
服务自动注册/下线 不支持 支持
健康检查
使用场景 服务器数量少,测试学习小型监控 生产环境,微服务,docker,k8s,大规模监控平台

关于基于文件的服务发现,感兴趣的同学可以参考这篇文章
https://blog.csdn.net/m0_63756214/article/details/161393935?spm=1001.2014.3001.5501

二、基于consul的服务发现实战

2.1 环境介绍

主机名 ip地址 服务 备注
prometheus 192.168.13.141 docker、docker-compose、prometheus、alertmanager、node-exporter、grafana 监控端,已安装
ubuntu 192.168.13.142 docker、docker-compose、各类exporter 被监控端,已安装

关于监控端的服务我已经安装好了,prometheus有两种安装方式:二进制安装和docker安装 ,本次实验使用的容器安装的,后续被监控端我也统一使用容器部署,大家可以自行选择

关于监控端的服务如何安装,这里不在赘述,有不懂的同学可以查看此篇文章
Prometheus二进制安装:https://blog.csdn.net/m0_63756214/article/details/161196428?spm=1001.2014.3001.5501
Prometheus容器安装:https://blog.csdn.net/m0_63756214/article/details/161225636?spm=1001.2014.3001.5501

被监控端的docker和docker-compose的安装可以参考此篇文章的2.1和2.2小节
https://blog.csdn.net/m0_63756214/article/details/161240598?spm=1001.2014.3001.5501

2.2 运行consul

Consul官网:https://developer.hashicorp.com/consul/install

yaml 复制代码
root@prometheus:~# mkdir /opt/prometheus/consul
root@prometheus:~# cd /opt/prometheus/consul
root@prometheus:/opt/prometheus/consul# vim docker-compose.yaml
services:
  consul:
    # 我这里选择的是比较旧的版本
    image: consul:1.14.5
    container_name: consul
    restart: always
    ports:
      - "8500:8500"
    volumes:
      # /consul/config是consul启动自动扫描服务配置的文件
      - ./config:/consul/config
      # /consul/data存放自己所有运行数据的文件夹
      - ./data:/consul/data
    # 以服务端模式启动单节点 Consul,开启 Web UI,并允许所有外部机器访问
    command: agent -server -bootstrap-expect=1 -ui -client 0.0.0.0
root@prometheus:/opt/prometheus/consul# docker-compose up -d

浏览器访问http://192.168.13.141:8500,可以登录consul的ui界面

consul agent 会将自己注册为一个名为 consul 的服务

2.3 json文件临时注册服务到consul

curl 临时注册服务到consul,主要是测试一下consul

yaml 复制代码
root@prometheus:/opt/prometheus/consul# vim node_exporter.json 
{
  "id": "node1",
  "name": "node_exporter",
  "address": "192.168.13.141",
  "port": 9100,
  "tags": ["exporter"],
  "meta": {
    "job": "node_exporter",
    "instance": "Prometheus服务器"
  },
  "checks": [{
    "http": "http://192.168.13.141:9100/metrics",
    "interval": "10s"
  }]
}

# 使用json文件注册
root@prometheus:/opt/prometheus/consul# curl --request PUT --data @node_exporter.json http://localhost:8500/v1/agent/service/register
yaml 复制代码
root@prometheus:/opt/prometheus/consul# vim node_exporter2.json 
{
  "id": "node2",
  "name": "node_exporter",
  "address": "192.168.13.142",
  "port": 9100,
  "tags": ["exporter"],
  "meta": {
    "job": "node_exporter",
    "instance": "Linux服务器"
  },
  "checks": [{
    "http": "http://192.168.13.142:9100/metrics",
    "interval": "10s"
  }]
}

# 使用json文件注册
root@prometheus:/opt/prometheus/consul# curl --request PUT --data @node_exporter2.json http://localhost:8500/v1/agent/service/register

浏览器访问http://192.168.13.141:8500,可以看到服务已经注册到consul

2.4 修改prometheus.yml文件

上面我们通过 Consul 注册了 2 个 node_exporter 服务,接下来我们将配置 prometheus 通过 consul 来自动发现exporter服务

yaml 复制代码
root@prometheus:/opt/prometheus/consul# cd /opt/prometheus/prometheus/
root@prometheus:/opt/prometheus/prometheus# ls
alert.yml  prometheus.yml  rules
root@prometheus:/opt/prometheus/prometheus# cp prometheus.yml prometheus.yml.bak
yaml 复制代码
root@prometheus:/opt/prometheus/prometheus# vim prometheus.yml
# 全局配置
global:
  scrape_interval:     15s # 将搜刮间隔设置为每15秒一次。默认是每1分钟一次。
  evaluation_interval: 15s # 每15秒评估一次规则。默认是每1分钟一次。

# Alertmanager 配置
alerting:
  alertmanagers:
  - static_configs:
    - targets: ['alertmanager:9093']

# 报警(触发器)配置
rule_files:
  - "alert.yml"
  - "rules/*.yml"

# 搜刮配置
scrape_configs:
  - job_name: 'prometheus'
    # 覆盖全局默认值,每15秒从该作业中刮取一次目标
    scrape_interval: 15s
    static_configs:
    - targets: ['localhost:9090']
  - job_name: 'alertmanager'
    # 覆盖全局默认值,每15秒从该作业中刮取一次目标
    scrape_interval: 15s
    static_configs:
    - targets: ['alertmanager:9093']

  - job_name: 'consul_exporter'
    consul_sd_configs:
      - server: '192.168.13.141:8500'
        services: []
    relabel_configs:
      - source_labels: [__meta_consul_tags]
        regex: .*exporter.*
        action: keep
      - regex: __meta_consul_service_metadata_(.+)
        action: labelmap
#  Spring Boot 应用数据采集配置
  - job_name: 'consul_springboot_demo'
    metrics_path: '/actuator/prometheus'
    scrape_interval: 5s
    consul_sd_configs:
      - server: '192.168.13.141:8500'
        services: []
    relabel_configs:
      - source_labels: [__meta_consul_tags]
        regex: .*springboot.*
        action: keep
      - regex: __meta_consul_service_metadata_(.+)
        action: labelmap
#http配置
  - job_name: "consul-blackbox_http"
    metrics_path: /probe
    params:
      module: [http_2xx]
    consul_sd_configs:
      - server: '192.168.13.141:8500'
        services: []
    relabel_configs:
      - source_labels: [__meta_consul_tags]
        regex: .*blackbox_http.*
        action: keep
      - regex: __meta_consul_service_metadata_(.+)
        action: labelmap
      - source_labels: [__meta_consul_service_address]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 192.168.13.142:9115
#tcp检查配置
  - job_name: "consul_blackbox_tcp"
    metrics_path: /probe
    params:
      module: [tcp_connect]
    consul_sd_configs:
      - server: '192.168.13.141:8500'
        services: []
    relabel_configs:
      - source_labels: [__meta_consul_tags]
        regex: .*blackbox_tcp.*
        action: keep
      - regex: __meta_consul_service_metadata_(.+)
        action: labelmap
      - source_labels: [__meta_consul_service_address]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 192.168.13.142:9115

#icmp检查配置
  - job_name: "consul_blackbox_icmp"
    metrics_path: /probe
    params:
      module: [icmp]
    consul_sd_configs:
      - server: '192.168.13.141:8500'
        services: []
    relabel_configs:
      - source_labels: [__meta_consul_tags]
        regex: .*blackbox_icmp.*
        action: keep
      - regex: __meta_consul_service_metadata_(.+)
        action: labelmap
      - source_labels: [__meta_consul_service_address]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 192.168.13.142:9115

#域名检测
  - job_name: consul_domain_exporter
    scrape_interval: 10s
    metrics_path: /probe
    consul_sd_configs:
      - server: '192.168.13.141:8500'
        services: []
    relabel_configs:
      - source_labels: [__meta_consul_tags]
        regex: .*domain.*
        action: keep
      - regex: __meta_consul_service_metadata_(.+)
        action: labelmap
      - source_labels: [__meta_consul_service_address]
        target_label: __param_target
      - target_label: __address__
        replacement: 192.168.13.142:9222
root@prometheus:/opt/prometheus/prometheus# curl -X POST http://localhost:9090/-/reload

登录Prometheus页面,可以看到Prometheus自动识别到node_exporter了

2.5 json文件持久化注册服务到consul

yaml 复制代码
root@prometheus:/opt/prometheus/consul# cd config/
# 下面都是服务的json文件,consul会自动识别config/下的json文件
root@prometheus:/opt/prometheus/consul/config# vim nginx.json 
{
  "service": {
    "id": "nginx1",
    "name": "nginx_exporter",
    "address": "192.168.13.142",
    "port": 9113,
    "tags": ["exporter"],
    "meta": {
      "job": "nginx_exporter",
      "instance": "Linux服务器",
      "env": "test"
    },
    "checks": [
      {
        "http": "http://192.168.13.142:9113/metrics",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim mysql.json 
{
  "service": {
    "id": "mysql1",
    "name": "mysqld_exporter",
    "address": "192.168.13.142",
    "port": 9104,
    "tags": ["exporter"],
    "meta": {
      "job": "mysqld_exporter",
      "instance": "Linux服务器",
      "env": "test"
    },
    "checks": [
      {
        "http": "http://192.168.13.142:9104/metrics",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim process.json 
{
  "service": {
    "id": "process1",
    "name": "process_exporter",
    "address": "192.168.13.142",
    "port": 9256,
    "tags": ["exporter"],
    "meta": {
      "job": "process_exporter",
      "instance": "Linux服务器",
      "env": "test"
    },
    "checks": [
      {
        "http": "http://192.168.13.142:9256/metrics",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim rabbitmq.json 
{
  "service": {
    "id": "rabbitmq1",
    "name": "rabbitmq_exporter",
    "address": "192.168.13.142",
    "port": 15692,
    "tags": ["exporter"],
    "meta": {
      "job": "rabbitmq_exporter",
      "instance": "Linux服务器",
      "env": "test"
    },
    "checks": [
      {
        "http": "http://192.168.13.142:15692/metrics",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim springboot.json 
{
  "service": {
    "id": "springboot1",
    "name": "springboot",
    "address": "192.168.13.142",
    "port": 8081,
    "tags": ["springboot"],
    "meta": {
      "job": "springboot",
      "instance": "Linux服务器",
      "env": "test"
    },
    "checks": [
      {
        "http": "http://192.168.13.142:8081/actuator/prometheus",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim redis.json 
{
  "service": {
    "id": "redis1",
    "name": "redis_exporter",
    "address": "192.168.13.142",
    "port": 9121,
    "tags": ["exporter"],
    "meta": {
      "job": "redis_exporter",
      "instance": "Linux服务器",
      "env": "test"
    },
    "checks": [
      {
        "http": "http://192.168.13.142:9121/metrics",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim cadvisor1.json 
{
  "service": {
    "id": "cadvisor1",
    "name": "cadvisor",
    "address": "192.168.13.141",
    "port": 8080,
    "tags": ["exporter"],
    "meta": {
      "job": "cadvisor",
      "instance": "Prometheus服务器",
      "env": "test"
    },
    "checks": [
      {
        "http": "http://192.168.13.141:8080/metrics",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim cadvisor2.json 
{
  "service": {
    "id": "cadvisor2",
    "name": "cadvisor",
    "address": "192.168.13.142",
    "port": 8080,
    "tags": ["exporter"],
    "meta": {
      "job": "cadvisor",
      "instance": "Linux服务器",
      "env": "test"
    },
    "checks": [
      {
        "http": "http://192.168.13.142:8080/metrics",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim blackbox-http.json 
{
  "service": {
    "id": "http1",
    "name": "blackbox_http",
    "address": "https://www.jd.com",
    "tags": ["blackbox_http"],
    "checks": [
      {
        "http": "http://192.168.13.142:9115",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim blackbox-tcp.json 
{
  "service": {
    "id": "tcp1",
    "name": "blackbox_tcp",
    "address": "192.168.11.61",
    "port": 9090,
    "tags": ["blackbox_tcp"],
    "checks": [
      {
        "http": "http://192.168.13.142:9115",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim blackbox-icmp.json 
{
  "service": {
    "id": "icmp1",
    "name": "blackbox_icmp",
    "address": "192.168.11.62",
    "tags": ["blackbox_icmp"],
    "checks": [
      {
        "http": "http://192.168.13.142:9115",
        "interval": "5s"
      }
    ]
  }
}


root@prometheus:/opt/prometheus/consul/config# vim domain.json 
{
  "service": {
    "id": "domain1",
    "name": "domain_exporter",
    "address": "baidu.com",
    "tags": ["domain"],
    "checks": [
      {
        "http": "http://192.168.13.142:9222",
        "interval": "5s"
      }
    ]
  }
}

浏览器访问http://192.168.13.141:8500,consul界面可以看到服务已经被自动添加

如果不能自动添加可以尝试docker restart consul重启consul容器

登录Prometheus界面,可以看到服务都被自动加载了

2.6 下载ConsulManager管理界面

生产环境中,一般会下载consul的管理界面consulmanager

它是一个 Web 管理面板,专门用来管理 Consul,我们可以直接在ConsulManager上直接注册服务,不用写 JSON、不用重启 Consul,批量增删、改标签、改健康检查等等

yaml 复制代码
root@prometheus:/opt/prometheus/consul# mkdir consulmanager
root@prometheus:/opt/prometheus/consul# cd consulmanager/
root@prometheus:/opt/prometheus/consul/consulmanager# vim docker-compose.yaml
version: "3.8"
services:
  flask-consul:
    image: swr.cn-south-1.myhuaweicloud.com/starsl.cn/flask-consul:latest
    container_name: flask-consul
    hostname: flask-consul
    restart: always
    volumes:
      - /usr/share/zoneinfo/PRC:/etc/localtime
    environment:
      # consul登录的token,我没有设置
      consul_token: ""
      # consul的地址
      consul_url: http://192.168.13.141:8500/v1
      # 设置登录密码
      admin_passwd: Qing@123456
      log_level: INFO

  nginx-consul:
    image: swr.cn-south-1.myhuaweicloud.com/starsl.cn/nginx-consul:latest
    container_name: nginx-consul
    hostname: nginx-consul
    restart: always
    ports:
      - "1026:1026"
    volumes:
      - /usr/share/zoneinfo/PRC:/etc/localtime
root@prometheus:/opt/prometheus/consul/consulmanager# docker-compose up -d

浏览器访问http://192.168.13.141:1026登录consulmanager的管理界面

可以看到由于我们之前已经手写了很多json文件,所以现在consulmanager上直接就显示出来了

我们可以直接在consulmanager上新增或者删除服务

例如:新增服务(我已经把刚才服务器上的nginx.json删除了)

可以看到新增成功,对应的Prometheus也会出现对应的监控服务


注:

文中若有疏漏,欢迎大家指正赐教。

本文为100%原创,转载请务必标注原创作者,尊重劳动成果。

求赞、求关注、求评论!你的支持是我更新的最大动力,评论区等你~

相关推荐
成为你的宁宁5 小时前
【Prometheus基于文件的服务发现】
服务发现·prometheus
zhojiew7 小时前
在Ray集群中使用vLLM部署LLM模型并集成Prometheus和Grafana进行指标观测的实践
grafana·prometheus·vllm
Cat_Rocky10 小时前
K8S部署EFK日志收集技术栈
容器·kubernetes·prometheus
D4c-lovetrain12 小时前
云原生实战:K8s 一键部署 Prometheus+Grafana+EFK 完整可观测平台
云原生·kubernetes·prometheus
眷蓝天12 小时前
基于K8s部署Prometheus与EFK栈
容器·kubernetes·prometheus
zxd02031113 小时前
K8S 中部署 Prometheus 监控体系:从原理到实战
容器·kubernetes·prometheus
zxd0203111 天前
Prometheus + Grafana 监控平台搭建实战指南
grafana·prometheus
脑子加油站1 天前
Prometheus 抑制告警风暴的方法
prometheus
成为你的宁宁2 天前
【基于 Pushgateway 的 Prometheus 自定义监控实践指南】
prometheus·pushgateway