Prometheus+Ansible+Consul实现服务发现

一、简介


1、Consul简介

  • Consul 是基于 GO 语言开发的开源工具,主要面向分布式,服务化的系统提供服务注册、服务发现和配置管理的功能。Consul 提供服务注册/发现、健康检查、Key/Value存储、多数据中心和分布式一致性保证等功能。

  • 在没有使用 consul 服务自动发现的时候,我们需要频繁对 Prometheus 配置文件进行修改,无疑给运维人员带来很大的负担。引入consul之后,只需要在consul中维护监控组件配置,prometheus就能够动态发现配置

2、实验环境

IP 操作系统 安装服务
172.18.200.52 ubuntu 22.04.1 Docker、Prometheus、Grafana、Consul
172.18.200.53 ubuntu 22.04.1 node-exporter

二、安装Consul


1、配置docker-compose.yml

# cat docker-compose.yml
version : '3'
services:
  consul:
    image: consul:1.15
    restart: always
    container_name: consul
    hostname: consul
    environment:
      TZ: Asia/Shanghai
    ports:
      - 8500:8500
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ./consul/config:/consul/config
      - ./consul/data:/consul/data/
    command: ["consul","agent","-config-dir","/consul/config"]

2、配置consul.hcl

server: 将其设置为 true 将使我们的 consul 服务作为服务器运行,而不是作为客户端或代理运行。
data_dir: consul 的默认数据目录,它存储一些持久服务器值。
log_level: 在运行 consul 的命令时我们将如何处理日志。
client_addr: 我们的客户地址,如果我们想要更多地保护它,我们可以使用 - 子网划分或只允许来自确定的 ip 的连接。
bind_addr: 我们的服务器ip地址,如果只使用一个网卡则不需要
connect: 允许网状连接。
ui_config: 基于Web的用户界面。

# cat consul/config/consul.hcl
client_addr = "0.0.0.0"
bind_addr = "127.0.0.1"
data_dir = "/consul/data"
log_level = "INFO"
server = true
bootstrap = true
connect{
  enabled = true
}
ui_config{
  enabled = true
}
acl = {
  enabled = true
  default_policy = "deny"
  enable_token_persistence = true
}

3、启动consul

# docker-compose up -d
# docker exec -it consul '/bin/sh'
/ # consul acl bootstrap
AccessorID:       738dba6d-xxxx-6f8e-xxxx-8b10d9b06a6f
SecretID:         c32db00c-xxxx-37be-xxxx-8b674d033ce3
Description:      Bootstrap Token (Global Management)
Local:            false
Create Time:      2023-11-14 06:16:01.812609522 +0000 UTC
Policies:
   00000000-0000-0000-0000-000000000001 - global-management

4、浏览器访问

通过SecretID进行登录

http://172.18.200.52:8500

三、配置Ansible


1、安装

# apt-get install ansible

2、修改配置

# cat /etc/ansible/ansible.cfg
[defaults]
#host_key_checking = False
#error_on_undefined_vars = True
#timeout = 60
#inventory = inventory.tmp
#roles_path = /conjurinc
#remote_tmp = /tmp
host_key_checking = False
log_path = /var/log/ansible.log

四、ansible-playbook编写


1、查看目录结构

# tree ./
# tree ./
./
├── inventory
│   └── hosts
├── node_exporter_roles.yml
└── roles
    ├── node-exporter
    │   ├── defaults
    │   │   └── main.yml
    │   ├── files
    │   │   └── node_exporter-1.6.1.linux-amd64.tar.gz
    │   ├── handlers
    │   │   └── main.yml
    │   ├── tasks
    │   │   └── main.yml
    │   └── templates
    │       └── node_exporter.service.j2
    └── register
        ├── files
        │   └── consul_register.sh
        └── tasks
            ├── main.yml
            └── register.yml

2、配置hosts

service_name:可以不配置

# cat inventory/hosts
[linux]
172.18.200.53 service_name=linux-172.18.200.53

[linux:vars]
consul_ip=172.18.200.52
consul_port=8500
node_exporter_port=9100
consul_token=c32db00c-xxxx-37be-xxxx-8b674d033ce3

3、配置node_exporter_roles.yml

# cat node_exporter_roles.yml
- hosts: linux
  gather_facts: no
  roles:
    - role: node-exporter

4、配置roles/node-exporter

(1)下载exporter

下载地址:https://github.com/prometheus/node_exporter/releases/tag/v1.6.1

(2)配置defaults

设置service_name默认值

# cat roles/node-exporter/defaults/main.yml
service_name: "{{ group_names[0] }}-{{ inventory_hostname }}"
(3)配置handlers
# cat roles/node-exporter/handlers/main.yml
- name: restart node exporter service
  systemd:
    name: node_exporter
    state: restarted
    daemon-reload: yes

- include: roles/register/tasks/register.yml
(4)配置tasks
# cat roles/node-exporter/tasks/main.yml
- name: push node_exporter
  unarchive:
    src: node_exporter-1.6.1.linux-amd64.tar.gz
    dest: /usr/local

- name: rename
  shell: |
    cd /usr/local
    if [ ! -d node_exporter ]
      then mv node_exporter-1.6.1.linux-amd64 node_exporter
    fi

- name: copy node_exporter systemd
  template:
    src: node_exporter.service.j2
    dest: /usr/lib/systemd/system/node_exporter.service
  notify: restart node exporter service

- name: start node_exporter
  systemd:
    name: node_exporter
    state: started
    enabled: yes
    daemon-reload: yes

- include: roles/register/tasks/main.yml
(5)配置templates

node_exporter_port:端口可以进行配置

# cat roles/node-exporter/templates/node_exporter.service.j2
[Unit]
Description=node_exporter

[Service]
ExecStart=/usr/local/node_exporter/node_exporter --web.listen-address=:{{ node_exporter_port }}
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure

[Install]
WantedBy=multi-user.target

5、配置roles/register

(1)配置files

name:ansible hosts的name
group_names[0]:组名,如果属于children,那么就是group_names[1]
inventory_hostname:ansible hosts的ip
node_exporter_port:node_exporter的端口,默认9100
consul_ip:consul服务的ip
consul_port:consul服务的端口
consul_token:consul服务的SecretID

# cat roles/register/files/consul_register.sh
#!/bin/bash

instance_id=$1
service_name=$2
ip=$3
port=$4
consul_ip=$5
consul_port=$6
consul_token=$7

curl -X PUT --header "X-CONSUL-TOKEN: $consul_token" -d '{"id": "'"$instance_id"'","name": "'"$service_name"'","address": "'"$ip"'","port": '"$port"',"tags": ["'"$service_name"'"],"checks": [{"http": "http://'"$ip"':'"$port"'","interval": "5s"}]}' http://$consul_ip:$consul_port/v1/agent/service/register
(2)配置tasks
# cat roles/register/tasks/main.yml
- name: push consul_register.sh
  copy:
    src: roles/register/files/consul_register.sh
    dest: /usr/local/bin

- include: roles/register/tasks/register.yml

# cat roles/register/tasks/register.yml
- name: register nodes into consul
  shell: /bin/bash /usr/local/bin/consul_register.sh {{ service_name }} {{ group_names[0] }} {{ inventory_hostname }} {{ node_exporter_port }} {{ consul_ip }} {{ consul_port }} {{ consul_token }}

五、修改Prometheus配置


1、配置prometheus.yml

services中的linux:ansible hosts文件中的group名字
这里的services为列表,所有可以添加多个不同组的服务器进来,也实现了分组

# cat prometheus/conf/prometheus.yml
...
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "linux"
    consul_sd_configs:
      - server: 172.18.200.52:8500
        token: c32db00c-xxxx-37be-xxxx-8b674d033ce3
        services: ['linux']

2、重启

# docker restart prometheus

六、执行并添加Grafana


1、执行ansible-playbook命令

# ansible-playbook -i inventory/hosts node_exporter_roles.yml

2、查看Consul

3、添加Grafana

模板id:9276

相关推荐
逻辑与&&15 小时前
[Prometheus学习笔记]从架构到案例,一站式教程
笔记·学习·prometheus
Walden-202016 小时前
构建基于 DCGM-Exporter, Node exporter,PROMETHEUS 和 GRAFANA 构建算力监控系统
docker·容器·grafana·prometheus
紫晓宁19 小时前
jmeter结合ansible分布式压测--3压测执行
分布式·jmeter·ansible
牛角上的男孩2 天前
部署Prometheus、Grafana、Zipkin、Kiali监控度量Istio
grafana·prometheus·istio
紫晓宁2 天前
jmeter结合ansible分布式压测--1数据准备
分布式·jmeter·ansible
紫晓宁2 天前
jmeter结合ansible分布式压测--2jmter环境准备
分布式·jmeter·ansible
福大大架构师每日一题3 天前
文心一言 VS 讯飞星火 VS chatgpt (383)-- 算法导论24.5 3题
prometheus
SG.xf5 天前
ansible中的任务执行控制
ansible
小安运维日记5 天前
Linux云计算 |【第五阶段】CLOUD-DAY10
linux·运维·云计算·k8s·grafana·prometheus
福大大架构师每日一题6 天前
29.2 golang实战项目log2metrics架构说明
架构·prometheus