Prometheus监控的搭建(ansible安装——超详细)

目录

1.各组件功能介绍

2.安装批量部署工具ansbile

3.执行服务器

4.各服务器间做免密

5.下载安装包

5.1Prometheus的下载的下载地址

5.2exporter的下载地址

5.3grafana的下载地址

6.编辑ansible需要的配置文件

7.编写ansible文件

8.验证执行结果


今天和大家分享一下搭建Prometheus的方法,搭建Prometheus实现监控一共需要三个组件,他们分别是Prometheus、grafana、exporter。如果需要实现报警功能,还需要装Alertmanager组件。目前测试了麒麟V10、Centos7、Ubuntu18、Ubuntu20版本,都可以跑通, 即使跑不通稍微修改也可以正常跑通,跑不通的可以私信我。不是基于docker跑的,所以说大部分环境都可以跑通。按照我的步骤跑不通,你打我,哈哈哈哈哈。

1.各组件功能介绍

Prometheus:

作用:Prometheus 是一种开源的系统监控和警报工具包,最初由SoundCloud开发。它主要用于收集和存储系统和服务的时间序列数据(metrics),并提供强大的查询语言(PromQL)用于分析这些数据。Prometheus 支持多种数据模型,适用于动态的服务发现和标签化的时间序列数据。

Exporter:

作用:Exporter 是一种用于从现有系统和服务中获取指标数据并将其转换为 Prometheus 格式的工具。Exporter 可以是官方支持的,也可以是社区或第三方开发的,用于监控各种不同类型的系统(如数据库、Web 服务器、消息代理等)。Exporter 通过暴露 HTTP 端点或其他形式的接口,允许 Prometheus 定期抓取和存储这些系统的指标数据。

Grafana:

作用:Grafana 是一个开源的数据可视化和监控平台,用于展示和分析 Prometheus 或其他数据源中的指标数据。Grafana 提供了丰富的图表和仪表盘编辑功能,用户可以根据需要创建个性化的监控仪表盘,并支持多种数据源的数据整合和展示。除了图表展示外,Grafana 还支持警报功能,可以根据设定的阈值条件触发警报通知。

Alertmanager:

Alertmanager 可以根据配置的路由规则,将报警通知发送到指定的接收端,如电子邮件、Slack、PagerDuty 等。

2.安装批量部署工具ansbile

3.执行服务器

|-------|--------------|---------------------|
| 主机名 | 主机ip | 部署服务 |
| host1 | 192.168.1.11 | exporter、prometheus |
| host2 | 192.168.1.12 | exporter、grafana |

4.各服务器间做免密

promethus与所有服务器做免密(包括自身也需要做)

[root@host1 ~]# ssh-keygen -t rsa -b 4096

[root@host1 ~]# ssh-copy-id 192.168.1.11

[root@host1 ~]# ssh-copy-id 192.168.1.12

5.下载安装包

可以去官网下载

也可以去清华园下载

5.1Prometheus的下载的下载地址

wget https://mirrors.tuna.tsinghua.edu.cn/github-release/prometheus/prometheus/LatestRelease/prometheus-2.49.1.linux-amd64.tar.gz

5.2exporter的下载地址

wget https://github.com/prometheus/node_exporter/releases/download/v1.6.0/node_exporter-1.6.0.linux-amd64.tar.gz

5.3grafana的下载地址

wget https://dl.grafana.com/enterprise/release/grafana-enterprise-10.2.3.linux-amd64.tar.gz

##这些文件都需要拉到/tmp目录下并且修改文件名为:

grafana-enterprise-10.2.3.linux-amd64.tar.gz

node_exporter.tar.gz

prometheus.tar.gz

##如果不修改文件名修改yaml文件中对应的名字

6.编辑ansible需要的配置文件

[root@host1 ~]# vim host

[prometheus_node]

192.168.1.11

[all]

192.168.1.11

192.168.1.12

[prometheus_grafana]

192.168.1.12

[root@host1 ~]# vim prometheus.yml.j2

global:

scrape_interval: 15s

evaluation_interval: 15s

alerting:

alertmanagers:

  • static_configs:

  • targets: []

scrape_configs:

  • job_name: "prometheus"

static_configs:

  • targets:

  • "192.168.1.11:9090"

  • job_name: "node_exporter"

static_configs:

  • targets:

  • "192.168.1.11:9100"

  • "192.168.1.12:9100"

###有其他exporter,写入其下面就可

7.编写ansible文件

[root@host1 ~]# vim prometheus.yaml


  • name: Install and configure Prometheus and Node Exporter

hosts: all

become: yes

tasks:

  • name: Create a user for Prometheus

user:

name: prometheus

shell: /sbin/nologin

  • name: Create directories for Prometheus

file:

path: "{{ item }}"

state: directory

owner: prometheus

group: prometheus

mode: '0755'

with_items:

  • /etc/prometheus

  • /var/lib/prometheus

  • name: Install Prometheus on the Prometheus node

hosts: prometheus_node

become: yes

tasks:

  • name: Extract Prometheus binary

unarchive:

src: /tmp/prometheus.tar.gz

dest: /tmp

remote_src: yes

  • name: Move Prometheus binaries to the proper location

shell: |

mv /tmp/prometheus-2.37.1.linux-amd64/prometheus /usr/local/bin/

mv /tmp/prometheus-2.37.1.linux-amd64/promtool /usr/local/bin/

become: yes

  • name: Move Prometheus configuration files

ansible.builtin.copy:

src: "{{ item }}"

dest: "/etc/prometheus/"

owner: prometheus

group: prometheus

remote_src: yes

loop:

  • "/tmp/prometheus-2.37.1.linux-amd64/consoles"

  • "/tmp/prometheus-2.37.1.linux-amd64/console_libraries"

  • "/tmp/prometheus-2.37.1.linux-amd64/prometheus.yml"

  • name: Ensure Prometheus is running as a service

copy:

content: |

[Unit]

Description=Prometheus

Wants=network-online.target

After=network-online.target

[Service]

User=prometheus

ExecStart=/usr/local/bin/prometheus --config.file /etc/prometheus/prometheus.yml --storage.tsdb.path /var/lib/prometheus

Restart=always

[Install]

WantedBy=multi-user.target

dest: /etc/systemd/system/prometheus.service

owner: root

group: root

mode: '0644'

  • name: Reload systemd to pick up Prometheus service

command: systemctl daemon-reload

  • name: Enable Prometheus service

systemd:

name: prometheus

enabled: yes

  • name: Start Prometheus service

systemd:

name: prometheus

state: started

  • name: Install Node Exporter on all nodes

hosts: all

become: yes

tasks:

  • name: Copy node_exporter.tar.gz to target host

ansible.builtin.copy:

src: /tmp/node_exporter.tar.gz

dest: /tmp/node_exporter.tar.gz

  • name: Extract Node Exporter binary

unarchive:

src: /tmp/node_exporter.tar.gz

dest: /tmp

remote_src: yes

  • name: Move Node Exporter binary to the proper location

command: mv /tmp/node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin/

  • name: Ensure Node Exporter is running as a service

copy:

content: |

[Unit]

Description=Node Exporter

Wants=network-online.target

After=network-online.target

[Service]

User=root

ExecStart=/usr/local/bin/node_exporter

Restart=always

[Install]

WantedBy=multi-user.target

dest: /etc/systemd/system/node_exporter.service

owner: root

group: root

mode: '0644'

  • name: Reload systemd to pick up Node Exporter service

command: systemctl daemon-reload

  • name: Enable Node Exporter service

systemd:

name: node_exporter

enabled: yes

  • name: Start Node Exporter service

systemd:

name: node_exporter

state: started

  • name: Install and configure Grafana Enterprise

hosts: prometheus_grafana

become: yes

tasks:

  • name: Copy grafana-enterprise-10.2.3.linux-amd64.tar.gz to target host

ansible.builtin.copy:

src: /tmp/grafana-enterprise-10.2.3.linux-amd64.tar.gz

dest: /tmp/grafana-enterprise-10.2.3.linux-amd64.tar.gz

  • name: Extract Grafana Enterprise tarball

ansible.builtin.unarchive:

src: /tmp/grafana-enterprise-10.2.3.linux-amd64.tar.gz

dest: /usr/local/

creates: /usr/local/grafana-v10.2.3

  • name: Rename Grafana directory

ansible.builtin.command:

argv:

  • mv

  • /usr/local/grafana-v10.2.3

  • /usr/local/grafana

creates: /usr/local/grafana

  • name: Create Grafana systemd service file

ansible.builtin.copy:

content: |

[Unit]

Description=Grafana instance

After=network.target

[Service]

Type=simple

WorkingDirectory=/usr/local/grafana/

ExecStart=/usr/local/grafana/bin/grafana-server

Restart=always

[Install]

WantedBy=multi-user.target

dest: /etc/systemd/system/grafana.service

notify:

  • restart grafana

handlers:

  • name: restart grafana

ansible.builtin.systemd:

name: grafana

state: restarted

  • name: Backup and Modify Prometheus configuration

hosts: prometheus_node

become: yes

tasks:

  • name: Backup original prometheus.yml

ansible.builtin.copy:

src: /etc/prometheus/prometheus.yml

dest: /etc/prometheus/prometheus.yml.bak

register: backup_result

changed_when: backup_result.changed

become: yes

become_method: sudo

  • name: Ensure backup completed successfully

assert:

that:

  • backup_result.changed

fail_msg: "Failed to backup prometheus.yml"

success_msg: "Backup of prometheus.yml completed successfully"

  • name: Replace prometheus.yml configuration

ansible.builtin.template:

src: /root/prometheus.yml.j2

dest: /etc/prometheus/prometheus.yml

notify: restart prometheus

handlers:

  • name: restart prometheus

ansible.builtin.systemd:

name: prometheus

state: restarted

[root@host1 ~]# ansible-playbook -i host.txt prometheus.yaml

8.验证执行结果

[root@host1 ~]# netstat -antup|grep 9100

[root@host1 ~]# netstat -antup|grep 9090

访问192.168.1.11:9090和192.168.1.12:3000

相关推荐
冰 河1 小时前
《Nginx核心技术》第16章:实现Nginx的高可用负载均衡
运维·nginx·程序员·负载均衡·高可用
人工智障调包侠5 小时前
Linux 目录介绍
linux·运维·服务器
Java小白白同学6 小时前
Linux 硬盘扩容操作手册
linux·运维·服务器
大白菜和MySQL7 小时前
keepalived和lvs高可用集群
linux·运维·lvs
学习向前冲7 小时前
高效诊断Linux性能问题
linux·运维·服务器
wd5205218 小时前
常用环境部署(十七)——Docker安装pritunl+openvpn
运维·docker·容器
威迪斯特8 小时前
视频监控接入平台web客户端有时无法登录,有时打开实时视频出现黑屏的问题解决
linux·运维·服务器·视频监控·df命令·磁盘空间·接入平台
数据安全小盾9 小时前
2024办公文件怎么加密?常用的8款加密软件排行榜
运维·服务器·网络·安全·web安全
素年槿夏10 小时前
600 条最强 Linux 命令总结
linux·运维·服务器
花生糖@10 小时前
使用批处理脚本自动化启动Unreal Engine项目
运维·游戏·自动化·虚幻·bat