Prometheus监控的搭建(ansible安装——超详细)

目录

1.各组件功能介绍

2.安装批量部署工具ansbile

3.执行服务器

4.各服务器间做免密

5.下载安装包

5.1Prometheus的下载的下载地址

5.2exporter的下载地址

5.3grafana的下载地址

6.编辑ansible需要的配置文件

7.编写ansible文件

8.验证执行结果


今天和大家分享一下搭建Prometheus的方法,搭建Prometheus实现监控一共需要三个组件,他们分别是Prometheus、grafana、exporter。如果需要实现报警功能,还需要装Alertmanager组件。目前测试了麒麟V10、Centos7、Ubuntu18、Ubuntu20版本,都可以跑通, 即使跑不通稍微修改也可以正常跑通,跑不通的可以私信我。不是基于docker跑的,所以说大部分环境都可以跑通。按照我的步骤跑不通,你打我,哈哈哈哈哈。

1.各组件功能介绍

Prometheus:

作用:Prometheus 是一种开源的系统监控和警报工具包,最初由SoundCloud开发。它主要用于收集和存储系统和服务的时间序列数据(metrics),并提供强大的查询语言(PromQL)用于分析这些数据。Prometheus 支持多种数据模型,适用于动态的服务发现和标签化的时间序列数据。

Exporter:

作用:Exporter 是一种用于从现有系统和服务中获取指标数据并将其转换为 Prometheus 格式的工具。Exporter 可以是官方支持的,也可以是社区或第三方开发的,用于监控各种不同类型的系统(如数据库、Web 服务器、消息代理等)。Exporter 通过暴露 HTTP 端点或其他形式的接口,允许 Prometheus 定期抓取和存储这些系统的指标数据。

Grafana:

作用:Grafana 是一个开源的数据可视化和监控平台,用于展示和分析 Prometheus 或其他数据源中的指标数据。Grafana 提供了丰富的图表和仪表盘编辑功能,用户可以根据需要创建个性化的监控仪表盘,并支持多种数据源的数据整合和展示。除了图表展示外,Grafana 还支持警报功能,可以根据设定的阈值条件触发警报通知。

Alertmanager:

Alertmanager 可以根据配置的路由规则,将报警通知发送到指定的接收端,如电子邮件、Slack、PagerDuty 等。

2.安装批量部署工具ansbile

3.执行服务器

|-------|--------------|---------------------|
| 主机名 | 主机ip | 部署服务 |
| host1 | 192.168.1.11 | exporter、prometheus |
| host2 | 192.168.1.12 | exporter、grafana |

4.各服务器间做免密

promethus与所有服务器做免密(包括自身也需要做)

root@host1 \~\]# ssh-keygen -t rsa -b 4096 \[root@host1 \~\]# ssh-copy-id 192.168.1.11 \[root@host1 \~\]# ssh-copy-id 192.168.1.12 ### 5.下载安装包 可以去官网下载 也可以去清华园下载 #### 5.1Prometheus的下载的下载地址 wget https://mirrors.tuna.tsinghua.edu.cn/github-release/prometheus/prometheus/LatestRelease/prometheus-2.49.1.linux-amd64.tar.gz #### 5.2exporter的下载地址 wget [https://github.com/prometheus/node_exporter/releases/download/v1.6.0/node_exporter-1.6.0.linux-amd64.tar.gz](https://github.com/prometheus/node_exporter/releases/download/v1.6.0/node_exporter-1.6.0.linux-amd64.tar.gz "https://github.com/prometheus/node_exporter/releases/download/v1.6.0/node_exporter-1.6.0.linux-amd64.tar.gz") #### 5.3grafana的下载地址 wget https://dl.grafana.com/enterprise/release/grafana-enterprise-10.2.3.linux-amd64.tar.gz ##这些文件都需要拉到/tmp目录下并且修改文件名为: grafana-enterprise-10.2.3.linux-amd64.tar.gz node_exporter.tar.gz prometheus.tar.gz ##如果不修改文件名修改yaml文件中对应的名字 ### 6.编辑ansible需要的配置文件 \[root@host1 \~\]# vim host > \[prometheus_node

192.168.1.11

all

192.168.1.11

192.168.1.12

prometheus_grafana

192.168.1.12

root@host1 \~\]# vim prometheus.yml.j2 > global: > > scrape_interval: 15s > > evaluation_interval: 15s > > alerting: > > alertmanagers: > > - static_configs: > > - targets: \[

scrape_configs:

  • job_name: "prometheus"

static_configs:

  • targets:

  • "192.168.1.11:9090"

  • job_name: "node_exporter"

static_configs:

  • targets:

  • "192.168.1.11:9100"

  • "192.168.1.12:9100"

###有其他exporter,写入其下面就可

7.编写ansible文件

root@host1 \~\]# vim prometheus.yaml > --- > > - name: Install and configure Prometheus and Node Exporter > > hosts: all > > become: yes > > tasks: > > - name: Create a user for Prometheus > > user: > > name: prometheus > > shell: /sbin/nologin > > - name: Create directories for Prometheus > > file: > > path: "{{ item }}" > > state: directory > > owner: prometheus > > group: prometheus > > mode: '0755' > > with_items: > > - /etc/prometheus > > - /var/lib/prometheus > > - name: Install Prometheus on the Prometheus node > > hosts: prometheus_node > > become: yes > > tasks: > > - name: Extract Prometheus binary > > unarchive: > > src: /tmp/prometheus.tar.gz > > dest: /tmp > > remote_src: yes > > - name: Move Prometheus binaries to the proper location > > shell: \| > > mv /tmp/prometheus-2.37.1.linux-amd64/prometheus /usr/local/bin/ > > mv /tmp/prometheus-2.37.1.linux-amd64/promtool /usr/local/bin/ > > become: yes > > - name: Move Prometheus configuration files > > ansible.builtin.copy: > > src: "{{ item }}" > > dest: "/etc/prometheus/" > > owner: prometheus > > group: prometheus > > remote_src: yes > > loop: > > - "/tmp/prometheus-2.37.1.linux-amd64/consoles" > > - "/tmp/prometheus-2.37.1.linux-amd64/console_libraries" > > - "/tmp/prometheus-2.37.1.linux-amd64/prometheus.yml" > > - name: Ensure Prometheus is running as a service > > copy: > > content: \| > > \[Unit

Description=Prometheus

Wants=network-online.target

After=network-online.target

Service

User=prometheus

ExecStart=/usr/local/bin/prometheus --config.file /etc/prometheus/prometheus.yml --storage.tsdb.path /var/lib/prometheus

Restart=always

Install

WantedBy=multi-user.target

dest: /etc/systemd/system/prometheus.service

owner: root

group: root

mode: '0644'

  • name: Reload systemd to pick up Prometheus service

command: systemctl daemon-reload

  • name: Enable Prometheus service

systemd:

name: prometheus

enabled: yes

  • name: Start Prometheus service

systemd:

name: prometheus

state: started

  • name: Install Node Exporter on all nodes

hosts: all

become: yes

tasks:

  • name: Copy node_exporter.tar.gz to target host

ansible.builtin.copy:

src: /tmp/node_exporter.tar.gz

dest: /tmp/node_exporter.tar.gz

  • name: Extract Node Exporter binary

unarchive:

src: /tmp/node_exporter.tar.gz

dest: /tmp

remote_src: yes

  • name: Move Node Exporter binary to the proper location

command: mv /tmp/node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin/

  • name: Ensure Node Exporter is running as a service

copy:

content: |

Unit

Description=Node Exporter

Wants=network-online.target

After=network-online.target

Service

User=root

ExecStart=/usr/local/bin/node_exporter

Restart=always

Install

WantedBy=multi-user.target

dest: /etc/systemd/system/node_exporter.service

owner: root

group: root

mode: '0644'

  • name: Reload systemd to pick up Node Exporter service

command: systemctl daemon-reload

  • name: Enable Node Exporter service

systemd:

name: node_exporter

enabled: yes

  • name: Start Node Exporter service

systemd:

name: node_exporter

state: started

  • name: Install and configure Grafana Enterprise

hosts: prometheus_grafana

become: yes

tasks:

  • name: Copy grafana-enterprise-10.2.3.linux-amd64.tar.gz to target host

ansible.builtin.copy:

src: /tmp/grafana-enterprise-10.2.3.linux-amd64.tar.gz

dest: /tmp/grafana-enterprise-10.2.3.linux-amd64.tar.gz

  • name: Extract Grafana Enterprise tarball

ansible.builtin.unarchive:

src: /tmp/grafana-enterprise-10.2.3.linux-amd64.tar.gz

dest: /usr/local/

creates: /usr/local/grafana-v10.2.3

  • name: Rename Grafana directory

ansible.builtin.command:

argv:

  • mv

  • /usr/local/grafana-v10.2.3

  • /usr/local/grafana

creates: /usr/local/grafana

  • name: Create Grafana systemd service file

ansible.builtin.copy:

content: |

Unit

Description=Grafana instance

After=network.target

Service

Type=simple

WorkingDirectory=/usr/local/grafana/

ExecStart=/usr/local/grafana/bin/grafana-server

Restart=always

Install

WantedBy=multi-user.target

dest: /etc/systemd/system/grafana.service

notify:

  • restart grafana

handlers:

  • name: restart grafana

ansible.builtin.systemd:

name: grafana

state: restarted

  • name: Backup and Modify Prometheus configuration

hosts: prometheus_node

become: yes

tasks:

  • name: Backup original prometheus.yml

ansible.builtin.copy:

src: /etc/prometheus/prometheus.yml

dest: /etc/prometheus/prometheus.yml.bak

register: backup_result

changed_when: backup_result.changed

become: yes

become_method: sudo

  • name: Ensure backup completed successfully

assert:

that:

  • backup_result.changed

fail_msg: "Failed to backup prometheus.yml"

success_msg: "Backup of prometheus.yml completed successfully"

  • name: Replace prometheus.yml configuration

ansible.builtin.template:

src: /root/prometheus.yml.j2

dest: /etc/prometheus/prometheus.yml

notify: restart prometheus

handlers:

  • name: restart prometheus

ansible.builtin.systemd:

name: prometheus

state: restarted

root@host1 \~\]# ansible-playbook -i host.txt prometheus.yaml ![](https://i-blog.csdnimg.cn/direct/7e9f4af7fbd34a97abe75ff9ad565e09.png) ### 8.验证执行结果 \[root@host1 \~\]# netstat -antup\|grep 9100 \[root@host1 \~\]# netstat -antup\|grep 9090 ![](https://i-blog.csdnimg.cn/direct/9b0c22965ee046cea34bfb10f9ea1d13.png) 访问192.168.1.11:9090和192.168.1.12:3000 ![](https://i-blog.csdnimg.cn/direct/d2810d77e4eb46e8af33be177f8c3911.png) ![](https://i-blog.csdnimg.cn/direct/3a54ea181fdc4a9d9c33f52925307647.png)

相关推荐
xuanzdhc1 小时前
Linux 基础IO
linux·运维·服务器
愚润求学1 小时前
【Linux】网络基础
linux·运维·网络
小和尚同志3 小时前
29.4k!使用 1Panel 来管理你的服务器吧
linux·运维
小米里的大麦9 天前
014 Linux 2.6内核进程调度队列(了解)
linux·运维·驱动开发
程序员的世界你不懂9 天前
Appium+python自动化(三十)yaml配置数据隔离
运维·appium·自动化
算法练习生9 天前
Linux文件元信息完全指南:权限、链接与时间属性
linux·运维·服务器
浩浩测试一下9 天前
渗透测试指南(CS&&MSF):Windows 与 Linux 系统中的日志与文件痕迹清理
linux·运维·windows·安全·web安全·网络安全·系统安全
小生云木9 天前
Linux离线编译安装nginx
linux·运维·nginx
19899 天前
【Dify精讲】第19章:开源贡献指南
运维·人工智能·python·架构·flask·开源·devops
成工小白9 天前
【Linux】文件系统
linux·运维·服务器