系列文章目录
运维监控平台搭建
运维监控平台监控标签
golang_Consul代码实现Prometheus监控目标的注册以及动态发现与配置V1版本
Prometheus运维监控平台之监控指标注册到consul脚本开发、自定义监控项采集配置调试(三)
文章目录
- 系列文章目录
- 前言
- [一、Alertmanager Webhook 是什么?](#一、Alertmanager Webhook 是什么?)
- 二、golang获取alertmanager原始告警数据示例
- 三、webhook部分示例代码
- 四、webhook使用示例
- 总结
前言
在当今复杂多变的系统环境中,高效的警报管理和响应机制变得尤为重要。为了解决这一问题,Alertmanager作为Prometheus生态系统中的核心组件,提供了强大的警报处理功能。然而,Alertmanager的默认通知方式可能并不完全满足所有用户的需求。例如,将不同的告警发送到不同接收群体、客户自己的办公软件等,为了解决这个问题,Alertmanager Webhook 应运而生,它允许用户将Alertmanager的警报与外部媒介的Webhook集成,实现自定义的警报通知和处理流程。本文主要以golang为基础,实现将告警发送到不同的媒介(邮箱、企业微信群组、钉钉群组)
一、Alertmanager Webhook 是什么?
webhook工作示意图
yaml
如上图所示,Alertmanager Webhook是一个中间件。当Alertmanager触发警报时,它会将警报信息发送给Webhook Adapter。
Webhook Adapter接收到警报信息后,会根据预定义的规则,将警报信息转换为Webhook服务可接受的格式,并发送给Webhook服务。
这样,用户就可以通过Webhook服务实现自定义的警报通知和处理流程。
webhook工作原理
yaml
1、接收警报信息:
当Alertmanager触发警报时,它会将警报信息发送给Webhook Adapter。警报信息包括触发警报的条件、触发时间、警报级别等信息。
2、转换警报信息:
Webhook Adapter接收到警报信息后,会根据预定义的规则,将警报信息转换为Webhook服务可接受的格式。这包括将警报信息转换为JSON格式,添加必要的请求头信息等。
3、发送警报信息:
Webhook Adapter将转换后的警报信息发送给Webhook服务。Webhook服务可以是任何能够接受HTTP请求的服务,如Slack、钉钉等。
4、处理警报信息:
Webhook服务接收到警报信息后,会根据自身的功能,对警报信息进行处理。例如,Slack可以将警报信息以消息的形式发送给指定的用户或群组;钉钉可以将警报信息以卡片的形式显示在群聊中。
Alertmanager支持多种通知方式
yaml
也就是支持多种通知接受者receiver:
webhook:web回调或者http服务的推送API接口
wechat:通过微信 API发送
sms:短信
email:电子邮件
二、golang获取alertmanager原始告警数据示例
1.编写golang代码
go
package main
import (
"fmt"
"github.com/gin-gonic/gin"
"io/ioutil"
"net/http"
"bytes"
)
func f1(c *gin.Context) {
// 向客户端响应ok
defer c.String(http.StatusOK, "ok\n")
// 获取客户端的请求方式
fmt.Println("method:", c.Request.Method)
// 获取客户端请求的 body
body, err := ioutil.ReadAll(c.Request.Body)
if err != nil {
fmt.Printf("read body err: %v\n", err)
return
}
fmt.Println("json: ", string(body))
// 在读取完请求体后,需要重置请求体,以便其他处理中也能获取到同样的内容
c.Request.Body = ioutil.NopCloser(bytes.NewBuffer(body))
}
func main() {
r := gin.Default()
// 设置路由
r.POST("/", f1) // 使用 POST 方法,可以根据需要更改为其他 HTTP 方法
// 启动服务器
err := r.Run("127.0.0.1:8888")
if err != nil {
fmt.Printf("could not start server: %v\n", err)
}
}
2.修改alertmanager.yml文件
yaml
global:
resolve_timeout: 5m
http_config:
basic_auth:
username: admin
password: "QAZXCFRF"
route:
group_by: ['alertname']
group_wait: 30s
group_interval: 2m
repeat_interval: 5m
receiver: 'webhook'
receivers:
- name: 'webhook'
webhook_configs:
- url: 'http://127.0.0.1:8888/' #此处设置为上述golang程序的地址和端口
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
3.启动golang程序,获取原始json数据
shell
[root@python2 alertmanagerTest]# go run main.go
#将返回值复制到在线的json格式转换中,转换后如下所示
{
"receiver": "web\\.hook",
"status": "firing",
"alerts": [
{
"status": "firing",
"labels": {
"alertname": "HostDisk",
"device": "dm-0",
"fstype": "xfs",
"host": "python2",
"instance": "192.168.56.131:9273",
"ip": "192.168.56.131",
"job": "consul-prometheus",
"mode": "rw",
"path": "/",
"port": "9273",
"serverity": "middle"
},
"annotations": {
"description": "192.168.56.131:9273, mountpoint /: Disk Usage is 85.36014882971507, above 80%",
"summary": "192.168.56.131:9273: High Disk Usage Detected"
},
"startsAt": "2024-10-19T09:16:28.627Z",
"endsAt": "0001-01-01T00:00:00Z",
"generatorURL": "http://python2:9090/graph?g0.expr=disk_used_percent+%3E+80\u0026g0.tab=1",
"fingerprint": "7d93a04c3406308a"
}
],
"groupLabels": {
"alertname": "HostDisk"
},
"commonLabels": {
"alertname": "HostDisk",
"device": "dm-0",
"fstype": "xfs",
"host": "python2",
"instance": "192.168.56.131:9273",
"ip": "192.168.56.131",
"job": "consul-prometheus",
"mode": "rw",
"path": "/",
"port": "9273",
"serverity": "middle"
},
"commonAnnotations": {
"description": "192.168.56.131:9273, mountpoint /: Disk Usage is 85.36014882971507, above 80%",
"summary": "192.168.56.131:9273: High Disk Usage Detected"
},
"externalURL": "http://python2:9093",
"version": "4",
"groupKey": "{}:{alertname=\"HostDisk\"}",
"truncatedAlerts": 0
}
三、webhook部分示例代码
1.webhook开发涉及到的技术依赖
yaml
1、golang 1.20.0
2、vscode
3、gin框架+template
4、logrus日志框架
5、redis(单节点)
6、企业微信、邮箱、钉钉官方文档
7、alertmanager
2.整体代码目录结构
3.定义配置文件
该配置文件主要用于定义钉钉、企业微信、邮箱、日志等相关配置
settings.yaml配置文件如下(示例):
yaml
DingDing:
enabled: false #是否开启钉钉接受告警,同下
dingdingKey: "xxxx" #钉钉群组中的机器人token,获取方法如上所示
QyWeChat:
enabled: true
qywechatKey: "xxx"
Email:
enabled: true
smtp_host: "smtp.163.com"
smtp_port: 25
smtp_from: "xxxxx@163.com" #发送者
smtp_password: "xxxx" #授权码不是登录的密码
smtp_to: "sss@qq.com" #接收者,可以是多个
Redis:
redisServer: "192.168.56.131" # 必须配置
redisPort: 7001 #可选项为空默认为6379
redisPassword: ""
System:
host: 0.0.0.0
port: 19093
env: release
# 日志配置
logFileDir: /opt/monitor/alertmanagerWebhook/ # 可选项,为空则为程序运行目录
logFilePath: alertmanager-webhook.log # 必须配置
4.定义全局变量及相关结构体
global目录中定义的是全局变量
代码如下(示例):
go
package global
import (
"alertmanagerWebhook/config"
"github.com/sirupsen/logrus"
)
var (
Config *config.Config
Logger *logrus.Logger
)
config目录下存放的是代码所需要的结构体定义,主要演示消息结构体的定义和企业微信的定义
go
package config
//定义两种消息类型 告警消息、恢复消息且为markdown类型
type Message struct {
QywechatMessage QyWechatMessage
}
type QyWechatMessage struct {
MarkdownFiring *QyWeChatMarkdown
MarkdownResolved *QyWeChatMarkdown
}
func NewMessage(markdownFiring, markdownResolved *QyWeChatMarkdown) *Message {
return &Message{
QywechatMessage: QyWechatMessage{
MarkdownFiring: markdownFiring,
MarkdownResolved: markdownResolved,
},
}
}
//企业微信结构体
package config
type QyWeChat struct {
Enabled bool `yaml:"enabled"`
QywechatKey string `yaml:"qywechatKey"`
}
type QyWeChatMarkdown struct {
MsgType string `json:"msgtype"`
Markdown Markdown `json:"markdown"`
}
type Markdown struct {
Content string `json:"content"`
}
func NewQyWeChatMarkdown(content string) *QyWeChatMarkdown {
return &QyWeChatMarkdown{
MsgType: "markdown",
Markdown: Markdown{
Content: content,
},
}
}
//alertmanager原始告警数据结构体(该结构体中的字段信息来源于原始告警数据,具体获取方式可以看上方 <alertmanager原始告警数据获取代码>)
package config
import "time"
type Alert struct {
Status string `json:"status"`
Labels ReqAlertLabel `json:"labels"`
Annotations ReqAlertAnnotations `json:"annotations"`
StartsAt time.Time `json:"startsAt"`
EndsAt time.Time `json:"endsAt"`
StartTime string `json:"startTime"`
EndTime string `json:"endTime"`
Fingerprint string `json:"fingerprint"`
Count int `json:"count"`
}
type ReqGroupLabels struct {
Alertname string `json:"alertname"`
}
type ReqCommonLabels struct {
Alertname string `json:"alertname"`
Instance string `json:"instance"`
Job string `json:"job"`
Severity string `json:"severity"`
}
type ReqCommonAnnotations struct {
Description string `json:"description"`
Summary string `json:"summary"`
}
type ReqAlertLabel struct {
Alertname string `json:"alertname"`
Instance string `json:"instance"`
Job string `json:"job"`
Severity string `json:"severity"`
}
type ReqAlertAnnotations struct {
Description string `json:"description"`
Summary string `json:"summary"`
}
type Notification struct {
Version string `json:"version"`
GroupKey string `json:"groupKey"`
Status string `json:"status"`
Receiver string `json:"receiver"`
GroupLabels ReqGroupLabels `json:"groupLabels"`
CommonLabels ReqCommonLabels `json:"commonLabels"`
ExternalURL string `json:"externalURL"`
Alerts []Alert `json:"alerts"`
CommonAnnotations ReqCommonAnnotations `json:"commonAnnotations"`
}
5.程序入口函数
cmd/main.go文件是程序的入口
go
package main
import (
"alertmanagerWebhook/core"
"alertmanagerWebhook/global"
"alertmanagerWebhook/routers"
"context"
"errors"
"net/http"
"os"
"os/signal"
"syscall"
"time"
"github.com/gin-gonic/gin"
"github.com/sirupsen/logrus"
)
func main() {
// Initialize configuration from YAML file 初始化settings配置文件
core.InitYaml()
// 设置日志文件位置
logFilePath := global.Config.System.LogFilePath
var closeLogger func() // 声明一个关闭函数
global.Logger, closeLogger = setupLogrus(logFilePath)
defer closeLogger() // 确保关闭日志文件
// 设置gin框架的启动模式及gin框架日志输出
gin.SetMode(gin.ReleaseMode)
gin.DefaultWriter = global.Logger.Writer()
gin.DefaultErrorWriter = global.Logger.Writer()
// 初始化路由
router := routers.InitRouterGroup()
srv := &http.Server{
Addr: global.Config.System.Addr(),
Handler: router,
}
// 开启gorouting
go func() {
global.Logger.Info("Starting server...")
if err := srv.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
global.Logger.Fatalf("Listen: %s\n", err)
}
}()
//优雅关闭程序函数
handleShutdown(srv)
}
// 设置日志输出相关函数
func setupLogrus(path string) (*logrus.Logger, func()) {
Logger := logrus.New()
f, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE|os.O_APPEND, 0666)
if err != nil {
logrus.Fatalf("Error opening log file: %v", err)
}
Logger.Out = f
Logger.SetFormatter(&logrus.TextFormatter{
FullTimestamp: true,
TimestampFormat: "2006-01-02 15:04:05",
})
Logger.SetReportCaller(true)
return Logger, func() { f.Close() } // 返回一个关闭函数
}
//程序关闭函数
func handleShutdown(srv *http.Server) {
quit := make(chan os.Signal, 1)
signal.Notify(quit, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
<-quit
global.Logger.Info("Shutting down server...")
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
if err := srv.Shutdown(ctx); err != nil {
global.Logger.Fatal("Server forced to shutdown:", err)
}
global.Logger.Info("Server exiting")
}
6.初始化配置文件及redis
core目录下存放的是关于初始化相关的操作代码
go
//初始化redis
package core
import (
"alertmanagerWebhook/global"
"fmt"
"github.com/gomodule/redigo/redis"
)
func ConnRedis() (redis.Conn, error) {
c, err := redis.Dial("tcp", global.Config.Redis.RedisAddr())
if err != nil {
return nil, fmt.Errorf("connect redis failed: %v\n", err)
}
if global.Config.Redis.RedisPassword != "" {
if _, err = c.Do("AUTH", global.Config.Redis.RedisPassword); err != nil {
c.Close()
return nil, fmt.Errorf("redis password failed: %v\n", err)
}
}
return c, nil
}
go
//初始化配置文件
package core
import (
"alertmanagerWebhook/config"
"alertmanagerWebhook/global"
"log"
"os"
"path/filepath"
"gopkg.in/yaml.v2"
)
func InitYaml() {
// 直接构建配置文件的路径
dir, err := os.Getwd()
if err != nil {
log.Fatalln("Error getting current directory:", err)
return
}
ConfigFile := filepath.Join(dir, "settings.yaml")
c := &config.Config{}
//读取yaml文件
yamlConf, err := os.ReadFile(ConfigFile)
if err != nil {
log.Fatalf("get yamlconf error: %s", err)
}
//yaml文件反序列化
err = yaml.Unmarshal(yamlConf, c)
if err != nil {
log.Fatalf("config init Unmarsharl: %v", err)
}
log.Println("config init Unmarsharl success")
//将其赋值给全局变量
global.Config = c
}
7.路由入口代码
routers目录存放的是关于外部媒介钉钉、邮箱、企业微信的请求入口
yaml
这部分代码主要是使用gin框架实现,定义对应的webhook请求地址以及将接收到的alertmanager原始告警信息进行转换后发送,一步一步看
此处以企业微信为例,其余两种媒介不做描述
go
//路由总入口
package routers
import (
"alertmanagerWebhook/global"
"fmt"
"time"
"github.com/gin-gonic/gin"
)
type RouterGroup struct {
*gin.RouterGroup
}
func InitRouterGroup() *gin.Engine {
gin.SetMode(global.Config.System.Env)
router := gin.New()
router.Use(gin.LoggerWithWriter(global.Logger.Writer())) // 将 Logger 输出到 logrusg
router.Use(gin.Recovery()) // 恢复中间件,用于处理错误
router.Use(gin.LoggerWithFormatter(func(param gin.LogFormatterParams) string {
// the client access log format
return fmt.Sprintf("%s - - [%s] \"%s %s %s\" %d %s \"%s\" \"%s\"\n",
param.ClientIP,
param.TimeStamp.Format(time.RFC1123),
param.Method,
param.Path,
param.Request.Proto,
param.StatusCode,
param.Latency,
param.Request.UserAgent(),
param.ErrorMessage,
)
}))
//路由分组
apiRouterGroup := router.Group("api")
routerGroup := RouterGroup{apiRouterGroup}
routerGroup.SendWeChat_Router()
routerGroup.SendDingDing_alert_Router()
routerGroup.SendEmail_alert_Router()
return router
}
企业微信
go
package routers
import (
"alertmanagerWebhook/api"
"alertmanagerWebhook/config"
"net/http"
"alertmanagerWebhook/global"
"github.com/gin-gonic/gin"
)
func (router RouterGroup) SendWeChat_Router() {
router.POST("v1/wechat", func(c *gin.Context) {
var notification config.Notification
// 从请求主体中解析 JSON 数据到 notification
err := c.ShouldBindJSON(¬ification)
if err != nil {
global.Logger.Errorf("Error shouldBindJson email: %v\n", err)
c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
return
}
// 打印接收到的 notification 数据
//log.Printf("Received notification: %+v\n", notification)
// 调用 api.SendToDingding 并传递 notification
api.SendToQywechat(c, notification)
// 如果需要,可以返回成功响应
c.JSON(http.StatusOK, gin.H{"status": "success"})
})
}
yaml
代码剖析:
1、定义webhook接口地址为/api/v1/wechat,将这个程序启动后,把这个接口地址配置到alertmanager.yml中的receivers处
2、配置完成并重启alertmanager后,alertmanager接收到prometheus的告警后,会从配置的接口发起POST请求,
将原始的alertmanager告警数据发送给webhook。此处可以通过log.Printf打印出发送过来的消息数据查看
3、当api.SendToQywechat(c, notification)执行时,会将原始的JSON数据进行转换,转换为媒介支持的告警消息格式,此处为markdown格式
8.原始数据进行格式转换
这部分是最主要的一部分,因为接收到的数据是json数据,且其中包括了许多告警消息不需要的字段,因为需要将这部分进行清洗转换,拿到自己所需要的信息后按照告警模板的定义拼接出一条告警消息
go
package common
import (
"alertmanagerWebhook/config"
"alertmanagerWebhook/core"
"alertmanagerWebhook/global"
"bytes"
"os"
"path/filepath"
"reflect"
"text/template"
"time"
"github.com/gomodule/redigo/redis"
)
func TransformToMarkdown(notification config.Notification) (message *config.Message, err error) {
c, err := core.ConnRedis()
if err != nil {
global.Logger.Errorf("Failed to connect to Redis: %v\n", err)
return
}
defer c.Close() // 确保在函数结束时关闭连接
var (
notificationFiring config.Notification
notificationResolved config.Notification
cstZone = time.FixedZone("CST", 8*3600)
bufferFiring bytes.Buffer
bufferResolved bytes.Buffer
)
dir, err := os.Getwd()
if err != nil {
global.Logger.Errorf("Error getting current directory: %v\n", err)
return
}
// Use filepath.Join to create the correct file path
templatePath := filepath.Join(dir, "/template/alert.tmpl")
for _, alert := range notification.Alerts {
if alert.Status == "firing" {
notificationFiring.Version = notification.Version
notificationFiring.GroupKey = notification.GroupKey
notificationFiring.Status = "firing"
notificationFiring.Receiver = notification.Receiver
notificationFiring.GroupLabels = notification.GroupLabels
notificationFiring.CommonLabels = notification.CommonLabels
notificationFiring.ExternalURL = notification.ExternalURL
notificationFiring.Alerts = append(notificationFiring.Alerts, alert)
} else if alert.Status == "resolved" {
notificationResolved.Version = notification.Version
notificationResolved.GroupKey = notification.GroupKey
notificationResolved.Status = "resolved"
notificationResolved.Receiver = notification.Receiver
notificationResolved.GroupLabels = notification.GroupLabels
notificationResolved.CommonLabels = notification.CommonLabels
notificationResolved.ExternalURL = notification.ExternalURL
notificationResolved.Alerts = append(notificationResolved.Alerts, alert)
}
}
// Templated Email Body for Firing Alerts
if !reflect.DeepEqual(notificationFiring, config.Notification{}) {
for _, alert := range notificationFiring.Alerts {
alert.StartTime = alert.StartsAt.In(cstZone).Format("2006-01-02 15:04:05")
fingerprint := alert.Fingerprint
// Save states in Redis -->hset fingerprintValue startTimeValue存储,key的名称就是fingerprintValue,字段就是startTime
if _, err = c.Do("HSet", fingerprint, "startTime", alert.StartTime); err != nil {
global.Logger.Errorln(err)
return nil, err
}
//Redis Hincrby 命令用于为哈希表中的字段值加上指定增量值
if _, err = c.Do("Hincrby", fingerprint, "count", 1); err != nil {
global.Logger.Errorln(err)
return nil, err
}
count, err := redis.Int(c.Do("HGet", fingerprint, "count"))
if err != nil {
global.Logger.Errorln("get alert count error: ", err)
}
alert.Count = count //通过redis记录告警次数
// 检查 Description 是否存在或为空
if alert.Annotations.Description == "" {
// 如果为空,则重新赋值
alert.Annotations.Description = alert.Annotations.Summary
}
//告警级别如果为空,则设置为warning
if alert.Labels.Severity == "" {
alert.Labels.Severity = "warning"
}
// Load template from file
tmpl, err := template.ParseFiles(templatePath)
if err != nil {
global.Logger.Errorln("template parse error: ", err)
return nil, err
}
// Execute the template and write to emailBodyFiring
if err := tmpl.Execute(&bufferFiring, alert); err != nil {
global.Logger.Errorln("template execute error: ", err)
return nil, err
}
bufferFiring.WriteString("\n") // 添加换行符以分隔不同的告警
}
}
// Templated Email Body for Resolved Alerts
if !reflect.DeepEqual(notificationResolved, config.Notification{}) {
for _, alert := range notificationResolved.Alerts {
alert.StartTime = alert.StartsAt.In(cstZone).Format("2006-01-02 15:04:05")
alert.EndTime = alert.EndsAt.In(cstZone).Format("2006-01-02 15:04:05")
// 检查 Description 是否存在或为空
if alert.Annotations.Description == "" {
// 如果为空,则重新赋值
alert.Annotations.Description = alert.Annotations.Summary
}
// Load template from file
tmpl, err := template.ParseFiles(templatePath)
if err != nil {
global.Logger.Errorln("template parse error: ", err)
return nil, err
}
// Execute the template and write to emailBodyResolved
if err := tmpl.Execute(&bufferResolved, alert); err != nil {
global.Logger.Errorln("template execute error: ", err)
return nil, err
}
bufferResolved.WriteString("\n") // 添加换行符以分隔不同的告警
//恢复后,从redis删除对应的key
if _, err := c.Do("Del", alert.Fingerprint); err != nil {
global.Logger.Errorln("delete key error: ", err)
}
}
}
// 转换为企业微信可以识别的格式
var markdownFiring, markdownResolved *config.QyWeChatMarkdown
var title string
title = "# <font color=\"red\">触发告警</font>\n"
if bufferFiring.String() != "" {
markdownFiring = config.NewQyWeChatMarkdown(title + bufferFiring.String())
} else {
markdownFiring = config.NewQyWeChatMarkdown("")
}
title = "# <font color=\"green\">告警恢复</font>\n"
if bufferResolved.String() != "" {
markdownResolved = config.NewQyWeChatMarkdown(title + bufferResolved.String())
} else {
markdownResolved = config.NewQyWeChatMarkdown("")
}
// 将企业微信消息进行封装
message = config.NewMessage(markdownFiring, markdownResolved)
//log.Printf("messages: %v\n", message.QywechatMessage.MarkdownFiring.Markdown.Content)
global.Logger.Infof("messagesWeChatFiring: %v\n", message.QywechatMessage.MarkdownFiring.Markdown.Content)
global.Logger.Infof("messagesWeChatResovled: %v\n", message.QywechatMessage.MarkdownResolved.Markdown.Content)
return message, nil
}
9.将重新构建好的消息发送到企业微信中
这部分代码就是重组消息后,对消息的发送过程示例
go
package api
import (
"alertmanagerWebhook/common"
"alertmanagerWebhook/config"
"alertmanagerWebhook/global"
"net/http"
"github.com/gin-gonic/gin"
)
// SendToQywechat handles sending notifications to WeChat.
func SendToQywechat(c *gin.Context, notification config.Notification) {
if err := common.SendWeChatNotification(notification); err != nil {
global.Logger.Errorln("Failed to send WeChat notification:", err)
c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()})
return
}
// 如果发送成功,返回成功响应
c.JSON(http.StatusOK, gin.H{"status": "wechat message sent successfully"})
}
api层调用common层的服务代码,实现最终告警消息的发送
go
package common
import (
"alertmanagerWebhook/config"
"bytes"
"encoding/json"
"io"
"net/http"
"alertmanagerWebhook/global"
)
// SendNotification handles sending notifications to specified platforms.
func SendNotification(notification config.Notification, platform string) error {
var webhookURL string
message, err := TransformToMarkdown(notification)
if err != nil {
global.Logger.Errorf("error transforming notification: %v\n", err) // 记录错误
return err // 返回实际的错误
}
switch platform {
case "wechat":
if !global.Config.QyWeChat.Enabled {
global.Logger.Errorln("WeChat notifications are disabled.")
return err
}
message = config.NewMessage(
message.QywechatMessage.MarkdownFiring,
message.QywechatMessage.MarkdownResolved,
nil, nil,
)
global.Logger.Infof("messageWeChatFiringSize: %d\n",len(message.QywechatMessage.MarkdownFiring.Markdown.Content))
global.Logger.Infof("messageWeChatResolvedSize: %d\n",len(message.QywechatMessage.MarkdownResolved.Markdown.Content))
if len(message.QywechatMessage.MarkdownFiring.Markdown.Content) == 0 {
global.Logger.Infoln("No firing alerts to send.")
}
if len(message.QywechatMessage.MarkdownResolved.Markdown.Content) == 0 {
global.Logger.Infoln("No resolved alerts to send.")
}
//fmt.Printf("企业微信获得的告警信息长度总计:%d\n", len(message.QywechatMessage.MarkdownFiring.Markdown.Content))
webhookURL = "https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=" + global.Config.QyWeChat.QywechatKey
default:
global.Logger.Errorf("unsupported platform: %s\n", platform)
return err
}
// 只发送当前平台相关的告警信息
var alertDetails []struct {
markdown interface{}
alertType string
}
if platform == "wechat" {
alertDetails = append(alertDetails, struct {
markdown interface{}
alertType string
}{
message.QywechatMessage.MarkdownFiring, "firing",
})
if message.QywechatMessage.MarkdownResolved != nil {
alertDetails = append(alertDetails, struct {
markdown interface{}
alertType string
}{
message.QywechatMessage.MarkdownResolved, "resolved",
})
}
}
for _, detail := range alertDetails {
if detail.markdown != nil {
if err := sendSingleMessage(webhookURL, detail.markdown); err != nil {
global.Logger.Errorf("Error sending message for %s: %v\n", detail.alertType, err)
}
}
}
return nil
}
// sendSingleMessage handles the actual sending for a single part of the message
func sendSingleMessage(webhookURL string, messageData interface{}) error {
data, err := json.Marshal(messageData)
if err != nil {
global.Logger.Errorf("error marshalling message: %v\n", err)
return err
}
if len(data) == 0 {
global.Logger.Infoln("invalid request body: empty message")
return err
}
req, err := http.NewRequest("POST", webhookURL, bytes.NewBuffer(data))
if err != nil {
global.Logger.Errorf("error creating request: %v\n", err)
return err
}
req.Header.Set("Content-Type", "application/json")
client := &http.Client{}
resp, err := client.Do(req)
if err != nil {
global.Logger.Errorf("resp error sending request: %v\n", err)
return err
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
global.Logger.Errorf("error reading response body: %v\n", err)
return err
}
var response map[string]interface{}
if err = json.Unmarshal(body, &response); err != nil {
global.Logger.Errorf("error unmarshalling response: %v\n", err)
return err
}
if resp.StatusCode != http.StatusOK {
global.Logger.Errorf("Non-OK HTTP status: %s, Response: %v\n", resp.Status, response)
return nil
}
if errCode, ok := response["errcode"].(float64); ok && int(errCode) != 0 {
global.Logger.Errorf("send alert message error: %d, Response: %v\n", int(errCode), response)
return nil
}
return nil
}
// SendWeChatNotification is now a wrapper around SendNotification for WeChat
func SendWeChatNotification(notification config.Notification) error {
return SendNotification(notification, "wechat")
}
至此,面向企业微信的webhook代码分析完成
四、webhook使用示例
settings.yaml中设置Enabled只开启企业微信和邮箱告警
1.修改alertmanager.yml
yaml
global:
resolve_timeout: 5m
http_config:
basic_auth:
username: admin
password: "QAZXCFRF"
route:
group_by: ['alertname']
group_wait: 30s
group_interval: 2m
repeat_interval: 5m
receiver: 'webhook'
receivers:
- name: 'webhook'
webhook_configs:
- url: 'http://127.0.0.1:19093/api/v1/dingding' #此处的端口是webhook代码运行后占用的端口,接口是在router代码层定义的接口
- url: 'http://127.0.0.1:19093/api/v1/wechat'
- url: 'http://127.0.0.1:19093/api/v1/email'
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
2.构建并启动webhook程序
首先触发几个告警用于测试
shell
[root@python2 alertmanagerWebhook]# pwd
/opt/monitor/alertmanagerWebhook
[root@python2 alertmanagerWebhook]# go build -o alertmanagerWebhook cmd/main.go
前台启动构建好的程序,并查看alertmanager-webhook.log日志
shell
[root@python2 alertmanagerWebhook]# ./alertmanagerWebhook
2024/10/26 21:37:45 config init Unmarsharl success
这个是template目录下定义的告警模板,通过gin-template调用后会替换模板中的值,从而实现最终发送出来的告警消息
yaml
{{- if eq .Status `firing` -}}
{{- /* 自定义触发告警时的内容格式 */ -}}
告警主题: {{.Annotations.Summary}}
告警级别: {{.Labels.Severity}}
告警次数: {{.Count}}
告警主机: {{.Labels.Instance}}
告警详情: {{.Annotations.Description}}
触发时间: {{.StartTime}}
{{- else if eq .Status `resolved` -}}
{{- /* 自定义告警恢复时的内容格式 */ -}}
告警主题: {{.Annotations.Summary}}
告警主机: {{.Labels.Instance}}
开始时间: {{.StartTime}}
恢复时间: {{.EndTime}}
{{- end -}}
查看企业微信群组和邮箱消息
至此,webhook告警的开发及演示到此结束。那么你会好奇redis起到什么作用?接着往下看
yaml
在实际的alertmanager标签中并没有关于count告警次数的统计,
本文中的redis就是用哈希数据类型进行保存,hset fingerprintValue 作为key存储,
有两个不同的字段,一个是startTime,对应告警开始的时间,一个是count,对应的值就是次数count,
通过redis的Hincrby 命令用于为哈希表中的递增字段值加上指定增量值实现。告警恢复后,删除fingerprintValue这个标识即可。
以上就是redis在这个webhook中的作用
总结
通过配置Alertmanager的Webhook接收器,可以轻松地将Prometheus的警报发送到多个自定义的HTTP服务。这种灵活性使得Prometheus能够与其他系统和工具无缝集成,从而提供更强大的监控和报警功能。同时在这个webhook的开发过程中也遇到了不少的问题,比如:钉钉和企业微信对消息的长度限制是4096个字节以及每分钟只能调用20次接口,如果超过这个限制就会导致告警失败。但是突发情况下,当产生大量告警时超过限制后,怎么解决这才是最主要的。这也是本次开发过程遗留下的问题之一。其次,如果将告警消息准确的通知到企业微信群组中的某一个人员以及告警消息的抑制怎么在webhook中实现等以上问题都还执得探索与解决,这样一个强大的告警webhook才得以实用。