AWS 亚马逊云预警通知接入钉钉告警(微信同样适用)

由于 AWS 不支持 Webhook 等通知渠道,因此借助 Lambda 函数作为中继,解析 SNS 消息,再推送至钉钉(或企业微信)。

架构

部署步骤

创建钉钉机器人

步骤如下:

  1. 创建钉钉通知群【创建 普通群 免费,不消耗企业资源
  2. 找到 "群设置" -> "机器人" -> "添加机器人"
  1. ‼️ 安全设置中添加关键词:Alarm【下面代码中,通知的内容默认会包含该词】
  1. 复制 Webhook 中的 access_token【后续 Lambda 代码中需要使用】

Lambda 部署钉钉通知函数

在 Lambda 配置的步骤大体如下:

  1. 📦 本地打包代码与环境依赖(这里使用 Python 开发,代码文件内容为空也没事,后续在 Lambda 控制台可以直接调整):
bash 复制代码
# 1) 创建打包目录
mkdir -p lambda_pkg && cd lambda_pkg

# 2) 安装依赖到当前目录(这里依赖 requests)
pip install --upgrade pip
pip install requests -t .

# 3) 放入你的代码文件
cp /path/to/lambda_function.py .

# 4) 打包(注意:zip 内应是文件本身,而非外层文件夹)
zip -r ../lambda.zip .

# 5) 返回上一层(得到的 lambda.zip 就能上传了)
cd ..
  1. Lambda 控制台创建函数:导航栏 "函数" -> "创建函数"
  1. 将第 1 步中打包的 lambda.zip 上传:
  1. 配置环境变量 token,将之前创建钉钉机器人时获取的 access_token 传入
  1. (按需)修改代码
  2. 部署并测试

创建 SNS 主题及订阅

(按需)创建主题

若之前已创建,可跳过,直接创建订阅即可。

创建 "标准" SNS 即可,其他配置无特殊需求保持默认。

创建订阅

  1. 协议选择 "AWS Lambda"
  2. 终端节点选择第 2 步中创建的

附录

代码

python 复制代码
import requests
import json
import os

def send_msg(title, msg):
    token = os.getenv('token')
    url = "https://oapi.dingtalk.com/robot/send?access_token="
    url = url + token
    headers = {'Content-Type': 'application/json'}
    values = values = f'{{"msgtype":"markdown","markdown":{{"title": "{title}","text": "{msg}"}}}}'
    
    print(values)
    request = requests.post(url, values,headers=headers)
    return request.text

def lambda_handler(event, context):
    print(event)
    
    try:
        Message = json.loads(event['Records'][0]['Sns']['Message'])
        alarmName = Message['AlarmName']
        alarmDescription = Message['AlarmDescription']
        newStateValue = Message['NewStateValue']
        timestamp = event['Records'][0]['Sns']['Timestamp']
        newStateReason = json.loads(event['Records'][0]['Sns']['Message'])['NewStateReason']
        region = Message['Region']
        
        msg = f"**预警**: {alarmName}\n\n**地域**: {region}\n\n**当前状态**: {newStateValue}\n\n**触发原因**: {newStateReason}\n\n**触发时间**: {timestamp}\n\n**预警描述**: {alarmDescription}"
        print(msg)
        send_msg(alarmName, msg)
    except json.JSONDecodeError:
        # 触发器不是 CloudWatch,直接发送原始消息
        raw_message = event['Records'][0]['Sns']['Message']
        print(raw_message)
        send_msg("预警通知", raw_message)
    except KeyError as e:
        # 缺少必要字段时,作为解析失败处理
        raw_message = event['Records'][0]['Sns']['Message']
        timestamp = event['Records'][0]['Sns']['Timestamp']
        
        msg = f"**预警消息解析失败**\n\n**原始消息**: {raw_message}\n\n**时间**: {timestamp}\n\n**错误**: 缺少必要字段 {str(e)}"
        print(f"字段缺失: {e}")
        print(msg)
        send_msg("预警消息解析失败", msg)

测试用的 SNS 事件

CloudWatch 指标预警

这里的 Event 是模拟的从 CloudWatch -> SNS -> Lambda 的内容。

json 复制代码
{
    "Records": [
        {
            "EventSource": "aws:sns",
            "EventVersion": "1.0",
            "EventSubscriptionArn": "arn:aws:sns:********",
            "Sns": {
                "Type": "Notification",
                "MessageId": "b46076bd-********-17a94cdbfbd5",
                "TopicArn": "arn:aws:sns:********",
                "Message": {
                    "AlarmName": "test_1 内存利用率过高告警",
                    "AlarmDescription": null,
                    "AWSAccountId": "514986213302",
                    "AlarmConfigurationUpdatedTimestamp": "2025-08-12T07:39:49.733+0000",
                    "NewStateValue": "ALARM",
                    "NewStateReason": "Threshold Crossed: 1 out of the last 1 datapoints [42.067474122272536 (12/08/25 07:36:00)] was greater than or equal to the threshold (40.0) (minimum 1 datapoint for OK -> ALARM transition).",
                    "StateChangeTime": "2025-08-12T07:41:05.608+0000",
                    "Region": "US West (Oregon)",
                    "AlarmArn": "arn:aws:cloudwatch:********",
                    "OldStateValue": "OK",
                    "OKActions": [

                    ],
                    "AlarmActions": [
                        "arn:aws:sns:********"
                    ],
                    "InsufficientDataActions": [

                    ],
                    "Trigger": {
                        "MetricName": "mem_used_percent",
                        "Namespace": "CWAgent",
                        "StatisticType": "ExtendedStatistic",
                        "ExtendedStatistic": "p90",
                        "Unit": null,
                        "Dimensions": [
                            {
                                "value": "i-********",
                                "name": "InstanceId"
                            }
                        ],
                        "Period": 300,
                        "EvaluationPeriods": 1,
                        "DatapointsToAlarm": 1,
                        "ComparisonOperator": "GreaterThanOrEqualToThreshold",
                        "Threshold": 40,
                        "TreatMissingData": "missing",
                        "EvaluateLowSampleCountPercentile": ""
                    }
                },
                "Timestamp": "2025-08-12T07:41:05.649Z",
                "SignatureVersion": "1",
                "Signature": "********",
                "SigningCertUrl": "https://sns.us-west-2.amazonaws.com/SimpleNotificationService-********",
                "Subject": "ALARM: \"test_\" in US West (Oregon)",
                "UnsubscribeUrl": "https://sns.us-west-2.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:********",
                "MessageAttributes": {

                }
            }
        }
    ]
}

CloudWatch 日志预警

这里的 Event 是模拟的 Lambda 定时从 CloudWatch Log 中根据 Query 获取并推送至 SNS 的内容。

json 复制代码
{
  "Records": [
    {
      "EventSource": "aws:sns",
      "EventVersion": "1.0",
      "EventSubscriptionArn": "arn:aws:sns:********",
      "Sns": {
        "Type": "Notification",
        "MessageId": "f664004c-xxxxxxxxx-615af8115e52",
        "TopicArn": "arn:aws:sns:********",
        "Message": "# 🔥 CloudWatch Logs Insights 告警\n\n---\n\n### 共 1 条预警\n\n---\n\n#### ERROR(hits = 1)\n> **msg**:测试错误\n- **line**:330\n- **logger**:com.amzless.ads.dispatch.task.ad.AdSyncHandler\n- **method**:getPostData\n- **firstSeen**:2025-08-13 04:51:36.463\n",
        "Timestamp": "2025-08-13T05:10:33.276Z",
        "SignatureVersion": "1",
        "Signature": "xxxxxxxxxx",
        "SigningCertUrl": "https://sns.us-west-2.amazonaws.com/SimpleNotificationService-xxxxxxxx.pem",
        "Subject": "CloudWatch Logs Insights 告警",
        "UnsubscribeUrl": "https://sns.us-west-2.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:xxxxxxxx",
        "MessageAttributes": {}
      }
    }
  ]
}
相关推荐
大斯斯4 小时前
AWS 通过 CloudWatch 实现应用日志告警
aws
指剑4 天前
借助Rclone快速从阿里云OSS迁移到AWS S3
阿里云·云计算·迁移学习·aws·亚马逊云科技·rclone
huainian4 天前
AWS 云小白学习指南 (一)
云计算·aws
xybDIY4 天前
智能云探索:基于Amazon Bedrock与MCP Server的AWS资源AI运维实践
运维·人工智能·aws
(:满天星:)5 天前
AWS EKS节点扩容时NLB与Ingress的故障处理与优化方案
云计算·aws
weixin_425878235 天前
AWS 可靠性工程深度实践: 从 Well-Architected 到“零失误”VPC 落地
大数据·云计算·aws
AWS官方合作商5 天前
解锁高效开发:AWS 前端 Web 与移动应用解决方案详解
前端·云计算·aws
Elastic 中国社区官方博客6 天前
升级 Elasticsearch 到新的 AWS Java SDK
java·大数据·elasticsearch·搜索引擎·云计算·全文检索·aws