AWS-WAF-Log S3存放,通过Athena查看

1.创建好waf-cdn 并且设置好规则和log存储方式为s3

2. Amazon Athena 服务 使用 (注意s3桶位置相同得区域)

https://docs.aws.amazon.com/zh_cn/athena/latest/ug/waf-logs.html#waf-example-count-matched-ip-addresses

官方文档参考,建一个分区查询表

不能直接使用 因为是cdn 资源需要修改相关字段

bash 复制代码
CREATE EXTERNAL TABLE `waf_logs`(
  `timestamp` bigint,
  `formatversion` int,
  `webaclid` string,
  `terminatingruleid` string,
  `terminatingruletype` string,
  `action` string,
  `terminatingrulematchdetails` array <
                                    struct <
                                        conditiontype: string,
                                        sensitivitylevel: string,
                                        location: string,
                                        matcheddata: array < string >
                                          >
                                     >,
  `httpsourcename` string,
  `httpsourceid` string,
  `rulegrouplist` array <
                      struct <
                          rulegroupid: string,
                          terminatingrule: struct <
                                              ruleid: string,
                                              action: string,
                                              rulematchdetails: array <
                                                                   struct <
                                                                       conditiontype: string,
                                                                       sensitivitylevel: string,
                                                                       location: string,
                                                                       matcheddata: array < string >
                                                                          >
                                                                    >
                                                >,
                          nonterminatingmatchingrules: array <
                                                              struct <
                                                                  ruleid: string,
                                                                  action: string,
                                                                  overriddenaction: string,
                                                                  rulematchdetails: array <
                                                                                       struct <
                                                                                           conditiontype: string,
                                                                                           sensitivitylevel: string,
                                                                                           location: string,
                                                                                           matcheddata: array < string >
                                                                                              >
                                                                   >,
                                                                  challengeresponse: struct <
                                                                            responsecode: string,
                                                                            solvetimestamp: string
                                                                              >,
                                                                  captcharesponse: struct <
                                                                            responsecode: string,
                                                                            solvetimestamp: string
                                                                              >
                                                                    >
                                                             >,
                          excludedrules: string
                            >
                       >,
`ratebasedrulelist` array <
                         struct <
                             ratebasedruleid: string,
                             limitkey: string,
                             maxrateallowed: int
                               >
                          >,
  `nonterminatingmatchingrules` array <
                                    struct <
                                        ruleid: string,
                                        action: string,
                                        rulematchdetails: array <
                                                             struct <
                                                                 conditiontype: string,
                                                                 sensitivitylevel: string,
                                                                 location: string,
                                                                 matcheddata: array < string >
                                                                    >
                                                             >,
                                        challengeresponse: struct <
                                                            responsecode: string,
                                                            solvetimestamp: string
                                                             >,
                                        captcharesponse: struct <
                                                            responsecode: string,
                                                            solvetimestamp: string
                                                             >
                                          >
                                     >,
  `requestheadersinserted` array <
                                struct <
                                    name: string,
                                    value: string
                                      >
                                 >,
  `responsecodesent` string,
  `httprequest` struct <
                    clientip: string,
                    country: string,
                    headers: array <
                                struct <
                                    name: string,
                                    value: string
                                      >
                                 >,
                    uri: string,
                    args: string,
                    httpversion: string,
                    httpmethod: string,
                    requestid: string
                      >,
  `labels` array <
               struct <
                   name: string
                     >
                >,
  `captcharesponse` struct <
                        responsecode: string,
                        solvetimestamp: string,
                        failureReason: string
                          >,
  `challengeresponse` struct <
                        responsecode: string,
                        solvetimestamp: string,
                        failureReason: string
                        >,
  `ja3Fingerprint` string,
  `oversizefields` string,
  `requestbodysize` int,
  `requestbodysizeinspectedbywaf` int
)
PARTITIONED BY ( 
`region` string, 
`date` string) 
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://<aws-waf-logs-xxx>/AWSLogs/<accountID>/WAFLogs/cloudfront/<waf-acl>'
TBLPROPERTIES(
 'projection.enabled' = 'true',
 'projection.region.type' = 'enum',
 'projection.region.values' = 'cloudfront',
 'projection.date.type' = 'date',
 'projection.date.range' = '2024/07/08,NOW',
 'projection.date.format' = 'yyyy/MM/dd',
 'projection.date.interval' = '1',
 'projection.date.interval.unit' = 'DAYS',
 'storage.location.template' = 's3://<aws-waf-logs-xxx>/AWSLogs/<accountID>/WAFLogs/${region}/<waf-acl>/${date}/')

修改< >中的字符为自己的资源

测试查询

bash 复制代码
SELECT 
  COUNT(httpRequest.country) as count, 
  httpRequest.country 
FROM waf_logs
WHERE 
  terminatingruletype='RATE_BASED' 
GROUP BY httpRequest.country
ORDER BY count
LIMIT 100;
bash 复制代码
SELECT 
  COUNT(*) AS count,
  webaclid,
  action,
  httprequest.clientip,
  httprequest.uri
FROM waf_logs
WHERE terminatingruleid='<id>'
GROUP BY webaclid, action, httprequest.clientip, httprequest.uri
ORDER BY count DESC
LIMIT 100;

具体的sql 字段需要修改成自己的 可以先检索全表 查看字段 方便搜索

相关推荐
码农的救赎1 小时前
阿里云ECS访问不了
阿里云·云计算
wayuncn6 小时前
黑龙江 GPU 服务器租用:开启高效计算新征程
运维·服务器·云计算·gpu算力·算力
Linux运维老纪11 小时前
Linux之 grep、find、ls、wc 命令
linux·运维·服务器·数据库·云计算·运维开发
csdn5659738501 天前
阿里云实时计算Flink版产品体验测评
阿里云·flink·云计算·datav·实时数仓hologres
西柚小萌新1 天前
【大模型实战篇】--阿里云百炼搭建MCP Agent
阿里云·云计算
huang_xiaoen2 天前
试一下阿里云新出的mcp服务
人工智能·阿里云·ai·云计算·mcp
代码小学僧2 天前
🤗 赛博佛祖 Cloudflare 初体验托管自定义域名与无限邮箱注册
前端·serverless·云计算
AKAMAI2 天前
GitOps实战:使用Flux新建Kubernetes集群
后端·云原生·云计算
阿里云大数据AI技术2 天前
大模型落地的关键:如何用 RAG 打造更智能的 AI 搜索——阿里云 AI 搜索开放平台
人工智能·云计算
Blossom.1182 天前
时间的重构:科技如何重塑人类的时间感知与存在方式
人工智能·科技·学习·重构·云计算·生活·时序数据库