AWS-WAF-Log S3存放,通过Athena查看

1.创建好waf-cdn 并且设置好规则和log存储方式为s3

2. Amazon Athena 服务 使用 (注意s3桶位置相同得区域)

https://docs.aws.amazon.com/zh_cn/athena/latest/ug/waf-logs.html#waf-example-count-matched-ip-addresses

官方文档参考,建一个分区查询表

不能直接使用 因为是cdn 资源需要修改相关字段

bash 复制代码
CREATE EXTERNAL TABLE `waf_logs`(
  `timestamp` bigint,
  `formatversion` int,
  `webaclid` string,
  `terminatingruleid` string,
  `terminatingruletype` string,
  `action` string,
  `terminatingrulematchdetails` array <
                                    struct <
                                        conditiontype: string,
                                        sensitivitylevel: string,
                                        location: string,
                                        matcheddata: array < string >
                                          >
                                     >,
  `httpsourcename` string,
  `httpsourceid` string,
  `rulegrouplist` array <
                      struct <
                          rulegroupid: string,
                          terminatingrule: struct <
                                              ruleid: string,
                                              action: string,
                                              rulematchdetails: array <
                                                                   struct <
                                                                       conditiontype: string,
                                                                       sensitivitylevel: string,
                                                                       location: string,
                                                                       matcheddata: array < string >
                                                                          >
                                                                    >
                                                >,
                          nonterminatingmatchingrules: array <
                                                              struct <
                                                                  ruleid: string,
                                                                  action: string,
                                                                  overriddenaction: string,
                                                                  rulematchdetails: array <
                                                                                       struct <
                                                                                           conditiontype: string,
                                                                                           sensitivitylevel: string,
                                                                                           location: string,
                                                                                           matcheddata: array < string >
                                                                                              >
                                                                   >,
                                                                  challengeresponse: struct <
                                                                            responsecode: string,
                                                                            solvetimestamp: string
                                                                              >,
                                                                  captcharesponse: struct <
                                                                            responsecode: string,
                                                                            solvetimestamp: string
                                                                              >
                                                                    >
                                                             >,
                          excludedrules: string
                            >
                       >,
`ratebasedrulelist` array <
                         struct <
                             ratebasedruleid: string,
                             limitkey: string,
                             maxrateallowed: int
                               >
                          >,
  `nonterminatingmatchingrules` array <
                                    struct <
                                        ruleid: string,
                                        action: string,
                                        rulematchdetails: array <
                                                             struct <
                                                                 conditiontype: string,
                                                                 sensitivitylevel: string,
                                                                 location: string,
                                                                 matcheddata: array < string >
                                                                    >
                                                             >,
                                        challengeresponse: struct <
                                                            responsecode: string,
                                                            solvetimestamp: string
                                                             >,
                                        captcharesponse: struct <
                                                            responsecode: string,
                                                            solvetimestamp: string
                                                             >
                                          >
                                     >,
  `requestheadersinserted` array <
                                struct <
                                    name: string,
                                    value: string
                                      >
                                 >,
  `responsecodesent` string,
  `httprequest` struct <
                    clientip: string,
                    country: string,
                    headers: array <
                                struct <
                                    name: string,
                                    value: string
                                      >
                                 >,
                    uri: string,
                    args: string,
                    httpversion: string,
                    httpmethod: string,
                    requestid: string
                      >,
  `labels` array <
               struct <
                   name: string
                     >
                >,
  `captcharesponse` struct <
                        responsecode: string,
                        solvetimestamp: string,
                        failureReason: string
                          >,
  `challengeresponse` struct <
                        responsecode: string,
                        solvetimestamp: string,
                        failureReason: string
                        >,
  `ja3Fingerprint` string,
  `oversizefields` string,
  `requestbodysize` int,
  `requestbodysizeinspectedbywaf` int
)
PARTITIONED BY ( 
`region` string, 
`date` string) 
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://<aws-waf-logs-xxx>/AWSLogs/<accountID>/WAFLogs/cloudfront/<waf-acl>'
TBLPROPERTIES(
 'projection.enabled' = 'true',
 'projection.region.type' = 'enum',
 'projection.region.values' = 'cloudfront',
 'projection.date.type' = 'date',
 'projection.date.range' = '2024/07/08,NOW',
 'projection.date.format' = 'yyyy/MM/dd',
 'projection.date.interval' = '1',
 'projection.date.interval.unit' = 'DAYS',
 'storage.location.template' = 's3://<aws-waf-logs-xxx>/AWSLogs/<accountID>/WAFLogs/${region}/<waf-acl>/${date}/')

修改< >中的字符为自己的资源

测试查询

bash 复制代码
SELECT 
  COUNT(httpRequest.country) as count, 
  httpRequest.country 
FROM waf_logs
WHERE 
  terminatingruletype='RATE_BASED' 
GROUP BY httpRequest.country
ORDER BY count
LIMIT 100;
bash 复制代码
SELECT 
  COUNT(*) AS count,
  webaclid,
  action,
  httprequest.clientip,
  httprequest.uri
FROM waf_logs
WHERE terminatingruleid='<id>'
GROUP BY webaclid, action, httprequest.clientip, httprequest.uri
ORDER BY count DESC
LIMIT 100;

具体的sql 字段需要修改成自己的 可以先检索全表 查看字段 方便搜索

相关推荐
翼龙云_cloud1 小时前
阿里云云渠道商:如何选择阿里云 GPU 配置方案?
服务器·人工智能·阿里云·云计算
Wnq100722 小时前
解构中心化困境:工业控制SCADA的延时与可靠性症结及分布式边缘计算转型路径
人工智能·分布式·云计算·去中心化·边缘计算
元气满满-樱2 小时前
安装Windows Server 2008
windows·云计算
Wnq100722 小时前
新型基于“去中心化分布式Agent“技术的操作系统DIOS
分布式·嵌入式硬件·中间件·架构·云计算·去中心化·信息与通信
这儿有一堆花15 小时前
浏览器指纹:互联网中无处遁形的数字身份证
云计算
云老大TG:@yunlaoda3601 天前
开通华为云国际站代理商的UCS服务需要哪些资质?
大数据·数据库·华为云·云计算
TG:@yunlaoda360 云老大1 天前
如何评估华为云国际站代理商跨境合规要求?
大数据·数据库·华为云·云计算
@HNUSTer1 天前
基于 GEE 的 Landsat 9 数据实现 11 种植被指数批量计算与导出
云计算·数据集·遥感大数据·gee·云平台·植被指数·landsat 9
TG:@yunlaoda360 云老大1 天前
如何了解华为云国际站代理商的GACS主要有什么作用呢?
大数据·华为云·云计算
咕噜企业分发小米1 天前
阿里云基因测序数据分析平台有哪些成功案例?
阿里云·数据分析·云计算