在中国区aws通过Network Flow Monitor实现实例网络流量指标上传到cloudwatch

关于Network Flow Monitor

AWS Network Flow Monitor 是 Amazon CloudWatch 的网络监控子服务。其核心组件 aws-network-sonar-agent(即 nfm-agent)是一个部署在每个计算节点上的轻量级 agent,通过 Linux eBPF 技术采集 TCP 连接的性能指标。镜像地址为602401143452.dkr.ecr.us-east-2.amazonaws.com/aws-network-sonar-agent:v1.1.3-eksbuild.3

通过对镜像的分析agent 主要逻辑为/usr/local/bin/nfm-agent。Agent 使用 Rust 的 aya 框架(v0.13.1)加载 eBPF 程序到内核:

  • 程序类型:nfm_sock_ops(sockops BPF 程序)
  • 挂载点:cgroupv2 需要 init container 预先挂载
  • BPF Maps:NFM_SK_PROPS(socket 属性)、NFM_SK_STATS(socket 统计)、NFM_COUNTERSNFM_CONTROL
  • 采集的内核事件:active_connect_eventsestablished_eventsstate_change_eventsrtt_eventsretrans_eventsrto_events

Agent 不会访问 TCP 连接的 payload 数据,只采集连接级别的元数据和性能指标。

数据处理层

每 500ms 聚合一次(aggregate_msecs 默认 500),每 30 秒 ± 5 秒抖动上报一次

  • 通过 conntrack 还原 NAT 转换后的真实 IP
  • local_address + remote_address + 服务端口 聚合,临时端口归零
    • 主动外连:local_port=0remote_port=目标端口
    • 被动接收:local_port=服务端口remote_port=0

上报层

上报格式为OTLP(OpenTelemetry Protocol),protobuf 编码,gzip 压缩,发送到端点https://networkflowmonitorreports.{region}.api.aws/publish

每次上报最多 500 个流(top_k 默认 500)

实际指标日志

通过实际运行 agent 并开启日志输出(-l on),获取到的完整报告结构:

流级别指标(network_stats):

json 复制代码
{
  "flow": {
    "protocol": "TCP",
    "local_address": "172.31.14.46",
    "remote_address": "169.254.169.254",
    "local_port": 0,
    "remote_port": 80
  },
  "stats": {
    "sockets_connecting": 0,
    "sockets_established": 0,
    "sockets_completed": 5,
    "severed_connect": 0,
    "severed_establish": 0,
    "connect_attempts": 5,
    "bytes_received": 1302,
    "bytes_delivered": 1626,
    "segments_received": 5,
    "segments_delivered": 5,
    "retrans_syn": 0,
    "retrans_est": 0,
    "retrans_close": 0,
    "rtos_syn": 0,
    "rtos_est": 0,
    "rtos_close": 0,
    "connect_us": {"count": 5, "min": 102, "max": 199, "sum": 696},
    "rtt_us": {"count": 5, "min": 61, "max": 91, "sum": 380},
    "rtt_smoothed_us": {"count": 5, "min": 92, "max": 131, "sum": 536}
  }
}

网卡限额指标(host_stats):

json 复制代码
{
  "interface_id": "eni-030xxxx06fd2",
  "stats": {
    "bw_in_allowance_exceeded": 0,
    "bw_out_allowance_exceeded": 0,
    "conntrack_allowance_exceeded": 0,
    "linklocal_allowance_exceeded": 0,
    "pps_allowance_exceeded": 0,
    "conntrack_allowance_available": 119972
  }
}

OpenMetrics 端口(:9090/metrics):仅暴露 5 个网卡限额指标,与 host_stats 重复,无额外数据。

通过分析 entrypoint.sh 和二进制中的 partition 数据确认

  • entrypoint.sh 中默认端点为 https://networkflowmonitorreports.$region.api.aws/publish,而中国区的 DNS 后缀应为 api.amazonwebservices.com.cn
  • Network Flow Monitor 服务本身未在 cn-north-1 / cn-northwest-1 部署
  • 虽然支持自定义端点(CUSTOM_INGESTION_ENDPOINT 环境变量),但没有后端服务可用

实现思路

核心思路绕过 AWS Network Flow Monitor 后端,利用 agent 的 eBPF 采集能力,通过 EMF(Embedded Metric Format)将指标写入 CloudWatch。

复制代码
eBPF 内核采集 → nfm-agent 日志输出 → EMF 转换 → CloudWatch Agent → CloudWatch Metrics

关键参数:

  • -p off:关闭向 AWS 后端上报(避免报错)
  • -l on:开启日志报告(完整 JSON 输出到 stdout)
  • -n on:启用 NAT 解析
  • --open-metrics on:启用 Prometheus 端口(可选)

文件结构

复制代码
nfm-compose/
├── docker-compose.yml
├── cwagent-config.json
└── scripts/
    ├── start-nfm.sh
    ├── start-emf.sh
    └── nfm_to_emf.py

docker-compose.yml

yaml 复制代码
services:
  nfm-agent:
    image: 602401143452.dkr.ecr.us-east-2.amazonaws.com/aws-network-sonar-agent:v1.1.3-eksbuild.3
    container_name: nfm-agent
    entrypoint: ["/bin/sh", "/scripts/start-nfm.sh"]
    privileged: true          # eBPF 需要特权模式
    network_mode: host        # 采集宿主机网络
    pid: host                 # 访问宿主机进程
    restart: unless-stopped
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
    volumes:
      - ./scripts:/scripts:ro
      - shared-logs:/shared-logs
      - /sys/fs/cgroup:/host-cgroup:rw  # 挂载宿主机 cgroup,eBPF 才能采集所有流量

  emf-converter:
    image: python:3.11-slim
    container_name: emf-converter
    entrypoint: ["/bin/sh", "/scripts/start-emf.sh"]
    depends_on:
      - nfm-agent
    restart: unless-stopped
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
    volumes:
      - ./scripts:/scripts:ro
      - shared-logs:/shared-logs
      - emf-logs:/emf-logs

  cloudwatch-agent:
    image: public.ecr.aws/cloudwatch-agent/cloudwatch-agent:latest
    container_name: cloudwatch-agent
    depends_on:
      - emf-converter
    restart: unless-stopped
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
    volumes:
      - ./cwagent-config.json:/etc/cwagentconfig/cwagent.json:ro
      - emf-logs:/emf-logs
    environment:
      - AWS_REGION=cn-north-1
    network_mode: host        # 访问 IMDS 获取 IAM 凭证

  log-rotator:
    image: alpine:3
    container_name: log-rotator
    restart: unless-stopped
    entrypoint: ["/bin/sh", "-c"]
    command:
      - |
        while true; do
          sleep 3600
          for f in /shared-logs/nfm.log /emf-logs/emf.log; do
            if [ -f "$$f" ]; then
              size=$$(stat -c%s "$$f" 2>/dev/null || echo 0)
              if [ "$$size" -gt 52428800 ]; then
                : > "$$f"
                echo "$$(date) Truncated $$f (was $${size} bytes)"
              fi
            fi
          done
        done
    volumes:
      - shared-logs:/shared-logs
      - emf-logs:/emf-logs

volumes:
  shared-logs:
  emf-logs:

start-nfm.sh Agent 启动脚本

bash 复制代码
#!/bin/sh
set -e
# 挂载宿主机的根 cgroupv2,这样 eBPF 能捕获所有进程的 TCP 连接
CGROUP_PATH="/host-cgroup"
exec /usr/local/bin/nfm-agent \
  --cgroup /cgroup-mount/cgroup-nfm-agent \
  --endpoint-region "$region" \
  --endpoint "" \
  -p off \                    # 关闭向 AWS 后端上报
  -l on \                     # 开启日志报告输出
  -k off \                    # 关闭 K8s 元数据(EC2 单机无需)
  -n on \                     # 启用 NAT 解析
  --publish-secs 30 \
  --jitter-secs 5 \
  --open-metrics on \
  --open-metrics-port 9090 \
  --open-metrics-address 0.0.0.0 \
  2>&1 | tee /shared-logs/nfm.log

start-emf.sh EMF 转换器启动脚本

bash 复制代码
#!/bin/sh
echo "Waiting for nfm-agent logs..."
while [ ! -f /shared-logs/nfm.log ]; do
  sleep 1
done
echo "Starting EMF converter, writing to /emf-logs/emf.log"
tail -F /shared-logs/nfm.log | python3 /scripts/nfm_to_emf.py >> /emf-logs/emf.log

nfm_to_emf.py 日志转 EMF 格式

python 复制代码
#!/usr/bin/env python3
"""nfm-agent JSON log -> CloudWatch EMF format (network_stats + host_stats)"""
import json, sys

def to_emf(entry):
    report = entry["report"]
    ts = entry["timestamp"]
    env = dict(report.get("env_metadata", []))
    k8s = report.get("k8s_metadata", {})
    instance_id = env.get("instance-id", {}).get("String", "unknown")
    node_name = k8s.get("node_name") or instance_id

    # Part 1: 流级别指标
    for item in report.get("network_stats", []):
        flow = item["flow"]
        stats = item["stats"]
        rtt_count = stats["rtt_us"]["count"]
        avg_rtt = stats["rtt_us"]["sum"] / max(rtt_count, 1)
        seg_total = stats["segments_delivered"] + stats["segments_received"]
        retrans_total = (stats["retrans_syn"] + stats["retrans_est"]
                         + stats["retrans_close"])

        emf = {
            "_aws": {
                "Timestamp": ts,
                "CloudWatchMetrics": [{
                    "Namespace": "NFM/NetworkFlows",
                    "Dimensions": [
                        ["NodeName", "LocalAddress", "RemoteAddress",
                         "RemotePort"],
                        ["NodeName", "RemoteAddress", "RemotePort"],
                        ["NodeName"],
                    ],
                    "Metrics": [
                        {"Name": "AvgRttUs", "Unit": "Microseconds"},
                        {"Name": "MaxRttUs", "Unit": "Microseconds"},
                        {"Name": "AvgConnectUs", "Unit": "Microseconds"},
                        {"Name": "BytesReceived", "Unit": "Bytes"},
                        {"Name": "BytesDelivered", "Unit": "Bytes"},
                        {"Name": "Retransmissions", "Unit": "Count"},
                        {"Name": "RetransRate", "Unit": "None"},
                        {"Name": "SeveredConnect", "Unit": "Count"},
                        {"Name": "SocketsCompleted", "Unit": "Count"},
                    ]
                }]
            },
            "NodeName": node_name,
            "InstanceId": instance_id,
            "LocalAddress": flow["local_address"],
            "RemoteAddress": flow["remote_address"],
            "RemotePort": str(flow["remote_port"]),
            "AvgRttUs": round(avg_rtt, 1),
            "MaxRttUs": stats["rtt_us"]["max"],
            "AvgConnectUs": round(
                stats["connect_us"]["sum"]
                / max(stats["connect_us"]["count"], 1), 1),
            "BytesReceived": stats["bytes_received"],
            "BytesDelivered": stats["bytes_delivered"],
            "Retransmissions": retrans_total,
            "RetransRate": round(
                retrans_total / max(seg_total, 1), 6),
            "SeveredConnect": stats["severed_connect"],
            "SocketsCompleted": stats["sockets_completed"],
        }
        print(json.dumps(emf), flush=True)

    # Part 2: 网卡限额指标
    for iface in report.get("host_stats", {}).get("interface_stats", []):
        s = iface.get("stats", {})
        emf = {
            "_aws": {
                "Timestamp": ts,
                "CloudWatchMetrics": [{
                    "Namespace": "NFM/HostStats",
                    "Dimensions": [
                        ["NodeName", "InterfaceId"],
                        ["NodeName"],
                    ],
                    "Metrics": [
                        {"Name": "BwInAllowanceExceeded", "Unit": "Count"},
                        {"Name": "BwOutAllowanceExceeded", "Unit": "Count"},
                        {"Name": "PpsAllowanceExceeded", "Unit": "Count"},
                        {"Name": "ConntrackAllowanceExceeded", "Unit": "Count"},
                        {"Name": "ConntrackAllowanceAvailable", "Unit": "Count"},
                        {"Name": "LinklocalAllowanceExceeded", "Unit": "Count"},
                    ]
                }]
            },
            "NodeName": node_name,
            "InstanceId": instance_id,
            "InterfaceId": iface.get("interface_id", "unknown"),
            "BwInAllowanceExceeded": s.get("bw_in_allowance_exceeded", 0),
            "BwOutAllowanceExceeded": s.get("bw_out_allowance_exceeded", 0),
            "PpsAllowanceExceeded": s.get("pps_allowance_exceeded", 0),
            "ConntrackAllowanceExceeded": s.get(
                "conntrack_allowance_exceeded", 0),
            "ConntrackAllowanceAvailable": s.get(
                "conntrack_allowance_available", 0),
            "LinklocalAllowanceExceeded": s.get(
                "linklocal_allowance_exceeded", 0),
        }
        print(json.dumps(emf), flush=True)

for line in sys.stdin:
    try:
        entry = json.loads(line.strip())
        if entry.get("message") == "Publishing report":
            to_emf(entry)
    except Exception:
        pass

cwagent-config.json CloudWatch Agent 配置

json 复制代码
{
  "agent": {
    "region": "cn-north-1",
    "debug": false
  },
  "logs": {
    "metrics_collected": {
      "emf": {}
    },
    "logs_collected": {
      "files": {
        "collect_list": [
          {
            "file_path": "/emf-logs/emf.log",
            "log_group_name": "/nfm/emf-metrics",
            "log_stream_name": "{instance_id}",
            "timezone": "UTC"
          }
        ]
      }
    },
    "force_flush_interval": 5
  }
}

关键配置说明:

  • metrics_collected.emf: {}:开启 EMF 解析,CW Agent 发现日志中包含 _aws.CloudWatchMetrics 字段时自动提取指标
  • file_path:监听 emf-converter 输出的 EMF 日志文件
  • log_group_name:EMF 日志同时存入 CloudWatch Logs,可用 Logs Insights 查询明细
  • force_flush_interval: 5:每 5 秒刷新一次

CloudWatch 中的指标

部署后,CloudWatch 中自动出现两个命名空间:

NFM/NetworkFlows

维度组合:

  • NodeName:节点级汇总
  • NodeName + RemoteAddress + RemotePort:按目标查看
  • NodeName + LocalAddress + RemoteAddress + RemotePort:完整视图
指标 单位 说明
AvgRttUs Microseconds 平均 TCP 往返时间
MaxRttUs Microseconds 最大 RTT
AvgConnectUs Microseconds 平均 TCP 连接建立耗时
BytesReceived Bytes 接收字节数
BytesDelivered Bytes 发送字节数
Retransmissions Count TCP 重传次数
RetransRate None 重传率
SeveredConnect Count 异常断开的连接数
SocketsCompleted Count 正常完成的连接数

NFM/HostStats

指标 说明
BwInAllowanceExceeded 入站带宽超限丢包数
BwOutAllowanceExceeded 出站带宽超限丢包数
PpsAllowanceExceeded PPS 超限丢包数
ConntrackAllowanceExceeded 连接跟踪超限丢包数
ConntrackAllowanceAvailable 剩余可用连接跟踪数
LinklocalAllowanceExceeded 本地代理 PPS 超限丢包数

RTT 异常检测思路

静态阈值

复制代码
AvgRttUs > 2000(2ms)→ WARNING
AvgRttUs > 5000(5ms)且 Retransmissions > 0 → CRITICAL

综合判断

复制代码
RTT 突增 + 重传正常  → 对端处理慢,应用层问题
RTT 突增 + 重传增加  → 网络拥塞或链路问题
RTT 正常 + 重传增加  → 丢包,可能是网卡/交换机问题
RTT 突增 + SeveredConnect 增加 → 严重网络故障

长连接场景下RTT 数据来自内核每次 ACK 的采样,长连接每 30 秒窗口都有持续的 rtt_us {count, min, max, sum}。检测方法为

  • 突变:cur_avg / prev_avg > 3
  • 抖动:max / avg > 10(间歇性问题,如 GC、网络瞬断)

模拟高 RTT 验证

使用 tc(traffic control)给网卡注入延迟,验证采集和告警链路。

相关推荐
ALex_zry1 小时前
Converter双向转换的边界条件处理
运维·服务器·建造者模式
IMPYLH1 小时前
Linux 的 printf 命令
linux·运维·服务器·bash
Coco_淳1 小时前
linux 服务器 初始化数据盘
运维·服务器
艾莉丝努力练剑1 小时前
【Linux加餐】mmap文件映射
linux·运维·服务器·c语言·c++·学习
广州灵眸科技有限公司2 小时前
瑞芯微(EASY EAI)RV1126B QT GUI例程方案
linux·服务器·开发语言·网络·人工智能·qt·物联网
枫叶丹42 小时前
【HarmonyOS 6.0】ArkWeb 私有网络访问控制接口详解
开发语言·网络·华为·harmonyos
聊点儿技术2 小时前
大促期间IP代理识别API频频超时怎么办?——高并发场景下离线库选型与本地部署实战
网络·tcp/ip·游戏·ip离线库·电商风控·识别代理ip·代理ip识别api
李日灐2 小时前
<3>Linux 基础指令:从时间、查找、文本过滤到 .zip/.tgz 压缩解压与常用热键
linux·运维·服务器·开发语言·后端·面试·指令
hughnz2 小时前
自动化控压钻井系统的挑战与风险
linux·服务器·网络