fluentd采集K8S日志

1 前言

发现fluentd是一个不错的日志采集转换工具,也非常的轻量。

支持输入(Input)、过滤(Filter)、解析(Parser)、输出(Output)的灵活扩展。

说白话就是功能比filebeat丰富,比logstash轻量,从内存使用情况就能明显发现

那么就开始使用fluentd采集K8S的日志吧,原理docker的日志默认输出到了

shell 复制代码
[root@k8s-node04 ~]# head  -100 /var/lib/docker/containers/feb5be2528d96c21d6e77508d0e5e25d9fc5aada95a8cea8191fcc77f3154dc1/feb5be2528d96c21d6e77508d0e5e25d9fc5aada95a8cea8191fcc77f3154dc1-json.log
{"log":"2025-05-17 09:47:20.649  INFO 1 --- [nio-8081-exec-3] com.willlink.web.core.mvc.WebLogAspect   : 参数 : [1, 10, null, null, null, null]\n","stream":"stdout","time":"2025-05-17T01:47:20.65052207Z"}
{"log":"  Creating a new SqlSession\n","stream":"stdout","time":"2025-05-17T01:47:20.650531313Z"}
{"log":"  SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@47925525] was not registered for synchronization because synchronization is not active\n","stream":"stdout","time":"2025-05-17T01:47:20.650532873Z"}
{"log":"  JDBC Connection [HikariProxyConnection@2075305405 wrapping com.mysql.jdbc.JDBC4Connection@2ddf3e2e] will not be managed by Spring\n","stream":"stdout","time":"2025-05-17T01:47:20.650534233Z"}
{"log":"  ==\u003e  Preparing: SELECT COUNT(*) AS total FROM t_pile_type WHERE flag = 1\n","stream":"stdout","time":"2025-05-17T01:47:20.650535501Z"}
{"log":"  ==\u003e Parameters:\n","stream":"stdout","time":"2025-05-17T01:47:20.650537251Z"}

2 日志整理需求

日志整理需求

  • 日志是json格式,分为3个字段,log是程序的原始日志,最后的\n是docker自己加上去的,第二个字段stream是程序的标准输出,time是UTC时间,注意这个时间是要处理的,不然会比北京时间慢8个小时。这个日期与日期、宿主机的时间无关,哪怕你的时间都设置对了,最后日志的时间都是不正确的。因此后期使用fluentd加上8个小时

  • 还有这里需要多行合并,因为这里的日志首行是已日期开头的,其他行都是信息打印,例如java的错误日志,数据库查询结果都要往上合并,所以合并规则就是log是否包含日期(或已日期开头)

  • 增加log_level字段,用于标识这条日志的日志级别,例如INFO ERROR WARN ...这些在log字段内有

  • 增加K8S元数据信息,例如这个容器的docker ID,镜像地址,pod ID,pod IP,名称空间,容器IP...,默认会把所有的K8S元信息添加进去,因此需要进一步的去除无关信息

  • fluentd写入的日志统一命名为k8s-pod-%Y-%m-%d,索引按照日期滚动

3 创建pod

3.1 创建elasticsearch

1.准备elastice。前置条件需要配置动态存储类SC。如果你不想吧ES部署到K8S,也可以使用传统部署es的方式

yaml 复制代码
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: es-cluster
  namespace: kube-logging
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
        - name: elasticsearch
          image: elasticsearch:7.17.1
          imagePullPolicy: IfNotPresent
          resources:
            limits:
              cpu: 2000m
            requests:
              cpu: 500m
          ports:
            - containerPort: 9200
              name: rest
              protocol: TCP
            - containerPort: 9300
              name: inter-node
              protocol: TCP
          volumeMounts:
            - name: data
              mountPath: /usr/share/elasticsearch/data
            - name: localtime
              readOnly: true
              mountPath: /etc/localtime
          env:
            - name: cluster.name
              value: k8s-logs
            - name: node.name
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: discovery.seed_hosts
              value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
            - name: cluster.initial_master_nodes
              value: "es-cluster-0,es-cluster-1,es-cluster-2"
            - name: ES_JAVA_OPTS
              value: "-Xms512m -Xmx512m"
      initContainers:
        - name: fix-permissions
          image: busybox
          imagePullPolicy: IfNotPresent
          command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
          securityContext:
            privileged: true
          volumeMounts:
          - name: data
            mountPath: /usr/share/elasticsearch/data
        - name: increase-vm-max-map
          image: busybox
          imagePullPolicy: IfNotPresent
          command: ["sysctl", "-w", "vm.max_map_count=262144"]
          securityContext:
            privileged: true
        - name: increase-fd-ulimit
          image: busybox
          imagePullPolicy: IfNotPresent
          command: ["sh", "-c", "ulimit -n 65536"]
          securityContext:
            privileged: true
      volumes:
        - name: localtime
          hostPath:
            path: /etc/localtime          
  volumeClaimTemplates:
    - metadata:
        name: data
        labels:
          app: elasticsearch
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: nfs-storage
        resources:
          requests:
            storage: 100Gi

3.2 创建fluentd pod

准备configmap

1.创建configmap也就是fluetnd的配置文件

yaml 复制代码
kind: ConfigMap
apiVersion: v1
metadata:
  name: fluentd-config
  namespace: kube-logging
data:
  containers.input.conf: |-
    <source>
      @id fluentd-containers.log
      @type tail
      # 使用了hostpath存储,所以可以访问到pod的日志
      path "/var/log/containers/*.log"
      pos_file /var/log/es-containers.log.pos
      tag raw.kubernetes.*
      # 从日志文件的第一行开始读取,false表示只读取增量日志,不会从头读
      read_from_head false
      <parse>
        @type multi_format
        <pattern>
          # 以json格式匹配日志
          format json
          time_key time
          time_format %Y-%m-%dT%H:%M:%S.%N%z
          keep_time_key false
        </pattern>
        # 如果不是json格式日志的兜底匹配方案
        <pattern>
          format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
          time_format %Y-%m-%dT%H:%M:%S.%N%z
        </pattern>
      </parse>
    </source>

    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes   500000
      max_lines   1000
    </match>

    <filter **>
      @id filter_concat
      @type concat
      key log
      # 多行合并规则,看是否有日志出现
      multiline_start_regexp /(\d{4}-\d{2}-\d{2}|==>|<==)/
      separator ""
      flush_interval 5s
    </filter>

    <filter kubernetes.**>
      @id filter_kubernetes_metadata
      @type kubernetes_metadata
      skip_container_metadata true
      skip_labels true
      skip_master_url true
    </filter>

    <filter kubernetes.**>
      @id filter_k8s_cleanup
      @type record_transformer
      # 移除不需要的K8S信息(可选)
      remove_keys $.kubernetes.pod_name,$.kubernetes.namespace_id,$.docker,$.stream
    </filter>

    <filter kubernetes.**>
      # CHANGED: 新增 - 将kubernetes字段提升一级
      @id filter_promote_k8s_fields
      @type record_transformer
      enable_ruby true
      <record>
        container_name  ${record['kubernetes']['container_name']}
        namespace_name  ${record['kubernetes']['namespace_name']}
        pod_id          ${record['kubernetes']['pod_id']}
        pod_ip          ${record['kubernetes']['pod_ip']}
        host            ${record['kubernetes']['host']}
      </record>
      renew_record false  # CHANGED: 保留原始字段,非覆盖
      remove_keys $.kubernetes
    </filter>

    <filter kubernetes.**>  # CHANGED: 新增 - 提取log等级
      @id filter_log_level
      @type record_transformer
      enable_ruby true
      <record>
        log_level ${record['log'] =~ /(INFO|DEBUG|ERROR|WARN|error|debug|info|warning)/ ? $1 : 'UNKNOWN'}
      </record>
    </filter>


  output.conf: |-
    <match **>
      @id elasticsearch
      @type elasticsearch
      @log_level error
      type_name _doc
      # 如果你的elasticsearch在k8s内
      host elasticsearch
      port 9200
      # 如果es设置了访问密码
      #user elastic
      #password ****
      # 不是勇logstash默认的索引名称
      logstash_format false
      index_name k8s-pod-%Y.%m.%d
      time_key time
      time_key_format %Y-%m-%dT%H:%M:%S%z
      include_timestamp true
      <buffer time>
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 2M
        total_limit_size 500M
        overflow_action block
        timekey 1d
        timekey_use_utc false
      </buffer>
    </match>

  fluent.conf: |-
    <system>
      log_level error
    </system>
    
    <label @FLUENT_LOG>
      <match fluent.*>
        @type stdout
      </match>
    </label>

设置RBAC

因为获取日志的元信息是从api-server组件获取的,需要配置授权service account

yaml 复制代码
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd-es
  namespace: kube-logging
  labels:
    k8s-app: fluentd-es
    addonmanager.kubernetes.io/mode: Reconcile
---

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd-es
  labels:
    k8s-app: fluentd-es
    addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
  - ""
  resources:
  - "namespaces"
  - "pods"
  verbs:
  - "get"
  - "watch"
  - "list"
---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd-es
  labels:
    k8s-app: fluentd-es
    addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
  name: fluentd-es
  namespace: kube-logging
  apiGroup: ""
roleRef:
  kind: ClusterRole
  name: fluentd-es
  apiGroup: ""

创建daemonset

yaml 复制代码
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-es-v3.1.0
  namespace: kube-logging
  labels:
    k8s-app: fluentd-es
    version: v3.1.0
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  selector:
    matchLabels:
      k8s-app: fluentd-es
      version: v3.1.0
  template:
    metadata:
      labels:
        k8s-app: fluentd-es
        version: v3.1.0
    spec:
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      priorityClassName: system-node-critical
      serviceAccountName: fluentd-es
      containers:
      - name: fluentd-es
        #image: quay.io/fluentd_elasticsearch/fluentd:v3.1.0
        image: quay.io/fluentd_elasticsearch/fluentd:v4.7.2
        imagePullPolicy: IfNotPresent
        env:
        - name: FLUENTD_ARGS
          value: --no-supervisor -q
        resources:
          limits:
            memory: 1024Mi
            cpu: 800m
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
          - name: varlog
            mountPath: /var/log
          - name: varlibdockercontainers
            mountPath: /var/lib/docker/containers
            readOnly: true
          - name: config-volume
            mountPath: /etc/fluent/fluent.conf
            subPath: fluent.conf
          - name: config-volume
            mountPath: /etc/fluent/config.d/containers.input.conf
            subPath: containers.input.conf
          - name: config-volume
            mountPath: /etc/fluent/config.d/output.conf
            subPath: output.conf
          - name: localtime
            mountPath: /etc/localtime
        ports:
          - containerPort: 24231
            name: prometheus
            protocol: TCP
        livenessProbe:
          tcpSocket:
            port: prometheus
          initialDelaySeconds: 5
          timeoutSeconds: 10
        readinessProbe:
          tcpSocket:
            port: prometheus
          initialDelaySeconds: 5
          timeoutSeconds: 10
      terminationGracePeriodSeconds: 30
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers
        - name: config-volume
          configMap:
            name: fluentd-configmap
        - name: localtime
          hostPath:
            path: /etc/localtime

4 查看日志

这是运行了一段时间的采集结果,分片和副本可以自己在kibana设置索引模板

查看json字段,只保留了必要的K8S信息

json 复制代码
{
  "_index": "k8s-pod-2025-05-19",
  "_type": "_doc",
  "_id": "Eh__55YB_RI1PaU21I5o",
  "_version": 1,
  "_score": 1,
  "_source": {
    "log": "2025-05-19 18:03:40.819 [nioEventLoopGroup-6-31] INFO  c.c.n.k.CustomDelimiterDecoder - 解析报文完成,channelId = 6adbc0e1, oldReadableBytes = 1115,nowReadableBytes = 0\n",
    "datetime": "2025-05-19T18:03:40.820231799+0800",
    "container_name": "automotive-netty",
    "namespace_name": "automotive-dev",
    "pod_id": "dca185db-e1a4-4392-9d84-53b4971c2b93",
    "pod_ip": "172.17.217.71",
    "host": "k8s-node04",
    "log_level": "INFO",
    "@timestamp": "2025-05-19T18:03:40.820233305+08:00"
  },
  "fields": {
    "log": [
      "2025-05-19 18:03:40.819 [nioEventLoopGroup-6-31] INFO  c.c.n.k.CustomDelimiterDecoder - 解析报文完成,channelId = 6adbc0e1, oldReadableBytes = 1115,nowReadableBytes = 0\n"
    ],
    "pod_ip": [
      "172.17.217.71"
    ],
    "log_level": [
      "INFO"
    ],
    "pod_ip.keyword": [
      "172.17.217.71"
    ],
    "log_level.keyword": [
      "INFO"
    ],
    "container_name.keyword": [
      "automotive-netty"
    ],
    "namespace_name": [
      "automotive-dev"
    ],
    "pod_id.keyword": [
      "dca185db-e1a4-4392-9d84-53b4971c2b93"
    ],
    "datetime": [
      "2025-05-19T10:03:40.820Z"
    ],
    "@timestamp": [
      "2025-05-19T10:03:40.820Z"
    ],
    "container_name": [
      "automotive-netty"
    ],
    "host": [
      "k8s-node04"
    ],
    "log.keyword": [
      "2025-05-19 18:03:40.819 [nioEventLoopGroup-6-31] INFO  c.c.n.k.CustomDelimiterDecoder - 解析报文完成,channelId = 6adbc0e1, oldReadableBytes = 1115,nowReadableBytes = 0\n"
    ],
    "namespace_name.keyword": [
      "automotive-dev"
    ],
    "host.keyword": [
      "k8s-node04"
    ],
    "pod_id": [
      "dca185db-e1a4-4392-9d84-53b4971c2b93"
    ]
  }
}
相关推荐
Akamai中国1 小时前
GPU加速Kubernetes集群助力音视频转码与AI工作负载扩展
人工智能·云原生·容器·kubernetes·云计算·音视频
珊珊而川1 小时前
docker常用指令
运维·docker·容器
shane-u3 小时前
【已解决】docker search --limit 1 centos Error response from daemon
docker·容器·centos
todoitbo4 小时前
开源一个记账软件,支持docker一键部署
运维·docker·容器·开源·springboot·记账软件·容器部署
itachi-uchiha5 小时前
Docker安装MinIO对象存储中间件
中间件·容器·minio
安顾里5 小时前
什么是endpoints?
运维·容器·kubernetes
线条15 小时前
Zookeeper 集群安装与脚本化管理详解
分布式·zookeeper·云原生
好吃的肘子5 小时前
Zookeeper入门(三)
分布式·zookeeper·云原生
qq_2153978978 小时前
docker 启动一个python环境的项目
docker·容器