【ELK】filebeat采集数据输出到kafka指定topic

提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档

文章目录


背景

今天收到需求,生产环境中通需要优化filebeat的输出,将不同的内容输出到kafka不同的topic中,没看过上集的兄弟可以去看上集

filebeat主体配置

这次filebeat主体部分也是参考 File beat官方文档 配置如下↓

复制代码
    filebeat.inputs:
    - type: log    
      paths:
        - /data/filebeat-data/*.log
      processors:
      - add_fields:
          target: ""
          fields:
            log_type: "bizlog"


    # To enable hints based autodiscover, remove `filebeat.inputs` configuration and uncomment this:
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true
          hints.default_config:
            type: container
            paths:
              - /var/log/containers/*.log
            processors:
            - add_fields:
                target: ""
                fields:
                  log_type: "bizlog"
   

    #output.elasticsearch:
    #  hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
    #  username: ${ELASTICSEARCH_USERNAME}
    #  password: ${ELASTICSEARCH_PASSWORD}
    output.kafka:
      hosts: ['${KAFKA_HOST:kafka}:${KAFKA_PORT:9092}']
      topic: log_topic_all
      topics:
        - topic: "bizlog-%{[agent.version]}"
          when.contains:
            log_type: "bizlog"
        - topic: "k8slog-%{[agent.version]}"
          when.contains:
            log_type: "k8slog"
---

filebeat.inputs部分

我们在input中添加了processors模块,它可以将自定义的标签注入到输出文档中

复制代码
      processors:
      - add_fields:
          target: ""      # target为空可以让文档直接加入根节点
          fields:
            log_type: "bizlog"      # 自定义标签

输出文本效果如下

filebeat.output部分

输出直接指向kafka接口

本例中使用了环境变量方式

输出到不同文档可以直接在topics模块中配置

复制代码
    output.kafka:
      hosts: ['${KAFKA_HOST:kafka}:${KAFKA_PORT:9092}']
      topic: log_topic_all    # 文本没有有符合下面条件时,则输出到这个标题
      topics:              
        - topic: "bizlog-%{[agent.version]}"      # 指定输出topic
          when.contains:
            log_type: "bizlog"                    # input中我们注入的标签这里可以作为判断条件使用
        - topic: "k8slog-%{[agent.version]}"
          when.contains:
            log_type: "k8slog"

输出效果如下

完美符合需求

filebeat完整配置

本文采用daemonset方式部署了filebeat,完整的yaml文件如下↓

复制代码
apiVersion: v1
data:
  .dockerconfigjson: ewogICJhdXRocyI6IHsKICAgICJmdmhiLmZqZWNsb3VkLmNvbSI6IHsKICAgICAgInVzZXJuYW1lIjogImFkbWluIiwKICAgICAgInBhc3N3b3JkIjogIkF0eG41WW5MWG5KS3JsVFciCiAgICB9CiAgfQp9Cg==
kind: Secret
metadata:
  name: harbor-login
  namespace: kube-system
type: kubernetes.io/dockerconfigjson
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  - nodes
  verbs:
  - get
  - watch
  - list
- apiGroups: ["apps"]
  resources:
    - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources:
    - jobs
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: filebeat
  # should be the namespace where filebeat is running
  namespace: kube-system
  labels:
    k8s-app: filebeat
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: filebeat-kubeadm-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: filebeat
  namespace: kube-system
subjects:
  - kind: ServiceAccount
    name: filebeat
    namespace: kube-system
roleRef:
  kind: Role
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: filebeat-kubeadm-config
  namespace: kube-system
subjects:
  - kind: ServiceAccount
    name: filebeat
    namespace: kube-system
roleRef:
  kind: Role
  name: filebeat-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.inputs:
    - type: log    
      paths:
        - /data/filebeat-data/*.log
      processors:
      - add_fields:
          target: ""
          fields:
            log_type: "bizlog"


    # To enable hints based autodiscover, remove `filebeat.inputs` configuration and uncomment this:
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true
          hints.default_config:
            type: container
            paths:
              - /var/log/containers/*.log
            processors:
            - add_fields:
                target: ""
                fields:
                  log_type: "bizlog"
   

    #output.elasticsearch:
    #  hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
    #  username: ${ELASTICSEARCH_USERNAME}
    #  password: ${ELASTICSEARCH_PASSWORD}
    output.kafka:
      hosts: ['${KAFKA_HOST:kafka}:${KAFKA_PORT:9092}']
      topic: log_topic_all
      topics:
        - topic: "bizlog-%{[agent.version]}"
          when.contains:
            log_type: "bizlog"
        - topic: "k8slog-%{[agent.version]}"
          when.contains:
            log_type: "k8slog"
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
spec:
  selector:
    matchLabels:
      k8s-app: filebeat
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: filebeat
        image: fvhb.fjecloud.com/beats/filebeat:8.9.2
        securityContext:
          runAsUser: 0
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        env:
        - name: KAFKA_HOST
          value: "100.99.17.19"
        - name: KAFKA_PORT
          value: "9092" 
        - name: ELASTICSEARCH_HOST
          value: "100.99.17.19"
        - name: ELASTICSEARCH_PORT
          value: "19200"
        - name: ELASTICSEARCH_USERNAME
          value: elastic
        - name: ELASTICSEARCH_PASSWORD
          value: qianyue@2024
        - name: ELASTIC_CLOUD_ID
          value:
        - name: ELASTIC_CLOUD_AUTH
          value:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
          # If using Red Hat OpenShift uncomment this:
          #privileged: true
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /data/docker/containers
          readOnly: true
        - name: filebeat-data
          mountPath: /data/filebeat-data
        - name: varlog
          mountPath: /var/log
          readOnly: true
      imagePullSecrets:
      - name: harbor-login
      volumes:
      - name: config
        configMap:
          defaultMode: 0640
          name: filebeat-config
      - name: varlibdockercontainers
        hostPath:
          path: /data/docker/containers
      - name: filebeat-data
        hostPath:
          path: /data/filebeat-data
          type: DirectoryOrCreate
      - name: varlog
        hostPath:
          path: /var/log
      # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
      - name: data
        hostPath:
          # When filebeat runs as non-root user, this directory needs to be writable by group (g+w).
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
---
相关推荐
lang201509284 小时前
Kafka高可用:延迟请求处理揭秘
分布式·kafka·linq
lang201509285 小时前
Kafka副本同步机制核心解析
分布式·kafka·linq
要开心吖ZSH6 小时前
应用集成平台-系统之间的桥梁-思路分享
java·kafka·交互
lang201509287 小时前
深入解析Kafka核心:Partition类源码揭秘
分布式·kafka·linq
Query*9 小时前
分布式消息队列kafka【六】—— kafka整合数据同步神器canal
分布式·kafka
Cat God 0079 小时前
Kafka单机搭建(二)
分布式·kafka·linq
yumgpkpm9 小时前
AI大模型手机的“简单替换陷阱”与Hadoop、Cloudera CDP 7大数据底座的关系探析
大数据·人工智能·hadoop·华为·spark·kafka·cloudera
yumgpkpm10 小时前
(简略)AI 大模型 手机的“简单替换陷阱”与Hadoop、Cloudera CDP 7大数据底座的关系探析
人工智能·hive·zookeeper·flink·spark·kafka·开源
Cat God 00710 小时前
Kafka单机搭建(一)
分布式·kafka
Chasing__Dreams10 小时前
kafka--基础知识点--6.3--leader epoch机制
分布式·kafka