【ELK】filebeat采集数据输出到kafka指定topic

提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档

文章目录


背景

今天收到需求,生产环境中通需要优化filebeat的输出,将不同的内容输出到kafka不同的topic中,没看过上集的兄弟可以去看上集

filebeat主体配置

这次filebeat主体部分也是参考 File beat官方文档 配置如下↓

复制代码
    filebeat.inputs:
    - type: log    
      paths:
        - /data/filebeat-data/*.log
      processors:
      - add_fields:
          target: ""
          fields:
            log_type: "bizlog"


    # To enable hints based autodiscover, remove `filebeat.inputs` configuration and uncomment this:
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true
          hints.default_config:
            type: container
            paths:
              - /var/log/containers/*.log
            processors:
            - add_fields:
                target: ""
                fields:
                  log_type: "bizlog"
   

    #output.elasticsearch:
    #  hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
    #  username: ${ELASTICSEARCH_USERNAME}
    #  password: ${ELASTICSEARCH_PASSWORD}
    output.kafka:
      hosts: ['${KAFKA_HOST:kafka}:${KAFKA_PORT:9092}']
      topic: log_topic_all
      topics:
        - topic: "bizlog-%{[agent.version]}"
          when.contains:
            log_type: "bizlog"
        - topic: "k8slog-%{[agent.version]}"
          when.contains:
            log_type: "k8slog"
---

filebeat.inputs部分

我们在input中添加了processors模块,它可以将自定义的标签注入到输出文档中

复制代码
      processors:
      - add_fields:
          target: ""      # target为空可以让文档直接加入根节点
          fields:
            log_type: "bizlog"      # 自定义标签

输出文本效果如下

filebeat.output部分

输出直接指向kafka接口

本例中使用了环境变量方式

输出到不同文档可以直接在topics模块中配置

复制代码
    output.kafka:
      hosts: ['${KAFKA_HOST:kafka}:${KAFKA_PORT:9092}']
      topic: log_topic_all    # 文本没有有符合下面条件时,则输出到这个标题
      topics:              
        - topic: "bizlog-%{[agent.version]}"      # 指定输出topic
          when.contains:
            log_type: "bizlog"                    # input中我们注入的标签这里可以作为判断条件使用
        - topic: "k8slog-%{[agent.version]}"
          when.contains:
            log_type: "k8slog"

输出效果如下

完美符合需求

filebeat完整配置

本文采用daemonset方式部署了filebeat,完整的yaml文件如下↓

复制代码
apiVersion: v1
data:
  .dockerconfigjson: ewogICJhdXRocyI6IHsKICAgICJmdmhiLmZqZWNsb3VkLmNvbSI6IHsKICAgICAgInVzZXJuYW1lIjogImFkbWluIiwKICAgICAgInBhc3N3b3JkIjogIkF0eG41WW5MWG5KS3JsVFciCiAgICB9CiAgfQp9Cg==
kind: Secret
metadata:
  name: harbor-login
  namespace: kube-system
type: kubernetes.io/dockerconfigjson
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  - nodes
  verbs:
  - get
  - watch
  - list
- apiGroups: ["apps"]
  resources:
    - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources:
    - jobs
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: filebeat
  # should be the namespace where filebeat is running
  namespace: kube-system
  labels:
    k8s-app: filebeat
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: filebeat-kubeadm-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: filebeat
  namespace: kube-system
subjects:
  - kind: ServiceAccount
    name: filebeat
    namespace: kube-system
roleRef:
  kind: Role
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: filebeat-kubeadm-config
  namespace: kube-system
subjects:
  - kind: ServiceAccount
    name: filebeat
    namespace: kube-system
roleRef:
  kind: Role
  name: filebeat-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.inputs:
    - type: log    
      paths:
        - /data/filebeat-data/*.log
      processors:
      - add_fields:
          target: ""
          fields:
            log_type: "bizlog"


    # To enable hints based autodiscover, remove `filebeat.inputs` configuration and uncomment this:
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true
          hints.default_config:
            type: container
            paths:
              - /var/log/containers/*.log
            processors:
            - add_fields:
                target: ""
                fields:
                  log_type: "bizlog"
   

    #output.elasticsearch:
    #  hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
    #  username: ${ELASTICSEARCH_USERNAME}
    #  password: ${ELASTICSEARCH_PASSWORD}
    output.kafka:
      hosts: ['${KAFKA_HOST:kafka}:${KAFKA_PORT:9092}']
      topic: log_topic_all
      topics:
        - topic: "bizlog-%{[agent.version]}"
          when.contains:
            log_type: "bizlog"
        - topic: "k8slog-%{[agent.version]}"
          when.contains:
            log_type: "k8slog"
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
spec:
  selector:
    matchLabels:
      k8s-app: filebeat
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: filebeat
        image: fvhb.fjecloud.com/beats/filebeat:8.9.2
        securityContext:
          runAsUser: 0
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        env:
        - name: KAFKA_HOST
          value: "100.99.17.19"
        - name: KAFKA_PORT
          value: "9092" 
        - name: ELASTICSEARCH_HOST
          value: "100.99.17.19"
        - name: ELASTICSEARCH_PORT
          value: "19200"
        - name: ELASTICSEARCH_USERNAME
          value: elastic
        - name: ELASTICSEARCH_PASSWORD
          value: qianyue@2024
        - name: ELASTIC_CLOUD_ID
          value:
        - name: ELASTIC_CLOUD_AUTH
          value:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
          # If using Red Hat OpenShift uncomment this:
          #privileged: true
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /data/docker/containers
          readOnly: true
        - name: filebeat-data
          mountPath: /data/filebeat-data
        - name: varlog
          mountPath: /var/log
          readOnly: true
      imagePullSecrets:
      - name: harbor-login
      volumes:
      - name: config
        configMap:
          defaultMode: 0640
          name: filebeat-config
      - name: varlibdockercontainers
        hostPath:
          path: /data/docker/containers
      - name: filebeat-data
        hostPath:
          path: /data/filebeat-data
          type: DirectoryOrCreate
      - name: varlog
        hostPath:
          path: /var/log
      # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
      - name: data
        hostPath:
          # When filebeat runs as non-root user, this directory needs to be writable by group (g+w).
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
---
相关推荐
唐人街都是苦瓜脸4 小时前
Kafka的简单介绍
分布式·kafka
IT成长日记5 小时前
【Kafka基础】Kafka 2.8以下版本的安装与配置指南:传统ZooKeeper依赖版详解
zookeeper·kafka·传统架构部署安装
小黑蛋学java5 小时前
Elasticsearch单节点安装手册
elk
碧海饮冰6 小时前
Springboot--Kafka客户端参数关键参数的调整方法
分布式·kafka
小黑蛋学java8 小时前
Elasticearch数据流向
elk
IT成长日记9 小时前
【Kafka基础】topic命令行工具kafka-topics.sh:基础操作命令解析
分布式·kafka·topic·kafka-topics.sh·命令行操作
唐人街都是苦瓜脸14 小时前
Kafka和RocketMQ相比有什么区别?那个更好用?
分布式·中间件·kafka·rocketmq
信徒_14 小时前
Kafka 中的高低水位
分布式·kafka
信徒_14 小时前
kafka 的存储文件结构
分布式·kafka
IT成长日记16 小时前
【Kafka基础】单机安装与配置指南,从零搭建环境
分布式·kafka·单机部署