开源日志收集体系ELK

一、简介

在现代企业IT系统中,应用规模持续扩大、架构不断演进,系统间的依赖关系愈加复杂。为了保证系统的可用性、性能和稳定性,建设一套完善的日志收集与分析平台变得至关重要。

ELK(Elasticsearch、Logstash、Kibana)是目前最广泛采用的开源日志系统解决方案之一。通过结合Filebeat、Heartbeat、Metricbeat等Beats组件,可以实现应用日志、系统指标、探活信息等多源数据的统一采集、处理与可视化,从而帮助运维团队和开发团队快速定位问题、监控系统健康状态、提升故障响应效率。

二、组件介绍

一个完整的日志采集体系通常由以下核心组件构成:

1. Elasticsearch(ES)

Elasticsearch是一个分布式搜索与分析引擎,用于存储和查询日志数据。其核心特性包括:

  • 水平扩展能力强,支持海量数据;

  • 高效全文检索、聚合分析;

  • 多节点集群提供高可用性;

2. Logstash

Logstash是功能强大的数据处理管道,负责日志的解析、过滤与转换。其特点包括:

  • 使用plugin机制支持输入、过滤、输出等多种流程;

  • 支持grok、geoip、mutate等过滤器;

  • 适用于需要复杂ETL或多格式日志处理的场景;

3. Kibana

Kibana是可视化与管理界面,用于展示和分析Elasticsearch中的数据。其功能包括:

  • 仪表盘构建;

  • 日志检索;

  • 系统监控;

  • 索引管理、可视化编辑;

4. Beats(Filebeat / Metricbeat等)

Beats是轻量级数据采集器,主要有:

  • Filebeat:采集日志文件;

  • Metricbeat:采集系统与应用指标;

  • Heartbeat:采集可用性信息(探活);

  • Auditbeat:采集审计信息;

Beats通常部署在应用服务器上,将原始日志转发到Logstash或Elasticsearch。

5. Kafka(可选)

在企业级架构中,Kafka常作为日志缓冲与解耦的中间件,提高系统可靠性与吞吐。

三、部署及配置

我们以常见架构为例,在Kubernetes环境中进行部署及配置说明。

假设:我们有前端服务和后端服务两种日志格式,内容格式分别如下:

#前端服务iwt-demo-site服务日志

!!!!!!!! ::: 10.3.3.220 ::: - ::: - ::: [14/Nov/2025:09:21:17 +0000] ::: "GET /assets/intro.8e730512.png HTTP/1.1" ::: 200 ::: 6320762 ::: "https://www.demo.com/assets/index.4fb9a4ac.css" ::: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36" ::: "10.3.2.154"

#后端服务iwt-demo-service服务日志

!!!!!!!! 2025:12:12 14:49:44.512 | IcerCarbonRegistryService | [main] | INFO | c.a.n.client.config.impl.CacheData | [fixed-project-demo-10.3.3.23_8848] [add-listener] ok, tenant=project-demo, dataId=project-demo.yaml, group=DEFAULT_GROUP, cnt=1

1. Elasticsearch

我们以单点es为例,创建es.yaml文件并部署到Kubernetes中,内容如下:

复制代码
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: iwt-ctx-elk-es-config
  namespace: log
data:
  elasticsearch.yml: |-
    cluster.name: "docker-cluster"
    network.host: 0.0.0.0
    xpack.security.enabled: true
    xpack.security.http.ssl.enabled: false
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: iwt-ctx-elk-es
  name: iwt-ctx-elk-es
  namespace: log
spec:
  minReadySeconds: 10
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: iwt-ctx-elk-es
  strategy:
    type: Recreate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: iwt-ctx-elk-es
    spec:
      affinity:
        nodeAffinity: {}
      containers:
        - env:
            - name: discovery.type
              value: single-node
            - name: ES_JAVA_OPTS
              value: -Xms4096m -Xmx12288m
            - name: MINIMUM_MASTER_NODES
              value: "1"
          image: elastic/elasticsearch:8.19.8
          imagePullPolicy: IfNotPresent
          name: iwt-ctx-elk-es
          ports:
            - containerPort: 9200
              name: db
              protocol: TCP
            - containerPort: 9300
              name: transport
              protocol: TCP
          resources:
            limits:
              cpu: 4000m
              memory: 16384Mi
            requests:
              cpu: 1000m
              memory: 4096Mi
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
            - mountPath: /usr/share/elasticsearch/data
              name: data
            - name: iwt-ctx-elk-es-config
              mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
              subPath: elasticsearch.yml
      dnsPolicy: ClusterFirst
      imagePullSecrets:
        - name: oci-container-registry
      initContainers:
        - command:
            - /sbin/sysctl
            - -w
            - vm.max_map_count=262144
          image: alpine:3.6
          imagePullPolicy: IfNotPresent
          name: iwt-ctx-elk-es-init
          securityContext:
            privileged: true
            procMount: Default
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
        - hostPath:
            path: /data/elk/elasticsearch/data
            type: ""
          name: data
        - name: iwt-ctx-elk-es-config
          configMap:
            name: iwt-ctx-elk-es-config
---
apiVersion: v1
kind: Service
metadata:
  namespace: log
  name: iwt-ctx-elk-es
  labels:
    app: iwt-ctx-elk-es
spec:
  ports:
    - port: 9200
      targetPort: 9200
      name: db
    - port: 9300
      targetPort: 9300
      name: transport
  selector:
    app: iwt-ctx-elk-es
  type: NodePort

初始化内置用户的密码:

$ kubectl -n log exec -it iwt-ctx-elk-es-6c6787fc49-6rxq8 -- /bin/sh

$ cd bin

$ ./elasticsearch-setup-passwords interactive

本示例中我们按照脚本的提示,将所有内置用户的密码设置为admin123,方便做配置演示。

2. Kibana

复制代码
apiVersion: v1
kind: ConfigMap
metadata:
  name: kibana-config
  namespace: log
data:
  kibana.yml: |
    server.name: iwt-ctx-elk-kibana
    i18n.locale: "en"
    server.host: "0.0.0.0"
    #自行修改encryptionKey内容
    xpack.encryptedSavedObjects.encryptionKey: "b8f3e1a2d4c9b6f1a7d3e2c8f0b1a5d7"
    elasticsearch.hosts: [ "http://iwt-ctx-elk-es:9200" ]
    elasticsearch.username: "kibana_system"
    elasticsearch.password: "admin123"
    server.publicBaseUrl: "https://kibana.iwt.devops.com"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: iwt-ctx-elk-kibana
  namespace: log
  labels:
    app: iwt-ctx-elk-kibana
spec:
  replicas: 1
  revisionHistoryLimit: 10
  progressDeadlineSeconds: 600
  selector:
    matchLabels:
      app: iwt-ctx-elk-kibana
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
  template:
    metadata:
      labels:
        app: iwt-ctx-elk-kibana
        name: iwt-ctx-elk-kibana
    spec:
      containers:
        - name: kibana
          image: docker.elastic.co/kibana/kibana:8.19.8
          volumeMounts:
            - name: kibana-config-volume
              mountPath: /usr/share/kibana/config/kibana.yml
              subPath: kibana.yml
          imagePullPolicy: IfNotPresent
          command: ["sh", "-c", "/usr/share/kibana/bin/kibana"]
          ports:
            - containerPort: 5601
              protocol: TCP
          env:
            - name: ELASTICSEARCH_URL
              value: http://iwt-ctx-elk-es:9200
            - name: ELASTICSEARCH_HOSTS
              value: http://iwt-ctx-elk-es:9200
            - name: NODE_OPTIONS
              value: --max-old-space-size=6144
          resources:
            requests:
              cpu: "1"
              memory: 2Gi
            limits:
              cpu: "4"
              memory: 8Gi
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      imagePullSecrets:
        - name: oci-container-registry
      dnsPolicy: ClusterFirst
      schedulerName: default-scheduler
      volumes:
        - name: kibana-config-volume
          configMap:
            name: kibana-config
---
apiVersion: v1
kind: Service
metadata:
  name: iwt-ctx-elk-kibana
  namespace: log
  labels:
    app: iwt-ctx-elk-kibana
spec:
  ports:
    - name: http
      port: 5601
      protocol: TCP
      targetPort: 5601
  selector:
    app: iwt-ctx-elk-kibana
  type: NodePort

3. Logstash

Logstash接收Filebeat发送的日志数据,并按照一定规则处理(如本示例中,前后端日志采用不同的filter处理),然后发送到es中。

复制代码
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: iwt-ctx-elk-logstash-config
  namespace: log
data:
  logstash.yml: |-
    http.host: "0.0.0.0"
    path.config: /usr/share/logstash/conf.d/*.conf
  change.conf: |-
    input {
        beats {
            port => 5044
    }
    }
    
    filter {
      #后端日志处理
      if [logtype] =~ /-service$/ {
    
        grok {
          match => {"message" => "^!{8}\s+(?<timestamp>\d{4}[:\-]\d{2}[:\-]\d{2} \d{2}:\d{2}:\d{2}(?:[.,]\d{3})?)\s*\|\s*%{DATA:contextName}\s*\|\s*%{DATA:thread}\s*\|\s*(?:\[(?<traceId>[^\]]+)\]\s*\|)?\s*%{WORD:level}\s*\|\s*%{DATA:logger}\s*\|\s*%{GREEDYDATA:msg}$"}
        }
    
        mutate {
          remove_field => ["agent","ecs","host","log","input"]
        }
    
        date {
          match => ["timestamp", "yyyy:MM:dd HH:mm:ss.SSS", "yyyy-MM-dd HH:mm:ss.SSS"]
          target => "@timestamp"
          timezone => "Asia/Shanghai"
        }
      }
      #前端日志处理
      if [logtype] =~ /-site$/ {
        grok {
          match => { "message" => '^!{8} ::: %{IP:remote_addr} ::: %{DATA:ident} ::: %{DATA:remote_user} ::: \[%{HTTPDATE:time_local}\] ::: "%{WORD:method} %{DATA:request} HTTP/%{NUMBER:http_version}" ::: %{NUMBER:status} ::: %{NUMBER:body_bytes_sent} ::: "%{DATA:http_referer}" ::: "%{DATA:http_user_agent}" ::: "%{DATA:http_x_forwarded_for}"$' }
        }
      
        mutate {
          remove_field => ["agent", "ecs", "host", "log", "input"]
        }
      
        date {
          match => ["time_local", "dd/MMM/yyyy:HH:mm:ss Z"]
          target => "@timestamp"
        }
      }
    
    }
    #输出到es中
    output {
      elasticsearch {
        hosts => ["iwt-ctx-elk-es:9200"]
        index => "iwt-%{[logtype]}-%{+YYYY-MM}"
        user => "elastic"
        password => "admin123"
      }
    }

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: iwt-ctx-elk-logstash
  namespace: log
  labels: 
    app: iwt-ctx-elk-logstash
spec:
  replicas: 1
  selector:
    matchLabels: 
      app: iwt-ctx-elk-logstash
  template:
    metadata:
      labels: 
        app: iwt-ctx-elk-logstash
        name: iwt-ctx-elk-logstash
    spec:
      imagePullSecrets:
      - name: oci-container-registry
      containers:
      - name: iwt-ctx-elk-logstash
        image: docker.elastic.co/logstash/logstash:8.19.8
        ports:
        - containerPort: 5044
          protocol: TCP
          name: transfer
        - containerPort: 9600
          protocol: TCP
          name: config
        resources:
          limits:
            cpu: 2000m
            memory: 16Gi
          requests:
            cpu: 418m
            memory: 1Gi
        volumeMounts:
        - name: iwt-ctx-elk-logstash-config
          mountPath: /usr/share/logstash/conf.d/change.conf
          subPath: change.conf
        - name: iwt-ctx-elk-logstash-config
          mountPath: /usr/share/logstash/config/logstash.yml
          subPath: logstash.yml

      volumes:
      - name: iwt-ctx-elk-logstash-config
        configMap:
          name: iwt-ctx-elk-logstash-config
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: iwt-ctx-elk-logstash
  name: iwt-ctx-elk-logstash
  namespace: log
spec:
  ports:
  - name: transfer
    port: 5044
    protocol: TCP
    targetPort: 5044
  - name: http
    port: 9600
    protocol: TCP
    targetPort: 9600
  selector:
    app: iwt-ctx-elk-logstash
  type: NodePort

4. Filebeat

Filebeat收集原始日志并发送到Logstash中,可以添加logtype字段方便Logstash处理;注意要启用http端口,用来给Metricbeat收集监控数据。

复制代码
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: iwt-ctx-elk-filebeat-config
  namespace: log
  labels:
    app: iwt-ctx-elk-filebeat
data:
  filebeat.yml: |-
    http:
      enabled: true
      host: 0.0.0.0
      port: 5066
    filebeat.config:
      inputs:
        # Mounted `iwt-ctx-elk-filebeat-inputs` configmap:
        path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        reload.enabled: true
        reload.period: 60s 
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false
    #输出到logstash中进行处理
    output.logstash:
      hosts: ["iwt-ctx-elk-logstash:5044"]
      bulk_max_size: 4096
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: iwt-ctx-elk-filebeat-inputs
  namespace: log
  labels:
    app: iwt-ctx-elk-filebeat
data:
  filebeat-inputs.yml: |-
    
    - type: log
      enabled: true
      backoff: "1s"
      tail_files: true
      multiline:
        pattern: '^!{8}'
        negate: true
        match: after
      paths:
        - /data/logs/iwt-demo-site/access.log
      fields:
        logtype: iwt-demo-site
      fields_under_root: true
    
    - type: log
      enabled: true
      backoff: "1s"
      tail_files: true
      multiline:
        pattern: '^!{8}'
        negate: true
        match: after
      paths:
        - /data/logs/iwt-demo-service/*.log
      fields:
        logtype: iwt-demo-service
      fields_under_root: true

---
apiVersion: apps/v1 
kind: Deployment
metadata:
  name: iwt-ctx-elk-filebeat
  namespace: log
  labels:
    app: iwt-ctx-elk-filebeat
spec:
  replicas: 1
  selector:
    matchLabels:
      app: iwt-ctx-elk-filebeat
  template:
    metadata:
      labels:
        app: iwt-ctx-elk-filebeat
    spec:
      containers:
      - name: iwt-ctx-elk-filebeat
        image: docker.elastic.co/beats/filebeat:8.19.8
        ports:
          - containerPort: 5066
            name: http-metrics
            protocol: TCP
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        securityContext:
          runAsUser: 0
          # If using Red Hat OpenShift uncomment this:
          #privileged: true
        resources:
          limits:
            cpu: 2000m
            memory: 8Gi
          requests:
            cpu: 512m
            memory: 1Gi
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: app-logs
          mountPath: /data/logs
        - name: inputs
          mountPath: /usr/share/filebeat/inputs.d
          readOnly: true
        - name: data
          mountPath: /usr/share/filebeat/data
      volumes:
      - name: config
        configMap:
          defaultMode: 0600
          name: iwt-ctx-elk-filebeat-config
      - name: app-logs
        hostPath:
          path: /data/logs
      - name: inputs
        configMap:
          defaultMode: 0600
          name: iwt-ctx-elk-filebeat-inputs
      # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
      - name: data
        hostPath:
          path: /data/elk/filebeat
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: iwt-ctx-elk-filebeat
  name: iwt-ctx-elk-filebeat
  namespace: log
spec:
  ports:
  - name: transfer
    port: 5066
    protocol: TCP
    targetPort: 5066
  selector:
    app: iwt-ctx-elk-filebeat
  type: NodePort

5. Metricbeat

本示例中,Metricbeat用来收集以上组件的监控数据,并发送到es中;Kibana的Stack Monitoring模块会自动生成监控仪表盘;

复制代码
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-elasticsearch
  namespace: log
data:
  # 主配置文件
  metricbeat.yml: |-
    metricbeat.config.modules:
      path: ${path.config}/modules.d/*.yml
      reload.enabled: false
    queue.mem:
      events: 4096
    setup.kibana:
      host: "http://iwt-ctx-elk-kibana:5601"
      username: "elastic"
      password: "admin123"

    output.elasticsearch:
      hosts: ["http://iwt-ctx-elk-es:9200"]
      username: "elastic"
      password: "admin123"
      bulk_max_size: 2048

    monitoring:
      enabled: true
      elasticsearch:
        hosts: ["http://iwt-ctx-elk-es:9200"]
        username: "elastic"
        password: "admin123"
        pipeline: "xpack-monitoring-8"

  #Elasticsearch module
  elasticsearch.yml: |-
    - module: elasticsearch
      xpack.enabled: true
      period: 30s
      hosts: ["http://iwt-ctx-elk-es:9200"]
      username: "elastic"
      password: "admin123"
      cluster_stats: true
      node: true
      enabled: true
  #kibana module
  kibana.yml: |-
    - module: kibana
      xpack.enabled: true
      period: 30s
      hosts: ["http://iwt-ctx-elk-kibana:5601"]
      username: "elastic"
      password: "admin123"
      enabled: true
  #Logstash module
  logstash.yml: |-
    - module: logstash
      xpack.enabled: true
      period: 30s
      hosts: ["http://iwt-ctx-elk-logstash:9600"]
      username: "elastic"
      password: "admin123"
      enabled: true
  #filebeat module
  beat.yml: |-
    - module: beat
      xpack.enabled: true
      period: 30s
      hosts: ["http://iwt-ctx-elk-filebeat:5066"]
      username: "elastic"
      password: "admin123"
      enabled: true
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metricbeat-es-monitor
  namespace: log
  labels:
    app: metricbeat-es
spec:
  replicas: 1
  selector:
    matchLabels:
      app: metricbeat-es
  template:
    metadata:
      labels:
        app: metricbeat-es
    spec:
      containers:
        - name: metricbeat
          image: docker.elastic.co/beats/metricbeat:8.19.0
          args: ["-e"]
          volumeMounts:
            - name: config
              mountPath: /usr/share/metricbeat/metricbeat.yml
              subPath: metricbeat.yml
            - name: modules
              mountPath: /usr/share/metricbeat/modules.d
      volumes:
        - name: config
          configMap:
            name: metricbeat-elasticsearch
            items:
              - key: metricbeat.yml
                path: metricbeat.yml
        - name: modules
          configMap:
            name: metricbeat-elasticsearch
            items:
              - key: elasticsearch.yml
                path: elasticsearch.yml
              - key: kibana.yml
                path: kibana.yml
              - key: logstash.yml
                path: logstash.yml
              - key: beat.yml
                path: beat.yml

四、Web UI

使用Ingress等组件,把Elasticsearch和Kibana的Service代理为域名,供浏览器访问。

1. Elasticsearch

2. Kibana

Data Views

依次点击Management >> Stack Management >> Index Management ,可以看到很多服务日志对应的Index数据:

依次点击Management >> Stack Management >> Data Views ,创建视图供普通用户日志检索使用:

依次点击Management >> Stack Management >> Security >> Users,创建用户给使用者(生产环境中,可以创建Roles并做细粒度的权限控制):

现在新用户登录Kibana之后,依次点击Analytics>> Discover,可以查看相关视图,并做日志检索:

Stack Monitoring

依次点击Management >> Stack Monitoring,可以查看各组件的运行状态:

以上是ELK的基础用法,现在企业级可观测性平台往往会引入更多组件,以整合日志、指标、链路追踪、安全、存储等,可能会涉及到Elastic APM Agent、Fluent Bit、Loki等;这些将会在后面的可观测性平台方案对比中详细介绍。

相关推荐
DeepFlow 零侵扰全栈可观测3 小时前
助力金融信创与云原生转型,DeepFlow 排障智能体和可观测性建设实践
云原生·金融
小小工匠3 小时前
ElasticSearch - 分片灾难恢复实战:不重启ES集群极限磁盘级数据抢救
elasticsearch·stale_primary·empty_primary·reroute
艾莉丝努力练剑3 小时前
【Linux基础开发工具 (七)】Git 版本管理全流程与 GDB / CGDB 调试技巧
大数据·linux·运维·服务器·git·安全·elasticsearch
yuguo.im3 小时前
Elasticsearch 的倒排索引原理
大数据·elasticsearch·搜索引擎
拾忆,想起3 小时前
Dubbo通信协议全景指南:如何为你的微服务选择最佳通信方案?
微服务·云原生·性能优化·架构·dubbo·safari
网络小白不怕黑3 小时前
Containerd指南:从Docker到K8s的容器运行时
docker·容器·kubernetes
古城小栈3 小时前
MySQL与ES高效同步
数据库·mysql·elasticsearch
Hui Baby3 小时前
K8S蓝绿发布
java·容器·kubernetes
一周困⁸天.3 小时前
K8S-Helm
容器·kubernetes