背景
最近某个功能要使用到ELK(ElasticSearch、Logstash、Kibana)采集日志,对数据进行分析,网上百度了一下,目前推荐不使用Logstash而使用Filebeat ,即EFK。
下载链接
Elasticsearch
Kibana
Filebeat
Logstash
analysis-ik
安装前提
需安装java环境,在此不叙述,不会的自行百度安装。
下载示例

Elasticsearch安装
下载完成后解压对应的zip包,然后配置文件在config目录下,启动在bin目录下的elasticsearch.bat,不需要改动任何配置,直接点击bat命令启动即可。
账号密码和证书
启动bat过程中会出现账号(默认都是elastic)密码和Kibana的认证证书,记得保存下来后续登录需要用到,注意密码和证书的首位空格去掉,可以使用bin/elasticsearch-reset-password -u elastic
命令更改密码。
配置说明
9.0.3版本默认以ssl方式登录的,启动成功后默认都是开启安全保护。
xpack.security.enabled: false:此设置表示你已禁用了 Elasticsearch 的安全功能。这意味着 Elasticsearch 不会执行用户身份验证和访问控制。请确保你在另外的方式下对 Elasticsearch 进行了安全保护。
xpack.security.enrollment.enabled: true:此设置启用了 Elasticsearch 的安全证书认证功能。通过此功能,你可以使用证书来进行节点之间的相互认证。
xpack.security.http.ssl.enabled: false:此设置表示你已禁用了通过 HTTPS 加密来保护 Kibana、Logstash 和 Agents 与 Elasticsearch 之间的连接。这意味着这些连接将以明文方式传输数据。
xpack.security.transport.ssl.enabled: true:此设置表示你已启用了节点之间的传输层加密和相互认证功能。这样可以保护 Elasticsearch 集群节点之间的通信安全。
cluster.initial_master_nodes: ["PC-20230824PCHD"]:此设置指定了初始主节点的名称。只有具有该名称的节点才能成为集群的初始主节点。
http.host: 0.0.0.0:此设置允许从任何地方的 HTTP API 连接。连接是加密的,需要用户身份验证。
解决跨域问题
bash
http.cors.enabled: true
http.cors.allow-origin: "*"
启动成功
访问elasticsearch登录链接,https://localhost:9200,注意是https,输入上面获取的账号密码,启动成功。
配置分词器(暂未解决引入报错问题,请勿参考)
ik版本号要和es版本号一致,下载的elasticsearch-analysis-ik-9.0.3.zip解压到目录复制到es的plugins目录下即完成ik插件的安装,注意解压完删除zip包,不然启动会报错起不来 。

重启es,启动日志看到loaded plugin [analysis-ik],说明加载了ik分词器。
KIBANA安装
下载完成后解压对应的zip包,然后配置文件在config目录下,修改配置文件,将i18n.locale注释放开,修改为zh-CN,为中文启动,启动在bin目录下的kibana.bat,不需要改动任何配置,直接点击bat命令启动即可。
启动成功
启动过程中会出现登录链接,点击跳转登录即可,在cmd窗口中按住ctrl键+鼠标点击即可跳转登录至浏览器,kibana登录链接,http://localhost:5601。
输入令牌
登录过程中需要输入令牌,将上面启动es获取到的令牌输入即可。
输入验证码
验证码就在kibana的启动窗口获取。

如果令牌忘记保存了,可以在es的bin目录输入下面命令重新获取令牌,elasticsearch-create-enrollment-token -s kibana
成功后出现登录界面,输入es的账号密码即可登录。
登陆成功后显示中文页面
启动报错
启动完成后,最后会出现一行错误(虽然不影响), Error: Unable to create alerts client because the Encrypted Saved Objects plugin is missing encryption key. Please set xpack.encryptedSavedObjects.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.因为es没有encrypted-saved-objects插件导致,解决方法如下:在es的bin目录下执行命令elasticsearch-plugin list
,查看当前的插件,未安装则使用命令elasticsearch-plugin install com.floragunn:encrypted-saved-objects:x.y.z
进行安装,其中x.y.z 是插件的版本号,你需要替换为适合你 Elasticsearch 版本的正确版本。
输入elasticsearch-plugin list
输入
elasticsearch-plugin install com.floragunn:encrypted-saved-objects:9.0.3
但是使用该命令安装失败,deepseek回答说是9版本不支持插件方式,可回退至6或者7的版本,不影响使用,跳过。
配置说明
因为elasticsearch是以ssl方式启动的,所以kibana配置也是如此,启动成功后会出现下面配置。
FILEBATE安装
下载完成后解压对应的zip包,在当前文件夹下输入cmd跳转到命令台,分别输入filebeat.exe setup
,filebeat.exe -e -c filebeat.yml
即可,其实只输入第二个命令即可启动,第一个命令作用如下,首次时间较久:
Index setup finished
在这个步骤中,它将创建一个 index template,并创建好相应的 index pattern。我们可以在 Kibana 中看到如下的一个新的 index pattern 被创建,所有的 filebeat 导入的文件将会自动被这个 index pattern 所访问。
配置管理
V9版本的filebeat.inputs.types属性应该有了变更,要用filestream不能用log,或许是我没弄明白,用log是启动不了的,会报错误文件找不到。
type: filestream,指定输入类型为 filestream,表示监控文件内容的变化(实时读取追加的内容,适合日志或持续更新的文件)
id,唯一标识符,用于区分多个输入配置。在日志或监控中可通过此 ID 追踪数据来源。
enabled,是否启用此输入配置。true 表示启用,false 表示禁用(如你最初的配置)。
paths,指定要监控的文件路径,支持通配符(如 *.csv)。注意:路径需用引号包裹(尤其含中文或空格时)。建议使用绝对路径。跨平台路径建议用 /(Windows 也支持)。
配置文件
注意output只能有一个,如果先以es启动,再换成logstash,启动不了的,会报错Exiting: index management requested but the Elasticsearch output is not configured/enabled,我试了加配置setup.ilm.enabled: false
setup.template.enabled: false没有作用,删了对应的索引也没用,只能删掉es和kibana重新启动初始化,有大神指导的话请说一下。
bash
###################### Filebeat Configuration Example #########################
# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html
# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.
# ============================== Filebeat inputs ===============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input-specific configurations.
# filestream is an input for collecting log messages from files.
- type: filestream
# Unique ID among all inputs, an ID is required.
id: csv
# Change to true to enable this input configuration.
enabled: true
encoding: utf-8
# Paths that should be crawled and fetched. Glob based paths.
paths:
#- /var/log/*.log
#- c:\programdata\elasticsearch\logs\*
-D:/Maruko/AI智能体/民航代理/*.csv
fields:
data_source: "hu_csys" # 数据来源标识
index_prefix: "hu-csys" # ES索引前缀
fields_under_root: true # 提升字段到根级别
close_eof: true # 读取到文件末尾后关闭
parsers:
- multiline: # 处理多行日志(如有)
pattern: '^[^,]+(,[^,]+){30,}' # 匹配包含30+字段的行
negate: false
match: after
# - type: filestream
# # Unique ID among all inputs, an ID is required.
# id: xlsx
# # Change to true to enable this input configuration.
# enabled: true
# encoding: utf-8
# # Paths that should be crawled and fetched. Glob based paths.
# paths:
# #- /var/log/*.log
# #- c:\programdata\elasticsearch\logs\*
# -D:/Maruko/AI智能体/民航代理/*.xlsx
# fields:
# file_type: "xlsx"
# fields_under_root: true
# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list.
# Line filtering happens after the parsers pipeline. If you would like to filter lines
# before parsers, use include_message parser.
#exclude_lines: ['^DBG']
# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list.
# Line filtering happens after the parsers pipeline. If you would like to filter lines
# before parsers, use include_message parser.
#include_lines: ['^ERR', '^WARN']
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
#prospector.scanner.exclude_files: ['.gz$']
# Optional additional fields. These fields can be freely picked
# to add additional information to the crawled log files for filtering
#fields:
# level: debug
# review: 1
# journald is an input for collecting logs from Journald
#- type: journald
# Unique ID among all inputs, if the ID changes, all entries
# will be re-ingested
#id: my-journald-id
# The position to start reading from the journal, valid options are:
# - head: Starts reading at the beginning of the journal.
# - tail: Starts reading at the end of the journal.
# This means that no events will be sent until a new message is written.
# - since: Use also the `since` option to determine when to start reading from.
#seek: head
# A time offset from the current time to start reading from.
# To use since, seek option must be set to since.
#since: -24h
# Collect events from the service and messages about the service,
# including coredumps.
#units:
#- docker.service
# ============================== Filebeat modules ==============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
#reload.period: 10s
# ======================= Elasticsearch template setting =======================
setup.template.settings:
index.number_of_shards: 1
#index.codec: best_compression
#_source.enabled: false
# ================================== General ===================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false
# The URL from where to download the dashboard archive. By default, this URL
# has a value that is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
# =================================== Kibana ===================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
#host: "localhost:5601"
# Kibana Space ID
# ID of the Kibana Space into which the dashboards should be loaded. By default,
# the Default Space will be used.
#space.id:
# =============================== Elastic Cloud ================================
# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:
# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:
# ================================== Outputs ===================================
# Configure what output to use when sending the data collected by the beat.
# ---------------------------- Elasticsearch Output ----------------------------
# output.elasticsearch:
# enabled: false
# # Array of hosts to connect to.
# hosts: ["https://localhost:9200"]
# ssl:
# # 替换为你的 CA 证书路径 注意使用/不要用\
# certificate_authorities: ["D:/Program Files/elasticsearch-9.0.3/config/certs/http_ca.crt"]
# # 严格验证证书
# verification_mode: "full"
# # Performance preset - one of "balanced", "throughput", "scale",
# # "latency", or "custom".
# preset: balanced
# # Protocol - either `http` (default) or `https`.
# #protocol: "https"
# # Authentication credentials - either API key or username/password.
# #api_key: "id:api_key"
# username: "elastic"
# #替换为自己的密码
# password: "8fy0k7b_mAu-m+aCD+rX"
# index: "critical-%{[fields.log_type]}"
# ------------------------------ Logstash Output -------------------------------
output.logstash:
hosts: ["logstash-server:5044"] # Logstash服务器地址
loadbalance: true # 负载均衡
worker: 4 # 并发线程数
bulk_max_size: 512 # 每批发送事件数
# ====== 索引管理配置 ======
#setup.ilm.enabled: false # 关键配置
#setup.template.enabled: false
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
# ================================= Processors =================================
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
# ================================== Logging ===================================
# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug
# At debug level, you can selectively enable logging only for some components.
# To enable all selectors, use ["*"]. Examples of other selectors are "beat",
# "publisher", "service".
#logging.selectors: ["*"]
# ============================= X-Pack Monitoring ==============================
# Filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
# reporting is disabled by default.
# Set to true to enable the monitoring reporter.
#monitoring.enabled: false
# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:
# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch outputs are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:
# ============================== Instrumentation ===============================
# Instrumentation support for the filebeat.
#instrumentation:
# Set to true to enable instrumentation of filebeat.
#enabled: false
# Environment in which filebeat is running on (eg: staging, production, etc.)
#environment: ""
# APM Server hosts to report instrumentation results to.
#hosts:
# - http://localhost:8200
# API Key for the APM Server(s).
# If api_key is set then secret_token will be ignored.
#api_key:
# Secret token for the APM Server(s).
#secret_token:
# ================================= Migration ==================================
# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true
LOGSTASH安装
因为要处理csv文件及xlsx文件为JSON格式,看了下filebeat好像处理比较麻烦,所以还是引入logstash处理。解压zip后。
配置说明
在logstash目录下config新增一个自己要处理对应文件逻辑的conf文件,如我要弄一个csv文件转换为JSON格式的,新增一个csv_to_json.conf文件,配置内容如下:
bash
input {
beats {
port => 5044
codec => "plain"
}
}
filter {
# 解析CSV(假设第一行为表头)
csv {
separator => ","
skip_header => true
columns => [
"t_date", "host_name", "t_csn", "collect_date", "block_num",
"block_time", "step_seq_num", "step_num", "pid", "orig_run_id",
"gen_run_id", "major_code", "minor_code", "host_num", "agent",
"office", "in_pid", "app_level", "usr_level", "usr_group",
"func_num", "func_code", "text_log", "t_date_orig", "protime",
"cust_num", "six_pid_orgin", "six_pid_indicator", "six_pid",
"sys", "t_topic", "t_partition", "t_offset", "year", "month",
"day", "hour"
]
convert => {
"block_num" => "integer"
"step_seq_num" => "integer"
"host_num" => "integer"
"year" => "integer"
"month" => "integer"
"day" => "integer"
"hour" => "integer"
}
}
# 日期字段格式化
date {
match => ["t_date", "ISO8601"] # 根据实际格式调整
target => "@timestamp" # 覆盖默认时间戳
}
# 清理字段
mutate {
remove_field => ["message", "host", "log"]
rename => {
"t_date" => "[@metadata][event_date]"
"host_name" => "[host][name]"
}
gsub => [
"text_log", "\n", " ", # 替换换行符
"text_log", "\t", " " # 替换制表符
]
}
# 添加元数据
fingerprint {
source => ["host_name", "t_csn", "collect_date"]
target => "[@metadata][_id]"
method => "SHA1"
}
}
output {
elasticsearch {
hosts => ["http://elasticsearch:9200"]
index => "%{[fields][index_prefix]}-%{+YYYY.MM.dd}" # 动态索引名
document_id => "%{[@metadata][_id]}" # 使用指纹ID避免重复
document_type => "_doc"
pipeline => "hu-csys-pipeline" # 可选:ES预处理管道
}
# 调试输出(可选)
stdout {
codec => json_lines
}
}
启动
错误说明1
bash
Exiting: couldn't connect to any of the configured Elasticsearch hosts. Errors: [error connecting to Elasticsearch at http://localhost:9200: Get "http://localhost:9200": EOF]
出现该错误的原因是filebeat默认用的是非ssl模式,所以需要修改filebeat.yml配置,将host改为https,证书路径在es路径下面的config/certs下面。
bash
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["https://localhost:9200"]
ssl:
# 替换为你的 CA 证书路径 注意使用/不要用\
certificate_authorities: ["D:/Program Files/elasticsearch-9.0.3/config/certs/http_ca.crt"]
# 严格验证证书
verification_mode: "full"
# Performance preset - one of "balanced", "throughput", "scale",
# "latency", or "custom".
preset: balanced
# Protocol - either `http` (default) or `https`.
#protocol: "https"
# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
username: "elastic"
#替换为自己的密码
password: "8fy0k7b_mAu-m+aCD+rX"

错误说明2
Exiting: error loading config file: yaml: line 166: found unknown escape character
YAML 文件对 反斜杠 \ 敏感,如果路径或字符串中包含 \ 但没有正确转义,就会报错,改为正斜杠(/),certificate_authorities证书路径修改。