EFK 日志系统搭建完整教程
本教程详细说明如何在 Linux 环境中搭建完整的 EFK(Elasticsearch + Filebeat + Kibana)日志收集分析系统。
目录
- 环境要求
- [Elasticsearch 安装与配置](#Elasticsearch 安装与配置)
- [Kibana 安装与配置](#Kibana 安装与配置)
- [Kubernetes 环境 Filebeat 部署](#Kubernetes 环境 Filebeat 部署)
- 系统服务配置
- 访问验证
- 故障排除
环境要求
- 操作系统: Linux
- 内存: 至少 2GB(推荐 4GB+)
- 磁盘: 至少 10GB 可用空间
- 网络: 确保端口 9200 (ES) 和 5601 (Kibana) 可访问
Elasticsearch 安装与配置
1. 下载并安装 Elasticsearch
bash
# 下载 Elasticsearch (版本 8.17.3)
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.17.3-linux-x86_64.tar.gz
# 解压
tar -zxvf elasticsearch-8.17.3-linux-x86_64.tar.gz
# 将 elasticsearch-8.17.3 文件夹移动到 /usr/local 目录下
sudo mv elasticsearch-8.17.3 /usr/local/
2. 配置 Java 环境
bash
# 进入 bin 目录
cd /usr/local/elasticsearch-8.17.3/bin
# 修改启动脚本,解决 JDK 依赖问题
vim elasticsearch
在文件末尾添加:
bash
export JAVA_HOME=/usr/local/elasticsearch-8.17.3/jdk
export PATH=$JAVA_HOME/bin:$PATH
if [ -x "$JAVA_HOME/bin/java" ]; then
JAVA="/usr/local/elasticsearch-8.17.3/jdk/bin/java"
else
JAVA=`which java`
fi
3. 调整 JVM 内存配置
bash
# 编辑 JVM 配置
vim /usr/local/elasticsearch-8.17.3/config/jvm.options
修改内存配置:
-Xms1g
-Xmx1g
注意: 根据服务器实际内存调整,如果服务器内存充足可以设置为2g
4. 创建专用用户
bash
# 创建 elasticsearch 用户
sudo useradd elasticsearch
# 设置目录权限
sudo chown elasticsearch:elasticsearch -R /usr/local/elasticsearch-8.17.3
sudo chmod -R 755 /usr/local/elasticsearch-8.17.3/logs
5. 配置 Elasticsearch 核心参数
bash
# 编辑主配置文件
vim /usr/local/elasticsearch-8.17.3/config/elasticsearch.yml
添加以下配置:
yaml
# 集群配置
cluster.name: elasticsearch-cluster
node.name: es-node-1
# 网络配置
network.host: 0.0.0.0
http.port: 9200
# 修改端口号(非必须)
# http.port: 19200
# 集群发现
cluster.initial_master_nodes: ["es-node-1"]
# 路径配置 (可选,如不配置则使用默认路径)
# path.data: /home/新用户名称/elasticsearch/data
# path.logs: /home/新用户名称/elasticsearch/logs
# 安全配置(生产环境建议开启)
xpack.security.enabled: true
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
# 索引配置
action.destructive_requires_name: true
6. 系统参数优化
bash
# 切换到 root 用户
su root
# 编辑系统配置
vim /etc/sysctl.conf
# 添加内存映射限制配置
vm.max_map_count=262144
vm.swappiness=1
# 刷新配置
sysctl -p
7. 启动 Elasticsearch
bash
# 切换到 elasticsearch 用户
su elasticsearch
# 进入启动目录
cd /usr/local/elasticsearch-8.17.3/bin
# 后台启动
./elasticsearch -d -p pid
# 检查启动状态
ps aux | grep elasticsearch
8. 重置密码
bash
# 重置 elastic 用户密码
cd /usr/local/elasticsearch-8.17.3/bin
./elasticsearch-reset-password -u elastic
# 记录输出的密码,通常格式类似:
# Password for [elastic] = [生成的密码]
Kibana 安装与配置
1. 下载并安装 Kibana
bash
# 下载 Kibana (版本务必与ES一致!)
cd /opt/efk
wget https://artifacts.elastic.co/downloads/kibana/kibana-8.17.3-linux-x86_64.tar.gz
# 解压
tar -zxvf kibana-8.17.3-linux-x86_64.tar.gz
# 移动到安装目录
sudo mv kibana-8.17.3 /usr/local/kibana
2. 获取 Elasticsearch 证书
bash
# Elasticsearch 在首次启动时会生成自签名证书
# 查看证书位置
ls /usr/local/elasticsearch-8.17.3/config/certs/
# 复制 CA 证书到 Kibana 目录
sudo cp /usr/local/elasticsearch-8.17.3/config/certs/http_ca.crt /usr/local/kibana/config/
sudo chown kibana:kibana /usr/local/kibana/config/http_ca.crt
3. 创建服务账户令牌
bash
# 进入 ES 安装目录
cd /usr/local/elasticsearch-8.17.3
# 创建 Kibana 服务账户令牌
./bin/elasticsearch-service-tokens create elastic/kibana kibana-token
重要服务令牌 : 保存生成的令牌,格式类似
SERVICE_TOKEN elastic/kibana/kibana-token = [生成的令牌值]
4. 配置 Kibana
bash
# 编辑 Kibana 配置文件
vim /usr/local/kibana/config/kibana.yml
配置内容:
yaml
# 服务器配置
server.port: 5601
server.host: "0.0.0.0"
# Elasticsearch 连接配置
elasticsearch.hosts: ["https://localhost:9200"]
elasticsearch.serviceAccountToken: "[实际服务令牌]"
elasticsearch.ssl.certificateAuthorities: ["/usr/local/kibana/config/http_ca.crt"]
# 中文配置
i18n.locale: "zh-CN"
# 日志配置
logging.appenders:
file:
type: file
fileName: /var/log/kibana/kibana.log
layout:
type: json
logging.root:
level: info
appenders: [file]
logging.loggers:
- name: http.server.response
level: debug
5. 创建 Kibana 用户和设置权限
bash
# 创建 kibana 用户
sudo useradd -M -s /bin/bash kibana
# 设置目录权限
sudo chown -R kibana:kibana /usr/local/kibana
sudo mkdir -p /var/log/kibana
sudo chown kibana:kibana /var/log/kibana
6. 启动 Kibana
bash
# 后台启动
sudo -u kibana nohup /usr/local/kibana/bin/kibana > /var/log/kibana/kibana.log 2>&1 &
# 检查启动状态
ps aux | grep kibana
tail -f /var/log/kibana/kibana.log
Kubernetes 环境 Filebeat 部署
1. 创建命名空间
bash
kubectl create namespace logging
2. 创建 ConfigMap 和 Secret
bash
# 创建 CA 证书 ConfigMap
kubectl create configmap elastic-ca-cert \
--namespace=logging \
--from-file=http_ca.crt=/usr/local/elasticsearch-8.17.3/config/certs/http_ca.crt
# 创建 Elasticsearch 凭据 Secret
kubectl create secret generic elasticsearch-credentials \
--namespace=logging \
--from-literal=username=elastic \
--from-literal=password=[实际密码]
3. 部署 Filebeat
创建 filebeat-daemonset.yaml 文件:
yaml
# Namespace (可改)
apiVersion: v1
kind: Namespace
metadata:
name: logging
---
# ServiceAccount + RBAC (允许 filebeat 访问 k8s API 以补齐 pod Metadata)
apiVersion: v1
kind: ServiceAccount
metadata:
name: filebeat
namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: filebeat
rules:
- apiGroups: [""]
resources: ["pods","namespaces","nodes"]
verbs: ["get","watch","list"]
- apiGroups: [""]
resources: ["endpoints","services"]
verbs: ["get","watch","list"]
- apiGroups: ["apps"]
resources: ["replicasets","deployments","daemonsets"]
verbs: ["get","watch","list"]
- apiGroups: ["extensions"]
resources: ["replicasets","deployments","daemonsets"]
verbs: ["get","watch","list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: filebeat
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: filebeat
subjects:
- kind: ServiceAccount
name: filebeat
namespace: logging
---
# ConfigMap: filebeat.yml(核心配置)
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat-config
namespace: logging
data:
filebeat.yml: |
# 性能优化配置
queue.mem:
events: 4096
flush.min_events: 512
flush.timeout: 5s
# 采集性能调优
filebeat.registry.flush: 5s
filebeat.shutdown_timeout: 10s
# 启用ILM(索引生命周期管理)
setup.ilm.enabled: true
setup.ilm.rollover_alias: "filebeat"
setup.ilm.pattern: "{now/d}-000001" # 每日滚动索引
setup.ilm.policy: "filebeat-ilm-policy"
setup.template.name: "filebeat"
setup.template.pattern: "filebeat-*"
filebeat.inputs:
- type: container
enabled: true
paths:
# K8s容器日志路径(推荐用于container类型)
- /var/log/containers/*.log
- /opt/mnt/docker/containers/*.log
- /home/docker/containers/*.log
- /data/docker/containers/*.log
- /var/lib/docker/containers/*/*.log # Docker原始日志路径
# Container输入特定配置
format: docker
scan_frequency: 5s
ignore_older: 0
close_inactive: 1h
# 多行配置(直接在input层级)
multiline.type: pattern
multiline.pattern: '^\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}\.\d{3}\s+'
multiline.negate: true
multiline.match: after
multiline.max_lines: 500
multiline.timeout: 5s
# 自动解析Docker JSON格式
parsers:
- container: ~
# 多行日志合并(处理Java异常堆栈)
- multiline:
type: pattern
# 匹配时间戳开头的行作为新日志的开始
pattern: '^\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}\.\d{3}\s+'
negate: true
match: after
max_lines: 500
timeout: 5s
# Add kubernetes metadata (in-cluster)
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
- add_kubernetes_metadata:
in_cluster: true
host: ${NODE_NAME}
# 使用默认配置,更稳定
default_indexers.enabled: true
default_matchers.enabled: true
# 简化配置,专注于容器日志
matchers:
- logs_path:
logs_path: "/var/log/containers/"
resource_type: "container"
# 只保留ERROR级别的日志(精确匹配)
- drop_event:
when:
not:
or:
# 精确匹配标准日志格式中的 ERROR 级别
# 格式: 2025-10-23 16:16:14.459 ERROR
- regexp:
message: '^\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}\.\d{3}\s+(ERROR|FATAL)'
# 匹配日志开头就是 ERROR 的情况
- regexp:
message: '^(ERROR|FATAL)'
# 标准 log.level 字段为 ERROR
- regexp:
log.level: '^(error|ERROR|fatal|FATAL)$'
# JSON 格式日志中的 level 字段
- regexp:
level: '^(error|ERROR|fatal|FATAL)$'
# 包含异常堆栈(at 开头的行)
- regexp:
message: '^\s+at\s+[a-zA-Z0-9_.$]+\.'
# Caused by 异常链
- regexp:
message: '^Caused by:'
# Java 异常类名(必须是完整的异常,不是简单包含exception)
- regexp:
message: '^[a-zA-Z0-9_.]+Exception:'
- regexp:
message: '^[a-zA-Z0-9_.]+Error:'
# 为包含中文的ERROR日志添加标签
- add_fields:
target: ""
fields:
alarm_flag: "backend_biz"
when:
regexp:
message: '[一-龥]+'
# 为不包含中文的ERROR日志添加标签
- add_fields:
target: ""
fields:
alarm_flag: "backend_error"
when:
not:
regexp:
message: '[一-龥]+'
# 添加环境标签
- add_fields:
target: ""
fields:
environment: "production"
cluster: "k8s-prod"
# 字段重命名和清理(添加条件检查)
- rename:
fields:
- from: "kubernetes.container.name"
to: "container_name"
- from: "kubernetes.namespace"
to: "namespace"
- from: "kubernetes.pod.name"
to: "pod_name"
ignore_missing: true
fail_on_error: false
# 删除不需要的字段以减少存储
- drop_fields:
fields:
- "agent.ephemeral_id"
- "agent.id"
- "ecs.version"
- "host.architecture"
ignore_missing: true
# Output to ElasticSearch (生产环境配置)
output.elasticsearch:
hosts: ["https://es ip:9200"]
username: "elastic"
password: "pass"
# 使用挂载的 CA 文件来验证 TLS(生产方式)
ssl.certificate_authorities: ["/etc/ssl/certs/http_ca.crt"]
# ILM配置 - 使用ILM别名索引
index: "filebeat-%{[agent.version]}" # 使用ILM管理的索引名
# 生产环境性能优化
bulk_max_size: 1000
worker: 2
compression_level: 1
# 生产环境监控配置
monitoring:
enabled: true
elasticsearch:
hosts: ["https://es ip:9200"]
username: "elastic"
password: "es pass"
ssl.certificate_authorities: ["/etc/ssl/certs/http_ca.crt"]
# 监控索引也启用ILM
index: "filebeat-monitoring-%{[agent.version]}-%{+yyyy.MM.dd}"
# 生产环境日志配置
logging.level: info
logging.to_files: true
logging.files:
path: /var/log/filebeat
name: filebeat
keepfiles: 7
permissions: 0644
# HTTP 监控端点
http.enabled: true
http.host: "0.0.0.0"
http.port: 5066
---
# DaemonSet 部署 Filebeat(以 root 运行以读取 host 日志)
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: filebeat
namespace: logging
labels:
app: filebeat
spec:
selector:
matchLabels:
app: filebeat
template:
metadata:
labels:
app: filebeat
spec:
serviceAccountName: filebeat
terminationGracePeriodSeconds: 30
containers:
- name: filebeat
image: x454262h22.qicp.vip:86/common/beats/filebeat:8.17.3
args:
- "-c"
- "/etc/filebeat.yml"
- "-e"
ports:
- name: http-monitoring
containerPort: 5066
protocol: TCP
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
runAsUser: 0
# 健康检查配置
livenessProbe:
httpGet:
path: /
port: 5066
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /
port: 5066
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
resources:
limits:
memory: 500Mi
cpu: 500m
requests:
memory: 200Mi
cpu: 100m
volumeMounts:
# 添加这些挂载
- name: data-docker-containers
mountPath: /data/docker/containers
readOnly: true
# 保留现有的挂载
- name: opt-mnt-docker-containers
mountPath: /opt/mnt/docker/containers
readOnly: true
- name: home-docker-containers
mountPath: /home/docker/containers
readOnly: true
# CA 证书挂载
- name: elastic-ca-cert
mountPath: /etc/ssl/certs/http_ca.crt
subPath: http_ca.crt
# filebeat config
- name: config
mountPath: /etc/filebeat.yml
subPath: filebeat.yml
# 常见 kube 聚合目录(如果存在)
- name: varlogcontainers
mountPath: /var/log/containers
readOnly: true
- name: varlogpods
mountPath: /var/log/pods
readOnly: true
# Docker 原始 json 日志(fallback)
- name: var-lib-docker-containers
mountPath: /var/lib/docker/containers
readOnly: true
# Filebeat 日志输出目录
- name: filebeat-logs
mountPath: /var/log/filebeat
volumes:
- name: elastic-ca-cert
configMap:
name: elastic-ca-cert
items:
- key: http_ca.crt
path: http_ca.crt
- name: config
configMap:
name: filebeat-config
items:
- key: filebeat.yml
path: filebeat.yml
- name: data-docker-containers
hostPath:
path: /data/docker/containers
type: DirectoryOrCreate
- name: opt-mnt-docker-containers
hostPath:
path: /opt/mnt/docker/containers
type: DirectoryOrCreate
- name: home-docker-containers
hostPath:
path: /home/docker/containers
type: DirectoryOrCreate
- name: var-lib-docker-containers
hostPath:
path: /var/lib/docker/containers
type: DirectoryOrCreate
- name: varlogcontainers
hostPath:
path: /var/log/containers
type: DirectoryOrCreate
- name: varlogpods
hostPath:
path: /var/log/pods
type: DirectoryOrCreate
- name: filebeat-logs
hostPath:
path: /var/log/filebeat
type: DirectoryOrCreate
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
# 保证在所有节点上部署
hostNetwork: false
4. 部署 Filebeat
bash
# 应用配置文件
kubectl apply -f filebeat-daemonset.yaml
# 查看部署状态
kubectl get pods -n logging
kubectl logs -f filebeat-xxxxx -n logging
系统服务配置
1. Elasticsearch Systemd 服务
创建服务文件:
bash
sudo cat << EOF > /etc/systemd/system/elasticsearch.service
[Unit]
Description=Elasticsearch
Documentation=https://www.elastic.co
After=network.target
[Service]
Type=simple
User=elasticsearch
Group=elasticsearch
Environment="ES_HOME=/usr/local/elasticsearch-8.17.3"
Environment="ES_PATH_CONF=$ES_HOME/config"
Environment="PID_DIR=$ES_HOME"
Environment="ES_JAVA_HOME=$ES_HOME/jdk"
Environment="ES_JAVA_OPTS=-Xms1g -Xmx1g"
WorkingDirectory=/usr/local/elasticsearch-8.17.3
LimitMEMLOCK=infinity
LimitNOFILE=65536
LimitNPROC=4096
ExecStart=/usr/local/elasticsearch-8.17.3/bin/elasticsearch
ExecStop=/bin/kill -INT \$MAINPID
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal
EnvironmentFile=-/etc/default/elasticsearch
[Install]
WantedBy=multi-user.target
EOF
启动服务:
bash
sudo systemctl daemon-reload
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch
sudo systemctl status elasticsearch
2. Kibana Systemd 服务
创建服务文件:
bash
sudo cat << EOF > /etc/systemd/system/kibana.service
[Unit]
Description=Kibana
Documentation=https://www.elastic.co
After=network.target
[Service]
User=kibana
Group=kibana
WorkingDirectory=/usr/local/kibana
ExecStart=/usr/local/kibana/bin/kibana
Restart=always
RestartSec=10
Environment="NODE_OPTIONS=--max-old-space-size=4096"
LimitNOFILE=65536
LimitMEMLOCK=infinity
[Install]
WantedBy=multi-user.target
EOF
启动服务:
bash
sudo systemctl daemon-reload
sudo systemctl enable kibana
sudo systemctl start kibana
sudo systemctl status kibana
访问验证
1. Elasticsearch 访问测试
bash
# 检查 ES 状态
curl -X GET "https://localhost:9200/_cluster/health?pretty" \
--cacert /usr/local/kibana/config/http_ca.crt \
-u elastic:[实际密码]
# 检查集群状态
curl -X GET "https://localhost:9200/_cat/nodes?v" \
--cacert /usr/local/kibana/config/http_ca.crt \
-u elastic:[实际密码]
2. Kibana 访问测试
在浏览器中访问:
http://[服务器IP地址]:5601
使用以下凭据登录:
- 用户名: elastic
- 密码: 重置密码时获得的密码
3. Filebeat 状态验证
bash
# 检查 Filebeat 状态
kubectl get pods -n logging
kubectl logs filebeat-xxxxx -n logging
# 检查索引
curl -X GET "https://localhost:9200/_cat/indices/filebeat-*?v" \
--cacert /usr/local/kibana/config/http_ca.crt \
-u elastic:[实际密码]
故障排除
常见问题解决
1. Elasticsearch 启动失败
bash
# 检查日志
sudo journalctl -u elasticsearch -f
# 检查端口占用
sudo netstat -tulpn | grep 9200
# 检查配置文件
tail -f /usr/local/elasticsearch-8.17.3/logs/*.log
2. Kibana 无法连接 Elasticsearch
bash
# 检查 Kibana 日志
sudo journalctl -u kibana -f
# 验证服务账户令牌
curl -X POST "https://localhost:9200/_security/service/elastic/kibana/credential/_verify" \
--cacert /usr/local/kibana/config/http_ca.crt \
-H "Authorization: Bearer [实际服务令牌]"
3. Filebeat 无法收集日志
bash
# 检查 Filebeat 配置
kubectl exec -it filebeat-xxxxx -n logging -- cat /etc/filebeat.yml
# 检查容器日志路径
kubectl exec -it filebeat-xxxxx -n logging -- ls -la /var/log/containers/
4. 内存不足问题
调整 JVM 内存设置:
bash
# 编辑 JVM 配置
vim /usr/local/elasticsearch-8.17.3/config/jvm.options
# 根据实际内存调整
-Xms1g
-Xmx1g
5. 证书验证问题
如果遇到证书验证错误:
bash
# 检查证书
openssl x509 -in /usr/local/kibana/config/http_ca.crt -text -noout
# 临时禁用证书验证(仅用于测试)
# 在 filebeat.yml 中添加:
# elasticsearch.ssl.verificationMode: none
重要注意事项
-
修改配置参数: 确保修改以下关键参数:
namespace: 根据实际命名空间修改为您的集群环境es地址: 更新为实际的 Elasticsearch IP 地址和端口账号密码: 使用正确的 ES 用户名和密码服务令牌: 使用实际生成的服务账户令牌CA证书: 确保证书路径正确
-
挂载日志目录 : 自定义挂载日志在
/opt/mnt/docker/containers/ -
安全配置: 生产环境中请使用加密连接和强密码策略
-
资源监控: 建议监控 Elasticsearch 和 Kibana 的内存使用情况
性能优化建议
- 调整 JVM 堆内存: 根据服务器内存的 50% 设置
- 优化索引设置 : 使用
ilm生命周期管理 - 配置日志轮转: 避免磁盘空间不足
- 监控资源使用: 通过 Kibana 监控面板
总结
通过本教程,您已经成功搭建了完整的 EFK 日志系统:
- Elasticsearch: 提供强大的搜索和分析能力
- Filebeat: 负责日志收集和传输
- Kibana: 提供友好的可视化界面
现在您可以通过 Kibana 界面查看和分析 Kubernetes 集群中的所有错误日志。
日志告警和日志分析:
