15-EFK日志收集配置详解
本文档详细介绍EFK(Elasticsearch + Fluentd + Kibana)日志收集系统的部署和配置。
概述
EFK是一套成熟的日志收集、存储和可视化解决方案:
-
Elasticsearch:分布式搜索引擎,用于存储和检索日志
-
Fluentd:日志收集器,运行在每个节点收集容器日志
-
Kibana:可视化界面,用于查询和分析日志
架构设计
┌─────────────────────────────────────────────────────────────────┐
│ manage-net (172.20.5.0/24) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Fluentd-N1 │ │ Fluentd-N2 │ │ Fluentd-N3 │ │
│ │172.20.5.11 │ │172.20.5.12 │ │172.20.5.22 │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └────────────────────┼────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Elasticsearch │ │
│ │ 172.20.5.21 │ │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Kibana │ │
│ │ 172.20.5.23 │ │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
IP规划
| 组件 | IP地址 | 节点 | 端口 |
|---|---|---|---|
| Fluentd | 172.20.5.11 | Node1 | 24224 |
| Fluentd | 172.20.5.12 | Node2 | 24224 |
| Fluentd | 172.20.5.22 | Node3 | 24224 |
| Elasticsearch | 172.20.5.21 | Node3 | 9200, 9300 |
| Kibana | 172.20.5.23 | Node3 | 5601 |
部署步骤
步骤1:创建配置目录
在所有节点执行:
mkdir -p /opt/cluster-deploy/config/{elasticsearch,fluentd,kibana}
mkdir -p /opt/cluster-deploy/logs/elasticsearch
chmod -R 777 /opt/cluster-deploy/logs
步骤2:创建Elasticsearch配置
在Node3执行,创建Elasticsearch配置文件:
cat > /opt/cluster-deploy/config/elasticsearch/elasticsearch.yml << 'EOF'
cluster.name: docker-cluster
node.name: elasticsearch-node1
node.attr.box_type: normal
path.data: /usr/share/elasticsearch/data
path.logs: /usr/share/elasticsearch/logs
bootstrap.memory_lock: true
bootstrap.system_call_filter: false
discovery.type: single-node
network.host: 0.0.0.0
http.port: 9200
transport.publish_host: 172.20.5.21
http.publish_host: 172.20.5.21
xpack.security.enabled: false
xpack.security.enrollment.enabled: false
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-methods: OPTIONS,HEAD,GET,POST,PUT,DELETE
http.cors.allow-headers: "X-Requested-With,Content-Type,Content-Length"
indices.memory.index_buffer_size: 20%
indices.queries.cache.size: 15%
action.auto_create_index: true
EOF
创建JVM配置:
cat > /opt/cluster-deploy/config/elasticsearch/jvm.options << 'EOF'
-Xms512m
-Xmx512m
-XX:+UseG1GC
-XX:+DisableExplicitGC
-XX:+ExitOnOutOfMemoryError
EOF
步骤3:创建Fluentd配置
在所有节点执行,创建Fluentd配置文件(以Node1为例):
cat > /opt/cluster-deploy/config/fluentd/fluentd-node1.conf << 'EOF'
<source>
@type tail
@id input_tail_docker_containers
path /var/lib/docker/containers/*/*-json.log
pos_file /var/log/fluentd-containers.log.pos
tag docker.*
read_from_head true
<parse>
@type json
time_type float
time_format %Y-%m-%dT%H:%M:%S.%NZ
</parse>
</source>
<source>
@type tail
@id input_tail_nginx_access
path /var/log/nginx/access.log
pos_file /var/log/fluentd-nginx-access.log.pos
tag nginx.access
<parse>
@type nginx
</parse>
</source>
<source>
@type tail
@id input_tail_nginx_error
path /var/log/nginx/error.log
pos_file /var/log/fluentd-nginx-error.log.pos
tag nginx.error
<parse>
@type nginx
</parse>
</source>
<filter docker.**>
@type record_transformer
<record>
hostname "#{Socket.gethostname}"
node_name Node1
</record>
</filter>
<filter nginx.**>
@type record_transformer
<record>
hostname "#{Socket.gethostname}"
node_name Node1
</record>
</filter>
<match docker.**>
@type elasticsearch
host 172.20.5.21
port 9200
logstash_format true
logstash_prefix docker
logstash_dateformat %Y.%m.%d
include_tag_key true
tag_key docker_tag
type_name docker-container
flush_interval 10s
buffer_type file
buffer_path /var/log/fluentd-buffers/elasticsearch.buffer
buffer_queue_full_action block
buffer_chunk_limit 2M
buffer_queue_limit 256
flush_interval 10s
max_retry_wait 30
disable_retry_limit
</match>
<match nginx.**>
@type elasticsearch
host 172.20.5.21
port 9200
logstash_format true
logstash_prefix nginx
logstash_dateformat %Y.%m.%d
include_tag_key true
tag_key nginx_tag
type_name nginx-log
flush_interval 10s
buffer_type file
buffer_path /var/log/fluentd-buffers/elasticsearch.buffer
buffer_queue_full_action block
buffer_chunk_limit 2M
buffer_queue_limit 256
max_retry_wait 30
disable_retry_limit
</match>
<system>
process_name fluentd
</system>
EOF
注意 :Node2和Node3的Fluentd配置只需将node_name改为对应的节点名称。
步骤4:创建Docker Compose文件
Node1 EFK部署文件
cat > /opt/cluster-deploy/docker-compose-efk-node1.yml << 'EOF'
services:
fluentd:
image: fluent/fluentd:v1.16-1
container_name: fluentd
networks:
manage-net:
ipv4_address: 172.20.5.11
volumes:
- ./config/fluentd/fluentd-node1.conf:/fluentd/etc/fluent.conf:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/log:/var/log:ro
- fluentd-buffer:/var/log/fluentd-buffers
environment:
- FLUENTD_CONF=fluent.conf
restart: unless-stopped
logging:
driver: none
networks:
manage-net:
external: true
volumes:
fluentd-buffer:
EOF
Node2 EFK部署文件
cat > /opt/cluster-deploy/docker-compose-efk-node2.yml << 'EOF'
services:
fluentd:
image: fluent/fluentd:v1.16-1
container_name: fluentd
networks:
manage-net:
ipv4_address: 172.20.5.12
volumes:
- ./config/fluentd/fluentd-node2.conf:/fluentd/etc/fluent.conf:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/log:/var/log:ro
- fluentd-buffer:/var/log/fluentd-buffers
environment:
- FLUENTD_CONF=fluent.conf
restart: unless-stopped
logging:
driver: none
networks:
manage-net:
external: true
volumes:
fluentd-buffer:
EOF
Node3 EFK部署文件
cat > /opt/cluster-deploy/docker-compose-efk-node3.yml << 'EOF'
services:
fluentd:
image: fluent/fluentd:v1.16-1
container_name: fluentd
networks:
manage-net:
ipv4_address: 172.20.5.22
volumes:
- ./config/fluentd/fluentd-node3.conf:/fluentd/etc/fluent.conf:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/log:/var/log:ro
- fluentd-buffer:/var/log/fluentd-buffers
environment:
- FLUENTD_CONF=fluent.conf
restart: unless-stopped
logging:
driver: none
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
container_name: elasticsearch
hostname: elasticsearch
networks:
manage-net:
ipv4_address: 172.20.5.21
environment:
- discovery.type=single-node
- ES_JAVA_OPTS=-Xms512m -Xmx512m
- xpack.security.enabled=false
- xpack.security.http.ssl.enabled=false
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
- ./config/elasticsearch/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml:ro
ulimits:
memlock:
soft: -1
hard: -1
restart: unless-stopped
kibana:
image: docker.elastic.co/kibana/kibana:8.11.0
container_name: kibana
networks:
manage-net:
ipv4_address: 172.20.5.23
environment:
- ELASTICSEARCH_HOSTS=http://172.20.5.21:9200
- SERVER_NAME=kibana
- SERVER_HOST=0.0.0.0
ports:
- "5601:5601"
depends_on:
- elasticsearch
restart: unless-stopped
networks:
manage-net:
external: true
volumes:
elasticsearch-data:
driver: local
fluentd-buffer:
EOF
步骤5:启动EFK服务
在所有节点执行:
# Node1
cd /opt/cluster-deploy
docker compose -f docker-compose-efk-node1.yml up -d
# Node2
cd /opt/cluster-deploy
docker compose -f docker-compose-efk-node2.yml up -d
# Node3
cd /opt/cluster-deploy
docker compose -f docker-compose-efk-node3.yml up -d
验证部署
检查容器状态
docker ps | grep -E "fluentd|elasticsearch|kibana"
验证Elasticsearch
curl http://172.20.5.21:9200
预期输出:
{
"name" : "elasticsearch-node1",
"cluster_name" : "docker-cluster",
"cluster_uuid" : "...",
"version" : {
"number" : "8.11.0",
...
},
"tagline" : "You Know, for Search"
}
验证Kibana
访问 http://192.168.64.130:5601 ,应该能看到Kibana界面。
查看日志索引
在Kibana中执行以下操作创建索引模式:
-
进入 Management → Stack Management → Index Patterns
-
点击 Create index pattern
-
输入索引名称
docker-*或nginx-* -
选择时间字段
@timestamp -
点击 Create index pattern
Kibana使用指南
查看容器日志
-
在左侧菜单点击 Discover
-
选择索引模式
docker-* -
可以按以下字段筛选:
-
node_name:按节点筛选 -
docker_tag:按容器标签筛选 -
log:日志内容全文搜索
-
创建可视化
-
点击左侧菜单 Visualize
-
选择可视化类型(如 Lens 、Data Table)
-
选择索引模式
-
配置聚合和筛选条件
-
保存可视化
创建仪表板
-
点击左侧菜单 Dashboard
-
点击 Create dashboard
-
点击 Add 添加已有可视化
-
保存仪表板
故障排除
Elasticsearch启动失败
检查内存锁定设置:
# 检查系统限制
grep MemLock /etc/security/limits.conf
# 临时禁用内存锁定(测试用)
ulimit -l unlimited
Fluentd无法连接Elasticsearch
# 查看Fluentd日志
docker logs fluentd
# 检查网络连通性
docker exec fluentd nc -zv elasticsearch 9200
Kibana无法连接到Elasticsearch
# 检查Elasticsearch是否正常运行
curl http://172.20.5.21:9200/_cluster/health
# 检查Kibana日志
docker logs kibana
性能优化
Elasticsearch优化
-
设置分片数量:默认1个主分片足够单节点使用
-
调整刷新间隔:默认1秒,可适当增加
-
配置索引生命周期管理(ILM):自动清理旧日志
Fluentd优化
-
调整缓冲区大小 :根据日志量调整
buffer_chunk_limit -
使用多worker:高负载时可启用多worker
-
过滤不必要的日志:减少传输数据量
定时清理旧日志
创建定时任务清理Elasticsearch中的旧索引:
# 每天凌晨2点删除30天前的索引
0 2 * * * curl -XDELETE "http://172.20.5.21:9200/docker-$(date -d '30 days ago' +%Y.%m.%d)*" && \
curl -XDELETE "http://172.20.5.21:9200/nginx-$(date -d '30 days ago' +%Y.%m.%d)*"