ElasticStack从入门到精通

什么是ElasticStack

ElasticStack早期名称为elk

elk代表了三个组件

ElasticSearch
负责数据存储和检索。
Logstash
负责数据的采集，将源数据采集到ElasticSearch进行存储。
Kibana
负责数据的展示。类似Granfa

由于Logstash是一个重量级产品，安装包超过300MB+，很多同学只是用于采集日志，于是使用其他采集工具代替，比如flume，fluentd等产品替代。

后来elastic公司也发现了这个问题，于是开发了一堆beats产品，其中典型代表就是Filebeat，metricbeat，heartbeat等。

而后，对于安全而言，又推出了xpack等相关组件，以及云环境的组件。

后期名称命名为elk stack(elk 技术栈)，后来公司为了宣传ElasticStack

ElasticStack 架构

ElasticStack版本

https://www.elastic.co/ elastic官网

最新版本8+，8版本默认启用了https协议，我们先安装7.17版本，然后手动启动https协议。

后面再练习安装8版本

选择elastic安装方式，我们再Ubuntu上部署elastic

二进制包部署单机es环境

部署

复制代码

1.下载elk安装包
root@elk:~# cat install_elk.sh 
#!/bin/bash
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.28-linux-x86_64.tar.gz
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.28-linux-x86_64.tar.gz.sha512
shasum -a 512 -c elasticsearch-7.17.28-linux-x86_64.tar.gz.sha512
tar -xzf elasticsearch-7.17.28-linux-x86_64.tar.gz -C /usr/local
cd elasticsearch-7.17.28/


2.修改配置文件
root@elk:~# vim /usr/local/elasticsearch-7.17.28/config/elasticsearch.yml 
root@elk:~# egrep -v "^#|^$" /usr/local/elasticsearch-7.17.28/config/elasticsearch.yml
cluster.name: xu-elasticstack
path.data: /var/lib/es7
path.logs: /var/log/es7
network.host: 0.0.0.0
discovery.type: single-node
相关参数说明:
	port
		默认端口是9200
		# By default Elasticsearch listens for HTTP traffic on the first free port it
		# finds starting at 9200. Set a specific HTTP port here:
		#
		#http.port: 9200
		
	cluster.name
		集群的名称
		
	path.data
		ES的数据存储路径。
		
	path.logs
		ES的日志存储路径。
		
	network.host
		# elasticStack默认只允许本机访问
		# By default Elasticsearch is only accessible on localhost. Set a different
		# address here to expose this node on the network:
		#
		#network.host: 192.168.0.1

		ES服务监听的地址。
		
	discovery.type
		# 如果部署的es集群就需要配置discovery.seed_hosts和cluster.initial_master_nodes参数
		# Pass an initial list of hosts to perform discovery when this node is started:
		# The default list of hosts is ["127.0.0.1", "[::1]"]
		#
		#discovery.seed_hosts: ["host1", "host2"]
		#
		# Bootstrap the cluster using an initial set of master-eligible nodes:
		#
		#cluster.initial_master_nodes: ["node-1", "node-2"]		
		指的ES集群的部署类型，此处的"single-node"，表示的是一个单点环境。
		
3.如果此时直接启动elastic会报错
3.1测试报错，官方给出的启动命令
Elasticsearch can be started from the command line as follows:
./bin/elasticsearch
root@elk:~# /usr/local/elasticsearch-7.17.28/bin/elasticsearch
# 这些是java类型报错
Mar 17, 2025 7:44:51 AM sun.util.locale.provider.LocaleProviderAdapter <clinit>
WARNING: COMPAT locale provider will be removed in a future release
[2025-03-17T07:44:53,125][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [elk] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.RuntimeException: can not run elasticsearch as root
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:173) ~[elasticsearch-7.17.28.jar:7.17.28]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:160) ~[elasticsearch-7.17.28.jar:7.17.28]
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) ~[elasticsearch-7.17.28.jar:7.17.28]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) ~[elasticsearch-cli-7.17.28.jar:7.17.28]
	at org.elasticsearch.cli.Command.main(Command.java:77) ~[elasticsearch-cli-7.17.28.jar:7.17.28]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:125) ~[elasticsearch-7.17.28.jar:7.17.28]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) ~[elasticsearch-7.17.28.jar:7.17.28]
Caused by: java.lang.RuntimeException: can not run elasticsearch as root
	at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:107) ~[elasticsearch-7.17.28.jar:7.17.28]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:183) ~[elasticsearch-7.17.28.jar:7.17.28]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) ~[elasticsearch-7.17.28.jar:7.17.28]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:169) ~[elasticsearch-7.17.28.jar:7.17.28]
	... 6 more
uncaught exception in thread [main]
java.lang.RuntimeException: can not run elasticsearch as root  # 不允许root直接启动
	at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:107)
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:183)
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434)
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:169)
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:160)
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77)
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112)
	at org.elasticsearch.cli.Command.main(Command.java:77)
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:125)
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80)
For complete error details, refer to the log at /var/log/es7/xu-elasticstack.log
2025-03-17 07:44:53,713764 UTC [1860] INFO  Main.cc@111 Parent process died - ML controller exiting

3.2创建启动用户
root@elk:~# useradd -m  elastic
root@elk:~# id elastic 
uid=1001(elastic) gid=1001(elastic) groups=1001(elastic)
# 通过elastic用户启动，此时还有一个报错
root@elk:~# su - elastic  -c  "/usr/local/elasticsearch-7.17.28/bin/elasticsearch"
could not find java in bundled JDK at /usr/local/elasticsearch-7.17.28/jdk/bin/java
# 系统中是存在java包的，但是elastic用户找不到，切换到elastic查看下
root@elk:~# ll /usr/local/elasticsearch-7.17.28/jdk/bin/java
-rwxr-xr-x 1 root root 12328 Feb 20 09:09 /usr/local/elasticsearch-7.17.28/jdk/bin/java*
root@elk:~# su - elastic 
$ pwd     
/home/elastic
$ ls /usr/local/elasticsearch-7.17.28/jdk/bin/java
# 报错的原因就是权限被拒绝，也就是elastic没有权限访问java包
ls: cannot access '/usr/local/elasticsearch-7.17.28/jdk/bin/java': Permission denied
# 一层一层地向外找，最后能找到/usr/local/elasticsearch-7.17.28/jdk/bin目录没有权限，导致报错
root@elk:~#  chown elastic:elastic  -R /usr/local/elasticsearch-7.17.28/
root@elk:~# ll -d /usr/local/elasticsearch-7.17.28/jdk/bin/
drwxr-x--- 2 elastic elastic 4096 Feb 20 09:09 /usr/local/elasticsearch-7.17.28/jdk/bin//

# 此时再进行启动测试，发现其他错误
# 我们指定地path.data和path.log不存在，我们需要手动创建
java.lang.IllegalStateException: Unable to access 'path.data' (/var/lib/es7)
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: Unable to access 'path.data' (/var/lib/es7)
root@elk:~# install -d /var/{log,lib}/es7 -o elastic -g elastic
root@elk:~# ll -d /var/{log,lib}/es7
drwxr-xr-x 2 elastic elastic 4096 Mar 17 08:01 /var/lib/es7/
drwxr-xr-x 2 elastic elastic 4096 Mar 17 07:44 /var/log/es7/

# 现在重新启动服务，可以启动成功，检测端口
root@elk:~# su - elastic  -c  "/usr/local/elasticsearch-7.17.28/bin/elasticsearch"
root@elk:~# netstat -tunlp | egrep "9[2|3]00"
tcp6       0      0 :::9200                 :::*                    LISTEN      2544/java           
tcp6       0      0 :::9300                 :::*                    LISTEN      2544/java

通过浏览器访问9200

同时elastic提供了一个api，可以查看当前的主机数

复制代码

[root@zabbix ~]# curl 192.168.121.21:9200/_cat/nodes
172.16.1.21 40 97 0 0.11 0.29 0.20 cdfhilmrstw * elk
# 在命令行访问，由于目前的单节点部署es，所以node只有一个

# 前面我们启动es是前台启动前台启动会存在两个问题
1.占用终端
2.如果想结束es比较困难，所以这里一般我们采用后台运行的方式启动
官方给我们的后台运行方式
elasticsearch 的-d参数
To run Elasticsearch as a daemon, specify -d on the command line, and record the process ID in a file using the -p option:
./bin/elasticsearch -d -p pid
root@elk:~# su - elastic -c '/usr/local/elasticsearch-7.17.28/bin/elasticsearch -d'

# 常见报错问题
Q1：最大虚拟内存映射太小
bootstrap check failure [1] of [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
ERROR: Elasticsearch did not exit normally - check the logs at /var/log/es7/AAA.log
root@elk:~# sysctl -q vm.max_map_count
vm.max_map_count = 65530
root@elk:~# echo "vm.max_map_count = 262144" >> /etc/sysctl.d/es.conf
root@elk:~# sysctl -w vm.max_map_count=262144
vm.max_map_count = 262144
root@elk:~# sysctl -q vm.max_map_count
vm.max_map_count = 262144


Q2：es配置文件写错
java.net.UnknownHostException: single-node


Q3：出现lock字样说明已经有ES实例启动。杀死现有进程后再重新执行启动命令
java.lang.IllegalStateException: failed to obtain node locks, tried [[/var/lib/es7]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?

Q5：ES集群部署的有问题，缺少master角色。
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

卸载环境

复制代码

1.停止elasticsearch
root@elk:~# kill `ps -ef | grep java | grep -v grep |awk '{print $2}'`
root@elk:~# ps -ef | grep java
root        4437    1435  0 09:21 pts/2    00:00:00 grep --color=auto java

2.删除数据目录、日志目录、安装包、用户
root@elk:~# rm -rf /usr/local/elasticsearch-7.17.28/ /var/{lib,log}/es7/
root@elk:~# userdel -r elastic

基于deb包安装ES单点

复制代码

1.安装deb包
root@elk:~# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.28-amd64.deb

2.安装es
root@elk:~# dpkg -i elasticsearch-7.17.28-amd64.deb 
# 通过二进制包安装es可以使用systemctl管理
### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd
 sudo systemctl daemon-reload
 sudo systemctl enable elasticsearch.service
### You can start elasticsearch service by executing
 sudo systemctl start elasticsearch.service
Created elasticsearch keystore in /etc/elasticsearch/elasticsearch.keystore


3.修改es配置文件
root@elk:~# vim /etc/elasticsearch/elasticsearch.yml 
root@elk:~# egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml
cluster.name: xu-es
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
discovery.type: single-node

4.启动es
systemctl enable elasticsearch --now
# 查看es的service文件，下面的参数都是在二进制安装的时候我们自己做的
User=elasticsearch
Group=elasticsearch
ExecStart=/usr/share/elasticsearch/bin/systemd-entrypoint -p ${PID_DIR}/elasticsearch.pid --quie

cat /usr/share/elasticsearch/bin/systemd-entrypoint 
#!/bin/sh

# This wrapper script allows SystemD to feed a file containing a passphrase into
# the main Elasticsearch startup script

if [ -n "$ES_KEYSTORE_PASSPHRASE_FILE" ] ; then
  exec /usr/share/elasticsearch/bin/elasticsearch "$@" < "$ES_KEYSTORE_PASSPHRASE_FILE"
else
  exec /usr/share/elasticsearch/bin/elasticsearch "$@"
fi

es常见术语

复制代码

1.索引 Index
	用户进行数据读写的单元
2.分片 Shared
	一个索引至少要有一个分片，如果一个索引仅有一个分片，意味着该索引的数据只能全量存储在某个节点上，且分片是不可拆分的，隶属于某个节点。
	换句话说，分片是ES集群最小的调度单元。
	一个索引数据也可以被分散的存储在不同的分片上，且这些分片可以放在不同的节点，从而实现数据的分布式存储。
3.副本 replica
	副本是针对分片来说的，一个分片可以有0个或多个副本。
	当副本数量为0时，意味着只有主分片(priamry shard)，当主分片所在的节点宕机时，数据就无法访问了。
	当副本数量大于0时，意味着同时存在主分片和副本分片(replica shard):
		主分片负责数据的读写(read write,rw)
		副本分片负责数据的读的负载均衡(read only,ro)
4.文档 document
	指的是用户存储的数据。其中包含元数据和源数据。
	元数据：
		用于描述源数据的数据。
	源数据:  
		用户实际存储的数据。
5.分配： allocation
	指的是将索引的不同分片(包含主分片和副本分片)分配到整个集群的过程。

查看集群状态

复制代码

# es提供了api /_cat/health
root@elk:~# curl 127.1:9200/_cat/health
1742210504 11:21:44 xu-es green 1 1 3 3 0 0 0 0 - 100.0%
root@elk:~# curl 127.1:9200/_cat/health?v
epoch      timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1742210512 11:21:52  xu-es   green           1         1      3   3    0    0        0             0                  -                100.0%

es集群环境部署

复制代码

1.安装es集群服务
root@elk1:~# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.28-amd64.deb
root@elk1:~# dpkg -i elasticsearch-7.17.28-amd64.deb
root@elk2:~# dpkg -i elasticsearch-7.17.28-amd64.deb
root@elk3:~# dpkg -i elasticsearch-7.17.28-amd64.deb


2.配置es，三台机器一样的配置
# 不需要配置discovery.type了
[root@elk1 ~]# grep -E "^(cluster|path|network|discovery|http)" /etc/elasticsearch/elasticsearch.yml 
cluster.name: es-cluster
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["192.168.121.91", "192.168.121.92", "192.168.121.93"]


3.启动服务
systemctl enable elasticsearch --now 

4.测试，带有*的是master节点
root@elk:~# curl 127.1:9200/_cat/nodes
172.16.1.23  6 97 25 0.63 0.57 0.25 cdfhilmrstw - elk3
172.16.1.22  5 96 23 0.91 0.76 0.33 cdfhilmrstw - elk2
172.16.1.21 19 90 39 1.22 0.87 0.35 cdfhilmrstw * elk
root@elk:~# curl 127.1:9200/_cat/nodes?v
ip          heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
172.16.1.23            9          83   2    0.12    0.21     0.18 cdfhilmrstw -      elk3
172.16.1.22            8          96   3    0.16    0.28     0.24 cdfhilmrstw -      elk2
172.16.1.21           22          97   3    0.09    0.30     0.25 cdfhilmrstw *      elk

# 集群部署故障 没有uuid  集群缺少master

[root@elk3 ~]# curl http://192.168.121.92:9200/_cat/nodes?v
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}
[root@elk3 ~]# curl 192.168.121.91:9200
{
  "name" : "elk91",
  "cluster_name" : "es-cluster",
  "cluster_uuid" : "_na_",
  ...
}
[root@elk3 ~]# 
[root@elk3 ~]# curl 10.0.0.92:9200
{
  "name" : "elk92",
  "cluster_name" : "es-cluster",
  "cluster_uuid" : "_na_",
  ...
}
[root@elk3 ~]# 
[root@elk3 ~]# 
[root@elk3 ~]# curl 10.0.0.93:9200
{
  "name" : "elk93",
  "cluster_name" : "es-cluster",
  "cluster_uuid" : "_na_",
  ...
}
[root@elk3 ~]# 
[root@elk3 ~]# curl http://192.168.121.91:9200/_cat/nodes
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}


# 解决方式
1.停止集群的ES服务
[root@elk91 ~]# systemctl stop elasticsearch.service 
[root@elk92 ~]# systemctl stop elasticsearch.service 
[root@elk93 ~]# systemctl stop elasticsearch.service 


2.删除数据，日志，和临时数据 
[root@elk91 ~]# rm -rf /var/{lib,log}/elasticsearch/* /tmp/*
[root@elk92 ~]# rm -rf /var/{lib,log}/elasticsearch/* /tmp/*
[root@elk93 ~]# rm -rf /var/{lib,log}/elasticsearch/* /tmp/*

3.添加配置项
[root@elk1 ~]# grep -E "^(cluster|path|network|discovery|http)" /etc/elasticsearch/elasticsearch.yml 
cluster.name: es-cluster
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["192.168.121.91", "192.168.121.92", "192.168.121.93"]
cluster.initial_master_nodes: ["192.168.121.91", "192.168.121.92", "192.168.121.93"]  ######

4.重启服务

5.测试



es集群master选举流程
1.启动时会检查集群是否有master，如果有则不发起选举master;
1.刚开始启动，所有节点均为人自己是master，并向集群的其他节点发送信息（包含ClusterStateVersion，ID等）
2.基于类似gossip协议获取所有可以参与master选举的节点列表;
3.先比较"ClusterStateVersion"，谁最大，谁优先级高，会被选举出master;
4.如果比不出来，则比较ID，谁的ID小，就优先成为master;
5.当集群半数以上节点参与选举完成后，则完成master选举，比如有N个节点，仅需要"(N/2)+1"节点就可以确认master;
6.master选举完成后，会向集群列表通报最新的master节点，此时才意味着选举完成;

DSL

ES相比与MySQL
- MySQL属于关系型数据库:
  增删改查：基于SQL
- ES属于文档型数据库，和MangoDB很相似
  增删改查： DSL语句，属于ES独有的有种查询语言。
  针对模糊查询，mysql无法充分利用索引，性能较低，而是用ES查询模糊数据，是非常高效的。

往es中添加单条请求

使用postman进行测试

复制代码

# 本质上是使用了curl
curl --location 'http://192.168.121.21:9200/test_linux/doc' \
--header 'Content-Type: application/json' \
--data '{
    "name": "孙悟空",
    "hobby": [
        "蟠桃",
        "紫霞仙子"
    ]
}



curl --location '192.168.121.21:9200/_bulk' \
--header 'Content-Type: application/json' \
--data '{ "create" : { "_index" : "test_linux_ss", "_id" : "1001" } }
{ "name" : "猪八戒","hobby": ["猴哥","高老庄"] }


{"create": {"_index":"test_linux_ss","_id":"1002"}}
{"name":"白龙马","hobby":["驮唐僧","吃草"]}
'

查询数据

复制代码

curl --location '192.168.121.22:9200/test_linux_ss/_doc/1001' \
--data ''



curl --location --request GET '192.168.121.22:9200/test_linux_ss/_search' \
--header 'Content-Type: application/json' \
--data '{
    "query":{
        "match":{
            "name":"猪八戒"
        }
    }
}'

删除数据

复制代码

curl --location --request DELETE '192.168.121.22:9200/test_linux_ss/_doc/1001'

kibana

部署kibana

kibana是针对ES做的一款可视化工具。将来的操作都可以在ES中完成。

复制代码

1.下载kibana
root@elk:~# wget https://artifacts.elastic.co/downloads/kibana/kibana-7.17.28-amd64.deb

2.安装kibana
root@elk:~# dpkg -i kibana-7.17.28-amd64.deb

3.修改配置文件
root@elk:~# vim /etc/kibana/kibana.yml 
root@elk:~# grep -E "^(elasticsearch.host|i18n|server)" /etc/kibana/kibana.yml
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.hosts: ["http://192.168.121.21:9200","http://192.168.121.22:9200","http://192.168.121.23:9200"]
i18n.locale: "zh-CN"

4.启动kibana
root@elk:~# systemctl enable kibana.service --now
Synchronizing state of kibana.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable kibana
Created symlink /etc/systemd/system/multi-user.target.wants/kibana.service → /etc/systemd/system/kibana.service.
root@elk:~# netstat  -tunlp | grep 5601
tcp        0      0 0.0.0.0:5601            0.0.0.0:*               LISTEN      19392/node

web端访问测试

基于KQL基本使用

过滤数据

Filebeat

部署Filebeat

复制代码

1.下载Filebeat
root@elk2:~# wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.17.28-amd64.deb

2.安装Filebeat
root@elk2:~# dpkg -i filebeat-7.17.28-amd64.deb 

3.编写Filebeat配置文件
# Filebeat需要我们自己创建config目录，然后编写配置文件
mkdir /etc/filebeat/config
vim /etc/filebeat/config/01-log-to-console.yaml
# Filebeat配置文件包含两个部分，Input和Output，Input指明要从哪里采集数据；Output指明数据存储到哪里，根据官方文档进行配置
# 目前还没有服务的运行日志，所以Input来源指定为特殊文件，Output去向输出的终端console

root@elk2:~# cat /etc/filebeat/config/01-log-to-console.yaml
# 定义数据从哪里来
filebeat.inputs:
  # 指定数据源的类型是log，表示从文件读取数据
- type: log
  # 指定文件的路径
  paths:
    - /tmp/student.log

# 定义数据到终端
output.console:
  pretty: true

4.运行filebeat实例
filebeat -e -c /etc/filebeat/config/01-log-to-console.yaml

5.创建student.log文件，并写入数据
root@elk2:~# echo ABC > /tmp/student.log

// 输出提示
{
  "@timestamp": "2025-03-18T14:48:42.432Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.17.28"
  },
  "message": "ABC",   // 检测到的内容变化
  "input": {
    "type": "log"
  },
  "ecs": {
    "version": "1.12.0"
  },
  "host": {
    "name": "elk2"
  },
  "agent": {
    "type": "filebeat",
    "version": "7.17.28",
    "hostname": "elk2",
    "ephemeral_id": "7f116862-382c-48f4-8797-c4b689e6e6fe",
    "id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf",
    "name": "elk2"
  },
  "log": {
    "offset": 0,   // 偏移量offset=0 表示从0开始 
    "file": {
      "path": "/tmp/student.log"
    }
  }
}


# 向文件student.log中追加写入数据
root@elk2:~# echo 123 >> /tmp/student.log

// 查看filebeat输出提示
{
  "@timestamp": "2025-03-18T14:51:17.449Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.17.28"
  },
  "log": {
    "offset": 4,  // 偏移量从4开始的
    "file": {
      "path": "/tmp/student.log"
    }
  },
  "message": "123",   // 统计到了数据 123
  "input": {
    "type": "log"
  },
  "ecs": {
    "version": "1.12.0"
  },
  "host": {
    "name": "elk2"
  },
  "agent": {
    "id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf",
    "name": "elk2",
    "type": "filebeat",
    "version": "7.17.28",
    "hostname": "elk2",
    "ephemeral_id": "7f116862-382c-48f4-8797-c4b689e6e6fe"
  }
}

Filebeat特性

复制代码

# 使用echo的静默输出进行追加 # 此时filebeat采集不到数据
root@elk2:~# echo -n 456 >> /tmp/student.log
root@elk2:~# cat /tmp/student.log
ABC
123
456root@elk2:~# 
root@elk2:~# echo -n abc  >> /tmp/student.log
root@elk2:~# cat /tmp/student.log
ABC
123
456789abcroot@elk2:~# 

# 使用非静默写入数据  此时filebeat就可以采集到数据了
root@elk2:~# echo haha >> /tmp/student.log
root@elk2:~# cat /tmp/student.log
ABC
123
456789abchaha


// 查看filebeat输出信息
{
  "@timestamp": "2025-03-18T14:55:37.476Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.17.28"
  },
  "host": {
    "name": "elk2"
  },
  "agent": {
    "name": "elk2",
    "type": "filebeat",
    "version": "7.17.28",
    "hostname": "elk2",
    "ephemeral_id": "7f116862-382c-48f4-8797-c4b689e6e6fe",
    "id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf"
  },
  "log": {
    "offset": 8,  // 偏移量=8
    "file": {
      "path": "/tmp/student.log"
    }
  },
  "message": "456789abchaha",  // 采集到的数据是静默输出+非静默输出 的所有数据
  "input": {
    "type": "log"
  },
  "ecs": {
    "version": "1.12.0"
  }
}

由此可以得出Filebeat的第一条性质：

filebeat默认是按行采集数据;

复制代码

# 现在将Filebeat停止，修改student.log，如果重新开始进行数据采集，那么Filebeat是采集文件中所有的内容，还是只采集Filebeat停止后新增的内容
root@elk2:~# echo xixi >> /tmp/student.log

# 重启Filebeat

// 查看Filebeat的输出信息
{
  "@timestamp": "2025-03-18T15:00:51.759Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.17.28"
  },
  "ecs": {
    "version": "1.12.0"
  },
  "host": {
    "name": "elk2"
  },
  "agent": {
    "type": "filebeat",
    "version": "7.17.28",
    "hostname": "elk2",
    "ephemeral_id": "81db6575-7f98-4ca4-a86f-4d0127c1e2a4",
    "id": "ba0b7fa3-59b2-4988-bfa1-d9ac8728bcaf",
    "name": "elk2"
  },
  "log": {
    "offset": 22,  // 偏移量也不是从0开始
    "file": {
      "path": "/tmp/student.log"
    }
  }, 
  "message": "xixi",  // 只采集Filebeat停止后新增的内容
  "input": {
    "type": "log"
  }
}


// 在上面有一条提示，当再次启动filebeat时回导入/var/lib/filebeat/registry/filebeat目录下的json文件
2025-03-18T15:00:51.756Z	INFO	memlog/store.go:124	Finished loading transaction log file for '/var/lib/filebeat/registry/filebeat'. Active transaction id=5
// 在我们第一次启动时也会从/var/lib/filebeat/registry/filebeat目录下导入，但是第一次启动时不会存在该目录，自然也没有信息

// 查看/var/lib/filebeat/registry/filebeat下的json文件，在这个json文件中记录了offset值，这就是filebeat能够在停止后，重启不会从头开始记录文件内容
{"op":"set","id":1}
{"k":"filebeat::logs::native::1441831-64768","v":{"id":"native::1441831-64768","prev_id":"","timestamp":[431172511,1742309322],"ttl":-1,"identifier_name":"native","source":"/tmp/student.log","offset":0,"type":"log","FileStateOS":{"inode":1441831,"device":64768}}}
{"op":"set","id":2}
{"k":"filebeat::logs::native::1441831-64768","v":{"prev_id":"","source":"/tmp/student.log","type":"log","FileStateOS":{"inode":1441831,"device":64768},"id":"native::1441831-64768","offset":4,"timestamp":[434614328,1742309323],"ttl":-1,"identifier_name":"native"}}
{"op":"set","id":3}
{"k":"filebeat::logs::native::1441831-64768","v":{"id":"native::1441831-64768","identifier_name":"native","ttl":-1,"type":"log","FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","source":"/tmp/student.log","offset":8,"timestamp":[450912955,1742309478]}}
{"op":"set","id":4}
{"k":"filebeat::logs::native::1441831-64768","v":{"type":"log","identifier_name":"native","offset":22,"timestamp":[478003874,1742309738],"source":"/tmp/student.log","ttl":-1,"FileStateOS":{"inode":1441831,"device":64768},"id":"native::1441831-64768","prev_id":""}}
{"op":"set","id":5}
{"k":"filebeat::logs::native::1441831-64768","v":{"id":"native::1441831-64768","ttl":-1,"FileStateOS":{"device":64768,"inode":1441831},"identifier_name":"native","prev_id":"","source":"/tmp/student.log","offset":22,"timestamp":[478003874,1742309738],"type":"log"}}
{"op":"set","id":6}
{"k":"filebeat::logs::native::1441831-64768","v":{"offset":22,"timestamp":[759162512,1742310051],"type":"log","FileStateOS":{"device":64768,"inode":1441831},"id":"native::1441831-64768","prev_id":"","identifier_name":"native","source":"/tmp/student.log","ttl":-1}}
{"op":"set","id":7}
{"k":"filebeat::logs::native::1441831-64768","v":{"offset":22,"timestamp":[759368397,1742310051],"type":"log","FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","source":"/tmp/student.log","ttl":-1,"identifier_name":"native","id":"native::1441831-64768"}}
{"op":"set","id":8}
{"k":"filebeat::logs::native::1441831-64768","v":{"ttl":-1,"identifier_name":"native","id":"native::1441831-64768","source":"/tmp/student.log","timestamp":[761513338,1742310052],"FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","offset":27,"type":"log"}}
{"op":"set","id":9}
{"k":"filebeat::logs::native::1441831-64768","v":{"source":"/tmp/student.log","timestamp":[795028411,1742310356],"FileStateOS":{"inode":1441831,"device":64768},"prev_id":"","offset":27,"ttl":-1,"type":"log","identifier_name":"native","id":"native::1441831-64768"}}

这是Filebeat的第二个特性

filebeat默认会在"/var/lib/filebeat"目录下记录已经采集的文件offset信息，以便于下一次采集接着该位置继续采集数据;

Filebeat写入es

复制代码

将Filebeat的Output写入到es中
根据官方文档查看配置
The Elasticsearch output sends events directly to Elasticsearch using the Elasticsearch HTTP API.

Example configuration:

output.elasticsearch:  
  hosts: ["https://myEShost:9200"] 

root@elk2:~# cat /etc/filebeat/config/02-log-to-es.yaml
# 定义数据从哪里来
filebeat.inputs:
  # 指定数据源的类型是log，表示从文件读取数据
- type: log
  # 指定文件的路径
  paths:
    - /tmp/student.log

# 定义数据到终端
output.elasticsearch:  
  hosts:
    - 192.168.121.21:9200
    - 192.168.121.22:9200
    - 192.168.121.23:9200
# 删除filebeat的json文件
root@elk2:~# rm -rf /var/lib/filebeat

# 启动filebeat实例
root@elk2:~# filebeat -e -c  /etc/filebeat/config/02-log-to-es.yaml

在kibana中收集到数据了

查看收集到的数据

设置刷新频率

自定义索引

复制代码

# 索引名称我们可以自己定义，官方给出了定义方式 index参数设置
output.elasticsearch:
  hosts: ["http://localhost:9200"]
  index: "%{[fields.log_type]}-%{[agent.version]}-%{+yyyy.MM.dd}" 
  
root@elk2:~# cat /etc/filebeat/config/03-log-to-constom.yaml
# 定义数据从哪里来
filebeat.inputs:
  # 指定数据源的类型是log，表示从文件读取数据
- type: log
  # 指定文件的路径
  paths:
    - /tmp/student.log

# 定义数据到终端
output.elasticsearch:  
  hosts:
    - 192.168.121.21:9200
    - 192.168.121.22:9200
    - 192.168.121.23:9200
  # 自定义索引名称
  index: "test_filebeat-%{+yyyy.MM.dd}"

# 启动filebeat，此时会报错
root@elk2:~# filebeat -e -c /etc/filebeat/config/03-log-to-constom.yaml 
2025-03-19T02:55:18.951Z	INFO	instance/beat.go:698	Home path: [/usr/share/filebeat] Config path: [/etc/filebeat] Data path: [/var/lib/filebeat] Logs path: [/var/log/filebeat] Hostfs Path: [/]
2025-03-19T02:55:18.958Z	INFO	instance/beat.go:706	Beat ID: a109c2d1-fbb6-4b82-9416-29f9488ccabc
# 你必须要设置setup.template.name and setup.template.patter这两个参数，也就是说如果我们想自定义index名称，必须设置setup.template.name and setup.template.patter
2025-03-19T02:55:18.958Z	ERROR	instance/beat.go:1027	Exiting: setup.template.name and setup.template.pattern have to be set if index name is modified
Exiting: setup.template.name and setup.template.pattern have to be set if index name is modified

# setup.template.name and setup.template.patter在官网中给出了提示、
 If you change this setting, you also need to configure the setup.template.name and setup.template.pattern options (see Elasticsearch index template).

# 官方给出的实例
setup.template.name
	The name of the template. The default is filebeat. The Filebeat version is always appended to the given name, so the final name is filebeat-%{[agent.version]}.
	
setup.template.pattern
	The template pattern to apply to the default index settings. The default pattern is filebeat. The Filebeat version is always included in the pattern, so the final pattern is filebeat-%{[agent.version]}.

Example:

setup.template.name: "filebeat"
setup.template.pattern: "filebeat"

# 还需要设置shared和replicas 官方给出的默认设置
setup.template.settings:
  index.number_of_shards: 1
  index.number_of_replicas: 1

# 配置我们自己的索引模板(就是创建索引的规则)
root@elk2:~# cat /etc/filebeat/config/03-log-to-constom.yaml
# 定义数据从哪里来
filebeat.inputs:
  # 指定数据源的类型是log，表示从文件读取数据
- type: log
  # 指定文件的路径
  paths:
    - /tmp/student.log

# 定义数据到终端
output.elasticsearch:  
  hosts:
    - 192.168.121.21:9200
    - 192.168.121.22:9200
    - 192.168.121.23:9200
  # 自定义索引名称
  index: "test_filebeat-%{+yyyy.MM.dd}"

# 定义索引模板(就是创建索引的规则)的名称
setup.template.name: "test_filebeat"
# 定义索引模板的匹配模式，表示当前索引模板针对哪些索引生效。
setup.template.pattern: "test_filebeat-*"
# # 定义索引模板的规则信息
setup.template.settings:
  # 分片数
  index.number_of_shards: 3
  # 每个分片有多少个副本
  index.number_of_replicas: 0


# 启动filebeat 此时可以正常启动filebeat，但是在kibana中发现没有建立我们设置index，查看启动信息
root@elk2:~# filebeat  -e -c /etc/filebeat/config/03-log-to-constom.yaml 
# 这里的大概意思就是ILM设置为了auto  如果启用了此配置，则忽略自定义索引的所有信息  所以我们要将ILM设置为false
2025-03-19T03:10:02.548Z	INFO	[index-management]	idxmgmt/std.go:260	Auto ILM enable success.
2025-03-19T03:10:02.558Z	INFO	[index-management.ilm]	ilm/std.go:170	ILM policy filebeat exists already.
2025-03-19T03:10:02.559Z	INFO	[index-management]	idxmgmt/std.go:396	Set setup.template.name to '{filebeat-7.17.28 {now/d}-000001}' as ILM is enabled.

# 查看官网对于index lifecycle management  ILM配置
When index lifecycle management (ILM) is enabled, the default index is "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}-%{index_num}", for example, "filebeat-8.17.3-2025-03-17-000001". Custom index settings are ignored when ILM is enabled. If you're sending events to a cluster that supports index lifecycle management, see Index lifecycle management (ILM) to learn how to change the index name.

# ilm默认是auto模式，支持true、false、auto
Enables or disables index lifecycle management on any new indices created by Filebeat. Valid values are true, false, and auto. When auto (the default) is specified on version 7.0 and later
setup.ilm.enabled: auto

# 添加ilm配置到我们自己的配置文件中

# 启动filebeat
root@elk2:~# filebeat  -e -c /etc/filebeat/config/03-log-to-constom.yaml

索引模板已被建立

复制代码

# 此时我想修改我的shared和replicas
# 直接修改配置文件 改为5shared 0repicas

此时还是3shared 0replicas

复制代码

# 这是由于 setup.template.overwrite 参数 默认是false也就是不覆盖
setup.template.overwrite
A boolean that specifies whether to overwrite the existing template. The default is false. Do not enable this option if you start more than one instance of Filebeat at the same time. It can overload Elasticsearch by sending too many template update requests.


# 设置setup.template.overwrite 为 true
root@elk2:~# cat /etc/filebeat/config/03-log-to-constom.yaml
# 定义数据从哪里来
filebeat.inputs:
  # 指定数据源的类型是log，表示从文件读取数据
- type: log
  # 指定文件的路径
  paths:
    - /tmp/student.log

# 定义数据到终端
output.elasticsearch:  
  hosts:
    - 192.168.121.21:9200
    - 192.168.121.22:9200
    - 192.168.121.23:9200
  # 自定义索引名称
  index: "test_filebeat-%{+yyyy.MM.dd}"

# 禁用索引生命周期管理(index lifecycle management，ILM)
# 如果启用了此配置，则忽略自定义索引的所有信息
setup.ilm.enabled: false
# 如果索引模板存在，是否覆盖，默认值为false，如果明确需要，则可以将其设置为ture。
# 但是官方建议将其设置为false，原因是每次写数据时，都会建立tcp链接，消耗资源。
setup.template.overwrite: true
# 定义索引模板(就是创建索引的规则)的名称
setup.template.name: "test_filebeat"
# 定义索引模板的匹配模式，表示当前索引模板针对哪些索引生效。
setup.template.pattern: "test_filebeat-*"
# # 定义索引模板的规则信息
setup.template.settings:
  # 分片数
  index.number_of_shards: 5
  # 每个分片有多少个副本
  index.number_of_replicas: 0

# 启动filebeat
# 此时就修改完成了shared和replicas

Filebeat采集nginx实战

复制代码

1.安装nginx
root@elk2:~# apt install -y nginx

2.启动nginx
root@elk2:~# systemctl start nginx
root@elk2:~# netstat -tunlp | grep 80
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      17956/nginx: master 
tcp6       0      0 :::80                   :::*                    LISTEN      17956/nginx: master

3.测试访问
root@elk2:~# curl 127.1
# 日志位置
root@elk2:~# ll /var/log/nginx/access.log 
-rw-r----- 1 www-data adm 86 Mar 19 06:58 /var/log/nginx/access.log
root@elk2:~# cat /var/log/nginx/access.log
127.0.0.1 - - [19/Mar/2025:06:58:31 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0"


4.编写Filebeat实例
root@elk2:~# cat /etc/filebeat/config/04-log-to-nginx.yaml
# 定义数据从哪里来
filebeat.inputs:
  # 指定数据源的类型是log，表示从文件读取数据
- type: log
  # 指定文件的路径
  paths:
    - /var/log/nginx/access.log*

# 定义数据到终端
output.elasticsearch:  
  hosts:
    - 192.168.121.21:9200
    - 192.168.121.22:9200
    - 192.168.121.23:9200
  # 自定义索引名称
  index: "test_filebeat-%{+yyyy.MM.dd}"

# 禁用索引生命周期管理(index lifecycle management，ILM)
# 如果启用了此配置，则忽略自定义索引的所有信息
setup.ilm.enabled: false
# 如果索引模板存在，是否覆盖，默认值为false，如果明确需要，则可以将其设置为ture。
# 但是官方建议将其设置为false，原因是每次写数据时，都会建立tcp链接，消耗资源。
setup.template.overwrite: true
# 定义索引模板(就是创建索引的规则)的名称
setup.template.name: "test_filebeat"
# 定义索引模板的匹配模式，表示当前索引模板针对哪些索引生效。
setup.template.pattern: "test_filebeat-*"
# # 定义索引模板的规则信息
setup.template.settings:
  # 分片数
  index.number_of_shards: 5
  # 每个分片有多少个副本
  index.number_of_replicas: 0


5.启动filebeat
root@elk2:~# filebeat  -e -c /etc/filebeat/config/04-log-to-nginx.yaml

Filebeat分析nginx日志

filebeat modules

复制代码

# filebeat support any modules
# 关于filebeat模块官方的解释：他能简化filebeat分析的日志格式
# Filebeat modules simplify the collection, parsing, and visualization of common log formats.
# 默认情况下这些模块都是disable  在需要时我们需要自己设置为enabled
root@elk2:~# ls -l /etc/filebeat/modules.d/
total 300
-rw-r--r-- 1 root root   484 Feb 13 16:58 activemq.yml.disabled
-rw-r--r-- 1 root root   476 Feb 13 16:58 apache.yml.disabled
-rw-r--r-- 1 root root   281 Feb 13 16:58 auditd.yml.disabled
-rw-r--r-- 1 root root  2112 Feb 13 16:58 awsfargate.yml.disabled
。。。
root@elk2:~# ls -l /etc/filebeat/modules.d/ | wc -l
72

# 查看哪些模块处于enabled和disabled
root@elk2:~# filebeat modules list


# 启动模块
root@elk2:~# filebeat modules enable apache nginx mysql redis
Enabled apache
Enabled nginx
Enabled mysql
Enabled redis

# 停止模块
root@elk2:~# filebeat modules disable  apache  mysql redis
Disabled apache
Disabled mysql
Disabled redis

配置filebeat监控nginx

复制代码

# 需要在filebeat的配置文件中配置模块功能  在/etc/filebeat/filebeat.yml文件中规定了配置方式
filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

module_nginx# 写入filebeat实例
root@elk2:~# cat /etc/filebeat/config/07-module-nginx-to-es.yaml
# Config modules
filebeat.config.modules:
  # Glob pattern for configuration loading  指定在哪个路径下加载
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading 自动加载/etc/filebeat/modules.d/下面的yml文件。
  reload.enabled: true

  # Period on which files under path should be checked for changes
  #reload.period: 10s

output.elasticsearch:  
  hosts:
    - 192.168.121.21:9200
    - 192.168.121.22:9200
    - 192.168.121.23:9200
  index: "module_nginx-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.overwrite: true
setup.template.name: "module_nginx"
setup.template.pattern: "module_nginx-*"
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0


# 准备nginx日志测试用例
root@elk2:~# cat /var/log/nginx/access.log
192.168.121.1 - - [19/Mar/2025:16:42:23 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36 CrKey/1.54.248666"
1.168.121.1 - - [19/Mar/2025:16:42:26 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36 CrKey/1.54.248666"
92.168.121.1 - - [19/Mar/2025:16:42:29 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1"
192.168.11.1 - - [19/Mar/2025:16:42:31 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android 13; SM-G981B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Mobile Safari/537.36"
192.168.121.1 - - [19/Mar/2025:16:42:40 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Safari/605.1.15"



# 启动filebeat实例
root@elk2:~# filebeat -e -c /etc/filebeat/config/07-module-nginx-to-es.yaml 





# 在收集到的结果中包含正确的日志和错误的日志，我们如果想只收集正确的，需要设置nginx模板的配置文件 /etc/filebeat/module.d/nginx.yml
- module: nginx
  # Access logs
  access:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    var.paths: ["/var/log/nginx/access.log"]

  # Error logs
  error:
    enabled: false    # 从true改为false

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

  # Ingress-nginx controller logs. This is disabled by default. It could be used in Kubernetes environments to parse ingress-nginx logs
  ingress_controller:
    enabled: false

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

kibana分析PV

复制代码

pv：  page view  页面访问量
1个request 就是一个 pv

kibana分析IP

kibana分析带宽

kibana制作Dashboard

kibana分析设备

kibana分析操作系统占比

kibana分析全球用户占比

filebeat采集tomcat日志

部署tomcat

复制代码

[root@elk2 ~]# wget https://dlcdn.apache.org/tomcat/tomcat-11/v11.0.5/bin/apache-tomcat-11.0.5.tar.gz
[root@elk2 ~]# tar xf apache-tomcat-11.0.5.tar.gz  -C /usr/local

# 配置环境变量
# es本身有jdk环境，我们将es的jdk环境配置到环境变量中，让tomcat调用es的jdk环境
# es的jdk环境目录
[root@elk2 ~]# ll /usr/share/elasticsearch/jdk/
# 添加环境变量
[root@elk2 ~]# vim /etc/profile.d/tomcat.sh
[root@elk2 ~]# source /etc/profile.d/tomcat.sh
[root@elk2 ~]# cat /etc/profile.d/tomcat.sh 
#!/bin/bash
export JAVA_HOME=/usr/share/elasticsearch/jdk
export TOMCAT_HOME=/usr/local/apache-tomcat-11.0.5
export PATH=$PATH:$JAVA_HOME/bin:$TOMCAT_HOME/bin
[root@elk3 ~]# java -version
openjdk version "22.0.2" 2024-07-16
OpenJDK Runtime Environment (build 22.0.2+9-70)
OpenJDK 64-Bit Server VM (build 22.0.2+9-70, mixed mode, sharing)

# 因为tomcat默认的日志格式显示的信息很少，所以这里需要修改tomcat配置文件，修改日志格式
[root@elk3 ~]# vim /usr/local/apache-tomcat-11.0.5/conf/server.xml 
...
          <Host name="tomcat.test.com"  appBase="webapps"
                unpackWARs="true" autoDeploy="true">

		<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
            prefix="tomcat.test.com_access_log" suffix=".json"
pattern="{&quot;clientip&quot;:&quot;%h&quot;,&quot;ClientUser&quot;:&quot;%l&quot;,&quot;authenticated&quot;:&quot;%u&quot;,&quot;AccessTime&quot;:&quot;%t&quot;,&quot;request&quot;:&quot;%r&quot;,&quot;status&quot;:&quot;%s&quot;,&quot;SendBytes&quot;:&quot;%b&quot;,&quot;Query?string&quot;:&quot;%q&quot;,&quot;partner&quot;:&quot;%{Referer}i&quot;,&quot;http_user_agent&quot;:&quot;%{User-Agent}i&quot;}"/>

          </Host>



# 启动tomcat
[root@elk2 ~]# catalina.sh  start 
[root@elk2 ~]# netstat -tunlp | grep 8080
tcp6       0      0 :::8080                 :::*                    LISTEN      98628/java   


# 访问测试
[root@elk2 ~]# cat /etc/hosts
127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters


192.168.121.92 tomcat.test.com

[root@elk2 ~]# cat /usr/local/apache-tomcat-11.0.5/logs/tomcat.test.com_access_log.2025-03-23.json
{"clientip":"192.168.121.92","ClientUser":"-","authenticated":"-","AccessTime":"[23/Mar/2025:20:55:41 +0800]","request":"GET / HTTP/1.1","status":"200","SendBytes":"11235","Query?string":"","partner":"-","http_user_agent":"curl/7.81.0"}

配置filebeat监控tomcat

复制代码

# 启动tomcat模块
[root@elk3 ~]# filebeat modules enable tomcat
Enabled tomcat
[root@elk3 ~]# ll /etc/filebeat/modules.d/tomcat.yml 
-rw-r--r-- 1 root root 623 Feb 14 00:58 /etc/filebeat/modules.d/tomcat.yml



# 配置tomcat模块
[root@elk3 ~]# cat /etc/filebeat/modules.d/tomcat.yml 
# Module: tomcat
# Docs: https://www.elastic.co/guide/en/beats/filebeat/7.17/filebeat-module-tomcat.html

- module: tomcat
  log:
    enabled: true

    # Set which input to use between udp (default), tcp or file.
    # var.input: udp
    var.input: file
    # var.syslog_host: tomcat.test.com
    # var.syslog_port: 8080

    # Set paths for the log files when file input is used.
    # var.paths:
    #   - /var/log/tomcat/*.log
    var.paths:
      - /usr/local/apache-tomcat-11.0.5/logs/tomcat.test.com_access_log.2025-03-23.json

    # Toggle output of non-ECS fields (default true).
    # var.rsa_fields: true

    # Set custom timezone offset.
    # "local" (default) for system timezone.
    # "+02:00" for GMT+02:00
    # var.tz_offset: local

# 配置filebeat
[root@elk3 ~]# cat /etc/filebeat/config/02-tomcat-es.yaml 
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: true
output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: test-modules-tomcat-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "test-modules-tomcat"
setup.template.pattern: "test-modules-tomcat-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0


# 启动filebeat
[root@elk3 ~]# filebeat -e -c /etc/filebeat/config/02-tomcat-es.yaml

filebeat processors

filebeat 处理器

https://www.elastic.co/guide/en/beats/filebeat/7.17/filtering-and-enhancing-data.html

复制代码

# 在我们使用filebeat采集tomcat日志时，由于tomcat的日志在我们的设置下是json格式的，我们想要得出json格式中的具体信息，需要通过filebeat processors进行进一步配置
# decode JSON fields 参数可以实现解析json格式
# 官方配置
processors:
  - decode_json_fields:
      fields: ["field1", "field2", ...]
      process_array: false
      max_depth: 1
      target: ""
      overwrite_keys: false
      add_error_key: true
# fields：指定要对哪个字段进行json解析
# process_array：一个bool值，指定是否解析数字，默认是false，可选配置
# max_depth：最大解析深度，默认值为1 将解码中所示字段中的 JSON 对象fields，值为2还将解码嵌入在这些解析文档的字段中的对象，可选配置
# target：解码后的 JSON 将写入的字段。默认情况下，解码后的 JSON 对象将替换读取它的字符串字段。要将解码后的 JSON 字段合并到事件的根目录中，请指定 target一个空字符串 ( target: "")。请注意，null值 ( target:) 被视为未设置字段 可选配置
# overwrite_keys：布尔值，指定事件中的现有键是否被解码的 JSON 对象中的键覆盖。默认值为false。可选配置
# add_error_key：如果设置为 ，true并且在解码 JSON 密钥时发生错误，则该error字段将成为事件的一部分，并带有错误消息。如果设置为false，则事件的字段中不会有任何错误。默认值为false。可选配置

# 写入filebeat配置
[root@elk3 ~]# cat /etc/filebeat/config/02-tomcat-es.yaml,
cat: /etc/filebeat/config/02-tomcat-es.yaml,: No such file or directory
[root@elk3 ~]# cat /etc/filebeat/config/02-tomcat-es.yaml
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: true
processors:
  - decode_json_fields:
      fields: ["event.original"]
      process_array: false
      max_depth: 1
      target: ""
      overwrite_keys: false
      add_error_key: true
output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: test-modules-tomcat-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "test-modules-tomcat"
setup.template.pattern: "test-modules-tomcat-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0

# 启动filebeat
[root@elk3 ~]# filebeat -e -c /etc/filebeat/config/02-tomcat-es.yaml



# 通过filebeat删除一个字段
processors:
  - drop_fields:
      when:
        condition
      fields: ["field1", "field2", ...]
      ignore_missing: false
      
The supported conditions are:

equals
contains
regexp
range
network
has_fields
or
and
not

# 删除status 是404的 event.module 字段值
[root@elk3 ~]# cat /etc/filebeat/config/02-tomcat-es.yaml
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: true
processors:
  - decode_json_fields:
      fields: ["event.original"]
      process_array: false
      max_depth: 1
      target: ""
      overwrite_keys: false
      add_error_key: true
  - drop_fields:
      when:
        equals:
          status: "404"
      fields: ["event.module"]
      ignore_missing: false
output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: test-modules-tomcat-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "test-modules-tomcat"
setup.template.pattern: "test-modules-tomcat-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0
[root@elk3 ~]# filebeat -e -c /etc/filebeat/config/02-tomcat-es.yam

filebeat采集es集群日志

复制代码

# 启动模块
[root@elk1 ~]# filebeat modules enable elasticsearch
Enabled elasticsearch

[root@elk1 ~]# cat /etc/filebeat/config/02-es-log.yml 
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: true

output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: es-log-modules-eslog-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "es-log"
setup.template.pattern: "es-log-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0

filebeat采集mysql日志

复制代码

# 部署mysql
[root@elk1 ~]# wget https://dev.mysql.com/get/Downloads/MySQL-8.4/mysql-8.4.4-linux-glibc2.28-x86_64.tar.xz
[root@elk1 ~]# tar xf mysql-8.4.4-linux-glibc2.28-x86_64.tar.xz -C /usr/local/


# 准备启动脚本并授权 
[root@elk1 ~]# cp /usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/support-files/mysql.server /etc/init.d/
[root@elk1 ~]# vim /etc/init.d/mysql.server 
[root@elk1 ~]# grep -E "^(basedir=|datadir=)" /etc/init.d/mysql.server
basedir=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/
datadir=/var/lib/mysql
[root@elk1 ~]# useradd -m mysql
[root@elk1 ~]# install -d /var/lib/mysql -o mysql -g mysql
[root@elk1 ~]# ll -d /var/lib/mysql
drwxr-xr-x 2 mysql mysql 4096 Mar 25 17:05 /var/lib/mysql/

# 准备配置文件
[root@elk1 ~]# vim /etc/my.cnf
[root@elk1 ~]# cat /etc/my.cnf
[mysqld]
basedir=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/
datadir=/var/lib/mysql
socket=/tmp/mysql80.sock
port=3306

[client]
socket=/tmp/mysql80.sock

# 启动数据库
[root@elk1 ~]# vim /etc/profile.d/mysql.sh
[root@elk1 ~]# cat /etc/profile.d/mysql.sh
#!/bin/bash
export MYSQL_HOME=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/
export PATH=$PATH:$MYSQL_HOME/bin
[root@elk1 ~]# source  /etc/profile.d/mysql.sh
[root@elk1 ~]# mysqld --initialize-insecure  --user=mysql  --datadir=/var/lib/mysql  --basedir=/usr/local/mysql-8.4.4-linux-glibc2.28-x86_64
2025-03-25T09:08:36.829914Z 0 [System] [MY-015017] [Server] MySQL Server Initialization - start.
2025-03-25T09:08:36.842773Z 0 [System] [MY-013169] [Server] /usr/local/mysql-8.4.4-linux-glibc2.28-x86_64/bin/mysqld (mysqld 8.4.4) initializing of server in progress as process 7905
2025-03-25T09:08:36.918780Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2025-03-25T09:08:37.818933Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2025-03-25T09:08:42.504501Z 6 [Warning] [MY-010453] [Server] root@localhost is created with an empty password ! Please consider switching off the --initialize-insecure option.
2025-03-25T09:08:46.909940Z 0 [System] [MY-015018] [Server] MySQL Server Initialization - end.
[root@elk1 ~]# /etc/init.d/mysql.server start
Starting mysql.server (via systemctl): mysql.server.service.
[root@elk1 ~]# netstat -tunlp | grep 3306
tcp6       0      0 :::3306                 :::*                    LISTEN      8141/mysqld         
tcp6       0      0 :::33060                :::*                    LISTEN      8141/mysqld  



# 开启filebeat的模块
[root@elk1 ~]# filebeat modules enable mysql
Enabled mysql

# 配置filebeat
[root@elk1 ~]# cat /etc/filebeat/config/03-es-mysql-log.yaml
filebeat.config.modules:
  path: ${path.config}/modules.d/mysql.yml
  reload.enabled: true

output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: es-modules-mysql-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "es-modules-mysql"
setup.template.pattern: "es-modules-mysql-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0
# 配置mysql modules
[root@elk1 ~]# cat /etc/filebeat/modules.d/mysql.yml
# Module: mysql
# Docs: https://www.elastic.co/guide/en/beats/filebeat/7.17/filebeat-module-mysql.html

- module: mysql
  # Error logs
  error:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:
    var.paths: ["/var/lib/mysql/elk1.err"]

  # Slow logs
  slowlog:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

# 启动filebeat实例
[root@elk1 ~]# filebeat -e -c /etc/filebeat/config/03-es-mysql-log.yaml

filebeat采集redis

复制代码

# 安装redis
[root@elk1 ~]# apt install -y redis

# redis日志文件位置
[root@elk1 ~]# cat /var/log/redis/redis-server.log 
8618:C 25 Mar 2025 17:18:37.442 # WARNING supervised by systemd - you MUST set appropriate values for TimeoutStartSec and TimeoutStopSec in your service unit.
8618:C 25 Mar 2025 17:18:37.442 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
8618:C 25 Mar 2025 17:18:37.442 # Redis version=6.0.16, bits=64, commit=00000000, modified=0, pid=8618, just started
8618:C 25 Mar 2025 17:18:37.442 # Configuration loaded
                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 6.0.16 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in standalone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 8618
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

8618:M 25 Mar 2025 17:18:37.446 # Server initialized
8618:M 25 Mar 2025 17:18:37.446 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
8618:M 25 Mar 2025 17:18:37.447 * Ready to accept connections

# 启动redis modules
[root@elk1 ~]# filebeat modules enable redis
Enabled redis

[root@elk1 ~]# cat /etc/filebeat/config/04-es-redis-log.yaml 
filebeat.config.modules:
  path: ${path.config}/modules.d/redis.yml
  reload.enabled: true

output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: es-modules-redis-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "es-modules-redis"
setup.template.pattern: "es-modules-redis-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0

# 启动filebeat实例
[root@elk1 ~]# filebeat  -e -c /etc/filebeat/config/04-es-redis-log.yaml

filebeat多行合并问题

复制代码

# Manage multiline messages
# 
parsers:
- multiline:
    type: pattern
    pattern: '^\['
    negate: true
    match: after
    
multiline.type
	Defines which aggregation method to use. The default is pattern. The other option is count which lets you aggregate constant number of lines.
multiline.pattern
	Specifies the regular expression pattern to match. Note that the regexp patterns supported by Filebeat differ somewhat from the patterns supported by Logstash. See Regular expression 		support for a list of supported regexp patterns. Depending on how you configure other multiline options, lines that match the specified regular expression are considered either 			continuations of a previous line or the start of a new multiline event. You can set the negate option to negate the pattern.
multiline.negate
	Defines whether the pattern is negated. The default is false.
multiline.match
	Specifies how Filebeat combines matching lines into an event. The settings are after or before. The behavior of these settings depends on what you specify for negate:

manager multiline redis log message

复制代码

# 通过manager multiline message优化redis集群日志采集规则
# type: filestream 是对 旧版本log的替代 
[root@elk1 ~]# cat /etc/filebeat/config/04-es-redis-log.yaml
filebeat.inputs:
- type: filestream
  paths:
    - /var/log/redis/redis-server.log*
  # 配置解析器
  parsers:
    # 定义多行匹配
  - multiline:
      # 指定匹配的类型
      type: pattern
      # 定义匹配模式
      pattern: '^\d'
      # 参考官网: https://www.elastic.co/guide/en/beats/filebeat/current/multiline-examples.html
      negate: true
      match: after
output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: es-modules-redis-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "es-modules-redis"
setup.template.pattern: "es-modules-redis-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0
[root@elk1 ~]# filebeat -e -c /etc/filebeat/config/04-es-redis-log.yaml 
# redis日志中的图案就筛选出来了

manager multiline tomcat error log message

复制代码

# tomcat error log path : /usr/local/apache-tomcat-11.0.5/logs/catalina.*
[root@elk2 ~]# cat /etc/filebeat/config/01-es-cluster-tomcat.yml
filebeat.inputs:
- type: filestream
  paths:
    - /usr/local/apache-tomcat-11.0.5/logs/catalina*
  parsers:
  - multiline:
      type: pattern
      pattern: '^\d'
      negate: true
      match: after
output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: test-modules-tomcat-elk2-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "test-modules-tomcat-elk2"
setup.template.pattern: "test-modules-tomcat-elk2*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0

filebeat多实例

复制代码

1.启动实例1
filebeat -e -c /etc/filebeat/config/01-log-to-console.yaml  --path.data /tmp/xixi


2.启动实例2 
filebeat -e -c /etc/filebeat/config/02-log-to-es.yaml  --path.data /tmp/haha

# 通过filebeat采集 /var/log/syslog  /var/log/auth.log
root@elk2:~# cat /etc/filebeat/config/05-log-to-syslog.yaml
# 定义数据从哪里来
filebeat.inputs:
  # 指定数据源的类型是log，表示从文件读取数据
- type: log
  # 指定文件的路径
  paths:
    - /var/log/syslog

# 定义数据到终端
output.elasticsearch:  
  hosts:
    - 192.168.121.21:9200
    - 192.168.121.22:9200
    - 192.168.121.23:9200
  # 自定义索引名称
  index: "test_syslog-%{+yyyy.MM.dd}"

# 禁用索引生命周期管理(index lifecycle management，ILM)
# 如果启用了此配置，则忽略自定义索引的所有信息
setup.ilm.enabled: false
# 如果索引模板存在，是否覆盖，默认值为false，如果明确需要，则可以将其设置为ture。
# 但是官方建议将其设置为false，原因是每次写数据时，都会建立tcp链接，消耗资源。
setup.template.overwrite: true
# 定义索引模板(就是创建索引的规则)的名称
setup.template.name: "test_syslog"
# 定义索引模板的匹配模式，表示当前索引模板针对哪些索引生效。
setup.template.pattern: "test_syslog-*"
# # 定义索引模板的规则信息
setup.template.settings:
  # 分片数
  index.number_of_shards: 5
  # 每个分片有多少个副本
  index.number_of_replicas: 0
root@elk2:~# cat /etc/filebeat/config/06-log-to-auth.yaml
# 定义数据从哪里来
filebeat.inputs:
  # 指定数据源的类型是log，表示从文件读取数据
- type: log
  # 指定文件的路径
  paths:
    - /var/log/auth.log

# 定义数据到终端
output.elasticsearch:  
  hosts:
    - 192.168.121.21:9200
    - 192.168.121.22:9200
    - 192.168.121.23:9200
  # 自定义索引名称
  index: "test_auth-%{+yyyy.MM.dd}"

# 禁用索引生命周期管理(index lifecycle management，ILM)
# 如果启用了此配置，则忽略自定义索引的所有信息
setup.ilm.enabled: false
# 如果索引模板存在，是否覆盖，默认值为false，如果明确需要，则可以将其设置为ture。
# 但是官方建议将其设置为false，原因是每次写数据时，都会建立tcp链接，消耗资源。
setup.template.overwrite: true
# 定义索引模板(就是创建索引的规则)的名称
setup.template.name: "test_auth"
# 定义索引模板的匹配模式，表示当前索引模板针对哪些索引生效。
setup.template.pattern: "test_auth-*"
# # 定义索引模板的规则信息
setup.template.settings:
  # 分片数
  index.number_of_shards: 5
  # 每个分片有多少个副本
  index.number_of_replicas: 0


# 通过多实例的方式启动filebeat
root@elk2:~# filebeat -e -c /etc/filebeat/config/05-log-to-syslog.yaml --path.data /tmp/xixi
root@elk2:~# filebeat -e -c /etc/filebeat/config/06-log-to-syslog.yaml --path.data /tmp/haha

EFK分析web集群

部署web集群

复制代码

1.部署tomcat服务器
# 192.168.121.92 192.168.121.93部署tomcat
参照filebeat收集tomcat章节的部署tomcat方式


2.部署nginx
# 192.168.121.91部署nginx
[root@elk1 ~]# apt install -y nginx
[root@elk1 ~]# vim /etc/nginx/nginx.conf
...
upstream es-web{
    server 192.168.121.92:8080;
    server 192.168.121.93:8080;
}
server {
    server_name es.web.com;
    location / {
        proxy_pass http://es-web;
    }
}
...
[root@elk1 ~]# nginx -t
[root@elk1 ~]# systemctl restart nginx
# 访问测试
[root@elk1 ~]# curl es.web.com

采集web集群日志

复制代码

# 91加载nginx模块
# 92 93 加载tomcat模块
[root@elk1 ~]# filebeat modules enable nginx
Enabled nginx
[root@elk2 ~]# filebeat modules enable tomcat
Enabled tomcat
[root@elk3 ~]# filebeat modules enable tomcat
Enabled tomcat


1.配置nginx模块功能
[root@elk1 ~]# cat /etc/filebeat/modules.d/nginx.yml
# Module: nginx
# Docs: https://www.elastic.co/guide/en/beats/filebeat/7.17/filebeat-module-nginx.html

- module: nginx
  # Access logs
  access:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    var.paths: /var/log/nginx/access.log

  # Error logs
  error:
    enabled: false

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

  # Ingress-nginx controller logs. This is disabled by default. It could be used in Kubernetes environments to parse ingress-nginx logs
  ingress_controller:
    enabled: false

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:
2.配置tomcat模块功能
[root@elk2 ~]# cat /etc/filebeat/modules.d/tomcat.yml
# Module: tomcat
# Docs: https://www.elastic.co/guide/en/beats/filebeat/7.17/filebeat-module-tomcat.html

- module: tomcat
  log:
    enabled: true

    # Set which input to use between udp (default), tcp or file.
    var.input: file
    # var.syslog_host: localhost
    # var.syslog_port: 9501

    # Set paths for the log files when file input is used.
    var.paths:
      - /usr/local/apache-tomcat-11.0.5/logs/*.json

    # Toggle output of non-ECS fields (default true).
    # var.rsa_fields: true

    # Set custom timezone offset.
    # "local" (default) for system timezone.
    # "+02:00" for GMT+02:00
    # var.tz_offset: local
3.配置91filebeat配置文件
[root@elk1 ~]# cat /etc/filebeat/config/01-es-web-nginx.yaml
filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: true
output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: es-web-nginx-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "es-web-nginx"
setup.template.pattern: "es-web-nginx-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0
4.92配置监控tomcat配置文件
[root@elk2 ~]# cat /etc/filebeat/config/01-es-web-tomcat.yaml 
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: true
processors:
  - decode_json_fields:
      fields: ["event.original"]
      process_array: false
      max_depth: 1
      target: ""
      overwrite_keys: false
      add_error_key: true
  - drop_fields:
      when:
        equals:
          status: "404"
      fields: ["event.module"]
      ignore_missing: false
output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: test-modules-tomcat91-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "test-modules-tomcat91"
setup.template.pattern: "test-modules-tomcat91-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0
5.93配置监控tomcat配置文件
[root@elk3 ~]# cat /etc/filebeat/config/02-es-web-tomcat.yaml 
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: true
processors:
  - decode_json_fields:
      fields: ["event.original"]
      process_array: false
      max_depth: 1
      target: ""
      overwrite_keys: false
      add_error_key: true
  - drop_fields:
      when:
        equals:
          status: "404"
      fields: ["event.module"]
      ignore_missing: false
output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: test-modules-tomcat93-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "test-modules-tomcat93"
setup.template.pattern: "test-modules-tomcat93-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0

# 依次启动filebeat

更改字段类型

复制代码

我们想统计带宽大小，但是此时是无法统计的



这是一个字符类型的值



# 通过processors修改字段类型
# 官网配置
# 支持的类型
# The supported types include: integer, long, float, double, string, boolean, and ip.

processors:
  - convert:
      fields:
        - {from: "src_ip", to: "source.ip", type: "ip"}
        - {from: "src_port", to: "source.port", type: "integer"}
      ignore_missing: true
      fail_on_error: false

# 配置filebeat配置文件
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: true
processors:
  - decode_json_fields:
      fields: ["event.original"]
      process_array: false
      max_depth: 1
      target: ""
      overwrite_keys: false
      add_error_key: true
  - convert:
      fields:
        - {from: "SendBytes", type: "long"}
  - drop_fields:
      when:
        equals:
          status: "404"
      fields: ["event.module"]
      ignore_missing: false
output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  index: test-modules-tomcat91-%{+yyyy.MM.dd}
setup.ilm.enabled: false
setup.template.name: "test-modules-tomcat91"
setup.template.pattern: "test-modules-tomcat91-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0

Ansible部署EFK集群

复制代码

[root@ansible efk]# cat set_es.sh
#!/bin/bash
ansible-playbook 01-install-elaticsearch.yaml
ansible-playbook 02-install-kibana.yaml
ansible-playbook 03-install-filebeat.yaml
ansible-playbook 04-set-web.yaml
ansible-playbook 05-config-filebeat.yaml

[root@ansible efk]# bash set_es.sh

[root@ansible efk]# cat 01-install-elaticsearch.yaml 
---
- name: Install es cluster
  hosts: all
  tasks:
    - name: get es deb package
      get_url:
        url: https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.28-amd64.deb
        dest: /root/
    - name: Install es
      shell: 
        cmd: dpkg -i /root/elasticsearch-7.17.28-amd64.deb | cat
    - name: Configer es
      copy:
        src: conf/elasticsearch.yml
        dest: /etc/elasticsearch/elasticsearch.yml
    - name: start es
      systemd:
        name: elasticsearch
        state: started
        enabled: yes


[root@ansible efk]# cat 02-install-kibana.yaml 
---
- name: Install kibana
  hosts: elk1
  tasks:
    - name: Get kibana deb package
      get_url:
        url: https://artifacts.elastic.co/downloads/kibana/kibana-7.17.28-amd64.deb
        dest: /root
    - name: Install kibana
      shell:
        cmd: dpkg -i kibana-7.17.28-amd64.deb | cat
    - name: Config kibana
      copy:
        src: conf/kibana.yml
        dest: /etc/kibana/kibana.yml
    - name: Start kibana
      systemd: 
        name: kibana
        state: started
        enabled: yes
[root@ansible efk]# cat 03-install-filebeat.yaml 
---
- name: Install filebeat
  hosts: elk
  tasks:
    - name: Get filebeat code
      get_url:
        url: https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.17.28-amd64.deb
        dest: /root
    - name: Install filebeat
      shell:
        cmd: dpkg -i filebeat-7.17.28-amd64.deb | cat
    - name: Configer filebeat
      file:
        path: /etc/filebeat/config
        state: directory

[root@ansible efk]# cat 04-set-web.yaml 
---
- name: Set nginx
  hosts: elk1
  tasks:
    - name: Install nginx
      shell:
        cmd: apt install -y nginx | cat
    - name: config nginx
      copy:
        src: conf/nginx.conf
        dest: /etc/nginx/nginx.conf
    - name: start nginx
      systemd:
        name: nginx
        state: started
        enabled: yes
    - name: Configure hosts
      copy:
        content: 192.168.121.91 es.web.com
        dest: /etc/hosts
- name: Set tomcat
  hosts: elk2,elk3
  tasks:
    - name: Get tomcat code
      get_url:
        url: https://dlcdn.apache.org/tomcat/tomcat-11/v11.0.5/bin/apache-tomcat-11.0.5.tar.gz
        dest: /root/
    - name: unarchive tomcat code
      unarchive:
        src: /root/apache-tomcat-11.0.5.tar.gz
        dest: /usr/local
        remote_src: yes
    - name: Configure jdk PATH
      copy:
        src: conf/tomcat.sh
        dest: /etc/profile.d/
    - name: reload profile 
      shell:
        cmd: source /etc/profile.d/tomcat.sh | cat
    - name: Configure tomcat
      copy:
        src: conf/server.xml
        dest: /usr/local/apache-tomcat-11.0.5/conf/server.xml
    - name: start tomcat
      shell:
        cmd: catalina.sh  start |cat
[root@ansible efk]# cat 05-config-filebeat.yaml 
---
- name: configure filebeat
  hosts: elk1
  tasks:
    - name: enable nginx modules
      shell:
        cmd: filebeat modules enable nginx | cat
    - name: configure nginx modules
      copy:
        src: conf/nginx.yml
        dest: /etc/filebeat/modules.d/nginx.yml
    - name: configure filebeat 
      copy:
        src: conf/01-es-cluster-nginx.yml
        dest: /etc/filebeat/config/01-es-cluster-nginx.yml

- name: configure filebeat
  hosts: elk2 elk3
  tasks:
    - name: enable tomcat modules
      shell:
        cmd: filebeat modules enable nginx | cat
    - name: configure tomcat modules
      copy:
        src: conf/tomcat.yml
        dest: /etc/filebeat/modules.d/tomcat.yml
    - name: configure filebeat
      template:
        src: conf/01-es-cluster-tomcat.yml.j2
        dest: /etc/filebeat/config/01-es-cluster-tomcat.yml

logstash

安装配置logstash

复制代码

1.部署logstash
[root@elk3 ~]# wget https://artifacts.elastic.co/downloads/logstash/logstash-7.17.28-amd64.deb
[root@elk3 ~]# dpkg -i logstash-7.17.28-amd64.deb 

2.创建符号链接，将Logstash命令添加到PATH环境变量
[root@elk3 ~]# ln -svf  /usr/share/logstash/bin/logstash /usr/local/bin/
'/usr/local/bin/logstash' -> '/usr/share/logstash/bin/logstash'

3.基于命令行的方式启动实例，使用-e选项指定配置信息（不推荐）
[root@elk3 ~]# logstash -e "input { stdin { type => stdin } } output { stdout { codec => rubydebug } }"  --log.level warn
...
The stdin plugin is now waiting for input:
111111111111111111111111111
{
    "@timestamp" => 2025-03-13T06:51:32.821Z,
          "type" => "stdin",
       "message" => "111111111111111111111111111",
          "host" => "elk93",
      "@version" => "1"
}

4.基于配置文件方式启动Logstash
[root@elk3 ~]# vim /etc/logstash/conf.d/01-stdin-to-stdout.conf
[root@elk3 ~]# cat /etc/logstash/conf.d/01-stdin-to-stdout.conf
input { 
  stdin { 
    type => stdin 
  } 
} 


output { 
  stdout { 
    codec => rubydebug 
  } 
}

[root@elk3 ~]# logstash -f /etc/logstash/conf.d/01-stdin-to-stdout.conf
...
333333333333333333333333333333
{
          "type" => "stdin",
       "message" => "333333333333333333333333333333",
          "host" => "elk93",
    "@timestamp" => 2025-03-13T06:54:20.223Z,
      "@version" => "1"
}

# https://www.elastic.co/guide/en/logstash/7.17/plugins-inputs-file.html
[root@elk3 ~]# cat /etc/logstash/conf.d/02-file-to-stdout.conf 
input { 
  file { 
    path => "/tmp/student.txt"
  } 
} 


output { 
  stdout { 
    codec => rubydebug 
  } 
}
[WARN ] 2025-03-26 09:40:52.788 [[main]<file] plain - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
{
          "path" => "/tmp/student.txt",
      "@version" => "1",
    "@timestamp" => 2025-03-26T01:40:52.879Z,
       "message" => "aaaddd",
          "host" => "elk3"
}

Logstash采集文本日志策略

复制代码

Logstash采集策略和filebeat的采集策略类似
	1.以换行符为准，以行为单位进行采集
	2.也存在和filebeat类似的偏移量的概念

[root@elk3 ~]# ll /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd 
-rw-r--r-- 1 root root 53 Mar 26 09:57 /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
[root@elk3 ~]# cat /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd 
408794 0 64768 12 1742955373.9715059 /tmp/student.txt  # 12就是偏移量
[root@elk3 ~]# cat /tmp/student.txt
ABC
2025def
[root@elk3 ~]# ll -i /tmp/student.txt 
408794 -rw-r--r-- 1 root root 12 Mar 26 09:45 /tmp/student.txt

# 可以直接修改偏移量进行指定位置采集  我们修改偏移量到8  查看采集结果
{
    "@timestamp" => 2025-03-26T02:20:50.776Z,
          "host" => "elk3",
       "message" => "def",
          "path" => "/tmp/student.txt",
      "@version" => "1"
}

start_position

复制代码

在filebeat中如果我们删除了filebeat的json文件，filebeat下一次采集从头开始，对于logstash来说，并不是这样

[root@elk3 ~]# rm -f /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
[root@elk3 ~]# logstash -f /etc/logstash/conf.d/02-file-to-stdout.conf
# 还是默认采集最后面的数据
[root@elk3 ~]# echo 123 >> /tmp/student.txt 
[root@elk3 ~]# cat /tmp/student.txt
ABC
2025def
123
{
      "@version" => "1",
    "@timestamp" => 2025-03-26T02:26:17.008Z,
       "message" => "123",
          "host" => "elk3",
          "path" => "/tmp/student.txt"
}

// 这时就需要star_position参数
start_position
Value can be any of: beginning, end
Default value is "end"

[root@elk3 ~]# cat /etc/logstash/conf.d/02-file-to-stdout.conf
input { 
  file { 
    path => "/tmp/student.txt"
    start_position => "beginning"
  } 
} 


output { 
  stdout { 
    codec => rubydebug 
  } 
}


[root@elk3 ~]# rm -f /usr/share/logstash/data/plugins/inputs/file/.sincedb_782d533684abe27068ac85b78871b9fd
[root@elk3 ~]# logstash -f /etc/logstash/conf.d/02-file-to-stdout.conf
{
      "@version" => "1",
          "host" => "elk3",
       "message" => "2025def",
          "path" => "/tmp/student.txt",
    "@timestamp" => 2025-03-26T02:31:50.020Z
}
{
      "@version" => "1",
          "host" => "elk3",
       "message" => "ABC",
          "path" => "/tmp/student.txt",
    "@timestamp" => 2025-03-26T02:31:49.813Z
}
{
      "@version" => "1",
          "host" => "elk3",
       "message" => "123",
          "path" => "/tmp/student.txt",
    "@timestamp" => 2025-03-26T02:31:50.037Z
}

filter plugins

复制代码

# logstash的输出有很多的字段，如果有一些我不想要，可以使用filter plugins进行过滤
# 移除@version字段 
[root@elk3 ~]# cat /etc/logstash/conf.d/02-file-to-stdout.conf
input { 
  file { 
    path => "/tmp/student.txt"
    start_position => "beginning"
  } 
} 
filter {
  mutate {
    remove_field => [ "@version" ]
  }
}

output { 
  stdout { 
    codec => rubydebug 
  } 
}

# -r模式启动logstash 可以实现重载
[root@elk3 ~]# logstash -r -f /etc/logstash/conf.d/02-file-to-stdout.conf
{
    "@timestamp" => 2025-03-26T03:01:02.078Z,
          "host" => "elk3",
       "message" => "111",
          "path" => "/tmp/student.txt"
}

logstash架构

logstash多实例

复制代码

启动实例1：
[root@elk93 ~]# logstash -f /etc/logstash/conf.d/01-stdin-to-stdout.conf 


启动实例2：
[root@elk93 ~]# logstash -rf /etc/logstash/conf.d/02-file-to-stdout.conf  --path.data /tmp/logstash-multiple

logstash与pipeline关系

复制代码

- 一个Logstash实例可以有多个pipeline，若没有定义pipeline id，则默认为main pipeline。
	
	
	- 每个pipeline都有三个组件组成，其中filter插件是可选组件:
		- input :
			数据从哪里来 。
			
		- filter:
			数据经过哪些插件处理，该组件是可选组件。
			
		- output: 
			数据到哪里去。

logstash采集nginx日志

复制代码

1.安装nginx
[root@elk3 ~]# apt install -y nginx


2.logstash采集nginx
[root@elk3 ~]# cat /etc/logstash/conf.d/03-nginx-grok.conf
input { 
  file { 
    path => "/var/log/nginx/access.log"
    start_position => "beginning"
  } 
} 
filter {
  mutate {
    remove_field => [ "@version" ]
  }
}

output { 
  stdout { 
    codec => rubydebug 
  } 
}
[root@elk3 ~]# logstash -r -f /etc/logstash/conf.d/03-nginx-grok.conf 
{
          "host" => "elk3",
       "message" => "127.0.0.1 - - [26/Mar/2025:14:43:58 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"curl/7.81.0\"",
    "@timestamp" => 2025-03-26T06:45:24.375Z,
          "path" => "/var/log/nginx/access.log"
}
{
          "host" => "elk3",
       "message" => "127.0.0.1 - - [26/Mar/2025:14:43:57 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"curl/7.81.0\"",
    "@timestamp" => 2025-03-26T06:45:24.293Z,
          "path" => "/var/log/nginx/access.log"
}
{
          "host" => "elk3",
       "message" => "127.0.0.1 - - [26/Mar/2025:14:43:58 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"curl/7.81.0\"",
    "@timestamp" => 2025-03-26T06:45:24.373Z,
          "path" => "/var/log/nginx/access.log"
}

grok plugins

复制代码

# 基于正则提取
Logstash ships with about 120 patterns by default. You can find them here: https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns.

[root@elk3 ~]# cat  /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.3.4/patterns/legacy/httpd
HTTPDUSER %{EMAILADDRESS}|%{USER}
HTTPDERROR_DATE %{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}

# Log formats
HTTPD_COMMONLOG %{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes})
HTTPD_COMBINEDLOG %{HTTPD_COMMONLOG} %{QS:referrer} %{QS:agent}

# Error logs
HTTPD20_ERRORLOG \[%{HTTPDERROR_DATE:timestamp}\] \[%{LOGLEVEL:loglevel}\] (?:\[client %{IPORHOST:clientip}\] ){0,1}%{GREEDYDATA:message}
HTTPD24_ERRORLOG \[%{HTTPDERROR_DATE:timestamp}\] \[(?:%{WORD:module})?:%{LOGLEVEL:loglevel}\] \[pid %{POSINT:pid}(:tid %{NUMBER:tid})?\]( \(%{POSINT:proxy_errorcode}\)%{DATA:proxy_message}:)?( \[client %{IPORHOST:clientip}:%{POSINT:clientport}\])?( %{DATA:errorcode}:)? %{GREEDYDATA:message}
HTTPD_ERRORLOG %{HTTPD20_ERRORLOG}|%{HTTPD24_ERRORLOG}

# Deprecated
COMMONAPACHELOG %{HTTPD_COMMONLOG}
COMBINEDAPACHELOG %{HTTPD_COMBINEDLOG}


# 配置logstash
[root@elk3 ~]# cat /etc/logstash/conf.d/03-nginx-grok.conf
input { 
  file { 
    path => "/var/log/nginx/access.log"
    start_position => "beginning"
  } 
} 
filter {
  mutate {
    remove_field => [ "@version" ]
  }
  # 基于正则提取任意文本，并将其封装为一个特定的字段。使用设置好的模板
  grok {
        match => { "message" => "%{HTTPD_COMMONLOG}" }
  }
}

output { 
  stdout { 
    codec => rubydebug 
  } 
}

[root@elk3 ~]# logstash -r -f /etc/logstash/conf.d/03-nginx-grok.conf 
{
        "message" => "192.168.121.1 - - [26/Mar/2025:14:52:06 +0800] \"GET / HTTP/1.1\" 200 396 \"-\" \"Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Mobile Safari/537.36\"",
           "path" => "/var/log/nginx/access.log",
        "request" => "/",
       "clientip" => "192.168.121.1",
           "host" => "elk3",
      "timestamp" => "26/Mar/2025:14:52:06 +0800",
           "auth" => "-",
           "verb" => "GET",
       "response" => "200",
          "ident" => "-",
    "httpversion" => "1.1",
     "@timestamp" => 2025-03-26T06:52:07.342Z,
          "bytes" => "396"
}

useragent plugins

复制代码

用于提取用户的设备信息

[root@elk3 ~]# cat /etc/logstash/conf.d/03-nginx-grok.conf
input { 
  file { 
    path => "/var/log/nginx/access.log"
    start_position => "beginning"
  } 
} 
filter {
  mutate {
    remove_field => [ "@version" ]
  }
  grok {
        match => { "message" => "%{HTTPD_COMMONLOG}" }
  }
  useragent {
  	# 指定从哪个字段解析用户设备信息
    source => 'message'
    # 将解析的结果存储到某个特定字段，若不指定，则默认放在顶级字段。
    target => "xu-ua"
  }      

}

[root@elk3 ~]# logstash -r -f /etc/logstash/conf.d/03-nginx-grok.conf 
output { 
  stdout { 
    codec => rubydebug 
  } 
}

{
        "message" => "192.168.121.1 - - [26/Mar/2025:16:45:10 +0800] \"GET / HTTP/1.1\" 200 396 \"-\" \"Mozilla/5.0 (Linux; Android 13; SM-G981B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Mobile Safari/537.36\"",
       "clientip" => "192.168.121.1",
      "timestamp" => "26/Mar/2025:16:45:10 +0800",
        "request" => "/",
          "bytes" => "396",
           "verb" => "GET",
    "httpversion" => "1.1",
     "@timestamp" => 2025-03-26T08:45:11.587Z,
           "host" => "elk3",
           "auth" => "-",
          "xu-ua" => {
              "name" => "Chrome Mobile",
           "version" => "134.0.0.0",
                "os" => "Android",
           "os_name" => "Android",
        "os_version" => "13",
            "device" => "Samsung SM-G981B",
           "os_full" => "Android 13",
             "minor" => "0",
          "os_major" => "13",
             "patch" => "0",
             "major" => "134"
    },
          "ident" => "-",
           "path" => "/var/log/nginx/access.log",
       "response" => "200"
}

geoip plugins

复制代码

基于公网IP地址分析你的经纬度坐标点

[root@elk3 ~]# cat /etc/logstash/conf.d/03-nginx-grok.conf
input { 
  file { 
    path => "/var/log/nginx/access.log"
    start_position => "beginning"
  } 
} 
filter {
  mutate {
    remove_field => [ "@version" ]
  }
  grok {
        match => { "message" => "%{HTTPD_COMMONLOG}" }
  }
  useragent {
    source => 'message'
    target => "xu-ua"
  }      
  geoip {
	source => "clientip"
  }

}
[root@elk3 ~]# logstash -r -f /etc/logstash/conf.d/03-nginx-grok.conf 
output { 
  stdout { 
    codec => rubydebug 
  } 
}

"geoip" => {
             "longitude" => -119.705,
         "country_code2" => "US",
           "region_name" => "Oregon",
              "timezone" => "America/Los_Angeles",
                    "ip" => "52.222.36.125",
        "continent_code" => "NA",
         "country_code3" => "US",
              "latitude" => 45.8401,
          "country_name" => "United States",
              "dma_code" => 810,
           "postal_code" => "97818",
           "region_code" => "OR",
              "location" => {
            "lat" => 45.8401,
            "lon" => -119.705
        },
             "city_name" => "Boardman"
    }

date plugins

复制代码

[root@elk3 ~]# cat /etc/logstash/conf.d/03-nginx-grok.conf
input { 
  file { 
    path => "/var/log/nginx/access.log"
    start_position => "beginning"
  } 
} 
filter {
  mutate {
    remove_field => [ "@version" ]
  }
  grok {
        match => { "message" => "%{HTTPD_COMMONLOG}" }
  }
  useragent {
    source => 'message'
    target => "xu-ua"
  }      
  geoip {
	source => "clientip"
  }
  date {
 	# 匹配日期字段，将其转换为日期格式，将来存储到ES，基于官方的示例对号入座对应的格式即可。
    # https://www.elastic.co/guide/en/logstash/7.17/plugins-filters-date.html#plugins-filters-date-match
    match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
    # 将match匹配的日期修改后的值直接覆盖到指定字段，若不定义，则默认覆盖"@timestamp"。
    target => "xu-timestamp"
  }

}

output { 
  stdout { 
    codec => rubydebug 
  } 
}

[root@elk3 ~]# logstash -r -f /etc/logstash/conf.d/03-nginx-grok.conf
"xu-timestamp" => 2025-03-26T09:17:18.000Z,

mutate plugins

复制代码

如果我们想统计带宽 我们会发现 "bytes" => "396" 
是字符串类型，不能累加，所以要使用mutate plugins转换类型

[root@elk3 ~]# cat /etc/logstash/conf.d/03-nginx-grok.conf
input { 
  file { 
    path => "/var/log/nginx/access.log"
    start_position => "beginning"
  } 
} 
filter {
  mutate {
    convert => {
      "bytes" => "integer"
    }
    remove_field => [ "@version" ]
  }
  grok {
        match => { "message" => "%{HTTPD_COMMONLOG}" }
  }
  useragent {
    source => 'message'
    target => "xu-ua"
  }      
  geoip {
	source => "clientip"
  }
  date {
    match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
    target => "xu-timestamp"
  }

}

output { 
  stdout { 
    codec => rubydebug 
  } 
}

logstash 采集日志输出到es

复制代码

[root@elk3 ~]# cat /etc/logstash/conf.d/08-nginx-to-es.conf
input { 
  file { 
    path => "/var/log/nginx/access.log"
    start_position => "beginning"
  } 
} 


filter {
  grok {
    match => { "message" => "%{HTTPD_COMMONLOG}" }
  }

  useragent {
    source => "message"
    target => "xu_user_agent"
  }

  geoip {
    source => "clientip"
  }

  date {
    match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
    target => "xu-timestamp"
  }
 
  # 对指定字段进行转换处理
  mutate {
    # 将指定字段转换成我们需要转换的类型
    convert => {
      "bytes" => "integer"
    }
    
    remove_field => [ "@version","host","message" ]
  }
}

output { 
  stdout { 
    codec => rubydebug 
  }

  elasticsearch {
      # 对应的ES集群主机列表
      hosts => ["10.0.0.91:9200","10.0.0.92:9200","10.0.0.93:9200"]
      # 对应的ES集群的索引名称
      index => "xu-elk-nginx"
  }
}

存在的问题:
	Failed (timed out waiting for connection to open). Sleeping for 0.02

问题描述:
	此问题在 ElasticStack 7.17.28版本中，可能会出现Logstash无法写入ES的情况。
	
TODO:
	需要调研官方是否做了改动，导致无法写入成功，需要额外的参数配置。

临时解决方案:
	- 回退版本到7.17.23版本。
	- 注释掉geoip的配置

解决在写入es时geoip plugins识别时间过长问题

复制代码

通过查看官网，我们可以看到geoip模块可以指定数据库， 我们通过指定数据库的方式来解决这个问题

1.查看Logstash本地默认的geoip插件
[root@elk3 ~]# tree /usr/share/logstash/data/plugins/filters/geoip/1742980310/
/usr/share/logstash/data/plugins/filters/geoip/1742980310/
├── COPYRIGHT.txt
├── elastic-geoip-database-service-agreement-LICENSE.txt
├── GeoLite2-ASN.mmdb
├── GeoLite2-ASN.tgz
├── GeoLite2-City.mmdb
├── GeoLite2-City.tgz
├── LICENSE.txt
└── README.txt

0 directories, 8 files

2.配置logstash
[root@elk3 ~]# cat /etc/logstash/conf.d/04-nginx-to-es.conf 
input { 
  file { 
    path => "/var/log/nginx/access.log"
    start_position => "beginning"
  } 
} 

filter {
  mutate {
    convert => {
      "bytes" => "integer"
    }
    remove_field => [ "@version" ]
  }
  grok {
        match => { "message" => "%{HTTPD_COMMONLOG}" }
  }
  useragent {
    source => 'message'
    target => "xu-ua"
  }      
  geoip {
	source => "clientip"
	database => "/usr/share/logstash/data/plugins/filters/geoip/1742980310/GeoLite2-City.mmdb"
  }
  date {
    match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
    target => "xu-timestamp"
  }

}

output { 
  stdout { 
    codec => rubydebug 
  } 
  elasticsearch {
	index => "xu-logstash"
	hosts => ["http://192.168.121.91:9200","http://192.168.121.92:9200","http://192.168.121.93:9200"]
  }
}
[root@elk3 ~]#

解决geoip.location数据类型不正确问题

这时的经纬度是float类型，是不能出图的

在kibana中创建索引模板

ELFK架构

json插件案例

复制代码

graph LR
filebeat--->|发送|logstash

让logstash接收filebeat收集到的数据，logstash优先于filebeat启动
也就是logstash中的input plugins type 是beats

# 配置logstash
[root@elk3 ~]# grep -v "^#" /etc/logstash/conf.d/05-beat-es.conf
input { 
  beats {
      port => 5044
  }
} 

filter {
  mutate {
    remove_field => [ "@version","host","agent","ecs","tags","input","log" ]
  }
  json {
     source => "message"
   }
}

output { 
  stdout { 
    codec => rubydebug 
  } 
}


# 配置filebeat
[root@elk1 ~]# cat /etc/filebeat/config/05-json.yaml 
filebeat.inputs:
- type: filestream
  paths:
    - /tmp/student.json
output.logstash:
  hosts: ["192.168.121.93:5044"]

# 先启动logstash，在启动filebeat
[root@elk3 conf.d]# logstash -rf 05-beat-es.conf 
[root@elk3 ~]# netstat -tunlp | grep 5044
tcp6       0      0 :::5044                 :::*                    LISTEN      120181/java    

# 在启动filebeat
[root@elk1 ~]# filebeat  -e -c /etc/filebeat/config/05-json.yaml 


// 准备测试数据
{
  "name":"aaa",
  "hobby":["写小说","唱歌"]
}
{
  "name":"bbb",
  "hobby":["健身","台球","打豆豆"]
}
{
  "name":"ccc",
  "hobby":["乒乓球","游泳","游戏"]
}
{
   "name": "ddd",
   "hobby": ["打游戏","打篮球"]
}

# 查看采集结果，由于filebeat的采集规则是按行采集，就导致我们准备的一条数据它采集出来了多条
"message" => "   \"name\": \"ddd\",",
"message" => "   \"hobby\": [\"打游戏\",\"打篮球\"]",
...

# 需要使用filebeat的多行合并进行处理
[root@elk1 ~]# cat /etc/filebeat/config/05-json.yaml 
filebeat.inputs:
- type: filestream
  paths:
    - /tmp/student.json
  parsers:
    - multiline:
        type: count
        count_lines: 4
output.logstash:
  hosts: ["192.168.121.93:5044"]

# 查看数据采集情况
{
       "message" => "{\n  \"name\":\"aaa\",\n  \"hobby\":[\"写小说\",\"唱歌\"]\n}",
          "name" => "aaa",
         "hobby" => [
        [0] "写小说",
        [1] "唱歌"
    ],
    "@timestamp" => 2025-03-27T07:46:14.390Z
}

写入es

复制代码

[root@elk3 ~]# grep -v "^#" /etc/logstash/conf.d/05-beat-es.conf
input { 
  beats {
      port => 5044
  }
} 

filter {
  mutate {
    remove_field => [ "@version","host","agent","ecs","tags","input","log" ]
  }
  json {
     source => "message"
   }
}

output { 
  stdout { 
    codec => rubydebug 
  } 
  elasticsearch {
	hosts => ["http://192.168.121.91:9200"]
  }
}

ELFK架构梳理之电商指标分享项目案例

复制代码

1.生成测试数据
[root@elk1 ~]# cat gen-log.py
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# @author : Jason Yin

import datetime
import random
import logging
import time
import sys

LOG_FORMAT = "%(levelname)s %(asctime)s [com.oldboyedu.%(module)s] - %(message)s "
DATE_FORMAT = "%Y-%m-%d %H:%M:%S"

# 配置root的logging.Logger实例的基本配置
logging.basicConfig(level=logging.INFO, format=LOG_FORMAT, datefmt=DATE_FORMAT, filename=sys.argv[1]
, filemode='a',)
actions = ["浏览页面", "评论商品", "加入收藏", "加入购物车", "提交订单", "使用优惠券", "领取优惠券",
 "搜索", "查看订单", "付款", "清空购物车"]

while True:
    time.sleep(random.randint(1, 5))
    user_id = random.randint(1, 10000)
    # 对生成的浮点数保留2位有效数字.
    price = round(random.uniform(15000, 30000),2)
    action = random.choice(actions)
    svip = random.choice([0,1,2])
    logging.info("DAU|{0}|{1}|{2}|{3}".format(user_id, action,svip,price))
[root@elk1 ~]#  python3 gen-log.py /tmp/apps.log

2.查看数据内容
[root@elk1 ~]# tail -f /tmp/apps.log
...
INFO 2025-03-27 17:03:10 [com.oldboyedu.gen-log] - DAU|7973|加入购物车|0|19300.65 
INFO 2025-03-27 17:03:13 [com.oldboyedu.gen-log] - DAU|8617|加入购物车|2|19720.57 
INFO 2025-03-27 17:03:14 [com.oldboyedu.gen-log] - DAU|6879|搜索|2|24774.85 
INFO 2025-03-27 17:03:19 [com.oldboyedu.gen-log] - DAU|804|付款|2|21352.22 
INFO 2025-03-27 17:03:22 [com.oldboyedu.gen-log] - DAU|3014|清空购物车|0|19908.62 
...

# 启动logstash实例
[root@elk3 conf.d]# cat 06-beats_apps-to-es.conf
input { 
  beats {
      port => 9999
  }
} 
filter {
mutate {
    split => { "message" => "|" }

    add_field => { 
      "other" => "%{[message][0]}" 
      "userId" => "%{[message][1]}" 
      "action" => "%{[message][2]}" 
      "svip" => "%{[message][3]}" 
      "price" => "%{[message][4]}" 
    }
}
mutate{
	split => { "other" => " " }

    add_field => {
       datetime => "%{[other][1]} %{[other][2]}"
    }
    
    convert => {
       "price" => "float"
     }
	remove_field => [ "@version","host","agent","ecs","tags","input","log","message","other"]
  }
}
output { 
#  stdout { 
#    codec => rubydebug 
#  }

  elasticsearch {
     index => "linux96-logstash-elfk-apps"
     hosts => ["http://192.168.121.91:9200","http://192.168.121.92:9200","http://192.168.121.93:9200"]
  }
}
# 启动filebeat实例
[root@elk1 ~]# cat /etc/filebeat/config/06-filestream-to-logstash.yml 
filebeat.inputs:
- type: filestream
  paths:
    - /tmp/apps.log

output.logstash:
  hosts: ["192.168.121.93:9999"]

ELK架构

logstash if语句

复制代码

在logstash中支持if语句，假如有多个input，可以通过if来对不同的input做不同的过滤，以及不同的输出

# 配置logstash if
[root@elk3 ~]# cat /etc/logstash/conf.d/08-logstash-if.conf 
input {
	beats {
		port => 9999
		type => "xu-filebeat"
	}
	file {
		path => "/var/log/nginx/access.log"
		start_position => "beginning"
    	type => "xu-file"
	}
	tcp {
		port => 8888
    	type => "xu-tcp"
	}
}

fileter {
	if [type] == "xu-tcp" {
	   mutate {
       add_field => {
          school => "school1"
          class => "one"
       } 
       remove_field => [ "@version","port"]
     }
	} else if [type] == "xu-filebeat" {
		mutate {
        split => { "message" => "|" }

        add_field => { 
          "other" => "%{[message][0]}" 
          "userId" => "%{[message][1]}" 
          "action" => "%{[message][2]}" 
          "svip" => "%{[message][3]}" 
          "price" => "%{[message][4]}" 
          "address" => "1.1.1.1"
        }

      }

      mutate {

        split => { "other" => " " }

        add_field => {
           datetime => "%{[other][1]} %{[other][2]}"
        }
        
        convert => {
           "price" => "float"
         }

        remove_field => [ "@version","host","agent","ecs","tags","input","log","message","other"]
      }

      date {
        match => [ "datetime", "yyyy-MM-dd HH:mm:ss" ]
      }
	} else {
		grok {
        match => { "message" => "%{HTTPD_COMMONLOG}" }
      }
    
      useragent {
        source => "message"
        target => "xu_user_agent"
      }
    
      geoip {
        source => "clientip"
        database => "/usr/share/logstash/data/plugins/filters/geoip/CC/GeoLite2-City.mmdb"
      }
    
      date {
        match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
        target => "xu-timestamp"
      }
     
      mutate {
        convert => {
          "bytes" => "integer"
        }
        
        add_field => {
           office => "https://studylinux.cn"
        }

        remove_field => [ "@version","host","message" ]
      }
	}
}

output {
	if [type] == "xu-filebeat" {
      elasticsearch {
         index => "xu-logstash-if-filebeat"
         hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
      }
  } else if [type] == "xu-tcp" {
      elasticsearch {
         index => "xu-logstash-if-tcp"
         hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
      }

  }else {
      elasticsearch {
         index => "xu-logstash-if-file"
         hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"]
      }
  }
}
[root@elk3 ~]#

pipeline

复制代码

# pipline配置文件位置
[root@elk3 ~]# ll /etc/logstash/pipelines.yml
-rw-r--r-- 1 root root 285 Feb 18 18:52 /etc/logstash/pipelines.yml

# 修改pipline配置文件
[root@elk3 ~]# tail -4 /etc/logstash/pipelines.yml
- pipeline.id: xixi
  path.config: "/etc/logstash/conf.d/01-file-to-stdout.conf"
- pipeline.id: haha
  path.config: "/etc/logstash/conf.d/03-nginx-grok.conf"

# 启动logstash，可以通过logstash -r直接启动不需要再指定配置文件了
[root@elk3 ~]# logstash  -r 
	# 直接会有一个报错 ERROR: Failed to read pipelines yaml file. Location: /usr/share/logstash/config/pipelines.yml
	# logstash默认会去 /usr/share/logstash/config/pipelines.yml找这个文件，此时我们做一个软链接即可

# 配置软链接
[root@elk3 ~]# mkdir /usr/share/logstash/config/
[root@elk3 ~]# ln -svf /etc/logstash/pipelines.yml  /usr/share/logstash/config/
'/usr/share/logstash/config/pipelines.yml' -> '/etc/logstash/pipelines.yml'
[root@elk3 ~]# logstash  -r 
...
[INFO ] 2025-03-29 10:16:50.372 [[xixi]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"xixi"}
[INFO ] 2025-03-29 10:16:54.380 [[haha]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"haha"}
...

ES集群安全

基于base_auth加密

es集群加密

复制代码

# 在配置加密之前是可以正常访问的
[root@elk1 ~]# curl 127.1:9200/_cat/nodes
192.168.121.92  6 94 0 0.05 0.03 0.00 cdfhilmrstw - elk2
192.168.121.91 22 95 4 0.25 0.27 0.25 cdfhilmrstw * elk1
192.168.121.93 29 94 6 0.02 0.25 0.48 cdfhilmrstw - elk3


1 生成证书文件
[root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil cert -out /etc/elasticsearch/elastic-certificates.p12 -pass ""
...
Certificates written to /etc/elasticsearch/elastic-certificates.p12

This file should be properly secured as it contains the private key for 
your instance.

This file is a self contained file and can be copied and used 'as is'
For each Elastic product that you wish to configure, you should copy
this '.p12' file to the relevant configuration directory
and then follow the SSL configuration instructions in the product guide.


2.把证书文件拷贝到其他节点
[root@elk1 ~]# chmod 640 /etc/elasticsearch/elastic-certificates.p12
[root@elk1 ~]# scp -r /etc/elasticsearch/elastic-certificates.p12  192.168.121.92:/etc/elasticsearch/elastic-certificates.p12
[root@elk1 ~]# scp - /etc/elasticsearch/elastic-certificates.p12  192.168.121.93:/etc/elasticsearch/elastic-certificates.p12

3.修改ES集群的配置文件，并同步到所有节点
[root@elk1 ~]# tail -5 /etc/elasticsearch/elasticsearch.yml
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12
[root@elk1 ~]# scp /etc/elasticsearch/elasticsearch.yml  192.168.121.92:/etc/elasticsearch/elasticsearch.yml
[root@elk1 ~]# scp /etc/elasticsearch/elasticsearch.yml  192.168.121.93:/etc/elasticsearch/elasticsearch.yml

4.重启es
[root@elk1 ~]# systemctl restart elasticsearch.service 

# 此时就不能直接访问了
[root@elk1 ~]# curl 127.1:9200
{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}},"status":401}


5.生成随机密码
[root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-setup-passwords  auto

Changed password for user apm_system
PASSWORD apm_system = aBsQ3WI9ydUVTx2hk2JT

Changed password for user kibana_system
PASSWORD kibana_system = xoMBWbFyYmadDyrYcwyI

Changed password for user kibana
PASSWORD kibana = xoMBWbFyYmadDyrYcwyI

Changed password for user logstash_system
PASSWORD logstash_system = fWx19jXFHinpcraglh8E

Changed password for user beats_system
PASSWORD beats_system = NgKipgH0LfnFGFAazun6

Changed password for user remote_monitoring_user
PASSWORD remote_monitoring_user = Af4hu6PrhPYvn2S5zcEj

Changed password for user elastic
PASSWORD elastic = 0Nj2dpMTSNYurPqQHInA


[root@elk1 ~]# curl -u elastic:MSfRhWKA3lRhufYpxF9u 127.1:9200/_cat/nodes
192.168.121.91 40 96 22 0.62 0.74 0.53 cdfhilmrstw - elk1
192.168.121.92 17 96 20 0.44 0.67 0.36 cdfhilmrstw * elk2
192.168.121.93 23 96 32 0.54 1.00 0.73 cdfhilmrstw - elk3


6.kibana连接es
	6.1修改kibana配置文件
[root@elk1 ~]# tail -2 /etc/kibana/kibana.yml
elasticsearch.username: "kibana_system"
elasticsearch.password: "47UD4ZOypuWO100QciH4"
	6.2重启kibana
[root@elk1 ~]# systemctl restart kibana.service 
	6.3web访问kibana

重置es密码

复制代码

在es集群中有类似root用户的superuser的角色，我们可以创建一个用户属于superuser的角色，通过这个用户去修改elastic的密码

1.创建一个超级管理员角色
[root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-users useradd xu -p 123456 -r superuser

2.基于管理员修改密码
[root@elk1 ~]# curl -s --user xu:123456 -XPUT "http://localhost:9200/_xpack/security/user/elastic/_password?pretty" -H 'Content-Type: application/json' -d'
     {
       "password" : "654321"
     }'
     
[root@elk1 ~]# curl -uelastic:654321 127.1:9200/_cat/nodes
192.168.121.91 35 96 7 0.38 0.43 0.52 cdfhilmrstw - elk1
192.168.121.92 20 96 2 0.20 0.20 0.25 cdfhilmrstw * elk2
192.168.121.93 27 97 5 0.10 0.18 0.38 cdfhilmrstw - elk3

filebeat对接es加密

复制代码

[root@elk1 ~]# cat /etc/filebeat/config/07-tcp-to-es_tls.yaml
filebeat.inputs:
- type: tcp
  host: "0.0.0.0:9000"


output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  # 指定连接ES集群的用户名
  username: "elastic"
  # 指定连接ES集群的密码
  password: "654321"
  index: xu-es-tls-filebeat

setup.ilm.enabled: false
setup.template.name: "xu-es-tls-filebeat"
setup.template.pattern: "xu-es-tls-filebeat-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 3
  index.number_of_replicas: 0

logstash对接es加密

复制代码

[root@elk3 ~]# cat /etc/logstash/conf.d/09-tcp-to-es_tls.conf
input { 
  tcp {
    port => 8888
  }
}  

output { 

  elasticsearch {
    hosts => ["192.168.121.91:9200","192.168.121.92:9200","192.168.121.93:9200"]
    index => "oldboyedu-logstash-tls-es"
    user => elastic
    password => "654321"
  }
}

api-key

复制代码

为什么要启用api-key
	为了安全性，使用用户名和密码的方式进行认证会暴露用户信息。
	ElasticSearch也支持api-key的方式进行认证。这样就可以保证安全性。api-key是不能用于登录kibana，安全性得到保障。
	而且可以基于api-key实现权限控制。

elasticsearch默认是没有启动api的，需要通过配置文件设置并启动api功能

启动es api功能

复制代码

[root@elk1 ~]# tail /etc/elasticsearch/elasticsearch.yml
# 启用api_key功能
xpack.security.authc.api_key.enabled: true
# 指定API密钥加密算法
xpack.security.authc.api_key.hashing.algorithm: pbkdf2
# 缓存的API密钥时间
xpack.security.authc.api_key.cache.ttl: 1d
# API密钥保存数量的上限
xpack.security.authc.api_key.cache.max_keys: 10000
# 用于内存中缓存的API密钥凭据的哈希算法
xpack.security.authc.api_key.cache.hash_algo: ssha256

[root@elk1 ~]# !scp
scp /etc/elasticsearch/elasticsearch.yml  192.168.121.93:/etc/elasticsearch/elasticsearch.yml
root@192.168.121.93's password: 
elasticsearch.yml                                                                                                                                          100% 4270   949.6KB/s   00:00    
[root@elk1 ~]# scp /etc/elasticsearch/elasticsearch.yml  192.168.121.92:/etc/elasticsearch/elasticsearch.yml
root@192.168.121.92's password: 
elasticsearch.yml

[root@elk1 ~]# systemctl restart elasticsearch.service

创建api

复制代码

# 解析api
[root@elk1 ~]# echo "TzBCTzY1VUJiWUdnVHlBNjZRTXc6eE9JWW9wT3dTT09Sam1UNE5RYnRjUQ==" | base64 -d ;echo
O0BO65UBbYGgTyA66QMw:xOIYopOwSOORjmT4NQbtcQ


# 配置filebeat
[root@elk1 ~]# cat /etc/filebeat/config/07-tcp-to-es_tls.yaml
filebeat.inputs:
- type: tcp
  host: "0.0.0.0:9000"


output.elasticsearch:
  hosts:
  - 192.168.121.91:9200
  - 192.168.121.92:9200
  - 192.168.121.93:9200
  #username: "elastic"
  #password: "654321"
  api_key: zvWA4JUBqFmHNaf3P8bM:d-goeFONRPelMuRxSr2Bxg
  index: xu-es-tls-filebeat

setup.ilm.enabled: false
setup.template.name: "xu-es-tls-filebeat"
setup.template.pattern: "xu-es-tls-filebeat-*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 3
  index.number_of_replicas: 0
[root@elk1 ~]# filebeat -e -c /etc/filebeat/config/07-tcp-to-es_tls.yaml

基于ES的api创建api-key并实现权限管理

参考链接:
https://www.elastic.co/guide/en/beats/filebeat/7.17/beats-api-keys.html
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/security-privileges.html#privileges-list-cluster
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/security-privileges.html#privileges-list-indices

复制代码

1.创建api-key
# 发送请求
POST /_security/api_key
{
  "name": "jasonyin2020", 
  "role_descriptors": {
    "filebeat_monitoring": { 
      "cluster": ["all"],
      "index": [
        {
          "names": ["xu-es-apikey*"],
          "privileges": ["create_index", "create"]
        }
      ]
    }
  }
}
# 返回数据
{
  "id" : "0vXs4ZUBqFmHNaf3s8Zn",
  "name" : "jasonyin2020",
  "api_key" : "y1Vi5fL6RfGy_B47YWBXcw",
  "encoded" : "MHZYczRaVUJxRm1ITmFmM3M4Wm46eTFWaTVmTDZSZkd5X0I0N1lXQlhjdw=="
}


# 解析
[root@elk1 ~]# echo MHZYczRaVUJxRm1ITmFmM3M4Wm46eTFWaTVmTDZSZkd5X0I0N1lXQlhjdw== | base64 -d  ;echo
0vXs4ZUBqFmHNaf3s8Zn:y1Vi5fL6RfGy_B47YWBXcw

https

es集群配置https

复制代码

1.自建CA证书
[root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil ca --out /etc/elasticsearch/elastic-stack-ca.p12 --pass ""
[root@elk1 ~]# ll /etc/elasticsearch/elastic-stack-ca.p12 
-rw------- 1 root elasticsearch 2672 Mar 29 20:44 /etc/elasticsearch/elastic-stack-ca.p12


2.基于CA证书创建ES证书
[root@elk1 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca /etc/elasticsearch/elastic-stack-ca.p12 --out /etc/elasticsearch/elastic-certificates-https.p12 --pass "" --days 3650 --ca-pass ""
[root@elk1 ~]# ll /etc/elasticsearch/elastic-stack-ca.p12 
-rw------- 1 root elasticsearch 2672 Mar 29 20:44 /etc/elasticsearch/elastic-stack-ca.p12
[root@elk1 ~]# ll /etc/elasticsearch/elastic-certificates-https.p12 
-rw------- 1 root elasticsearch 3596 Mar 29 20:48 /etc/elasticsearch/elastic-certificates-https.p12


3.修改配置文件
[root@elk1 ~]# tail -2 /etc/elasticsearch/elasticsearch.yml 
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: elastic-certificates-https.p12

[root@elk1 ~]# chmod 640 /etc/elasticsearch/elastic-certificates-https.p12

[root@elk1 ~]# scp  -rp /etc/elasticsearch/elastic{-certificates-https.p12,search.yml} 192.168.121.92:/etc/elasticsearch/
root@192.168.121.92's password: 
elastic-certificates-https.p12                                                                                                                             100% 3596     1.6MB/s   00:00    
elasticsearch.yml                                                                                                                                          100% 4378     6.0MB/s   00:00    
[root@elk1 ~]# scp  -rp /etc/elasticsearch/elastic{-certificates-https.p12,search.yml} 192.168.121.93:/etc/elasticsearch/
root@192.168.121.93's password: 
elastic-certificates-https.p12                                                                                                                             100% 3596   894.2KB/s   00:00    
elasticsearch.yml  

4.重启ES集群
[root@elk1 ~]# systemctl restart elasticsearch.service 
[root@elk2 ~]# systemctl restart elasticsearch.service 
[root@elk3 ~]# systemctl restart elasticsearch.service 

[root@elk1 ~]# curl https://127.1:9200/_cat/nodes -u elastic:654321 -k
192.168.121.92 16 94 63 1.88 0.92 0.35 cdfhilmrstw - elk2
192.168.121.91 14 96 30 0.79 0.90 0.55 cdfhilmrstw * elk1
192.168.121.93  8 97 53 1.22 0.71 0.33 cdfhilmrstw - elk3

5.修改kibana的配置跳过自建证书校验
[root@elk1 ~]# vim /etc/kibana/kibana.yml 
...
# 指向ES集群的地址协议为https
elasticsearch.hosts: ["https://192.168.121.91:9200","https://192.168.121.92:9200","https://192.168.121.93:9200"]
# 跳过证书校验
elasticsearch.ssl.verificationMode: none
[root@elk1 ~]# systemctl restart kibana.service

filebeat对接https加密

复制代码

# 编写filebeat配置文件
[root@elk92 filebeat]# cat 17-tcp-to-es-tls.yaml 
filebeat.inputs:
- type: tcp
  host: "0.0.0.0:9000"


output.elasticsearch:
  hosts:
  - https://192.168.121.91:9200
  - https://192.168.121.92:9200
  - https://192.168.121.93:9200
  api_key: "m1wPlJUBrDbi_DeiIc-1:RcEw7Mk2QQKH_CGhMBnfbg"
  index: xu-es-apikey-tls-2025
  # 配置es集群的tls，此处跳过证书校验。默认值为: full
  # 参考链接:
  # 	https://www.elastic.co/guide/en/beats/filebeat/7.17/configuration-ssl.html#client-verification-mode
  ssl.verification_mode: none

setup.ilm.enabled: false
setup.template.name: "xu"
setup.template.pattern: "xu*"
setup.template.overwrite: true
setup.template.settings:
  index.number_of_shards: 3
  index.number_of_replicas: 0

logstash对接https加密

复制代码

[root@elk93 logstash]# cat 13-tcp-to-es_api-key.conf 
input { 
  tcp {
    port => 8888
  }
}  

output { 

  elasticsearch {
    hosts => ["192.168.121.91:9200","192.168.121.92:9200","192.168.121.93:9200"]
    index => "xu-api-key"
    #user => elastic
    #password => "123456"xu
    # 指定api-key的方式认证
    api_key => "oFwZlJUBrDbi_DeiLc9O:HWBj0LC2RWiUNTudV-6CBw"
   
    # 使用api-key则必须启动ssl
    ssl => true

    # 跳过ssl证书验证
    ssl_certificate_verification => false
  }
}
[root@elk93 logstash]# 
[root@elk93 logstash]# logstash -rf 13-tcp-to-es_api-key.conf

基于kibana实现RBAC

参考链接:
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/security-privileges.html

创建角色

创建用户

ES8部署

单点部署ES8集群

复制代码

环境准备：
	192.168.121.191 elk191
	192.168.121.192 elk192
	192.168.121.193 elk193

1.获取安装包，安装es8
[root@elk191 ~]# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.17.3-amd64.deb
[root@elk191 ~]# dpkg -i elasticsearch-8.17.3-amd64.deb


# es8默认就支持https
--------------------------- Security autoconfiguration information ------------------------------

Authentication and authorization are enabled.
TLS for the transport and HTTP layers is enabled and configured.

The generated password for the elastic built-in superuser is : P0-MRYuCOTFj*4*rGNZk   # 内置elastic超级用户的密码

If this node should join an existing cluster, you can reconfigure this with
'/usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token <token-here>'
after creating an enrollment token on your existing cluster.

You can complete the following actions at any time:

Reset the password of the elastic built-in superuser with 
'/usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic'.

Generate an enrollment token for Kibana instances with 
 '/usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana'.

Generate an enrollment token for Elasticsearch nodes with 
'/usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node'.

-------------------------------------------------------------------------------------------------
### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd
 sudo systemctl daemon-reload
 sudo systemctl enable elasticsearch.service
### You can start elasticsearch service by executing
 sudo systemctl start elasticsearch.service


2.启动es8
[root@elk191 ~]# systemctl enable elasticsearch.service --now
Created symlink /etc/systemd/system/multi-user.target.wants/elasticsearch.service → /lib/systemd/system/elasticsearch.service.
[root@elk191 ~]# netstat  -tunlp | grep -E "9[2|3]00"
tcp6       0      0 127.0.0.1:9300          :::*                    LISTEN      1669/java           
tcp6       0      0 ::1:9300                :::*                    LISTEN      1669/java           
tcp6       0      0 :::9200                 :::*                    LISTEN      1669/java 

3.测试访问
[root@elk191 ~]# curl -u elastic:NVPLcMy0_n8aGL=UGAGc https://127.1:9200 -k
{
  "name" : "elk191",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "-cw1TGvZSau0J2x-ThOJsg",
  "version" : {
    "number" : "8.17.3",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "a091390de485bd4b127884f7e565c0cad59b10d2",
    "build_date" : "2025-02-28T10:07:26.089129809Z",
    "build_snapshot" : false,
    "lucene_version" : "9.12.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}
[root@elk191 ~]# curl -u elastic:NVPLcMy0_n8aGL=UGAGc https://127.1:9200/_cat/nodes -k
127.0.0.1 9 97 13 0.35 0.59 0.31 cdfhilmrstw * elk191

部署kibana8

复制代码

1.获取安装包，安装kibana
[root@elk191 ~]# wget  https://artifacts.elastic.co/downloads/kibana/kibana-8.17.3-amd64.deb
[root@elk191 ~]# dpkg -i kibana-8.17.3-amd64.deb 

2.配置kibana
[root@elk191 ~]# grep -vE "^$|^#" /etc/kibana/kibana.yml 
server.port: 5601
server.host: "0.0.0.0"
logging:
  appenders:
    file:
      type: file
      fileName: /var/log/kibana/kibana.log
      layout:
        type: json
  root:
    appenders:
      - default
      - file
pid.file: /run/kibana/kibana.pid
i18n.locale: "zh-CN"

3.启动kibana
[root@elk191 ~]# systemctl enable --now kibana.service 
[root@elk191 ~]# ss -ntl | grep 5601
LISTEN 0      511               0.0.0.0:5601      0.0.0.0:*     

4.生成kibana专用token
[root@elk191 ~]# /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana
eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTkyLjE2OC4xMjEuMTkxOjkyMDAiXSwiZmdyIjoiZmNjMWI3MzJlNzIwMzMzMjI0ZDc5Zjk1YTUyZjIzZmUyNjMzMzYwZDIxY2Q0NzY3YjQ2ZjExZDhiOGYxZTFlZiIsImtleSI6IjdjNTk3SlVCeEI5S3NHd1ZPWVQ5OmYtN0FRWkhEUTVtMnlCZXdiMnJLbXcifQ==

5.kiban服务器获取校验码
[root@elk191 ~]# /usr/share/kibana/bin/kibana-verification-code 
Your verification code is:  414 756

es8集群部署

复制代码

1.拷贝配置文件到其他节点
[root@elk191 ~]# scp elasticsearch-8.17.3-amd64.deb 10.0.0.192:~
[root@elk191 ~]# scp elasticsearch-8.17.3-amd64.deb 10.0.0.193:~

2.其他节点安装ES8软件包 
[root@elk192 ~]# dpkg -i elasticsearch-8.17.3-amd64.deb 
[root@elk193 ~]# dpkg -i elasticsearch-8.17.3-amd64.deb 

# 配置es8
[root@elk191 ~]# grep -Ev "^$|^#" /etc/elasticsearch/elasticsearch.yml
cluster.name: xu-application
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
discovery.seed_hosts: ["192.168.121.191","192.168.121.192","192.168.121.193"]
cluster.initial_master_nodes: ["192.168.121.191","192.168.121.192","192.168.121.193"]
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
xpack.security.http.ssl:
  enabled: true
  keystore.path: certs/http.p12
xpack.security.transport.ssl:
  enabled: true
  verification_mode: certificate
  keystore.path: certs/transport.p12
  truststore.path: certs/transport.p12
http.host: 0.0.0.0

3.在现有集群任意节点生成token令牌文件
[root@elk191 ~]#  /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node

4.在待加入节点使用token重新配置新节点的配置文件
[root@elk192 ~]# /usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token  eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTkyLjE2OC4xMjEuMTkxOjkyMDAiXSwiZmdyIjoiMzIwODY0YzMxNmEyMDQ4YmIwYzVjNDNhY2FlZjQ4MTg2OTM3MmVhNTg2NjdiYTAwMjBjN2Y2ZTczN2YzNWU0MCIsImtleSI6IkE3RTY4SlVCU1BhTWhMRFN0VWdlOmdaM0dIS0RNUndld3o3ZWM0Qk1ySEEifQ==
[root@elk193 ~]# /usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token  eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTkyLjE2OC4xMjEuMTkxOjkyMDAiXSwiZmdyIjoiMzIwODY0YzMxNmEyMDQ4YmIwYzVjNDNhY2FlZjQ4MTg2OTM3MmVhNTg2NjdiYTAwMjBjN2Y2ZTczN2YzNWU0MCIsImtleSI6IkE3RTY4SlVCU1BhTWhMRFN0VWdlOmdaM0dIS0RNUndld3o3ZWM0Qk1ySEEifQ==

5.同步配置文件
[root@elk191 ~]# scp /etc/elasticsearch/elasticsearch.yml  192.168.121.192:/etc/elasticsearch/
[root@elk191 ~]# scp /etc/elasticsearch/elasticsearch.yml  192.168.121.193:/etc/elasticsearch/


6.启动客户端es
[root@elk192 ~]# systemctl enable elasticsearch.service --now
[root@elk193 ~]# systemctl enable elasticsearch.service --now

7.访问测试
[root@elk193 ~]# curl -u elastic:123456 -k https://192.168.121.191:9200/_cat/nodes
192.168.121.191 17 97 10 0.61 0.55 0.72 cdfhilmrstw * elk191
192.168.121.193 15 97 55 1.72 1.05 0.49 cdfhilmrstw - elk193
192.168.121.192 13 97  4 0.25 0.45 0.52 cdfhilmrstw - elk192

常见错误

复制代码

1.常见的错误处理Q1:
ERROR: Aborting enrolling to cluster. Unable to remove existing secure settings. Error was: Aborting enrolling to cluster. Unable to remove existing security configuration, elasticsearch.keystore did not contain expected setting [autoconfiguration.password_hash]., with exit code 74


问题分析:
说明本地已经有安全配置的相关参数设置了。删除之前的配置"elasticsearch.keystore"。


解决方案:
rm -f /etc/elasticsearch/elasticsearch.keystore 




常见的错误处理Q2:
ERROR: Skipping security auto configuration because this node is configured to bootstrap or to join a multi-node cluster, which is not supported., with exit code 80


解决方案:
export IS_UPGRADE=false


常见的错误处理Q3：
ERROR: Aborting enrolling to cluster. This node doesn't appear to be auto-configured for security. Expected configuration is missing from elasticsearch.yml., with exit code 64


错误分析:
检查配置文件，发现会缺少安全相关的配置，可能是同步'elasticsearch.yml'失败了。

解决方案:
修改"/etc/elasticsearch/elasticsearch.yml"，添加安全配置即可，可以手动将elk192节点的配置拷贝到该节点即可。

如果还是解决不了，可以对比elk191节点的配置和elk192的配置文件不同之处，然后将对应的配置拷贝过去即可，我此处测试缺少certs目录。


[root@elk191 ~]# scp -rp /etc/elasticsearch/certs/ 10.0.0.192:/etc/elasticsearch/
[root@elk191 ~]# scp /etc/elasticsearch/elasticsearch.yml 10.0.0.192:/etc/elasticsearch/
[root@elk191 ~]# scp /etc/elasticsearch/elasticsearch.keystore  10.0.0.192:/etc/elasticsearch/
[root@elk191 ~]# scp -rp /etc/elasticsearch/elasticsearch.yml  10.0.0.192:/etc/elasticsearch/


ERROR: Aborting enrolling to cluster. Unable to remove existing secure settings. Error was: Aborting enrolling to cluster. Unable to remove existing security configuration, elasticsearch.keystore did not contain expected setting [xpack.security.transport.ssl.keystore.secure_password]., with exit code 74

es8和es7区别

复制代码

- ES8和ES7对比部署
	1.ES8默认启用了https，支持认证等功能;
	2.ES8新增'elasticsearch-reset-password'脚本，对于elastic用户重置密码更加简单;
	3.ES8新增'elasticsearch-create-enrollment-token'脚本，可以为组件创建token信息，比如kibana组件；
	4.ES8新增kibana新增'kibana-verification-code'用于生成校验码。
	5.kibana支持更多的语言：English (default) "en", Chinese "zh-CN", Japanese "ja-JP", French "fr-FR"
	6.kibana的webUI更加丰富，支持AI助手，手动创建索引等功能;
	7.ES8集群部署时，需要借助'elasticsearch-reconfigure-node'脚本来加入已存在的集群，默认就是单master节点的配置;

ES7 JVM调优

复制代码

1.ES默认会吃掉物理机的一半内存
[root@elk91 ~]# ps -ef | grep java | grep Xms
elastic+   10045       1  2 Mar14 ?        00:56:32 /usr/share/elasticsearch/jdk/bin/java ...  -Xms1937m -Xmx1937m ...

2.关于ES集群的调优原则
		- 1.ES集群的JVM大小应该为物理机的一半，但是不的超过32GB；
		- 2.比如你的集群内存为32GB，则默认应该为16GB，但是如果你的物理机是128GB，默认也会吃掉一半，因此我们需要手动配置为32GB；

3.将ES内存设置为256Mb
[root@elk1 ~]# vim /etc/elasticsearch/jvm.options
[root@elk1 ~]# egrep "^-Xm[s|x]" /etc/elasticsearch/jvm.options
-Xms256m
-Xmx256m

4.拷贝配置文件并滚动重启ES7集群
[root@elk1 ~]# scp /etc/elasticsearch/jvm.options 192.168.121.92:/etc/elasticsearch/
jvm.options                                                                                                                                                100% 3474     2.7MB/s   00:00    
[root@elk1 ~]# scp /etc/elasticsearch/jvm.options 192.168.121.93:/etc/elasticsearch/
jvm.options

[root@elk1 ~]# systemctl restart elasticsearch.service 
[root@elk2 ~]# systemctl restart elasticsearch.service 
[root@elk3 ~]# systemctl restart elasticsearch.service 

5.测试验证
[root@elk1 ~]# free -h
               total        used        free      shared  buff/cache   available
Mem:           3.8Gi       1.1Gi       1.9Gi       1.0Mi       800Mi       2.4Gi
Swap:          3.8Gi        26Mi       3.8Gi
[root@elk1 ~]# ps -ef | grep java | grep Xms
-Xms256m -Xmx256m


curl -k -u elastic:123456 https://127.1:9200/_cat/nodes
192.168.121.92 68 67 94 4.01 2.12 0.96 cdfhilmrstw * elk2
192.168.121.91 59 56 42 1.72 0.87 0.43 cdfhilmrstw - elk1
192.168.121.93 63 61 92 3.30 2.26 1.14 cdfhilmrstw - elk3