filebeat采集应用程序日志和多行匹配

1 filebeat采集nginx json日志

01 修改nginx的日志为json格式

elk93节点安装nginx，注释掉默认的nginx日志格式：# access_log /var/log/nginx/access.log;，在下方增加以下配置。然后重启nginx

nginx 复制代码

log_format wzy_nginx_json '{"@timestamp":"$time_iso8601",'
                            '"host":"$server_addr",'
                            '"clientip":"$remote_addr",'
                            '"SendBytes":$body_bytes_sent,'
                            '"responsetime":$request_time,'
                            '"upstreamtime":"$upstream_response_time",'
                            '"upstreamhost":"$upstream_addr",'
                            '"http_host":"$host",'
                            '"uri":"$uri",'
                            '"domain":"$host",'
                            '"xff":"$http_x_forwarded_for",'
                            '"referer":"$http_referer",'
                            '"tcp_xff":"$proxy_protocol_addr",'
                            '"http_user_agent":"$http_user_agent",'
                            '"status":"$status"}';

02 配置filebeat采集nginx日志

yaml 复制代码

cat > 07-nginx-to-es.yaml <<EOF
filebeat.inputs:
- type: log
  paths:
    - /var/log/nginx/access.log
  # 将message的json格式解析后放在顶级字段中，而不是放到message字段中
  json.keys_under_root: true

output.elasticsearch:
  hosts: 
  - "http://10.0.0.91:9200"
  - "http://10.0.0.92:9200"
  - "http://10.0.0.93:9200"
  index: "zhiyong18-luckyboy-log-nginx"

setup.ilm.enabled: false
setup.template.name: "zhiyong18-luckyboy"
setup.template.pattern: "zhiyong18-luckyboy-log*"
setup.template.overwrite: false
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0
EOF

可视化

过滤状态码

过滤状态码为200且域名是 pc.wzy666.com

sh 复制代码

domain :"pc.wzy666.com and status : "200"

03 根据nginx访问日志绘制dashboard

1 绘制单个指标

保存

IP统计

统计UV(带宽流量)

2 把指标制作为仪表盘

创建仪表盘时，导入库中的制作好的图

2 filebeat采集tomcat日志

01 tomcat部署和配置

1.elk93节点下载tomcat，配置JDK借用elasticsearch的自带的JDK，并加载环境变量

sh 复制代码

cat > /etc/profile.d/tomcat.sh <<EOF
#!/bin/bash
export JAVA_HOME=/usr/share/elasticsearch/jdk
export TOMCAT_HOME=/app/apache-tomcat-10.1.25
export PATH=$PATH:$JAVA_HOME/bin:$TOMCAT_HOME/bin
EOF

2.设置tomcat的访问日志为json格式，修改conf/server.xml，删除143到159行

xml 复制代码

          <Host name="tomcat.zhiyong18.com"  appBase="webapps"
                unpackWARs="true" autoDeploy="true">

		<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
            prefix="tomcat.oldboyedu.com_access_log" suffix=".json"
pattern="{&quot;clientip&quot;:&quot;%h&quot;,&quot;ClientUser&quot;:&quot;%l&quot;,&quot;authenticated&quot;:&quot;%u&quot;,&quot;AccessTime&quot;:&quot;%t&quot;,&quot;request&quot;:&quot;%r&quot;,&quot;status&quot;:&quot;%s&quot;,&quot;SendBytes&quot;:&quot;%b&quot;,&quot;Query?string&quot;:&quot;%q&quot;,&quot;partner&quot;:&quot;%{Referer}i&quot;,&quot;http_user_agent&quot;:&quot;%{User-Agent}i&quot;}"/>
          </Host>

3.启动tomcat

bash 复制代码

catalina.sh start

4.在Windows或linux主机上添加host记录，访问：http://tomcat.zhiyong18.com:8080

bash 复制代码

curl -H 'HOST: tomcat.zhiyong18.com' 10.0.0.93:8080

5.查看tomcat的访问日志为json格式

json 复制代码

[root@elk93~]# tail -2 /app/apache-tomcat-10.1.25/logs/tomcat.zhiyong18.com_access_log.2024-11-16.json 
{"clientip":"10.0.0.253","ClientUser":"-","authenticated":"-","AccessTime":"[16/Nov/2024:17:38:58 +0000]","request":"GET /asf-logo-wide.svg HTTP/1.1","status":"200","SendBytes":"27235","Query?string":"","partner":"http://tomcat.zhiyong18.com:8080/tomcat.css","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/112.0"}
{"clientip":"10.0.0.253","ClientUser":"-","authenticated":"-","AccessTime":"[16/Nov/2024:17:38:58 +0000]","request":"GET /favicon.ico HTTP/1.1","status":"200","SendBytes":"21630","Query?string":"","partner":"http://tomcat.zhiyong18.com:8080/","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/112.0"}

02 部署filebeat采集tomcat日志

1.elk93节点执行该filebeat配置文件

yaml 复制代码

cat > 08-tomcat-to-es.yaml <<EOF
filebeat.inputs:
- type: log
  paths:
    - /app/apache-tomcat-10.1.25/logs/tomcat.zhiyong18.com_access_log*.json
  json.keys_under_root: true

output.elasticsearch:
  hosts: 
  - "http://10.0.0.91:9200"
  - "http://10.0.0.92:9200"
  - "http://10.0.0.93:9200"
  index: "zhiyong18-luckyboy-log-tomcat"

setup.ilm.enabled: false
setup.template.name: "zhiyong18-luckyboy"
setup.template.pattern: "zhiyong18-luckyboy-log*"
setup.template.overwrite: false
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0
EOF

2.创建索引模式后，在 analysis --> discover 查询对应日志

3 filebeat多行匹配

4.3.1 ES官方多行匹配介绍

多行匹配官方文档

为什么需要多行匹配，java程序报错很有特点，基本上看到行首连续出现的 at ，然后往上看就能快速定位问题。就像下图：

1.先看图。ES官方介绍了4种多行匹配模式

multiline.negate

描述：正则表达式匹配的行被视为前一行的延续或新多行事件的开始
默认值：false
如果设置为 true：匹配到的行作为一个事件的开始，不匹配的行会参与到多行合并
如果设置为 false：匹配的行会参与到多行合并，反之不匹配的就作为一个事件的开始

2.方式1的匹配模式示例：java报错采集，

提示：看懂官方的 mutiline.pattern 需要正则表达式基础

示例图

multiline.negate: false，正则匹配到的以空白字符开头的行，并不是新事件的开始，会参与到多行合并
multiline.match: after，表示将匹配到的行（以空白字符开头的行）附加到前一行（即事件行）的后面

3.示例

multiline.negate: true，正则匹配到的以日期开头的行，是新事件的开始，其余的没有被正则匹配到会参与多行合并
multiline.match: after，非事件行放到事件行的后面

4.3.2 tomcat错误日志采集案例

1.修改tomcat的 server.xml 故意制造一点报错。先观察 catalina.out 的日志格式特点，发现只有2个数字开头和非2个数字开头

2.filebeat执行该配置：

yaml 复制代码

cat > 09-tomcat_errlog-to-es.yaml <<EOF
filebeat.inputs:
- type: log
  paths:
    - /app/apache-tomcat-10.1.25/logs/catalina.out
  multiline:
    # 指定多行匹配的模式，支持pattern模式
    type: pattern
    # 指定匹配模式，事件行是2个数字开头的行，其余都参与合并
    pattern: '^\d{2}'
    negate: true
    match: after

output.elasticsearch:
  hosts: 
  - "http://10.0.0.91:9200"
  - "http://10.0.0.92:9200"
  - "http://10.0.0.93:9200"
  index: "zhiyong18-luckyboy-log-tomcat-errorlog"

setup.ilm.enabled: false
setup.template.name: "zhiyong18-luckyboy"
setup.template.pattern: "zhiyong18-luckyboy-log*"
setup.template.overwrite: false
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0
EOF

3.创建索引模式后去查看

4 count多行匹配+input写入到不同的索引案例

需求：

input1是/tmp/xixi.log，写入到 zhiyong18-luckyboy-log-xixi-%{+yyyy.MM.dd} 索引
input2是TCP数据流，写入到 zhiyong18-luckyboy-log-haha-%{+yyyy.MM.dd}

思路：

给不同的input打上标签（tag），在 output.elasticsearch 判断标签，再匹配合适的索引

1.filebeat执行该文件

yaml 复制代码

cat > 10-mutiple-to-es.yaml <<EOF
filebeat.inputs:
- type: log
  paths:
    - /tmp/xixi.log
  # 给数据打标签
  tags: xixi
  multiline:
    # 每4行才会看做1条文档
    type: count
    count_lines: 4

- type: tcp
  host: "0.0.0.0:9000"
  tags: haha

output.elasticsearch:
  hosts: 
  - "http://10.0.0.91:9200"
  - "http://10.0.0.92:9200"
  - "http://10.0.0.93:9200"
  # 根据tags字段判断将events事件写入到不同的索引
  indices:
  - index: "zhiyong18-luckyboy-log-xixi-%{+yyyy.MM.dd}"
    # 判断tags字段是否包含 xixi
    when.contains:
      tags: xixi
  - index: "zhiyong18-luckyboy-log-haha-%{+yyyy.MM.dd}"
    # 判断tags字段是否包含 haha
    when.contains:
      tags: haha

setup.ilm.enabled: false
setup.template.name: "zhiyong18-luckyboy"
setup.template.pattern: "zhiyong18-luckyboy-log*"
setup.template.overwrite: false
setup.template.settings:
  index.number_of_shards: 5
  index.number_of_replicas: 0
EOF

2.准备测试数据 /tmp/xixi.log，内容如下

yaml 复制代码

name: wenzy
  hobby:
  - 博客
  - 唱歌
name: she
  hobby:
  - 跳舞
  - 钓鱼

TCP数据流数据如下：

bash 复制代码

[root@elk91~]# echo -e '1 \n 2 \n 3 \n 4 \n A \n B \n C \n D' | nc 10.0.0.93 9000

3.验证创建成功的2条索引。去 discover 查看内容

of_shards: 5

index.number_of_replicas: 0

EOF

复制代码

2.准备测试数据 /tmp/xixi.log，内容如下

```yaml
name: wenzy
  hobby:
  - 博客
  - 唱歌
name: she
  hobby:
  - 跳舞
  - 钓鱼

TCP数据流数据如下：

bash 复制代码

[root@elk91~]# echo -e '1 \n 2 \n 3 \n 4 \n A \n B \n C \n D' | nc 10.0.0.93 9000

3.验证创建成功的2条索引。去 discover 查看内容

外链图片转存中...(img-GSd2Tdq5-1735833529783)