鲲鹏服务器logstash采集nginx日志

准备

1、服务器信息

shell 复制代码
[root@ecs ~]# cat /etc/kylin-release 
Kylin Linux Advanced Server release V10 (Tercel)
[root@ecs ~]# 
[root@ecs ~]# uname -a
Linux ecs.novalocal 4.19.148+ #1 SMP Mon Oct 5 22:04:46 EDT 2020 aarch64 aarch64 aarch64 GNU/Linux

2、logstash镜像

版本:8.4.0,可以在https://hub.docker.com/下载,也可以到https://download.csdn.net/download/Angushine/91436965这里下载

采集思路

1、将nginx的日志输出为json格式方便采集

2、logstash通过访问nginx日志access-yyyy-mm-dd.log将采集的日志写入kafka或es

导入镜像

bash 复制代码
[root@ecs public]# 
[root@ecs public]# docker load -i logstash-8.4.tar 
9beca9c8e2ec: Loading layer [==================================================>]   68.1MB/68.1MB
deebf99a620e: Loading layer [==================================================>]  66.89MB/66.89MB
1ec913148952: Loading layer [==================================================>]  345.6kB/345.6kB
5ffe4168f194: Loading layer [==================================================>]  595.9MB/595.9MB
ed15c32d9098: Loading layer [==================================================>]  4.096kB/4.096kB
ef6a633680e3: Loading layer [==================================================>]  4.096kB/4.096kB
898f302c5bdb: Loading layer [==================================================>]  4.608kB/4.608kB
1e88cd63ad1e: Loading layer [==================================================>]  4.608kB/4.608kB
2681b072ef0f: Loading layer [==================================================>]  14.34kB/14.34kB
ea41d0d53acc: Loading layer [==================================================>]  2.945MB/2.945MB
ddd9c047a343: Loading layer [==================================================>]  3.584kB/3.584kB
Loaded image: logstash:8.4.0
[root@ecs public]# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
logstash            8.4.0               8ec9de6fbf46        2 years ago         720MB

改造Nginx日志格式

1、修改nginx配置文件nginx.config

bash 复制代码
http {
    include       mime.types;
    default_type  application/octet-stream;
    
    # 原始格式
    # log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
    #                  '$status $body_bytes_sent "$http_referer" '
    #                  '"$http_user_agent" "$http_x_forwarded_for"';
    # 改造后日志格式
    log_format json '{"@timestamp":"$time_iso8601",'
                    '"@version":"1",'
                    '"client":"$remote_addr",'
                    '"url":"$uri",'
                    '"status":"$status",'
                    '"domain":"$host",'
                    '"host":"$server_addr",'
                    '"size":$body_bytes_sent,'
                    '"responsetime":$request_time,'
                    '"referer": "$http_referer",'
                    '"ua": "$http_user_agent"'
                    '}';

    # 日志按天分割方法一
    #map $time_iso8601 $logdate {
    #    '~^(?<ymd>\d{4}-\d{2}-\d{2})' $ymd;
    #    default 'date-not-found';
    #}
    #access_log logs/access_$logdate.log json;

    open_log_file_cache max=10;
    sendfile        on;
    keepalive_timeout  65;
    server_tokens off;

    # gzip  on;
    client_max_body_size 200M;

    server {
        listen       80;
        server_name  localhost;

        # 日志按天分割方法二
        if ($time_iso8601 ~ '(\d{4}-\d{2}-\d{2})') {
            set $tttt $1;
        }
        access_log  logs/access_$tttt.log  json;
    }
}

2、加载配置

bash 复制代码
./sbin/nginx -s reload

3、检查日志是否按照指定格式生成

bash 复制代码
[root@server nginx]# cat ./logs/access-2025-07-21.log 
{"@timestamp":"2025-07-21T17:05:21+08:00","@version":"1","client":"192.168.74.6","url":"/test/uc/test/applyPageList.do","status":"200","domain":"192.168.1.100","host":"192.168.1.100","size":5928,"responsetime":0.019,"referer": "http://192.168.1.100/home/","ua": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"}
{"@timestamp":"2025-07-21T17:05:26+08:00","@version":"1","client":"192.168.74.6","url":"/test/uc/test/pageList.do","status":"200","domain":"192.168.1.100","host":"192.168.1.100","size":5288,"responsetime":0.015,"referer": "http://192.168.1.100/home/","ua": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"}
{"@timestamp":"2025-07-21T17:05:28+08:00","@version":"1","client":"192.168.74.6","url":"/test/uc/test/getSafetyPostList.do","status":"200","domain":"192.168.1.100","host":"192.168.1.100","size":5590,"responsetime":0.010,"referer": "http://192.168.1.100/home/","ua": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"}
{"@timestamp":"2025-07-21T17:05:28+08:00","@version":"1","client":"192.168.74.6","url":"/test/uc/test.do","status":"200","domain":"192.168.1.100","host":"192.168.1.100","size":19018,"responsetime":0.026,"referer": "http://192.168.1.100/home/","ua": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"}

可以看到日志已经按照指定格式写入了访问日志文件中。

logstash配置

从这里下载:https://download.csdn.net/download/Angushine/91497749

配置介绍

将下载的logstash.zip文件解压后放到指定目录,比如这里放到/data/public/logstash/目录下

1、logstash.yml

logstash.yml完整路径为:/data/public/logstash/config/logstash.yml

bash 复制代码
# 指定配置文件路径,若为目录则按字典顺序加载所有.conf文件
path.config: /usr/share/logstash/config/conf.d/*.conf
path.logs: /var/log/logstash
# 配置有变更,自动加载
config.reload.automatic: true

2、beats.conf

beats.conf完整路径:/data/public/logstash/config/conf.d/beats.conf

bash 复制代码
input {
    file {
        path => ["/data/logs/access_*.log"]
        type => "nginx"
        start_position => "end"
    }
}
filter{
    grok{
        match => ['message','%{TIMESTAMP_ISO8601:logdate}']
    }
    json{
        source => "message"
        remove_field => ["message"]
    }
    ruby {
        code => "
            event.set('@timestamp', LogStash::Timestamp.at(event.get('@timestamp').time.localtime + 8*60*60))
        "
    }
    mutate {
        add_field => { "log_time" => "%{+YYYY-MM-dd HH:mm:ss}" }
        remove_field => ["@version"]
    }
}
output {
    kafka {
        bootstrap_servers => "19.168.1.101:9092"
        topic_id => "nginx-logs"
        codec => json
        acks => "all"
        retries => 5
        batch_size => 16384
    }
}
  • 输入(input)配置

​​文件输入​​:通过file插件从文件读取数据,常用配置项有path(文件路径,支持通配符,如/var/log/*.log)和start_position(开始读取位置,beginning或end)。

​​Beats输入​​:接收Filebeat等客户端数据,配置port(监听端口)即可。

​​HTTP输入​​:通过HTTP请求接收数据,配置port(监听端口)。

  • 过滤(filter)配置

​​grok过滤器​​:使用正则表达式解析非结构化日志,如match => { "message" => "%{COMBINEDAPACHELOG}" }。

​​mutate过滤器​​:修改字段,如add_field添加新字段、remove_field删除字段。

​​date过滤器​​:解析日期字段,如match => [ "timestamp", "ISO8601" ]。

  • 输出(output)配置

​​Elasticsearch输出​​:将数据发送到Elasticsearch,配置hosts(ES地址)和index(索引名)。

​​文件输出​​:将数据写入文件,配置path(文件路径)。

  • 全局配置

​​path.config​​:指定配置文件路径,若为目录则按字典顺序加载所有.conf文件。

​​pipeline.workers​​:设置并行处理线程数,默认为CPU核数。

​​pipeline.batch.size​​:定义单个线程批量处理的最大事件数,默认

输出到ES

如果需要同时输出到ES和Kafka,只需增加如下elasticsearch配置即可

bash 复制代码
output {
    elasticsearch {
        hosts => ["http://192.168.1.101:9200"]
        index => "logstash-%{+YYYY.MM.dd}"
        user => ""
        password => ""
        manage_template => true
        template => "/usr/share/logstash/config/template/logs.json"
        template_overwrite => true 
    }
    kafka {
        bootstrap_servers => "192.168.1.101:9092"
        topic_id => "test-logs"
        codec => json
        acks => "all"
        retries => 5
        batch_size => 16384
    }
}    

启动

bash 复制代码
docker run --log-driver json-file --log-opt max-size=100m --log-opt max-file=2 \
--privileged=true --network=host --name logstash \
-v /data/public/logstash/config/:/usr/share/logstash/config/ \
-v /data/public/nginx/logs/:/data/logs/ \
-d logstash:8.4.0

启动成功后,就可以检查一下是否有采集日志到kafka中

常见问题

启动后可能会提示权限问题,建议执行一下如下脚本

bash 复制代码
# 对配置文件进行授权
chmod 777 -R /data/public/logstash/config/
# 对日志文件进行授权
chmod 777 /data/public/nginx/logs/access_*.log

如果nginx日志按天生成,可能会造成日志文件权限失效,此时可以在系统定时器中加入定时设置权限

bash 复制代码
crontab -e
# 调整logstash采集nginx日志的访问权限
* * * * * chmod 777 /data/public/nginx/logs/access_*.log
相关推荐
七夜zippoe4 小时前
CANN Runtime任务描述序列化与持久化源码深度解码
大数据·运维·服务器·cann
盟接之桥4 小时前
盟接之桥说制造:引流品 × 利润品,全球电商平台高效产品组合策略(供讨论)
大数据·linux·服务器·网络·人工智能·制造
Fcy6485 小时前
Linux下 进程(一)(冯诺依曼体系、操作系统、进程基本概念与基本操作)
linux·运维·服务器·进程
袁袁袁袁满5 小时前
Linux怎么查看最新下载的文件
linux·运维·服务器
代码游侠6 小时前
学习笔记——设备树基础
linux·运维·开发语言·单片机·算法
主机哥哥6 小时前
阿里云OpenClaw部署全攻略,五种方案助你快速部署!
服务器·阿里云·负载均衡
Harvey9036 小时前
通过 Helm 部署 Nginx 应用的完整标准化步骤
linux·运维·nginx·k8s
珠海西格电力科技7 小时前
微电网能量平衡理论的实现条件在不同场景下有哪些差异?
运维·服务器·网络·人工智能·云计算·智慧城市
释怀不想释怀7 小时前
Linux环境变量
linux·运维·服务器
zzzsde7 小时前
【Linux】进程(4):进程优先级&&调度队列
linux·运维·服务器