大家在使用elasticsearch查询日志的时候应该都碰到过日志乱序的问题,因为elasticsearch默认使用写入ES的时间作为timestamp,这样多个文件往ES里面写的时候难免会有乱序的情况。
解决办法就是将日志里面的时间抽取出来作为日志的默认时间戳。下面就来讲讲具体怎么实现:
-
使用dissect提取日志里面的日期和时间
yaml- dissect: #这里按照日志的格式提取就好,不一定非要像我这样 tokenizer: "%{log_date} %{log_time} %{message}" field: "message" target_prefix: "extracted" -
使用script生成新的timestamp,并处理异常情况
yaml- script: language: javascript source: > function process(event) { try{ var date = event.Get("extracted:log_date"); var time = event.Get("extracted:log_time"); if(date && time){ var timestamp = new Date(date + 'T' + time + 'Z'); if(isNaN(timestamp.getTime())){ return; } event.Put("@timestamp", timestamp); } } catch(e){ event.Put("error", e.message); } } -
最后使用timestamp替换原有时间戳
yaml- timestamp: filed: "@timestamp" layouts: ["ISO8601"]
完整配置:
yaml
# filebeat.yml
filebeat.inputs:
- type: log
enabled: true
path:
- /data/app/folder1/logs/aa.log
- /data/app/folder2/logs/bb.log
- /data/app/folder3/logs/cc.log
- /data/app/folder4/logs/dd.log
- /data/app/folder5/logs/ee.log
# 多行处理(适用于将多行异常转为一条日志记录)
multiline:
pattern: '^\d{4}-\d{2}-\d{2}'
negate: true
match: after
processors:
- dissect:
tokenizer: "%{log_date} %{log_time} %{message}"
field: "message"
target_prefix: "extracted"
- script:
language: javascript
source: >
function process(event) {
try{
var date = event.Get("extracted:log_date");
var time = event.Get("extracted:log_time");
if(date && time){
var timestamp = new Date(date + 'T' + time + 'Z');
if(isNaN(timestamp.getTime())){
return;
}
event.Put("@timestamp", timestamp);
}
}
catch(e){
event.Put("error", e.message);
}
}
- timestamp:
filed: "@timestamp"
layouts: ["ISO8601"]