一、安装
二、主要配置
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| - type: log
``# Change to true to enable this input configuration.
``enabled: true
``# Paths that should be crawled and fetched. Glob based paths.
``paths:
``- /home/centos/pip_v2.csv #源路径
``#- c:\programdata\elasticsearch\logs\*
``#exclude_lines: ["^Restaurant Name,"] #第一行为字段头以"Restaurant Name"开头,不要第一行
``multiline:
``pattern: ^\d{4}
``#pattern: ',\d+,[^\",]+$'
``negate: true
``match: after
``max_lines: 1000
``timeout: 30s
|
三、关于elastic的pipline
https://hacpai.com/article/1512990272091
我简单介绍主流程,详情见上链接
1.开启数据预处理,node.ingest: true
2.向es提交pipline,并命名为my-pipeline-id
PUT _ingest/pipeline/my-pipeline-id
{
"description" : "describe pipeline",
"processors" : [
{
"set" : {
"field": "foo",
"value": "bar"
}
}
]
}
3.以上pipline的作用
若产生新的数据,会新增一个字段为foo:bar
4.curl的pipline即时测试
POST _ingest/pipeline/_simulate
是一个测试接口,提供pipline的规则和测试数据,返回结果数据
四、关于grok
是pipline中的正则匹配模式,以上规则的复杂版
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| POST _ingest/pipeline/_simulate
{
``"pipeline": {
``"description": "grok processor",
``"processors" : [
``{
``"grok": {
``"field": "message",
``"patterns": ["%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}"]
``}
``}
``]
``},
``"docs": [
``{
``"_index": "index",
``"_type": "type",
``"_id": "id",
``"_source": {
``"message": "55.3.244.1 GET /index.html 15824 0.043"
``}
``}
``]
}
|
五、使用pipline导入csv
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| utput.elasticsearch:
``# Array of hosts to connect to.
``hosts: ["localhost:9200"]
``#index: "csvindex"
``pipline: "my-pipeline-id"
```# Protocol - either
http(default) or
https.``
``#protocol: "https"` |
测试结果pipline配置后,并没生效。
六、结论
1.filebeat 导入csv的资料很少,主要为pipline方式,测试几个失败。
2.J和数据组并没有filebaeat 导入csv的成功案例。J不太建议使用
结论:filebeat导csv并不方便,建议采用logstash。
一般日志收集可使用logstash,每行的信息会存到message中