ES API 批量操作 Bulk API

bulk 是 elasticsearch 提供的一种批量增删改的操作API。

bulk 对 JSON串 有着严格的要求。每个JSON串 不能换行 ,只能放在同一行,同时, 相邻的JSON串之间必须要有换行 (Linux下是\n;Window下是\r\n)。bulk的每个操作必须要 一对JSON串 (delete语法除外)。

action必须是以下几种:

行为 解释
create 如果文档不存在就创建,但如果文档存在就返回错误 包含 。POSt和PUT 两种操作
index 如果文档不存在就创建,如果文档存在就更新,版本_version 会加1
update 更新一个文档,如果文档不存在就返回错误
delete 删除一个文档,如果要删除的文档id不存在,就返回错误

其实可以看得出来 index 是比较常用的。 bulk 请求不是原子操作,它们不能实现事务。每个请求操作时分开的,所以每个请求的成功与否不干扰其它操作。

Bulk一次最大处理多少数据量?

Bulk会把将要处理的数据载入内存中,所以数据量是有限制的,最佳的数据量不是一个确定的数值,它取决于你的硬件,你的文档大小以及复杂性,你的索引以及搜索的负载。

一般建议是1000-5000个文档,大小建议是5-15M,默认不能超过100M,可以在es的配置文件(即$ES_HOME下的config下的elasticsearch.yml)中

bash 复制代码
# bulk批量的混合操作,一般不推荐这种使用,项目中也用的极少。
PUT /_bulk
{ "create" : { "_index" : "ad", "_id" : "6" }}
{ "doc" : {"name" : "bulk"}}
{ "index" : { "_index" : "ad", "_id" : "6" }}
{ "doc" : {"name" : "bulk"}}
{ "delete":{ "_index" : "ad", "_id" : "1"}}
{ "update":{ "_index" : "ad", "_id" : "3"}}
{ "doc" : {"name" : "huawei p20"}}

# 输出结果
{
"took" : 77,
# 如果任意一个文档出错,这里返回true,
"errors" : true,
# items数组,它罗列了每一个请求的结果,结果的顺序与我们请求的顺序相同
"items" : [
{
   # create这个文档已经存在,所以异常
    "create":{
        "_index":"ad",
        "_type":"_doc",
        "_id":"6",
        "status":409,
        "error":{
            "type":"version_conflict_engine_exception",
            "reason":"[6]: version conflict, document already exists (current version [1])",
            "index_uuid":"90zLKRHyT02kyN148mQpqg",
            "shard":"0",
            "index":"ad"
        }
    }
},
# index这个文档已经存在,会覆盖
{
    "index":{
        "_index":"ad",
        "_type":"_doc",
        "_id":"6",
        "_version":2,
        "result":"updated",
        "_shards":{
            "total":2,
            "successful":1,
            "failed":0
        },
        "_seq_no":11,
        "_primary_term":3,
        "status":200
    }
},
{
    "delete":{
        "_index":"ad",
        "_type":"_doc",
        "_id":"1",
        "_version":2,
        "result":"deleted",
        "_shards":{
            "total":2,
            "successful":1,
            "failed":0
        },
        "_seq_no":12,
        "_primary_term":3,
        "status":200
    }
},
{
    "update":{
        "_index":"ad",
        "_type":"_doc",
        "_id":"3",
        "_version":2,
        "result":"updated",
        "_shards":{
            "total":2,
            "successful":1,
            "failed":0
        },
        "_seq_no":13,
        "_primary_term":3,
        "status":200
    }
}
]
}
测试数据准备

# 测试数据准备
PUT example
PUT example/_mapping
{
    "mapping":{
        "id":{
            "type":"long"
        },
        "name":{
            "type":"text"
        },
        "counter":{
            "type":"integer"
        },
        "tags":{
            "type":"text"
        }
    }
}
批量插入

# 批量插入
POST /example/_bulk
{"index": {"_id": 1}}
{"id":1, "name":"admin", "counter":10, "tags":["red", "black"]}
{"index": {"_id": 2}}
{"id":2, "name":"张三", "counter":20, "tags":["green", "purple"]}
{"index": {"_id": 3}}
{"id":3, "name":"李四", "counter":30, "tags":["red", "blue"]}
{"index": {"_id": 4}}
{"id":4, "name":"tom", "counter":40, "tags":["orange"]}

# 输出结果
{
    "took":7,
    "errors":false,
    "items":[
        {
            "index":{
                "_index":"example",
                "_type":"_doc",
                "_id":"1",
                "_version":1,
                "result":"created",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":0,
                "_primary_term":1,
                "status":201
            }
        },
        {
            "index":{
                "_index":"example",
                "_type":"_doc",
                "_id":"2",
                "_version":1,
                "result":"created",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":1,
                "_primary_term":1,
                "status":201
            }
        },
        {
            "index":{
                "_index":"example",
                "_type":"_doc",
                "_id":"3",
                "_version":1,
                "result":"created",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":2,
                "_primary_term":1,
                "status":201
            }
        },
        {
            "index":{
                "_index":"example",
                "_type":"_doc",
                "_id":"4",
                "_version":1,
                "result":"created",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":3,
                "_primary_term":1,
                "status":201
            }
        }
    ]
}

批量修改

# 批量修改
POST /example/_bulk
{"update": {"_id": 1}}
{"doc": {"id":1, "name": "admin-02", "counter":11}}
{"update": {"_id": 2}}
{"script":{"lang":"painless","source":"ctx._source.counter += params.num","params":
{"num":2}}}
{"update":{"_id": 3}}
{"doc": {"name": "test3333name", "counter": 999}}
{"update":{"_id": 4}}
{"doc": {"name": "test444name", "counter": 888}, "doc_as_upsert" : true}

# 输出结果
{
    "took":149,
    "errors":false,
    "items":[
        {
            "update":{
                "_index":"example",
                "_type":"_doc",
                "_id":"1",
                "_version":2,
                "result":"updated",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":4,
                "_primary_term":1,
                "status":200
            }
        },
        {
            "update":{
                "_index":"example",
                "_type":"_doc",
                "_id":"2",
                "_version":2,
                "result":"updated",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":5,
                "_primary_term":1,
                "status":200
            }
        },
        {
            "update":{
                "_index":"example",
                "_type":"_doc",
                "_id":"3",
                "_version":2,
                "result":"updated",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":6,
                "_primary_term":1,
                "status":200
            }
        },
        {
            "update":{
                "_index":"example",
                "_type":"_doc",
                "_id":"4",
                "_version":2,
                "result":"updated",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":7,
                "_primary_term":1,
                "status":200
            }
        }
    ]
}
批量删除

# 批量删除
POST /example/_bulk
{"delete": {"_id": 1}}
{"delete": {"_id": 2}}
{"delete": {"_id": 3}}
{"delete": {"_id": 4}}

# 输出结果
{
    "took":7,
    "errors":false,
    "items":[
        {
            "delete":{
                "_index":"example",
                "_type":"_doc",
                "_id":"1",
                "_version":3,
                "result":"deleted",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":8,
                "_primary_term":1,
                "status":200
            }
        },
        {
            "delete":{
                "_index":"example",
                "_type":"_doc",
                "_id":"2",
                "_version":3,
                "result":"deleted",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":9,
                "_primary_term":1,
                "status":200
            }
        },
        {
            "delete":{
                "_index":"example",
                "_type":"_doc",
                "_id":"3",
                "_version":3,
                "result":"deleted",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":10,
                "_primary_term":1,
                "status":200
            }
        },
        {
            "delete":{
                "_index":"example",
                "_type":"_doc",
                "_id":"4",
                "_version":3,
                "result":"deleted",
                "_shards":{
                    "total":2,
                    "successful":1,
                    "failed":0
                },
                "_seq_no":11,
                "_primary_term":1,
                "status":200
            }
        }
    ]
}

另外在定义存储数据的时候,不预先定义mapping ES 也可以存储数据

数据在存放第一个数据的时候数据类型已经确定下来了

相关推荐
zfj3212 小时前
学技术学英文:elasticsearch 的数据类型
elasticsearch·数据类型·复杂数据类型
DavidSoCool7 小时前
es 3期 第25节-运用Rollup减少数据存储
大数据·elasticsearch·搜索引擎
Elastic 中国社区官方博客7 小时前
使用 Elasticsearch 导航检索增强生成图表
大数据·数据库·人工智能·elasticsearch·搜索引擎·ai·全文检索
Elastic 中国社区官方博客9 小时前
设计新的 Kibana 仪表板布局以支持可折叠部分等
大数据·数据库·elasticsearch·搜索引擎·信息可视化·全文检索·kibana
Dusk_橙子19 小时前
在elasticsearch中,document数据的写入流程如何?
大数据·elasticsearch·搜索引擎
喝醉酒的小白21 小时前
Elasticsearch 中,分片(Shards)数量上限?副本的数量?
大数据·elasticsearch·jenkins
熟透的蜗牛1 天前
Elasticsearch 8.17.1 JAVA工具类
elasticsearch
九圣残炎1 天前
【ElasticSearch】 Java API Client 7.17文档
java·elasticsearch·搜索引擎
risc1234561 天前
【Elasticsearch】HNSW
elasticsearch
我的棉裤丢了1 天前
windows安装ES
大数据·elasticsearch·搜索引擎