ElasticSearch备考 -- Index shrink

一、题目

索引task包括5个分片一个副本，对索引执行shrink压缩操作，压缩后索引为1主分片，索引名称为task-new

二、思考

在执行shrink前必须满足三个前置条件

The index must be read-only.
A copy of every shard in the index must reside on the same node.
The index must have a green health status

三、解题

Step 1、初始化索引task

bash 复制代码

# DELETE task
PUT task
{
  "settings": {
    "number_of_replicas": 1,
    "number_of_shards": 5
  }
}

POST task/_bulk
{"create":{"_id":1}}
{"a":"key","b":"mom","c":"mom","d":1}
{"create":{"_id":2}}
{"a":"key","b":"cake mix","c":"mom","d":2}
{"create":{"_id":3}}
{"a":"key","b":"mom","c":"cake mix","d":3}
{"create":{"_id":4}}
{"a":"cake mix","b":"mom","c":"mom","d":4}

通过head插件我们对初始化的索引观察一下，5个分片，分配在三个节点，并且有一个副本

Step 2、修改索引准备压缩

index.number_of_replicas 设置副本数为0
index.routing.allocation.require._name 指定索引分片重新路由到节点名称
index.blocks.write 设置为只读索引

bash 复制代码

PUT /task/_settings
{
  "settings": {
    "index.number_of_replicas": 0,                                
    "index.routing.allocation.require._name": "node-1",
    "index.blocks.write":true
  }
}

通过以上操作可以满足的压缩的三个必要条件，通过head插件可以看到，已经没有了副本，并且5个分片已经重新路由到了node1节点

Step 3、执行压缩

请求路径上_shrink前后分别为压缩前索引名称和压缩后的新索引名称

通过Setting指定压缩参数

"index.number_of_replicas": 1,
"index.number_of_shards": 1,
"index.codec": "best_compression" 指定压缩方式

bash 复制代码

POST /task/_shrink/task_shrink
{
  "settings": {
    "index.number_of_replicas": 1,
    "index.number_of_shards": 1, 
    "index.codec": "best_compression" 
  },
  "aliases": {
    "task_new": {}
  }
}

四、总结

压缩前必须满足三个条件，三个条件必须同时满足

索引必须是只读状态
不能有副本，并将待压缩分片重新分配到一个节点
索引的状态必须是green

参考资料

Shrink index API | Elasticsearch Guide [8.15] | Elastic

送一波福利：

福利一

有需要内推JD的同学，可以私信或留言，我帮您内推，流程快！！！

福利二

福利三