ElasticSearch 7.x现网运行问题汇集1

问题描述:

现网ElasticSearch health状态变为red,有分片无法assign。如下摘录explain的结果部分:

复制代码
    "note": "No shard was specified in the explain API request, so this response explains a randomly chosen unassigned shard. There may be other unassigned shards in this cluster which cannot be assigned for different reasons. It may not be possible to assign this shard until one of the other shards is assigned correctly. To explain the allocation of other shards (whether assigned or unassigned) you must specify the target shard in the request to this API.",
    "index": "demo-2022.02.06",
    "shard": 3,
    "primary": true,
    "current_state": "unassigned",
    "unassigned_info": {
        "reason": "CLUSTER_RECOVERED",
        "at": "2023-05-29T08:08:22.697Z",
        "last_allocation_status": "no_valid_shard_copy"
    },
    "can_allocate": "no_valid_shard_copy",
    "allocate_explanation": "cannot allocate because all found copies of the shard are either stale or corrupt",
。。。
"store": {
                "in_sync": true,
                "allocation_id": "82iRvG0KTTm9NT_5Fx8BRA",
                "store_exception": {
                    "type": "corrupt_index_exception",
                    "reason": "failed engine (reason: [corrupt file (source: [start])]) (resource=preexisting_corruption)",
                    "caused_by": {
                        "type": "i_o_exception",
                        "reason": "failed engine (reason: [corrupt file (source: [start])])",
                        "caused_by": {
                            "type": "corrupt_index_exception",
                            "reason": "checksum passed (d87020fd). possibly transient resource issue, or a Lucene or JVM bug (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path=\"/data/es/data/nodes/0/indices/dzcoAoZjSzGus0qj1sKTFg/3/index/segments_6\")))"
                        }
                    }
                }
            }

解决方案:

  1. 步骤1: 检查shard stores

GET /_shard_stores?pretty ,得到分片损坏的明细,以便进行修复,得到如图:

  1. 步骤2: reroute index

    POST /_cluster/reroute?master_timeout=5m
    {
    "commands": [
    {
    "allocate_empty_primary": {
    "index": "demo-2023.04.04",
    "shard": 2 ,
    "node": "{nodename}",
    "accept_data_loss": true
    }
    }
    ]
    }

相关推荐
财迅通Ai13 小时前
6000万吨产能承压 卫星化学迎来战略窗口期
大数据·人工智能·物联网·卫星化学
武子康14 小时前
大数据-263 实时数仓-Canal 增量订阅与消费原理:MySQL Binlog 数据同步实践
大数据·hadoop·后端
LJ979511114 小时前
媒体发布新武器:Infoseek融媒体平台使用指南
大数据·人工智能
科技小花14 小时前
AI重塑数据治理:2026年核心方案评估与场景适配
大数据·人工智能·云原生·ai原生
方向研究14 小时前
存储芯片生产
大数据
代码青铜14 小时前
如何用 Zion 实现 AI 图片分析与电商文案自动生成流程
大数据·人工智能
gaoshengdainzi15 小时前
GB/T23448-2019卫生洁具软管专用检测设备全套解决方案
大数据·卫生洁具软管检测设备·软管试验机
茶靡花开041517 小时前
什么是DMS经销商管理系统?经销商管理系统哪个好?
大数据·人工智能
Gofarlic_OMS17 小时前
HyperWorks用户仿真行为分析与许可证资源分点配置
java·大数据·运维·服务器·人工智能
fire-flyer17 小时前
ClickHouse系列(二):MergeTree 家族详解
大数据·数据库·clickhouse