ES错误记录

文章目录


一.fielddata预加载数据过大,es触发熔断

1.现象

发送es请求:

POST /indexname/_search

{"_source":["@timestamp","Resource.k8s.cluster.name","Resource.k8s.pod.name","Resource.k8s.namespace.name","Resource.logservice.project","Resource.service.name","Body"],"query":{"bool":{"must":[{"term":{"Resource.k8s.cluster.name":"xxx"}},{"term":{"Resource.k8s.pod.name":"mypod-42wjd"}},{"term":{"Resource.service.name":"my-service-impl"}},{"bool":{"should":[{"match":{"Resource.k8s.namespace.name":"my-ns"}}]}}],"filter":[{"range":{"@timestamp":{"gte":"2025-11-12T10:27:26+08:00","lt":"2025-11-12T11:15:26+08:00"}}}]}},"size":500,"search_after":[1762484400260,"qEVBXJoBwBNcCGsGtsZX"],"sort":[{"@timestamp":{"order":"desc"},"_id":{"order":"desc"}}],"track_total_hits":true}

响应:

复制代码
{
  "error": {
    "root_cause": [
      {
        "type": "circuit_breaking_exception",
        "reason": "[fielddata] Data too large, data for [_id] would be [8846062101/8.2gb], which is larger than the limit of [8589934592/8gb]",
        "bytes_wanted": 8846062101,
        "bytes_limit": 8589934592,
        "durability": "PERMANENT"
      },
      {
        "type": "circuit_breaking_exception",
        "reason": "[fielddata] Data too large, data for [_id] would be [8846509299/8.2gb], which is larger than the limit of [8589934592/8gb]",
        "bytes_wanted": 8846509299,
        "bytes_limit": 8589934592,
        "durability": "PERMANENT"
      },
      {
        "type": "circuit_breaking_exception",
        "reason": "[fielddata] Data too large, data for [_id] would be [8841863228/8.2gb], which is larger than the limit of [8589934592/8gb]",
        "bytes_wanted": 8841863228,
        "bytes_limit": 8589934592,
        "durability": "PERMANENT"
      },
      {
        "type": "circuit_breaking_exception",
        "reason": "[fielddata] Data too large, data for [_id] would be [8615344024/8gb], which is larger than the limit of [8589934592/8gb]",
        "bytes_wanted": 8615344024,
        "bytes_limit": 8589934592,
        "durability": "PERMANENT"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "otel-rrsoms-2025.11.12",
        "node": "nNtQiwxhQjysG6A9K73Jfw",
        "reason": {
          "type": "exception",
          "reason": "java.util.concurrent.ExecutionException: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8846062101/8.2gb], which is larger than the limit of [8589934592/8gb]]",
          "caused_by": {
            "type": "execution_exception",
            "reason": "execution_exception: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8846062101/8.2gb], which is larger than the limit of [8589934592/8gb]]",
            "caused_by": {
              "type": "circuit_breaking_exception",
              "reason": "[fielddata] Data too large, data for [_id] would be [8846062101/8.2gb], which is larger than the limit of [8589934592/8gb]",
              "bytes_wanted": 8846062101,
              "bytes_limit": 8589934592,
              "durability": "PERMANENT"
            }
          }
        }
      },
      {
        "shard": 1,
        "index": "otel-xxx-2025.11.12",
        "node": "nNtQiwxhQjysG6A9K73Jfw",
        "reason": {
          "type": "exception",
          "reason": "java.util.concurrent.ExecutionException: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8846509299/8.2gb], which is larger than the limit of [8589934592/8gb]]",
          "caused_by": {
            "type": "execution_exception",
            "reason": "execution_exception: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8846509299/8.2gb], which is larger than the limit of [8589934592/8gb]]",
            "caused_by": {
              "type": "circuit_breaking_exception",
              "reason": "[fielddata] Data too large, data for [_id] would be [8846509299/8.2gb], which is larger than the limit of [8589934592/8gb]",
              "bytes_wanted": 8846509299,
              "bytes_limit": 8589934592,
              "durability": "PERMANENT"
            }
          }
        }
      },
      {
        "shard": 2,
        "index": "otel-xxx-2025.11.12",
        "node": "jLEV30xQTjWvjxWSRX6lnQ",
        "reason": {
          "type": "exception",
          "reason": "java.util.concurrent.ExecutionException: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8841863228/8.2gb], which is larger than the limit of [8589934592/8gb]]",
          "caused_by": {
            "type": "execution_exception",
            "reason": "execution_exception: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8841863228/8.2gb], which is larger than the limit of [8589934592/8gb]]",
            "caused_by": {
              "type": "circuit_breaking_exception",
              "reason": "[fielddata] Data too large, data for [_id] would be [8841863228/8.2gb], which is larger than the limit of [8589934592/8gb]",
              "bytes_wanted": 8841863228,
              "bytes_limit": 8589934592,
              "durability": "PERMANENT"
            }
          }
        }
      },
      {
        "shard": 3,
        "index": "otel-xxx-2025.11.12",
        "node": "jLEV30xQTjWvjxWSRX6lnQ",
        "reason": {
          "type": "exception",
          "reason": "java.util.concurrent.ExecutionException: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8615344024/8gb], which is larger than the limit of [8589934592/8gb]]",
          "caused_by": {
            "type": "execution_exception",
            "reason": "execution_exception: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8615344024/8gb], which is larger than the limit of [8589934592/8gb]]",
            "caused_by": {
              "type": "circuit_breaking_exception",
              "reason": "[fielddata] Data too large, data for [_id] would be [8615344024/8gb], which is larger than the limit of [8589934592/8gb]",
              "bytes_wanted": 8615344024,
              "bytes_limit": 8589934592,
              "durability": "PERMANENT"
            }
          }
        }
      }
    ],
    "caused_by": {
      "type": "circuit_breaking_exception",
      "reason": "[fielddata] Data too large, data for [_id] would be [8846062101/8.2gb], which is larger than the limit of [8589934592/8gb]",
      "bytes_wanted": 8846062101,
      "bytes_limit": 8589934592,
      "durability": "PERMANENT"
    }
  },
  "status": 500
}

2.原因:

这是 Elasticsearch 的熔断器(Circuit Breaker)机制触发的保护性拒绝。

具体来说:

查询(或聚合、排序等操作)试图将 _id (sort+searchAffter带有_id,表示按照id排序)字段加载到 fielddata 内存中;

预估需要 8.2 GB 内存;

但 fielddata 熔断器的上限是 8 GB(默认为 JVM 堆内存的 40%);

因此 ES 主动拒绝执行该请求,防止节点 OOM 崩溃。

Fielddata 通常用于对 text 类型字段 进行排序(sort)、在 terms 聚合中使用 text 字段或者使用 script 访问字段值;但 _id 是一个特殊元字段(keyword 类型),默认不会加载到 fielddata。

而我的索引有海量文档(比如数亿条日志);

每个 _id 平均 20--30 字节,总大小轻松超 8GB;

ES 预估加载 _id 到 fielddata 需要 8.2GB;

indices.breaker.fielddata.limit 默认是 JVM 堆的 40%(假设你给 ES 20GB 堆 → 8GB limit);

8.2GB > 8GB → 触发 circuit_breaking_exception;

请求被立即拒绝,不执行任何操作。

3.修改

不要用_id参数排序、计算、聚合等操作

4.相关文档

fielddate详解

相关推荐
Elastic 中国社区官方博客1 小时前
Elastic-caveman : 在不损失 Elastic 最佳效果的情况下,将 AI 响应 tokens 减少64%
大数据·运维·数据库·人工智能·elasticsearch·搜索引擎·全文检索
身如柳絮随风扬4 小时前
深度解析 Elasticsearch 搜索服务:核心原理、架构与优化实践
大数据·elasticsearch·架构
kft13144 小时前
AI 驱动测试 2.0:当测试智能体成为你的“超级 QA“
大数据·人工智能·elasticsearch
木雷坞9 小时前
K8s GPU 推理服务 ImagePullBackOff 排查与预热
云原生·容器·kubernetes·gpu算力
吴爃9 小时前
Spring Boot 项目在 K8S 中的打包、部署与运维发布实践
运维·spring boot·kubernetes
Elastic 中国社区官方博客10 小时前
在 Elastic 中使用 MCP 自动化用户旅程以进行合成监控
大数据·运维·人工智能·elasticsearch·搜索引擎·自动化·可用性测试
The Straggling Crow12 小时前
Monitoring 2026-04-30
kubernetes
AOwhisky12 小时前
Kubernetes调度与服务暴露:从“定时任务”到“服务发现”的完全指南
linux·运维·云原生·容器·kubernetes·服务发现
逸Y 仙X12 小时前
文章十六:ElasticSearch 使用enrich策略实现大宽表
java·大数据·数据库·elasticsearch·搜索引擎·全文检索
Elasticsearch12 小时前
用于 JavaScript 和 TypeScript 的 ES|QL 查询构建器:流式、类型安全的查询构建
elasticsearch