ES错误记录

文章目录


一.fielddata预加载数据过大,es触发熔断

1.现象

发送es请求:

POST /indexname/_search

{"_source":["@timestamp","Resource.k8s.cluster.name","Resource.k8s.pod.name","Resource.k8s.namespace.name","Resource.logservice.project","Resource.service.name","Body"],"query":{"bool":{"must":[{"term":{"Resource.k8s.cluster.name":"xxx"}},{"term":{"Resource.k8s.pod.name":"mypod-42wjd"}},{"term":{"Resource.service.name":"my-service-impl"}},{"bool":{"should":[{"match":{"Resource.k8s.namespace.name":"my-ns"}}]}}],"filter":[{"range":{"@timestamp":{"gte":"2025-11-12T10:27:26+08:00","lt":"2025-11-12T11:15:26+08:00"}}}]}},"size":500,"search_after":[1762484400260,"qEVBXJoBwBNcCGsGtsZX"],"sort":[{"@timestamp":{"order":"desc"},"_id":{"order":"desc"}}],"track_total_hits":true}

响应:

复制代码
{
  "error": {
    "root_cause": [
      {
        "type": "circuit_breaking_exception",
        "reason": "[fielddata] Data too large, data for [_id] would be [8846062101/8.2gb], which is larger than the limit of [8589934592/8gb]",
        "bytes_wanted": 8846062101,
        "bytes_limit": 8589934592,
        "durability": "PERMANENT"
      },
      {
        "type": "circuit_breaking_exception",
        "reason": "[fielddata] Data too large, data for [_id] would be [8846509299/8.2gb], which is larger than the limit of [8589934592/8gb]",
        "bytes_wanted": 8846509299,
        "bytes_limit": 8589934592,
        "durability": "PERMANENT"
      },
      {
        "type": "circuit_breaking_exception",
        "reason": "[fielddata] Data too large, data for [_id] would be [8841863228/8.2gb], which is larger than the limit of [8589934592/8gb]",
        "bytes_wanted": 8841863228,
        "bytes_limit": 8589934592,
        "durability": "PERMANENT"
      },
      {
        "type": "circuit_breaking_exception",
        "reason": "[fielddata] Data too large, data for [_id] would be [8615344024/8gb], which is larger than the limit of [8589934592/8gb]",
        "bytes_wanted": 8615344024,
        "bytes_limit": 8589934592,
        "durability": "PERMANENT"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "otel-rrsoms-2025.11.12",
        "node": "nNtQiwxhQjysG6A9K73Jfw",
        "reason": {
          "type": "exception",
          "reason": "java.util.concurrent.ExecutionException: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8846062101/8.2gb], which is larger than the limit of [8589934592/8gb]]",
          "caused_by": {
            "type": "execution_exception",
            "reason": "execution_exception: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8846062101/8.2gb], which is larger than the limit of [8589934592/8gb]]",
            "caused_by": {
              "type": "circuit_breaking_exception",
              "reason": "[fielddata] Data too large, data for [_id] would be [8846062101/8.2gb], which is larger than the limit of [8589934592/8gb]",
              "bytes_wanted": 8846062101,
              "bytes_limit": 8589934592,
              "durability": "PERMANENT"
            }
          }
        }
      },
      {
        "shard": 1,
        "index": "otel-xxx-2025.11.12",
        "node": "nNtQiwxhQjysG6A9K73Jfw",
        "reason": {
          "type": "exception",
          "reason": "java.util.concurrent.ExecutionException: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8846509299/8.2gb], which is larger than the limit of [8589934592/8gb]]",
          "caused_by": {
            "type": "execution_exception",
            "reason": "execution_exception: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8846509299/8.2gb], which is larger than the limit of [8589934592/8gb]]",
            "caused_by": {
              "type": "circuit_breaking_exception",
              "reason": "[fielddata] Data too large, data for [_id] would be [8846509299/8.2gb], which is larger than the limit of [8589934592/8gb]",
              "bytes_wanted": 8846509299,
              "bytes_limit": 8589934592,
              "durability": "PERMANENT"
            }
          }
        }
      },
      {
        "shard": 2,
        "index": "otel-xxx-2025.11.12",
        "node": "jLEV30xQTjWvjxWSRX6lnQ",
        "reason": {
          "type": "exception",
          "reason": "java.util.concurrent.ExecutionException: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8841863228/8.2gb], which is larger than the limit of [8589934592/8gb]]",
          "caused_by": {
            "type": "execution_exception",
            "reason": "execution_exception: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8841863228/8.2gb], which is larger than the limit of [8589934592/8gb]]",
            "caused_by": {
              "type": "circuit_breaking_exception",
              "reason": "[fielddata] Data too large, data for [_id] would be [8841863228/8.2gb], which is larger than the limit of [8589934592/8gb]",
              "bytes_wanted": 8841863228,
              "bytes_limit": 8589934592,
              "durability": "PERMANENT"
            }
          }
        }
      },
      {
        "shard": 3,
        "index": "otel-xxx-2025.11.12",
        "node": "jLEV30xQTjWvjxWSRX6lnQ",
        "reason": {
          "type": "exception",
          "reason": "java.util.concurrent.ExecutionException: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8615344024/8gb], which is larger than the limit of [8589934592/8gb]]",
          "caused_by": {
            "type": "execution_exception",
            "reason": "execution_exception: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [8615344024/8gb], which is larger than the limit of [8589934592/8gb]]",
            "caused_by": {
              "type": "circuit_breaking_exception",
              "reason": "[fielddata] Data too large, data for [_id] would be [8615344024/8gb], which is larger than the limit of [8589934592/8gb]",
              "bytes_wanted": 8615344024,
              "bytes_limit": 8589934592,
              "durability": "PERMANENT"
            }
          }
        }
      }
    ],
    "caused_by": {
      "type": "circuit_breaking_exception",
      "reason": "[fielddata] Data too large, data for [_id] would be [8846062101/8.2gb], which is larger than the limit of [8589934592/8gb]",
      "bytes_wanted": 8846062101,
      "bytes_limit": 8589934592,
      "durability": "PERMANENT"
    }
  },
  "status": 500
}

2.原因:

这是 Elasticsearch 的熔断器(Circuit Breaker)机制触发的保护性拒绝。

具体来说:

查询(或聚合、排序等操作)试图将 _id (sort+searchAffter带有_id,表示按照id排序)字段加载到 fielddata 内存中;

预估需要 8.2 GB 内存;

但 fielddata 熔断器的上限是 8 GB(默认为 JVM 堆内存的 40%);

因此 ES 主动拒绝执行该请求,防止节点 OOM 崩溃。

Fielddata 通常用于对 text 类型字段 进行排序(sort)、在 terms 聚合中使用 text 字段或者使用 script 访问字段值;但 _id 是一个特殊元字段(keyword 类型),默认不会加载到 fielddata。

而我的索引有海量文档(比如数亿条日志);

每个 _id 平均 20--30 字节,总大小轻松超 8GB;

ES 预估加载 _id 到 fielddata 需要 8.2GB;

indices.breaker.fielddata.limit 默认是 JVM 堆的 40%(假设你给 ES 20GB 堆 → 8GB limit);

8.2GB > 8GB → 触发 circuit_breaking_exception;

请求被立即拒绝,不执行任何操作。

3.修改

不要用_id参数排序、计算、聚合等操作

4.相关文档

fielddate详解

相关推荐
360智汇云6 小时前
容器云质量加固方案
rpc·kubernetes·dubbo
眠りたいです11 小时前
基于脚手架微服务的视频点播系统-脚手架开发部分(完结)elasticsearch与libcurl的简单使用与二次封装及bug修复
c++·elasticsearch·微服务·云原生·架构·bug
N 年 后11 小时前
Docker、Compose、Portainer与K8s详解
docker·容器·kubernetes
失散1312 小时前
分布式专题——57 如何保证MySQL数据库到ES的数据一致性
java·数据库·分布式·mysql·elasticsearch·架构
oneslide1 天前
Kubernetes环境部署Redis集群
redis·容器·kubernetes
企鹅侠客1 天前
k8s之Headless浅谈
云原生·容器·kubernetes
喜欢你,还有大家1 天前
Kubernetes-架构安装
架构·kubernetes·云计算
庸子1 天前
Kubernetes调度器深度解析:从资源分配到亲和性策略的架构师之路
java·算法·云原生·贪心算法·kubernetes·devops
easy_coder1 天前
超越故障修复:从 Kubernetes POD 崩溃到 AI 驱动的运维认知重构
云原生·架构·kubernetes·云计算