Elasticsearch 使用问题记录

文章目录

- [Elasticsearch 使用问题记录](#Elasticsearch 使用问题记录)
- - Text类型的字段，无法进行聚合和排序
  - 高亮查询匹配过多，超出高亮默认限制

Elasticsearch 使用问题记录

Text类型的字段，无法进行聚合和排序

问题背景：在使用Elasticsearch的过程中，对Text类型的字段[conten]进行了聚合和排序的操作，如

json 复制代码

POST /my_index/_search
{
"size": 0,
"aggs": {
    "terms_of_content": {
    "terms": {
        "field": "content"
    }
    }
}
}

该index的mapping配置类似于

json 复制代码

"content" : {
      "type" : "text",
      "norms" : false,
      "analyzer" : "field_analyzer@content",
      "term_hash_enable" : true
    }

报错信息

复制代码

java.lang.IllegalArgumentException: Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [content] in order to load field data by uninverting the inverted index. Note that this can use significant memory.

问题原因：Text类型的字段，一般用于全文搜索，进行了分词分析；而排序和聚合需要将整个字段内容都看做一个整体(token)来构建对应的索引，因此对于Text这种用于全文搜索的字段，Elasticsearch默认拒绝对其进行排序和聚合的操作
解决办法
1. 可以通过修改mapping,在字段中增加一个fields，设置为keyword类型
json 复制代码
```
"content" : {
        "type" : "text",
        "fields" : {
            "keyword" : {
            "type" : "keyword"
            }
        }
```
1. 也可以参照错误提示在mapping中增加fielddata=true(不推荐，会占用大量内存)
json 复制代码
```
 "content": {
   "type": "text",
   "fielddata": true
 }
```

高亮查询匹配过多，超出高亮默认限制

问题背景：在查询过程中，查询条件添加了高亮，出现了报错

查询语句

json 复制代码

POST /logs/_search
{
"query": {
    "match": {
    "message": "specific_error_code"
    }
},
"highlight": {
    "fields": {
    "request_arguments": {} // Highlighting is attempted on the request_arguments field
    }
}
}

报错信息

复制代码

The length of [specific_error_code] field of [32535924] doc "illegal argument exception', maximum allowed to be analyzed for highlighting.

问题原因：高亮查询的内容超出了highlight.max_analyzed_offset的设置，该配置的含义为高亮的最大字符，默认为[10000]

解决办法：通过调大索引限制(有oom风险)/修改高亮搜索条件

json 复制代码

  PUT index/_settings
  {
      "index" : {
          "highlight.max_analyzed_offset" : 100000000
      }
  }

json 复制代码

POST /logs/_search
{
"query": {
    "match": {
    "message": "specific_error_code"
    }
},
"highlight": {
    "fields": {
    "*": {}, // Highlight all other fields except the one causing the issue
    "request_arguments": {"no_match": "none"} // Exclude request_arguments from highlighting
    }
}
}