Elasticsearch 8.13.4 常用搜索操作完全指南

Elasticsearch 作为分布式搜索和分析引擎，提供了丰富的搜索能力。本文将详细介绍 Elasticsearch 8.13.4 中最常用的搜索操作，帮助您快速掌握其核心搜索功能。

一、基础概念回顾

在开始搜索操作前，让我们简要回顾几个核心概念：

索引(Index): 存储文档的容器，类似于数据库中的表
文档(Document): 可搜索的基本信息单元，以 JSON 格式表示
倒排索引: Elasticsearch 使用的数据结构，支持快速全文搜索

二、常用搜索操作

1. 匹配所有文档查询

最基础的搜索，返回索引中的所有文档：

json 复制代码

GET /your_index/_search
{
  "query": {
    "match_all": {}
  }
}

2. 全文搜索 (Match Query)

最常用的全文搜索查询，会对查询文本进行分析：

json 复制代码

GET /your_index/_search
{
  "query": {
    "match": {
      "field_name": "搜索关键词"
    }
  }
}

3. 短语匹配 (Match Phrase Query)

要求所有词条都出现且顺序相同：

json 复制代码

GET /your_index/_search
{
  "query": {
    "match_phrase": {
      "field_name": "精确短语搜索"
    }
  }
}

4. 多字段搜索 (Multi-match Query)

在多个字段中执行相同的搜索：

json 复制代码

GET /your_index/_search
{
  "query": {
    "multi_match": {
      "query": "搜索关键词",
      "fields": ["field1", "field2", "field3"]
    }
  }
}

5. 精确匹配 (Term Query)

用于精确值匹配，不会对查询文本进行分析：

json 复制代码

GET /your_index/_search
{
  "query": {
    "term": {
      "field_name": "精确值"
    }
  }
}

6. 术语集合查询 (Terms Query)

匹配指定字段中包含任意一个指定术语的文档：

json 复制代码

GET /your_index/_search
{
  "query": {
    "terms": {
      "field_name": ["值1", "值2", "值3"]
    }
  }
}

7. 范围查询 (Range Query)

查找指定范围内的值：

json 复制代码

GET /your_index/_search
{
  "query": {
    "range": {
      "field_name": {
        "gte": 10,
        "lte": 20
      }
    }
  }
}

8. 布尔查询 (Bool Query)

组合多个查询条件，支持 must(必须匹配)、should(应该匹配)、must_not(必须不匹配)、filter(必须匹配但不计算分数):

json 复制代码

GET /your_index/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "title": "Elasticsearch" } }
      ],
      "filter": [
        { "term": { "status": "published" } },
        { "range": { "publish_date": { "gte": "2023-01-01" } } }
      ],
      "should": [
        { "term": { "tags": "教程" } },
        { "term": { "tags": "指南" } }
      ],
      "minimum_should_match": 1
    }
  }
}

9. 通配符查询 (Wildcard Query)

使用 * 匹配任意字符序列，? 匹配任意单个字符：

json 复制代码

GET /your_index/_search
{
  "query": {
    "wildcard": {
      "field_name": "搜索*"
    }
  }
}

10. 前缀查询 (Prefix Query)

查找以指定前缀开头的术语：

json 复制代码

GET /your_index/_search
{
  "query": {
    "prefix": {
      "field_name": "前缀"
    }
  }
}

11. 模糊查询 (Fuzzy Query)

基于编辑距离进行模糊匹配：

json 复制代码

GET /your_index/_search
{
  "query": {
    "fuzzy": {
      "field_name": {
        "value": "搜索词",
        "fuzziness": "AUTO"
      }
    }
  }
}

三、搜索结果控制

1. 分页

json 复制代码

GET /your_index/_search
{
  "from": 0,
  "size": 10,
  "query": {
    "match_all": {}
  }
}

2. 排序

json 复制代码

GET /your_index/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    { "field1": { "order": "asc" } },
    { "field2": { "order": "desc" } }
  ]
}

3. 指定返回字段

json 复制代码

GET /your_index/_search
{
  "_source": ["field1", "field2"],
  "query": {
    "match_all": {}
  }
}

4. 高亮显示

json 复制代码

GET /your_index/_search
{
  "query": {
    "match": {
      "content": "搜索关键词"
    }
  },
  "highlight": {
    "fields": {
      "content": {}
    }
  }
}

四、聚合查询

Elasticsearch 提供了强大的聚合功能，用于数据分析：

1. 指标聚合

json 复制代码

GET /your_index/_search
{
  "size": 0,
  "aggs": {
    "avg_price": {
      "avg": { "field": "price" }
    },
    "max_price": {
      "max": { "field": "price" }
    }
  }
}

2. 分桶聚合

json 复制代码

GET /your_index/_search
{
  "size": 0,
  "aggs": {
    "group_by_category": {
      "terms": {
        "field": "category",
        "size": 10
      }
    }
  }
}

3. 组合聚合

json 复制代码

GET /your_index/_search
{
  "size": 0,
  "aggs": {
    "group_by_category": {
      "terms": {
        "field": "category"
      },
      "aggs": {
        "avg_price": {
          "avg": { "field": "price" }
        }
      }
    }
  }
}

五、性能优化建议

使用 filter 上下文：不计算分数，性能更好且可缓存
避免通配符开头的前缀查询 ：如 *text，会导致性能问题
合理使用分页：深度分页(from + size > 10000)性能较差，考虑使用 search_after
选择合适的字段类型：text 用于全文搜索，keyword 用于精确匹配

六、总结

Elasticsearch 8.13.4 提供了丰富多样的搜索功能，从基础的全文搜索到复杂的聚合分析。掌握这些常用搜索操作，能够帮助您构建高效、灵活的搜索应用。

在实际应用中，建议根据具体场景选择合适的查询类型，并注意性能优化，以获得最佳的搜索体验。

希望本文能帮助您更好地理解和使用 Elasticsearch 的搜索功能！