当Spring服务接入ElasticSearch:如何优雅的CRUD呢?

Elasticsearch API 及其 Java 客户端操作详解

Elasticsearch 是一个功能强大的分布式搜索和分析引擎,其 RESTful API 提供了丰富的操作接口。与此同时,Elasticsearch 提供了官方的 Java 客户端(如 Java High-Level REST Client 和新版 Elasticsearch Java API Client),方便开发者在 Java 环境中与其交互。本文将从索引层和文档层两个维度,详细讲解 Elasticsearch 的核心 API 及其 Java 客户端的构造操作。

一、索引层操作

索引(Index)是 Elasticsearch 中存储数据的逻辑单元,类似于数据库中的"数据库"。索引层操作主要涉及索引的创建、管理和删除等功能。

1. 创建索引(Create Index)

Elasticsearch 允许通过 PUT /<index> 创建索引,并可指定映射(Mapping)和设置(Settings)。

REST API 示例
json 复制代码
PUT /my_index
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
      "title": { "type": "text" },
      "date": { "type": "date" }
    }
  }
}
Java 客户端实现

使用 RestHighLevelClient 创建索引:

java 复制代码
import org.elasticsearch.action.admin.indices.create.CreateIndexRequest;
import org.elasticsearch.action.admin.indices.create.CreateIndexResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.settings.Settings;

public void createIndex(RestHighLevelClient client) throws IOException {
    CreateIndexRequest request = new CreateIndexRequest("my_index");
    request.settings(Settings.builder()
        .put("index.number_of_shards", 3)
        .put("index.number_of_replicas", 1)
    );
    request.mapping("{\"properties\":{\"title\":{\"type\":\"text\"},\"date\":{\"type\":\"date\"}}}", 
        XContentType.JSON);
    
    CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT);
    System.out.println("Index created: " + response.isAcknowledged());
}

2. 删除索引(Delete Index)

通过 DELETE /<index> 删除索引。

REST API 示例
json 复制代码
DELETE /my_index
Java 客户端实现
java 复制代码
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;

public void deleteIndex(RestHighLevelClient client) throws IOException {
    DeleteIndexRequest request = new DeleteIndexRequest("my_index");
    AcknowledgedResponse response = client.indices().delete(request, RequestOptions.DEFAULT);
    System.out.println("Index deleted: " + response.isAcknowledged());
}

你说得对,查询是 Elasticsearch 的核心功能之一,尤其在实际应用中,复杂的查询条件和组合逻辑非常常见。我重新调整并扩展这部分内容,深入讲解查询相关的 API 和 Java 客户端实现,涵盖更多复杂的查询场景(如布尔查询、多字段查询、过滤、聚合等),并提供详细的实现细节和示例代码。以下是修订后的博客片段,重点扩展查询部分。


二、文档层操作

2. 查询文档(Search Document)

查询是 Elasticsearch 的核心功能,其强大的查询 DSL(Domain Specific Language)支持从简单匹配到复杂聚合的各种场景。REST API 使用 GET /<index>/_search,Java 客户端通过 SearchRequestSearchSourceBuilder 构造查询。

基础查询回顾

先看一个简单查询:

REST API 示例
json 复制代码
GET /my_index/_search
{
  "query": {
    "match": {
      "title": "Elasticsearch"
    }
  }
}
Java 客户端实现
java 复制代码
public void basicSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(QueryBuilders.matchQuery("title", "Elasticsearch"));
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Total hits: " + response.getHits().getTotalHits());
}

但实际场景中,查询需求往往更复杂,下面逐步介绍。

复杂查询场景
2.1 布尔查询(Boolean Query)

布尔查询允许组合多个条件(如 mustshouldmust_notfilter),实现精确控制。

REST API 示例

查询标题包含 "Elasticsearch" 且日期晚于 2025-01-01,但不包含 "Basics" 的文档:

json 复制代码
GET /my_index/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "title": "Elasticsearch" }}
      ],
      "filter": [
        { "range": { "date": { "gte": "2025-01-01" }}}
      ],
      "must_not": [
        { "match": { "title": "Basics" }}
      ]
    }
  }
}
Java 客户端实现
java 复制代码
public void booleanSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    BoolQueryBuilder boolQuery = QueryBuilders.boolQuery()
        .must(QueryBuilders.matchQuery("title", "Elasticsearch"))
        .filter(QueryBuilders.rangeQuery("date").gte("2025-01-01"))
        .mustNot(QueryBuilders.matchQuery("title", "Basics"));
    
    sourceBuilder.query(boolQuery);
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Boolean query hits: " + response.getHits().getTotalHits());
    for (SearchHit hit : response.getHits().getHits()) {
        System.out.println(hit.getSourceAsString());
    }
}
2.2 多字段查询(Multi-Match Query)

当需要跨多个字段搜索时,使用 multi_match

REST API 示例

搜索标题或内容中包含 "Elasticsearch" 的文档:

json 复制代码
GET /my_index/_search
{
  "query": {
    "multi_match": {
      "query": "Elasticsearch",
      "fields": ["title", "content"],
      "type": "best_fields"
    }
  }
}
Java 客户端实现
java 复制代码
public void multiMatchSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    MultiMatchQueryBuilder multiMatchQuery = QueryBuilders.multiMatchQuery("Elasticsearch", "title", "content")
        .type(MultiMatchQueryBuilder.Type.BEST_FIELDS);
    
    sourceBuilder.query(multiMatchQuery);
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Multi-match hits: " + response.getHits().getTotalHits());
}
2.3 短语查询(Match Phrase Query)

要求词语按顺序完全匹配,适用于精确短语搜索。

REST API 示例
json 复制代码
GET /my_index/_search
{
  "query": {
    "match_phrase": {
      "title": "Elasticsearch Basics"
    }
  }
}
Java 客户端实现
java 复制代码
public void phraseSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    sourceBuilder.query(QueryBuilders.matchPhraseQuery("title", "Elasticsearch Basics"));
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Phrase query hits: " + response.getHits().getTotalHits());
}
2.4 模糊查询(Fuzzy Query)

允许一定程度的拼写错误,适用于容错搜索。

REST API 示例
json 复制代码
GET /my_index/_search
{
  "query": {
    "fuzzy": {
      "title": {
        "value": "Elastcsearch",
        "fuzziness": "AUTO"
      }
    }
  }
}
Java 客户端实现
java 复制代码
public void fuzzySearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    sourceBuilder.query(QueryBuilders.fuzzyQuery("title", "Elastcsearch").fuzziness(Fuzziness.AUTO));
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Fuzzy query hits: " + response.getHits().getTotalHits());
}
2.5 聚合查询(Aggregations)

聚合用于统计分析,例如按字段分组或计算平均值。

REST API 示例

按日期统计文档数量:

json 复制代码
GET /my_index/_search
{
  "aggs": {
    "by_date": {
      "terms": {
        "field": "date"
      }
    }
  }
}
Java 客户端实现
java 复制代码
public void aggregationSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    TermsAggregationBuilder aggregation = AggregationBuilders.terms("by_date").field("date");
    sourceBuilder.aggregation(aggregation);
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    Terms byDateAgg = response.getAggregations().get("by_date");
    for (Terms.Bucket bucket : byDateAgg.getBuckets()) {
        System.out.println("Date: " + bucket.getKeyAsString() + ", Count: " + bucket.getDocCount());
    }
}
相关性算分与优化

Elasticsearch 默认使用 BM25 算法计算相关性得分。可以通过以下方式优化:

2.6 Boosting 权重调整

为特定字段或条件增加权重。

REST API 示例
json 复制代码
GET /my_index/_search
{
  "query": {
    "multi_match": {
      "query": "Elasticsearch",
      "fields": ["title^2", "content"]
    }
  }
}
Java 客户端实现
java 复制代码
public void boostedSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    MultiMatchQueryBuilder query = QueryBuilders.multiMatchQuery("Elasticsearch", "title", "content")
        .field("title", 2.0f); // 提高 title 字段权重
    sourceBuilder.query(query);
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Boosted query hits: " + response.getHits().getTotalHits());
}
2.7 Function Score 查询

自定义得分逻辑,例如结合字段值或距离。

REST API 示例
json 复制代码
GET /my_index/_search
{
  "query": {
    "function_score": {
      "query": { "match": { "title": "Elasticsearch" }},
      "functions": [
        {
          "field_value_factor": {
            "field": "views",
            "factor": 1.5,
            "modifier": "sqrt"
          }
        }
      ]
    }
  }
}
Java 客户端实现
java 复制代码
public void functionScoreSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery(
        QueryBuilders.matchQuery("title", "Elasticsearch"),
        ScoreFunctionBuilders.fieldValueFactorFunction("views")
            .factor(1.5f)
            .modifier(FieldValueFactorFunction.Modifier.SQRT)
    );
    sourceBuilder.query(functionScoreQuery);
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Function score hits: " + response.getHits().getTotalHits());
}
查询性能优化建议
  1. 使用 Filter 替代 Mustfilter 不计算相关性得分,性能更高。
  2. 分页控制 :通过 fromsize 参数控制结果范围。
  3. 字段选择 :使用 stored_fields_source 过滤返回字段,减少数据传输。

相关推荐
uzong4 小时前
技术故障复盘模版
后端
GetcharZp4 小时前
基于 Dify + 通义千问的多模态大模型 搭建发票识别 Agent
后端·llm·agent
桦说编程5 小时前
Java 中如何创建不可变类型
java·后端·函数式编程
IT毕设实战小研5 小时前
基于Spring Boot 4s店车辆管理系统 租车管理系统 停车位管理系统 智慧车辆管理系统
java·开发语言·spring boot·后端·spring·毕业设计·课程设计
wyiyiyi5 小时前
【Web后端】Django、flask及其场景——以构建系统原型为例
前端·数据库·后端·python·django·flask
阿华的代码王国6 小时前
【Android】RecyclerView复用CheckBox的异常状态
android·xml·java·前端·后端
Jimmy6 小时前
AI 代理是什么,其有助于我们实现更智能编程
前端·后端·ai编程
AntBlack7 小时前
不当韭菜V1.1 :增强能力 ,辅助构建自己的交易规则
后端·python·pyqt
bobz9658 小时前
pip install 已经不再安全
后端
寻月隐君8 小时前
硬核实战:从零到一,用 Rust 和 Axum 构建高性能聊天服务后端
后端·rust·github