当Spring服务接入ElasticSearch:如何优雅的CRUD呢?

Elasticsearch API 及其 Java 客户端操作详解

Elasticsearch 是一个功能强大的分布式搜索和分析引擎,其 RESTful API 提供了丰富的操作接口。与此同时,Elasticsearch 提供了官方的 Java 客户端(如 Java High-Level REST Client 和新版 Elasticsearch Java API Client),方便开发者在 Java 环境中与其交互。本文将从索引层和文档层两个维度,详细讲解 Elasticsearch 的核心 API 及其 Java 客户端的构造操作。

一、索引层操作

索引(Index)是 Elasticsearch 中存储数据的逻辑单元,类似于数据库中的"数据库"。索引层操作主要涉及索引的创建、管理和删除等功能。

1. 创建索引(Create Index)

Elasticsearch 允许通过 PUT /<index> 创建索引,并可指定映射(Mapping)和设置(Settings)。

REST API 示例
json 复制代码
PUT /my_index
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
      "title": { "type": "text" },
      "date": { "type": "date" }
    }
  }
}
Java 客户端实现

使用 RestHighLevelClient 创建索引:

java 复制代码
import org.elasticsearch.action.admin.indices.create.CreateIndexRequest;
import org.elasticsearch.action.admin.indices.create.CreateIndexResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.settings.Settings;

public void createIndex(RestHighLevelClient client) throws IOException {
    CreateIndexRequest request = new CreateIndexRequest("my_index");
    request.settings(Settings.builder()
        .put("index.number_of_shards", 3)
        .put("index.number_of_replicas", 1)
    );
    request.mapping("{\"properties\":{\"title\":{\"type\":\"text\"},\"date\":{\"type\":\"date\"}}}", 
        XContentType.JSON);
    
    CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT);
    System.out.println("Index created: " + response.isAcknowledged());
}

2. 删除索引(Delete Index)

通过 DELETE /<index> 删除索引。

REST API 示例
json 复制代码
DELETE /my_index
Java 客户端实现
java 复制代码
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;

public void deleteIndex(RestHighLevelClient client) throws IOException {
    DeleteIndexRequest request = new DeleteIndexRequest("my_index");
    AcknowledgedResponse response = client.indices().delete(request, RequestOptions.DEFAULT);
    System.out.println("Index deleted: " + response.isAcknowledged());
}

你说得对,查询是 Elasticsearch 的核心功能之一,尤其在实际应用中,复杂的查询条件和组合逻辑非常常见。我重新调整并扩展这部分内容,深入讲解查询相关的 API 和 Java 客户端实现,涵盖更多复杂的查询场景(如布尔查询、多字段查询、过滤、聚合等),并提供详细的实现细节和示例代码。以下是修订后的博客片段,重点扩展查询部分。


二、文档层操作

2. 查询文档(Search Document)

查询是 Elasticsearch 的核心功能,其强大的查询 DSL(Domain Specific Language)支持从简单匹配到复杂聚合的各种场景。REST API 使用 GET /<index>/_search,Java 客户端通过 SearchRequestSearchSourceBuilder 构造查询。

基础查询回顾

先看一个简单查询:

REST API 示例
json 复制代码
GET /my_index/_search
{
  "query": {
    "match": {
      "title": "Elasticsearch"
    }
  }
}
Java 客户端实现
java 复制代码
public void basicSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(QueryBuilders.matchQuery("title", "Elasticsearch"));
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Total hits: " + response.getHits().getTotalHits());
}

但实际场景中,查询需求往往更复杂,下面逐步介绍。

复杂查询场景
2.1 布尔查询(Boolean Query)

布尔查询允许组合多个条件(如 mustshouldmust_notfilter),实现精确控制。

REST API 示例

查询标题包含 "Elasticsearch" 且日期晚于 2025-01-01,但不包含 "Basics" 的文档:

json 复制代码
GET /my_index/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "title": "Elasticsearch" }}
      ],
      "filter": [
        { "range": { "date": { "gte": "2025-01-01" }}}
      ],
      "must_not": [
        { "match": { "title": "Basics" }}
      ]
    }
  }
}
Java 客户端实现
java 复制代码
public void booleanSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    BoolQueryBuilder boolQuery = QueryBuilders.boolQuery()
        .must(QueryBuilders.matchQuery("title", "Elasticsearch"))
        .filter(QueryBuilders.rangeQuery("date").gte("2025-01-01"))
        .mustNot(QueryBuilders.matchQuery("title", "Basics"));
    
    sourceBuilder.query(boolQuery);
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Boolean query hits: " + response.getHits().getTotalHits());
    for (SearchHit hit : response.getHits().getHits()) {
        System.out.println(hit.getSourceAsString());
    }
}
2.2 多字段查询(Multi-Match Query)

当需要跨多个字段搜索时,使用 multi_match

REST API 示例

搜索标题或内容中包含 "Elasticsearch" 的文档:

json 复制代码
GET /my_index/_search
{
  "query": {
    "multi_match": {
      "query": "Elasticsearch",
      "fields": ["title", "content"],
      "type": "best_fields"
    }
  }
}
Java 客户端实现
java 复制代码
public void multiMatchSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    MultiMatchQueryBuilder multiMatchQuery = QueryBuilders.multiMatchQuery("Elasticsearch", "title", "content")
        .type(MultiMatchQueryBuilder.Type.BEST_FIELDS);
    
    sourceBuilder.query(multiMatchQuery);
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Multi-match hits: " + response.getHits().getTotalHits());
}
2.3 短语查询(Match Phrase Query)

要求词语按顺序完全匹配,适用于精确短语搜索。

REST API 示例
json 复制代码
GET /my_index/_search
{
  "query": {
    "match_phrase": {
      "title": "Elasticsearch Basics"
    }
  }
}
Java 客户端实现
java 复制代码
public void phraseSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    sourceBuilder.query(QueryBuilders.matchPhraseQuery("title", "Elasticsearch Basics"));
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Phrase query hits: " + response.getHits().getTotalHits());
}
2.4 模糊查询(Fuzzy Query)

允许一定程度的拼写错误,适用于容错搜索。

REST API 示例
json 复制代码
GET /my_index/_search
{
  "query": {
    "fuzzy": {
      "title": {
        "value": "Elastcsearch",
        "fuzziness": "AUTO"
      }
    }
  }
}
Java 客户端实现
java 复制代码
public void fuzzySearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    sourceBuilder.query(QueryBuilders.fuzzyQuery("title", "Elastcsearch").fuzziness(Fuzziness.AUTO));
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Fuzzy query hits: " + response.getHits().getTotalHits());
}
2.5 聚合查询(Aggregations)

聚合用于统计分析,例如按字段分组或计算平均值。

REST API 示例

按日期统计文档数量:

json 复制代码
GET /my_index/_search
{
  "aggs": {
    "by_date": {
      "terms": {
        "field": "date"
      }
    }
  }
}
Java 客户端实现
java 复制代码
public void aggregationSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    TermsAggregationBuilder aggregation = AggregationBuilders.terms("by_date").field("date");
    sourceBuilder.aggregation(aggregation);
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    Terms byDateAgg = response.getAggregations().get("by_date");
    for (Terms.Bucket bucket : byDateAgg.getBuckets()) {
        System.out.println("Date: " + bucket.getKeyAsString() + ", Count: " + bucket.getDocCount());
    }
}
相关性算分与优化

Elasticsearch 默认使用 BM25 算法计算相关性得分。可以通过以下方式优化:

2.6 Boosting 权重调整

为特定字段或条件增加权重。

REST API 示例
json 复制代码
GET /my_index/_search
{
  "query": {
    "multi_match": {
      "query": "Elasticsearch",
      "fields": ["title^2", "content"]
    }
  }
}
Java 客户端实现
java 复制代码
public void boostedSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    MultiMatchQueryBuilder query = QueryBuilders.multiMatchQuery("Elasticsearch", "title", "content")
        .field("title", 2.0f); // 提高 title 字段权重
    sourceBuilder.query(query);
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Boosted query hits: " + response.getHits().getTotalHits());
}
2.7 Function Score 查询

自定义得分逻辑,例如结合字段值或距离。

REST API 示例
json 复制代码
GET /my_index/_search
{
  "query": {
    "function_score": {
      "query": { "match": { "title": "Elasticsearch" }},
      "functions": [
        {
          "field_value_factor": {
            "field": "views",
            "factor": 1.5,
            "modifier": "sqrt"
          }
        }
      ]
    }
  }
}
Java 客户端实现
java 复制代码
public void functionScoreSearch(RestHighLevelClient client) throws IOException {
    SearchRequest request = new SearchRequest("my_index");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    
    FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery(
        QueryBuilders.matchQuery("title", "Elasticsearch"),
        ScoreFunctionBuilders.fieldValueFactorFunction("views")
            .factor(1.5f)
            .modifier(FieldValueFactorFunction.Modifier.SQRT)
    );
    sourceBuilder.query(functionScoreQuery);
    request.source(sourceBuilder);
    
    SearchResponse response = client.search(request, RequestOptions.DEFAULT);
    System.out.println("Function score hits: " + response.getHits().getTotalHits());
}
查询性能优化建议
  1. 使用 Filter 替代 Mustfilter 不计算相关性得分,性能更高。
  2. 分页控制 :通过 fromsize 参数控制结果范围。
  3. 字段选择 :使用 stored_fields_source 过滤返回字段,减少数据传输。

相关推荐
KAI19 分钟前
NestJS使用拦截器和异常过滤器实现 RESTful API的统一响应格式
后端·nestjs
Asthenia041228 分钟前
Linux系统的页表一般多大?内存不足时强行申请内存会如何?
后端
小宝潜行29 分钟前
SpringBoot之核心特性理解和Jar启动命令运行原理
spring boot·后端·jar
Asthenia041239 分钟前
JavaSE:进程/线程/协程!你真的明白了么?
后端
gongzemin1 小时前
Mac 安装MongoDB 社区版
后端·mongodb
宇瞳月1 小时前
Rust语言的嵌入式Linux
开发语言·后端·golang
Java中文社群1 小时前
拿下美团实习~
java·后端·面试
用户86178277365182 小时前
整表复制
java·后端·mysql
lovebugs2 小时前
CAS是什么?AtomicInteger如何利用它?ABA问题如何解决?
后端·面试
小巫编程室2 小时前
快速入门-Java Lambda
java·后端·面试