Elasticsearch API 及其 Java 客户端操作详解
Elasticsearch 是一个功能强大的分布式搜索和分析引擎,其 RESTful API 提供了丰富的操作接口。与此同时,Elasticsearch 提供了官方的 Java 客户端(如 Java High-Level REST Client 和新版 Elasticsearch Java API Client),方便开发者在 Java 环境中与其交互。本文将从索引层和文档层两个维度,详细讲解 Elasticsearch 的核心 API 及其 Java 客户端的构造操作。
一、索引层操作
索引(Index)是 Elasticsearch 中存储数据的逻辑单元,类似于数据库中的"数据库"。索引层操作主要涉及索引的创建、管理和删除等功能。
1. 创建索引(Create Index)
Elasticsearch 允许通过 PUT /<index>
创建索引,并可指定映射(Mapping)和设置(Settings)。
REST API 示例
json
PUT /my_index
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"title": { "type": "text" },
"date": { "type": "date" }
}
}
}
Java 客户端实现
使用 RestHighLevelClient
创建索引:
java
import org.elasticsearch.action.admin.indices.create.CreateIndexRequest;
import org.elasticsearch.action.admin.indices.create.CreateIndexResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.settings.Settings;
public void createIndex(RestHighLevelClient client) throws IOException {
CreateIndexRequest request = new CreateIndexRequest("my_index");
request.settings(Settings.builder()
.put("index.number_of_shards", 3)
.put("index.number_of_replicas", 1)
);
request.mapping("{\"properties\":{\"title\":{\"type\":\"text\"},\"date\":{\"type\":\"date\"}}}",
XContentType.JSON);
CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT);
System.out.println("Index created: " + response.isAcknowledged());
}
2. 删除索引(Delete Index)
通过 DELETE /<index>
删除索引。
REST API 示例
json
DELETE /my_index
Java 客户端实现
java
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
public void deleteIndex(RestHighLevelClient client) throws IOException {
DeleteIndexRequest request = new DeleteIndexRequest("my_index");
AcknowledgedResponse response = client.indices().delete(request, RequestOptions.DEFAULT);
System.out.println("Index deleted: " + response.isAcknowledged());
}
你说得对,查询是 Elasticsearch 的核心功能之一,尤其在实际应用中,复杂的查询条件和组合逻辑非常常见。我重新调整并扩展这部分内容,深入讲解查询相关的 API 和 Java 客户端实现,涵盖更多复杂的查询场景(如布尔查询、多字段查询、过滤、聚合等),并提供详细的实现细节和示例代码。以下是修订后的博客片段,重点扩展查询部分。
二、文档层操作
2. 查询文档(Search Document)
查询是 Elasticsearch 的核心功能,其强大的查询 DSL(Domain Specific Language)支持从简单匹配到复杂聚合的各种场景。REST API 使用 GET /<index>/_search
,Java 客户端通过 SearchRequest
和 SearchSourceBuilder
构造查询。
基础查询回顾
先看一个简单查询:
REST API 示例
json
GET /my_index/_search
{
"query": {
"match": {
"title": "Elasticsearch"
}
}
}
Java 客户端实现
java
public void basicSearch(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest("my_index");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchQuery("title", "Elasticsearch"));
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println("Total hits: " + response.getHits().getTotalHits());
}
但实际场景中,查询需求往往更复杂,下面逐步介绍。
复杂查询场景
2.1 布尔查询(Boolean Query)
布尔查询允许组合多个条件(如 must
、should
、must_not
、filter
),实现精确控制。
REST API 示例
查询标题包含 "Elasticsearch" 且日期晚于 2025-01-01,但不包含 "Basics" 的文档:
json
GET /my_index/_search
{
"query": {
"bool": {
"must": [
{ "match": { "title": "Elasticsearch" }}
],
"filter": [
{ "range": { "date": { "gte": "2025-01-01" }}}
],
"must_not": [
{ "match": { "title": "Basics" }}
]
}
}
}
Java 客户端实现
java
public void booleanSearch(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest("my_index");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery()
.must(QueryBuilders.matchQuery("title", "Elasticsearch"))
.filter(QueryBuilders.rangeQuery("date").gte("2025-01-01"))
.mustNot(QueryBuilders.matchQuery("title", "Basics"));
sourceBuilder.query(boolQuery);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println("Boolean query hits: " + response.getHits().getTotalHits());
for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getSourceAsString());
}
}
2.2 多字段查询(Multi-Match Query)
当需要跨多个字段搜索时,使用 multi_match
。
REST API 示例
搜索标题或内容中包含 "Elasticsearch" 的文档:
json
GET /my_index/_search
{
"query": {
"multi_match": {
"query": "Elasticsearch",
"fields": ["title", "content"],
"type": "best_fields"
}
}
}
Java 客户端实现
java
public void multiMatchSearch(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest("my_index");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
MultiMatchQueryBuilder multiMatchQuery = QueryBuilders.multiMatchQuery("Elasticsearch", "title", "content")
.type(MultiMatchQueryBuilder.Type.BEST_FIELDS);
sourceBuilder.query(multiMatchQuery);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println("Multi-match hits: " + response.getHits().getTotalHits());
}
2.3 短语查询(Match Phrase Query)
要求词语按顺序完全匹配,适用于精确短语搜索。
REST API 示例
json
GET /my_index/_search
{
"query": {
"match_phrase": {
"title": "Elasticsearch Basics"
}
}
}
Java 客户端实现
java
public void phraseSearch(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest("my_index");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchPhraseQuery("title", "Elasticsearch Basics"));
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println("Phrase query hits: " + response.getHits().getTotalHits());
}
2.4 模糊查询(Fuzzy Query)
允许一定程度的拼写错误,适用于容错搜索。
REST API 示例
json
GET /my_index/_search
{
"query": {
"fuzzy": {
"title": {
"value": "Elastcsearch",
"fuzziness": "AUTO"
}
}
}
}
Java 客户端实现
java
public void fuzzySearch(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest("my_index");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.fuzzyQuery("title", "Elastcsearch").fuzziness(Fuzziness.AUTO));
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println("Fuzzy query hits: " + response.getHits().getTotalHits());
}
2.5 聚合查询(Aggregations)
聚合用于统计分析,例如按字段分组或计算平均值。
REST API 示例
按日期统计文档数量:
json
GET /my_index/_search
{
"aggs": {
"by_date": {
"terms": {
"field": "date"
}
}
}
}
Java 客户端实现
java
public void aggregationSearch(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest("my_index");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
TermsAggregationBuilder aggregation = AggregationBuilders.terms("by_date").field("date");
sourceBuilder.aggregation(aggregation);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
Terms byDateAgg = response.getAggregations().get("by_date");
for (Terms.Bucket bucket : byDateAgg.getBuckets()) {
System.out.println("Date: " + bucket.getKeyAsString() + ", Count: " + bucket.getDocCount());
}
}
相关性算分与优化
Elasticsearch 默认使用 BM25 算法计算相关性得分。可以通过以下方式优化:
2.6 Boosting 权重调整
为特定字段或条件增加权重。
REST API 示例
json
GET /my_index/_search
{
"query": {
"multi_match": {
"query": "Elasticsearch",
"fields": ["title^2", "content"]
}
}
}
Java 客户端实现
java
public void boostedSearch(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest("my_index");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
MultiMatchQueryBuilder query = QueryBuilders.multiMatchQuery("Elasticsearch", "title", "content")
.field("title", 2.0f); // 提高 title 字段权重
sourceBuilder.query(query);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println("Boosted query hits: " + response.getHits().getTotalHits());
}
2.7 Function Score 查询
自定义得分逻辑,例如结合字段值或距离。
REST API 示例
json
GET /my_index/_search
{
"query": {
"function_score": {
"query": { "match": { "title": "Elasticsearch" }},
"functions": [
{
"field_value_factor": {
"field": "views",
"factor": 1.5,
"modifier": "sqrt"
}
}
]
}
}
}
Java 客户端实现
java
public void functionScoreSearch(RestHighLevelClient client) throws IOException {
SearchRequest request = new SearchRequest("my_index");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery(
QueryBuilders.matchQuery("title", "Elasticsearch"),
ScoreFunctionBuilders.fieldValueFactorFunction("views")
.factor(1.5f)
.modifier(FieldValueFactorFunction.Modifier.SQRT)
);
sourceBuilder.query(functionScoreQuery);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println("Function score hits: " + response.getHits().getTotalHits());
}
查询性能优化建议
- 使用 Filter 替代 Must :
filter
不计算相关性得分,性能更高。 - 分页控制 :通过
from
和size
参数控制结果范围。 - 字段选择 :使用
stored_fields
或_source
过滤返回字段,减少数据传输。