Elasticsearch QueryBuilders 高级使用案例

Elasticsearch QueryBuilders 高级使用案例

本文将详细介绍 Elasticsearch QueryBuilders 的高级使用场景,包括复杂组合查询、聚合查询、嵌套查询、分页排序、高亮显示等高级功能,并提供完整的代码示例。

1. 复杂组合查询(多层布尔查询)

1.1 多层嵌套布尔查询

java 复制代码
// 构建复杂的多层嵌套布尔查询
BoolQueryBuilder outerBoolQuery = QueryBuilders.boolQuery();

// 第一层:必须满足的条件
outerBoolQuery.must(QueryBuilders.termQuery("status", "active"));

// 第二层:条件组1(OR关系)
BoolQueryBuilder innerShouldQuery1 = QueryBuilders.boolQuery();
innerShouldQuery1.should(QueryBuilders.matchQuery("title", "elasticsearch"));
innerShouldQuery1.should(QueryBuilders.matchQuery("content", "elasticsearch"));
innerShouldQuery1.minimumShouldMatch(1); // 至少满足一个条件
outerBoolQuery.must(innerShouldQuery1);

// 第三层:条件组2(复杂范围条件)
BoolQueryBuilder innerBoolQuery2 = QueryBuilders.boolQuery();
innerBoolQuery2.filter(QueryBuilders.rangeQuery("publishDate").gte("2023-01-01").lte("2023-12-31"));
innerBoolQuery2.filter(QueryBuilders.rangeQuery("viewCount").gt(100));
outerBoolQuery.filter(innerBoolQuery2);

// 第四层:排除条件
outerBoolQuery.mustNot(QueryBuilders.termQuery("category", "spam"));

// 构建搜索请求
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(outerBoolQuery)
    .withPageable(PageRequest.of(0, 20))
    .build();

1.2 多字段加权查询

java 复制代码
// 使用 function_score 进行多字段加权查询
FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery(
    // 基础查询
    QueryBuilders.matchQuery("content", "elasticsearch tutorial"),
    // 函数评分列表
    new FunctionScoreQueryBuilder.FilterFunctionBuilder[] {
        // 标题匹配加分
        new FunctionScoreQueryBuilder.FilterFunctionBuilder(
            QueryBuilders.matchQuery("title", "elasticsearch tutorial"),
            ScoreFunctionBuilders.weightFactorFunction(3.0f)
        ),
        // 标签匹配加分
        new FunctionScoreQueryBuilder.FilterFunctionBuilder(
            QueryBuilders.termsQuery("tags", "elasticsearch", "tutorial"),
            ScoreFunctionBuilders.weightFactorFunction(2.0f)
        ),
        // 最近发布的文档加分
        new FunctionScoreQueryBuilder.FilterFunctionBuilder(
            QueryBuilders.matchAllQuery(),
            ScoreFunctionBuilders.gaussDecayFunction(
                "publishDate",
                new DateHistogramInterval("30d"),
                0.5
            )
        )
    }
).boostMode(CombineFunction.SUM); // 分数相加模式

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(functionScoreQuery)
    .build();

2. 聚合查询(AggregationBuilders)

2.1 基础聚合查询

java 复制代码
// 1. 术语聚合(分组统计)
TermsAggregationBuilder termsAgg = AggregationBuilders.terms("by_category")
    .field("category.keyword")
    .size(10);

// 2. 数值统计聚合
StatsAggregationBuilder statsAgg = AggregationBuilders.stats("price_stats")
    .field("price");

// 3. 平均值聚合
AvgAggregationBuilder avgAgg = AggregationBuilders.avg("avg_rating")
    .field("rating");

// 构建搜索请求
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(QueryBuilders.matchAllQuery())
    .addAggregation(termsAgg)
    .addAggregation(statsAgg)
    .addAggregation(avgAgg)
    .withPageable(PageRequest.of(0, 0)) // 只获取聚合结果,不返回文档
    .build();

// 执行查询并获取聚合结果
SearchHits<Product> searchHits = elasticsearchRestTemplate.search(searchQuery, Product.class);

// 解析聚合结果
Terms byCategoryTerms = searchHits.getAggregations().get("by_category");
for (Terms.Bucket bucket : byCategoryTerms.getBuckets()) {
    String category = bucket.getKeyAsString();
    long count = bucket.getDocCount();
    System.out.println("Category: " + category + ", Count: " + count);
}

Stats priceStats = searchHits.getAggregations().get("price_stats");
System.out.println("Min Price: " + priceStats.getMin());
System.out.println("Max Price: " + priceStats.getMax());
System.out.println("Avg Price: " + priceStats.getAvg());

2.2 嵌套聚合查询

java 复制代码
// 按品牌分组,然后在每个品牌内按价格区间分组,最后计算每个组的平均评分
TermsAggregationBuilder brandAgg = AggregationBuilders.terms("by_brand")
    .field("brand.keyword")
    .size(10);

RangeAggregationBuilder priceRangeAgg = AggregationBuilders.range("price_ranges")
    .field("price")
    .addUnboundedTo(1000) // 0-1000
    .addRange(1000, 5000) // 1000-5000
    .addUnboundedFrom(5000); // 5000+

AvgAggregationBuilder ratingAgg = AggregationBuilders.avg("avg_rating")
    .field("rating");

// 构建嵌套聚合
priceRangeAgg.subAggregation(ratingAgg);
brandAgg.subAggregation(priceRangeAgg);

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(QueryBuilders.matchAllQuery())
    .addAggregation(brandAgg)
    .withPageable(PageRequest.of(0, 0))
    .build();

// 执行查询并解析嵌套聚合结果
SearchHits<Product> searchHits = elasticsearchRestTemplate.search(searchQuery, Product.class);
Terms byBrandTerms = searchHits.getAggregations().get("by_brand");

for (Terms.Bucket brandBucket : byBrandTerms.getBuckets()) {
    String brand = brandBucket.getKeyAsString();
    System.out.println("Brand: " + brand);
    
    Range priceRanges = brandBucket.getAggregations().get("price_ranges");
    for (Range.Bucket rangeBucket : priceRanges.getBuckets()) {
        String range = rangeBucket.getKeyAsString();
        long count = rangeBucket.getDocCount();
        Avg avgRating = rangeBucket.getAggregations().get("avg_rating");
        
        System.out.println("  Price Range: " + range + ", Count: " + count + ", Avg Rating: " + avgRating.getValue());
    }
}

2.3 日期直方图聚合

java 复制代码
// 按月份统计数据量
DateHistogramAggregationBuilder dateAgg = AggregationBuilders.dateHistogram("by_month")
    .field("publishDate")
    .dateHistogramInterval(DateHistogramInterval.MONTH)
    .format("yyyy-MM")
    .minDocCount(1); // 只返回有文档的区间

// 添加子聚合:每月的平均浏览量
AvgAggregationBuilder viewAvgAgg = AggregationBuilders.avg("avg_views")
    .field("viewCount");
dateAgg.subAggregation(viewAvgAgg);

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(QueryBuilders.rangeQuery("publishDate").gte("2023-01-01"))
    .addAggregation(dateAgg)
    .withPageable(PageRequest.of(0, 0))
    .build();

3. 嵌套查询(Nested Query)

3.1 基本嵌套查询

java 复制代码
// 假设文档结构包含嵌套字段 comments
// {"id": 1, "title": "...", "comments": [{"user": "张三", "content": "...", "rating": 5}]}

// 嵌套字段查询
NestedQueryBuilder nestedQuery = QueryBuilders.nestedQuery(
    "comments", // 嵌套字段路径
    QueryBuilders.boolQuery()
        .must(QueryBuilders.termQuery("comments.user", "张三"))
        .must(QueryBuilders.rangeQuery("comments.rating").gte(4)),
    ScoreMode.Avg // 评分模式
);

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(nestedQuery)
    .build();

3.2 复杂嵌套查询与聚合结合

java 复制代码
// 查找包含高质量评论的文章,并统计每篇文章的平均评论评分
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();

// 基础查询条件
boolQuery.must(QueryBuilders.termQuery("status", "published"));

// 嵌套查询:至少有一条评论评分大于4
boolQuery.filter(
    QueryBuilders.nestedQuery(
        "comments",
        QueryBuilders.rangeQuery("comments.rating").gt(4),
        ScoreMode.None
    )
);

// 嵌套聚合:计算每篇文章的平均评论评分
NestedAggregationBuilder nestedAgg = AggregationBuilders.nested("comments_agg", "comments");
AvgAggregationBuilder avgRatingAgg = AggregationBuilders.avg("avg_comment_rating").field("comments.rating");
nestedAgg.subAggregation(avgRatingAgg);

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(boolQuery)
    .addAggregation(nestedAgg)
    .withPageable(PageRequest.of(0, 10))
    .build();

4. 分页、排序与高亮

4.1 高效分页与排序

java 复制代码
// 构建查询
BoolQueryBuilder query = QueryBuilders.boolQuery()
    .must(QueryBuilders.matchQuery("title", "elasticsearch"))
    .filter(QueryBuilders.termQuery("status", "active"));

// 多字段排序
List<FieldSortBuilder> sortBuilders = new ArrayList<>();
sortBuilders.add(SortBuilders.fieldSort("publishDate").order(SortOrder.DESC));
sortBuilders.add(SortBuilders.fieldSort("viewCount").order(SortOrder.DESC));
sortBuilders.add(SortBuilders.fieldSort("_score").order(SortOrder.DESC));

// 构建搜索请求(使用searchAfter进行深度分页)
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(query)
    .withSorts(sortBuilders)
    .withPageable(PageRequest.of(0, 100)) // 第一页
    .build();

// 执行查询
SearchHits<Article> searchHits = elasticsearchRestTemplate.search(searchQuery, Article.class);

// 获取最后一条记录的排序值,用于下一页查询
if (searchHits.hasSearchHits() && searchHits.getSearchHits().size() == 100) {
    SearchHit<Article> lastHit = searchHits.getSearchHits().get(searchHits.getSearchHits().size() - 1);
    Object[] sortValues = lastHit.getSortValues();
    
    // 构建下一页查询(使用searchAfter)
    NativeSearchQuery nextPageQuery = new NativeSearchQueryBuilder()
        .withQuery(query)
        .withSorts(sortBuilders)
        .withSearchAfter(sortValues) // 使用上一页最后一条记录的排序值
        .withPageable(PageRequest.of(0, 100)) // 页码始终为0
        .build();
}

4.2 高亮显示

java 复制代码
// 构建查询
QueryBuilder query = QueryBuilders.multiMatchQuery(
    "elasticsearch tutorial", 
    "title", "content", "description"
);

// 配置高亮
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("title")
    .preTags("<em>")
    .postTags("</em>")
    .fragmentSize(100)
    .numOfFragments(1);

highlightBuilder.field("content")
    .preTags("<em>")
    .postTags("</em>")
    .fragmentSize(200)
    .numOfFragments(3);

// 构建搜索请求
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(query)
    .withHighlightBuilder(highlightBuilder)
    .build();

// 执行查询并处理高亮结果
SearchHits<Article> searchHits = elasticsearchRestTemplate.search(searchQuery, Article.class);
for (SearchHit<Article> hit : searchHits) {
    Article article = hit.getContent();
    Map<String, List<String>> highlightFields = hit.getHighlightFields();
    
    // 处理高亮片段
    if (highlightFields.containsKey("title")) {
        String highlightedTitle = highlightFields.get("title").get(0);
        System.out.println("Highlighted Title: " + highlightedTitle);
    }
    
    if (highlightFields.containsKey("content")) {
        List<String> contentFragments = highlightFields.get("content");
        System.out.println("Content Fragments:");
        for (String fragment : contentFragments) {
            System.out.println("  " + fragment);
        }
    }
}

5. 脚本查询与性能优化

5.1 脚本查询

java 复制代码
// 使用脚本查询(计算两个字段的差值)
Script script = new Script(ScriptType.INLINE, "painless", 
    "doc['price'].value - doc['discount_price'].value > params.threshold", 
    Collections.singletonMap("threshold", 100));

ScriptQueryBuilder scriptQuery = QueryBuilders.scriptQuery(script);

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(scriptQuery)
    .build();

5.2 性能优化技巧

java 复制代码
// 1. 使用filter上下文而不是must(不计算评分,可缓存)
BoolQueryBuilder optimizedQuery = QueryBuilders.boolQuery();
// 过滤条件使用filter
optimizedQuery.filter(QueryBuilders.termQuery("status", "active"));
optimizedQuery.filter(QueryBuilders.rangeQuery("price").from(100).to(1000));
// 只有搜索条件使用must
optimizedQuery.must(QueryBuilders.matchQuery("title", "elasticsearch"));

// 2. 使用terms查询替代多个term查询
optimizedQuery.filter(QueryBuilders.termsQuery("category.keyword", "tech", "programming", "database"));

// 3. 合理设置查询大小和超时
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(optimizedQuery)
    .withPageable(PageRequest.of(0, 50)) // 限制返回数量
    .withTimeout(TimeValue.timeValueSeconds(10)) // 设置超时时间
    .build();

// 4. 禁用不必要的功能
NativeSearchQuery optimizedSearchQuery = new NativeSearchQueryBuilder()
    .withQuery(optimizedQuery)
    .withFetchSource("id", "title", "price") // 只返回需要的字段
    .withTrackScores(false) // 不需要评分时禁用
    .withTrackTotalHitsUpTo(10000) // 限制总命中数跟踪,提高性能
    .build();

6. 实际业务场景综合示例

6.1 电商搜索场景

java 复制代码
@Service
public class ProductSearchService {
    
    @Autowired
    private ElasticsearchRestTemplate elasticsearchRestTemplate;
    
    /**
     * 电商搜索综合查询
     * 支持:关键词搜索、分类筛选、品牌筛选、价格范围、评分筛选、排序、分页、高亮
     */
    public SearchResult searchProducts(ProductSearchRequest request) {
        BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
        
        // 1. 关键词搜索
        if (StringUtils.hasText(request.getKeyword())) {
            boolQuery.must(QueryBuilders.multiMatchQuery(
                request.getKeyword(),
                "name", "description", "keywords"
            ).minimumShouldMatch("70%"));
        }
        
        // 2. 分类筛选
        if (StringUtils.hasText(request.getCategory())) {
            boolQuery.filter(QueryBuilders.termQuery("category.keyword", request.getCategory()));
        }
        
        // 3. 品牌筛选
        if (CollectionUtils.isNotEmpty(request.getBrands())) {
            boolQuery.filter(QueryBuilders.termsQuery("brand.keyword", request.getBrands()));
        }
        
        // 4. 价格范围
        if (request.getMinPrice() != null || request.getMaxPrice() != null) {
            RangeQueryBuilder priceRange = QueryBuilders.rangeQuery("price");
            if (request.getMinPrice() != null) priceRange.gte(request.getMinPrice());
            if (request.getMaxPrice() != null) priceRange.lte(request.getMaxPrice());
            boolQuery.filter(priceRange);
        }
        
        // 5. 评分筛选
        if (request.getMinRating() != null) {
            boolQuery.filter(QueryBuilders.rangeQuery("rating").gte(request.getMinRating()));
        }
        
        // 6. 排序
        List<FieldSortBuilder> sorts = new ArrayList<>();
        if (StringUtils.hasText(request.getSortBy())) {
            SortOrder order = request.isAscending() ? SortOrder.ASC : SortOrder.DESC;
            switch (request.getSortBy()) {
                case "price":
                    sorts.add(SortBuilders.fieldSort("price").order(order));
                    break;
                case "rating":
                    sorts.add(SortBuilders.fieldSort("rating").order(order));
                    break;
                case "sales":
                    sorts.add(SortBuilders.fieldSort("salesCount").order(order));
                    break;
                case "newest":
                    sorts.add(SortBuilders.fieldSort("createTime").order(SortOrder.DESC));
                    break;
                default:
                    // 默认按相关性排序
                    sorts.add(SortBuilders.fieldSort("_score").order(SortOrder.DESC));
            }
        } else {
            // 默认排序
            sorts.add(SortBuilders.fieldSort("_score").order(SortOrder.DESC));
            sorts.add(SortBuilders.fieldSort("salesCount").order(SortOrder.DESC));
        }
        
        // 7. 高亮配置
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        highlightBuilder.field("name")
            .preTags("<em class='highlight'>")
            .postTags("</em>")
            .fragmentSize(100)
            .numOfFragments(1);
        
        // 8. 聚合查询(用于筛选器)
        TermsAggregationBuilder brandAgg = AggregationBuilders.terms("by_brand")
            .field("brand.keyword")
            .size(20);
            
        RangeAggregationBuilder priceAgg = AggregationBuilders.range("price_ranges")
            .field("price")
            .addUnboundedTo(100)
            .addRange(100, 500)
            .addRange(500, 1000)
            .addRange(1000, 3000)
            .addUnboundedFrom(3000);
        
        // 9. 构建搜索请求
        NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
            .withQuery(boolQuery)
            .withSorts(sorts)
            .withHighlightBuilder(highlightBuilder)
            .addAggregation(brandAgg)
            .addAggregation(priceAgg)
            .withPageable(PageRequest.of(request.getPage(), request.getSize()))
            .withTimeout(TimeValue.timeValueSeconds(5))
            .build();
        
        // 10. 执行查询
        SearchHits<Product> searchHits = elasticsearchRestTemplate.search(searchQuery, Product.class);
        
        // 11. 处理结果
        List<ProductDTO> products = new ArrayList<>();
        for (SearchHit<Product> hit : searchHits) {
            Product product = hit.getContent();
            ProductDTO dto = convertToDTO(product);
            
            // 设置高亮
            Map<String, List<String>> highlightFields = hit.getHighlightFields();
            if (highlightFields.containsKey("name")) {
                dto.setHighlightName(highlightFields.get("name").get(0));
            }
            
            products.add(dto);
        }
        
        // 12. 处理聚合结果
        Map<String, List<FilterOption>> filters = new HashMap<>();
        
        // 品牌筛选器
        Terms byBrandTerms = searchHits.getAggregations().get("by_brand");
        List<FilterOption> brandOptions = byBrandTerms.getBuckets().stream()
            .map(bucket -> new FilterOption(bucket.getKeyAsString(), bucket.getDocCount()))
            .collect(Collectors.toList());
        filters.put("brand", brandOptions);
        
        // 价格筛选器
        Range priceRanges = searchHits.getAggregations().get("price_ranges");
        List<FilterOption> priceOptions = priceRanges.getBuckets().stream()
            .map(bucket -> new FilterOption(bucket.getKeyAsString(), bucket.getDocCount()))
            .collect(Collectors.toList());
        filters.put("price", priceOptions);
        
        // 13. 返回结果
        SearchResult result = new SearchResult();
        result.setProducts(products);
        result.setTotalHits(searchHits.getTotalHits());
        result.setFilters(filters);
        result.setCurrentPage(request.getPage());
        result.setPageSize(request.getSize());
        
        return result;
    }
    
    private ProductDTO convertToDTO(Product product) {
        // 转换逻辑
        return new ProductDTO();
    }
}

7. 注意事项与最佳实践

  1. 版本兼容性:不同版本的 Elasticsearch 中 QueryBuilders 的方法可能有所不同,请根据实际使用的版本查阅官方文档。

  2. 查询性能

    • 优先使用 filter 上下文而不是 query 上下文进行过滤操作
    • 避免使用 leading wildcard 查询
    • 对于深度分页,使用 search_after 而不是 from/size
    • 合理设置查询超时时间
  3. 内存使用

    • 聚合查询可能消耗大量内存,特别是高基数字段的聚合
    • 限制返回字段,只获取必要的数据
    • 对于大型结果集,考虑使用 scroll API
  4. 索引设计

    • 为需要精确匹配的字段使用 keyword 类型
    • 为全文搜索字段选择合适的分词器
    • 对于频繁过滤的字段,考虑启用 doc_values
  5. 复杂查询调试

    • 使用 explain API 分析查询执行计划
    • 检查慢查询日志,优化性能瓶颈
    • 考虑使用预计算或缓存热点数据

通过合理利用这些高级特性,您可以构建强大而高效的 Elasticsearch 查询,满足各种复杂的业务需求。

相关推荐
骄马之死14 小时前
SpringMVC + SpringBoot 核心知识点总结
java·spring boot·后端
GoGeekBaird15 小时前
Anthropic技能"(Skills)的经验分享
后端
王码码203516 小时前
多台服务器怎么统一看状态?Beszel 轻量监控,搭起来不费事
运维·服务器·后端·安全·阿里云·接口·web
郑洁文16 小时前
基于Spring Boot的流浪动物救助网站
java·spring boot·后端·毕设·流浪动物救助
螺丝钉code17 小时前
JAVA项目 Claude code CLAUDE.md 到底应该怎么写
java·人工智能·claude code
指令集梦境17 小时前
Cursor + Spring Boot实战:从零写一个RESTful API
spring boot·后端·restful
摇滚侠18 小时前
Maven 入门+高深 单一架构案例 54-59
java·架构·maven·intellij-idea
VidDown18 小时前
Webhook 调试器:让第三方回调“原形毕露”
java·开发语言·javascript·编辑器·postman
码云之上18 小时前
聊聊如何设计一个高效、稳定的 Node.js 接入层
前端·后端·node.js
折哥的程序人生 · 物流技术专研18 小时前
Java 23 种设计模式:从踩坑到精通 | 原型模式 —— 克隆对象,深拷贝与浅拷贝的坑你踩过吗?
java·设计模式·架构·原型模式·单一职责原则