Elasticsearch QueryBuilders 高级使用案例

Elasticsearch QueryBuilders 高级使用案例

本文将详细介绍 Elasticsearch QueryBuilders 的高级使用场景,包括复杂组合查询、聚合查询、嵌套查询、分页排序、高亮显示等高级功能,并提供完整的代码示例。

1. 复杂组合查询(多层布尔查询)

1.1 多层嵌套布尔查询

java 复制代码
// 构建复杂的多层嵌套布尔查询
BoolQueryBuilder outerBoolQuery = QueryBuilders.boolQuery();

// 第一层:必须满足的条件
outerBoolQuery.must(QueryBuilders.termQuery("status", "active"));

// 第二层:条件组1(OR关系)
BoolQueryBuilder innerShouldQuery1 = QueryBuilders.boolQuery();
innerShouldQuery1.should(QueryBuilders.matchQuery("title", "elasticsearch"));
innerShouldQuery1.should(QueryBuilders.matchQuery("content", "elasticsearch"));
innerShouldQuery1.minimumShouldMatch(1); // 至少满足一个条件
outerBoolQuery.must(innerShouldQuery1);

// 第三层:条件组2(复杂范围条件)
BoolQueryBuilder innerBoolQuery2 = QueryBuilders.boolQuery();
innerBoolQuery2.filter(QueryBuilders.rangeQuery("publishDate").gte("2023-01-01").lte("2023-12-31"));
innerBoolQuery2.filter(QueryBuilders.rangeQuery("viewCount").gt(100));
outerBoolQuery.filter(innerBoolQuery2);

// 第四层:排除条件
outerBoolQuery.mustNot(QueryBuilders.termQuery("category", "spam"));

// 构建搜索请求
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(outerBoolQuery)
    .withPageable(PageRequest.of(0, 20))
    .build();

1.2 多字段加权查询

java 复制代码
// 使用 function_score 进行多字段加权查询
FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery(
    // 基础查询
    QueryBuilders.matchQuery("content", "elasticsearch tutorial"),
    // 函数评分列表
    new FunctionScoreQueryBuilder.FilterFunctionBuilder[] {
        // 标题匹配加分
        new FunctionScoreQueryBuilder.FilterFunctionBuilder(
            QueryBuilders.matchQuery("title", "elasticsearch tutorial"),
            ScoreFunctionBuilders.weightFactorFunction(3.0f)
        ),
        // 标签匹配加分
        new FunctionScoreQueryBuilder.FilterFunctionBuilder(
            QueryBuilders.termsQuery("tags", "elasticsearch", "tutorial"),
            ScoreFunctionBuilders.weightFactorFunction(2.0f)
        ),
        // 最近发布的文档加分
        new FunctionScoreQueryBuilder.FilterFunctionBuilder(
            QueryBuilders.matchAllQuery(),
            ScoreFunctionBuilders.gaussDecayFunction(
                "publishDate",
                new DateHistogramInterval("30d"),
                0.5
            )
        )
    }
).boostMode(CombineFunction.SUM); // 分数相加模式

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(functionScoreQuery)
    .build();

2. 聚合查询(AggregationBuilders)

2.1 基础聚合查询

java 复制代码
// 1. 术语聚合(分组统计)
TermsAggregationBuilder termsAgg = AggregationBuilders.terms("by_category")
    .field("category.keyword")
    .size(10);

// 2. 数值统计聚合
StatsAggregationBuilder statsAgg = AggregationBuilders.stats("price_stats")
    .field("price");

// 3. 平均值聚合
AvgAggregationBuilder avgAgg = AggregationBuilders.avg("avg_rating")
    .field("rating");

// 构建搜索请求
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(QueryBuilders.matchAllQuery())
    .addAggregation(termsAgg)
    .addAggregation(statsAgg)
    .addAggregation(avgAgg)
    .withPageable(PageRequest.of(0, 0)) // 只获取聚合结果,不返回文档
    .build();

// 执行查询并获取聚合结果
SearchHits<Product> searchHits = elasticsearchRestTemplate.search(searchQuery, Product.class);

// 解析聚合结果
Terms byCategoryTerms = searchHits.getAggregations().get("by_category");
for (Terms.Bucket bucket : byCategoryTerms.getBuckets()) {
    String category = bucket.getKeyAsString();
    long count = bucket.getDocCount();
    System.out.println("Category: " + category + ", Count: " + count);
}

Stats priceStats = searchHits.getAggregations().get("price_stats");
System.out.println("Min Price: " + priceStats.getMin());
System.out.println("Max Price: " + priceStats.getMax());
System.out.println("Avg Price: " + priceStats.getAvg());

2.2 嵌套聚合查询

java 复制代码
// 按品牌分组,然后在每个品牌内按价格区间分组,最后计算每个组的平均评分
TermsAggregationBuilder brandAgg = AggregationBuilders.terms("by_brand")
    .field("brand.keyword")
    .size(10);

RangeAggregationBuilder priceRangeAgg = AggregationBuilders.range("price_ranges")
    .field("price")
    .addUnboundedTo(1000) // 0-1000
    .addRange(1000, 5000) // 1000-5000
    .addUnboundedFrom(5000); // 5000+

AvgAggregationBuilder ratingAgg = AggregationBuilders.avg("avg_rating")
    .field("rating");

// 构建嵌套聚合
priceRangeAgg.subAggregation(ratingAgg);
brandAgg.subAggregation(priceRangeAgg);

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(QueryBuilders.matchAllQuery())
    .addAggregation(brandAgg)
    .withPageable(PageRequest.of(0, 0))
    .build();

// 执行查询并解析嵌套聚合结果
SearchHits<Product> searchHits = elasticsearchRestTemplate.search(searchQuery, Product.class);
Terms byBrandTerms = searchHits.getAggregations().get("by_brand");

for (Terms.Bucket brandBucket : byBrandTerms.getBuckets()) {
    String brand = brandBucket.getKeyAsString();
    System.out.println("Brand: " + brand);
    
    Range priceRanges = brandBucket.getAggregations().get("price_ranges");
    for (Range.Bucket rangeBucket : priceRanges.getBuckets()) {
        String range = rangeBucket.getKeyAsString();
        long count = rangeBucket.getDocCount();
        Avg avgRating = rangeBucket.getAggregations().get("avg_rating");
        
        System.out.println("  Price Range: " + range + ", Count: " + count + ", Avg Rating: " + avgRating.getValue());
    }
}

2.3 日期直方图聚合

java 复制代码
// 按月份统计数据量
DateHistogramAggregationBuilder dateAgg = AggregationBuilders.dateHistogram("by_month")
    .field("publishDate")
    .dateHistogramInterval(DateHistogramInterval.MONTH)
    .format("yyyy-MM")
    .minDocCount(1); // 只返回有文档的区间

// 添加子聚合:每月的平均浏览量
AvgAggregationBuilder viewAvgAgg = AggregationBuilders.avg("avg_views")
    .field("viewCount");
dateAgg.subAggregation(viewAvgAgg);

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(QueryBuilders.rangeQuery("publishDate").gte("2023-01-01"))
    .addAggregation(dateAgg)
    .withPageable(PageRequest.of(0, 0))
    .build();

3. 嵌套查询(Nested Query)

3.1 基本嵌套查询

java 复制代码
// 假设文档结构包含嵌套字段 comments
// {"id": 1, "title": "...", "comments": [{"user": "张三", "content": "...", "rating": 5}]}

// 嵌套字段查询
NestedQueryBuilder nestedQuery = QueryBuilders.nestedQuery(
    "comments", // 嵌套字段路径
    QueryBuilders.boolQuery()
        .must(QueryBuilders.termQuery("comments.user", "张三"))
        .must(QueryBuilders.rangeQuery("comments.rating").gte(4)),
    ScoreMode.Avg // 评分模式
);

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(nestedQuery)
    .build();

3.2 复杂嵌套查询与聚合结合

java 复制代码
// 查找包含高质量评论的文章,并统计每篇文章的平均评论评分
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();

// 基础查询条件
boolQuery.must(QueryBuilders.termQuery("status", "published"));

// 嵌套查询:至少有一条评论评分大于4
boolQuery.filter(
    QueryBuilders.nestedQuery(
        "comments",
        QueryBuilders.rangeQuery("comments.rating").gt(4),
        ScoreMode.None
    )
);

// 嵌套聚合:计算每篇文章的平均评论评分
NestedAggregationBuilder nestedAgg = AggregationBuilders.nested("comments_agg", "comments");
AvgAggregationBuilder avgRatingAgg = AggregationBuilders.avg("avg_comment_rating").field("comments.rating");
nestedAgg.subAggregation(avgRatingAgg);

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(boolQuery)
    .addAggregation(nestedAgg)
    .withPageable(PageRequest.of(0, 10))
    .build();

4. 分页、排序与高亮

4.1 高效分页与排序

java 复制代码
// 构建查询
BoolQueryBuilder query = QueryBuilders.boolQuery()
    .must(QueryBuilders.matchQuery("title", "elasticsearch"))
    .filter(QueryBuilders.termQuery("status", "active"));

// 多字段排序
List<FieldSortBuilder> sortBuilders = new ArrayList<>();
sortBuilders.add(SortBuilders.fieldSort("publishDate").order(SortOrder.DESC));
sortBuilders.add(SortBuilders.fieldSort("viewCount").order(SortOrder.DESC));
sortBuilders.add(SortBuilders.fieldSort("_score").order(SortOrder.DESC));

// 构建搜索请求(使用searchAfter进行深度分页)
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(query)
    .withSorts(sortBuilders)
    .withPageable(PageRequest.of(0, 100)) // 第一页
    .build();

// 执行查询
SearchHits<Article> searchHits = elasticsearchRestTemplate.search(searchQuery, Article.class);

// 获取最后一条记录的排序值,用于下一页查询
if (searchHits.hasSearchHits() && searchHits.getSearchHits().size() == 100) {
    SearchHit<Article> lastHit = searchHits.getSearchHits().get(searchHits.getSearchHits().size() - 1);
    Object[] sortValues = lastHit.getSortValues();
    
    // 构建下一页查询(使用searchAfter)
    NativeSearchQuery nextPageQuery = new NativeSearchQueryBuilder()
        .withQuery(query)
        .withSorts(sortBuilders)
        .withSearchAfter(sortValues) // 使用上一页最后一条记录的排序值
        .withPageable(PageRequest.of(0, 100)) // 页码始终为0
        .build();
}

4.2 高亮显示

java 复制代码
// 构建查询
QueryBuilder query = QueryBuilders.multiMatchQuery(
    "elasticsearch tutorial", 
    "title", "content", "description"
);

// 配置高亮
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("title")
    .preTags("<em>")
    .postTags("</em>")
    .fragmentSize(100)
    .numOfFragments(1);

highlightBuilder.field("content")
    .preTags("<em>")
    .postTags("</em>")
    .fragmentSize(200)
    .numOfFragments(3);

// 构建搜索请求
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(query)
    .withHighlightBuilder(highlightBuilder)
    .build();

// 执行查询并处理高亮结果
SearchHits<Article> searchHits = elasticsearchRestTemplate.search(searchQuery, Article.class);
for (SearchHit<Article> hit : searchHits) {
    Article article = hit.getContent();
    Map<String, List<String>> highlightFields = hit.getHighlightFields();
    
    // 处理高亮片段
    if (highlightFields.containsKey("title")) {
        String highlightedTitle = highlightFields.get("title").get(0);
        System.out.println("Highlighted Title: " + highlightedTitle);
    }
    
    if (highlightFields.containsKey("content")) {
        List<String> contentFragments = highlightFields.get("content");
        System.out.println("Content Fragments:");
        for (String fragment : contentFragments) {
            System.out.println("  " + fragment);
        }
    }
}

5. 脚本查询与性能优化

5.1 脚本查询

java 复制代码
// 使用脚本查询(计算两个字段的差值)
Script script = new Script(ScriptType.INLINE, "painless", 
    "doc['price'].value - doc['discount_price'].value > params.threshold", 
    Collections.singletonMap("threshold", 100));

ScriptQueryBuilder scriptQuery = QueryBuilders.scriptQuery(script);

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(scriptQuery)
    .build();

5.2 性能优化技巧

java 复制代码
// 1. 使用filter上下文而不是must(不计算评分,可缓存)
BoolQueryBuilder optimizedQuery = QueryBuilders.boolQuery();
// 过滤条件使用filter
optimizedQuery.filter(QueryBuilders.termQuery("status", "active"));
optimizedQuery.filter(QueryBuilders.rangeQuery("price").from(100).to(1000));
// 只有搜索条件使用must
optimizedQuery.must(QueryBuilders.matchQuery("title", "elasticsearch"));

// 2. 使用terms查询替代多个term查询
optimizedQuery.filter(QueryBuilders.termsQuery("category.keyword", "tech", "programming", "database"));

// 3. 合理设置查询大小和超时
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(optimizedQuery)
    .withPageable(PageRequest.of(0, 50)) // 限制返回数量
    .withTimeout(TimeValue.timeValueSeconds(10)) // 设置超时时间
    .build();

// 4. 禁用不必要的功能
NativeSearchQuery optimizedSearchQuery = new NativeSearchQueryBuilder()
    .withQuery(optimizedQuery)
    .withFetchSource("id", "title", "price") // 只返回需要的字段
    .withTrackScores(false) // 不需要评分时禁用
    .withTrackTotalHitsUpTo(10000) // 限制总命中数跟踪,提高性能
    .build();

6. 实际业务场景综合示例

6.1 电商搜索场景

java 复制代码
@Service
public class ProductSearchService {
    
    @Autowired
    private ElasticsearchRestTemplate elasticsearchRestTemplate;
    
    /**
     * 电商搜索综合查询
     * 支持:关键词搜索、分类筛选、品牌筛选、价格范围、评分筛选、排序、分页、高亮
     */
    public SearchResult searchProducts(ProductSearchRequest request) {
        BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
        
        // 1. 关键词搜索
        if (StringUtils.hasText(request.getKeyword())) {
            boolQuery.must(QueryBuilders.multiMatchQuery(
                request.getKeyword(),
                "name", "description", "keywords"
            ).minimumShouldMatch("70%"));
        }
        
        // 2. 分类筛选
        if (StringUtils.hasText(request.getCategory())) {
            boolQuery.filter(QueryBuilders.termQuery("category.keyword", request.getCategory()));
        }
        
        // 3. 品牌筛选
        if (CollectionUtils.isNotEmpty(request.getBrands())) {
            boolQuery.filter(QueryBuilders.termsQuery("brand.keyword", request.getBrands()));
        }
        
        // 4. 价格范围
        if (request.getMinPrice() != null || request.getMaxPrice() != null) {
            RangeQueryBuilder priceRange = QueryBuilders.rangeQuery("price");
            if (request.getMinPrice() != null) priceRange.gte(request.getMinPrice());
            if (request.getMaxPrice() != null) priceRange.lte(request.getMaxPrice());
            boolQuery.filter(priceRange);
        }
        
        // 5. 评分筛选
        if (request.getMinRating() != null) {
            boolQuery.filter(QueryBuilders.rangeQuery("rating").gte(request.getMinRating()));
        }
        
        // 6. 排序
        List<FieldSortBuilder> sorts = new ArrayList<>();
        if (StringUtils.hasText(request.getSortBy())) {
            SortOrder order = request.isAscending() ? SortOrder.ASC : SortOrder.DESC;
            switch (request.getSortBy()) {
                case "price":
                    sorts.add(SortBuilders.fieldSort("price").order(order));
                    break;
                case "rating":
                    sorts.add(SortBuilders.fieldSort("rating").order(order));
                    break;
                case "sales":
                    sorts.add(SortBuilders.fieldSort("salesCount").order(order));
                    break;
                case "newest":
                    sorts.add(SortBuilders.fieldSort("createTime").order(SortOrder.DESC));
                    break;
                default:
                    // 默认按相关性排序
                    sorts.add(SortBuilders.fieldSort("_score").order(SortOrder.DESC));
            }
        } else {
            // 默认排序
            sorts.add(SortBuilders.fieldSort("_score").order(SortOrder.DESC));
            sorts.add(SortBuilders.fieldSort("salesCount").order(SortOrder.DESC));
        }
        
        // 7. 高亮配置
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        highlightBuilder.field("name")
            .preTags("<em class='highlight'>")
            .postTags("</em>")
            .fragmentSize(100)
            .numOfFragments(1);
        
        // 8. 聚合查询(用于筛选器)
        TermsAggregationBuilder brandAgg = AggregationBuilders.terms("by_brand")
            .field("brand.keyword")
            .size(20);
            
        RangeAggregationBuilder priceAgg = AggregationBuilders.range("price_ranges")
            .field("price")
            .addUnboundedTo(100)
            .addRange(100, 500)
            .addRange(500, 1000)
            .addRange(1000, 3000)
            .addUnboundedFrom(3000);
        
        // 9. 构建搜索请求
        NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
            .withQuery(boolQuery)
            .withSorts(sorts)
            .withHighlightBuilder(highlightBuilder)
            .addAggregation(brandAgg)
            .addAggregation(priceAgg)
            .withPageable(PageRequest.of(request.getPage(), request.getSize()))
            .withTimeout(TimeValue.timeValueSeconds(5))
            .build();
        
        // 10. 执行查询
        SearchHits<Product> searchHits = elasticsearchRestTemplate.search(searchQuery, Product.class);
        
        // 11. 处理结果
        List<ProductDTO> products = new ArrayList<>();
        for (SearchHit<Product> hit : searchHits) {
            Product product = hit.getContent();
            ProductDTO dto = convertToDTO(product);
            
            // 设置高亮
            Map<String, List<String>> highlightFields = hit.getHighlightFields();
            if (highlightFields.containsKey("name")) {
                dto.setHighlightName(highlightFields.get("name").get(0));
            }
            
            products.add(dto);
        }
        
        // 12. 处理聚合结果
        Map<String, List<FilterOption>> filters = new HashMap<>();
        
        // 品牌筛选器
        Terms byBrandTerms = searchHits.getAggregations().get("by_brand");
        List<FilterOption> brandOptions = byBrandTerms.getBuckets().stream()
            .map(bucket -> new FilterOption(bucket.getKeyAsString(), bucket.getDocCount()))
            .collect(Collectors.toList());
        filters.put("brand", brandOptions);
        
        // 价格筛选器
        Range priceRanges = searchHits.getAggregations().get("price_ranges");
        List<FilterOption> priceOptions = priceRanges.getBuckets().stream()
            .map(bucket -> new FilterOption(bucket.getKeyAsString(), bucket.getDocCount()))
            .collect(Collectors.toList());
        filters.put("price", priceOptions);
        
        // 13. 返回结果
        SearchResult result = new SearchResult();
        result.setProducts(products);
        result.setTotalHits(searchHits.getTotalHits());
        result.setFilters(filters);
        result.setCurrentPage(request.getPage());
        result.setPageSize(request.getSize());
        
        return result;
    }
    
    private ProductDTO convertToDTO(Product product) {
        // 转换逻辑
        return new ProductDTO();
    }
}

7. 注意事项与最佳实践

  1. 版本兼容性:不同版本的 Elasticsearch 中 QueryBuilders 的方法可能有所不同,请根据实际使用的版本查阅官方文档。

  2. 查询性能

    • 优先使用 filter 上下文而不是 query 上下文进行过滤操作
    • 避免使用 leading wildcard 查询
    • 对于深度分页,使用 search_after 而不是 from/size
    • 合理设置查询超时时间
  3. 内存使用

    • 聚合查询可能消耗大量内存,特别是高基数字段的聚合
    • 限制返回字段,只获取必要的数据
    • 对于大型结果集,考虑使用 scroll API
  4. 索引设计

    • 为需要精确匹配的字段使用 keyword 类型
    • 为全文搜索字段选择合适的分词器
    • 对于频繁过滤的字段,考虑启用 doc_values
  5. 复杂查询调试

    • 使用 explain API 分析查询执行计划
    • 检查慢查询日志,优化性能瓶颈
    • 考虑使用预计算或缓存热点数据

通过合理利用这些高级特性,您可以构建强大而高效的 Elasticsearch 查询,满足各种复杂的业务需求。

相关推荐
青云交7 小时前
Java 大视界 -- Java 大数据在智能家居能源消耗模式分析与节能策略制定中的应用
java·大数据·智能家居·数据采集·能源消耗模式分析·节能策略制定·节能效果评估
Zhang青山7 小时前
【玩转全栈】----Django基本配置和介绍
java·后端
BUG?不,是彩蛋!7 小时前
Java Web 项目打包部署全解析:从 IDEA 配置到 Tomcat 运行
java·intellij-idea
勇敢牛牛_8 小时前
Rust真的适合写业务后端吗?
开发语言·后端·rust
JIngJaneIL8 小时前
财务管理|基于SprinBoot+vue的个人财务管理系统(源码+数据库+文档)
java·前端·数据库·vue.js·spring boot·毕设·财务管理系统
rengang668 小时前
352-Spring AI Alibaba OpenAI DashScope 多模态示例
java·人工智能·spring·多模态·spring ai·ai应用编程
不爱学英文的码字机器8 小时前
深度解析《AI+Java编程入门》:一本为零基础重构的Java学习路径
java·人工智能·后端·重构
IT_陈寒8 小时前
Vue3性能翻倍秘籍:5个Composition API技巧让你的应用快如闪电⚡
前端·人工智能·后端
不光头强8 小时前
spring IOC
java·spring·rpc