ElasticSearch8 java api子聚合写法

众所周知ElasticSearch8之前的Api风格都较为保守,变化不大,直到升级到了ElasticSearch8 ,由于ElasticSearch8弃用了RestHighLevelClient

XML 复制代码
<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.17.29</version>
</dependency>
XML 复制代码
<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch</artifactId>
    <version>7.17.24</version>
</dependency>
XML 复制代码
<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>transport</artifactId>
    <version>7.17.24</version>
</dependency>

改用了

XML 复制代码
<dependency>
    <groupId>co.elastic.clients</groupId>
    <artifactId>elasticsearch-java</artifactId>
    <version>8.8.2</version>
</dependency>

因此在升级ElasticSearch服务端版本后,需要大量更新客户端的代码,几乎所有的API都发生了变化,相对来说,查询的API相对好写一些,而聚合的API变化很大,尤其是嵌套了很多层的子聚合。

先看下es8之前的java api是如何写子聚合的。

java 复制代码
//                实例化第一个聚合
                AggregationBuilder realtimeOrdersAgg = AggregationBuilders.sum("last7DayAvgRealTimeOrdersSum").field("last7DayAvgRealTimeOrders");
//                实例化第二个聚合
                AggregationBuilder productAgg = AggregationBuilders.terms("termProductCode").field("productNo").size(1000);
//                实例化第三个聚合
                DateHistogramAggregationBuilder aggregationBuilder = AggregationBuilders
                        .dateHistogram("startTimeTerm").field("startTime").timeZone(DateTimeZone.forOffsetHours(8))
                        .dateHistogramInterval(DateHistogramInterval.hours(1))
                        .extendedBounds(new ExtendedBounds(dateStart.getTime(), dateEnd.getTime()));
//                设置嵌套关系
                productAgg.subAggregation(aggregationBuilder.subAggregation(realtimeOrdersAgg));

                SearchResponse response = monitorEsUtils.getClient().prepareSearch("test")
                        .setSearchType(SearchType.QUERY_THEN_FETCH).setQuery(boolQueryBuilder)
                        .addAggregation(productAgg)
                        .setSize(0)
                        .get();

可以看到,在es8之前设置子聚合是十分优雅的事情,到了es8之后写子聚合就会变得异常麻烦!

这里引用另一篇文章来说说Es8的子聚合的写法:

https://blog.csdn.net/wojiushiwo945you/article/details/149544848

可以发现Es8的聚合,是通过Aggregation类的静态of方法来创建的,而其子聚合则需要创建在其父类聚合内部,因此再形成子聚合嵌套时,需要保证所有的聚合,子聚合依赖关系清楚,最终在一个大的模块下,形成最终的聚合代码。

以上这种形式,虽然可以实现功能,但是代码利用程度不高,而且还有两个问题:

1.当子聚合又深又多时,es8的lambda写法会导致代码非常不易写,也不宜修改,同时整个代码块会非常大。

2.如果子聚合是可选的,比如满足某个条件再加入子聚合,这个时候这种写法就会有问题了。

相信看到这里的内容的人,应该明白我说的是什么意思,或许目前就有以上的困惑。

经过反复实践,发现了如下规律:

1.Es8的聚合实例一旦完成,则不可添加修改子聚合。

2.es8的聚合实例不能在创建实例时指定名称,只能在加入查询或加入子聚合时指定名称。

3.es8父聚合的子聚合添加可以是集合形式(Map<String,Aggregation>),key即为聚合名称。

基于如上规律,总结出来的写法是,可以使用Map结构,即

java 复制代码
//        这里使用了 Aggregation.of静态方法但没有生成Aggregation实例,而是调用了dateHistogram()
        DateHistogramAggregation dateHistogramAggregation = Aggregation.of(a -> a.dateHistogram(b -> {
            DateHistogramAggregation.Builder builder = b.field("startTime").timeZone(ZoneId.of("Asia/Shanghai").toString())
                    .fixedInterval(Time.of(t -> t.time("1d"))).extendedBounds(ExtendedBounds.of(e -> e.max(FieldDateMath.of(f -> f.value((double) 1765273915)))
                            .min(FieldDateMath.of(f -> f.value((double) 1765273915)))));
            return builder;
        })).dateHistogram();

        Map<String, Aggregation> aggregationBuilderSubAggregationMap = new HashMap<>();
//        todo 上面实例化map之后,下面就可以按需添加子聚合了 if else map中的元素就是可变的了
        Script script = Script.of(m -> m.inline(n -> n.source("if (doc.oo_code.size()!=0) { return doc.oo_code.value.splitOnToken(',')").lang(ScriptLanguage.Painless)));
        Script script2 = Script.of(m -> m.inline(n -> n.source("if(doc.comp_time.value.toInstant().toEpochMilli() - doc.start_time.value.toInstant().toEpochMilli() > 0 ) return doc.comp_time.value.toInstant().toEpochMilli() - doc.start_time.value.toInstant().toEpochMilli() ").lang(ScriptLanguage.Painless)));
        aggregationBuilderSubAggregationMap.put ("term_sum_invokeagg", Aggregation.of(a->a.sum(v->v.script(script))));
        if (true) {
            aggregationBuilderSubAggregationMap.put ("term_sum_codeagg", Aggregation.of(a->a.sum(v->v.script(script2))));
        }
        Aggregation aggregation = Aggregation.of(a -> {
            Aggregation.Builder.ContainerBuilder containerBuilder = a.dateHistogram(dateHistogramAggregation);
//            加入上面的已经实例化好的map,即使map的元素数量是0
            containerBuilder.aggregations(aggregationBuilderSubAggregationMap);
            return containerBuilder;
        });
        SearchRequest searchRequest = new
                SearchRequest.Builder().index ("test").searchType (SearchType.QueryThenFetch)
                .query(boolQuery.build()._toQuery())
                .aggregations(TERM_START_TIME, aggregation)
                .size(0).build();

    }

如上通过定义aggregationBuilderSubAggregationMap对象,将子聚合的实例添加到Map中,最终再实例化出父的聚合实例添加到查询请求中。

当然实际操作中如果嵌套较深,可能需要创建多个Map集合,通过多个Map的嵌套来实现目的,使用这种方式可以实现有条件的子聚合,同时也能使部分的叶子聚合实例得到复用。但最终的父聚合还是需要在最后进行实例化的。

以上方案虽然可以满足查询需要了,但相比较于Es8之前的版本创建聚合的方式还是存在部分问题,比如需要创建多个Map,Map与Map之间可能也存在嵌套关系,而且也打乱了聚合的创建的顺序。代码复用程度还是不高。

在使用Map的方式一段时间后,发现在Api中存在一个AggregationVariant类,这个类从名称上看叫做聚合的变种或者称之为半成品,有了这个半成品后就可以很轻松的获取聚合的实例。考虑到聚合有相应的依赖关系,因此需要按照一定的顺序来实例化聚合,因此有了如下的方案:

java 复制代码
import co.elastic.clients.elasticsearch._types.aggregations.*;
import co.elastic.clients.elasticsearch._types.query_dsl.Query;


import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;

/**
 * 嵌套聚合上下文
 */
public class NestAggregationContext {
    /**
     * 聚合名称
     */
    private String name;

    /**
     * 聚合名称对应的聚合实例,这个属性是构建聚合的必需品.
     * 在构建聚合上下文实例时,如果该节点本身没有子节点时
     * 这个属性需要设置,否则需要设置如下的aggregationVariant属性
     */
    private Aggregation aggregation;

    /**
     * 聚合名称对应的聚合实例变种(半成品)
     * 当在构建聚合上下文实例时,如果该节点本身没有子节点时
     * 不可设置此属性,可设置aggregation属性.如该节点有子节
     * 点时,可设置此属性
     */
    private AggregationVariant aggregationVariant;

    /**
     * 聚合名称对应的子聚合上下文实例列表
     */
    protected Set<NestAggregationContext> subNestAggregationContexts;

    private NestAggregationContext(Builder builder) {
        this.name = builder.name;
        this.aggregationVariant = builder.aggregationVariant;
        this.aggregation = builder.aggregation;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public Aggregation getAggregation() {
        return aggregation;
    }


    public AggregationVariant getAggregationVariant() {
        return aggregationVariant;
    }

    public void setAggregationVariant(AggregationVariant aggregationVariant) {
        this.aggregationVariant = aggregationVariant;
    }

    public Set<NestAggregationContext> getSubNestAggregationContexts() {
        return subNestAggregationContexts;
    }

    public static class Builder {
        private String name;
        private Aggregation aggregation;
        private AggregationVariant aggregationVariant;

        public Builder(String name) {
            this.name = name;
        }

        public Builder aggregation(Aggregation aggregation) {
            this.aggregation = aggregation;
            return this;
        }

        public Builder aggregationVariant(AggregationVariant aggregationVariant) {
            this.aggregationVariant = aggregationVariant;
            return this;
        }

        public NestAggregationContext build() {
            if (aggregation == null && aggregationVariant == null) {
                throw new IllegalArgumentException("aggregation属性与aggregationVariant属性必需包含一个");
            }
            if (aggregation != null && aggregationVariant != null) {
                throw new IllegalArgumentException("aggregation属性与aggregationVariant属性不可同时包含");
            }
            return new NestAggregationContext(this);
        }
    }

    /**
     * 设置聚合上下文的子聚合依赖
     *
     * @param subNestAggregationContexts
     */
    public void setSubNestAggregationContexts(Set<NestAggregationContext> subNestAggregationContexts) {
        if (subNestAggregationContexts.size() == 0 && aggregation == null) {
            String format = "名称为:%s的聚合配置关系有误!,aggregation属性为空,却无法找到嵌套子聚合依赖集合";
            throw new IllegalArgumentException(String.format(format, name));
        }
        this.subNestAggregationContexts = subNestAggregationContexts;
    }


    /**
     * 针对aggregation属性实例为空的聚合上下文实例,进行构建
     */
    protected void buildAggregation() {
        if (aggregation != null || subNestAggregationContexts.size() == 0) {
            return;
        }
//        将子聚合实例列表转换成名称和Es聚合的kv形式
        Map<String, Aggregation> collect = subNestAggregationContexts.stream().collect(Collectors.toMap(a -> a.getName(), a -> a.getAggregation()));
//        判断聚合半成品的类型,并将子聚合设置中聚合半成品中,之后聚合半成品就变成了成品,并设置到aggregation属性中
        if (aggregationVariant instanceof ValueCountAggregation) {
            aggregation = Aggregation.of(a -> {
                Aggregation.Builder.ContainerBuilder containerBuilder = a.valueCount((ValueCountAggregation) aggregationVariant);
                containerBuilder.aggregations(collect);
                return containerBuilder;
            });
            return;
        }

        if (aggregationVariant instanceof Query) {
            aggregation = Aggregation.of(a -> {
                Aggregation.Builder.ContainerBuilder containerBuilder = a.filter((Query) aggregationVariant);
                containerBuilder.aggregations(collect);
                return containerBuilder;
            });
            return;
        }

        if (aggregationVariant instanceof TermsAggregation) {
            aggregation = Aggregation.of(a -> {
                Aggregation.Builder.ContainerBuilder containerBuilder = a.terms((TermsAggregation) aggregationVariant);
                containerBuilder.aggregations(collect);
                return containerBuilder;
            });
            return;
        }

        if (aggregationVariant instanceof NestedAggregation) {
            aggregation = Aggregation.of(a -> {
                Aggregation.Builder.ContainerBuilder containerBuilder = a.nested((NestedAggregation) aggregationVariant);
                containerBuilder.aggregations(collect);
                return containerBuilder;
            });
            return;
        }

        if (aggregationVariant instanceof DateHistogramAggregation) {
            aggregation = Aggregation.of(a -> {
                Aggregation.Builder.ContainerBuilder containerBuilder = a.dateHistogram((DateHistogramAggregation) aggregationVariant);
                containerBuilder.aggregations(collect);
                return containerBuilder;
            });
            return;
        }

        if (aggregationVariant instanceof AverageAggregation) {
            aggregation = Aggregation.of(a -> {
                Aggregation.Builder.ContainerBuilder containerBuilder = a.avg((AverageAggregation) aggregationVariant);
                containerBuilder.aggregations(collect);
                return containerBuilder;
            });
            return;
        }
        if (aggregationVariant instanceof SumAggregation) {
            aggregation = Aggregation.of(a -> {
                Aggregation.Builder.ContainerBuilder containerBuilder = a.sum((SumAggregation) aggregationVariant);
                containerBuilder.aggregations(collect);
                return containerBuilder;
            });
            return;
        }
//        todo 多种变种类型需要根据情况自行补充代码,或者自行构建通用代码
        throw new UnsupportedOperationException("聚合名称:" + name + "设置了不被支持的变种类型:" + aggregationVariant.getClass());

    }
java 复制代码
import co.elastic.clients.elasticsearch._types.aggregations.Aggregation;
import co.elastic.clients.elasticsearch._types.aggregations.AggregationBuilders;
import com.google.common.graph.GraphBuilder;
import com.google.common.graph.MutableGraph;
import org.apache.commons.lang3.StringUtils;

import java.util.*;

/**
 * Es8 java api聚合嵌套工具类
 */
public class Es8AggregateNestUtil {

    /**
     * 使用Guava的图库来构建es的嵌套聚合树,即各个聚合的依赖关系
     */
    private MutableGraph<NestAggregationContext> mapMutableGraph = GraphBuilder.directed().allowsSelfLoops(false).build();

    /**
     * 构建聚合依赖关系图,说起来是图其实更像一颗树
     *
     * @param parentNode 聚合父级节点
     * @param subNode    聚合子级节点
     */
    public void buildAggregateGraph(NestAggregationContext parentNode, NestAggregationContext subNode) {
//        guava图类库会根据传入的节点自动配置树的依赖关系
        mapMutableGraph.putEdge(parentNode, subNode);
    }


    /**
     * 获取聚合完毕后的聚合集合,由于是树形结构,其实map只有一个key
     *
     * @return
     */
    public Map<String, Aggregation> getAggregateMap() {
        NestAggregationContext nestAggregationContext = buildAggregate();
        Map<String, Aggregation> map = new HashMap<>();
        map.put(nestAggregationContext.getName(), nestAggregationContext.getAggregation());
        return map;
    }

    /**
     * 获取构建后的顶级聚合
     *
     * @return
     */
    public Aggregation getAggregate() {
        NestAggregationContext nestAggregationContext = buildAggregate();
        return nestAggregationContext.getAggregation();
    }

    /**
     * 根据图中的聚合上下文实例构建聚合
     *
     * @return
     */
    NestAggregationContext buildAggregate() {
        NestAggregationContext rootAgg = null;
//        循环图中的所有节点
        for (NestAggregationContext nest : mapMutableGraph.nodes()) {
//            获取循环到的节点的入度
            int o = mapMutableGraph.inDegree(nest);
            if (o == 0) {
//                如果入度为0,表示当前节点是整颗树的顶级节点并跳出
                rootAgg = nest;
                break;
            }
        }
//       实例化一个有序map该map,为后续实例化聚合上下文实例作准备,key为数值类型,key表示为实例化的顺序,注意是逆序(为什么要是逆序?因为父聚合要实例化时而要先实例化子聚合)
        Map<Integer, List<NestAggregationContext>> nestAggregationContextOrderMap = new TreeMap<>(Collections.reverseOrder());
//        构建子聚合上下文,这里传入顶级聚合上下文,index为树的层级,树根即为1
        buildSubNestAggregateContext(rootAgg, nestAggregationContextOrderMap, 1);
//        经过上面的构建方法,此时nestAggregationContextOrderMap中已包含了以树的层级为key的map,由于上面设置了map为逆序,因此遍历时,是从树的叶子节点开始构建
        for (Map.Entry<Integer, List<NestAggregationContext>> integerListEntry : nestAggregationContextOrderMap.entrySet()) {
            for (NestAggregationContext nestAggregationContext : integerListEntry.getValue()) {
//                聚合上下文实例开始构建es聚合
                nestAggregationContext.buildAggregation();
            }
        }

        return rootAgg;
    }


    /**
     * 构建子聚合上下文
     *
     * @param parentNode 父级聚合上下文
     * @param map
     * @param index      树的层级
     */
    private void buildSubNestAggregateContext(NestAggregationContext parentNode, Map<Integer, List<NestAggregationContext>> map, int index) {
        if (!map.containsKey(index)) {
            map.put(index, new ArrayList<>());
        }
        map.get(index).add(parentNode);
//        获取父级节点下的所有出度,也即子级节点(只下探一层)
        Set<NestAggregationContext> subNestAggregationContexts = mapMutableGraph.successors(parentNode);
//        将父级节点下的直接子节点设置到父节点下
        parentNode.setSubNestAggregationContexts(subNestAggregationContexts);
//        所有的子节点再进行下探
        for (NestAggregationContext aggregationMap : subNestAggregationContexts) {
//           递归 注意树的层级+1
            buildSubNestAggregateContext(aggregationMap, map, ++index);
        }
    }

    public static void main(String[] args) {
        NestAggregationContext a1 = new NestAggregationContext.Builder("A1").aggregation(AggregationBuilders.terms().build()._toAggregation()).build();
        NestAggregationContext a2 = new NestAggregationContext.Builder("A2").aggregation(AggregationBuilders.terms().build()._toAggregation()).build();
        NestAggregationContext a3 = new NestAggregationContext.Builder("A3").aggregationVariant(AggregationBuilders.filter().bool(b -> b.mustNot(q -> q.term(o -> o.field("mutual_rescue").value("Y")))).build()).build();
        NestAggregationContext a4 = new NestAggregationContext.Builder("A4").aggregationVariant(AggregationBuilders.valueCount().build()).build();
        NestAggregationContext a5 = new NestAggregationContext.Builder("A5").aggregationVariant(AggregationBuilders.nested().build()).build();
        Es8AggregateNestUtil aggregationNestUtil = new Es8AggregateNestUtil();
        aggregationNestUtil.buildAggregateGraph(a5, a4);
        aggregationNestUtil.buildAggregateGraph(a5, a3);
        aggregationNestUtil.buildAggregateGraph(a3, a2);
        aggregationNestUtil.buildAggregateGraph(a4, a1);


//        aggregationNestUtil.configSubAggregateGraph(a5, a3);
//        aggregationNestUtil.configSubAggregateGraph(a3, a2);
//        aggregationNestUtil.configSubAggregateGraph(a3, a1);
//        aggregationNestUtil.configSubAggregateGraph(a3, a4);

        Aggregation allAggregate = aggregationNestUtil.getAggregate();
        System.out.println(allAggregate);
    }

上面代码给出了工具类,及测试方法,需要注意的是,以上代码需要依赖guava库

XML 复制代码
<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>24.0-jre</version>
</dependency>

具体思路如下:

遇到子查询,先判断哪些是已知可以直接实例化的聚合,先进行实例化:

java 复制代码
// 这是使用建造者模式,名称是一定要传的,名称也会是后面聚合的名称
NestAggregationContext a1 = new NestAggregationContext.Builder("A1").aggregation(AggregationBuilders.terms().build()._toAggregation()).build();

如果不能直接实例化,不能实例化的原因肯定是因为其子聚合未实例化,则先获取其半成品:

java 复制代码
NestAggregationContext a4 = new NestAggregationContext.Builder("A4").aggregationVariant(AggregationBuilders.valueCount().build()).build();

需要注意的是上面两段代码的区别,一个是获取Builder实例后调用的aggregation方法,一个是一个是获取Builder实例后调用的aggregationVariant方法,但最终它们都返回了NestAggregationContext对象,而这个对象是我们自定义的一个对象,这个对象会帮我们将所有的聚合实例化好(见其buildAggregation()方法,注意该方法我只写了部分,可进行扩展)。

由于我们把聚合的实例化对象或者聚合的半成品对象,包装到了NestAggregationContext 中,因此在实例化好了NestAggregationContext对象后,我们就需要配置NestAggregationContext对象间的关系。

java 复制代码
   NestAggregationContext a1 = new NestAggregationContext.Builder("A1").aggregation(AggregationBuilders.terms().build()._toAggregation()).build();
        NestAggregationContext a2 = new NestAggregationContext.Builder("A2").aggregation(AggregationBuilders.terms().build()._toAggregation()).build();
        NestAggregationContext a3 = new NestAggregationContext.Builder("A3").aggregationVariant(AggregationBuilders.filter().bool(b -> b.mustNot(q -> q.term(o -> o.field("mutual_rescue").value("Y")))).build()).build();
        NestAggregationContext a4 = new NestAggregationContext.Builder("A4").aggregationVariant(AggregationBuilders.valueCount().build()).build();
        NestAggregationContext a5 = new NestAggregationContext.Builder("A5").aggregationVariant(AggregationBuilders.nested().build()).build();
        Es8AggregateNestUtil aggregationNestUtil = new Es8AggregateNestUtil();
        aggregationNestUtil.buildAggregateGraph(a5, a4);
        aggregationNestUtil.buildAggregateGraph(a5, a3);
        aggregationNestUtil.buildAggregateGraph(a3, a2);
        aggregationNestUtil.buildAggregateGraph(a4, a1);

从代码中可以看到实例化了Es8AggregateNestUtil,并调用了buildAggregateGraph()方法,该方法就是确定聚合的依赖关系,第一个参数是父节点,第二个参数是父节点下的直接子节点。

配置完毕后:

java 复制代码
Aggregation allAggregate = aggregationNestUtil.getAggregate();

即可调用Es8AggregateNestUtil的getAggregate()方法或者getAggregateMap()方法获取最终的父节点。

以上方式其实就是先包装聚合(实例或半成品),然后使用聚合的依赖关系写入到guva的图库中,然后找到图库中的叶子节点,进行由叶子节点到节点的实例化,最终返回!

注意点:

1.以上工具类适用于一个根节点的情况,如果有多个根节点,建议重复使用。(其已满足80%的需求)

2.在配置依赖关系时需要仔细观察,如叶子节点一定是实例化的聚合而不能是半成品,同时注意依赖关系不能出现环形

相关推荐
ZouZou老师2 小时前
C++设计模式之适配器模式:以家具生产为例
java·设计模式·适配器模式
曼巴UE52 小时前
UE5 C++ 动态多播
java·开发语言
VX:Fegn08952 小时前
计算机毕业设计|基于springboot + vue音乐管理系统(源码+数据库+文档)
java·数据库·vue.js·spring boot·后端·课程设计
程序员鱼皮3 小时前
刚刚,IDEA 免费版发布!终于不用破解了
java·程序员·jetbrains
Hui Baby3 小时前
Nacos容灾俩种方案对比
java
曲莫终3 小时前
Java单元测试框架Junit5用法一览
java
成富4 小时前
Chat Agent UI,类似 ChatGPT 的聊天界面,Spring AI 应用的测试工具
java·人工智能·spring·ui·chatgpt
凌波粒4 小时前
Springboot基础教程(9)--Swagger2
java·spring boot·后端
2301_800256114 小时前
【第九章知识点总结1】9.1 Motivation and use cases 9.2 Conceptual model
java·前端·数据库