众所周知ElasticSearch8之前的Api风格都较为保守,变化不大,直到升级到了ElasticSearch8 ,由于ElasticSearch8弃用了RestHighLevelClient
XML
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.17.29</version>
</dependency>
XML
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>7.17.24</version>
</dependency>
XML
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>7.17.24</version>
</dependency>
改用了
XML
<dependency>
<groupId>co.elastic.clients</groupId>
<artifactId>elasticsearch-java</artifactId>
<version>8.8.2</version>
</dependency>
因此在升级ElasticSearch服务端版本后,需要大量更新客户端的代码,几乎所有的API都发生了变化,相对来说,查询的API相对好写一些,而聚合的API变化很大,尤其是嵌套了很多层的子聚合。
先看下es8之前的java api是如何写子聚合的。
java
// 实例化第一个聚合
AggregationBuilder realtimeOrdersAgg = AggregationBuilders.sum("last7DayAvgRealTimeOrdersSum").field("last7DayAvgRealTimeOrders");
// 实例化第二个聚合
AggregationBuilder productAgg = AggregationBuilders.terms("termProductCode").field("productNo").size(1000);
// 实例化第三个聚合
DateHistogramAggregationBuilder aggregationBuilder = AggregationBuilders
.dateHistogram("startTimeTerm").field("startTime").timeZone(DateTimeZone.forOffsetHours(8))
.dateHistogramInterval(DateHistogramInterval.hours(1))
.extendedBounds(new ExtendedBounds(dateStart.getTime(), dateEnd.getTime()));
// 设置嵌套关系
productAgg.subAggregation(aggregationBuilder.subAggregation(realtimeOrdersAgg));
SearchResponse response = monitorEsUtils.getClient().prepareSearch("test")
.setSearchType(SearchType.QUERY_THEN_FETCH).setQuery(boolQueryBuilder)
.addAggregation(productAgg)
.setSize(0)
.get();
可以看到,在es8之前设置子聚合是十分优雅的事情,到了es8之后写子聚合就会变得异常麻烦!
这里引用另一篇文章来说说Es8的子聚合的写法:
https://blog.csdn.net/wojiushiwo945you/article/details/149544848
可以发现Es8的聚合,是通过Aggregation类的静态of方法来创建的,而其子聚合则需要创建在其父类聚合内部,因此再形成子聚合嵌套时,需要保证所有的聚合,子聚合依赖关系清楚,最终在一个大的模块下,形成最终的聚合代码。
以上这种形式,虽然可以实现功能,但是代码利用程度不高,而且还有两个问题:
1.当子聚合又深又多时,es8的lambda写法会导致代码非常不易写,也不宜修改,同时整个代码块会非常大。
2.如果子聚合是可选的,比如满足某个条件再加入子聚合,这个时候这种写法就会有问题了。
相信看到这里的内容的人,应该明白我说的是什么意思,或许目前就有以上的困惑。
经过反复实践,发现了如下规律:
1.Es8的聚合实例一旦完成,则不可添加修改子聚合。
2.es8的聚合实例不能在创建实例时指定名称,只能在加入查询或加入子聚合时指定名称。
3.es8父聚合的子聚合添加可以是集合形式(Map<String,Aggregation>),key即为聚合名称。
基于如上规律,总结出来的写法是,可以使用Map结构,即
java
// 这里使用了 Aggregation.of静态方法但没有生成Aggregation实例,而是调用了dateHistogram()
DateHistogramAggregation dateHistogramAggregation = Aggregation.of(a -> a.dateHistogram(b -> {
DateHistogramAggregation.Builder builder = b.field("startTime").timeZone(ZoneId.of("Asia/Shanghai").toString())
.fixedInterval(Time.of(t -> t.time("1d"))).extendedBounds(ExtendedBounds.of(e -> e.max(FieldDateMath.of(f -> f.value((double) 1765273915)))
.min(FieldDateMath.of(f -> f.value((double) 1765273915)))));
return builder;
})).dateHistogram();
Map<String, Aggregation> aggregationBuilderSubAggregationMap = new HashMap<>();
// todo 上面实例化map之后,下面就可以按需添加子聚合了 if else map中的元素就是可变的了
Script script = Script.of(m -> m.inline(n -> n.source("if (doc.oo_code.size()!=0) { return doc.oo_code.value.splitOnToken(',')").lang(ScriptLanguage.Painless)));
Script script2 = Script.of(m -> m.inline(n -> n.source("if(doc.comp_time.value.toInstant().toEpochMilli() - doc.start_time.value.toInstant().toEpochMilli() > 0 ) return doc.comp_time.value.toInstant().toEpochMilli() - doc.start_time.value.toInstant().toEpochMilli() ").lang(ScriptLanguage.Painless)));
aggregationBuilderSubAggregationMap.put ("term_sum_invokeagg", Aggregation.of(a->a.sum(v->v.script(script))));
if (true) {
aggregationBuilderSubAggregationMap.put ("term_sum_codeagg", Aggregation.of(a->a.sum(v->v.script(script2))));
}
Aggregation aggregation = Aggregation.of(a -> {
Aggregation.Builder.ContainerBuilder containerBuilder = a.dateHistogram(dateHistogramAggregation);
// 加入上面的已经实例化好的map,即使map的元素数量是0
containerBuilder.aggregations(aggregationBuilderSubAggregationMap);
return containerBuilder;
});
SearchRequest searchRequest = new
SearchRequest.Builder().index ("test").searchType (SearchType.QueryThenFetch)
.query(boolQuery.build()._toQuery())
.aggregations(TERM_START_TIME, aggregation)
.size(0).build();
}
如上通过定义aggregationBuilderSubAggregationMap对象,将子聚合的实例添加到Map中,最终再实例化出父的聚合实例添加到查询请求中。
当然实际操作中如果嵌套较深,可能需要创建多个Map集合,通过多个Map的嵌套来实现目的,使用这种方式可以实现有条件的子聚合,同时也能使部分的叶子聚合实例得到复用。但最终的父聚合还是需要在最后进行实例化的。
以上方案虽然可以满足查询需要了,但相比较于Es8之前的版本创建聚合的方式还是存在部分问题,比如需要创建多个Map,Map与Map之间可能也存在嵌套关系,而且也打乱了聚合的创建的顺序。代码复用程度还是不高。
在使用Map的方式一段时间后,发现在Api中存在一个AggregationVariant类,这个类从名称上看叫做聚合的变种或者称之为半成品,有了这个半成品后就可以很轻松的获取聚合的实例。考虑到聚合有相应的依赖关系,因此需要按照一定的顺序来实例化聚合,因此有了如下的方案:
java
import co.elastic.clients.elasticsearch._types.aggregations.*;
import co.elastic.clients.elasticsearch._types.query_dsl.Query;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;
/**
* 嵌套聚合上下文
*/
public class NestAggregationContext {
/**
* 聚合名称
*/
private String name;
/**
* 聚合名称对应的聚合实例,这个属性是构建聚合的必需品.
* 在构建聚合上下文实例时,如果该节点本身没有子节点时
* 这个属性需要设置,否则需要设置如下的aggregationVariant属性
*/
private Aggregation aggregation;
/**
* 聚合名称对应的聚合实例变种(半成品)
* 当在构建聚合上下文实例时,如果该节点本身没有子节点时
* 不可设置此属性,可设置aggregation属性.如该节点有子节
* 点时,可设置此属性
*/
private AggregationVariant aggregationVariant;
/**
* 聚合名称对应的子聚合上下文实例列表
*/
protected Set<NestAggregationContext> subNestAggregationContexts;
private NestAggregationContext(Builder builder) {
this.name = builder.name;
this.aggregationVariant = builder.aggregationVariant;
this.aggregation = builder.aggregation;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public Aggregation getAggregation() {
return aggregation;
}
public AggregationVariant getAggregationVariant() {
return aggregationVariant;
}
public void setAggregationVariant(AggregationVariant aggregationVariant) {
this.aggregationVariant = aggregationVariant;
}
public Set<NestAggregationContext> getSubNestAggregationContexts() {
return subNestAggregationContexts;
}
public static class Builder {
private String name;
private Aggregation aggregation;
private AggregationVariant aggregationVariant;
public Builder(String name) {
this.name = name;
}
public Builder aggregation(Aggregation aggregation) {
this.aggregation = aggregation;
return this;
}
public Builder aggregationVariant(AggregationVariant aggregationVariant) {
this.aggregationVariant = aggregationVariant;
return this;
}
public NestAggregationContext build() {
if (aggregation == null && aggregationVariant == null) {
throw new IllegalArgumentException("aggregation属性与aggregationVariant属性必需包含一个");
}
if (aggregation != null && aggregationVariant != null) {
throw new IllegalArgumentException("aggregation属性与aggregationVariant属性不可同时包含");
}
return new NestAggregationContext(this);
}
}
/**
* 设置聚合上下文的子聚合依赖
*
* @param subNestAggregationContexts
*/
public void setSubNestAggregationContexts(Set<NestAggregationContext> subNestAggregationContexts) {
if (subNestAggregationContexts.size() == 0 && aggregation == null) {
String format = "名称为:%s的聚合配置关系有误!,aggregation属性为空,却无法找到嵌套子聚合依赖集合";
throw new IllegalArgumentException(String.format(format, name));
}
this.subNestAggregationContexts = subNestAggregationContexts;
}
/**
* 针对aggregation属性实例为空的聚合上下文实例,进行构建
*/
protected void buildAggregation() {
if (aggregation != null || subNestAggregationContexts.size() == 0) {
return;
}
// 将子聚合实例列表转换成名称和Es聚合的kv形式
Map<String, Aggregation> collect = subNestAggregationContexts.stream().collect(Collectors.toMap(a -> a.getName(), a -> a.getAggregation()));
// 判断聚合半成品的类型,并将子聚合设置中聚合半成品中,之后聚合半成品就变成了成品,并设置到aggregation属性中
if (aggregationVariant instanceof ValueCountAggregation) {
aggregation = Aggregation.of(a -> {
Aggregation.Builder.ContainerBuilder containerBuilder = a.valueCount((ValueCountAggregation) aggregationVariant);
containerBuilder.aggregations(collect);
return containerBuilder;
});
return;
}
if (aggregationVariant instanceof Query) {
aggregation = Aggregation.of(a -> {
Aggregation.Builder.ContainerBuilder containerBuilder = a.filter((Query) aggregationVariant);
containerBuilder.aggregations(collect);
return containerBuilder;
});
return;
}
if (aggregationVariant instanceof TermsAggregation) {
aggregation = Aggregation.of(a -> {
Aggregation.Builder.ContainerBuilder containerBuilder = a.terms((TermsAggregation) aggregationVariant);
containerBuilder.aggregations(collect);
return containerBuilder;
});
return;
}
if (aggregationVariant instanceof NestedAggregation) {
aggregation = Aggregation.of(a -> {
Aggregation.Builder.ContainerBuilder containerBuilder = a.nested((NestedAggregation) aggregationVariant);
containerBuilder.aggregations(collect);
return containerBuilder;
});
return;
}
if (aggregationVariant instanceof DateHistogramAggregation) {
aggregation = Aggregation.of(a -> {
Aggregation.Builder.ContainerBuilder containerBuilder = a.dateHistogram((DateHistogramAggregation) aggregationVariant);
containerBuilder.aggregations(collect);
return containerBuilder;
});
return;
}
if (aggregationVariant instanceof AverageAggregation) {
aggregation = Aggregation.of(a -> {
Aggregation.Builder.ContainerBuilder containerBuilder = a.avg((AverageAggregation) aggregationVariant);
containerBuilder.aggregations(collect);
return containerBuilder;
});
return;
}
if (aggregationVariant instanceof SumAggregation) {
aggregation = Aggregation.of(a -> {
Aggregation.Builder.ContainerBuilder containerBuilder = a.sum((SumAggregation) aggregationVariant);
containerBuilder.aggregations(collect);
return containerBuilder;
});
return;
}
// todo 多种变种类型需要根据情况自行补充代码,或者自行构建通用代码
throw new UnsupportedOperationException("聚合名称:" + name + "设置了不被支持的变种类型:" + aggregationVariant.getClass());
}
java
import co.elastic.clients.elasticsearch._types.aggregations.Aggregation;
import co.elastic.clients.elasticsearch._types.aggregations.AggregationBuilders;
import com.google.common.graph.GraphBuilder;
import com.google.common.graph.MutableGraph;
import org.apache.commons.lang3.StringUtils;
import java.util.*;
/**
* Es8 java api聚合嵌套工具类
*/
public class Es8AggregateNestUtil {
/**
* 使用Guava的图库来构建es的嵌套聚合树,即各个聚合的依赖关系
*/
private MutableGraph<NestAggregationContext> mapMutableGraph = GraphBuilder.directed().allowsSelfLoops(false).build();
/**
* 构建聚合依赖关系图,说起来是图其实更像一颗树
*
* @param parentNode 聚合父级节点
* @param subNode 聚合子级节点
*/
public void buildAggregateGraph(NestAggregationContext parentNode, NestAggregationContext subNode) {
// guava图类库会根据传入的节点自动配置树的依赖关系
mapMutableGraph.putEdge(parentNode, subNode);
}
/**
* 获取聚合完毕后的聚合集合,由于是树形结构,其实map只有一个key
*
* @return
*/
public Map<String, Aggregation> getAggregateMap() {
NestAggregationContext nestAggregationContext = buildAggregate();
Map<String, Aggregation> map = new HashMap<>();
map.put(nestAggregationContext.getName(), nestAggregationContext.getAggregation());
return map;
}
/**
* 获取构建后的顶级聚合
*
* @return
*/
public Aggregation getAggregate() {
NestAggregationContext nestAggregationContext = buildAggregate();
return nestAggregationContext.getAggregation();
}
/**
* 根据图中的聚合上下文实例构建聚合
*
* @return
*/
NestAggregationContext buildAggregate() {
NestAggregationContext rootAgg = null;
// 循环图中的所有节点
for (NestAggregationContext nest : mapMutableGraph.nodes()) {
// 获取循环到的节点的入度
int o = mapMutableGraph.inDegree(nest);
if (o == 0) {
// 如果入度为0,表示当前节点是整颗树的顶级节点并跳出
rootAgg = nest;
break;
}
}
// 实例化一个有序map该map,为后续实例化聚合上下文实例作准备,key为数值类型,key表示为实例化的顺序,注意是逆序(为什么要是逆序?因为父聚合要实例化时而要先实例化子聚合)
Map<Integer, List<NestAggregationContext>> nestAggregationContextOrderMap = new TreeMap<>(Collections.reverseOrder());
// 构建子聚合上下文,这里传入顶级聚合上下文,index为树的层级,树根即为1
buildSubNestAggregateContext(rootAgg, nestAggregationContextOrderMap, 1);
// 经过上面的构建方法,此时nestAggregationContextOrderMap中已包含了以树的层级为key的map,由于上面设置了map为逆序,因此遍历时,是从树的叶子节点开始构建
for (Map.Entry<Integer, List<NestAggregationContext>> integerListEntry : nestAggregationContextOrderMap.entrySet()) {
for (NestAggregationContext nestAggregationContext : integerListEntry.getValue()) {
// 聚合上下文实例开始构建es聚合
nestAggregationContext.buildAggregation();
}
}
return rootAgg;
}
/**
* 构建子聚合上下文
*
* @param parentNode 父级聚合上下文
* @param map
* @param index 树的层级
*/
private void buildSubNestAggregateContext(NestAggregationContext parentNode, Map<Integer, List<NestAggregationContext>> map, int index) {
if (!map.containsKey(index)) {
map.put(index, new ArrayList<>());
}
map.get(index).add(parentNode);
// 获取父级节点下的所有出度,也即子级节点(只下探一层)
Set<NestAggregationContext> subNestAggregationContexts = mapMutableGraph.successors(parentNode);
// 将父级节点下的直接子节点设置到父节点下
parentNode.setSubNestAggregationContexts(subNestAggregationContexts);
// 所有的子节点再进行下探
for (NestAggregationContext aggregationMap : subNestAggregationContexts) {
// 递归 注意树的层级+1
buildSubNestAggregateContext(aggregationMap, map, ++index);
}
}
public static void main(String[] args) {
NestAggregationContext a1 = new NestAggregationContext.Builder("A1").aggregation(AggregationBuilders.terms().build()._toAggregation()).build();
NestAggregationContext a2 = new NestAggregationContext.Builder("A2").aggregation(AggregationBuilders.terms().build()._toAggregation()).build();
NestAggregationContext a3 = new NestAggregationContext.Builder("A3").aggregationVariant(AggregationBuilders.filter().bool(b -> b.mustNot(q -> q.term(o -> o.field("mutual_rescue").value("Y")))).build()).build();
NestAggregationContext a4 = new NestAggregationContext.Builder("A4").aggregationVariant(AggregationBuilders.valueCount().build()).build();
NestAggregationContext a5 = new NestAggregationContext.Builder("A5").aggregationVariant(AggregationBuilders.nested().build()).build();
Es8AggregateNestUtil aggregationNestUtil = new Es8AggregateNestUtil();
aggregationNestUtil.buildAggregateGraph(a5, a4);
aggregationNestUtil.buildAggregateGraph(a5, a3);
aggregationNestUtil.buildAggregateGraph(a3, a2);
aggregationNestUtil.buildAggregateGraph(a4, a1);
// aggregationNestUtil.configSubAggregateGraph(a5, a3);
// aggregationNestUtil.configSubAggregateGraph(a3, a2);
// aggregationNestUtil.configSubAggregateGraph(a3, a1);
// aggregationNestUtil.configSubAggregateGraph(a3, a4);
Aggregation allAggregate = aggregationNestUtil.getAggregate();
System.out.println(allAggregate);
}
上面代码给出了工具类,及测试方法,需要注意的是,以上代码需要依赖guava库
XML
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>24.0-jre</version>
</dependency>
具体思路如下:
遇到子查询,先判断哪些是已知可以直接实例化的聚合,先进行实例化:
java
// 这是使用建造者模式,名称是一定要传的,名称也会是后面聚合的名称
NestAggregationContext a1 = new NestAggregationContext.Builder("A1").aggregation(AggregationBuilders.terms().build()._toAggregation()).build();
如果不能直接实例化,不能实例化的原因肯定是因为其子聚合未实例化,则先获取其半成品:
java
NestAggregationContext a4 = new NestAggregationContext.Builder("A4").aggregationVariant(AggregationBuilders.valueCount().build()).build();
需要注意的是上面两段代码的区别,一个是获取Builder实例后调用的aggregation方法,一个是一个是获取Builder实例后调用的aggregationVariant方法,但最终它们都返回了NestAggregationContext对象,而这个对象是我们自定义的一个对象,这个对象会帮我们将所有的聚合实例化好(见其buildAggregation()方法,注意该方法我只写了部分,可进行扩展)。
由于我们把聚合的实例化对象或者聚合的半成品对象,包装到了NestAggregationContext 中,因此在实例化好了NestAggregationContext对象后,我们就需要配置NestAggregationContext对象间的关系。
java
NestAggregationContext a1 = new NestAggregationContext.Builder("A1").aggregation(AggregationBuilders.terms().build()._toAggregation()).build();
NestAggregationContext a2 = new NestAggregationContext.Builder("A2").aggregation(AggregationBuilders.terms().build()._toAggregation()).build();
NestAggregationContext a3 = new NestAggregationContext.Builder("A3").aggregationVariant(AggregationBuilders.filter().bool(b -> b.mustNot(q -> q.term(o -> o.field("mutual_rescue").value("Y")))).build()).build();
NestAggregationContext a4 = new NestAggregationContext.Builder("A4").aggregationVariant(AggregationBuilders.valueCount().build()).build();
NestAggregationContext a5 = new NestAggregationContext.Builder("A5").aggregationVariant(AggregationBuilders.nested().build()).build();
Es8AggregateNestUtil aggregationNestUtil = new Es8AggregateNestUtil();
aggregationNestUtil.buildAggregateGraph(a5, a4);
aggregationNestUtil.buildAggregateGraph(a5, a3);
aggregationNestUtil.buildAggregateGraph(a3, a2);
aggregationNestUtil.buildAggregateGraph(a4, a1);
从代码中可以看到实例化了Es8AggregateNestUtil,并调用了buildAggregateGraph()方法,该方法就是确定聚合的依赖关系,第一个参数是父节点,第二个参数是父节点下的直接子节点。
配置完毕后:
java
Aggregation allAggregate = aggregationNestUtil.getAggregate();
即可调用Es8AggregateNestUtil的getAggregate()方法或者getAggregateMap()方法获取最终的父节点。
以上方式其实就是先包装聚合(实例或半成品),然后使用聚合的依赖关系写入到guva的图库中,然后找到图库中的叶子节点,进行由叶子节点到节点的实例化,最终返回!
注意点:
1.以上工具类适用于一个根节点的情况,如果有多个根节点,建议重复使用。(其已满足80%的需求)
2.在配置依赖关系时需要仔细观察,如叶子节点一定是实例化的聚合而不能是半成品,同时注意依赖关系不能出现环形