ElasticSearch:文章检索

实现目标

思路与ES前期准备

使用postman添加映射put请求 :

搜索结果展示内容:标题、布局、枫叶图片、发布时间、作者名称、文章id、作者id、静态url 需要对:内容、标题进行分词

json "content":{ "type":"text", "ananlyze":"ik_smart" }

http://{url}:{port}/appinfoarticle

校验尝试:

GET请求查询映射:http://{url}:{port}/appinfoarticle

DELETE请求,删除索引及映射:http://{url}:{port}/appinfoarticle

GET请求,查询所有文档:http://{url}:{port}/appinfoarticle/_search

查询所有文章,批量导入到es库中

```java BulkRequest bulkRequest = new BulkRequest("appinfoarticle");

for (SearchArticleVo searchArticleVo : searchArticleVos) {

IndexRequest indexRequest = new IndexRequest().id(searchArticleVo.getId().toString())
        .source(JSON.toJSONString(searchArticleVo), XContentType.JSON);

//批量添加数据
bulkRequest.add(indexRequest);

} restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);

``` 其中: bulkRequest用法

```java RestHighLevelClient restHighLevelClient = new RestHighLevelClient();//创建一个es客户端,用于执行请求

BulkRequest bulkRequest = new BulkRequest(); //创建一个批量请求

IndexRequest indexRequest1 = new IndexRequest("indexname"); //创建第一个"添加文档"的请求 indexRequest1.id("1") .source(XContentType.JSON,"field name", "foo1");//设置添加的文档信息 IndexRequest indexRequest2 = new IndexRequest("indexname"); //创建第二个"添加文档"的请求 indexRequest2.id("1") .source(XContentType.JSON,"fieldname", "foo2");//设置添加的文档信息

bulkRequest.add(indexRequest1); //将第一个"添加文档"请求加入到批量请求中 bulkRequest.add(indexRequest2); //将第二个"添加文档"请求加入到批量请求中

BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);//执行批量请求,返回执行结果

```

查询时各种条件的构造的正确姿势

由于es在java中查询没法像mybatis那样方便,而且es的构造器使用也比较繁琐,理解不是很方便,所以下面来记录es构造器BoolQueryBuilder查询时各种条件的构造的正确姿势。

1.构造准备

java //1.构建SearchRequest请求对象,指定索引库 SearchRequest searchRequest = new SearchRequest("data_info"); //2.构建SearchSourceBuilder查询对象 SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); //2.1 这个条件用于返回所有命中条件的数据数量, 不设置则返回大概数值 sourceBuilder.trackTotalHits(true); //3.检索条件构造 BoolQueryBuilder bqb = QueryBuilders.boolQuery();

2.条件构造:must可以使用filter代替,效率更高,因为must会对_score评估

```java //3.1 完全匹配 bqb.must(QueryBuilders.matchQuery("code", 666L);

//3.2 模糊匹配 bqb.must(QueryBuilders.matchPhraseQuery("name", "张");

//3.3 in的效果 传单个参数就是完全匹配 bqb.must(QueryBuilders.termsQuery("code", new Long[]{1L, 2L, 3L}); bqb.must(QueryBuilders.termsQuery("code", 1L, 2L, 3L);

//3.4 or条件 BoolQueryBuilder shouldQuery = QueryBuilders.boolQuery(); shouldQuery.should(QueryBuilders.matchQuery("code", 1L); shouldQuery.should(QueryBuilders.matchQuery("code", 2L); shouldQuery.minimumShouldMatch(1); //至少满足一个 bqb.must(shouldQuery);

//3.5 非null bqb.must(QueryBuilders.existsQuery("iden")); //是null

bqb.mustNot(QueryBuilders.existsQuery("iden"));

//3.6 大于等于gte (gt-大于 lt-小于 lte-小于等于) bqb.must(QueryBuilders.rangeQuery("time").gte(new Date());

//3.7 中文完全匹配 bqb.must(queryBuilder.matchPhraseQuery("key", value));

//3.8 匹配多个字段 bqb.must(queryBuilder.multiMatchQuery(value, key1, key2, key3)); ```

3.构造完成 准备查询

```java //4.将QuseryBuilder对象设置到SearchSourceBuilder对象中 sourceBuilder.query(bqb); //5.排序 sourceBuilder.sort("updateTime", SortOrder.DESC); //6.分页 sourceBuilder.from((dto.getPageNum() - 1) * dto.getPageSize()); sourceBuilder.size(dto.getPageSize());

//设置高亮 title HighlightBuilder highlightBuilder = new HighlightBuilder(); highlightBuilder.field("title"); highlightBuilder.preTags(""); highlightBuilder.postTags("");

searchBuilder.highlighter(highlightBuilder);

//7.将SearchSourceBuilder设置到SearchRequest中 searchRequest.source(sourceBuilder);

//8.调用方法查询数据 SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);

//9.解析返回结果 SearchHit[] hits = searchResponse.getHits().getHits(); for (int i = 0; i < hits.length; i++) { System.out.println("返回的结果: " + hits[i].getSourceAsString()); }

log.info("返回总数为:" + searchResponse.getHits().getTotalHits()); int total = (int)searchResponse.getHits().getTotalHits().value; ```

检索业务实现

  1. 配置config ```java @Getter @Setter @Configuration @ConfigurationProperties(prefix = "elasticsearch") public class ElasticSearchConfig { private String host; private int port;

    @Bean public RestHighLevelClient client(){ System.out.println(host); System.out.println(port); return new RestHighLevelClient(RestClient.builder( new HttpHost( host, port, "http" ) )); } } ```

  2. 构造一个UserSearchDto其中包含字段 搜索关键字searchWords、当前页pageNum、分页条数pageSize、最小时间minBehotTime。其中提供一个分页算法

java public int getFromIndex(){ if(this.pageNum<1)return 0; if(this.pageSize<1) this.pageSize = 10; return this.pageSize * (pageNum-1); } 2. 进行编写es代码

```java // 传入参数为UserSearchDto // 1 设置查询条件 SearchRequest searchRequest = new SearchRequest("key_name"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// 2 设置查询条件 BoolQueryBuilder boolQueryBuilder = QueryBuilders.query();

// 关键字的分词后查询 QueryStringQueryBuilder queryStringQueryBuilder = QueryBuilders.queryStringQuery(dto.getSearchWords()).field("title").field("content").defaultOperatior(Operator.OR); boolQueryBuilder.must(queryStringQueryBuilder);

// 查询小于minBehotTim的数据 RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("publishTime").lt(dto.getMinHotTime().getTime()); boolQueryBuilder.filter(rangeQueryBuilder);

// 分页查询 searchSourceBuilder.from(0); searchSourceBuilder.size(dto.getPageSize());

// 按照发布时间倒序查询 searchSourceBuilder.sort("publishTime", SortOrder.DESC);

// 设置高亮 title HighlightBuilder highlightBuilder = new HighlightBuilder(); highlightBuilder.field("title"); highlightBuilder.preTags(""); highlightBuilder.postTags("") searchSourceSearch.highlighter(highlightBuilder);

// 执行查询 searchSourceBuilder.query(boolQueryBuilder); searchRequest.source(searchSourceBuilder); SearchResponse searchResponse = restHighLevelClient.search(searchRequet, RequestOptions.DEFAULT); ```

文章审核后构建索引

收集数据并发送消息

```java //发送消息,创建索引 createArticleESIndex(apArticle,content,path);

@Autowired private KafkaTemplate kafkaTemplate;

/** * 送消息,创建索引 * @param apArticle * @param content * @param path */ private void createArticleESIndex(ApArticle apArticle, String content, String path) { SearchArticleVo vo = new SearchArticleVo(); BeanUtils.copyProperties(apArticle,vo); vo.setContent(content); vo.setStaticUrl(path);

kafkaTemplate.send(ArticleConstants.ARTICLE_ES_SYNC_TOPIC, JSON.toJSONString(vo));

}

public static final String ARTICLEESSYNC_TOPIC = "article.es.sync.topic"; ```

搜索的微服务 接收消息并创建场景

  1. 搜索微服务中添加kafka的配置,nacos配置

yaml spring: kafka: bootstrap-servers: ${url}:${port} consumer: group-id: ${spring.application.name} key-deserializer: org.apache.kafka.common.serialization.StringDeserializer value-deserializer: org.apache.kafka.common.serialization.StringDeserializer 2. 定义监听接收消息,保存索引数据,class SyncArticleListener

```java @AutoWired private RestHighLevelClient restHighLevelClient;

@KafkaListener(topics = ArticleConstants.ARTICLEESSYNC_TOPIC) public void onMessage(String message){ if(StringUtils.isNotBlank(message)){ log.info("SyncArticleListener message = {}", message); SearchArticleVo searchArticleVo = JSON.parseObject } } ```

相关推荐
孤水寒月2 小时前
Git忽略文件.gitignore
git·elasticsearch
田猿笔记2 小时前
Typesense:开源的高速搜索引擎
搜索引擎
LKAI.8 小时前
搭建Elastic search群集
linux·运维·elasticsearch·搜索引擎
WTT001110 小时前
2024楚慧杯WP
大数据·运维·网络·安全·web安全·ctf
云云32115 小时前
怎么通过亚矩阵云手机实现营销?
大数据·服务器·安全·智能手机·矩阵
新加坡内哥谈技术15 小时前
苏黎世联邦理工学院与加州大学伯克利分校推出MaxInfoRL:平衡内在与外在探索的全新强化学习框架
大数据·人工智能·语言模型
Data-Miner15 小时前
经典案例PPT | 大型水果连锁集团新零售数字化建设方案
大数据·big data
lovelin+v1750304096616 小时前
安全性升级:API接口在零信任架构下的安全防护策略
大数据·数据库·人工智能·爬虫·数据分析
道一云黑板报16 小时前
Flink集群批作业实践:七析BI批作业执行
大数据·分布式·数据分析·flink·kubernetes