Springboot集成ElasticSearch实现minio文件内容全文检索

一、docker安装Elasticsearch

(1)springboot和Elasticsearch的版本对应关系如下,请看版本对应

注意安装对应版本,否则可能会出现一些未知的错误。

(2)拉取镜像

java 复制代码
docker pull elasticsearch:7.17.6

(3)运行容器

python 复制代码
docker run  -it  -d  --name elasticsearch -e "discovery.type=single-node"  -e "ES_JAVA_OPTS=-Xms512m -Xmx1024m"   -p 9200:9200 -p 9300:9300  elasticsearch:7.17.6

访问http://localhost:9200/,出现如下内容表示安装成功。

(4)安装中文分词器

进入容器:

python 复制代码
docker exec -it elasticsearch bash

然后进入bin目录执行下载安装ik分词器命令:

python 复制代码
elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.17.6/elasticsearch-analysis-ik-7.17.6.zip

退出bash并重启容器:

python 复制代码
docker restart elasticsearch

二、安装kibana

Kibana 是为 Elasticsearch设计的开源分析和可视化平台。你可以使用 Kibana 来搜索,查看存储在 Elasticsearch 索引中的数据并与之交互。你可以很容易实现高级的数据分析和可视化,以图表的形式展现出来。

(1)拉取镜像

python 复制代码
docker pull kibana:7.17.6

(2)运行容器

python 复制代码
docker run --name kibana -p 5601:5601 --link elasticsearch:es  -e "elasticsearch.hosts=http://es:9200"  -d kibana:7.17.6

--link elasticsearch:es表示容器互联,即容器kibana连接到elasticsearch。

(3)使用kibana dev_tools发送http请求操作Elasticsearch

三、后端代码

(1)引入maven依赖

javascript 复制代码
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>

(2)application.yml配置

javascript 复制代码
spring:
  elasticsearch:
    uris: http://localhost:9200

(3)实体类

java 复制代码
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;

import java.util.Date;


/**
 * @author yangfeng
 */
@Data
@NoArgsConstructor
@AllArgsConstructor
@Document(indexName = "file")
public class File {

    @Id
    private String id;

    /**
     * 文件名称
     */
    @Field(type = FieldType.Text, analyzer = "ik_max_word")
    private String fileName;

    /**
     * 文件分类
     */
    @Field(type = FieldType.Keyword)
    private String fileCategory;

    /**
     * 文件内容
     */
    @Field(type = FieldType.Text, analyzer = "ik_max_word")
    private String fileContent;

    /**
     * 文件存储路径
     */
    @Field(type = FieldType.Keyword, index = false)
    private String filePath;

    /**
     * 文件大小
     */
    @Field(type = FieldType.Keyword, index = false)
    private Long fileSize;

    /**
     * 文件类型
     */
    @Field(type = FieldType.Keyword, index = false)
    private String fileType;

    /**
     * 创建人
     */
    @Field(type = FieldType.Keyword, index = false)
    private String createBy;

    /**
     * 创建日期
     */
    @Field(type = FieldType.Keyword, index = false)
    private Date createTime;

    /**
     * 更新人
     */
    @Field(type = FieldType.Keyword, index = false)
    private String updateBy;

    /**
     * 更新日期
     */
    @Field(type = FieldType.Keyword, index = false)
    private Date updateTime;

}

(4)repository接口,继承ElasticsearchRepository

java 复制代码
import org.springframework.data.domain.Page;
import org.springframework.data.domain.Pageable;
import org.springframework.data.elasticsearch.annotations.Highlight;
import org.springframework.data.elasticsearch.annotations.HighlightField;
import org.springframework.data.elasticsearch.annotations.HighlightParameters;
import org.springframework.data.elasticsearch.core.SearchHit;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;
import org.springframework.stereotype.Repository;

import java.util.List;

/**
 * @author yangfeng
 * @date: 2024年11月9日 15:29
 */
@Repository
public interface FileRepository extends ElasticsearchRepository<File, String> {

    /**
     * 关键字查询
     *
     * @return
     */
    @Highlight(fields = {@HighlightField(name = "fileName"), @HighlightField(name = "fileContent")},
            parameters = @HighlightParameters(preTags = {"<span style='color:red'>"}, postTags = {"</span>"}, numberOfFragments = 0))
    List<SearchHit<File>> findByFileNameOrFileContent(String fileName, String fileContent, Pageable pageable);
}

(5)service接口

java 复制代码
import org.springframework.data.elasticsearch.core.SearchHit;
import org.springframework.data.elasticsearch.core.SearchHits;

import java.util.List;

/**
 * description: ES文件服务
 *
 * @author yangfeng
 * @version V1.0
 * @date 2023-02-21
 */
public interface IFileService {

    /**
     * 保存文件
     */
    void saveFile(String filePath, String fileCategory) throws Exception;

    /**
     * 关键字查询
     *
     * @return
     */
    List<SearchHit<File>> search(FileDTO dto);

    /**
     * 关键字查询
     *
     * @return
     */
    SearchHits<File> searchPage(FileDTO dto);
}

(6)service实现类

java 复制代码
import cn.hutool.core.util.IdUtil;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.StringUtils;
import org.apache.shiro.SecurityUtils;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.sort.SortBuilders;
import org.elasticsearch.search.sort.SortOrder;
import org.jeecg.common.exception.JeecgBootException;
import org.jeecg.common.system.vo.LoginUser;
import org.jeecg.common.util.CommonUtils;
import org.jeecg.common.util.MinioUtil;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.domain.Pageable;
import org.springframework.data.domain.Sort;
import org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate;
import org.springframework.data.elasticsearch.core.SearchHit;
import org.springframework.data.elasticsearch.core.SearchHits;
import org.springframework.data.elasticsearch.core.query.NativeSearchQuery;
import org.springframework.data.elasticsearch.core.query.NativeSearchQueryBuilder;
import org.springframework.stereotype.Service;

import java.io.InputStream;
import java.util.Date;
import java.util.List;
import java.util.Objects;

/**
 * description: ES文件服务
 *
 * @author yangfeng
 * @version V1.0
 * @date 2023-02-21
 */
@Slf4j
@Service
public class FileServiceImpl implements IFileService {

    @Autowired
    private FileRepository fileRepository;

    @Autowired
    private ElasticsearchRestTemplate elasticsearchRestTemplate;

    /**
     * 保存文件
     */
    @Override
    public void saveFile(String filePath, String fileCategory) throws Exception {
        if (Objects.isNull(filePath)) {
            throw new JeecgBootException("文件不存在");
        }

        LoginUser user = (LoginUser) SecurityUtils.getSubject().getPrincipal();

        String fileName = CommonUtils.getFileNameByUrl(filePath);
        String fileType = StringUtils.isNotBlank(fileName) ? fileName.substring(fileName.lastIndexOf(".") + 1) : null;

        InputStream inputStream = MinioUtil.getMinioFile(filePath);

        // 读取文件内容,上传到es,方便后续的检索
        String fileContent = FileUtils.readFileContent(inputStream, fileType);
        File file = new File();
        file.setId(IdUtil.getSnowflake(1, 1).nextIdStr());
        file.setFileContent(fileContent);
        file.setFileName(fileName);
        file.setFilePath(filePath);
        file.setFileType(fileType);
        file.setFileCategory(fileCategory);
        file.setCreateBy(user.getUsername());
        file.setCreateTime(new Date());
        fileRepository.save(file);
    }


    /**
     * 关键字查询
     *
     * @return
     */
    @Override
    public List<SearchHit<File>> search(FileDTO dto) {
        Pageable pageable = PageRequest.of(dto.getPageNo() - 1, dto.getPageSize(), Sort.Direction.DESC, "createTime");
        return fileRepository.findByFileNameOrFileContent(dto.getKeyword(), dto.getKeyword(), pageable);
    }


    @Override
    public SearchHits<File> searchPage(FileDTO dto) {
        NativeSearchQueryBuilder queryBuilder = new NativeSearchQueryBuilder();
        queryBuilder.withQuery(QueryBuilders.multiMatchQuery(dto.getKeyword(), "fileName", "fileContent"));
        // 设置高亮
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        String[] fieldNames = {"fileName", "fileContent"};
        for (String fieldName : fieldNames) {
            highlightBuilder.field(fieldName);
        }
        highlightBuilder.preTags("<span style='color:red'>");
        highlightBuilder.postTags("</span>");
        highlightBuilder.order();
        queryBuilder.withHighlightBuilder(highlightBuilder);

        // 也可以添加分页和排序
        queryBuilder.withSorts(SortBuilders.fieldSort("createTime").order(SortOrder.DESC))
                .withPageable(PageRequest.of(dto.getPageNo() - 1, dto.getPageSize()));

        NativeSearchQuery nativeSearchQuery = queryBuilder.build();

        return elasticsearchRestTemplate.search(nativeSearchQuery, File.class);
    }

}

(7)controller

java 复制代码
import lombok.extern.slf4j.Slf4j;
import org.jeecg.common.api.vo.Result;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

/**
 * 文件es操作
 *
 * @author yangfeng
 * @since 2024-11-09
 */
@Slf4j
@RestController
@RequestMapping("/elasticsearch/file")
public class FileController {

    @Autowired
    private IFileService fileService;


    /**
     * 保存文件
     *
     * @return
     */
    @PostMapping(value = "/saveFile")
    public Result<?> saveFile(@RequestBody File file) throws Exception {
        fileService.saveFile(file.getFilePath(), file.getFileCategory());
        return Result.OK();
    }


    /**
     * 关键字查询-repository
     *
     * @throws Exception
     */
    @PostMapping(value = "/search")
    public Result<?> search(@RequestBody FileDTO dto) {
        return Result.OK(fileService.search(dto));
    }

    /**
     * 关键字查询-原生方法
     *
     * @throws Exception
     */
    @PostMapping(value = "/searchPage")
    public Result<?> searchPage(@RequestBody FileDTO dto) {
        return Result.OK(fileService.searchPage(dto));
    }

}

(8)工具类

java 复制代码
import lombok.extern.slf4j.Slf4j;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import org.apache.poi.xwpf.usermodel.XWPFDocument;

import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
import java.util.List;

@Slf4j
public class FileUtils {

    private static final List<String> FILE_TYPE;

    static {
        FILE_TYPE = Arrays.asList("pdf", "doc", "docx", "text");
    }

    public static String readFileContent(InputStream inputStream, String fileType) throws Exception{
        if (!FILE_TYPE.contains(fileType)) {
            return null;
        }
        // 使用PdfBox读取pdf文件内容
        if ("pdf".equalsIgnoreCase(fileType)) {
            return readPdfContent(inputStream);
        } else if ("doc".equalsIgnoreCase(fileType) || "docx".equalsIgnoreCase(fileType)) {
            return readDocOrDocxContent(inputStream);
        } else if ("text".equalsIgnoreCase(fileType)) {
            return readTextContent(inputStream);
        }

        return null;
    }


    private static String readPdfContent(InputStream inputStream) throws Exception {
        // 加载PDF文档
        PDDocument pdDocument = PDDocument.load(inputStream);

        // 创建PDFTextStripper对象, 提取文本
        PDFTextStripper textStripper = new PDFTextStripper();

        // 提取文本
        String content = textStripper.getText(pdDocument);
        // 关闭PDF文档
        pdDocument.close();
        return content;
    }


    private static String readDocOrDocxContent(InputStream inputStream) {
        try {
            // 加载DOC文档
            XWPFDocument document = new XWPFDocument(inputStream);

            // 2. 提取文本内容
            XWPFWordExtractor extractor = new XWPFWordExtractor(document);
            return extractor.getText();
        } catch (IOException e) {
            e.printStackTrace();
            return null;
        }
    }


    private static String readTextContent(InputStream inputStream) {
        StringBuilder content = new StringBuilder();
        try (InputStreamReader isr = new InputStreamReader(inputStream, StandardCharsets.UTF_8)) {
            int ch;
            while ((ch = isr.read()) != -1) {
                content.append((char) ch);
            }
        } catch (IOException e) {
            e.printStackTrace();
            return null;
        }
        return content.toString();
    }

}

(9)dto

java 复制代码
import lombok.Data;

@Data
public class FileDTO {

    private String keyword;

    private Integer pageNo;

    private Integer pageSize;

}

四、前端代码

(1)查询组件封装

html 复制代码
<template>
    <a-input-search
        v-model:value="pageInfo.keyword"
        placeholder="全文检索"
        @search="handleSearch"
        style="width: 220px;margin-left:30px"
    />
    <a-modal v-model:visible="showSearch" title="全文检索" width="900px" :footer="null"
             destroy-on-close>
      <SearchContent :items="searchItems" :loading="loading"/>
      <div style="padding: 10px;display: flex;justify-content: flex-end">
        <Pagination v-if="pageInfo.total" :pageSize="pageInfo.pageSize" :pageNo="pageInfo.pageNo"
                    :total="pageInfo.total" @pageChange="changePage" :show-total="total => `共 ${total} 条`"/>
      </div>
    </a-modal>
</template>

<script lang="ts" setup>
import {ref} from 'vue'
import {Pagination} from "ant-design-vue";
import SearchContent from "@/components/ElasticSearch/SearchContent.vue"
import {searchPage} from "@/api/sys/elasticsearch"

const loading = ref<boolean>(false)
const showSearch = ref<any>(false)
const searchItems = ref<any>();

const pageInfo = ref<{
  pageNo: number;
  pageSize: number;
  keyword: string;
  total: number;
}>({
  // 当前页码
  pageNo: 1,
  // 当前每页显示多少条数据
  pageSize: 10,
  keyword: '',
  total: 0,
});

async function handleSearch() {
  if (!pageInfo.value.keyword) {
    return;
  }
  pageInfo.value.pageNo = 1
  showSearch.value = true
  await getSearchItems();
}

function changePage(pageNo) {
  pageInfo.value.pageNo = pageNo
  getSearchItems();
}

async function getSearchItems() {
  loading.value = true
  try {
    const res: any = await searchPage(pageInfo.value);
    searchItems.value = res?.searchHits;
    debugger
    pageInfo.value.total = res?.totalHits
  } finally {
    loading.value = false
  }
}
</script>

<style scoped></style>

(2)接口elasticsearch.ts

javascript 复制代码
import {defHttp} from '/@/utils/http/axios';

enum Api {
    saveFile = '/elasticsearch/file/saveFile',
    searchPage = '/elasticsearch/file/searchPage',
}


/**
 * 保存文件到es
 * @param params
 */
export const saveFile = (params) => defHttp.post({
    url: Api.saveFile,
    params
});

/**
 * 关键字查询-原生方法
 * @param params
 */
export const searchPage = (params) => defHttp.post({
    url: Api.searchPage,
    params
},);

(3)搜索内容组件SearchContent.vue

javascript 复制代码
<template>
  <a-spin :spinning="loading">
    <div class="searchContent">
      <div v-for="(item,index) in items" :key="index" v-if="!!items.length > 0">
        <a-card class="contentCard">
          <template #title>
            <a @click="detailSearch(item.content)">
              <div class="flex" style="align-items: center">
                <div>
                  <img src="../../assets/images/pdf.png" v-if="item?.content?.fileType=='pdf'" style="width: 20px"/>
                  <img src="../../assets/images/word.png" v-if="item?.content?.fileType=='word'" style="width: 20px"/>
                  <img src="../../assets/images/excel.png" v-if="item?.content?.fileType=='excel'" style="width: 20px"/>
                </div>
                <div style="margin-left:10px">
                  <article class="article" v-html="item.highlightFields.fileName"
                           v-if="item?.highlightFields?.fileName"></article>
                  <span v-else>{{ item?.content?.fileName }}</span>
                </div>
              </div>
            </a>
          </template>
          <div class="item">
            <article class="article" v-html="item.highlightFields.fileContent"
                     v-if="item?.highlightFields?.fileContent"></article>
            <span v-else>{{
                item?.content?.fileContent?.length > 150 ? item.content.fileContent.substring(0, 150) + '......' : item.content.fileContent
              }}</span>
          </div>
        </a-card>
      </div>
      <EmptyData v-else/>
    </div>
  </a-spin>
</template>
<script lang="ts" setup>
import {useGlobSetting} from "@/hooks/setting";
import EmptyData from "/@/components/ElasticSearch/EmptyData.vue";
import {ref} from "vue";

const glob = useGlobSetting();

const props = defineProps({
  loading: {
    type: Boolean,
    default: false
  },
  items: {
    type: Array,
    default: []
  },
})

function detailSearch(searchItem) {
  const url = ref(`${glob.domainUrl}/sys/common/pdf/preview/`);
  window.open(url.value + searchItem.filePath + '#scrollbars=0&toolbar=0&statusbar=0', '_blank');
}

</script>
<style lang="less" scoped>
.searchContent {
  min-height: 500px;
  overflow-y: auto;
}

.contentCard {
  margin: 10px 20px;
}

a {
  color: black;
}

a:hover {
  color: #3370ff;
}

:deep(.ant-card-body) {
  padding: 13px;
}
</style>

五、效果展示

相关推荐
hashiqimiya19 分钟前
springboot事务触发滚动与不滚蛋
java·spring boot·后端
因我你好久不见22 分钟前
Windows部署springboot jar支持开机自启动
windows·spring boot·jar
无关86881 小时前
SpringBootApplication注解大解密
spring boot
追梦者1233 小时前
springboot整合minio
java·spring boot·后端
帅气的你3 小时前
Spring Boot 集成 AOP 实现日志记录与接口权限校验
java·spring boot
计算机毕设VX:Fegn08954 小时前
计算机毕业设计|基于springboot + vue在线音乐播放系统(源码+数据库+文档)
数据库·vue.js·spring boot·后端·课程设计
计算机毕设VX:Fegn08954 小时前
计算机毕业设计|基于springboot + vue博物馆展览与服务一体化系统(源码+数据库+文档)
数据库·vue.js·spring boot·后端·课程设计
帅气的你4 小时前
Spring Boot 1.x 接口性能优化:从 3 秒到 200 毫秒的实战调优之路
java·spring boot
yangminlei5 小时前
Spring Boot/Spring MVC核心注解深度解析
spring boot
goodlook01235 小时前
监控平台搭建-日志-springboot直接推送loki篇(九)
java·spring boot·后端·grafana