使用Spring Boot集成中间件：Elasticsearch基础-＞提高篇

使用Spring Boot集成中间件：Elasticsearch基础->提高篇

导言

Elasticsearch是一个开源的分布式搜索和分析引擎，广泛用于构建实时的搜索和分析应用。在本篇博客中，我们将深入讲解如何使用Spring Boot集成Elasticsearch，实现数据的索引、搜索和分析。

一、 Elasticsearch一些基本操作和配置

1. 准备工作

在开始之前，确保已经完成以下准备工作：

安装并启动Elasticsearch集群
创建Elasticsearch索引和映射（Mapping）

2. 添加依赖

首先，需要在Spring Boot项目中添加Elasticsearch的依赖。在pom.xml文件中加入以下依赖：

xml 复制代码

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

3. 配置Elasticsearch连接

在application.properties或application.yml中配置Elasticsearch的连接信息：

properties 复制代码

spring.data.elasticsearch.cluster-nodes=localhost:9200

4. 创建实体类

创建一个Java实体类，用于映射Elasticsearch中的文档。

java 复制代码

import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;

@Document(indexName = "blog", type = "article")
public class Article {

    @Id
    private String id;

    private String title;

    private String content;

    // Getters and setters
}

在上述代码中，我们使用了@Document注解定义了Elasticsearch中的索引名和文档类型。

5. 创建Repository接口

使用Spring Data Elasticsearch提供的ElasticsearchRepository接口来定义对Elasticsearch的操作。

java 复制代码

import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;

public interface ArticleRepository extends ElasticsearchRepository<Article, String> {

    List<Article> findByTitle(String title);

    List<Article> findByContent(String content);
}

通过继承ElasticsearchRepository，我们可以直接使用Spring Data提供的方法进行数据的CRUD操作。

6. 编写Service

创建一个Service类，封装业务逻辑，调用Repository进行数据操作。

java 复制代码

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import java.util.List;

@Service
public class ArticleService {

    private final ArticleRepository articleRepository;

    @Autowired
    public ArticleService(ArticleRepository articleRepository) {
        this.articleRepository = articleRepository;
    }

    public List<Article> searchByTitle(String title) {
        return articleRepository.findByTitle(title);
    }

    public List<Article> searchByContent(String content) {
        return articleRepository.findByContent(content);
    }
}

7. 使用示例

在Controller层使用我们创建的Service进行数据的操作。

java 复制代码

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.*;

import java.util.List;

@RestController
@RequestMapping("/articles")
public class ArticleController {

    private final ArticleService articleService;

    @Autowired
    public ArticleController(ArticleService articleService) {
        this.articleService = articleService;
    }

    @GetMapping("/searchByTitle")
    public List<Article> searchByTitle(@RequestParam String title) {
        return articleService.searchByTitle(title);
    }

    @GetMapping("/searchByContent")
    public List<Article> searchByContent(@RequestParam String content) {
        return articleService.searchByContent(content);
    }
}

8. 运行和测试

通过访问Controller提供的接口，我们可以进行数据的索引、搜索等操作：

bash 复制代码

curl -X GET http://localhost:8080/articles/searchByTitle?title=Elasticsearch

二、 Elasticsearch 保存实体类在表中的映射

Elasticsearch 与传统的关系型数据库不同，它采用的是文档型数据库的思想，数据以文档的形式存储。在 Elasticsearch 中，我们不再创建表，而是创建索引（Index），每个索引包含多个文档（Document），每个文档包含多个字段。

以下是 Elasticsearch 中建立索引和实体类的映射的基本步骤：

1. 创建索引

在 Elasticsearch 中，索引是存储相关文档的地方。我们可以通过 RESTful API 或者在 Spring Boot 项目中使用 Elasticsearch 的 Java 客户端创建索引。以下是通过 RESTful API 创建索引的示例：

bash 复制代码

PUT /my_index

上述命令创建了一个名为 my_index 的索引。在 Spring Boot 项目中，可以使用 IndexOperations 类来创建索引，示例如下：

java 复制代码

@Autowired
private ElasticsearchRestTemplate elasticsearchRestTemplate;

public void createIndex() {
    elasticsearchRestTemplate.indexOps(MyEntity.class).create();
}

2. 定义实体类

实体类用于映射 Elasticsearch 中的文档结构。每个实体类的实例对应于一个文档。在实体类中，我们可以使用注解来定义字段的映射关系。以下是一个简单的实体类示例：

java 复制代码

import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;

@Document(indexName = "my_index", type = "my_entity")
public class MyEntity {

    @Id
    private String id;

    @Field(type = FieldType.Text)
    private String name;

    @Field(type = FieldType.Keyword)
    private String category;

    // 其他字段和方法
}

上述示例中，通过 @Document 注解定义了索引名为 my_index，类型名为 my_entity。通过 @Field 注解定义了字段的映射关系，例如 name 字段映射为 Text 类型，category 字段映射为 Keyword 类型。

3. 保存文档

保存文档是将实体类的实例存储到 Elasticsearch 中的过程。在 Spring Boot 项目中，可以使用 ElasticsearchTemplate 或者 ElasticsearchRepository 进行文档的保存。以下是使用 ElasticsearchRepository 的示例：

java 复制代码

public interface MyEntityRepository extends ElasticsearchRepository<MyEntity, String> {
}

在上述示例中，MyEntityRepository 继承了 ElasticsearchRepository 接口，泛型参数为实体类类型和 ID 类型。Spring Data Elasticsearch 将根据实体类的结构自动生成相应的 CRUD 方法。通过调用 save 方法，可以保存实体类的实例到 Elasticsearch 中。

java 复制代码

@Autowired
private MyEntityRepository myEntityRepository;

public void saveDocument() {
    MyEntity entity = new MyEntity();
    entity.setName("Document Name");
    entity.setCategory("Document Category");

    myEntityRepository.save(entity);
}

上述代码示例中，我们创建了一个 MyEntity 类的实例，并使用 save 方法将其保存到 Elasticsearch 中。

提高篇

一实际案例：使用Spring Boot集成Elasticsearch的深度提高篇

在这个实际案例中，我们将以一个图书搜索引擎为例，详细讲解如何使用Spring Boot集成Elasticsearch进行深度提高，包括性能调优、复杂查询、分页和聚合等方面。

1. 准备工作

首先，确保你已经搭建好Elasticsearch集群，并且在Spring Boot项目中添加了Elasticsearch的依赖。

xml 复制代码

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

2. 配置文件

在application.properties或application.yml中配置Elasticsearch的连接信息：

properties 复制代码

spring.data.elasticsearch.cluster-nodes=localhost:9200

3. 实体类

创建一个图书实体类，用于映射Elasticsearch中的文档。

java 复制代码

import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;

@Document(indexName = "books", type = "book")
public class Book {

    @Id
    private String id;

    private String title;

    private String author;

    private String genre;

    // Getters and setters
}

4. Repository 接口

创建一个Elasticsearch Repository接口，继承自ElasticsearchRepository，用于对图书文档进行操作。

java 复制代码

import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;

import java.util.List;

public interface BookRepository extends ElasticsearchRepository<Book, String> {

    List<Book> findByTitleLike(String title);

    List<Book> findByAuthorAndGenre(String author, String genre);

    // 更多自定义查询方法
}

5. 服务类

创建一个服务类，用于处理业务逻辑，调用Repository进行图书文档的操作。

java 复制代码

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import java.util.List;

@Service
public class BookService {

    private final BookRepository bookRepository;

    @Autowired
    public BookService(BookRepository bookRepository) {
        this.bookRepository = bookRepository;
    }

    public List<Book> searchBooksByTitle(String title) {
        return bookRepository.findByTitleLike(title);
    }

    public List<Book> searchBooksByAuthorAndGenre(String author, String genre) {
        return bookRepository.findByAuthorAndGenre(author, genre);
    }

    // 更多业务逻辑和自定义查询方法
}

6. 性能调优

6.1 配置文件调优

在application.properties中配置Elasticsearch的连接池大小和相关参数：

properties 复制代码

spring.data.elasticsearch.properties.http.max_content_length=100mb
spring.data.elasticsearch.properties.http.max_initial_line_length=100kb
spring.data.elasticsearch.properties.http.max_header_size=3kb
spring.data.elasticsearch.properties.transport.tcp.compress=true
spring.data.elasticsearch.properties.transport.tcp.connect_timeout=5s
spring.data.elasticsearch.properties.transport.tcp.keep_alive=true
spring.data.elasticsearch.properties.transport.tcp.no_delay=true
spring.data.elasticsearch.properties.transport.tcp.socket_timeout=5s

6.2 JVM调优

修改jvm.options文件，调整堆内存大小：

ini 复制代码

-Xms2g
-Xmx2g

7. 复杂查询

通过服务类提供的自定义查询方法实现复杂查询，例如按标题模糊查询和按作者、类别查询：

java 复制代码

@RestController
@RequestMapping("/books")
public class BookController {

    private final BookService bookService;

    @Autowired
    public BookController(BookService bookService) {
        this.bookService = bookService;
    }

    @GetMapping("/searchByTitle")
    public List<Book> searchByTitle(@RequestParam String title) {
        return bookService.searchBooksByTitle(title);
    }

    @GetMapping("/searchByAuthorAndGenre")
    public List<Book> searchByAuthorAndGenre(@RequestParam String author, @RequestParam String genre) {
        return bookService.searchBooksByAuthorAndGenre(author, genre);
    }
}

8. 分页和聚合

在Controller中添加分页和聚合的方法：

java 复制代码

@GetMapping("/searchWithPagination")
public List<Book> searchWithPagination(@RequestParam String title, @RequestParam int page, @RequestParam int size) {
    PageRequest pageRequest = PageRequest.of(page, size);
    return bookService.searchBooksByTitleWithPagination(title, pageRequest);
}

@GetMapping("/aggregateByGenre")
public Map<String, Long> aggregateByGenre() {
    return bookService.aggregate

BooksByGenre();
}

在服务类中实现分页和聚合的方法：

java 复制代码

public List<Book> searchBooksByTitleWithPagination(String title, Pageable pageable) {
    SearchHits<Book> searchHits = bookRepository.search(QueryBuilders.matchQuery("title", title), pageable);
    return searchHits.stream().map(SearchHit::getContent).collect(Collectors.toList());
}

public Map<String, Long> aggregateBooksByGenre() {
    TermsAggregationBuilder aggregation = AggregationBuilders.terms("genres").field("genre").size(10);
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().aggregation(aggregation);
    SearchHits<Book> searchHits = bookRepository.search(sourceBuilder.build());
    
    return searchHits.getAggregations().asMap().entrySet().stream()
            .collect(Collectors.toMap(Map.Entry::getKey, e -> ((ParsedLongTerms) e.getValue()).getBuckets().size()));
}

9. 运行与测试

二使用Spring Boot集成Elasticsearch的更多进阶特性

1. 文档数据处理

在实际应用中，对文档数据的处理常常需要更多的灵活性。我们将学习如何在实体类中使用注解进行更高级的字段映射和设置：

java 复制代码

import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;

@Document(indexName = "books", type = "book")
public class Book {

    @Id
    private String id;

    @Field(type = FieldType.Text, analyzer = "standard", fielddata = true)
    private String title;

    @Field(type = FieldType.Keyword)
    private String author;

    @Field(type = FieldType.Keyword)
    private String genre;

    // 其他字段和方法
}

在上述例子中，我们使用了@Field注解进行更精细的字段类型设置和分词配置。

2. 脚本查询

Elasticsearch允许使用脚本进行查询，这在某些复杂的业务逻辑下非常有用。我们将学习如何使用脚本进行查询：

java 复制代码

@GetMapping("/searchWithScript")
public List<Book> searchWithScript(@RequestParam String script) {
    NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
            .withQuery(QueryBuilders.scriptQuery(new Script(script)))
            .build();
    return elasticsearchRestTemplate.search(searchQuery, Book.class).stream()
            .map(SearchHit::getContent)
            .collect(Collectors.toList());
}

在上述例子中，我们通过Script对象构建了一个脚本查询，并使用NativeSearchQuery进行执行。

3. 性能优化 - Bulk 操作

当需要批量操作大量文档时，使用Bulk操作可以显著提高性能。我们将学习如何使用Bulk操作：

java 复制代码

public void bulkIndexBooks(List<Book> books) {
    List<IndexQuery> indexQueries = books.stream()
            .map(book -> new IndexQueryBuilder()
                    .withObject(book)
                    .build())
            .collect(Collectors.toList());

    elasticsearchRestTemplate.bulkIndex(indexQueries);
    elasticsearchRestTemplate.refresh(Book.class);
}

在上述例子中，我们通过bulkIndex方法批量索引图书，并使用refresh方法刷新索引。

4. 高级用法 - Highlight

在搜索结果中高亮显示关键字是提高用户体验的一种方式。我们将学习如何在查询中使用Highlight：

java 复制代码

public List<Book> searchBooksWithHighlight(String keyword) {
    QueryStringQueryBuilder query = QueryBuilders.queryStringQuery(keyword);
    HighlightBuilder.Field highlightTitle = new HighlightBuilder.Field("title")
            .preTags("<span style='background-color:yellow'>")
            .postTags("</span>");

    NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
            .withQuery(query)
            .withHighlightFields(highlightTitle)
            .build();

    SearchHits<Book> searchHits = elasticsearchRestTemplate.search(searchQuery, Book.class);
    return searchHits.stream()
            .map(searchHit -> {
                Book book = searchHit.getContent();
                Map<String, List<String>> highlightFields = searchHit.getHighlightFields();
                if (highlightFields.containsKey("title")) {
                    book.setTitle(String.join(" ", highlightFields.get("title")));
                }
                return book;
            })
            .collect(Collectors.toList());
}

在上述例子中，我们通过HighlightBuilder设置了对title字段的高亮显示，然后在查询中使用了withHighlightFields方法。

5. 进阶用法 - SearchTemplate

Elasticsearch提供了SearchTemplate功能，允许使用模板进行更灵活的查询。我们将学习如何使用SearchTemplate：

java 复制代码

public List<Book> searchBooksWithTemplate(String genre) {
    Map<String, Object> params = Collections.singletonMap("genre", genre);
    String script = "{\"query\":{\"match\":{\"genre\":\"{{genre}}\"}}}";

    SearchResponse response = elasticsearchRestTemplate.query(searchRequest -> {
        searchRequest
                .setScript(new Script(ScriptType.INLINE, "mustache", script, params))
                .setIndices("books")
                .setTypes("book");
    }, SearchResponse.class);

    return Arrays.stream(response.getHits().getHits())
            .map(hit -> elasticsearchRestTemplate.getConverter().read(Book.class, hit))
            .collect(Collectors.toList());
}

在上述例子中，我们通过SearchTemplate使用了一个简单的Mustache模板进行查询。

高级使用篇

Elasticsearch常见的高级使用篇

在一部分中，我们将深入讨论Elasticsearch的一些常见的高级使用技巧，包括聚合、地理空间搜索、模糊查询、索引别名等。

1. 聚合（Aggregation）

聚合是Elasticsearch中一项强大的功能，它允许对数据集进行复杂的数据分析和汇总。以下是一些常见的聚合类型：

1.1 桶聚合（Bucket Aggregation）

桶聚合将文档分配到不同的桶中，然后对每个桶进行聚合计算。

json 复制代码

GET /my_index/_search
{
  "size": 0,
  "aggs": {
    "categories": {
      "terms": {
        "field": "category.keyword"
      }
    }
  }
}

上述例子中，通过桶聚合统计了每个类别的文档数量。

1.2 指标聚合（Metric Aggregation）

指标聚合计算某个字段的统计指标，比如平均值、最大值、最小值等。

json 复制代码

GET /my_index/_search
{
  "size": 0,
  "aggs": {
    "average_price": {
      "avg": {
        "field": "price"
      }
    }
  }
}

上述例子中，通过指标聚合计算了字段"price"的平均值。

2. 地理空间搜索

Elasticsearch提供了强大的地理空间搜索功能，支持地理点、地理形状等多种地理数据类型。

2.1 地理点搜索

json 复制代码

GET /my_geo_index/_search
{
  "query": {
    "geo_distance": {
      "distance": "10km",
      "location": {
        "lat": 40,
        "lon": -70
      }
    }
  }
}

上述例子中，通过地理点搜索找到距离指定坐标（纬度40，经度-70）10公里范围内的文档。

2.2 地理形状搜索

json 复制代码

GET /my_geo_shape_index/_search
{
  "query": {
    "geo_shape": {
      "location": {
        "shape": {
          "type": "envelope",
          "coordinates": [[-74.1,40.73], [-73.9,40.85]]
        },
        "relation": "within"
      }
    }
  }
}

上述例子中，通过地理形状搜索找到在指定矩形区域内的文档。

3. 模糊查询

Elasticsearch支持多种模糊查询，包括通配符查询、模糊查询、近似查询等。

3.1 通配符查询

json 复制代码

GET /my_index/_search
{
  "query": {
    "wildcard": {
      "name": "el*"
    }
  }
}

上述例子中，通过通配符查询找到名字以"el"开头的文档。

3.2 模糊查询

json 复制代码

GET /my_index/_search
{
  "query": {
    "fuzzy": {
      "name": {
        "value": "elastic",
        "fuzziness": "AUTO"
      }
    }
  }
}

上述例子中，通过模糊查询找到与"elastic"相似的文档。

4. 索引别名

索引别名是一个指向一个或多个索引的虚拟索引名称，它可以用于简化查询、切换索引版本、重命名索引等操作。

json 复制代码

POST /_aliases
{
  "actions": [
    {
      "add": {
        "index": "new_index",
        "alias": "my_alias"
      }
    }
  ]
}

上述例子中，创建了一个别名"my_alias"指向索引"new_index"。

5. 深度分页

当需要深度分页时，常规的from和size可能会导致性能问题。这时可以使用search_after进行优化。

json 复制代码

GET /my_index/_search
{
  "size": 10,
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "date": {
        "order": "asc"
      }
    }
  ]
}

上述例子中，通过search_after分页查询，可以避免使用from和size导致的性能问题。

结语

通过这篇高级使用篇博客，我们详细介绍了如何使用Spring Boot集成Elasticsearch，包括添加依赖、配置连接、创建实体类和Repository接口、编写Service以及使用示例。我们深入了解了Elasticsearch的一些高级功能，包括聚合、地理空间搜索、模糊查询、索引别名等。这些技巧将有助于你更灵活、高效地处理各种复杂的数据查询和分析任务。希望这些内容对你在实际项目中的应用有所帮助。感谢阅读！

使用Spring Boot集成中间件：Elasticsearch基础-＞提高篇