以下是针对 Spring Boot 与 Elasticsearch 8.x 集成的更详细最佳实践,涵盖配置优化、性能调优、安全、测试等关键方面。
1. 版本与依赖管理
1.1 明确版本关系
-
Spring Boot 3.x + Spring Data Elasticsearch 5.x + Elasticsearch 8.x 是官方推荐组合。
-
检查依赖树,避免冲突:
xml<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-elasticsearch</artifactId> </dependency> <!-- 确保使用 Elasticsearch 8.x 的 Java Client --> <dependency> <groupId>co.elastic.clients</groupId> <artifactId>elasticsearch-java</artifactId> <version>8.12.0</version> </dependency>
1.2 排除旧客户端
确保不引入 RestHighLevelClient
:
xml
<exclusions>
<exclusion>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
</exclusion>
</exclusions>
2. 配置与连接管理
2.1 基础配置(application.yml
)
yaml
spring:
elasticsearch:
uris: ["https://elasticsearch-host:9200"] # Elasticsearch 8 默认启用 HTTPS
username: "elastic"
password: "your_password"
ssl:
certificate-authorities: "/path/to/http_ca.crt" # 必须配置 CA 证书
connection-timeout: 5s
socket-timeout: 30s
max-connections: 100
2.2 自定义配置类
typescript
@Configuration
public class ElasticsearchConfig {
@Bean
public ElasticsearchClient elasticsearchClient(RestClient restClient) {
return new ElasticsearchClient(new RestClientTransport(restClient, new JacksonJsonpMapper()));
}
@Bean
public RestClient restClient(@Value("${spring.elasticsearch.uris}") String[] uris) {
return RestClient.builder(HttpHost.create(uris[0]))
.setHttpClientConfigCallback(httpClientBuilder -> {
// 配置 SSL 上下文
SSLContext sslContext = loadSSLContext();
return httpClientBuilder.setSSLContext(sslContext);
})
.build();
}
}
3. 索引与映射设计
3.1 显式定义索引映射
less
@Document(indexName = "orders", createIndex = false) // 禁止自动创建索引
@Setting(
dynamic = Dynamic.STRICT, // 禁止未定义的字段
numberOfShards = 3,
numberOfReplicas = 1
)
public class Order {
@Id
private String id;
@Field(type = FieldType.Keyword)
private String orderId;
@Field(type = FieldType.Text, analyzer = "ik_smart")
private String description;
@Field(type = FieldType.Date, format = DateFormat.date_optional_time)
private LocalDateTime orderDate;
}
3.2 使用 Index Templates 管理动态字段
bash
PUT _index_template/order_template
{
"index_patterns": ["orders*"],
"template": {
"settings": {
"number_of_shards": 3,
"analysis": {
"analyzer": {
"ik_smart": { "type": "ik_smart" }
}
}
},
"mappings": {
"dynamic": "strict",
"properties": {
"orderId": { "type": "keyword" },
"description": { "type": "text", "analyzer": "ik_smart" }
}
}
}
}
4. 数据操作与查询优化
4.1 使用 Repository 简化操作
typescript
public interface OrderRepository extends ElasticsearchRepository<Order, String> {
// 自动生成 DSL
List<Order> findByDescription(String description);
}
4.2 复杂查询使用 ElasticsearchClient
scss
@Autowired
private ElasticsearchClient elasticsearchClient;
public List<Order> searchOrders(String keyword) {
Query query = Query.of(q -> q
.bool(b -> b
.must(m -> m.match(t -> t
.field("description")
.query(keyword)
))
)
);
SearchResponse<Order> response = elasticsearchClient.search(s -> s
.index("orders")
.query(query)
.size(100), Order.class);
return response.hits().hits().stream()
.map(Hit::source)
.collect(Collectors.toList());
}
4.3 批量写入优化
less
BulkRequest.Builder bulkRequest = new BulkRequest.Builder();
orders.forEach(order -> bulkRequest
.operations(op -> op
.index(idx -> idx
.index("orders")
.id(order.getId())
.document(order)
)
)
);
elasticsearchClient.bulk(bulkRequest.build());
5. 性能调优
5.1 分片与副本策略
- 分片数:根据数据量估算,单个分片建议不超过 50GB。
- 副本数:生产环境至少 1 个副本,高可用场景可设置为 2。
5.2 刷新间隔与写入优化
scss
// 创建索引时配置
CreateIndexRequest request = new CreateIndexRequest.Builder()
.index("orders")
.settings(s -> s
.refreshInterval("30s") // 降低刷新频率
.numberOfShards("3")
)
.build();
elasticsearchClient.indices().create(request);
5.3 查询性能优化
-
使用
filter
代替query
上下文加速非相关性查询:lessQuery query = Query.of(q -> q .bool(b -> b .filter(f -> f.term(t -> t.field("status").value("completed")) ) );
-
避免
wildcard
查询,改用edge_ngram
分词器。
6. 安全与认证
6.1 强制 HTTPS 和 TLS
-
生成 CA 证书并配置 Elasticsearch:
yamlxpack.security.enabled: true xpack.security.http.ssl: enabled: true keystore.path: certs/elastic-certificates.p12 truststore.path: certs/elastic-certificates.p12
6.2 角色权限控制
-
在
elasticsearch.yml
中定义最小权限角色:yamlxpack.security.authz: roles: app_user: indices: - names: ['orders*'] privileges: ['read', 'index']
7. 异常处理与重试
7.1 自定义异常处理
java
@ControllerAdvice
public class ElasticsearchExceptionHandler {
@ExceptionHandler(ElasticsearchException.class)
public ResponseEntity<ErrorResponse> handleEsException(ElasticsearchException ex) {
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body(new ErrorResponse("ES_ERROR", ex.getMessage()));
}
}
7.2 重试机制(结合 Resilience4j)
typescript
@Bean
public RetryConfig retryConfig() {
return RetryConfig.custom()
.maxAttempts(3)
.waitDuration(Duration.ofMillis(500))
.retryOnException(e -> e instanceof ElasticsearchException)
.build();
}
@Retry(name = "elasticsearchRetry")
public void saveOrder(Order order) {
orderRepository.save(order);
}
8. 监控与运维
8.1 集成 Prometheus + Grafana
-
配置 Elasticsearch Exporter:
yamlmetrics.enabled: true xpack.monitoring.exporters: prometheus: type: prometheus host: ["localhost:9200"]
8.2 日志记录
-
记录慢查询(
elasticsearch.yml
):diffindex.search.slowlog.threshold.query.warn: 10s index.search.slowlog.threshold.fetch.debug: 500ms
9. 测试策略
9.1 使用 Testcontainers 进行集成测试
less
@Testcontainers
@SpringBootTest
public class OrderRepositoryTest {
@Container
static ElasticsearchContainer container = new ElasticsearchContainer("docker.elastic.co/elasticsearch/elasticsearch:8.12.0")
.withPassword("password")
.withExposedPorts(9200);
@DynamicPropertySource
static void setProperties(DynamicPropertyRegistry registry) {
registry.add("spring.elasticsearch.uris", () -> "https://" + container.getHost() + ":" + container.getMappedPort(9200));
registry.add("spring.elasticsearch.username", () -> "elastic");
registry.add("spring.elasticsearch.password", () -> "password");
registry.add("spring.elasticsearch.ssl.certificate-authorities", () -> "/path/to/http_ca.crt");
}
}
10. 高级特性
10.1 向量搜索(Elasticsearch 8 新增)
less
@Field(type = FieldType.DenseVector, dims = 512)
private float[] embedding;
// 查询时使用 knn 搜索
Query query = Query.of(q -> q
.knn(k -> k
.field("embedding")
.queryVector(embeddingArray)
.k(10)
.numCandidates(100)
)
);
10.2 索引生命周期管理(ILM)
bash
PUT _ilm/policy/orders_policy
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": { "max_size": "50GB" }
}
},
"delete": {
"min_age": "30d",
"actions": { "delete": {} }
}
}
}
}
总结:核心注意事项
- 安全第一:强制 HTTPS、配置 CA 证书、最小权限原则。
- 索引设计:显式定义映射、合理分片、使用模板管理动态索引。
- 性能调优:批量写入、调整刷新间隔、避免深分页。
- 容错机制:全局异常处理、重试策略、熔断降级。
- 监控告警:集成 Prometheus、记录慢查询日志。
通过以上实践,可构建高性能、高可用的 Elasticsearch 8 集成方案,适配生产级需求。