Java 21 + Spring Boot + Elasticsearch 8.10 实现语义搜索
目录
- 一、背景与方案选型
- 二、技术栈与依赖
- 三、环境准备
- [四、Spring Boot 集成 Elasticsearch 8.10](#四、Spring Boot 集成 Elasticsearch 8.10)
- [五、索引设计与 Mapping](#五、索引设计与 Mapping)
- 六、向量嵌入(Embedding)集成
- 七、数据写入与向量索引构建
- 八、语义搜索实现
- [九、混合搜索(向量 + 关键词)](#九、混合搜索(向量 + 关键词))
- 十、搜索结果优化与重排
- 十一、性能优化
- 十二、完整项目结构
- 十三、部署与运维
- 十四、总结与展望
一、背景与方案选型
1.1 什么是语义搜索
传统关键词搜索依赖词频匹配,存在以下局限:
| 问题 | 示例 |
|---|---|
| 同义词不匹配 | 搜索"手机"找不到"iPhone" |
| 语义理解缺失 | 搜索"怎么减肥"找不到"瘦身方法" |
| 上下文丢失 | 搜索"苹果"无法区分水果和公司 |
语义搜索通过将文本转换为高维向量,在向量空间中计算语义相似度,从根本上解决上述问题。
1.2 技术方案对比
传统搜索(BM25): 文本 → 分词 → 倒排索引 → 关键词匹配 → 结果
语义搜索(KNN): 文本 → Embedding → 向量索引 → 相似度检索 → 结果
混合搜索: 上述两者并行 → 分数融合 → 结果
| 方案 | 准确性 | 实现复杂度 | 适用场景 |
|---|---|---|---|
| BM25 | 中 | 低 | 精确匹配场景 |
| KNN 向量搜索 | 高 | 中 | 语义理解场景 |
| 混合搜索(Hybrid) | 最高 | 高 | 生产级搜索 |
1.3 选型理由
| 组件 | 版本 | 选型理由 |
|---|---|---|
| Java | 21 | 长期支持版本,Virtual Threads 提升并发 |
| Spring Boot | 3.2+ | 原生支持 Java 21,集成 ES 8.x |
| Elasticsearch | 8.10+ | 原生支持 kNN 向量搜索,dense_vector 字段 |
| Embedding 模型 | bge-large-zh / text2vec | 中文语义理解效果好 |
二、技术栈与依赖
2.1 pom.xml 核心依赖
xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.2.5</version>
</parent>
<groupId>com.example</groupId>
<artifactId>semantic-search</artifactId>
<version>1.0.0</version>
<properties>
<java.version>21</java.version>
<elasticsearch.version>8.10.4</elasticsearch.version>
</properties>
<dependencies>
<!-- Spring Boot Web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Elasticsearch Java Client 8.x -->
<dependency>
<groupId>co.elastic.clients</groupId>
<artifactId>elasticsearch-java</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
<!-- Jackson(JSON 处理,ES Java Client 依赖) -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>
<!-- Jakarta JSON Processing(ES Java Client 依赖) -->
<dependency>
<groupId>jakarta.json</groupId>
<artifactId>jakarta.json-api</artifactId>
<version>2.1.3</version>
</dependency>
<!-- OkHttp(ES Java Client 的 HTTP 传输层) -->
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.12.0</version>
</dependency>
<!-- Lombok -->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<!-- Test -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
</project>
2.2 为什么用 ES Java Client 8.x
Elasticsearch 8.x 推出了全新的 Java Client(co.elastic.clients),替代了旧版的 RestHighLevelClient:
旧版(已废弃): 新版(推荐):
RestHighLevelClient (7.x) ElasticsearchClient (8.x)
↓ ↓
基于 Apache HttpClient 基于 OkHttp / JDK HTTP Client
手写 Request 对象 类型安全的 Builder API
运行时错误多 编译期类型检查
WrapperQueryBuilder 处理非标查询 原生支持 kNN 查询
三、环境准备
3.1 Elasticsearch 安装
bash
# Docker 方式(推荐)
docker run -d \
--name elasticsearch \
-p 9200:9200 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
-e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \
docker.elastic.co/elasticsearch/elasticsearch:8.10.4
# 验证
curl http://localhost:9200
# 安装向量搜索插件(HNSW 索引)
# ES 8.x 的 kNN 搜索功能已内置,无需额外插件
3.2 Embedding 模型服务
语义搜索的核心是将文本转为向量。常见方案:
方案一:本地部署模型服务(推荐)
bash
# 使用 Xinference 部署 bge-large-zh-v1.5(1024 维)
pip install xinference
xinference-local --host 0.0.0.0 --port 9997
xinference launch --model-name bge-large-zh-v1.5 --model-type embedding
方案二:调用在线 API
bash
# OpenAI API
# 模型:text-embedding-3-small(1536 维)
# 每次调用费用:$0.00002 / 1K tokens
方案三:DJL(Deep Java Library)本地推理
xml
<dependency>
<groupId>ai.djl</groupId>
<artifactId>model-zoo</artifactId>
<version>0.24.0</version>
</dependency>
<dependency>
<groupId>ai.djl.huggingface</groupId>
<artifactId>tokenizers</artifactId>
<version>0.24.0</version>
</dependency>
四、Spring Boot 集成 Elasticsearch 8.10
4.1 配置文件
yaml
# application.yml
spring:
application:
name: semantic-search
elasticsearch:
host: localhost
port: 9200
scheme: http
# Embedding 服务配置
embedding:
service:
url: http://localhost:9997/v1/embeddings
model: bge-large-zh-v1.5
dimensions: 1024
4.2 Elasticsearch Client 配置
java
package com.example.search.config;
import co.elastic.clients.elasticsearch.ElasticsearchClient;
import co.elastic.clients.json.jackson.JacksonJsonpMapper;
import co.elastic.clients.transport.ElasticsearchTransport;
import co.elastic.clients.transport.rest_client.RestClientTransport;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.datatype.jsr310.JavaTimeModule;
import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class ElasticsearchConfig {
@Value("${elasticsearch.host}")
private String host;
@Value("${elasticsearch.port}")
private int port;
@Value("${elasticsearch.scheme}")
private String scheme;
@Bean
public ElasticsearchClient elasticsearchClient() {
// 创建低级 RestClient
RestClient restClient = RestClient.builder(
new HttpHost(host, port, scheme)
).build();
// JSON 映射器
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.registerModule(new JavaTimeModule());
objectMapper.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS);
// 创建传输层
ElasticsearchTransport transport = new RestClientTransport(
restClient, new JacksonJsonpMapper(objectMapper)
);
// 创建高级客户端
return new ElasticsearchClient(transport);
}
}
4.3 如果连接需要认证
java
@Configuration
public class ElasticsearchConfig {
@Value("${elasticsearch.host}")
private String host;
@Value("${elasticsearch.port}")
private int port;
@Value("${elasticsearch.username:}")
private String username;
@Value("${elasticsearch.password:}")
private String password;
@Bean
public ElasticsearchClient elasticsearchClient() {
RestClientBuilder builder = RestClient.builder(
new HttpHost(host, port, "https")
);
if (StringUtils.hasText(username)) {
CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(
AuthScope.ANY,
new UsernamePasswordCredentials(username, password)
);
builder.setHttpClientConfigCallback(httpClientBuilder ->
httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider)
);
}
RestClient restClient = builder.build();
ObjectMapper objectMapper = new ObjectMapper()
.registerModule(new JavaTimeModule())
.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS);
ElasticsearchTransport transport = new RestClientTransport(
restClient, new JacksonJsonpMapper(objectMapper)
);
return new ElasticsearchClient(transport);
}
}
五、索引设计与 Mapping
5.1 语义搜索索引 Mapping
java
package com.example.search.service;
import co.elastic.clients.elasticsearch.ElasticsearchClient;
import co.elastic.clients.elasticsearch._types.mapping.TypeMapping;
import co.elastic.clients.elasticsearch.cat.indices.IndicesResponse;
import co.elastic.clients.elasticsearch.indices.CreateIndexRequest;
import co.elastic.clients.elasticsearch.indices.IndexState;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Service;
import java.io.IOException;
@Slf4j
@Service
@RequiredArgsConstructor
public class IndexService {
private static final String INDEX_NAME = "articles";
private final ElasticsearchClient esClient;
/**
* 创建语义搜索索引
* 核心字段:title(关键词搜索)、content_vector(语义搜索)
*/
public void createSemanticSearchIndex() throws IOException {
boolean exists = esClient.indices().exists(e -> e.index(INDEX_NAME)).value();
if (exists) {
log.info("索引 {} 已存在,跳过创建", INDEX_NAME);
return;
}
CreateIndexRequest request = CreateIndexRequest.of(b -> b
.index(INDEX_NAME)
.settings(s -> s
.numberOfShards(1)
.numberOfReplicas(1)
// kNN 相关设置
.setting("index.knn", true)
.setting("index.knn.space_type", "cosinesimil")
.setting("index.knn.algo_param.ef_search", 256)
)
.mappings(m -> m
.properties("title", p -> p.text(t -> t
.analyzer("ik_max_word")
.searchAnalyzer("ik_smart")
))
.properties("content", p -> p.text(t -> t
.analyzer("ik_max_word")
.searchAnalyzer("ik_smart")
))
.properties("summary", p -> p.keyword(k -> k))
.properties("category", p -> p.keyword(k -> k))
.properties("author", p -> p.keyword(k -> k))
.properties("status", p -> p.integer(i -> i))
.properties("created_at", p -> p.date(d -> d
.format("yyyy-MM-dd HH:mm:ss||epoch_millis")
))
// 向量字段:dense_vector
// dims:向量维度,需与 Embedding 模型输出一致
// index:设为 true 启用 kNN 索引
// similarity:余弦相似度(与 bge 模型匹配)
.properties("content_vector", p -> p.denseVector(dv -> dv
.dims(1024)
.index(true)
.similarity("cosine")
.indexOptions(io -> io
.type("hnsw")
.m(32)
.efConstruction(256)
)
))
.properties("title_vector", p -> p.denseVector(dv -> dv
.dims(1024)
.index(true)
.similarity("cosine")
.indexOptions(io -> io
.type("hnsw")
.m(32)
.efConstruction(256)
)
))
)
);
esClient.indices().create(request);
log.info("索引 {} 创建成功", INDEX_NAME);
}
}
5.2 Mapping 结构说明
json
{
"mappings": {
"properties": {
"title": { "type": "text", "analyzer": "ik_max_word" },
"content": { "type": "text", "analyzer": "ik_max_word" },
"summary": { "type": "keyword" },
"category": { "type": "keyword" },
"content_vector": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "hnsw",
"m": 32,
"ef_construction": 256
}
}
}
},
"settings": {
"index": {
"knn": true,
"knn.space_type": "cosinesimil",
"knn.algo_param.ef_search": 256
}
}
}
关键参数说明:
| 参数 | 说明 | 推荐值 |
|---|---|---|
dims |
向量维度,必须与 Embedding 模型一致 | 768/1024/1536 |
similarity |
相似度算法,bge 模型用 cosine | cosine |
index_options.type |
向量索引算法 | hnsw |
index_options.m |
HNSW 图的每个节点的最大连接数 | 16~64 |
index_options.ef_construction |
HNSW 构建时搜索的邻居数量 | 128~512 |
knn.algo_param.ef_search |
HNSW 查询时的搜索范围 | 64~512 |
index.knn |
全局启用 kNN | true |
六、向量嵌入(Embedding)集成
6.1 Embedding 服务客户端
java
package com.example.search.service;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.MediaType;
import org.springframework.stereotype.Service;
import org.springframework.web.reactive.function.client.WebClient;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
@Slf4j
@Service
public class EmbeddingService {
@Value("${embedding.service.url}")
private String embeddingUrl;
@Value("${embedding.service.model}")
private String model;
@Value("${embedding.service.dimensions}")
private int dimensions;
private final WebClient webClient;
private final ObjectMapper objectMapper = new ObjectMapper();
public EmbeddingService(@Value("${embedding.service.url}") String url) {
this.webClient = WebClient.builder()
.baseUrl(url)
.codecs(configurer -> configurer
.defaultCodecs()
.maxInMemorySize(10 * 1024 * 1024) // 10MB
)
.build();
}
/**
* 获取单条文本的向量
*/
public float[] embed(String text) {
List<float[]> result = embedBatch(List.of(text));
return result.get(0);
}
/**
* 批量获取向量
*/
public List<float[]> embedBatch(List<String> texts) {
try {
Map<String, Object> body = Map.of(
"model", model,
"input", texts
);
String response = webClient.post()
.contentType(MediaType.APPLICATION_JSON)
.bodyValue(body)
.retrieve()
.bodyToMono(String.class)
.block();
return parseEmbeddingResponse(response);
} catch (Exception e) {
log.error("Embedding 请求失败: {}", e.getMessage());
throw new RuntimeException("Embedding 服务调用失败", e);
}
}
private List<float[]> parseEmbeddingResponse(String json) {
try {
JsonNode root = objectMapper.readTree(json);
JsonNode dataArray = root.get("data");
List<float[]> embeddings = new ArrayList<>(dataArray.size());
for (JsonNode item : dataArray) {
JsonNode embeddingNode = item.get("embedding");
float[] vector = new float[embeddingNode.size()];
for (int i = 0; i < embeddingNode.size(); i++) {
vector[i] = (float) embeddingNode.get(i).asDouble();
}
embeddings.add(vector);
}
return embeddings;
} catch (Exception e) {
throw new RuntimeException("解析 Embedding 响应失败", e);
}
}
}
6.2 使用 Spring AI 简化(可选)
如果项目已经使用 Spring AI,可以进一步简化:
xml
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>1.0.0-M6</version>
</dependency>
yaml
spring:
ai:
openai:
api-key: your-api-key
base-url: http://localhost:9997 # 本地模型服务
java
@Service
public class EmbeddingService {
private final EmbeddingModel embeddingModel;
public EmbeddingService(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
public float[] embed(String text) {
EmbeddingResponse response = embeddingModel.call(
new EmbeddingRequest(List.of(text), EmbeddingOptions.EMPTY)
);
return response.getResults().get(0).getOutput();
}
}
七、数据写入与向量索引构建
7.1 文档实体类
java
package com.example.search.model;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
import java.time.LocalDateTime;
import java.util.List;
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public class ArticleDocument {
private String id;
private String title;
private String content;
private String summary;
private String category;
private String author;
private Integer status;
private LocalDateTime createdAt;
// 向量字段(写入 ES 时使用)
private float[] contentVector;
private float[] titleVector;
}
7.2 索引服务
java
package com.example.search.service;
import co.elastic.clients.elasticsearch.ElasticsearchClient;
import co.elastic.clients.elasticsearch.core.bulk.BulkOperation;
import co.elastic.clients.elasticsearch.core.bulk.BulkResponse;
import co.elastic.clients.elasticsearch.core.index.IndexRequest;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.util.List;
import java.util.Map;
@Slf4j
@Service
@RequiredArgsConstructor
public class DocumentService {
private static final String INDEX_NAME = "articles";
private final ElasticsearchClient esClient;
private final EmbeddingService embeddingService;
/**
* 单条索引文档(自动生成向量)
*/
public void index(ArticleDocument doc) throws IOException {
// 生成向量
String textForEmbed = doc.getTitle() + " " + doc.getContent();
float[] vector = embeddingService.embed(textForEmbed);
doc.setContentVector(vector);
esClient.index(i -> i
.index(INDEX_NAME)
.id(doc.getId())
.document(doc)
);
log.info("文档索引成功: id={}, dims={}", doc.getId(), vector.length);
}
/**
* 批量索引文档
*/
public void bulkIndex(List<ArticleDocument> docs) throws IOException {
// 1. 批量生成向量
List<String> texts = docs.stream()
.map(doc -> doc.getTitle() + " " + doc.getContent())
.toList();
List<float[]> vectors = embeddingService.embedBatch(texts);
// 2. 设置向量
for (int i = 0; i < docs.size(); i++) {
docs.get(i).setContentVector(vectors.get(i));
}
// 3. 构建批量请求
List<BulkOperation> operations = docs.stream()
.map(doc -> BulkOperation.of(b -> b
.index(idx -> idx
.index(INDEX_NAME)
.id(doc.getId())
.document(doc)
)
))
.toList();
BulkResponse response = esClient.bulk(b -> b
.operations(operations)
.refresh(co.elastic.clients.elasticsearch._types.RefreshType.True)
);
if (response.errors()) {
log.error("批量索引存在错误: {}", response.items().stream()
.filter(item -> item.error() != null)
.map(item -> item.error().reason())
.toList());
} else {
log.info("批量索引成功: {} 条", docs.size());
}
}
/**
* 删除文档
*/
public void delete(String docId) throws IOException {
esClient.delete(d -> d.index(INDEX_NAME).id(docId));
}
}
7.3 批量向量化优化
对于大批量数据(百万级),建议分批次处理并使用 Virtual Threads:
java
import java.util.concurrent.StructuredTaskScope;
public void bulkIndexParallel(List<ArticleDocument> docs) throws Exception {
int batchSize = 100;
List<List<ArticleDocument>> batches = partition(docs, batchSize);
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
List<Subtask<Void>> tasks = batches.stream()
.map(batch -> scope.fork(() -> {
indexBatchWithRetry(batch);
return null;
}))
.toList();
scope.join();
scope.throwIfFailed();
log.info("所有批次索引完成: {} 批, 共 {} 条", batches.size(), docs.size());
}
}
private void indexBatchWithRetry(List<ArticleDocument> batch) {
int maxRetries = 3;
for (int attempt = 1; attempt <= maxRetries; attempt++) {
try {
List<String> texts = batch.stream()
.map(d -> d.getTitle() + " " + d.getContent()).toList();
List<float[]> vectors = embeddingService.embedBatch(texts);
for (int i = 0; i < batch.size(); i++) {
batch.get(i).setContentVector(vectors.get(i));
}
bulkRequest(batch);
return;
} catch (Exception e) {
log.warn("批次索引失败 (尝试 {}/{}): {}", attempt, maxRetries, e.getMessage());
if (attempt == maxRetries) throw new RuntimeException(e);
}
}
}
八、语义搜索实现
8.1 搜索请求 DTO
java
package com.example.search.model;
import lombok.Data;
@Data
public class SearchRequest {
/** 搜索关键词 */
private String query;
/** 分页:页码(从 1 开始) */
private int page = 1;
/** 分页:每页数量 */
private int size = 10;
/** 分类过滤 */
private String category;
/** 搜索模式:semantic(语义)、keyword(关键词)、hybrid(混合) */
private String mode = "hybrid";
}
8.2 搜索结果 DTO
java
package com.example.search.model;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
import java.util.List;
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public class SearchResult {
private long total;
private int page;
private int size;
private int totalPages;
private List<SearchHit> hits;
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public static class SearchHit {
private String id;
private String title;
private String content;
private String summary;
private String category;
private float score;
private Map<String, Object> highlight;
}
}
8.3 语义搜索核心实现
java
package com.example.search.service;
import co.elastic.clients.elasticsearch.ElasticsearchClient;
import co.elastic.clients.elasticsearch._types.FieldSort;
import co.elastic.clients.elasticsearch._types.SortOptions;
import co.elastic.clients.elasticsearch._types.query_dsl.*;
import co.elastic.clients.elasticsearch.core.SearchRequest;
import co.elastic.clients.elasticsearch.core.SearchResponse;
import co.elastic.clients.elasticsearch.core.search.Hit;
import com.example.search.model.SearchRequest;
import com.example.search.model.SearchResult;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.util.List;
import java.util.Map;
@Slf4j
@Service
@RequiredArgsConstructor
public class SearchService {
private static final String INDEX_NAME = "articles";
private final ElasticsearchClient esClient;
private final EmbeddingService embeddingService;
/**
* 统一搜索入口
*/
public SearchResult search(SearchRequest request) throws IOException {
return switch (request.getMode()) {
case "semantic" -> semanticSearch(request);
case "keyword" -> keywordSearch(request);
default -> hybridSearch(request);
};
}
/**
* 纯语义搜索(kNN)
* 将用户查询转为向量,在向量空间中检索最相似的文档
*/
public SearchResult semanticSearch(SearchRequest request) throws IOException {
// 1. 查询文本转向量
float[] queryVector = embeddingService.embed(request.getQuery());
// 2. 构建搜索请求
int from = (request.getPage() - 1) * request.getSize();
Query filterQuery = buildFilterQuery(request);
// kNN 查询
KnnQuery knn = KnnQuery.of(k -> k
.field("content_vector")
.queryVector(queryVector)
.k(request.getSize() * 5) // 多召回,后续分页
.numCandidates(100)
.filter(filterQuery)
);
SearchResponse<Map> response = esClient.search(s -> s
.index(INDEX_NAME)
.from(from)
.size(request.getSize())
.knn(knn)
.highlight(h -> h
.fields("title", hf -> hf.preTags("<em>").postTags("</em>"))
.fields("content", hf -> hf
.preTags("<em>").postTags("</em>")
.fragmentSize(200)
.numberOfFragments(3)
)
),
Map.class
);
return buildSearchResult(response, request);
}
/**
* 关键词搜索(BM25)
* 传统全文检索,适合精确匹配场景
*/
public SearchResult keywordSearch(SearchRequest request) throws IOException {
int from = (request.getPage() - 1) * request.getSize();
Query textQuery = Query.of(q -> q
.bool(b -> b
.should(
Query.of(m -> m.match(ma -> ma
.field("title").query(request.getQuery()).boost(3.0f)
)),
Query.of(m -> m.match(ma -> ma
.field("content").query(request.getQuery()).boost(1.0f)
))
)
.minimumShouldMatch("1")
.filter(buildFilterQuery(request))
)
);
SearchResponse<Map> response = esClient.search(s -> s
.index(INDEX_NAME)
.from(from)
.size(request.getSize())
.query(textQuery)
.sort(SortOptions.of(so -> so
.field(FieldSort.of(f -> f.field("_score").order(co.elastic.clients.elasticsearch._types.SortOrder.Desc)))
))
.highlight(h -> h
.fields("title", hf -> hf.preTags("<em>").postTags("</em>"))
.fields("content", hf -> hf.preTags("<em>").postTags("</em>"))
),
Map.class
);
return buildSearchResult(response, request);
}
/**
* 混合搜索(Hybrid Search)
* 同时执行 kNN 向量搜索和 BM25 关键词搜索,融合两者得分
*/
public SearchResult hybridSearch(SearchRequest request) throws IOException {
// 1. 查询向量
float[] queryVector = embeddingService.embed(request.getQuery());
int from = (request.getPage() - 1) * request.getSize();
// 2. kNN 部分
KnnQuery knn = KnnQuery.of(k -> k
.field("content_vector")
.queryVector(queryVector)
.k(request.getSize() * 5)
.numCandidates(100)
.filter(buildFilterQuery(request))
);
// 3. BM25 部分
Query textQuery = Query.of(q -> q
.bool(b -> b
.should(
Query.of(m -> m.match(ma -> ma
.field("title").query(request.getQuery()).boost(3.0f)
)),
Query.of(m -> m.match(ma -> ma
.field("content").query(request.getQuery()).boost(1.0f)
))
)
.minimumShouldMatch("1")
.filter(buildFilterQuery(request))
)
);
// 4. 组合搜索
SearchResponse<Map> response = esClient.search(s -> s
.index(INDEX_NAME)
.from(from)
.size(request.getSize())
.knn(knn)
.query(textQuery)
.rank(r -> r
.rrf(rf -> rf.windowSize(request.getSize() * 5))
)
.highlight(h -> h
.fields("title", hf -> hf.preTags("<em>").postTags("</em>"))
.fields("content", hf -> hf.preTags("<em>").postTags("</em>"))
),
Map.class
);
return buildSearchResult(response, request);
}
/**
* 构建过滤条件
*/
private Query buildFilterQuery(SearchRequest request) {
BoolQuery.Builder boolBuilder = new BoolQuery.Builder();
if (request.getCategory() != null && !request.getCategory().isBlank()) {
boolBuilder.filter(f -> f
.term(t -> t.field("category").value(request.getCategory()))
);
}
return Query.of(q -> q.bool(boolBuilder.build()));
}
/**
* 构建搜索结果
*/
private SearchResult buildSearchResult(SearchResponse<Map> response, SearchRequest request) {
long total = response.hits().total().value();
int totalPages = (int) Math.ceil((double) total / request.getSize());
List<SearchResult.SearchHit> hits = response.hits().hits().stream()
.map(this::mapHit)
.toList();
return SearchResult.builder()
.total(total)
.page(request.getPage())
.size(request.getSize())
.totalPages(totalPages)
.hits(hits)
.build();
}
private SearchResult.SearchHit mapHit(Hit<Map> hit) {
Map<String, Object> source = hit.source();
return SearchResult.SearchHit.builder()
.id(hit.id())
.title((String) source.get("title"))
.content((String) source.get("content"))
.summary((String) source.get("summary"))
.category((String) source.get("category"))
.score(hit.score() != null ? hit.score().floatValue() : 0f)
.highlight(hit.highlight())
.build();
}
}
8.4 搜索控制器
java
package com.example.search.controller;
import com.example.search.model.SearchRequest;
import com.example.search.model.SearchResult;
import com.example.search.service.SearchService;
import lombok.RequiredArgsConstructor;
import org.springframework.web.bind.annotation.*;
@RestController
@RequestMapping("/api/search")
@RequiredArgsConstructor
public class SearchController {
private final SearchService searchService;
@GetMapping
public SearchResult search(
@RequestParam String q,
@RequestParam(defaultValue = "1") int page,
@RequestParam(defaultValue = "10") int size,
@RequestParam(required = false) String category,
@RequestParam(defaultValue = "hybrid") String mode
) throws Exception {
SearchRequest request = new SearchRequest();
request.setQuery(q);
request.setPage(page);
request.setSize(size);
request.setCategory(category);
request.setMode(mode);
return searchService.search(request);
}
}
九、混合搜索(向量 + 关键词)
9.1 三种搜索模式对比
查询:"如何学习 Spring Boot"
┌──────────────────────────────────────────────────────────────────┐
│ 关键词搜索 (BM25) │
│ 命中: │
│ ✅ "Spring Boot 入门教程" (标题包含关键词) │
│ ✅ "Spring Boot 3.0 新特性" (标题包含关键词) │
│ ❌ "Java Web 框架对比分析" (语义相关但无关键词匹配) │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ 语义搜索 (kNN) │
│ 命中: │
│ ✅ "Spring Boot 入门教程" (语义高度相关) │
│ ✅ "Java Web 框架对比分析" (语义相关) │
│ ✅ "Spring 框架学习路径规划" (语义相关) │
│ ❌ "Spring Boot 启动报错解决" (包含关键词但语义偏移) │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ 混合搜索 (Hybrid) │
│ 命中(RRF 融合排序): │
│ 1. "Spring Boot 入门教程" (关键词 + 语义双高) │
│ 2. "Spring Boot 3.0 新特性" (关键词高) │
│ 3. "Java Web 框架对比分析" (语义高) │
│ 4. "Spring 框架学习路径规划" (语义相关) │
└──────────────────────────────────────────────────────────────────┘
9.2 RRF(Reciprocal Rank Fusion)分数融合
ES 8.x 原生支持 RRF,这是混合搜索的标准做法:
java
// ES 8.x 原生 RRF(已在 8.3 节使用)
.rank(r -> r
.rrf(rf -> rf.windowSize(50))
)
RRF 算法原理:
对每条文档,在各个排序结果中取排名 r,计算:
RRF_score = Σ 1/(k + r_i) (k 通常取 60)
示例:
文档 A 在 kNN 中排第 1,在 BM25 中排第 3
RRF_score = 1/(60+1) + 1/(60+3) = 0.0164 + 0.0159 = 0.0323
文档 B 在 kNN 中排第 5,在 BM25 中排第 1
RRF_score = 1/(60+5) + 1/(60+1) = 0.0154 + 0.0164 = 0.0318
结果:A 排在 B 前面
9.3 自定义权重融合
如果需要更精细的权重控制:
java
/**
* 自定义权重融合(kNN 权重 + BM25 权重)
*/
public SearchResult weightedHybridSearch(SearchRequest request) throws IOException {
float[] queryVector = embeddingService.embed(request.getQuery());
int from = (request.getPage() - 1) * request.getSize();
// kNN 权重 0.7,BM25 权重 0.3
KnnQuery knn = KnnQuery.of(k -> k
.field("content_vector")
.queryVector(queryVector)
.k(50)
.numCandidates(100)
.boost(0.7f)
.filter(buildFilterQuery(request))
);
Query textQuery = Query.of(q -> q
.multiMatch(mm -> mm
.fields("title^3", "content")
.query(request.getQuery())
.type(TextQueryType.BestFields)
.boost(0.3f)
)
);
SearchResponse<Map> response = esClient.search(s -> s
.index(INDEX_NAME)
.from(from)
.size(request.getSize())
.knn(knn)
.query(textQuery)
.rank(r -> r.rrf(rf -> rf.windowSize(50)))
, Map.class
);
return buildSearchResult(response, request);
}
十、搜索结果优化与重排
10.1 搜索建议(Suggest)
java
/**
* 搜索自动补全
*/
public List<String> suggest(String prefix) throws IOException {
SearchResponse<Map> response = esClient.search(s -> s
.index(INDEX_NAME)
.size(0)
.suggest(sg -> sg
.suggesters("title-suggest", sg1 -> sg1
.prefix(prefix)
.completion(c -> c
.field("title.suggest")
.skipDuplicates(true)
.size(10)
)
)
),
Map.class
);
return response.suggest().get("title-suggest").stream()
.flatMap(list -> list.options().stream())
.map(option -> option.text())
.toList();
}
需要在 Mapping 中添加 completion 字段:
java
.properties("title_suggest", p -> p.searchAsYouType(s -> s))
10.2 热门搜索 + 个性化排序
java
/**
* 个性化搜索:结合用户行为和文档热度
*/
public SearchResult personalizedSearch(SearchRequest request, String userId) throws IOException {
float[] queryVector = embeddingService.embed(request.getQuery());
// 获取用户兴趣向量
float[] userInterestVector = getUserInterestVector(userId);
// 融合查询向量和用户兴趣向量
float[] combinedVector = combineVectors(queryVector, userInterestVector, 0.7f, 0.3f);
// ... 后续 kNN 搜索使用 combinedVector
}
private float[] combineVectors(float[] a, float[] b, float weightA, float weightB) {
float[] result = new float[a.length];
for (int i = 0; i < a.length; i++) {
result[i] = a[i] * weightA + b[i] * weightB;
}
// 归一化
float norm = 0;
for (float v : result) norm += v * v;
norm = (float) Math.sqrt(norm);
for (int i = 0; i < result.length; i++) result[i] /= norm;
return result;
}
10.3 多路召回 + Cross-Encoder 重排
用户查询 → [向量召回] ─────┐
→ [BM25 召回] ─────┼──→ 合并去重 → [Cross-Encoder 精排] → 最终结果
→ [标签召回] ─────┘
java
/**
* Cross-Encoder 重排
* 使用专门的重排序模型对召回结果精排
*/
public List<SearchResult.SearchHit> rerank(
String query, List<SearchResult.SearchHit> candidates) {
// 构建重排输入(query, title + content)
List<String> pairs = candidates.stream()
.map(hit -> query + "[SEP]" + hit.getTitle() + " " + hit.getContent())
.toList();
// 调用重排模型获取分数
List<Double> scores = callRerankModel(query, pairs);
// 按分数重排
Map<SearchResult.SearchHit, Double> scored = new LinkedHashMap<>();
for (int i = 0; i < candidates.size(); i++) {
scored.put(candidates.get(i), scores.get(i));
}
return scored.entrySet().stream()
.sorted(Map.Entry.<SearchResult.SearchHit, Double>comparingByValue().reversed())
.map(Map.Entry::getKey)
.toList();
}
十一、性能优化
11.1 索引构建优化
java
/**
* 高性能批量索引策略
*/
public void optimizedBulkIndex(List<ArticleDocument> docs) {
int batchSize = 500;
int embeddingBatchSize = 50;
// 1. 关闭副本,加速索引构建
// settings: number_of_replicas: 0 → 构建完成后改为 1
// 2. 批量向量化
for (int i = 0; i < docs.size(); i += embeddingBatchSize) {
int end = Math.min(i + embeddingBatchSize, docs.size());
List<String> batchTexts = docs.subList(i, end).stream()
.map(d -> d.getTitle() + " " + d.getContent())
.toList();
List<float[]> vectors = embeddingService.embedBatch(batchTexts);
for (int j = 0; j < batchTexts.size(); j++) {
docs.get(i + j).setContentVector(vectors.get(j));
}
}
// 3. 批量写入 ES
for (int i = 0; i < docs.size(); i += batchSize) {
int end = Math.min(i + batchSize, docs.size());
List<ArticleDocument> batch = docs.subList(i, end);
BulkRequest bulkRequest = new BulkRequest.Builder()
.operations(batch.stream()
.map(doc -> BulkOperation.of(op -> op
.index(idx -> idx
.index(INDEX_NAME)
.id(doc.getId())
.document(doc)
)
))
.toList()
)
.refresh(Refresh.WaitFor)
.timeout("5m")
.build();
BulkResponse bulkResponse = esClient.bulk(bulkRequest);
// 处理失败项...
}
// 4. 恢复副本数
esClient.indices().putSettings(ps -> ps
.index(INDEX_NAME)
.settings("index.number_of_replicas", "1")
);
}
11.2 查询性能优化
java
// 1. 只返回需要的字段(减少网络传输)
SearchResponse<Map> response = esClient.search(s -> s
.index(INDEX_NAME)
.source(src -> src
.fetch(true)
.includes("id", "title", "summary", "category", "created_at")
)
.knn(knn)
.size(10)
, Map.class
);
// 2. 使用 search_after 替代 from/size 深度分页
// 首次查询
String[] sortValues = null;
SearchResponse<Map> firstPage = esClient.search(s -> s
.index(INDEX_NAME)
.knn(knn)
.size(10)
.sort(SortOptions.of(so -> so.field(
FieldSort.of(f -> f.field("_shard_doc").order(SortOrder.Asc))
)))
, Map.class
);
// 获取最后一条的排序值
List<FieldValue> lastSort = firstPage.hits().hits()
.get(firstPage.hits().hits().size() - 1).sort();
// 翻页查询
SearchResponse<Map> nextPage = esClient.search(s -> s
.index(INDEX_NAME)
.knn(knn)
.size(10)
.searchAfter(lastSort)
.sort(...)
, Map.class
);
// 3. 使用 collapse 去重
SearchResponse<Map> response = esClient.search(s -> s
.index(INDEX_NAME)
.query(query)
.collapse(c -> c.field("category")) // 按 category 去重
.size(10)
, Map.class
);
11.3 缓存策略
java
@Service
public class CachedSearchService {
private final SearchService searchService;
private final CacheManager cacheManager;
/**
* 热门查询缓存
*/
@Cacheable(value = "search", key = "#request.query + '_' + #request.mode",
unless = "#result.total == 0")
public SearchResult search(SearchRequest request) throws IOException {
return searchService.search(request);
}
/**
* 向量缓存(同一查询文本的向量结果可复用)
*/
@Cacheable(value = "embedding", key = "#text.hashCode()")
public float[] getEmbedding(String text) {
return embeddingService.embed(text);
}
}
11.4 使用 Java 21 Virtual Threads 优化
java
/**
* 并行调用 Embedding 服务和 ES 搜索
*/
public SearchResult asyncHybridSearch(SearchRequest request) throws Exception {
try (var scope = new StructuredTaskScope<SearchResult>()) {
// 并行执行:语义搜索 + 关键词搜索
Subtask<SearchResult> semanticTask = scope.fork(() ->
searchService.semanticSearch(request)
);
Subtask<SearchResult> keywordTask = scope.fork(() ->
searchService.keywordSearch(request)
);
scope.join();
// 合并结果
return mergeResults(
semanticTask.get(),
keywordTask.get(),
0.6 // 语义搜索权重
);
}
}
十二、完整项目结构
semantic-search/
├── pom.xml
├── src/main/java/com/example/search/
│ ├── SemanticSearchApplication.java
│ │
│ ├── config/
│ │ ├── ElasticsearchConfig.java # ES 客户端配置
│ │ └── WebClientConfig.java # HTTP 客户端配置
│ │
│ ├── model/
│ │ ├── ArticleDocument.java # 文档实体
│ │ ├── SearchRequest.java # 搜索请求 DTO
│ │ └── SearchResult.java # 搜索结果 DTO
│ │
│ ├── service/
│ │ ├── EmbeddingService.java # 向量嵌入服务
│ │ ├── DocumentService.java # 文档索引服务
│ │ ├── SearchService.java # 搜索核心服务
│ │ ├── IndexService.java # 索引管理服务
│ │ └── CacheService.java # 缓存服务
│ │
│ ├── controller/
│ │ ├── SearchController.java # 搜索 API
│ │ └── DocumentController.java # 文档管理 API
│ │
│ └── exception/
│ ├── GlobalExceptionHandler.java # 全局异常处理
│ └── SearchException.java # 自定义异常
│
├── src/main/resources/
│ ├── application.yml
│ └── application-prod.yml
│
└── src/test/java/com/example/search/
├── service/
│ ├── EmbeddingServiceTest.java
│ └── SearchServiceTest.java
└── integration/
└── SearchIntegrationTest.java
十三、部署与运维
13.1 Elasticsearch 8.10 生产配置建议
yaml
# elasticsearch.yml
# 集群配置
cluster.name: semantic-search-prod
node.name: node-1
network.host: 0.0.0.0
# 内存配置
bootstrap.memory_lock: true
ES_JAVA_OPTS: -Xms8g -Xmx8g
# 索引配置
indices.query.bool.max_clause_count: 4096
action.auto_create_index: false
# 向量搜索配置
indices.memory.index_buffer_size: 30%
# 安全配置
xpack.security.enabled: true
xpack.security.http.ssl.enabled: true
13.2 监控指标
java
/**
* ES 搜索健康检查
*/
@Scheduled(fixedRate = 60000)
public void checkSearchHealth() throws IOException {
// 集群健康状态
var health = esClient.cluster().health();
log.info("ES 集群状态: {}, 节点数: {}", health.status(), health.numberOfNodes());
// 索引文档数
var count = esClient.count(c -> c.index(INDEX_NAME));
log.info("索引 {} 文档数: {}", INDEX_NAME, count.count());
// 索引大小
var stats = esClient.indices().stats(s -> s.index(INDEX_NAME));
long sizeInBytes = stats.indices().get(INDEX_NAME).primaries().store().sizeInBytes();
log.info("索引 {} 大小: {} MB", INDEX_NAME, sizeInBytes / 1024 / 1024);
}
13.3 向量索引重建
当 Embedding 模型更换时需要重建向量索引:
java
/**
* 重建向量索引
* 1. 创建新索引
* 2. 批量生成新向量
* 3. 写入新索引
* 4. 原子切换别名
*/
public void rebuildVectorIndex(String newModel) throws Exception {
String newIndex = INDEX_NAME + "_v2_" + System.currentTimeMillis();
String alias = "articles-alias";
// 1. 创建新索引
esClient.indices().create(c -> c.index(newIndex)
.settings(buildVectorIndexSettings(1024))
.mappings(buildVectorIndexMapping(1024))
);
// 2. 从旧索引读取文档,生成新向量,写入新索引
SearchResponse<Map> scrollResponse = esClient.search(s -> s
.index(INDEX_NAME)
.size(1000)
.scroll(Time.of(t -> t.time("5m")))
, Map.class
);
String scrollId = scrollResponse.scrollId();
List<Hit<Map>> hits = scrollResponse.hits().hits();
while (!hits.isEmpty()) {
List<ArticleDocument> docs = hits.stream().map(this::mapToDocument).toList();
documentService.bulkIndexToIndex(newIndex, docs, newModel);
scrollResponse = esClient.scroll(sc -> sc
.scrollId(scrollId)
.scroll(Time.of(t -> t.time("5m")))
, Map.class
);
scrollId = scrollResponse.scrollId();
hits = scrollResponse.hits().hits();
}
// 3. 清除滚动上下文
esClient.clearScroll(cs -> cs.scrollId(scrollId));
// 4. 原子切换别名
esClient.indices().putAlias(pa -> pa
.index(newIndex)
.name(alias)
.actions(a -> a
.add(aa -> aa.index(INDEX_NAME).alias(alias))
.remove(ra -> ra.index(INDEX_NAME).alias(alias))
)
);
// 5. 删除旧索引
esClient.indices().delete(d -> d.index(INDEX_NAME));
log.info("向量索引重建完成: {} -> {}", INDEX_NAME, newIndex);
}
十四、总结与展望
核心技术要点回顾
┌──────────────────────────────────────────────────────────────────┐
│ 语义搜索实现全景图 │
│ │
│ 文本 ──→ Embedding 模型 ──→ 高维向量(1024维) │
│ │ │
│ 索引阶段: │ │
│ 文档标题+内容 ──→ 向量化 ──→ ES dense_vector (HNSW) │
│ │
│ 搜索阶段: │ │
│ 用户查询 ──→ 向量化 ──→ kNN 近似搜索 ──→ 候选集 │
│ │ │
│ └──→ BM25 关键词搜索 ──→ 候选集 │
│ │
│ 融合排序: │ │
│ RRF 融合 kNN + BM25 分数 ──→ Cross-Encoder 精排 ──→ 最终结果 │
└──────────────────────────────────────────────────────────────────┘
性能基准参考
| 场景 | QPS | P99 延迟 | 说明 |
|---|---|---|---|
| kNN 搜索(1M 文档) | 200+ | <50ms | HNSW, ef_search=256 |
| 混合搜索(1M 文档) | 150+ | <80ms | kNN + BM25 + RRF |
| 向量化(本地模型) | 50+ | <30ms | bge-large-zh, batch=50 |
| 批量索引(1K/批) | - | <2s | 含向量生成 |
展望方向
- 多模态搜索:文本 + 图片联合向量检索
- 实时更新:增量向量索引 + 近实时搜索
- RAG 架构:结合 LLM 实现问答式搜索
- 多向量策略:title_vector + content_vector 分别检索
- 向量量化:Product Quantization (PQ) 降低内存占用
- 分布式部署:多分片 + 副本 + 跨集群搜索
本文档基于 Elasticsearch 8.10 的 kNN Search 原生能力,结合 Spring Boot 3.2 和 Java 21 Virtual Threads,提供了一套生产可用的语义搜索实现方案。建议结合实际业务数据进行调优,特别是
ef_search、m、ef_construction等参数需要根据数据规模和精度需求进行调整。