Spring AI / Vector Databases / Redis

Spring AI Redis 向量存储参考文档

概述

本节将引导您设置 RedisVectorStore，用于存储文档嵌入并执行相似性搜索。

Redis 是一个开源（BSD 许可）的内存数据结构存储系统，用作数据库、缓存、消息代理和流引擎。Redis 提供的数据结构包括字符串、哈希、列表、集合、带范围查询的有序集合、位图、超日志、地理空间索引和流。

Redis Search and Query 扩展了 Redis OSS 的核心功能，允许您将 Redis 用作向量数据库：

在哈希或 JSON 文档中存储向量及相关元数据
检索向量
执行向量相似性搜索（KNN）
执行基于半径阈值的范围向量搜索
对 TEXT 字段执行全文搜索
支持多种距离度量（COSINE、L2、IP）和向量算法（HNSW、FLAT）

先决条件

Redis Stack 实例
- Redis Cloud（推荐）
- Docker 镜像 redis/redis-stack:latest
用于计算文档嵌入的 EmbeddingModel 实例。有多种选项可供选择：
如果需要，为生成 RedisVectorStore 存储的嵌入的 EmbeddingModel 提供 API 密钥。

自动配置

注意：Spring AI 自动配置模块和 starter 模块的工件名称发生了重大变化。请参阅升级说明了解更多信息。

Spring AI 为 Redis 向量存储提供了 Spring Boot 自动配置。要启用它，请将以下依赖项添加到项目的 Maven pom.xml 文件中：

xml 复制代码

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-vector-store-redis</artifactId>
</dependency>

或添加到 Gradle build.gradle 构建文件中：

gradle 复制代码

dependencies {
    implementation 'org.springframework.ai:spring-ai-starter-vector-store-redis'
}

请参阅依赖管理部分将 Spring AI BOM 添加到您的构建文件中。

请参阅工件仓库部分将 Maven Central 和/或快照仓库添加到您的构建文件中。

向量存储实现可以为您初始化所需的模式，但您必须通过适当的构造函数设置 initializeSchema 布尔值，或在 application.properties 文件中设置 ...initialize-schema=true 来选择启用。

⚠️ 这是一个重大变更！ 在早期版本的 Spring AI 中，此模式初始化默认发生。

请查看向量存储的配置参数列表，了解默认值和配置选项。

此外，您还需要一个已配置的 EmbeddingModel bean。请参阅 EmbeddingModel 部分了解更多信息。

现在您可以在应用程序中自动装配 RedisVectorStore 作为向量存储：

java 复制代码

@Autowired VectorStore vectorStore;

// ...

List<Document> documents = List.of(
    new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
    new Document("The World is Big and Salvation Lurks Around the Corner"),
    new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));

// 将文档添加到 Redis
vectorStore.add(documents);

// 检索与查询相似的文档
List<Document> results = this.vectorStore.similaritySearch(SearchRequest.builder().query("Spring").topK(5).build());

配置属性

要连接到 Redis 并使用 RedisVectorStore，您需要提供实例的访问详情。可以通过 Spring Boot 的 application.yml 进行简单配置：

yaml 复制代码

spring:
  data:
    redis:
      url: <redis instance url>
  ai:
    vectorstore:
      redis:
        initialize-schema: true
        index-name: custom-index
        prefix: custom-prefix

对于 Redis 连接配置，也可以通过 Spring Boot 的 application.properties 进行简单配置：

properties 复制代码

spring.data.redis.host=localhost
spring.data.redis.port=6379
spring.data.redis.username=default
spring.data.redis.password=

以 spring.ai.vectorstore.redis.* 开头的属性用于配置 RedisVectorStore：

属性	描述	默认值
`spring.ai.vectorstore.redis.initialize-schema`	是否初始化所需的模式	`false`
`spring.ai.vectorstore.redis.index-name`	存储向量的索引名称	`spring-ai-index`
`spring.ai.vectorstore.redis.prefix`	Redis 键的前缀	`embedding:`
`spring.ai.vectorstore.redis.distance-metric`	向量相似性的距离度量（COSINE、L2、IP）	`COSINE`
`spring.ai.vectorstore.redis.vector-algorithm`	向量索引算法（HNSW、FLAT）	`HNSW`
`spring.ai.vectorstore.redis.hnsw-m`	HNSW：最大出连接数	`16`
`spring.ai.vectorstore.redis.hnsw-ef-construction`	HNSW：索引构建期间的最大连接数	`200`
`spring.ai.vectorstore.redis.hnsw-ef-runtime`	HNSW：搜索期间考虑的连接数	`10`
`spring.ai.vectorstore.redis.default-range-threshold`	范围搜索的默认半径阈值	`0.8`
`spring.ai.vectorstore.redis.text-scorer`	文本评分算法（BM25、TFIDF、BM25STD、DISMAX、DOCSCORE）	`BM25`

元数据过滤

您也可以将通用、可移植的元数据过滤器与 Redis 一起使用。

例如，您可以使用文本表达式语言：

java 复制代码

vectorStore.similaritySearch(SearchRequest.builder()
        .query("The World")
        .topK(TOP_K)
        .similarityThreshold(SIMILARITY_THRESHOLD)
        .filterExpression("country in ['UK', 'NL'] && year >= 2020").build());

或使用 Filter.Expression DSL 以编程方式：

java 复制代码

FilterExpressionBuilder b = new FilterExpressionBuilder();

vectorStore.similaritySearch(SearchRequest.builder()
        .query("The World")
        .topK(TOP_K)
        .similarityThreshold(SIMILARITY_THRESHOLD)
        .filterExpression(b.and(
                b.in("country", "UK", "NL"),
                b.gte("year", 2020)).build()).build());

这些（可移植的）过滤表达式会自动转换为 Redis 搜索查询。例如，此可移植过滤表达式：

复制代码

country in ['UK', 'NL'] && year >= 2020

被转换为专有的 Redis 过滤器格式：

复制代码

@country:{UK | NL} @year:[2020 inf]

手动配置

您也可以不使用 Spring Boot 自动配置，而是手动配置 Redis 向量存储。为此，您需要将 spring-ai-redis-store 添加到项目中：

xml 复制代码

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-redis-store</artifactId>
</dependency>

或添加到 Gradle build.gradle 构建文件中：

gradle 复制代码

dependencies {
    implementation 'org.springframework.ai:spring-ai-redis-store'
}

创建 RedisClient bean：

java 复制代码

@Bean
public RedisClient jedisClient() {
    return RedisClient.builder().hostAndPort("<host>", 6379).build();
}

然后使用构建器模式创建 RedisVectorStore bean：

java 复制代码

@Bean
public VectorStore vectorStore(RedisClient jedisClient, EmbeddingModel embeddingModel) {
    return RedisVectorStore.builder(jedisClient, embeddingModel)
        .indexName("custom-index")                // 可选：默认为 "spring-ai-index"
        .prefix("custom-prefix")                  // 可选：默认为 "embedding:"
        .contentFieldName("content")              // 可选：文档内容字段
        .embeddingFieldName("embedding")          // 可选：向量嵌入字段
        .vectorAlgorithm(Algorithm.HNSW)          // 可选：HNSW 或 FLAT（默认为 HNSW）
        .distanceMetric(DistanceMetric.COSINE)    // 可选：COSINE、L2 或 IP（默认为 COSINE）
        .hnswM(16)                                // 可选：HNSW 连接数（默认为 16）
        .hnswEfConstruction(200)                  // 可选：HNSW 构建参数（默认为 200）
        .hnswEfRuntime(10)                        // 可选：HNSW 搜索参数（默认为 10）
        .defaultRangeThreshold(0.8)               // 可选：范围搜索的默认半径
        .textScorer(TextScorer.BM25)              // 可选：文本评分算法（默认为 BM25）
        .metadataFields(                          // 可选：定义用于过滤的元数据字段
            MetadataField.tag("country"),
            MetadataField.numeric("year"),
            MetadataField.text("description"))
        .initializeSchema(true)                   // 可选：默认为 false
        .batchingStrategy(new TokenCountBatchingStrategy()) // 可选：默认为 TokenCountBatchingStrategy
        .build();
}

// 这可以是任何 EmbeddingModel 实现
@Bean
public EmbeddingModel embeddingModel() {
    return new OpenAiEmbeddingModel(OpenAiEmbeddingOptions.builder().apiKey(System.getenv("OPENAI_API_KEY")).build());
}

您必须明确列出在过滤表达式中使用的所有元数据字段名称和类型（TAG、TEXT 或 NUMERIC）。上面的 metadataFields 注册了可过滤的元数据字段：类型为 TAG 的 country、类型为 NUMERIC 的 year。

访问原生客户端

Redis 向量存储实现通过 getNativeClient() 方法提供对底层原生 Redis 客户端（RedisClient）的访问：

java 复制代码

RedisVectorStore vectorStore = context.getBean(RedisVectorStore.class);
Optional<RedisClient> nativeClient = vectorStore.getNativeClient();

if (nativeClient.isPresent()) {
    RedisClient jedisClient = nativeClient.get();
    // 使用原生客户端执行 Redis 特有操作
}

原生客户端使您能够访问可能未通过 VectorStore 接口公开的 Redis 特有功能和操作。

距离度量

Redis 向量存储支持三种向量相似性距离度量：

COSINE：余弦相似度（默认）------测量向量之间夹角的余弦值
L2：欧几里得距离------测量向量之间的直线距离
IP：内积------测量向量之间的点积

每个度量自动归一化为 0-1 的相似度分数，其中 1 表示最相似。

java 复制代码

RedisVectorStore vectorStore = RedisVectorStore.builder(jedisClient, embeddingModel)
    .distanceMetric(DistanceMetric.COSINE)  // 或 L2、IP
    .build();

HNSW 算法配置

Redis 向量存储默认使用 HNSW（分层可导航小世界）算法进行高效的近似最近邻搜索。您可以根据具体用例调整 HNSW 参数：

java 复制代码

RedisVectorStore vectorStore = RedisVectorStore.builder(jedisClient, embeddingModel)
    .vectorAlgorithm(Algorithm.HNSW)
    .hnswM(32)                    // 每个节点的最大出连接数（默认：16）
    .hnswEfConstruction(100)      // 索引构建期间的连接数（默认：200）
    .hnswEfRuntime(50)            // 搜索期间的连接数（默认：10）
    .build();

参数指南：

M：值越高提高召回率但增加内存使用和索引时间。典型值：12-48。
EF_CONSTRUCTION：值越高提高索引质量但增加构建时间。典型值：100-500。
EF_RUNTIME：值越高提高搜索精度但增加延迟。典型值：10-100。

对于较小的数据集或需要精确结果时，改用 FLAT 算法：

java 复制代码

RedisVectorStore vectorStore = RedisVectorStore.builder(jedisClient, embeddingModel)
    .vectorAlgorithm(Algorithm.FLAT)
    .build();

文本搜索

Redis 向量存储利用 Redis 查询引擎的全文搜索功能提供文本搜索能力。这使您能够根据 TEXT 字段中的关键词和短语查找文档：

java 复制代码

// 搜索包含特定文本的文档
List<Document> textResults = vectorStore.searchByText(
    "machine learning",   // 搜索查询
    "content",            // 要搜索的字段（必须是 TEXT 类型）
    10,                   // 限制数量
    "category == 'AI'"    // 可选过滤表达式
);

文本搜索支持：

单词搜索
当 inOrder 为 true 时的精确短语匹配
当 inOrder 为 false 时的基于 OR 语义的术语搜索
停用词过滤以忽略常见词
多种文本评分算法

在构建时配置文本搜索行为：

java 复制代码

RedisVectorStore vectorStore = RedisVectorStore.builder(jedisClient, embeddingModel)
    .textScorer(TextScorer.TFIDF)                    // 文本评分算法
    .inOrder(true)                                   // 按顺序匹配术语
    .stopwords(Set.of("is", "a", "the", "and"))      // 忽略常见词
    .metadataFields(MetadataField.text("description")) // 定义 TEXT 字段
    .build();

文本评分算法

提供多种文本评分算法：

BM25：具有术语饱和度的现代 TF-IDF 版本（默认）
TFIDF：经典的词频-逆文档频率
BM25STD：标准化 BM25
DISMAX：析取最大值
DOCSCORE：文档分数

分数归一化为 0-1 范围，与向量相似度分数保持一致。

范围搜索

范围搜索返回指定半径阈值内的所有文档，而不是固定数量的最近邻：

java 复制代码

// 使用显式半径搜索
List<Document> rangeResults = vectorStore.searchByRange(
    "AI and machine learning",  // 查询
    0.8,                        // 半径（相似度阈值）
    "category == 'AI'"          // 可选过滤表达式
);

您也可以在构建时设置默认范围阈值：

java 复制代码

RedisVectorStore vectorStore = RedisVectorStore.builder(jedisClient, embeddingModel)
    .defaultRangeThreshold(0.8)  // 设置默认阈值
    .build();

// 使用默认阈值
List<Document> results = vectorStore.searchByRange("query");

当您希望检索高于特定相似度阈值的所有相关文档，而不是限制为特定数量时，范围搜索非常有用。

语义缓存

语义缓存是一种强大的优化技术，利用 Redis 向量搜索功能，基于用户查询的语义相似性（而非精确字符串匹配）来缓存和检索 AI 聊天响应。即使使用不同措辞问出语义相似的问题，也能实现智能响应重用。

为什么需要语义缓存？

传统缓存依赖精确键匹配，当用户使用不同措辞问出语义等价的问题时会失败：

"法国的首都是什么？"
"告诉我法国的首都城市"
"哪个城市是法国的首都？"

这三个问题有相同的答案，但传统缓存会将它们视为不同的请求，导致冗余的 LLM API 调用。语义缓存通过使用向量嵌入比较查询的含义来解决这个问题。

优势：

降低 API 成本：避免对昂贵的 LLM API 进行冗余调用
降低延迟：立即返回缓存的响应，无需等待模型推理
提高可扩展性：处理更高的查询量而无需按比例增加 API 成本
一致的响应：为语义相似的问题返回相同的答案

自动配置

Spring AI 为 Redis 语义缓存提供 Spring Boot 自动配置。要启用它，请将以下依赖项添加到项目的 Maven pom.xml 文件中：

xml 复制代码

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-vector-store-redis-semantic-cache</artifactId>
</dependency>

或添加到 Gradle build.gradle 构建文件中：

gradle 复制代码

dependencies {
    implementation 'org.springframework.ai:spring-ai-starter-vector-store-redis-semantic-cache'
}

自动配置为语义缓存提供了一个默认的嵌入模型（redis/langcache-embed-v1）。您可以通过提供自己的 EmbeddingModel bean 来覆盖它。

配置属性

以 spring.ai.vectorstore.redis.semantic-cache.* 开头的属性配置语义缓存：

属性	描述	默认值
`spring.ai.vectorstore.redis.semantic-cache.enabled`	启用或禁用语义缓存	`true`
`spring.ai.vectorstore.redis.semantic-cache.host`	Redis 服务器主机	`localhost`
`spring.ai.vectorstore.redis.semantic-cache.port`	Redis 服务器端口	`6379`
`spring.ai.vectorstore.redis.semantic-cache.similarity-threshold`	缓存命中的相似度阈值（0.0-1.0）。值越高要求语义匹配越接近。	`0.95`
`spring.ai.vectorstore.redis.semantic-cache.index-name`	Redis 搜索索引名称	`semantic-cache-index`
`spring.ai.vectorstore.redis.semantic-cache.prefix`	Redis 中缓存条目的键前缀	`semantic-cache:`

application.yml 中的配置示例：

yaml 复制代码

spring:
  ai:
    vectorstore:
      redis:
        semantic-cache:
          enabled: true
          host: localhost
          port: 6379
          similarity-threshold: 0.85
          index-name: my-app-cache
          prefix: "my-app:semantic-cache:"

使用 SemanticCacheAdvisor

SemanticCacheAdvisor 无缝集成到 Spring AI 的 ChatClient advisor 模式中。它会自动缓存响应，并为相似查询返回缓存的结果：

java 复制代码

@Autowired
private SemanticCache semanticCache;

@Autowired
private ChatModel chatModel;

public void example() {
    // 创建缓存 advisor
    SemanticCacheAdvisor cacheAdvisor = SemanticCacheAdvisor.builder()
        .cache(semanticCache)
        .build();

    // 第一次查询 - 调用 LLM 并缓存响应
    ChatResponse response1 = ChatClient.builder(chatModel)
        .build()
        .prompt("What is the capital of France?")
        .advisors(cacheAdvisor)
        .call()
        .chatResponse();

    // 相似查询 - 返回缓存的响应（无 LLM 调用）
    ChatResponse response2 = ChatClient.builder(chatModel)
        .build()
        .prompt("Tell me the capital city of France")
        .advisors(cacheAdvisor)
        .call()
        .chatResponse();

    // response1 和 response2 包含相同的缓存答案
}

该 advisor 自动：

在调用 LLM 之前检查缓存中是否有语义相似的查询
当找到高于相似度阈值的匹配时返回缓存的响应
在成功调用 LLM 后缓存新响应
支持同步和流式聊天操作

直接使用缓存

您也可以直接与 SemanticCache 交互以获得细粒度控制：

java 复制代码

@Autowired
private SemanticCache semanticCache;

// 将响应与查询一起存储
semanticCache.set("What is the capital of France?", chatResponse);

// 使用 TTL（生存时间）存储以自动过期
semanticCache.set("What's the weather today?", weatherResponse, Duration.ofHours(1));

// 检索语义相似的响应
Optional<ChatResponse> cached = semanticCache.get("Tell me France's capital");

if (cached.isPresent()) {
    // 使用缓存的响应
    String answer = cached.get().getResult().getOutput().getText();
}

// 清除所有缓存条目
semanticCache.clear();

手动配置

如需更多控制，您可以手动配置语义缓存组件：

java 复制代码

@Configuration
public class SemanticCacheConfig {

    @Bean
    public RedisClient jedisClient() {
        return RedisClient.builder().hostAndPort("localhost", 6379).build();
    }

    @Bean
    public SemanticCache semanticCache(RedisClient jedisClient, EmbeddingModel embeddingModel) {
        return DefaultSemanticCache.builder()
            .jedisClient(jedisClient)
            .embeddingModel(embeddingModel)
            .distanceThreshold(0.3)           // 值越低 = 匹配越严格
            .indexName("my-semantic-cache")
            .prefix("cache:")
            .build();
    }

    @Bean
    public SemanticCacheAdvisor semanticCacheAdvisor(SemanticCache cache) {
        return SemanticCacheAdvisor.builder()
            .cache(cache)
            .build();
    }
}

使用命名空间进行缓存隔离

对于多租户应用或需要独立缓存空间时，使用不同的索引名称来隔离缓存条目：

java 复制代码

// 为不同用户或上下文创建隔离的缓存
SemanticCache user1Cache = DefaultSemanticCache.builder()
    .jedisClient(jedisClient)
    .embeddingModel(embeddingModel)
    .indexName("user-1-cache")
    .build();

SemanticCache user2Cache = DefaultSemanticCache.builder()
    .jedisClient(jedisClient)
    .embeddingModel(embeddingModel)
    .indexName("user-2-cache")
    .build();

// 每个用户获得自己独立的缓存空间
SemanticCacheAdvisor user1Advisor = SemanticCacheAdvisor.builder()
    .cache(user1Cache)
    .build();

系统提示隔离

SemanticCacheAdvisor 基于系统提示自动隔离缓存的响应。这确保具有不同系统提示的相同用户查询返回不同的缓存响应，这对于具有多个 AI 角色或上下文相关行为的应用至关重要。

java 复制代码

SemanticCacheAdvisor cacheAdvisor = SemanticCacheAdvisor.builder()
    .cache(semanticCache)
    .build();

// 使用技术支持角色查询
ChatResponse technicalResponse = ChatClient.builder(chatModel)
    .build()
    .prompt()
    .system("You are a technical support specialist. Provide detailed technical answers.")
    .user("How do I reset my password?")
    .advisors(cacheAdvisor)
    .call()
    .chatResponse();

// 相同查询使用客服角色 - 缓存未命中（不同上下文）
ChatResponse serviceResponse = ChatClient.builder(chatModel)
    .build()
    .prompt()
    .system("You are a friendly customer service agent. Keep responses brief and helpful.")
    .user("How do I reset my password?")
    .advisors(cacheAdvisor)
    .call()
    .chatResponse();

// 再次使用技术支持角色 - 缓存命中
ChatResponse technicalAgain = ChatClient.builder(chatModel)
    .build()
    .prompt()
    .system("You are a technical support specialist. Provide detailed technical answers.")
    .user("How do I reset my password?")
    .advisors(cacheAdvisor)
    .call()
    .chatResponse();
// 返回缓存的技术支持响应

工作原理：

advisor 计算系统提示的确定性哈希，并在存储和检索缓存响应时将其用作元数据过滤器：
- 相同用户问题 + 相同系统提示 → 缓存命中
- 相同用户问题 + 不同系统提示 → 缓存未命中（单独的缓存条目）
- 没有系统提示的查询共享一个公共缓存空间

上下文感知缓存 API

对于高级用例，您可以直接使用上下文感知缓存方法：

java 复制代码

// 使用显式上下文哈希存储
String contextHash = "technical-support-context";
semanticCache.set("How do I reset my password?", response, contextHash);

// 使用上下文过滤检索
Optional<ChatResponse> cached = semanticCache.get("How do I reset my password?", contextHash);

// 不同的上下文哈希返回空（无匹配）
Optional<ChatResponse> otherContext = semanticCache.get("How do I reset my password?", "billing-context");

调整相似度阈值

相似度阈值决定了查询必须与缓存条目的匹配程度才能被视为命中。阈值表示为 0.0 到 1.0 之间的值：

较高阈值（例如 0.95）：要求非常接近的语义匹配。减少误报但可能错过有效的缓存命中。
较低阈值（例如 0.70）：允许更广泛的语义匹配。提高缓存命中率但可能返回不太相关的缓存响应。

java 复制代码

// 严格匹配 - 只有非常相似的查询才能命中缓存
SemanticCache strictCache = DefaultSemanticCache.builder()
    .jedisClient(jedisClient)
    .embeddingModel(embeddingModel)
    .distanceThreshold(0.2)  // 严格（基于距离，值越低越严格）
    .build();

// 宽松匹配 - 接受更广泛的语义相似性
SemanticCache lenientCache = DefaultSemanticCache.builder()
    .jedisClient(jedisClient)
    .embeddingModel(embeddingModel)
    .distanceThreshold(0.5)  // 宽松
    .build();

建议：从较高的阈值（较严格的匹配）开始，根据应用对语义变化的容忍度逐渐降低阈值。

TTL 和缓存过期

缓存的响应可以配置 TTL（生存时间）以自动过期。这对于时效性敏感的数据至关重要：

java 复制代码

// 缓存天气数据 1 小时
semanticCache.set("What's the weather in New York?", weatherResponse, Duration.ofHours(1));

// 缓存常识性知识无期限（无 TTL）
semanticCache.set("What is photosynthesis?", scienceResponse);

// Redis 自动删除过期的条目

工作原理

语义缓存按以下流程运行：

查询嵌入 ：当查询到达时，使用配置的 EmbeddingModel 将其转换为向量嵌入
向量搜索 ：Redis 执行基于范围的向量搜索（VECTOR_RANGE），在相似度阈值内查找缓存的条目
缓存命中 ：如果找到语义相似的查询，立即返回缓存的 ChatResponse
缓存未命中：如果没有找到匹配，查询进入 LLM，并将响应缓存以供将来使用

该实现利用 Redis 高效的向量索引（HNSW 算法）进行快速相似性搜索，即使缓存规模很大也能保持高性能。

Spring AI / Vector Databases / Redis

Spring AI Redis 向量存储参考文档

目录

概述

先决条件

自动配置

配置属性

元数据过滤

手动配置

访问原生客户端

距离度量

HNSW 算法配置

文本搜索

文本评分算法

范围搜索

语义缓存

为什么需要语义缓存？

自动配置

配置属性

使用 SemanticCacheAdvisor

直接使用缓存

手动配置

使用命名空间进行缓存隔离

系统提示隔离

调整相似度阈值

TTL 和缓存过期

工作原理