拥抱智能时代：Spring AI：在Spring生态中构建AI应用——深度剖析与实践

随着人工智能技术的飞速发展，大型语言模型（LLMs）和生成式AI（Generative AI）正在深刻改变软件开发的范式。将AI能力融入业务应用，不再是少数专业AI团队的专属，而是成为赋能传统应用、提升用户体验的关键。作为广受欢迎的企业级应用开发框架，Spring Framework 自然也紧跟这一趋势，推出了其在AI领域的全新力作：Spring AI。

本文将深入探讨Spring AI的核心概念、架构设计、关键功能、高级用法以及如何在Spring生态系统中构建生产级的智能应用，旨在为开发者提供一个全面而专业的指导，助你从容迈入AI赋能的开发新纪元。

1. Spring AI：连接Spring与AI世界的桥梁

Spring AI 是一个新兴的Spring项目，旨在简化将AI功能（尤其是基于LLMs的功能）集成到Spring应用中的过程。它的核心目标是提供一个统一的API和抽象层，屏蔽底层AI服务（如OpenAI、Google Gemini、Azure AI、Hugging Face等）的复杂性，让Spring开发者能够以熟悉的方式，高效地构建AI驱动的应用。

1.1 设计理念与核心价值

Spring AI的设计理念与Spring Framework一脉相承：提供高层次的抽象和可插拔的架构。其核心价值体现在：

简化集成： 统一的API接口和自动配置机制，开发者无需关注不同AI服务提供商的具体SDK差异、认证细节或RESTful API调用规范。只需通过简单的Spring Boot Starter，即可快速接入AI能力。
开发效率： 借助于Spring的依赖注入、自动化配置、组件扫描等特性，显著加速AI功能的开发和部署。开发者可以利用熟悉的Spring范式来构建AI相关的业务逻辑。
可移植性与灵活性： Spring AI的抽象层允许开发者轻松切换底层AI模型或服务。例如，从OpenAI切换到Google Gemini，只需修改少量配置和依赖，而无需重写核心业务代码，这大大降低了供应商锁定风险。
企业级特性： 融合Spring Boot的生产就绪特性，支持度量（Metrics）、可观测性（Observability）、安全性（Security）、健康检查（Health Checks）等。这使得构建的AI应用能够无缝融入现有的企业级监控和管理体系。
应用场景丰富： 提供对多种AI应用场景的开箱即用支持，包括但不限于文本生成（文章、报告）、代码生成与补全、内容摘要、多轮对话聊天机器人、智能问答系统、以及基于RAG（Retrieval Augmented Generation）的知识检索增强生成等。

2. Spring AI 核心概念与架构深度解析

Spring AI的设计围绕几个关键抽象展开，它们共同构成了其可扩展且易于使用的架构。理解这些核心概念对于高效使用Spring AI至关重要。

2.1 核心组件与接口

ChatClient：大模型对话的核心
- 职责： 这是Spring AI中最核心的接口，负责与LLM进行文本对话交互。它提供了流畅的链式API来构建提示（Prompt）并执行调用。
- 用法： chatClient.prompt().user("Hello, AI!").call().content() 是最简单的调用方式。它还支持系统消息、函数调用、流式输出等高级特性。
- 实现： 具体实现由底层的模型提供商Starter提供，例如 OpenAiChatClient、GoogleGeminiChatClient 等。
Prompt 与 Generation：输入与输出的封装
- Prompt： 封装了发送给LLM的完整输入。它不仅仅是简单的字符串，可以包含多条 Message（用户消息、系统消息、AI消息、函数消息），并支持携带模型参数（ChatOptions）和函数调用描述（FunctionCallingOptions）。
- Generation： 表示LLM生成的一个输出结果。一次LLM调用可能返回多个 Generation 对象（例如，当要求模型生成多个候选答案时），每个 Generation 包含生成的文本内容、元数据以及可能的函数调用信息。
EmbeddingClient：文本的向量化表示
- 职责： 负责将文本（或文档片段）转换为高维向量表示，即嵌入（embeddings）。这些向量捕获了文本的语义信息，使得可以通过计算向量距离来衡量文本之间的相似性。
- 重要性： 嵌入是实现语义搜索、推荐系统、分类聚类以及RAG（检索增强生成）等高级AI功能的基础。
- 用法： embeddingClient.embed(text) 或 embeddingClient.embed(document)。
VectorStore：知识库的核心
- 职责： 用于高效存储和检索嵌入向量。它通常由一个底层的向量数据库（Vector Database）实现。
- 集成： Spring AI提供了对多种流行的向量数据库的抽象和集成，例如：
  - SQL类： PostgreSQL (使用PGVector扩展)
  - NoSQL类： Redis (RediSearch)
  - 专用向量数据库： Pinecone, Weaviate, Milvus, Chroma, Qdrant, Neo4j (使用向量索引) 等。
- 用法： vectorStore.add(List<Document>) 用于添加文档及其嵌入；vectorStore.similaritySearch(SearchRequest) 用于根据查询向量进行相似性搜索。
PromptTemplate：灵活的提示工程
- 职责： 提供一种声明式的方式来构建和渲染复杂的提示。它支持类似于Spring RestTemplate 的变量替换和条件逻辑，使得提示的构建更加灵活、可维护。
- 优势： 避免硬编码提示字符串，便于参数化和动态生成提示，降低提示工程的复杂度。
- 用法： new PromptTemplate("请总结以下内容：{content}").create(Map.of("content", text))。
模型提供商集成（Starters）：
- Spring AI通过提供不同的Spring Boot Starter来支持各种LLM服务提供商。这些Starter不仅引入了相应的客户端库，还提供了自动配置，简化了连接和认证过程。例如：
  - spring-ai-openai-spring-boot-starter
  - spring-ai-azure-openai-spring-boot-starter
  - spring-ai-google-gemini-spring-boot-starter
  - spring-ai-ollama-spring-boot-starter (用于本地运行模型，如Mistral, Llama2)
  - spring-ai-huggingface-spring-boot-starter 等。

2.2 架构概览

(图片来源：Spring AI官方文档)

从上图可以看出，Spring AI的架构清晰地划分为几个层次：

应用层 (Application Code)： 开发者直接与Spring AI提供的核心接口（ChatClient、EmbeddingClient、VectorStore）交互，无需关心底层AI服务的复杂性。
Spring AI Core： 这是Spring AI的核心层，定义了通用的API接口和抽象，如 Prompt、Message、Generation、ChatOptions 等。它还包含 PromptTemplate 等通用工具。
模型提供商适配层 (Model Providers)： 这一层是Spring AI的关键扩展点。每个AI模型提供商（如OpenAI、Google Gemini等）都有自己的适配器（Starter），负责将Spring AI的通用请求转换为对应服务特有的API调用，并解析返回结果。
底层AI模型/服务 (Underlying AI Models/Services)： 实际执行AI计算的服务，可以是云服务（OpenAI API、Azure AI）或本地模型（通过Ollama等）。
外部知识库/数据源 (External Knowledge Base/Data Sources)： 用于RAG等场景，通过 VectorStore 集成，提供额外的上下文信息。

这种分层设计使得Spring AI既能提供强大的功能，又能保持高度的灵活性和可维护性。

3. 构建基于Spring AI的智能应用：从基础到高级实践

3.1 基础：引入依赖与配置

首先，确保你的Spring Boot项目是最新版本，并引入相应的Spring AI Starter和 BOM。

xml 复制代码

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.3.0</version> <relativePath/> </parent>
    <groupId>com.example</groupId>
    <artifactId>spring-ai-demo</artifactId>
    <version>0.0.1-SNAPSHOT</artifactId>
    <name>spring-ai-demo</name>
    <description>Demo project for Spring AI</description>

    <properties>
        <java.version>17</java.version>
        <spring-ai.version>0.8.0</spring-ai.version> </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.ai</groupId>
                <artifactId>spring-ai-bom</artifactId>
                <version>${spring-ai.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

重要提示： Spring AI的版本与Spring Boot版本之间存在严格的兼容性要求。务必查阅Spring AI的官方文档或GitHub仓库，确认当前Spring AI版本支持的Spring Boot版本，以避免不必要的依赖冲突。

在application.properties或application.yml中配置AI服务的API Key和相关参数：

properties 复制代码

# application.properties
spring.ai.openai.api-key=${OPENAI_API_KEY} # 环境变量方式更安全，避免硬编码
spring.ai.openai.chat.options.model=gpt-4o # 选择合适的模型，如 gpt-3.5-turbo, gpt-4o, llama3等
spring.ai.openai.chat.options.temperature=0.7 # 控制模型生成的多样性，0-1，越高越随机
spring.ai.openai.chat.options.top-p=1.0 # 控制模型采样的范围

# 如果需要使用Embedding服务，也需配置
spring.ai.openai.embedding.options.model=text-embedding-ada-002 # 常用嵌入模型

对于Google Gemini、Azure AI等，配置项会略有不同，但通常遵循spring.ai.{provider}.*的命名规范。

3.2 核心实践：文本生成与对话

基本文本生成：使用ChatClient 通过依赖注入 ChatClient 即可在Spring组件中使用LLM进行文本生成。

java 复制代码

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.chat.model.Generation;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.PromptTemplate;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux; // 用于流式响应

import java.util.List;
import java.util.Map;

@RestController
public class ChatController {

    private final ChatClient chatClient;

    @Autowired
    public ChatController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    /**
     * 简单的文本生成API
     * GET /generate?message=Hello AI
     */
    @GetMapping("/generate")
    public String generate(@RequestParam(value = "message", defaultValue = "Tell me a short, funny joke.") String message) {
        // 使用链式API构建并发送提示，获取模型生成内容
        String response = chatClient.prompt()
            .user(message) // 用户消息
            .call()        // 调用模型
            .content();    // 获取生成内容
        return response;
    }

    /**
     * 使用PromptTemplate进行参数化提示
     * GET /poem?topic=spring
     */
    @GetMapping("/poem")
    public String generatePoem(@RequestParam(value = "topic", defaultValue = "spring") String topic) {
        // 定义一个 PromptTemplate，支持变量替换
        PromptTemplate promptTemplate = new PromptTemplate("""
            请写一首关于{topic}的五言绝句，要求意境优美，充满生机。
            """);
        // 使用参数创建Prompt
        Prompt prompt = promptTemplate.create(Map.of("topic", topic));

        return this.chatClient.prompt(prompt)
            .call()
            .content();
    }

    /**
     * 获取流式响应（Streaming API）
     * 对于长文本生成或聊天场景，流式响应可以提供更好的用户体验
     * GET /stream/joke
     */
    @GetMapping("/stream/joke")
    public Flux<String> streamJoke() {
        // stream() 方法返回 Flux<ChatResponse>，每个ChatResponse包含一部分生成内容
        return chatClient.prompt()
            .user("讲一个关于程序员的冷笑话，请逐步生成。")
            .stream()
            .content(); // content() 方法将 Flux<ChatResponse> 转换为 Flux<String>
    }

    /**
     * 多轮对话示例
     * 对于复杂对话，需要维护上下文
     * GET /chat
     */
    private static final List<org.springframework.ai.chat.model.Message> chatHistory = new java.util.ArrayList<>();

    @GetMapping("/chat")
    public String chat(@RequestParam("q") String query) {
        // 将用户查询添加到历史记录
        chatHistory.add(new org.springframework.ai.chat.prompt.messages.UserMessage(query));

        // 构建包含历史消息的Prompt
        ChatResponse response = chatClient.prompt()
            .messages(chatHistory) // 将整个聊天历史作为消息列表传递
            .call();

        String aiResponse = response.getResult().getOutput().getContent();
        // 将AI响应也添加到历史记录，以便下一轮对话使用
        chatHistory.add(new org.springframework.ai.chat.prompt.messages.AssistantMessage(aiResponse));

        return aiResponse;
    }
}

3.3 高级实践：增强生成（RAG）与函数调用

增强生成（RAG）：结合EmbeddingClient与VectorStore RAG（Retrieval Augmented Generation）是当前LLM应用的热门范式，它通过从外部知识库中检索相关信息来增强LLM的生成能力，有效解决了LLM的"幻觉"问题和知识时效性问题。

基本步骤：

数据加载与分块： 从各种数据源（文档、数据库、API等）加载原始数据。这些数据通常需要进行适当的分块（Chunking），以便更好地适应模型的上下文窗口和检索效率。
文档嵌入： 使用 EmbeddingClient 将分块后的文本（Document对象）转换为高维向量表示。
向量存储： 将这些嵌入向量连同原始文本内容和元数据一起存储到 VectorStore 中，以便进行高效的相似性搜索。
用户查询嵌入： 当用户提出问题时，将用户的查询也转换为嵌入向量。
向量检索： 在 VectorStore 中执行相似性搜索，检索与用户查询嵌入最相似的K个文档片段。
提示增强： 将检索到的相关文档片段作为额外的上下文，与原始用户查询一起，构建一个新的、更丰富的提示（Prompt）。
LLM生成： 将增强后的提示发送给LLM，LLM结合上下文生成更准确、更相关的回答。

代码示例（概念性）：

java 复制代码

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.embedding.EmbeddingClient;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.document.Document;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import java.util.List;
import java.util.stream.Collectors;

@RestController
public class RagController {

    private final ChatClient chatClient;
    private final EmbeddingClient embeddingClient;
    private final VectorStore vectorStore; // 假设已经配置好并注入了VectorStore实例

    @Autowired
    public RagController(ChatClient chatClient, EmbeddingClient embeddingClient, VectorStore vectorStore) {
        this.chatClient = chatClient;
        this.embeddingClient = embeddingClient;
        this.vectorStore = vectorStore;
    }

    /**
     * 示例：加载文档到VectorStore (通常在应用启动时或通过管理界面触发)
     */
    @GetMapping("/load-docs")
    public String loadDocuments() {
        // 模拟从某个数据源加载文档
        List<Document> documents = List.of(
            new Document("Spring Framework is an open-source application framework for Java."),
            new Document("Spring Boot simplifies the creation of production-ready Spring applications."),
            new Document("Microservices architecture is a style of developing a single application as a suite of small services.")
        );
        // 将文档转换为嵌入并添加到VectorStore
        vectorStore.add(documents);
        return "Documents loaded and embedded into VectorStore.";
    }

    /**
     * RAG问答API
     * GET /ask?query=What is Spring Boot?
     */
    @GetMapping("/ask")
    public String askQuestion(@RequestParam("query") String query) {
        // 1. 根据用户查询，从VectorStore中检索最相关的K个文档片段
        // SearchRequest.query(query)会自动使用EmbeddingClient将查询转换为向量
        List<Document> relevantDocuments = vectorStore.similaritySearch(
            SearchRequest.query(query).withTopK(3) // 检索最相似的3个文档
        );

        // 2. 将检索到的文档内容聚合成上下文
        String context = relevantDocuments.stream()
            .map(Document::getContent)
            .collect(Collectors.joining("\n---\n")); // 使用分隔符连接多个文档片段

        // 3. 构建增强的提示 (Prompt Engineering)
        // 告诉LLM其角色，并明确指示如何处理上下文信息和未知情况
        String enhancedPrompt = String.format("""
            你是一个专家级的问答助手，请根据提供的"上下文信息"来回答用户的问题。
            如果"上下文信息"中没有足够的信息来回答问题，请诚实地说明"我无法从提供的资料中找到答案"。

            上下文信息:
            %s

            用户问题: %s
            """, context, query);

        // 4. 发送增强的提示给LLM，获取答案
        return chatClient.prompt()
            .user(enhancedPrompt)
            .call()
            .content();
    }
}

函数调用（Function Calling）：赋予LLM外部能力 函数调用（Function Calling）是LLM的一项突破性能力，允许模型识别用户意图并调用外部工具或API来完成特定任务。Spring AI提供了对这一功能的无缝支持，使得LLM能够成为一个智能的协调者。

基本流程：

定义函数： 在Spring应用中定义普通的Java方法，并使用特定的Spring AI注解（如 @Description）来描述其功能、参数和返回值。这些描述会被Spring AI收集并提供给LLM。
注册函数： Spring AI的Starter会自动扫描并注册这些带有注解的函数。你也可以手动通过 ChatClient 配置函数列表。
用户提示： 用户发送包含潜在函数调用意图的提示（例如："帮我查一下旧金山今天的天气"）。
LLM决策与生成： LLM根据提示内容和已注册函数的功能描述，智能地判断是否需要调用某个函数，并生成相应的函数调用指令（包括函数名和参数）。这个指令是LLM输出的一部分。
Spring AI执行： Spring AI截获LLM的函数调用指令，通过反射机制调用对应的Java方法。
结果反馈： 将函数执行的结果（Java方法的返回值）作为新的消息（FunctionMessage）反馈给LLM作为新的上下文。
LLM生成最终响应： LLM结合函数执行结果，生成最终的用户响应，完成整个任务。

java 复制代码

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.messages.FunctionMessage;
import org.springframework.ai.chat.messages.Message;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import java.util.List;
import java.util.function.Function;

@RestController
public class FunctionCallingController {

    private final ChatClient chatClient;

    public FunctionCallingController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    /**
     * 函数定义：一个模拟查询当前天气的功能
     */
    @Bean
    @Description("获取给定城市当前的天气信息。") // 描述函数的用途
    public Function<WeatherFunction.Request, WeatherFunction.Response> currentWeatherFunction() {
        return new WeatherFunction();
    }

    static class WeatherFunction implements Function<WeatherFunction.Request, WeatherFunction.Response> {
        // 定义请求参数
        public record Request(
            @Description("城市名称，例如：San Francisco, New York, London") String city
        ) {}

        // 定义响应结构
        public record Response(String city, String temperature, String conditions) {}

        @Override
        public Response apply(Request request) {
            // 模拟调用外部天气API
            System.out.println("--- 调用函数: currentWeatherFunction (city: " + request.city + ") ---");
            if ("旧金山".equalsIgnoreCase(request.city) || "San Francisco".equalsIgnoreCase(request.city)) {
                return new Response(request.city, "20°C", "晴朗");
            } else if ("伦敦".equalsIgnoreCase(request.city) || "London".equalsIgnoreCase(request.city)) {
                return new Response(request.city, "15°C", "多云有小雨");
            }
            return new Response(request.city, "未知", "未知");
        }
    }

    /**
     * 触发函数调用的API
     * GET /ask-weather?q=旧金山现在天气怎么样？
     */
    @GetMapping("/ask-weather")
    public String askWeather(@RequestParam("q") String query) {
        // 创建一个Prompt，并指定要使用的函数名称
        Prompt prompt = chatClient.prompt()
            .user(query)
            .function("currentWeatherFunction") // 告诉LLM这个函数是可用的
            .build();

        // 发送Prompt并获取响应
        String result = chatClient.call(prompt).getResult().getOutput().getContent();

        // 检查响应中是否包含函数调用的信息（通常Spring AI会自动处理这部分）
        // 如果LLM决定调用函数，Spring AI会执行函数，并将结果作为FunctionMessage返回给LLM进行二次推理
        // 最终的result就是LLM根据函数执行结果生成的自然语言响应
        return result;
    }
}

4. 生产级应用的考虑与最佳实践

将Spring AI应用于生产环境，不仅仅是代码实现，还需要在性能、可靠性、成本和安全性方面进行全面考量。

4.1 性能与成本优化

Token管理与成本监控： LLM调用通常按Token数量计费。
- 优化提示词： 尽可能精简提示词，避免冗余信息。
- 分块与摘要： 对于RAG，合理分块并对检索到的文档进行摘要，减少发送给LLM的Token量。
- 日志与度量： 集成Micrometer等工具监控API调用量和Token使用情况，及时发现异常消耗。
缓存机制： 对于重复的或高频的LLM请求，可以考虑在应用层引入缓存（如Spring Cache），避免不必要的API调用，降低成本和延迟。
并发与吞吐量： LLM服务可能存在速率限制。在处理高并发请求时，应：
- 使用异步API： Spring AI的 ChatClient 和 EmbeddingClient 都支持异步方法（返回 Flux 或 Mono），充分利用非阻塞IO。
- 线程池管理： 合理配置Spring Boot的线程池，确保并发请求能够高效处理。
- 限流与熔断： 引入Hystrix或Resilience4j等库进行限流和熔断，防止LLM服务过载。

4.2 可观测性与可靠性

日志记录： 详细记录AI调用日志，包括请求、响应、耗时、错误信息等，便于问题诊断和审计。
度量监控： 集成Micrometer，收集AI服务调用的各项指标，如请求成功率、响应时间、错误率、Token使用量等。通过Grafana、Prometheus等工具进行可视化监控。
分布式链路追踪： 使用Spring Cloud Sleuth或Micrometer Tracing集成OpenTelemetry，实现跨服务的AI调用链路追踪，帮助分析复杂分布式系统中的AI请求流。
错误处理与重试： LLM服务可能因网络问题、服务过载或模型内部错误而失败。
- 健壮的异常处理： 对 ChatClient 和 EmbeddingClient 的调用进行 try-catch 块封装。
- 重试机制： 使用Spring Retry或Resilience4j的Retry功能，配置合理的重试策略（例如指数退避），提高API调用的成功率。
- 降级策略： 在AI服务不可用时，提供备用方案（如返回默认值、使用缓存数据或提示用户稍后重试）。

4.3 安全性考量

API Key管理： 绝不将API Keys硬编码在代码中。应通过环境变量、Spring Cloud Config、HashiCorp Vault等安全方式管理和注入API Keys。
数据隐私与合规：
- 敏感数据过滤： 在发送敏感数据到LLM服务前，进行脱敏或过滤。
- 数据驻留： 了解所选LLM服务的数据存储策略和地理位置，确保符合GDPR、CCPA等数据隐私法规要求。
- 模型训练： 确认LLM服务是否会使用你的数据进行模型训练，并根据需要配置禁用此功能。
访问控制： 对暴露的AI相关API接口进行严格的认证和授权，确保只有合法用户或服务才能访问。
提示注入攻击（Prompt Injection）： 这是LLM应用特有的安全风险，恶意用户可能通过精心构造的提示，绕过LLM的指令或安全策略。需要：
- 输入验证与净化： 对用户输入进行严格的验证和净化。
- 系统提示增强： 在系统提示中明确LLM的职责和行为约束，如"请严格按照上下文回答，不要偏离主题"。
- 输出审查： 对LLM的输出进行一定程度的审查，防止生成不当内容。

6. 总结与展望

Spring AI 作为Spring生态在人工智能领域的重要布局，极大地降低了LLM技术在企业应用中的集成门槛。它提供了一套优雅、一致且可扩展的API，使得Java开发者能够以熟悉的方式，专注于业务逻辑和用户体验，而无需深入AI底层服务的细节。

从基本的文本生成到复杂的RAG系统和函数调用，Spring AI为开发者提供了构建各种智能应用的强大基石。同时，结合Spring Boot的生产就绪特性，我们可以构建出高性能、高可靠、可观测且安全的AI驱动应用。