大模型LLM之SpringAI：Web+AI（一）

官网：https://docs.spring.io/spring-ai/reference/index.html

版本：1.0.0-M1

Spring AI是一个旨在简化开发人员构建人工智能应用程序的项目的名称，它通过消除不必要的复杂性来实现这一目标。该项目受到了Python项目的启发，如LangChain和LlamaIndex，但它并不是后者的直接移植。Spring AI相信，下一个人工智能应用浪潮将不仅仅局限于Python开发人员，而是会普及到许多编程语言。

特点：

对所有主要模型供应商（如OpenAI、Microsoft、Amazon、Google和Hugging Face）的支持。
支持的模型类型包括聊天、文本转图像、音频转录、文字转语音等，未来还会支持更多。
在所有模型中提供跨AI供应商的统一API。同时支持同步和流式API选项。也可以访问特定于模型的特性。
将AI模型的输出映射到POJO
支持所有主要向量数据库供应商（如Apache Cassandra、Azure向量搜索、Chroma、Milvus、MongoDB Atlas、Neo4j、Oracle、PostgreSQL/PGVector、PineCone、Qdrant、Redis和Weaviate）。
...

1、概念

1.1、模型Model

功能分类：文生文、文生图、图生文、声音转文字、文字转声音、文本（图片）转向量（嵌套）等

temperature温度：这个参数控制着生成的随机性。较高的温度值会增加文本的多样性和创造性，但可能会牺牲一些准确性或连贯性。

1.2、提示词prompts

提示词作为基于语言的输入的基础,指导人工智能模型生成具体的输出，但提示词不仅仅是对话框中输入的文本。ChatGPT 的 API 在提示中有多个文本输入，每个文本输入都被分配一个角色。例如，存在系统角色，它告诉模型如何行为并为交互设置上下文。还有用户角色，通常是来自用户的输入。

提示词模板：创建有效的提示涉及建立请求的上下文，并用特定于用户输入的值替换请求的部分。此过程使用传统的基于文本的模板引擎进行提示创建和管理。Spring AI 使用 OSS 库 StringTemplate 来实现此目的。

官网例子：

复制代码

Tell me a {adjective} joke about {content}.

在 Spring AI 中，提示模板可以比作 Spring MVC 架构中的"视图"。提供了一个模型对象，通常为 java.util.Map ，用于填充模板中的占位符。

prompts后续会根据OpenAI的文档单独写一章

1.3、嵌入Embeddings

翻译的软件说是嵌入，即文本、图像或视频的数字表示形式，用于捕获输入之间的关系。**我更认为是文本和图像的向量化。**嵌入的工作原理是将文本、图像和视频转换为浮点数数组，称为向量。这些向量旨在捕获文本、图像和视频的含义。嵌入数组的长度称为向量的维数。

通过计算两段文本的向量表示之间的数值距离，应用程序可以确定用于生成嵌入向量的对象之间的相似性。

嵌入在检索增强生成（RAG）模式等实际应用中尤为重要。它们使数据能够表示为语义空间中的点，语义空间类似于欧几里得几何的二维空间，但维度更高。这意味着，就像欧几里得几何中平面上的点根据其坐标可以近或远一样，在语义空间中，点的接近反映了含义的相似性。在这个多维空间中，关于相似主题的句子位置更近，就像图表上彼此靠近的点一样。这种接近性有助于文本分类、语义搜索甚至产品推荐等任务，因为它允许人工智能根据它们在这个扩展的语义景观中的"位置"来辨别和分组相关概念。

RAG、agent（智能体）等后续文章会讲，期待后续吧！

1.4、令牌Token

在英语中，一个token大致相当于一个单词的 75%。自己买的API通过token进行消耗，狭义理解为使用OpenAI的流量。在托管 AI 模型的上下文中，您的费用由使用的令牌数量决定。输入和输出都会影响总令牌计数。

1.5、结构化输出

一般输出以String形式出现，也可以是JSON，JSON不一定准。

结构化输出转换采用精心设计的提示，通常需要与模型进行多次交互才能实现所需的格式设置。

1.6、AI整合私有数据

一般三种方法：

使用私有数据微调模型，修改模型权重等
将私有数据转换为向量存储到向量数据库，在对话时进行查询检索，即检索增强生成（RAG）
函数调用，直接调用OpenAI、Ollama等

推荐使用2和3

1.7、检索增强生成 Retrieval Augmented Generation（RAG）

该方法涉及批处理样式编程模型，其中作业从文档中读取非结构化数据，对其进行转换，然后将其写入向量数据库。概括地说，这是一个 ETL（提取、转换和加载）管道。向量数据库用于RAG技术的检索部分。

作为将非结构化数据加载到矢量数据库的一部分，最重要的转换之一是将原始文档拆分为较小的部分。将原始文档拆分为较小部分的过程有两个重要步骤：

将文档拆分为多个部分，同时保留内容的语义边界。例如，对于包含段落和表格的文档，应避免将文档拆分在段落或表格的中间。对于代码，请避免在方法实现的过程中拆分代码。
将文档的各个部分进一步拆分为多个部分，其大小仅占 AI 模型令牌限制的一小部分。

RAG 的下一阶段是处理用户输入。当 AI 模型要回答用户的问题时，该问题和所有"相似"的文档片段都会被放入发送到 AI 模型的提示中。这就是使用向量数据库的原因。它非常擅长寻找类似的内容。

1.8、函数调用

Spring AI 大大简化了您需要编写以支持函数调用的代码。它为您处理函数调用对话。您可以将函数提供@Bean ，然后在提示选项中提供函数的 bean 名称以激活该函数。此外，您可以在单个提示中定义和引用多个函数。

执行聊天请求，发送功能定义信息。后者提供 name 、 description （例如解释模型何时应调用函数）和 input parameters （例如函数的输入参数架构）。
当模型决定调用函数时，它将使用输入参数调用函数，并将输出返回给模型。
Spring AI 为您处理此对话。它将函数调用分派给相应的函数，并将结果返回给模型。
模型可以执行多个函数调用来检索它需要的所有信息。
一旦获取了所需的所有信息，模型将生成响应。

2、使用

2.1、开始

要使用SpringBoot3，OpenJDK17或以上

xml 复制代码

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.2.8</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.ai</groupId>
    <artifactId>spring-ai</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>spring-ai</name>
    <description>spring-ai</description>
    <url/>
    <licenses>
        <license/>
    </licenses>
    <developers>
        <developer/>
    </developers>
    <scm>
        <connection/>
        <developerConnection/>
        <tag/>
        <url/>
    </scm>
    <properties>
        <java.version>17</java.version>
        <spring-ai.version>1.0.0-M1</spring-ai.version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <!--需要ollama时候再使用，否则在自动装配情况下启动报错，提示chatClient.Builder需要指定一个大模型，在不使用自动装配时可以不注释-->
<!--        <dependency>-->
<!--            <groupId>org.springframework.ai</groupId>-->
<!--            <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>-->
<!--        </dependency>-->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-devtools</artifactId>
            <scope>runtime</scope>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>
    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.ai</groupId>
                <artifactId>spring-ai-bom</artifactId>
                <version>${spring-ai.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
                <configuration>
                    <excludes>
                        <exclude>
                            <groupId>org.projectlombok</groupId>
                            <artifactId>lombok</artifactId>
                        </exclude>
                    </excludes>
                </configuration>
            </plugin>
        </plugins>
    </build>
    <repositories>
        <repository>
            <id>spring-milestones</id>
            <name>Spring Milestones</name>
            <url>https://repo.spring.io/milestone</url>
            <snapshots>
                <enabled>false</enabled>
            </snapshots>
        </repository>
    </repositories>

</project>

2.2、API

AI Model API分为普通Model和StreamingModel。StreamingModel是以流式返回响应，类似与AI回答一个个字显示而不是一下全部显示出来。

支持 OpenAI、Microsoft、Amazon、Google、Amazon Bedrock、Hugging Face 等的 AI 模型。支持 14 个向量数据库。Spring AI 可以轻松地让 AI 模型调用 POJO java.util.Function 对象。

2.2.1、Chat Client （聊天客户端）API

SpringAI提供了一个流畅的 API，用于与 AI 模型进行通信。它支持同步和反应式编程模型（最简单、最基础的API）。AI 模型处理两种主要类型的消息：用户消息（来自用户的直接输入）和系统消息（由系统生成以指导对话）。这些消息通常包含占位符，这些占位符在运行时根据用户输入进行替换，以自定义 AI 模型对用户输入的响应。还可以指定提示选项，例如要使用的 AI 模型的名称和控制生成输出的随机性或创造力设置。

ChatClient 使用 ChatClient.Builder 对象创建的。您可以为任何 ChatModel Spring Boot 自动配置获取自动配置 ChatClient.Builder 的实例，也可以以编程方式创建一个实例。

官网代码

java 复制代码

package com.ai.springai.controller;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

/**
 * @Title: MyController
 * @Author lizhe
 * @Package com.ai.springai.controller
 * @Date 2024/8/15 下午11:33
 * @description: 测试chat client API
 */

@RestController
public class MyController {
    private final ChatClient chatClient;
    public MyController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }
    @GetMapping("/ai")
    String generation(String userInput) {
        return this.chatClient.prompt()
                .user(userInput)
                .call()
                .content();
    }


}

将函数改造一下（调用OpenAI接口）

java 复制代码

package com.ai.springai.controller;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

/**
 * @Title: MyController
 * @Author lizhe
 * @Package com.ai.springai.controller
 * @Date 2024/8/15 下午11:33
 * @description: 测试chat client API
 */

@RestController
public class MyController {
    private final ChatClient chatClient;
    public MyController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }
    @GetMapping("/ai")
    String generation(@RequestParam(value = "message", defaultValue = "Tell me a joke") String userInput) {
        return this.chatClient.prompt()
                .user(userInput)
                .call()
                .content();
    }


}

配置文件

yml 复制代码

spring:
  application:
    name: spring-ai

  ai:
  	#api-key需要自己买，base-url买的时候会告诉你
    openai:
      api-key: sk-***********************************
      base-url: https://*********

测试结果

如果要使用多个大模型进行聊天，需要把Spring的自动注入关闭。可以通过设置属性 spring.ai.chat.client.enabled=false 来禁用 ChatClient.Builder 自动配置。

修改配置文件

yml 复制代码

spring:
  application:
    name: spring-ai

  ai:
  	#api-key需要自己买，base-url买的时候会告诉你
    openai:
      api-key: sk-***********************************
      base-url: https://*********
    chat:
      client:
        enabled: false

修改代码

java 复制代码

package com.ai.springai.controller;

import jakarta.annotation.Resource;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.openai.OpenAiChatModel;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

/**
 * @Title: MyController
 * @Author lizhe
 * @Package com.ai.springai.controller
 * @Date 2024/8/15 下午11:33
 * @description: 测试chat client API
 */

@RestController
public class MyController {
    private final ChatClient chatClient;
	// 在使用多模型可能会用到,后续遇到再写
    // private final ChatClient chatClientOther;
    @Autowired
    public MyController(OpenAiChatModel openAiChatModel) {
        this.chatClient = ChatClient.create(openAiChatModel);
    }



    @GetMapping("/ai")
    String generation(@RequestParam(value = "message", defaultValue = "Tell me a joke") String userInput) {
        return this.chatClient.prompt()
                .user(userInput)
                .call()
                .content();
    }


}

测试结果

客户端返回响应

AI 模型的响应是由 ChatResponse 类型定义的丰富结构。它包括有关如何生成响应的元数据，还可以包含多个响应（称为"生成"），每个响应都有自己的元数据。元数据包括用于创建响应的标记数（每个标记大约为单词的 3/4）。此信息很重要，因为托管 AI 模型根据每个请求使用的令牌数量收费。

java 复制代码

    @GetMapping("/getChatResponse")
    String index() {
        ChatResponse chatResponse = chatClient.prompt()
                .user("Tell me a joke")
                .call()
                .chatResponse();
        return chatResponse.toString();
    }

返回结果

复制代码

ChatResponse [metadata={ @type: org.springframework.ai.openai.metadata.OpenAiChatResponseMetadata, id: chatcmpl-DsGla7joCoVCOc5ZiBoUDbFbNZYnA, usage: Usage[completionTokens=17, promptTokens=11, totalTokens=28], rateLimit: { @type: org.springframework.ai.openai.metadata.OpenAiRateLimit, requestsLimit: null, requestsRemaining: null, requestsReset: null, tokensLimit: null; tokensRemaining: null; tokensReset: null } }, generations=[Generation{assistantMessage=AssistantMessage{content='Sure thing! What kind of shoes do ninjas wear? Sneakers!', properties={finishReason=STOP, role=ASSISTANT, id=chatcmpl-DsGla7joCoVCOc5ZiBoUDbFbNZYnA, messageType=ASSISTANT}, messageType=ASSISTANT}, chatGenerationMetadata=org.springframework.ai.chat.metadata.ChatGenerationMetadata$1@5e07f126}]]

返回实体对象

java 复制代码

    @GetMapping("/getEntity")
    ActorFilms getEntity() {
        ActorFilms actorFilms = chatClient.prompt()
                .user("Generate the filmography for a random actor.")
                .call()
                .entity(ActorFilms.class);
        return actorFilms;
    }

结果

流式响应**

使 stream 您可以获得异步响应

java 复制代码

@GetMapping("/getStreamOutput")
    Flux<String> getStreamOutput() {
        Flux<String> output = chatClient.prompt()
                .user("讲一个不少于500字的通话故事")
                .stream()
                .content();
        return output;
    }

生成的结果文字是通过流的方式一点点的呈现，对于生成大量文字的用户体验感更好。

使用默认值

在 @Configuration 类中使用默认系统文本创建 ChatClient 可简化运行时代码。通过设置默认值，您只需在调用 ChatClient 时指定用户文本，而无需为运行时代码路径中的每个请求设置系统文本。

系统默认文本可以提前设置输入给模型的角色、任务、方式等

config

java 复制代码

package com.ai.springai.config;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

/**
 * @Title: AIConfig
 * @Author lizhe
 * @Package com.ai.springai.config
 * @Date 2024/8/16 下午11:10
 * @description: 配置系统默认文本
 */
@Configuration
public class AIConfig {
    @Bean
    ChatClient chatClient(ChatClient.Builder builder) {
        return builder.defaultSystem("你是一名专业的中国法律顾问，回答问题时要引用中国的法律条文")
                .build();
    }
}

contrller

java 复制代码

package com.ai.springai.controller;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import java.util.Map;

/**
 * @Title: TestDefaultText
 * @Author lizhe
 * @Package com.ai.springai.controller
 * @Date 2024/8/16 下午11:14
 * @description: 测试默认系统文本
 */
@RestController
public class TestDefaultTextController {
    private final ChatClient chatClient;
    TestDefaultTextController(ChatClient chatClient){
        this.chatClient = chatClient;
    }
    @GetMapping("/defaultText")
    public Map<String, String> completion(@RequestParam(value = "message", defaultValue = "单位克扣工资违法吗") String message) {
        return Map.of("completion", chatClient.prompt().user(message).call().content());
    }

}

结果

在默认文本中加入参数,在参数中可以设置AI的不同角色

config

java 复制代码

package com.ai.springai.config;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

/**
 * @Title: AIConfig
 * @Author lizhe
 * @Package com.ai.springai.config
 * @Date 2024/8/16 下午11:10
 * @description: 配置系统默认文本
 */
@Configuration
public class AIConfig {
    @Bean
    ChatClient chatClient(ChatClient.Builder builder) {
        return builder.defaultSystem("你是一名专业的中国{profession}")
                .build();
    }
}

controller

java 复制代码

package com.ai.springai.controller;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import java.util.Map;

/**
 * @Title: TestDefaultText
 * @Author lizhe
 * @Package com.ai.springai.controller
 * @Date 2024/8/16 下午11:14
 * @description: 测试默认系统文本
 */
@RestController
public class TestDefaultTextController {
    private final ChatClient chatClient;
    TestDefaultTextController(ChatClient chatClient){
        this.chatClient = chatClient;
    }
    @GetMapping("/defaultText")
    public Map<String, String> completion(@RequestParam(value = "message", defaultValue = "关于环保说出你的专业看法") String message,String profession) {
        return Map.of("completion", chatClient.prompt().system(systemSpec -> systemSpec.param("profession",profession)).user(message).call().content());
    }

}

农民角色

环保主义者角色

聊天记忆

ChatMemory 代表聊天对话历史记录的存储。它提供了向对话添加消息、从对话中检索消息以及清除对话历史记录的方法。

有两种实现 InMemoryChatMemory ， CassandraChatMemory 它们为聊天对话历史记录、内存中和 time-to-live 相应的持久化提供存储。

使用 ChatMemory 接口通过对话历史记录向提示提供建议，这些对话历史记录在如何将内存添加到提示的详细信息上有所不同。

MessageChatMemoryAdvisor ：检索内存并将其作为消息集合添加到提示符中
PromptChatMemoryAdvisor ：检索内存并将其添加到提示的系统文本中。
VectorStoreChatMemoryAdvisor ：构造函数 VectorStoreChatMemoryAdvisor(VectorStore vectorStore, String defaultConversationId, int chatHistoryWindowSize) 允许您指定要从中检索聊天记录的 VectorStore、唯一对话 ID、要检索的聊天记录的大小（以令牌大小表示）。

官网代码

java 复制代码

import static org.springframework.ai.chat.client.advisor.AbstractChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY;
import static org.springframework.ai.chat.client.advisor.AbstractChatMemoryAdvisor.CHAT_MEMORY_RETRIEVE_SIZE_KEY;

@Service
public class CustomerSupportAssistant {

    private final ChatClient chatClient;

    public CustomerSupportAssistant(ChatClient.Builder builder, VectorStore vectorStore, ChatMemory chatMemory) {

    this.chatClient = builder
            .defaultSystem("""
                    You are a customer chat support agent of an airline named "Funnair".", Respond in a friendly,
                    helpful, and joyful manner.

                    Before providing information about a booking or cancelling a booking, you MUST always
                    get the following information from the user: booking number, customer first name and last name.

                    Before changing a booking you MUST ensure it is permitted by the terms.

                    If there is a charge for the change, you MUST ask the user to consent before proceeding.
                    """)
            .defaultAdvisors(
                    new PromptChatMemoryAdvisor(chatMemory),
                    // new MessageChatMemoryAdvisor(chatMemory), // CHAT MEMORY
                    new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()),
                    new LoggingAdvisor()) // RAG
            .defaultFunctions("getBookingDetails", "changeBooking", "cancelBooking") // FUNCTION CALLING
            .build();
}

public Flux<String> chat(String chatId, String userMessageContent) {

    return this.chatClient.prompt()
            .user(userMessageContent)
            .advisors(a -> a
                    .param(CHAT_MEMORY_CONVERSATION_ID_KEY, chatId)
                    .param(CHAT_MEMORY_RETRIEVE_SIZE_KEY, 100))
            .stream().content();
    }
}

但是我粘贴后，导包后还是报红，不知道咋回事，估计后续官网会更新吧

源码中的参数并没有LoggingAdvisor

日志记录

SimpleLoggerAdvisor 用于记录 ChatClient 的 request 和 response 数据。这对于调试和监视 AI 交互非常有用。

大模型LLM之SpringAI：Web+AI（一）

1、概念

1.1、模型Model

1.2、提示词prompts

1.3、嵌入Embeddings

1.4、令牌Token

1.5、结构化输出

1.6、AI整合私有数据

1.7、检索增强生成 Retrieval Augmented Generation（RAG）

1.8、函数调用

2、使用

2.1、开始

2.2、API

2.2.1、Chat Client （聊天客户端 ）API

客户端返回响应

返回实体对象

流式响应**

使用默认值

聊天记忆

日志记录

2.2.1、Chat Client （聊天客户端）API