Spring AI 入门教程，使用Ollama本地模型集成，实现对话记忆功能。

Spring AI 入门教程：Ollama 本地模型集成

1. 环境准备

1.1 安装 Ollama

Ollama 是一个用于在本地运行大语言模型的工具。首先需要安装 Ollama：

MacOS : 可以通过 Homebrew 安装
bash 复制代码
```
brew install ollama
```
Windows/Linux : 访问 Ollama 官网下载对应的安装包

1.2 拉取 Ollama 模型

安装完成后，拉取一个模型（例如 Llama 3）：

bash 复制代码

ollama pull llama3

您可以根据需要选择其他模型，如 Mistral、DeepSeek 等。

1.3 验证 Ollama 服务

启动 Ollama 服务（通常安装后会自动启动），可以通过以下方式验证服务是否正常运行：

bash 复制代码

ollama list

正常情况下，会显示已下载的模型列表。Ollama 服务默认运行在 http://localhost:11434。

2. 创建 Spring Boot 项目

2.1 使用 Spring Initializr 创建项目

创建一个 Spring Boot 3.2+ 项目，选择以下依赖：

Spring Web
Spring Boot DevTools

2.2 添加 Spring AI 1.0.0-M6 依赖

在 pom.xml 文件中添加以下依赖：

xml 复制代码

<properties>
    <spring-ai.version>1.0.0-M6</spring-ai.version>
</properties>

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-bom</artifactId>
            <version>${spring-ai.version}</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependencies>
    <!-- Spring Boot Web 依赖 -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    
   <!-- ollama 依赖 -->
   <dependency>
     <groupId>org.springframework.ai</groupId>
     <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
  </dependency>

    
</dependencies>

3. 配置 Ollama

3.1 配置文件设置

在 application.yml 或 application.properties 中配置 Ollama：

application.yml 示例：

yaml 复制代码

server:
  port: 8080

spring:
  ai:
    ollama:
      base-url: http://localhost:11434  # Ollama 服务的URL
      chat:
        options:
          model: deepseek-r1:1.5b  # 使用的模型名称
          temperature: 0.7  # 可选：温度参数
          max-tokens: 1024  # 可选：最大生成长度

application.properties 示例：

properties 复制代码

server.port=8080

# Ollama 配置
spring.ai.ollama.base-url=http://localhost:11434
spring.ai.ollama.chat.options.model=deepseek-r1:1.5b
spring.ai.ollama.chat.options.temperature=0.7
spring.ai.ollama.chat.options.max-tokens=1024

4. 使用 ChatModel 与 Ollama 交互

ChatModel 是 Spring AI 中与聊天模型交互的基础接口。下面演示如何使用它与 Ollama 本地模型交互。

4.1 创建 ChatModel 服务

java 复制代码

import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.messages.UserMessage;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

@Service
public class OllamaChatService {

    private final ChatModel chatModel;

    @Autowired
    public OllamaChatService(ChatModel chatModel) {
        this.chatModel = chatModel;
    }

    public String chat(String message) {
        // 创建用户消息
        UserMessage userMessage = new UserMessage(message);
        
        // 创建提示
        Prompt prompt = new Prompt(userMessage);
        
        // 调用 Ollama 模型并获取响应
        return chatModel.call(prompt).getResult().getOutput().getContent();
    }
}

4.2 创建 ChatModel 控制器

java 复制代码

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class OllamaChatController {

    private final OllamaChatService chatService;

    @Autowired
    public OllamaChatController(OllamaChatService chatService) {
        this.chatService = chatService;
    }

    @GetMapping("/chat-model")
    public String chatWithModel(@RequestParam String message) {
        return chatService.chat(message);
    }
}

5. 使用 ChatClient 与 Ollama 交互

ChatClient 是对 ChatModel 的更高层次封装，提供了更流畅的 API。

5.1 创建 ChatClient 服务

java 复制代码

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

@Service
public class OllamaClientService {

    private final ChatClient chatClient;

    @Autowired
    public OllamaClientService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public String simpleChat(String message) {
        // 使用流畅的 API 调用 Ollama
        return chatClient.prompt().user(message).call().content();
    }

    public String chatWithSystemPrompt(String message) {
        // 添加系统提示
        return chatClient.prompt()
            .system("你是一个专业助手，用简洁的语言回答问题。")
            .user(message)
            .call()
            .content();
    }
}

5.2 创建 ChatClient 控制器

java 复制代码

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class OllamaClientController {

    private final OllamaClientService clientService;

    @Autowired
    public OllamaClientController(OllamaClientService clientService) {
        this.clientService = clientService;
    }

    @GetMapping("/chat-client")
    public String chatWithClient(@RequestParam String message) {
        return clientService.simpleChat(message);
    }

    @GetMapping("/chat-with-system")
    public String chatWithSystem(@RequestParam String message) {
        return clientService.chatWithSystemPrompt(message);
    }
}

6. 高级功能

6.1 流式响应

Spring AI 支持流式返回 Ollama 生成的内容，适用于需要实时显示的场景。

java 复制代码

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import reactor.core.publisher.Flux;

@Service
public class OllamaStreamingService {

    private final ChatClient chatClient;

    @Autowired
    public OllamaStreamingService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public Flux<String> streamChat(String message) {
        return chatClient.prompt()
            .user(message)
            .stream()
            .content();
    }
}

对应的控制器：

java 复制代码

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;

@RestController
public class OllamaStreamingController {

    private final OllamaStreamingService streamingService;

    @Autowired
    public OllamaStreamingController(OllamaStreamingService streamingService) {
        this.streamingService = streamingService;
    }

    @GetMapping(value = "/stream", produces = "text/event-stream")
    public Flux<String> stream(@RequestParam String message) {
        return streamingService.streamChat(message);
    }
}

6.2 对话记忆

使用 ChatMemory 实现对话上下文管理：

java 复制代码

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.messages.Message;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import java.util.List;
import java.util.UUID;

@Service
public class OllamaConversationService {

    private final ChatClient chatClient;


    @Autowired
    ChatMemory memory;

    @Autowired
    public OllamaConversationService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public String startConversation(String message) {
        String conversationId = UUID.randomUUID().toString();

        MessageChatMemoryAdvisor messageChatMemoryAdvisor = new MessageChatMemoryAdvisor(memory, conversationId, 20, 1);

        String response = chatClient.prompt()
                .user(message)
                .advisors(messageChatMemoryAdvisor)
                .call()
                .content();

        System.out.println("ConversationId : " + conversationId);
        return "ConversationId : " + conversationId + "\nResponse: " + response;
    }

    public String continueConversation(String conversationId, String message) {

        MessageChatMemoryAdvisor messageChatMemoryAdvisor = new MessageChatMemoryAdvisor(memory, conversationId, 20, 1);
        return chatClient.prompt()
                .user(message)
                .advisors(messageChatMemoryAdvisor)
                .call()
                .content();
    }

    public List<Message> getMessageList(String conversationId) {
        List<Message> messageList = memory.get(conversationId, 20);
        return messageList;
    }
}

对应的控制器：

java 复制代码

import com.example.service.OllamaConversationService;
import org.springframework.ai.chat.messages.Message;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import java.util.List;

@RestController
public class OllamaConversationController {

    private final OllamaConversationService conversationService;

    @Autowired
    public OllamaConversationController(OllamaConversationService conversationService) {
        this.conversationService = conversationService;
    }

    /**
     * 发起会话
     *
     * @param message
     * @return
     */
    @GetMapping("/start-conversation")
    public String startConversation(@RequestParam(defaultValue = "我叫张三") String message) {
        return conversationService.startConversation(message);
    }


    /**
     * 持续聊天
     *
     * @param sessionId
     * @param message
     * @return
     */
    @GetMapping("/continue-conversation")
    public String continueConversation(@RequestParam(defaultValue = "sessionId") String sessionId, @RequestParam(defaultValue = "我叫什么?") String message) {
        return conversationService.continueConversation(sessionId, message);
    }

    /**
     * 获取某个会话的聊天记录
     * @param sessionId
     * @return
     */
    @GetMapping("/getMessageList")
    public List<Message> getMessageList(@RequestParam(defaultValue = "sessionId") String sessionId) {
        return conversationService.getMessageList(sessionId);
    }
}

7. 自定义 ChatClient 配置

可以通过配置类自定义 ChatClient 的行为：

java 复制代码

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.PromptChatMemoryAdvisor;
import org.springframework.ai.chat.client.advisor.api.Advisor;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.memory.InMemoryChatMemory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class ChatClientConfig {


    @Bean
    public ChatMemory chatMemory(){
        return new InMemoryChatMemory();
    }

    @Bean
    public ChatClient ollamaChatClient(ChatClient.Builder chatClientBuilder) {

        ChatClient chatClient = chatClientBuilder
                // 可以添加默认的系统提示
                .defaultSystem("你是一个友好的AI助手，用简洁明了的语言回答问题。")
                // 可以添加默认的增强器Ø
                .defaultAdvisors(
                        // 注意：如果添加默认对话记忆，所有对话将共享同一个上下文
                        // 对于生产环境，应该根据会话ID管理不同的记忆实例
//                        promptChatMemoryAdvisor()
                )
                .build();
        return chatClient;
    }
}

8. 错误处理

添加全局异常处理以提供更好的用户体验：

java 复制代码

import org.springframework.ai.chat.model.ChatException;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.ControllerAdvice;
import org.springframework.web.bind.annotation.ExceptionHandler;

@ControllerAdvice
public class GlobalExceptionHandler {

    /** @ExceptionHandler(ChatException.class)
    public ResponseEntity<String> handleChatException(ChatException ex) {
        // 记录错误日志
        System.err.println("Ollama 模型调用失败: " + ex.getMessage());
        // 返回友好的错误提示
        return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
            .body("抱歉，模型调用失败，请检查 Ollama 服务是否正常运行。");
    }
    */
    @ExceptionHandler(Exception.class)
    public ResponseEntity<String> handleGeneralException(Exception ex) {
        // 记录错误日志
        System.err.println("系统错误: " + ex.getMessage());
        // 返回通用错误提示
        return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
            .body("系统发生错误，请稍后再试。");
    }
}

9. 完整示例项目结构

css 复制代码

src/main/java/com/example/
├── config/
│   └── OllamaConfig.java
├── controller/
│   ├── OllamaChatController.java
│   ├── OllamaClientController.java
│   ├── OllamaStreamingController.java
│   └── OllamaConversationController.java
├── service/
│   ├── OllamaChatService.java
│   ├── OllamaClientService.java
│   ├── OllamaStreamingService.java
│   └── OllamaConversationService.java
├── exception/
│   └── GlobalExceptionHandler.java
└── OllamaSpringAiApplication.java
src/main/resources/
└── application.yml

10. 测试应用

启动 Spring Boot 应用后，可以通过以下方式测试：

使用浏览器或 curl 测试基本聊天功能：

bash 复制代码

http://localhost:8080/chat-client?message=你好，请介绍一下自己

测试流式响应：

bash 复制代码

http://localhost:8080/stream?message=详细解释一下Spring Boot框架

测试对话记忆：

首先开始一个对话：

bash 复制代码

http://localhost:8080/start-conversation?message=我的名字是张三

然后继续对话（使用返回的 sessionId）：

bash 复制代码

http://localhost:8080/continue-conversation?sessionId=你的sessionId&message=我叫什么名字

11. 总结

本教程介绍了如何在 Spring AI 1.0.0-M6 版本中集成 Ollama 本地大模型，主要内容包括：

环境准备：安装 Ollama 并拉取模型
项目配置：添加 Spring AI 1.0.0-M6 依赖并配置 Ollama
使用 ChatModel 与 Ollama 交互的基础方法
使用 ChatClient 的高级功能，如流畅的 API 和系统提示
实现流式响应以提供更好的用户体验
使用对话记忆功能维护多轮对话上下文
自定义 ChatClient 配置和错误处理

通过本教程，您应该能够成功地在 Spring Boot 应用中集成 Ollama 本地模型，并利用 Spring AI 提供的各种功能构建智能应用。本地模型的优势在于数据隐私性更高、无需网络连接、可定制性更强，特别适合对数据安全有较高要求的场景。

12. 代码链接

gitee.com/tree_boss/s...