# 用 NestJS + LangChain + RxJS 打造可扩展的 AI 流式 Agent（含工具调用）

用 NestJS + LangChain + RxJS 打造可扩展的 AI 流式 Agent（含工具调用）

在很多 AI 应用的早期实现中，我们往往只关注"能不能调用大模型"，却忽略了两个关键问题：

如何优雅地支持流式输出（Streaming）？
如何让模型具备"行动能力"（Tool / Agent）？

这篇文章我结合一套实际代码，带你从 0 到 1 实现一个具备以下能力的 AI 服务：

✅ 基于 NestJS 的 SSE 流式接口
✅ 使用 LangChain 接入大模型
✅ 支持 Tool 调用（查询数据库等）
✅ 使用 RxJS 做流式响应桥接
✅ 实现一个简易 Agent Loop

更重要的是：这套架构是可扩展的，而不是 demo 级别。

一、为什么要"流式 + Agent"

1. 流式输出的本质

传统调用是这样的：

arduino 复制代码

const result = await llm.invoke(...)
return result

问题是：用户要等很久。

而流式输出可以做到：

边生成边返回
用户体验类似 ChatGPT

本质依赖：

HTTP 分块传输（chunked） + 持久连接（keep-alive）

2. Agent 的本质

大模型本身不会"查数据库"，它只能：

决定什么时候调用工具

例如：

用户说：查一下 001 用户信息

模型：我需要调用 query_user

这就是 Agent 思维：

复制代码

用户问题 → LLM → 是否调用工具 → 执行工具 → 再交给 LLM → 输出

二、整体架构设计

我们这套系统拆成三层：

arduino 复制代码

Controller（SSE）
    ↓
Service（Agent + Stream）
    ↓
LLM + Tools（LangChain）

关键点：

Controller：负责"流式返回"
Service：负责"Agent逻辑"
Model：负责"生成 + 工具调用"

三、SSE：让接口支持流式输出

1. NestJS 的优雅做法：@Sse

less 复制代码

@Sse('chat/stream')
chatStream(@Query('q') query: string): Observable<MessageEvent> {
  return new Observable<MessageEvent>((observe) => {
    (async () => {
      try {
        const stream = this.aiService.runChainStream(query);

        for await (const chunk of stream) {
          observe.next({ data: chunk });
        }

        observe.complete();
      } catch (error) {
        observe.error(error);
      }
    })();
  });
}

核心点：

@Sse() 自动帮你加上这些头：

yaml 复制代码

Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
Transfer-Encoding: chunked

返回类型必须是：

swift 复制代码

Observable<MessageEvent>

2. 为什么用 RxJS？

很多人会问：为什么不用 async/await 直接返回？

因为：

SSE 是"持续推送"，不是一次性返回

RxJS 正好适合：

多次事件
流式数据
push 模型

四、Service：实现 Agent + Streaming

核心逻辑在这里👇

csharp 复制代码

async *runChainStream(query: string): AsyncIterable<string>

这是一个：

✅ async generator（异步生成器）

1. Generator 的价值

普通函数：

复制代码

执行 → 返回 → 结束

生成器函数：

arduino 复制代码

执行 → yield → 暂停 → 再继续

非常适合：

AI 流式输出

2. 初始化对话

go 复制代码

const messages: BaseMessage[] = [
  new SystemMessage(`你是一个智能助手，可以调用工具`),
  new HumanMessage(query)
];

3. Agent Loop（核心思想）

arduino 复制代码

while (true) {

这是整个系统最有价值的一段逻辑 👇

4. 流式读取 LLM 输出

csharp 复制代码

const stream = await this.modelWithTools.stream(messages);

for await (const chunk of stream) {

关键点：

LLM 返回的是 AsyncIterable
每一段 chunk 都是增量内容

5. 拼接完整 AIMessage

ini 复制代码

fullAIMessage = fullAIMessage
  ? fullAIMessage.concat(chunk)
  : chunk;

原因：

tool_call 信息是"分片返回"的

6. 判断是否是 Tool 调用

ini 复制代码

const hasToolCallChunk =
  !!fullAIMessage.tool_call_chunks &&
  fullAIMessage.tool_call_chunks.length > 0;

关键逻辑：

arduino 复制代码

if (!hasToolCallChunk && chunk.content) {
  yield chunk.content;
}

👉 只有"纯文本"才直接返回给前端

7. 是否结束

kotlin 复制代码

if (!toolCalls.length) {
  return;
}

说明：

模型已经不需要工具，直接结束

8. 执行工具

ini 复制代码

if (toolName === 'query_user') {
  const args = queryUserArgsSchema.parse(toolCall.args);
  const result = await queryUserTool.invoke(args);

  messages.push(
    new ToolMessage({
      content: result,
      name: toolName,
      tool_call_id: toolCallId
    })
  );
}

然后：

👉 再进入下一轮 while(true)

五、Tool 设计（重点）

1. 用 zod 定义参数

csharp 复制代码

const queryUserArgsSchema = z.object({
  userId: z.string().describe('用户ID')
});

👉 这一步非常关键：

让 LLM 知道参数结构

2. 定义 Tool

php 复制代码

const queryUserTool = tool(
  async ({ userId }) => {
    const user = database.users[userId];
    return JSON.stringify(user, null);
  },
  {
    name: 'query_user',
    description: '查询用户信息',
    schema: queryUserArgsSchema
  }
);

3. 绑定工具

ini 复制代码

this.modelWithTools = model.bindTools([queryUserTool]);

六、Provider 解耦模型

css 复制代码

{
  provide: 'CHAT_MODEL',
  useFactory: (configService: ConfigService) => {
    return new ChatOpenAI({
      model: configService.get('EMBEDDING_MODEL_NAME'),
      apiKey: configService.get('OPENAI_API_KEY'),
      configuration: {
        baseURL: configService.get('OPENAI_BASE_URL')
      }
    });
  },
  inject: [ConfigService]
}

为什么这么设计？

👉 关键思想：

模型是"可替换的"
业务逻辑不依赖具体 LLM

未来你可以：

换 OpenAI
换本地模型
接入多模型路由

七、这套架构的真正价值

很多教程到这里就结束了，但真正值得思考的是👇

1. 从"调用模型" → "构建系统"

你现在有的是：

一个 可扩展 AI 系统骨架

可以轻松扩展：

🔍 网络搜索 Tool
📅 日程管理
📧 发邮件
📝 自动写作

2. Agent Loop 是核心抽象

这一段：

arduino 复制代码

while(true)

本质就是：

一个"推理-行动"循环（Reasoning Loop）

未来你可以扩展：

多工具调度
多步推理
记忆系统（Memory）
ReAct Agent

3. RxJS + Generator 是绝配

Generator：生产数据
RxJS：分发数据

👉 一个负责"生成"，一个负责"推送"

4. Tool 是 AI 的"手"

没有 Tool 的 AI：

只能聊天

有 Tool 的 AI：

才能做事

八、可以继续进阶的方向

如果你想把这套系统做成生产级，可以往这些方向走：

1. 工具体系

动态注册 Tool
Tool 权限控制
Tool 超时 / 重试

2. 记忆系统

短期记忆（对话上下文）
长期记忆（向量数据库）

3. 多 Agent 协作

Planner Agent
Executor Agent

4. 流式优化

前端打字机效果
chunk 合并策略
中断控制（Abort）

九、总结

这套实现的核心不在代码，而在思想：

✅ 用 SSE 解决"用户体验"

✅ 用 Generator 解决"流式生成"

✅ 用 RxJS 解决"流式分发"

✅ 用 Tool 让 AI 具备"行动能力"

✅ 用 Agent Loop 统一"推理 + 执行"

如果你理解了这一套，你就已经从：

❌ 调 API 的工程师

✅ 进化成 AI 系统设计者