Elasticsearch：使用 AI SDK 和 Elastic 构建 AI 代理

作者：来自 Elastic Carly Richmond

你是否经常听到 AI 代理（AI agents）这个词，但不太确定它们是什么，或者如何在 TypeScript（或 JavaScript）中构建一个？跟我一起深入了解 AI 代理的概念、它们的可能应用场景，以及一个使用 AI SDK 和 Elasticsearch 构建的旅行规划代理示例。

你是否经常听到 AI 代理这个词，但不太确定它们是什么，或者它们如何与 Elastic 关联？在这里，我将深入探讨 AI 代理，具体包括：

什么是 AI 代理？
AI 代理可以解决哪些问题？
一个基于 AI SDK、TypeScript 和 Elasticsearch 的旅行规划代理示例，代码可在 GitHub 上找到。

什么是 AI 代理？

AI 代理是一种能够自主执行任务并代表人类采取行动的软件，它利用人工智能实现这一目标。AI 代理通过结合一个或多个大语言模型（large language models - LLMs）与用户定义的工具（或函数）来执行特定操作。例如，这些工具可以执行以下操作：

从数据库、传感器、API 或 Elasticsearch 等搜索引擎提取信息。
执行复杂计算，并让 LLM 总结其结果。
基于各种数据输入快速做出关键决策。
根据响应触发必要的警报和反馈。

AI 代理可以做什么？

AI 代理可以根据其类型在多个领域中应用，可能的示例包括：

基于效用的代理：评估行动并提供推荐以最大化收益，例如根据用户的观看历史推荐电影和电视剧。
基于模型的代理：根据传感器输入实时决策，例如自动驾驶汽车或智能吸尘器。
学习型代理：结合数据和机器学习识别模式和异常，例如用于欺诈检测。
投资建议代理：根据用户的风险偏好和现有投资组合提供投资建议，以最大化收益。如果能权衡准确性、声誉风险和监管因素，这将加速决策过程。
简单聊天机器人：如当前的聊天机器人，可访问用户账户信息并用自然语言回答基本问题。

示例：旅行规划助手

为了更好地理解 AI 代理的功能，以及如何使用熟悉的 Web 技术构建一个 AI 代理，我们来看一个使用 AI SDK、TypeScript 和 Elasticsearch 编写的简单旅行规划助手示例。

架构

我们的示例由 5 个不同的元素组成：

一个名为 weatherTool 的工具，从 Weather API 获取提问者指定位置的天气数据。
一个名为 fcdoTool 的工具，从 GOV.UK content API 提供目的地的当前旅行状态。
flightTool 工具使用简单查询从 Elasticsearch 获取航班信息。
以上所有信息都会传递给 LLM GPT-4 Turbo。

模型选择

在构建你的第一个 AI 代理时，确定使用哪个模型可能会很困难。资源如 Hugging Face Open LLM Leaderboard 是一个不错的起点。此外，你还可以参考 Berkeley Function-Calling Leaderboard 来获取工具使用的指导。

在我们的案例中，AI SDK 特别推荐使用具有强大工具调用能力的模型，例如 gpt-4 或 gpt-4-turbo ，这在其 Prompt Engineering 文档中有详细说明。如果选择了错误的模型，可能会导致 LLM 无法按预期调用多个工具，甚至会出现兼容性错误，如下所示：

bash 复制代码

`

1.  # Llama3 lack of tooling support (3.1 or higher)
2.  llama3 does not support tools

4.  # Unsupported toolChoice option to configure tool usage
5.  AI_UnsupportedFunctionalityError: 'Unsupported tool choice type: required' functionality not supported.

`AI写代码

先决条件

要运行此示例，请确保按照仓库 README 中的先决条件进行操作。

基础聊天助手

你可以使用 AI SDK 创建的最简单的 AI 代理将生成来自 LLM 的响应，而无需任何额外的上下文。AI SDK 支持许多 JavaScript 框架，具体可参考其文档。然而，AI SDK UI 库文档列出了对 React、Svelte、Vue.js 和 SolidJS 的不同支持，许多教程针对 Next.js。因此，我们的示例使用 Next.js 编写。

任何 AI SDK 聊天机器人的基本结构使用 useChat 钩子来处理来自后端路由的请求，默认情况下是 /api/chat/：

page.tsx 文件包含了我们在 Chat 组件中的客户端实现，包括由 useChat hook 暴露的提交、加载和错误处理功能。加载和错误处理功能是可选的，但建议提供请求状态的指示。与简单的 REST 调用相比，代理可能需要相当长的时间来响应，因此在此过程中保持用户更新状态非常重要，避免用户快速连续点击和重复调用。

由于该组件涉及客户端交互，我使用了 use client 指令，以确保该组件被视为客户端包的一部分：

css 复制代码

`

1.  'use client';

3.  import { useChat } from '@ai-sdk/react';
4.  import Spinner from './components/spinner';

6.  export default function Chat() {
7.    /* useChat hook helps us handle the input, resulting messages, and also handle the loading and error states for a better user experience */
8.    const { messages, input, handleInputChange, handleSubmit, isLoading, stop, error, reload } = useChat();

10.    return (
11.      <div class>
12.        <div class>
13.          {
14.            /* Display all user messages and assistant responses */
15.            messages.map(m => (
16.            <div key={m.id} class>
17.              <div>
18.                { /* Messages with the role of *assistant* denote responses from the LLM*/ }
19.                <div class>{m.role === "assistant" ? "Sorley" : "Me"}</div>
20.                { /* User or LLM generated content */}
21.                  <div class dangerouslySetInnerHTML={{ __html: markdownConverter.makeHtml(m.content) }}></div>
22.              </div>
23.            </div>
24.          ))}
25.        </div>

27.        {
28.          /* Spinner shows when awaiting a response */
29.          isLoading && (
30.          <div class>
31.            <Spinner />
32.            <button id="stop__button" type="button" onClick={() => stop()}>
33.              Stop
34.            </button>
35.          </div>
36.        )}

38.        {
39.        /* Show error message and return button when something goes wrong */
40.        error && (
41.          <>
42.            <div class>Unable to generate a plan. Please try again later!</div>
43.            <button id="retry__button" type="button" onClick={() => reload()}>
44.              Retry
45.            </button>
46.          </>
47.        )}

49.        { /* Form using default input and submission handler form the useChat hook */ }
50.        <form onSubmit={handleSubmit}>
51.          <input
52.            class
53.            value={input}
54.            placeholder="Where would you like to go?"
55.            onChange={handleInputChange}
56.            disabled={error != null}
57.          />
58.        </form>
59.      </div>
60.    );
61.  }

`AI写代码

Chat 组件将通过钩子暴露的 input 属性保持用户输入，并在提交时将响应发送到相应的路由。我使用了默认的 handleSubmit 方法，它将调用 /ai/chat/ 的 POST 路由。

该路由的处理程序位于 /ai/chat/route.ts 中，通过 OpenAI provider 程序初始化与 gpt-4-turbo LLM 的连接：

php 复制代码

`

1.  import { openai } from '@ai-sdk/openai';
2.  import { streamText } from 'ai';
3.  import { NextResponse } from 'next/server';

5.  // Allow streaming responses up to 30 seconds to address typically longer responses from LLMs
6.  export const maxDuration = 30;

8.  // Post request handler
9.  export async function POST(req: Request) {
10.    const { messages } = await req.json();

12.    try {
13.      // Generate response from the LLM using the provided model, system prompt and messages
14.      const result = streamText({
15.        model: openai('gpt-4-turbo'),
16.        system: 'You are a helpful assistant that returns travel itineraries',
17.        messages
18.      });

20.      // Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experience
21.      return result.toDataStreamResponse();
22.    } catch(e) {
23.      console.error(e);
24.      return new NextResponse("Unable to generate a plan. Please try again later!");
25.    }
26.  }

`AI写代码

请注意，上述实现将默认从环境变量 OPENAI_API_KEY 中提取 API 密钥。如果需要自定义 OpenAI 提供程序的配置，可以使用 createOpenAI 方法来覆盖提供程序的设置。

通过以上路由，结合 Showdown 帮助将 GPT 的 Markdown 输出格式化为 HTML，再加上一些 CSS 魔法（在 globals.css 文件中），我们最终得到了一个简单的响应式 UI，可以根据用户的提示生成行程：

基本的 LLM 行程视频

添加工具

向 AI 代理添加工具基本上就是创建 LLM 可以使用的自定义功能，以增强其生成的响应。在此阶段，我将添加 3 个新的工具，LLM 可以选择在生成行程时使用，如下图所示：

天气工具

虽然生成的行程是一个很好的开始，但我们可能希望添加 LLM 没有经过训练的额外信息，比如天气。这促使我们编写第一个工具，它不仅可以作为 LLM 的输入，还能提供额外的数据，帮助我们调整 UI。

创建的天气工具，完整代码如下，接受一个参数 location，LLM 将从用户输入中提取该位置。schema 属性使用 TypeScript 的 schema 验证库 Zod 来验证传入的参数类型，确保传递的是正确的参数类型。description 属性允许你定义工具的功能，帮助 LLM 决定是否调用该工具。

javascript 复制代码

``

1.  import { tool as createTool } from 'ai';
2.  import { z } from 'zod';

4.  import { WeatherResponse } from '../model/weather.model';

6.  export const weatherTool = createTool({
7.    description: 
8.    'Display the weather for a holiday location',
9.    parameters: z.object({
10.      location: z.string().describe('The location to get the weather for')
11.    }),
12.    execute: async function ({ location }) {
13.      // While a historical forecast may be better, this example gets the next 3 days
14.      const url = `https://api.weatherapi.com/v1/forecast.json?q=${location}&days=3&key=${process.env.WEATHER_API_KEY}`;

16.      try {
17.        const response = await fetch(url);
18.        const weather : WeatherResponse = await response.json();
19.        return { 
20.          location: location, 
21.          condition: weather.current.condition.text, 
22.          condition_image: weather.current.condition.icon,
23.          temperature: Math.round(weather.current.temp_c),
24.          feels_like_temperature: Math.round(weather.current.feelslike_c),
25.          humidity: weather.current.humidity
26.        };
27.      } catch(e) {
28.        console.error(e);
29.        return { 
30.          message: 'Unable to obtain weather information', 
31.          location: location
32.        };
33.      }
34.    }
35.  });

``AI写代码

你可能已经猜到，execute 属性是我们定义异步函数并实现工具逻辑的地方。具体来说，发送到天气 API 的位置会传递给我们的工具函数。然后，响应会被转换为一个单一的 JSON 对象，可以显示在 UI 上，并且也用于生成行程。

鉴于我们目前只运行一个工具，因此不需要考虑顺序或并行流程。简单来说，就是在原始 api/chat 路由中处理 LLM 输出的 streamText 方法中添加 tools 属性：

php 复制代码

`

1.  import { weatherTool } from '@/app/ai/weather.tool';

3.  // Other imports omitted

5.  export const tools = {
6.    displayWeather: weatherTool,
7.  };

9.  // Post request handler
10.  export async function POST(req: Request) {
11.    const { messages } = await req.json();

13.      // Generate response from the LLM using the provided model, system prompt and messages (try catch block omitted)
14.      const result = streamText({
15.        model: openai('gpt-4-turbo'),
16.        system: 
17.          'You are a helpful assistant that returns travel itineraries based on the specified location.',
18.        messages,
19.        maxSteps: 2,
20.        tools
21.      });

23.      // Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experience
24.      return result.toDataStreamResponse();
25.  }

`AI写代码

工具输出与消息一起提供，这使我们能够为用户提供更完整的体验。每条消息包含一个 parts 属性，其中包含 type 和 state 属性。当这些属性的值分别为 tool-invocation 和 result 时，我们可以从 toolInvocation 属性中提取返回的结果，并按需要显示它们。

更改后的 page.tsx 源代码将显示天气摘要以及生成的行程：

css 复制代码

`

1.  'use client';

3.  import { useChat } from '@ai-sdk/react';
4.  import Image from 'next/image';

6.  import { Weather } from './components/weather';

8.  import pending from '../../public/multi-cloud.svg';

10.  export default function Chat() {
11.    /* useChat hook helps us handle the input, resulting messages, and also handle the loading and error states for a better user experience */
12.    const { messages, input, handleInputChange, handleSubmit, isLoading, stop, error, reload } = useChat();

14.    return (
15.      <div class>
16.        <div class>
17.          {
18.            /* Display all user messages and assistant responses */
19.            messages.map(m => (
20.              <div key={m.id} class>
21.                <div>
22.                  { /* Messages with the role of *assistant* denote responses from the LLM */}
23.                  <div class>{m.role === "assistant" ? "Sorley" : "Me"}</div>
24.                  { /* Tool handling */}
25.                  <div class>
26.                    {
27.                      m.parts.map(part => {
28.                        if (part.type === 'tool-invocation') {
29.                          const { toolName, toolCallId, state } = part.toolInvocation;

31.                          if (state === 'result') {
32.                            { /* Show weather results */}
33.                            if (toolName === 'displayWeather') {
34.                              const { result } = part.toolInvocation;
35.                              return (
36.                                <div key={toolCallId}>
37.                                  <Weather {...result} />
38.                                </div>
39.                              );
40.                            }
41.                          } else {
42.                            return (
43.                              <div key={toolCallId}>
44.                                {toolName === 'displayWeather' ? (
45.                                  <div class>
46.                                    <Image src={pending} width={80} height={80} alt="Placeholder Weather"/>
47.                                    <p class>Loading weather...</p>
48.                                  </div>
49.                                ) : null}
50.                              </div>
51.                            );
52.                          }
53.                        }
54.                      })}
55.                  </div>
56.                  { /* User or LLM generated content */}
57.                  <div class dangerouslySetInnerHTML={{ __html: markdownConverter.makeHtml(m.content) }}></div>

59.                </div>
60.              </div>
61.            ))}
62.        </div>

64.        { /* Spinner and loading handling omitted */ }

66.        { /* Form using default input and submission handler form the useChat hook */}
67.        <form onSubmit={handleSubmit}>
68.          <input
69.            class
70.            value={input}
71.            placeholder="Where would you like to go?"
72.            onChange={handleInputChange}
73.            disabled={error != null}
74.          />
75.        </form>
76.      </div>
77.    );
78.  }

`AI写代码

上述代码将向用户提供以下输出：

FCO 工具

AI 代理的强大之处在于 LLM 可以选择触发多个工具来获取相关信息，以生成响应。假设我们想要查看目标国家的旅行指南。下面的代码展示了如何创建一个新的工具 fcdoGuidance，它可以触发一个对 GOV.UK Content API 的 API 调用：

javascript 复制代码

``

1.  import { tool as createTool } from 'ai';
2.  import { z } from 'zod';

4.  import { FCDOResponse } from '../model/fco.model';

6.  export const fcdoTool = createTool({
7.    description: 
8.    'Display the FCDO guidance for a destination',
9.    parameters: z.object({
10.      country: z.string().describe('The country of the location to get the guidance for')
11.    }),
12.    execute: async function ({ country }) {
13.      const url = `https://www.gov.uk/api/content/foreign-travel-advice/${country.toLowerCase()}`;

15.      try {
16.        const response = await fetch(url, { headers: { 'Content-Type': 'application/json' } });
17.        const fcoResponse: FCDOResponse = await response.json();

19.        const alertStatus: string = fcoResponse.details.alert_status.length == 0 ? 'Unknown' : 
20.        fcoResponse.details.alert_status[0].replaceAll('_', ' ');

22.        return { 
23.          status: alertStatus, 
24.          url: fcoResponse.details?.document?.url
25.        };
26.      } catch(e) {
27.        console.error(e);
28.        return { 
29.          message: 'Unable to obtain FCDO information', 
30.          location: location
31.        };
32.      }
33.    }
34.  });

``AI写代码

你会注意到，格式与之前讨论的天气工具非常相似。事实上，要将该工具包含到 LLM 输出中，只需将其添加到 tools 属性，并修改 /api/chat 路由中的提示即可：

php 复制代码

`

1.  // Imports omitted

3.  export const tools = {
4.    fcdoGuidance: fcdoTool,
5.    displayWeather: weatherTool,
6.  };

8.  // Post request handler
9.  export async function POST(req: Request) {
10.    const { messages } = await req.json();

12.      // Generate response from the LLM using the provided model, system prompt and messages (try/ catch block omitted)
13.      const result = streamText({
14.        model: openai('gpt-4-turbo'),
15.        system:
16.          "You are a helpful assistant that returns travel itineraries based on a location" + 
17.          "Use the current weather from the displayWeather tool to adjust the itinerary and give packing suggestions." +
18.          "If the FCDO tool warns against travel DO NOT generate an itinerary.",
19.        messages,
20.        maxSteps: 2,
21.        tools
22.      });

24.      // Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experience
25.      return result.toDataStreamResponse();
26.  }

`AI写代码

一旦将显示工具输出的组件添加到页面，对于不建议旅行的国家，输出应该如下所示：

支持工具调用的LLM可以选择是否调用工具，除非它认为有必要。使用gpt-4-turbo时，我们的两个工具会并行调用。然而，之前尝试使用llama3.1时，取决于输入，只有一个模型会被调用。

航班信息工具

RAG（Retrieval Augmented Generation - 检索增强生成）指的是一种软件架构，其中从搜索引擎或数据库中提取的文档作为上下文传递给 LLM，以基于提供的文档集来生成回应。这种架构允许 LLM 根据它之前没有训练过的数据生成更准确的回应。虽然 Agentic RAG 通过定义的工具或结合向量或混合搜索处理文档，但也可以像我们这里所做的那样，利用 RAG 作为与传统词汇搜索的复杂流程的一部分。

为了将航班信息与其他工具一起传递给LLM，最后一个工具 flightTool 通过 Elasticsearch JavaScript 客户端，从 Elasticsearch 中拉取出发和到达航班的航班信息，使用提供的出发地和目的地：

markdown 复制代码

`

1.  import { tool as createTool } from 'ai';
2.  import { z } from 'zod';

4.  import { Client } from '@elastic/elasticsearch';
5.  import { SearchResponseBody } from '@elastic/elasticsearch/lib/api/types';

7.  import { Flight } from '../model/flight.model';

9.  const index: string = "upcoming-flight-data";
10.  const client: Client = new Client({
11.    node: process.env.ELASTIC_ENDPOINT,
12.    auth: {
13.      apiKey: process.env.ELASTIC_API_KEY || "",
14.    },
15.  });

17.  function extractFlights(response: SearchResponseBody<Flight>): (Flight | undefined)[] {
18.      return response.hits.hits.map(hit => { return hit._source})
19.  }

21.  export const flightTool = createTool({
22.    description:
23.      "Get flight information for a given destination from Elasticsearch, both outbound and return journeys",
24.    parameters: z.object({
25.      destination: z.string().describe("The destination we are flying to"),
26.      origin: z
27.        .string()
28.        .describe(
29.          "The origin we are flying from (defaults to London if not specified)"
30.        ),
31.    }),
32.    execute: async function ({ destination, origin }) {
33.      try {
34.        const responses = await client.msearch({
35.          searches: [
36.            { index: index },
37.            {
38.              query: {
39.                bool: {
40.                  must: [
41.                    {
42.                      match: {
43.                        origin: origin,
44.                      },
45.                    },
46.                    {
47.                      match: {
48.                        destination: destination,
49.                      },
50.                    },
51.                  ],
52.                },
53.              },
54.            },

56.            // Return leg
57.            { index: index },
58.            {
59.              query: {
60.                bool: {
61.                  must: [
62.                    {
63.                      match: {
64.                        origin: destination,
65.                      },
66.                    },
67.                    {
68.                      match: {
69.                        destination: origin,
70.                      },
71.                    },
72.                  ],
73.                },
74.              },
75.            },
76.          ],
77.        });

79.        if (responses.responses.length < 2) {
80.          throw new Error("Unable to obtain flight data");
81.        }

83.        return {
84.          outbound: extractFlights(responses.responses[0] as SearchResponseBody<Flight>),
85.          inbound: extractFlights(responses.responses[1] as SearchResponseBody<Flight>)
86.        };
87.      } catch (e) {
88.        console.error(e);
89.        return {
90.          message: "Unable to obtain flight information",
91.          location: location,
92.        };
93.      }
94.    },
95.  });

`AI写代码

这个示例使用了 Multi search API 来分别拉取出发和到达航班的信息，然后通过 extractFlights 工具方法提取文档。

为了使用工具的输出，我们需要再次修改我们的提示和工具集合，更新 /ai/chat/route.ts 文件：

php 复制代码

`

1.  // Imports omitted

3.  // Allow streaming responses up to 30 seconds to address typically longer responses from LLMs
4.  export const maxDuration = 30;

6.  export const tools = {
7.    getFlights: flightTool,
8.    displayWeather: weatherTool,
9.    fcdoGuidance: fcdoTool
10.  };

12.  // Post request handler
13.  export async function POST(req: Request) {
14.    const { messages } = await req.json();

16.      // Generate response from the LLM using the provided model, system prompt and messages (try/ catch block omitted)
17.      const result = streamText({
18.        model: openai('gpt-4-turbo'),
19.        system:
20.        "You are a helpful assistant that returns travel itineraries based on location, the FCDO guidance from the specified tool, and the weather captured from the displayWeather tool." + 
21.        "Use the flight information from tool getFlights only to recommend possible flights in the itinerary." + 
22.        "Return an itinerary of sites to see and things to do based on the weather." + 
23.        "If the FCDO tool warns against travel DO NOT generate an itinerary.",
24.        messages,
25.        maxSteps: 2,
26.        tools
27.      });

29.      // Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experience
30.      return result.toDataStreamResponse();
31.  }

`AI写代码

通过最终的提示，所有 3 个工具将被调用，以生成包含航班选项的行程：

总结

如果你之前对 AI 代理还不完全了解，现在你应该清楚了！我们通过使用 AI SDK、Typescript 和 Elasticsearch 的简单旅行规划示例来进行了解。我们可以扩展我们的规划器，添加其他数据源，允许用户预订旅行以及旅游，甚至根据位置生成图像横幅（目前 AI SDK 中对此的支持仍处于实验阶段）。

如果你还没有深入了解代码，可以在这里查看！

资源

AI SDK 核心文档
AI SDK 核心 > 工具调用
Elasticsearch JavaScript 客户端
旅行规划 AI 代理 | GitHub

想要获得 Elastic 认证吗？查看下次 Elasticsearch 工程师培训的时间！

Elasticsearch 拥有众多新功能，可以帮助你为你的使用案例构建最佳搜索解决方案。深入了解我们的示例笔记本，了解更多内容，开始免费云试用，或在本地机器上尝试 Elastic。

原文：Building AI Agents with AI SDK and Elastic - Elasticsearch Labs