让我们把这个 expense 工具从 n8n 迁移到 Elastic One Workflow

作者：来自 Elastic Vladimir_Filonov

使用 Elastic One Workflow、Gemini 和 Telegram 构建对话式费用助手

不久前，我偶然看到 Som 的一篇非常棒的实战指南（昨天发布）：

"使用 Elasticsearch Agent Builder、Telegram、n8n 和 Bedrock 构建对话式费用助手"（发布前会有修改）

这个想法非常优雅。Telegram 作为聊天 UI，n8n 负责整体编排，STT，Bedrock 用于意图识别和信息抽取，Elasticsearch + Agent Builder 用于存储和查询。非常干净。出奇地流畅。说实话？它立刻给了我灵感。这种情况在技术类实战文章中并不常见。

但由于我大部分时间都在使用 Elastic 的 Workflow Engine，我开始思考：

"如果我重新创建一个简化版的 Som 助手，但用 Elastic One Workflow 作为编排器，而不是 n8n，会怎么样？"

同样的想法，同样"和你的费用对话"的魔法，但整个流程原生运行在 Elastic 中。这看起来应该是可行的。而且确实基本可行。

One Workflow 还处于早期阶段，还没有 n8n 那么完整的能力。我做了一些简化：使用 Gemini 来完成 LLM 相关任务（一部分是为了 LLM 多样性，一部分是因为我想试试它），并且使用轮询 Telegram 而不是 webhooks。轮询运行得不错，不过 webhooks 会更好。

于是我把它做出来了。这是一个紧凑的费用助手。支持语音和文本。进行意图分类。将结构化费用数据连同语义 embedding 一起索引。使用 ES|QL 工具做分析。在 Telegram 中直接回复。这大概就是整体情况。

和 Som 版本的主要区别？Elastic One Workflow 替代了 n8n，Gemini 替代了 Bedrock，我使用的是定时轮询而不是 Telegram webhooks。一切都发生在 Elastic 内部，这正是重点所在。我并不是想做一个 1:1 的克隆 ------ 只是想看看当所有东西都放在 ELK stack 里时，同样的想法会是什么效果。不过说实话，我并不确定长期来看每 15 秒轮询一次是不是最好的方式。Webhooks 会更干净，但 One Workflow 目前还不支持。所以只能用轮询。

它每 15 秒轮询一次 Telegram。把语音转换成文本 ------ 我用的是 Deepgram，不过任何 STT 服务都可以。使用 Gemini 将意图分类为 INGEST 或 QUERY。然后要么索引费用数据，要么通过 Agent Builder 工具运行 ES|QL 查询。如果置信度较低？它会请求澄清。整个流程在完成初始设置之后其实相当直接。

image_placeholder.png

你需要具备带有 Agent Builder 和 inference endpoints 的 Elasticsearch。在 GCP 上通过 Vertex AI 配置好 Gemini。一个 Telegram bot token（如果你还没有，可以从 @BotFatherBotFather 获取）。以及一个 STT 提供方 ------ 我使用的是 Deepgram，因为它很容易设置，不过也有其他选择。

设置索引

首先创建 Elasticsearch 索引。这里没有什么复杂的东西：

bash 复制代码

`

1.  PUT /expenses
2.  {
3.    "mappings": {
4.      "properties": {
5.        "@timestamp": { "type": "date" },
6.        "amount":       { "type": "float" },
7.        "merchant":     { "type": "keyword" },
8.        "category":     { "type": "keyword" },
9.        "payment_method": { "type": "keyword" },
10.        "raw_transcript": { "type": "text" },
11.        "user_id": { "type": "keyword" },
12.        "telegram_chat_id": { "type": "keyword" },
13.        "semantic_text": {
14.          "type": "semantic_text",
15.          "inference_id": "gemini_embeddings"
16.        }
17.      }
18.    }
19.  }

`AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)

设置 Gemini

这一部分比我预期花了更长时间。GCP 的设置比较繁琐。你知道流程：创建一个项目，启用 Vertex AI API，创建一个带有 Vertex AI User 角色的服务账号，下载 JSON key。选择一个 Vertex AI 可用的区域 ------ 我用的是 us-central1，因为我之前已经配置好了。也许有更好的区域，我并没有仔细研究。

然后在 Kibana 中，进入 Stack Management → Connectors，创建一个 Gemini connector。名字随便取。将 API URL 设置为你所在区域的 endpoint。填入你的 project ID 和 region。模型使用 gemini-2.5-pro（或者你想用的其他版本）。把整个服务账号 JSON 粘贴到 credentials 字段里。是整个 JSON，不是其中的一部分。我第一次就犯了这个错误。

对于 embeddings，你需要一个 inference endpoint。像这样创建它：

bash 复制代码

`

1.  PUT _inference/text_embedding/gemini_embeddings
2.  {
3.    "service": "googlevertexai",
4.    "service_settings": {
5.      "project_id": "my-gemini-project-12345",
6.      "location": "us-central1",
7.      "model_id": "text-embedding-004"
8.    }
9.  }

`AI写代码

然后在 Kibana 中创建 .inference connector。路径：Stack Management → Connectors → AI Connector。

将 task type 设置为 text_embedding，provider 设置为 googlevertexai，inference ID 设置为 gemini_embeddings ------ 这必须与你在索引映射中使用的一致，否则无法工作。在 secrets 部分粘贴相同的服务账号 JSON。保存时，connector 会自动创建/更新 inference endpoint。至少理论上是这样，我当时刷新了几次才生效。

重要提示：inference_id 必须与你的索引映射一致。在你能用 semantic_text 字段索引文档之前，endpoint 必须存在。Elasticsearch 在索引时会自动生成 embeddings。至少文档是这么说的 ------ 我最初遇到了一些问题，但最终还是成功了。

示例文档：

bash 复制代码

`

1.  POST /expenses/_doc
2.  {
3.    "@timestamp": "2025-01-15T10:30:00Z",
4.    "amount": 250.0,
5.    "merchant": "cafe",
6.    "category": "food",
7.    "payment_method": "credit_card",
8.    "raw_transcript": "Spent 250 on lunch at the cafe",
9.    "semantic_text": "Spent 250 on lunch at the cafe",
10.    "user_id": "user123",
11.    "telegram_chat_id": "chat456"
12.  }

`AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)

semantic_text 字段会自动生成 embeddings。你可以检查它是否成功，尽管我不完全确定失败时的输出是什么。我只是根据查询返回了结果，就假设它成功了：

bash 复制代码

`

1.  GET /expenses/_search
2.  {
3.    "_source": {
4.      "includes": ["*", "_inference_fields"]
5.    },
6.    "query": {
7.      "match_all": {}
8.    },
9.    "size": 1
10.  }

`AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)

创建 ES|QL 工具

我为 Agent Builder 创建了两个 ES|QL 工具。嗯，我本来想创建更多，但这两个已经足够我的需求了。如果需要，之后可能还可以添加更多。第一个按日期范围搜索：

bash 复制代码

`

1.  POST kbn://api/agent_builder/tools
2.  {
3.    "id": "search_expenses_by_date",
4.    "type": "esql",
5.    "description": "Search expenses within a date range. Returns amount, merchant, category, and payment method.",
6.    "tags": ["expenses", "analytics"],
7.    "configuration": {
8.      "query": "FROM expenses | WHERE @timestamp >= ?start_date AND @timestamp <= ?end_date | WHERE category == ?category | STATS total = SUM(amount) BY category, payment_method | SORT total DESC",
9.      "params": {
10.        "start_date": {
11.          "type": "date",
12.          "description": "Start date in ISO format (e.g., 2025-01-01)"
13.        },
14.        "end_date": {
15.          "type": "date",
16.          "description": "End date in ISO format (e.g., 2025-01-31)"
17.        },
18.        "category": {
19.          "type": "keyword",
20.          "description": "Category filter (optional - use empty string for all categories)",
21.          "optional": true,
22.          "defaultValue": ""
23.        }
24.      }
25.    }
26.  }

`AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)

ES|QL 工具使用 ?param_name 语法。category 参数是可选的 ------ 空字符串返回所有类别。我想是这样，我没有测试所有边界情况。可能本该测试，但对我的用例来说已经可以用了。

第二个工具做语义搜索。这比我预期的更有用。语义搜索效果实际上相当不错：

bash 复制代码

`

1.  POST kbn://api/agent_builder/tools
2.  {
3.    "id": "semantic_search_expenses",
4.    "type": "esql",
5.    "description": "Semantically search expenses using natural language query. Useful for finding expenses by description, merchant name, or context.",
6.    "tags": ["expenses", "semantic-search"],
7.    "configuration": {
8.      "query": "FROM expenses METADATA _score | WHERE MATCH(semantic_text, ?query) | SORT _score DESC, @timestamp DESC | LIMIT ?limit",
9.      "params": {
10.        "query": {
11.          "type": "text",
12.          "description": "Natural language search query"
13.        },
14.        "limit": {
15.          "type": "integer",
16.          "description": "Maximum number of results",
17.          "optional": true,
18.          "defaultValue": 10
19.        }
20.      }
21.    }
22.  }

`AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)

MATCH 函数在 semantic_text 字段上使用映射中的 inference endpoint 执行语义搜索。你需要使用 METADATA _score 按相关性排序 ------ 我是在疑惑为什么结果没有正确排序后才学到这一点的。错误信息也不是特别有帮助。

工作流

这个工作流每 15 秒运行一次。轮询 Telegram。处理消息。它很长，因为要处理语音和文本、意图分类、置信度检查、路由。可能可以更短，但我没怎么优化。如下：

yaml 复制代码

`

1.  name: "Expense Assistant Workflow"
2.  description: "Scheduled workflow that polls Telegram and processes expense messages"
3.  enabled: true

5.  triggers:
6.    - type: scheduled
7.      with:
8.        every: "15s"

10.  consts:
11.    telegram_bot_token: "<TELEGRAM_BOT_TOKEN>"
12.    telegram_api_url: "https://api.telegram.org/bot"
13.    last_update_id: 0  # This will be stored/retrieved from Elasticsearch

15.  steps:
16.      # Step 1: Poll Telegram for new messages
17.      - name: poll_telegram
18.        type: http
19.        with:
20.          url: "{{ consts.telegram_api_url }}{{ consts.telegram_bot_token }}/getUpdates"
21.          method: GET
22.          body:
23.            offset: "{{ steps.get_last_update_id.output._source.last_update_id | default: 0 | plus: 1 }}"
24.        on-failure:
25.          continue: true  # Skip if there's a conflict, will retry on next run

27.      # Step 2: Check if there are new messages
28.      - name: check_new_messages
29.        type: if
30.        condition: "{{ steps.poll_telegram.output.result }}"
31.        steps:
32.          # Step 3: Process each message
33.          - name: process_messages
34.            type: foreach
35.            foreach: "{{ steps.poll_telegram.output.result | json: 2 }}"
36.            steps:
37.              # Extract message data from Telegram update
38.              - name: extract_message_data
39.                type: console
40.                with:
41.                  message: "Processing message from user {{ foreach.item.message.from.id | json: 2 }}"

43.              # Check if message has text or voice
44.              - name: check_message_type
45.                type: if
46.                condition: "{{ foreach.item.message.voice != null or foreach.item.message.audio != null }}"
47.                steps:
48.                  # Voice message - get file and transcribe
49.                  - name: get_voice_file
50.                    type: http
51.                    with:
52.                      url: "{{ consts.telegram_api_url }}{{ consts.telegram_bot_token }}/getFile"
53.                      method: GET
54.                      body:
55.                        file_id: "{{ foreach.item.message.voice.file_id | default: foreach.item.message.audio.file_id }}"

57.                  - name: transcribe_voice
58.                    type: http
59.                    with:
60.                      url: "https://api.deepgram.com/v1/listen"
61.                      method: POST
62.                      headers:
63.                        Authorization: "Token YOUR_DEEPGRAM_KEY"
64.                        Content-Type: "application/json"
65.                      body:
66.                        url: "https://api.telegram.org/file/bot{{ consts.telegram_bot_token }}/{{ steps.get_voice_file.output.result.file_path }}"
67.                    on-failure:
68.                      fallback:
69.                        - name: fallback_transcription
70.                          type: http
71.                          with:
72.                            url: "http://localhost:8000/transcribe"
73.                            method: POST
74.                            body:
75.                              audio_url: "https://api.telegram.org/file/bot{{ consts.telegram_bot_token }}/{{ steps.get_voice_file.output.result.file_path }}"
76.                else:
77.                  # Text message
78.                  - name: use_text_directly
79.                    type: console
80.                    with:
81.                      message: "Processing text message: {{ foreach.item.message.text | json: 2 }}"

83.              # Extract transcript (from voice or text)
84.              - name: get_transcript
85.                type: console
86.                with:
87.                  message: "Transcript: {{ steps.transcribe_voice.output.results?.channels[0]?.alternatives[0]?.transcript or foreach.item.message.text }}"

89.              # Intent Classification
90.              - name: classify_intent
91.                type: .gemini
92.                connector-id: "gemini-expense-assistant-connector-id"
93.                with:
94.                  subAction: invokeAI
95.                  subActionParams:
96.                    model: "gemini-2.5-pro"
97.                    messages:
98.                      - role: user
99.                        content: "{{ steps.transcribe_voice.output.results?.channels[0]?.alternatives[0]?.transcript or foreach.item.message.text }}"
100.                    systemInstruction: |
101.                      You are an intent classifier for an expense assistant.
102.                      Classify the user's message as either:
103.                      - INGEST: User wants to add/record an expense (e.g., "Spent 250 on lunch", "Add dinner for 350")
104.                      - QUERY: User wants to query/search expenses (e.g., "How much did I spend last week?", "Show my food expenses")

106.                      Always respond with valid JSON only:
107.                      {
108.                        "intent": "INGEST" or "QUERY",
109.                        "confidence": 0.0-1.0,
110.                        "reasoning": "brief explanation"
111.                      }

113.              - name: check_confidence
114.                type: if
115.                # TODO: This condition needs to be adjusted based on your workflow structure.
116.                # The template extracts the confidence, but KQL needs a field name to compare.
117.                # Consider restructuring to extract the confidence value first, then reference it.
118.                condition: "confidence < 0.7"
119.                steps:
120.                  - name: request_clarification
121.                    type: http
122.                    with:
123.                      url: "{{ consts.telegram_api_url }}{{ consts.telegram_bot_token }}/sendMessage"
124.                      method: POST
125.                      headers:
126.                        Content-Type: "application/json"
127.                      body:
128.                        chat_id: "{{ foreach.item.message.from.id | json: 2 }}"
129.                        text: "{{ steps.classify_intent.output.message.reasoning }}. Could you please clarify: Are you trying to add an expense or ask about existing expenses?"
130.                else:
131.                  - name: route_by_intent
132.                    type: if
133.                    # TODO: This condition needs to be adjusted. KQL needs a field name.
134.                    # Consider restructuring to extract the intent value first, then reference it.
135.                    condition: "intent: INGEST"
136.                    steps:
137.                      # INGEST BRANCH
138.                      - name: extract_expense_data
139.                        type: .gemini
140.                        connector-id: "gemini-expense-assistant-connector-id"
141.                        with:
142.                          subAction: invokeAI
143.                          subActionParams:
144.                            model: "gemini-2.5-pro"
145.                            messages:
146.                              - role: user
147.                                content: "{{ steps.transcribe_voice.output.results?.channels[0]?.alternatives[0]?.transcript or foreach.item.message.text }}"
148.                            systemInstruction: |
149.                              Extract expense information from the user's message.
150.                              Return valid JSON with:
151.                              {
152.                                "amount": number,
153.                                "merchant": string,
154.                                "category": string (food, transport, entertainment, etc.),
155.                                "payment_method": string (credit_card, cash, debit, etc.),
156.                                "date": string (ISO format, default to today if not specified)
157.                              }

159.                      - name: index_expense
160.                        type: elasticsearch.index
161.                        with:
162.                          index: "expenses"
163.                          document:
164.                            "@timestamp": "{{ steps.extract_expense_data.output.message.date | default: 'now' }}"
165.                            amount: "{{ steps.extract_expense_data.output.message.amount }}"
166.                            merchant: "{{ steps.extract_expense_data.output.message.merchant }}"
167.                            category: "{{ steps.extract_expense_data.output.message.category }}"
168.                            payment_method: "{{ steps.extract_expense_data.output.message.payment_method }}"
169.                            raw_transcript: "{{ steps.transcribe_voice.output.results?.channels[0]?.alternatives[0]?.transcript or foreach.item.message.text }}"
170.                            semantic_text: "{{ steps.transcribe_voice.output.results?.channels[0]?.alternatives[0]?.transcript or foreach.item.message.text }}"
171.                            user_id: "{{ foreach.item.message.from.id | json: 1 }}"
172.                            telegram_chat_id: "{{ foreach.item.message.chat.id | json: 1 }}"

174.                      - name: send_ingest_response
175.                        type: http
176.                        with:
177.                          url: "{{ consts.telegram_api_url }}{{ consts.telegram_bot_token }}/sendMessage"
178.                          method: POST
179.                          headers:
180.                            Content-Type: "application/json"
181.                          body:
182.                                chat_id: "{{ foreach.item.message.from.id | json: 1 }}"
183.                                text: "✅ Added expense: {{ steps.extract_expense_data.output.message.amount }} at {{ steps.extract_expense_data.output.message.merchant }} ({{ steps.extract_expense_data.output.message.category }})"

185.                    else:
186.                      # QUERY BRANCH
187.                      - name: query_agent
188.                        type: http
189.                        with:
190.                          url: "http://localhost:5601/api/agent_builder/mcp"
191.                          method: POST
192.                          headers:
193.                            Authorization: "ApiKey YOUR_API_KEY"
194.                            Content-Type: "application/json"
195.                          body:
196.                            method: "tools/call"
197.                            params:
198.                              name: "semantic_search_expenses"
199.                              arguments:
200.                                query: "{{ steps.transcribe_voice.output.results?.channels[0]?.alternatives[0]?.transcript or foreach.item.message.text }}"
201.                                limit: 10

203.                      - name: send_query_response
204.                        type: http
205.                        with:
206.                          url: "{{ consts.telegram_api_url }}{{ consts.telegram_bot_token }}/sendMessage"
207.                          method: POST
208.                          headers:
209.                            Content-Type: "application/json"
210.                          body:
211.                            chat_id: "{{ foreach.item.message.from.id | json: 2 }}"
212.                            text: "{{ steps.query_agent.output.content[0].text | default: 'I found your expense information.' }}"

214.              # Step 4: Update last_update_id (store in Elasticsearch for persistence)
215.              # Update on every iteration - the last one will be the final value
216.              # This avoids needing array[length-1] syntax which LiquidJS doesn't support
217.              - name: update_last_update_id
218.                type: elasticsearch.index
219.                with:
220.                  index: "telegram-bot-state"
221.                  id: "last_update_id"
222.                  document:
223.                    last_update_id: "{{ foreach.item.update_id }}"
224.                    updated_at: "now"

`AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)收起代码块![](https://csdnimg.cn/release/blogv2/dist/pc/img/arrowup-line-top-White.png)

测试

我发送了 "Spent 250 on lunch"（应该会被 ingest）和 "How much did I spend on food last week?"（应该会查询）。

结果失败了.。唉，正如我之前提到的 ------ One Workflow 还处于非常早期阶段，并不是所有东西都顺利运行。幸运的是，我已经找到了大部分 bug，最终工作流在我的本地环境中可以运行。所以，我只需要一点时间把所有问题整理到 PR 中并合并，希望很快你就能让一切正常工作 ------ 我会更新这篇文章来保持信息同步 =)

我的收获

重新实现 Som 的费用助手不仅仅是一次技术实验。这是一次机会，让我看到当编排、搜索、语义和 AI 推理全部在同一平台上运行时会发生什么。说实话？感觉非常协调。就像一切本该协同工作，这很少见。

Elastic 一直在存储和搜索数据方面非常强大。但看到一个工作流从 Telegram 拉取消息、处理语音或文本、用 Gemini 分类意图、丰富并索引文档、运行 ES|QL 分析、并提供对话式回复 ------ 所有这些都不离开 Elastic 生态系统 ------ 真是有趣。Elasticsearch 显然正在发展成不仅仅是搜索引擎的东西。它正在成为一个平台，让 AI agents 和操作性工作流可以真正融合，而不会相互冲突。

这个助手远不是最终状态。在基础搭建好之后，你可以在它之上构建很多东西。支出洞察。异常检测（"嘿，这笔支出看起来怪怪的..."）。月度总结。对话式仪表板。主动通知。预算管理。多用户 bot。常规功能。所有这些都由同一个引擎驱动，这正是它有趣的地方。

如果你想了解 AI agents 在与真实可观测数据配合时的表现 ------ 而不仅仅是漂浮在云端的聊天完成 ------ 这种项目是一种出乎意料的、非常直观的探索方式。至少比我预期的更直观。

也许你的支出终于会开始回答你的问题。我的还没有，但它们正越来越接近。

原文：discuss.elastic.co/t/dec-4th-2...