将 CrewAI 与 Elasticsearch 结合使用

作者:来自 Elastic Jeffrey Rengifo

学习如何使用 CrewAI 为你的代理团队创建一个 Elasticsearch 代理,并执行市场调研任务。

CrewAI 是一个用于编排代理的框架,它通过角色扮演的方式让多个代理协同完成复杂任务。

如果你想了解更多关于代理及其工作原理的内容,建议你阅读这篇文章
图片来源: https://github.com/crewAIInc/crewAI

CrewAI 声称比类似的框架(如 LangGraph)更快速、更简单,因为它不需要像 Autogen 那样大量的样板代码或额外的代理编排代码。此外,CrewAI 与 Langchain 工具兼容,带来了更多可能性。

CrewAI 的应用场景十分广泛,包括研究代理、股市分析、潜在客户捕捉、合同分析、网站生成、旅行推荐等。

在本文中,你将创建一个使用 Elasticsearch 作为数据检索工具的代理,与其他代理协作,对我们的 Elasticsearch 产品进行市场调研。

基于诸如 "summer clothes - 夏季服装" 这样的概念,专家代理(expert agent) 将会在 Elasticsearch 中搜索最具语义相似性的产品;研究员代理( researcher agen**)** 则会在网上查找相关的网站和产品;最后,写作代理( writer agent**)**会将所有内容整合成一份市场分析报告。

你可以在这里找到包含完整示例的 Notebook

要使这组代理(crew agent)正常运行,请完成以下步骤:

步骤如下:

  • 安装并导入相关包

  • 准备数据

  • 创建 Elasticsearch 的 CrewAI 工具

  • 配置代理

  • 配置任务

安装并导入包

复制代码
pip install elasticsearch==8.17 'crewai[tools]'

import json
import os

from getpass import getpass
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk

from crewai import Agent, Crew, Task
from crewai.tools import tool
from crewai_tools import SerperDevTool, WebsiteSearchTool

我们导入 SerperDevTool 以使用 Serper API 在互联网上搜索与我们的查询相关的网站,并使用 WebsiteSearchTool 在找到的内容中进行 RAG 搜索。

Serper 提供 2,500 次免费查询,你可以在这里领取

准备数据

Elasticsearch 客户端

复制代码
os.environ["ELASTIC_ENDPOINT"] = "your_es_url"
os.environ["ELASTIC_API_KEY"] = "es_api_key"

_client = Elasticsearch(
    os.environ["ELASTIC_ENDPOINT"],
    api_key=os.environ["ELASTIC_API_KEY"],
)

创建推理端点

为了启用语义搜索功能,你需要使用 ELSER 创建一个推理端点:

复制代码
_client.options(
    request_timeout=60, max_retries=3, retry_on_timeout=True
).inference.put(
    task_type="sparse_embedding",
    inference_id="clothes-inference",
    body={
      "service": "elasticsearch",
        "service_settings": {
          "adaptive_allocations": { 
            "enabled": True
          },
          "num_threads": 1,
          "model_id": ".elser_model_2" 
        }
      }
)

创建映射

现在,我们将把 ELSER 模型应用于一个单一的 semantic_text 字段,以便使代理能够运行混合查询。

复制代码
_client.indices.create(
        index="summer-clothes",
        body={
             "mappings": {
                "properties": {
                  "title": {
                    "type": "text",
                    "copy_to": "semantic_field"
                  },
                  "description": {
                    "type": "text",
                    "copy_to": "semantic_field"
                  },
                  "price": {
                    "type": "float"
                  },
                  "semantic_field": {
                    "type": "semantic_text",
                    "inference_id": "clothes-inference"
                  }
                }
             }
        }
    )

索引数据

我们将存储一些关于衣服的数据,以便将我们的来源与研究员代理在互联网上找到的信息进行比较。

复制代码
documents = [
    {
        "title": "Twist-Detail Crop Top",
        "description": "Fitted crop top in woven, patterned fabric with linen content. Wide shoulder straps, sweetheart neckline, and gathered side seams for a gently draped effect. Twisted detail at center bust, cut-out section at front, and wide smocking at back. Lined",
        "price": 34.99
    },
    {
        "title": "Rib-knit Tank Top",
        "description": "Short, fitted top in a soft rib knit. Extra-narrow shoulder straps and a square neckline.",
        "price": 7.49
    },
    {
        "title": "Linen-blend Shorts",
        "description": "Shorts in an airy, woven linen blend. High, ruffle-trimmed waist, narrow drawstring and covered elastic at waistband, and discreet side pockets.",
        "price": 13.99
    },
    {
        "title": "Twill Cargo Shorts",
        "description": "Fitted shorts in cotton twill with a V-shaped yoke at front and back. High waist, zip fly with button, and patch front pockets.",
        "price": 20.99
    },
    {
        "title": "Slim Fit Ribbed Tank Top",
        "description": "Slim-fit tank top in medium weight, ribbed cotton-blend jersey with a fitted silhouette. Straight-cut hem.",
        "price": 8.49
    },
    {
        "title": "Relaxed Fit Linen Resort Shirt",
        "description": "Relaxed-fit shirt in airy linen. Resort collar, buttons without placket, yoke at back, and short sleeves. Straight-cut hem. Fabric made from linen is breathable, looks great when ironed or wrinkled, and softens over time.",
        "price": 17.99
    },
    {
        "title": "Swim Shorts",
        "description": "Swim shorts in woven fabric. Drawstring and covered elastic at waistband, side pockets, and a back pocket with hook-loop fastener. Small slit at sides. Mesh liner shorts.",
        "price": 14.99
    },
    {
        "title": "Baggy Fit Cargo Shorts",
        "description": "Baggy-fit cargo shorts in cotton canvas with a generous but not oversized silhouette. Zip fly with button, diagonal side pockets, back pockets with flap and snap fasteners, and bellows leg pockets with snap fasteners.",
        "price": 20.99
    },
    {
        "title": "Muslin Shorts",
        "description": "Shorts in airy cotton muslin. High, ruffle-trimmed waist, covered elastic at waistband, and an extra-narrow drawstring with a bead at ends. Discreet side pockets.",
        "price": 15.99
    },
    {
        "title": "Oversized Lyocell-blend Dress",
        "description": "Short, oversized dress in a woven lyocell blend. Gathered, low-cut V-neck with extra-narrow ties at front, 3/4-length, raglan-cut balloon sleeves with narrow elastic at cuffs, and seams at waist and hips with delicate piping. Unlined.",
        "price": 38.99
    }
]

def build_data():
    for doc in documents:
        yield {
            "_index": "summer-clothes",
            "_source": doc
        }

try:
    success, errors = bulk(_client, build_data())
    print(f"{success} documents indexed successfully")
    if errors:
        print("Errors during indexing:", errors)
                
except Exception as e:
    print(f"Error: {str(e)}")

创建 Elasticsearch CrewAI 工具

CrewAI 的工具装饰器简化了将常规 Python 函数转换为代理可以使用的工具。以下是我们如何创建一个 Elasticsearch 搜索工具:

复制代码
@tool("es tool")
def elasticsearch_tool(question: str) -> str:
    """
    Search in Elasticsearch using hybrid search capabilities.

    Args:
        question (str): The search query to be semantically matched

    Returns:
        str: Concatenated hits from Elasticsearch as string JSON
    """

    response = _client.search(
        index="summer-clothes",
        body={
            "size": 10,
            "_source": {"includes": ["description", "title", "price"]},
            "retriever": {
                "rrf": {
                    "retrievers": [
                        {"standard": {"query": {"match": {"title": question}}}},
                        {
                            "standard": {
                                "query": {
                                    "semantic": {
                                        "field": "semantic_field",
                                        "query": question,
                                    }
                                }
                            }
                        },
                    ]
                }
            },
        },
    )

    hits = response["hits"]["hits"]

    if not hits:
        return ""

    result = json.dumps([hit["_source"] for hit in hits], indent=2)

    return result

导入其他需要的工具和凭证

现在,我们实例化一开始准备的工具,用于在互联网上搜索并在找到的内容中进行 RAG。

你还需要一个 OpenAI API 密钥,用于与 LLM 的通信。

复制代码
os.environ["SERPER_API_KEY"] = "your-key"
os.environ["OPENAI_API_KEY"] = "your-key"

search_tool = SerperDevTool()
web_rag_tool = WebsiteSearchTool()

配置代理

现在,你需要定义代理:

  • 检索器/Retriever:能够使用之前创建的工具在 Elasticsearch 中进行搜索。

  • 研究员/Researcher:使用 search_tool 在互联网上搜索。

  • 撰写者/Writer:将来自其他两个代理的信息总结成一个 Markdown 博客文件。

    es_retriever_agent = Agent(
    role="Retriever",
    goal="Retrieve Elasticsearch documents",
    backstory="You are an expert researcher",
    tools=[elasticsearch_tool],
    verbose=True,
    )

    internet_researcher_agent = Agent(
    role="Research analyst",
    goal="Provide up-to-date market analysis of the industry",
    backstory="You are an expert analyst",
    tools=[search_tool, web_rag_tool],
    verbose=True,
    )

    writer_agent = Agent(
    role="Content Writer",
    goal="Craft engaging blog posts about the information gathered",
    backstory="A skilled writer with a passion for writing about fashion",
    tools=[],
    verbose=True,
    )

配置任务

现在你已经定义了代理和工具,你需要为每个代理创建任务。你将指定不同的任务,以包括内容来源,以便撰写者代理能够引用它们,确保检索器和研究员代理都在提供信息。

复制代码
es_retriever_task = Task(
    description="Retrieve documents from the Elasticsearch index.",
    expected_output="A list of documents retrieved from the Elasticsearch index based on the query.",
    agent=es_retriever_agent,
)


internet_research_task = Task(
    description="Conduct research on the latest fashion trends for summer 2025. Identify five key trends that are shaping the industry, including popular colors, fabrics, and styles. Use reliable sources such as fashion magazines, retail market reports, and industry analyses. For each trend, provide a concise summary explaining its significance and why it is gaining popularity. Clearly cite your sources using a format like [Source Name]. Your response should be structured and fact-based",
    expected_output="A structured summary of the top five fashion trends for summer 2025, including citations for each trend in the format [Source Name]",
    agent=internet_researcher_agent,
)

write_task = Task(
    description="Compare the fashion trends from the Research Agent with product listings from Elasticsearch. Write a short report highlighting how the store's products align with trends. Use '[Elasticsearch]' for database references and '[Source Name]' for external sources",
    expected_output="A short, structured report combining trend insights with product recommendations, with clearly marked references",
    agent=writer_agent,
    output_file="blog-posts/new_post.md",
)

现在,你只需要实例化包含所有代理和任务的 crew 并运行它们:

复制代码
# Use in a crew
crew = Crew(
    agents=[es_retriever_agent, internet_researcher_agent, writer_agent],
    tasks=[
        es_retriever_task,
        internet_research_task,
        write_task,
    ],
)

# Execute tasks
crew.kickoff()

​我们可以在 new_post.md 文件中看到结果:

复制代码
"**Short Report on Fashion Trends and Product Alignment**

In this report, we will explore how the current fashion trends for summer 2025 align with the offerings in our store, as evidenced by product listings from [Elasticsearch]. The analysis focuses on five prominent trends and identifies specific products that reflect these aesthetics.

**1. Romantic Florals with Kitsch Twist**

The resurgence of floral patterns, particularly those that are intricate rather than large, embodies a whimsical approach to summer fashion. While our current inventory lacks offerings specifically featuring floral designs, there is an opportunity to curate products that align with this trend, potentially expanding into tops or dresses adorned with delicate floral patterns. [Source: Teen Vogue]

**2. High Waisted and Baggy Silhouettes**

High-waisted styles are a key trend for summer 2025, emphasizing comfort without sacrificing style. Among our offerings, the following products fit this criterion:

- **Baggy Fit Cargo Shorts** ($20.99): These cargo shorts present a relaxed, generous silhouette, complementing the cultural shift towards practical fashion that allows ease of movement.

- **Twill Cargo Shorts** ($20.99): These fitted options also embrace the high-waisted trend, providing versatility for various outfits.

**3. Bold Colors: Turquoise and Earthy Tones**

This summer promises a palette of vibrant turquoise alongside earthy tones. While our current collection does not showcase products that specifically reflect these colors, introducing pieces such as tops, dresses, or accessories in these hues could strategically cater to this emerging aesthetic. [Source: Heuritech]

**4. Textured Fabrics**

As textured fabrics gain popularity, we recognize an opportunity in our offerings:

- **Oversized Lyocell-blend Dress** ($38.99): This dress showcases unique fabric quality with gathered seams and balloon sleeves, making it a textural delight that speaks to the trend of tactile experiences in fashion.

- **Twist-Detail Crop Top** ($34.99): Featuring gathered side seams and a twist detail, it embraces the layered, visually engaging designs consumers are seeking.

**5. Quiet Luxury**

Quiet luxury resonates with those prioritizing quality and sustainability over fast fashion. Our offerings in this category include:

- **Relaxed Fit Linen Resort Shirt** ($17.99): This piece's breathable linen fabric and classic design underline a commitment to sustainable, timeless pieces that exemplify understated elegance.

In conclusion, our current product listings from [Elasticsearch] demonstrate alignment with several key summer fashion trends for 2025. There are unique opportunities to further harness these trends by expanding our collection to include playful floral designs and vibrant colors. Additionally, leveraging the existing offerings that emphasize comfort and quality can enhance our customer appeal in the face of evolving consumer trends.

We are well positioned to make strategic enhancements to our inventory, ensuring we stay ahead in the fast-evolving fashion landscape."

结论

CrewAI 简化了实例化带有角色扮演的代理工作流的过程,并且支持 Langchain 工具,包括自定义工具,使得通过像工具装饰器这样的抽象方式创建工具变得更加容易。

这个代理 crew 展示了执行结合本地数据源和互联网搜索的复杂任务的能力。

如果你想继续改进这个工作流,你可以尝试创建一个新的代理,将 writer_agent 的结果写入 Elasticsearch!

想要获得 Elastic 认证吗?查找下次 Elasticsearch 工程师培训的时间!

Elasticsearch 拥有许多新功能,可以帮助你为你的使用场景构建最佳搜索解决方案。深入探索我们的示例笔记本,了解更多,开始免费云试用,或者现在就在你的本地机器上尝试 Elastic。

原文:Using CrewAI with Elasticsearch - Elasticsearch Labs

相关推荐
終不似少年遊*1 小时前
【NLP解析】多头注意力+掩码机制+位置编码:Transformer三大核心技术详解
人工智能·自然语言处理·大模型·nlp·transformer·注意力机制
清岚_lxn4 小时前
原生SSE实现AI智能问答+Vue3前端打字机流效果
前端·javascript·人工智能·vue·ai问答
ml130185288744 小时前
开发一个环保回收小程序需要哪些功能?环保回收小程序
java·大数据·微信小程序·小程序·开源软件
zybishe5 小时前
免费送源码:Java+ssm+MySQL 酒店预订管理系统的设计与实现 计算机毕业设计原创定制
java·大数据·python·mysql·微信小程序·php·课程设计
_一条咸鱼_6 小时前
大厂AI 大模型面试:注意力机制原理深度剖析
人工智能·深度学习·机器学习
FIT2CLOUD飞致云6 小时前
四月月报丨MaxKB正在被能源、交通、金属矿产等行业企业广泛采纳
人工智能·开源
_一条咸鱼_6 小时前
大厂AI大模型面试:泛化能力原理
人工智能·深度学习·机器学习
Amor风信子6 小时前
【大模型微调】如何解决llamaFactory微调效果与vllm部署效果不一致如何解决
人工智能·学习·vllm
Jamence6 小时前
多模态大语言模型arxiv论文略读(十五)
人工智能·语言模型·自然语言处理
派可数据BI可视化7 小时前
数据中台、BI业务访谈(二):组织架构梳理的坑
数据仓库·人工智能·信息可视化·数据分析·商业智能bi