使用 AutoGen 与 Elasticsearch

作者：来自 Elastic Jeffrey Rengifo

学习如何使用 AutoGen 为你的 agent 创建一个 Elasticsearch 工具。

Elasticsearch 拥有与行业领先的生成式 AI 工具和提供商的原生集成。查看我们的网络研讨会，了解如何超越 RAG 基础，或使用 Elastic 向量数据库构建可投入生产的应用。

为了为你的使用场景构建最佳搜索解决方案，现在就开始免费云试用，或者在本地机器上试用 Elastic。

AutoGen 是微软的一个框架，用于构建可以与人类互动或自主行动的应用程序。它提供了一个完整的生态系统，具备不同层级的抽象，取决于你需要自定义的程度。

如果你想了解更多关于 agent 及其工作原理的内容，建议你阅读这篇文章。

图片来源： github.com/microsoft/a...

AgentChat 让你可以轻松地在 AutoGen 核心之上实例化预设的 agent，从而配置模型提示词、工具等内容。

在 AgentChat 之上，你可以使用扩展来增强其功能。这些扩展既包括官方库中的，也包括社区开发的。

最高层级的抽象是 Magnetic-One，这是一个为复杂任务设计的通用多 agent 系统，在介绍该方法的论文中已经预先配置好。

AutoGen 以促进 agent 之间的沟通而闻名，提出了具有突破性的交互模式，例如：

在本文中，我们将创建一个使用 Elasticsearch 作为语义搜索工具的 agent，使其能够与其他 agent 协作，在 Elasticsearch 中存储的候选人简历与在线职位之间寻找最佳匹配。

我们将创建一组共享 Elasticsearch 和在线信息的 agents，尝试将候选人与职位进行匹配。我们将使用 "Group Chat" 模式，其中一个管理员负责协调对话和执行任务，而每个 agent 专注于特定任务。

完整示例可在此 Notebook 中查看。

步骤：

安装依赖并导入包
准备数据
配置 agent
配置工具
执行任务

安装依赖并导入包

ini 复制代码

`pip install autogen elasticsearch==8.17 nest-asyncio`AI写代码

python 复制代码

`

1.  import json
2.  import os
3.  import nest_asyncio
4.  import requests

6.  from getpass import getpass
7.  from autogen import (
8.      AssistantAgent,
9.      GroupChat,
10.      GroupChatManager,
11.      UserProxyAgent,
12.      register_function,
13.  )
14.  from elasticsearch import Elasticsearch
15.  from elasticsearch.helpers import bulk

17.  nest_asyncio.apply()

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)AI写代码

准备数据

设置密钥

对于 agent 的 AI 接口，我们需要提供一个 OpenAI API 密钥。我们还需要一个 Serper API 密钥，以赋予 agent 搜索能力。Serper 在注册时提供 2,500 次免费的搜索调用。我们使用 Serper 让 agent 具备访问互联网的能力，更具体来说，是获取 Google 搜索结果。agent 可以通过 API 发送搜索查询，Serper 会返回 Google 的前几条结果。

lua 复制代码

`

1.  os.environ["SERPER_API_KEY"] = "serper-api-key"
2.  os.environ["OPENAI_API_KEY"] = "openai-api-key"
3.  os.environ["ELASTIC_ENDPOINT"] = "elastic-endpoint"
4.  os.environ["ELASTIC_API_KEY"] = "elastic-api-key"

`AI写代码

Elasticsearch client

ini 复制代码

`

1.  _client = Elasticsearch(
2.      os.environ["ELASTIC_ENDPOINT"],
3.      api_key=os.environ["ELASTIC_API_KEY"],
4.  )

`AI写代码

推理端点与映射

为了启用语义搜索功能，我们需要使用 ELSER 创建一个推理端点。ELSER 允许我们运行语义或混合查询，因此我们可以向 agent 分配宽泛的任务，而无需输入文档中出现的关键字，Elasticsearch 就能返回语义相关的文档。

python 复制代码

`

1.  try:
2.      _client.options(
3.          request_timeout=60, max_retries=3, retry_on_timeout=True
4.      ).inference.put(
5.          task_type="sparse_embedding",
6.          inference_id="jobs-candidates-inference",
7.          body={
8.              "service": "elasticsearch",
9.              "service_settings": {
10.                  "adaptive_allocations": {"enabled": True},
11.                  "num_threads": 1,
12.                  "model_id": ".elser_model_2",
13.              },
14.          },
15.      )

17.      print("Inference endpoint created successfully.")

19.  except Exception as e:
20.      print(
21.          f"Error creating inference endpoint: {e.info['error']['root_cause'][0]['reason'] }"
22.      )

`AI写代码

映射

对于映射，我们将把所有相关的文本字段复制到 [semantic_text](https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text "semantic_text") 字段中，以便我们可以对数据执行语义或混合查询。

python 复制代码

`

1.  try:
2.      _client.indices.create(
3.          index="available-candidates",
4.          body={
5.               "mappings": {
6.                  "properties": {
7.                    "candidate_name": {
8.                      "type": "text",
9.                      "copy_to": "semantic_field"
10.                    },
11.                    "position_title": {
12.                      "type": "text",
13.                      "copy_to": "semantic_field"
14.                    },
15.                    "profile_description": {
16.                      "type": "text",
17.                      "copy_to": "semantic_field"
18.                    },
19.                    "expected_salary": {
20.                      "type": "text",
21.                      "copy_to": "semantic_field"
22.                    },
23.                    "skills": {
24.                      "type": "keyword",
25.                      "copy_to": "semantic_field"
26.                    },
27.                    "semantic_field": {
28.                      "type": "semantic_text",
29.                      "inference_id": "positions-inference"
30.                    }
31.                  }
32.            }
33.          }
34.      )

36.      print("index created successfully")
37.  except Exception as e:
38.      print(f"Error creating inference endpoint: {e.info['error']['root_cause'][0]['reason'] }")

`AI写代码

将文档导入 Elasticsearch

我们将加载关于求职者的数据，并要求我们的 agent 根据他们的经验和期望薪资找到最适合的职位。

bash 复制代码

`

1.  documents = [
2.    {
3.      "candidate_name": "John",
4.      "position_title": "Software Engineer",
5.      "expected_salary": "$85,000 - $120,000",
6.      "profile_description": "Experienced software engineer with expertise in backend development, cloud computing, and scalable system architecture.",
7.      "skills": ["Python", "Java", "AWS", "Microservices", "Docker", "Kubernetes"]
8.    },
9.    {
10.      "candidate_name": "Emily",
11.      "position_title": "Data Scientist",
12.      "expected_salary": "$90,000 - $140,000",
13.      "profile_description": "Data scientist with strong analytical skills and experience in machine learning and big data processing.",
14.      "skills": ["Python", "SQL", "TensorFlow", "Pandas", "Hadoop", "Spark"]
15.    },
16.    {
17.      "candidate_name": "Michael",
18.      "position_title": "DevOps Engineer",
19.      "expected_salary": "$95,000 - $130,000",
20.      "profile_description": "DevOps specialist focused on automation, CI/CD pipelines, and infrastructure as code.",
21.      "skills": ["Terraform", "Ansible", "Jenkins", "Docker", "Kubernetes", "AWS"]
22.    },
23.    {
24.      "candidate_name": "Sarah",
25.      "position_title": "Product Manager",
26.      "expected_salary": "$110,000 - $150,000",
27.      "profile_description": "Product manager with a technical background, skilled in agile methodologies and user-centered design.",
28.      "skills": ["JIRA", "Agile", "Scrum", "A/B Testing", "SQL", "UX Research"]
29.    },
30.    {
31.      "candidate_name": "David",
32.      "position_title": "UX/UI Designer",
33.      "expected_salary": "$70,000 - $110,000",
34.      "profile_description": "Creative UX/UI designer with experience in user research, wireframing, and interactive prototyping.",
35.      "skills": ["Figma", "Adobe XD", "Sketch", "HTML", "CSS", "JavaScript"]
36.    },
37.    {
38.      "candidate_name": "Jessica",
39.      "position_title": "Cybersecurity Analyst",
40.      "expected_salary": "$100,000 - $140,000",
41.      "profile_description": "Cybersecurity expert with experience in threat detection, penetration testing, and compliance.",
42.      "skills": ["Python", "SIEM", "Penetration Testing", "Ethical Hacking", "Nmap", "Metasploit"]
43.    },
44.    {
45.      "candidate_name": "Robert",
46.      "position_title": "Cloud Architect",
47.      "expected_salary": "$120,000 - $180,000",
48.      "profile_description": "Cloud architect specializing in designing secure and scalable cloud infrastructures.",
49.      "skills": ["AWS", "Azure", "GCP", "Kubernetes", "Terraform", "CI/CD"]
50.    },
51.    {
52.      "candidate_name": "Sophia",
53.      "position_title": "AI/ML Engineer",
54.      "expected_salary": "$100,000 - $160,000",
55.      "profile_description": "Machine learning engineer with experience in deep learning, NLP, and computer vision.",
56.      "skills": ["Python", "PyTorch", "TensorFlow", "Scikit-Learn", "OpenCV", "NLP"]
57.    },
58.    {
59.      "candidate_name": "Daniel",
60.      "position_title": "QA Engineer",
61.      "expected_salary": "$60,000 - $100,000",
62.      "profile_description": "Quality assurance engineer focused on automated testing, test-driven development, and software reliability.",
63.      "skills": ["Selenium", "JUnit", "Cypress", "Postman", "Git", "CI/CD"]
64.    },
65.    {
66.      "candidate_name": "Emma",
67.      "position_title": "Technical Support Specialist",
68.      "expected_salary": "$50,000 - $85,000",
69.      "profile_description": "Technical support specialist with expertise in troubleshooting, customer support, and IT infrastructure.",
70.      "skills": ["Linux", "Windows Server", "Networking", "SQL", "Help Desk", "Scripting"]
71.    }
72.  ]

`AI写代码

python 复制代码

`

1.  def build_data():
2.      for doc in documents:
3.          yield {"_index": "available-candidates", "_source": doc}

6.  try:
7.      success, errors = bulk(_client, build_data())
8.      if errors:
9.          print("Errors during indexing:", errors)
10.      else:
11.          print(f"{success} documents indexed successfully")

13.  except Exception as e:
14.      print(f"Error: {str(e)}")

`AI写代码

配置 agent

AI 端点配置

让我们根据在第一步中定义的环境变量配置 AI 端点。

ini 复制代码

`

1.  config_list = [{"model": "gpt-4o-mini", "api_key": os.environ["OPENAI_API_KEY"]}]
2.  ai_endpoint_config = {"config_list": config_list}

`AI写代码

创建 agents

我们将首先创建管理员，负责主持对话并执行其他 agent 提出的任务。

然后，我们将创建执行每个任务的 agents：

管理员：领导对话并执行其他 agent 的行动。
研究员：在网上搜索职位信息。
检索员：在 Elastic 中查找候选人。
匹配员：尝试将职位和候选人进行匹配。
评审员：在提供最终答案之前，评估匹配的质量。

ini 复制代码

`

1.  user_proxy = UserProxyAgent(
2.      ,
3.      system_message="""You are a human administrator.
4.          Your role is to interact with agents and tools to execute tasks efficiently.
5.          Execute tasks and agents in a logical order, ensuring that all agents perform
6.          their duties correctly. All tasks must be approved by you before proceeding.""",
7.      human_input_mode="NEVER",
8.      code_execution_config=False,
9.      is_termination_msg=lambda msg: msg.get("content") is not None
10.      and "TERMINATE" in msg["content"],
11.      llm_config=ai_endpoint_config,
12.  )

14.  researcher = AssistantAgent(
15.      ,
16.      system_message="""You are a Researcher.
17.          Your role is to use the 'search_in_internet' tool to find individual
18.          job offers realted to the candidates profiles. Each job offer must include a direct link to a specific position,
19.          not just a category or group of offers. Ensure that all job offers are relevant and accurate.""",
20.      llm_config=ai_endpoint_config,
21.  )

23.  retriever = AssistantAgent(
24.      ,
25.      llm_config=ai_endpoint_config,
26.      system_message="""You are a Retriever.
27.          Your task is to use the 'elasticsearch_hybrid_search' tool to retrieve
28.          candidate profiles from Elasticsearch.""",
29.  )

31.  matcher = AssistantAgent(
32.      ,
33.      system_message="""Your role is to match job offers with suitable candidates.
34.          The matches must be accurate and beneficial for both parties.
35.          Only match candidates with job offers that fit their qualifications.""",
36.      llm_config=ai_endpoint_config,
37.  )

39.  critic = AssistantAgent(
40.      ,
41.      system_message="""You are the Critic.
42.          Your task is to verify the accuracy of job-candidate matches.
43.          If the matches are correct, inform the Admin and include the word 'TERMINATE' to end the process.""",  # End condition
44.      llm_config=ai_endpoint_config,
45.  )

`AI写代码

配置工具

对于这个项目，我们需要创建两个工具：一个用于在 Elasticsearch 中搜索，另一个用于在线搜索。工具是一个 Python 函数，我们将在接下来注册并分配给 agent。

工具方法

python 复制代码

`

1.  async def elasticsearch_hybrid_search(question: str):
2.      """
3.      Search in Elasticsearch using semantic search capabilities.
4.      """

6.      response = _client.search(
7.          index="available-candidates",
8.          body={
9.              "_source": {
10.                  "includes": [
11.                      "candidate_name",
12.                      "position_title",
13.                      "profile_description",
14.                      "expected_salary",
15.                      "skills",
16.                  ],
17.              },
18.              "size": 10,
19.              "retriever": {
20.                  "rrf": {
21.                      "retrievers": [
22.                          {
23.                              "standard": {
24.                                  "query": {"match": {"position_title": question}}
25.                              }
26.                          },
27.                          {
28.                              "standard": {
29.                                  "query": {
30.                                      "semantic": {
31.                                          "field": "semantic_field",
32.                                          "query": question,
33.                                      }
34.                                  }
35.                              }
36.                          },
37.                      ]
38.                  }
39.              },
40.          },
41.      )

43.      hits = response["hits"]["hits"]

45.      if not hits:
46.          return ""

48.      result = json.dumps([hit["_source"] for hit in hits], indent=2)

50.      return result

53.  async def search_in_internet(query: str):
54.      """Search in internet using Serper and retrieve results in json format"""

56.      url = "https://google.serper.dev/search"
57.      headers = {
58.          "X-API-KEY": os.environ["SERPER_API_KEY"],
59.          "Content-Type": "application/json",
60.      }

62.      payload = json.dumps({"q": query})
63.      response = requests.request("POST", url, headers=headers, data=payload)
64.      original_results = response.json()

66.      related_searches = original_results.get("relatedSearches", [])
67.      original_organics = original_results.get("organic", [])

69.      for search in related_searches:
70.          payload = json.dumps({"q": search.get("query")})
71.          response = requests.request("POST", url, headers=headers, data=payload)
72.          original_organics.extend(response.json().get("organic", []))

74.      return original_organics

`AI写代码

将工具分配给 agent

为了让工具正常工作，我们需要定义一个调用者，它将确定函数的参数，以及一个执行者，它将运行该函数。我们将定义管理员为执行者，并将相应的 agent 作为调用者。

ini 复制代码

`

1.  register_function(
2.      elasticsearch_hybrid_search,
3.      caller=retriever,
4.      executor=user_proxy,
5.      ,
6.      description="A method retrieve information from Elasticsearch using semantic search capabilities",
7.  )

9.  register_function(
10.      search_in_internet,
11.      caller=researcher,
12.      executor=user_proxy,
13.      ,
14.      description="A method for search in internet",
15.  )

`AI写代码

执行任务

我们现在将定义一个包含所有 agent 的群聊，其中管理员为每个 agent 分配轮次，指定它要调用的任务，并在根据先前的指令满足定义的条件后结束任务。

ini 复制代码

`

1.  groupchat = GroupChat(
2.      agents=[user_proxy, researcher, retriever, matcher, critic],
3.      messages=[],
4.      max_round=50,
5.  )

7.  manager = GroupChatManager(groupchat=groupchat, llm_config=ai_endpoint_config)

9.  user_proxy.initiate_chat(
10.      manager,
11.      message="""Compare the candidate profiles retrieved by the Retriever with the job offers
12.          found by the Researcher on the internet.
13.          Both candidate profiles and job offers are related to the software industry.
14.          Ensure that each match is accurate and beneficial for both parties.
15.          Each candidate should be matched with a single job offer.
16.          Include the job offer link provided by the Researcher.""",
17.  )

`AI写代码

推理

（格式化以提高可读性）

输出将如下所示：

vbnet 复制代码

`

1.  Admin (to chat_manager):

3.  Compare the candidate profiles retrieved by the Retriever with the job offers 
4.          found by the Researcher on the internet. 
5.          Both candidate profiles and job offers are related to the software industry. 
6.          Ensure that each match is accurate and beneficial for both parties. 
7.          Each candidate should be matched with a single job offer. 
8.          Include the job offer link provided by the Researcher.

`AI写代码

vbnet 复制代码

`

1.  Admin (to chat_manager):

3.  Compare the candidate profiles retrieved by the Retriever with the job offers 
4.          found by the Researcher on the internet. 
5.          Both candidate profiles and job offers are related to the software industry. 
6.          Ensure that each match is accurate and beneficial for both parties. 
7.          Each candidate should be matched with a single job offer. 
8.          Include the job offer link provided by the Researcher.

`AI写代码

Next speaker: Matcher

vbnet 复制代码

`

1.  Matcher (to chat_manager):

3.  I currently do not have access to external databases or the ability to retrieve specific candidate profiles or job offers. However, I can help guide you on how to effectively match candidates with job offers based on their qualifications and the requirements of the jobs.

5.  1. **Identify Key Qualities**: For each job offer, list the key qualifications, skills, and experience required. Look for programming languages, technologies, years of experience, and any other competencies that are mentioned.

7.  2. **Analyze Candidate Profiles**: For each candidate, evaluate their skills, education, and experience. Pay special attention to their technical skills and past roles to ensure they align with the job requirements.

9.  3. **Create Matching Criteria**: Develop criteria for grading matches, such as:
10.     - Exact match (essential skills/experience)
11.     - General match (related skills/experience)
12.     - Desirable match (preferred qualifications)

14.  4. **Match Candidates to Job Offers**: Based on the above analysis, start matching candidates to job offers. Aim to find roles that play to the candidate's strengths and career aspirations.

16.  5. **Link Job Offers**: Ensure that for every candidate matched, the corresponding job offer link is provided.

18.  Once you have the specific candidate profiles and job offers, you can follow these steps to create accurate and beneficial matches. If you have specific examples or data, I can help you formulate the matches!

`AI写代码

Next speaker: Retriever

markdown 复制代码

`

1.  Retriever (to chat_manager):

3.  ***** Suggested tool call (call_0yjI5x0gvcLAJeDbiDKYPtFD): elasticsearch_hybrid_search *****
4.  Arguments: 
5.  {"question":"Retrieve candidate profiles related to software development and engineering."}
6.  ********************************************************************************************

`AI写代码

Next speaker: Admin

vbnet 复制代码

`

1.  >>>>>>>> EXECUTING FUNCTION elasticsearch_hybrid_search...
2.  Call ID: call_0yjI5x0gvcLAJeDbiDKYPtFD
3.  Input arguments: {'question': 'Retrieve candidate profiles related to software development and engineering.'}

`AI写代码

Admin (to chat_manager):

csharp 复制代码

`1.  ***** Response from calling tool (call_0yjI5x0gvcLAJeDbiDKYPtFD) *****
2.  [
3.    {
4.  ....` AI写代码

Researcher (to chat_manager):

arduino 复制代码

`

1.  ***** Suggested tool call (call_X0bSSvAKuilPJ9hY90AJvMud): search_in_internet *****
2.  Arguments: 
3.  {"query": "Software Engineer job offers Python Java AWS Microservices Docker Kubernetes"}
4.  ....

`AI写代码

Next speaker: Critic

markdown 复制代码

`

1.  Critic (to chat_manager):

3.  The matches are accurate and beneficial for both candidates and job offers. 

5.  **Matches Summary:**
6.  1. **John** - Lead Software Engineer, Full Stack (Java, Python, AWS) at Capital One.
7.  2. **Daniel** - Quality Assurance Engineer for Homey - Taro.
8.  3. **Michael** - DevOps Engineer - Kubernetes, Terraform, Jenkins, Ansible, AWS at Cisco.
9.  4. **Sophia** - AI/ML Engineer at Chesterfield, Missouri, United States.
10.  5. **David** - UX/UI Designer at HR Force International in New York.

`AI写代码

Admin has been informed of the successful candidate-job offer matches.

go 复制代码

`**TERMINATE**`AI写代码

结果

（格式化以提高可读性）

markdown 复制代码

`

1.  ### Candidate: John
2.  - **Profile**: Experienced software engineer with expertise in backend development, cloud computing, and scalable system architecture.
3.  - **Skills**: Python, Java, AWS, Microservices, Docker, Kubernetes.
4.  - **Expected Salary**: $85,000 - $120,000.
5.  - **Match**: [Lead Software Engineer, Full Stack (Java, Python, AWS) at Capital One](https://www.capitalonecareers.com/en/job/new-york/lead-software-engineer-full-stack-java-python-aws/1732/77978761520)

7.  ### Candidate: Daniel
8.  - **Profile**: Quality assurance engineer focused on automated testing, test-driven development, and software reliability.
9.  - **Skills**: Selenium, JUnit, Cypress, Postman, Git, CI/CD.
10.  - **Expected Salary**: $60,000 - $100,000.
11.  - **Match**: [Quality Assurance Engineer for Homey - Taro](https://www.jointaro.com/jobs/homey/quality-assurance-engineer/)

13.  ### Candidate: Michael
14.  - **Profile**: DevOps specialist focused on automation, CI/CD pipelines, and infrastructure as code.
15.  - **Skills**: Terraform, Ansible, Jenkins, Docker, Kubernetes, AWS.
16.  - **Expected Salary**: $95,000 - $130,000.
17.  - **Match**: [DevOps Engineer - Kubernetes, Terraform, Jenkins, Ansible, AWS at Cisco](https://jobs.cisco.com/jobs/ProjectDetail/Software-Engineer-DevOps-Engineer-Kubernetes-Terraform-Jenkins-Ansible-AWS-8-11-Years/1436347)

19.  ### Candidate: Sophia
20.  - **Profile**: Machine learning engineer with experience in deep learning, NLP, and computer vision.
21.  - **Skills**: Python, PyTorch, TensorFlow, Scikit-Learn, OpenCV, NLP.
22.  - **Expected Salary**: $100,000 - $160,000.
23.  - **Match**: [AI/ML Engineer - Chesterfield, Missouri, United States](https://careers.mii.com/jobs/ai-ml-engineer-chesterfield-missouri-united-states)

25.  ### Candidate: David
26.  - **Profile**: Creative UX/UI designer with experience in user research, wireframing, and interactive prototyping.
27.  - **Skills**: Figma, Adobe XD, Sketch, HTML, CSS, JavaScript.
28.  - **Expected Salary**: $70,000 - $110,000.
29.  - **Match**: [HR Force International is hiring: UX/UI Designer in New York](https://www.mediabistro.com/jobs/604658829-hr-force-international-is-hiring-ux-ui-designer-in-new-york)

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)AI写代码

注意，在每个 Elasticsearch 存储的候选人末尾，你可以看到一个匹配字段，显示最适合他们的职位！

结论

AutoGen 允许你创建多个 agents 群组，它们协作解决问题，复杂度可根据需求调整。可用的模式之一是 "群聊 - group chat"，管理员在 agent 之间主持对话，最终达成成功的解决方案。

你可以通过创建更多 agent 为项目增加更多功能。例如，将匹配结果存储回 Elasticsearch，然后使用 WebSurfer agent 自动申请职位。WebSurfer agent 可以使用视觉模型和无头浏览器浏览网站。

要在 Elasticsearch 中索引文档，你可以使用类似于 elasticsearch_hybrid_search 的工具，但需要添加额外的导入逻辑。然后，创建一个特殊的 agent "ingestor" 来实现索引。完成后，你可以按照官方文档实现 WebSurfer agent。

原文：Using AutoGen with Elasticsearch - Elasticsearch Labs