AutoGen学习笔记系列(十七)Examples - Literature Review

这篇笔记瞄准的是AutoGen库中 Examples 章节的 Literature Review 示例,实现的功能是对 Arxiv 文献进行检索


获取 Google API KEY

如何获取Google API KEY我们在之前的几篇文章中已经详细介绍过了,参考以下链接:

【注意】:这三个API KEY是必要的


Literature Review

这个demo通过下面的Agent组成的Team对一个领域的论文进行总结并形成文献综述:

  • Arxiv Search Agent:使用 Arxiv API 搜索指定议题的关联论文并返回;
  • Google Search Agent:使用 Google Search api 搜索指定议题的关联论文并返回;
  • Report Agent:根据上面两个Agent搜索到的结果汇总并形成报告;

首先要安装依赖:

python 复制代码
$ pip install arxiv

Defining Tools

【注意】:官网在这里有两处笔误,两个工具的函数名写的是 google_search arxiv_search,但下面调用时使用的是 google_search_toolarxiv_search_tool,我这里统一成 _tool 结尾:

  • Google Search Agent使用的工具 google_search_tool
python 复制代码
def google_search_tool(query: str, num_results: int = 2, max_chars: int = 500) -> list:  # type: ignore[type-arg]
    import os
    import time

    import requests
    from bs4 import BeautifulSoup
    from dotenv import load_dotenv

    load_dotenv()

    api_key = os.getenv("GOOGLE_API_KEY")
    search_engine_id = os.getenv("GOOGLE_SEARCH_ENGINE_ID")

    if not api_key or not search_engine_id:
        raise ValueError("API key or Search Engine ID not found in environment variables")

    url = "https://www.googleapis.com/customsearch/v1"
    params = {"key": api_key, "cx": search_engine_id, "q": query, "num": num_results}

    response = requests.get(url, params=params)  # type: ignore[arg-type]

    if response.status_code != 200:
        print(response.json())
        raise Exception(f"Error in API request: {response.status_code}")

    results = response.json().get("items", [])

    def get_page_content(url: str) -> str:
        try:
            response = requests.get(url, timeout=10)
            soup = BeautifulSoup(response.content, "html.parser")
            text = soup.get_text(separator=" ", strip=True)
            words = text.split()
            content = ""
            for word in words:
                if len(content) + len(word) + 1 > max_chars:
                    break
                content += " " + word
            return content.strip()
        except Exception as e:
            print(f"Error fetching {url}: {str(e)}")
            return ""

    enriched_results = []
    for item in results:
        body = get_page_content(item["link"])
        enriched_results.append(
            {"title": item["title"], "link": item["link"], "snippet": item["snippet"], "body": body}
        )
        time.sleep(1)  # Be respectful to the servers

    return enriched_results
  • Arxiv Search Agent 使用的工具 arxiv_search_tool
python 复制代码
def arxiv_search_tool(query: str, max_results: int = 2) -> list:  # type: ignore[type-arg]
    """
    Search Arxiv for papers and return the results including abstracts.
    """
    import arxiv

    client = arxiv.Client()
    search = arxiv.Search(query=query, max_results=max_results, sort_by=arxiv.SortCriterion.Relevance)

    results = []
    for paper in client.results(search):
        results.append(
            {
                "title": paper.title,
                "authors": [author.name for author in paper.authors],
                "published": paper.published.strftime("%Y-%m-%d"),
                "abstract": paper.summary,
                "pdf_url": paper.pdf_url,
            }
        )

    # Write results to a file
    with open('arxiv_search_results.json', 'w') as f:
        json.dump(results, f, indent=2)

    return results

Defining Agents

定义三个Agent:

  • Google Search Agent
python 复制代码
google_search_agent = AssistantAgent(
    name="Google_Search_Agent",
    tools=[google_search_tool],
    model_client=OpenAIChatCompletionClient(model="gpt-4o-mini"),
    description="An agent that can search Google for information, returns results with a snippet and body content",
    system_message="You are a helpful AI assistant. Solve tasks using your tools.",
)
  • Arxiv Search Agent
python 复制代码
arxiv_search_agent = AssistantAgent(
    name="Arxiv_Search_Agent",
    tools=[arxiv_search_tool],
    model_client=OpenAIChatCompletionClient(model="gpt-4o-mini"),
    description="An agent that can search Arxiv for papers related to a given topic, including abstracts",
    system_message="You are a helpful AI assistant. Solve tasks using your tools. Specifically, you can take into consideration the user's request and craft a search query that is most likely to return relevant academi papers.",
)
  • Report Agent
python 复制代码
report_agent = AssistantAgent(
    name="Report_Agent",
    model_client=OpenAIChatCompletionClient(model="gpt-4o-mini"),
    description="Generate a report based on a given topic",
    system_message="You are a helpful assistant. Your task is to synthesize data extracted into a high quality literature review including CORRECT references. You MUST write a final report that is formatted as a literature review with CORRECT references.  Your response should end with the word 'TERMINATE'",
)

Creating the Team

创建Team并定义任务:

python 复制代码
termination = TextMentionTermination("TERMINATE")
team = RoundRobinGroupChat(
    participants=[google_search_agent, arxiv_search_agent, report_agent], termination_condition=termination
)

完整代码

python 复制代码
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.ui import Console
from autogen_core.tools import FunctionTool
from autogen_ext.models.openai import OpenAIChatCompletionClient
import json, os, asyncio

os.environ["GOOGLE_SEARCH_ENGINE_ID"] = "你的GOOGLE_SEARCH_ENGINE_ID"	# Google
os.environ["GOOGLE_API_KEY"] = "你的GOOGLE_API_KEY"						# Google
os.environ["OPENAI_API_KEY"] = "你的OPENAI_API_KEY"						# OpenAI

#-----------------------------------------------------------#
# Step1. 定义两个Agent使用的工具tool:
def google_search_tool(query: str, num_results: int = 2, max_chars: int = 500) -> list:  # type: ignore[type-arg]
    import os
    import time

    import requests
    from bs4 import BeautifulSoup
    from dotenv import load_dotenv

    load_dotenv()

    api_key = os.getenv("GOOGLE_API_KEY")
    search_engine_id = os.getenv("GOOGLE_SEARCH_ENGINE_ID")

    if not api_key or not search_engine_id:
        raise ValueError("API key or Search Engine ID not found in environment variables")

    url = "https://www.googleapis.com/customsearch/v1"
    params = {"key": api_key, "cx": search_engine_id, "q": query, "num": num_results}

    response = requests.get(url, params=params)  # type: ignore[arg-type]

    if response.status_code != 200:
        print(response.json())
        raise Exception(f"Error in API request: {response.status_code}")

    results = response.json().get("items", [])

    def get_page_content(url: str) -> str:
        try:
            response = requests.get(url, timeout=10)
            soup = BeautifulSoup(response.content, "html.parser")
            text = soup.get_text(separator=" ", strip=True)
            words = text.split()
            content = ""
            for word in words:
                if len(content) + len(word) + 1 > max_chars:
                    break
                content += " " + word
            return content.strip()
        except Exception as e:
            print(f"Error fetching {url}: {str(e)}")
            return ""

    enriched_results = []
    for item in results:
        body = get_page_content(item["link"])
        enriched_results.append(
            {"title": item["title"], "link": item["link"], "snippet": item["snippet"], "body": body}
        )
        time.sleep(1)  # Be respectful to the servers

    return enriched_results


def arxiv_search_tool(query: str, max_results: int = 2) -> list:  # type: ignore[type-arg]
    """
    Search Arxiv for papers and return the results including abstracts.
    """
    import arxiv

    client = arxiv.Client()
    search = arxiv.Search(query=query, max_results=max_results, sort_by=arxiv.SortCriterion.Relevance)

    results = []
    for paper in client.results(search):
        results.append(
            {
                "title": paper.title,
                "authors": [author.name for author in paper.authors],
                "published": paper.published.strftime("%Y-%m-%d"),
                "abstract": paper.summary,
                "pdf_url": paper.pdf_url,
            }
        )

    # Write results to a file
    with open('arxiv_search_results.json', 'w') as f:
        json.dump(results, f, indent=2)

    return results

#-----------------------------------------------------------#
# Step2. 定义三个Agent
google_search_agent = AssistantAgent(
    name="Google_Search_Agent",
    tools=[google_search_tool],
    model_client=OpenAIChatCompletionClient(model="gpt-4o-mini"),
    description="An agent that can search Google for information, returns results with a snippet and body content",
    system_message="You are a helpful AI assistant. Solve tasks using your tools.",
)

arxiv_search_agent = AssistantAgent(
    name="Arxiv_Search_Agent",
    tools=[arxiv_search_tool],
    model_client=OpenAIChatCompletionClient(model="gpt-4o-mini"),
    description="An agent that can search Arxiv for papers related to a given topic, including abstracts",
    system_message="You are a helpful AI assistant. Solve tasks using your tools. Specifically, you can take into consideration the user's request and craft a search query that is most likely to return relevant academi papers.",
)

report_agent = AssistantAgent(
    name="Report_Agent",
    model_client=OpenAIChatCompletionClient(model="gpt-4o-mini"),
    description="Generate a report based on a given topic",
    system_message="You are a helpful assistant. Your task is to synthesize data extracted into a high quality literature review including CORRECT references. You MUST write a final report that is formatted as a literature review with CORRECT references.  Your response should end with the word 'TERMINATE'",
)

#-----------------------------------------------------------#
# Step3. 定义终止条件与Team
termination = TextMentionTermination("TERMINATE")
team = RoundRobinGroupChat(
    participants=[google_search_agent, arxiv_search_agent, report_agent], termination_condition=termination
)

#-----------------------------------------------------------#
# Step4. 运行Task
asyncio.run(
    Console(
        team.run_stream(
            task="Write a literature review on no code tools for building multi agent ai systems",
        )
    )
)

运行结果如下:

bash 复制代码
$ python demo.py

【注意】:上面完整代码中在 arxiv_search_tool 函数里我将官网的写入本地文件注释解开了,因此会在同级目录下生成一个 arxiv_search_results.json 文件,由于 Arxiv 的API接口得到的是 真实且明确 的信息,即搜索出来的内容和 Arxiv 数据库中的一致,而非LLM幻觉产生,因此这个工具是一个非常好用的函数,如果单独调用也不需要LLM和Agent。

arxiv_search_results.json 内容如下:

json 复制代码
[
  {
    "title": "AutoGen Studio: A No-Code Developer Tool for Building and Debugging Multi-Agent Systems",
    "authors": [
      "Victor Dibia",
      "Jingya Chen",
      "Gagan Bansal",
      "Suff Syed",
      "Adam Fourney",
      "Erkang Zhu",
      "Chi Wang",
      "Saleema Amershi"
    ],
    "published": "2024-08-09",
    "abstract": "Multi-agent systems, where multiple agents (generative AI models + tools)\ncollaborate, are emerging as an effective pattern for solving long-running,\ncomplex tasks in numerous domains. However, specifying their parameters (such\nas models, tools, and orchestration mechanisms etc,.) and debugging them\nremains challenging for most developers. To address this challenge, we present\nAUTOGEN STUDIO, a no-code developer tool for rapidly prototyping, debugging,\nand evaluating multi-agent workflows built upon the AUTOGEN framework. AUTOGEN\nSTUDIO offers a web interface and a Python API for representing LLM-enabled\nagents using a declarative (JSON-based) specification. It provides an intuitive\ndrag-and-drop UI for agent workflow specification, interactive evaluation and\ndebugging of workflows, and a gallery of reusable agent components. We\nhighlight four design principles for no-code multi-agent developer tools and\ncontribute an open-source implementation at\nhttps://github.com/microsoft/autogen/tree/main/samples/apps/autogen-studio",
    "pdf_url": "http://arxiv.org/pdf/2408.15247v1"
  },
  {
    "title": "Beyond Text: Implementing Multimodal Large Language Model-Powered Multi-Agent Systems Using a No-Code Platform",
    "authors": [
      "Cheonsu Jeong"
    ],
    "published": "2025-01-01",
    "abstract": "This study proposes the design and implementation of a multimodal LLM-based\nMulti-Agent System (MAS) leveraging a No-Code platform to address the practical\nconstraints and significant entry barriers associated with AI adoption in\nenterprises. Advanced AI technologies, such as Large Language Models (LLMs),\noften pose challenges due to their technical complexity and high implementation\ncosts, making them difficult for many organizations to adopt. To overcome these\nlimitations, this research develops a No-Code-based Multi-Agent System designed\nto enable users without programming knowledge to easily build and manage AI\nsystems. The study examines various use cases to validate the applicability of\nAI in business processes, including code generation from image-based notes,\nAdvanced RAG-based question-answering systems, text-based image generation, and\nvideo generation using images and prompts. These systems lower the barriers to\nAI adoption, empowering not only professional developers but also general users\nto harness AI for significantly improved productivity and efficiency. By\ndemonstrating the scalability and accessibility of No-Code platforms, this\nstudy advances the democratization of AI technologies within enterprises and\nvalidates the practical applicability of Multi-Agent Systems, ultimately\ncontributing to the widespread adoption of AI across various industries.",
    "pdf_url": "http://arxiv.org/pdf/2501.00750v2"
  },
  {
    "title": "Improving Performance of Commercially Available AI Products in a Multi-Agent Configuration",
    "authors": [
      "Cory Hymel",
      "Sida Peng",
      "Kevin Xu",
      "Charath Ranganathan"
    ],
    "published": "2024-10-29",
    "abstract": "In recent years, with the rapid advancement of large language models (LLMs),\nmulti-agent systems have become increasingly more capable of practical\napplication. At the same time, the software development industry has had a\nnumber of new AI-powered tools developed that improve the software development\nlifecycle (SDLC). Academically, much attention has been paid to the role of\nmulti-agent systems to the SDLC. And, while single-agent systems have\nfrequently been examined in real-world applications, we have seen comparatively\nfew real-world examples of publicly available commercial tools working together\nin a multi-agent system with measurable improvements. In this experiment we\ntest context sharing between Crowdbotics PRD AI, a tool for generating software\nrequirements using AI, and GitHub Copilot, an AI pair-programming tool. By\nsharing business requirements from PRD AI, we improve the code suggestion\ncapabilities of GitHub Copilot by 13.8% and developer task success rate by\n24.5% -- demonstrating a real-world example of commercially-available AI\nsystems working together with improved outcomes.",
    "pdf_url": "http://arxiv.org/pdf/2410.22129v1"
  },
  {
    "title": "AutoAgent: A Fully-Automated and Zero-Code Framework for LLM Agents",
    "authors": [
      "Jiabin Tang",
      "Tianyu Fan",
      "Chao Huang"
    ],
    "published": "2025-02-09",
    "abstract": "Large Language Model (LLM) Agents have demonstrated remarkable capabilities\nin task automation and intelligent decision-making, driving the widespread\nadoption of agent development frameworks such as LangChain and AutoGen.\nHowever, these frameworks predominantly serve developers with extensive\ntechnical expertise - a significant limitation considering that only 0.03 % of\nthe global population possesses the necessary programming skills. This stark\naccessibility gap raises a fundamental question: Can we enable everyone,\nregardless of technical background, to build their own LLM agents using natural\nlanguage alone? To address this challenge, we introduce AutoAgent-a\nFully-Automated and highly Self-Developing framework that enables users to\ncreate and deploy LLM agents through Natural Language Alone. Operating as an\nautonomous Agent Operating System, AutoAgent comprises four key components: i)\nAgentic System Utilities, ii) LLM-powered Actionable Engine, iii) Self-Managing\nFile System, and iv) Self-Play Agent Customization module. This lightweight yet\npowerful system enables efficient and dynamic creation and modification of\ntools, agents, and workflows without coding requirements or manual\nintervention. Beyond its code-free agent development capabilities, AutoAgent\nalso serves as a versatile multi-agent system for General AI Assistants.\nComprehensive evaluations on the GAIA benchmark demonstrate AutoAgent's\neffectiveness in generalist multi-agent tasks, surpassing existing\nstate-of-the-art methods. Furthermore, AutoAgent's Retrieval-Augmented\nGeneration (RAG)-related capabilities have shown consistently superior\nperformance compared to many alternative LLM-based solutions.",
    "pdf_url": "http://arxiv.org/pdf/2502.05957v2"
  },
  {
    "title": "Interactive Debugging and Steering of Multi-Agent AI Systems",
    "authors": [
      "Will Epperson",
      "Gagan Bansal",
      "Victor Dibia",
      "Adam Fourney",
      "Jack Gerrits",
      "Erkang Zhu",
      "Saleema Amershi"
    ],
    "published": "2025-03-03",
    "abstract": "Fully autonomous teams of LLM-powered AI agents are emerging that collaborate\nto perform complex tasks for users. What challenges do developers face when\ntrying to build and debug these AI agent teams? In formative interviews with\nfive AI agent developers, we identify core challenges: difficulty reviewing\nlong agent conversations to localize errors, lack of support in current tools\nfor interactive debugging, and the need for tool support to iterate on agent\nconfiguration. Based on these needs, we developed an interactive multi-agent\ndebugging tool, AGDebugger, with a UI for browsing and sending messages, the\nability to edit and reset prior agent messages, and an overview visualization\nfor navigating complex message histories. In a two-part user study with 14\nparticipants, we identify common user strategies for steering agents and\nhighlight the importance of interactive message resets for debugging. Our\nstudies deepen understanding of interfaces for debugging increasingly important\nagentic workflows.",
    "pdf_url": "http://arxiv.org/pdf/2503.02068v1"
  }
]

Take a breath

恭喜,到达此处说明你又完成了一个阶段的学习,至此你已经完成了 AutoGen 官方教程中的 TutorialAdvancedExamples 这三大模块,官网链接中教程我们已经完整过了一遍,后续我会再开一个新的章节,主要是其源码中的 samples 中的示例,这部分的难度会上升一个档次,但不要担心,我会先将所有的雷都趟一遍为你保驾护航。

革命尚未成功,同志仍需努力。

相关推荐
XiaoMu_0014 分钟前
基于Django+Vue3+YOLO的智能气象检测系统
python·yolo·django
SEO_juper7 分钟前
大型语言模型SEO(LLM SEO)完全手册:驾驭搜索新范式
人工智能·语言模型·自然语言处理·chatgpt·llm·seo·数字营销
攻城狮7号1 小时前
腾讯混元翻译模型Hunyuan-MT-7B开源,先前拿了30个冠军
人工智能·hunyuan-mt-7b·腾讯混元翻译模型·30个冠军
zezexihaha1 小时前
从“帮写文案”到“管生活”:个人AI工具的边界在哪?
人工智能
算家云1 小时前
nano banana官方最强Prompt模板来了!六大场景模板详解
人工智能·谷歌·ai大模型·算家云·ai生图·租算力,到算家云·nano banana 提示词
暴躁的大熊1 小时前
AI助力决策:告别生活与工作中的纠结,明析抉择引领明智选择
人工智能
Gyoku Mint1 小时前
提示词工程(Prompt Engineering)的崛起——为什么“会写Prompt”成了新技能?
人工智能·pytorch·深度学习·神经网络·语言模型·自然语言处理·nlp
honder试试1 小时前
焊接自动化测试平台图像处理分析-模型训练推理
开发语言·python
梁小憨憨1 小时前
zotero扩容
人工智能·笔记
大数据张老师1 小时前
AI架构师的思维方式与架构设计原则
人工智能·架构师·ai架构·后端架构