AutoGen 检索增强生成（RAG）功能解析

[一、什么是检索增强（RAG）？](#一、什么是检索增强（RAG）？)

[二、AutoGen 检索增强（RAG）](#二、AutoGen 检索增强（RAG）)

三、实例

本文主要对 AutoGen 检索增强生成（RAG）功能进行解析，并通过两个实例来说明。

一、什么是检索增强（RAG）？

检索增强生成 (RAG) 是大模型的一种扩展技术，它将大模型与外部知识检索相结合，以提高生成的响应的质量和相关性，例如：FastGPT。

RAG架构图如下所示。

主要流程如下所示：

（1）用户（User）提出问题（Query）;

（2）在数据库（Data Source）中查询与问题（Query）相关的内容，这些内容将作为 LLM 的上下文（Text），将问题（Query）和上下文（Text）一起传给 LLM；

（3）LLM 根据 Data Source 中检索到的内容组织回答，最终生成回答，返回给用户；

这样就很大程度上避免了 LLM 幻觉问题，充分利用了 LLM 语言生成的能力。

二、AutoGen 检索增强（RAG）

AutoGen 虽然是多代理协作的框架，AutoGen 同样也支持 RAG 功能，AutoGen 是通过 AssistantAgent 和 RetrieveUserProxyAgent 类构建代理聊天实现。

2.1 RetrieveUserProxyAgent 函数

RetrieveUserProxyAgent 根据问题的嵌入检索文档块，并将它们与问题一起发送给检索增强助手，该类继承自 UserProxyAgent。

python 复制代码

class RetrieveUserProxyAgent(UserProxyAgent)

主要函数为 init，如下所示。

python 复制代码

def __init__(name="RetrieveChatAgent",
             human_input_mode: Literal["ALWAYS", "NEVER",
                                       "TERMINATE"] = "ALWAYS",
             is_termination_msg: Optional[Callable[[Dict], bool]] = None,
             retrieve_config: Optional[Dict] = None,
             **kwargs)

参数介绍：

name: 代理名称；

human_input_mode： 详细介绍见AutoGen ConversableAgent 基类解析中关于 human_input_mode 的介绍；

is_termination_msg： 同 human_input_mode字段介绍；

retrieve_config： dict or None 类型，这个字段是**重点！**默认是 None，可以包含很多参数，下面列举下主要的参数：

（1）task： 检索增强任务类型，可选项：code 、qa 、default 。默认值是 default ，同时支持 code 和 qa，并提供来源。

（2）chunk_token_size： 分块大小（通常将检索的原文切分为多个块，通过 Embedding 模型对比问题和分块的相似度，从而判断是否相似），默认值 max_tokens * 0.4；

（3）model： 用于检索聊天的模型，默认值 gpt-4；

（4）embedding_model： 用于检索中文本的向量化，默认值 all-MiniLM-L6-v2 。可用模型列表：https://www.sbert.net/docs/pretrained_models.html，官方推荐使用 all-mpnet-base-v2 模型；

**（5）docs_path：**文档目录的路径，可以是单个文件的路径、单个文件的url或目录、文件和url的列表。

（6）custom_text_types：docs_path 字段中要处理的文件类型列表。默认值为 autogen.retrieve_utils.TEXT_FORMATS 中支持的类型。

下面通过两个实例进行说明。

三、实例

因为我这边 LLM 采用的是开源大模型 llama-3.1-405b 中文支持感觉不太好，所以下面的实例中还是采用英文进行介绍。

实例主要是询问大模型"介绍下AutoGen"，并提供了 AutoGen 在线文档进行内容检索，代码如下所示。

python 复制代码

import os
import chromadb
from autogen import AssistantAgent, config_list_from_json
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent

# 配置LLM
config_list = config_list_from_json(
    env_or_file="OAI_CONFIG_LIST",
)

# 助手
assistant = AssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    llm_config={
        "timeout": 600,
        "cache_seed": 42,
        "config_list": config_list,
    },
)

# 检索增强代理
ragproxyagent = RetrieveUserProxyAgent(
    name="ragproxyagent",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=3,
    retrieve_config={
        "task": "code",
        "docs_path": [
            "https://github.com/microsoft/autogen/blob/main/website/docs/Getting-Started.mdx",
            "https://github.com/microsoft/autogen/blob/main/website/docs/tutorial/introduction.ipynb",
            os.path.join(os.path.abspath(""), "..", "website", "docs"),
        ],
        "custom_text_types": ["mdx"],
        "chunk_token_size": 2000,
        "model": config_list[0]["model"],
        "client": chromadb.PersistentClient(path="/tmp/chromadb"),
        "embedding_model": "all-mpnet-base-v2",
        "get_or_create": True,  # set to False if you don't want to reuse an existing collection, but you'll need to remove the collection manually
    },
    code_execution_config=False,  # set to False if you don't want to execute the code
)


code_problem = "Please introduce AutoGen."

# 开始对话
ragproxyagent.initiate_chat(
    assistant, 
    message=ragproxyagent.message_generator,
    problem=code_problem,
    search_string="AutoGen"
)  # search_string is used as an extra filter for the embeddings search, in this case, we only want to search documents that contain "spark".

输出如下所示。

python 复制代码

(autogenstudy) D:\code\autogenstudio_images\example>python RetrieveUserProxyAgent.py
D:\Software\anaconda3\envs\autogenstudy\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
Trying to create collection.
2024-08-31 14:00:25,791 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - Use the existing collection `autogen-docs`.
File D:\code\autogenstudio_images\example\..\website\docs does not exist. Skipping.
2024-08-31 14:00:29,757 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - Found 3 chunks.
2024-08-31 14:00:29,764 - autogen.agentchat.contrib.vectordb.chromadb - INFO - No content embedding is provided. Will use the VectorDB's embedding function to generate the content embedding.
Number of requested results 20 is greater than number of elements in index 7, updating n_results = 7
VectorDB returns doc_ids:  [['99947a71', 'b947501f']]
Model meta/llama-3.1-405b-instruct not found. Using cl100k_base encoding.
Adding content of doc 99947a71 to context.
Model meta/llama-3.1-405b-instruct not found. Using cl100k_base encoding.
ragproxyagent (to assistant):

You're a retrieve augmented coding assistant. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
For code generation, you must obey the following rules:
Rule 1. You MUST NOT install any packages because all the packages needed are already installed.
Rule 2. You must follow the formats below to write your code:
```language
# your code
```

User's question is: Please introduce AutoGen.

Context is: ### Main Features

* AutoGen enables building next\-gen LLM applications based on [multi\-agent
conversations](/microsoft/autogen/blob/main/docs/Use-Cases/agent_chat) with minimal effort. It simplifies
the orchestration, automation, and optimization of a complex LLM workflow. It
maximizes the performance of LLM models and overcomes their weaknesses.
* It supports [diverse conversation
patterns](/microsoft/autogen/blob/main/docs/Use-Cases/agent_chat#supporting-diverse-conversation-patterns)
for complex workflows. With customizable and conversable agents, developers can
use AutoGen to build a wide range of conversation patterns concerning
conversation autonomy, the number of agents, and agent conversation topology.
* It provides a collection of working systems with different complexities. These
systems span a [wide range of
applications](/microsoft/autogen/blob/main/docs/Use-Cases/agent_chat#diverse-applications-implemented-with-autogen)
from various domains and complexities. This demonstrates how AutoGen can
easily support diverse conversation patterns.

AutoGen is powered by collaborative [research studies](/microsoft/autogen/blob/main/docs/Research) from
Microsoft, Penn State University, and University of Washington.

### Quickstart

```
pip install pyautogen
```

```
import os
from autogen import AssistantAgent, UserProxyAgent

llm_config = {"model": "gpt-4", "api_key": os.environ["OPENAI_API_KEY"]}
assistant = AssistantAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)

# Start the chat
user_proxy.initiate_chat(
    assistant,
    message="Tell me a joke about NVDA and TESLA stock prices.",
)
```

```
</TabItem>
<TabItem value="local" label="Local execution" default>

```

:::warning
When asked, be sure to check the generated code before continuing to ensure it is safe to run.
:::

```
import os
import autogen
from autogen import AssistantAgent, UserProxyAgent

llm_config = {"model": "gpt-4", "api_key": os.environ["OPENAI_API_KEY"]}
assistant = AssistantAgent("assistant", llm_config=llm_config)

user_proxy = UserProxyAgent(
    "user_proxy", code_execution_config={"executor": autogen.coding.LocalCommandLineCodeExecutor(work_dir="coding")}
)

# Start the chat
user_proxy.initiate_chat(
    assistant,
    message="Plot a chart of NVDA and TESLA stock price change YTD.",
)
```

```
</TabItem>
<TabItem value="docker" label="Docker execution" default>

```

```
import os
import autogen
from autogen import AssistantAgent, UserProxyAgent

llm_config = {"model": "gpt-4", "api_key": os.environ["OPENAI_API_KEY"]}

with autogen.coding.DockerCommandLineCodeExecutor(work_dir="coding") as code_executor:
    assistant = AssistantAgent("assistant", llm_config=llm_config)
    user_proxy = UserProxyAgent(
        "user_proxy", code_execution_config={"executor": code_executor}
    )

    # Start the chat
    user_proxy.initiate_chat(
        assistant,
        message="Plot a chart of NVDA and TESLA stock price change YTD. Save the plot to a file called plot.png",
    )
```

Open `coding/plot.png` to see the generated plot.

```
</TabItem>

```

:::tip
Learn more about configuring LLMs for agents [here](/microsoft/autogen/blob/main/docs/topics/llm_configuration).
:::

#### Multi\-Agent Conversation Framework

Autogen enables the next\-gen LLM applications with a generic multi\-agent conversation framework. It offers customizable and conversable agents which integrate LLMs, tools, and humans.
By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For [example](https://github.com/microsoft/autogen/blob/main/test/twoagent.py),

The figure below shows an example conversation flow with AutoGen.

[![Agent Chat Example](/microsoft/autogen/raw/main/img/chat_example.png)](/microsoft/autogen/blob/main/img/chat_example.png)

### Where to Go Next?

* Go through the [tutorial](/microsoft/autogen/blob/main/docs/tutorial/introduction) to learn more about the core concepts in AutoGen
* Read the examples and guides in the [notebooks section](/microsoft/autogen/blob/main/docs/notebooks)
* Understand the use cases for [multi\-agent conversation](/microsoft/autogen/blob/main/docs/Use-Cases/agent_chat) and [enhanced LLM inference](/microsoft/autogen/blob/main/docs/Use-Cases/enhanced_inference)
* Read the [API](/microsoft/autogen/blob/main/docs/reference/agentchat/conversable_agent) docs
* Learn about [research](/microsoft/autogen/blob/main/docs/Research) around AutoGen
* Chat on [Discord](https://aka.ms/autogen-dc)
* Follow on [Twitter](https://twitter.com/pyautogen)
* See our [roadmaps](https://aka.ms/autogen-roadmap)

If you like our project, please give it a [star](https://github.com/microsoft/autogen/stargazers) on GitHub. If you are interested in contributing, please read [Contributor's Guide](/microsoft/autogen/blob/main/docs/contributor-guide/contributing).

\<iframe
 src\="[https://ghbtns.com/github\-btn.html?user\=microsoft\&repo\=autogen\&type\=star\&count\=true\&size\=large](https://ghbtns.com/github-btn.html?user=microsoft&repo=autogen&type=star&count=true&size=large)"
 frameborder\="0"
 scrolling\="0"
 width\="170"
 height\="30"
 title\="GitHub"
\>\</iframe\>


Footer
------

 © 2024 GitHub, Inc.


### Footer navigation

* [Terms](https://docs.github.com/site-policy/github-terms/github-terms-of-service)
* [Privacy](https://docs.github.com/site-policy/privacy-policies/github-privacy-statement)
* [Security](https://github.com/security)
* [Status](https://www.githubstatus.com/)
* [Docs](https://docs.github.com/)
* [Contact](https://support.github.com?tags=dotcom-footer)
* Manage cookies
* Do not share my personal information

 You can't perform that action at this time.



--------------------------------------------------------------------------------
assistant (to ragproxyagent):

AutoGen is an open-source Python library developed by Microsoft that simplifies the process of building next-gen LLM (Large Language Model) applications based on multi-agent conversations. It enables developers to create customizable and conversable agents that can integrate LLMs, tools, and humans, allowing for easy automation and optimization of complex LLM workflows.

AutoGen supports diverse conversation patterns, including conversation autonomy, multiple agents, and agent conversation topology, making it suitable for a wide range of applications. The library provides a collection of working systems with different complexities and is powered by collaborative research studies from Microsoft, Penn State University, and the University of Washington.

Key features of AutoGen include:

1. Multi-agent conversation framework: AutoGen enables next-gen LLM applications with a generic multi-agent conversation framework, allowing for customization and conversability.
2. Customizable agents: Developers can create agents that integrate LLMs, tools, and humans, making it easy to automate and optimize complex LLM workflows.
3. Diverse conversation patterns: AutoGen supports various conversation patterns, including conversation autonomy, multiple agents, and agent conversation topology.
4. Working systems: The library provides a collection of working systems with different complexities, demonstrating its versatility.

AutoGen is easy to use and provides a quickstart guide, with examples of local, Docker, and API execution. The library also includes extensive documentation, tutorials, and guides, making it accessible to developers of all levels.

Overall, AutoGen is a powerful library that enables the creation of advanced LLM applications with ease, making it an exciting tool for developers and researchers working with large language models.

--------------------------------------------------------------------------------

参考链接：

1\] [Retrieval Augmentation \| AutoGen](https://microsoft.github.io/autogen/docs/topics/retrieval_augmentation/ "Retrieval Augmentation | AutoGen") \[2\] [agentchat.contrib.retrieve_user_proxy_agent \| AutoGen](https://microsoft.github.io/autogen/docs/reference/agentchat/contrib/retrieve_user_proxy_agent/ "agentchat.contrib.retrieve_user_proxy_agent | AutoGen")

AutoGen 检索增强生成（RAG）功能解析

一、什么是检索增强（RAG） ？

二、AutoGen 检索增强（RAG）

三、实例

一、什么是检索增强（RAG）？