多Agent协作入门:群组聊天-AgentGroupChat

大家好,我是Edison。

近日抽空学习了下Semantic Kernel提供的AgentGroupChat对象写了一个多Agent群组对话的Demo,总结一下分享与你。当然,多Agent协作还有其他的方式,就留到后续慢慢介绍给你。

AgentChat是什么鬼?

在Semantic Kernel中,AgentChat提供了一个框架,可以启用多个代理之间的交互,即使它们属于不同类型的代理。 这使得 ChatCompletionAgent和 OpenAIAssistantAgent 可以在同一对话中协同工作。 AgentChat还定义了用于启动代理之间协作的入口点,无论是通过多个响应还是单个代理响应。

在实现层面,AgentGroupChat 提供了 AgentChat 的具体实现,它是使用基于策略的方法来管理聊天的动态。

快速入门案例

这里我们来快速实现一个案例:Reviewer & Writer,让这两个不同功能的Agent能够相互配合协作,完成一个指定的功能:

(1)Reviewer 可以审核用户输入的文案并给出优化建议;

(2)Writer 则根据优化建议进行文案的优化创作;

为了简单地实现这个功能,我们创建一个.NET控制台项目,然后安装以下包:

复制代码
Microsoft.Extensions.Configuration
Microsoft.Extensions.Configuration.Json
Microsoft.SemanticKernel.Agents.Core
Microsoft.SemanticKernel.Agents.OpenAI (Preview版本)

需要注意的是,由于Semantic Kernel的较多功能目前还处于实验预览阶段,所以建议在该项目的csproj文件中加入以下配置,统一取消警告:

复制代码
<PropertyGroup>
  <NoWarn>$(NoWarn);CA2007;IDE1006;SKEXP0001;SKEXP0110;OPENAI001</NoWarn>
</PropertyGroup>

创建一个appsettings.json配置文件,填入以下关于LLM API的配置,其中API_KEY请输入你自己的:

复制代码
{
  "LLM": 
  {
    "BASE_URL": "https://api.siliconflow.cn",
    "API_KEY": "******************************",
    "MODEL_ID": "Qwen/Qwen2.5-32B-Instruct"
  }
}

这里我们使用SiliconCloud提供的 Qwen2.5-32B-Instruct 模型,你可以通过:https://cloud.siliconflow.cn/i/DomqCefW 注册一个账号,获取大量免费的Token来来进行这个DEMO实验。

有了LLM API,我们可以创建一个Kernel供后续使用,这也是老面孔了:

复制代码
Console.WriteLine("Now loading the configuration...");
var config = new ConfigurationBuilder()
    .AddJsonFile($"appsettings.json", optional: false, reloadOnChange: true)
    .Build();
Console.WriteLine("Now loading the chat client...");
var chattingApiConfiguration = new OpenAiConfiguration(
    config.GetSection("LLM:MODEL_ID").Value,
    config.GetSection("LLM:BASE_URL").Value,
    config.GetSection("LLM:API_KEY").Value);
var openAiChattingClient = new HttpClient(new OpenAiHttpHandler(chattingApiConfiguration.EndPoint));
var kernel = Kernel.CreateBuilder()
    .AddOpenAIChatCompletion(chattingApiConfiguration.ModelId, chattingApiConfiguration.ApiKey, httpClient: openAiChattingClient)
    .Build();

接下来,我们就一步一步地来看看核心的代码。

定义两个Agent

这里我们来定义两个Agent:Reviewer & Writer:

(1)Reviewer

复制代码
public class ReviewerAgent
{
    public const string AgentName = "Reviewer";
    public static ChatCompletionAgent Build(Kernel kernel)
    {
        var toolKernel = kernel.Clone();
        toolKernel.Plugins.AddFromType<ClipboardAccessPlugin>();
        var reviewerAgent = new ChatCompletionAgent()
        {
            Name = AgentName,
            Instructions =
                """
                Your responsibility is to review and identify how to improve user provided content.
                If the user has providing input or direction for content already provided, specify how to address this input.
                Never directly perform the correction or provide example.
                Once the content has been updated in a subsequent response, you will review the content again until satisfactory.
                Always copy satisfactory content to the clipboard using available tools and inform user.
                RULES:
                - Only identify suggestions that are specific and actionable.
                - Verify previous suggestions have been addressed.
                - Never repeat previous suggestions.
                """,
            Kernel = toolKernel,
            Arguments = new KernelArguments(
                new OpenAIPromptExecutionSettings()
                {
                    FunctionChoiceBehavior = FunctionChoiceBehavior.Auto()
                })
        };
        return reviewerAgent;
    }
}

(2)Writer

复制代码
public class WriterAgent
{
    public const string AgentName = "Writer";
    public static ChatCompletionAgent Build(Kernel kernel)
    {
        var writerAgent = new ChatCompletionAgent()
        {
            Name = AgentName,
            Instructions =
                """
                Your sole responsibility is to rewrite content according to review suggestions.
                - Always apply all review direction.
                - Always revise the content in its entirety without explanation.
                - Never address the user.
                """,
            Kernel = kernel
        };
        return writerAgent;
    }
}

这里可以通过静态方法直接Build出来两个Agent实例:

复制代码
// Initialize Reviewer Agent
Console.WriteLine("Now loading the Reviewer Agent...");
var reviewerAgent = ReviewerAgent.Build(kernel);
// Initialize Writer Agent
Console.WriteLine("Now loading the Writer Agent...");
var writerAgent = WriterAgent.Build(kernel);

定义选择策略 和 终止策略

对于多Agent协作,在AgentGroupChat中需要定义选择Agent轮次的策略(即SelectionStrategy)和 终止聊天循环的策略(即TerminationStrategy)。我们可以通过使用 AgentGroupChat.CreatePromptFunctionForStrategy 来轻松地实现,它提供了一种方便的机制,避免了对消息参数进行HTML编码:

(1)SelectionStrategy

所谓选择策略,就是如何定义下一个发言的是谁,或者谁来接龙。这里我们首先让Reviewer评估用户输入的内容,如果觉得需要优化,就给出建议,下一个就轮到Writer来进行优化内容协作。

复制代码
// Define Selection Policy
var selectionFunction =
    AgentGroupChat.CreatePromptFunctionForStrategy(
        $$$"""
        Examine the provided RESPONSE and choose the next participant.
        State only the name of the chosen participant without explanation.
        Never choose the participant named in the RESPONSE.
        Choose only from these participants:
        - {{{ReviewerAgent.AgentName}}}
        - {{{WriterAgent.AgentName}}}
        Always follow these rules when choosing the next participant:
        - If RESPONSE is user input, it is {{{ReviewerAgent.AgentName}}}'s turn.
        - If RESPONSE is by {{{ReviewerAgent.AgentName}}}, it is {{{WriterAgent.AgentName}}}'s turn.
        - If RESPONSE is by {{{WriterAgent.AgentName}}}, it is {{{ReviewerAgent.AgentName}}}'s turn.
        RESPONSE:
        {{${{{KernelFunctionTerminationStrategy.DefaultHistoryVariableName}}}}}
        """);

(2)TerminationStrategy

这个终止策略至关重要,它定义了如何评估什么时候退出聊天循环。对于这个案例来说,就是评估Writer优化的内容是否满足用户的需求了。

复制代码
// Define Termination Policy
const string TerminationToken = "yes";
var terminationFunction =
    AgentGroupChat.CreatePromptFunctionForStrategy(
        $$$"""
        Examine the RESPONSE and determine whether the content has been deemed satisfactory.
        If content is satisfactory, respond with a single word without explanation: {{{TerminationToken}}}.
        If specific suggestions are being provided, it is not satisfactory.
        If no correction is suggested, it is satisfactory.
        RESPONSE:
        {{${{{KernelFunctionTerminationStrategy.DefaultHistoryVariableName}}}}}
        """);

两种策略 都只需要了解最新的1条聊天消息,因此可以使用下面的代码来定一个HistoryReducer,它可以只将最近的1条消息作为历史记录传递给下一个聊天参与者。这将减少Token消耗 也能 一定程度提高性能。

复制代码
var historyReducer = new ChatHistoryTruncationReducer(1);

初始化AgentGroupChat

AgentGroupChat对象会将之前定义的所有内容聚集在一起,相当于我们创建了一个微信群聊,添加了群聊的对象(Reviewer + Writer),以及告诉群主或管理员如何选择Agent的策略 和 终止循环的策略。

复制代码
// Initialize AgentGroupChat
var groupChat = new AgentGroupChat(reviewerAgent, writerAgent)
{
    ExecutionSettings = new AgentGroupChatSettings()
    {
        SelectionStrategy = new KernelFunctionSelectionStrategy(selectionFunction, kernel)
        {
            InitialAgent = reviewerAgent,
            HistoryReducer = historyReducer,
            HistoryVariableName = KernelFunctionTerminationStrategy.DefaultHistoryVariableName,
            ResultParser = (result) =>
            {
                var val = result.GetValue<string>() ?? ReviewerAgent.AgentName;
                return val.ReplaceLineEndings("\n").Trim();
            }
        },
        TerminationStrategy = new KernelFunctionTerminationStrategy(terminationFunction, kernel)
        {
            Agents = [reviewerAgent],
            HistoryReducer = historyReducer,
            HistoryVariableName = KernelFunctionTerminationStrategy.DefaultHistoryVariableName,
            MaximumIterations = 10,
            ResultParser = (result) =>
            {
                var val = result.GetValue<string>() ?? string.Empty;
                return val.Contains(TerminationToken, StringComparison.OrdinalIgnoreCase);
            }
        }
    }
};

开始聊天循环

下面的代码也是老面孔了,就不过多介绍了:

复制代码
// Start Working!
Console.WriteLine("----------Agents are Ready. Let's Start Working!----------");
while (true)
{
    Console.WriteLine("User> ");
    var input = Console.ReadLine();
    if (string.IsNullOrWhiteSpace(input))
        continue;
    input = input.Trim();
    if (input.Equals("EXIT", StringComparison.OrdinalIgnoreCase))
        break;
    if (input.Equals("RESET", StringComparison.OrdinalIgnoreCase))
    {
        await groupChat.ResetAsync();
        Console.ResetColor();
        Console.WriteLine("System> Conversation has been reset!");
        continue;
    }
    groupChat.AddChatMessage(new ChatMessageContent(AuthorRole.User, input));
    groupChat.IsComplete = false;
    try
    {
        await foreach (var response in groupChat.InvokeAsync())
        {
            if (string.IsNullOrWhiteSpace(response.Content))
                continue;
            Console.ForegroundColor = ConsoleColor.Green;
            Console.WriteLine();
            Console.WriteLine($"{response.AuthorName} ({response.Role})> ");
            Console.WriteLine($"{response.Content.ReplaceLineEndings("\n").Trim()}");
        }
        Console.ResetColor();
        Console.WriteLine();
    }
    catch (HttpOperationException ex)
    {
        Console.ResetColor();
        Console.WriteLine(ex.Message);
        if (ex.InnerException != null)
        {
            Console.WriteLine(ex.InnerException.Message);
            if (ex.InnerException.Data.Count > 0)
                Console.WriteLine(JsonSerializer.Serialize(ex.InnerException.Data, new JsonSerializerOptions() { WriteIndented = true }));
        }
    }
}
Console.ResetColor();
Console.WriteLine("----------See you next time!----------");
Console.ReadKey();

效果展示

第一轮:我给了它一段待优化的文本段落,文本内容如下。
Semantic Kernel (SK) is an open-source SDK that enables developers to build and orchestrate complex AI workflows that involve natural language processing (NLP) and machine learning models. It provides a flexible platform for integrating AI capabilities such as semantic search, text summarization, and dialogue systems into applications. With SK, you can easily combine different AI services and models, define their relationships, and orchestrate interactions between them.

第二轮:让Agent帮忙将其拆分为两个段落

第三轮:提出更高的要求,需要更加学术化以便大学教授能够欣赏

可以看到,Reviewer 和 Writer 的配合还是不错,准确完成了我给它们的Task。

小结

本文介绍了如何通过Semantic Kernel提供的AgentGroupChat来实现多Agent的协作,其中最要的部分就是定义选择轮次策略 和 终止聊天策略,相信通过这个案例你能够有个感性的认识。

当然,除了群组聊天模式之外,多Agent协作还有很多其他的方式(比如 并行、顺序、移交、磁性等等),也还有不同的框架实现(如AutoGen),这就留到后面一一介绍给你,因为我也还在学。

示例源码

Github: https://github.com/EdisonTalk/MultiAgentSamples

参考资料

Microsoft Learn: https://learn.microsoft.com/en-us/semantic-kernel/support/archive/agent-chat-example?pivots=programming-language-csharp

推荐学习

圣杰:《.NET+AI | Semantic Kernel入门到精通


作者:爱迪生

出处:https://edisontalk.cnblogs.com

本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文链接。

相关推荐
CS创新实验室3 小时前
筑牢 AIGC 安全防线:警惕提示词注入攻击
安全·大模型·aigc·提示词·提示词注入
墨风如雪6 小时前
Grok-4来了!马斯克这次要把AI“逼疯”,但你付得起吗?
aigc
后端小肥肠8 小时前
揭秘10W+AI动物运动会视频,我用Coze一键搞定全流程(附保姆级拆解)
人工智能·aigc·coze
m0_7431064610 小时前
【论文笔记】BlockGaussian:巧妙解决大规模场景重建中的伪影问题
论文阅读·计算机视觉·3d·aigc·几何学
聚客AI13 小时前
🎯 RAG系统工业级部署指南:六步实现<3%幻觉率的问答系统
人工智能·langchain·llm
夲奋亻Jay13 小时前
下面我将针对每个应用场景,详细列出前端领域涉及AI推进的具体技术、工具和方案
aigc·ai编程
柠檬豆腐脑15 小时前
Trae-Agent 核心执行思路深度解析
llm·agent·trae
xuedaobian15 小时前
AI IDE里的 context 工程
人工智能·aigc·visual studio code
半旧51816 小时前
Deepseek搭建智能体&个人知识库
大模型·llm·aigc·agent·知识库·智能体