AI大模型-2：智能体（Agent）和demo演示实践

一、智能体Agent

1、智能体Agent的定义

很多人对AI Agent的概念都有点模糊，其实最主要的原因是：Agent在不同场景下，说法不一样。我们可以从三个角度来理解它：

（1）学术视角

所谓AI Agent，就是具备这些能力的智能实体：

感知能力（Perception）、决策能力（Decision Making）、行动能力（Action）、目标驱动（Goal Oriented）。它不是一个简单的模型，而是能在环境里自主运行的智能体。

（2）现代大模型时代的视角

到了大模型的时代，AI Agent通常包含这几个部分：

大模型（LLM / 多模态模型），这是它的核心，负责推理、理解和生成内容；

记忆（Memory），用来存长期的知识、上下文以及和人交互的历史；

工具使用（Tool Use / Function Calling），就是能调用API、数据库、搜索引擎、代码执行器这些外部功能；

规划（Planning），能把复杂的任务拆成一步步能执行的步骤，还能反思调整、反复优化；

行动（Action），按照规划好的步骤调用工具、操作系统或应用，一直到完成目标为止。

（3）产品 / 工程视角

从产品和工程实际应用来看，AI Agent就是一个能一直运行、可以重复执行任务、还能自主完成工作的软件智能体。

比如：能自动写代码、运行代码、修复错误的AI Dev Agent；能自动处理客户咨询、流转工单的AI客服Agent；能自动分析业务数据、生成结论的AI分析Agent等等。

总结定义：广义来说，AI Agent是一种能在环境中自主感知、思考、规划并执行行动，从而达成特定目标的智能系统。

与大模型关系：Agent不一定非得有大模型，但现在主流的Agent，基本都以LLM或多模态模型为核心，再搭配上工具调用、记忆和规划机制，形成一套像人类执行任务那样的完整闭环能力。

更易落地的当下定义：现在我们说的AI Agent，其实就是基于大模型的自主智能系统，能感知环境、记住信息、做好规划、调用工具，还能执行行动，最终实现明确的目标。

2、智能体和大模型关系：从大脑到完整的身体

大语言模型（LLM），就相当于一个"超级厉害的大脑"，懂的知识多、推理能力也强，但它本身没有"手脚"，没法直接感知世界，也没法直接执行操作。

而智能体（Agent），就是在拥有"大脑"（也就是LLM）的基础上，再给它加上"手脚"（Tool）和"记忆"（Memory），这样它就能主动去感知、规划、行动，还能反思调整。

二、智能体的原理机制

1、智能体的四大核心组件

最常见的一种拆解方式，是把智能体分成四个核心组件：

组件	功能	作用	类比
1. Prompt（规划）	任务分解，执行路径规划	定义目标、策略和推理逻辑	大脑的"潜意识"（负责思考逻辑）
2. Tools（工具/手脚）	执行具体操作	调用外部 API、数据库、函数	双手（负责执行动作）
3. Memory（记忆）	存储信息	存储历史对话、执行状态	笔记本（负责记录上下文）
4. Agent Core（核心）	负责协调	调度决策（如 ReAct 循环）	中枢神经系统（负责协调）

1. （ Prompt）规划：Agent 的最高指令和操作规程 (SOP，)Prompt 是智能体的"灵魂注入点"

在基于大模型的智能体架构中，Prompt 负责告诉 LLM（大语言模型）"你是谁"、"你要做什么"以及"你该如何思考"。

作用：设定角色、定义任务边界、注入领域知识、规定输出格式。
地位：没有 Prompt，大模型就只是一个通用的聊天机器，无法成为特定领域的智能体。

Prompt 能做什么？

明确目标、约束格式、指定风格、给背景。比如你说"你是一个语气幽默的产品经理，帮我写一段面向 90 后用户的推广文案"，实习生就知道该往哪个方向走。

Prompt 做不到什么？

它止步于"这一次"。你说完，它做完，结束。

它不会主动去查数据，不会自动执行第二步，更不会在你睡觉的时候帮你把周报发出去。要做到这些，就需要下一个角色了。

一个很常见的误区：Prompt 写得越长越好。

不是的。长而啰嗦的 Prompt 经常比短而清晰的效果更差。清晰 > 复杂，永远是。

2、智能体的工作流程：感知 - 决策 - 行动 - 反思

本文侧重的是基于大模型的智能体（LLM Agent），其典型流程包括：

智能体循环的关键步骤：

感知（Perception）：借助传感器（比如 API 监听、用户输入接口）获取外界的输入信息，这些信息就是观察（Observation）。

思考（Thought）：由大语言模型带动的内部推理环节，还能细分成两部分：

规划（Planning）：结合当下的观察和过往记忆，更新对任务和环境的认知，制定或修改执行方案，把复杂目标拆分成一个个子任务。

工具选择（Tool Selection）：在现有的工具库里挑选最适配的工具，并确定好调用的参数。

行动（Action）：通过执行器（Actuators）完成具体操作，一般是调用某类工具（比如代码解释器、搜索 API 等），对环境产生作用。

观察与反思：依据行动的结果更新记忆和计划，开启下一轮循环。

从实际应用来看，Agent Loop 一般就是持续依据用户需求和环境的反馈，反复执行任务直到实现目标。

yanshi

三、多智能体（Multi-Agent）技术：AI 协作的新范式

1 多智能体简介为什么我们需要多智能体呢？

（1）单智能体的局限性：

能力比较单一，很难同时掌握多种专业技能；

遇到复杂任务时就吃力了，有些任务需要多步骤、多领域配合着完成；

效率不高，只能一步步串行执行，没法充分利用并行的资源；

扩展起来也麻烦，应对不了大规模的分布式场景。

（2）多智能体的优势：

能实现专业化分工，每个智能体都专注于自己擅长的领域；

可以并行处理任务，多个智能体一起干活，能加快整体任务的进度；

能把复杂的大任务拆解开，分成一个个子任务，分别去处理；

系统扩展性好，能根据需求随时增加或者替换智能体。

多智能体（Multi-Agent）模式，其实就是智能体系统从"一个人单打独斗"变成"一群人团队协作"的核心发展方向：

靠多个专业的智能体，再加上彼此的通信机制和协调策略，实现从"什么都想做的全能助手"到"各有所长的专家团队"的转变。

4.2 多智能体的协作模式

常见的多智能体协作模式有这几种：

主从模式：一个主Agent来调度多个子Agent，分配好各自的任务；

平行协作：多个Agent地位平等，一起协作，通过协调机制统一最终的结果；

黑板模式：所有Agent都通过一块共享的"黑板"来交换信息、同步进度；

组织/角色模式：按照"部门-角色"的方式来分配任务，就像公司里各部门各司其职一样。

多智能体的核心机制

多智能体系统的核心机制主要有这几个方面：任务分解，就是把复杂的问题拆成多个子任务，再按照每个智能体的专业特长进行合理分配。

智能体协调，借助任务调度、优先级管理以及负载均衡等方式，防止出现资源争抢和重复工作的情况。

通信协议，搭建起统一规范的信息交流方式，既支持同步通信，也支持异步通信，确保数据和状态能够准确、及时地传输。

决策融合，把多个智能体得出的决策结果进行汇总，通过投票、加权平均或者专家系统等方法，得出最终的决策结果。

四、demo实践

可以研究dify的agent：https://blog.csdn.net/hguisu/article/details/138978737?spm=1001.2014.3001.5501

1、没有实际模型的Agent Demo

基于 ReAct 模式的简单 Agent Demo。这个 Agent 能够通过"思考-行动"循环，调用工具来解决实际问题。

这个 Demo 展示了 Agent 的关键特性：

Prompt 驱动：定义了 Agent 的思考方式和响应格式
工具调用：能根据任务选择合适工具
记忆机制：记录对话历史用于上下文理解
决策循环：思考 → 行动 → 观察 → 回答

在实际应用中，_simulate_llm函数会被真正的 LLM API 调用取代，实现更智能的决策。这个架构也支持轻松扩展新工具和复杂任务。

python 复制代码

# 1. 工具定义 (Tools)
tools = {
    "search_web": {
        "description": "搜索互联网信息",
        "function": lambda query: f"搜索结果: 关于'{query}'的最新信息..."
    },
    "calculate": {
        "description": "执行数学计算",
        "function": lambda expr: f"计算结果: {eval(expr)}"
    },
    "get_time": {
        "description": "获取当前时间",
        "function": lambda: f"当前时间: 2026年3月28日 星期六 14:30"
    }
}

# 2. 记忆存储 (Memory)
memory = {
    "chat_history": [],
    "task_logs": []
}

# 3. Prompt模板 (Agent的"大脑")
PROMPT_TEMPLATE = """
你是一个智能助手。请通过以下步骤帮助用户：

工具列表:
{available_tools}

对话历史:
{chat_history}

当前任务: {user_input}

你的思考过程:
1. 理解用户需求
2. 选择合适工具（或直接回答）
3. 执行行动
4. 给出最终答案

请严格按照以下格式回应:
Thought: [你的思考过程]
Action: [工具名称] 或 [Answer]
Action Input: [工具参数] 或 [无]
Observation: [工具返回结果] 或 [无]
Final Answer: [最终回答]
"""

# 4. Agent 核心逻辑
class SimpleAgent:
    def __init__(self, tools, prompt_template):
        self.tools = tools
        self.prompt_template = prompt_template
        
    def process(self, user_input):
        """处理用户输入，执行 ReAct 循环"""
        print(f"\n用户: {user_input}")
        
        # 构建提示
        prompt = self.prompt_template.format(
            available_tools="\n".join([f"- {name}: {info['description']}" 
                                     for name, info in self.tools.items()]),
            chat_history="\n".join(memory["chat_history"][-3:]),
            user_input=user_input
        )
        
        # 模拟 LLM 响应 (这里是简化的决策逻辑)
        response = self._simulate_llm(prompt, user_input)
        
        # 解析响应
        thought, action, action_input = self._parse_response(response)
        
        # 执行行动
        observation = ""
        if action in self.tools:
            observation = self.tools[action]["function"](action_input)
            print(f"执行工具: {action}({action_input})")
        elif action == "Answer":
            observation = "直接回答问题"
        
        # 生成最终答案
        final_answer = self._generate_final_answer(thought, observation, user_input)
        
        # 更新记忆
        memory["chat_history"].append(f"用户: {user_input}")
        memory["chat_history"].append(f"助手: {final_answer}")
        memory["task_logs"].append({
            "input": user_input,
            "thought": thought,
            "action": action,
            "result": observation
        })
        
        return final_answer
    
    def _simulate_llm(self, prompt, user_input):
        """简化的决策逻辑 (实际应用中会调用真正的 LLM API)"""
        user_input = user_input.lower()
        
        if "时间" in user_input:
            return "Thought: 用户想知道当前时间，我应该调用时间工具。\nAction: get_time\nAction Input: 无"
        elif "计算" in user_input or "+" in user_input or "-" in user_input:
            return "Thought: 这是一个数学计算问题，我应该调用计算器工具。\nAction: calculate\nAction Input: 15 + 8 * 2"
        elif "新闻" in user_input or "信息" in user_input:
            return "Thought: 用户想获取最新信息，我应该搜索互联网。\nAction: search_web\nAction Input: 今日头条新闻"
        else:
            return f"Thought: 这是一个可以直接回答的问题，不需要调用工具。\nAction: Answer\nAction Input: 无\nFinal Answer: 你好！我是你的智能助手。你问的是: {user_input}"
    
    def _parse_response(self, response):
        """解析 LLM 的响应文本"""
        lines = response.split('\n')
        thought = lines[0].replace("Thought: ", "")
        action = lines[1].replace("Action: ", "") if len(lines) > 1 else "Answer"
        action_input = lines[2].replace("Action Input: ", "") if len(lines) > 2 else "无"
        return thought, action, action_input
    
    def _generate_final_answer(self, thought, observation, user_input):
        """生成最终回答"""
        if "时间" in user_input.lower():
            return "当前是 2026年3月28日 星期六 14:30。祝您今日愉快！"
        elif "计算" in user_input.lower():
            return f"根据计算，15 + 8 * 2 = 31。需要我计算其他公式吗？"
        elif "新闻" in user_input.lower():
            return "根据搜索，今日头条新闻是关于人工智能技术的新突破。想了解具体内容吗？"
        else:
            return f"明白了。你问的是关于'{user_input}'的问题。我目前可以帮你查询信息、计算或告知时间。"

# 5. 运行示例
if __name__ == "__main__":
    # 创建 Agent
    agent = SimpleAgent(tools, PROMPT_TEMPLATE)
    
    # 交互示例
    test_queries = [
        "现在几点了？",
        "帮我计算一下 15 + 8 * 2 等于多少",
        "今天有什么新闻？",
        "你好，能介绍一下你自己吗？"
    ]
    
    print("=== 简单 Agent Demo 启动 ===")
    print("Agent 已加载工具:", list(tools.keys()))
    
    for query in test_queries:
        result = agent.process(query)
        print(f"助手: {result}\n")
    
    print("=== 执行记录 ===")
    for i, log in enumerate(memory["task_logs"], 1):
        print(f"{i}. 任务: {log['input']}")
        print(f"   思考: {log['thought']}")
        print(f"   行动: {log['action']}")
        print(f"   结果: {log['result'][:50]}...")

2、使用腾讯云的混元模型

我们来看看智能体的执行流程，使用腾讯云的混元模型：

python 复制代码

import os
import json
from tencentcloud.common import credential
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.hunyuan.v20230901 import hunyuan_client, models


class HunyuanAgent:
    def __init__(self):
        # 1. 初始化混元客户端
        self.cred = credential.Credential(
            os.getenv("TENCENTCLOUD_SECRET_ID"),
            os.getenv("TENCENTCLOUD_SECRET_KEY")
        )
        self.client = hunyuan_client.HunyuanClient(self.cred, "ap-guangzhou")

        # 2. 定义工具（注意：Parameters必须是JSON字符串）
        self.tools = [
            {
                "Type": "function",
                "Function": {
                    "Name": "get_weather",
                    "Description": "获取指定城市的天气情况",
                    "Parameters": json.dumps({  # 必须是JSON字符串
                        "type": "object",
                        "properties": {
                            "city": {
                                "type": "string",
                                "description": "城市名称，如'北京'、'上海'、'深圳'"
                            }
                        },
                        "required": ["city"]
                    }, ensure_ascii=False)  # ensure_ascii=False 确保中文正常显示
                }
            },
            {
                "Type": "function",
                "Function": {
                    "Name": "calculate",
                    "Description": "执行数学计算",
                    "Parameters": json.dumps({  # 必须是JSON字符串
                        "type": "object",
                        "properties": {
                            "expression": {
                                "type": "string",
                                "description": "数学表达式，如'15+8 * 2'、'(3+5)*2'"
                            }
                        },
                        "required": ["expression"]
                    }, ensure_ascii=False)
                }
            }
        ]

        # 3. 工具函数映射
        self.tool_functions = {
            "get_weather": self._get_weather,
            "calculate": self._calculate
        }

        # 4. 对话历史
        self.messages = []

        # 5. 系统提示词 - 定义Agent的行为
        self.system_prompt = """你是一个智能助手，可以调用工具来帮助用户。
        你可以使用的工具有：
        1. get_weather: 查询城市天气
        2. calculate: 执行数学计算

        当用户的问题涉及这些功能时，请调用相应的工具。
        调用工具时，请确保参数完整准确。
        如果不需要调用工具，请直接回答用户的问题。"""

        # 初始化系统消息
        self.messages.append({"Role": "system", "Content": self.system_prompt})

    # --- 工具实现 ---
    def _get_weather(self, city):
        """模拟天气查询"""
        weather_data = {
            "北京": "天气：晴，温度：15°C，湿度：45%，空气质量：良",
            "上海": "天气：多云，温度：18°C，湿度：60%，空气质量：优",
            "深圳": "天气：阵雨，温度：22°C，湿度：75%，空气质量：良",
            "广州": "天气：阴，温度：20°C，湿度：65%，空气质量：优",
            "杭州": "天气：晴，温度：16°C，湿度：50%，空气质量：良"
        }
        return weather_data.get(city, f"抱歉，暂时没有{city}的天气信息")

    def _calculate(self, expression):
        """安全计算"""
        try:
            # 安全过滤
            expression = expression.replace(" ", "")
            # 只允许安全的数学字符
            allowed_chars = "0123456789+-*/()."
            if not all(c in allowed_chars for c in expression):
                return "错误：表达式包含不安全的字符"

            # 使用安全的方式计算
            result = eval(expression, {"__builtins__": {}}, {})
            return f"计算结果：{result}"
        except ZeroDivisionError:
            return "错误：除以零"
        except Exception as e:
            return f"计算失败：{str(e)}"

    def _simple_process(self, user_input):
        """简化处理流程（不依赖工具调用）"""
        print(f"用户: {user_input}")

        # 添加到消息历史
        self.messages.append({"Role": "user", "Content": user_input})

        try:
            # 构建请求
            req = models.ChatCompletionsRequest()
            req.Model = "hunyuan-standard"  # 标准模型
            req.Messages = self.messages
            req.Stream = False

            # 调用混元
            resp = self.client.ChatCompletions(req)
            message = resp.Choices[0].Message

            # 获取回答
            response = message.Content

            # 添加到历史
            self.messages.append({"Role": "assistant", "Content": response})

            return response

        except TencentCloudSDKException as e:
            return f"调用失败: {e}"

    def _process_with_tools(self, user_input):
        """使用工具调用的处理流程"""
        print(f"用户: {user_input}")

        # 添加到消息历史
        self.messages.append({"Role": "user", "Content": user_input})

        try:
            # 构建请求，启用工具
            req = models.ChatCompletionsRequest()
            req.Model = "hunyuan-standard"  # 或使用 hunyuan-pro
            req.Messages = self.messages
            req.Tools = self.tools
            req.ToolChoice = "auto"  # 让模型自主决定
            req.Stream = False

            # 调用混元
            resp = self.client.ChatCompletions(req)
            message = resp.Choices[0].Message
            print("模型回复------------------------:")
            print(message)

            # 检查是否有工具调用
            if hasattr(message, 'ToolCalls') and message.ToolCalls:
                print("检测到工具调用请求")

                # 保存助手的消息（包含工具调用）
                tool_calls_serializable = []
                for tool_call in message.ToolCalls:
                    tool_calls_serializable.append({
                        "Function": {
                            "Name": tool_call.Function.Name,
                            "Arguments": tool_call.Function.Arguments
                        },
                        "Id": tool_call.Id
                    })
                self.messages.append({
                    "Role": "assistant",
                    "Content": message.Content if hasattr(message, 'Content') else "",
                    "ToolCalls": tool_calls_serializable
                })

                # 处理每个工具调用
                tool_results = []
                for tool_call in message.ToolCalls:
                    try:
                        func_name = tool_call.Function.Name
                        args_str = tool_call.Function.Arguments

                        # 解析参数
                        args = json.loads(args_str) if isinstance(args_str, str) else args_str

                        print(f"调用工具: {func_name}, 参数: {args}")

                        if func_name in self.tool_functions:
                            # 执行工具
                            result = self.tool_functions[func_name](**args)
                            tool_results.append(result)

                            # 将工具结果添加到消息历史
                            self.messages.append({
                                "Role": "tool",
                                "Content": json.dumps({"result": result}, ensure_ascii=False),
                                "ToolCallId": tool_call.Id
                            })
                        else:
                            error_msg = f"未知工具: {func_name}"
                            self.messages.append({
                                "Role": "tool",
                                "Content": json.dumps({"error": error_msg}, ensure_ascii=False),
                                "ToolCallId": tool_call.Id
                            })
                            tool_results.append(error_msg)

                    except Exception as e:
                        error_msg = f"工具调用失败: {str(e)}"
                        self.messages.append({
                            "Role": "tool",
                            "Content": json.dumps({"error": error_msg}, ensure_ascii=False),
                            "ToolCallId": getattr(tool_call, 'Id', 'unknown')
                        })
                        tool_results.append(error_msg)

                # 现在让模型基于工具结果生成最终回答
                req2 = models.ChatCompletionsRequest()
                req2.Model = "hunyuan-standard"
                req2.Messages = self.messages
                req2.Stream = False

                resp2 = self.client.ChatCompletions(req2)
                final_message = resp2.Choices[0].Message
                final_response = final_message.Content

                # 添加最终回答到历史
                self.messages.append({"Role": "assistant", "Content": final_response})

                return final_response

            else:
                # 没有工具调用，直接返回
                response = message.Content
                self.messages.append({"Role": "assistant", "Content": response})
                return response

        except TencentCloudSDKException as e:

            return f"调用失败: {e}"
        except Exception as e:
            return f"处理失败: {str(e)}"

    def process(self, user_input, use_tools=True):
        """处理用户输入"""
        if use_tools:
            return self._process_with_tools(user_input)
        else:
            return self._simple_process(user_input)

    def chat(self, use_tools=True):
        """交互式对话"""
        print("=" * 60)
        print("腾讯云混元 Agent 演示系统")
        print("=" * 60)
        print(f"模式: {'工具调用模式' if use_tools else '简单对话模式'}")
        print("支持功能:")
        print("  1. 天气查询: 北京天气怎么样？")
        print("  2. 数学计算: 计算 15+8 * 2 等于多少")
        print("  3. 多城市天气: 对比北京和上海的天气")
        print("  4. 普通对话")
        print("输入 'quit' 退出，输入 'clear' 清空历史")
        print("=" * 60)

        while True:
            try:
                user_input = input("\n用户: ").strip()

                if not user_input:
                    continue

                if user_input.lower() == 'quit':
                    print("再见！")
                    break

                if user_input.lower() == 'clear':
                    self.messages = [{"Role": "system", "Content": self.system_prompt}]
                    print("历史已清空")
                    continue

                # 处理输入
                response = self.process(user_input, use_tools)
                print(f"助手: {response}")

            except KeyboardInterrupt:
                print("\n\n再见！")
                break
            except Exception as e:
                print(f"错误: {e}")


def test_direct_api():
    """直接测试API调用（调试用）"""
    print("直接测试混元API调用...")

    cred = credential.Credential(
        os.getenv("TENCENTCLOUD_SECRET_ID"),
        os.getenv("TENCENTCLOUD_SECRET_KEY")
    )
    client = hunyuan_client.HunyuanClient(cred, "ap-guangzhou")

    # 简单测试
    req = models.ChatCompletionsRequest()
    req.Model = "hunyuan-standard"
    req.Messages = [
        {"Role": "system", "Content": "你是一个有用的助手"},
        {"Role": "user", "Content": "你好，请用一句话介绍自己"}
    ]
    req.Stream = False

    try:
        resp = client.ChatCompletions(req)
        print(f"测试成功！响应: {resp.Choices[0].Message.Content}")
        return True
    except TencentCloudSDKException as e:
        print(f"API调用失败: {e}")
        return False


if __name__ == "__main__":
    # 检查环境变量
    secret_id = os.getenv("TENCENTCLOUD_SECRET_ID")
    secret_key = os.getenv("TENCENTCLOUD_SECRET_KEY")

    if not secret_id or not secret_key:
        print("⚠️ 警告: 未设置腾讯云密钥环境变量")
        print("请先设置环境变量:")
        print("  Linux/Mac: export TENCENTCLOUD_SECRET_ID='your-id'")
        print("             export TENCENTCLOUD_SECRET_KEY='your-key'")
        print("  Windows:   set TENCENTCLOUD_SECRET_ID=your-id")
        print("             set TENCENTCLOUD_SECRET_KEY=your-key")
        print("\n或者，你可以在这里输入密钥:")
        secret_id = input("SecretId: ").strip()
        secret_key = input("SecretKey: ").strip()

        if secret_id and secret_key:
            os.environ["TENCENTCLOUD_SECRET_ID"] = secret_id
            os.environ["TENCENTCLOUD_SECRET_KEY"] = secret_key
        else:
            print("未提供密钥，使用模拟模式...")
            # 可以运行一个模拟版本
            exit(1)

    # 测试API连接
    if not test_direct_api():
        print("API连接失败，请检查密钥和网络")
        exit(1)

    # 选择模式
    print("\n请选择模式:")
    print("1. 工具调用模式 (推荐)")
    print("2. 简单对话模式")
    choice = input("请输入选择 (1或2): ").strip()

    use_tools = (choice == "1")

    # 创建并运行Agent
    agent = HunyuanAgent()
    agent.chat(use_tools=use_tools)

pip install tencentcloud-sdk-python

export TENCENTCLOUD_SECRET_ID=AKIDCNNDqHWYNyNbcU

export TENCENTCLOUD_SECRET_KEY=6GscUEuSuIV

python3 tools/agent_demo.py

运行结果：