本文围绕《Agentic Reasoning for Large Language Models》综述及配套开源仓库 Awesome-Agentic-Reasoning 展开,解读了 LLM 智能体推理(Agentic Reasoning)这一核心技术。文章首先界定了智能体推理的本质 ------ 构建 "思考 - 行动 - 反馈 - 进化" 闭环,具备目标导向、环境交互和自我进化能力;随后详解其三层架构(基础推理层:规划、工具使用、智能搜索;自我进化层:反馈机制、记忆管理、能力进化;集体协作层:角色分工、协同合作、群体记忆),并梳理了各模块下的前沿顶会论文与核心方法。此外,文章还总结了智能体推理在科学发现、代码开发、实体智能体、医疗健康、网页自动化等领域的落地应用,以及对应的基准测试工具,同时给出了仓库的使用与贡献指南。最后展望了跨模态推理、低资源适配、真实世界部署等未来方向,为 AI 研究者、算法工程师及创业者提供了一份体系化、时效性强的 Agentic Reasoning 学习与实践指南。

当大语言模型(LLM)从 "文本生成器" 进化为 "自主智能体",一个关键能力成为破局点 ------**Agentic Reasoning(智能体推理)**。它让 AI 不仅能 "思考",更能 "行动":规划任务、使用工具、自我进化、协同合作,最终在真实世界中解决复杂问题。
近日,研究者发布了一份里程碑式的《Agentic Reasoning for Large Language Models》综述,并同步开源了Awesome-Agentic-Reasoning仓库,整合了数百篇顶会论文、应用案例和基准测试,成为该领域的 "一站式资源库"。本文将带你深度解读这份宝藏仓库,揭秘智能体推理的核心架构、前沿进展和落地场景,帮你快速抢占 AI 下一代技术高地!
一、什么是 Agentic Reasoning?智能体的 "思考 + 行动" 闭环
传统 LLM 擅长文本交互,但缺乏 "自主决策" 和 "环境交互" 能力。而Agentic Reasoning的核心,是让 AI 构建 "思考 - 行动 - 反馈 - 进化" 的闭环,具备三大核心特征:
- 目标导向:围绕具体任务制定规划,而非被动响应;
- 环境交互:通过工具使用、搜索、物理交互等方式影响世界;
- 自我进化:利用记忆和反馈持续优化行为策略。
二、仓库亮点:为什么它能成为领域标杆?
这份仓库的核心价值的在于 "体系化 + 时效性 + 实用性",三大亮点不可错过:
- 2026 最新综述背书:基于 arXiv 顶刊综述(2601.12538),涵盖 2022-2026 年最新研究,包括 Google DeepMind、OpenAI、Meta 等机构的前沿成果;
- 全维度分类框架:按 "基础推理 - 自我进化 - 集体协作 - 应用 - 基准" 五大模块分类,每个模块下再细分具体方向(如规划推理包含 workflow 设计、树搜索、分解策略等);
- 社区驱动更新:支持 PR 提交新论文,目前已收录 500 + 篇顶会论文(NeurIPS、ICML、ACL 等),持续追踪领域动态。
三、核心内容解析:从基础能力到前沿应用
(一)基础推理层:智能体的 "基本功"
基础推理是智能体的核心能力,涵盖 "规划、工具使用、搜索" 三大模块,是所有复杂任务的基石:
1. 规划推理(Planning Reasoning):智能体的 "思维导图"
让 AI 学会 "分解任务、分步执行",核心方法分为两类:
- 上下文内规划:无需微调,通过提示词让 LLM 生成步骤(如 Least-to-Most Prompting、Plan-and-Solve);
- 树搜索 / 算法模拟:借鉴 AlphaZero 思想,用蒙特卡洛树搜索(MCTS)探索最优路径(如 Tree of Thoughts、Q*);
- 关键论文:《Tree of Thoughts》(NeurIPS 2023)、《Algorithm of Thoughts》(ICML 2024)、《Plan-and-Act》(2025)。
2. 工具使用优化(Tool-Use Optimization):智能体的 "瑞士军刀"
让 AI 熟练使用 API、计算器、搜索引擎等外部工具,核心方向:
- 上下文内工具集成:通过 CoT + 工具调用(如 ReAct、HuggingGPT);
- 微调优化:用 SFT(ToolLLM)或 RL(ToolRL)让模型掌握工具使用逻辑;
- 工具编排:多工具协同(如 ToolPlanner、OctoTools);
- 关键论文:《Toolformer》(NeurIPS 2023)、《Gorilla》(2023)、《ToolRL》(2025)。
3. 智能搜索(Agentic Search):智能体的 "信息雷达"
让 AI 自主搜索信息、验证答案,解决知识过时问题:
- 上下文内搜索:推理与搜索交织(如 Self-RAG、Interleaving Retrieval);
- 结构化搜索:基于知识图谱的精准检索(如 Agent-G、GeAR);
- 关键论文:《Self-RAG》(ICLR 2024)、《Search-R1》(2025)、《DeepRAG》(2025)。
(二)自我进化层:智能体的 "终身学习" 能力
让 AI 从经验中学习,持续优化性能,核心包括三大模块:
1. 反馈机制(Agentic Feedback):智能体的 "错题本"
- 反思反馈:自我批判优化(如 Reflexion、Self-Refine);
- 验证器驱动:通过外部工具验证结果(如 LEVER、CodeRL);
- 关键论文:《Reflexion》(NeurIPS 2023)、《Self-Refine》(NeurIPS 2023)。
2. 记忆管理(Agentic Memory):智能体的 "大脑存储"
- 扁平记忆:事实性记忆(如 RAG、MemGPT);
- 结构化记忆:基于图谱 / 树状结构的经验存储(如 Zep、Mem0);
- 关键论文:《MemGPT》(2023)、《Nemori》(2025)、《MemOS》(2025)。
3. 能力进化(Evolving Capabilities):智能体的 "自我升级"
让 AI 自主提升规划、工具使用等核心能力:
- 自我挑战:生成困难任务训练自己(如 Self-challenging);
- 工具创造:让 AI 自主设计新工具(如 Large Language Models as Tool Makers);
- 关键论文:《Self: Self-Evolution with Language Feedback》(2023)、《LLM Agents Making Agent Tools》(2025)。
(三)集体协作层:多智能体的 "团队协作"
单智能体能力有限,多智能体协同能解决更复杂的任务:
- 角色分工:按任务分配角色(如 MetaGPT 的产品经理、开发者分工);
- 协同合作:通过对话协商完成目标(如 AgentOrchestra、Collab-RAG);
- 群体记忆:共享经验提升团队效率(如 G-Memory、MIRIX);
- 关键论文:《MetaGPT》(ICLR 2024)、《Chain of Agents》(NeurIPS 2024)、《Theory of mind for multi-agent collaboration》(2023)。
(四)应用场景:从科研到产业的落地实践
仓库收录了智能体推理在五大核心领域的应用案例,覆盖高价值场景:
| 应用领域 | 典型场景 | 代表性论文 / 工具 |
|---|---|---|
| 科学发现 | 材料设计、药物研发、数学推理 | 《ProtAgents》(Digital Discovery 2024)、《The AI Scientist》(2024) |
| 代码开发 | 仓库级编程、自动化调试 | 《CodePlan》(FSE 2024)、《AgentCoder》(2023) |
| 实体智能体 | 机器人操作、自动驾驶 | 《Voyager》(2023)、《Gemini Robotics》(2025) |
| 医疗健康 | 疾病诊断、临床决策 | 《AgentMD》(Nature Communications 2025)、《MedAgent-Pro》(2025) |
| 网页 / 办公自动化 | 自主搜索、任务协作 | 《WebGPT》(2021)、《PC-Agent》(2025) |
(五)基准测试:评估智能体推理能力的 "标尺"
仓库整理了覆盖核心机制和应用场景的基准测试,方便研究者验证模型性能:
- 核心机制基准:Tool Use(ToolQA、GTA)、Memory(LongMemEval)、Planning(PlanBench);
- 应用场景基准:实体智能体(ALFWorld、OSWorld)、科学发现(ScienceWorld)、网页代理(WebArena、Mind2Web)。
Awesome Agentic Reasoning Papers
This repository organizes research by thematic areas that integrate reasoning with action, including planning, tool use, search, self-evolution through memory and feedback, multi-agent systems, and real-world applications and benchmarks.
📄 Based on the survey : Agentic Reasoning for Large Language Models: A Survey
🔔 News
[01/21/26] 🚀 We have released a comprehensive survey on Agentic Reasoning for Large Language Models ! The paper is now available on arxiv and HuggingFace. We welcome contributions from the community to help expand and improve our survey 🤗!
📋 Table of Contents
- [🔔 News](#🔔 News)
- [📋 Table of Contents](#📋 Table of Contents)
- [🌟 Introduction](#🌟 Introduction)
- [🤝 Contributing](#🤝 Contributing)
- [📝 Citation](#📝 Citation)
- [🏗️ Foundational Agentic Reasoning](#🏗️ Foundational Agentic Reasoning)
- [🗺️ Planning Reasoning](#🗺️ Planning Reasoning)
- [🛠️ Tool-Use Optimization](#🛠️ Tool-Use Optimization)
- [🔍 Agentic Search](#🔍 Agentic Search)
- [🧬 Self-evolving Agentic Reasoning](#🧬 Self-evolving Agentic Reasoning)
- [🔄 Agentic Feedback Mechanisms](#🔄 Agentic Feedback Mechanisms)
- [🧠 Agentic Memory](#🧠 Agentic Memory)
- [🚀 Evolving Foundational Agentic Capabilities](#🚀 Evolving Foundational Agentic Capabilities)
- [👥 Collective Multi-agent Reasoning](#👥 Collective Multi-agent Reasoning)
- [🎭 Role Taxonomy of Multi-Agent Systems (MAS)](#🎭 Role Taxonomy of Multi-Agent Systems (MAS))
- [🤝 Collaboration and Division of Labor](#🤝 Collaboration and Division of Labor)
- [🌱 Multi-Agent Memory and Evolution](#🌱 Multi-Agent Memory and Evolution)
- [🎨 Applications](#🎨 Applications)
- [💻 Math Exploration & Vibe Coding Agents](#💻 Math Exploration & Vibe Coding Agents)
- [🔬 Scientific Discovery Agents](#🔬 Scientific Discovery Agents)
- [🤖 Embodied Agents](#🤖 Embodied Agents)
- [🏥 Healthcare & Medicine Agents](#🏥 Healthcare & Medicine Agents)
- [🌐 Autonomous Web Exploration & Research Agents](#🌐 Autonomous Web Exploration & Research Agents)
- [📊 Benchmarks](#📊 Benchmarks)
- [⚙️ Core Mechanisms of Agentic Reasoning](#⚙️ Core Mechanisms of Agentic Reasoning)
- [Tool Use](#Tool Use)
- Search
- [Memory and Planning](#Memory and Planning)
- [Multi-Agent System](#Multi-Agent System)
- [🎯 Applications of Agentic Reasoning](#🎯 Applications of Agentic Reasoning)
- [Embodied Agents](#Embodied Agents)
- [Scientific Discovery Agents](#Scientific Discovery Agents)
- [Autonomous Research Agents](#Autonomous Research Agents)
- [Medical and Clinical Agents](#Medical and Clinical Agents)
- [Web Agents](#Web Agents)
- [General Tool-Use Agents](#General Tool-Use Agents)
- [⚙️ Core Mechanisms of Agentic Reasoning](#⚙️ Core Mechanisms of Agentic Reasoning)
🌟 Introduction
Bridging thought and action through autonomous agents that reason, act, and learn via continual interaction with their environments. The goal is to enhance agent capabilities by grounding reasoning in action.
We organize agentic reasoning into three layers, each corresponding to a distinct reasoning paradigm under different environmental dynamics:
🔹 Foundational Reasoning. Core single-agent abilities (planning, tool-use, search) in environments
🔹 Self-Evolving Reasoning. Adaptation through feedback, memory, and learning in dynamic settings
🔹 Collective Reasoning. Multi-agent coordination, role specialization, and collaborative intelligence
Across these layers, we further identify complementary reasoning paradigms defined by their optimization settings.
🔸 In-Context Reasoning. Test-time scaling through structured orchestration and adaptive workflows
🔸 Post-Training Reasoning. Behavior optimization via RL and supervised fine-tuning
🤝 Contributing
This collection is an ongoing effort. We are actively expanding and refining its coverage, and welcome contributions from the community. You can:
- Submit a pull request to add papers or resources
- Open an issue to suggest additional papers or resources
- Email us at twei10@illinois.edu, twli@illinois.edu, liu326@illinois.edu
We regularly update the repository to include new research works on agentic reasoning.
📝 Citation
If you find this repository or paper useful, please consider citing the survey paper:
bibtex
@article{wei2026agentic,
title={Agentic Reasoning for Large Language Models},
author={Wei, Tianxin and Li, Ting-Wei and Liu, Zhining and Ning, Xuying and Yang, Ze and Zou, Jiaru and Zeng, Zhichen and Qiu, Ruizhong and Lin, Xiao and Fu, Dongqi and others},
journal={arXiv preprint arXiv:2601.12538},
year={2026}
}
🏗️ Foundational Agentic Reasoning
🗺️ Planning Reasoning
In-context Planning
Workflow Design
Tree Search / Algorithm Simulation
Process Formalization
Decoupling / Decomposition
External Aid / Tool Use
Post-training Planning
🛠️ Tool-Use Optimization
In-Context Tool-Integration
Interleaving Reasoning and Tool Use
Optimizing Context for Tool Interaction
Post-training Tool-Integration
Bootstrapping of Tool Use via SFT
Mastery of Tool Use via RL
Orchestration-based Tool-Integration
Agentic Pipelines for Tool Orchestration
Tool Representations for Orchestration
🔍 Agentic Search
In-Context Search
Interleaving Reasoning and Search
Structure-Enhanced Search
Post-Training Search
SFT-Based Agentic Search
RL-Based Agentic Search
🧬 Self-evolving Agentic Reasoning
🔄 Agentic Feedback Mechanisms
Reflective Feedback
Parametric Adaptation
Validator-Driven Feedback
🧠 Agentic Memory
Agentic Use of Flat Memory
Factual Memory
Experience Memory
Structured Use of Memory
Post-training Memory Control
🚀 Evolving Foundational Agentic Capabilities
Self-evolving Planning
Self-evolving Tool-use
Self-evolving Search for Memory Retrieval
👥 Collective Multi-agent Reasoning
🤝 Collaboration and Division of Labor
In-context Collaboration
Manually Crafted Pipelines
LLM-Driven Pipelines
Theory-of-Mind-Augmented Collaboration
Post-training Collaboration
Multi-agent Prompt Optimization
Graph-based Topology Generation
Policy-based Topology Generation
🌱 Multi-Agent Memory and Evolution
From Single-Agent Evolution to Multi-Agent Evolution
Intra-test-time Evolution
Inter-test-time Evolution
Multi-agent Evolution
Multi-agent Memory Management for Evolution
Training Multi-agent to Evolve
🎨 Applications
💻 Math Exploration & Vibe Coding Agents
Foundational Agentic Reasoning
Self-evolving Agentic Reasoning
Collective Multi-agent Reasoning
🔬 Scientific Discovery Agents
Here are the extracted citation tables grouped by their respective sections.
Foundational Agentic Reasoning
Self-evolving Agentic Reasoning
Collective multi-agent reasoning
🤖 Embodied Agents
Foundational Agentic Reasoning
Self-evolving Agentic Reasoning
Collective multi-agent reasoning
🏥 Healthcare & Medicine Agents
Foundational agentic reasoning
Self-evolving agentic reasoning
Collective multi-agent reasoning
🌐 Autonomous Web Exploration & Research Agents
Foundational agentic reasoning
Self-evolving agentic reasoning
Collective multi-agent reasoning
⚙️ Core Mechanisms of Agentic Reasoning
Tool Use
Single-Turn Tool Use
Multi-Turn Tool Use
Search
Memory and Planning
Long-Horizon Episodic Memory
Multi-session Recall
Planning and Feedback
Multi-Agent System
Game-based reinforcement learning evaluation
Simulation-centric real-world assessment
Language, Communication, and Social Reasoning
🎯 Applications of Agentic Reasoning
Embodied Agents
Scientific Discovery Agents
Autonomous Research Agents
Medical and Clinical Agents
Web Agents
General Tool-Use Agents
资料来源
https://github.com/weitianxin/Awesome-Agentic-Reasoning/tree/main
未来展望:智能体推理的下一个风口
随着 LLM 能力的持续提升,Agentic Reasoning 将成为 AI 落地的核心引擎,未来三大方向值得关注:
- 跨模态推理:融合文本、图像、语音的多模态智能体(如 Emma-X、M3-Agent);
- 低资源适配:让小模型也具备高效推理能力(如 LightMem、BTGenBot);
- 真实世界部署:解决延迟、可靠性、安全性问题,推动智能体在工业、医疗等领域规模化应用。
结语
Awesome-Agentic-Reasoning 仓库不仅是一份论文合集,更是智能体推理领域的 "知识地图"。无论你是 AI 研究者、算法工程师,还是想入局智能体开发的创业者,这份资源都能帮你快速掌握核心技术、追踪前沿动态。
现在就 Star 仓库(https://github.com/weitianxin/Awesome-Agentic-Reasoning),加入社区交流,一起见证智能体从 "实验室" 走向 "真实世界" 的进化之旅!
关注我,后续将持续解读仓库中的顶会论文和实战案例,带你从理论到实践,玩转 Agentic Reasoning!