Google ADK技术原理深度剖析 -- Agent系统架构与执行流程全解析

深入Google ADK源码，全面解析Agent系统的工作原理：从CLI启动到LLM调用的完整技术链路。涵盖两阶段架构设计、处理器链模式、事件驱动机制、工具调用生命周期等核心概念。基于真实源码分析，为开发者提供Google ADK架构的完整技术视图。

前言

在5分钟入门Google ADK -- 从零构建你的第一个AI Agent教程中，我们学会了如何使用Google ADK构建AI Agent。但是，Agent系统内部是如何工作的？用户的一个简单提问，是如何经过层层处理，最终调用LLM并返回结果的？

本文将深入Google ADK的源码，全面剖析Agent系统的技术架构与执行流程。通过对实际源码的分析，我们将揭示：

Google ADK的两阶段架构设计理念
从CLI到LLM的完整调用链路
处理器链模式的设计与实现
事件驱动架构的工作机制
工具调用的完整生命周期

Agent系统工作流程详解

让我们以一个具体的用户查询为例，深入了解Agent系统是如何工作的：

用户输入： "What's the weather in New York?"

基于源码分析的真实流程： 通过深入阅读Google ADK的源码（.venv/lib/python3.13/site-packages/google/adk/），我们发现实际的架构是：

LlmAgent：配置容器，管理instruction、tools、model等配置
SingleFlow：继承自BaseLlmFlow，配置了特定的处理器链
BaseLlmFlow：抽象基类，定义了LLM调用循环的通用算法（被SingleFlow继承）
Functions模块：具体的工具调用执行逻辑

sequenceDiagram participant Dev as 开发者 participant CLI as ADK CLI participant IM as importlib participant User as 用户 participant Web as Web界面 participant R as Runner participant LA as LlmAgent participant SF as SingleFlow participant LLM as Gemini LLM participant FC as Functions participant T as 工具函数 Note over Dev,IM: 第一阶段：启动ADK服务 Dev->>CLI: adk web Note over Dev,CLI: 开发者启动Web服务 CLI->>IM: importlib.import_module("multi_tool_agent") Note over CLI,IM: 动态导入Agent模块 IM->>CLI: agent_module.agent.root_agent Note over IM,CLI: 获取LlmAgent实例 Note over CLI: ADK Web服务启动完成
等待用户请求 Note over User,T: 第二阶段：用户交互 User->>Web: "What's the weather in New York?" Note over User,Web: 用户在Web界面提问 Web->>R: Runner(agent=root_agent) Note over Web,R: Web界面创建Runner处理请求 Web->>R: runner.run_async(user_input) Note over Web,R: 启动Agent执行用户查询 R->>R: _find_agent_to_run(session, root_agent) Note over R: 选择要运行的Agent
(单Agent场景返回root_agent) R->>LA: invocation_context.agent.run_async(ctx) Note over R,LA: 调用LlmAgent的执行方法 LA->>SF: self._llm_flow.run_async(ctx) Note over LA,SF: LlmAgent委托给SingleFlow
(继承BaseLlmFlow) loop LLM调用循环 (BaseLlmFlow逻辑) SF->>SF: 执行处理器链 Note over SF: basic→auth→instructions
→identity→contents→planning SF->>LLM: _call_llm_async() Note over SF,LLM: 调用Gemini API
发送工具定义+用户消息 LLM->>SF: 返回LLM响应 Note over LLM,SF: 可能包含工具调用或最终回答 alt LLM请求工具调用 SF->>FC: handle_function_calls_async() Note over SF,FC: 处理工具调用请求 FC->>T: get_weather("New York") Note over FC,T: 执行具体的Python工具函数 T->>FC: 返回天气数据 Note over T,FC: {"status": "success", "report": "..."} FC->>SF: 工具执行结果Event Note over FC,SF: 包装为Event返回 Note over SF: 继续下一轮LLM调用
将工具结果发送给LLM else LLM给出最终回答 Note over SF: 退出循环 end end SF->>LA: yield Event流 LA->>R: yield Event流 R->>Web: yield Event流 Web->>User: "New York weather: 22°C, sunny" Note over Web,User: 用户在Web界面看到回答

基于源码的流程详解

第一阶段：ADK服务启动

1. 开发者启动服务

步骤： 开发者执行 adk web 启动Web服务

bash 复制代码

# 开发者在命令行执行
adk web

ADK CLI扫描当前目录，准备加载所有可用的Agent。

2. Agent模块加载

步骤： CLI通过importlib动态加载Agent模块

python 复制代码

# 实际的cli.py代码
agent_module = importlib.import_module(agent_folder_name)  # 导入multi_tool_agent模块
root_agent = agent_module.agent.root_agent  # 获取我们定义的root_agent

系统在这个阶段预加载所有Agent，使其在Web界面下拉菜单中可选。

3. Web服务就绪

步骤： ADK Web服务启动完成，等待用户请求

arduino 复制代码

ADK Web服务启动在: http://localhost:8000
可用的Agent: multi_tool_agent

第二阶段：用户交互处理

4. 用户发起查询

步骤： 用户在Web界面输入问题

用户在浏览器中访问ADK Web界面，选择"multi_tool_agent"，然后输入："What's the weather in New York?"

5. Web界面创建Runner

步骤： Web界面为每个用户请求创建Runner实例

python 复制代码

# Web界面处理用户请求
runner = Runner(
    app_name=agent_folder_name,
    agent=root_agent,  # 使用预加载的LlmAgent实例
    artifact_service=artifact_service,
    session_service=session_service,
)

# 启动Agent执行
async for event in runner.run_async(
    user_id=session.user_id, 
    session_id=session.id, 
    new_message=content
):
    # 处理返回的事件

6. Runner调度执行

步骤： Runner调用根Agent的run_async方法

python 复制代码

# 实际的runners.py代码 
async def run_async(self, *, user_id: str, session_id: str, new_message: types.Content, run_config: RunConfig = RunConfig()) -> AsyncGenerator[Event, None]:
    """Runner的主要入口方法"""
    session = await self.session_service.get_session(app_name=self.app_name, user_id=user_id, session_id=session_id)
    invocation_context = self._new_invocation_context(session, new_message=new_message, run_config=run_config)
    
    # 关键：找到要运行的agent（通常是root_agent）
    invocation_context.agent = self._find_agent_to_run(session, self.agent)
    
    # 调用具体Agent的run_async方法
    async for event in invocation_context.agent.run_async(invocation_context):
        if not event.partial:
            await self.session_service.append_event(session=session, event=event)
        yield event

def _find_agent_to_run(self, session: Session, root_agent: BaseAgent) -> BaseAgent:
    """寻找要运行的agent（对于单Agent场景，返回root_agent）"""
    # 在复杂场景中会根据session历史选择合适的agent
    # 对于我们的简单场景，直接返回root_agent
    return root_agent

7. LlmAgent执行层

步骤： LlmAgent委托给SingleFlow执行引擎

python 复制代码

# 实际的llm_agent.py代码
@override
async def _run_async_impl(self, ctx: InvocationContext) -> AsyncGenerator[Event, None]:
    """LlmAgent的核心执行方法"""
    async for event in self._llm_flow.run_async(ctx):  # 委托给Flow执行引擎
        self.__maybe_save_output_to_state(event)
        yield event

@property 
def _llm_flow(self) -> BaseLlmFlow:
    """根据Agent配置选择合适的Flow执行引擎"""
    if (self.disallow_transfer_to_parent and self.disallow_transfer_to_peers and not self.sub_agents):
        return SingleFlow()  # 我们的简单场景：单Agent，无转移
    else:
        return AutoFlow()   # 复杂场景：多Agent，支持转移

8. SingleFlow处理器链

步骤： SingleFlow设置并执行处理器链

python 复制代码

# 实际的single_flow.py代码
class SingleFlow(BaseLlmFlow):
    def __init__(self):
        super().__init__()
        self.request_processors += [
            basic.request_processor,           # 基础请求处理
            auth_preprocessor.request_processor, # 认证处理
            instructions.request_processor,    # 指令处理
            identity.request_processor,        # 身份处理  
            contents.request_processor,        # 内容处理
            _nl_planning.request_processor,    # 规划处理
            _code_execution.request_processor, # 代码执行处理
        ]
        self.response_processors += [
            _nl_planning.response_processor,   # 规划响应处理
            _code_execution.response_processor, # 代码执行响应处理
        ]

9. LLM调用循环

步骤： SingleFlow (继承自BaseLlmFlow) 执行LLM调用循环

python 复制代码

# 实际的base_llm_flow.py代码
async def run_async(self, invocation_context: InvocationContext) -> AsyncGenerator[Event, None]:
    """循环调用LLM直到获得最终响应"""
    while True:
        last_event = None
        async for event in self._run_one_step_async(invocation_context):
            last_event = event
            yield event
        if not last_event or last_event.is_final_response():
            break  # 获得最终响应，退出循环

async def _run_one_step_async(self, invocation_context: InvocationContext) -> AsyncGenerator[Event, None]:
    """一步 = 一次LLM调用"""
    llm_request = LlmRequest()
    
    # 1. 预处理：运行所有request_processors
    async for event in self._preprocess_async(invocation_context, llm_request):
        yield event
        
    # 2. 调用LLM
    async for llm_response in self._call_llm_async(invocation_context, llm_request, model_response_event):
        # 3. 后处理：处理LLM响应
        async for event in self._postprocess_async(invocation_context, llm_request, llm_response, model_response_event):
            yield event

10. Gemini API调用

步骤： 真实的Gemini API调用

python 复制代码

# 实际的base_llm_flow.py代码
async def _call_llm_async(self, invocation_context: InvocationContext, llm_request: LlmRequest, model_response_event: Event) -> AsyncGenerator[LlmResponse, None]:
    llm = self.__get_llm(invocation_context)  # 获取Gemini模型
    
    # 实际的API调用
    async for llm_response in llm.generate_content_async(
        llm_request,  # 包含工具定义、用户消息、系统指令等
        stream=invocation_context.run_config.streaming_mode == StreamingMode.SSE,
    ):
        yield llm_response

11. 工具调用处理

步骤： Functions模块处理工具调用

python 复制代码

# 实际的functions.py代码
async def handle_function_calls_async(invocation_context: InvocationContext, function_call_event: Event, tools_dict: dict[str, BaseTool]) -> Optional[Event]:
    """处理LLM返回的工具调用请求"""
    function_calls = function_call_event.get_function_calls()
    
    for function_call in function_calls:
        # 获取工具和参数
        tool = tools_dict[function_call.name]  # get_weather工具
        function_args = function_call.args     # {"city": "New York"}
        
        # 执行工具调用
        function_response = await __call_tool_async(tool, args=function_args, tool_context=tool_context)
        
        # 构建响应事件
        function_response_event = __build_response_event(tool, function_response, tool_context, invocation_context)

async def __call_tool_async(tool: BaseTool, args: dict[str, Any], tool_context: ToolContext) -> Any:
    """实际执行工具函数"""
    return await tool.run_async(args=args, tool_context=tool_context)  # 调用我们的get_weather函数

12. 结果返回

步骤： Event流从底层逐级返回到用户界面

python 复制代码

# Event流的传递路径
SingleFlow → LlmAgent → Runner → Web界面 → 用户浏览器

用户在Web界面看到最终回答："The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees Fahrenheit)."

关键架构洞察

两阶段设计的优势

服务启动阶段：一次性加载，提高响应速度
用户交互阶段：每次请求独立处理，支持并发

处理器链模式

SingleFlow使用处理器链模式，每个processor负责特定的功能：

basic: 基础请求处理
instructions: 整合Agent的instruction
contents: 处理对话内容
functions: 添加工具定义到LLM请求

循环直到完成

BaseLlmFlow在循环中调用LLM，直到：

获得不包含工具调用的最终回复
或者触发Agent转移
或者遇到错误

事件驱动架构

整个系统基于Event流：

每个LLM响应都是一个Event
每个工具调用结果都是一个Event
支持流式处理和实时反馈

工具调用的完整生命周期

LLM决策：Gemini分析用户意图，决定调用哪个工具
参数提取：LLM生成工具调用的参数
工具执行：Functions模块执行实际的Python函数
结果封装：将工具结果包装为Event返回给LLM
最终回复：LLM基于工具结果生成用户友好的回复

设计模式深度分析

模板方法模式

BaseLlmFlow使用模板方法模式定义算法骨架，子类实现具体细节：

python 复制代码

# 模板方法：定义固定的执行流程
async def run_async(self, invocation_context):
    while True:  # 通用的循环逻辑
        # 1. 预处理（子类自定义）
        await self._preprocess_async(invocation_context, llm_request)
        
        # 2. 调用LLM（通用逻辑）
        llm_response = await self._call_llm_async(invocation_context, llm_request)
        
        # 3. 后处理（子类自定义）
        await self._postprocess_async(invocation_context, llm_request, llm_response)
        
        # 4. 判断是否结束（通用逻辑）
        if llm_response.is_final():
            break

策略模式

LlmAgent根据配置选择不同的Flow策略：

python 复制代码

@property 
def _llm_flow(self) -> BaseLlmFlow:
    if (self.disallow_transfer_to_parent and self.disallow_transfer_to_peers and not self.sub_agents):
        return SingleFlow()  # 单Agent策略
    else:
        return AutoFlow()    # 多Agent策略

责任链模式

SingleFlow的处理器链实现了责任链模式：

python 复制代码

# 每个processor处理特定职责
basic.request_processor          # 基础处理
→ auth_preprocessor.request_processor  # 认证处理
→ instructions.request_processor       # 指令处理
→ identity.request_processor          # 身份处理
→ contents.request_processor          # 内容处理

总结

Google ADK通过两阶段设计（服务启动 + 用户交互），实现了高效的Agent执行引擎。整个系统是事件驱动的，支持复杂的工具调用和多轮对话。

核心设计亮点：

分层架构：每层职责明确，便于维护和扩展
设计模式：充分运用模板方法、策略、责任链等经典模式
事件驱动：支持流式处理和实时反馈
两阶段设计：平衡了性能和灵活性

通过这样的架构设计，Google ADK为开发者提供了一个既强大又灵活的Agent开发平台。无论是简单的单Agent场景，还是复杂的多Agent协作，都能得到很好的支持。

Google ADK架构分层总结

经过上面复杂的流程分析，我们可以将Google ADK的架构简化为六个清晰的分层：

各层职责详解

🖥️ 用户交互层（Presentation Layer）

Web界面：提供可视化的Agent交互环境，支持实时对话和调试
CLI终端：命令行交互方式，适合开发和测试
API服务器：RESTful接口，支持程序化集成

🎯 编排调度层（Orchestration Layer）

Runner：请求生命周期管理，协调各组件协作
Session管理：维护对话状态和历史记录
路由调度：在多Agent场景中选择合适的Agent执行

🤖 Agent执行层（Agent Layer）

LlmAgent：Agent的核心实现，管理配置和执行策略
Agent配置：指令、工具、模型等配置管理
Sub-agents：支持复杂的多Agent协作场景

⚙️ Flow引擎层（Flow Engine Layer）

SingleFlow：单Agent执行流程，适合简单场景
AutoFlow：多Agent自动转移流程，适合复杂场景
处理器链：basic → auth → instructions → contents → functions

🧠 LLM服务层（LLM Service Layer）

Gemini API：与Google Gemini模型的实际通信
模型调用：请求构建、参数传递、响应解析
响应处理：流式处理、工具调用识别

🔧 工具执行层（Tool Execution Layer）

Functions模块：工具调用的调度和管理
工具函数：具体的业务逻辑实现（如get_weather）
结果封装：将工具执行结果包装为Event返回

架构优势

分层解耦：每层只依赖下层，便于独立开发和测试

职责单一：每层专注特定功能，提高代码可维护性

灵活扩展：可以在任意层进行扩展而不影响其他层

流式处理：Event在各层间流式传递，支持实时反馈

配置驱动：通过配置而非代码控制Agent行为

本文基于Google ADK源码分析撰写，所有代码示例均来自实际实现。如需了解最新变化，请参考官方文档：google.github.io/adk-docs/