使用 Google ADK, Gemma 3 和 MCP 工具构建 AI 代理

简介

在本文中, 我们将探讨如何使用谷歌的 代理开发工具包(ADK)来构建具有本地 LLMs (如Gemma3 或 Llama3.2)的代理, 通过Ollama 运行, 并通过使用模型上下文协议(MCP)调用外部工具执行实时搜索.

我们将使用 ADK 框架构建一个YouTube 搜索代理, 它将利用 Gemma 3 模型进行推理和生成响应, 并通过 MCP 动态调用工具获取搜索结果.

什么是 Gemma 3 模型?

Gemma 3 是谷歌最新的开源大型语言模型, 旨在提供高性能, 高效率和多模态功能.

Gemma 3 有多种大小(1B, 4B, 12B 和 27B), 使我们能够根据硬件和性能要求选择理想的模型.

Gemma 3 - 一组轻量级, 先进的开放式模型, 采用了与 Gemini 2.0 模型相同的研究和技术.

Gemma 3 型号的几个特点

图像和文本输入: 利用多模态输入支持分析和解释视觉数据
128K 标记上下文: 16 倍大的输入上下文, 用于分析更多数据和解决更复杂的问题.
功能调用: 创建自然语言界面, 与应用程序接口和工具进行交互.
语言支持: 支持 140 多种语言, 适用于多语言 AI 应用程序
模型尺寸: 提供 1B, 4B, 12B 和 27B 变体, 以满足我们的计算和任务需求

在上一篇文章中, 探讨了ADK 框架和MCP协议, 并演示了如何使用Gemini-2.5 Pro将 ADK 作为MCP客户端来构建一个代理. 我们将通过 Ollama 使用 Gemma3 模型来代替 Gemini-2.5 Pro.

架构

在本演示中, 我们将使用 ADK 框架构建一个YouTube 搜索代理, 该代理将利用 Gemma 3 模型进行推理和生成响应, 并通过 MCP 动态调用工具来获取搜索结果.

核心组件:

谷歌 ADK - 提供代理框架.
MCP - 标准化工具通信
Gemma 3 (12B) - 为语言理解和生成提供动力
Ollama - 本地托管 Gemma 模型.
MCP 服务器--提供 YouTube 搜索功能
Python 3.11+ - 基本运行环境

工作流程:

用户通过界面提交查询.
ADK 代理框架处理查询并确定意图
如果需要搜索

请求被转到MCP 工具注册表.
MCP 搜索工具接收查询并获取结果
通过 MCP 标准格式返回结果.

Gemma 3 模型(通过 Ollama 和 LiteLlm):

接收搜索结果
生成自然语言响应

格式化的响应返回到用户界面

实施指南:

让我们利用 ADK 框架和 Gemma 3 模型来逐步分解构建本地 AI 助手的过程

前提条件

已安装Python 3.11+.
已安装 Ollama
下载 gemma3:12b 模型.
有效的 SerpAPI key (用于搜索)
安装或使用任何本地或远程 MCP 服务器(mcp-youtube-search)

项目结构

bash 复制代码

adk_mcp_gemma3/
├── search/
│   ├── __init__.py              # Package initialization
│   └── agent.py                 # Agent implementation with ADK
├── README.md                    # Project documentation
└── requirements.txt             # Dependencies

步骤 1: 设置虚拟环境

安装依赖项

bash 复制代码

# Setup virtual environment (Mac or Unix )
python -m venv venv && source venv/bin/active 

# Install agent development kit
pip install google-adk

# Install MCP server 
pip install mcp-youtube-search

# Install litellm
pip install litellm

google-adk: 安装谷歌的 ADK
mcp-youtube-search: 处理 YouTube 搜索和获取结果的专用 MCP 服务器.
litellm: LiteLLM 库为各种 LLM 提供商提供了统一的接口, 在我们的例子中, 它是 Ollama

设置环境变量:

ini 复制代码

export SERP_API_KEY="your-serpapi-key"

第 2 步:安装 MCP 服务器

在本文中, 我们将使用mcp-youtube-search和MCP服务器, 后者是使用adk构建的.

ini 复制代码

# Install from PyPI ( Already Installed from Step 1)
pip install mcp-youtube-search

第 3 步:执行辅助函数 (agent.py)

python 复制代码

# Step 1: Import required modules
from google.adk.agents import LlmAgent
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, StdioServerParameters
from google.adk.models.lite_llm import LiteLlm
import os
import dotenv

# Step 2: Load environment variables
dotenv.load_dotenv()

# Step 3: Create tool execution helper functions
async def execute_tool(tool, args):
    """Execute a single tool and handle cleanup."""
    try:
        result = await tool.run_async(args=args, tool_context=None)
        return (True, result, None)  # Success, result, no error
    except Exception as e:
        return (False, None, str(e))  # Failed, no result, error message


async def try_tools_sequentially(tools, args, exit_stack):
    """Try each tool in sequence until one succeeds."""
    errors = []
    
    for tool in tools:
        success, result, error = await execute_tool(tool, args)
        if success:
            return result
        errors.append(f"Tool '{tool.name}' failed: {error}")
    
    if errors:
        return f"All tools failed: {'; '.join(errors)}"
    return "No tools available"

LlmAgent - 提供中央代理类, 负责协调整个系统, 管理对话状态和历史, 管理工具和提示指令.
MCPToolset - 实现与 MCP 兼容工具的连接和互动, 充当通过 MCP 发现和访问工具的接口
StdioServerParameters - 通过标准 I/O 连接到 MCP 服务器的配置
LiteLlm - 在 ADK 和各种 LLM 提供商之间提供一个兼容层
LlmAgent 充当协调器, LiteLlm 充当模型接口, MCPToolset 充当工具注册表

步骤 4: 创建 MCP 工具执行器

下面的函数创建了一个 async wrapper , 用于连接到 MCP 工具服务器 并运行可用工具 - MCPToolset.from_server() 用于连接到工具, 按照步骤 3 中的定义依次**尝试 MCP 服务器返回的所有工具, 直到其中一个成功为止, 并通过 exit_stack.aclose() 确保适当的 资源清理.

python 复制代码

# Step 4: Create MCP tool executor
def create_mcp_tool_executor(command, args=None, env=None):
    """Create a function that connects to an MCP server and executes tools."""
    async def mcp_tool_executor(**kwargs):
        # Connect to MCP server
        tools, exit_stack = await MCPToolset.from_server(
            connection_params=StdioServerParameters(
                command=command,
                args=args or [],
                env=env or {},
            )
        )
        
        try:
            # Try all tools until one succeeds
            return await try_tools_sequentially(tools, kwargs, exit_stack)
        finally:
            # cleanup
            await exit_stack.aclose()
    
    return mcp_tool_executor

第 5 步:使用上述函数创建搜索函数

我们需要确保向搜索函数提供适当的指令, 以便 LLM 翻译自然语言查询, 并在需要时调用搜索函数. 使用 mcp-youtube-search MCP 服务器, 定义工具执行器

ini 复制代码

# Step 5: Create YouTube search function
search_youtube = create_mcp_tool_executor(
    command="mcp-youtube-search",
    args=[],
    env={"SERP_API_KEY": os.getenv("SERP_API_KEY")}
)

# Step 6: Add documentation for the LLM
search_youtube.__name__ = "search_youtube"
search_youtube.__doc__ = """
Search for YouTube videos based on a search query.
    
Args:
    search_query: The search terms to look for on YouTube (e.g., 'Google Cloud Next 25')
    max_results: Optional. Maximum number of results to return (default: 10)

Returns:
    List of YouTube videos with details including title, channel, link, published date, 
    duration, views, thumbnail URL, and description.
"""

步骤 6:创建 LlmAgent

让我们使用 LlmAgent 和 Ollama/gemma3:12b 创建代理

LiteLLM/Ollama KeyError Bug

Ollama 的 JSON 格式响应和 LiteLLM 的解析存在一个已知问题, 可能导致 KeyError: 'name' 错误.

bash 复制代码

venv/lib/python3.12/site-packages/litellm/llms/ollama/completion/transformation.py", line 266, in transform_response
    "name": function_call["name"]
KeyError: 'name'

当 Ollama 返回的 JSON 格式不是特定的工具调用格式时, 就会发生这种情况.
错误看起来像在 litellm/llms/ollama/completion/transformation.py 中出现 KeyError: 'name' 错误.
我已在 PR #9966中提交了针对 LiteLLM 软件包的错误修复, 但仍在等待批准和合并.
当使用 Ollama 模型调用函数时, 该错误可能会导致应用程序崩溃或进入无限循环.

临时解决方法:

使用 PR #9966中的更改手动修补本地 LiteLLM 安装, 并覆盖 .venv/lib/python3.11/site-package/litellm/llms/ollama/completion/transformation.py

如果上述错误合并到 litellm 中, 则不需要上述临时解决方法

ini 复制代码

# Step 7: Create the agent with instructions for formatting results
agent = LlmAgent(
    name="youtube_assistant",
    model=LiteLlm(model="ollama/gemma3:12b"),
    instruction="""You are a helpful YouTube video search assistant.
Your goal is to use the search_youtube tool and present the results clearly.

1.  When asked to find videos, call the search_youtube tool.
2.  The tool will return a JSON object. Find the list of videos in the 'results' field of this JSON.
3.  For each video in the list, create a bullet point (*).
4.  Format each bullet point like this: **Title** (Link) by Channel: Description. (Published Date, Views, Duration)
    - Use the 'title', 'link', 'channel', 'description', 'published_date', 'views', and 'duration' fields from the JSON for each video.
    - Make the title bold.
    - Put the link in parentheses right after the title.
5.  Your final response should ONLY be the formatted bullet list of videos. Do not include the raw JSON.
6.  If the 'results' list in the JSON is empty, simply respond: "I couldn't find any videos for that search."
""",
    tools=[search_youtube],
)