MCP - AI 自动化的金钥匙

什么是 MCP? 大型语言模型如何通过 MCP 进行调用? MCP 授权机制如何运作? 以及其他相关方面

MCP 似乎是解锁大型语言模型全部潜力的金钥匙. 从谷歌到微软, 以及其间的其他公司:

模型上下文协议 - MCP 正引发大量关注, 但似乎存在过度规范化. 它已出现一些失败尝试, 如SSE (由于云托管成本问题, SSE正被HTTP流式传输取代).

然而, 它填补了LLM与API调用集成领域的空白, 并足够流行以成为统治一切的标准. 它本质上是一种发布API的方式, 类似于通过Swagger, 以便LLM能够理解并生成带参数的API调用; 几乎类似于REST; 只是它通过JSON-RPC实现, 且与传输协议无关, 而REST或Protobuf/gRPC则基于HTTP. (然而, 所有远程MCP服务器均基于HTTP, 而本地服务器则通过STDIO------标准输入/输出命名管道进行通信.)

通过一个简单的计算器 MCP 服务器示例揭开 MCP 的神秘面纱

LLM非常强大------但它们并非设计用于担任计算器. 它们在处理精确算术或逻辑密集型操作时常常力不从心.

那么, 如果我们不再强迫它们进行计算, 而是每次遇到计算任务时都为它们提供合适的工具呢?

这个想法并不新鲜. 它已经以多种方式实现. 从LLM生成代码来执行计算, 然后提取该代码并在Python沙箱中运行并返回结果; 或LLM生成调用其他服务的URL或JSON(通过REST或GRPC), 然后提取URL或JSON并调用服务获取结果; 或可能以其他更复杂的方式实现.

借助MCP, 此工具调用已标准化. 它基于JSON-RPC.

它有两种实现方式:本地工具可通过STDIO管道调用, 远程工具则通过HTTP调用.

通过HTTP理解MCP的简化方式是, 它几乎类似于REST. 但最大区别在于REST是无状态的, 而MCP是有状态的. 这将影响许多设计决策.

MCP 是一个简单概念, 即使是非技术人员也能理解.

我已通过 HuggingFace Spaces 使用 Docker 免费托管了一个简单的 MCP 服务器版本. MCP 规范较为复杂, 你需要使用 FastMCP 这样的框架来构建 Python 版本的服务器.

python 复制代码

from fastmcp import FastMCP
mcp = FastMCP()
# Assume that this is the tool you want to expose
# Give all the types and description
@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers"""
    return a + b
if __name__ == "__main__":
    # take host and port from command line
    import argparse
    parser = argparse.ArgumentParser(description="Run FastMCP server")
    parser.add_argument("--host", type=str, default="0.0.0.0", help="Host address (default: 0.0.0.0)")
    parser.add_argument("--port", type=int, default=7860, help="Port number (default: 7860)")
    args = parser.parse_args()
    mcp.run(
        transport="streamable-http", # https://github.com/modelcontextprotocol/python-sdk/?tab=readme-ov-file#streamable-http-transport
        host=args.host,
        port=args.port,
        path="/mcp",
        log_level="debug",
    )

点击链接了解详情. 它暴露了add方法

我有一个通过Colab托管的MCP 客户端, 你也可以运行并测试它.

从客户端角度理解流程更直观. 首先, 我们连接到 MCP 服务器并请求其列出工具.

python 复制代码

async with Client("https://alexcpn-mcpserver-demo.hf.space/mcp/") as client:
        await client.ping()
        # List available tools
        tools = await client.list_tools()
        print("Available tools:", tools)
        tool_result = await client.call_tool("add", {"a": "1", "b": "2"})
        print("Tool result:", tool_result)

输出结果如下.

css 复制代码

Available tools: [Tool(name='add', description='Add two numbers', inputSchema={'properties': {'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'}}, 'required': ['a', 'b'], 'type': 'object'}, annotations=None)]

在没有框架的情况下, 这就是我们在服务器上如何展示工具详细信息的方式, 你可能已经注意到, 这就是客户端获得的内容. 像FastMCP这样的框架会从类型信息中推断出这些内容.

csharp 复制代码

@app.list_tools()
async def list_tools() -> list[types.Tool]:
    return [
        types.Tool(
            name="add",
            description=(
                "add two numbers and return the result. "
                
            ),
            inputSchema={
                "type": "object",
                "required": ["a", "b"],
                "properties": {
                    "a": {
                        "type": "integer",
                        "description": "The first number to add",
                    },
                    "b": {
                        "type": "integer",
                        "description": "The second number to add",
...

接下来, 我们将此输出 tool_result 传递给 LLM, 并要求其为工具调用生成包含相关参数的正确 JSON 输出.

python 复制代码

a = 123124522
b= 865734234
question = f"Using the tools available {tool_result} frame the JSON RPC call to the tool add with a={a} and b={b}, do not add anything else to the output" + \
    "here is the JSON RPC call format {{"method": "<method name>", "params": {{"<param 1 name>": {<param 1 value>}, "<param 2 name>": {<param 2 value>} etc }}}}"
# Use a simple model like gpt-3.5-turbo
  completion = openai_client.chat.completions.create(
      model="gpt-3.5-turbo", messages=[
         {"role": "user", "content":question }
      ]
  )
  # Print the response
  print("LLM response:", tool_call)
  print(tool_call["method"], tool_call["params"])

我们获得了如下所示的 JSON 输出.

css 复制代码

LLM response: {'method': 'add', 'params': {'a': 123124522, 'b': 865734234}}
add {'a': 123124522, 'b': 865734234}

我们通过 MCP 客户端工具调用 API 将此 JSON 发送至 LLM 服务器.

python 复制代码

tool_result = await client.call_tool(tool_call["method"], tool_call["params"])
print("Tool result:", tool_result)

MCP 服务器的响应结果如下所示.

python 复制代码

Tool result: [TextContent(type='text', text='988858756', annotations=None)]

简单来说 ------ LLM调用工具 ------ 更准确地说, LLM获取工具信息并生成JSON格式的调用签名, 而我们的程序使用MCP客户端实际调用MCP服务器.

MCP中的授权

原始来源 modelcontextprotocol.io/specificati...

MCP 使用 OAuth2 流程进行授权. 这意味着你无需与任何 MCP 客户端共享用户名和/或密码. MCP 客户端会将你重定向至资源服务器(如照片存储, Gmail 或其他服务), 你使用凭据登录该服务器.

vbscript 复制代码

1. MCP Client --> say Gmail MCP Server (establish unsecure connection)
2. Request login() , MCP server sends a link to click with above sessionid
3. User clicks the link (or browser opens with the link) to Gmail Server
4. User enters Gmail Credentials and is shown that they are logged in
5. Internally Gmail servers calls the Gmail MCP Server with the session id (redirect url)
    and tells the MCP Server that the Session id is authenicated
6. Now MCP Client --> MCP Server session is authenticated
7. Other calls that need authenication do not need any token or secret key
    MCP Client can request users emails etc

认证完成后, MCP 服务器通过重定向 URL 从资源服务器(如 Gmail)收到回调, 并在内部标记 SSE 或 HTTP 流式传输会话已认证. 此后, 其余请求可按认证流程处理.

我们将通过Zerodha Kite近期推出的MCP服务器来演示这一流程. Zerodha是印度最大的折扣券商之一, 类似于美国Robinhood交易平台. 他们最近推出了Kite MCP服务器, 可将你的投资组合和其他操作与AI助手进行集成.

通过此集成, 你可以查询持仓情况, 研究股票信息, 并使用Claude-Desktop等厚客户端执行其他代理工作流程.

尽管其文档展示了Claude-Desktop与MCP服务器的集成工具, 但开发者常会疑惑:

如何在不传递登录凭据或密钥的情况下验证我自己的自定义 MCP 客户端代码? 尤其是当我正在开发类似 Claude Desktop 的厚客户端或基于已验证 MCP 服务器(如 Kite)的 Agentic AI 时.

本节旨在简化使用 Python MCP 客户端与启用身份验证的 MCP 服务器进行身份验证的流程. 我们以 Zerodha Kite 的 MCP 服务器为例, 因为金融服务比任何其他领域都需要更高的安全性.

挑战: 新的认证范式

传统的Zerodha Kite Connect API认证通常涉及获取自己的API密钥和密钥, 然后通过OAuth 2.0流程(通常使用PKCE用于公共客户端)直接获取access_token. 然而, MCP服务器引入了一个抽象层.

然而, 通过MCP授权流程(详见此处), 遵循OAuth2协议.

我并非该领域的专家, 但这种方式相对容易理解------与直接将登录凭据(如Kite的用户名和密码)分享给第三方(例如我的Python MCP客户端以获取你的Kite持仓)不同, 第三方首先会与Kite MCP服务器建立一个未认证的会话, 随后将你重定向至你信任的网页浏览器进行Kite的身份验证.

一旦你在外部完成Kite认证, Kite MCP服务器会将第三方之前打开的会话标记为已认证. 你现在可以返回该会话并执行所有认证任务, 例如获取持仓信息.

该令牌在会话活跃期间有效, 且不会存储或使用在cookie等中. 更多细节如下

解析MCP认证流程

MCP 认证的核心在于一个利用 Zerodha 后端服务作为中间人的复杂 OAuth 流程. 以下是你的 Python fastmcp 客户端进行认证的分步分解:

建立SSE会话:你的Python fastmcp客户端通过Server-Sent Events(SSE)连接至MCP服务器 (例如https://mcp.kite.trade/sse). 此时, MCP服务器与你的客户端建立唯一且持久的会话, 并生成内部session_id以追踪此特定连接.

python 复制代码

Your FastMCP Client --> SSE Session --> MCP Server (https://mcp.kite.trade/sse)
async def main():
    from fastmcp import Client
    from fastmcp.client.transports import SSETransport
    transport = SSETransport(
        url="https://mcp.kite.trade/sse",
        headers={}
    )
    async with Client(transport) as client:
        print("Connected to fastmcp client.")
        # 3. Call the 'login' tool

发起登录命令:

一旦 SSE 会话打开, 你的 fastmcp 客户端调用 login() 工具. 这并非直接的身份验证调用, 而是向 MCP 服务器请求启动外部用户身份验证过程.

内部, MCP 服务器(Kite MCP 服务器)为此次 OAuth 登录尝试生成一个新的临时令牌 session_id, 并将其映射到你当前的 SSE 会话中.
随后, 它将 登录 URL 返回给你的 fastmcp 客户端. 此 URL 至关重要, 其中包含你的 api_key=kitemcp 以及生成的 session_id, 这些信息都包含在 redirect_params 中.

python 复制代码

async with Client(transport) as client:
        print("Connected to fastmcp client.")
        # 3. Call the 'login' tool
        print("Calling fastmcp 'login' tool...")
        login_result = await client.call_tool("login", {})
        print("Login result from fastmcp:", login_result)
        # 4. Extract and display the URL to the user
        login_url = None
        if isinstance(login_result, list) and login_result:
            for item in login_result:
                if hasattr(item, 'type') and item.type == 'text' and 'URL:' in item.text:
                    url_start_index = item.text.find('URL: ') + len('URL: ')
                    login_url = item.text[url_start_index:].strip()
                    break

输出如下:

perl 复制代码

Example Login URL: https://kite.zerodha.com/connect/login?api_key=kitemcp&v=3&redirect_params=session_id%3D44ce2d96-710b-4442-bf57-cd0d5d0e4050

外部浏览器认证: 我们的 Python 脚本随后在用户的默认网页浏览器(如 Chrome 或 Firefox)中打开此 Login URL. 你也可以复制粘贴此 URL 并自行操作. 用户随后直接与 Zerodha Kite 进行认证, 提供其凭据并完成任何双因素认证步骤.

python 复制代码

if login_url:
            print("\n=======================================================")
            print("  Please open this URL in your browser to login to Kite:")
            print(f"  {login_url}")
            print("=======================================================\n")
            # Open the URL automatically for convenience
            webbrowser.open(login_url)

MCP 服务器的协调("隐藏"握手): 认证成功后, Zerodha Kite 的主要 OAuth 授权服务器将用户的浏览器重定向回预配置的 redirect_uri. 关键点在于, 此 redirect_uri 指向 Zerodha MCP 服务器基础设施内的某个端点, 而非你的本地机器. 你无需启动 HTTP 服务器并处理回调.
此次重定向包含授权码(request_token)和 MCP 服务器最初嵌入在 登录 URL 中的 session_id.
内部映射: 完成Kite登录认证后, 它会使用session_id调用Kite MCP服务器(重定向URI). Kite MCP服务器使用会话ID查找哪个活跃的SSE会话(你的fastmcp客户端连接)发起此次登录, 并标记该会话为已认证
隐式会话更新: MCP 服务器不会直接将此 access_token 发回你的 fastmcp 客户端. 相反, 它会隐式更新你的 SSE 会话的身份验证状态, 表明你的客户端现在已通过身份验证.
进行认证调用: 在用户确认已登录(例如在终端中按下回车键)后, fastmcp 客户端即可继续调用其他 API 工具, 如 getHoldings.
当你调用 client.call_tool("get_holdings", {}) 时, fastmcp 客户端会通过已认证的 SSE 会话将此命令发送至 MCP 服务器. mcp.kite.trade 后端现在安全地持有该特定会话的 access_token, 使用它代表你向 Zerodha 的核心 Kite API 发起实际的 get_holdings API 调用.
结果随后通过 SSE 连接流式传输回你的 fastmcp 客户端.

python 复制代码

webbrowser.open(login_url)
      # 5. Wait for user confirmation (crucial step for CLI)
      input("Press Enter after you have successfully logged in to Kite in your browser...")
      print("User confirmed login. Attempting to proceed with fastmcp calls.")
      
      print("Session details after login:")
      print(client.session)
      # Give a small buffer time for fastmcp to potentially synchronize or detect the login.
      await asyncio.sleep(2)
      # 6. Call subsequent tools (assuming fastmcp's session is now updated)
      print("Calling fastmcp 'get_holdings' tool...")
      try:
          
          holdings = await client.call_tool("get_holdings", {})
          print("Your holdings:", holdings)

关键要点：客户端无需存储 `API 密钥` 或 `token 密钥`

此流程的核心特征之一, 也是重要的安全优势在于: 你的 Python 客户端无需知晓或存储与 kitemcp API 密钥关联的 API_Secret. kitemcp 的 API_Key 是公开的, 而其对应的 API_Secret 由 Zerodha 的 MCP 服务安全管理. 这使得你的 fastmcp 客户端成为 OAuth 生态系统中的真正"公共客户端", 简化部署并提升安全性.

总结一下

MCP 似乎正成为统领一切的标准. 它似乎是实现与LLM进一步自动化的关键.

好吧, 今天的内容就分享到这里啦!

一家之言, 欢迎拍砖!

Happy Coding! Stay GOLDEN!

MCP - AI 自动化的金钥匙

什么是 MCP? 大型语言模型如何通过 MCP 进行调用? MCP 授权机制如何运作? 以及其他相关方面

通过一个简单的计算器 MCP 服务器示例揭开 MCP 的神秘面纱

MCP中的授权

挑战: 新的认证范式

解析MCP认证流程

关键要点：客户端无需存储 API 密钥 或 token 密钥

总结一下

关键要点：客户端无需存储 `API 密钥` 或 `token 密钥`