MCP 客户端代理：架构与实现，与 LLM 集成

导录

在这篇文章中，我们将更深入地探讨整体架构和 MCP 客户端流程，并实现 MCP 客户端代理。

......并希望能够澄清"当您向 MCP 提交 LLM 申请时会发生什么"

高级 MCP 架构

MCP 组件

主机：用户直接交互的 AI 代码编辑器（如 Claude Desktop 或 Cursor），作为主界面和系统管理器。
主机：用户直接与之交互的 AI 代码编辑器（如 Claude Desktop 或 Cursor），用作主界面和系统管理器。
客户端 ：维护主机和 MCP 服务器之间的连接的中介，处理通信协议和数据流。
客户端 ：维护主机和 MCP 服务器之间连接的中介，处理通信协议和数据流。
服务器 ：通过标准化接口向 AI 模型提供特定功能、数据源和工具的组件
服务器 ：通过标准化接口为 AI 模型提供特定功能、数据源和工具的组件

事不宜迟，让我们进入本文的核心。

什么是 MCP 客户端代理？

自定义 MCP 客户端：以编程方式调用 MCP 服务器

到目前为止，我们看到的大多数用例都是在 AI 驱动的 IDE 中使用 MCP。用户在 IDE 中配置 MCP 服务器，并使用其聊天界面与 MCP 服务器交互，这里的聊天界面就是 MCP 客户端/主机。

但是，如果您想以编程方式从您的服务中调用这些 MCP 服务器呢？这正是 MCP 的真正优势所在，它是一种为 LLM 提供上下文和工具的标准化方法，因此我们无需亲自编写代码来集成所有外部 API、资源或文件，只需提供上下文和工具，并将它们发送给 LLM 获取情报即可。

MCP 客户端代理流程（含多个 MCP 服务器）

该图展示了 MCP 自定义客户端/AI 代理如何通过 MCP 服务器处理用户请求。以下是该交互流程的分步分解：

步骤1：用户发起请求

用户通过 IDE、浏览器或终端提出查询或提交请求
查询由自定义 MCP 客户端/代理接口接收。

步骤 2：MCP 客户端与服务器连接

MCP 客户端连接到 MCP 服务器。它可以同时连接多个服务器，并从这些服务器请求工具
服务器发回支持的工具和功能列表。

步骤3：AI处理

用户查询和工具列表均发送到 LLM（例如 OpenAI）
LLM 分析请求并建议适当的工具和输入参数，并将响应发送回 MCP 客户端

步骤4：函数执行

MCP 客户端使用建议的参数调用 MCP 服务器中选择的函数。
MCP 服务器接收函数调用并处理请求，根据请求的不同，特定 MCP 服务器中的相应工具将被调用。请注意，请确保各个 MCP 服务器中的工具名称不同，以避免 LLM 幻觉和不确定的响应。
服务器可能与数据库、外部 API 或文件系统交互来处理请求

步骤 5：（可选）使用 LLM 改进响应

MCP Server 返回函数执行响应给 MCP Client。
（选修的）
- MCP 客户端随后可以将该响应转发给 LLM 进行改进
- LLM 将技术响应转换为自然语言或创建摘要

步骤6：回复用户

最终处理后的响应通过客户端界面发送回用户
用户收到其原始查询的答案

自定义 MCP 客户端实现/源代码

连接到 MCP 服务器：如上所述，一个 MCP 客户端可以连接到多个 MCP 服务器，我们可以在自定义 MCP 客户端中模拟相同的操作。
注意：为了避免过度幻觉并获得固定结果，建议不要在这些多台服务器之间的工具之间发生冲突。
MCP 服务器 2 种传输选择：STDIO（用于本地进程）、SSE（用于 http/websocket 请求）

连接到 STDIO 传输

python 复制代码

async def connect_to_stdio_server(self, server_script_path: str):
        """Connect to an MCP stdio server"""
        is_python = server_script_path.endswith('.py')
        is_js = server_script_path.endswith('.js')
        if not (is_python or is_js):
            raise ValueError("Server script must be a .py or .js file")
        command = "python" if is_python else "node"
        server_params = StdioServerParameters(
            command=command,
            args=[server_script_path],
            env=None
        )
        stdio_transport = await self.exit_stack.enter_async_context(stdio_client(server_params))
        self.stdio, self.write = stdio_transport
        self.session = await self.exit_stack.enter_async_context(ClientSession(self.stdio, self.write))
        await self.session.initialize()
        print("Initialized stdio...")

连接到 SSE 传输

python 复制代码

async def connect_to_sse_server(self, server_url: str):
        """Connect to an MCP server running with SSE transport"""
        # Store the context managers so they stay alive
        self._streams_context = sse_client(url=server_url)
        streams = await self._streams_context.__aenter__()

        self._session_context = ClientSession(*streams)
        self.session: ClientSession = await self._session_context.__aenter__()

        # Initialize
        await self.session.initialize()
        print("Initialized SSE...")
Get Tools and Process User request with LLM & MCP Servers
Once the Servers are initialized, we can now fetch tools from all available servers and process user query, processing user query will follow the steps as described above

# get available tools from the servers
stdio_tools = await std_server.list_tools()
sse_tools = await sse_server.list_tools()

处理用户请求

ini 复制代码

async def process_user_query(self, available_tools: any, user_query: str, tool_session_map: dict):
        """
        Process the user query and return the response.
        """
        model_name = "gpt-35-turbo"
        api_version = "2022-12-01-preview"

        # On first user query, initialize messages if empty
        self.messages = [
            {
                "role": "user",
                "content": user_query
            }
        ]

        # Initialize your LLM - e.g., Azure OpenAI client
        openai_client = AzureOpenAI(
            api_version=api_version,
            azure_endpoint=<OPENAI_ENDPOINT>,
            api_key=<API_KEY>,
        )

        # send the user query to the LLM along with the available tools from MCP Servers
        response = openai_client.chat.completions.create(
            messages=self.messages,
            model=model_name,
            tools=available_tools,
            tool_choice="auto"
        )

        llm_response = response.choices[0].message

        # append the user query along with LLM response
        self.messages.append({
            "role": "user",
            "content": user_query
        })
        self.messages.append(llm_response)

        # Process respose and handle tool calls
        if azure_response.tool_calls:

            # assuming only one tool call suggested by LLM or keep in for loop to go over all suggested tool_calls
            tool_call = azure_response.tool_calls[0]

            # tool call based on the LLM suggestion
            result = await tool_session_map[tool_call.function.name].call_tool(
                tool_call.function.name,
                json.loads(tool_call.function.arguments)
            )

            # append the response to messages
            self.messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result.content[0].text
            })

            # optionally send the response to LLM to summarize
            azure_response = openai_client.chat.completions.create(
                messages=self.messages,
                model=model_name,
                tools=available_tools,
                tool_choice="auto"
            ).choices[0].message