【Spring AI MCP】四、MCP 服务端

Spring AI MCP学习目录

MCP（Model Context Protocol）服务端是指接收符合 MCP 规范的请求、解析 context 字段、协调工具调用与模型推理，并返回标准化响应的后端系统。它是连接 LLM 与外部能力（工具、知识库、多模态等）的"智能调度中枢"。

一、MCP 服务端的核心职责

职责	说明
✅ 解析 MCP	请求读取 context.tools、context.retrieval 等扩展字段
✅ 路由到基础 LLM	调用 OpenAI、Claude、Llama 等底层模型 API
✅ 执行工具调用（Tool Execution）	当模型返回 tool_calls，自动调用本地/远程函数
✅ 注入检索结果（RAG）	根据 context.retrieval 查询向量库并拼接上下文
✅ 多轮对话状态管理	维护会话历史、工具调用中间状态
✅ 降级兼容处理	对不支持 MCP 的模型，自动转为传统 prompt 拼接

二、MCP 服务端架构设计

HTTP POST
MCP Request MCP Client MCP Server 解析 context 工具注册中心向量数据库多模态处理器 LLM 路由器 OpenAI Claude Ollama 工具执行引擎天气API 数据库查询代码解释器响应组装器

三、MCP 服务端工作流程详解

场景：用户问 "北京天气如何？"，并注册了 get_weather 工具
Step 1: 接收 MCP 请求

json 复制代码

{
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "北京天气如何？"}],
  "context": {
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "获取城市天气",
          "parameters": {"city": {"type": "string"}}
        }
      }
    ]
  }
}

Step 2: 转发给 LLM（透传或增强）

若 LLM 原生支持工具调用（如 GPT-4o）→ 直接透传 tools
若不支持（如 Llama 3）→ 自动拼接提示词：

Text 复制代码

你可以使用以下工具：
- get_weather(city: string): 获取城市天气
如果需要调用，请回复 JSON: {"tool_name": "...", "args": {...}}

Step 3: 处理模型响应

情况 A：模型返回 tool_call

json 复制代码

{
  "tool_calls": [{
    "function": {"name": "get_weather", "arguments": "{\"city\": \"北京\"}"}
  }]
}

→ 服务端解析并调用本地 get_weather("北京") 函数

情况 B：模型直接回答
→ 直接返回给客户端

Step 4: 执行工具 & 回传结果

python 复制代码

# 伪代码
def execute_tool(tool_name, args):
    if tool_name == "get_weather":
        result = call_weather_api(args["city"])
        return {"output": result}  # 标准化输出

Step 5: 发起第二轮 LLM 调用（带工具结果）

json 复制代码

{
  "messages": [
    {"role": "user", "content": "北京天气如何？"},
    {"role": "assistant", "tool_calls": [...]},
    {"role": "tool", "content": "{\"temp\": 25, \"condition\": \"sunny\"}"}
  ]
}

Step 6: 返回最终答案

json 复制代码

{
  "choices": [{
    "message": {
      "content": "北京今天晴，气温25°C。"
    }
  }]
}

四、MCP 服务端关键模块实现（Python 示例）

1. 工具注册与执行引擎

python 复制代码

from typing import Callable, Dict, Any

class ToolRegistry:
    def __init__(self):
        self.tools: Dict[str, Callable] = {}
    
    def register(self, name: str, func: Callable):
        self.tools[name] = func
    
    def execute(self, name: str, args: Dict[str, Any]) -> Dict[str, Any]:
        if name not in self.tools:
            raise ValueError(f"Tool {name} not found")
        result = self.tools[name](**args)
        return {"output": result}  # MCP 标准响应格式

2. MCP 服务端主逻辑（FastAPI）

python 复制代码

from fastapi import FastAPI
import httpx

app = FastAPI()
tool_registry = ToolRegistry()

@app.post("/mcp/chat/completions")
async def mcp_chat(request: dict):
    # 1. 提取 context
    context = request.get("context", {})
    tools = context.get("tools", [])
    
    # 2. 注册工具（简化：假设工具已预注册）
    # 实际中可从 DB 动态加载
    
    # 3. 转发给底层 LLM
    async with httpx.AsyncClient() as client:
        llm_response = await client.post(
            "https://api.openai.com/v1/chat/completions",
            headers={"Authorization": "Bearer YOUR_KEY"},
            json={
                "model": request["model"],
                "messages": request["messages"],
                "tools": tools  # OpenAI 原生支持
            }
        )
    
    response_data = llm_response.json()
    
    # 4. 检查是否需要工具调用
    if "tool_calls" in response_data["choices"][0]["message"]:
        tool_call = response_data["choices"][0]["message"]["tool_calls"][0]
        func = tool_call["function"]
        args = json.loads(func["arguments"])
        
        # 执行工具
        tool_result = tool_registry.execute(func["name"], args)
        
        # 5. 第二轮调用：附上工具结果
        new_messages = request["messages"] + [
            response_data["choices"][0]["message"],
            {"role": "tool", "content": json.dumps(tool_result["output"])}
        ]
        
        final_response = await client.post(..., json={
            "model": request["model"],
            "messages": new_messages
        })
        return final_response.json()
    
    return response_data

五、高级功能支持

1. RAG 集成（通过 context.retrieval）

python 复制代码

if "retrieval" in context:
    query = context["retrieval"]["query"]
    docs = vector_db.search(query, top_k=3)
    # 将 docs 拼接到 system message 或 user message
    enhanced_prompt = f"参考以下资料：\n{docs}\n\n问题：{original_query}"

2. 多模态处理（context.images）

python 复制代码

if "images" in context:
    for img in context["images"]:
        # 下载图片 → 转 base64 → 注入 messages
        base64_img = download_and_encode(img["url"])
        messages[0]["content"] = [
            {"type": "text", "text": original_text},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_img}"}}
        ]

3. 缓存优化（context.cache）

python 复制代码

cache_key = context.get("cache", {}).get("key")
if cache_key and redis.exists(cache_key):
    return redis.get(cache_key)
# ... 否则正常处理并缓存结果

六、主流 MCP 服务端实现

类型	项目	特点
商业平台	OpenRouter MCP Gateway	支持 50+ 模型，自动降级，企业级
开源方案	mcp-server-python (社区)	基础工具调用支持
自研推荐	基于 FastAPI + LangChain	灵活定制，适合企业私有部署

🔍 GitHub 搜索关键词：mcp server, model context protocol server

七、MCP 服务端 vs 传统 AI 网关

能力	传统网关	MCP 服务端
工具调用	需应用层处理	内置自动执行
RAG 集成	手动拼 prompt	声明式配置
多模型适配	每个模型写适配器	统一 MCP 接口
多模态	各家格式不同	标准 images 字段
开发效率	低（重复造轮子）	高（专注业务逻辑）

八、总结

MCP 服务端 = AI 能力的操作系统内核

它让 LLM 不再是"孤岛模型"，而是能安全、高效、标准化地调用外部世界能力的智能代理。

✅ 核心价值：

对开发者：屏蔽底层复杂性，一套代码适配所有模型
对企业：统一治理 AI 调用（审计、限流、成本控制）
对生态：推动工具/知识库的插件化标准

💡 建议：

中小团队：直接使用 OpenRouter MCP 服务
大型企业：基于开源框架自建 MCP 服务端，集成内部工具链