MCP架构简述
MCP是由Anthropic推出的开源协议,目的是通过统一的连接方式,让大型语言模型(LLM)与外部数据源和工具无缝集成,减少重复造轮子的过程。
MCP架构主要涉及到以下5个部分:
● Host:一个包含MCP Client的应用,可以是Web、App、或其他类型的程序等
● MCP Client:使用MCP协议与Server建立一对一连接
● MCP Server:连接内部、外部、网络资源,使用MCP协议对外提供服务
● Local:内部资源
● Remote:外部/网络资源
OpenAI提出的function calling
增强了大模型的能力,通过使用外部工具的方式提升了模型的智能化水平。但是不同的模型使用工具的方式和定义并不完全相同,而且,对于常见的功能需求,比如联网查询和天气查询,也都需要在自己的程序中实现相应的代码。而有了MCP之后,只需要把这些已经被实现的功能引入进自己的项目中,就像导入一个包一样简单。
因此,从实际体验上来讲,可以简单理解成MCP是把Agent涉及到的工具调用、提示词等进行了标准化处理,简化了开发流程,减少了重复造轮子的国产,方便大家无缝使用开源社区中优秀的工具。
Cursor MCP实战
在网上找了一个MCP Server,用于检索arxiv文章(链接:github.com/blazickjp/a... Client使用,不过需要注意Cursor仅支持MCP中的tools。
1. 安装对应的Server
vbscript
uv tool install arxiv-mcp-server
2. Cursor配置
在「Cursor Settings」菜单栏中点击「MCP」,然后点击「+ Add new global MCP server」按钮,会弹出一个mcp.json文件,把arxiv-mcp-server中的配置信息添加到json文件中去。
json
{
"mcpServers": {
"arxiv-mcp-server": {
"command": "uv",
"args": [
"tool",
"run",
"arxiv-mcp-server",
"--storage-path", "/path/to/paper/storage"
]
}
}
}
当该MCP Server前面的按钮显示绿色的时候,说明该功能已经可以正常使用了。
3. 效果实测
在这里我输入一个问题「ai agent 2025」,针对改问题,模型判断最适合的工具应该是联网搜索功能,由于这里我点击了拒绝,因此模型判断应该选择工具「search_papers」,可以看到显示的工具名称正是之前配置的MCP Server中的工具名称,因此这里可以选择使用该工具继续。这里可以看到search_papers的结果: Parameters:
json
{
"query": "AI agents autonomous systems 2025",
"categories": [
"cs.AI",
"cs.LG"
],
"max_results": 5
}
Result:
swift
{
"total_results": 5,
"papers": [
{
"id": "2503.10638v1",
"title": "Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective",
"authors": [
"Xiaoming Zhao",
"Alexander G. Schwing"
],
"abstract": "Classifier-free guidance has become a staple for conditional generation with\ndenoising diffusion models. However, a comprehensive understanding of\nclassifier-free guidance is still missing. In this work, we carry out an\nempirical study to provide a fresh perspective on classifier-free guidance.\nConcretely, instead of solely focusing on classifier-free guidance, we trace\nback to the root, i.e., classifier guidance, pinpoint the key assumption for\nthe derivation, and conduct a systematic study to understand the role of the\nclassifier. We find that both classifier guidance and classifier-free guidance\nachieve conditional generation by pushing the denoising diffusion trajectories\naway from decision boundaries, i.e., areas where conditional information is\nusually entangled and is hard to learn. Based on this classifier-centric\nunderstanding, we propose a generic postprocessing step built upon\nflow-matching to shrink the gap between the learned distribution for a\npre-trained denoising diffusion model and the real data distribution, majorly\naround the decision boundaries. Experiments on various datasets verify the\neffectiveness of the proposed approach.",
"categories": [
"cs.CV",
"cs.AI",
"cs.LG"
],
"published": "2025-03-13T17:59:59+00:00",
"url": "http://arxiv.org/pdf/2503.10638v1",
"resource_uri": "arxiv://2503.10638v1"
},
{
"id": "2503.10636v1",
"title": "The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation",
"authors": [
"Ho Kei Cheng",
"Alexander Schwing"
],
"abstract": "Minibatch optimal transport coupling straightens paths in unconditional flow\nmatching. This leads to computationally less demanding inference as fewer\nintegration steps and less complex numerical solvers can be employed when\nnumerically solving an ordinary differential equation at test time. However, in\nthe conditional setting, minibatch optimal transport falls short. This is\nbecause the default optimal transport mapping disregards conditions, resulting\nin a conditionally skewed prior distribution during training. In contrast, at\ntest time, we have no access to the skewed prior, and instead sample from the\nfull, unbiased prior distribution. This gap between training and testing leads\nto a subpar performance. To bridge this gap, we propose conditional optimal\ntransport C^2OT that adds a conditional weighting term in the cost matrix when\ncomputing the optimal transport assignment. Experiments demonstrate that this\nsimple fix works with both discrete and continuous conditions in\n8gaussians-to-moons, CIFAR-10, ImageNet-32x32, and ImageNet-256x256. Our method\nperforms better overall compared to the existing baselines across different\nfunction evaluation budgets. Code is available at\nhttps://hkchengrex.github.io/C2OT",
"categories": [
"cs.LG",
"cs.CV"
],
"published": "2025-03-13T17:59:56+00:00",
"url": "http://arxiv.org/pdf/2503.10636v1",
"resource_uri": "arxiv://2503.10636v1"
},
{
"id": "2503.10635v1",
"title": "A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1",
"authors": [
"Zhaoyi Li",
"Xiaohan Zhao",
"Dong-Dong Wu",
"Jiacheng Cui",
"Zhiqiang Shen"
],
"abstract": "Despite promising performance on open-source large vision-language models\n(LVLMs), transfer-based targeted attacks often fail against black-box\ncommercial LVLMs. Analyzing failed adversarial perturbations reveals that the\nlearned perturbations typically originate from a uniform distribution and lack\nclear semantic details, resulting in unintended responses. This critical\nabsence of semantic information leads commercial LVLMs to either ignore the\nperturbation entirely or misinterpret its embedded semantics, thereby causing\nthe attack to fail. To overcome these issues, we notice that identifying core\nsemantic objects is a key objective for models trained with various datasets\nand methodologies. This insight motivates our approach that refines semantic\nclarity by encoding explicit semantic details within local regions, thus\nensuring interoperability and capturing finer-grained features, and by\nconcentrating modifications on semantically rich areas rather than applying\nthem uniformly. To achieve this, we propose a simple yet highly effective\nsolution: at each optimization step, the adversarial image is cropped randomly\nby a controlled aspect ratio and scale, resized, and then aligned with the\ntarget image in the embedding space. Experimental results confirm our\nhypothesis. Our adversarial examples crafted with local-aggregated\nperturbations focused on crucial regions exhibit surprisingly good\ntransferability to commercial LVLMs, including GPT-4.5, GPT-4o,\nGemini-2.0-flash, Claude-3.5-sonnet, Claude-3.7-sonnet, and even reasoning\nmodels like o1, Claude-3.7-thinking and Gemini-2.0-flash-thinking. Our approach\nachieves success rates exceeding 90% on GPT-4.5, 4o, and o1, significantly\noutperforming all prior state-of-the-art attack methods. Our optimized\nadversarial examples under different configurations and training code are\navailable at https://github.com/VILA-Lab/M-Attack.",
"categories": [
"cs.CV",
"cs.AI",
"cs.LG"
],
"published": "2025-03-13T17:59:55+00:00",
"url": "http://arxiv.org/pdf/2503.10635v1",
"resource_uri": "arxiv://2503.10635v1"
},
{
"id": "2503.10633v1",
"title": "Charting and Navigating Hugging Face's Model Atlas",
"authors": [
"Eliahu Horwitz",
"Nitzan Kurer",
"Jonathan Kahana",
"Liel Amar",
"Yedid Hoshen"
],
"abstract": "As there are now millions of publicly available neural networks, searching\nand analyzing large model repositories becomes increasingly important.\nNavigating so many models requires an atlas, but as most models are poorly\ndocumented charting such an atlas is challenging. To explore the hidden\npotential of model repositories, we chart a preliminary atlas representing the\ndocumented fraction of Hugging Face. It provides stunning visualizations of the\nmodel landscape and evolution. We demonstrate several applications of this\natlas including predicting model attributes (e.g., accuracy), and analyzing\ntrends in computer vision models. However, as the current atlas remains\nincomplete, we propose a method for charting undocumented regions.\nSpecifically, we identify high-confidence structural priors based on dominant\nreal-world model training practices. Leveraging these priors, our approach\nenables accurate mapping of previously undocumented areas of the atlas. We\npublicly release our datasets, code, and interactive atlas.",
"categories": [
"cs.LG",
"cs.CL",
"cs.CV"
],
"published": "2025-03-13T17:59:53+00:00",
"url": "http://arxiv.org/pdf/2503.10633v1",
"resource_uri": "arxiv://2503.10633v1"
},
{
"id": "2503.10632v1",
"title": "Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?",
"authors": [
"Subhajit Maity",
"Killian Hitsman",
"Xin Li",
"Aritra Dutta"
],
"abstract": "Kolmogorov-Arnold networks (KANs) are a remarkable innovation consisting of\nlearnable activation functions with the potential to capture more complex\nrelationships from data. Although KANs are useful in finding symbolic\nrepresentations and continual learning of one-dimensional functions, their\neffectiveness in diverse machine learning (ML) tasks, such as vision, remains\nquestionable. Presently, KANs are deployed by replacing multilayer perceptrons\n(MLPs) in deep network architectures, including advanced architectures such as\nvision Transformers (ViTs). In this paper, we are the first to design a general\nlearnable Kolmogorov-Arnold Attention (KArAt) for vanilla ViTs that can operate\non any choice of basis. However, the computing and memory costs of training\nthem motivated us to propose a more modular version, and we designed particular\nlearnable attention, called Fourier-KArAt. Fourier-KArAt and its variants\neither outperform their ViT counterparts or show comparable performance on\nCIFAR-10, CIFAR-100, and ImageNet-1K datasets. We dissect these architectures'\nperformance and generalization capacity by analyzing their loss landscapes,\nweight distributions, optimizer path, attention visualization, and spectral\nbehavior, and contrast them with vanilla ViTs. The goal of this paper is not to\nproduce parameter- and compute-efficient attention, but to encourage the\ncommunity to explore KANs in conjunction with more advanced architectures that\nrequire a careful understanding of learnable activations. Our open-source code\nand implementation details are available on: https://subhajitmaity.me/KArAt",
"categories": [
"cs.LG",
"cs.CV",
"68T07",
"I.2.6; I.5.1; I.5.5; I.5.4; I.4.10"
],
"published": "2025-03-13T17:59:52+00:00",
"url": "http://arxiv.org/pdf/2503.10632v1",
"resource_uri": "arxiv://2503.10632v1"
}
]
}
到这里模型判断已经解决了用户的问题,因此没有下一步。 接下来的交互中,我明确要求模型下载其中的一篇论文,可以看到工具「download_papers」已经被调用,pdf论文被保存为.md格式,并存放在本地路径中。然而还没有结束,模型会接着调用工具「read_papers」,针对论文内容进行解读,直到这一步才终止。
自定义MCP Client和MCP Server
前面提到的Cursor相当于MCP架构中的Client(实际上是包含了Client的host),因此对于普通用户来说,只需要关注自身需要什么样的功能(Server),去找到相应的功能(Server)安装即可。对于开发者而言,如果需要让自己的程序能够使用已有的MCP Server,则需要针对自己的程序进行改造,使其符合MCP规范才行。目前官方也分别提供了Python 和 Js的SDK,可以很方便进行开发。
1. MCP Client
根据官方提供的教程:
首先,创建一个连接Server的函数,在该函数中,判断脚本类型是否满足(只允许python和js),并且返回可以使用的工具。
python
async def connect_to_server(self, server_script_path: str):
"""Connect to an MCP server
Args:
server_script_path: Path to the server script (.py or .js)
"""
is_python = server_script_path.endswith('.py')
is_js = server_script_path.endswith('.js')
if not (is_python or is_js):
raise ValueError("Server script must be a .py or .js file")
command = "python" if is_python else "node"
server_params = StdioServerParameters(
command=command,
args=[server_script_path],
env=None
)
stdio_transport = await self.exit_stack.enter_async_context(stdio_client(server_params))
self.stdio, self.write = stdio_transport
self.session = await self.exit_stack.enter_async_context(ClientSession(self.stdio, self.write))
await self.session.initialize()
# List available tools
response = await self.session.list_tools()
tools = response.tools
print("\nConnected to server with tools:", [tool.name for tool in tools])
接下来,定义输入解析函数,这一步就是大模型工具调用的处理流程。
python
async def process_query(self, query: str) -> str:
"""Process a query using Claude and available tools"""
messages = [
{
"role": "user",
"content": query
}
]
response = await self.session.list_tools()
available_tools = [{
"name": tool.name,
"description": tool.description,
"input_schema": tool.inputSchema
} for tool in response.tools]
# Initial Claude API call
response = self.anthropic.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1000,
messages=messages,
tools=available_tools
)
# Process response and handle tool calls
final_text = []
for content in response.content:
if content.type == 'text':
final_text.append(content.text)
elif content.type == 'tool_use':
tool_name = content.name
tool_args = content.input
# Execute tool call
result = await self.session.call_tool(tool_name, tool_args)
final_text.append(f"[Calling tool {tool_name} with args {tool_args}]")
# Continue conversation with tool results
if hasattr(content, 'text') and content.text:
messages.append({
"role": "assistant",
"content": content.text
})
messages.append({
"role": "user",
"content": result.content
})
# Get next response from Claude
response = self.anthropic.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1000,
messages=messages,
)
final_text.append(response.content[0].text)
return "\n".join(final_text)
处理对话函数,这里的demo不保留历史记录,只支持单次对话。
python
async def chat_loop(self):
"""Run an interactive chat loop"""
print("\nMCP Client Started!")
print("Type your queries or 'quit' to exit.")
while True:
try:
query = input("\nQuery: ").strip()
if query.lower() == 'quit':
break
response = await self.process_query(query)
print("\n" + response)
except Exception as e:
print(f"\nError: {str(e)}")
完整代码如下:
python
import asyncio
from typing import Optional
from contextlib import AsyncExitStack
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from anthropic import Anthropic
from dotenv import load_dotenv
load_dotenv() # load environment variables from .env
class MCPClient:
def __init__(self):
# Initialize session and client objects
self.session: Optional[ClientSession] = None
self.exit_stack = AsyncExitStack()
self.anthropic = Anthropic()
async def connect_to_server(self, server_script_path: str):
"""Connect to an MCP server
Args:
server_script_path: Path to the server script (.py or .js)
"""
is_python = server_script_path.endswith('.py')
is_js = server_script_path.endswith('.js')
if not (is_python or is_js):
raise ValueError("Server script must be a .py or .js file")
command = "python" if is_python else "node"
server_params = StdioServerParameters(
command=command,
args=[server_script_path],
env=None
)
stdio_transport = await self.exit_stack.enter_async_context(stdio_client(server_params))
self.stdio, self.write = stdio_transport
self.session = await self.exit_stack.enter_async_context(ClientSession(self.stdio, self.write))
await self.session.initialize()
# List available tools
response = await self.session.list_tools()
tools = response.tools
print("\nConnected to server with tools:", [tool.name for tool in tools])
async def process_query(self, query: str) -> str:
"""Process a query using Claude and available tools"""
messages = [
{
"role": "user",
"content": query
}
]
response = await self.session.list_tools()
available_tools = [{
"name": tool.name,
"description": tool.description,
"input_schema": tool.inputSchema
} for tool in response.tools]
# Initial Claude API call
response = self.anthropic.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1000,
messages=messages,
tools=available_tools
)
# Process response and handle tool calls
final_text = []
for content in response.content:
if content.type == 'text':
final_text.append(content.text)
elif content.type == 'tool_use':
tool_name = content.name
tool_args = content.input
# Execute tool call
result = await self.session.call_tool(tool_name, tool_args)
final_text.append(f"[Calling tool {tool_name} with args {tool_args}]")
# Continue conversation with tool results
if hasattr(content, 'text') and content.text:
messages.append({
"role": "assistant",
"content": content.text
})
messages.append({
"role": "user",
"content": result.content
})
# Get next response from Claude
response = self.anthropic.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1000,
messages=messages,
)
final_text.append(response.content[0].text)
return "\n".join(final_text)
async def chat_loop(self):
"""Run an interactive chat loop"""
print("\nMCP Client Started!")
print("Type your queries or 'quit' to exit.")
while True:
try:
query = input("\nQuery: ").strip()
if query.lower() == 'quit':
break
response = await self.process_query(query)
print("\n" + response)
except Exception as e:
print(f"\nError: {str(e)}")
async def cleanup(self):
"""Clean up resources"""
await self.exit_stack.aclose()
async def main():
if len(sys.argv) < 2:
print("Usage: python client.py <path_to_server_script>")
sys.exit(1)
client = MCPClient()
try:
await client.connect_to_server(sys.argv[1])
await client.chat_loop()
finally:
await client.cleanup()
if __name__ == "__main__":
import sys
asyncio.run(main())
官方的例子中用到的是Claude模型,但是实测可以兼容OpenAI接口的API,只要能支持tools
这个参数即可。
2. MCP Server
MCP Server主要包含了以下3部分的内容:
● Resources:允许访问的静态资源,如文件图片等
● Tools:该Server拥有的工具
● Prompts:预先写好的提示词模版,辅助完成特定任务
官方也提供了一个简单的Server开发demo:
首先,定义了2个函数,用于发起实际的API请求以及解析响应内容。
python
async def make_nws_request(url: str) -> dict[str, Any] | None:
"""Make a request to the NWS API with proper error handling."""
headers = {
"User-Agent": USER_AGENT,
"Accept": "application/geo+json"
}
async with httpx.AsyncClient() as client:
try:
response = await client.get(url, headers=headers, timeout=30.0)
response.raise_for_status()
return response.json()
except Exception:
return None
def format_alert(feature: dict) -> str:
"""Format an alert feature into a readable string."""
props = feature["properties"]
return f"""
Event: {props.get('event', 'Unknown')}
Area: {props.get('areaDesc', 'Unknown')}
Severity: {props.get('severity', 'Unknown')}
Description: {props.get('description', 'No description available')}
Instructions: {props.get('instruction', 'No specific instructions provided')}
"""
定义了2个工具函数,通过添加装饰器让Client知道该函数是工具,可以进行调用。
python
@mcp.tool()
async def get_alerts(state: str) -> str:
"""Get weather alerts for a US state.
Args:
state: Two-letter US state code (e.g. CA, NY)
"""
url = f"{NWS_API_BASE}/alerts/active/area/{state}"
data = await make_nws_request(url)
if not data or "features" not in data:
return "Unable to fetch alerts or no alerts found."
if not data["features"]:
return "No active alerts for this state."
alerts = [format_alert(feature) for feature in data["features"]]
return "\n---\n".join(alerts)
@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
"""Get weather forecast for a location.
Args:
latitude: Latitude of the location
longitude: Longitude of the location
"""
# First get the forecast grid endpoint
points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}"
points_data = await make_nws_request(points_url)
if not points_data:
return "Unable to fetch forecast data for this location."
# Get the forecast URL from the points response
forecast_url = points_data["properties"]["forecast"]
forecast_data = await make_nws_request(forecast_url)
if not forecast_data:
return "Unable to fetch detailed forecast."
# Format the periods into a readable forecast
periods = forecast_data["properties"]["periods"]
forecasts = []
for period in periods[:5]: # Only show next 5 periods
forecast = f"""
{period['name']}:
Temperature: {period['temperature']}°{period['temperatureUnit']}
Wind: {period['windSpeed']} {period['windDirection']}
Forecast: {period['detailedForecast']}
"""
forecasts.append(forecast)
return "\n---\n".join(forecasts)
完整代码如下:
python
from typing import Any
import httpx
from mcp.server.fastmcp import FastMCP
# Initialize FastMCP server
mcp = FastMCP("weather")
# Constants
NWS_API_BASE = "https://api.weather.gov"
USER_AGENT = "weather-app/1.0"
async def make_nws_request(url: str) -> dict[str, Any] | None:
"""Make a request to the NWS API with proper error handling."""
headers = {
"User-Agent": USER_AGENT,
"Accept": "application/geo+json"
}
async with httpx.AsyncClient() as client:
try:
response = await client.get(url, headers=headers, timeout=30.0)
response.raise_for_status()
return response.json()
except Exception:
return None
def format_alert(feature: dict) -> str:
"""Format an alert feature into a readable string."""
props = feature["properties"]
return f"""
Event: {props.get('event', 'Unknown')}
Area: {props.get('areaDesc', 'Unknown')}
Severity: {props.get('severity', 'Unknown')}
Description: {props.get('description', 'No description available')}
Instructions: {props.get('instruction', 'No specific instructions provided')}
"""
@mcp.tool()
async def get_alerts(state: str) -> str:
"""Get weather alerts for a US state.
Args:
state: Two-letter US state code (e.g. CA, NY)
"""
url = f"{NWS_API_BASE}/alerts/active/area/{state}"
data = await make_nws_request(url)
if not data or "features" not in data:
return "Unable to fetch alerts or no alerts found."
if not data["features"]:
return "No active alerts for this state."
alerts = [format_alert(feature) for feature in data["features"]]
return "\n---\n".join(alerts)
@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
"""Get weather forecast for a location.
Args:
latitude: Latitude of the location
longitude: Longitude of the location
"""
# First get the forecast grid endpoint
points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}"
points_data = await make_nws_request(points_url)
if not points_data:
return "Unable to fetch forecast data for this location."
# Get the forecast URL from the points response
forecast_url = points_data["properties"]["forecast"]
forecast_data = await make_nws_request(forecast_url)
if not forecast_data:
return "Unable to fetch detailed forecast."
# Format the periods into a readable forecast
periods = forecast_data["properties"]["periods"]
forecasts = []
for period in periods[:5]: # Only show next 5 periods
forecast = f"""
{period['name']}:
Temperature: {period['temperature']}°{period['temperatureUnit']}
Wind: {period['windSpeed']} {period['windDirection']}
Forecast: {period['detailedForecast']}
"""
forecasts.append(forecast)
return "\n---\n".join(forecasts)
if __name__ == "__main__":
# Initialize and run the server
mcp.run(transport='stdio')
好用的MCP推荐
MCP Server:
MCP Client:
也可以在glama中根据条件筛选:
总结
MCP的出现,对于整个AI生态都起到了促进作用:
普通用户只需要关注自己需要什么样的插件,然后去找,并且安装到自己的Cursor等AI 应用上即可使用; 开发者根据自身项目需要,可以通过引入优秀的MCP Server来减少开发的工作量,也可以把重复实现的功能作为MCP Server进行开发,进而实现多项目复用。
不过,Claude模型及其桌面端在国内均无法直接使用,需要外国手机号才能注册,并且经常有被Ban的风险;而Cursor在Agent模式下才能使用MCP Server,普通Chat模式是不含该功能的,并且Agent模式在免费版下每个月仅有一定额度。这也只能寄希望于开源社区涌现出更多优秀的项目,才能发挥出MCP生态的影响。