从零开始学MCP(7) | 实战：用 MCP 构建论文分析智能体

在之前的教程中，我们已经了解了 MCP（Model Context Protocol）的基本概念和核心组件。本篇教程将通过一个实际案例，展示如何使用 MCP 构建一个能够分析学术论文的智能体。这个论文分析智能体将能够读取 PDF 论文，提取关键信息，并回答用户关于论文内容的问题。

测试开发全景图：人工智能测试、智能驱动、自动化、测试开发、左移右移与DevOps的持续交付

一、项目概述

我们将构建一个具有以下功能的论文分析智能体：

读取和解析 PDF 论文
提取论文的基本信息（标题、作者、摘要等）
分析论文内容并回答用户问题
提供论文关键信息的总结

二、环境准备

首先，确保你已经安装了以下工具：

Node.js (版本 18 或更高)
npm 或 yarn
Claude 桌面应用或支持 MCP 的其它客户端

创建项目目录并初始化：

bash 复制代码

mkdir paper-analysis-agent
cd paper-analysis-agent
npm init -y

安装所需依赖：

bash 复制代码

npm install @modelcontextprotocol/server-nodejs pdf-parse

三、实现 MCP 服务器

1. 创建服务器入口文件

创建 server.js 文件：

javascript 复制代码

const { Server } = require('@modelcontextprotocol/server-nodejs');
const { analyzePaper, extractPaperInfo } = require('./paperAnalyzer');

class PaperAnalysisServer {
  constructor() {
    this.server = new Server(
      {
        name: 'paper-analysis-server',
        version: '1.0.0',
      },
      {
        capabilities: {
          resources: {},
          tools: {},
        },
      }
    );

    this.setupResources();
    this.setupTools();
    this.setupErrorHandling();
  }

  setupResources() {
    // 资源相关设置将在后续实现
  }

  setupTools() {
    this.server.setRequestHandler('tools/call', async (request) => {
      const { name, arguments: args } = request.params;

      try {
        switch (name) {
          case 'analyze_paper':
            return await this.analyzePaper(args);
          case 'extract_paper_info':
            return await this.extractPaperInfo(args);
          case 'summarize_paper':
            return await this.summarizePaper(args);
          default:
            throw new Error(`Unknown tool: ${name}`);
        }
      } catch (error) {
        return {
          content: [
            {
              type: 'text',
              text: `Error: ${error.message}`,
            },
          ],
          isError: true,
        };
      }
    });
  }

  setupErrorHandling() {
    this.server.onerror = (error) => {
      console.error('Server error:', error);
    };
  }

  async analyzePaper(args) {
    const { pdfPath, question } = args;
    
    if (!pdfPath) {
      throw new Error('PDF path is required');
    }

    const analysis = await analyzePaper(pdfPath, question);
    
    return {
      content: [
        {
          type: 'text',
          text: analysis,
        },
      ],
    };
  }

  async extractPaperInfo(args) {
    const { pdfPath } = args;
    
    if (!pdfPath) {
      throw new Error('PDF path is required');
    }

    const info = await extractPaperInfo(pdfPath);
    
    return {
      content: [
        {
          type: 'text',
          text: JSON.stringify(info, null, 2),
        },
      ],
    };
  }

  async summarizePaper(args) {
    const { pdfPath } = args;
    
    if (!pdfPath) {
      throw new Error('PDF path is required');
    }

    // 这里实现论文总结逻辑
    const summary = "论文总结内容将在这里显示";
    
    return {
      content: [
        {
          type: 'text',
          text: summary,
        },
      ],
    };
  }

  async run() {
    await this.server.connect();
    console.log('Paper Analysis MCP Server is running...');
  }
}

const server = new PaperAnalysisServer();
server.run().catch(console.error);

2. 实现论文分析器

创建 paperAnalyzer.js 文件：

javascript 复制代码

const fs = require('fs');
const pdf = require('pdf-parse');

class PaperAnalyzer {
  constructor() {
    this.cache = new Map();
  }

  async parsePDF(pdfPath) {
    if (this.cache.has(pdfPath)) {
      return this.cache.get(pdfPath);
    }

    try {
      const dataBuffer = fs.readFileSync(pdfPath);
      const data = await pdf(dataBuffer);
      
      const result = {
        text: data.text,
        info: data.info,
        metadata: data.metadata,
      };

      this.cache.set(pdfPath, result);
      return result;
    } catch (error) {
      throw new Error(`Failed to parse PDF: ${error.message}`);
    }
  }

  async extractPaperInfo(pdfPath) {
    const paperData = await this.parsePDF(pdfPath);
    const text = paperData.text;

    // 简单的信息提取逻辑（实际应用中可能需要更复杂的 NLP 处理）
    const titleMatch = text.match(/^(.+)\n\n(?:Abstract|ABSTRACT)/m);
    const abstractMatch = text.match(/(?:Abstract|ABSTRACT)[\s\S]*?(\n\n|$)/i);
    const authorMatch = text.match(/(?:Authors?|By)[:\s]+(.+?)(?=\n\n)/i);

    return {
      title: titleMatch ? titleMatch[1].trim() : 'Unknown',
      authors: authorMatch ? authorMatch[1].trim() : 'Unknown',
      abstract: abstractMatch ? abstractMatch[0].replace(/(Abstract|ABSTRACT)/i, '').trim() : 'Unknown',
      pageCount: paperData.info.Pages || 'Unknown',
    };
  }

  async analyzeContent(pdfPath, question) {
    const paperData = await this.parsePDF(pdfPath);
    
    // 这里可以实现更复杂的内容分析逻辑
    // 目前只是简单返回包含问题的响应
    return `关于论文的分析结果：
问题: ${question}
回答: 根据论文内容，这里应该包含针对问题的详细分析。`;
  }
}

// 创建单例实例
const analyzer = new PaperAnalyzer();

// 导出函数
async function analyzePaper(pdfPath, question) {
  return await analyzer.analyzeContent(pdfPath, question);
}

async function extractPaperInfo(pdfPath) {
  return await analyzer.extractPaperInfo(pdfPath);
}

module.exports = {
  analyzePaper,
  extractPaperInfo,
};

四、配置 MCP 客户端

创建 claude_desktop_config.json 文件（位于 Claude 桌面应用的配置目录）：

json 复制代码

{
  "mcpServers": {
    "paper-analysis": {
      "command": "node",
      "args": ["/path/to/your/paper-analysis-agent/server.js"],
      "env": {}
    }
  }
}

五、测试智能体

创建测试脚本 test.js：

javascript 复制代码

const { analyzePaper, extractPaperInfo } = require('./paperAnalyzer');

async function test() {
  try {
    // 测试信息提取
    const info = await extractPaperInfo('./sample.pdf');
    console.log('论文信息:', info);

    // 测试内容分析
    const analysis = await analyzePaper(
      './sample.pdf',
      '这篇论文的主要贡献是什么？'
    );
    console.log('分析结果:', analysis);
  } catch (error) {
    console.error('测试失败:', error);
  }
}

test();

六、运行和使用

启动 MCP 服务器：

bash 复制代码

node server.js

在 Claude 桌面应用中，你现在可以使用以下工具：

analyze_paper: 分析论文内容并回答问题
extract_paper_info: 提取论文基本信息
summarize_paper: 生成论文总结

示例对话：

makefile 复制代码

用户: 请分析这篇论文 "/path/to/paper.pdf"，并告诉我它的主要研究方法。

Claude: 我将使用论文分析工具来帮您解答这个问题。

[调用 analyze_paper 工具]

七、进阶功能扩展

你可以进一步扩展这个智能体：

集成 NLP 库：添加自然语言处理功能，如实体识别、关系提取等
添加引用分析：解析论文的参考文献和引用关系
实现可视化：生成论文内容的可视化分析报告
添加缓存机制：提高重复查询的响应速度
支持多种格式：扩展支持 Word、HTML 等其他文档格式

八、总结

通过本教程，你学会了如何：

创建一个基于 MCP 的论文分析智能体
实现 PDF 解析和内容提取功能
配置 MCP 服务器与 Claude 客户端的集成
构建实用的论文分析工具

这个项目展示了 MCP 在实际应用中的强大能力，通过组合不同的工具和资源，可以构建出专门针对特定领域的高效智能体。

测试开发全景图：人工智能测试、智能驱动、自动化、测试开发、左移右移与DevOps的持续交付

记得在实际应用中处理错误情况、添加适当的日志记录，并考虑性能优化和安全问题。Happy coding！