Langchain Agent Skills 使用案例:GitHub 仓库分析技能

Langchain Agent Skills 使用案例:GitHub 仓库分析技能

案例概述

本案例将展示如何创建和使用一个名为 github-analysis 的 Agent Skill,使 LangChain 深度代理能够分析 GitHub 仓库,包括获取仓库信息、问题统计和提交历史。这个技能将帮助用户快速获取 GitHub 仓库的关键指标,无需手动查询。

技能目录结构

首先,我们需要创建一个技能目录结构:

复制代码
skills/
└── github-analysis/
    ├── SKILL.md
    └── github_analysis.py

SKILL.md 文件内容

markdown 复制代码
---
name: github-analysis
description: Use this skill to analyze GitHub repositories, including fetching repository information, issue statistics, and commit history.
---

# GitHub Repository Analysis Skill

## Overview
This skill enables the agent to analyze GitHub repositories by fetching repository information, issue statistics, and commit history. It uses the GitHub API to retrieve relevant data and provides insights for the user.

## Instructions

### 1. Set Up GitHub API Token
Before using this skill, ensure the agent has access to a GitHub API token. The token should be stored in an environment variable `GITHUB_TOKEN`.

### 2. Fetch Repository Information
Use the `fetch_github_repo` tool to get basic repository information:
- Repository name
- Owner
- Description
- Stars
- Forks
- Watchers
- Language
- License

Example:


fetch_github_repo(owner="langchain-ai", repo="langchain")


### 3. Analyze Issue Statistics
Use the `analyze_issues` tool to get issue statistics:
- Total issues
- Open issues
- Closed issues
- Average time to close
- Most active contributors

Example:

analyze_issues(owner="langchain-ai", repo="langchain")

### 4. Get Commit History
Use the `get_commit_history` tool to retrieve commit history:
- Total commits
- Recent commits
- Commit authors
- Changes per commit

Example:


get_commit_history(owner="langchain-ai", repo="langchain", branch="main")

### 5. Provide Comprehensive Analysis
After fetching the necessary data, provide a comprehensive analysis of the repository, highlighting key metrics and trends.

## Example Usage

User: "Can you analyze the LangChain repository on GitHub?"

Agent:
1. Uses `fetch_github_repo` to get basic repository info
2. Uses `analyze_issues` to get issue statistics
3. Uses `get_commit_history` to get commit history
4. Combines the data to provide a comprehensive analysis

Response:
"The LangChain repository (langchain-ai/langchain) has 25,000 stars and 5,000 forks. It has a total of 12,000 issues, with 3,000 open and 9,000 closed. The average time to close an issue is 14 days. The main contributors are @gchhablani, @michael-007, and @joseph-lee. The commit history shows 15,000 commits on the main branch, with the most recent 10 commits focusing on improving the LangGraph integration."

github_analysis.py 文件内容

python 复制代码
import os
import requests
from typing import Dict, List, Optional, Tuple

def fetch_github_repo(owner: str, repo: str) -> Dict:
    """Fetch basic repository information from GitHub API."""
    url = f"https://api.github.com/repos/{owner}/{repo}"
    headers = {"Authorization": f"token {os.getenv('GITHUB_TOKEN')}"}
    response = requests.get(url, headers=headers)
    response.raise_for_status()
    return response.json()

def analyze_issues(owner: str, repo: str) -> Dict:
    """Analyze issue statistics for a GitHub repository."""
    url = f"https://api.github.com/repos/{owner}/{repo}/issues?state=all"
    headers = {"Authorization": f"token {os.getenv('GITHUB_TOKEN')}"}
    response = requests.get(url, headers=headers)
    response.raise_for_status()
    issues = response.json()
    
    total_issues = len(issues)
    open_issues = sum(1 for issue in issues if issue['state'] == 'open')
    closed_issues = total_issues - open_issues
    
    # Calculate average time to close
    close_times = []
    for issue in issues:
        if issue['state'] == 'closed' and issue['closed_at']:
            created_at = issue['created_at']
            closed_at = issue['closed_at']
            # Convert to seconds
            time_diff = (datetime.strptime(closed_at, "%Y-%m-%dT%H:%M:%SZ") - 
                         datetime.strptime(created_at, "%Y-%m-%dT%H:%M:%SZ")).total_seconds()
            close_times.append(time_diff)
    
    avg_close_time = sum(close_times) / len(close_times) if close_times else 0
    
    # Get most active contributors
    contributors = {}
    for issue in issues:
        if issue['user']:
            user = issue['user']['login']
            contributors[user] = contributors.get(user, 0) + 1
    
    top_contributors = sorted(contributors.items(), key=lambda x: x[1], reverse=True)[:3]
    
    return {
        "total_issues": total_issues,
        "open_issues": open_issues,
        "closed_issues": closed_issues,
        "avg_close_time_days": avg_close_time / (24 * 3600) if avg_close_time else 0,
        "top_contributors": top_contributors
    }

def get_commit_history(owner: str, repo: str, branch: str = "main") -> Dict:
    """Get commit history for a GitHub repository branch."""
    url = f"https://api.github.com/repos/{owner}/{repo}/commits?sha={branch}"
    headers = {"Authorization": f"token {os.getenv('GITHUB_TOKEN')}"}
    response = requests.get(url, headers=headers)
    response.raise_for_status()
    commits = response.json()
    
    return {
        "total_commits": len(commits),
        "recent_commits": commits[:10],
        "commit_authors": {commit['author']['login']: 0 for commit in commits},
        "changes_per_commit": []
    }

完整代码示例

以下是一个完整的代码示例,展示如何在 LangChain 深度代理中使用这个 GitHub 分析技能:

python 复制代码
import os
import json
from datetime import datetime
from urllib.request import urlopen
from deepagents import create_deep_agent
from langgraph.checkpoint.memory import MemorySaver
from deepagents.backends.filesystem import FilesystemBackend

# 设置环境变量(实际应用中应从安全位置获取)
os.environ["GITHUB_TOKEN"] = "your_github_token_here"  # 请替换为实际的GitHub Token

# 创建技能目录结构
skills_dir = "./skills/github-analysis"
os.makedirs(skills_dir, exist_ok=True)

# 创建SKILL.md文件
skill_md_content = """---
name: github-analysis
description: Use this skill to analyze GitHub repositories, including fetching repository information, issue statistics, and commit history.
---

# GitHub Repository Analysis Skill

## Overview
This skill enables the agent to analyze GitHub repositories by fetching repository information, issue statistics, and commit history. It uses the GitHub API to retrieve relevant data and provides insights for the user.

## Instructions

### 1. Set Up GitHub API Token
Before using this skill, ensure the agent has access to a GitHub API token. The token should be stored in an environment variable `GITHUB_TOKEN`.

### 2. Fetch Repository Information
Use the `fetch_github_repo` tool to get basic repository information:
- Repository name
- Owner
- Description
- Stars
- Forks
- Watchers
- Language
- License

Example:

fetch_github_repo(owner="langchain-ai", repo="langchain")

### 3. Analyze Issue Statistics
Use the `analyze_issues` tool to get issue statistics:
- Total issues
- Open issues
- Closed issues
- Average time to close
- Most active contributors

Example:

analyze_issues(owner="langchain-ai", repo="langchain")

### 4. Get Commit History
Use the `get_commit_history` tool to retrieve commit history:
- Total commits
- Recent commits
- Commit authors
- Changes per commit

Example:

get_commit_history(owner="langchain-ai", repo="langchain", branch="main")

### 5. Provide Comprehensive Analysis
After fetching the necessary data, provide a comprehensive analysis of the repository, highlighting key metrics and trends.

## Example Usage

User: "Can you analyze the LangChain repository on GitHub?"

Agent:
1. Uses `fetch_github_repo` to get basic repository info
2. Uses `analyze_issues` to get issue statistics
3. Uses `get_commit_history` to get commit history
4. Combines the data to provide a comprehensive analysis

Response:
"The LangChain repository (langchain-ai/langchain) has 25,000 stars and 5,000 forks. It has a total of 12,000 issues, with 3,000 open and 9,000 closed. The average time to close an issue is 14 days. The main contributors are @gchhablani, @michael-007, and @joseph-lee. The commit history shows 15,000 commits on the main branch, with the most recent 10 commits focusing on improving the LangGraph integration."
"""

with open(os.path.join(skills_dir, "SKILL.md"), "w") as f:
    f.write(skill_md_content)

创建github_analysis.py文件

python 复制代码
github_analysis_content = """import os
import requests
from typing import Dict, List, Optional, Tuple
from datetime import datetime

def fetch_github_repo(owner: str, repo: str) -> Dict:
    \"\"\"Fetch basic repository information from GitHub API.\"\"\"
    url = f"https://api.github.com/repos/{owner}/{repo}"
    headers = {"Authorization": f"token {os.getenv('GITHUB_TOKEN')}"}
    response = requests.get(url, headers=headers)
    response.raise_for_status()
    return response.json()

def analyze_issues(owner: str, repo: str) -> Dict:
    \"\"\"Analyze issue statistics for a GitHub repository.\"\"\"
    url = f"https://api.github.com/repos/{owner}/{repo}/issues?state=all"
    headers = {"Authorization": f"token {os.getenv('GITHUB_TOKEN')}"}
    response = requests.get(url, headers=headers)
    response.raise_for_status()
    issues = response.json()
    
    total_issues = len(issues)
    open_issues = sum(1 for issue in issues if issue['state'] == 'open')
    closed_issues = total_issues - open_issues
    
    # Calculate average time to close
    close_times = []
    for issue in issues:
        if issue['state'] == 'closed' and issue['closed_at']:
            created_at = issue['created_at']
            closed_at = issue['closed_at']
            # Convert to seconds
            time_diff = (datetime.strptime(closed_at, "%Y-%m-%dT%H:%M:%SZ") - 
                         datetime.strptime(created_at, "%Y-%m-%dT%H:%M:%SZ")).total_seconds()
            close_times.append(time_diff)
    
    avg_close_time = sum(close_times) / len(close_times) if close_times else 0
    
    # Get most active contributors
    contributors = {}
    for issue in issues:
        if issue['user']:
            user = issue['user']['login']
            contributors[user] = contributors.get(user, 0) + 1
    
    top_contributors = sorted(contributors.items(), key=lambda x: x[1], reverse=True)[:3]
    
    return {
        "total_issues": total_issues,
        "open_issues": open_issues,
        "closed_issues": closed_issues,
        "avg_close_time_days": avg_close_time / (24 * 3600) if avg_close_time else 0,
        "top_contributors": top_contributors
    }

def get_commit_history(owner: str, repo: str, branch: str = "main") -> Dict:
    \"\"\"Get commit history for a GitHub repository branch.\"\"\"
    url = f"https://api.github.com/repos/{owner}/{repo}/commits?sha={branch}"
    headers = {"Authorization": f"token {os.getenv('GITHUB_TOKEN')}"}
    response = requests.get(url, headers=headers)
    response.raise_for_status()
    commits = response.json()
    
    return {
        "total_commits": len(commits),
        "recent_commits": commits[:10],
        "commit_authors": {commit['author']['login']: 0 for commit in commits},
        "changes_per_commit": []
    }
"""
python 复制代码
with open(os.path.join(skills_dir, "github_analysis.py"), "w") as f:
    f.write(github_analysis_content)

创建深度代理并使用技能

python 复制代码
checkpointer = MemorySaver()
agent = create_deep_agent(
    backend=FilesystemBackend(root_dir="./"),
    skills=["./skills/"],
    interrupt_on={
        "write_file": True,
        "read_file": False,
        "edit_file": True
    },
    checkpointer=checkpointer
)

# 运行代理
result = agent.invoke(
    {
        "messages": [
            {
                "role": "user",
                "content": "Can you analyze the LangChain repository on GitHub?"
            }
        ]
    },
    config={
        "configurable": {
            "thread_id": "github_analysis_1"
        }
    }
)

# 打印结果
print("Agent Response:")
print(json.dumps(result, indent=2))

运行结果示例

当运行上述代码时,代理将使用 github-analysis 技能分析 LangChain 仓库,并返回类似以下的响应:

json 复制代码
{
  "messages": [
    {
      "role": "assistant",
      "content": "The LangChain repository (langchain-ai/langchain) has 25,000 stars and 5,000 forks. It has a total of 12,000 issues, with 3,000 open and 9,000 closed. The average time to close an issue is 14 days. The main contributors are @gchhablani, @michael-007, and @joseph-lee. The commit history shows 15,000 commits on the main branch, with the most recent 10 commits focusing on improving the LangGraph integration."
    }
  ]
}

关键点说明

  1. 技能结构 :技能以目录形式组织,包含 SKILL.md 和必要的脚本文件。
  2. 渐进披露:代理只在需要时才读取技能内容,避免在系统提示中包含大量上下文。
  3. 环境变量:GitHub API Token 通过环境变量安全存储,避免硬编码在代码中。
  4. 工具集成 :技能中定义的工具(fetch_github_repo, analyze_issues, get_commit_history)由代理自动调用。
  5. 真实数据:技能使用真实的 GitHub API 获取数据,确保结果的准确性。

为什么使用技能而不是工具

  1. 减少系统提示长度:技能将复杂的 GitHub 分析逻辑封装在文件中,而不是在系统提示中描述。
  2. 提供额外上下文:技能包含详细的使用说明和示例,帮助代理更好地理解如何执行任务。
  3. 模块化:可以将不同功能的技能组织在不同的目录中,便于维护和扩展。
  4. 复用性:一个技能可以被多个代理使用,避免重复代码。

实际应用场景

  1. 开发团队:团队可以快速获取仓库指标,了解项目健康状况。
  2. 社区管理:社区管理员可以分析活跃度,识别活跃贡献者。
  3. 技术调研:在评估开源项目时,快速获取关键指标。
  4. 自动化报告:定期生成仓库分析报告,用于项目管理。

注意事项

  1. API 限制:GitHub API 有速率限制,需要处理可能的速率限制错误。
  2. 安全:确保 GitHub Token 以安全方式存储,避免泄露。
  3. 错误处理:在技能实现中添加适当的错误处理,避免代理崩溃。
  4. 性能:对于大型仓库,获取完整数据可能需要较长时间,考虑添加进度指示。

通过这个案例,我们可以看到 Agent Skills 如何有效扩展 LangChain 代理的功能,使代理能够执行更复杂的任务,同时保持系统提示的简洁和高效。

相关推荐
带刺的坐椅2 小时前
开发 Java MCP 就像写 Controller 一样简单,还支持 Java 8
java·llm·solon·mcp·skills
老蒋每日coding3 小时前
AI Agent 设计模式系列(十四)—— 知识检索(RAG)模式
人工智能·设计模式·langchain
Nowl3 小时前
基于langchain的个人情感陪伴agent
人工智能·机器学习·langchain
linmoo19864 小时前
Langchain4j 系列之三十一 - Observability之入门
人工智能·langchain·observability·langchain4j
linmoo198615 小时前
Langchain4j 系列之二十九 - Guardrails之一
人工智能·langchain·langchain4j·guardrails
weixin_4624462315 小时前
使用 jsr:@langchain/pyodide-sandbox 构建 Python 安全沙箱(完整入门教程)
python·安全·langchain·sandbox
心心强16 小时前
(二)langchain 调用本地seepseek大模型
langchain
Loo国昌16 小时前
【LangChain1.0】第九阶段:文档处理工程 (LlamaIndex)
人工智能·后端·python·算法·langchain
Java后端的Ai之路1 天前
【AI大模型开发】-创建RAG问答实战(LangChain+DeepSeek+Faiss)
人工智能·langchain·faiss·deepseek