极简万能通用AI Agent：universal-agent

🤖 极简万能 Agent (Minimal Universal Agent)

作者：王教成（波动几何） | LLM（大脑）+ 命令执行器（手脚） = 万能通用AI Agent

核心理念

复制代码

用户自然语言输入
       ↓
  ┌─────────┐
  │   LLM   │ ← 理解意图，生成命令/脚本代码
  │  (大脑)  │
  └────┬────┘
       │ 自动生成
       ↓
  ┌─────────────┐
  │execute_     │ ← 执行任何命令，控制所有软件和硬件
  │command(手脚)│
  └──────┬──────┘
         │ 实际执行
         ↓
    任务完成 ✅

为什么这是"万能"的？

能力层	说明
Shell命令	操作文件、管理进程、系统运维
Python脚本	数据处理、网络爬虫、机器学习
调用任何CLI	git, docker, ffmpeg, aws...
控制硬件	通过软件间接控制：屏幕、音频、串口...
网络连接	API调用、SSH远程、IoT设备

execute_command 能运行 Python 脚本 → Python 能做任何事 → Agent 就能做任何事

快速开始

🏔️ 模式一：独立运行（需要 API Key）

脚本自力更生，自己调用 LLM API + 自己执行命令。

bash 复制代码

# 交互式启动（推荐）
python scripts/universal_agent.py

# 单次任务
python scripts/universal_agent.py --run "列出当前目录的所有文件"

# 使用环境变量
set LLM_API_KEY=sk-xxx
python scripts/universal_agent.py --run "分析数据"

🌉 模式二：桥接执行（不需要 API Key，推荐用于 Skill 集成）

脚本由外部 Agent 驱动。外部 Agent 提供"大脑"（LLM能力），脚本负责"手脚"（执行+安全+重试+记忆）。任何有 LLM + 命令执行能力的工具都可使用：WorkBuddy、Cursor、Continue.dev、Aider、Cline 等。

bash 复制代码

# 查看完整协议说明
python scripts/universal_agent.py --bridge-info

# 桥接模式执行任务
set UA_THINK={"type":"command","content":"dir /b","explanation":""}
python scripts/universal_agent.py --backend bridge --run "列出当前目录的文件"

# 作为模块导入使用
from universal_agent import UniversalAgent, AgentBridge

bridge = AgentBridge()
bridge.set_response('think', '{"type":"command","content":"dir /b"}')
bridge.set_response('summarize', '已完成：列出了当前目录的15个文件')

agent = UniversalAgent(api_key="bridge", backend="bridge")
agent.run("列出当前目录的文件")

通信方式（环境变量）：

环境变量	用途	格式
`UA_THINK`	任务决策结果	`{"type":"command
`UA_GENERATE_SCRIPT`	生成的 Python 脚本	完整代码
`UA_SUMMARIZE`	执行结果总结	自然语言文本
`UA_DEBUG_AND_FIX`	修复后的代码	修复后代码

📖 模式三：Skill 模拟（不需要运行脚本）

加载此 Skill 的 Agent 阅读 SKILL.md 后，用自身能力模拟整个工作流程。无需配置，但不会启用脚本内置的安全/重试/记忆功能。

方式四：作为模块导入（独立模式）

python 复制代码

from universal_agent import UniversalAgent

agent = UniversalAgent(
    api_key="your-key",
    model="gpt-4o",
    base_url="https://api.openai.com/v1"
)

# 单次执行
agent.run("帮我把当前目录的文件按类型整理")

# 交互模式
agent.chat()

方式四：使用环境变量（适合服务器）

bash 复制代码

export LLM_API_KEY="sk-xxx"
export LLM_MODEL="gpt-4o"
export LLM_BASE_URL="https://api.openai.com/v1"

python universal_agent.py --run "分析数据"

支持的LLM提供商

提供商	模型	base_url
OpenAI	gpt-4o, gpt-4o-mini	https://api.openai.com/v1
DeepSeek	deepseek-chat, deepseek-reasoner	https://api.deepseek.com
通义千问	qwen-max, qwen-turbo	https://dashscope.aliyuncs.com/compatible-mode/v1
智谱GLM	glm-4-plus	https://open.bigmodel.cn/api/paas/v4
本地Ollama	llama3, qwen2, any-model	http://localhost:11434/v1
Groq	llama-3.1-70b-versatile	https://api.groq.com/openai/v1
其他	任何OpenAI兼容API	-

使用示例

简单任务（自动生成shell命令）

复制代码

你: 列出当前目录的文件
Agent: $ dir /b
      (显示结果)
✅ 已完成：当前目录包含15个文件和3个文件夹...

复杂任务（自动生成Python脚本）

复制代码

你: 分析sales.csv的销售趋势并生成图表
Agent: (自动生成并执行约50行Python代码)
      - 导入pandas、matplotlib
      - 读取CSV数据
      - 计算统计指标
      - 绑制趋势图
      - 保存为图片
✅ 任务完成！已生成销售趋势图 trend.png...

开发任务

复制代码

你: 写一个快速排序算法并测试性能
Agent: (生成完整Python脚本)
      ✅ 快速排序实现完成
      - 随机数组10000个元素: 排序耗时0.012秒
      - 与内置sorted对比验证: 结果一致

系统运维

复制代码

你: 检查哪些端口被占用
Agent: $ netstat -ano | findstr LISTENING
      (展示端口占用情况)
✅ 已发现5个活跃端口...

项目结构

复制代码

scripts/
├── universal_agent.py        # 主程序（单文件，约900行）
├── config.json               # 配置文件（填入API Key即可使用）
references/
└── README.md                 # 本文档

功能特性

✅ 全流程自动化

用户只需说一句话
自动理解 → 规划 → 生成代码 → 执行 → 总结
无需人工干预

🔒 安全机制

高危命令检测（rm -rf /, format C:, etc.）
危险操作强制确认
中危操作警告提示
可选的危险模式（跳过确认）

🧠 智能决策

自动判断任务复杂度
简单任务 → 生成shell命令
复杂任务 → 生成Python脚本
多步任务 → 生成命令序列

🔄 自我修复

执行出错时自动分析原因
LLM修复代码后重新执行
默认最多重试2次

💾 持久记忆

跨会话保留执行历史
存储学到的知识
变量持久化存储
下次启动自动加载

🌐 跨平台支持

Windows (cmd.exe)
macOS (bash/zsh)
Linux (bash)

📊 执行历史

实时查看最近执行记录
包含成功/失败状态
显示耗时信息

依赖要求

零外部依赖（使用标准库）：

os, sys --- 系统操作
subprocess --- 命令执行
json --- JSON解析
re --- 正则表达式
time/datetime --- 时间处理
urllib --- HTTP请求（备选）

可选依赖：

requests --- 更好的HTTP支持（推荐安装）
bash 复制代码
```
pip install requests
```

免费方案推荐

1️⃣ Ollama + 本地模型（完全免费）

bash 复制代码

# 安装 Ollama
# Windows: https://ollama.ai/download
# Mac: brew install ollama
# Linux: curl -fsSL https://ollama.ai/install.sh | sh

# 运行本地模型
ollama run llama3    # 推荐，7B参数
# 或
ollama run qwen2     # 通义千问2代
ollama run codellama # 代码专用

# 启动Agent时选择 Ollama 配置即可

优点 : 免费、无限用、隐私安全
缺点: 需要较好的电脑配置（建议8GB+显存）

2️⃣ DeepSeek（性价比极高）

注册: https://platform.deepseek.com
价格: 约 ¥1/百万token
性能: 接近GPT-4水平

3️⃣ Groq Cloud（有免费额度）

注册: https://console.groq.com
特点: 超快推理速度
免费额度: 每月一定量免费API调用

配置说明

config.json 格式

json 复制代码

{
  "api_key": "sk-your-api-key",
  "model": "gpt-4o",
  "base_url": "https://api.openai.com/v1",
  "dangerous_mode": false,
  "auto_save_memory": true,
  "memory_file": "universal_agent_memory.json",
  "command_timeout": 300,
  "script_timeout": 600
}

环境变量

bash 复制代码

export LLM_API_KEY="sk-xxx"
export LLM_MODEL="gpt-4o"
export LLM_BASE_URL="https://api.openai.com/v1"

交互命令

在交互模式中可使用以下特殊命令：

命令	功能
`help`	显示帮助
`history`	查看执行历史
`context`	查看上下文
`stats`	统计信息
`clear`	清理临时文件
`var X Y`	设置变量
`remember X`	记住信息
`exit`	退出

安全注意事项

⚠️ 重要提醒

不要在公共环境开启危险模式
定期清理临时生成的脚本文件 (_agent_task_*.py)
敏感信息不要放在任务描述中
API Key 要妥善保管，不要提交到Git
对于删除操作，务必仔细检查

技术架构

复制代码

universal_agent.py (~1000+ 行)
│
├── class AgentBridge        (外部Agent桥接接口 --- Mode 2)
│   ├── think()             # 从外部Agent接收决策结果
│   ├── generate_script()   # 接收生成的Python脚本
│   ├── summarize()         # 接收总结文本
│   ├── debug_and_fix()     # 接收修复后代码
│   └── set_response()      # 程序化设置响应内容
│
├── class LLMBrain          (大语言模型HTTP接口 --- Mode 1)
│   ├── think()             # 核心：理解意图，生成命令/脚本
│   ├── generate_script()   # 专门生成Python脚本
│   ├── summarize()         # 将结果翻译成人类语言
│   └── debug_and_fix()     # 出错时自我修复
│
├── class UniversalExecutor  (通用命令执行器 --- 所有模式共用)
│   ├── execute()           # 主入口：执行命令或脚本
│   ├── _execute_command()  # Shell命令执行
│   ├── _execute_script()   # Python脚本执行
│   └── _check_danger()     # 危险等级检查
│
├── class ContextManager     (上下文/记忆管理 --- 所有模式共用)
│   ├── add_task_record()   # 记录任务
│   ├── get_context_string()# 供LLM/Agent使用的上下文
│   ├── save() / load()     # 持久化
│   └── get_stats()         # 统计信息
│
└── class UniversalAgent     (主类，组合以上三者)
    ├── run()               # 单次任务（全流程自动化）
    ├── chat()              # 交互模式
    ├── batch_run()         # 批量任务
    └── backend="llm|bridge" # 后端选择

License

MIT License - 自由使用、修改、分发

Made with ❤️ by 极简万能 Agent

name: universal-agent

author: 王教成 Wang Jiaocheng (波动几何)
description: >
This skill should be used when the user needs to execute tasks through a complete
automated workflow: understand natural language intent, dynamically generate commands
or Python scripts, auto-execute them via shell, and summarize results. This is a
minimal universal agent implementation based on "LLM brain + command executor limbs"
architecture. Use this when users say things like "help me do X automatically",
"generate code and run it", "universal agent", "execute task end-to-end", or when
tasks require dynamic code generation and execution with automatic error recovery.

Universal Agent Skill

A minimal universal AI agent that automates end-to-end task execution: understand user intent in natural language, generate commands or scripts, execute them, analyze results, and self-recover from errors.

Architecture

复制代码

Natural Language Input
       ↓
  ┌─────────────┐
  │ LLM (Brain) │ Understand intent, generate command/Python script
  └──────┬──────┘
         │ Auto-generate code/command
         ↓
  ┌─────────────────┐
  │ Command Executor │ Execute any command, control software & hardware
  │ (Limbs)          │
  └───────┬─────────┘
          │ Actual execution
          ↓
     Task Complete ✅

File Structure

复制代码

universal-agent/
├── SKILL.md                    # This file (skill definition)
├── scripts/
│   ├── universal_agent.py      # Main program (complete standalone implementation)
│   └── config.json             # Configuration file (fill in API key for standalone mode)
└── references/
    └── README.md               # Detailed usage documentation

When to Use

Use this skill when:

User describes a task in natural language that requires automated execution
Task needs dynamic code generation (Python script) and immediate execution
Task involves file operations, data processing, system administration, CLI tools, API calls
User wants end-to-end automation without manual intervention
Keywords: "万能agent", "universal agent", "自动执行", "动态生成代码", "生成并执行", "帮我做XX"

How It Works

Automated Workflow (4 Steps)

Think --- LLM understands task, judges complexity, decides whether to generate a shell command or Python script
Execute --- Auto-write file → Run command/script → Capture output
Fix --- On error, LLM analyzes error, auto-fixes code, retries (up to 2 times)
Summarize --- Translates technical output into human-friendly language

Why It's "Universal"

Capability	Description
Shell Commands	File ops, process management, system admin
Python Scripts	Data processing, web scraping, ML, image processing
CLI Tools	git, docker, ffmpeg, aws, any CLI
Hardware Control	Serial/GPIO/network-controlled physical devices
API Calls	Any HTTP API

Command executor can run Python → Python can do anything → Agent can do anything

Usage Modes

This skill supports three distinct usage modes, each suited to different scenarios:

Mode 1: Standalone (独立运行)

Run the bundled script directly as an independent program. The script handles everything internally --- LLM calls, command execution, safety checks, retries, memory.

bash 复制代码

# Single task mode (needs API key)
python scripts/universal_agent.py --run "task description"

# Interactive mode
python scripts/universal_agent.py

# With environment variables
set LLM_API_KEY=sk-xxx && python scripts/universal_agent.py --run "任务"

What works: Safety ✅ | Auto-retry ✅ | Memory persistence ✅ | Needs API Key.

Mode 2: Bridge Execution (桥接执行 --- 推荐)

Execute the script with --backend bridge. The script's brain is provided by the external Agent that loaded this Skill, while the script itself handles execution, safety, retry, and memory. Any Agent with LLM + command execution can use this.

bash 复制代码

# Basic bridge execution
python scripts/universal_agent.py --backend bridge --run "任务描述"

# View full protocol spec
python scripts/universal_agent.py --bridge-info

How it works --- the Agent drives the script through environment variables:

复制代码

┌─────────────────────────────────────────────────────────────┐
│                    Bridge Mode Flow                         │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ① User: "列出当前目录的文件"                                │
│       ↓                                                     │
│  ② External Agent LLM → generates decision                  │
│     set UA_THINK={"type":"command","content":"dir /b"}      │
│       ↓                                                     │
│  ③ Agent executes:                                          │
│     python ... --backend bridge --run "列出当前目录的文件"   │
│       ↓                                                     │
│  ④ Script reads UA_THINK → runs "dir /b"                    │
│     → safety check passes → captures output                 │
│       ↓ (if error)                                          │
│  ⑤ Script requests fix via UA_DEBUG_AND_FIX env var         │
│       ↓                                                     │
│  ⑥ External Agent provides fixed code                       │
│     set UA_DEBUG_AND_FIX="fixed_command_or_script"           │
│       ↓                                                     │
│  ⑦ Script re-executes → success                             │
│       ↓                                                     │
│  ⑧ Script reads UA_SUMMARIZE for final output               │
│     → returns structured JSON result                        │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Environment Variable Protocol:

Variable	When Used	Format
`UA_THINK`	Step 1 --- decision	JSON: `{"type":"command
`UA_GENERATE_SCRIPT`	If type=script and code needed	Complete Python source code
`UA_SUMMARIZE`	Final step --- result summary	Natural language summary text
`UA_DEBUG_AND_FIX`	On error retry --- fixed code	Fixed Python/shell code

What works: Safety ✅ | Auto-retry ✅ | Memory persistence ✅ | No API Key needed (Agent provides LLM).

Who can use this: WorkBuddy, Cursor, Continue.dev, Aider, Cline, any AI IDE/tool with LLM + shell access.

Mode 3: Inline Simulation (模拟执行)

The loaded Agent reads this SKILL.md, learns the architecture pattern, and simulates the workflow using its own native capabilities without executing the script at all. The script serves as a reference/teaching example only.

Agent uses its own LLM instead of LLMBrain
Agent uses its own execute_command instead of UniversalExecutor
Agent does its own summarization

What works: Fastest ⚡ | No setup | Safety ❌ Retry ❌ Memory ❌ (script features unused).

Core Components

`scripts/universal_agent.py` --- Main Program

Four core classes implementing the full agent:

Class	Role	Key Methods
`LLMBrain`	Brain --- HTTP LLM interface (Mode 1)	`think()`, `generate_script()`, `summarize()`, `debug_and_fix()`
`AgentBridge`	Brain --- External Agent bridge (Mode 2)	`think()`, `generate_script()`, `summarize()`, `debug_and_fix()`, `set_response()`
`UniversalExecutor`	Limbs (command execution)	`execute()`, `_execute_command()`, `_execute_script()`, `_check_danger()`
`ContextManager`	Memory (state management)	`add_task_record()`, `get_context_string()`, `save()/load()`
`UniversalAgent`	Main orchestrator	`run()`, `chat()`, `batch_run()`

See references/README.md for full API documentation and examples.

Safety Mechanisms

The executor includes built-in danger detection:

Level	Examples	Handling
🔴 High	`rm -rf /`, `format C:`	Forced confirmation required
🟡 Medium	`pip uninstall`, `sudo`	Warning prompt
🟢 Low	`ls`, `cat`, `python script.py`	Direct execution

Danger patterns are defined in HIGH_DANGER_PATTERNS and MEDIUM_DANGER_PATTERNS within the script.

Configuration

Mode 1 (Standalone) --- Needs API Key

Option A --- Config File:

Edit scripts/config.json and fill in your API key.

Option B --- Environment Variables:

bash 复制代码

set LLM_API_KEY=your-key-here
set LLM_MODEL=gpt-4o
set LLM_BASE_URL=https://api.openai.com/v1

Option C --- Local Ollama (Free):

bash 复制代码

ollama run llama3
# Then select ollama_llama3 preset when starting the script

Configuration priority: Environment variables > config.json > Interactive input.

Mode 2 (Bridge) --- No API Key Needed

The external Agent provides all LLM capabilities. Configure only optional settings:

bash 复制代码

# Optional: change input source from env to file
set UA_INPUT_SOURCE=file

# Optional: skip safety confirmations (not recommended)
# Use --dangerous flag instead

Mode 3 (Simulation) --- No Configuration Needed

Agent uses its own native capabilities. Nothing to configure.

Supported LLM Providers

Provider	Models	base_url
OpenAI	gpt-4o, gpt-4o-mini	https://api.openai.com/v1
DeepSeek	deepseek-chat, deepseek-reasoner	https://api.deepseek.com
Qwen	qwen-max, qwen-turbo	https://dashscope.aliyuncs.com/compatible-mode/v1
Zhipu GLM	glm-4-plus	https://open.bigmodel.cn/api/paas/v4
Local Ollama	llama3, qwen2, any model	http://localhost:11434/v1
Groq	llama-3.1-70b-versatile	https://api.groq.com/openai/v1
Any OpenAI-compatible API	any	your-url

Platform Support

✅ Cross-platform --- Windows, macOS, Linux:

OS	Shell Backend
Windows	`cmd.exe /c` (with `CREATE_NO_WINDOW`)
macOS	`bash` (`shell=True`)
Linux	`bash` (`shell=True`)

All file I/O uses UTF-8 encoding. Python script execution uses sys.executable for platform-agnostic invocation.

Dependencies

✅ Zero external dependencies --- Python standard library only:

os, sys --- System operations
subprocess --- Command execution
json, re --- JSON parsing and regex
time/datetime --- Time handling
urllib --- HTTP requests (fallback)

Optional:

requests library --- Better HTTP support (pip install requests)

Free Options

Ollama + local model (completely free, unlimited, private)
DeepSeek (~¥1/million tokens, excellent cost-performance)
Groq Cloud (free tier available, ultra-fast inference)

极简万能通用AI Agent：universal-agent

🤖 极简万能 Agent (Minimal Universal Agent)

核心理念

为什么这是"万能"的？

快速开始

🏔️ 模式一：独立运行（需要 API Key）

🌉 模式二：桥接执行（不需要 API Key，推荐用于 Skill 集成）

📖 模式三：Skill 模拟（不需要运行脚本）

方式四：作为模块导入（独立模式）

方式四：使用环境变量（适合服务器）

支持的LLM提供商

使用示例

简单任务（自动生成shell命令）

复杂任务（自动生成Python脚本）

开发任务

系统运维

项目结构

功能特性

✅ 全流程自动化

🔒 安全机制

🧠 智能决策

🔄 自我修复

💾 持久记忆

🌐 跨平台支持

📊 执行历史

依赖要求

免费方案推荐

1️⃣ Ollama + 本地模型（完全免费）

2️⃣ DeepSeek（性价比极高）

3️⃣ Groq Cloud（有免费额度）

配置说明

config.json 格式

环境变量

交互命令

安全注意事项

技术架构

License

Made with ❤️ by 极简万能 Agent

name: universal-agent

Universal Agent Skill

Architecture

File Structure

When to Use

How It Works

Automated Workflow (4 Steps)

Why It's "Universal"

Usage Modes

Mode 1: Standalone (独立运行)

Mode 2: Bridge Execution (桥接执行 --- 推荐)

Mode 3: Inline Simulation (模拟执行)

Core Components

scripts/universal_agent.py --- Main Program

Safety Mechanisms

Configuration

Mode 1 (Standalone) --- Needs API Key

Mode 2 (Bridge) --- No API Key Needed

Mode 3 (Simulation) --- No Configuration Needed

Supported LLM Providers

Platform Support

Dependencies

Free Options

`scripts/universal_agent.py` --- Main Program