Minimal Agent — 极简操作系统控制代理

Minimal Agent --- 极简操作系统控制代理

架构

复制代码

用户输入 → LLM 决策 → system-controller 脚本执行 → 结果返回 → 循环

版本对比

特性	Kotlin版	Python Skill版
运行环境	独立应用程序	WorkBuddy Skill
支持模式	text (V1) + function (V2) + auto + mixed + force_text + force_function	text (V1) + function (V2) + auto + mixed + force_text + force_function
模式切换	`config.toml` 中的 `mode` 字段 + 命令行参数	`config.toml` 中的 `mode` 字段 + 命令行参数
默认模式	auto（自动检测）	auto（自动检测）

智能模式选择

两个版本都实现了智能模式选择机制：

模式优先级

命令行参数 ：--text、--function、--auto（最高优先级）
配置文件 ：config.toml 中的 mode 字段
自动检测 ：当 mode = "auto" 时自动检测 system-controller 可用性

6种运行模式

模式	说明	能力范围	适用场景
`function`（V2）	Function Calling 结构化调用	限制在55个预定义工具内，只能调用system-controller脚本	生产环境，安全敏感场景
`text`（V1）	LLM 生成文本命令，正则提取	无限制，可执行任何系统命令（包括文件修改、脚本执行、网络操作、数据库操作等）	开发/调试/运维，需要完全控制权的场景
`auto`（推荐）	智能检测：有system-controller用V2，没有或不可用时用V1	自动适配	通用场景，无需手动切换
`mixed`（高级）	智能混合：自动分析任务，V1和V2命令夹杂自由切换按顺序组合	智能适配	复杂多步骤任务，需要V1和V2模式组合
`force_text`	强制V1模式，无论是否有system-controller	无限制	需要完全控制权，忽略system-controller
`force_function`	强制V2模式，无论system-controller是否可用	限制在55个工具内	强制使用结构化调用，忽略可用性

模式切换与组合使用

两个版本都支持动态模式切换 和组合使用：

动态切换：在单次会话中可以通过命令切换模式

bash 复制代码

# 启动时指定模式
kotlinc run.kt -script -- --text --interactive

# 会话中切换
"切换到V2模式"
"用V1模式执行这个命令"
"切回自动模式"

组合使用：V1和V2模式可以在单个复杂任务中组合

bash 复制代码

# 执行复杂任务：硬件控制 + 文件操作 + 系统管理
kotlinc run.kt -script -- --text "截屏后OCR文字，保存到文件，然后用V2模式调整音量"

智能适配：auto模式会自动根据任务类型选择最佳模式
- 硬件控制、窗口管理 → 自动使用V2模式（如果可用）
- 文件操作、脚本执行 → 自动使用V1模式（无限制）

V1模式的无限制能力

在 V1（text）模式 下，Minimal Agent 可以执行任何系统命令，包括但不限于：

文件操作：创建、删除、编辑、移动、复制文件
脚本执行：运行 Python、PowerShell、Bash 脚本
网络操作：HTTP请求、FTP传输、端口扫描
数据库操作：SQL查询、数据导入导出
系统管理：用户管理、服务控制、注册表编辑
软件安装：安装程序、更新包管理
安全操作：防火墙配置、权限管理

示例：

bash 复制代码

# V1模式可以执行任何命令
kotlinc run.kt -script -- --text "创建test.txt文件并写入内容"
kotlinc run.kt -script -- --text "运行Python脚本处理数据"
kotlinc run.kt -script -- --text "查询数据库并生成报告"
kotlinc run.kt -script -- --text "安装软件包并配置环境"

V2模式则专注于安全、可靠的硬件/软件控制，限制在55个预定义工具内。

智能检测逻辑

system-controller可用 → 使用 function 模式（V2）
system-controller不可用 → 使用 text 模式（V1）
不可用原因：目录不存在、脚本缺失、Python环境不可用

文件结构

复制代码

minimal-agent/
├── run.kt          # 全部代码：配置 + LLM 调用 + 脚本执行 + 双模式路由
├── config.toml     # 配置（API、模式、路径等）
└── README.md       # 本文件：使用说明

依赖

Kotlin 运行时 ：kotlinc（从 jetbrains.com 下载）
Python 3.x：执行 system-controller 脚本
curl：调用 LLM API（系统自带）
system-controller Skill ：需先安装到 ~/.workbuddy/skills/system-controller/

下载安装 system-controller

两个版本的 Minimal Agent 都需要 system-controller 来实现完整的硬件/软件控制能力：

官方安装（推荐）：
- 从腾讯云技能市场安装：https://skillhub.tencent.com/skills/system-controller
- 通过 WorkBuddy 技能中心搜索 "system-controller"

GitHub 安装：

bash 复制代码

# 克隆到 WorkBuddy 技能目录
git clone https://github.com/clawhub/wangjiaocheng/system-controller.git ~/.workbuddy/skills/system-controller

# 或者从镜像站
git clone https://clawhub.ai/wangjiaocheng/system-controller ~/.workbuddy/skills/system-controller

手动安装：
- 下载：https://clawhub.ai/wangjiaocheng/system-controller/archive/main.zip
- 解压到：~/.workbuddy/skills/system-controller/

安装依赖

bash 复制代码

# 1. 安装 Kotlin 编译器（如果还没有）
#    下载: https://github.com/JetBrains/kotlin/releases
#    解压后把 bin 目录加到 PATH

# 2. 安装 Python（如果还没有）
#    https://python.org

# 3. 安装 system-controller Skill
#    通过 WorkBuddy 技能市场安装，或：
#    git clone https://github.com/your-repo/system-controller.git ~/.workbuddy/skills/system-controller

# 4. 配置 API Key
#    export LLM_API_KEY="sk-..."

运行

bash 复制代码

# 交互模式（自动检测）
kotlinc run.kt -script

# 交互模式（强制V1模式）
kotlinc run.kt -script -- --text

# 交互模式（强制V2模式）
kotlinc run.kt -script -- --function

# 交互模式（混合模式 - 支持复杂多步骤任务）
kotlinc run.kt -script -- --mixed

# 单次命令模式（自动检测）
kotlinc run.kt -script -- "帮我打开记事本"

# 单次命令模式（强制V1模式）
kotlinc run.kt -script -- --text "截屏然后 OCR 识别文字"

# 单次命令模式（强制V2模式）
kotlinc run.kt -script -- --function "把音量调到 50%"

# 单次命令模式（混合模式 - 智能分析并执行复杂任务）
kotlinc run.kt -script -- --mixed "帮我截屏，OCR识别文字，保存到文件，然后调整音量"

# 配置模式切换：编辑 config.toml 中的 mode 字段
#   mode = "auto"      # 自动检测（推荐）
#   mode = "function"  # 强制V2模式
#   mode = "text"      # 强制V1模式
#   mode = "mixed"     # 混合模式（智能分析，V1和V2命令夹杂自由切换）
#   mode = "force_function"  # 强制V2模式，忽略可用性
#   mode = "force_text"      # 强制V1模式，忽略可用性

核心设计原则

原子操作清单法：每个工具 = 一个不可拆分的原子操作
LLM 编排组合：不预设工作流，让 LLM 自由组合原子操作
零状态无依赖：每次执行独立，不依赖上一次状态
三不原则 ：
- LLM 能做的事，代码不做
- 已覆盖的能力，不重复做
- 增加复杂度的功能，不做

工具清单（55 个原子操作）

模块	工具数	能力
窗口管理	7	列出/激活/关闭/最小化/最大化/调整/发送按键
进程管理	5	列出/结束/启动/详情/系统资源
硬件控制	16	音量/亮度/电源/网络/WiFi/USB
GUI 控制	14	鼠标/键盘/截图/OCR/图像识别
串口通信	5	列出端口/发送/接收/收发对话/监听模式
IoT/HTTP	8	HTTP GET/POST/PUT + HomeAssistant 5个操作

能力总结

两个版本都具备完整的能力栈：

1. 硬件/软件控制能力（V2模式，55个工具）

窗口管理：控制应用窗口
进程管理：管理系统进程
硬件控制：调节音量、亮度、电源
GUI自动化：鼠标、键盘、截图、OCR
串口通信：与硬件设备通信
IoT控制：智能家居、HTTP API

2. 无限制系统命令能力（V1模式）

文件操作：创建、删除、编辑、移动文件
脚本执行：运行任何脚本（Python、Bash、PowerShell）
网络操作：HTTP请求、FTP、端口扫描
数据库操作：SQL查询、数据导入导出
系统管理：用户管理、服务控制、注册表
软件安装：安装、配置、更新软件
安全操作：防火墙、权限管理
开发工具：编译、构建、测试
数据操作：处理CSV、JSON、XML文件
任何其他可以通过命令行完成的任务

3. 智能模式切换（V1+V2组合）

动态切换：会话中随时切换模式
组合使用：单个任务中混合使用V1和V2
智能适配：auto模式自动选择最佳模式
智能混合：mixed模式自动分析任务，V1和V2命令夹杂自由切换按顺序组合

4. 关键优势

V2模式：安全、可靠、结构化，适合生产环境
V1模式：无限制、灵活、强大，适合开发运维
双模式：两者兼得，根据需求选择
自动检测：无需手动配置，智能选择最佳模式

一句话总结 ：两个版本都能通过V1模式做任何可以通过命令行完成的事情 ，同时通过V2模式安全可靠地控制硬件软件 ，并支持智能切换和组合使用。

yaml 复制代码

# Minimal Agent 配置（Kotlin 版）

[llm]
# LLM API 配置（Kotlin 独立版需要，Python Skill版由WorkBuddy提供）
api_url = "https://api.openai.com/v1/chat/completions"
api_key = "${LLM_API_KEY}"          # 从环境变量读取
model = "gpt-4o"
max_tokens = 4096
temperature = 0.7

[system_controller]
skill_path = "~/.workbuddy/skills/system-controller"  # Skill 安装路径
python_path = "python"                                 # Python 解释器

[execution]
# 运行模式（支持6种模式）：
#   function - V2模式，使用system-controller工具（默认推荐）
#   text     - V1模式，执行任意命令
#   auto     - 自动检测：有system-controller用V2，没有或不可用时用V1
#   mixed    - 智能混合模式：自动分析任务，智能组合V1和V2命令
#   force_text     - 强制V1模式，无论是否有system-controller
#   force_function - 强制V2模式，无论system-controller是否可用
mode = "auto"
timeout_seconds = 30              # 单个脚本执行超时秒数
max_iterations = 20              # 单轮对话最大操作次数（防死循环）

kotlin 复制代码

#!/usr/bin/env kotlin
// ═══════════════════════════════════════════
// Minimal Agent --- 极简 AI 操作系统控制代理
// ═══════════════════════════════════════════
//
// 原理：用户输入 → LLM 决策 → 调用 system-controller 脚本执行 → 返回结果 → 循环
// 支持两种模式：
//   - text 模式（V1）：LLM 生成自然语言命令文本，正则提取后执行
//     **能力范围：无限制，可执行任何系统命令（包括文件修改、脚本执行等）**
//   - function 模式（V2）：LLM 返回结构化 JSON（Function Calling），框架映射到脚本
//     **能力范围：限制在55个预定义工具内，只能调用system-controller脚本**
// 通过 config.toml 中的 mode 字段切换
//
// 用法：
//   kotlinc run.kt -script                    # 交互模式
//   kotlinc run.kt -script -- "帮我打开记事本" # 单次模式

import java.io.File
import java.util.*
import java.util.concurrent.TimeUnit

// ═══════════════════════════════════════════
// 配置
// ═══════════════════════════════════════════

/** 运行模式枚举 */
enum class RunMode {
    /** V1：LLM 生成自然语言命令，正则提取执行（兼容性好） */
    TEXT,
    /** V2：Function Calling 结构化调用（省 token，稳定） */
    FUNCTION,
    /** 自动：智能检测 system-controller 可用性 */
    AUTO,
    /** 强制 V1：无论是否有 system-controller 都使用文本模式 */
    FORCE_TEXT,
    /** 强制 V2：无论是否有 system-controller 都使用函数模式 */
    FORCE_FUNCTION,
    /** 混合模式：自动分析任务，V1和V2命令夹杂自由切换按顺序组合 */
    MIXED,
}

/** 完整配置 */
data class Config(
    val apiUrl: String,          // LLM API 地址
    val apiKey: String,          // API 密钥
    val model: String,           // 模型名称
    val maxTokens: Int,          // 最大输出 token 数
    val temperature: Double,     // 温度参数
    val skillPath: String,       // system-controller Skill 安装路径
    val pythonPath: String,      // Python 解释器路径
    val timeoutSeconds: Long,    // 单次脚本执行超时秒数
    val maxIterations: Int,      // 单轮对话最大操作次数（防死循环）
    val mode: RunMode,           // 运行模式：AUTO/TEXT/FUNCTION/FORCE_TEXT/FORCE_FUNCTION
)

/** 从 config.toml 加载配置 */
fun loadConfig(): Config {
    // 简单的 TOML 解析（生产环境可用 ktoml 库，这里手写保持零依赖）
    val lines = File("config.toml").readLines()
    
    fun get(key: String): String = lines
        .filter { it.startsWith(key) }
        .map { it.substringAfter("=").trim().trim('"').trim() }
        .firstOrNull()
        ?: error("配置文件缺少必要字段: $key")
    
    return Config(
        apiUrl = get("api_url"),
        apiKey = System.getenv("LLM_API_KEY") ?: get("api_key"),
        model = get("model"),
        maxTokens = get("max_tokens").toInt(),
        temperature = get("temperature").toDouble(),
        skillPath = get("skill_path").replace("~", System.getProperty("user.home")),
        pythonPath = get("python_path"),
        timeoutSeconds = get("timeout_seconds").toLong(),
        maxIterations = get("max_iterations").toInt(),
        mode = when (get("mode").lowercase()) {
            "text", "v1" -> RunMode.TEXT
            "function", "v2", "fc" -> RunMode.FUNCTION
            "auto" -> RunMode.AUTO
            "mixed" -> RunMode.MIXED
            "force_text", "force-v1" -> RunMode.FORCE_TEXT
            "force_function", "force-v2", "force-fc" -> RunMode.FORCE_FUNCTION
            else -> error("无效的运行模式: ${get("mode")}，请使用 text/function/auto/mixed/force_text/force_function")
        },
    )
}

// ═══════════════════════════════════════════
// 提示词加载
// ═══════════════════════════════════════════

/**
 * 根据运行模式加载对应的 System Prompt
 * - TEXT 模式：需要详细命令列表，LLM 据此生成命令文本
 * - FUNCTION 模式：工具定义通过 toolsJson() 传递，prompt 只需行为规则
 * - MIXED 模式：智能混合模式，支持V1和V2命令夹杂自由切换
 * - AUTO/FORCE_* 模式：根据实际模式决定
 */
fun loadPrompt(config: Config, actualMode: RunMode? = null): String {
    val mode = actualMode ?: config.mode
    return when (mode) {
        RunMode.TEXT, RunMode.FORCE_TEXT -> TEXT_MODE_PROMPT
        RunMode.FUNCTION, RunMode.FORCE_FUNCTION -> FUNCTION_MODE_PROMPT
        RunMode.MIXED -> MIXED_MODE_PROMPT
        RunMode.AUTO -> {
            // 为AUTO模式准备通用提示词
            FUNCTION_MODE_PROMPT // 默认使用function模式提示词，实际调用时根据模式调整
        }
    }
}

/** TEXT 模式 Prompt：详细命令列表（LLM 据此生成脚本调用文本） */
val TEXT_MODE_PROMPT = """
你是一个能直接操控用户操作系统的 AI 助手。

## 执行协议

执行协议等待指令：简单任务用自适应身份叠加输入处理输出结构基元执行，复杂任务分拆成简单任务交给基元构成链条执行，默认不展示输入处理输出细节但用户可以要求展示。优先根据提示词使用大语言模型完成任务，非必要不调用工具技能插件，把工具技能插件作为身份或基元进行规划调用。

## 你能做的事

通过调用以下脚本来控制 Windows 系统：

### 窗口管理 (window_manager.py)
- list --- 列出所有窗口 (title, pid, process)
- activate --title/pid --- 激活窗口到前台
- close --title/pid --- 关闭窗口
- minimize/maximize --title/pid --- 最小化/最大化
- resize --pid --x --y --w --h --- 调整大小和位置
- send-keys --title/pid --text --- 发送按键 (如 ENTER, ^c for Ctrl+C)

### 进程管理 (process_manager.py)
- list [--name] --- 列出/查找进程
- kill --name/pid [--force] --- 结束进程
- start "命令" [--dir] --- 启动应用
- info --pid --- 进程详情
- system --- CPU/内存/磁盘总览

### 硬件控制 (hardware_controller.py)
- volume get/set --level / mute --- 音量
- screen brightness [--level] --- 亮度（不传参数获取当前亮度，传 --level 设置）
- screen info --- 显示器信息
- power lock/sleep/hibernate/shutdown [--delay]/restart/cancel --- 电源
- network list/enable/disable --name/wifi/info --- 网络（enable/disable 需要 --name 参数）
- usb list --- USB 设备

### GUI 控制 (gui_controller.py)
- mouse move/click/right-click/double-click --x --y --- 鼠标操作
- mouse drag --start-x --start-y --end-x --end-y --- 拖拽
- mouse scroll --direction up/down --clicks --- 滚动
- mouse position --- 获取鼠标位置
- keyboard type --text --- 输入文字
- keyboard press --keys --- 按键 ("ctrl+c", "alt+tab")
- screenshot full/window/size --- 截图（类型：full 全屏，window 活动窗口，size 分辨率）
- visual ocr [--x --y --w --h] --- 屏幕文字识别
- visual find --template "img.png" --- 图像匹配
- visual click-image --template "img.png" --- 找图并点击
- visual pixel --x --y --- 取像素颜色

### 串口通信 (serial_comm.py)
- list --- 列出串口
- send/receive/chat/monitor --port COMx --data "..." --- 串口操作

### IoT 控制 (iot_controller.py)
- homeassistant --url --token list/state/on/off/toggle/entity-id
- http --url get/post/put --path [--body/--header] --- 通用 HTTP API

## 执行规则

1. 先查询后操作：list/find → 再执行动作
2. 危险操作前必须说明并等待确认（关机、杀进程、关闭窗口等）
3. 返回结果要简洁，只报告关键信息
4. 操作失败时分析原因并建议替代方案
5. 可以组合多个步骤完成复杂任务
6. 用户说"撤销"或"回滚"时，用已知信息反向执行
""".trimIndent()

/** FUNCTION 模式 Prompt：简洁行为规则（工具通过 toolsToJson() 传递） */
val FUNCTION_MODE_PROMPT = """
你是一个能直接操控用户操作系统的 AI 助手。
你可以通过工具调用来控制 Windows 系统的窗口、进程、硬件、GUI、串口和 IoT 设备。

## 执行协议

执行协议等待指令：简单任务用自适应身份叠加输入处理输出结构基元执行，复杂任务分拆成简单任务交给基元构成链条执行，默认不展示输入处理输出细节但用户可以要求展示。优先根据提示词使用大语言模型完成任务，非必要不调用工具技能插件，把工具技能插件作为身份或基元进行规划调用。

## 规则

1. 先查询后操作（先 list/find，再执行动作）
2. 危险操作（关机、杀进程、关窗）必须先说明并等确认
3. 可以组合多个工具完成复杂任务
4. 操作失败时分析原因并建议替代方案
5. 用户说"撤销"/"回滚"时用已知信息反向执行
6. 不要编造不存在的工具名称，只用提供的工具
""".trimIndent()

/** MIXED 模式 Prompt：自动分析任务，V1和V2命令夹杂自由切换按顺序组合 */
val MIXED_MODE_PROMPT = """
你是一个能直接操控用户操作系统的 AI 助手，具备智能模式选择能力。

## 执行协议

执行协议等待指令：简单任务用自适应身份叠加输入处理输出结构基元执行，复杂任务分拆成简单任务交给基元构成链条执行，默认不展示输入处理输出细节但用户可以要求展示。优先根据提示词使用大语言模型完成任务，非必要不调用工具技能插件，把工具技能插件作为身份或基元进行规划调用。

## 智能模式选择

你有两种执行模式，可以**自动选择最佳模式**：

### 1. V2模式（Function Calling - 安全工具）
- **适用**：硬件控制、窗口管理、进程控制、GUI自动化、串口通信、IoT控制
- **工具**：55个预定义安全工具（见下方工具列表）
- **优点**：安全、可靠、结构化
- **使用方式**：直接调用工具名称和参数

### 2. V1模式（文本命令 - 无限制）
- **适用**：文件操作、脚本执行、网络操作、数据库操作、系统管理、软件安装、安全操作等
- **能力**：**任何可以通过命令行完成的任务**
- **优点**：无限制、灵活、强大
- **使用方式**：生成自然语言命令，用```bash或```powershell代码块包裹

## 工具列表（V2模式可用）

### 窗口管理
- window_list, window_activate, window_close, window_minimize, window_maximize, window_resize, window_send_keys

### 进程管理
- process_list, process_kill, process_start, process_info, process_system

### 硬件控制
- volume_get, volume_set, volume_mute, brightness_set, screen_info, power_lock, power_sleep, power_hibernate, power_shutdown, power_restart, power_cancel, network_list, network_enable, network_disable, wifi_info, usb_list

### GUI控制
- mouse_move, mouse_click, mouse_right_click, mouse_double_click, mouse_drag, mouse_scroll, mouse_position, keyboard_type, keyboard_press, screenshot, visual_ocr, visual_find, visual_click_image, visual_pixel

### 串口通信
- serial_list, serial_send, serial_receive, serial_chat, serial_monitor

### IoT控制
- homeassistant_list, homeassistant_state, homeassistant_on, homeassistant_off, homeassistant_toggle, http_get, http_post, http_put

## 智能执行规则

1. **自动模式选择**：根据任务类型自动选择V1或V2模式
2. **混合执行**：复杂任务可以V1和V2命令夹杂自由切换按顺序组合
3. **智能分析**：分析用户需求，拆分为多个步骤，为每个步骤选择最佳模式
4. **连续执行**：自动按顺序执行所有步骤，无需人工干预
5. **结果传递**：前一步骤的输出可以作为后一步骤的输入

## 示例

用户："帮我截屏，OCR识别文字，保存到文件，然后调整音量"

智能分析：
1. 截屏 → V2模式：screenshot
2. OCR识别 → V2模式：visual_ocr  
3. 保存到文件 → V1模式：```bash echo "文字内容" > output.txt```
4. 调整音量 → V2模式：volume_set --level 50

自动执行：按顺序执行以上4个步骤，在V2和V1模式间自由切换。
""".trimIndent()

// ═══════════════════════════════════════════
// 消息与响应数据结构
// ═══════════════════════════════════════════

/** 对话消息 */
data class Message(val role: String, val content: String)

/** LLM 响应（通用） */
data class LlmResponse(
    val text: String? = null,              // 文本回复内容
    val toolCalls: List<ToolCall>? = null,  // 工具调用列表（仅 function 模式）
)

/** 单次工具调用 */
data class ToolCall(
    val name: String,                      // 工具名称
    val arguments: Map<String, Any?>,       // 工具参数
)

/** 脚本执行结果 */
data class ExecutionResult(
    val success: Boolean,                  // 是否成功
    val output: String,                    // 执行输出
    val command: String,                   // 实际执行的命令
)

// ═══════════════════════════════════════════
// JSON 工具函数
// ═══════════════════════════════════════════

/** 字符串转义为安全的 JSON 内容 */
fun escapeJson(s: String): String = s
    .replace("\\", "\\\\")
    .replace("\"", "\\\"")
    .replace("\n", "\\n")
    .replace("\r", "\\r")
    .replace("\t", "\\t")

/** 极简 JSON 解析器（只处理简单的 key-value 结构，避免引入序列化库） */
fun simpleJsonParse(s: String): Any {
    val trimmed = s.trim()
    if (trimmed.startsWith("{")) {
        val result = mutableMapOf<String, Any>()
        """"(\w+)"\s*:\s*""".toRegex().findAll(trimmed).forEach { match ->
            val key = match.groupValues[1]
            val after = trimmed.substring(match.range.last + 1).trim()
            val value = when {
                after.startsWith("\"") -> after.drop(1).takeWhile { it != '"' }
                after.startsWith("{") || after.startsWith("[") -> "complex"  // 简化处理嵌套结构
                else -> after.takeWhile { it != ',' && it != '}' && it != ']' }
            }
            result[key] = value
        }
        return result
    }
    return s
}

// ═══════════════════════════════════════════
// Tool 定义（Function Calling 模式专用）
// ═══════════════════════════════════════════

/**
 * System Controller --- 统一工具映射表
 *
 * 设计思路：
 * - 对外暴露统一的 tool calling 接口（语义化的 name + 结构化 parameters）
 * - 对内通过 toCommand lambda 映射到具体的 Python 脚本命令行
 * - LLM 不需要知道脚本路径和实现细节，只需选工具名 + 填参数
 */

/** 工具参数定义 */
data class ToolParam(
    val name: String,         // 参数名
    val type: String,         // 参数类型（string/int/boolean）
    val description: String,  // 参数说明
    val required: Boolean = false,  // 是否必填
)

/** 工具定义 */
data class ToolDef(
    val name: String,                                    // 工具名（给 LLM 看的语义名称）
    val description: String,                             // 一句话描述
    val params: List<ToolParam>,                         // 参数列表
    val toCommand: (Map<String, String>) -> List<String>, // 参数 → 脚本命令行的映射函数
)

/**
 * 全部原子操作定义清单 ------ 这里就是完整的工具集
 * 
 * 分为 6 大模块：窗口管理 / 进程管理 / 硬件控制 / GUI 控制 / 串口通信 / IoT 网络
 * 每个 ToolDef 就是一个不可拆分的原子操作，LLM 自由组合它们完成任意复杂任务
 */
val TOOLS = listOf(

    // ─────────────────────────────────────
    // 一、窗口管理 (window_manager.py)
    // ─────────────────────────────────────
    ToolDef(
        name = "window_list",
        description = "列出当前所有可见窗口",
        params = emptyList(),
        toCommand = { listOf("window_manager.py", "list") },
    ),
    ToolDef(
        name = "window_activate",
        description = "将窗口激活到前台",
        params = listOf(
            ToolParam("title", "string", "窗口标题（支持模糊匹配）"),
            ToolParam("pid", "int", "进程ID"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("window_manager.py", "activate")
            args["title"]?.let { cmd.addAll(listOf("--title", it)) }
            args["pid"]?.let { cmd.addAll(listOf("--pid", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "window_close",
        description = "关闭指定窗口",
        params = listOf(
            ToolParam("title", "string", "窗口标题"),
            ToolParam("pid", "int", "进程ID"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("window_manager.py", "close")
            args["title"]?.let { cmd.addAll(listOf("--title", it)) }
            args["pid"]?.let { cmd.addAll(listOf("--pid", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "window_minimize",
        description = "最小化窗口",
        params = listOf(ToolParam("title", "string", "窗口标题"), ToolParam("pid", "int", "进程ID")),
        toCommand = { args ->
            val cmd = mutableListOf("window_manager.py", "minimize")
            args["title"]?.let { cmd.addAll(listOf("--title", it)) }
            args["pid"]?.let { cmd.addAll(listOf("--pid", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "window_maximize",
        description = "最大化窗口",
        params = listOf(ToolParam("title", "string", "窗口标题"), ToolParam("pid", "int", "进程ID")),
        toCommand = { args ->
            val cmd = mutableListOf("window_manager.py", "maximize")
            args["title"]?.let { cmd.addAll(listOf("--title", it)) }
            args["pid"]?.let { cmd.addAll(listOf("--pid", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "window_resize",
        description = "调整窗口大小和位置",
        params = listOf(
            ToolParam("pid", "int", "进程ID", required = true),
            ToolParam("x", "int", "左上角X坐标", required = true),
            ToolParam("y", "int", "左上角Y坐标", required = true),
            ToolParam("width", "int", "宽度", required = true),
            ToolParam("height", "int", "高度", required = true),
        ),
        toCommand = { args ->
            listOf("window_manager.py", "resize", "--pid", args["pid"]!!,
                "--x", args["x"]!!, "--y", args["y"]!!,
                "--width", args["width"]!!, "--height", args["height"]!!)
        },
    ),
    ToolDef(
        name = "window_send_keys",
        description = "向窗口发送按键组合",
        params = listOf(
            ToolParam("title", "string", "窗口标题"),
            ToolParam("pid", "int", "进程ID"),
            ToolParam("text", "string", "按键文本（如 ENTER, ^c 代表 Ctrl+C）", required = true),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("window_manager.py", "send-keys")
            args["title"]?.let { cmd.addAll(listOf("--title", it)) }
            args["pid"]?.let { cmd.addAll(listOf("--pid", it)) }
            cmd.addAll(listOf("--text", args["text"]!!))
            cmd
        },
    ),

    // ─────────────────────────────────────
    // 二、进程管理 (process_manager.py)
    // ─────────────────────────────────────
    ToolDef(
        name = "process_list",
        description = "列出系统进程（可按名称过滤）",
        params = listOf(ToolParam("name", "string", "进程名过滤")),
        toCommand = { args ->
            val cmd = mutableListOf("process_manager.py", "list")
            args["name"]?.let { cmd.addAll(listOf("--name", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "process_kill",
        description = "结束进程",
        params = listOf(
            ToolParam("name", "string", "进程名"),
            ToolParam("pid", "int", "进程ID"),
            ToolParam("force", "boolean", "强制结束"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("process_manager.py", "kill")
            args["name"]?.let { cmd.addAll(listOf("--name", it)) }
            args["pid"]?.let { cmd.addAll(listOf("--pid", it)) }
            if (args["force"] == "true") cmd.add("--force")
            cmd
        },
    ),
    ToolDef(
        name = "process_start",
        description = "启动应用程序",
        params = listOf(
            ToolParam("command", "string", "启动命令或应用名", required = true),
            ToolParam("dir", "string", "工作目录"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("process_manager.py", "start", args["command"]!!)
            args["dir"]?.let { cmd.addAll(listOf("--dir", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "process_info",
        description = "获取进程详情",
        params = listOf(ToolParam("pid", "int", "进程ID", required = true)),
        toCommand = { args -> listOf("process_manager.py", "info", "--pid", args["pid"]!!) },
    ),
    ToolDef(
        name = "process_system",
        description = "获取系统资源总览（CPU/内存/磁盘）",
        params = emptyList(),
        toCommand = { listOf("process_manager.py", "system") },
    ),

    // ─────────────────────────────────────
    // 三、硬件控制 (hardware_controller.py)
    // ─────────────────────────────────────
    ToolDef(name = "volume_get", description = "获取当前音量", params = emptyList(),
        toCommand = { listOf("hardware_controller.py", "volume", "get") }),
    ToolDef(
        name = "volume_set", description = "设置音量（0-100）",
        params = listOf(ToolParam("level", "int", "音量级别 0-100", required = true)),
        toCommand = { args -> listOf("hardware_controller.py", "volume", "set", "--level", args["level"]!!) }),
    ),
    ToolDef(name = "volume_mute", description = "静音/取消静音", params = emptyList(),
        toCommand = { listOf("hardware_controller.py", "volume", "mute") }),
    ToolDef(
        name = "brightness_set", description = "获取或设置亮度（0-100）",
        params = listOf(
            ToolParam("level", "int", "亮度级别 0-100，不传则获取当前亮度")
        ),
        toCommand = { args ->
            val cmd = mutableListOf("hardware_controller.py", "screen", "brightness")
            args["level"]?.let { cmd.addAll(listOf("--level", it)) }
            cmd
        },
    ),
    ToolDef(name = "screen_info", description = "获取显示器信息", params = emptyList(),
        toCommand = { listOf("hardware_controller.py", "screen", "info") }),
    ToolDef(name = "power_lock", description = "锁定屏幕", params = emptyList(),
        toCommand = { listOf("hardware_controller.py", "power", "lock") }),
    ToolDef(name = "power_sleep", description = "休眠", params = emptyList(),
        toCommand = { listOf("hardware_controller.py", "power", "sleep") }),
    ToolDef(
        name = "power_shutdown", description = "关机（可延迟）",
        params = listOf(ToolParam("delay", "int", "延迟秒数")),
        toCommand = { args ->
            val cmd = mutableListOf("hardware_controller.py", "power", "shutdown")
            args["delay"]?.let { cmd.addAll(listOf("--delay", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "power_restart", description = "重启（可延迟）",
        params = listOf(ToolParam("delay", "int", "延迟秒数")),
        toCommand = { args ->
            val cmd = mutableListOf("hardware_controller.py", "power", "restart")
            args["delay"]?.let { cmd.addAll(listOf("--delay", it)) }
            cmd
        },
    ),
    ToolDef(name = "power_hibernate", description = "休眠到硬盘", params = emptyList(),
        toCommand = { listOf("hardware_controller.py", "power", "hibernate") }),
    ToolDef(name = "power_cancel", description = "取消定时关机/重启", params = emptyList(),
        toCommand = { listOf("hardware_controller.py", "power", "cancel") }),
    ToolDef(name = "network_list", description = "列出网络适配器", params = emptyList(),
        toCommand = { listOf("hardware_controller.py", "network", "adapters") }),
    ToolDef(
        name = "network_enable", description = "启用网络适配器",
        params = listOf(ToolParam("name", "string", "适配器名称", required = true)),
        toCommand = { args -> listOf("hardware_controller.py", "network", "enable", "--name", args["name"]!!) },
    ),
    ToolDef(
        name = "network_disable", description = "禁用网络适配器",
        params = listOf(ToolParam("name", "string", "适配器名称", required = true)),
        toCommand = { args -> listOf("hardware_controller.py", "network", "disable", "--name", args["name"]!!) },
    ),
    ToolDef(name = "wifi_info", description = "获取 WiFi 信息", params = emptyList(),
        toCommand = { listOf("hardware_controller.py", "network", "wifi") }),
    ToolDef(name = "usb_list", description = "列出 USB 设备", params = emptyList(),
        toCommand = { listOf("hardware_controller.py", "usb", "list") }),

    // ─────────────────────────────────────
    // 四、GUI 控制 --- 鼠标 (gui_controller.py)
    // ─────────────────────────────────────
    ToolDef(
        name = "mouse_move", description = "移动鼠标",
        params = listOf(ToolParam("x", "int", "X坐标", required = true), ToolParam("y", "int", "Y坐标", required = true)),
        toCommand = { args -> listOf("gui_controller.py", "mouse", "move", "--x", args["x"]!!, "--y", args["y"]!!) },
    ),
    ToolDef(
        name = "mouse_click", description = "鼠标左键点击",
        params = listOf(ToolParam("x", "int", "X坐标", required = true), ToolParam("y", "int", "Y坐标", required = true)),
        toCommand = { args -> listOf("gui_controller.py", "mouse", "click", "--x", args["x"]!!, "--y", args["y"]!!) },
    ),
    ToolDef(
        name = "mouse_right_click", description = "鼠标右键点击",
        params = listOf(ToolParam("x", "int", "X坐标", required = true), ToolParam("y", "int", "Y坐标", required = true)),
        toCommand = { args -> listOf("gui_controller.py", "mouse", "right-click", "--x", args["x"]!!, "--y", args["y"]!!) },
    ),
    ToolDef(
        name = "mouse_double_click", description = "鼠标双击",
        params = listOf(ToolParam("x", "int", "X坐标", required = true), ToolParam("y", "int", "Y坐标", required = true)),
        toCommand = { args -> listOf("gui_controller.py", "mouse", "double-click", "--x", args["x"]!!, "--y", args["y"]!!) },
    ),
    ToolDef(
        name = "mouse_drag", description = "鼠标拖拽",
        params = listOf(
            ToolParam("start_x", "int", "起点X", required = true),
            ToolParam("start_y", "int", "起点Y", required = true),
            ToolParam("end_x", "int", "终点X", required = true),
            ToolParam("end_y", "int", "终点Y", required = true),
        ),
        toCommand = { args ->
            listOf("gui_controller.py", "mouse", "drag",
                "--start-x", args["start_x"]!!, "--start-y", args["start_y"]!!,
                "--end-x", args["end_x"]!!, "--end-y", args["end_y"]!!)
        },
    ),
    ToolDef(
        name = "mouse_scroll", description = "滚动鼠标滚轮",
        params = listOf(
            ToolParam("direction", "string", "方向: up 或 down", required = true),
            ToolParam("clicks", "int", "滚动次数"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("gui_controller.py", "mouse", "scroll", "--direction", args["direction"]!!)
            args["clicks"]?.let { cmd.addAll(listOf("--clicks", it)) }
            cmd
        },
    ),
    ToolDef(name = "mouse_position", description = "获取当前鼠标位置", params = emptyList(),
        toCommand = { listOf("gui_controller.py", "mouse", "position") }),

    // ─────────────────────────────────────
    // GUI 控制 --- 键盘 (gui_controller.py)
    // ─────────────────────────────────────
    ToolDef(
        name = "keyboard_type", description = "输入文字",
        params = listOf(ToolParam("text", "string", "要输入的文字", required = true)),
        toCommand = { args -> listOf("gui_controller.py", "keyboard", "type", "--text", args["text"]!!) },
    ),
    ToolDef(
        name = "keyboard_press", description = "按下快捷键",
        params = listOf(ToolParam("keys", "string", "按键组合（如 ctrl+c, alt+tab）", required = true)),
        toCommand = { args -> listOf("gui_controller.py", "keyboard", "press", "--keys", args["keys"]!!) },
    ),

    // ─────────────────────────────────────
    // GUI 控制 --- 截图与视觉识别 (gui_controller.py)
    // ─────────────────────────────────────
    ToolDef(
        name = "screenshot", description = "截图（全屏/活动窗口/分辨率）",
        params = listOf(
            ToolParam("type", "string", "截图类型: full (全屏), window (活动窗口), size (分辨率)")
        ),
        toCommand = { args ->
            val cmd = mutableListOf("gui_controller.py", "screenshot")
            when (args["type"]) {
                "full" -> cmd.add("full")
                "window" -> cmd.add("active-window")
                "size" -> cmd.add("size")
                else -> cmd.add("full")
            }
            cmd
        },
    ),
    ToolDef(
        name = "visual_ocr", description = "识别屏幕区域的文字（OCR）",
        params = listOf(
            ToolParam("x", "int", "区域左上角X"),
            ToolParam("y", "int", "区域左上角Y"),
            ToolParam("width", "int", "区域宽度"),
            ToolParam("height", "int", "区域高度"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("gui_controller.py", "visual", "ocr")
            args["x"]?.let { cmd.addAll(listOf("--x", it)) }
            args["y"]?.let { cmd.addAll(listOf("--y", it)) }
            args["width"]?.let { cmd.addAll(listOf("--width", it)) }
            args["height"]?.let { cmd.addAll(listOf("--height", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "visual_find", description = "在屏幕上查找图像模板",
        params = listOf(ToolParam("template", "string", "模板图片路径", required = true)),
        toCommand = { args -> listOf("gui_controller.py", "visual", "find", "--template", args["template"]!!) },
    ),
    ToolDef(
        name = "visual_click_image", description = "找到图像模板并点击",
        params = listOf(ToolParam("template", "string", "模板图片路径", required = true)),
        toCommand = { args -> listOf("gui_controller.py", "visual", "click-image", "--template", args["template"]!!) },
    ),
    ToolDef(
        name = "visual_pixel", description = "获取指定位置的像素颜色",
        params = listOf(ToolParam("x", "int", "X坐标", required = true), ToolParam("y", "int", "Y坐标", required = true)),
        toCommand = { args -> listOf("gui_controller.py", "visual", "pixel", "--x", args["x"]!!, "--y", args["y"]!!) },
    ),

    // ─────────────────────────────────────
    // 五、串口通信 (serial_comm.py)
    // ─────────────────────────────────────
    ToolDef(name = "serial_list", description = "列出串口设备", params = emptyList(),
        toCommand = { listOf("serial_comm.py", "list") }),
    ToolDef(
        name = "serial_send", description = "向串口发送数据",
        params = listOf(
            ToolParam("port", "string", "端口号如 COM3", required = true),
            ToolParam("data", "string", "发送的数据", required = true),
            ToolParam("baud", "int", "波特率"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("serial_comm.py", "send", "--port", args["port"]!!, "--data", args["data"]!!)
            args["baud"]?.let { cmd.addAll(listOf("--baud", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "serial_chat", description = "发送数据并等待响应",
        params = listOf(
            ToolParam("port", "string", "端口号", required = true),
            ToolParam("data", "string", "发送的数据", required = true),
            ToolParam("baud", "int", "波特率"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("serial_comm.py", "chat", "--port", args["port"]!!, "--data", args["data"]!!)
            args["baud"]?.let { cmd.addAll(listOf("--baud", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "serial_receive", description = "接收串口数据",
        params = listOf(
            ToolParam("port", "string", "端口号", required = true),
            ToolParam("baud", "int", "波特率"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("serial_comm.py", "receive", "--port", args["port"]!!)
            args["baud"]?.let { cmd.addAll(listOf("--baud", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "serial_monitor", description = "串口监听模式",
        params = listOf(
            ToolParam("port", "string", "端口号", required = true),
            ToolParam("baud", "int", "波特率"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("serial_comm.py", "monitor", "--port", args["port"]!!)
            args["baud"]?.let { cmd.addAll(listOf("--baud", it)) }
            cmd
        },
    ),

    // ─────────────────────────────────────
    // 六、IoT 与 HTTP (iot_controller.py)
    // ─────────────────────────────────────
    ToolDef(
        name = "http_get", description = "HTTP GET 请求",
        params = listOf(
            ToolParam("url", "string", "基础URL", required = true),
            ToolParam("path", "string", "请求路径", required = true),
            ToolParam("header", "string", "自定义请求头"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("iot_controller.py", "http", "--url", args["url"]!!, "get", "--path", args["path"]!!)
            args["header"]?.let { cmd.addAll(listOf("--header", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "http_post", description = "HTTP POST 请求",
        params = listOf(
            ToolParam("url", "string", "基础URL", required = true),
            ToolParam("path", "string", "请求路径", required = true),
            ToolParam("body", "string", "请求体 JSON"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("iot_controller.py", "http", "--url", args["url"]!!, "post", "--path", args["path"]!!)
            args["body"]?.let { cmd.addAll(listOf("--body", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "http_put", description = "HTTP PUT 请求",
        params = listOf(
            ToolParam("url", "string", "基础URL", required = true),
            ToolParam("path", "string", "请求路径", required = true),
            ToolParam("body", "string", "请求体 JSON"),
        ),
        toCommand = { args ->
            val cmd = mutableListOf("iot_controller.py", "http", "--url", args["url"]!!, "put", "--path", args["path"]!!)
            args["body"]?.let { cmd.addAll(listOf("--body", it)) }
            cmd
        },
    ),
    ToolDef(
        name = "ha_list", description = "列出 HomeAssistant 实体",
        params = listOf(
            ToolParam("url", "string", "HomeAssistant URL", required = true),
            ToolParam("token", "string", "API Token", required = true),
        ),
        toCommand = { args -> listOf("iot_controller.py", "homeassistant", "--url", args["url"]!!, "--token", args["token"]!!, "list") },
    ),
    ToolDef(
        name = "ha_state", description = "获取 HomeAssistant 实体状态",
        params = listOf(
            ToolParam("url", "string", "HomeAssistant URL", required = true),
            ToolParam("token", "string", "API Token", required = true),
            ToolParam("entity_id", "string", "实体ID", required = true),
        ),
        toCommand = { args -> listOf("iot_controller.py", "homeassistant", "--url", args["url"]!!, "--token", args["token"]!!, "state", "--entity-id", args["entity_id"]!!) },
    ),
    ToolDef(
        name = "ha_on", description = "打开 HomeAssistant 实体",
        params = listOf(
            ToolParam("url", "string", "HomeAssistant URL", required = true),
            ToolParam("token", "string", "API Token", required = true),
            ToolParam("entity_id", "string", "实体ID", required = true),
        ),
        toCommand = { args -> listOf("iot_controller.py", "homeassistant", "--url", args["url"]!!, "--token", args["token"]!!, "on", "--entity-id", args["entity_id"]!!) },
    ),
    ToolDef(
        name = "ha_off", description = "关闭 HomeAssistant 实体",
        params = listOf(
            ToolParam("url", "string", "HomeAssistant URL", required = true),
            ToolParam("token", "string", "API Token", required = true),
            ToolParam("entity_id", "string", "实体ID", required = true),
        ),
        toCommand = { args -> listOf("iot_controller.py", "homeassistant", "--url", args["url"]!!, "--token", args["token"]!!, "off", "--entity-id", args["entity_id"]!!) },
    ),
    ToolDef(
        name = "ha_toggle", description = "切换 HomeAssistant 实体状态",
        params = listOf(
            ToolParam("url", "string", "HomeAssistant URL", required = true),
            ToolParam("token", "string", "API Token", required = true),
            ToolParam("entity_id", "string", "实体ID", required = true),
        ),
        toCommand = { args -> listOf("iot_controller.py", "homeassistant", "--url", args["url"]!!, "--token", args["token"]!!, "toggle", "--entity-id", args["entity_id"]!!) },
    ),
).associateBy { it.name }

/** 将 TOOLS 映射表转换为 OpenAI Function Calling 格式的 JSON 字符串 */
fun toolsToJson(): String {
    val sb = StringBuilder("[\n")
    TOOLS.values.forEachIndexed { i, tool ->
        if (i > 0) sb.append(",\n")
        sb.append("""  {"type":"function","function":{"name":"${tool.name}","description":"${tool.description}","parameters":{"type":"object","properties":{""")

        tool.params.forEachIndexed { j, p ->
            if (j > 0) sb.append(",")
            sb.append("""""${p.name}":{"type":"${p.type}","description":"${p.description}"""")
            if (p.required) sb.append(""","required":true""")
        }

        val requiredParams = tool.params.filter { it.required }.map { "\"${it.name}\"" }
        if (requiredParams.isNotEmpty()) {
            sb.append("""},"required":[${requiredParams.joinToString(",")}]}""")
        } else {
            sb.append("}}}")
        }
        sb.append("}")
    }
    sb.append("\n]")
    return sb.toString()
}

// ═══════════════════════════════════════════
// System-controller 检测与模式选择
// ═══════════════════════════════════════════

// ═══════════════════════════════════════════
// MIXED 模式智能分析
// ═══════════════════════════════════════════

/**
 * 智能分析任务类型，决定使用V1还是V2模式
 * 
 * 规则：
 * 1. 如果任务可以通过预定义工具完成 → V2模式
 * 2. 如果任务需要文件操作、脚本执行等 → V1模式
 * 3. 复杂任务可以拆分为多个步骤，每个步骤单独选择模式
 */
fun analyzeTaskType(task: String): String {
    val v2Keywords = listOf(
        "窗口", "进程", "音量", "亮度", "电源", "网络", "鼠标", "键盘", "截图", "OCR", 
        "串口", "IoT", "homeassistant", "亮度", "屏幕", "显示器", "激活", "最小化", "最大化",
        "关闭窗口", "打开应用", "结束进程", "启动进程", "调整大小", "发送按键",
        "鼠标移动", "鼠标点击", "键盘输入", "截屏", "文字识别", "找图", "颜色",
        "串口发送", "串口接收", "智能家居", "HTTP请求"
    )
    
    val v1Keywords = listOf(
        "文件", "编辑", "创建", "删除", "移动", "复制", "重命名", "脚本", "执行", "运行",
        "网络", "数据库", "查询", "导入", "导出", "系统", "配置", "注册表", "服务",
        "安装", "卸载", "更新", "安全", "防火墙", "权限", "编译", "构建", "测试",
        "CSV", "JSON", "XML", "处理", "转换", "打包", "部署", "备份", "恢复"
    )
    
    val taskLower = task.lowercase()
    val v2Count = v2Keywords.count { taskLower.contains(it.lowercase()) }
    val v1Count = v1Keywords.count { taskLower.contains(it.lowercase()) }
    
    return when {
        v2Count > v1Count && v2Count > 0 -> "V2"
        v1Count > v2Count && v1Count > 0 -> "V1"
        else -> "MIXED" // 两者都有或无法判断
    }
}

/**
 * 检测 system-controller 是否可用
 * 1. 检查 skill 目录是否存在
 * 2. 检查主要脚本是否存在
 * 3. 检查 Python 环境是否可用
 */
fun isSystemControllerAvailable(config: Config): Boolean {
    try {
        // 1. 检查 skill 目录
        val skillDir = File(config.skillPath)
        if (!skillDir.exists() || !skillDir.isDirectory) {
            println("⚠️ system-controller 目录不存在: ${config.skillPath}")
            return false
        }
        
        // 2. 检查 scripts 子目录
        val scriptsDir = File(skillDir, "scripts")
        if (!scriptsDir.exists() || !scriptsDir.isDirectory) {
            println("⚠️ system-controller 脚本目录不存在: $scriptsDir")
            return false
        }
        
        // 3. 检查主要脚本文件
        val requiredScripts = listOf("window_manager.py", "process_manager.py", "hardware_controller.py")
        for (script in requiredScripts) {
            val scriptFile = File(scriptsDir, script)
            if (!scriptFile.exists()) {
                println("⚠️ system-controller 缺少必要脚本: $script")
                return false
            }
        }
        
        // 4. 测试 Python 环境
        try {
            val testProc = ProcessBuilder(config.pythonPath, "--version")
                .redirectErrorStream(true)
                .start()
            testProc.waitFor(5, TimeUnit.SECONDS)
            if (testProc.exitValue() != 0) {
                println("⚠️ Python 环境不可用: ${config.pythonPath}")
                return false
            }
        } catch (e: Exception) {
            println("⚠️ Python 环境测试失败: ${e.message}")
            return false
        }
        
        println("✅ system-controller 可用: ${config.skillPath}")
        return true
    } catch (e: Exception) {
        println("⚠️ system-controller 检测异常: ${e.message}")
        return false
    }
}

/**
 * 智能选择运行模式
 * 根据配置和 system-controller 可用性自动选择最佳模式
 */
fun determineActualMode(config: Config): RunMode {
    return when (config.mode) {
        RunMode.AUTO -> {
            if (isSystemControllerAvailable(config)) {
                println("📊 自动选择：system-controller 可用 → 使用 function 模式 (V2)")
                RunMode.FUNCTION
            } else {
                println("📊 自动选择：system-controller 不可用 → 使用 text 模式 (V1)")
                RunMode.TEXT
            }
        }
        RunMode.MIXED -> {
            println("📊 混合模式：智能分析任务，V1和V2命令夹杂自由切换按顺序组合")
            RunMode.MIXED
        }
        RunMode.FORCE_TEXT -> {
            println("📊 强制模式：忽略 system-controller → 使用 text 模式 (V1)")
            RunMode.TEXT
        }
        RunMode.FORCE_FUNCTION -> {
            println("📊 强制模式：强制 function 模式 (V2)，system-controller 可能不可用")
            RunMode.FUNCTION
        }
        else -> config.mode  // TEXT 或 FUNCTION 直接使用
    }
}

// ═══════════════════════════════════════════
// LLM API 调用（两种模式共用底层 HTTP 请求）
// ═══════════════════════════════════════════

/**
 * 调用 LLM API
 * @param config 配置
 * @param messages 对话历史
 * @param toolsJson Function Calling 工具定义 JSON（function 模式必填，text 模式传 null）
 * @return LLM 响应
 */
fun callLlm(config: Config, messages: List<Message>, toolsJson: String? = null): LlmResponse {
    // 构建 JSON 请求体
    val body = buildString {
        append("""{"model":"${config.model}","max_tokens":${config.maxTokens},"temperature":${config.temperature},"messages":[""")
        messages.forEachIndexed { i, m ->
            if (i > 0) append(",")
            append("""{"role":"${m.role}","content":${escapeJson(m.content)}}""")
        }
        append("]")
        // Function Calling 模式附加工具定义
        if (toolsJson != null) {
            append(""","tools":$toolsJson,"tool_choice":"auto"""")
        }
        append("}")
    }

    // 使用 curl 发送 HTTP POST 请求（纯 HTTP，无 SDK 依赖）
    val proc = ProcessBuilder("curl", "-s", "-X", "POST", config.apiUrl,
        "-H", "Content-Type: application/json",
        "-H", "Authorization: Bearer ${config.apiKey}",
        "-d", body)
        .redirectErrorStream(true)
        .start()

    val output = proc.inputStream.bufferedReader().readText()
    proc.waitFor()

    // 解析文本内容
    val text = """content"\s*:\s*"((?:[^"\\]|\\.)*)""".toRegex()
        .find(output)?.groupValues?.get(1)

    // 如果是 function 模式，解析 tool_calls
    val toolCalls = if (toolsJson != null) {
        val calls = mutableListOf<ToolCall>()
        // 正则提取 function call 的 name 和 arguments
        val fcRegex = """"function"\s*:\s*\{\s*"name"\s*:\s*"(\w+)".*?"arguments"\s*:\s*(\{.*?\})""".toRegex(RegexOption.DOT_MATCHES_ALL)
        fcRegex.findAll(output).forEach { match ->
            val funcName = match.groupValues[1]
            try {
                @Suppress("UNCHECKED_CAST")
                val args = simpleJsonParse(match.groupValues[2]) as Map<String, Any?>
                calls.add(ToolCall(funcName, args))
            } catch (_: Exception) { /* 解析失败则跳过 */ }
        }
        calls.ifEmpty { null }
    } else {
        null
    }

    return LlmResponse(text, toolCalls)
}

// ═══════════════════════════════════════════
// 脚本执行器（两种模式共用）
// ═══════════════════════════════════════════

/**
 * 执行 system-controller Python 脚本
 * @param config 配置
 * @param scriptName 脚本文件名（如 window_manager.py）
 * @param args 脚本参数列表
 * @return 执行结果
 */
fun executeScript(config: Config, scriptName: String, vararg args: String): ExecutionResult {
    // 组装完整命令：python + 脚本完整路径 + 参数
    val finalArgs = mutableListOf(config.pythonPath, "${config.skillPath}/scripts/$scriptName")
    finalArgs.addAll(args)
    val cmdLine = finalArgs.joinToString(" ")

    println("\n ⚡ 执行: $cmdLine")

    try {
        val proc = ProcessBuilder(finalArgs)
            .redirectErrorStream(true)
            .start()

        val output = proc.inputStream.bufferedReader().readText()
        val exited = proc.waitFor(config.timeoutSeconds, TimeUnit.SECONDS)

        return ExecutionResult(
            success = exited && proc.exitValue() == 0,
            output = output.trim().take(2000),  // 截断过长输出，节省 token
            command = cmdLine,
        )
    } catch (e: Exception) {
        return ExecutionResult(success = false, output = e.message ?: "未知错误", command = cmdLine)
    }
}

/**
 * 执行 Tool Call（Function Calling 模式专用）
 * 根据 toolName 在 TOOLS 表中查找，通过 toCommand 映射后调用 executeScript
 */
fun executeToolCall(config: Config, toolName: String, args: Map<String, Any?>): ExecutionResult {
    val tool = TOOLS[toolName]
        ?: return ExecutionResult(false, "未知工具: $toolName", toolName)

    // 将参数全部转为字符串
    val stringArgs = args.mapValues { (_, v) -> v?.toString() ?: "" }

    // 通过 toCommand lambda 映射到实际脚本命令行
    val scriptArgs = tool.toCommand(stringArgs)
    
    // 执行：第一个元素是脚本名，剩余是参数
    return executeScript(config, scriptArgs[0], *scriptArgs.drop(1).toTypedArray())
}

// ═══════════════════════════════════════════
// Text 模式主循环（V1）
// ═══════════════════════════════════════════

/**
 * Text 模式的交互循环
 * LLM 返回自然语言文本，从中正则提取脚本命令后执行
 */
fun runLoopText(config: Config, systemPrompt: String) {
    val history = mutableListOf<Message>(
        Message("system", systemPrompt)
    )

    println("╔══════════════════════════════════════╗")
    println("║   Minimal Agent --- 文本模式 (V1)        ║")
    println("║   输入指令开始，输入 quit 退出         ║")
    println("╚══════════════════════════════════════╝")

    while (true) {
        print("\n❯ ")
        val input = readLine()?.trim() ?: break
        if (input == "quit") break
        if (input.isEmpty()) continue

        history.add(Message("user", input))

        var iterations = 0
        var done = false

        while (!done && iterations < config.maxIterations) {
            iterations++

            // 1. 调用 LLM 决策（不传工具定义，让 LLM 自由生成文本）
            val response = callLlm(config, history)

            // 2. 检测回复中是否包含脚本调用命令
            val hasScriptCall = Regex(
                """(window_manager|process_manager|hardware_controller|gui_controller|serial_comm|iot_controller)\.py\s+\w+"""
            ).containsMatchIn(response.text ?: "")

            if (!hasScriptCall) {
                // LLM 只是回复文字，不需要执行任何操作
                println("\n🤖 ${response.text}")
                response.text?.let { history.add(Message("assistant", it)) }
                done = true
                continue
            }

            println("\n🤖 决策: ${(response.text ?: "").take(150)}...")

            // 3. 从 LLM 回复中正则提取脚本调用命令
            val scriptCalls = Regex("""(\w+\.py)\s+([\w\-][\w\-\s="'/.:]*)""")
                .findAll(response.text ?: "")
                .toList()

            if (scriptCalls.isEmpty()) {
                println("\n🤖 ${response.text}")
                response.text?.let { history.add(Message("assistant", it)) }
                done = true
                continue
            }

            // 4. 逐个执行提取到的脚本命令
            val execResults = StringBuilder()

            for ((_, matchResult) in scriptCalls) {
                val scriptName = matchResult.groupValues[1]
                val argsStr = matchResult.groupValues[2].trim()
                // 按空白分割参数（处理带引号的值）
                val args = argsStr.split(Regex("\\s+(?=[^\"']*[\"'])|(?<=[^\"']*[\"'])\\s+"))
                    .filter { it.isNotEmpty() }

                val result = executeScript(config, scriptName, *args.toTypedArray())
                val resultIcon = if (result.success) "✅" else "❌"
                execResults.appendln("$resultIcon [${result.command}]")
                execResults.appendln(result.output)
            }

            val execSummary = execResults.toString().take(2000)

            // 5. 把执行结果喂回给 LLM，让它决定下一步
            history.add(Message("assistant", response.text ?: ""))
            history.add(Message("user", "[执行结果]:\n$execSummary\n\n请根据结果继续或结束。"))
        }

        if (iterations >= config.maxIterations) {
            println("\n⚠️ 达到最大操作次数限制 (${config.maxIterations})，已自动停止。")
        }
    }

    println("\n👋 再见!")
}

// ═══════════════════════════════════════════
// MIXED 模式主循环（智能混合）
// ═══════════════════════════════════════════

/**
 * MIXED 模式的交互循环
 * 智能分析任务，自动选择V1或V2模式，支持V1和V2命令夹杂自由切换按顺序组合
 */
fun runLoopMixed(config: Config, systemPrompt: String) {
    val history = mutableListOf<Message>(Message("system", systemPrompt))
    val toolsJson = toolsToJson()  // 预生成工具定义 JSON（只生成一次）
    
    println("╔══════════════════════════════════════════╗")
    println("║   Minimal Agent --- 智能混合模式 (MIXED)    ║")
    println("║   V1和V2命令夹杂自由切换按顺序组合        ║")
    println("╚══════════════════════════════════════════╝")
    println("📊 系统检测: ${if (isSystemControllerAvailable(config)) "✅ system-controller 可用" else "⚠️ system-controller 不可用"}")

    while (true) {
        print("\n💬 请输入指令（输入 'exit' 退出）：")
        val input = readLine()?.trim()
        if (input == null || input.lowercase() == "exit") break
        
        // 智能分析任务类型
        val taskType = analyzeTaskType(input)
        println("📊 智能分析: 任务类型 = $taskType")
        
        // 添加到历史
        history.add(Message("user", input))
        
        // 根据任务类型选择执行方式
        when (taskType) {
            "V2" -> {
                // V2模式：Function Calling
                println("📊 执行模式: V2（Function Calling）")
                val resp = callLlm(config, history, toolsJson)
                if (resp.toolCalls != null) {
                    for (tc in resp.toolCalls!!) {
                        val r = executeToolCall(config, tc.name, tc.arguments)
                        println(r.output)
                        history.add(Message("assistant", r.output))
                    }
                } else {
                    println(resp.text)
                    history.add(Message("assistant", resp.text ?: ""))
                }
            }
            "V1" -> {
                // V1模式：文本命令
                println("📊 执行模式: V1（文本命令）")
                val resp = callLlm(config, history)
                println(resp.text)
                history.add(Message("assistant", resp.text ?: ""))
                
                // 尝试提取并执行命令
                val commands = extractCommands(resp.text ?: "")
                if (commands.isNotEmpty()) {
                    println("📊 提取到 ${commands.size} 个命令")
                    for (cmd in commands) {
                        val r = executeCommand(config, cmd)
                        println(r.output)
                        history.add(Message("assistant", r.output))
                    }
                }
            }
            "MIXED" -> {
                // MIXED模式：需要LLM智能分解任务
                println("📊 执行模式: MIXED（智能分解）")
                val resp = callLlm(config, history, toolsJson)
                
                if (resp.toolCalls != null) {
                    // LLM选择了V2模式工具调用
                    println("📊 LLM选择: V2模式工具调用")
                    for (tc in resp.toolCalls!!) {
                        val r = executeToolCall(config, tc.name, tc.arguments)
                        println(r.output)
                        history.add(Message("assistant", r.output))
                    }
                } else {
                    // LLM选择了V1模式文本命令
                    println("📊 LLM选择: V1模式文本命令")
                    println(resp.text)
                    history.add(Message("assistant", resp.text ?: ""))
                    
                    // 尝试提取并执行命令
                    val commands = extractCommands(resp.text ?: "")
                    if (commands.isNotEmpty()) {
                        println("📊 提取到 ${commands.size} 个命令")
                        for (cmd in commands) {
                            val r = executeCommand(config, cmd)
                            println(r.output)
                            history.add(Message("assistant", r.output))
                        }
                    }
                }
            }
        }
    }
    
    println("\n👋 再见!")
}

// ═══════════════════════════════════════════
// Function Calling 模式主循环（V2）
// ═══════════════════════════════════════════

/**
 * Function Calling 模式的交互循环
 * LLM 返回结构化 JSON 工具调用，框架直接映射到脚本执行
 * 比 text 模式节省约 85% token
 */
fun runLoopFunction(config: Config, systemPrompt: String) {
    val history = mutableListOf<Message>(Message("system", systemPrompt))
    val toolsJson = toolsToJson()  // 预生成工具定义 JSON（只生成一次）

    println("╔══════════════════════════════════════╗")
    println("║   Minimal Agent --- 函数调用模式 (V2)    ║")
    println("║   ${TOOLS.size} 个原子工具 · 结构化调用         ║")
    println("╚══════════════════════════════════════╝")

    while (true) {
        print("\n❯ ")
        val input = readLine()?.trim() ?: break
        if (input == "quit") break
        if (input.isEmpty()) continue

        history.add(Message("user", input))
        var iterations = 0

        while (iterations < config.maxIterations) {
            iterations++

            // 1. 调用 LLM（附带工具定义）
            val response = callLlm(config, history, toolsJson)

            // Case 1: LLM 返回纯文字（不需要调工具）
            if (response.toolCalls == null || response.toolCalls!!.isEmpty()) {
                println("\n🤖 ${response.text ?: "(无回复)"}")
                response.text?.let { history.add(Message("assistant", it)) }
                break
            }

            // Case 2: LLM 返回 tool calls → 逐个执行
            println("\n🤖 调用 ${response.toolCalls!!.size} 个工具...")
            val results = mutableListOf<String>()

            for (tc in response.toolCalls!!) {
                val result = executeToolCall(config, tc.name, tc.arguments)
                val icon = if (result.success) "✅" else "❌"
                results.add("$icon [${result.command}]\n${result.output}")
            }

            // 3. 把执行结果喂回 LLM（作为普通消息）
            history.add(Message("assistant", response.text ?: ""))
            history.add(Message("user", "[工具执行结果]:\n${results.joinToString("\n---\n")}\n\n请根据结果继续或结束。"))
        }

        if (iterations >= config.maxIterations) {
            println("\n⚠️ 达到最大操作次数 (${config.maxIterations})")
        }
    }

    println("\n👋 再见!")
}

// ═══════════════════════════════════════════
// 入口 --- 根据配置路由到对应模式的循环
// ═══════════════════════════════════════════

fun main(args: Array<String>) {
    val config = loadConfig()
    val actualMode = determineActualMode(config)
    val systemPrompt = loadPrompt(config)

    println("\n📋 配置模式: ${config.mode}")
    println("📋 实际模式: $actualMode")
    println("📋 路径检查: ${config.skillPath}")

    // 处理命令行参数
    val hasForceModeArg = args.any { it == "--text" || it == "--function" || it == "--auto" || it == "--mixed" }
    var forceMode: RunMode? = null
    
    if (hasForceModeArg) {
        when {
            args.contains("--text") -> forceMode = RunMode.TEXT
            args.contains("--function") -> forceMode = RunMode.FUNCTION
            args.contains("--auto") -> forceMode = RunMode.AUTO
            args.contains("--mixed") -> forceMode = RunMode.MIXED
        }
        println("📋 命令行强制模式: $forceMode")
    }

    val finalMode = forceMode ?: actualMode

    if (args.isNotEmpty() && args[0] == "--") {
        // 单次模式：直接处理命令行参数作为用户输入
        val input = args.drop(1).joinToString(" ")
        val msgs = listOf(Message("system", systemPrompt), Message("user", input))

        when (finalMode) {
            RunMode.FUNCTION, RunMode.FORCE_FUNCTION -> {
                val resp = callLlm(config, msgs, toolsJson())
                if (resp.toolCalls != null) {
                    for (tc in resp.toolCalls!!) {
                        val r = executeToolCall(config, tc.name, tc.arguments)
                        println(r.output)
                    }
                } else {
                    println(resp.text)
                }
            }
            RunMode.TEXT, RunMode.FORCE_TEXT -> {
                println(callLlm(config, msgs).text)
            }
            RunMode.MIXED -> {
                // MIXED模式在单次模式下智能分析并执行
                println("📊 混合模式：智能分析并执行任务")
                val taskType = analyzeTaskType(input)
                println("📊 任务类型分析: $taskType")
                
                if (taskType == "V2") {
                    val resp = callLlm(config, msgs, toolsJson())
                    if (resp.toolCalls != null) {
                        for (tc in resp.toolCalls!!) {
                            val r = executeToolCall(config, tc.name, tc.arguments)
                            println(r.output)
                        }
                    } else {
                        println(resp.text)
                    }
                } else {
                    println(callLlm(config, msgs).text)
                    
                    // 尝试提取并执行命令
                    val commands = extractCommands(callLlm(config, msgs).text ?: "")
                    if (commands.isNotEmpty()) {
                        println("📊 提取到 ${commands.size} 个命令")
                        for (cmd in commands) {
                            val r = executeCommand(config, cmd)
                            println(r.output)
                        }
                    }
                }
            }
            RunMode.AUTO -> {
                // AUTO模式在单次模式下也检测可用性
                val modeForThisTask = if (isSystemControllerAvailable(config)) RunMode.FUNCTION else RunMode.TEXT
                println("📊 本次任务自动选择: $modeForThisTask")
                if (modeForThisTask == RunMode.FUNCTION) {
                    val resp = callLlm(config, msgs, toolsJson())
                    if (resp.toolCalls != null) {
                        for (tc in resp.toolCalls!!) {
                            val r = executeToolCall(config, tc.name, tc.arguments)
                            println(r.output)
                        }
                    } else {
                        println(resp.text)
                    }
                } else {
                    println(callLlm(config, msgs).text)
                }
            }
        }
    } else if (args.isNotEmpty() && hasForceModeArg) {
        // 交互模式：根据最终模式选择对应的循环
        val actualModeForInteractive = determineActualMode(config.copy(mode = finalMode))
        when (actualModeForInteractive) {
            RunMode.FUNCTION -> runLoopFunction(config, systemPrompt)
            RunMode.TEXT -> runLoopText(config, systemPrompt)
            RunMode.MIXED -> runLoopMixed(config, systemPrompt)
            RunMode.AUTO -> {
                val autoMode = if (isSystemControllerAvailable(config)) RunMode.FUNCTION else RunMode.TEXT
                println("📊 自动选择: $autoMode")
                if (autoMode == RunMode.FUNCTION) {
                    runLoopFunction(config, systemPrompt)
                } else {
                    runLoopText(config, systemPrompt)
                }
            }
            else -> runLoopFunction(config, systemPrompt)
        }
    } else {
        // 交互模式：根据最终模式选择对应的循环
        when (finalMode) {
            RunMode.FUNCTION -> runLoopFunction(config, systemPrompt)
            RunMode.TEXT -> runLoopText(config, systemPrompt)
            RunMode.MIXED -> runLoopMixed(config, systemPrompt)
            RunMode.AUTO -> {
                val autoMode = if (isSystemControllerAvailable(config)) RunMode.FUNCTION else RunMode.TEXT
                println("📊 自动选择: $autoMode")
                if (autoMode == RunMode.FUNCTION) {
                    runLoopFunction(config, systemPrompt)
                } else {
                    runLoopText(config, systemPrompt)
                }
            }
            RunMode.FORCE_TEXT -> runLoopText(config, systemPrompt)
            RunMode.FORCE_FUNCTION -> runLoopFunction(config, systemPrompt)
        }
    }
}

main(args.sliceArray(1..lastIndex))

name: minimal-agent

description: >
极简 AI 操作系统控制代理。通过执行协议驱动 system-controller
完成窗口管理、进程控制、硬件操作、GUI自动化、串口通信和IoT设备交互。
当用户需要操控 Windows 系统（打开/关闭应用、调整音量亮度、
截图OCR、管理进程、串口通信等）时使用此 Skill。
触发词：打开应用、关闭窗口、调音量、调亮度、锁屏、关机、
列出进程、发串口命令、控灯、连Arduino、开WiFi、USB设备、
截屏、OCR、找图点击、输入文字、鼠标操作等。