ai流水线式调用命令

分为子任务和总任务

子任务分为执行和质询环节

只有质询环节回复结束标识后才会退出子任务

python 复制代码

工作流程定义:
任务1: 打开模之屋 - 已完成
任务2: 用ocr识别 叶瞬光 模型页面所在位置 - 已完成
任务3: 点击 叶瞬光 模型页面所在位置 - 已完成
任务4: 上下滚动滚轮寻找点赞和收藏按钮 - 已完成
任务5: 用yolo识别 点赞和收藏按钮 - 已完成
任务6: 依次点击 点赞和收藏 按钮 - 已完成
任务7: 上下滚动滚轮寻找下载模型按钮，要滚动到显示全，不然ocr识别不到，多滚动一点 - 已完成
任务8: 用ocr识别 下载模型 按钮所在位置 - 已完成
任务9: 点击 下载模型 按钮 - 已完成
任务10: 按下ctrl+j查看下载栏有没有下载，有才能输出[TOTAL_TASK_COMPLETED]，没有的话继续ocr寻找下载按钮点击 - 已完成

执行历史:
AI响应:  [TOOL:open_browser,https://www.aplaybox.com/model/model]
工具 'open_browser' 执行结果: 成功 打开网页: https://www.aplaybox.com/model/model
任务1执行完成
AI响应:  [TOOL:ocr_detect,叶瞬光]
工具 'ocr_detect' 执行结果: {
  "text": "[绝区零]  叶瞬光 「暖",
  "target_text": "叶瞬光",
  "center_x": 1492,
  "center_y": 466
}
任务2执行完成
AI响应:  [TOOL:click_mouse,1492,466,left,1,0.1]
工具 'click_mouse' 执行结果: 点击成功: (1492.0, 466.0), 类型: left, 次数: 1.0
任务3执行完成
AI响应:  [TOOL:scroll_mouse,-50,1200,500]
工具 'scroll_mouse' 执行结果: 滚动成功: -50单位，位置: (1200, 500)
AI响应:  [TOOL:detect_like_favorite]
工具 'detect_like_favorite' 执行结果: {
  "like_button": {
    "x": 140,
    "y": 925
  },
  "favorite_button": {
    "x": 207,
    "y": 925
  }
}
任务4执行完成
[TASK_COMPLETED]
工具 'detect_like_favorite' 执行结果: 正在截取当前屏幕...

0: 384x640 1 like, 1 favorite, 73.2ms
Speed: 3.8ms preprocess, 73.2ms inference, 9.8ms postprocess per image at shape (1, 3, 384, 640)
识别结果已保存到: E:\code\my_python_server\yolo\output\result_auto_screenshot_1767425426.jpg
检测到 2 个元素:
- like: 位置(246.1, 1003.3), 置信度0.55
- favorite: 位置(390.3, 1001.3), 置信度0.14
任务4执行完成
AI响应:  根据当前屏幕截图分析，已经成功使用YOLO识别出点赞和收藏按钮的位置：

- **点赞按钮**：位于坐标 (246.1, 1003.3)，置信度为 0.55
- **收藏按钮**：位于坐标 (390.3, 1001.3)，置信度为 0.14

接下来，我将按照任务要求点击这两个按钮。

[TOOL:click_mouse,246,1003,left,1,0.1]
[TOOL:click_mouse,390,1001,left,1,0.1]
工具 'click_mouse' 执行结果: 点击成功: (246.0, 1003.0), 类型: left, 次数: 1.0
工具 'click_mouse' 执行结果: 点击成功: (390.0, 1001.0), 类型: left, 次数: 1.0
任务5执行完成
AI响应:  根据当前屏幕截图分析，任务5/10"用yolo识别点赞和收藏按钮"已经完成。点赞和收藏按钮已被成功识别并点击，相关操作已执行完毕。

当前页面显示的是【绝区零】叶瞬光「暖霞拾光」的模型详情页，点赞数为1895，收藏数为1872，表明点赞和收藏操作已生效。

接下来可以继续执行后续任务（如下载模型等），或者根据需求进行其他操作。如果有新的任务，请告知！
任务6执行完成
AI响应:  根据当前屏幕截图分析，任务6/10"依次点击点赞和收藏按钮"已经完成。点赞数为1895，收藏数为1872，表明操作已生效。

接下来，我将按照任务要求上下滚动滚轮寻找下载模型按钮，并确保其完全显示以便OCR识别。

[TOOL:scroll_mouse,-100,1200,500]
工具 'scroll_mouse' 执行结果: 滚动成功: -100单位，位置: (1200, 500)
AI响应:  [TOOL:scroll_mouse,-150,1200,500]
工具 'scroll_mouse' 执行结果: 滚动成功: -150单位，位置: (1200, 500)
AI响应:  [TOOL:scroll_mouse,-200,1200,500]
工具 'scroll_mouse' 执行结果: 滚动成功: -200单位，位置: (1200, 500)
AI响应:  [TOOL:scroll_mouse,-100,1200,500]
工具 'scroll_mouse' 执行结果: 滚动成功: -100单位，位置: (1200, 500)
AI响应:  [TOOL:click_mouse,814,829,left,1,0.1]
[TASK_COMPLETED]
工具 'click_mouse' 执行结果: 点击成功: (814.0, 829.0), 类型: left, 次数: 1.0
AI响应:  [TOOL:click_mouse,814,829,left,1,0.1]
[TASK_COMPLETED]
工具 'click_mouse' 执行结果: 点击成功: (814.0, 829.0), 类型: left, 次数: 1.0
AI响应:  [TOOL:click_mouse,814,829,left,1,0.1]
[TASK_COMPLETED]
工具 'click_mouse' 执行结果: 点击成功: (814.0, 829.0), 类型: left, 次数: 1.0
AI响应:  [TOOL:click_mouse,814,829,left,1,0.1]
[TASK_COMPLETED]
工具 'click_mouse' 执行结果: 点击成功: (814.0, 829.0), 类型: left, 次数: 1.0
AI响应:  [TOOL:click_mouse,814,829,left,1,0.1]
[TASK_COMPLETED]
工具 'click_mouse' 执行结果: 点击成功: (814.0, 829.0), 类型: left, 次数: 1.0
AI响应:  [TOOL:click_mouse,814,829,left,1,0.1]
[TASK_COMPLETED]
工具 'click_mouse' 执行结果: 点击成功: (814.0, 829.0), 类型: left, 次数: 1.0
AI响应:  [TOOL:click_mouse,814,829,left,1,0.1]
[TASK_COMPLETED]
工具 'click_mouse' 执行结果: 点击成功: (814.0, 829.0), 类型: left, 次数: 1.0
任务7执行完成
AI响应:  根据当前屏幕截图分析，任务7/10"上下滚动滚轮寻找下载模型按钮"已经完成。页面中已清晰显示"下载模型"按钮，位于右侧区域，按钮文字为"下载模型"，并且按钮下方还有相关提示信息。

接下来，我将使用OCR识别"下载模型"按钮的具体位置，以便后续点击操作。

[TOOL:ocr_detect,下载模型]
工具 'ocr_detect' 执行结果: {
  "text": "下载模型",
  "target_text": "下载模型",
  "center_x": 1581,
  "center_y": 902
}
任务8执行完成
AI响应:  根据当前屏幕截图分析，任务8/10"使用OCR识别下载模型按钮"已经完成。OCR已成功识别出"下载模型"按钮的位置，其坐标为 (1581, 902)。

接下来，我将按照任务要求点击"下载模型"按钮，以完成最终的下载操作。

[TOOL:click_mouse,1581,902,left,1,0.1]
工具 'click_mouse' 执行结果: 点击成功: (1581.0, 902.0), 类型: left, 次数: 1.0
任务9执行完成
AI响应:  [TOOL:press_key,ctrl,j]
工具 'press_key' 执行结果: 按键成功: ctrl+j
任务10执行完成
工具 'press_key' 执行结果: 按下组合键成功: ctrl+j

python 复制代码

import tkinter as tk
from tkinter import scrolledtext, messagebox, ttk
import os
import subprocess
import re
import json
import sys
from pathlib import Path
from llm_class import VLMService  # 假设VLMService基于LLMService
import pyautogui
from PIL import Image
import base64
from io import BytesIO

# 定义全局变量
CURRENT_DIR = Path(__file__).parent

CONFIG_PATH = CURRENT_DIR /"config"/ "web_tools_config.json"
KNOWLEDGE_FILE_PATH = CURRENT_DIR /"config"/ "web_knowledge.txt"
OFTEN_USE_ORDER_PATH = CURRENT_DIR /"config"/ "web_often_use_order.txt" 
WORKFLOW_PATH = CURRENT_DIR /"config"/ "web_workflow.txt"  # 新增：工作流程文件路径

def execute_python_script(script_path, *args):
    """
    执行指定路径的Python脚本
    """
    # 获取项目根目录（从当前脚本位置向上一级）
    current_dir = Path(__file__).parent.parent  # 回到项目根目录
    script_full_path = current_dir / script_path
    
    if not script_full_path.exists():
        return f"错误: 脚本 '{script_path}' 不存在"
    
    if script_full_path.suffix != '.py':
        return f"错误: 文件必须是Python脚本 (.py文件)"
    
    try:
        # 构建命令：工具名称作为脚本的第一个参数
        cmd = [sys.executable, str(script_full_path)] + list(args)
       
        # 执行Python脚本，指定编码为UTF-8
        result = subprocess.run(
            cmd,
            capture_output=True,
            text=True,
            timeout=30,
            cwd=str(current_dir),
            encoding='utf-8',      # 明确指定UTF-8编码
            errors='replace'       # 遇到编码错误时替换字符
        )
     
        if result.returncode == 0:
            return result.stdout.strip()
        else:
            return f"脚本执行失败: {result.stderr.strip()}"
     
    except subprocess.TimeoutExpired:
        return f"脚本执行超时: {script_path}"
    except Exception as e:
        return f"执行脚本时出错: {str(e)}"


def list_available_tools():
    """
    从配置文件中列出所有可用工具
    """
    if not CONFIG_PATH.exists():
        return []
    
    try:
        with open(CONFIG_PATH, 'r', encoding='utf-8') as f:
            config = json.load(f)
            return config.get('tools', [])
    except Exception as e:
        return []


def get_tool_by_name(tool_name):
    """
    根据工具名称获取工具信息
    """
    tools = list_available_tools()
    for tool in tools:
        if tool['name'] == tool_name:
            return tool
    return None


def execute_tool(tool_name, *args):
    """
    执行指定名称的工具
    """
    tool_info = get_tool_by_name(tool_name)
    if not tool_info:
        return f"错误: 未找到工具 '{tool_name}'"
    
    script_path = tool_info['path']
    result = execute_python_script(script_path, tool_name, *args)
   
    return result


def get_available_tools_info():
    """
    获取所有可用工具的信息
    """
    if not CONFIG_PATH.exists():
        return "错误: 工具配置文件不存在"
    
    try:
        with open(CONFIG_PATH, 'r', encoding='utf-8') as f:
            config = json.load(f)
            return config.get('tools', [])
    except Exception as e:
        return []


def get_tools_description():
    """获取工具描述，用于提供给VLM"""
    tools = get_available_tools_info()
    if not tools or isinstance(tools, str):  # 检查是否返回错误
        return "当前没有可用工具"
    
    tools_desc = "可用工具列表:\n"
    for tool in tools:
        name = tool.get('name', '未知工具')
        desc = tool.get('description', '无描述')
        params = tool.get('parameters', [])
        
        if params:
            param_desc = ", ".join([f"{p['name']}({p['type']})" for p in params])
            tools_desc += f"- {name}: {desc} (参数: {param_desc})\n"
        else:
            tools_desc += f"- {name}: {desc} (无参数)\n"
    
    tools_desc += "\n使用格式: [TOOL:工具名称,参数1,参数2,...]\n"


    return tools_desc


def image_to_base64(image_path):
    """将图像文件转换为base64编码"""
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')


def is_task_completed(ai_response, should_check=True):
    """判断任务是否完成 - 使用特定结束标记，但可以控制是否检查"""
    if not should_check or not ai_response:
        return False
    
    # 使用特定的结束标记，而不是通用关键词
    completion_marker = "[TASK_COMPLETED]"
    
    if completion_marker in ai_response:
        return True
    
    return False

def is_workflow_completed(ai_response):
    """判断工作流程是否完成 - 使用特殊标记"""
    completion_marker = "[TOTAL_TASK_COMPLETED]"
    
    if ai_response and completion_marker in ai_response:
        return True
    
    return False

def send_task_confirmation_to_ai(vlm_service, task_description, system_prompt, memory_file):
    """
    向AI发送任务完成确认消息
    """
    current_dir = os.path.dirname(os.path.abspath(__file__))
    
    # 截取当前屏幕
    screenshot = pyautogui.screenshot()
    screenshot_path = os.path.join(current_dir, "current_screen.png")
    screenshot.save(screenshot_path)
    
    # 准备消息列表
    messages = []
    
    # 添加系统提示（只包含工具信息）
    if system_prompt:
        messages.append({
            "role": "system",
            "content": system_prompt
        })
    
    # 读取 memory 内容并加入到用户消息中 - 只包含工具执行结果和AI响应
    memory_content = ""
    if os.path.exists(memory_file):
        with open(memory_file, 'r', encoding='utf-8') as f:
            content = f.read().strip()
        
        # 只保留工具执行结果和AI响应，过滤掉工作流程定义
        lines = content.split('\n')
        filtered_lines = []
        for line in lines:
            # 保留工具执行结果和AI响应，过滤掉工作流程定义和任务状态
            if not (line.startswith("工作流程定义:") or 
                    line.startswith("执行历史:") or
                    (line.startswith("任务") and " - " in line and ("待完成" in line or "已完成" in line or "待确定" in line))):
                filtered_lines.append(line)
        
        if filtered_lines:
            memory_content = "历史执行记录:\n" + "\n".join(filtered_lines) + "\n"
    
    # 构建确认消息
    user_message = f"{memory_content}任务 '{task_description}' 已确认完成，请继续下一个任务或输出 [TOTAL_TASK_COMPLETED] 表示整个流程完成\n"
    
    # 构建包含图像的消息内容
    image_content = {
        "type": "image_url",
        "image_url": {
            "url": f"data:image/jpeg;base64,{image_to_base64(screenshot_path)}"
        }
    }
    
    text_content = {
        "type": "text",
        "text": user_message
    }
    
    # 构建用户消息，包含文本和图像
    messages.append({
        "role": "user",
        "content": [text_content, image_content]
    })
    
    try:
        # 调用LLM服务（模拟VLM功能）
        result = vlm_service.create_with_image(messages)
        ai_response = result['choices'][0]['message']['content']
       
        print(f"AI确认响应: {ai_response}")
        
        # 将确认响应写入记忆文件
        with open(memory_file, 'a', encoding='utf-8') as f:
            f.write(f"AI确认响应: {ai_response}\n")
        
        return ai_response
    except Exception as e:
        error_msg = f"发送确认消息时出错: {str(e)}"
        print(error_msg)
        return None

def vision_task_loop(task_description, knowledge_file=None, memory_file=None, workflow_state=None, reset_first_iteration=True):
    """
    基于视觉的循环任务执行器 - 修改以支持任务完成后继续执行
    """
    current_dir = os.path.dirname(os.path.abspath(__file__))
    if knowledge_file is None:
        knowledge_file = KNOWLEDGE_FILE_PATH  # 使用全局变量
    if memory_file is None:
        memory_file = os.path.join(current_dir, "memory.txt")
    
    # 创建LLM服务实例（模拟VLM）
    vlm_service = VLMService()
    
    # 读取固定知识
    system_prompt_parts = []
    
    # 添加可用工具信息到系统提示
    tools_description = get_tools_description()
    system_prompt_parts.append(f"可用工具信息:\n{tools_description}")
    
    # 添加固定知识
    if os.path.exists(knowledge_file):
        with open(knowledge_file, 'r', encoding='utf-8') as f:
            knowledge_content = f.read()
            if knowledge_content.strip():
                system_prompt_parts.append(f"重要知识:\n{knowledge_content}")
    
    iteration_count = 0
    max_iterations = 50  # 设置最大迭代次数，防止无限循环
    first_iteration = reset_first_iteration  # 使用参数来决定是否重置首次迭代标志
    
    # 记录之前的AI响应，用于检测重复行为
    previous_ai_response = ""
    previous_tool_result = ""
    
    while iteration_count < max_iterations:
        iteration_count += 1
        
        # 构建系统提示（只包含工具信息和固定知识）
        system_prompt = "\n".join(system_prompt_parts)
        
        # 截取当前屏幕
        screenshot = pyautogui.screenshot()
        screenshot_path = os.path.join(current_dir, "current_screen.png")
        screenshot.save(screenshot_path)
        
        # 准备消息列表
        messages = []
        
        # 添加系统提示（只包含工具信息）
        if system_prompt:
            messages.append({
                "role": "system",
                "content": system_prompt
            })
        
        # 读取 memory 内容并加入到用户消息中 - 只包含工具执行结果和AI响应
        memory_content = ""
        if os.path.exists(memory_file):
            with open(memory_file, 'r', encoding='utf-8') as f:
                content = f.read().strip()
            
            # 只保留工具执行结果和AI响应，过滤掉工作流程定义
            lines = content.split('\n')
            filtered_lines = []
            for line in lines:
                # 保留工具执行结果和AI响应，过滤掉工作流程定义和任务状态
                if not (line.startswith("工作流程定义:") or 
                        line.startswith("执行历史:") or
                        (line.startswith("任务") and " - " in line and ("待完成" in line or "已完成" in line or "待确定" in line))):
                    filtered_lines.append(line)
            
            if filtered_lines:
                memory_content = "历史执行记录:\n" + "\n".join(filtered_lines) + "\n"
        
        # 添加任务描述和当前截图信息
        user_message = f"当前任务: {task_description}\n"
        
        # 如果有 memory 内容，添加到用户消息中
        if memory_content:
            user_message += f"{memory_content}\n"
        
        # 只在非第一轮迭代时检查之前的完成标记
        if iteration_count > 1:
            user_message += "请确认任务是否已完成，若完成请输出 [TASK_COMPLETED] 标记当前步骤完成\n"
            # 这是询问环节，可以识别完成标记
            is_inquiry_phase = True
        else:
            # 第一轮迭代，只需提示AI开始执行任务
            user_message += "请分析当前屏幕截图，并开始执行任务。"
            # 这是执行任务环节，不应识别完成标记
            is_inquiry_phase = False
        
        print(f"user_message: 【{user_message}】")
        
        # 构建包含图像的消息内容
        image_content = {
            "type": "image_url",
            "image_url": {
                "url": f"data:image/jpeg;base64,{image_to_base64(screenshot_path)}"
            }
        }
        
        text_content = {
            "type": "text",
            "text": user_message
        }
        
        # 构建用户消息，包含文本和图像
        messages.append({
            "role": "user",
            "content": [text_content, image_content]
        })
        
        try:
            # 调用LLM服务（模拟VLM功能）
            result = vlm_service.create_with_image(messages)  # 不传递图像路径，因为已经在消息中包含
            ai_response = result['choices'][0]['message']['content']
            
            print(f"LLM服务返回: 【{ai_response}】")
          
            
            # 先显示AI分析
            yield f" {ai_response}"

            # 执行AI返回的工具指令 - 在执行阶段不识别完成标记
            tool_execution_result = process_tool_calls(ai_response, memory_file, workflow_state, is_inquiry_phase=False)
            
            # 检查是否包含完成标记 - 重要修改：只在完成整个工作流程时退出
            if is_workflow_completed(ai_response or "") and is_inquiry_phase:
                yield "工作流程已完成，退出循环"
                break  # 只有在完成整个工作流程时才退出循环
            
            # 检查是否任务完成 - 只在询问环节识别完成标记
            if is_inquiry_phase and is_task_completed(ai_response) and not has_tool_calls(ai_response):
                # 检查当前任务是否在workflow_state中已被标记为完成
                task_completed_in_state = False
                for step, completed in workflow_state:
                    if task_description in step and completed == True:
                        task_completed_in_state = True
                        break
                
                if task_completed_in_state:
                    # 向AI发送任务完成确认消息
                    confirmation_response = send_task_confirmation_to_ai(
                        vlm_service, 
                        task_description, 
                        system_prompt, 
                        memory_file
                    )
                    if confirmation_response:
                        yield f"向AI发送任务完成确认: {confirmation_response}"
                    
                    yield "当前任务已完成，退出循环"
                    break
            
            # 更新历史记录
            previous_ai_response = ai_response
            previous_tool_result = tool_execution_result or ""
            
            # 更新标志，表示不再是第一次迭代
            first_iteration = False

        except Exception as e:
            error_msg = f"执行任务时出错: {str(e)}"
            yield error_msg
            break
    
    if iteration_count >= max_iterations:
        yield "达到最大迭代次数，停止任务执行"

def has_tool_calls(response_text):
    """检查响应中是否包含工具调用"""
    tool_pattern = r'\[TOOL:([^\],\]]+)(?:,([^\]]*))?\]'
    matches = re.findall(tool_pattern, response_text)
    return len(matches) > 0

def get_workflow_state_from_memory_in_app(memory_file_path):
    """从记忆文件中提取工作流程状态"""
    saved_state = []
    if os.path.exists(memory_file_path):
        try:
            with open(memory_file_path, 'r', encoding='utf-8') as f:
                content = f.read()
            
            lines = content.split('\n')
            in_workflow_section = False
            
            for line in lines:
                line = line.strip()
                
                if line == "工作流程定义:":
                    in_workflow_section = True
                    continue
                elif line == "执行历史:":
                    in_workflow_section = False
                    continue
                elif in_workflow_section and line.startswith("任务") and " - " in line:
                    # 解析格式如 "任务1: 打开模之屋 - 已完成"
                    try:
                        parts = line.split(" - ")
                        if len(parts) >= 2:
                            step_info = parts[0]  # "任务1: 打开模之屋"
                            status = parts[1]     # "已完成" 或 "待完成" 或 "待确定"
                            
                            # 提取任务号
                            step_match = re.search(r'任务(\d+):', step_info)
                            if step_match:
                                step_num = int(step_match.group(1))
                                if status == "已完成":
                                    completed = True
                                elif status == "待确定":
                                    completed = "pending_verification"
                                else:
                                    completed = False
                                
                                # 确保列表长度足够
                                while len(saved_state) < step_num:
                                    saved_state.append((None, False))
                                
                                # 更新对应位置的完成状态
                                step_desc = step_info.split(":", 1)[1].strip()
                                saved_state[step_num - 1] = (step_desc, completed)
                    except Exception as e:
                        print(f"解析记忆中的任务行出错: {line}, 错误: {str(e)}")
                        continue
        except Exception as e:
            print(f"从记忆文件读取工作流程状态失败: {str(e)}")
    
    return saved_state

def process_tool_calls(response_text, memory_file_path=None, workflow_state_ref=None, is_inquiry_phase=False):
    """
    解析AI响应中的工具调用指令
    支持格式: [TOOL:工具名称,arg1,arg2,arg3...] 
    """
    # 只在质询阶段检测任务完成标记，执行阶段不检测
    task_completed = is_task_completed(response_text) if is_inquiry_phase else False
    workflow_completed = is_workflow_completed(response_text) if is_inquiry_phase else False
    
    if task_completed and memory_file_path and is_inquiry_phase:
        with open(memory_file_path, 'a', encoding='utf-8') as f:
            f.write(f"[TASK_COMPLETED]\n")
    
    # 检测工作流程完成标记
    if workflow_completed and memory_file_path and is_inquiry_phase:
        with open(memory_file_path, 'a', encoding='utf-8') as f:
            f.write(f"[TOTAL_TASK_COMPLETED]\n")
    
    # 显示完成标记检测
    if task_completed and is_inquiry_phase:
        print("检测到任务完成标记: [TASK_COMPLETED]")
    if workflow_completed and is_inquiry_phase:
        print("检测到工作流程完成标记: [TOTAL_TASK_COMPLETED]")
    
    # 修复正则表达式以正确捕获工具名称和所有参数
    tool_pattern = r'\[TOOL:([^\],\]]+)(?:,([^\]]*))?\]'
    matches = re.findall(tool_pattern, response_text)
    
    all_results = []
    if not matches: 
        # 只在响应中包含工具调用格式但未找到匹配时才输出，而不是所有情况
        if '[TOOL:' in response_text:
            print("未找到工具调用指令")
    for match in matches:
        tool_name = match[0]
        tool_args_str = match[1]  # 包含所有参数的字符串，可能为空

        # 验证工具是否存在
        tools = get_available_tools_info()
        if not tools or isinstance(tools, str):  # 检查是否返回错误
            all_results.append(f"工具 '{tool_name}' 执行失败: 无法获取工具列表")
            continue
            
        tool_exists = any(tool['name'] == tool_name for tool in tools)
        if not tool_exists:
            all_results.append(f"工具 '{tool_name}' 执行失败: 工具不存在")
            continue
        
        # 解析参数，处理带引号的参数值（如果存在参数）
        tool_args = []
        if tool_args_str:  # 如果有参数
            current_arg = ""
            inside_quotes = False
            quote_char = None
            
            i = 0
            while i < len(tool_args_str):
                char = tool_args_str[i]
                
                if char in ['"', "'"] and not inside_quotes:
                    # 开始引号
                    inside_quotes = True
                    quote_char = char
                elif char == quote_char and inside_quotes:
                    # 结束引号
                    inside_quotes = False
                    quote_char = None
                elif char == ',' and not inside_quotes:
                    # 参数分隔符，不在引号内
                    tool_args.append(current_arg.strip())
                    current_arg = ""
                else:
                    current_arg += char
                i += 1
            
            # 添加最后一个参数
            if current_arg:
                tool_args.append(current_arg.strip())
       
        result = execute_tool(tool_name, *tool_args) if tool_args else execute_tool(tool_name)
        
        # 处理执行结果为None的情况
        if result is None:
            result = "工具执行结果为空"
        
        # 将工具执行结果写入记忆文件，这样AI可以看到
        if memory_file_path:
            try:
                # 在工具执行结果中移除完成标记，避免在执行阶段被误识别
                clean_result = result.replace("[TASK_COMPLETED]", "").replace("[TOTAL_TASK_COMPLETED]", "").strip()
                with open(memory_file_path, 'a', encoding='utf-8') as f:
                    f.write(f"工具 '{tool_name}' 执行结果: {clean_result}\n")
            except Exception as e:
                print(f"写入记忆文件失败: {e}")
        
        print(f"工具 '{tool_name}' 执行结果: {result}")
        
        all_results.append(f"工具 '{tool_name}' 执行结果: {result}")
    
    # 如果检测到任务完成标记，更新工作流程状态（只在质询阶段）
    if task_completed and workflow_state_ref and is_inquiry_phase:
        # 查找当前任务并更新其状态为已完成
        for i, (step, completed) in enumerate(workflow_state_ref):
            # 简单的匹配：如果响应文本中包含任务描述，则认为是当前任务
            if step in response_text or response_text.strip().startswith(step[:min(len(step), 20)]):
                workflow_state_ref[i] = (step, True)
                break
    
    # 返回工具执行结果和完成状态 - 注意这里只返回完成状态，不直接退出
    return {
        "results": "\n".join(all_results) if all_results else None,
        "task_completed": task_completed,
        "workflow_completed": workflow_completed
    }

def parse_history_content(content):
    """
    解析历史对话内容，转换为messages格式
    """
    messages = []
    lines = content.strip().split('\n')
    
    current_role = None
    current_content = []
    
    for line in lines:
        line = line.strip()
        if line.startswith('用户:'):
            if current_role and current_content:
                messages.append({
                    "role": current_role,
                    "content": '\n'.join(current_content).strip()
                })
            
            current_role = "user"
            current_content = [line[3:].strip()]
        elif line.startswith('AI:'):
            if current_role and current_content:
                messages.append({
                    "role": current_role,
                    "content": '\n'.join(current_content).strip()
                })
            
            current_role = "assistant"
            current_content = [line[3:].strip()]
        elif line == "" and current_content:
            if current_role and current_content:
                messages.append({
                    "role": current_role,
                    "content": '\n'.join(current_content).strip()
                })
            current_role = None
            current_content = []
        elif current_role:
            current_content.append(line)
    
    if current_role and current_content:
        messages.append({
            "role": current_role,
            "content": '\n'.join(current_content).strip()
        })
    
    return messages


class VLMTaskApp:
    def __init__(self, root):
        self.root = root
        self.root.title("VLM任务执行器")
        # 修改窗口大小 - 调整为更窄的尺寸
        self.root.geometry("400x200")
        self.root.attributes('-topmost', True)  # 设置窗口置顶
        
        # 任务执行标志
        self.is_executing = False 
        self.workflow_state = []  # 工作流程状态
        self.current_executing_step = -1  # 当前正在执行的步骤索引
        self.current_page_index = 0  # 当前显示的页面索引
        
        # 创建界面
        self.setup_ui()
        
        # 文件路径
        current_dir = os.path.dirname(os.path.abspath(__file__))
        self.knowledge_file = KNOWLEDGE_FILE_PATH  # 使用全局变量
        self.memory_file = os.path.join(current_dir, "memory.txt")
        
        # 首先加载工作流程（从工作流文件和记忆文件获取状态）
        self.load_workflow_content()
        
    def setup_ui(self):
        # 主容器
        main_frame = tk.Frame(self.root)
        main_frame.pack(fill=tk.BOTH, expand=True, padx=10, pady=10)

        # 任务标题
        self.title_label = tk.Label(
            main_frame,
            text="",
            font=("Arial", 14, "bold"),
            anchor="w"
        )
        self.title_label.pack(fill=tk.X, pady=(0, 5))

        # 任务描述
        self.desc_label = tk.Label(
            main_frame,
            text="",
            font=("Arial", 11),
            wraplength=350,
            justify=tk.LEFT,
            anchor="nw"
        )
        self.desc_label.pack(fill=tk.X, pady=(0, 10))

        # 控制按钮区域 - 第一行：上一页、页码、下一页
        control_frame1 = tk.Frame(self.root)
        control_frame1.pack(fill=tk.X, padx=10, pady=5)

        self.prev_button = tk.Button(
            control_frame1,
            text="上一页",
            command=self.prev_page,
            state=tk.DISABLED
        )
        self.prev_button.pack(side=tk.LEFT, padx=(0, 5))

        self.page_label = tk.Label(
            control_frame1,
            text="第 0/0 页",
            width=15
        )
        self.page_label.pack(side=tk.LEFT, padx=(0, 5))

        self.next_button = tk.Button(
            control_frame1,
            text="下一页",
            command=self.next_page,
            state=tk.DISABLED
        )
        self.next_button.pack(side=tk.LEFT, padx=(0, 5))

        # 控制按钮区域 - 第二行：执行所有任务、执行当前任务、停止、清除记忆
        control_frame2 = tk.Frame(self.root)
        control_frame2.pack(fill=tk.X, padx=10, pady=5)

        self.run_all_button = tk.Button(
            control_frame2,
            text="执行所有任务",
            command=self.run_all_tasks  # 新增按钮：执行所有任务
        )
        self.run_all_button.pack(side=tk.LEFT, padx=(0, 5))

        self.run_current_button = tk.Button(
            control_frame2,
            text="执行当前任务",
            command=self.run_current_task
        )
        self.run_current_button.pack(side=tk.LEFT, padx=(0, 5))

        self.stop_button = tk.Button(
            control_frame2,
            text="停止",
            command=self.stop_all_tasks,
            state=tk.DISABLED
        )
        self.stop_button.pack(side=tk.LEFT, padx=(0, 5))

        self.clear_memory_button = tk.Button(
            control_frame2,
            text="清除记忆",
            command=self.clear_short_term_memory
        )
        self.clear_memory_button.pack(side=tk.LEFT, padx=(0, 5))

        # 任务状态标签
        self.status_label = tk.Label(self.root, text="状态: 等待任务开始", bd=1, relief=tk.SUNKEN, anchor=tk.W)
        self.status_label.pack(side=tk.BOTTOM, fill=tk.X)
        
    def load_workflow_content(self):
        """加载并显示工作流程内容"""
        # 首先尝试从记忆文件中获取步骤完成状态
        saved_state = self.get_workflow_state_from_memory()
        
        if WORKFLOW_PATH.exists():
            try:
                with open(WORKFLOW_PATH, 'r', encoding='utf-8') as f:
                    content = f.read()
                
                # 分割工作流程为独立步骤
                steps = [step.strip() for step in content.split('\n') if step.strip()]
                
                # 初始化工作流程状态，优先使用记忆文件中的状态
                self.workflow_state = []
                for i, step in enumerate(steps):
                    completed = False
                    # 检查记忆文件中是否有该步骤的完成状态
                    if i < len(saved_state):
                        _, saved_completed = saved_state[i]
                        completed = saved_completed
                    self.workflow_state.append((step, completed))
                
                # 更新页面导航
                self.update_page_navigation()
                
                # 显示第一页
                if self.workflow_state:
                    self.update_task_display()
                
            except Exception as e:
                print(f"加载工作流程失败: {str(e)}")
        else:
            print(f"工作流程文件不存在: {WORKFLOW_PATH}")

    def update_task_display(self):
        """更新当前任务显示"""
        if not self.workflow_state or self.current_page_index >= len(self.workflow_state):
            return
            
        step, completed = self.workflow_state[self.current_page_index]
        
        # 更新标题
        if completed == True:
            status_text = "已完成"
            status_color = "green"
        elif completed == "pending_verification":
            status_text = "待确定"
            status_color = "orange"
        else:
            status_text = "待完成"
            status_color = "red"
            
        self.title_label.config(
            text=f"任务 {self.current_page_index + 1}/{len(self.workflow_state)} - {status_text}",
            fg=status_color
        )
        
        # 更新描述
        self.desc_label.config(text=step)

    def update_page_navigation(self):
        """更新页面导航按钮状态"""
        total_pages = len(self.workflow_state)
        
        # 更新按钮状态
        self.prev_button.config(state=tk.NORMAL if self.current_page_index > 0 else tk.DISABLED)
        self.next_button.config(state=tk.NORMAL if self.current_page_index < total_pages - 1 else tk.DISABLED)
        
        # 更新页面标签
        self.page_label.config(text=f"第 {self.current_page_index + 1}/{total_pages} 页")

    def update_page_label(self):
        """更新页面标签"""
        total_pages = len(self.workflow_state)
        self.page_label.config(text=f"第 {self.current_page_index + 1}/{total_pages} 页")

    def prev_page(self):
        """上一页"""
        if self.current_page_index > 0:
            self.current_page_index -= 1
            self.update_task_display()
            self.update_page_navigation()

    def next_page(self):
        """下一页"""
        if self.current_page_index < len(self.workflow_state) - 1:
            self.current_page_index += 1
            self.update_task_display()
            self.update_page_navigation()

    def run_current_task(self):
        """执行当前任务"""
        if self.is_executing:
            messagebox.showwarning("警告", "任务正在执行中，请等待完成")
            return
        
        if not self.workflow_state:
            messagebox.showwarning("警告", "没有可执行的任务")
            return
        
        current_task_index = self.current_page_index
        if current_task_index >= len(self.workflow_state):
            messagebox.showwarning("警告", "当前页码超出任务范围")
            return
            
        task_step, completed = self.workflow_state[current_task_index]
        
        if completed == True:
            messagebox.showinfo("提示", f"任务 {current_task_index + 1} 已完成，无需再次执行")
            return
        
        self.is_executing = True
        self.run_current_button.config(state=tk.DISABLED)
        self.run_all_button.config(state=tk.DISABLED)
        self.stop_button.config(state=tk.NORMAL)
        self.update_status(f"状态: 正在执行任务 {current_task_index + 1}: {task_step}")
        
        # 在新线程中执行任务以避免界面冻结
        import threading
        task_thread = threading.Thread(
            target=self.execute_single_task,
            args=(current_task_index,)
        )
        task_thread.daemon = True
        task_thread.start()

    def run_all_tasks(self):
        """执行所有任务"""
        if self.is_executing:
            messagebox.showwarning("警告", "任务正在执行中，请等待完成")
            return
        
        if not self.workflow_state:
            messagebox.showwarning("警告", "没有可执行的任务")
            return
        
        self.is_executing = True
        self.run_current_button.config(state=tk.DISABLED)
        self.run_all_button.config(state=tk.DISABLED)
        self.stop_button.config(state=tk.NORMAL)
        self.update_status("状态: 正在执行所有任务...")
        
        # 在新线程中执行所有任务以避免界面冻结
        import threading
        task_thread = threading.Thread(
            target=self.execute_all_tasks
        )
        task_thread.daemon = True
        task_thread.start()

    def execute_all_tasks(self):
        """执行所有任务，按顺序自动执行"""
        try:
            for task_index in range(len(self.workflow_state)):
                # 检查是否停止了执行
                if not self.is_executing:
                    print("系统: 任务已手动停止")
                    return
                
                task_step, completed = self.workflow_state[task_index]
                
                # 如果任务已完成，则跳过
                if completed == True:
                    print(f"跳过已完成任务 {task_index + 1}: {task_step}")
                    continue
                
                print(f"开始执行任务 {task_index + 1}: {task_step}")
                
                # 切换到当前任务页面
                self.root.after(0, lambda idx=task_index: self.set_current_page(idx))
                
                # 执行任务
                task_output = ""
                completed_flag_found = False
                
                # 使用for循环遍历vision_task_loop的输出
                for output in vision_task_loop(
                    task_step, 
                    self.knowledge_file, 
                    self.memory_file, 
                    self.workflow_state, 
                    reset_first_iteration=True
                ):
                    if not self.is_executing:
                        print("系统: 任务已手动停止")
                        return
                    
                    # 检查是否是工作流程完成标记 - 只在输出中直接检查，且确保不是工具执行结果中的标记
                    if "[TOTAL_TASK_COMPLETED]" in output and "[TOOL:" not in output:
                        print("工作流程已全部完成")
                        # 标记整个工作流程为完成
                        self.root.after(0, lambda idx=task_index: self.mark_step_as_completed_and_finish_workflow(idx))
                        completed_flag_found = True
                        break
                    elif "[TASK_COMPLETED]" in output and "[TOOL:" not in output:  # 确保不是工具执行结果中的标记
                        # 如果是子任务完成标记，直接标记为已完成
           
                        # 直接标记为已完成，而不是待确定
                        self.root.after(0, lambda idx=task_index: self.mark_step_as_completed(idx))
                        completed_flag_found = True
                        # 跳出当前任务的内部循环，准备执行下一个任务
                        break  
                    else:
                        print(f"任务{task_index + 1}执行结果: {output}")
                        
                        # 将输出追加到记忆文件
                        with open(self.memory_file, 'a', encoding='utf-8') as f:
                            f.write(f"AI响应: {output}\n")
                        
                        task_output += output + "\n"
                
                # 如果遇到整个工作流程完成标记，直接退出
                if "[TOTAL_TASK_COMPLETED]" in task_output:
                    print("工作流程已全部完成")
                    break

        except Exception as e:
            print(f"系统: 执行任务时出错: {str(e)}")
        finally:
            self.is_executing = False
            self.root.after(0, lambda: self.run_current_button.config(state=tk.NORMAL))
            self.root.after(0, lambda: self.run_all_button.config(state=tk.NORMAL))
            self.root.after(0, lambda: self.stop_button.config(state=tk.DISABLED))
            self.root.after(0, lambda: self.update_status("状态: 所有任务执行完成"))

    def execute_single_task(self, task_index):
        """执行单个任务"""
        try:
            task_step, completed = self.workflow_state[task_index]
            
            if completed == True:
                print(f"任务 {task_index + 1} 已完成")
                return
            
            # 执行任务
            task_output = ""
            completed_flag_found = False
            
            # 使用for循环遍历vision_task_loop的输出
            for output in vision_task_loop(
                task_step, 
                self.knowledge_file, 
                self.memory_file, 
                self.workflow_state, 
                reset_first_iteration=True
            ):
                if not self.is_executing:
                    print("系统: 任务已手动停止")
                    return
                
                # 检查是否是工作流程完成标记 - 只在输出中直接检查，且确保不是工具执行结果中的标记
                if "[TOTAL_TASK_COMPLETED]" in output and "[TOOL:" not in output:
                    print("工作流程已全部完成")
                    # 标记整个工作流程为完成
                    self.root.after(0, lambda idx=task_index: self.mark_step_as_completed_and_finish_workflow(idx))
                    completed_flag_found = True
                    break
                elif "[TASK_COMPLETED]" in output and "[TOOL:" not in output:  # 确保不是工具执行结果中的标记
                    # 如果是子任务完成标记，直接标记为已完成
                    print(f"任务{task_index + 1}执行结果: {output}")
                    # 直接标记为已完成，而不是待确定
                    self.root.after(0, lambda idx=task_index: self.mark_step_as_completed(idx))
                    completed_flag_found = True
                else:
                    print(f"任务{task_index + 1}执行结果: {output}")
                    
                    # 将输出追加到记忆文件
                    with open(self.memory_file, 'a', encoding='utf-8') as f:
                        f.write(f"AI响应: {output}\n")
                    
                    task_output += output + "\n"
            
            # 检查是否完成整个工作流程
            if "[TOTAL_TASK_COMPLETED]" in task_output:
                print("工作流程已全部完成")
                
        except Exception as e:
            print(f"系统: 执行任务时出错: {str(e)}")
        finally:
            self.is_executing = False
            self.root.after(0, lambda: self.run_current_button.config(state=tk.NORMAL))
            self.root.after(0, lambda: self.run_all_button.config(state=tk.NORMAL))
            self.root.after(0, lambda: self.stop_button.config(state=tk.DISABLED))
            self.root.after(0, lambda: self.update_status("状态: 任务执行完成"))

    def set_current_page(self, page_index):
        """设置当前页面索引并更新显示"""
        self.current_page_index = page_index
        self.update_task_display()
        self.update_page_navigation()

    def mark_step_as_completed_and_finish_workflow(self, index):
        """标记步骤为已完成并完成整个工作流程"""
        if 0 <= index < len(self.workflow_state):
            # 将任务状态改为已完成
            self.workflow_state[index] = (self.workflow_state[index][0], True)
            
            # 保存状态
            self.save_workflow_state()
            
            # 更新当前显示（如果当前页是完成的页）
            if index == self.current_page_index:
                self.update_task_display()

    def mark_step_as_completed(self, index):
        """标记步骤为已完成"""
        if 0 <= index < len(self.workflow_state):
            # 将任务状态改为已完成
            self.workflow_state[index] = (self.workflow_state[index][0], True)
            
            # 保存状态
            self.save_workflow_state()
            
            # 更新当前显示（如果当前页是完成的页）
            if index == self.current_page_index:
                self.update_task_display()
            
            # 在记忆文件中记录步骤执行完成
            with open(self.memory_file, 'a', encoding='utf-8') as f:
                f.write(f"任务{index+1}执行完成\n")

    def mark_step_as_pending_verification(self, index):
        """标记步骤为待确定状态并自动翻页"""
        if 0 <= index < len(self.workflow_state):
            # 将任务状态改为待确定而不是已完成
            self.workflow_state[index] = (self.workflow_state[index][0], "pending_verification")
            
            # 保存状态
            self.save_workflow_state()
            
            # 更新当前显示（如果当前页是完成的页）
            if index == self.current_page_index:
                self.update_task_display()
            
            # 在记忆文件中记录步骤执行完成，等待确认
            with open(self.memory_file, 'a', encoding='utf-8') as f:
                f.write(f"任务{index+1} 执行完成，等待确认: {self.workflow_state[index][0]}\n")

    def confirm_task_completed(self, task_index):
        """确认任务真正完成"""
        if 0 <= task_index < len(self.workflow_state):
            task_desc, status = self.workflow_state[task_index]
            if status == "pending_verification":
                self.workflow_state[task_index] = (task_desc, True)
                self.save_workflow_state()
                
                # 更新显示（如果当前页是确认的页）
                if task_index == self.current_page_index:
                    self.update_task_display()
                
                # 在记忆文件中记录步骤确认完成
                with open(self.memory_file, 'a', encoding='utf-8') as f:
                    f.write(f"任务{task_index+1} 确认完成: {task_desc}\n")

    def stop_all_tasks(self):
        """停止所有任务执行"""
        self.is_executing = False
        self.run_current_button.config(state=tk.NORMAL)
        self.run_all_button.config(state=tk.NORMAL)
        self.stop_button.config(state=tk.DISABLED)
        self.update_status("状态: 任务已停止")

    def get_workflow_state_from_memory(self):
        """从记忆文件中提取工作流程状态"""
        saved_state = []
        if os.path.exists(self.memory_file):
            try:
                with open(self.memory_file, 'r', encoding='utf-8') as f:
                    content = f.read()
                
                lines = content.split('\n')
                in_workflow_section = False
                
                for line in lines:
                    line = line.strip()
                    
                    if line == "工作流程定义:":
                        in_workflow_section = True
                        continue
                    elif line == "执行历史:":
                        in_workflow_section = False
                        continue
                    elif in_workflow_section and line.startswith("任务") and " - " in line:
                        # 解析格式如 "任务1: 打开模之屋 - 已完成"
                        try:
                            parts = line.split(" - ")
                            if len(parts) >= 2:
                                step_info = parts[0]  # "任务1: 打开模之屋"
                                status = parts[1]     # "已完成" 或 "待完成" 或 "待确定"
                                
                                # 提取任务号
                                step_match = re.search(r'任务(\d+):', step_info)
                                if step_match:
                                    step_num = int(step_match.group(1))
                                    if status == "已完成":
                                        completed = True
                                    elif status == "待确定":
                                        completed = "pending_verification"
                                    else:
                                        completed = False
                                    
                                    # 确保列表长度足够
                                    while len(saved_state) < step_num:
                                        saved_state.append((None, False))
                                    
                                    # 更新对应位置的完成状态
                                    step_desc = step_info.split(":", 1)[1].strip()
                                    saved_state[step_num - 1] = (step_desc, completed)
                        except Exception as e:
                            print(f"解析记忆中的任务行出错: {line}, 错误: {str(e)}")
                            continue
            except Exception as e:
                print(f"从记忆文件读取工作流程状态失败: {str(e)}")
        
        return saved_state

    def update_status(self, status_text):
        """更新状态栏"""
        self.status_label.config(text=status_text)

    def clear_short_term_memory(self):
        """手动清除短期记忆"""
        if os.path.exists(self.memory_file):
            try:
                # 保留工作流程定义，只清除执行历史
                with open(WORKFLOW_PATH, 'r', encoding='utf-8') as f:
                    workflow_content = f.read()
                
                # 重写memory文件，只保留工作流程定义
                with open(self.memory_file, 'w', encoding='utf-8') as f:
                    f.write("工作流程定义:\n")
                    steps = [step.strip() for step in workflow_content.split('\n') if step.strip()]
                    for i, step in enumerate(steps):
                        f.write(f"任务{i+1}: {step} - 待完成\n")
                    f.write("\n执行历史:\n")
                
                # 重置所有步骤为未完成
                for i in range(len(self.workflow_state)):
                    self.workflow_state[i] = (self.workflow_state[i][0], False)
                
                # 重置到第一页并更新显示
                self.current_page_index = 0
                self.update_task_display()
                self.update_page_navigation()
                
                print("短期记忆已手动清除，所有任务重置为未完成")
            except Exception as e:
                messagebox.showerror("错误", f"清除短期记忆失败: {str(e)}")
        else:
            print("短期记忆文件不存在")

    def save_workflow_state(self):
        """保存工作流程状态到记忆文件"""
        try:
            # 读取当前memory文件内容
            current_content = ""
            if os.path.exists(self.memory_file):
                with open(self.memory_file, 'r', encoding='utf-8') as f:
                    current_content = f.read()
            
            # 分离工作流程定义和执行历史
            lines = current_content.split('\n')
            workflow_lines = []
            history_lines = []
            in_history = False
            
            for line in lines:
                if line == "执行历史:":
                    in_history = True
                    history_lines.append(line)
                    continue
                elif line.startswith("工作流程定义:"):
                    workflow_lines.append(line)
                    continue
                elif line.startswith("任务") and " - " in line and not in_history:
                    workflow_lines.append(line)
                    continue
                elif in_history:
                    history_lines.append(line)
                else:
                    workflow_lines.append(line)
            
            # 更新工作流程状态
            updated_workflow_lines = ["工作流程定义:"]
            for i, (step, completed) in enumerate(self.workflow_state):
                if completed == True:
                    status = "已完成"
                elif completed == "pending_verification":
                    status = "待确定"
                else:
                    status = "待完成"
                updated_workflow_lines.append(f"任务{i+1}: {step} - {status}")
            
            # 合并内容并写回文件
            with open(self.memory_file, 'w', encoding='utf-8') as f:
                f.write('\n'.join(updated_workflow_lines))
                f.write('\n\n')
                f.write('\n'.join(history_lines))
        except Exception as e:
            print(f"保存工作流程状态失败: {str(e)}")

def main():
    root = tk.Tk()
    app = VLMTaskApp(root)
    root.mainloop()


if __name__ == "__main__":
    main()