系列09-Playwright UI 自动化平台怎么设计？MQ 调度与 Runner 执行架构

本地用 Playwright 写脚本很顺；做成团队平台会撞上四类硬问题：

浏览器不能长期跑在 FastAPI worker 里（资源、隔离、Linux 无桌面）
多台 Windows/Mac 测试机如何 认领任务、不串台
截图/视频存哪、报告如何 三级汇总
Runner 在公网如何 安全连 MQ/Redis/MinIO

BrickCore 的答案：平台编排 + RabbitMQ 解耦 + Runner 执行 Playwright 。本文基于 CE 可读源码讲架构；引擎 WebEngine 闭源，但 MQ 协议与调度逻辑完全开源。

演示与源码

	地址
功能演示	http://43.142.83.156/showcase/ （「UI 录制」「UI Agent 生成步骤」等录屏；平台 admin / BrickCore123456）
开源仓库	https://gitee.com/BanZhuanKeOrz/BrickCore
Runner 安装包	https://gitee.com/BanZhuanKeOrz/BrickCore/releases

试跑页：http://43.142.83.156/showcase/demo-ui.html（demo / demo123）。AI 定位器自愈 见系列06；本文聚焦 调度与执行架构。

一、为什么 UI 执行必须外置

Web 内嵌浏览器	后果
与 API/压测抢 CPU	UI 任务饿死或拖垮接口回归
容器无 DISPLAY	Xvfb 稳定性差
用户步骤任意 URL	安全隔离难
Web 水平扩容	与浏览器进程无关，浪费

结论：调度在平台，执行在 Runner（桌面或专用机）。

二、平台 vs Runner 职责边界

	平台（FastAPI + Vue）	Runner（BrickCoreRunner）
步骤 JSON 编辑 / 录制入口	✅	❌
计划触发、权限、报告	✅	❌
Playwright 执行	❌	✅
截图/视频上传 MinIO	存 URL	✅ 上传
设备上线、心跳	设备管理	✅
AI 自愈 LLM 调用	Backend API	失败时 POST internal

三、四层数据模型

复制代码

Case（steps JSON：keyword / method / params / desc）
  → Suite（用例顺序、stop_on_failure、setup SQL）
    → Task/Plan（多套件、定时 cron）
      → Execution（Plan / Suite / Case 三级执行记录）

步骤示例：

json 复制代码

{
  "keyword": "点击元素",
  "method": "click_ele",
  "params": {"locator": "get_by_role=button,name=提交", "timeout": 20000},
  "desc": "点击提交按钮"
}

支持 条件分支 、步骤片段 、数据库断言（Backend 代跑 SQL，Runner 不持库密码）。

四、端到端时序

复制代码

用户触发 UI 计划/套件/单用例
  → 平台创建 *Execution 记录（status=running）
  → resolve_locator_heal、headless 等写入 env_config
  → dispatch_to_device(device_id) --- 校验设备「在线」且含 web 引擎
  → MQ send_test_task(env_config, run_suite, device_id)
  → Runner 消费 → Playwright 逐步执行
  → HTTP post_results（截图 URL、步骤日志、locator_healed）
  → 平台汇总报告，可选邮件推送

┌──────────────┐  queue=device_id   ┌─────────────────┐
│ 平台 FastAPI  │ ───────────────► │ Runner Playwright│
│  MySQL 记录   │ ◄─────────────── │  MinIO 截图      │
└──────────────┘   HTTP 回写       └─────────────────┘
        │                                    ▲
        └──────── RabbitMQ / Redis ──────────┘

五、源码视角：发任务与队列模型

5.1 落库再发 MQ

python 复制代码

# routers/ui/exec.py
async def dispatch_to_device(env_payload, suite_payload, device_id):
    device = await Device.get_or_none(id=device_id)
    if not device or device.status != "在线":
        return False
    mq.send_test_task(env_payload, suite_payload, device_id)

先写 Execution 再发消息 ：Runner 回写时有关联 ID；设备离线则 不分发，避免消息进无人队列。

5.2 队列名 = device_id

python 复制代码

# core/mq_producer.py
self.channel.queue_declare(queue=device_id, durable=True)
self.channel.basic_publish(exchange='', routing_key=device_id, body=msg,
                           properties=pika.BasicProperties(delivery_mode=2))

设计	作用
`routing_key=device_id`	每台 Runner 独占队列，不串台
`durable=True` + `delivery_mode=2`	Broker 重启不丢队列/消息

5.3 MQ 消息体（自建 Runner 可对照）

json 复制代码

{
  "env_config": {
    "base_url": "https://demo.example.com",
    "headless": true,
    "ai_heal_enabled": true,
    "project_id": 1,
    "environment_id": 2
  },
  "run_suite": {
    "suite_id": 12,
    "case_id": 101,
    "cases": [{"case_id": 101, "steps": [...]}]
  }
}

六、Runner 消费与经典踩坑

runner/tools/mq_consumer.py：

python 复制代码

# IO 线程收到消息 → 尽快 basic_ack
# Playwright 在工作线程 execute → 禁止在工作线程 ack
def _process_message(...):
    runner = Runner(env_config, run_suite)
    result = runner.run()
    self._save_result(run_suite, result)

现象：用例实际成功，平台一直 running ，日志 MQ 连接丢失。

原因：Pika BlockingConnection 跨线程 ack 。

处理：CE 安装包已在 IO 线程 ack；自建 Runner 须遵守同样约束。

失败步骤若开 AI 自愈 ，basecase.py 捕获异常 → try_heal_step → 重试本步（详见系列06）。

七、结果回写与安全

模式	认证	场景
桌面客户端 connect	`X-Runner-Token`	测试同学本机 Runner
演示机 / 无头	`X-Internal-Token`	`/runner/results/internal`

connect 模式 （routers/runner/connect.py）：provision_device_middleware 为 每设备独立 MQ/Redis 账号 ，Runner 不拿业务库密码。

八、RabbitMQ、Redis、MinIO 与部署

组件	作用
RabbitMQ	任务队列（25672）
Redis	执行日志流、实时进度（26379）
MinIO	截图/视频；公网 URL 需配 `MINIO_PUBLIC_ENDPOINT`（9200）

Docker 全栈部署时，Runner 在 容器外 连公网中间件------安全组除 80 外须放行 25672 / 26379 / 9200，否则任务永久排队。

九、Runner 生命周期（实操）

安装 BrickCoreRunner（Release，注意 arm64/intel）
配置平台 URL，登录，上线 → 设备管理显示「在线」
UI 模块运行套件/计划，选该 device_id
解压路径 勿含中文/空格

十、与 Selenium Grid / 纯脚本对比

	纯 Playwright 脚本	Selenium Grid	平台 + Runner
维护载体	Git 代码	Grid 节点	Web 步骤 JSON
非开发参与	难	中	较易（录制/Agent）
报告	自建	部分	三级落库 + HTML
水平扩展	CI shard	加 Grid	加 Runner 机器
AI 自愈	自建	无	平台 Backend+Runner

十一、CE 开源边界

Gitee CE 可读	安装包闭源
`backend` 调度、MQ、设备、报告	`runner/WebEngine` 引擎
`frontend` 步骤编辑器	Runner 打包脚本
MQ 消息结构、`runner/connect`	---

全功能可体验；二次开发引擎需 Pro/商业授权或自研 Playwright 层。

十二、常见排错

现象	处理
任务一直排队	Runner 未上线；MQ 端口未开
设备离线	心跳超时；重启 Runner
报告无截图	MinIO 公网地址/9200 安全组
用例成功平台 running	MQ 跨线程 ack（见 §6）
发错机器	检查选用的 device_id

十三、小结

UI 平台 = 编排 + 记录 ；Runner = Playwright 执行。
MQ 按设备队列 分发，易水平加机器。
MinIO 统一截图；connect 模式隔离中间件账号。
架构可复用到任意「Web 测管 + 远程执行器」方案。

附录 A：源码文件索引

顺序	文件	关注点
1	`routers/ui/exec.py`	Execution、`dispatch_to_device`
2	`core/mq_producer.py`	`send_test_task`
3	`routers/runner/connect.py`	connect bundle、中间件隔离
4	`core/runner_results.py`	结果落库、通知
5	`runner/tools/mq_consumer.py`	消费、ack 策略
6	`runner/tools/runner_api.py`	`post_results`

支持与交流

演示：http://43.142.83.156/showcase/ · 源码：https://gitee.com/BanZhuanKeOrz/BrickCore
觉得有用欢迎 Star ⭐，问题评论区留言或 Gitee Issues