浅玩一下，基于 Appium 的自动化测试 AI Agent

前言

上篇文章，我们使用 Appium 的插件 appium-llm-plugin 来辅助我们进行 UI 自动化测试。不过整个流程感觉还是不够智能，我们还是得手动去调用 API 来完成一个测试用例的编写，有没有可能把这个也去掉，用纯自然语言来写一个自动化测试用例呢？这篇文章就来试一下，来实现一个简单的自动化测试 Agent。

先上效果，这是我们的测试用例，

bash 复制代码

你帮我进行如下自动化测试：

等待 2 秒
点击页面上对话框的同意按钮
等待 2 秒
点击文章列表中的第一篇文章
等待 2 秒
点击关注

该用例演示了 "进入掘金首页 -> 打开第一篇文章 -> 点击关注" 这个流程：

实现方式

本文是基于 OpenAI 的 Function Call 来实现。整体流程如下：

首先，我们需要定义一些 Tools 来供大模型使用：

js 复制代码

[
  {
    name: 'wait',
    description: `等待一段时间`,
    inputSchema: {
      type: 'object',
      properties: {
        seconds: {
          type: 'string',
          description: '等待的秒数',
        },
      },
      required: ['seconds'],
    },
  },
  {
    name: 'get_page_xml',
    description: `获取当前页面的 XML 描述文件`,
    inputSchema: {
      type: 'object',
      properties: {},
      required: [],
    },
  },
  {
    name: 'click_by_xpath',
    description: `使用 xpath 选择元素并点击`,
    inputSchema: {
      type: 'object',
      properties: {
        selector: {
          type: 'string',
          description: 'webdriverio 支持的 xpath 选择器',
        },
      },
      required: ['selector'],
    },
  },
]

我们把上述 Tools 作为系统提示词传递给大模型：

vbnet 复制代码

you are a test engineer who is good at automated testing, you can use the following tools to complete automated testing:
 
# Tools
 
${toolsStr}
 
# Notes 
- Ensure responses are based on the latest information available from function calls.
- Maintain an engaging, supportive, and friendly tone throughout the dialogue.
- Always highlight the potential of available tools to assist users comprehensively.
- Decide the next action based on XML.
- Always get the latest page information before you decide the next action.

然后，把我们的测试用例也传递给大模型。接下来不断重复下面这个步骤：

大模型返回 Function Call 的指令
执行 Function Call
将 Function Call 的执行结果传递给大模型，这里要么执行成功，要么大部分情况下就是元素找不到，返回 Element not found，我们期待的是大模型会聪明地重新进行尝试。

下面是测试的记录及分析：

json 复制代码

[
  {
    "role": "user",
    "content": "\n你帮我进行如下自动化测试：\n\n等待 2 秒\n点击页面上对话框的同意按钮\n等待 2 秒\n点击文章列表中的第一篇文章\n等待 2 秒\n点击关注\n\n      "
  },
  // 大模型返回执行 wait 和 get_page_xml 的指令
  {
    "content": null,
    "refusal": null,
    "role": "assistant",
    "annotations": [],
    "audio": null,
    "function_call": null,
    "tool_calls": [
      {
        "id": "call_8ftk1bjDtsVEY0Gotfouahbp",
        "function": {
          "arguments": "{\"seconds\": \"2\"}",
          "name": "wait"
        },
        "type": "function"
      },
      {
        "id": "call_KpfEBZo67WXsQ3YhSRZg2teE",
        "function": {
          "arguments": "{}",
          "name": "get_page_xml"
        },
        "type": "function"
      }
    ]
  },
  // 执行结果都为成功
  {
    "role": "tool",
    "content": "Wait successfully",
    "tool_call_id": "call_8ftk1bjDtsVEY0Gotfouahbp"
  },
  {
    "role": "tool",
    "content": "...", // xml 内容太多了，省略
    "tool_call_id": "call_KpfEBZo67WXsQ3YhSRZg2teE"
  },
  // 大模型返回 click_by_xpath 和 wait 的指令
  {
    "content": null,
    "refusal": null,
    "role": "assistant",
    "annotations": [],
    "audio": null,
    "function_call": null,
    "tool_calls": [
      {
        "id": "call_N44tjs7O76wVCowOJf2JHTj6",
        "function": {
          "arguments": "{\"selector\": \"//android.widget.TextView[@resource-id='com.daimajia.gold:id/btn_ok']\"}",
          "name": "click_by_xpath"
        },
        "type": "function"
      },
      {
        "id": "call_GVRdcAPBVULEMZYnbgztqtSg",
        "function": {
          "arguments": "{\"seconds\": \"2\"}",
          "name": "wait"
        },
        "type": "function"
      }
    ]
  },
  // 执行结果，成功
  {
    "role": "tool",
    "content": "[{\"type\":\"text\",\"content\":\"Click successfully\"}]",
    "tool_call_id": "call_N44tjs7O76wVCowOJf2JHTj6"
  },
  {
    "role": "tool",
    "content": "Wait successfully",
    "tool_call_id": "call_GVRdcAPBVULEMZYnbgztqtSg"
  },
  // 大模型返回 click_by_xpath 和 wait 的指令（点击文章列表中的第一篇文章）
  {
    "content": null,
    "refusal": null,
    "role": "assistant",
    "annotations": [],
    "audio": null,
    "function_call": null,
    "tool_calls": [
      {
        "id": "call_5gUEZ6JrRZ1Cd6mNYmUdGfXR",
        "function": {
          "arguments": "{\"selector\": \"//androidx.recyclerview.widget.RecyclerView/android.widget.LinearLayout[1]\"}",
          "name": "click_by_xpath"
        },
        "type": "function"
      },
      {
        "id": "call_C2oXiINLoDGWdKPfwDDiUMHR",
        "function": {
          "arguments": "{\"seconds\": \"2\"}",
          "name": "wait"
        },
        "type": "function"
      }
    ]
  },
  // click_by_xpath 执行失败，没有找到元素（原因在于大模型没有按照要求再次获取 XML）
  {
    "role": "tool",
    "content": "[{\"type\":\"text\",\"content\":\"Element not found\"}]",
    "tool_call_id": "call_5gUEZ6JrRZ1Cd6mNYmUdGfXR"
  },
  {
    "role": "tool",
    "content": "Wait successfully",
    "tool_call_id": "call_C2oXiINLoDGWdKPfwDDiUMHR"
  },
  // 大模型返回执行 get_page_xml 的指令（表现还算不错，意识到要重新获取 XML）
  {
    "content": null,
    "refusal": null,
    "role": "assistant",
    "annotations": [],
    "audio": null,
    "function_call": null,
    "tool_calls": [
      {
        "id": "call_PMQ72TSMy0i6aTYcTam1HFSM",
        "function": {
          "arguments": "{}",
          "name": "get_page_xml"
        },
        "type": "function"
      },
      {
        "id": "call_vCTopzCHVbLyP3L25H1cngZG",
        "function": {
          "arguments": "{\"seconds\": \"2\"}",
          "name": "wait"
        },
        "type": "function"
      }
    ]
  },
  // 执行成功
  {
    "role": "tool",
    "content": "...",
    "tool_call_id": "call_PMQ72TSMy0i6aTYcTam1HFSM"
  },
  {
    "role": "tool",
    "content": "Wait successfully",
    "tool_call_id": "call_vCTopzCHVbLyP3L25H1cngZG"
  },
  // 再次返回 click_by_xpath （点击文章列表中的第一篇文章）
  {
    "content": null,
    "refusal": null,
    "role": "assistant",
    "annotations": [],
    "audio": null,
    "function_call": null,
    "tool_calls": [
      {
        "id": "call_0nWMw8rEOJvhRfUrKaAafnD6",
        "function": {
          "arguments": "{\"selector\":\"//android.widget.RelativeLayout[@index='0']\"}",
          "name": "click_by_xpath"
        },
        "type": "function"
      }
    ]
  },
  // 这次成功了
  {
    "role": "tool",
    "content": "[{\"type\":\"text\",\"content\":\"Click successfully\"}]",
    "tool_call_id": "call_0nWMw8rEOJvhRfUrKaAafnD6"
  },
  // 大模型返回执行 wait 和 get_page_xml 的指令
  {
    "content": null,
    "refusal": null,
    "role": "assistant",
    "annotations": [],
    "audio": null,
    "function_call": null,
    "tool_calls": [
      {
        "id": "call_7Bg3Qrv52uJ5knZrRX0VMBFR",
        "function": {
          "arguments": "{\"seconds\": \"2\"}",
          "name": "wait"
        },
        "type": "function"
      },
      {
        "id": "call_iAXPQyYDlkz5oCt2jwYBm53T",
        "function": {
          "arguments": "{}",
          "name": "get_page_xml"
        },
        "type": "function"
      }
    ]
  },
  // 执行成功
  {
    "role": "tool",
    "content": "Wait successfully",
    "tool_call_id": "call_7Bg3Qrv52uJ5knZrRX0VMBFR"
  },

  {
    "role": "tool",
    "content": "...",
    "tool_call_id": "call_iAXPQyYDlkz5oCt2jwYBm53T"
  },
  // 大模型返回执行 click_by_xpath 的指令（点击关注）
  {
    "content": null,
    "refusal": null,
    "role": "assistant",
    "annotations": [],
    "audio": null,
    "function_call": null,
    "tool_calls": [
      {
        "id": "call_fkYD6oSJoow7GFdpsbxTHJ3S",
        "function": {
          "arguments": "{\"selector\":\"//android.widget.TextView[@resource-id='com.daimajia.gold:id/web_focus_button']\"}",
          "name": "click_by_xpath"
        },
        "type": "function"
      }
    ]
  },
  {
    "role": "tool",
    "content": "[{\"type\":\"text\",\"content\":\"Click successfully\"}]",
    "tool_call_id": "call_fkYD6oSJoow7GFdpsbxTHJ3S"
  }
]

总结

本文通过一个非常简单的例子试验了一下基于 Appium 的自动化测试 AI Agent，得到了一个成功的结果，但是过程中还是有些问题的：

大模型一开始总是拿着上一个页面的 XML 来找当前页面的元素，如果找不到还好一点，就怕找到一个错的，点击后到了一个不该去的页面，接下来就 GG 了，加上这句提示词后效果会好一点 "Always get the latest page information before you decide the next action."，下一步可以试试把提示词进一步优化。
如何让大模型知道自己点错了元素进入到了错误的页面？是不是可以给目标页面一些简单的描述，比如"你进入的应该是文章详情"。这样当大模型意识到进入到了错误的页面时，可以尝试后退到上一步进行重试。不过这里注意注意的是不要和断言混淆了，因为断言失败时，整个测试的结果应该是失败的。
可以支持让大模型通过页面截图来定位元素，跟 XML 的方法结合起来，提高准确性。