Playwright02-CDP

Playwright02-CDP

playwright自动化开发记录,学习BrowserUse的时候涉及到playwright和udp-use的知识点


1-核心知识点

  • 1-运行playwright第一个demo

2-参考网址


3-动手实操

1-UV环境搭建

bash 复制代码
# 1-uv环境搭建
uv python pin 3.11.4
uv init python_playwright && cd python_playwright
uv venv && source .venv/bin/activate
uv add python-dotenv pydantic playwright

# 2-安装playwright-刷新应用
uv add playwright
source .venv/bin/activate

# 3-playwright安装chromium(当前只安装了chrome浏览器)
playwright install chromium

2-CDP接口开发

cdp-use 是一个为 Chrome DevTools Protocol(CDP)生成的 类型安全 Python 客户端库


方案 A:完全交给 Playwright

不关心真实 WebSocket 地址,只拿到"默认上下文里已有的页面"

python 复制代码
import time

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    # 1. 启动浏览器,并强制开启远程调试端口
    browser = p.chromium.launch(
        headless=False,
        args=["--remote-debugging-port=9222"]  # 开启 CDP 端口
    )
    # 2. 新建标签页
    page = browser.new_page()
    # 3. 打开目标网址
    web_url = "https://www.baidu.com/"

    try:
        # 设置更长的超时时间,并添加异常处理
        page.goto(web_url, timeout=60000)
        print("浏览器 成功打开浏览器:", web_url)
    except Exception as e:
        print(f"页面加载失败: {e}")
        browser.close()
        exit(1)

    # 4. 简单等待,方便肉眼观察
    time.sleep(3)

    # 5. 直接通过 playwright 自己的连接拿到同一浏览器
    try:
        browser2 = p.chromium.connect_over_cdp("http://localhost:9222")
        browser_contexts = browser2.contexts[0]
        print("=======browser_contexts响应数据结构========")
        print(browser_contexts)
        print("=======browser_contexts响应数据结构========\n")
        default_ctx_page = browser_contexts.pages[0]  # 默认上下文里已有的页面
        print("默认页面标题:", default_ctx_page.title())
        # 5. 关闭
        browser2.close()
    except Exception as e:
        print(f"连接到CDP时出错: {e}")
    finally:
        browser.close()

运行结果示例:

复制代码
已连接到 pydev 调试器(内部版本号 231.9225.15)浏览器 成功打开浏览器: https://www.baidu.com/
=======browser_contexts响应数据结构========
<BrowserContext browser=<Browser type=<BrowserType name=chromium executable_path=/Users/rong/Library/Caches/ms-playwright/chromium-1194/chrome-mac/Chromium.app/Contents/MacOS/Chromium> version=141.0.7390.37>>
=======browser_contexts响应数据结构========

方案 B:Playwright 控制+WebSocket 地址

既要 Playwright 控制,也要"真实的 WebSocket 地址"

python 复制代码
import json
import time
import requests
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    # 1. 启动浏览器,并强制开启远程调试端口
    browser = p.chromium.launch(
        headless=False,
        args=["--remote-debugging-port=9222"]  # 开启 CDP 端口
    )
    # 2. 新建标签页
    page = browser.new_page()
    # 3. 打开目标网址
    web_url = "https://www.baidu.com/"
    
    try:
        # 设置更长的超时时间,并添加异常处理
        page.goto(web_url, timeout=60000)
        print("浏览器 成功打开浏览器:", web_url)
    except Exception as e:
        print(f"页面加载失败: {e}")
        browser.close()
        exit(1)
    
    # 4. 简单等待,方便肉眼观察
    time.sleep(3)

    # 5. 自己取一次 /json/version 拿到 webSocketDebuggerUrl
    try:
        resp = requests.get("http://localhost:9222/json/version", timeout=5)
        print("=======json_version响应数据结构========")
        print(json.dumps(resp.json(), indent=2, ensure_ascii=False))
        print("=======json_version响应数据结构========\n")
        ws_url = resp.json()["webSocketDebuggerUrl"]
        print("浏览器 WebSocket 地址:", ws_url)
        
        # 如果还想继续用 playwright 操纵同一浏览器
        browser2 = p.chromium.connect_over_cdp("http://localhost:9222")
        default_page = browser2.contexts[0].pages[0]
        print("默认页面标题:", default_page.title())

        browser2.close()
    except requests.exceptions.RequestException as e:
        print(f"无法连接到调试地址: {e}")
    except Exception as e:
        print(f"处理调试连接时出错: {e}")
    finally:
        browser.close()

运行结果示例:

复制代码
已连接到 pydev 调试器(内部版本号 231.9225.15)浏览器 成功打开浏览器: https://www.baidu.com/
=======json_version响应数据结构========
{
  "Browser": "Chrome/141.0.7390.37",
  "Protocol-Version": "1.3",
  "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36",
  "V8-Version": "14.1.146.11",
  "WebKit-Version": "537.36 (@9f043f63b0e5b728c8d09f3e3ddfc1681a4bd58e)",
  "webSocketDebuggerUrl": "ws://localhost:9222/devtools/browser/27e882e5-8999-4a81-8d1f-9092e6698d61"
}
=======json_version响应数据结构========

浏览器 WebSocket 地址: ws://localhost:9222/devtools/browser/27e882e5-8999-4a81-8d1f-9092e6698d61
默认页面标题: 百度一下,你就知道

至此,你既拿到了"真实的 CDP WebSocket 地址",也通过 Playwright 取得了"默认上下文里已有的页面"。


相关推荐
Wpa.wk2 天前
自动化测试 - Playwrigh简单介绍+基础使用
经验分享·测试工具·playwright
熊猫钓鱼>_>4 天前
Playwright深度应用研究:从自动化到业务场景的全链路解决方案
运维·自动化·agent·playwright·skill·mcp·openclaw
熊猫钓鱼>_>4 天前
Playwright与Puppeteer实战教程:让AI拥有“看懂“网页的能力
人工智能·ai·puppeteer·playwright·jina·skills·agent skills
喵手7 天前
Python爬虫实战:Playwright 监听快手直播间,自动化采集实时在线与礼物数据!
爬虫·python·爬虫实战·快手·playwright·零基础python爬虫教学·采集快手直播间数据
喵手9 天前
Python爬虫高阶:用 Playwright “监听” Figma 社区热门插件数据!
爬虫·python·爬虫实战·figma·playwright·零基础python爬虫教学·社区热门插件数据采集
喵手9 天前
Python 爬虫实战:利用 Playwright 攻克 Canva 动态设计模板库
爬虫·python·爬虫实战·playwright·canva·零基础python爬虫教学·搭建动态设计模版库
喵手9 天前
Python爬虫实战:Playwright实现飞zhu酒店热门城市榜(仅技术交流)!
爬虫·python·爬虫实战·playwright·飞猪·零基础python爬虫教学·采集飞猪酒店热门城市榜
喵手9 天前
Python爬虫实战:降维打击 - 用 Playwright 嗅探网络层抓取douyin无水印视频!
爬虫·python·爬虫实战·抖音·playwright·零基础python爬虫教学·采集抖音无水印视频
带娃的IT创业者9 天前
UI 交互难题攻克:遮挡、弹窗、动态加载
ui·交互·文件上传·浏览器自动化·playwright·ui 交互·元素遮挡
秦始皇爱找茬24 天前
Playwright Python Windows 下 headful Chromium 崩溃排查经验分享
python·ui自动化·playwright