使用Python3和Selenium打造自动化搜狗微信爬虫

开篇

本篇文章旨在实现一个简单的爬虫程序,用来爬取搜狗|微信中爬取文章,并保存到excel中。之所以采用这个网站,是因为这个网站不用登录,便于爬取。本文仅做学习用,希望本文的程序,能对您起到抛砖引玉的作用。

环境准备

  • 安装与谷歌浏览器版本匹配的chormeDriver
  • 确保已经安装了python3和pip,然后按照下列命令依次安装所需的库
python 复制代码
pip install selenium
pip install openpyxl
pip install pandas

代码实现

python 复制代码
import time
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from openpyxl import Workbook
from datetime import datetime

# 设置chrome浏览器驱动程序和可执行文件的路径,并返回一个配置好的WebDriver对象
def setup_driver(chrome_driver_path, chrome_binary_path=None):
    options = webdriver.ChromeOptions()
    if chrome_binary_path:
        options.binary_location = chrome_binary_path
    # options.add_argument("--headless")  # 以无头模式运行,不打开浏览器窗口
    service = Service(chrome_driver_path)
    driver = webdriver.Chrome(service=service, options=options)
    return driver

# 在指定的WebDriver对象中执行搜索和爬取操作,将结果存储在一个列表中并返回
def search_and_scrape(driver, keyword, num_pages=5):
    # 搜索目标网址
    driver.get("https://weixin.sogou.com/")
    # 这句代码的意思是在WebDriver对象中查找一个具有指定ID属性值的元素
    search_box = driver.find_element(By.ID, "query")
    search_box.send_keys(keyword)
    # 对应的是搜狗微信的"搜文章"按钮,元素如下:
    # <input type="submit" value="搜文章" οnclick="weinxinfilter(this, 2)" uigs="search_article">
    search_button = driver.find_element(By.XPATH, '//input[@value="搜文章"]')
    search_button.click()

    results = []

    for page in range(num_pages):
        print(f"正在爬取第 {page + 1} 页...")
        time.sleep(5)  # 等待页面加载
        search_results = driver.find_elements(By.XPATH, '//ul[@class="news-list"]/li')
        
        for result in search_results:
            try:
                # 这一块分别对应的元素,可以在https://weixin.sogou.com/上查找
                # 我也截取了这一块的网页结构代码,可以作为参考
                title = result.find_element(By.XPATH, './/h3/a').text
                abstract = result.find_element(By.XPATH, './/p[@class="txt-info"]').text
                link = result.find_element(By.XPATH, './/h3/a').get_attribute("href")
                source = result.find_element(By.XPATH, './/div[@class="s-p"]').text
                date_index = source.find("20")
                if date_index != -1:
                    source = source[:date_index]
                # 增加爬取到的结果到results中
                results.append({"标题": title, "摘要": abstract, "链接": link, "来源": source})
            except Exception as e:
                print(f"处理文章时出错:{e}")
                continue

        print(f"第 {page + 1} 页爬取完成,当前总文章数:{len(results)}")

        if page < num_pages - 1:
            # 滚动到页面底部
            driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
            try:
                # 这一块对应的是页面底部的下一页
                # <a id="sogou_next" href="?query=AI&amp;_sug_type_=&amp;sut=941&amp;
                # lkt=3%2C1718688539526%2C1718688540190&amp;s_from=input&amp;_sug_=y&amp;
                #type=2&amp;sst0=1718688540297&amp;page=2&amp;ie=utf8&amp;w=01019900&amp;dr=1" class="np" 
                # uigs="page_next">下一页</a>
                WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.ID, 'sogou_next')))
            except TimeoutException:
                print("已到达最后一页")
                break
            next_page_button = driver.find_element(By.ID, 'sogou_next')
            next_page_button.click()

    return results

# 保存数据到excel表格中
def save_to_excel(data, keyword):
    now = datetime.now().strftime("%Y%m%d_%H%M%S")
    wb = Workbook()
    ws = wb.active
    ws.append(["标题", "摘要", "链接", "来源"])
    
    for item in data:
        ws.append([item["标题"], item["摘要"], item["链接"], item["来源"]])
    
    filename = f"{keyword}_微信_{now}.xlsx"
    wb.save(filename)
    print(f"数据已保存到 {filename}")

def main():
    chrome_driver_path = r'F:\applications\chrome\chromedriver\chromedriver.exe'  # 替换为您自己的 chromedriver 路径
    chrome_binary_path = r'F:\applications\chrome\chrome\chrome.exe'  # 替换为您自己的 Chrome 可执行文件路径
    keyword = "橘猫"
    pages_to_scrape = 5

    driver = setup_driver(chrome_driver_path, chrome_binary_path)
    try:
        results = search_and_scrape(driver, keyword, pages_to_scrape)
        if results:
            save_to_excel(results, keyword)
        else:
            print("没有抓取到任何数据。")
    finally:
        driver.quit()

if __name__ == "__main__":
    main()

解释

为了便于学习,我给代码写了非常完善的注释,相信还是比较容易理解的。而针对代码中涉及到目标网页结构代码的部分,您可以直接到该网站使用F12来检查其代码结果,或者直接参考下面我截取的屏幕截图以及部分HTML代码。

javascript 复制代码
<ul class="news-list">
	
    <!-- a -->
    <li id="sogou_vr_11002601_box_0" d="ab735a258a90e8e1-6bee54fcbd896b2a-1cf460cec1d0c6424b6a11f04ff26bea">
<div class="img-box">
<a data-z="art" target="_blank" id="sogou_vr_11002601_img_0" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9QPhh12PpQ4wOmIIoYyYOCOMZM3F69YUKLcXZbPRp9T1AlSBA4bIXjQztW22IknkGTqtcEjayRdZU3EEn3emybLnCNPh5psfT7RQVYRdfjcQ_IjDYHhgl9Hw4Z3IxbrxCXDro-V4NCqYfLz4MIS4G3N-cIYm0euER4RMsG7KrBBX1AfaFW2DefQ..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" uigs="article_image_0"><img src="//img01.sogoucdn.com/v2/thumb?appid=201147&amp;url=https%3A%2F%2Fmmbiz.qpic.cn%2Fmmbiz_jpg%2FfvrHGSFCJuOboCUaA1Pw7PIycLm5GDnY2YQGyvTqkKCCnTQrLvITHZ4nIoWPibib1guaibGa2txzQYyD818a8D3Cw%2F0%3Fwx_fmt%3Djpeg&amp;sign=1b045d75e09e5d807cf49a8042908404" onload="resizeImage(this,140,105)" onerror="errorImage(this)" style="width: 140px; height: auto; margin-top: -17.5px;"></a>
</div>
<div class="txt-box">
<h3>
<a target="_blank" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9QPhh12PpQ4wOmIIoYyYOCOMZM3F69YUKLcXZbPRp9T1AlSBA4bIXjQztW22IknkGTqtcEjayRdZU3EEn3emybLnCNPh5psfT7RQVYRdfjcQ_IjDYHhgl9Hw4Z3IxbrxCXDro-V4NCqYfLz4MIS4G3N-cIYm0euER4RMsG7KrBBX1AfaFW2DefQ..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" id="sogou_vr_11002601_title_0" uigs="article_title_0">"法国版Open<em><!--red_beg-->AI<!--red_end--></em>"又获6亿欧元融资,要做<em><!--red_beg-->AI<!--red_end--></em>领域最省钱的公司</a>
</h3>
<p class="txt-info" id="sogou_vr_11002601_summary_0">在最新一轮融资中,坚持开源的法国<em><!--red_beg-->AI<!--red_end--></em>初创公司Mistral <em><!--red_beg-->AI<!--red_end--></em>获得了6亿欧元投资,最新估值已逼近60亿欧元.据了解,本轮融资由美国的...</p>
<div class="s-p">
<span class="all-time-y2">上海股权投资协会行业资讯</span><span class="s2"><script>document.write(timeConvert('1718684014'))</script>1小时前</span>
</div>
</div>
</li>

    <!-- z -->

    <!-- a -->
    <li id="sogou_vr_11002601_box_1" d="ab735a258a90e8e1-6bee54fcbd896b2a-7431e6afb9964bfd8faeaabc331ece6e">
<div class="img-box">
<a data-z="art" target="_blank" id="sogou_vr_11002601_img_1" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9xnpNm-zZx9PeiiSE-g9Z4bv8kKTPHdnYQAKP295_XE9MDs8NA7R9spOgHEZbQgcpFy9FlHff3v9UzKw1Hd5ET9i9Rv14BzNxqkkYHLrala0cZHPDbgfVQoLwc8s558hI1hPJDZOpH6LXVgQEPkqyB5k470bxnVy1SUmjPNDGYYLYl_Q5RRZQjg..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" uigs="article_image_1"><img src="//img01.sogoucdn.com/v2/thumb?appid=201147&amp;url=https%3A%2F%2Fmmbiz.qpic.cn%2Fsz_mmbiz_jpg%2FUicQ7HgWiaUb3sjueKiaJVfsUFovSNT63QsbRHRtXDb8X1KNibpjdJia4Goaz7o2T7qPHz1cprzmE7vh6UJLN98D6kA%2F0%3Fwx_fmt%3Djpeg&amp;sign=2e1ed979debcd84f9411ffd928000ce3" onload="resizeImage(this,140,105)" onerror="errorImage(this)" style="height: 105px; width: auto; margin-left: -53.5px;"></a>
</div>
<div class="txt-box">
<h3>
<a target="_blank" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9xnpNm-zZx9PeiiSE-g9Z4bv8kKTPHdnYQAKP295_XE9MDs8NA7R9spOgHEZbQgcpFy9FlHff3v9UzKw1Hd5ET9i9Rv14BzNxqkkYHLrala0cZHPDbgfVQoLwc8s558hI1hPJDZOpH6LXVgQEPkqyB5k470bxnVy1SUmjPNDGYYLYl_Q5RRZQjg..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" id="sogou_vr_11002601_title_1" uigs="article_title_1"><em><!--red_beg-->AI<!--red_end--></em>教父Hinton:我支持超级<em><!--red_beg-->AI<!--red_end--></em>取代人类!</a>
</h3>
<p class="txt-info" id="sogou_vr_11002601_summary_1">新智元报道 编辑:乔杨 好困【新智元导读】「<em><!--red_beg-->AI<!--red_end--></em>教父」Geoffrey Hinton在最近的采访中表达了自己对<em><!--red_beg-->AI<!--red_end--></em>智能的理解------LLM并不是简单...</p>
<div class="s-p">
<span class="all-time-y2">AI 人工智能助理</span><span class="s2"><script>document.write(timeConvert('1718687564'))</script>16分钟前</span>
</div>
</div>
</li>

    <!-- z -->

    <!-- a -->
    <li id="sogou_vr_11002601_box_2" d="ab735a258a90e8e1-6bee54fcbd896b2a-e215448dd95d31b181e046dc57755f1e">
<div class="img-box">
<a data-z="art" target="_blank" id="sogou_vr_11002601_img_2" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9U43qjqVnvXIDfYctxkCDdmKKyRVwojdT7fp6Vryr-AnK6cqts2YyDNpABBVOLKIdxXPg57MmfJGyc3PPqOEh-urJHzXTuDFcD1XaLziX9-Mdeq7Yz0rI_1gA00Tk5KxpkZdRq8ckzUdkK7MqU80691l52f495G9dDcrGQHFiAql5LU3lqtPw9g..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" uigs="article_image_2"><img src="//img01.sogoucdn.com/v2/thumb?appid=201147&amp;url=https%3A%2F%2Fmmbiz.qpic.cn%2Fmmbiz_jpg%2FPicLGiaOegHVcVBBFNv1N9Tk20HJKjo2dtItRz9CHTdqyHPPMh8YytCvmrngguRdCuFrNo34SjiaVZHAia0FxzdSCw%2F0%3Fwx_fmt%3Djpeg&amp;sign=922151b6c6df95fe4852fef7bf7b4587" onload="resizeImage(this,140,105)" onerror="errorImage(this)" style="width: 140px; height: auto; margin-top: -17.5px;"></a>
</div>
<div class="txt-box">
<h3>
<a target="_blank" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9U43qjqVnvXIDfYctxkCDdmKKyRVwojdT7fp6Vryr-AnK6cqts2YyDNpABBVOLKIdxXPg57MmfJGyc3PPqOEh-urJHzXTuDFcD1XaLziX9-Mdeq7Yz0rI_1gA00Tk5KxpkZdRq8ckzUdkK7MqU80691l52f495G9dDcrGQHFiAql5LU3lqtPw9g..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" id="sogou_vr_11002601_title_2" uigs="article_title_2">现阶段的<em><!--red_beg-->AI<!--red_end--></em>更需要开放还是监管?</a>
</h3>
<p class="txt-info" id="sogou_vr_11002601_summary_2">在全球加紧对<em><!--red_beg-->AI<!--red_end--></em>监管布局的同时,也有不少来自业界的反对声.Meta首席科学家杨立昆认为对<em><!--red_beg-->AI<!--red_end--></em>研发的规范会适得其反,过早监管<em><!--red_beg-->AI<!--red_end--></em>只...</p>
<div class="s-p">
<span class="all-time-y2">科技每日推送</span><span class="s2"><script>document.write(timeConvert('1718687713'))</script>13分钟前</span>
</div>
</div>
</li>

    <!-- z -->

    <!-- a -->
    <li id="sogou_vr_11002601_box_3" d="ab735a258a90e8e1-6bee54fcbd896b2a-b497de8973bde992ba857742cc90d79e">
<div class="img-box">
<a data-z="art" target="_blank" id="sogou_vr_11002601_img_3" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9g2bjy3qaPHyzKFeceUh_063WxWyPymRwp2gOAM0-_Rsa-e2JDR6T0Wx0U4QS4Zmz5DlMYX99LJ-syTC7Gq_BNRCO29cao_fjhAGF6GgljNEeNh7rXEV5edJBxJDg15vVQK1pc2a7GiwnHcH3c-LoJWwDL0jCKwACr1-qXcUmH89f_71bqkOtiw..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" uigs="article_image_3"><img src="//img01.sogoucdn.com/v2/thumb?appid=201147&amp;url=https%3A%2F%2Fmmbiz.qpic.cn%2Fmmbiz_jpg%2FfOicibLgl3HgMhM9icFv0kySHzbsQKpKUhef0iar5gs6ttDqyJg3QgMTmcSWSHEhEBsiahgNia9oTtd35Ax2nCFB8nsA%2F0%3Fwx_fmt%3Djpeg&amp;sign=c5c1021bba6a57629aa48e7dceb76bf8" onload="resizeImage(this,140,105)" onerror="errorImage(this)" style="width: 140px; height: auto; margin-top: -17.5px;"></a>
</div>
<div class="txt-box">
<h3>
<a target="_blank" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9g2bjy3qaPHyzKFeceUh_063WxWyPymRwp2gOAM0-_Rsa-e2JDR6T0Wx0U4QS4Zmz5DlMYX99LJ-syTC7Gq_BNRCO29cao_fjhAGF6GgljNEeNh7rXEV5edJBxJDg15vVQK1pc2a7GiwnHcH3c-LoJWwDL0jCKwACr1-qXcUmH89f_71bqkOtiw..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" id="sogou_vr_11002601_title_3" uigs="article_title_3"><em><!--red_beg-->AI<!--red_end--></em> 一键生成 PPT、文档、图片,免费的!!</a>
</h3>
<p class="txt-info" id="sogou_vr_11002601_summary_3">我是老罗,1.今天推荐一个新的,免费的,<em><!--red_beg-->AI<!--red_end--></em>生成 PPT、文档、图片的网站,<em><!--red_beg-->AI<!--red_end--></em>生成PPT提供文档<em><!--red_beg-->AI<!--red_end--></em>改写成小红书文案根据小红书文案...</p>
<div class="s-p">
<span class="all-time-y2">每日优质搜罗</span><span class="s2"><script>document.write(timeConvert('1718688230'))</script>5分钟前</span>
</div>
</div>
</li>

    <!-- z -->

    <!-- a -->
    <li id="sogou_vr_11002601_box_4" d="ab735a258a90e8e1-6bee54fcbd896b2a-c1998b603fa6df16258fb237ce9f202c">
<div class="img-box">
<a data-z="art" target="_blank" id="sogou_vr_11002601_img_4" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9mEnEcFmmiJLbd-HBEUfggWnmCVTvCTiKmTrPCv7VwLe4lsVzhmlAMKGr8Hb1TFM6CiCmTahnwOd1_VI8kKLO5qqCt-Wy3L_0KzExNZZcubr5PajBjYCGiBRyZBhkHCSd5lgRpqdcg7nreq0RST3L6_3FE4giIXdkq9v4BVquK2jI3OTRHYU2og..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" uigs="article_image_4"><i></i><img src="//img01.sogoucdn.com/v2/thumb?appid=201147&amp;url=https%3A%2F%2Fmmbiz.qpic.cn%2Fsz_mmbiz_jpg%2FKmXPKA19gW9mibb7NPZfEiaevxrbBrMgnLPXU8HSMC9eLj5wkBRCuP2ianAxH3ibvgWkQ7sekNCcJtl1Q0ShEohF0A%2F0%3Fwx_fmt%3Djpeg&amp;sign=7d5816fb2ab87bac8487546b54189bec" onload="resizeImage(this,140,105)" onerror="errorImage(this)" style="height: 105px; width: auto; margin-left: -53.5px;"></a>
</div>
<div class="txt-box">
<h3>
<a target="_blank" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9mEnEcFmmiJLbd-HBEUfggWnmCVTvCTiKmTrPCv7VwLe4lsVzhmlAMKGr8Hb1TFM6CiCmTahnwOd1_VI8kKLO5qqCt-Wy3L_0KzExNZZcubr5PajBjYCGiBRyZBhkHCSd5lgRpqdcg7nreq0RST3L6_3FE4giIXdkq9v4BVquK2jI3OTRHYU2og..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" id="sogou_vr_11002601_title_4" uigs="article_title_4">杀疯了!谷歌卷视频到语音,逼真音效让<em><!--red_beg-->AI<!--red_end--></em>视频告别无声!</a>
</h3>
<p class="txt-info" id="sogou_vr_11002601_summary_4">机器之心报道编辑:杨文<em><!--red_beg-->AI<!--red_end--></em>圈这遍地开花的大好局面,让吃瓜群众们甚是惊喜.这几天,大洋彼岸杀疯了!Luma 的热乎劲儿还没过去...</p>
<div class="s-p">
<span class="all-time-y2">机器之心</span><span class="s2"><script>document.write(timeConvert('1718684121'))</script>1小时前</span>
</div>
</div>
</li>

    <!-- z -->

    <!-- a -->
    <li id="sogou_vr_11002601_box_5" d="ab735a258a90e8e1-6bee54fcbd896b2a-aecd95b2f66788ce20e55951c2f2a4f3">
<div class="img-box">
<a data-z="art" target="_blank" id="sogou_vr_11002601_img_5" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9BoLmkA-IWQiz2u9_XM5rQXH-x7xyaMxSPDl2MavyCdMeRo0r75_nHw0OjgPy_ywGD8QeMB6YrSL7GsiQkBLqN0egQsTuxSO8ByUzJ8K9KtqM8m7CHi2Oe3VGbS7XbjbYGATFN456YKdzflHRQc-s0yS5eQbEbA54VMt2d8As4qZAe0f8bRARvQ..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" uigs="article_image_5"><img src="//img01.sogoucdn.com/v2/thumb?appid=201147&amp;url=https%3A%2F%2Fmmbiz.qpic.cn%2Fsz_mmbiz_jpg%2FnxfZgMicFsW23g2bxM6yftg6Yiar0pPYAU6NYcO0ddQ7UJO1SeGO82pzKDLFpkN7ibymL4uzsMEwYcxeW8ZaeYZWA%2F0%3Fwx_fmt%3Djpeg&amp;sign=7d028147c057887977227f83cdafc9dc" onload="resizeImage(this,140,105)" onerror="errorImage(this)" style="height: 105px; width: auto; margin-left: -54px;"></a>
</div>
<div class="txt-box">
<h3>
<a target="_blank" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9BoLmkA-IWQiz2u9_XM5rQXH-x7xyaMxSPDl2MavyCdMeRo0r75_nHw0OjgPy_ywGD8QeMB6YrSL7GsiQkBLqN0egQsTuxSO8ByUzJ8K9KtqM8m7CHi2Oe3VGbS7XbjbYGATFN456YKdzflHRQc-s0yS5eQbEbA54VMt2d8As4qZAe0f8bRARvQ..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" id="sogou_vr_11002601_title_5" uigs="article_title_5"><em><!--red_beg-->AI<!--red_end--></em>女友 | 皮裤女孩</a>
</h3>
<p class="txt-info" id="sogou_vr_11002601_summary_5">靴子猫VR,靴子猫<em><!--red_beg-->AI<!--red_end--></em>尽快关注下往期精彩可以点击图片跳转快上车美图壁纸 | 绚丽国风娘文末有彩蛋哦在繁华的都市里,韩梅梅是一位...</p>
<div class="s-p">
<span class="all-time-y2">靴子猫VRAI</span><span class="s2"><script>document.write(timeConvert('1718683514'))</script>1小时前</span>
</div>
</div>
</li>

    <!-- z -->

    <!-- a -->
    <li id="sogou_vr_11002601_box_6" d="ab735a258a90e8e1-6bee54fcbd896b2a-76681bc6b73f7e286e9bd2569cef4641">
<div class="img-box">
<a data-z="art" target="_blank" id="sogou_vr_11002601_img_6" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9SDVmlkTldai90xbjE0RJZOa_7IgNdt7W-16bwkvlGT-ZJVxdws-L1d-RQ4bz-oTIbp5bsrIRzdX2u8jruyQosbhQ2QFTAXHxn5xNaDSj1VVv87twfHxPgL2sAi3UolxaEi0tiVnjl3BLTPRVQUFqD5jrvt03LmvrKbTEZxJYjwZhlgeRt7bSzA..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" uigs="article_image_6"><img src="//img01.sogoucdn.com/v2/thumb?appid=201147&amp;url=https%3A%2F%2Fmmbiz.qpic.cn%2Fsz_mmbiz_jpg%2FPsRu4HuQzCnH4d62NHI8zsibu2vSy3jsqcaTFkHhLhqc1X0skzDOGgCF2ibEEocbc6KsX7hm94xnzEUJR8m8LJbQ%2F0%3Fwx_fmt%3Djpeg&amp;sign=837a14ee1cab8ad59f3a006ad0aefcd7" onload="resizeImage(this,140,105)" onerror="errorImage(this)" style="height: 105px; width: auto; margin-left: -54px;"></a>
</div>
<div class="txt-box">
<h3>
<a target="_blank" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9SDVmlkTldai90xbjE0RJZOa_7IgNdt7W-16bwkvlGT-ZJVxdws-L1d-RQ4bz-oTIbp5bsrIRzdX2u8jruyQosbhQ2QFTAXHxn5xNaDSj1VVv87twfHxPgL2sAi3UolxaEi0tiVnjl3BLTPRVQUFqD5jrvt03LmvrKbTEZxJYjwZhlgeRt7bSzA..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" id="sogou_vr_11002601_title_6" uigs="article_title_6">谁家<em><!--red_beg-->AI<!--red_end--></em>如此狂野?</a>
</h3>
<p class="txt-info" id="sogou_vr_11002601_summary_6">感受到<em><!--red_beg-->AI<!--red_end--></em>疯狂的想象力了吗?还不快来学习<em><!--red_beg-->AI<!--red_end--></em>?</p>
<div class="s-p">
<span class="all-time-y2">顶级程序员</span><span class="s2"><script>document.write(timeConvert('1718688168'))</script>6分钟前</span>
</div>
</div>
</li>

    <!-- z -->

    <!-- a -->
    <li id="sogou_vr_11002601_box_7" d="ab735a258a90e8e1-6bee54fcbd896b2a-123b840752b7c23325931d9478cc8ecb">
<div class="img-box">
<a data-z="art" target="_blank" id="sogou_vr_11002601_img_7" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9U43qjqVnvXIDfYctxkCDdmKKyRVwojdT7fp6Vryr-AnK6cqts2YyDI0T_gGLpAj_o3QSm4fU1FH1GV4L9cSq_o7gC9WCiCj1hD87dH3fr2RndCJrGH7rDItSFUQeO2z8oYfsN21SLy0xygmqPNKfAyHBZE83rYdzPdF3_y_hMILP28temKTOwg..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" uigs="article_image_7"><img src="//img01.sogoucdn.com/v2/thumb?appid=201147&amp;url=https%3A%2F%2Fmmbiz.qpic.cn%2Fmmbiz_jpg%2FPicLGiaOegHVcVBBFNv1N9Tk20HJKjo2dtqmckyIZaY51Xgic7K6hVG10PicpgVC3XWFRlz422qelFTTgVzX4vIasw%2F0%3Fwx_fmt%3Djpeg&amp;sign=e3c655c3a4752938ab25210426285351" onload="resizeImage(this,140,105)" onerror="errorImage(this)" style="width: 140px; height: auto; margin-top: -17.5px;"></a>
</div>
<div class="txt-box">
<h3>
<a target="_blank" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9U43qjqVnvXIDfYctxkCDdmKKyRVwojdT7fp6Vryr-AnK6cqts2YyDI0T_gGLpAj_o3QSm4fU1FH1GV4L9cSq_o7gC9WCiCj1hD87dH3fr2RndCJrGH7rDItSFUQeO2z8oYfsN21SLy0xygmqPNKfAyHBZE83rYdzPdF3_y_hMILP28temKTOwg..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" id="sogou_vr_11002601_title_7" uigs="article_title_7">OpenAI:非营利组织是核心使命,Runway发布新一代产品,善思开悟完成5000万元A轮融资</a>
</h3>
<p class="txt-info" id="sogou_vr_11002601_summary_7">IMF直言对"<em><!--red_beg-->AI<!--red_end--></em>大量抢走人类饭碗"深感担忧,同时呼吁各国政府采取更多措施来保护自己的经济.IMF在报告中表示,生成式人工智能...</p>
<div class="s-p">
<span class="all-time-y2">科技每日推送</span><span class="s2"><script>document.write(timeConvert('1718687713'))</script>13分钟前</span>
</div>
</div>
</li>

    <!-- z -->

    <!-- a -->
    <li id="sogou_vr_11002601_box_8" d="ab735a258a90e8e1-6bee54fcbd896b2a-3209e5de9c455224e1be5b55981e6764">
<div class="img-box">
<a data-z="art" target="_blank" id="sogou_vr_11002601_img_8" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd98W_d1zaDvTkMym4PfWrBC0foLvuzkkLQ_SqNLHeQcc8uYSXAA6PJcRT9V2d4xrZ-9mtz4sX-j3oUnzCzyGqKMAFM6ykCpdnrr1iBPlPeINVJ9OmVXG_a6e6OEqNcTTTZofGFhWBMpN81vh5DC7d53AxGGXFr0h7WINLgEIWF_az1AfaFW2DefQ..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" uigs="article_image_8"><img src="//img01.sogoucdn.com/v2/thumb?appid=201147&amp;url=http%3A%2F%2Fmmbiz.qpic.cn%2Fmmbiz_jpg%2FCB1K8ks6fXf1PeIoGuWCVrbnG0AgkVJTosUbTZlxP5hzSRPOWb7RicdXrvQiagRB2SJI3S1jtOWpYzy5XaGL39Hg%2F0%3Fwx_fmt%3Djpeg&amp;sign=a2338fe00b34a0331cac5cb7412736eb" onload="resizeImage(this,140,105)" onerror="errorImage(this)" style="height: 105px; width: auto; margin-left: -53.5px;"></a>
</div>
<div class="txt-box">
<h3>
<a target="_blank" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd98W_d1zaDvTkMym4PfWrBC0foLvuzkkLQ_SqNLHeQcc8uYSXAA6PJcRT9V2d4xrZ-9mtz4sX-j3oUnzCzyGqKMAFM6ykCpdnrr1iBPlPeINVJ9OmVXG_a6e6OEqNcTTTZofGFhWBMpN81vh5DC7d53AxGGXFr0h7WINLgEIWF_az1AfaFW2DefQ..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" id="sogou_vr_11002601_title_8" uigs="article_title_8">今日<em><!--red_beg-->AI<!--red_end--></em>资讯-6.18</a>
</h3>
<p class="txt-info" id="sogou_vr_11002601_summary_8">2024.6.181、绿米 <em><!--red_beg-->AI<!--red_end--></em> 智能存在传感器 FP1E开售2、Google 在 <em><!--red_beg-->AI<!--red_end--></em> 功能推出新功能,需要明确说明可能出错的地方3、北大、快手攻克复...</p>
<div class="s-p">
<span class="all-time-y2">创智科成AIGC</span><span class="s2"><script>document.write(timeConvert('1718684718'))</script>1小时前</span>
</div>
</div>
</li>

    <!-- z -->

    <!-- a -->
    <li id="sogou_vr_11002601_box_9" d="ab735a258a90e8e1-6bee54fcbd896b2a-ef5c4d08d5359fde59d68363acae9efe">
<div class="img-box">
<a data-z="art" target="_blank" id="sogou_vr_11002601_img_9" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9o9LVYFOEUpWo3D6uykufOAoKELT7V1GAKzWSZppfmv0TjtYG33rajaBIN9ehWOmKJqBoFsyBYMUvcjcUKCNbmsHQ_0WVe_wp_2_ziimxmkPczRL0GnuJFb-G5px-Ns7RXxesJtqbmTpyg_s2h8esxm6XnRkWGUV5N9z4kgSS9EL6zvkPgoArRQ..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" uigs="article_image_9"><img src="//img01.sogoucdn.com/v2/thumb?appid=201147&amp;url=http%3A%2F%2Fmmbiz.qpic.cn%2Fmmbiz_jpg%2FvmVatE9coazZvJBNRYe3XFuIKcwbCj4YGZknX3Jiac4aED06EzJ58DdXtUXsAFd7TliapC4OtdUpJ9dkWDwnXVBg%2F0%3Fwx_fmt%3Djpeg&amp;sign=938e07f8483eaea956a5b0647749d35e" onload="resizeImage(this,140,105)" onerror="errorImage(this)" style="height: 105px; width: auto; margin-left: -53.5px;"></a>
</div>
<div class="txt-box">
<h3>
<a target="_blank" href="/link?url=dn9a_-gY295K0Rci_xozVXfdMkSQTLW6cwJThYulHEtVjXrGTiVgS3e7YtGetgRttbDm-ybvTEZEIja2oyzzvVqXa8Fplpd9o9LVYFOEUpWo3D6uykufOAoKELT7V1GAKzWSZppfmv0TjtYG33rajaBIN9ehWOmKJqBoFsyBYMUvcjcUKCNbmsHQ_0WVe_wp_2_ziimxmkPczRL0GnuJFb-G5px-Ns7RXxesJtqbmTpyg_s2h8esxm6XnRkWGUV5N9z4kgSS9EL6zvkPgoArRQ..&amp;type=2&amp;query=AI&amp;token=0FE61702378AEFC572776A525A64415272D7085A66711B1B" id="sogou_vr_11002601_title_9" uigs="article_title_9"><em><!--red_beg-->AI<!--red_end--></em>在HR领域的未来</a>
</h3>
<p class="txt-info" id="sogou_vr_11002601_summary_9">✦++点击蓝字 关注我们全宇人力Metalent人工智能(<em><!--red_beg-->AI<!--red_end--></em>)正在通过改善员工体验和组织运营来改变人力资源管理.本文将讨论人工智能...</p>
<div class="s-p">
<span class="all-time-y2">全宇人力Metalent</span><span class="s2"><script>document.write(timeConvert('1718683235'))</script>1小时前</span>
</div>
</div>
</li>

    <!-- z -->

</ul>

至此,一个简单的爬虫就算是做好啦!值得注意的是,本文用到的chrome.exe以及chromedriver.exe的地址,您需要手动替换成自己这些程序的目录。

感谢阅读!

相关推荐
A 计算机毕业设计-小途15 分钟前
大四零基础用Vue+ElementUI一周做完化妆品推荐系统?
java·大数据·hadoop·python·spark·毕业设计·毕设
念念01074 小时前
数学建模竞赛中评价类相关模型
python·数学建模·因子分析·topsis
云天徽上4 小时前
【数据可视化-94】2025 亚洲杯总决赛数据可视化分析:澳大利亚队 vs 中国队
python·信息可视化·数据挖掘·数据分析·数据可视化·pyecharts
☺����5 小时前
实现自己的AI视频监控系统-第一章-视频拉流与解码2
开发语言·人工智能·python·音视频
王者鳜錸5 小时前
PYTHON让繁琐的工作自动化-函数
开发语言·python·自动化
xiao助阵6 小时前
python实现梅尔频率倒谱系数(MFCC) 除了傅里叶变换和离散余弦变换
开发语言·python
麻辣清汤7 小时前
结合BI多维度异常分析(日期-> 商家/渠道->日期(商家/渠道))
数据库·python·sql·finebi
钢铁男儿8 小时前
Python 正则表达式(正则表达式和Python 语言)
python·mysql·正则表达式
钢铁男儿8 小时前
Python 正则表达式实战:解析系统登录与进程信息
开发语言·python·正则表达式
前端小趴菜058 小时前
python - range
python