使用python脚本爬取前端页面上的表格导出为Excel

江一铭2025-01-10 9:20

前几天有前端小伙伴说后端没写导出功能，但是现在人事需要用到这个表，要导出Excel给她，那就用脚本爬一下吧，30行代码搞定。

电脑需要有python3环境，用解释器打开，没包的下载包，然后跑一下就行，需要注意的是，直接用的find('table') ，如果有多个table，想要爬某个表格，那就指定id去查。下课。

python 复制代码

import requests
from bs4 import BeautifulSoup
import pandas as pd

#   获取网页内容
url = "http://127.0.0.1:53893/"
response = requests.get(url)
html_content = response.text

#  解析html 获取表格 提取表头
soup = BeautifulSoup(html_content, 'html.parser')
table = soup.find('table')

headers = []
for th in table.find_all('th'):
    headers.append(th.text.strip())

#  提取表格的行数据
rows = []
for tr in table.find_all('tr')[1:]:  # 从第二行开始，第一行是表头
    cells = tr.find_all('td')
    row = [cell.text.strip() for cell in cells]
    if row:
        rows.append(row)

df = pd.DataFrame(rows, columns=headers)

#  导出为Excel
df.to_excel('index.xlsx', index=False)

print("数据已成功导出到index.xlsx")