示例代码说明:
在小说网站选定一本小说,将小说每个章节内容存为txt文档,文件标题与小说章节标题一致
import requests
from lxml import etree
#一本小说链接
Anovellink = 'https://www.hongxiu.com/book/18899519001291804#Catalog'
#目录页代码
ContentsPageCode = requests.get(Anovellink).text
#目录页
ContentsPage = etree.HTML(ContentsPageCode)
href = ContentsPage.xpath('//*[@id="j-catalogWrap"]/div[2]/div/ul/li/a/@href')
for link in href:
#链接地址
linkaddress = 'https://www.hongxiu.com' + link
#章节页面代码
Chapterpagecode=requests.get(linkaddress).text
#章节页面
Chapterpage = etree.HTML(Chapterpagecode)
#文字列表
Literallist =Chapterpage.xpath('//div[@class="ywskythunderfont"]/p/text()')
#标题
title=Chapterpage.xpath('//h1[@class ="j_chapterName"]/text()')[0]
file =open('E:/novelpython/'+title+ '.txt','w',encoding='utf-8')
for paragraph in Literallist:
file.write(paragraph + '\n')
print(title +' Chapter crawling is complete')
print('The novel pulling is complete')
结果示例:

