【Shell】模拟爬虫下载天龙八部小说

因缘而起12025-04-08 11:21

Shell脚本：

bash 复制代码

#curl https://tianlong.5000yan.com/ -o tianlong.html
grep "href=" tianlong.html | grep html | awk -F"\"" '{ print $6 }' >> urls.txt
grep "href=" tianlong.html | grep html | awk -F">" '{ print $3 }' | awk -F"<" '{ print $1 }' >>titles.txt

exec 3<urls.txt
exec 4<titles.txt

while read -u 3 url && read -u 4 title
do
	echo "$title : $url"
	curl "$url" -o "${title}.html"
done

exec 3<&-
exec 4<&-

下载后的文件：

下载后的效果：

上一篇：在Kotlin中编写依赖于时间的可测试协程代码

下一篇：使用 LLaMA-Factory 微调 llama3 模型（二）