一、初识
创建项目:
python
scrapy startproject my_one_project # 创建项目命令
cd my_one_project # 先进去, 后面在里面运行
运行爬虫命令为:scrapy crawl tk
其中name就是scrapy crawl tk ,运行时用的
python
# spiders脚本
import scrapy
class TkSpider(scrapy.Spider):
name = 'tk' # 运行爬虫命令为:scrapy crawl tk
start_urls = ['https://www.baidu.com/']
def parse(self, response, **kwargs):
print(1111)
print(response.text)
运行时:
scrapy.downloadermiddlewares.robotstxt\] DEBUG: Forbidden by robots.txt: \