闲来无事想看个小说,打算下载到电脑上看,找了半天,没找到可以下载的网站,于是就想自己爬取一下小说内容并保存到本地圣墟 第一章 沙漠中的彼岸花 - 辰东 - 6毛小说网 http://www.6mao.com/html/40/40184/12601161.html这是要爬取的网页观察结构私信小编01 02 03 04 05 即可获取数十套PDF哦!下一章然后开始创建scrapy项目:其中sixmaospider.py:# -*- coding: utf-8 -*-import scrapyfrom ..items import SixmaoItemclass SixmaospiderSpider(scrapy.Spider): name = 'sixmaospider' #allowed_domains = ['http://www.6mao.com'] start_urls = ['http://www.6mao.com/html/40/40184/12601161.html'] #圣墟 def parse(self, response): novel_biaoti = response.xpath('//div[@id="content"]/h1/text()').extract() #print(novel_biaoti) novel_neirong=response.xpath('//div[@id="neirong"]/text()').extract() print(novel_neirong)
………………………………