python爬取起点中文网小说

完整代码：

import requests
from lxml import etree
header = {'User-Agent':'Mozilla/5.0(Macintosh;Inter Mac OS X 10_13_3) AppleWebkit/537.36 (KHTML,like Gecko)''Chrom/65.0.3325.162 Safari/537.36'}
def getbookurls():url = 'https://book.qidian.com/info/1017125042#Catalog'#获取页面源代码charptes = requests.get(url,headers = header).text#print(charptes)objects = etree.HTML(charptes)#print(objects)#章节链接  //匹配所有objs = objects.xpath('//ul[@class="cf"]/li')clist = []for obj in objs:try:#章节的url地址chapt_urls = obj.xpath('a/@href')[0]#章节的名称chapt_names = obj.xpath('a/text()')[0]into = {'chapt_urls':'https:'+ chapt_urls,'chapt_names':chapt_names}clist.append(into)except:passreturn clistclist = getbookurls()#获取章节小说内容
def getcontent(url):res = requests.get(url,headers = header).textobjects = etree.HTML(res)objs = objects.xpath('//div[@class="read-content j_readContent"]/p/text()')content = []for i in objs:#               替换之前的  替换之后的text = i.replace('\u3000\u3000','')content.append(text)return content#下载小说
for i in clist:chapt_urls = i['chapt_urls']chapt_names = i['chapt_names']content = getcontent(chapt_urls)text = ''for j in content:text = text + jprint("正在下载%s"%chapt_names)#保存路径，按照自己的进行更改with open('起点小说/%s.doc'%chapt_names,'w') as f:f.write(text)

python爬取起点中文网小说相关推荐

python爬虫之爬取起点中文网小说
python爬虫之爬取起点中文网小说 hello大家好,这篇文章带大家来制作一个python爬虫爬取阅文集团旗下产品起点中文网的程序,这篇文章的灵感来源于本人制作的一个项目:电脑助手启帆助手 ⬆是项 ...
python 爬虫抓取网页数据导出excel_Python爬虫|爬取起点中文网小说信息保存到Excel...
前言: 爬取起点中文网全部小说基本信息,小说名.作者.类别.连载\完结情况.简介,并将爬取的数据存储与EXCEL表中环境:Python3.7 PyCharm Chrome浏览器主要模块:xlwt ...
Python 爬取起点的小说（非vip）
Python 爬取起点的小说(非vip) 起点小说网是一个小说种类比较全面的网站,当然,作为收费类网站,VIP类的小说也很多,章节是VIP的话,有一个动态加载,也就 ...
python爬取起点vip小说章节_python 爬取起点小说vip章节（失败）
今天心血来潮,想爬取起点vip小说章节,花费了足足0.27大洋后,悟出来一个人生道理,这个应该是爬不下来.但是这0.27大洋也教会了我两个知识点. 1.服务器只会响应客户端的请求,不会主动给客户端发送 ...
Python爬取起点中文网月票榜前500名网络小说介绍
前言本文的文字及图片来源于网络,仅供学习.交流使用,不具有任何商业用途,如有问题请及时联系我们以作处理. PS:如有需要Python学习资料的小伙伴可以加点击下方链接自行获取 python免费学习资 ...
Python简单爬取起点中文网小说（仅学习）
目录前言一.爬虫思路二.使用步骤 1.引入库 2.读取页面 3.分析HTML 3.从标签中取出信息 4.爬取正文总结前言实习期间自学了vba,现在开始捡回以前上课学过的python,在此记 ...
python request 爬虫爬取起点中文网小说
1.网页分析.进入https://www.qidian.com/,点击全部,进行翻页,你就会发现一个规律, url=https://www.qidian.com/all?orderId=&st ...
爬取起点中文网小说介绍信息
字数的信息(word)没有得到缺失 import xlwt import requests from lxml import etree import timeall_info_list=[] hea ...
java爬虫抓取起点小说_爬虫实践-爬取起点中文网小说信息
qidian.py: import xlwt import requests from lxml import etree import time all_info_list = [] def get ...
Python爬取起点小说并写入文档
python爬取起点免费小说按F12查看网页源代码: 发现每一章小说链接在li中,这时可以提取每一章的链接: def get_html(url):r=requests.get(url)html=Be ...

python爬取起点中文网小说

python爬取起点中文网小说

python爬取起点中文网小说相关推荐

最新文章

热门文章