python使用selenium爬取斗鱼房间数据并写入mongodb数据库

本篇文章是对之前斗鱼爬虫的修改，将爬取到的数据直接写入mongodb数据库中

（今天学习到mongodb）

from selenium import webdriver
import time
from pymongo import MongoClient#连接到douyu数据库
db = MongoClient(host="127.0.0.1", port=27017).douyuurl = "https://www.douyu.com/directory/all"Gdriver = webdriver.Chrome()# 发送请求
Gdriver.get(url)#打开浏览器后等待几秒，让页面加载完成
time.sleep(3)next_page = True
while next_page is not None:# 提取数据room_list = Gdriver.find_elements_by_xpath("//li[@class = 'layout-Cover-item']")content_list = []   #一个存放字典的列表for i in room_list:room_dict = {}room_dict["标题"] = i.find_element_by_xpath(".//h3[@class='DyListCover-intro']").get_attribute("title")room_dict["类型"] = i.find_element_by_xpath(".//span[@class='DyListCover-zone']").textroom_dict["主播name"] = i.find_element_by_xpath(".//h2[@class='DyListCover-user']").textroom_dict["热度"] = i.find_element_by_xpath(".//span[@class = 'DyListCover-hot']").textprint(room_dict)#将数据写入数据库db.aa.insert_one(room_dict)# 请求下一页地址，循环next_page = Gdriver.find_element_by_xpath("//li[@class=' dy-Pagination-next']/span")next_page.click()  #点击下一页time.sleep(3)  #睡眠3秒#退出浏览器
Gdriver.quit()

数据库代码部分如下

#连接到douyu数据库
db = MongoClient(host="127.0.0.1", port=27017).douyu

  #将数据写入数据库db.aa.insert_one(room_dict)

运行截图如下

在PyCharm运行

在数据库终端查看数据插入情况

现在pymongo库中使用insert插入已经放弃了，可以使用insert_one或insert_many

python使用selenium爬取斗鱼房间数据并写入mongodb数据库相关推荐

使用 Selenium 爬取斗鱼直播数据（2019最新）
环境:Win10 需要安装 Chromedriver Chromedriver下载(需对应浏览器的版本) 这些是我最近在学 selenium 模块爬取得斗鱼直播数据,以下爬取的是图片.详情地址.标 ...
python 使用selenium爬取斗鱼直播房间信息
用到的模块今天没课,开启正片 selenium 打开浏览器提取数据 time 每打开一个页面睡眠几秒,防止太快报错还有被发现 json 写入文件要用到完整代码如下(为了减少代码就不使用类了) fr ...
python+selenium爬虫，使用selenium爬取热门微博数据
python爬虫使用selenium爬取热门微博数据完整代码 from selenium.webdriver import Chrome import time import csvf = open ...
python基于scrapy爬取京东笔记本电脑数据并进行简单处理和分析
这篇文章主要介绍了python基于scrapy爬取京东笔记本电脑数据并进行简单处理和分析的实例,帮助大家更好的理解和学习使用python.感兴趣的朋友可以了解下一.环境准备 python3.8.3 ...
mysql 查询系统字段自然日_吴裕雄--天生自然python数据清洗与数据可视化：MYSQL、MongoDB数据库连接与查询、爬取天猫连衣裙数据保存到MongoDB...
本博文使用的数据库是MySQL和MongoDB数据库.安装MySQL可以参照我的这篇博文:https://www.cnblogs.com/tszr/p/12112777.html 其中操作Mysql使 ...
python 使用 selenium 爬取中国福利彩票双色球历史中奖号码
python 使用 selenium 爬取中国福利彩票双色球历史中奖号码前期准备版本:python3 模块:selenium.time.pprint 一开始使用的是 tree 的方式获取数据,但发 ...
爬虫之selenium爬取斗鱼网站
爬虫之selenium爬取斗鱼网站示例代码: from selenium import webdriver import timeclass Douyu(object):def __init__(s ...
python使用selenium爬取联想官网驱动（一）：获取遍历各驱动的下载网址
python使用selenium爬取联想官网驱动(一):获取遍历各驱动的下载网址然后wget命令试验下载由于初期学习,所以先拿一个型号的产品驱动试验. (1)以下为在联想某型号产品获取相关驱动下载的 ...
python爬取微博数据存入数据库_Python爬取新浪微博评论数据，写入csv文件中
因为新浪微博网页版爬虫比较困难,故采取用手机网页端爬取的方式操作步骤如下: 1. 网页版登陆新浪微博 2.打开m.weibo.cn 3.查找自己感兴趣的话题,获取对应的数据接口链接 4.获取cook ...

python使用selenium爬取斗鱼房间数据并写入mongodb数据库

本篇文章是对之前斗鱼爬虫的修改，将爬取到的数据直接写入mongodb数据库中

运行截图如下

python使用selenium爬取斗鱼房间数据并写入mongodb数据库相关推荐

最新文章

热门文章