selenium爬取评论

from selenium import webdriverdriver=webdriver.Chrome()
# 自动访问的网站
driver.get("http://www.santostang.com/2018/07/04/hello-world/")fo = open("result.txt", "a+")
fo.truncate(0)for ii in range(0, 3):# i指的是每页有10小页for i in range(0, 10):# 下滑到页面底部driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")# 爬取某一页的所有评论driver.switch_to.frame(driver.find_element_by_css_selector("iframe[title='livere-comment']"))driver.implicitly_wait(10)  # 隐性等待10秒comment = driver.find_elements_by_css_selector('div.reply-content')print()print("第 %g 页评论:" % int(i + 1 + ii * 10))# 打开一个文件fo = open("result.txt", "a+")fo.write('\n')fo.write("第 %g 页评论:" % int(i + 1 + ii * 10) + '\n')# 打印所有评论for eachcomment in comment:content = eachcomment.find_element_by_tag_name('p')print(content.text)# fo.write(content.text.encode("gbk", 'ignore').decode("gbk", "ignore"))text = content.text.encode('GBK', 'ignore').decode('GBk')fo.write(text + '\n')fo.close()# 获取所有的页码按钮page_btn = driver.find_elements_by_class_name("page-btn")# 统计这一页总共有多少页评论，默认最多为10页page_btn_size = len(page_btn)if i == page_btn_size - 1:driver.switch_to.default_content()driver.implicitly_wait(10)break# 按顺序点击某一页if i != 9 and i + 1 < page_btn_size:page_btn[i + 1].click()# 把iframe又转回去，注意加上这一句driver.switch_to.default_content()# 如果网速慢，可以适当增加隐性等待时间driver.implicitly_wait(15)driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")driver.switch_to.frame(driver.find_element_by_css_selector("iframe[title='livere-comment']"))# 判断页面是否有下一页的按钮，没有就退出try:next_page = driver.find_element_by_class_name("page-last-btn")next_page.click()# 把iframe又转回去，注意加上这一句driver.switch_to.default_content()driver.implicitly_wait(10)except:print()print("爬取结束！（不是爬取内容）")

selenium爬取评论相关推荐

Selenium 爬取评论数据，就是这么简单！
本文来自作者秦子敬在 GitChat 上分享「如何利用 Selenium 爬取评论数据?」,「阅读原文」查看交流实录「文末高能」编辑 | 飞鸿一.前言我们知道,如今的 web 网页数据很多 ...
selenium爬取亚马逊商品评论
亚马逊商品评论有反爬虫,所以就用selenium爬了.网速一定要好,不然爬的真的是天昏地暗.配合多线程就会快很多,这个不写了,爬的时候手动复制了N个代码去爬.还有一个点,中文和英文的设置,可以在评论里 ...
Python爬虫——selenium爬取网易云评论并做词云
大家好!我是霖hero 到点了上号网易云,很多人喜欢到夜深人静的时候,在网易云听音乐发表评论,正所谓:自古评论出人才,千古绝句随口来,奈何本人没文化,一句卧槽行天下!评论区集结各路大神,今天我们来爬取 ...
Python+Selenium爬取新浪微博评论数据
Python+Selenium爬取指定新浪微博的数据微博分析微博端类型选择爬取对象 Ajax动态加载数据分析 Python实现代码微博分析微博端类型首先找到一个待爬取的微博,需要注意的是, ...
Python + selenium 爬取淘宝商品列表及商品评论 2021-08-26
Python + selenium 爬取淘宝商品列表及商品评论[2021-08-26] 主要内容登录淘宝获取商品列表获取评论信息存入数据库需要提醒主要内容通过python3.8+ sel ...
python用selenium爬取b站评论并制作词云图
文章目录前言一.爬取b站评论 1.selenium配置 2.代码二.制作词云图 1.下载停用词 2.代码 3.注意事项三.制作成品 1.初期成品 2.成品前言 b站视频下的评论是下拉加载的. ...
Selenium爬取网易云音乐评论
Selenium爬取网易云音乐评论一.爬取工具 1.1 selenium selenium这是一个第三方库我们可以通过 pip install selenium来安装这个第三方库. Sele ...
23、selenium爬取歌曲精彩评论
我们这次试试用selenium爬取QQ音乐的歌曲评论,我选的歌是<甜甜的>. https://y.qq.com/n/yqq/song/000xdZuV2LcQ19.html 1 from ...
[python爬虫] Selenium爬取内容并存储至MySQL数据库
前面我通过一篇文章讲述了如何爬取CSDN的博客摘要等信息.通常,在使用Selenium爬虫爬取数据后,需要存储在TXT文本中,但是这是很难进行数据处理和数据分析的.这篇文章主要讲述通过Selenium ...

selenium爬取评论

selenium爬取评论相关推荐

最新文章

热门文章