python爬取饿了么评论_爬虫实例：饿了么爬虫

饿了么外卖网站是一个ajax动态加载的网站

Version1:直接页面提取

from lxml importetreeimportrequestsimportsysimporttime

reload(sys)

sys.setdefaultencoding('utf-8')

url= 'https://www.ele.me/place/ws101hcw982?latitude=22.52721&longitude=113.95232'response=requests.get(url)printresponse.status_code

time.sleep(10)

html=response.content

selector=etree.HTML(html)

rez= selector.xpath('//*[@class="place-rstbox clearfix"]')print 'haha',rez #[]

for i inrez:

Name= i.xpath('//*[@class="rstblock-title"]/text()')printname

msales= i.xpath('//*[@class="rstblock-monthsales"]/text()')

tip= i.xpath('//*[@class="rstblock-cost"]/text()')

stime= i.xpath('//*[@class="rstblock-logo"]/span/text()')print u'店名'

for j inName:printjbreak

问题：根据//*[@class="place-rstbox clearfix"]xpath提取成功，但是rez输出为空

Version2:通过接口提取

geohash=ws101hcw982&latitude=22.52721&longitude=113.95232：位置信息参数及参数值

terminal=web：渠道信息

extras[]=activities和offset=0未知

importrequestsimportjson

url= 'https://www.ele.me/restapi/shopping/restaurants?extras[]=activities&geohash=ws101hcw982&latitude=22.52721&limit=30&longitude=113.95232&offset=0&terminal=web'resp=requests.get(url)printresp.status_code

Jdata=json.loads(resp.text)#print Jdata

for n inJdata:

name= n['name']

msales= n['recent_order_num']

stime= n['order_lead_time']

tip= n['description']

phone= n['phone']print name

输出：原以为通过limit=100就可以提取100条商家信息，然而最多只显示30

Version3：通过selenium提取

from selenium importwebdriverimportselenium.webdriver.support.ui as uiimporttime

driver= webdriver.PhantomJS(executable_path=r"C:\Python27\phantomjs.exe")#driver = webdriver.Chrome()

driver.get('https://www.ele.me/place/ws101hcw982?latitude=22.52721&longitude=113.95232')

time.sleep(10)

driver.get_screenshot_as_file("E:\\Elm_ok.jpg")

wait= ui.WebDriverWait(driver,10)

wait.until(lambda driver: driver.find_element_by_xpath('//div[@class="place-rstbox clearfix"]'))

name= driver.find_element_by_xpath('//*[@class="rstblock-title"]').text

msales= driver.find_element_by_xpath('//*[@class="rstblock-monthsales"]').text

tip= driver.find_element_by_xpath('//*[@class="rstblock-cost"]').text

stime= driver.find_element_by_xpath('//*[@class="rstblock-logo"]/span').textprint name #乐凯撒比萨(生态园店)

注：find_element只提取一个

改进版

#coding=utf-8

from selenium importwebdriverimportselenium.webdriver.support.ui as uiimporttime

driver= webdriver.PhantomJS(executable_path=r"C:\Python27\phantomjs.exe")#driver = webdriver.Chrome()

driver.get('https://www.ele.me/place/ws101hcw982?latitude=22.52721&longitude=113.95232')

time.sleep(10)#driver.get_screenshot_as_file("E:\\Elm_ok.jpg")

wait= ui.WebDriverWait(driver,10)

wait.until(lambda driver: driver.find_element_by_xpath('//div[@class="place-rstbox clearfix"]'))#driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") #滚动至底部页面

defexecute_times(times):for i in range(times + 1):

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

time.sleep(5)

execute_times(20)

name= driver.find_elements_by_xpath('//*[@class="rstblock-title"]')

msales= driver.find_elements_by_xpath('//*[@class="rstblock-monthsales"]')

tip= driver.find_elements_by_xpath('//*[@class="rstblock-cost"]')

stime= driver.find_elements_by_xpath('//*[@class="rstblock-logo"]/span')#print name,msales,stime,tip #[

print type(tip) #

printlen(name) #120for i inname:print i.text

说明：通过execute_times函数，滚动条每下移一次，休息5s，从而使页面加载更多的商家信息

输出：

python爬取饿了么评论_爬虫实例：饿了么爬虫相关推荐

python 爬取亚马逊评论_用Python爬取了三大相亲软件评论区，结果...
小三:怎么了小二?一副愁眉苦脸的样子. 小二:唉!这不是快过年了吗,家里又催相亲了 ... 小三:现在不是流行网恋吗,你可以试试相亲软件呀. 小二:这玩意靠谱吗? 小三:我也没用过,你自己看看软件评论 ...
python爬取携程酒店信息_不写代码玩转爬虫实例（3） - 抓取携程酒店信息
背景需求有不少朋友问永恒君携程网站的酒店信息怎么抓取,今天这篇文章来分享一下使用web scraper来快速实现抓取携程酒店信息. 例如,在携程官网搜索北京密云水库的酒店信息, 可以搜索到非常多的 ...
python爬取b站评论_学习笔记(1):写了个python爬取B站视频评论的程序
学习笔记(1):写了个python爬取B站视频评论的程序 import requests import json import os table='fZodR9XQDSUm21yCkr6zBqiveY ...
python爬b站评论_学习笔记(1):写了个python爬取B站视频评论的程序
学习笔记(1):写了个python爬取B站视频评论的程序 import requests import json import os table='fZodR9XQDSUm21yCkr6zBqiveY ...
python爬取当当网商品评论
python爬取当当网商品评论本案例获取某鞋评论作为例案例目的: 通过爬取当当网商品评价,介绍通过结合jsonpath和正则表达式获取目标数据的方法. 代码功能: 输入爬取的页数,自动下载保存每页 ...
python爬取苏宁商品评论
python爬取苏宁商品评论爬取其他电商物品评论的案例如下: https://blog.csdn.net/coffeetogether/article/details/114296159 https ...
python爬取网易云音乐评论分析_python爬取网易云音乐评论
本文实例为大家分享了python爬取网易云音乐评论的具体代码,供大家参考,具体内容如下 import requests import bs4 import json def get_hot_comme ...
python爬取网易云音乐飙升榜音乐_python爬取网易云音乐热歌榜 python爬取网易云音乐热歌榜实例代码...
想了解python爬取网易云音乐热歌榜实例代码的相关内容吗,FXL在本文为您仔细讲解python爬取网易云音乐热歌榜的相关知识和一些Code实例,欢迎阅读和指正,我们先划重点:python,网易热歌榜 ...
python 爬取菜鸟教程python100题，百度贴吧图片反爬虫下载，批量下载
每天一点点,记录学习 python 爬取菜鸟教程python100题近期爬虫项目,看完请点赞哦: 1:python 爬取菜鸟教程python100题,百度贴吧图片反爬虫下载,批量下载 2:pytho ...
【爬虫】Python爬取电商平台评论完整代码
利用Ajax爬取淘宝评论,这里完整的补充一下,包括数据存储. 对于Ajax参数的分析,Python爬取平台评论,这篇文章分析过了这里不再重复了. 主要是完善一下代码. import time impo ...

python爬取饿了么评论_爬虫实例：饿了么爬虫

python爬取饿了么评论_爬虫实例：饿了么爬虫相关推荐

最新文章

热门文章