python爬取携程酒店评论_python爬取携程酒店列表
做个笔记,亲测可用
```python
import requests
import json
from lxml import etree
from bs4 import BeautifulSoup
url = 'https://m.ctrip.com/webapp/hotel/j/hotellistbody?pageid=212093&key=a618c%60iDcf8%C2%A814KGaa035%2FJy2bddcaP%40e65d85d%3Fc61f44d399389006bebbc5cb&huk=S9aJ3pe7Te4PvAnwdYlfYp6E3YLpemkEn4jzaWaYs9jZmw0OJU0jAYoTjH9EbfvgbjMYL6Ia8wl5vAZwHYUcWQ7YTDyUljcSvdFeG1RHfj0leSYkSv0MKgkYGawb9jnke87yhZYtYmYnUvoMem0YHNidqYZYOY1pyDqjanx0PeGPwmMjMfrXgjNlR34iazwo9IU1Y9LjPYmse5LyShi9LiBGjUYoOyBpw8dYhFwFmy7fwakYmDwXY7BRqswTUWMAj9UJcqig6W4HYGdwDaRXUwazjoqRkYGsrQzjhGythw9OjfpjQ8wc8vhzRgOWktRAdyUQYkMiLARdgifYfmRDGw5DWq6JF5yfojHqRdtvFUw8GJ4QJlpYc8RUY6OjmbwApvMqjHYZPROkvzZY3PWaHePnRg8WmZj4AWAYPDRB4vtcY3TWQHeHpRqZWU6Ez6W6YD1RUMw8oW7Tjt0JFciGmW68YZFJt7RhsELSycz'
payloadheader = {
'authority': 'm.ctrip.com',
'method': 'POST',
'path': '/webapp/hotel/j/hotellistbody?pageid=212093&key=a618c%60iDcf8%C2%A814KGaa035%2FJy2bddcaP%40e65d85d%3Fc61f44d399389006bebbc5cb&huk=S9aJ3pe7Te4PvAnwdYlfYp6E3YLpemkEn4jzaWaYs9jZmw0OJU0jAYoTjH9EbfvgbjMYL6Ia8wl5vAZwHYUcWQ7YTDyUljcSvdFeG1RHfj0leSYkSv0MKgkYGawb9jnke87yhZYtYmYnUvoMem0YHNidqYZYOY1pyDqjanx0PeGPwmMjMfrXgjNlR34iazwo9IU1Y9LjPYmse5LyShi9LiBGjUYoOyBpw8dYhFwFmy7fwakYmDwXY7BRqswTUWMAj9UJcqig6W4HYGdwDaRXUwazjoqRkYGsrQzjhGythw9OjfpjQ8wc8vhzRgOWktRAdyUQYkMiLARdgifYfmRDGw5DWq6JF5yfojHqRdtvFUw8GJ4QJlpYc8RUY6OjmbwApvMqjHYZPROkvzZY3PWaHePnRg8WmZj4AWAYPDRB4vtcY3TWQHeHpRqZWU6Ez6W6YD1RUMw8oW7Tjt0JFciGmW68YZFJt7RhsELSycz',
'scheme': 'https',
'accept': 'text/html',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'zh-CN,zh;q=0.9',
'content-length': '1117',
'content-type': 'application/json',
'cookie': 'HLDUUID=64e4e34a2f6a4a8d9c4f2f001038116e; supportwebp=true; list_hotel_price=%7B%22traceid%22%3A%22100004883-3a7064cb-6afe-4a60-a521-56fc832ff2fd%22%2C%22pageid%22%3A%22212093%22%2C%22searchcandidate%22%3A%7B%22bedtype%22%3A%22%22%2C%22breakfast%22%3A-1%2C%22childs%22%3A%5B%5D%2C%22person%22%3A0%2C%22segmentationno%22%3A0%2C%22showtype%22%3A0%7D%2C%22timestamp%22%3A1608518070983%2C%22minpriceroom%22%3A%7B%22avgprice%22%3A151%2C%22cashbackamount%22%3A0%2C%22couponamount%22%3A0%2C%22currency%22%3A%22RMB%22%2C%22iscanreserve%22%3A1%2C%22isshadow%22%3A0%2C%22isusedcashback%22%3A1%2C%22isusedcoupon%22%3A-1%2C%22reductionamount%22%3A0%2C%22roomid%22%3A822100983%2C%22shadowId%22%3A0%2C%22taxfee%22%3A0%7D%2C%22ttype%22%3A0%2C%22icp%22%3A0%2C%22ipd%22%3A0%2C%22isopenpricetolerate%22%3A1%2C%22passData%22%3A%7B%22minPriceDetailInfo%22%3A%22%22%7D%7D; JSESSIONID=64C3384CD7932D78B0A8FE05C84A2F86; ibulanguage=CN; ibulocale=zh_cn; cookiePricesDisplayed=CNY; _RSG=HG4RzA1yyYA5fvTO5InEOA; _RDG=28c7fc45c749a92e343672259641905462; _RGUID=a8bcf012-bd6c-4411-a60e-c52158779a45; MKT_CKID=1604564189751.y0gdt.z92s; _ga=GA1.2.1442630158.1604564190; _abtest_userid=98f1c193-aeed-44cc-bfb7-dce45016a23e; MKT_Pagesource=PC; cticket=D7E5D6FF1FCFE372FCBA8D08018B18EE850DBAD8BF5709BC690907FDEF5F9F25; AHeadUserInfo=VipGrade=0&VipGradeName=%C6%D5%CD%A8%BB%E1%D4%B1&UserName=&NoReadMessageCount=0; GUID=09031149213039616044; HotelCityID=380split%E5%8D%97%E5%AE%81splitNanningsplit2020-12-4split2020-12-05split0; ticket_ctrip=bJ9RlCHVwlu1ZjyusRi+ypZ7X2r4+yojTmzDcktI7M4C0EskLVMK0lrCXhnuDj8dA92pzWks5c8FmvfoxNQGlaonIkM38NfKbL40tW25T0QscE9b+Sb/7zIpAeumkbTXJIkVlDueYw0WicYQmyektJJvs+HhzLkrE8dJCvC+AOCQaWtGiVt3Cef6UJjmUCzNs2gc/5qxfR0QXOmSVy5ovW67AaAmI4Lh8pxiIkXUzSTlqPEwCu0cCiFT3jPVIS8JpWHx/7RpzE+TuVsYbFWpQPJ1ta5X/rzL7M6DGZvk+zk=; nfes_isSupportWebP=1; MKT_OrderClick=ASID=48972257110&AID=4897&CSID=2257110&OUID=&CT=1608117770096&CURL=https%3A%2F%2Fhotels.ctrip.com%2Fhotels%2FdetailPage%3Fallianceid%3D4897%26sid%3D2257110%26bd_vid%3D12272123516948689266%26keywordid%3D189730625678%26checkin%3D2020-12-16%26checkout%3D2020-12-17%26hotelId%3D1719392%26adult%3D1%26crn%3D1&VAL={"pc_vid":"1605781965521.40zvlg"}; _RF1=220.166.229.158; librauuid=MtdcV8c9mlmF23mo; DUID=u=CB680F8835D75F2D847143D9A633A7C4&v=0; _gid=GA1.2.1929230059.1608515166; MKT_CKID_LMT=1608515166001; __utma=13090024.1442630158.1604564190.1608517975.1608517975.1; __utmz=13090024.1608517975.1.1.utmcsr=baidu|utmccn=(organic)|utmcmd=organic; __utmc=13090024; Session=smartlinkcode=U130026&smartlinklanguage=zh&SmartLinkKeyWord=&SmartLinkQuary=&SmartLinkHost=; MKT_code=OUID=&AllianceID=4897&SID=353693&SourceID=55551825&AppID=&OpenID=&exmktID=&createtime=1608517989&Expires=1609122788723; Union=OUID=&AllianceID=4897&SID=353693&SourceID=55551825&AppID=&OpenID=&exmktID=&createtime=1608517989&Expires=1609122788723; intl_ht1=h4=22249_455053,32_919951,12_1945382,547_37192103,477_2097037,2_2258966; _bfs=1.1; _uetsid=46004140432e11eb889b8b9838b6f929; _uetvid=3799d0201f3f11eba4ab0740f1d353e0; _bfi=p1%3D102002%26p2%3D102002%26v1%3D2424%26v2%3D2423; _jzqco=%7C%7C%7C%7C1608515166292%7C1.453714554.1604564189747.1608521336199.1608529226601.1608521336199.1608529226601.0.0.0.2016.2016; __zpspc=9.52.1608529226.1608529226.1%232%7Cwww.baidu.com%7C%7C%7C%7C%23; appFloatCnt=1998; hotelhst=1164390341; _bfa=1.1605781965521.40zvlg.1.1608520759447.1608530252749.66.2425.212093',
'origin': 'https://m.ctrip.com',
'referer': 'https://m.ctrip.com/webapp/hotel/guangzhou32/',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36',
'x-ctrip-hotel-firend': '----',
'x-requested-with': 'XMLHttpRequest'
}
payloadParam = {
'abMaps': {},
'adultCounts': 0,
'business': 'false',
# 'checkinDate': "20201221",
# 'checkoutDate': "20201222",
'cityID': 28,
'controlBitMap': 1,
'costPerformanceHigh': 'false',
'districtID': 0,
'domesticHotelList': "domesticHotelList",
'enableAdHotel': 'true',
'filterItemList': [],
'hiddenHotelIDList': [],
'hiddenHotelIdListStr': "",
'highestPrice': 0,
'hotelIdList': [],
'keyword': "",
'keywordFilterItem': "hotkeyword-锦江之星|||锦江之星12|",
'keywordText': "",
'locationItemList': [],
'lowestPrice': 0,
'multipleHotel': 'false',
'needLastPageRecommend': 'false',
'notTopSet': 'false',
'pageSize':' 20'
}
dumpJson = json.dumps(payloadParam)
html= requests.post(url,data=dumpJson,headers=payloadheader)
soup = BeautifulSoup(html.text, "html5lib")
html_tree = etree.HTML(str(soup))
code = html_tree.xpath('/html/body/div/@data-id')
json_code = json.dumps(code)
```
python爬取携程酒店评论_python爬取携程酒店列表相关推荐
- python爬携程景区评论_python爬取携程景点评论信息
python爬取携程景点评论信息 今天要分析的网站是携程网,获取景点的用户评论,评论信息通过json返回API,页面是这个样子的 按下F12之后,F5刷新一下 具体需要URL Request的方式为P ...
- python携程酒店评论_python爬取携程景点评论信息
今天要分析的网站是携程网,获取景点的用户评论,评论信息通过json返回API,页面是这个样子的 按下F12之后,F5刷新一下 具体需要URL Request的方式为POST,还需要你提取的哪一页,下面 ...
- python爬取饿了么评论_python爬取饿了么的实例
python爬取饿了么的实例 发布时间:2020-11-17 10:55:40 来源:亿速云 阅读:85 作者:小新 小编给大家分享一下python爬取饿了么的实例,相信大部分人都还不怎么了解,因此分 ...
- python爬虫实践-01-携程酒店评论的爬取
0 关键 携程网其最大的特点就是:基本上所有的有效数据都是通过Ajax异步请求获取的.本博客的主要内容为: 构造Ajax请求,获得返回的reviews数据,由于返回的数据为JSON格式,很好分析 判定 ...
- python携程酒店评论_Python基于selenium爬取携程酒店评论信息
爬取站点 任意一个携程酒店的详细链接,这里给出了四个,准备开四个线程爬取: https://hotels.ctrip.com/hotel/6278770.html#ctm_ref=hod_hp_hot ...
- python爬取评论_python爬取网易云音乐评论
本文实例为大家分享了python爬取网易云音乐评论的具体代码,供大家参考,具体内容如下 import requests import bs4 import json def get_hot_comme ...
- python爬取虎扑评论_Python爬取NBA虎扑球员数据
虎扑是一个认真而有趣的社区,每天有众多JRs在虎扑分享自己对篮球.足球.游戏电竞.运动装备.影视.汽车.数码.情感等一切人和事的见解,热闹.真实.有温度. 受害者地址 https://nba.hupu ...
- python爬取豆瓣电影评论_python 爬取豆瓣电影评论,并进行词云展示及出现的问题解决办法...
def getHtml(url): """获取url页面""" headers = {'User-Agent':'Mozilla/5.0 ( ...
- python抓取微博评论_Python爬取新浪微博评论数据,你有空了解一下?
开发工具 Python版本:3.6.4 相关模块: argparse模块: requests模块: jieba模块: wordcloud模块: 以及一些Python自带的模块. 环境搭建 安装Pyth ...
最新文章
- 在Matlab符号计算中灵活运用assume
- ubuntu挂载windows下的文件目录的步骤
- 平板电脑有什么用_除了盖泡面,平板电脑没什么用了
- 如何在报表中实现算法的可挂接需求
- Java快速开发框架LML简介
- Coin-row problem(1139)
- 14.初步解析document的核心元数据以及图解剖析index创建反例
- 常见电容器图片_工业机器视觉的常见应用与施努卡VisionMax视觉系统介绍
- 阿里巴巴大数据运维平台实践
- 小写数字转大写_微软太坏了,这个函数居然被隐藏了,用它搞定数值转中文大小写...
- 复选框不可编辑_你不可错过的Word操作文本小技巧 | 厉害了Word姐15
- 关于延拓定理的一点注解
- java jdk oracle官网历史版本下载链接
- 老小白手机安装termux(换源)运行Python2
- openldap sssd服务认证登录
- mybatis运行报错java.sql.SQLNonTransientConnectionException: Public Key Retrieval is not allowed
- 机器学习:单词拼写纠正器python实现
- 二维vector的创建
- 微信Android SDK提示com.tencent.mm.plugin.openapi.Intent.ACTION_REFRESH_WXAPP
- 【夏目鬼鬼分享】RabbitMQ发布/订阅广播模式
热门文章
- 文献分析-利用CNKI自带的可视化分析工具
- 对偶理论说明(深入理解)
- 开关调色新世界BP2888电源解决方案
- 开发票服务器返回信息为空,使用 getinvoicebatch 批量接口获取发票信息,返回成功,但数据是空的...
- spring定时任务的应用
- 两个独立同分布的指数分布相加服从什么分布
- bootstrap4.0图标使用_很不错的两款Bootstrap Icon图标选择组件
- PHP:【微信小程序】初识微信小程序,微信小程序配置
- CentOS 8.1安装MySQL 8.0详解
- 自然语音处理(NLP)系列(四)——命名实体识别 (NER)