Python获取网页中动态加载的数据

0、XHR 是什么?

XHR是 XMLHttpRequest 对象。既Ajax功能实现所依赖的对象,在JQuery中的Ajax是对 XHR的封装。

1、查看异步加载数据的RequestURL

图片示例:

2、查看图片在HTML页面中的绝对定位

图片示例:可以看到动态JS新增Div标签。

复制IMG在HTML 页面中的绝对定位

3、爬取异步加载的数据

这种可以用来爬取循环加载的网站。

代码示例:

from bs4 import BeautifulSoupimport requestsimport time

url = 'https://knewone.com/discover?page='def get_page(Url, data=None): print(Url)

wb_data = requests.get(Url)

soup = BeautifulSoup(wb_data.text, 'lxml')

imgs = soup.select('a.cover-inner > img') # 获取页面所有的img titles = soup.select('section.content > h4 > a') # 获取所有img的title links = soup.select('section.content > h4 > a') # 获取所有标签的链接

if data == None:

for img, title, link in zip(imgs, titles, links):

data = {

'img': img.get('src'),

'title': title.get('title'),

'link': link.get('href')

}

print(data)

def get_more_pages(Url, start, end): for one in range(start, end):

get_page(Url + str(one)) # 添加页码 time.sleep(1) # 防止被封IP,所以暂停1秒。

get_more_pages(url, 1, 10) # 获取1-9页的数据。

代码运行结果:

/Library/Frameworks/Python.framework/Versions/3.6/bin/python3 /Users/mac/Desktop/data/cloudbility/四周爬虫/2-KneWOne.py

https://knewone.com/discover?page=1{'img': 'https://making-photos.b0.upaiyun.com/photos/dfaec1d3ba6df86562f9699869ababd4.jpg!thing.fixed.big', 'title': 'Osmo 儿童游戏套件', 'link': '/things/osmo-er-tong-you-xi-tao-jian'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/2883a5c06b4da12a0cde1b7dff26b104.jpg!thing.fixed.big', 'title': 'TBot', 'link': '/things/tbot'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/aaeb7de0751ebc627c0971deb633b265.jpg!thing.fixed.big', 'title': 'olloclip 四合一摄像镜头 iPhone 6/6 Plus 版', 'link': '/things/olloclip-si-he-she-xiang-jing-tou-iphone-6-slash-6-plus-ban'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/91063c24d62dead12a8d9e2a54887f51.jpg!thing.fixed.big', 'title': 'Momax SelfiFit Mini 蓝牙自拍器', 'link': '/things/momax-selfifit-lan-ya-zi-pai-qi'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/a97bf7f2a200bace8bd1d629b6436b85.jpg!thing.fixed.big', 'title': '贱驴 007', 'link': '/things/jian-lu-007-1'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/8c7c2008ebb9844a6a86123e8554b8e4.jpg!thing.fixed.big', 'title': 'Moshi Xync Lightning Keychain 连接线', 'link': '/things/moshi-xync-lightning-keychain-lian-jie-xian'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/9fb1f1372d5b2c486e3ca903ca11826e.jpg!thing.fixed.big', 'title': 'RainDesign iLevel 2 支架', 'link': '/things/raindesign-ilevel-2'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/5ad99a590f7be4ba4b0df26f94e2c8a4.jpg!thing.fixed.big', 'title': 'Smart Rope 智能跳绳', 'link': '/things/smart-rope-zhi-neng-tiao-sheng'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/394b7f9a17177fe1278dc0b6ee51dea5.jpg!thing.fixed.big', 'title': 'Magic 桜', 'link': '/things/magic-ying'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/fa6e238c053df828fb6d516656c8648c.jpg!thing.fixed.big', 'title': 'SwatchMate Color Capturing', 'link': '/things/swatchmate-color-capturing-cube'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/edd9f874c4bf1a5f371ca2e3cba5f01b.jpg!thing.fixed.big', 'title': 'foto.sosho', 'link': '/things/foto-dot-sosho'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/30bfd4c95fd4bafa058e3485b117a2a0.jpg!thing.fixed.big', 'title': 'Bang & Olufsen BeoPlay H8', 'link': '/things/bang-and-olufsen-beoplay-h8'}

https://knewone.com/discover?page=2{'img': 'https://making-photos.b0.upaiyun.com/photos/6249e13497c84f8d850521a201af008a.jpg!thing.fixed.big', 'title': 'Woody 可折叠创意书灯', 'link': '/things/woody-ke-zhe-die-chuang-yi-shu-deng'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/e46d335dcf45605aae2e5889a0eaf4b6.jpg!thing.fixed.big', 'title': '创意智能感应温度显示魔术水杯', 'link': '/things/chuang-yi-zhi-neng-gan-ying-wen-du-xian-shi-mo-zhu-shui-bei'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/cf7d91d87fd70dd909750e7bd4e81f4d.jpg!thing.fixed.big', 'title': 'Broadlink RM Pro', 'link': '/things/broadlink-rm-pro'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/bae34600ab52cba16aed13309c70999e.jpg!thing.fixed.big', 'title': 'Anker USB 3.0 4-Port Hub', 'link': '/things/anker-r-usb-3-dot-0-4-port'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/62963e790016b599c0c4af6f63339848.jpg!thing.fixed.big', 'title': 'Eva Solo Fridge Carafe 彩漾冷水瓶', 'link': '/things/eva-solo-fridge-carafe-cai-yang-leng-shui-ping'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/b866f966623b4af3b4b00079c572b06e.jpg!thing.fixed.big', 'title': 'Flick Candles', 'link': '/things/flick-candles'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/0ff87359eaff7ca5642d450926ab536d.jpg!thing.fixed.big', 'title': '欧蒂芙 天使冰膜', 'link': '/things/ou-di-fu-tian-shi-bing-mo'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/2cda0a6b2421529b0fee175237ebdee1.png!thing.fixed.big', 'title': 'Propolinse 比那氏蜂胶复合漱口水', 'link': '/things/propolinse-bi-na-shi-feng-xiao-fu-he-shu-kou-shui'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/cc8147ca5612bf0a7e6a13b7e971fd3e.jpg!thing.fixed.big', 'title': 'Withings Activité', 'link': '/things/withings-activite'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/d5585da14eb35e2cf82b855bc406afd7.jpg!thing.fixed.big', 'title': 'RainDesign mStand', 'link': '/things/mstand-laptop-stand'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/04847c2ae796e6af39ab1f821701f044.jpg!thing.fixed.big', 'title': '萌奇 x JOWAY 小冰棒', 'link': '/things/meng-qi-x-joway-xiao-bing-bang'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/ade52894eed66ae6992de74950005e47.jpg!thing.fixed.big', 'title': 'Rivers Demita', 'link': '/things/rivers-demita'}

https://knewone.com/discover?page=3{'img': 'https://making-photos.b0.upaiyun.com/photos/f27e40a07d9465efb0f868b940865626.jpg!thing.fixed.big', 'title': 'Adorable Pet Bed 可爱的鱼形宠物床', 'link': '/things/adorable-pet-bed-ke-ai-de-yu-xing-chong-wu-chuang'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/43d62540f88969581065fc969cdf1439.jpg!thing.fixed.big', 'title': 'gatsby crazy cool 超凉身体降温喷雾', 'link': '/things/gatsby-crazy-cool-chao-liang-shen-ti-jiang-wen-pen-wu'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/81995fd123f247bd850a0a46783fdf76.jpg!thing.fixed.big', 'title': 'UNI-CUB β 电动代步车', 'link': '/things/uni-cub-b'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/b4a5f52fb530ad5341e94dc9d419f181.jpg!thing.fixed.big', 'title': 'Stadler Form OTTO 古典原木风扇', 'link': '/things/stadler-form-otto-gu-dian-yuan-mu-feng-shan'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/87014d301ec4da5c2906f0ad3d6a81a8.jpg!thing.fixed.big', 'title': 'momoda 床头宝—互联网智能闹钟', 'link': '/things/momoda-chuang-tou-bao-hu-lian-wang-zhi-neng-nao-zhong'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/f2d41f734f108c19fd6a84adf6eed760.jpg!thing.fixed.big', 'title': 'Ithink 手立视 智能网络摄像头', 'link': '/things/ithink-shou-li-shi-zhi-neng-wang-luo-she-xiang-tou'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/310e8c6edcc832639de94f1d208fad32.jpg!thing.fixed.big', 'title': '丝瓜年代 牛皮笔记本', 'link': '/things/si-gua-nian-dai-niu-pi-bi-ji-ben'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/899266e88c00d2413fcab437eb87a6e1.jpg!thing.fixed.big', 'title': 'Okamura Duke CZ 真皮办公椅', 'link': '/things/okamura-duke-cz-zhen-pi-ban-gong-yi'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/ed11079d3d623b95b1a29fa7fe975e40.jpg!thing.fixed.big', 'title': '麦芽智能灯', 'link': '/things/mai-ya-zhi-neng-deng'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/8fea9e5b30d9bb9ba0a4493fd981b36d.jpg!thing.fixed.big', 'title': 'Gekkopod 壁虎自拍支架', 'link': '/things/gekkopod-bi-hu-zi-pai-zhi-jia'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/5ae10361c86fbf47692e68c1f1b166eb.png!thing.fixed.big', 'title': 'TACS Twenty 4 TS1101 男士石英手表', 'link': '/things/tacs-twenty-4-ts1101-nan-shi-shi-ying-shou-biao'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/a590ee9c3cd457c6234923815df8fe9a.png!thing.fixed.big', 'title': 'TACS Pixel TS1302 男士石英手表', 'link': '/things/tacs-pixel-ts1302-nan-shi-shi-ying-shou-biao'}

https://knewone.com/discover?page=4{'img': 'https://making-photos.b0.upaiyun.com/photos/2ec807d8a030f5eb9e3ecd0d88ef771a.png!thing.fixed.big', 'title': 'LOCA 超薄透明手机壳', 'link': '/things/loca-i6-slash-6-plus-chao-bo-tou-ming-shou-ji-ke'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/816411b67f901ddf3d03bb47d15c9e2b.png!thing.fixed.big', 'title': 'TACS Kraft TS1306 男士石英手表', 'link': '/things/tacs-kraft-ts1306-nan-shi-shi-ying-shou-biao'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/28751c8173b9692a06b32f1a5c3e36a3.jpg!thing.fixed.big', 'title': 'imblu 多功能洗漱包', 'link': '/things/imblu-duo-gong-neng-shou-na-bao'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/abbfe26bec5e061b46959743f1c3a3c6.jpg!thing.fixed.big', 'title': 'Super-Soft Ear Muff for Sleeping', 'link': '/things/super-soft-ear-muff-for-sleeping'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/05dd5c32dfdd95a954aa23d0b288f0fe.jpg!thing.fixed.big', 'title': 'Bluelounge posto 耳机支架', 'link': '/things/bluelounge-posto-er-ji-zhi-jia'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/987dbcd0fee536d08554742bd20687d2.png!thing.fixed.big', 'title': 'ANGELHOOD 日式祝愿娃娃', 'link': '/things/angelhood-ri-shi-zhu-yuan-wa-wa'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/0632f191508933848d123bfd33b82d51.jpg!thing.fixed.big', 'title': 'ten Design Stationery 转动圆珠笔', 'link': '/things/ten-design-stationery-zhuan-dong-yuan-zhu-bi'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/a58213fe2e341928f2e424338f37956c.jpg!thing.fixed.big', 'title': '保友 - Ergonor 独立脚托', 'link': '/things/ergonor-du-li-jiao-tuo'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/167446c4da1e1f971b1aeebf005dc685.jpg!thing.fixed.big', 'title': 'IdeaShow 阿拉神灯', 'link': '/things/ideashow-a-la-shen-deng'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/09ebc797c57930ef6ddc56d9d9f86495.jpg!thing.fixed.big', 'title': 'Fabriano Boutique 匹诺曹钢笔', 'link': '/things/fabriano-boutique-pi-nuo-cao-gang-bi'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/22619d5ab27a7ebec352c3fe20396ab9.jpg!thing.fixed.big', 'title': 'IRIS OHYAMA 手持除螨吸尘器', 'link': '/things/iris-ohyama-shou-chi-chu-man-xi-chen-qi'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/3ddde827164ea9779c3d3732ca67ad0e.jpg!thing.fixed.big', 'title': 'Fred & Friends Doomed Crystal Skull Shot Glass', 'link': '/things/fred-and-friends-doomed-crystal-skull-shot-glass'}

https://knewone.com/discover?page=5{'img': 'https://making-photos.b0.upaiyun.com/photos/316f43719495992bcf30ec89a85d7ec8.jpg!thing.fixed.big', 'title': '网格吊床', 'link': '/things/wang-ge-diao-chuang'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/2d09c1b3db6b58c633c7b1e3ed1ac6cb.jpg!thing.fixed.big', 'title': 'Broadlink TC1', 'link': '/things/broadlink-tc1'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/a495fbcb757a0b061971ed15ccaff5f9.jpg!thing.fixed.big', 'title': 'DIVOOM Travel III 音箱', 'link': '/things/divoom-travel-mi-ni-san-fang-hu-wai-lan-ya-xiao-yin-xiang'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/d0877ed9dfa2cd855aa5acfd32cdcc62.jpg!thing.fixed.big', 'title': '乐事 A200', 'link': '/things/le-shi-a200'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/46b8cdbf2053b87d8c4c8d275b3d5b05.jpg!thing.fixed.big', 'title': 'Bang & Olufsen Beoplay H4', 'link': '/things/bang-and-olufsen-beoplay-h4'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/0f48595529627efd5ed58d5791385a63.jpg!thing.fixed.big', 'title': '追求科技 ZQ16 艺术签名版', 'link': '/things/zhui-qiu-ke-ji-zq16-yi-zhu-qian-ming-ban'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/d342b49d8effb9f3e60258c39e9193f1.jpeg!thing.fixed.big', 'title': 'Philips SHE4205BK', 'link': '/things/philips-she4205bk'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/dd5616e71ea578c2b00c68656fbadaa8.jpg!thing.fixed.big', 'title': 'SONY a7', 'link': '/things/sony-a7-1'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/6bccdd5c134977ca5149fc0101862340.jpg!thing.fixed.big', 'title': 'Humanscale word chair', 'link': '/things/humanscale-word-chair'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/0d55169f38de18740eb9e4fdc5c76865.jpg!thing.fixed.big', 'title': 'Cutipol DUNA Gold', 'link': '/things/cutipol-duna-gold'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/2194ed37ca0092372763cd9ee2d490e3.jpg!thing.fixed.big', 'title': 'Dunoon 向日葵马克杯', 'link': '/things/dunoon-xiang-ri-kui-ma-ke-bei'}

{'img': 'https://making-photos.b0.upaiyun.com/photos/819bc7954c54311c88952b956fca4901.jpg!thing.fixed.big', 'title': '质造 星球杯', 'link': '/things/zhi-zao-xing-qiu-bei'}

.....

(完)

原文链接:Python获取网页中动态加载的数据 - 运维之路(Opsroad)社区​www.opsroad.comPython获取网页中动态加载的数据 - 运维之路(Opsroad)社区​www.opsroad.comPython获取网页中动态加载的数据 - 运维之路(Opsroad)社区​www.opsroad.com

python requests 动态加载_Python获取网页中动态加载的数据相关推荐

  1. python 如何使用 pandas 在 flask web 网页中分页显示 csv 文件数据

    目录 一.实战场景 二.知识点 python 基础语法 python 文件读写 python 分页 pandas 数据处理 flask web 框架 jinja 模版 三.菜鸟实战 初始化 Flask ...

  2. python读取图片分辨率_python获取网页中所有图片并筛选指定分辨率的方法

    压测时,图片太少,想着下载网页中的图片,然后过滤指定分辨率,但网页中指定分辨率的图片太少了(见下) 后使用格式工厂转换图片 import urllib.request # 导入urllib模块 imp ...

  3. python 打开网页开发者工具_Python获取网页指定内容(BeautifulSoup工具的使用方法)...

    page = urllib2.urlopen(url) contents = page.read() #获得了整个网页的内容也就是源代码 print(contents) url代表网址,content ...

  4. 网页中动态嵌入PDF文件/在线预览PDF内容

    网页中动态嵌入PDF文件/在线预览PDF内容https://www.cnblogs.com/xgyy/p/6119459.html #网页中动态嵌入PDF文件/在线预览PDF内容# 摘要:在web开发 ...

  5. node.js用get方式获取网页中的链接

    2019独角兽企业重金招聘Python工程师标准>>> get方式获取网页中的链接 var http = require('http');//定义函数 var getAHref = ...

  6. 在网页中动态的生成一个gif图片

    作者: love.net 大家知道股票网站的K线图是动态生成的定时刷新PHP 就有动态生成图片的功能 那么怎样用asp.net在网页中动态的生成一个图片呢? 下面我要举的例子是动态的生成一个图片显示当 ...

  7. [Web开发] 在网页中动态加入RSS feed 元素

    浏览器通过<link type="application/rss+xml" ... > 来识别网页中是否存在RSS feed <link href="r ...

  8. Android WebView获取网页中JavaScript弹框内容

    Android WebView获取网页中JavaScript弹框内容 网页中弹窗的js代码为 <script type="text/javascript" language= ...

  9. java 获取js html_JS获取网页中HTML元素的几种方法

    编写js程序的时候最常使用的就是获取网页中的html元素,并进行处理,我在网上发现了一篇获取html对象的几种方法进行整理的帖子,发上来大家一块学习~ getElementById getElemen ...

最新文章

  1. Java中遍历数组使用foreach循环还是for循环?
  2. linux modprobe命令参数及用法详解--linux加载模块命令
  3. html滚动字幕如何向下移动,按向下键的同时,菜单选项向下移动,浏览器右边的滚动条也跟着跑怎么办。这个bug怎么改...
  4. linux lanmp 安装教程,Linux 安装 lanmp
  5. GPS(北斗)拓展无线同步模块GSYN1000系列在电力、大坝、隧道、核电、密闭厂房的应用方案...
  6. PHP使用echo输出标签设置CSS样式问题
  7. Linux常用命令英文全称
  8. java jshell_Java9特性预览——Jshell
  9. c语言的实验,c语言 实验1
  10. oracle loap函数用法
  11. python给csv文件添加表头
  12. PySwitch - Python 环境快速切换
  13. oracle创建数据库的先决条件,Oracle数据库安装先决条件检查失败解决方案
  14. 常见的数据库有哪些?
  15. 声网传输层协议 AUT 的总结与展望丨Dev for Dev 专栏
  16. [蛋蛋无厘头日记]收到礼物喵~o(∩_∩)o
  17. 爱签电子合同助力无纸化办公,青岛将推行存量房网签合同电子签名
  18. 微信公众号无限推送消息微信群发微信定时群发微信主动推送微信客服消息
  19. 农村商业银行与中国农业银行的区别
  20. EXCEL公式引用得是空的单元格,但是却不返回空““,而是返回0是怎么回事? 怎么处理呢?

热门文章

  1. 【力控电机分析】MIT四足机器人力控电机优劣分析及解决办法
  2. python动态网页爬取_Python 动态页面内容爬取
  3. 3ds max石墨工具中步骤构建、延伸、优化多边形绘制总结
  4. C++ set的应用
  5. 问道神兵天降服务器维护,问道12月29日全区更新维护1.473公告
  6. 中文汉字编码技术的探索、突破与拓展
  7. MOSFET正向导通,阻断,阈值电压研究
  8. UE4开发Oculus Quest2游戏要注意的问题
  9. 如何通过txt文本编辑一个页面
  10. MATLAB图像处理简单程序(1)—实现几何、算数简单变换,滤镜处理以及图片变换效果展示