aiohttp模块

参考aiohttp库简单教程 - 简书

什么是aiohttp

aiohttp是一个为Python提供异步HTTP 客户端/服务端编程，基于asyncio的异步库。asyncio可以实现单线程并发IO操作，其实现了TCP、UDP、SSL等协议，aiohttp就是基于asyncio实现的http框架。

安装

pip3 install aiohttp

使用

在网络请求中，一个请求就是一个会话，然后aiohttp使用的是ClientSession来管理会话，客户端会话(ClientSession)支持使用上下文管理器在结束时自动关闭。

import aiohttp
import asyncioasync def main():async with aiohttp.ClientSession() as session:async with session.get("https://www.3gbizhi.com/meinv/xgmn_2.html") as resp:print(await resp.text())asyncio.run(main())

读取响应内容

# 读取文本内容
await resp.text()
# 读取非文本内容
await resp.read()

爬美女图片

目标网站:36壁纸

代码讲解

通过创建三个ClientSession来分别请求不同的内容

首先获取每个页面里每个图片的url

async def main(url):async with aiohttp.ClientSession() as session:async with session.get(url) as response:tree = etree.HTML(await response.text())image_url_list = tree.xpath("/html/body/div[4]/ul/li")for image_url in image_url_list:image_url = image_url.xpath("./a/@href")[0]await get_iamge_url(image_url)

然后获取图片

async def get_iamge_url(url):async with aiohttp.ClientSession() as session:async with session.get(url) as response:tree = etree.HTML(await response.text())image = tree.xpath("//*[@id='showpicnow']/@src")[0]name = tree.xpath("//*[@id='showpicnow']/@alt")[0]path = desktop + "//" + name + ".jpg"await download(image, path, name)

最后写入文件

async def download(url, path, name):async with aiohttp.ClientSession() as session:async with session.get(url) as response:async with aiofiles.open(path, mode="wb") as f:await f.write(await response.read())print(name + "   下载完成")

创建任务列表，每个任务分别获取不同的页面

async def multiple_main():tasks = []for i in range(1, 15):tasks.append(main(f"https://www.3gbizhi.com/meinv/xgmn_{i}.html"))await asyncio.wait(tasks)if __name__ == '__main__':asyncio.run(multiple_main())

效果展示

完整代码

import asyncio
import winreg
import aiofiles
import aiohttp
from lxml import etree# 获取桌面路径
key = winreg.OpenKey(winreg.HKEY_CURRENT_USER, r'Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders')
desktop = winreg.QueryValueEx(key, "Desktop")[0]async def download(url, path, name):async with aiohttp.ClientSession() as session:async with session.get(url) as response:async with aiofiles.open(path, mode="wb") as f:await f.write(await response.read())print(name + "   下载完成")async def get_iamge_url(url):async with aiohttp.ClientSession() as session:async with session.get(url) as response:tree = etree.HTML(await response.text())image = tree.xpath("//*[@id='showpicnow']/@src")[0]name = tree.xpath("//*[@id='showpicnow']/@alt")[0]path = desktop + "//" + name + ".jpg"await download(image, path, name)async def main(url):async with aiohttp.ClientSession() as session:async with session.get(url) as response:tree = etree.HTML(await response.text())image_url_list = tree.xpath("/html/body/div[4]/ul/li")for image_url in image_url_list:image_url = image_url.xpath("./a/@href")[0]await get_iamge_url(image_url)async def multiple_main():tasks = []for i in range(1, 15):tasks.append(main(f"https://www.3gbizhi.com/meinv/xgmn_{i}.html"))await asyncio.wait(tasks)if __name__ == '__main__':asyncio.run(multiple_main())

python爬虫进阶：异步请求几秒钟爬光网站的全部美女图片相关推荐

Python 爬虫进阶篇-利用beautifulsoup库爬取网页文章内容实战演示
我们以 fox新闻网的文章来举例子,把整篇文章爬取出来. 首先是标题,通过结构可以看出来 class 为 article-header 的节点下的 h1 里的内容即是标题,通过 string 可以获 ...
Python 异步，协程，学起来好头疼，Python爬虫程序能调用GPU去爬东西吗？
78 技术人社群日报时间文章目录 Python 爬虫程序能调用 GPU 去爬东西吗? Python 异步,协程--,学起来好头疼有没有牛子大的说下 `matplotlib` 里 `plot` 和 ...
python爬虫进阶-每日一学（字体反爬-移花接木）
目的分析与学习更多的字体反爬套路详细需求 url:http://glidedsky.com/level/web/crawler-font-puzzle-2 思路解析一.审查二.分析 impor ...
python爬虫多久能学会-不踩坑的Python爬虫：如何在一个月内学会爬取大规模数据...
原标题:不踩坑的Python爬虫:如何在一个月内学会爬取大规模数据 Python爬虫为什么受欢迎如果你仔细观察,就不难发现,懂爬虫.学习爬虫的人越来越多,一方面,互联网可以获取的数据越来越多,另一方 ...
python 扒数据_不踩坑的Python爬虫：如何在一个月内学会爬取大规模数据
Python爬虫为什么受欢迎如果你仔细观察,就不难发现,懂爬虫.学习爬虫的人越来越多,一方面,互联网可以获取的数据越来越多,另一方面,像 Python这样的编程语言提供越来越多的优秀工具,让爬虫变得 ...
Python爬虫进阶——urllib模块使用案例【淘宝】
Python爬虫基础--HTML.CSS.JavaScript.JQuery网页前端技术 Python爬虫基础--正则表达式 Python爬虫基础--re模块的提取.匹配和替换 Python爬虫基础- ...
一文看懂Python 爬虫进阶（三）
一文看懂Python 爬虫进阶(三) 文章目录一文看懂Python 爬虫进阶(三) **猫眼电影(xpath)** **链家二手房案例(xpath)** **百度贴吧图片抓取** 这篇几乎都是代 ...
Python爬虫小白教程（二）—— 爬取豆瓣评分TOP250电影
文章目录前言安装bs4库网站分析获取页面爬取页面页面分析其他页面爬虫系列前言经过上篇博客Python爬虫小白教程(一)-- 静态网页抓取后我们已经知道如何抓取一个静态的页面了,现在 ...
python爬虫项目实战教学视频_('[Python爬虫]---Python爬虫进阶项目实战视频',)
爬虫]---Python 爬虫进阶项目实战 1- Python3+Pip环境配置 2- MongoDB环境配置 3- Redis环境配置 4- 4-MySQL的安装 5- 5-Python多版本共存配 ...

python爬虫进阶：异步请求几秒钟爬光网站的全部美女图片