Python爬虫爬取网页数据

本篇文章介绍爬虫爬取某租房信息数据，数据仅用于学习使用无商业用途。

首先在Python Console 控制台中安装requests、parsel模块，requests发送网络请求获取数据，parsel用于对数据源进行解析。

pip install requestspip install parsel

下面开始实操代码：

import requests
import parsel# file = open("C:\\Users\\AUSU\\Desktop\\租房数据.txt", "a")
# for i in range(98):
# url = "https://hz.lianjia.com/zufang/pg" + str(i + 2) + "rt200600000002/#contentList"
url = "https://nj.lianjia.com/zufang/pg3/#contentList"
header = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.82 Safari/537.36"
}
response = requests.get(url=url, headers=header)
selector = parsel.Selector(response.text)lis = selector.css(".content__list--item--main ")
for li in lis:title = li.css(".content__list--item--title a::text").getall()if title:info = str(title).replace("\\n", "").replace(" ", "").replace("[", "").replace("'", "").replace("]", "")location: list = li.css(".content__list--item--des a::text").getall()if location:area = str("-".join(location))address: list = li.css(".content__list--item--des ::text").getall()if address:addressInfo = str(address).replace("\\n", "").replace(" ", "").replace("[", "").replace("]", "") \.replace("'-'", "").replace("'", "").replace(",", "")price = li.css(".content__list--item-price em::text").get()result = info + "|" + area + "|" + addressInfo + "|" + price + "元"# file.write(result)# file.write("\n")print(result)

Python爬虫爬取网页数据相关推荐

Python爬虫爬取网页数据并存储（一）
Python爬虫爬取网页数据并存储(一) 环境搭建爬虫基本原理 urllib库使用 requests库使用正则表达式一个示例环境搭建 1.需要事先安装anaconda(或Python3.7)和 ...
python如何爬虫网页数据-python爬虫——爬取网页数据和解析数据
1.网络爬虫的基本概念网络爬虫(又称网络蜘蛛,机器人),就是模拟客户端发送网络请求,接收请求响应,一种按照一定的规则,自动地抓取互联网信息的程序. 只要浏览器能够做的事情,原则上,爬虫都能够做到. ...
python爬虫数据分析可以做什么-python爬虫爬取的数据可以做什么
在Python中连接到多播服务器问题,怎么解决你把redirect关闭就可以了.在send时,加上参数allow_redirects=False 通常每个浏览器都会设置redirect的次数.如果re ...
python 爬虫爬取疫情数据，爬虫思路和技术你全都有哈（一）
python 爬虫爬取疫情数据,爬虫思路和技术你全都有哈(二.数据清洗及存储) 爬起疫情数据,有两个网址: 1.百度:链接 2.丁香园疫情:链接在这两个中,丁香园的爬虫相对简单一点,所以今天就展示一 ...
python初学-爬取网页数据
python初学-爬取网页数据 1,获取网页源代码 import urllib url = 'http://www.163.com'wp = urllib.urlopen(url) file_cont ...
Python爬虫爬取疫情数据并可视化展示
这篇文章主要介绍了Python利用爬虫爬取疫情数据并进行可视化的展示,文中的示例代码讲解清晰,对工作或学习有一定的价值,需要的朋友可以参考一下.编程资料点击领取目录知识点开发环境爬虫完整代码 ...
python 爬虫爬取疫情数据，爬虫思路和技术你全都有哈（二）
上一章: python 爬虫爬取疫情数据,爬虫思路和技术你全都有哈(一.爬虫思路及代码) 第三步:数据清洗清洗数据很简单,就是数据太乱的话,就得花些时间,所以一定要有一个好的方法,才能避免在清洗数据 ...
python爬虫爬取网页新闻标题-看完保证你会
python爬虫爬取网页新闻标题方法 1.首先使用浏览自带的工具--检查,查找网页新闻标题对应的元素位置,这里查到的新闻标题是在 h3 标签中 2.然后使用编辑器编写python代码 2.1方法一: ...
python爬取网页代码-python爬虫爬取网页所有数据详细教程
Python爬虫可通过查找一个或多个域的所有 URL 从 Web 收集数据.Python 有几个流行的网络爬虫库和框架.大家熟知的就是python爬取网页数据,对于没有编程技术的普通人来说,怎么才能快 ...
python爬虫爬取网页上的天气数据
目录一:获取网页重要信息二:爬取网页数据三:源码分享一:获取网页重要信息在要爬取数据信息的网页上,F12进入查看网页内容二:爬取网页数据 1 导入模块 import requests fr ...

Python爬虫爬取网页数据

Python爬虫爬取网页数据相关推荐

最新文章

热门文章