使用动态代理爬取某房产平台信息并写入Excel（python）

import requests
from lxml import html
import random
import xlwt
import time
import hashlib
from datetime import datetimeugList = []orderno = "DT20210228205219E8iMOzLE"
secret = "XXXXXXXXXXX"
ip = "dynamic.xiongmaodaili.cn"
# 按量订单端口
port = "8088"
ip_port = ip + ":" + port
timestamp = str(int(time.time()))
#第二种写法:timestamp = str(int(datetime.timestamp(datetime.now())))txt = "orderno=" + orderno + "," + "secret=" + secret + "," + "timestamp=" + timestamp
txt = txt.encode()
md5_string = hashlib.md5(txt).hexdigest()
sign = md5_string.upper()
#print(sign)
auth = "sign=" + sign + "&" + "orderno=" + orderno + "&" + "timestamp=" + timestamp + "&change=true"
proxy = {"https": "https://" + ip_port}
#print(proxy)
headers = {"User-Agent": random.choice(ugList),"Proxy-Authorization": "sign=BDB087FE4EZXXXXXXB814EACD4CB80&orderno=DT20210228205219E8iMOzLE&timestamp=1615711733&change=true"}i = 0
work_book = xlwt.Workbook(encoding="utf-8")
sheet = work_book.add_sheet("巴州二手房信息")
sheet.write(0, 3, "小区名称")
sheet.write(0, 4, "区域1")
sheet.write(0, 5, "区域2")
sheet.write(0, 6, "地址")sheet.write(0, 7, "总价(万元)")
sheet.write(0, 8, "单价(元/㎡)")
sheet.write(0, 2, "房子大小(㎡)")
sheet.write(0, 1, "房型")
sheet.write(0, 0, "标题")
row_num = 1
for i in range(0,50):url = "https://bygl.58.com/ershoufang/p" + str(i + 1) + "/"requests.DEFAULT_RETRIES = 5s = requests.session()s.keep_alive = Falsei += 1r = s.get(url, headers=headers, proxies=proxy, verify=False, timeout=20)r.encoding = 'utf-8'preview_html = html.fromstring(r.text)list_title = preview_html.xpath("//div[@class='property-content-title']/h3/text()|//p[""@class='property-content-info-comm-name']/text()|//p[ ""@class='property-content-info-comm-address']//span/text()|//span[ ""@class='property-price-total-num']/text()|//p[""@class='property-price-average']/text()|//p[""@class='property-content-info-text'][1]/text()|//p[""@class='property-content-info-text property-content-info-attribute']//span//text()")list_title = [str(x) for x in list_title]#time.sleep(random.random() * 2)print("-------------------------第" + str(i) + "页-------------------------------")print(list_title)for j in range(len(list_title)):if j % 14 == 0:title = list_title[j + 8]area1 = list_title[j + 9]biaoti = list_title[j]area2 = list_title[j + 10]area3 = list_title[j + 11]totalnum = list_title[j + 12]avg = list_title[j + 13]size = list_title[j + 7].strip().strip('\n')house_type = list_title[j + 1] + list_title[j + 2] + list_title[j + 3] + list_title[j + 4] + list_title[j + 5] + list_title[j + 6]# print(type(list_title[j + 6]))sheet.write(row_num, 3, title)sheet.write(row_num, 4, area1)sheet.write(row_num, 5, area2)sheet.write(row_num, 6, area3)sheet.write(row_num, 7, totalnum)sheet.write(row_num, 8, avg)sheet.write(row_num, 2, size)sheet.write(row_num, 1, house_type)sheet.write(row_num, 0, biaoti)row_num += 1time.sleep(1)
file_name = r"F:\巴州二手房爬取.xls"
work_book.save(file_name)

使用动态代理爬取某房产平台信息并写入Excel（python）相关推荐

python基金筛选_Python爬取基金的排名信息，写入excel中方便挑选基金
原标题:Python爬取基金的排名信息,写入excel中方便挑选基金基金是一种很好的理财方式,利用pyhton根据以往的跌幅情况进行基金选择,是一种很可靠的选择方式.本文以债券基金(稳定且风险较低) ...
python BeautifulSoup爬取豆瓣电影top250信息并写入Excel表格
豆瓣是一个社区网站,创立于2005年3月6日.该网站以书影音起家,提供关于书籍,电影,音乐等作品信息,其描述和评论都是由用户提供的,是Web2.0网站中具有特色的一个网站. 豆瓣电影top250网址: ...
2020-09-22Python爬取基金的排名信息，写入excel中方便挑选基金
基金是一种很好的理财方式,利用pyhton根据以往的跌幅情况进行基金选择,是一种很可靠的选择方式.本文以债券基金(稳定且风险较低)的爬虫和策略选择为例子,实现基金的选择. 1.数据库准备 1.1.ub ...
selenium框架爬取p2p问题平台信息，需加载点击页面的。
@TOC selenium框架爬取p2p问题平台信息 # -*- coding: utf-8 -*- """ Created on Tue Dec 10 07:03:57 ...
Python爬取淘宝商品信息保存到Excel
前言本文的文字及图片来源于网络,仅供学习.交流使用,不具有任何商业用途,如有问题请及时联系我们以作处理. PS:如有需要Python学习资料的小伙伴可以加点击下方链接自行获取 python免费学习资 ...
python 爬虫抓取网页数据导出excel_Python爬虫|爬取起点中文网小说信息保存到Excel...
前言: 爬取起点中文网全部小说基本信息,小说名.作者.类别.连载\完结情况.简介,并将爬取的数据存储与EXCEL表中环境:Python3.7 PyCharm Chrome浏览器主要模块:xlwt ...
python贴吧回帖-python控制浏览器爬取百度贴吧回复并写入Excel
[Python] 纯文本查看复制代码# http://tieba.baidu.com/i/i/my_reply from selenium import webdriver import time ...
Python3网络爬虫开发实战，使用IP代理爬取微信公众号文章
前面讲解了代理池的维护和付费代理的相关使用方法,接下来我们进行一下实战演练,利用代理来爬取微信公众号的文章. 很多人学习python,不知道从何学起. 很多人学习python,掌握了基本语法过后,不知 ...
使用代理爬去微信公众号_Python3网络爬虫开发实战之使用代理爬取微信公众号文章...
本节目标我们的主要目标是利用代理爬取微信公众号的文章,提取正文.发表日期.公众号等内容,爬取来源是搜狗微信,其链接为 http://weixin.sogou.com/,然后把爬取结果保存到 MySQ ...
使用代理爬去微信公众号_Python3WebSpider/9.5-使用代理爬取微信公众号文章.md at master · Lainton/Python3WebSpider · GitHub...
9.5 使用代理爬取微信公众号文章前面讲解了代理池的维护和付费代理的相关使用方法,接下来我们进行一下实战演练,利用代理来爬取微信公众号的文章. 1. 本节目标我们的主要目标是利用代理爬取微信公众号 ...

使用动态代理爬取某房产平台信息并写入Excel（python）

使用动态代理爬取某房产平台信息并写入Excel（python）相关推荐

最新文章

热门文章