本模块为IP溯源单线程获取ip138信息的模块。

效果图：

代码：

import re
import requests
from fake_useragent import UserAgentdef get_ip138html(ip):# 加入Useer-Agent否则无法获取响应，这里产生随机UserAgent，防止基于User-Agent的反爬User_Agent = UserAgent().randomheaders = {'User-Agent': User_Agent}url = 'https://www.ip138.com/iplookup.asp?ip=' + ip + '&action=2'# 编码解码的方式防止乱码html = requests.post(url, headers=headers).text.encode('raw_unicode_escape').decode('gbk')return htmldef dig_ip138_information(html):# 获得ip段rule1 = re.compile(r'"iP段":"(.*)", "兼容')# 观察响应发现其他信息都储存在一个字典中，利用正则获取字典rule2 = re.compile(r'"ip_c_list":\[(.*)], "zg"')# 从html源码中正则获取信息# result1为ip的网段，result2为其他信息result1 = rule1.search(html).group(1)result2 = rule2.search(html).group(1)# 将获取到的字典字符串转换为python字典ip_information_dict = eval(result2)# 由于已经获取到网段，且经测试idc均无返回值，删除这三对键值ip_information_dict.pop('begin')ip_information_dict.pop('end')ip_information_dict.pop('idc')# 增加网段的键值对ip_information_dict['segment'] = result1# 将国家省城市键合并为place字符串place = ip_information_dict['ct'] + ip_information_dict['prov'] + ip_information_dict['city'] + ip_information_dict['area']# 删除这些键值ip_information_dict.pop('ct')ip_information_dict.pop('prov')ip_information_dict.pop('city')ip_information_dict.pop('area')# 将place键值增加到字典ip_information_dict['place'] = place# 返回字典return ip_information_dictif __name__ == '__main__':ip = '36.110.116.43'# 获得ip138的页面ip138_html = get_ip138html(ip)# 爬取信息，并返回一个字典ip_information = dig_ip138_information(ip138_html)print(ip_information)

{'yunyin': '电信', 'net': '', 'segment': '36.110.87.0 - 36.110.146.255', 'place': '中国北京市北京市'}

【python爬虫】爬取ip138信息（随机调用User-Agent）相关推荐

用python爬虫爬取微博信息
用python爬虫爬取微博信息话不多说,直接上代码! import requests from bs4 import BeautifulSoup from urllib import parse i ...
python爬虫爬取房源信息
目录一.数据获取与预处理二.csv文件的保存三.数据库存储四.爬虫完整代码五.数据库存储完整代码写这篇博客的原因是在我爬取房产这类数据信息的时候,发现csdn中好多博主写的关于此类的文 ...
复工复产，利用Python爬虫爬取火车票信息
文章目录 Python 爬虫操作基本操作 python 标准库 urllib 获取信息上传信息 python 标准库 urllib3 获取信息上传信息第三方库 requests 获取特征信息模 ...
python 爬虫爬取小说信息
1.进入小说主页(以下示例是我在网上随便找的一片小说),获取该小说的名称.作者以及相关描述信息 2.获取该小说的所有章节列表信息(最重要的是每个章节的链接地址href) 3.根据每个章节的地址信息下载 ...
python爬虫爬取网页信息
爬虫流程:准备工作➡️爬取网页,获取数据(核心)➡️解析内容➡️保存数据解析页面内容:使用beautifulsoup定位特定的标签位置,使用正则表达式找到具体内容 import导入一些库,做准备工作 ...
python爬虫爬取股票评论，调用百度AI进行语义分析， matlab观察股票涨跌和评论的关系
文章自己写的,代码自己调试的,但是思想是拿来的哈哈,不能叫严格意义上的原创哦一.爬股票的评论环境:win7 aconda2python2.7,pycharm3.5 professional 1. ...
python爬虫爬取华硕笔记本信息
之前一个朋友麻烦我帮他爬取一下华硕笔记本信息,最后存储为一个csv格式的文件,文件格式为"系列型号".本文为本人实现该爬虫的心路旅程. 目录一.获取系列信息 1. 爬虫可行性分 ...
四小时学python爬虫爬取信息系列（第一天）
四小时学python爬虫爬取信息系列(第一天)(全是干货) 1.安装requests库(可以在电脑python,我是进入anaconda我建的虚拟环境) anaconda虚拟环境法流程: conda ...
Python爬虫爬取链家网上的房源信息练习
一原链接:用Python爬虫爬取链家网上的房源信息_shayebuhui_a的博客-CSDN博客_python爬取链家打开链家网页:https://sh.lianjia.com/zufang/ ...
python爬虫爬取当当网的商品信息
python爬虫爬取当当网的商品信息一.环境搭建二.简介三.当当网网页分析 1.分析网页的url规律 2.解析网页html页面书籍商品html页面解析其他商品html页面解析四.代码实现 ...

【python爬虫】爬取ip138信息（随机调用User-Agent）

效果图：

代码：

【python爬虫】爬取ip138信息（随机调用User-Agent）相关推荐

最新文章

热门文章