b站爬虫，用于查询主播舰队用户等级构成

b站爬虫，用于查询主播舰队用户等级构成：

话不多说直接上代码

import urllib.request
import re
import os
import timeclass marine:total= 0         #大航海总数lv6Count = 0lv5Count = 0lv4Count = 0lv3Count = 0lv2Count = 0lv1Count = 0lv0Count = 0ship1Count = 0   #总督ship2Count = 0   #提督ship3Count = 0   #舰长uid = ''ruid = ''user_space = ''all_page=0       #舰长列表页数crew_list='./crew_list/'#存放舰长列表的文件夹名称header = {'User-Agent':'Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36','accept-language': 'zh-CN,zh;q=0.9,en;q=0.8'
}def __init__(self,uid):self.uid = uidself.space = 'https://api.bilibili.com/x/space/acc/info?mid='+uid+'&jsonp=jsonp'#用户信息def ruid(self):#根据uid获取直播间号space_request = urllib.request.Request(self.space,headers=self.header)space_html = urllib.request.urlopen(space_request).read().decode('utf-8')self.ruid = re.findall(r'live.bilibili.com/(.*?)"',space_html)[0]print("直播间号：",self.ruid)self.caplist = 'https://api.live.bilibili.com/xlive/app-room/v2/guardTab/topList?roomid='+self.ruid+'&page=1&ruid='+self.uid+'&page_size=29'#直播间信息def Snapshot(self):#下载舰长列表if not os.path.exists(self.crew_list):os.makedirs(self.crew_list)request = urllib.request.Request(self.caplist,headers=self.header)reponse = urllib.request.urlopen(request).read()path=self.crew_list+'page1.html'fh = open(path,"wb")    fh.write(reponse)fh.close()reponse=reponse.decode('utf-8')all_page = re.findall(r'"page":(.*?),',reponse)[0]    self.all_page=int(all_page)total = re.findall(r'"num":(.*?),',reponse)[0]self.total=int(total)print("舰长总数：",self.total)for i in range(2,self.all_page+1):time.sleep(0.5)caplist = 'https://api.live.bilibili.com/xlive/app-room/v2/guardTab/topList?roomid='+self.ruid+'&page='+str(i)+'&ruid='+self.uid+'&page_size=29'#舰长列表网址request = urllib.request.Request(caplist,headers=self.header)reponse = urllib.request.urlopen(request).read()path = self.crew_list+'page'+str(i)+'.html'fh = open(path,"wb")    fh.write(reponse)fh.close()def count(self):#开始统计dirs = os.listdir(self.crew_list)crew_list=[]flag=[]for i in range(0,self.total+1):flag.append(0)for file in dirs:#将船员uid、舰队等级、舰队排序存放于三元组列表path = self.crew_list + filefd = open(path,encoding='utf-8')content=fd.read()crew_list.extend(re.findall(r'"uid":(.*?),"ruid":.*?"rank":(.*?),"username":".*?","face":".*?","is_alive":.*?,"guard_level":(.*?),',content))fd.close()list(set(crew_list))for each in crew_list:#统计舰长、提督、总督数量if int(each[2])==3 and flag[int(each[1])]==0:self.ship3Count+=1flag[int(each[1])]=1elif int(each[2])==2 and flag[int(each[1])]==0:self.ship2Count+=1flag[int(each[1])]=1elif int(each[2])==1 and flag[int(each[1])]==0:self.ship1Count+=1flag[int(each[1])]=1for i in range(0,self.total+1):flag[i]=0counter=0print (time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()),end='')print("开始统计船员等级")for each in crew_list:#统计船员等级if flag[int(each[1])]==0:time.sleep(0.5)flag[int(each[1])]=1counter+=1if counter%30==0:print (time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()),end='')print("\033[30m完成度：%5d/%5d\033[0m"%(counter,self.total))url = 'https://api.bilibili.com/x/space/acc/info?mid='+each[0]+'&jsonp=jsonp'request = urllib.request.Request(url,headers=self.header)reponse = urllib.request.urlopen(request).read().decode('utf-8')lv=re.findall(r'"level":(.*?),"jointime"',reponse)[0]if int(lv)==6:self.lv6Count+=1elif int(lv)==5:self.lv5Count+=1elif int(lv)==4:self.lv4Count+=1elif int(lv)==3:self.lv3Count+=1elif int(lv)==2:self.lv2Count+=1elif int(lv)==1:self.lv1Count+=1elif int(lv)==0:self.lv0Count+=1print (time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()),end='')print("船员等级统计完成")def display(self):#展示统计结果print("--------------------------------")print("6级号：%5d/%d，占比：% 2.3f%%"%(self.lv6Count,self.total,self.lv6Count*1.0/self.total*100))print("5级号：%5d/%d，占比：% 2.3f%%"%(self.lv5Count,self.total,self.lv5Count*1.0/self.total*100))print("4级号：%5d/%d，占比：% 2.3f%%"%(self.lv4Count,self.total,self.lv4Count*1.0/self.total*100))print("3级号：%5d/%d，占比：% 2.3f%%"%(self.lv3Count,self.total,self.lv3Count*1.0/self.total*100))print("2级号：%5d/%d，占比：% 2.3f%%"%(self.lv2Count,self.total,self.lv2Count*1.0/self.total*100))print("1级号：%5d/%d，占比：% 2.3f%%"%(self.lv1Count,self.total,self.lv1Count*1.0/self.total*100))print("0级号：%5d/%d，占比：% 2.3f%%"%(self.lv0Count,self.total,self.lv0Count*1.0/self.total*100))print("总督： %5d/%d，占比：% 2.3f%%"%(self.ship1Count,self.total,self.ship1Count*1.0/self.total*100))print("提督： %5d/%d，占比：% 2.3f%%"%(self.ship2Count,self.total,self.ship2Count*1.0/self.total*100))print("舰长： %5d/%d，占比：% 2.3f%%"%(self.ship3Count,self.total,self.ship3Count*1.0/self.total*100))print("--------------------------------")uid=input("请输入用户uid：")
jiaran=marine(uid)
jiaran.ruid()
jiaran.Snapshot()
jiaran.count()
jiaran.display()

几个重要的api：

用户个人信息，包括用户等级、性别、直播间、大会员开通情况等：'https://api.bilibili.com/x/space/acc/info?mid=(此处填入uid)&jsonp=jsonp'

舰长列表：'https://api.live.bilibili.com/xlive/app-room/v2/guardTab/topList?roomid=(此处填入直播间号)&page=(此处填入舰长列表页数)&ruid=(此处填入uid)&page_size=29'

b站爬虫，用于查询主播舰队用户等级构成相关推荐

B站又备战虚拟主播了
配图来自Canva可画随着技术的不断发展,虚拟主播越来越为人所熟知.实际上,虚拟主播是虚拟人下沉的结果,是新兴形式与现有媒介融合的产物,同时也是技术进步.成本降低.消费者对其理解愈发深入所致的必然趋 ...
Python爬虫获取斗鱼主播信息
感谢参考原文-http://bjbsair.com/2020-03-27/tech-info/7150.html 下面我们进入正题首先我们进入斗鱼的官网我发现首页是一些推荐的主播,并不全面,不能 ...
搜狗公司与新华社新媒体中心联合发布了全球首个站立式 AI 合成主播
2019独角兽企业重金招聘Python工程师标准>>> 2 月 19 日,在新华社新媒体中心与搜狗公司战略合作签约仪式上,搜狗公司与新华社新媒体中心联合发布了全球首个站立式 AI 合 ...
爬虫获取斗鱼主播人气
获取斗鱼页面中DOTA2游戏主播的人气值,并进行排序代码: import requests import re import randomclass Spider():# url = 'https: ...
B站大量虚拟主播被集体强制退款：收入蒸发，还倒欠B站；乔布斯被追授美国总统自由勋章；Grafana 9 发布|极客头条
「极客头条」-- 技术人员的新闻圈! CSDN 的读者朋友们早上好哇,「极客头条」来啦,快来看今天都有哪些值得我们技术人关注的重要新闻吧. 整理 | 梦依丹出品 | CSDN(ID:CSDNnews ...
爬虫技术python爬到女性语音_python爬虫看看虎牙女主播中谁最“顶”步骤详解
网页链接:https://www.huya.com/g/4079 这里的主要步骤其实还是和我们之前分析的一样,如下图所示: 这里再简单带大家看一下就行,重点是我们的第二部分. 既然网页结构我们已经分析 ...
python怎么爬虎牙_Python爬虫：爬取虎牙星秀主播图片
动态爬取思路讲解 1.简单的爬虫只需要访问网站搜索栏处的url,就可以在开发者工具(F12)处,利用正则表达式.Xpath.css等进行定位并抓取数据: 2.虎牙星秀页面不同于简单的网页,随时都在更新 ...
Python爬虫实现获取斗鱼主播信息
先下载安装Python以及其编写软件 https://www.python.org/downloads/ Python下载官网选择版本下面使用的版本为3.6.5 根据自己的操作系统的位数选择打开 ...
搜狗分身技术再进化，让AI合成主播“动”起来
整理 | 一一出品 | AI科技大本营去年 11 月的互联网大会期间,搜狗与新华社联合发布全球首个AI合成主播一经亮相,引起了人们对"AI+媒体"的广泛讨论.如今,搜狗 AI ...

b站爬虫，用于查询主播舰队用户等级构成

b站爬虫，用于查询主播舰队用户等级构成相关推荐

最新文章

热门文章