1.获取up的视频BV号

re获取

import  re
import requests
home_url="https://api.bilibili.com/x/space/arc/search?mid=10278125&ps=30&tid=0&pn=1&keyword=&order=pubdate&jsonp=jsonp"headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:81.0) Gecko/20100101 Firefox/81.0'
}
html=requests.get(url=home_url,headers=headers)
html_content=html.textpattern=r'"bvid":"(.*?)"'
print(re.findall(pattern,html_content))

json获取

import requests
import  json
home_url="https://api.bilibili.com/x/space/arc/search?mid=10278125&ps=30&tid=0&pn=1&keyword=&order=pubdate&jsonp=jsonp"headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:81.0) Gecko/20100101 Firefox/81.0'
}
html=requests.get(url=home_url,headers=headers)
html_content=html.text
json_html=json.loads(html_content)
print(json_html)for i in json_html['data']['list']['vlist']:print(i['bvid'])
home_url="https://api.bilibili.com/x/space/arc/search?mid=10278125&ps=30&tid=0&pn=1&keyword=&order=pubdate&jsonp=jsonp"

这个home_url需要自己去找,方法:
在查找里面输入bvid,然后一般最后一个里面就有我们的想要的BV号了。(验证了两个up都是这样的)

2.获取视频,音频。

根据我们所找到的BV号我们可以很容易的得到网址
比如https://www.bilibili.com/video/BV1gW411F7Gy然后如果是一系列视频的话就是在加上?p=第几集就填多少。
到目前我们已经完成了找到网址。
然后就是下载视频了

import json
import os
import re
import requests
from random import choice
from lxml import etree
headers = {'Accept': '*/*','Accept-Language': 'en-US,en;q=0.5','User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.116 Safari/537.36'
}
def get_user_agent():'''获取随机用户代理'''user_agents = ["Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; AcooBrowser; .NET CLR 1.1.4322; .NET CLR 2.0.50727)","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Acoo Browser; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506)","Mozilla/4.0 (compatible; MSIE 7.0; AOL 9.5; AOLBuild 4337.35; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)","Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET CLR 2.0.50727; Media Center PC 6.0)","Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET CLR 1.0.3705; .NET CLR 1.1.4322)","Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 5.2; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.2; .NET CLR 3.0.04506.30)","Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN) AppleWebKit/523.15 (KHTML, like Gecko, Safari/419.3) Arora/0.3 (Change: 287 c9dfb30)","Mozilla/5.0 (X11; U; Linux; en-US) AppleWebKit/527+ (KHTML, like Gecko, Safari/419.3) Arora/0.6","Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.2pre) Gecko/20070215 K-Ninja/2.1.1","Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9) Gecko/20080705 Firefox/3.0 Kapiko/3.0","Mozilla/5.0 (X11; Linux i686; U;) Gecko/20070322 Kazehakase/0.4.5","Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.8) Gecko Fedora/1.9.0.8-1.fc10 Kazehakase/0.5.6","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/535.20 (KHTML, like Gecko) Chrome/19.0.1036.7 Safari/535.20","Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; fr) Presto/2.9.168 Version/11.52","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.11 TaoBrowser/2.0 Safari/536.11","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.71 Safari/537.1 LBBROWSER","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; LBBROWSER)","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; QQDownload 732; .NET4.0C; .NET4.0E; LBBROWSER)","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.84 Safari/535.11 LBBROWSER","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; QQBrowser/7.0.3698.400)","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; QQDownload 732; .NET4.0C; .NET4.0E)","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SV1; QQDownload 732; .NET4.0C; .NET4.0E; 360SE)","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; QQDownload 732; .NET4.0C; .NET4.0E)","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)","Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1","Mozilla/5.0 (iPad; U; CPU OS 4_2_1 like Mac OS X; zh-cn) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8C148 Safari/6533.18.5","Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:2.0b13pre) Gecko/20110307 Firefox/4.0b13pre","Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:16.0) Gecko/20100101 Firefox/16.0","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11","Mozilla/5.0 (X11; U; Linux x86_64; zh-CN; rv:1.9.2.10) Gecko/20100922 Ubuntu/10.10 (maverick) Firefox/3.6.10","MQQBrowser/26 Mozilla/5.0 (Linux; U; Android 2.3.7; zh-cn; MB200 Build/GRJ22; CyanogenMod-7) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1","Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1","Mozilla/5.0 (Linux; Android 5.1.1; Nexus 6 Build/LYZ28E) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.23 Mobile Safari/537.36","Mozilla/5.0 (iPod; U; CPU iPhone OS 2_1 like Mac OS X; ja-jp) AppleWebKit/525.18.1 (KHTML, like Gecko) Version/3.1.1 Mobile/5F137 Safari/525.20","Mozilla/5.0 (Linux;u;Android 4.2.2;zh-cn;) AppleWebKit/534.46 (KHTML,like Gecko) Version/5.1 Mobile Safari/10600.6.3 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)","Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"]# 在user_agent列表中随机产生一个代理,作为模拟的浏览器user_agent = choice(user_agents)return user_agent
def re_video_info(text, pattern):'''利用正则表达式匹配出视频信息并转化成json'''match = re.search(pattern, text)print(match.group(1))return json.loads(match.group(1))
def single_download(aid, acc_quality=1):'''单个视频实现下载'''# 请求视频链接,获取信息origin_video_url = 'https://www.bilibili.com/video/' + aidres = requests.get(origin_video_url, headers=headers)print('************************************')print(res.text)print('************************************')html = etree.HTML(res.text)title = html.xpath('//*[@id="viewbox_report"]/h1/span/text()')[0]print('您当前正在下载:', title)video_info_temp = re_video_info(res.text, '__playinfo__=(.*?)</script><script>')video_info = {}# 获取视频质量quality = video_info_temp['data']['accept_description'][acc_quality]# 获取视频时长video_info['duration'] = video_info_temp['data']['dash']['duration']# 获取视频链接video_url = video_info_temp['data']['dash']['video'][acc_quality]['baseUrl']# 获取音频链接audio_url = video_info_temp['data']['dash']['audio'][acc_quality]['baseUrl']# 计算视频时长video_time = int(video_info.get('duration', 0))video_minute = video_time // 60video_second = video_time % 60print('当前视频清晰度为{},时长{}分{}秒'.format(quality, video_minute, video_second))# 调用函数下载保存视频download_video_single(origin_video_url, video_url, audio_url, title)def download_video_single(referer_url, video_url, audio_url, video_name):'''单个视频下载'''# 更新请求头headers.update({"Referer": referer_url})print("视频下载开始:%s" % video_name)# 下载并保存视频video_content = requests.get(video_url, headers=headers)print('%s\t视频大小:' % video_name, round(int(video_content.headers.get('content-length', 0)) / 1024 / 1024, 2),'\tMB')received_video = 0with open('%s_video.mp4' % video_name, 'ab') as output:headers['Range'] = 'bytes=' + str(received_video) + '-'response = requests.get(video_url, headers=headers)output.write(response.content)# 下载并保存音频audio_content = requests.get(audio_url, headers=headers)print('%s\t音频大小:' % video_name, round(int(audio_content.headers.get('content-length', 0)) / 1024 / 1024, 2),'\tMB')received_audio = 0with open('%s_audio.mp3' % video_name, 'ab') as output:headers['Range'] = 'bytes=' + str(received_audio) + '-'response = requests.get(audio_url, headers=headers)output.write(response.content)received_audio += len(response.content)print("视频下载结束:%s" % video_name)video_audio_merge_single(video_name)
def video_audio_merge_single(video_name):'''使用ffmpeg单个视频音频合并'''print("视频合成开始:%s" % video_name)import subprocesscommand = 'echo y | ffmpeg -i "%s_video.mp4" -i "%s_audio.mp3" -c:v copy -c:a aac -strict experimental "%s.mp4"' % (video_name, video_name, video_name)print(command)os.remove('%s_audio.mp3'%video_name)os.remove('%s_video.mp4'%video_name)subprocess.Popen(command, shell=True)print("视频合成结束:%s" % video_name)
if __name__=='__main__':single_download("BV1La4y1s7Pq")

没有用异步很慢 |-_-| 。
仅供学习用,借鉴了一个博主的代码,后来想找一下在哪?再去搜就没有搜到了,抱歉

python爬取bili指定up主的视频相关推荐

  1. python爬取B站up主全部视频封面

    B站up主的点赞投币转发等信息,以及弹幕文件.评论文件等等都可以调用特定的API接口来获得. python爬取B站弹幕.绘制词云等点击下方链接 https://blog.csdn.net/weixin ...

  2. 用Python爬取B站、腾讯视频、芒果TV和爱奇艺视频弹幕

    众所周知,弹幕,即在网络上观看视频时弹出的评论性字幕.不知道大家看视频的时候会不会点开弹幕,于我而言,弹幕是视频内容的良好补充,是一个组织良好的评论序列.通过分析弹幕,我们可以快速洞察广大观众对于视频 ...

  3. python爬取微博指定内容_python3.5爬虫-爬取微博某博主微博内容

    想要爬取某个博主的微博数据.在网络上寻找了很多关于爬取微博内容的教程,发现有些教程比较老旧了,已经无法再用,有些教程在我这里出现一些问题,比如爬取移动端的微博需要获取登陆cookie,而我的谷歌浏览器 ...

  4. Python 爬取任意指定城市的天气预报,总共1万字❤️解决重要的“任意指定城市”的问题哦

    上一篇做了一个新闻类爬虫,积累了一些小经验.学到一点:在写文章时最好不要把网站名称和爬取到的新闻文本一起发来,可能会被CSDN下架的.这次准备好爬取天气预报网站,内容应该不像新闻类的可能涉及政治,但网 ...

  5. python爬取bili评论

    实例操作.非常规页面爬取 import requests import lxml.html import jsonclass Bili:def __init__(self):passdef getMs ...

  6. 刘华强买瓜是怎么火起来的?我用Python爬取了3000条B站视频才知道的!

    最近B站这两个月,快被买瓜的刘华强屠版了,上一个这么火的人物,应该还是闪电五连鞭的马保国. ​ ​ 与之前的网络"审丑"视频不同,这次火的片段来自一部影视作品. 孙红雷在2003年 ...

  7. 用Python爬取Bilibili上二次元妹子的视频

    作者:Mike_Shine 来源:https://urlify.cn/2qyMBb 一直想爬取BiliBili的视频,无奈一直没有去研究一下. 最近,在旭哥的指点之下,用了Fiddler抓包,抓到了一 ...

  8. python爬快手个人介绍_抖音爬腻了,安卓爬腻了?python爬取快手ios端首页热门视频!...

    最近快手这种小视频app,特别的火,中午吃过午饭,闲来无聊,想搞下快手的短视频,看能不能搞到. 于是乎, 打开了fiddler,开始准备抓包,学习Python中有不明白推荐加入交流群 号:864573 ...

  9. 利用python爬取小说诡秘之主

    import requests,re,os dir_name='guimi' if not os.path.exists('guimi'):os.makedirs('guimi') response= ...

最新文章

  1. 如何更改gridview中任意单元格颜色或者内容。
  2. 哪家中国公司为Java 16贡献最多?Java第一大厂居然不是第一的...
  3. 本次案例:对于sun 服务器的故障排查
  4. 综合布线工作组2009年工作简报
  5. Java LinkedList的实现原理详解
  6. 学习Spring Data JPA
  7. SecureCRT 用来当串口工具的设置
  8. [scikit-learn 机器学习] 2. 简单线性回归
  9. 动态代理和静态代理的区别_代理,是动态和静态的吗?
  10. 爱着你,恨着你——BCGControlBar的Menu字体
  11. 5.PHP 命令行模式
  12. usb转232串口线驱动android,prolific usb转串口驱动下载
  13. delphi的时间Ttime,Tdatetime的信息
  14. Android天天飞车游戏辅助系统
  15. 福特汉姆大学计算机科学专业,留学福特汉姆大学专业
  16. 如何安装Windows操作系统
  17. 使用selenium和phantomjs解决爬虫中对渲染页面的爬取
  18. 关于JS运算,出现多余小数点尾数,浮点问题处理
  19. js转换时间戳一直转换成1970的解决方法
  20. ubuntu安装keepass2解决汉化乱码问题

热门文章

  1. 用python做网站开发的课程_腾讯课堂:Flask Python Web 网站开发
  2. windows环境下netcat的安装及使用
  3. Win32编程---在窗体添加一个按钮
  4. Kubernetes调度
  5. 怎样大幅度地提升硬盘的速度
  6. HTML复选框checkbox默认样式修改
  7. [解决]通常每个套接字地址只允许使用一次
  8. voxelmorph中的STN网络
  9. docker docker compose 云效流水线
  10. 《微型计算机原理与接口技术》复习笔记(三)