Python爬取百度热榜前十条数据

具体代码如下：

import re
import requests
import jsondef get_html(url):  # 获取网页源代码try:response = requests.get(url)response.encoding = response.apparent_encodingif response.status_code == 200:html = response.textreturn htmlelse:print("连接异常！")except:print("获取失败！")def get_result(html):  # 答案# print(html)pattern1 = re.compile(r'<td class="first">.*?>(\d+)</span>', re.S)  # 获取排名rank = re.findall(pattern1, html)# print(rank)pattern2 = re.compile(r'<td class="keyword">.*?>(.*?)</a>', re.S)  # 获取keywordkeyword = re.findall(pattern2, html)# print(keyword)pattern3 = re.compile(r'<td class="last">.*?>(\d+)</span>', re.S)  # 获取流行指数last = re.findall(pattern3, html)# print(last)pattern4 = re.compile(r'<td class="keyword">.*?<a href="(.*?)"', re.S)  # 获取链接link = re.findall(pattern4, html)# print(link)result = {}for i in range(10):dict1 = {"关键字": keyword[i],"流行指数": last[i],"链接": link[i].replace('./detail?b=1&c=513&w','https://www.baidu.com/baidu?cl=3&tn=SE_baiduhomet8_jmjb7mjw&rsv_dl=fyb_top&fr=top1000&wd')}# print(dict1)result[rank[i]] = dict1  # 将排行作为外层键，dict1作为结果的值，构成一个大的字典便于查询return resultdef main():url = 'http://top.baidu.com/buzz?b=1&fr=topindex'html = get_html(url)result1 = get_result(html)result = json.dumps(result1, indent=4, ensure_ascii=False)  # 转json格式print(result)if __name__ == '__main__':main()

Python爬取百度热榜前十条数据相关推荐

【爬虫实战】Python 爬取起点热榜，再也不怕没有小说看了！
最近看完一部小说<大奉打更人>,看得我热血沸腾.但是看完后,有选择困难症的我又不知道可以看什么了. 于是,我打算开发一个爬虫,爬取起点热榜. 一.导入所需库我们使用 requests 来 ...
10行python代码爬取百度热榜
百度热搜榜python爬虫,仅供学习交流源码: import requests from bs4 import BeautifulSoupresponse = requests.get(" ...
python爬取今日热榜数据到txt文件
今日热榜:https://tophub.today/ 爬取数据及保存格式: 爬取后保存为.txt文件: 部分内容: 源码及注释: import requests from bs4 import Bea ...
php采集百度热搜,python 爬取百度热搜
###导入模块 import requests from lxml import etree import requests,json ###网址 url="http://top.baidu ...
python爬取微博热搜榜
python爬取微博热搜榜最近应我大学室友得需求,做了一个简单的爬虫案例,先给大家看一下程序运行效果接下来就是贴出代码了,在这里我会做一些简单说明,对如刚入门的同学可能会有所帮助,这里使用的是py ...
python爬取百度标题_Python爬取百度热搜和数据处理
一.主题式网络爬虫设计方案 1.主题式网络爬虫名称:爬取百度热搜 2.主题式网络爬虫爬取的内容与数据特征分析:百度热搜排行,标题,热度 3.主题式网络爬虫设计方案概述:先搜索网站,查找数据并比对然后再 ...
Python 爬取百度搜索风云榜新闻并自动推送到邮箱
本文将使用Python爬取百度新闻搜索指数排名前50的新闻,并通过服务器运行,每天定时发送到指定邮箱. 先上代码: # -*- coding:utf-8 -*- import requests,os, ...
java爬虫黑马百度云,Java爬虫小Demo java爬取百度风云榜数据
Java爬虫小Demo java爬取百度风云榜数据很简单的一个小例子,使用到了java的爬虫框架 jsoup ,一起啦看看实现的方法吧! 相关推荐:Python爬虫实战 python爬虫爬取百度风云 ...
GoLang爬取今日热榜
初学GO,边百度边瞎写,代码只爬取了IT相关的热榜. 热榜地址:https://tophub.today/c/developer 已经部署在个人服务器上,用php写了一个简单的接口. 接口地址:www ...
python爬取豆瓣电影榜单
python爬取豆瓣电影榜单 python爬取豆瓣电影榜单并保存到本地excel中,以后就不愁没片看了. 目标确定我们想要抓取的电影的相关内容. 抓取豆瓣top250电影的排名.电影名.评价(总结很 ...

Python爬取百度热榜前十条数据

Python爬取百度热榜前十条数据相关推荐

最新文章

热门文章