一、Beautiful Soup介绍与安装

1,Beautiful Soup介绍

答:Beautiful Soup是一个可以从HTML或XML文件中提取数据的Python库

2,Beautiful Soup安装

答:安装Beautiful Soup 4:pip install bs4
安装lxml:pip install lxml

二、Beautiful Soup对象介绍与创建

1,Beautiful Soup对象介绍

答:Beautiful Soup对象代表要解析整个文档树,支持遍历文档树搜索文档树中描述的大部分的方法。

2,Beautiful Soup对象创建

答:

  1. 导入模块:from bs4 import BeautifulSoup
  2. 创建Beautiful Soup对象:soup = BeautifulSoup('html文档','lxml')
#导入模块
from bs4 import BeautifulSoup#创建BeautifulSoup对象
soup = BeautifulSoup('<html>data</html>','lxml')
print(soup)#结果为:<html><body><p>data</p></body></html>
#这里Beautiful Soup会自动补全html

3,Beautiful Soup对象的find方法

find方法的作用:搜索文档树

find(self,name=None,attrs={},recursive=True,text=None,**kwargs)
参数
name:标签名
attrs:属性字典
recursive:是否递归循环查找
text:根据文本内容查找
返回
查找到的第一个元素对象

根据标签名查找:soup.find(‘a’)
根据属性查找:soup.find(id=‘2’)、soup.find(attrs={‘id’:‘2’})
根据文本内容查找:soup.find(text=‘three’)

①根据标签名查找
获取下面文档中的title标签和a标签
文档内容如下:

<html> <head><title>百度一下,你就知道</title></head><body><p class="title"><b>beyondyanyu</b></p><p class="beyond"> Epiphany<a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="1" class=beyond>one</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="2" class=beyond>two</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="3" class=beyond>three</a>humour</p></body>
</html>
# 1,导入模块
from bs4 import BeautifulSoup
# 2,准备文档字符串
html = '''
<html> <head><title>百度一下,你就知道</title></head><body><p class="title"><b>beyondyanyu</b></p><p class="beyond"> Epiphany<a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="1" class=beyond>one</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="2" class=beyond>two</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="3" class=beyond>three</a>humour</p></body>
</html>
'''
# 3,创建Beautiful Soup对象
soup = BeautifulSoup(html,'lxml')
# 4,查找title标签
title = soup.find('title')
print(title)#结果为:<title>百度一下,你就知道</title>
# 5,查找a标签
a = soup.find('a')
print(a)#结果为:<a class="beyond" href="https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343" id="1">one</a>
# 6,查找所有的a标签
a_all = soup.find_all('a')
print(a_all)#结果为:[<a class="beyond" href="https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343" id="1">one</a>, <a class="beyond" href="https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343" id="2">two</a>, <a class="beyond" href="https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343" id="3">three</a>]

②根据属性查找
获取下面文档中的id为2的标签
文档内容如下:

<html> <head><title>百度一下,你就知道</title></head><body><p class="title"><b>beyondyanyu</b></p><p class="beyond"> Epiphany<a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="1" class=beyond>one</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="2" class=beyond>two</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="3" class=beyond>three</a>humour</p></body>
</html>
# 1,导入模块
from bs4 import BeautifulSoup
# 2,准备文档字符串
html = '''
<html> <head><title>百度一下,你就知道</title></head><body><p class="title"><b>beyondyanyu</b></p><p class="beyond"><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="1" class=beyond>one</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="2" class=beyond>two</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="3" class=beyond>three</a></p></body>
</html>
'''
# 3,创建Beautiful Soup对象
soup = BeautifulSoup(html,'lxml')# 4,查找文档中id为2的标签
# 方式一,通过命名参数进行指定
two = soup.find(id='2')
print(two)#结果为:<a class="beyond" href="https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343" id="2">two</a># 方法二:使用attrs来指定属性字典进行查找
two = soup.find(attrs={'id':'2'})
print(two)#结果为:<a class="beyond" href="https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343" id="2">two</a>

③根据文本查找
获取下面文档中文本为three的标签文本
文档内容如下:

<html> <head><title>百度一下,你就知道</title></head><body><p class="title"><b>beyondyanyu</b></p><p class="beyond"> Epiphany<a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="1" class=beyond>one</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="2" class=beyond>two</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="3" class=beyond>three</a>humour</p></body>
</html>
# 1,导入模块
from bs4 import BeautifulSoup
# 2,准备文档字符串
html = '''
<html> <head><title>百度一下,你就知道</title></head><body><p class="title"><b>beyondyanyu</b></p><p class="beyond"><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="1" class=beyond>one</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="2" class=beyond>two</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="3" class=beyond>three</a></p></body>
</html>
'''
# 3,创建Beautiful Soup对象
soup = BeautifulSoup(html,'lxml')# 4,查找文档中文本为three的标签文本
text = soup.find(text='three')
print(text)#结果为:three

4,Beautiful Soup对象中的Tag对象

Tag对象介绍:Tag对象对应于原始文档中的XML或HTML标签
Tag有很多方法和属性,可以用于遍历文档树和搜索文档树以及获取标签内容

Tag对象常见的属性
name:获取标签名称
attrs:获取标签所有属性的键和值
text:获取标签的文本字符串

# 1,导入模块
from bs4 import BeautifulSoup
# 2,准备文档字符串
html = '''
<html> <head><title>百度一下,你就知道</title></head><body><p class="title"><b>beyondyanyu</b></p><p class="beyond"><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="1" class=beyond>one</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="2" class=beyond>two</a><a href=https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343 id="3" class=beyond>three</a></p></body>
</html>
'''
# 3,创建Beautiful Soup对象
soup = BeautifulSoup(html,'lxml')
# 4,查找title标签
title = soup.find('title')
#print(title)
# 5,查找a标签
a = soup.find('a')print(type(a))#结果为:<class 'bs4.element.Tag'>
print("标签名:",a.name)#结果为:标签名: a
print("标签所有属性",a.attrs)#结果为:标签所有属性: {'href': 'https://blog.csdn.net/qq_41264055?spm=1011.2124.3001.5343', 'id': '1', 'class': ['beyond']}
print("标签文本内容",a.text)#结果为:标签文本内容: one

三、从疫情首页提取全国最新的疫情数据

当然,数据来源仍然是丁香园新型冠状病毒肺炎疫情实时动态首页
url:https://ncov.dxy.cn/ncovh5/view/pneumonia

# 1,导入相关模块
import requests
from bs4 import BeautifulSoup# 2,发送请求,获取疫情首页内容
response = requests.get('https://ncov.dxy.cn/ncovh5/view/pneumonia')
home_page = response.content.decode()
#print(home_page)#内容太多已省略,<body><script id="getAreaStat">try { window.getAreaStat = [{"provinceName":"香港","provinceShortName":"香港"。
#但是从最后一行可看出来,这里香港的疫情数据对应的id为getAreaStat
'''
<!DOCTYPE html><html lang="zh-cn" xmlns:layout="http://www.ultraq.net.nz/web/thymeleaf/layout" style="filter: none;"><head><link rel="stylesheet" href="//assets.dxycdn.com/gitrepo/ncov-mobile/dist/umi.bundle.css?t=1645425725691"><meta charset="utf-8"><meta content="width=device-width,initial-scale=1,user-scalable=0,viewport-fit=cover" name="viewport"><meta content="#000000" name="theme-color"><title></title><script>window.routerBase = "/ncovh5/view";</script>
<script charset="utf-8" src="//assets.dxycdn.com/gitrepo/ncov-mobile/dist/vendors~p__ECommerce~p__Pneumonia~p__Pneumonia__area~p__Pneumonia__area__index_en~p__Pneumonia__dete~e354351c.async.f15190c5.js"></script><script charset="utf-8" src="//assets.dxycdn.com/gitrepo/ncov-mobile/dist/vendors~p__Pneumonia~p__Pneumonia__area~p__Pneumonia__policy~p__Pneumonia__risk-zone~p__Pneumonia__rumor-list.async.035b6a59.js"></script><link rel="stylesheet" type="text/css" href="//assets.dxycdn.com/gitrepo/ncov-mobile/dist/vendors~p__ECommerce~p__Pneumonia~p__Pneumonia__area~p__Pneumonia__hotspot~p__Pneumonia__index_en.async.1eec8c54.css"><script charset="utf-8" src="//assets.dxycdn.com/gitrepo/ncov-mobile/dist/vendors~p__ECommerce~p__Pneumonia~p__Pneumonia__area~p__Pneumonia__hotspot~p__Pneumonia__index_en.async.61a0218b.js"></script><script charset="utf-8" src="//assets.dxycdn.com/gitrepo/ncov-mobile/dist/vendors~p__Pneumonia~p__Pneumonia__area__index_en~p__Pneumonia__index_en.async.72f0956a.js"></script><link rel="stylesheet" type="text/css" href="//assets.dxycdn.com/gitrepo/ncov-mobile/dist/p__Pneumonia.async.5cef6c52.css"><script charset="utf-8" src="//assets.dxycdn.com/gitrepo/ncov-mobile/dist/p__Pneumonia.async.bc83130c.js"></script><meta name="description" content="丁香园、丁香医生整合各权威渠道发布的官方数据,通过疫情地图直观展示,持续更新最新的新型冠状病毒肺炎的实时疫情动态。"><meta name="keywords" content="最新疫情、实时疫情、疫情地图、疫情、丁香园"><meta name="baidu-site-verification" content="IL1HU7F7Vj"></head>
<body><script id="getAreaStat">try { window.getAreaStat = [{"provinceName":"香港","provinceShortName":"香港","currentConfirmedCount":5990,"confirmedCount":22468,"suspectedCount":181,"curedCount":16190,"deadCount":288,"comment":"疑似 1 例","locationId":810000,"statisticsData":"https://file1.dxycdn.com/2020/0223/331/3398299755968040033-135.json","highDangerCount":0,"midDangerCount":0,"detectOrgCount":0,"vaccinationOrgCount":0,"cities":[],"dangerAreas":[]},{"provinceName":"台湾","provinceShortName":"台湾","currentConfirmedCount":5413,"confirmedCount":20007,"suspectedCount":485,"curedCount":13742,"deadCount":852,"comment":"","locationId":710000,"statisticsData":"https://file1.dxycdn.com/2020/0223/045/3398299749526003760-135.json","highDangerCount":0,"midDangerCount":0,"detectOrgCount":0,"vaccinationOrgCount":0,"cities":[],"dangerAreas":[]},{"provinceName":"浙江省","provinceShortName":"浙江","currentConfirmedCount":388,"confirmedCount":2255,"suspectedCount":68,"curedCount":1866,"deadCount":1,"comment":"2月10日通报核减的12例在浙江省治愈的外省病例,根据国家最新要求重新纳入累计病例。","locationId":330000,"statisticsData":"https://file1.dxycdn.com/2020/0223/537/3398299755968455045-135.json","highDangerCount":0,"midDangerCount":0,"detectOrgCount":519,"vaccinationOrgCount":217,"cities":[{"cityName":"杭州","currentConfirmedCount":143,"confirmedCount":328,"suspectedCount":0,"curedCount":185,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":330100,"currentConfirmedCountStr":"143"},{"cityName":"境外输入","currentConfirmedCount":119,"confirmedCount":387,"suspectedCount":68,"curedCount":268,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":0,"currentConfirmedCountStr":"119"},{"cityName":"宁波","currentConfirmedCount":110,"confirmedCount":269,"suspectedCount":0,"curedCount":159,"deadCount":0,"highDangerCount":0,
'''
# 3,使用Beautiful Soup提取疫情数据
soup = BeautifulSoup(home_page,'lxml')
script = soup.find(id='getAreaStat')
text = script.text
print(text)
'''
try { window.getAreaStat = [{"provinceName":"香港","provinceShortName":"香港","currentConfirmedCount":5990,"confirmedCount":22468,"suspectedCount":181,"curedCount":16190,"deadCount":288,"comment":"疑似 1 例","locationId":810000,"statisticsData":"https://file1.dxycdn.com/2020/0223/331/3398299755968040033-135.json","highDangerCount":0,"midDangerCount":0,"detectOrgCount":0,"vaccinationOrgCount":0,"cities":[],"dangerAreas":[]},{"provinceName":"台湾","provinceShortName":"台湾","currentConfirmedCount":5413,"confirmedCount":20007,"suspectedCount":485,"curedCount":13742,"deadCount":852,"comment":"","locationId":710000,"statisticsData":"https://file1.dxycdn.com/2020/0223/045/3398299749526003760-135.json","highDangerCount":0,"midDangerCount":0,"detectOrgCount":0,"vaccinationOrgCount":0,"cities":[],"dangerAreas":[]},{"provinceName":"浙江省","provinceShortName":"浙江","currentConfirmedCount":388,"confirmedCount":2255,"suspectedCount":68,"curedCount":1866,"deadCount":1,"comment":"2月10日通报核减的12例在浙江省治愈的外省病例,根据国家最新要求重新纳入累计病例。","locationId":330000,"statisticsData":"https://file1.dxycdn.com/2020/0223/537/3398299755968455045-135.json","highDangerCount":0,"midDangerCount":0,"detectOrgCount":519,"vaccinationOrgCount":217,"cities":[{"cityName":"杭州","currentConfirmedCount":143,"confirmedCount":328,"suspectedCount":0,"curedCount":185,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":330100,"currentConfirmedCountStr":"143"},{"cityName":"境外输入","currentConfirmedCount":119,"confirmedCount":387,"suspectedCount":68,"curedCount":268,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":0,"currentConfirmedCountStr":"119"},{"cityName":"宁波","currentConfirmedCount":110,"confirmedCount":269,"suspectedCount":0,"curedCount":159,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":330200,"currentConfirmedCountStr":"110"},{"cityName":"绍兴","currentConfirmedCount":38,"confirmedCount":430,"suspectedCount":0,"curedCount":392,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":330600,"currentConfirmedCountStr":"38"},{"cityName":"金华","currentConfirmedCount":2,"confirmedCount":57,"suspectedCount":0,"curedCount":55,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":330700,"currentConfirmedCountStr":"2"},{"cityName":"温州","currentConfirmedCount":0,"confirmedCount":504,"suspectedCount":0,"curedCount":503,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":330300,"currentConfirmedCountStr":"0"},{"cityName":"台州","currentConfirmedCount":0,"confirmedCount":147,"suspectedCount":0,"curedCount":147,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":331000,"currentConfirmedCountStr":"0"},{"cityName":"嘉兴","currentConfirmedCount":0,"confirmedCount":46,"suspectedCount":0,"curedCount":46,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":330400,"currentConfirmedCountStr":"0"},{"cityName":"省十里丰监狱","currentConfirmedCount":0,"confirmedCount":36,"suspectedCount":0,"curedCount":36,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":0,"currentConfirmedCountStr":"0"},{"cityName":"丽水","currentConfirmedCount":0,"confirmedCount":17,"suspectedCount":0,"curedCount":17,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":331100,"currentConfirmedCountStr":"0"},{"cityName":"衢州","currentConfirmedCount":0,"confirmedCount":14,"suspectedCount":0,"curedCount":14,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":330800,"currentConfirmedCountStr":"0"},{"cityName":"湖州","currentConfirmedCount":0,"confirmedCount":10,"suspectedCount":0,"curedCount":10,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":330500,"currentConfirmedCountStr":"0"},{"cityName":"舟山","currentConfirmedCount":0,"confirmedCount":10,"suspectedCount":0,"curedCount":10,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":330900,"currentConfirmedCountStr":"0"},{"cityName":"待明确地区","currentConfirmedCount":-24,"confirmedCount":0,"suspectedCount":0,"curedCount":24,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":0,"notShowCurrentConfirmedCount":true,"currentConfirmedCountStr":"-"}],"dangerAreas":[]},{"provinceName":"广东省","provinceShortName":"广东","currentConfirmedCount":362,"confirmedCount":4163,"suspectedCount":25,"curedCount":3793,"deadCount":8,"comment":"广东卫健委未明确部分治愈病例的地市归属,因此各地市的现存确诊存在一定偏差。","locationId":440000,"statisticsData":"https://file1.dxycdn.com/2020/0223/281/3398299758115524068-135.json","highDangerCount":0,"midDangerCount":3,"detectOrgCount":120,"vaccinationOrgCount":42,"cities":[{"cityName":"深圳","currentConfirmedCount":145,"confirmedCount":798,"suspectedCount":3,"curedCount":650,"deadCount":3,"highDangerCount":0,"midDangerCount":3,"locationId":440300,"currentConfirmedCountStr":"145"},{"cityName":"广州","currentConfirmedCount":103,"confirmedCount":2205,"suspectedCount":3,"curedCount":2101,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":440100,"currentConfirmedCountStr":"103"},{"cityName":"东莞","currentConfirmedCount":31,"confirmedCount":202,"suspectedCount":1,"curedCount":170,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":441900,"currentConfirmedCountStr":"31"},{"cityName":"佛山","currentConfirmedCount":28,"confirmedCount":318,"suspectedCount":1,"curedCount":290,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":440600,"currentConfirmedCountStr":"28"},{"cityName":"阳江","currentConfirmedCount":23,"confirmedCount":51,"suspectedCount":0,"curedCount":28,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":441700,"currentConfirmedCountStr":"23"},{"cityName":"珠海","currentConfirmedCount":15,"confirmedCount":169,"suspectedCount":2,"curedCount":153,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":440400,"currentConfirmedCountStr":"15"},{"cityName":"惠州","currentConfirmedCount":7,"confirmedCount":71,"suspectedCount":0,"curedCount":64,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":441300,"currentConfirmedCountStr":"7"},{"cityName":"江门","currentConfirmedCount":7,"confirmedCount":47,"suspectedCount":0,"curedCount":40,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":440700,"currentConfirmedCountStr":"7"},{"cityName":"云浮","currentConfirmedCount":7,"confirmedCount":7,"suspectedCount":0,"curedCount":0,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":445300,"currentConfirmedCountStr":"7"},{"cityName":"中山","currentConfirmedCount":4,"confirmedCount":80,"suspectedCount":0,"curedCount":76,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":442000,"currentConfirmedCountStr":"4"},{"cityName":"湛江","currentConfirmedCount":2,"confirmedCount":43,"suspectedCount":2,"curedCount":41,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":440800,"currentConfirmedCountStr":"2"},{"cityName":"河源","currentConfirmedCount":1,"confirmedCount":6,"suspectedCount":0,"curedCount":5,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":441600,"currentConfirmedCountStr":"1"},{"cityName":"肇庆","currentConfirmedCount":0,"confirmedCount":47,"suspectedCount":1,"curedCount":46,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":441200,"currentConfirmedCountStr":"0"},{"cityName":"汕头","currentConfirmedCount":0,"confirmedCount":26,"suspectedCount":0,"curedCount":26,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":440500,"currentConfirmedCountStr":"0"},{"cityName":"清远","currentConfirmedCount":0,"confirmedCount":23,"suspectedCount":0,"curedCount":23,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":441800,"currentConfirmedCountStr":"0"},{"cityName":"梅州","currentConfirmedCount":0,"confirmedCount":19,"suspectedCount":0,"curedCount":19,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":441400,"currentConfirmedCountStr":"0"},{"cityName":"茂名","currentConfirmedCount":0,"confirmedCount":17,"suspectedCount":0,"curedCount":17,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":440900,"currentConfirmedCountStr":"0"},{"cityName":"揭阳","currentConfirmedCount":0,"confirmedCount":11,"suspectedCount":0,"curedCount":11,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":445200,"currentConfirmedCountStr":"0"},{"cityName":"韶关","currentConfirmedCount":0,"confirmedCount":10,"suspectedCount":0,"curedCount":9,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":440200,"currentConfirmedCountStr":"0"},{"cityName":"潮州","currentConfirmedCount":0,"confirmedCount":7,"suspectedCount":0,"curedCount":7,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":445100,"currentConfirmedCountStr":"0"},{"cityName":"汕尾","currentConfirmedCount":0,"confirmedCount":6,"suspectedCount":0,"curedCount":6,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":441500,"currentConfirmedCountStr":"0"},{"cityName":"待明确地区","currentConfirmedCount":-11,"confirmedCount":0,"suspectedCount":0,"curedCount":11,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":0,"notShowCurrentConfirmedCount":true,"currentConfirmedCountStr":"-"}],"dangerAreas":[{"cityName":"深圳","areaName":"龙岗区坂田街道马安堂社区侨联东10巷1号顺兴楼","dangerLevel":2},{"cityName":"深圳","areaName":"罗湖区东门街道新园路明华广场1至6楼(含6A与M层)商业区","dangerLevel":2},{"cityName":"深圳","areaName":"中兴路高时石材B区A钢构厂房","dangerLevel":2}]},{"provinceName":"广西壮族自治区","provinceShortName":"广西","currentConfirmedCount":319,"confirmedCount":1028,"suspectedCount":0,"curedCount":707,"deadCount":2,"comment":"","locationId":450000,"statisticsData":"https://file1.dxycdn.com/2020/0223/536/3398299758115523880-135.json","highDangerCount":1,"midDangerCount":10,"detectOrgCount":270,"vaccinationOrgCount":15,"cities":[{"cityName":"百色","currentConfirmedCount":227,"confirmedCount":274,"suspectedCount":0,"curedCount":47,"deadCount":0,"highDangerCount":1,"midDangerCount":10,"locationId":451000,"currentConfirmedCountStr":"227"},{"cityName":"境外输入","currentConfirmedCount":91,"confirmedCount":482,"suspectedCount":0,"curedCount":391,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":0,"currentConfirmedCountStr":"91"},{"cityName":"南宁","currentConfirmedCount":1,"confirmedCount":57,"suspectedCount":0,"curedCount":56,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":450100,"currentConfirmedCountStr":"1"},{"cityName":"北海","currentConfirmedCount":0,"confirmedCount":44,"suspectedCount":0,"curedCount":43,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":450500,"currentConfirmedCountStr":"0"},{"cityName":"防城港","currentConfirmedCount":0,"confirmedCount":39,"suspectedCount":0,"curedCount":39,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":450600,"currentConfirmedCountStr":"0"},{"cityName":"桂林","currentConfirmedCount":0,"confirmedCount":32,"suspectedCount":0,"curedCount":32,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":450300,"currentConfirmedCountStr":"0"},{"cityName":"河池","currentConfirmedCount":0,"confirmedCount":28,"suspectedCount":0,"curedCount":27,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":451200,"currentConfirmedCountStr":"0"},{"cityName":"柳州","currentConfirmedCount":0,"confirmedCount":24,"suspectedCount":0,"curedCount":24,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":450200,"currentConfirmedCountStr":"0"},{"cityName":"玉林","currentConfirmedCount":0,"confirmedCount":11,"suspectedCount":0,"curedCount":11,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":450900,"currentConfirmedCountStr":"0"},{"cityName":"来宾","currentConfirmedCount":0,"confirmedCount":11,"suspectedCount":0,"curedCount":11,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":451300,"currentConfirmedCountStr":"0"},{"cityName":"钦州","currentConfirmedCount":0,"confirmedCount":8,"suspectedCount":0,"curedCount":8,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":450700,"currentConfirmedCountStr":"0"},{"cityName":"贵港","currentConfirmedCount":0,"confirmedCount":8,"suspectedCount":0,"curedCount":8,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":450800,"currentConfirmedCountStr":"0"},{"cityName":"梧州","currentConfirmedCount":0,"confirmedCount":5,"suspectedCount":0,"curedCount":5,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":450400,"currentConfirmedCountStr":"0"},{"cityName":"贺州","currentConfirmedCount":0,"confirmedCount":4,"suspectedCount":0,"curedCount":4,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":451100,"currentConfirmedCountStr":"0"},{"cityName":"崇左","currentConfirmedCount":0,"confirmedCount":1,"suspectedCount":0,"curedCount":1,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":451400,"currentConfirmedCountStr":"0"}],"dangerAreas":[{"cityName":"百色","areaName":"德保县都安乡伏计村陇意屯","dangerLevel":1},{"cityName":"百色","areaName":"城关镇隆盛社区东蒙荣盛二巷25号","dangerLevel":2},{"cityName":"百色","areaName":"城关镇隆盛社区盛象名都小区","dangerLevel":2},{"cityName":"百色","areaName":"都安乡坡那村多麦屯","dangerLevel":2},{"cityName":"百色","areaName":"德保县都安乡福记村山金屯","dangerLevel":2},{"cityName":"百色","areaName":"德保县维也纳酒店(德保腾飞广场店)","dangerLevel":2},{"cityName":"百色","areaName":"东凌镇登限村念洞屯","dangerLevel":2},{"cityName":"百色","areaName":"敬德镇陇正村多果屯","dangerLevel":2},{"cityName":"百色","areaName":"靖西市武平镇大道街大定屯","dangerLevel":2},{"cityName":"百色","areaName":"莲城社区德立山庄","dangerLevel":2},{"cityName":"百色","areaName":"弄贴村新村屯","dangerLevel":2}]},{"provinceName":"上海市","provinceShortName":"上海","currentConfirmedCount":209,"confirmedCount":4003,"suspectedCount":393,"curedCount":3787,"deadCount":7,"comment":"因未公布分区死亡和治愈,仅展示累计确诊和现存确诊","locationId":310000,"statisticsData":"https://file1.dxycdn.com/2020/0223/128/3398299755968454977-135.json","highDangerCount":0,"midDangerCount":0,"detectOrgCount":130,"vaccinationOrgCount":17,"cities":[{"cityName":"境外输入","currentConfirmedCount":208,"confirmedCount":3611,"suspectedCount":8,"curedCount":3403,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":0,"currentConfirmedCountStr":"208"},{"cityName":"奉贤区","currentConfirmedCount":1,"confirmedCount":11,"suspectedCount":0,"curedCount":10,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":310120,"currentConfirmedCountStr":"1"},{"cityName":"外地来沪","currentConfirmedCount":0,"confirmedCount":113,"suspectedCount":0,"curedCount":112,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":0,"currentConfirmedCountStr":"0"},{"cityName":"浦东新区","currentConfirmedCount":0,"confirmedCount":82,"suspectedCount":0,"curedCount":81,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":310115,"currentConfirmedCountStr":"0"},{"cityName":"宝山区","currentConfirmedCount":0,"confirmedCount":27,"suspectedCount":0,"curedCount":26,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":310113,"currentConfirmedCountStr":"0"},{"cityName":"黄浦区","currentConfirmedCount":0,"confirmedCount":22,"suspectedCount":0,"curedCount":22,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":310101,"currentConfirmedCountStr":"0"},{"cityName":"闵行区","currentConfirmedCount":0,"confirmedCount":19,"suspectedCount":0,"curedCount":19,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":310112,"currentConfirmedCountStr":"0"},{"cityName":"徐汇区","currentConfirmedCount":0,"confirmedCount":18,"suspectedCount":0,"curedCount":17,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":310104,"currentConfirmedCountStr":"0"},{"cityName":"静安区","currentConfirmedCount":0,"confirmedCount":17,"suspectedCount":0,"curedCount":16,"deadCount":1,"highDangerCount":0,"midDangerCount":0,"locationId":310106,"currentConfirmedCountStr":"0"},{"cityName":"松江区","currentConfirmedCount":0,"confirmedCount":16,"suspectedCount":0,"curedCount":16,"deadCount":0,"highDangerCount":0,"midDangerCount":0,"locationId":310117,"currentConfirmedCountStr":"0"},{"cityName":"长宁区","currentConfirmedCount":0,"confirmedCount":14,"suspectedCount":0,"curedCount":14,"deadCount":...内容太多了已省略
'''

三、Beautiful Soup解析库相关推荐

  1. python3 beautifulsoup 表格_[Python3爬虫]Beautiful Soup解析库

    解析库与Beautiful Soup 通过request库,我们已经能够抓取网页信息了,但要怎么提取包含在Html代码里面的有效信息呢?谈到匹配有效信息你肯定会想到正则表达式,这里就不讨论了,实际上关 ...

  2. 【Python beautiful soup】如何用beautiful soup 解析HTML内容

    美丽汤(Beautiful Soup)是一个流行的Python库,用于从HTML或XML文件中提取数据.它将复杂的HTML文件转化为一个Python对象,使得用户可以更方便地解析.搜索和修改HTML内 ...

  3. 爬虫5_python2_使用 Beautiful Soup 解析数据

    使用 Beautiful Soup 解析数据(感谢东哥) 有的小伙伴们对写正则表达式的写法用得不熟练,没关系,我们还有一个更强大的工具,叫Beautiful Soup,有了它我们可以很方便地提取出HT ...

  4. 小白学爬虫(三 Beautiful Soup库)

    Beautiful Soup库是解析HTML页面信息标记与提取方法,解析.维护.遍历"标签树"的功能库. 初步使用Beautiful Soup库 from bs4 import B ...

  5. Beautiful Soup库入门

    Beautiful Soup简介与安装 简介 简单来说,Beautiful Soup是python的一个库,最主要的功能是从网页抓取数据.官方解释如下: Beautiful Soup提供一些简单的py ...

  6. Python爬虫之Beautiful soup模块

    1.Beautiful soup与Xpath对比 相同点:用来解析HTML和XML,并从中提取数据 独有的特点: API简单,功能强大 支持多种解析器 自动实现编码的转换 2.Beautiful so ...

  7. python爬虫解析库(Xpath、beautiful soup、Jsonpath)

    1. HTML解析 HTML的内容返回给浏览器,浏览器就会解析它,并对它渲染. HTML 超文本表示语言,设计的初衷就是为了超越普通文本,让文本表现力更强. XML 扩展标记语言,不是为了代替HTML ...

  8. python标准库Beautiful Soup与MongoDb爬喜马拉雅电台的总结

    Beautiful Soup标准库是一个可以从HTML/XML文件中提取数据的Python库,它能够通过你喜欢的转换器实现惯用的文档导航,查找,修改文档的方式,Beautiful Soup将会节省数小 ...

  9. python2.7怎么下载安装_Windows平台下python2.7如何安装Beautiful Soup

    Beautiful Soup是一个Python的一个库,主要为一些短周期项目比如屏幕抓取而设计.有三个特性使得它非常强大: 1.Beautiful Soup提供了一些简单的方法和Python术语,用于 ...

最新文章

  1. 微信推送模板消息的PHP代码整理
  2. Science公布年度十大科学突破!新冠疫苗居首位
  3. 随机样本选择——快速求解机器学习中的优化问题
  4. JSR 303约束规则
  5. 【图像分割模型】全景分割是什么?
  6. julia const报错_我爱Julia之入门-004
  7. C++远航之封装篇——数据的封装
  8. kmp学英语必须设置
  9. 【优化选址】基于matlab穷举法求解小区基站选址优化问题【含Matlab源码 439期】
  10. TongWeb基本使用
  11. VideoEdit+ User Manual
  12. Python游戏开发实战:飞机大战(含代码)
  13. NBU备份速度快慢调整
  14. 树莓派教程 : 树莓派各版本引脚定义
  15. pdf文件太大如何压缩变小一点?
  16. SQL语句写起来太繁琐?你可以试试 MyBatis “动态” SQL
  17. 圆的面积php,圆的面积教学活动方案
  18. 后缀是lnk是什么文件_ink是什么文件?ink文件怎么打开
  19. matlab 坐标轴居中,MATLAB 把坐标轴(X Y轴)移到坐标原点
  20. 人工智能:声纹相关基础概念介绍

热门文章

  1. Java面试题2019简书_2019最新Spring面试题大全含答案之Spring Beans(2019最全Spring超级葵花宝典)...
  2. jsp中@import导入外部样式表与link链入外部样式表的区别
  3. JS基础:求一组数中的最大最小值,以及所在位置
  4. 巧用css的border属性完成对图片编辑功能的性能优化
  5. Bootstrap概述
  6. cd1101d 树形dp
  7. 【ABAP系列】SAP 面试 ABAPer的一些感想
  8. 剑指offer二十二之从上往下打印二叉树
  9. 模拟聊天室显示语句保持最新显示
  10. iOS开发 之 可穿戴设备 蓝牙4.0 BLE 开发