mysql 时间推移_随着时间的推移可视化COVID-19新案例

mysql 时间推移

This heat map shows the progression of the COVID-19 pandemic in the United States over time. The map is read from left to right, and color coded to show the relative numbers of new cases by state, adjusted for population.

该热图显示了美国COVID-19大流行随着时间的进展。从左到右读取地图，并用颜色编码显示按州调整的新病例的相对数量，并根据人口进行调整。

This visualization was inspired by a similar heat map that I saw on a discussion forum thread. I could never locate the source, as it was only a pasted image with no link. The original version was also crafted to make a political point, separating states by predominate party affiliation, which I was not as interested in. I was fascinated by how it concisely showed the progression of the pandemic, so I decided to create a similar visualization myself that I could update regularly.

这种可视化的灵感来自我在讨论论坛主题中看到的类似热图。我永远找不到源，因为它只是一个粘贴的图像，没有链接。最初的版本还经过精心设计以提出政治观点，以政党之间的支配关系分隔国家，对此我并不感兴趣。我可以定期更新。

Source code is hosted on my Github repo. If you are just interested in seeing updated versions of this heat map, I publish them weekly on my Twitter feed. It’s important to note that you should be careful comparing graphs from one week to another to each other, as the color map may change as new data is included. Comparisons are only valid within a given heatmap.

源代码托管在我的Github存储库中。如果您只想查看此热图的更新版本，我每周都会在Twitter feed上发布它们。重要的是要注意，您应该谨慎比较一周之间的图表，因为随着添加新数据，颜色图可能会发生变化。比较仅在给定的热图中有效。

The script relies on pandas, numpy, matplotlib, and seaborn.

该脚本依赖于pandas，numpy，matplotlib和seaborn。

The data comes from the New York Times COVID-19 Github repo. A simple launcher script clones the latest copy of the repository and copies the required file, and then launches the Python script to create the heat map. Only one file is really needed, so it could certainly be tightened up, but this works.

数据来自《纽约时报》 COVID-19 Github存储库。一个简单的启动器脚本将克隆存储库的最新副本并复制所需的文件，然后启动Python脚本以创建热图。确实只需要一个文件，因此可以将其收紧，但这是可行的。

echo "Clearing old data..."
rm -rf covid-19-data/
rm us-states.csv
echo "Getting new data..."
git clone https://github.com/nytimes/covid-19-data
echo "Done."cp covid-19-data/us-states.csv .
echo "Starting..."python3 heatmap-newcases.py
echo "Done."

The script first loads a CSV file containing the state populations into a dictionary, which is used to scale daily new case results. The new cases are computed for each day from the running total in the NY Times data, and then scaled to new cases per 100,000 people in the population.

该脚本首先将包含州人口的CSV文件加载到字典中，该字典用于扩展每日新个案结果。根据《纽约时报》数据中的运行总计每天计算新病例，然后将其扩展为人口中每100,000人的新病例。

We could display the heat map at that point, but if we do, states with very high numbers of cases per 100,000 people will swamp the detail of the states with lower numbers of cases. Applying a log(x+1) transform improves contrast and readability significantly.

我们可以在那时显示热图，但是如果这样做，每10万人中案件数量非常多的州将淹没案件数量较少的州的详细信息。应用log(x + 1)变换可显着提高对比度和可读性。

Finally, Seaborn and Matplotlib are used to generate the heatmap and save it to an image file.

最后，使用Seaborn和Matplotlib生成热图并将其保存到图像文件中。

That’s it! Feel free to use this as a framework for your own visualization. You can customize it to zero in on areas of interest.

而已！随意使用它作为您自己的可视化框架。您可以在感兴趣的区域将其自定义为零。

Full source code is below. Thanks for reading, and I hope you found it useful.

完整的源代码如下。感谢您的阅读，希望您觉得它有用。

import numpy as np
import seaborn as sns
import matplotlib.pylab as plt
import pandas as pd
import csv
import datetimereader = csv.reader(open('StatePopulations.csv'))statePopulations = {}
for row in reader:key = row[0]if key in statePopulations:passstatePopulations[key] = row[1:]filename = "us-states.csv"
fullTable = pd.read_csv(filename)
fullTable = fullTable.drop(['fips'], axis=1)
fullTable = fullTable.drop(['deaths'], axis=1)# generate a list of the dates in the table
dates = fullTable['date'].unique().tolist()
states = fullTable['state'].unique().tolist()result = pd.DataFrame()
result['date'] = fullTable['date']states.remove('Northern Mariana Islands')
states.remove('Puerto Rico')
states.remove('Virgin Islands')
states.remove('Guam')states.sort()for state in states:# create new dataframe with only the current state's datepopulation = int(statePopulations[state][0])print(state + ": " + str(population))stateData = fullTable[fullTable.state.eq(state)]newColumnName = statestateData[newColumnName] = stateData.cases.diff()stateData[newColumnName] = stateData[newColumnName].replace(np.nan, 0)stateData = stateData.drop(['state'], axis=1)stateData = stateData.drop(['cases'], axis=1)stateData[newColumnName] = stateData[newColumnName].div(population)stateData[newColumnName] = stateData[newColumnName].mul(100000.0)result = pd.merge(result, stateData, how='left', on='date')result = result.drop_duplicates()
result = result.fillna(0)for state in states:result[state] = result[state].add(1.0)result[state] = np.log10(result[state])#result[state] = np.sqrt(result[state])result['date'] = pd.to_datetime(result['date'])
result = result[result['date'] >= '2020-02-15']
result['date'] = result['date'].dt.strftime('%Y-%m-%d')result.set_index('date', inplace=True)
result.to_csv("result.csv")
result = result.transpose()plt.figure(figsize=(16, 10))
g = sns.heatmap(result, cmap="coolwarm", linewidth=0.05, linecolor='lightgrey')
plt.xlabel('')
plt.ylabel('')plt.title("Daily New Covid-19 Cases Per 100k Of Population", fontsize=20)updateText = "Updated " + str(datetime.date.today()) + \". Scaled with Log(x+1) for improved contrast due to wide range of values. Data source: NY Times Github. Visualization by @JRBowling"plt.suptitle(updateText, fontsize=8)plt.yticks(np.arange(.5, 51.5, 1.0), states)plt.yticks(fontsize=8)
plt.xticks(fontsize=8)
g.set_xticklabels(g.get_xticklabels(), rotation=90)
g.set_yticklabels(g.get_yticklabels(), rotation=0)
plt.savefig("covidNewCasesper100K.png")

翻译自: https://towardsdatascience.com/visualization-of-covid-19-new-cases-over-time-in-python-8c6ac4620c88

mysql 时间推移

查看全文

http://www.taodudu.cc/news/show-997592.html

海量数据寻找最频繁的数据_寻找数据科学家的“原因”
kaggle比赛数据_表格数据二进制分类：来自5个Kaggle比赛的所有技巧和窍门
netflix_Netflix的Polynote
气流与路易吉，阿戈，MLFlow，KubeFlow
顶级数据恢复_顶级R数据科学图书馆
大数据 notebook_Dockerless Notebook：数据科学期待已久的未来
微软大数据_我对Microsoft的数据科学采访
如何击败腾讯_击败股市
如何将Jupyter Notebook连接到远程Spark集群并每天运行Spark作业？
twitter 数据集处理_Twitter数据清理和数据科学预处理
使用管道符组合使用命令_如何使用管道的魔力
2020年十大币预测_2020年十大商业智能工具
为什么我们需要使用Pandas新字符串Dtype代替文本数据对象
nlp构建_使用NLP构建自杀性推文分类器
时间序列分析 lstm_LSTM —时间序列分析
泰晤士报下载_《泰晤士报》和《星期日泰晤士报》新闻编辑室中具有指标的冒险活动-第1部分：问题
异常检测机器学习_使用机器学习检测异常
特征工程tf-idf_特征工程-保留和删除的内容
自我价值感缺失的表现_不同类型的缺失价值观和应对方法
学习sql注入:猜测数据库_面向数据科学家SQL：学习简单方法
python自动化数据报告_如何：使用Python将实时数据自动化到您的网站
学习深度学习需要哪些知识_您想了解的有关深度学习的所有知识
置信区间估计预测区间估计_估计，预测和预测
地图 c-suite_C-Suite的模型
sap中泰国有预扣税设置吗_泰国餐厅密度细分：带有K-means聚类的python
傅里叶变换直观_A / B测试的直观模拟
鸽子迷信_人工智能如何帮助我战胜鸽子
scikit keras_Scikit学习，TensorFlow，PyTorch，Keras…但是天秤座呢？
数据结构两个月学完_这是我作为数据科学家两年来所学到的
迈向数据科学的第一步：在Python中支持向量回归

mysql 时间推移_随着时间的推移可视化COVID-19新案例相关推荐

mysql 格林时间转换_格林时间转换成正常时间
uscdbmt@rac1:~> date +%s 1414741902 oracle中怎么把这个1414741902转换成正常时间格式 select Numtodsinterval(141474 ...
动态时间规整_动态时间规整下时间序列子序列的搜索与挖掘
一.DTW的背景对于时间序列数据挖掘算法的相似性搜索来说最大的瓶颈就是所花费的时间,所以大多数关于时间序列数据挖掘的学术研究都在考虑数百万个时间序列对象时停滞不前,而许多工业和科学都在数十亿个等待探 ...
calendar类计算时间距离_日期时间--JAVA成长之路
Java中为处理日期和时间提供了大量的API,确实有把一件简单的事情搞复杂的嫌疑,各种类:Date Time Timestamp Calendar...,但是如果能够看到时间处理的本质就可以轻松hol ...
vb 软件时间限制_带时间限制的软件加密锁
"时间就是金钱"这句话完美的诠释了软件的价值.而精锐 5 时钟锁最大程度的保证软件价值并帮助软件开发商解决业务问题. 精锐 5 时钟锁采用硬件时钟功能,内置独立时钟芯片,带 ...
mysql 利用时间查询_利用时间的艺术
mysql 利用时间查询 Oh dear! If I had a dime every time (I just rhymed - twice!) someone, including me, com ...
mysql 时间国际化_日期时间处理和国际化相关
日期/时间的国际化,不仅涉及到地理位置(Locale,比如星期.月份等日历本地化表示),还涉及到时区(TimeZone,针对UTC/GMT的偏移量).时区不仅是地理位置规定,更是政治规定,比如中国从地 ...
mysql 时间设计模式_数据库时间设计模式
{"moduleinfo":{"card_count":[{"count_phone":1,"count":1}],&q ...
java 时间轮_基于时间轮的定时任务
JobScheduleHelper是一个单列. 1-来看看类中的一些属性: public static final long PRE_READ_MS = 5000; // pre read priva ...
php实现按时间排序_按时间排序的问题？
表中某时间字段有一些格式: November 11, 2016 31 Oct 2016 2016-01-11 07 Nov 2016 能否按时间排序? 回复内容: 表中某时间字段有一些格式: Nove ...

mysql 时间推移_随着时间的推移可视化COVID-19新案例

相关文章：

mysql 时间推移_随着时间的推移可视化COVID-19新案例相关推荐

最新文章

热门文章