文章目录

  • 一、问题分析
    • 1.1 问题描述
    • 1.2 问题分析
  • 二、具体代码及注释
    • 2.1 代码
    • 2.2 绘图结果

一、问题分析

1.1 问题描述

Before working on this assignment please read these instructions fully. In the submission area, you will notice that you can click the link to Preview the Grading for each step of the assignment. This is the criteria that will be used for peer grading. Please familiarize yourself with the criteria before beginning the assignment.

An NOAA dataset has been stored in the file data/C2A2_data/BinnedCsvs_d400/fb441e62df2d58994928907a91895ec62c2c42e6cd075c2700843b89.csv. This is the dataset to use for this assignment. Note: The data for this assignment comes from a subset of The National Centers for Environmental Information (NCEI) Daily Global Historical Climatology Network (GHCN-Daily). The GHCN-Daily is comprised of daily climate records from thousands of land surface stations across the globe.

Each row in the assignment datafile corresponds to a single observation.

The following variables are provided to you:

  • id : station identification code
  • date : date in YYYY-MM-DD format (e.g. 2012-01-24 = January 24, 2012)
  • element : indicator of element type
    • TMAX : Maximum temperature (tenths of degrees C)
    • TMIN : Minimum temperature (tenths of degrees C)
  • value : data value for element (tenths of degrees C)

For this assignment, you must:

  1. Read the documentation and familiarize yourself with the dataset, then write some python code which returns a line graph of the record high and record low temperatures by day of the year over the period 2005-2014. The area between the record high and record low temperatures for each day should be shaded.
  2. Overlay a scatter of the 2015 data for any points (highs and lows) for which the ten year record (2005-2014) record high or record low was broken in 2015.
  3. Watch out for leap days (i.e. February 29th), it is reasonable to remove these points from the dataset for the purpose of this visualization.
  4. Make the visual nice! Leverage principles from the first module in this course when developing your solution. Consider issues such as legends, labels, and chart junk.

The data you have been given is near Ann Arbor, Michigan, United States, and the stations the data comes from are shown on the map below.

1.2 问题分析

我们发现该Assignment一共分位四个部分

  1. 首先记录2005-2014年每一天的最高气温和最低气温,这需要对时间数据进行pd.to_datetime转化后拆分然后利用分组函数groupby和聚合函数求最大最小值就可以了。得到每一天的数据后根据日期画出折线图,将最高温度和最低温之间的部分填充上色。
  2. 找到2015年超过2005-2014年最高温度和低于最低温度的日期和温度,在图上用散点图表示,可以在plt.scatter()函数中利用np.where()来当index实现,np.where()返回0和1的矩阵。
  3. 将2月29日的数据剔除
  4. 做好可视化,减少图像垃圾,例如,减少无关数据的笔墨。

二、具体代码及注释

2.1 代码

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
%matplotlib notebook
binsize=400
hashid='fb441e62df2d58994928907a91895ec62c2c42e6cd075c2700843b89'#读取数据
df=pd.read_csv('data/C2A2_data/BinnedCsvs_d{}/{}.csv'.format(binsize,hashid))
#df=pd.read_csv('assignment2_data.csv')#温度单位转化
df['value']=df['Data_Value'].apply(lambda x:x/10)#拆分时间
df['year']=pd.to_datetime(df['Date']).apply(lambda x:x.year)
df['month']=pd.to_datetime(df['Date']).apply(lambda x:x.month)
df['day']=pd.to_datetime(df['Date']).apply(lambda x:x.day)#去除2月29日的数据
df=df[~((df['month']==2)&(df['day']==29))]#取2005-2014的数据为df_05_14
df_05_14=df[(df['year']>=2005)&(df['year']<=2014)]
#取2015年的数据为df_15
df_15=df[df['year']==2015]#取05-14年数据最大值和最小值
df_max_05_14=df_05_14[df_05_14['Element']=='TMAX'].groupby(['month','day']).agg({'value':np.max})
df_min_05_14=df_05_14[df_05_14['Element']=='TMIN'].groupby(['month','day']).agg({'value':np.min})#取15年数据最大值和最小值
df_max_15=df_15[df_15['Element']=='TMAX'].groupby(['month','day']).agg({'value':np.max})
df_min_15=df_15[df_15['Element']=='TMIN'].groupby(['month','day']).agg({'value':np.min})#找到打破记录的日期
broken_max=np.where(df_max_15>df_max_05_14)[0]
broken_min=np.where(df_min_15<df_min_05_14)[0]

2.2 绘图结果

【DS实践 | Coursera】Assignment 2 | Applied Plotting, Charting Data Representation in Python相关推荐

  1. 【DS实践 | Coursera】Assignment 3 | Applied Plotting, Charting Data Representation in Python

    文章目录 一.问题分析 1.1 问题描述 1.2 问题分析 二.具体代码及注释 2.1 代码及注释 2.2 绘图结果 一.问题分析 1.1 问题描述 In this assignment you mu ...

  2. Coursera | Applied Plotting, Charting Data Representation in Python(UMich)| W3 Practice Assignment

       所有assignment相关链接:   Coursera | Applied Plotting, Charting & Data Representation in Python(Uni ...

  3. Coursera | Applied Plotting, Charting Data Representation in Python(UMich)| Assignment4

       所有assignment相关链接:   Coursera | Applied Plotting, Charting & Data Representation in Python(Uni ...

  4. Coursera | Applied Plotting, Charting Data Representation in Python(UMich)| Assignment2

       所有assignment相关链接:   Coursera | Applied Plotting, Charting & Data Representation in Python(Uni ...

  5. [Applied Plotting, Charting Data Representation in Python] Assignment 2-Plotting Weather Patterns

    作为一个近乎小白的新玩家,旁听的这个课又无法提交,想来想去还是发出来留个纪念嘻嘻. Assignment 2 Before working on this assignment please read ...

  6. 【DS实践 | Coursera】Assignment3 | Introduction to Data Science in Python

    文章目录 前言 一.Q1 二.Q2 三.Q3 四.Q4 五.Q5 六.Q6 七.Q7 八.Q8 九.Q9 十.Q10 十一.Q11 十二.Q12 十三.Q13 前言 本章是Introduction t ...

  7. Coursera | Applied Data Science with Python 专项课程 | Applied Machine Learning in Python

    本文为学习笔记,记录了由University of Michigan推出的Coursera专项课程--Applied Data Science with Python中Course Three: Ap ...

  8. Coursera | Introduction to Data Science in Python(University of Michigan)| Assignment2

       u1s1,这门课的assignment还是有点难度的,特别是assigment4(哀怨),放给大家参考啦~    有时间(需求)就把所有代码放到github上(好担心被河蟹啊)    先放下该课 ...

  9. Coursera | Introduction to Data Science in Python(University of Michigan)| Assignment4

       u1s1,这门课的assignment还是有点难度的,特别是assigment4(哀怨),放给大家参考啦~    有时间(需求)就把所有代码放到github上(好担心被河蟹啊)    先放下该课 ...

最新文章

  1. java spring 单例_spring怎么实现单例模式?
  2. 为什么要用GCD-Swift2.x
  3. 宠物商店(pet-shop) 学习笔记
  4. 教你吃透CSS的盒子模型(Box Model)
  5. python utc 时间
  6. Mycat概述、核心概念及linux安装、运行、登录
  7. alwayson高可用组_了解AlwaysOn可用性组上的备份-第1部分
  8. 集合框架(数据结构之栈和队列)
  9. 搭建IPv6网络环境
  10. 高德地图api使用过程出现崩溃
  11. antd 表单的校验方式
  12. 超搜索引擎BBMAO
  13. php接入外汇购物,兑换难!这些外币最好别带回国
  14. 【转载】KaTeX 数学公式大全
  15. 中集集团人工智能企业中集飞瞳,拿产品说话的全球航运港口人工智能高科技独角兽,全球第一家完成200万次人工智能集装箱验箱的AI企业
  16. 什么是信令?什么是信令网?(转)
  17. 联通云服务器优势,多线云服务器有哪些优势?
  18. AutoSAR系列讲解(实践篇)8.5-C/S原理进阶
  19. Vue 计算属性缓存和方法的区别:从另一段代码来看【vue3学习笔记】
  20. 车间和仓库可以一起吗_仓库与车间交接之规定

热门文章

  1. Spring Beans 自动装配 使用XML配置列子(带源码)
  2. 解决模拟器方向键无法使用问题
  3. python毕业设计 基于django框架个人博客系统毕业设计设计与实现
  4. 郊区春游(状压dp)
  5. 第006话 皮皮和月亮石!
  6. 2021营销案例盘点,这些品牌的中秋营销创意,绝了!
  7. 数据结构课程主页-2015级
  8. htc one x android5.0,终于来了 HTC One M8升级Android 5.0体验
  9. 海中山——洋底的崇山峻岭
  10. 上班族是如何预防电脑辐射的