Github 地址: https://github.com/guipsamora/pandas_exercises
数据集: https://www.aliyundrive.com/s/Wk6L6R67rrn

1. 了解数据

1.1 导入数据集并查看数据集基本情况
import pandas as pd
chipotle = pd.read_csv('001_chipotle.tsv',sep='\t')
chipotle.head()
order_id quantity item_name choice_description item_price
0 1 1 Chips and Fresh Tomato Salsa NaN $2.39
1 1 1 Izze [Clementine] $3.39
2 1 1 Nantucket Nectar [Apple] $3.39
3 1 1 Chips and Tomatillo-Green Chili Salsa NaN $2.39
4 2 2 Chicken Bowl [Tomatillo-Red Chili Salsa (Hot), [Black Beans... $16.98
chipotle.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4622 entries, 0 to 4621
Data columns (total 5 columns):#   Column              Non-Null Count  Dtype
---  ------              --------------  ----- 0   order_id            4622 non-null   int64 1   quantity            4622 non-null   int64 2   item_name           4622 non-null   object3   choice_description  3376 non-null   object4   item_price          4622 non-null   object
dtypes: int64(2), object(3)
memory usage: 180.7+ KB
chipotle.columns
Index(['order_id', 'quantity', 'item_name', 'choice_description','item_price'],dtype='object')
chipotle.index
RangeIndex(start=0, stop=4622, step=1)
1.2 被下单最多的商品是什么
  • as_index = false 使用默认索引列,
  • 显示效果上, 使列名都在一行, “不合并单元格”, 更符合使用习惯
item_buy_count = chipotle[['item_name','quantity']].groupby('item_name').sum()['quantity'].nlargest(1)
item_buy_count
item_name
Chicken Bowl    761
Name: quantity, dtype: int64
  • 原答案
item_buy_count_1 = chipotle[['item_name','quantity']].groupby('item_name', as_index=False).agg({'quantity':'sum'})
item_buy_count_1.sort_values('quantity', ascending=False, inplace=True)
item_buy_count_1.head()
item_name quantity
17 Chicken Bowl 761
18 Chicken Burrito 591
25 Chips and Guacamole 506
39 Steak Burrito 386
10 Canned Soft Drink 351
1.3 有多少种商品被下单
  • nunique() 非重复计数
chipotle['item_name'].nunique()
50
# 简单做个图
s = chipotle['item_name'].value_counts()from matplotlib import pyplot as plt
plt.figure(figsize=(20,4),dpi=100)
plt.xticks(fontsize=13, rotation='vertical')
plt.bar(s.index,s.values)
<BarContainer object of 50 artists>


1.4 被下单的商品中, choice_description 的排名
  • 没搞明白什么意思
chipotle['choice_description'].value_counts().head()
[Diet Coke]                                                                          134
[Coke]                                                                               123
[Sprite]                                                                              77
[Fresh Tomato Salsa, [Rice, Black Beans, Cheese, Sour Cream, Lettuce]]                42
[Fresh Tomato Salsa, [Rice, Black Beans, Cheese, Sour Cream, Guacamole, Lettuce]]     40
Name: choice_description, dtype: int64
1.5 一共有多少商品被下单
chipotle['quantity'].sum()
4972
1.6 将item_price转换为浮点数
chipotle['item_price'] = chipotle['item_price'].str.lstrip('$').astype('float')
chipotle.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4622 entries, 0 to 4621
Data columns (total 5 columns):#   Column              Non-Null Count  Dtype
---  ------              --------------  -----  0   order_id            4622 non-null   int64  1   quantity            4622 non-null   int64  2   item_name           4622 non-null   object 3   choice_description  3376 non-null   object 4   item_price          4622 non-null   float64
dtypes: float64(1), int64(2), object(2)
memory usage: 180.7+ KB
  • 原解法
chipotle['item_price'] = chipo['item_price'].apply(lambda x: float(x[1:-1]))
---------------------------------------------------------------------------NameError                                 Traceback (most recent call last)~\AppData\Local\Temp/ipykernel_19644/772157282.py in <module>
----> 1 chipotle['item_price'] = chipo['item_price'].apply(lambda x: float(x[1:-1]))NameError: name 'chipo' is not defined
1.7 总收入
chipotle['sub_total'] = (chipotle['item_price'] * chipotle['quantity'])
chipotle['sub_total'].sum()
39237.02
1.8 总订单数
chipotle['order_id'].nunique()
1834
1.9 单均价
chipotle['sub_total'].sum()/chipotle['order_id'].nunique()
21.39423118865867
  • 原答案
chipotle[['order_id','sub_total']].groupby(by=['order_id']).agg({'sub_total':'sum'})['sub_total'].mean()
21.39423118865867

2. 数据过滤与排序

2.1 数据导入, 查看数据集基本情况
import pandas as pd
euro = pd.read_csv('Euro2012_stats.csv')
euro.head()
Team Goals Shots on target Shots off target Shooting Accuracy % Goals-to-shots Total shots (inc. Blocked) Hit Woodwork Penalty goals Penalties not scored ... Saves made Saves-to-shots ratio Fouls Won Fouls Conceded Offsides Yellow Cards Red Cards Subs on Subs off Players Used
0 Croatia 4 13 12 51.9% 16.0% 32 0 0 0 ... 13 81.3% 41 62 2 9 0 9 9 16
1 Czech Republic 4 13 18 41.9% 12.9% 39 0 0 0 ... 9 60.1% 53 73 8 7 0 11 11 19
2 Denmark 4 10 10 50.0% 20.0% 27 1 0 0 ... 10 66.7% 25 38 8 4 0 7 7 15
3 England 5 11 18 50.0% 17.2% 40 0 0 0 ... 22 88.1% 43 45 6 5 0 11 11 16
4 France 3 22 24 37.9% 6.5% 65 1 0 0 ... 6 54.6% 36 51 5 6 0 11 11 19

5 rows × 35 columns

euro.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16 entries, 0 to 15
Data columns (total 35 columns):#   Column                      Non-Null Count  Dtype
---  ------                      --------------  -----  0   Team                        16 non-null     object 1   Goals                       16 non-null     int64  2   Shots on target             16 non-null     int64  3   Shots off target            16 non-null     int64  4   Shooting Accuracy           16 non-null     object 5   % Goals-to-shots            16 non-null     object 6   Total shots (inc. Blocked)  16 non-null     int64  7   Hit Woodwork                16 non-null     int64  8   Penalty goals               16 non-null     int64  9   Penalties not scored        16 non-null     int64  10  Headed goals                16 non-null     int64  11  Passes                      16 non-null     int64  12  Passes completed            16 non-null     int64  13  Passing Accuracy            16 non-null     object 14  Touches                     16 non-null     int64  15  Crosses                     16 non-null     int64  16  Dribbles                    16 non-null     int64  17  Corners Taken               16 non-null     int64  18  Tackles                     16 non-null     int64  19  Clearances                  16 non-null     int64  20  Interceptions               16 non-null     int64  21  Clearances off line         15 non-null     float6422  Clean Sheets                16 non-null     int64  23  Blocks                      16 non-null     int64  24  Goals conceded              16 non-null     int64  25  Saves made                  16 non-null     int64  26  Saves-to-shots ratio        16 non-null     object 27  Fouls Won                   16 non-null     int64  28  Fouls Conceded              16 non-null     int64  29  Offsides                    16 non-null     int64  30  Yellow Cards                16 non-null     int64  31  Red Cards                   16 non-null     int64  32  Subs on                     16 non-null     int64  33  Subs off                    16 non-null     int64  34  Players Used                16 non-null     int64
dtypes: float64(1), int64(29), object(5)
memory usage: 4.5+ KB
2.1 有多少球队参与了 2012 年欧洲杯
euro['Team'].nunique()
16
2.2 将数据集中的列 Team, Yellow Cards 和 Red Cards 单独存在一个名叫 discripline 的 DateFrame 里
discripline = euro[['Team','Yellow Cards', 'Red Cards']]
discripline
Team Yellow Cards Red Cards
0 Croatia 9 0
1 Czech Republic 7 0
2 Denmark 4 0
3 England 5 0
4 France 6 0
5 Germany 4 0
6 Greece 9 1
7 Italy 16 0
8 Netherlands 5 0
9 Poland 7 1
10 Portugal 12 0
11 Republic of Ireland 6 1
12 Russia 6 0
13 Spain 11 0
14 Sweden 7 0
15 Ukraine 5 0
2.3 对 discripline 排序
discripline.sort_values(['Yellow Cards', 'Red Cards'], ascending=[False, False], inplace=True)
discripline
C:\Users\LiCN\AppData\Local\Temp/ipykernel_19644/3135130171.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrameSee the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copydiscripline.sort_values(['Yellow Cards', 'Red Cards'], ascending=[False, False], inplace=True)
Team Yellow Cards Red Cards
7 Italy 16 0
10 Portugal 12 0
13 Spain 11 0
6 Greece 9 1
0 Croatia 9 0
9 Poland 7 1
1 Czech Republic 7 0
14 Sweden 7 0
11 Republic of Ireland 6 1
4 France 6 0
12 Russia 6 0
3 England 5 0
8 Netherlands 5 0
15 Ukraine 5 0
2 Denmark 4 0
5 Germany 4 0
2.4 计算每个球队拿到的黄牌数的平均值
discripline['Yellow Cards'].mean()
7.4375
2.5 进球大于 6 的球队
discripline[discripline['Yellow Cards']>6]
# discripline[discripline.Yellow Cards >6]
Team Yellow Cards Red Cards
7 Italy 16 0
10 Portugal 12 0
13 Spain 11 0
6 Greece 9 1
0 Croatia 9 0
9 Poland 7 1
1 Czech Republic 7 0
14 Sweden 7 0
2.6 选取以字母 G 开头的球队数据
euro[euro['Team'].str.startswith('G')]
Team Goals Shots on target Shots off target Shooting Accuracy % Goals-to-shots Total shots (inc. Blocked) Hit Woodwork Penalty goals Penalties not scored ... Saves made Saves-to-shots ratio Fouls Won Fouls Conceded Offsides Yellow Cards Red Cards Subs on Subs off Players Used
5 Germany 10 32 32 47.8% 15.6% 80 2 1 0 ... 10 62.6% 63 49 12 4 0 15 15 17
6 Greece 5 8 18 30.7% 19.2% 32 1 1 1 ... 13 65.1% 67 48 12 9 1 12 12 20

2 rows × 35 columns

2.7 选取前 7 列
euro.iloc[:,0:7]
Team Goals Shots on target Shots off target Shooting Accuracy % Goals-to-shots Total shots (inc. Blocked)
0 Croatia 4 13 12 51.9% 16.0% 32
1 Czech Republic 4 13 18 41.9% 12.9% 39
2 Denmark 4 10 10 50.0% 20.0% 27
3 England 5 11 18 50.0% 17.2% 40
4 France 3 22 24 37.9% 6.5% 65
5 Germany 10 32 32 47.8% 15.6% 80
6 Greece 5 8 18 30.7% 19.2% 32
7 Italy 6 34 45 43.0% 7.5% 110
8 Netherlands 2 12 36 25.0% 4.1% 60
9 Poland 2 15 23 39.4% 5.2% 48
10 Portugal 6 22 42 34.3% 9.3% 82
11 Republic of Ireland 1 7 12 36.8% 5.2% 28
12 Russia 5 9 31 22.5% 12.5% 59
13 Spain 12 42 33 55.9% 16.0% 100
14 Sweden 5 17 19 47.2% 13.8% 39
15 Ukraine 2 7 26 21.2% 6.0% 38
2.8 选取除了最后 3 列之外的全部列
euro.iloc[:,:-3]
Team Goals Shots on target Shots off target Shooting Accuracy % Goals-to-shots Total shots (inc. Blocked) Hit Woodwork Penalty goals Penalties not scored ... Clean Sheets Blocks Goals conceded Saves made Saves-to-shots ratio Fouls Won Fouls Conceded Offsides Yellow Cards Red Cards
0 Croatia 4 13 12 51.9% 16.0% 32 0 0 0 ... 0 10 3 13 81.3% 41 62 2 9 0
1 Czech Republic 4 13 18 41.9% 12.9% 39 0 0 0 ... 1 10 6 9 60.1% 53 73 8 7 0
2 Denmark 4 10 10 50.0% 20.0% 27 1 0 0 ... 1 10 5 10 66.7% 25 38 8 4 0
3 England 5 11 18 50.0% 17.2% 40 0 0 0 ... 2 29 3 22 88.1% 43 45 6 5 0
4 France 3 22 24 37.9% 6.5% 65 1 0 0 ... 1 7 5 6 54.6% 36 51 5 6 0
5 Germany 10 32 32 47.8% 15.6% 80 2 1 0 ... 1 11 6 10 62.6% 63 49 12 4 0
6 Greece 5 8 18 30.7% 19.2% 32 1 1 1 ... 1 23 7 13 65.1% 67 48 12 9 1
7 Italy 6 34 45 43.0% 7.5% 110 2 0 0 ... 2 18 7 20 74.1% 101 89 16 16 0
8 Netherlands 2 12 36 25.0% 4.1% 60 2 0 0 ... 0 9 5 12 70.6% 35 30 3 5 0
9 Poland 2 15 23 39.4% 5.2% 48 0 0 0 ... 0 8 3 6 66.7% 48 56 3 7 1
10 Portugal 6 22 42 34.3% 9.3% 82 6 0 0 ... 2 11 4 10 71.5% 73 90 10 12 0
11 Republic of Ireland 1 7 12 36.8% 5.2% 28 0 0 0 ... 0 23 9 17 65.4% 43 51 11 6 1
12 Russia 5 9 31 22.5% 12.5% 59 2 0 0 ... 0 8 3 10 77.0% 34 43 4 6 0
13 Spain 12 42 33 55.9% 16.0% 100 0 1 0 ... 5 8 1 15 93.8% 102 83 19 11 0
14 Sweden 5 17 19 47.2% 13.8% 39 3 0 0 ... 1 12 5 8 61.6% 35 51 7 7 0
15 Ukraine 2 7 26 21.2% 6.0% 38 0 0 0 ... 0 4 4 13 76.5% 48 31 4 5 0

16 rows × 32 columns

2.9 找到英格兰 (England)、意大利 (Italy) 和俄罗斯 (Russia) 的射正率 (Shooting Accuracy)
euro[euro['Team'].isin(['England', 'Italy', 'Russia'])][['Team','Shooting Accuracy']]
Team Shooting Accuracy
3 England 50.0%
7 Italy 43.0%
12 Russia 22.5%
  • 原解法
euro.loc[euro['Team'].isin(['England', 'Italy', 'Russia']), ['Team','Shooting Accuracy']]
Team Shooting Accuracy
3 England 50.0%
7 Italy 43.0%
12 Russia 22.5%

3. 数据分组 探索酒类消费数据

import pandas as pddrinks = pd.read_csv('drinks.csv')
drinks.head()
country beer_servings spirit_servings wine_servings total_litres_of_pure_alcohol continent
0 Afghanistan 0 0 0 0.0 AS
1 Albania 89 132 54 4.9 EU
2 Algeria 25 0 14 0.7 AF
3 Andorra 245 138 312 12.4 EU
4 Angola 217 57 45 5.9 AF
drinks.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 193 entries, 0 to 192
Data columns (total 6 columns):#   Column                        Non-Null Count  Dtype
---  ------                        --------------  -----  0   country                       193 non-null    object 1   beer_servings                 193 non-null    int64  2   spirit_servings               193 non-null    int64  3   wine_servings                 193 non-null    int64  4   total_litres_of_pure_alcohol  193 non-null    float645   continent                     170 non-null    object
dtypes: float64(1), int64(3), object(2)
memory usage: 9.2+ KB
3.1 哪个大陆平均消耗的啤酒更多?
drinks.groupby('continent')['beer_servings'].mean().sort_values().head(1)
continent
AS    37.045455
Name: beer_servings, dtype: float64
3.2 打印出每个大陆红酒消耗的描述性统计值
drinks.groupby('continent')['wine_servings'].describe()
count mean std min 25% 50% 75% max
continent
AF 53.0 16.264151 38.846419 0.0 1.0 2.0 13.00 233.0
AS 44.0 9.068182 21.667034 0.0 0.0 1.0 8.00 123.0
EU 45.0 142.222222 97.421738 0.0 59.0 128.0 195.00 370.0
OC 16.0 35.625000 64.555790 0.0 1.0 8.5 23.25 212.0
SA 12.0 62.416667 88.620189 1.0 3.0 12.0 98.50 221.0
3.3 打印出每个大陆每种酒类的消耗平均值
drinks.groupby('continent').mean()
beer_servings spirit_servings wine_servings total_litres_of_pure_alcohol
continent
AF 61.471698 16.339623 16.264151 3.007547
AS 37.045455 60.840909 9.068182 2.170455
EU 193.777778 132.555556 142.222222 8.617778
OC 89.687500 58.437500 35.625000 3.381250
SA 175.083333 114.750000 62.416667 6.308333
3.4 打印出每个大陆每种酒类的消耗的中位数
drinks.groupby('continent').median()
beer_servings spirit_servings wine_servings total_litres_of_pure_alcohol
continent
AF 32.0 3.0 2.0 2.30
AS 17.5 16.0 1.0 1.20
EU 219.0 122.0 128.0 10.00
OC 52.5 37.0 8.5 1.75
SA 162.5 108.5 12.0 6.85
3.5 打印出每个大陆对 spirit 饮品消耗的平均值, 最大值, 最小值
drinks.groupby('continent')['spirit_servings'].agg(['mean','max','min'])
mean max min
continent
AF 16.339623 152 0
AS 60.840909 326 0
EU 132.555556 373 0
OC 58.437500 254 0
SA 114.750000 302 25

4. apply 函数 (1960-2014 美国犯罪数据探索)

4.1 数据导入及数据集基本情况
import pandas as pdcrime = pd.read_csv('US_Crime_Rates_1960_2014.csv')crime.head()
Year Population Total Violent Property Murder Forcible_Rape Robbery Aggravated_assault Burglary Larceny_Theft Vehicle_Theft
0 1960 179323175 3384200 288460 3095700 9110 17190 107840 154320 912100 1855400 328200
1 1961 182992000 3488000 289390 3198600 8740 17220 106670 156760 949600 1913000 336000
2 1962 185771000 3752200 301510 3450700 8530 17550 110860 164570 994300 2089600 366800
3 1963 188483000 4109500 316970 3792500 8640 17650 116470 174210 1086400 2297800 408300
4 1964 191141000 4564600 364220 4200400 9360 21420 130390 203050 1213200 2514400 472800
crime.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 55 entries, 0 to 54
Data columns (total 12 columns):#   Column              Non-Null Count  Dtype
---  ------              --------------  -----0   Year                55 non-null     int641   Population          55 non-null     int642   Total               55 non-null     int643   Violent             55 non-null     int644   Property            55 non-null     int645   Murder              55 non-null     int646   Forcible_Rape       55 non-null     int647   Robbery             55 non-null     int648   Aggravated_assault  55 non-null     int649   Burglary            55 non-null     int6410  Larceny_Theft       55 non-null     int6411  Vehicle_Theft       55 non-null     int64
dtypes: int64(12)
memory usage: 5.3 KB
4.2 将 Year 数据类型转换为 datetime
crime['Year'] = pd.to_datetime(crime['Year'], format='%Y')
crime.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 55 entries, 0 to 54
Data columns (total 12 columns):#   Column              Non-Null Count  Dtype
---  ------              --------------  -----         0   Year                55 non-null     datetime64[ns]1   Population          55 non-null     int64         2   Total               55 non-null     int64         3   Violent             55 non-null     int64         4   Property            55 non-null     int64         5   Murder              55 non-null     int64         6   Forcible_Rape       55 non-null     int64         7   Robbery             55 non-null     int64         8   Aggravated_assault  55 non-null     int64         9   Burglary            55 non-null     int64         10  Larceny_Theft       55 non-null     int64         11  Vehicle_Theft       55 non-null     int64
dtypes: datetime64[ns](1), int64(11)
memory usage: 5.3 KB
4.3 将 Year 设置为索引
# 好像不要 drop=True 原索引也不会成为新的一列
crime.set_index('Year', inplace=True)
crime.head()
Population Total Violent Property Murder Forcible_Rape Robbery Aggravated_assault Burglary Larceny_Theft Vehicle_Theft
Year
1960-01-01 179323175 3384200 288460 3095700 9110 17190 107840 154320 912100 1855400 328200
1961-01-01 182992000 3488000 289390 3198600 8740 17220 106670 156760 949600 1913000 336000
1962-01-01 185771000 3752200 301510 3450700 8530 17550 110860 164570 994300 2089600 366800
1963-01-01 188483000 4109500 316970 3792500 8640 17650 116470 174210 1086400 2297800 408300
1964-01-01 191141000 4564600 364220 4200400 9360 21420 130390 203050 1213200 2514400 472800
4.4 删除名为 Total 的列
del crime['Total']
4.5 每十年重组数据 resample 非常重要
  • Pandas中的resample,重新采样,是对原样本重新处理的一个方法,是一个对常规时间序列数据重新采样和频率转换的便捷的方法。
  • 参数表示采样的规则 常见时间频率 :A year M month W week D day H hour T minute S second
  • 按照每十年为一组进行采样统计
crime_10Ys = crime.resample('10As').sum()
crime_10Ys
Population Violent Property Murder Forcible_Rape Robbery Aggravated_assault Burglary Larceny_Theft Vehicle_Theft
Year
1960-01-01 1915053175 4134930 45160900 106180 236720 1633510 2158520 13321100 26547700 5292100
1970-01-01 2121193298 9607930 91383800 192230 554570 4159020 4702120 28486000 53157800 9739900
1980-01-01 2371370069 14074328 117048900 206439 865639 5383109 7619130 33073494 72040253 11935411
1990-01-01 2612825258 17527048 119053499 211664 998827 5748930 10568963 26750015 77679366 14624418
2000-01-01 2947969117 13968056 100944369 163068 922499 4230366 8652124 21565176 67970291 11412834
2010-01-01 1570146307 6072017 44095950 72867 421059 1749809 3764142 10125170 30401698 3569080
  • 用 population 的最大值替换重组后的 population 值, 上面每个十年的 population 加总没有意义
  • 用十年中的最大值替换掉这个加总值
crime_10Ys['Population'] = crime['Population'].resample('10As').max()
crime_10Ys.head()
Population Violent Property Murder Forcible_Rape Robbery Aggravated_assault Burglary Larceny_Theft Vehicle_Theft
Year
1960-01-01 201385000 4134930 45160900 106180 236720 1633510 2158520 13321100 26547700 5292100
1970-01-01 220099000 9607930 91383800 192230 554570 4159020 4702120 28486000 53157800 9739900
1980-01-01 248239000 14074328 117048900 206439 865639 5383109 7619130 33073494 72040253 11935411
1990-01-01 272690813 17527048 119053499 211664 998827 5748930 10568963 26750015 77679366 14624418
2000-01-01 307006550 13968056 100944369 163068 922499 4230366 8652124 21565176 67970291 11412834
4.6 何时是美国历史上生存最危险的年代
  • idxmax()
crime
Population Violent Property Murder Forcible_Rape Robbery Aggravated_assault Burglary Larceny_Theft Vehicle_Theft
Year
1960-01-01 179323175 288460 3095700 9110 17190 107840 154320 912100 1855400 328200
1961-01-01 182992000 289390 3198600 8740 17220 106670 156760 949600 1913000 336000
1962-01-01 185771000 301510 3450700 8530 17550 110860 164570 994300 2089600 366800
1963-01-01 188483000 316970 3792500 8640 17650 116470 174210 1086400 2297800 408300
1964-01-01 191141000 364220 4200400 9360 21420 130390 203050 1213200 2514400 472800
1965-01-01 193526000 387390 4352000 9960 23410 138690 215330 1282500 2572600 496900
1966-01-01 195576000 430180 4793300 11040 25820 157990 235330 1410100 2822000 561200
1967-01-01 197457000 499930 5403500 12240 27620 202910 257160 1632100 3111600 659800
1968-01-01 199399000 595010 6125200 13800 31670 262840 286700 1858900 3482700 783600
1969-01-01 201385000 661870 6749000 14760 37170 298850 311090 1981900 3888600 878500
1970-01-01 203235298 738820 7359200 16000 37990 349860 334970 2205000 4225800 928400
1971-01-01 206212000 816500 7771700 17780 42260 387700 368760 2399300 4424200 948200
1972-01-01 208230000 834900 7413900 18670 46850 376290 393090 2375500 4151200 887200
1973-01-01 209851000 875910 7842200 19640 51400 384220 420650 2565500 4347900 928800
1974-01-01 211392000 974720 9278700 20710 55400 442400 456210 3039200 5262500 977100
1975-01-01 213124000 1039710 10252700 20510 56090 470500 492620 3265300 5977700 1009600
1976-01-01 214659000 1004210 10345500 18780 57080 427810 500530 3108700 6270800 966000
1977-01-01 216332000 1029580 9955000 19120 63500 412610 534350 3071500 5905700 977700
1978-01-01 218059000 1085550 10123400 19560 67610 426930 571460 3128300 5991000 1004100
1979-01-01 220099000 1208030 11041500 21460 76390 480700 629480 3327700 6601000 1112800
1980-01-01 225349264 1344520 12063700 23040 82990 565840 672650 3795200 7136900 1131700
1981-01-01 229146000 1361820 12061900 22520 82500 592910 663900 3779700 7194400 1087800
1982-01-01 231534000 1322390 11652000 21010 78770 553130 669480 3447100 7142500 1062400
1983-01-01 233981000 1258090 10850500 19310 78920 506570 653290 3129900 6712800 1007900
1984-01-01 236158000 1273280 10608500 18690 84230 485010 685350 2984400 6591900 1032200
1985-01-01 238740000 1328800 11102600 18980 88670 497870 723250 3073300 6926400 1102900
1986-01-01 240132887 1489169 11722700 20613 91459 542775 834322 3241410 7257153 1224137
1987-01-01 242282918 1483999 12024700 20096 91110 517704 855088 3236184 7499900 1288674
1988-01-01 245807000 1566220 12356900 20680 92490 542970 910090 3218100 7705900 1432900
1989-01-01 248239000 1646040 12605400 21500 94500 578330 951710 3168200 7872400 1564800
1990-01-01 248709873 1820130 12655500 23440 102560 639270 1054860 3073900 7945700 1635900
1991-01-01 252177000 1911770 12961100 24700 106590 687730 1092740 3157200 8142200 1661700
1992-01-01 255082000 1932270 12505900 23760 109060 672480 1126970 2979900 7915200 1610800
1993-01-01 257908000 1926020 12218800 24530 106010 659870 1135610 2834800 7820900 1563100
1994-01-01 260341000 1857670 12131900 23330 102220 618950 1113180 2712800 7879800 1539300
1995-01-01 262755000 1798790 12063900 21610 97470 580510 1099210 2593800 7997700 1472400
1996-01-01 265228572 1688540 11805300 19650 96250 535590 1037050 2506400 7904700 1394200
1997-01-01 267637000 1634770 11558175 18208 96153 498534 1023201 2460526 7743760 1354189
1998-01-01 270296000 1531044 10944590 16914 93103 446625 974402 2329950 7373886 1240754
1999-01-01 272690813 1426044 10208334 15522 89411 409371 911740 2100739 6955520 1152075
2000-01-01 281421906 1425486 10182586 15586 90178 408016 911706 2050992 6971590 1160002
2001-01-01 285317559 1439480 10437480 16037 90863 423557 909023 2116531 7092267 1228391
2002-01-01 287973924 1423677 10455277 16229 95235 420806 891407 2151252 7057370 1246646
2003-01-01 290690788 1383676 10442862 16528 93883 414235 859030 2154834 7026802 1261226
2004-01-01 293656842 1360088 10319386 16148 95089 401470 847381 2144446 6937089 1237851
2005-01-01 296507061 1390745 10174754 16740 94347 417438 862220 2155448 6783447 1235859
2006-01-01 299398484 1418043 9983568 17030 92757 447403 860853 2183746 6607013 1192809
2007-01-01 301621157 1408337 9843481 16929 90427 445125 855856 2176140 6568572 1095769
2008-01-01 304374846 1392628 9767915 16442 90479 443574 842134 2228474 6588046 958629
2009-01-01 307006550 1325896 9337060 15399 89241 408742 812514 2203313 6338095 795652
2010-01-01 309330219 1251248 9112625 14772 85593 369089 781844 2168457 6204601 739565
2011-01-01 311587816 1206031 9052743 14661 84175 354772 752423 2185140 6151095 716508
2012-01-01 313873685 1217067 9001992 14866 85141 355051 762009 2109932 6168874 723186
2013-01-01 316497531 1199684 8650761 14319 82109 345095 726575 1931835 6018632 700294
2014-01-01 318857056 1197987 8277829 14249 84041 325802 741291 1729806 5858496 689527
crime.idxmax()
Population           2014-01-01
Violent              1992-01-01
Property             1991-01-01
Murder               1991-01-01
Forcible_Rape        1992-01-01
Robbery              1991-01-01
Aggravated_assault   1993-01-01
Burglary             1980-01-01
Larceny_Theft        1991-01-01
Vehicle_Theft        1991-01-01
dtype: datetime64[ns]
  • 各种犯罪数量的最大值集中出现在1991年~1993年, 从整个数据集看人口数量是在逐渐增长的(从1960年的1.8亿到2014年的3.2亿), 说明这一时期犯罪率高于其他时期(但并不能断言整个90年代是最危险的年代)

5. 合并(探索虚拟姓名数据)

5.1 构建虚拟数据
import numpy as np
import pandas as pdraw_data_1 = {'subject_id': ['1', '2', '3', '4', '5'],'first_name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],'last_name': ['Anderson', 'Ackerman', 'Ali', 'Aoni', 'Atiches']}
raw_data_2 = {'subject_id': ['4', '5', '6', '7', '8'],'first_name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],'last_name': ['Bonder', 'Black', 'Balwner', 'Brice', 'Btisan']}
raw_data_3 = {'subject_id': ['1', '2', '3', '4', '5', '7', '8', '9', '10', '11'],'test_id': [51, 15, 15, 61, 16, 14, 15, 1, 61, 16]}data_1 = pd.DataFrame(raw_data_1)
data_2 = pd.DataFrame(raw_data_2)
data_3 = pd.DataFrame(raw_data_3)print(data_1)
print(data_2)
print(data_3)
  subject_id first_name last_name
0          1       Alex  Anderson
1          2        Amy  Ackerman
2          3      Allen       Ali
3          4      Alice      Aoni
4          5     Ayoung   Atichessubject_id first_name last_name
0          4      Billy    Bonder
1          5      Brian     Black
2          6       Bran   Balwner
3          7      Bryce     Brice
4          8      Betty    Btisansubject_id  test_id
0          1       51
1          2       15
2          3       15
3          4       61
4          5       16
5          7       14
6          8       15
7          9        1
8         10       61
9         11       16
5.2 上下合并, 鸡蛋下落方向, axis=0
all_data = pd.concat([data_1,data_2], ignore_index=True)
all_data
subject_id first_name last_name
0 1 Alex Anderson
1 2 Amy Ackerman
2 3 Allen Ali
3 4 Alice Aoni
4 5 Ayoung Atiches
5 4 Billy Bonder
6 5 Brian Black
7 6 Bran Balwner
8 7 Bryce Brice
9 8 Betty Btisan
5.3 左右合并, axis=1
by_col = pd.concat([data_1,data_2], axis=1)
by_col
subject_id first_name last_name subject_id first_name last_name
0 1 Alex Anderson 4 Billy Bonder
1 2 Amy Ackerman 5 Brian Black
2 3 Allen Ali 6 Bran Balwner
3 4 Alice Aoni 7 Bryce Brice
4 5 Ayoung Atiches 8 Betty Btisan
5.4 按照 subject_id 的值对 all_data 和 data3 作合并
all_data = pd.merge(all_data,data_3,on='subject_id')
all_data
subject_id first_name last_name test_id
0 1 Alex Anderson 51
1 2 Amy Ackerman 15
2 3 Allen Ali 15
3 4 Alice Aoni 61
4 4 Billy Bonder 61
5 5 Ayoung Atiches 16
6 5 Brian Black 16
7 7 Bryce Brice 14
8 8 Betty Btisan 15
5.5 对 data1 和 data2 按照 subject_id 作内连接
inner_join = pd.merge(data_1,data_2, on='subject_id', how='inner')
inner_join
subject_id first_name_x last_name_x first_name_y last_name_y
0 4 Alice Aoni Billy Bonder
1 5 Ayoung Atiches Brian Black
5.6 对 data1 和 data2 按照 subject_id 作外连接
outer_join = pd.merge(data_1,data_2, on='subject_id', how='outer')
outer_join
subject_id first_name_x last_name_x first_name_y last_name_y
0 1 Alex Anderson NaN NaN
1 2 Amy Ackerman NaN NaN
2 3 Allen Ali NaN NaN
3 4 Alice Aoni Billy Bonder
4 5 Ayoung Atiches Brian Black
5 6 NaN NaN Bran Balwner
6 7 NaN NaN Bryce Brice
7 8 NaN NaN Betty Btisan

6. 统计 探索风速数据

6.1 数据导入等
import pandas as pd
import datetimewind = pd.read_table('wind.data', sep='\s+', parse_dates=[[0,1,2]])
wind.head()
Yr_Mo_Dy RPT VAL ROS KIL SHA BIR DUB CLA MUL CLO BEL MAL
0 2061-01-01 15.04 14.96 13.17 9.29 NaN 9.87 13.67 10.25 10.83 12.58 18.50 15.04
1 2061-01-02 14.71 NaN 10.83 6.50 12.62 7.67 11.50 10.04 9.79 9.67 17.54 13.83
2 2061-01-03 18.50 16.88 12.33 10.13 11.17 6.17 11.25 NaN 8.50 7.67 12.75 12.71
3 2061-01-04 10.58 6.63 11.75 4.58 4.54 2.88 8.63 1.79 5.83 5.88 5.46 10.88
4 2061-01-05 13.33 13.25 11.42 6.17 10.71 8.21 11.92 6.54 10.92 10.34 12.92 11.83
wind.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6574 entries, 0 to 6573
Data columns (total 13 columns):#   Column    Non-Null Count  Dtype
---  ------    --------------  -----         0   Yr_Mo_Dy  6574 non-null   datetime64[ns]1   RPT       6568 non-null   float64       2   VAL       6571 non-null   float64       3   ROS       6572 non-null   float64       4   KIL       6569 non-null   float64       5   SHA       6572 non-null   float64       6   BIR       6574 non-null   float64       7   DUB       6571 non-null   float64       8   CLA       6572 non-null   float64       9   MUL       6571 non-null   float64       10  CLO       6573 non-null   float64       11  BEL       6574 non-null   float64       12  MAL       6570 non-null   float64
dtypes: datetime64[ns](1), float64(12)
memory usage: 667.8 KB
6.2 2061 年是错误数据, 尝试修复
  • 原答案少一层’pd.to_datetime()', 这样的话应用函数后 wind[‘Yr_Mo_Dy’] 将不再是 datetime格式\

    • 单个元素的话可以直接用 x.year
def fix_year(x):year=x.year-100 if x.year>1978 else x.yearreturn pd.to_datetime(datetime.date(year, x.month, x.day),format='%Y/%m/%d')wind['Yr_Mo_Dy']  = wind['Yr_Mo_Dy'].apply(fix_year)
wind['Yr_Mo_Dy'].dt.year.unique()
array([1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971,1972, 1973, 1974, 1975, 1976, 1977, 1978], dtype=int64)
6.3 将日期设为索引, 注意数据类型, 应该是 datetime64
wind.set_index('Yr_Mo_Dy', inplace=True)
wind.head()
RPT VAL ROS KIL SHA BIR DUB CLA MUL CLO BEL MAL
Yr_Mo_Dy
1961-01-01 15.04 14.96 13.17 9.29 NaN 9.87 13.67 10.25 10.83 12.58 18.50 15.04
1961-01-02 14.71 NaN 10.83 6.50 12.62 7.67 11.50 10.04 9.79 9.67 17.54 13.83
1961-01-03 18.50 16.88 12.33 10.13 11.17 6.17 11.25 NaN 8.50 7.67 12.75 12.71
1961-01-04 10.58 6.63 11.75 4.58 4.54 2.88 8.63 1.79 5.83 5.88 5.46 10.88
1961-01-05 13.33 13.25 11.42 6.17 10.71 8.21 11.92 6.54 10.92 10.34 12.92 11.83
6.4 对应每一个 location, 一共有多少数据缺失
wind.isna().sum()
RPT    6
VAL    3
ROS    2
KIL    5
SHA    2
BIR    0
DUB    3
CLA    2
MUL    3
CLO    1
BEL    0
MAL    4
dtype: int64
6.5 对应每一个 location,一共有多少完整的数据值
  • notna() 和 info() 显示的都能显示非空行
wind.notna().sum()
RPT    6568
VAL    6571
ROS    6572
KIL    6569
SHA    6572
BIR    6574
DUB    6571
CLA    6572
MUL    6571
CLO    6573
BEL    6574
MAL    6570
dtype: int64
# 原答案
wind.shape[0] - wind.isnull().sum()
RPT    6568
VAL    6571
ROS    6572
KIL    6569
SHA    6572
BIR    6574
DUB    6571
CLA    6572
MUL    6571
CLO    6573
BEL    6574
MAL    6570
dtype: int64
6.6 对于全体数据, 计算风速的平均值
wind.mean().mean()
10.227982360836938
6.7 创建一个 DataFrame 计算并存储每个 location 的风速最小值, 最大值, 平均值和标准差
  • describe() 是沿columns方向的
wind_des = wind.describe().T.iloc[:,[3,7,1,2]]
wind_des
min max mean std
RPT 0.67 35.80 12.362987 5.618413
VAL 0.21 33.37 10.644314 5.267356
ROS 1.50 33.84 11.660526 5.008450
KIL 0.00 28.46 6.306468 3.605811
SHA 0.13 37.54 10.455834 4.936125
BIR 0.00 26.16 7.092254 3.968683
DUB 0.00 30.37 9.797343 4.977555
CLA 0.00 31.08 8.495053 4.499449
MUL 0.00 25.88 8.493590 4.166872
CLO 0.04 28.21 8.707332 4.503954
BEL 0.13 42.38 13.121007 5.835037
MAL 0.67 42.54 15.599079 6.699794
# 一般写法
day_stats = pd.DataFrame()
day_stats['min'] = wind.min(axis=1)
day_stats['max'] = wind.max(axis=1)
day_stats['mean'] = wind.mean(axis=1)
day_stats['std'] = wind.std(axis=1)
day_stats.head()
6.8 对于每一个 location, 计算一月份的平均风速
  • 注意, 原po主说把1961年的1月和1962年的1月都当作1月份

  • 注意 query() 的用法

wind['date'] = wind.index
wind.head()
RPT VAL ROS KIL SHA BIR DUB CLA MUL CLO BEL MAL date
Yr_Mo_Dy
1961-01-01 15.04 14.96 13.17 9.29 NaN 9.87 13.67 10.25 10.83 12.58 18.50 15.04 1961-01-01
1961-01-02 14.71 NaN 10.83 6.50 12.62 7.67 11.50 10.04 9.79 9.67 17.54 13.83 1961-01-02
1961-01-03 18.50 16.88 12.33 10.13 11.17 6.17 11.25 NaN 8.50 7.67 12.75 12.71 1961-01-03
1961-01-04 10.58 6.63 11.75 4.58 4.54 2.88 8.63 1.79 5.83 5.88 5.46 10.88 1961-01-04
1961-01-05 13.33 13.25 11.42 6.17 10.71 8.21 11.92 6.54 10.92 10.34 12.92 11.83 1961-01-05
wind['year']= wind['date'].dt.year
wind['month']= wind['date'].dt.month
wind['day']= wind['date'].dt.day
wind.head()
RPT VAL ROS KIL SHA BIR DUB CLA MUL CLO BEL MAL date year month day
Yr_Mo_Dy
1961-01-01 15.04 14.96 13.17 9.29 NaN 9.87 13.67 10.25 10.83 12.58 18.50 15.04 1961-01-01 1961 1 1
1961-01-02 14.71 NaN 10.83 6.50 12.62 7.67 11.50 10.04 9.79 9.67 17.54 13.83 1961-01-02 1961 1 2
1961-01-03 18.50 16.88 12.33 10.13 11.17 6.17 11.25 NaN 8.50 7.67 12.75 12.71 1961-01-03 1961 1 3
1961-01-04 10.58 6.63 11.75 4.58 4.54 2.88 8.63 1.79 5.83 5.88 5.46 10.88 1961-01-04 1961 1 4
1961-01-05 13.33 13.25 11.42 6.17 10.71 8.21 11.92 6.54 10.92 10.34 12.92 11.83 1961-01-05 1961 1 5
# 原
wind['year'] = wind['date'].apply(lambda x: x.year)
wind['month'] = wind['date'].apply(lambda x: x.month)
wind['day'] = wind['date'].apply(lambda x: x.day)
wind[wind['month']==1].iloc[:,0:12].mean()
RPT    14.847325
VAL    12.914560
ROS    13.299624
KIL     7.199498
SHA    11.667734
BIR     8.054839
DUB    11.819355
CLA     9.512047
MUL     9.543208
CLO    10.053566
BEL    14.550520
MAL    18.028763
dtype: float64
  • 解法二, loc 定位
wind.loc[wind.index.month==1].mean()
RPT    14.847325
VAL    12.914560
ROS    13.299624
KIL     7.199498
SHA    11.667734
BIR     8.054839
DUB    11.819355
CLA     9.512047
MUL     9.543208
CLO    10.053566
BEL    14.550520
MAL    18.028763
dtype: float64
6.9 对每个地区的记录进行年度向下采样
  • DatetimeIndex 多个独立的日期组合起来
  • PeriodIndex 指定了间隔频率的 DatetimeIndex, 通过将 DatetimeIndex.to_period (“”),指定间隔频率来得到
wind.groupby(wind.index.to_period('A')).mean()
RPT VAL ROS KIL SHA BIR DUB CLA MUL CLO BEL MAL
Yr_Mo_Dy
1961 12.299583 10.351796 11.362369 6.958227 10.881763 7.729726 9.733923 8.858788 8.647652 9.835577 13.502795 13.680773
1962 12.246923 10.110438 11.732712 6.960440 10.657918 7.393068 11.020712 8.793753 8.316822 9.676247 12.930685 14.323956
1963 12.813452 10.836986 12.541151 7.330055 11.724110 8.434712 11.075699 10.336548 8.903589 10.224438 13.638877 14.999014
1964 12.363661 10.920164 12.104372 6.787787 11.454481 7.570874 10.259153 9.467350 7.789016 10.207951 13.740546 14.910301
1965 12.451370 11.075534 11.848767 6.858466 11.024795 7.478110 10.618712 8.879918 7.907425 9.918082 12.964247 15.591644
1966 13.461973 11.557205 12.020630 7.345726 11.805041 7.793671 10.579808 8.835096 8.514438 9.768959 14.265836 16.307260
1967 12.737151 10.990986 11.739397 7.143425 11.630740 7.368164 10.652027 9.325616 8.645014 9.547425 14.774548 17.135945
1968 11.835628 10.468197 11.409754 6.477678 10.760765 6.067322 8.859180 8.255519 7.224945 7.832978 12.808634 15.017486
1969 11.166356 9.723699 10.902000 5.767973 9.873918 6.189973 8.564493 7.711397 7.924521 7.754384 12.621233 15.762904
1970 12.600329 10.726932 11.730247 6.217178 10.567370 7.609452 9.609890 8.334630 9.297616 8.289808 13.183644 16.456027
1971 11.273123 9.095178 11.088329 5.241507 9.440329 6.097151 8.385890 6.757315 7.915370 7.229753 12.208932 15.025233
1972 12.463962 10.561311 12.058333 5.929699 9.430410 6.358825 9.704508 7.680792 8.357295 7.515273 12.727377 15.028716
1973 11.828466 10.680493 10.680493 5.547863 9.640877 6.548740 8.482110 7.614274 8.245534 7.812411 12.169699 15.441096
1974 13.643096 11.811781 12.336356 6.427041 11.110986 6.809781 10.084603 9.896986 9.331753 8.736356 13.252959 16.947671
1975 12.008575 10.293836 11.564712 5.269096 9.190082 5.668521 8.562603 7.843836 8.797945 7.382822 12.631671 15.307863
1976 11.737842 10.203115 10.761230 5.109426 8.846339 6.311038 9.149126 7.146202 8.883716 7.883087 12.332377 15.471448
1977 13.099616 11.144493 12.627836 6.073945 10.003836 8.586438 11.523205 8.378384 9.098192 8.821616 13.459068 16.590849
1978 12.504356 11.044274 11.380000 6.082356 10.167233 7.650658 9.489342 8.800466 9.089753 8.301699 12.967397 16.771370
6.10 对每个地区的记录进行月度向下采样
wind.groupby(wind.index.to_period('M')).mean()
RPT VAL ROS KIL SHA BIR DUB CLA MUL CLO BEL MAL
Yr_Mo_Dy
1961-01 14.841333 11.988333 13.431613 7.736774 11.072759 8.588065 11.184839 9.245333 9.085806 10.107419 13.880968 14.703226
1961-02 16.269286 14.975357 14.441481 9.230741 13.852143 10.937500 11.890714 11.846071 11.821429 12.714286 18.583214 15.411786
1961-03 10.890000 11.296452 10.752903 7.284000 10.509355 8.866774 9.644194 9.829677 10.294138 11.251935 16.410968 15.720000
1961-04 10.722667 9.427667 9.998000 5.830667 8.435000 6.495000 6.925333 7.094667 7.342333 7.237000 11.147333 10.278333
1961-05 9.860968 8.850000 10.818065 5.905333 9.490323 6.574839 7.604000 8.177097 8.039355 8.499355 11.900323 12.011613
... ... ... ... ... ... ... ... ... ... ... ... ...
1978-08 9.645161 8.259355 9.032258 4.502903 7.368065 5.935161 5.650323 5.417742 7.241290 5.536774 10.466774 12.054194
1978-09 10.913667 10.895000 10.635000 5.725000 10.372000 9.278333 10.790333 9.583000 10.069333 8.939000 15.680333 19.391333
1978-10 9.897742 8.670968 9.295806 4.721290 8.525161 6.774194 8.115484 7.337742 8.297742 8.243871 13.776774 17.150000
1978-11 16.151667 14.802667 13.508000 7.317333 11.475000 8.743000 11.492333 9.657333 10.701333 10.676000 17.404667 20.723000
1978-12 16.175484 13.748065 15.635161 7.094839 11.398710 9.241613 12.077419 10.194839 10.616774 11.028710 13.859677 21.371613

216 rows × 12 columns

6.11 对每个地区的记录进行周度向下采样
wind.groupby(wind.index.to_period('W')).mean()
RPT VAL ROS KIL SHA BIR DUB CLA MUL CLO BEL MAL
Yr_Mo_Dy
1960-12-26/1961-01-01 15.040000 14.960000 13.170000 9.290000 NaN 9.870000 13.670000 10.250000 10.830000 12.580000 18.500000 15.040000
1961-01-02/1961-01-08 13.541429 11.486667 10.487143 6.417143 9.474286 6.435714 11.061429 6.616667 8.434286 8.497143 12.481429 13.238571
1961-01-09/1961-01-15 12.468571 8.967143 11.958571 4.630000 7.351429 5.072857 7.535714 6.820000 5.712857 7.571429 11.125714 11.024286
1961-01-16/1961-01-22 13.204286 9.862857 12.982857 6.328571 8.966667 7.417143 9.257143 7.875714 7.145714 8.124286 9.821429 11.434286
1961-01-23/1961-01-29 19.880000 16.141429 18.225714 12.720000 17.432857 14.828571 15.528571 15.160000 14.480000 15.640000 20.930000 22.530000
... ... ... ... ... ... ... ... ... ... ... ... ...
1978-11-27/1978-12-03 14.934286 11.232857 13.941429 5.565714 10.215714 8.618571 9.642857 7.685714 9.011429 9.547143 11.835714 18.728571
1978-12-04/1978-12-10 20.740000 19.190000 17.034286 9.777143 15.287143 12.774286 14.437143 12.488571 13.870000 14.082857 18.517143 23.061429
1978-12-11/1978-12-17 16.758571 14.692857 14.987143 6.917143 11.397143 7.272857 10.208571 7.967143 9.168571 8.565714 11.102857 15.562857
1978-12-18/1978-12-24 11.155714 8.008571 13.172857 4.004286 7.825714 6.290000 7.798571 8.667143 7.151429 8.072857 11.845714 18.977143
1978-12-25/1978-12-31 14.951429 11.801429 16.035714 6.507143 9.660000 8.620000 13.708571 10.477143 10.868571 11.471429 12.947143 26.844286

940 rows × 12 columns

6.12 计算前 52 周内所有地点(假设第一个星期从 1961年1月2日开始) 的最小值、最大值、平均风速和风速的标准偏差。
# 重采样并聚合
weekly = wind.resample('W').agg(['min','max','mean','std'])
weekly
RPT VAL ROS ... CLO BEL MAL
min max mean std min max mean std min max ... mean std min max mean std min max mean std
Yr_Mo_Dy
1961-01-01 15.04 15.04 15.040000 NaN 14.96 14.96 14.960000 NaN 13.17 13.17 ... 12.580000 NaN 18.50 18.50 18.500000 NaN 15.04 15.04 15.040000 NaN
1961-01-08 10.58 18.50 13.541429 2.631321 6.63 16.88 11.486667 3.949525 7.62 12.33 ... 8.497143 1.704941 5.46 17.54 12.481429 4.349139 10.88 16.46 13.238571 1.773062
1961-01-15 9.04 19.75 12.468571 3.555392 3.54 12.08 8.967143 3.148945 7.08 19.50 ... 7.571429 4.084293 5.25 20.71 11.125714 5.552215 5.17 16.92 11.024286 4.692355
1961-01-22 4.92 19.83 13.204286 5.337402 3.42 14.37 9.862857 3.837785 7.29 20.79 ... 8.124286 4.783952 6.50 15.92 9.821429 3.626584 6.79 17.96 11.434286 4.237239
1961-01-29 13.62 25.04 19.880000 4.619061 9.96 23.91 16.141429 5.170224 12.67 25.84 ... 15.640000 3.713368 14.04 27.71 20.930000 5.210726 17.50 27.63 22.530000 3.874721
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1978-12-03 9.08 21.29 14.934286 4.931754 4.54 21.34 11.232857 5.978968 8.21 24.04 ... 9.547143 6.284973 4.92 21.42 11.835714 5.950112 11.50 25.75 18.728571 6.393188
1978-12-10 9.92 29.33 20.740000 7.215012 12.54 24.79 19.190000 4.953060 7.21 25.37 ... 14.082857 5.516405 9.54 26.08 18.517143 5.600389 15.34 34.59 23.061429 8.093976
1978-12-17 9.87 23.13 16.758571 4.499431 3.21 24.04 14.692857 7.578665 8.04 18.05 ... 8.565714 5.487801 5.00 21.50 11.102857 6.631925 6.92 22.83 15.562857 6.005594
1978-12-24 6.21 16.62 11.155714 3.522759 3.63 13.29 8.008571 3.882900 8.50 22.21 ... 8.072857 3.023131 3.21 19.79 11.845714 5.750301 10.29 31.71 18.977143 7.194108
1978-12-31 7.21 20.33 14.951429 4.350400 5.46 17.41 11.801429 4.705392 7.83 27.29 ... 11.471429 5.533397 1.21 21.79 12.947143 7.523148 11.96 41.46 26.844286 11.621233

940 rows × 48 columns

weekly.loc[weekly.index[1:53],:].head()
RPT VAL ROS ... CLO BEL MAL
min max mean std min max mean std min max ... mean std min max mean std min max mean std
Yr_Mo_Dy
1961-01-08 10.58 18.50 13.541429 2.631321 6.63 16.88 11.486667 3.949525 7.62 12.33 ... 8.497143 1.704941 5.46 17.54 12.481429 4.349139 10.88 16.46 13.238571 1.773062
1961-01-15 9.04 19.75 12.468571 3.555392 3.54 12.08 8.967143 3.148945 7.08 19.50 ... 7.571429 4.084293 5.25 20.71 11.125714 5.552215 5.17 16.92 11.024286 4.692355
1961-01-22 4.92 19.83 13.204286 5.337402 3.42 14.37 9.862857 3.837785 7.29 20.79 ... 8.124286 4.783952 6.50 15.92 9.821429 3.626584 6.79 17.96 11.434286 4.237239
1961-01-29 13.62 25.04 19.880000 4.619061 9.96 23.91 16.141429 5.170224 12.67 25.84 ... 15.640000 3.713368 14.04 27.71 20.930000 5.210726 17.50 27.63 22.530000 3.874721
1961-02-05 10.58 24.21 16.827143 5.251408 9.46 24.21 15.460000 5.187395 9.04 19.70 ... 9.460000 2.839501 9.17 19.33 14.012857 4.210858 7.17 19.25 11.935714 4.336104

5 rows × 48 columns

7. 探索泰坦尼克灾难数据(可视化)

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as npplt.rcParams["font.sans-serif"]=['SimHei']  #正常显示中文
plt.rcParams["axes.unicode_minus"]=False   # 正常显示负号%matplotlib inlineimport warnings
warnings.filterwarnings('ignore')
7.1 导入数据等
titanic = pd.read_csv('train.csv')
titanic.head()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S
1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C
2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S
3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S
4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S
titanic.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):#   Column       Non-Null Count  Dtype
---  ------       --------------  -----  0   PassengerId  891 non-null    int64  1   Survived     891 non-null    int64  2   Pclass       891 non-null    int64  3   Name         891 non-null    object 4   Sex          891 non-null    object 5   Age          714 non-null    float646   SibSp        891 non-null    int64  7   Parch        891 non-null    int64  8   Ticket       891 non-null    object 9   Fare         891 non-null    float6410  Cabin        204 non-null    object 11  Embarked     889 non-null    object
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB
7.1 绘制一个展示乘客性别比例的扇形图
males = (titanic['Sex']=='male').sum()
females = (titanic['Sex']=='female').sum()proportions = [males, females]
#plt.figure(figsize=(8,8), dpi=100)
#plt.Subplot(111,facecolor='#F0F0F0')plt.pie(proportions,labels = ['Males', 'Females'],shadow = True,colors = ['#E76278','#FAC699'],explode = (0.15,0),startangle = 90,autopct = '%1.1f%%')plt.axis('equal')
plt.title('Sex Proportion')
plt.tight_layout()
plt.show()


7.2 绘制一个展示船票费用与乘客年龄和船票的散点图
lm = sns.lmplot(x='Age', y='Fare', data=titanic, hue='Sex', fit_reg=False)
lm.set(title='Fare_Age Scatter')plt.tight_layout()
plt.show()

7.3 有多少人生还
titanic['Survived'].sum()
342
7.4 绘制一个展示船票价格的直方图
plt.figure(figsize=(10,4), dpi=100)
plt.subplot(111, facecolor='#F0F0F0')plt.title('船票价格分布')
plt.xlabel('船票价格')
plt.ylabel('人数')plt.hist(titanic['Fare'], bins=50, color='#447C69')
plt.xticks(np.arange(0,300,20), fontsize=10)
plt.yticks(fontsize=10)plt.tight_layout()

8. 探索 Pokemon 数据

8.1 数据导入及其他
import pandas as pd
raw_data = {"name": ['Bulbasaur', 'Charmander','Squirtle','Caterpie'],"evolution": ['Ivysaur','Charmeleon','Wartortle','Metapod'],"type": ['grass', 'fire', 'water', 'bug'],"hp": [45, 39, 44, 45],"pokedex": ['yes', 'no','yes','no']}
8.2 创建 DataFrame
pokemon = pd.DataFrame(raw_data)
pokemon.head()
name evolution type hp pokedex
0 Bulbasaur Ivysaur grass 45 yes
1 Charmander Charmeleon fire 39 no
2 Squirtle Wartortle water 44 yes
3 Caterpie Metapod bug 45 no
8.3 列重新排序
pokemon = pokemon[['name','type','hp','evolution','pokedex']]
pokemon.head()
name type hp evolution pokedex
0 Bulbasaur grass 45 Ivysaur yes
1 Charmander fire 39 Charmeleon no
2 Squirtle water 44 Wartortle yes
3 Caterpie bug 45 Metapod no
8.4 添加列 place
pokemon['place'] = ['park','street','lake','forest']
pokemon
name type hp evolution pokedex place
0 Bulbasaur grass 45 Ivysaur yes park
1 Charmander fire 39 Charmeleon no street
2 Squirtle water 44 Wartortle yes lake
3 Caterpie bug 45 Metapod no forest
8.5 查看每个列的数据类型
pokemon.dtypes
name         object
type         object
hp            int64
evolution    object
pokedex      object
place        object
dtype: object

9. 探索 Apple 公司股价数据

9.1 导入数据等
import pandas as pd
import numpy as npimport matplotlib.pyplot as plt
%matplotlib inline
apple = pd.read_csv('Apple_stock.csv')
apple.head()
Date Open High Low Close Volume Adj Close
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35
1 2014-07-07 94.14 95.99 94.10 95.97 56305400 95.97
2 2014-07-03 93.67 94.10 93.20 94.03 22891800 94.03
3 2014-07-02 93.87 94.06 93.09 93.48 28420900 93.48
4 2014-07-01 93.52 94.07 93.13 93.52 38170200 93.52
apple.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 8465 entries, 1980-12-12 to 2014-07-08
Data columns (total 6 columns):#   Column     Non-Null Count  Dtype
---  ------     --------------  -----  0   Open       8465 non-null   float641   High       8465 non-null   float642   Low        8465 non-null   float643   Close      8465 non-null   float644   Volume     8465 non-null   int64  5   Adj Close  8465 non-null   float64
dtypes: float64(5), int64(1)
memory usage: 721.0 KB
9.1 有重复的日期吗
  • 两种方法
apple['Date'].duplicated().sum()
# 没有重复数据
0
apple['Date'].is_unique
True
# 也可以用在索引上
apple.index.is_unique
True
  • 将时间列设为索引
apple['Date'] = pd.to_datetime(apple['Date'])
apple.set_index('Date', inplace=True)
apple.head()
Open High Low Close Volume Adj Close
Date
2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35
2014-07-07 94.14 95.99 94.10 95.97 56305400 95.97
2014-07-03 93.67 94.10 93.20 94.03 22891800 94.03
2014-07-02 93.87 94.06 93.09 93.48 28420900 93.48
2014-07-01 93.52 94.07 93.13 93.52 38170200 93.52
9.2 将 index 设置为升序
apple.sort_index(ascending=True, inplace=True)
apple.head()
Open High Low Close Volume Adj Close
Date
1980-12-12 28.75 28.87 28.75 28.75 117258400 0.45
1980-12-15 27.38 27.38 27.25 27.25 43971200 0.42
1980-12-16 25.37 25.37 25.25 25.25 26432000 0.39
1980-12-17 25.87 26.00 25.87 25.87 21610400 0.40
1980-12-18 26.63 26.75 26.63 26.63 18362400 0.41
9.3 找到每个月的最后一个交易日(business day)
# 'BM' business month end frequency
apple_month = apple.resample('BM').mean()
apple_month.head()
Open High Low Close Volume Adj Close
Date
1980-12-31 30.481538 30.567692 30.443077 30.443077 2.586252e+07 0.473077
1981-01-30 31.754762 31.826667 31.654762 31.654762 7.249867e+06 0.493810
1981-02-27 26.480000 26.572105 26.407895 26.407895 4.231832e+06 0.411053
1981-03-31 24.937727 25.016818 24.836364 24.836364 7.962691e+06 0.387727
1981-04-30 27.286667 27.368095 27.227143 27.227143 6.392000e+06 0.423333
9.4 数据集中最早的日期和最晚的日期相差多少天?
(apple.index.max()-apple.index.min()).days
12261
9.5 数据集中一共有多少个月?
len(apple.resample('BM').mean())
404
9.6 按照时间顺序可视化 Adj Close 值
plt.figure(figsize=(8,3), dpi=100)
apple['Adj Close'].plot(title='Apple Stock')
<AxesSubplot:title={'center':'Apple Stock'}, xlabel='Date'>

10. 探索 Iris 纸鸢花数据

10.1 导入数据等
import pandas as pd
iris = pd.read_csv('iris.csv')
iris.head()
5.1 3.5 1.4 0.2 Iris-setosa
0 4.9 3.0 1.4 0.2 Iris-setosa
1 4.7 3.2 1.3 0.2 Iris-setosa
2 4.6 3.1 1.5 0.2 Iris-setosa
3 5.0 3.6 1.4 0.2 Iris-setosa
4 5.4 3.9 1.7 0.4 Iris-setosa
iris.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 149 entries, 0 to 148
Data columns (total 5 columns):#   Column       Non-Null Count  Dtype
---  ------       --------------  -----  0   5.1          149 non-null    float641   3.5          149 non-null    float642   1.4          149 non-null    float643   0.2          149 non-null    float644   Iris-setosa  149 non-null    object
dtypes: float64(4), object(1)
memory usage: 5.9+ KB
10.2 添加列名
iris.columns = ['sepal_length','sepal_width', 'petal_length','petal_width', 'class']
iris.head()
sepal_length sepal_width petal_length petal_width class
0 4.9 3.0 1.4 0.2 Iris-setosa
1 4.7 3.2 1.3 0.2 Iris-setosa
2 4.6 3.1 1.5 0.2 Iris-setosa
3 5.0 3.6 1.4 0.2 Iris-setosa
4 5.4 3.9 1.7 0.4 Iris-setosa
10.3 数据集有缺失值吗
iris.isna().sum()
sepal_length    0
sepal_width     0
petal_length    0
petal_width     0
class           0
dtype: int64
10.4 将petal_lenth 的第10到19行设置为缺失值
  • python range() 函数是左闭右开,下面的stop=20,不包括20
for i in range(10,20):iris['petal_length'].at[i] = np.naniris.head(21)
sepal_length sepal_width petal_length petal_width class
0 4.9 3.0 1.4 0.2 Iris-setosa
1 4.7 3.2 1.3 0.2 Iris-setosa
2 4.6 3.1 1.5 0.2 Iris-setosa
3 5.0 3.6 1.4 0.2 Iris-setosa
4 5.4 3.9 1.7 0.4 Iris-setosa
5 4.6 3.4 1.4 0.3 Iris-setosa
6 5.0 3.4 1.5 0.2 Iris-setosa
7 4.4 2.9 1.4 0.2 Iris-setosa
8 4.9 3.1 1.5 0.1 Iris-setosa
9 5.4 3.7 1.5 0.2 Iris-setosa
10 4.8 3.4 NaN 0.2 Iris-setosa
11 4.8 3.0 NaN 0.1 Iris-setosa
12 4.3 3.0 NaN 0.1 Iris-setosa
13 5.8 4.0 NaN 0.2 Iris-setosa
14 5.7 4.4 NaN 0.4 Iris-setosa
15 5.4 3.9 NaN 0.4 Iris-setosa
16 5.1 3.5 NaN 0.3 Iris-setosa
17 5.7 3.8 NaN 0.3 Iris-setosa
18 5.1 3.8 NaN 0.3 Iris-setosa
19 5.4 3.4 NaN 0.2 Iris-setosa
20 5.1 3.7 1.5 0.4 Iris-setosa
iris.iloc[10:20, 1:2] = np.nan
iris.head(21)
sepal_length sepal_width petal_length petal_width class
0 4.9 3.0 1.4 0.2 Iris-setosa
1 4.7 3.2 1.3 0.2 Iris-setosa
2 4.6 3.1 1.5 0.2 Iris-setosa
3 5.0 3.6 1.4 0.2 Iris-setosa
4 5.4 3.9 1.7 0.4 Iris-setosa
5 4.6 3.4 1.4 0.3 Iris-setosa
6 5.0 3.4 1.5 0.2 Iris-setosa
7 4.4 2.9 1.4 0.2 Iris-setosa
8 4.9 3.1 1.5 0.1 Iris-setosa
9 5.4 3.7 1.5 0.2 Iris-setosa
10 4.8 NaN 1.6 0.2 Iris-setosa
11 4.8 NaN 1.4 0.1 Iris-setosa
12 4.3 NaN 1.1 0.1 Iris-setosa
13 5.8 NaN 1.2 0.2 Iris-setosa
14 5.7 NaN 1.5 0.4 Iris-setosa
15 5.4 NaN 1.3 0.4 Iris-setosa
16 5.1 NaN 1.4 0.3 Iris-setosa
17 5.7 NaN 1.7 0.3 Iris-setosa
18 5.1 NaN 1.5 0.3 Iris-setosa
19 5.4 NaN 1.7 0.2 Iris-setosa
20 5.1 3.7 1.5 0.4 Iris-setosa
10.5 将缺失值全部替换为 1.0
iris.fillna(1, inplace=True)
iris.head(21)
sepal_length sepal_width petal_length petal_width class
0 4.9 3.0 1.4 0.2 Iris-setosa
1 4.7 3.2 1.3 0.2 Iris-setosa
2 4.6 3.1 1.5 0.2 Iris-setosa
3 5.0 3.6 1.4 0.2 Iris-setosa
4 5.4 3.9 1.7 0.4 Iris-setosa
5 4.6 3.4 1.4 0.3 Iris-setosa
6 5.0 3.4 1.5 0.2 Iris-setosa
7 4.4 2.9 1.4 0.2 Iris-setosa
8 4.9 3.1 1.5 0.1 Iris-setosa
9 5.4 3.7 1.5 0.2 Iris-setosa
10 4.8 3.4 1.0 0.2 Iris-setosa
11 4.8 3.0 1.0 0.1 Iris-setosa
12 4.3 3.0 1.0 0.1 Iris-setosa
13 5.8 4.0 1.0 0.2 Iris-setosa
14 5.7 4.4 1.0 0.4 Iris-setosa
15 5.4 3.9 1.0 0.4 Iris-setosa
16 5.1 3.5 1.0 0.3 Iris-setosa
17 5.7 3.8 1.0 0.3 Iris-setosa
18 5.1 3.8 1.0 0.3 Iris-setosa
19 5.4 3.4 1.0 0.2 Iris-setosa
20 5.1 3.7 1.5 0.4 Iris-setosa
10.6 删除列
del iris['class']
iris.head()
sepal_length sepal_width petal_length petal_width
0 4.9 3.0 1.4 0.2
1 4.7 3.2 1.3 0.2
2 4.6 3.1 1.5 0.2
3 5.0 3.6 1.4 0.2
4 5.4 3.9 1.7 0.4
10.7 将数据的前三行设置为缺失值
iris.iloc[0:3,:] = np.nan
iris.head()
sepal_length sepal_width petal_length petal_width
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 5.0 3.6 1.4 0.2
4 5.4 3.9 1.7 0.4
10.8 删除有缺失值的行
iris.drop_duplicates(inplace=True)
iris.head()
sepal_length sepal_width petal_length petal_width
0 NaN NaN NaN NaN
3 5.0 3.6 1.4 0.2
4 5.4 3.9 1.7 0.4
5 4.6 3.4 1.4 0.3
6 5.0 3.4 1.5 0.2
10.9 重新设置索引
iris.reset_index(inplace=True, drop=True)
iris.head()
sepal_length sepal_width petal_length petal_width
0 NaN NaN NaN NaN
1 5.0 3.6 1.4 0.2
2 5.4 3.9 1.7 0.4
3 4.6 3.4 1.4 0.3
4 5.0 3.4 1.5 0.2

[ github ] 10道题串起 Pandas 常用知识点相关推荐

  1. 数据分析 | Pandas 200道练习题,每日10道题,学完必成大神(2)

    文章目录 前期准备 1.通过DataFrame保存为EXCEL 2.查看数据行列数 3.提取popularity列中值大于3小于7的行 4.交换两列的位置 5.提取popularity列最大的行所在行 ...

  2. 数据分析 | Pandas 200道练习题,每日10道题,学完必成大神(3)

    文章目录 1.读取本的数据集 2.查看数据的前5行 3.将salary列的数据转换为最大值和最小值的平均值 4.将数据根据学历进行分组计算平均值 5.将createTime列转换为月日 6.查看所索引 ...

  3. ES6常用知识点概述

    前言 国庆假期已过一半,来篇干货压压惊. ES6,并不是一个新鲜的东西,ES7.ES8已经赶脚了.但是,东西不在于新,而在于总结.每个学前端的人,身边也必定有本阮老师的<ES6标准入门>或 ...

  4. matplotlib一些常用知识点的整理,

    本文作为学习过程中对matplotlib一些常用知识点的整理,方便查找. 强烈推荐ipython 无论你工作在什么项目上,IPython都是值得推荐的.利用ipython --pylab,可以进入Py ...

  5. 仅需10道题轻松掌握Python文件处理 | Python技能树征题

    仅需10道题轻松掌握Python文件处理 | Python技能树征题 0. 前言 1. 第 1 题:文件路径名的处理 2. 第 2 题:检测文件是否存在 3. 第 3 题:获取指定文件夹下的文件列表 ...

  6. 仅需10道题轻松掌握Python字符串方法 | Python技能树征题

    仅需10道题轻松掌握Python字符串方法 | Python技能树征题 0. 前言 1. 第 1 题:字符串检查 2. 第 2 题:字符串大小写转换 3. 第 3 题:字符串开头或结尾匹配 4. 第 ...

  7. 【C++】-- C++11基础常用知识点(下)

    上篇: [C++]-- C++11基础常用知识点(上)_川入的博客-CSDN博客 目录 新的类功能 默认成员函数 可变参数模板 可变参数 可变参数模板 empalce lambda表达式 C++98中 ...

  8. 数据汇总与统计(pandas库)知识点归纳总结及练习题

    统计的基本概念 总体:研究对象的全体–eg:所有学生的身高.成绩和体重等 个体:总体中的每一个成员–eg:单个同学的身高.成绩 样本:从总体中抽出部分个体组成的集合 样本容量:样本中所含个体的数目 常 ...

  9. Numpy 100道练习题+常用知识点

    目录 Numpy 100道练习题知识点总结 打印numpy配置 `np.show_config` 数组大小.元素个数.元素比特数 查询numpy函数的文档 `np.info` 获取范围数组 `np.a ...

  10. javaScript常用知识点有哪些

    javaScript常用知识点有哪些 一.总结 一句话总结:int = ~~myVar, // to integer | 是二进制或, x|0 永远等于x:^为异或,同0异1,所以 x^0 还是永远等 ...

最新文章

  1. ASP.NET程序中常用的三十三种代码(转载)
  2. jQuery 选择器语法
  3. 【CentOS Linux 7】【Linux系统及应用---调研报告】
  4. java junit autowired_写Junit测试时用Autowired注入的类实例始终为空怎么解?
  5. 远程工具:MobaXterm使用图文教程
  6. python3如何安装selenium_Mac-Firefox浏览器+selenium+Python3环境安装
  7. 查询各个年级的平均分_艰难困苦终生志,玉汝于成竟英才——高三年级期中考试总结暨优秀学生表彰会...
  8. golang---文件读写
  9. java test20006_Java单例7种测试实践
  10. (CPSCA's)CPOJC+VIJOS
  11. [Sdoi2013] 直径
  12. 【Godot】行为树(一)了解与设计行为树代码
  13. 利用第三方软件识别图片文字并转换为文本
  14. 转-思维要裂变要敢闯想
  15. 用了MybatisPlus后,我很久没有手写sql了
  16. Mysql和Navicat
  17. 利用马青公式输出π的后任意位数字
  18. USB转串口驱动应用于macbook
  19. 基于PyQt5实现界面控件自适应大小
  20. 有哪些好用的BT下载器?

热门文章

  1. C++引用、取地址符
  2. 英文pdf翻译为中文(word+google浏览器即可)
  3. C#通过WebBrowser对网页截图
  4. 计算机系统的优化具体操作,注册表优化电脑内存的详细操作步骤
  5. 计算机内存不足 无法使用,电脑内存不足怎么办,教您解决电脑内存不足
  6. php用redis实现队列,PHP使用Redis实现队列
  7. 电脑键盘部分按键失灵_win10键盘个别按键失灵的原因及解决方法
  8. 简单使用Easy Touch5摇杆控制物体移动
  9. 安装nodejs时:The error code is 2503.
  10. word页眉页脚页码设置详解