一、索引

时间序列标签索引，支持各种时间字符串，以及datetime.datetime

时间序列由于按照时间先后排序，故不用考虑顺序问题
索引方法同样适用于Dataframe

import numpy as np
import pandas as pd
from datetime import datetime# 索引rng = pd.date_range('2017/1', '2017/3')
ts = pd.Series(np.random.rand(len(rng)), index=rng)
print("ts = \n", ts)
print("-" * 50)
print("ts.head() = \n", ts.head())
print("-" * 200)# 基本下标位置索引
print("下标位置索引: ts[0] = ", ts[0])
print("下标位置索引: ts[:2] = \n", ts[:2])
print("-" * 200)# 时间序列标签索引，支持各种时间字符串，以及datetime.datetime
# 时间序列由于按照时间先后排序，故不用考虑顺序问题
# 索引方法同样适用于Dataframe
print("ts['2017/1/2'] = ", ts['2017/1/2'])
print("ts['20170103'] = ", ts['20170103'])
print("ts['1/10/2017'] = ", ts['1/10/2017'])
print("ts[datetime(2017, 1, 20)] = ", ts[datetime(2017, 1, 20)])
print("-" * 200)

打印结果：

ts =
2017-01-01    0.551172
2017-01-02    0.676984
2017-01-03    0.449515
2017-01-04    0.029888
2017-01-05    0.760317
2017-01-06    0.237550
2017-01-07    0.447621
2017-01-08    0.765687
2017-01-09    0.594706
2017-01-10    0.127133
2017-01-11    0.585002
2017-01-12    0.715092
2017-01-13    0.452857
2017-01-14    0.002166
2017-01-15    0.919406
2017-01-16    0.661433
2017-01-17    0.816985
2017-01-18    0.054109
2017-01-19    0.941522
2017-01-20    0.577710
2017-01-21    0.896383
2017-01-22    0.062862
2017-01-23    0.765347
2017-01-24    0.592148
2017-01-25    0.278556
2017-01-26    0.090711
2017-01-27    0.772405
2017-01-28    0.685413
2017-01-29    0.564777
2017-01-30    0.249494
2017-01-31    0.353693
2017-02-01    0.641812
2017-02-02    0.744452
2017-02-03    0.802991
2017-02-04    0.286702
2017-02-05    0.505531
2017-02-06    0.147288
2017-02-07    0.412554
2017-02-08    0.690443
2017-02-09    0.219935
2017-02-10    0.631287
2017-02-11    0.283691
2017-02-12    0.637356
2017-02-13    0.414368
2017-02-14    0.670913
2017-02-15    0.982919
2017-02-16    0.787294
2017-02-17    0.783862
2017-02-18    0.110436
2017-02-19    0.631306
2017-02-20    0.857404
2017-02-21    0.697764
2017-02-22    0.990373
2017-02-23    0.876479
2017-02-24    0.617759
2017-02-25    0.370738
2017-02-26    0.523457
2017-02-27    0.074906
2017-02-28    0.875270
2017-03-01    0.455254
Freq: D, dtype: float64
--------------------------------------------------
ts.head() = 2017-01-01    0.551172
2017-01-02    0.676984
2017-01-03    0.449515
2017-01-04    0.029888
2017-01-05    0.760317
Freq: D, dtype: float64
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
下标位置索引: ts[0] =  0.5511722618400913
下标位置索引: ts[:2] =
2017-01-01    0.551172
2017-01-02    0.676984
Freq: D, dtype: float64
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ts['2017/1/2'] =  0.6769837637858711
ts['20170103'] =  0.4495150651749722
ts['1/10/2017'] =  0.12713279349021678
ts[datetime(2017, 1, 20)] =  0.5777095683188953
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Process finished with exit code 0

二、切片

和Series按照index索引原理一样，也是末端包含。

import numpy as np
import pandas as pd# 切片rng = pd.date_range('2017/1', '2017/3', freq='12H')
ts = pd.Series(np.random.rand(len(rng)), index=rng)
print("ts = \n", ts)
print('-' * 200)# 和Series按照index索引原理一样，也是末端包含
data1 = ts['2017/1/5':'2017/1/10']
print("data1 = ts['2017/1/5':'2017/1/10'] = \n", data1)
print('-' * 200)# 传入月，直接得到一个切片
data2 = ts['2017/2']
data3 = data2.head()
print("data2 = ts['2017/2'] = \n", data2)
print('-' * 50)
print("data3 = ts['2017/2'].head() = \n", data3)
print('-' * 200)

打印结果：

ts =
2017-01-01 00:00:00    0.494033
2017-01-01 12:00:00    0.820702
2017-01-02 00:00:00    0.616621
2017-01-02 12:00:00    0.011143
2017-01-03 00:00:00    0.940433...
2017-02-27 00:00:00    0.978302
2017-02-27 12:00:00    0.414231
2017-02-28 00:00:00    0.218717
2017-02-28 12:00:00    0.580957
2017-03-01 00:00:00    0.090996
Freq: 12H, Length: 119, dtype: float64
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
data1 = ts['2017/1/5':'2017/1/10'] =
2017-01-05 00:00:00    0.652041
2017-01-05 12:00:00    0.773052
2017-01-06 00:00:00    0.463288
2017-01-06 12:00:00    0.335351
2017-01-07 00:00:00    0.099362
2017-01-07 12:00:00    0.883344
2017-01-08 00:00:00    0.426475
2017-01-08 12:00:00    0.580315
2017-01-09 00:00:00    0.863783
2017-01-09 12:00:00    0.494119
2017-01-10 00:00:00    0.577613
2017-01-10 12:00:00    0.168280
Freq: 12H, dtype: float64
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
data2 = ts['2017/2'] =
2017-02-01 00:00:00    0.906511
2017-02-01 12:00:00    0.208719
2017-02-02 00:00:00    0.831267
2017-02-02 12:00:00    0.496934
2017-02-03 00:00:00    0.882586
2017-02-03 12:00:00    0.269308
2017-02-04 00:00:00    0.767492
2017-02-04 12:00:00    0.928533
2017-02-05 00:00:00    0.404165
2017-02-05 12:00:00    0.573177
2017-02-06 00:00:00    0.298927
2017-02-06 12:00:00    0.987986
2017-02-07 00:00:00    0.097949
2017-02-07 12:00:00    0.971335
2017-02-08 00:00:00    0.194750
2017-02-08 12:00:00    0.224471
2017-02-09 00:00:00    0.628354
2017-02-09 12:00:00    0.487055
2017-02-10 00:00:00    0.166684
2017-02-10 12:00:00    0.644644
2017-02-11 00:00:00    0.479011
2017-02-11 12:00:00    0.035003
2017-02-12 00:00:00    0.694782
2017-02-12 12:00:00    0.784163
2017-02-13 00:00:00    0.740384
2017-02-13 12:00:00    0.983730
2017-02-14 00:00:00    0.010376
2017-02-14 12:00:00    0.026971
2017-02-15 00:00:00    0.012298
2017-02-15 12:00:00    0.679321
2017-02-16 00:00:00    0.594517
2017-02-16 12:00:00    0.260168
2017-02-17 00:00:00    0.405923
2017-02-17 12:00:00    0.856798
2017-02-18 00:00:00    0.615552
2017-02-18 12:00:00    0.261799
2017-02-19 00:00:00    0.786273
2017-02-19 12:00:00    0.316262
2017-02-20 00:00:00    0.457370
2017-02-20 12:00:00    0.975753
2017-02-21 00:00:00    0.232189
2017-02-21 12:00:00    0.373186
2017-02-22 00:00:00    0.506089
2017-02-22 12:00:00    0.849335
2017-02-23 00:00:00    0.623559
2017-02-23 12:00:00    0.215287
2017-02-24 00:00:00    0.985915
2017-02-24 12:00:00    0.998497
2017-02-25 00:00:00    0.294932
2017-02-25 12:00:00    0.993772
2017-02-26 00:00:00    0.852245
2017-02-26 12:00:00    0.957576
2017-02-27 00:00:00    0.978302
2017-02-27 12:00:00    0.414231
2017-02-28 00:00:00    0.218717
2017-02-28 12:00:00    0.580957
Freq: 12H, dtype: float64
--------------------------------------------------
data3 = ts['2017/2'].head() =
2017-02-01 00:00:00    0.906511
2017-02-01 12:00:00    0.208719
2017-02-02 00:00:00    0.831267
2017-02-02 12:00:00    0.496934
2017-02-03 00:00:00    0.882586
Freq: 12H, dtype: float64
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Process finished with exit code 0

三、重复索引的时间序列

import numpy as np
import pandas as pd# index有重复，is_unique检查 → values唯一，index不唯一
dates = pd.DatetimeIndex(['1/1/2015', '1/2/2015', '1/3/2015', '1/4/2015', '1/1/2015', '1/2/2015'])
ts = pd.Series(np.random.rand(6), index=dates)
print("ts = \n", ts)
print('-' * 50)
print("ts.is_unique = {0}, ts.index.is_unique = {1}".format(ts.is_unique, ts.index.is_unique))
print('-' * 200)# index有重复的将返回多个值
data1 = ts['20150101']
print("data1 = \n{0} \ntype(data1) = {1}".format(data1, type(data1)))
print('-' * 50)
data2 = ts['20150104']
print("data2 = \n{0} \ntype(data2) = {1}".format(data2, type(data2)))
print('-' * 200)# 通过groupby做分组，重复的值这里用平均值处理
data3 = ts.groupby(level=0).mean()
print("data3 = \n{0} \ntype(data3) = {1}".format(data3, type(data3)))
print('-' * 200)

打印结果：

ts =
2015-01-01    0.488589
2015-01-02    0.621012
2015-01-03    0.657300
2015-01-04    0.164756
2015-01-01    0.078192
2015-01-02    0.899275
dtype: float64
--------------------------------------------------
ts.is_unique = True, ts.index.is_unique = False
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
data1 =
2015-01-01    0.488589
2015-01-01    0.078192
dtype: float64
type(data1) = <class 'pandas.core.series.Series'>
--------------------------------------------------
data2 =
2015-01-04    0.164756
dtype: float64
type(data2) = <class 'pandas.core.series.Series'>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
data3 =
2015-01-01    0.283390
2015-01-02    0.760143
2015-01-03    0.657300
2015-01-04    0.164756
dtype: float64
type(data3) = <class 'pandas.core.series.Series'>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Process finished with exit code 0

Pandas-时间序列（二）-索引及切片：TimeSeries是Series的一个子类，所以Series索引及数据选取方面的方法基本一样【TimeSeries通过时间序列有更便捷的方法做索引和切片】相关推荐

Python 数据分析三剑客之 Pandas（二）：Index 索引对象以及各种索引操作
CSDN 课程推荐:<迈向数据科学家:带你玩转Python数据分析>,讲师齐伟,苏州研途教育科技有限公司CTO,苏州大学应用统计专业硕士生指导委员会委员:已出版<跟老齐学Python ...
【Python数据科学手册】Pandas——十二、处理时间序列
文章目录十二.处理时间序列 1.Python的日期与时间工具 1)Python原生的日期使劲按工具:datetime和dateutil 2)时间类型数组:Numpy的datetime64类型 3)p ...
python使用np.argsort对一维numpy概率值数据排序获取倒序索引、获取的top索引（例如top2、top5、top10）索引二维numpy数组中对应的原始数据：原始数据概率最大的头部数据
python使用np.argsort对一维numpy概率值数据排序获取倒序索引.获取的top索引(例如top2.top5.top10)索引二维numpy数组中对应的原始数据:原始数据概率最大的头部数据 ...
python使用np.argsort对一维numpy概率值数据排序获取升序索引、获取的top索引（例如top2、top5、top10）索引二维numpy数组中对应的原始数据：原始数据概率最小的头部数据
python使用np.argsort对一维numpy概率值数据排序获取升序索引.获取的top索引(例如top2.top5.top10)索引二维numpy数组中对应的原始数据:原始数据概率最小的头部数据 ...
数据分析---数据处理工具pandas（二）
文章目录数据分析---数据处理工具pandas(二) 一.Pandas数据结构Dataframe:基本概念及创建 1.DataFrame简介 2.创建Dataframe (1)方法一:由数组/lis ...
为什么用B+树做索引MySQL存储引擎简介
索引的数据结构为什么不是二叉树,红黑树什么的呢? 首先,一般来说,索引本身也很大,不可能全部存在内存中,因此索引往往以索引文件的方式存在磁盘上.然后一般一个结点一个磁盘块,也就是读一个结点要进行一次 ...
MySQL数据库的红黑树优化_为什么Mysql用B+树做索引而不用B-树或红黑树
B+树做索引而不用B-树那么Mysql如何衡量查询效率呢?– 磁盘IO次数. 一般来说索引非常大,尤其是关系性数据库这种数据量大的索引能达到亿级别,所以为了减少内存的占用,索引也会被存储在磁盘上. ...
二叉树、B树(B-树)、B+树、B*树详解，以及为什么MySQL选择B+树做索引
温故而知新,可以为师矣.看到一篇介绍B数和B减树的文章,这里记录一下. 1. 简要众所周知,MySQL的索引使用了B+树的数据结构.那么为什么不用B树呢? 先看一下B树和B+树的区别. 2. 二叉树 ...
数据分析之pandas学习笔记（六）（层次化索引、重塑、轴向旋转、行列变换、合并表数据）
数据分析之Pandas学习笔记(六)(层次化索引.重塑.轴向旋转.行列变换.合并表数据) level层次化索引 unstack()与stack()进行重塑,即:行列索引变换 swaplevel()交换 ...

Pandas-时间序列（二）-索引及切片：TimeSeries是Series的一个子类，所以Series索引及数据选取方面的方法基本一样【TimeSeries通过时间序列有更便捷的方法做索引和切片】

一、索引

二、切片

三、重复索引的时间序列

Pandas-时间序列（二）-索引及切片：TimeSeries是Series的一个子类，所以Series索引及数据选取方面的方法基本一样【TimeSeries通过时间序列有更便捷的方法做索引和切片】相关推荐

最新文章

热门文章