Pandas用法

import pandas as pd
import numpy as np

1.创建Series

1）创建一个空Series

s = pd.Series()
s

<ipython-input-3-85850638a114>:1: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.s = pd.Series()Series([], dtype: float64)

2)从ndarray创建一个Series，并规定索引为[100,101,102,103]

arr = np.array(['a','b','c','d'])
ser1 = pd.Series(arr,index=[100,101,102,103])
ser1

100    a
101    b
102    c
103    d
dtype: object

3)从字典创建一个Series，字典键用于构建索引

dic = {'a':0,'b':1,'c':2,'d':3}
dic

{'a': 0, 'b': 1, 'c': 2, 'd': 3}

ser2 = pd.Series(dic)
ser2

a    0
b    1
c    2
d    3
dtype: int64

4) 从标量创建一个Series，此时，必须提供索引，重复值以匹配索引的长度

s = pd.Series(5,index = [0,1,2,3])
s

0    5
1    5
2    5
3    5
dtype: int64

2.从具体位置的Series中访问数据(pandas切片)

1) 检索Series中的第一个元素

s=pd.Series([1,2,3,4,5],index=['a','b','c','d','e'])
s

a    1
b    2
c    3
d    4
e    5
dtype: int64

s[0]

2) 检索Series中的前三个元素

s[:3]

a    1
b    2
c    3
dtype: int64

3) 检索Series中最后三个元素

s[2:]

c    3
d    4
e    5
dtype: int64

s[-3:]

c    3
d    4
e    5
dtype: int64

3.使用标签检索数据（索引）：一个Series就像一个固定大小的字典，可以通过索引标签获取和设置值

1) 使用索引标签检索单个元素

s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
s

a    1
b    2
c    3
d    4
e    5
dtype: int64

s['b']

2) 使用索引标签列表检索多个元素

s1 = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
s1

a    1
b    2
c    3
d    4
e    5
dtype: int64

print(['a','b','c','d'])

['a', 'b', 'c', 'd']

3) 如果不包含标签，检索会出现异常

s['f']

---------------------------------------------------------------------------KeyError                                  Traceback (most recent call last)C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)2894             try:
-> 2895                 return self._engine.get_loc(casted_key)2896             except KeyError as err:pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()KeyError: 'f'The above exception was the direct cause of the following exception:KeyError                                  Traceback (most recent call last)<ipython-input-24-c23937ec966b> in <module>
----> 1 s['f']C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)880 881         elif key_is_scalar:
--> 882             return self._get_value(key)883 884         if is_hashable(key):C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)987 988         # Similar to Index.get_value, but we do not fall back to positional
--> 989         loc = self.index.get_loc(label)990         return self.index._get_values_for_loc(self, loc, label)991 C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)2895                 return self._engine.get_loc(casted_key)2896             except KeyError as err:
-> 2897                 raise KeyError(key) from err2898 2899         if tolerance is not None:KeyError: 'f'

4.简单运算

在pandas的Series中，会保留NumPy的数组操作（用布尔数组过滤数据，标量乘法，以及使用数学函数），并同时保持引用的使用

1) 例

ser2 = pd.Series(range(4),index = ["a","b","c","d"])
ser2

a    0
b    1
c    2
d    3
dtype: int64

ser2[ser2 > 2]

d    3
dtype: int64

ser2 * 2

a    0
b    2
c    4
d    6
dtype: int64

np.exp(ser2)

a     1.000000
b     2.718282
c     7.389056
d    20.085537
dtype: float64

Series的自动对齐Series的一个重要功能就是自动对齐（不明觉厉），看看例子就明白了。差不多就是不同Series对象运算的时候根据其索引进行匹配计算

1）创建两个Series名为ser3与ser4.

sdata = {'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}
sdata

{'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}

ser3 = pd.Series(sdata)
ser3

Ohio      35000
Texas     71000
Oregon    16000
Utah       5000
dtype: int64

states = ['California', 'Ohio', 'Oregon', 'Texas']
states

['California', 'Ohio', 'Oregon', 'Texas']

ser4 = pd.Series(sdata,index = states)
ser4

California        NaN
Ohio          35000.0
Oregon        16000.0
Texas         71000.0
dtype: float64

 ser3+ser4

California         NaN
Ohio           70000.0
Oregon         32000.0
Texas         142000.0
Utah               NaN
dtype: float64

5.Series增删改

1) 增：Series的add()方法是加法计算不是增加Series元素用的，使用append连接其他Series。

sdata = {'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}
sdata

{'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}

ser3 = pd.Series(sdata)
ser3

Ohio      35000
Texas     71000
Oregon    16000
Utah       5000
dtype: int64

states = ['California', 'Ohio', 'Oregon', 'Texas']
states

['California', 'Ohio', 'Oregon', 'Texas']

ser4 = pd.Series(sdata,index = states)
ser4

California        NaN
Ohio          35000.0
Oregon        16000.0
Texas         71000.0
dtype: float64

ser3.append(ser4)

Ohio          35000.0
Texas         71000.0
Oregon        16000.0
Utah           5000.0
California        NaN
Ohio          35000.0
Oregon        16000.0
Texas         71000.0
dtype: float64

2)删：Series的drop()方法可以对Series进行删除操作，返回一个被删除后的Series，原来的Series不改变

s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
s

a    1
b    2
c    3
d    4
e    5
dtype: int64

a1 = s.drop('a')
s

a    1
b    2
c    3
d    4
e    5
dtype: int64

a1

b    2
c    3
d    4
e    5
dtype: int64

3) 改：通过索引的方式查找到某个元素，然后通过“=”赋予新的值

s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
s

a    1
b    2
c    3
d    4
e    5
dtype: int64

s['a']=5
s

a    5
b    2
c    3
d    4
e    5
dtype: int64

pandasSeries模块相关推荐

etcd 笔记（05）— etcd 代码结构、各模块功能、整体架构、各模块之间的交互、请求和应答流程
1. etcd 项目结构和功能 etcd 项目代码的目录结构如下: $ tree ├── auth ├── build ├── client ├── clientv3 ├── contrib ├── ...
OpenCV 笔记（01）— OpenCV 概念、整体架构、各模块主要功能
1. OpenCV 概念图像处理( Image Processing )是用计算机对图像进行分析, 以达到所需结果的技术, 又称影像处理. 图像处理技术一般包括图像压缩, 增强和复原, 匹配.描述和 ...
Python 多线程总结（1）- thread 模块
thread 模块 1. 单线程首先看下单线程程序运行的例子,如下所示, import timedef loop0():print 'start loop0 begin', time.ctime() ...
关于python导入模块和package的一些深度思考
背景在python中有导入模块和导入package一说,这篇文章主要介绍导入模块和package的一些思考. 首先什么是模块?什么是package? 模块:用来从逻辑上组织python代码(变量,函 ...
Python Re 模块超全解读！详细
内行必看!Python Re 模块超全解读! 2019.08.08 18:59:45字数 953阅读 121 re模块下的函数 compile(pattern):创建模式对象 > import ...
python性能分析之line_profiler模块-耗时,效率时间
20210203 直接用pycharm 自带的 20201215 直接装不上的情况下先下载安装文件再安装 line_profiler使用装饰器(@profile)标记需要调试的函数.用kernpr ...
python：Json模块dumps、loads、dump、load介绍
20210831 https://www.cnblogs.com/bigtreei/p/10466518.html json dump dumps 区别 python:Json模块dumps.load ...
关于python 中的__future__模块
Python的每个新版本都会增加一些新的功能,或者对原来的功能作一些改动.有些改动是不兼容旧版本的,也就是在当前版本运行正常的代码,到下一个版本运行就可能不正常了. 具体说来就是,某个版本中出现了某个 ...
GPUtil是一个Python模块，使用nvidia-smi从NVIDA GPU获取GPU状态
GPUtil是一个Python模块,使用nvidia-smi从NVIDA GPU获取GPU状态一个Python模块,用于在Python中使用nvidia-smi以编程方式从NVIDA GPU获取GP ...
Python多线程（3）——Queue模块
Python多线程(3)--Queue模块 Queue模块支持先进先出(FIFO)队列,支持多线程的访问,包括一个主要的类型(Queue)和两个异常类(exception classes). Pyth ...

pandasSeries模块

Pandas用法

1.创建Series

1）创建一个空Series

2)从ndarray创建一个Series，并规定索引为[100,101,102,103]

3)从字典创建一个Series，字典键用于构建索引

4) 从标量创建一个Series，此时，必须提供索引，重复值以匹配索引的长度

2.从具体位置的Series中访问数据(pandas切片)

1) 检索Series中的第一个元素

2) 检索Series中的前三个元素

3) 检索Series中最后三个元素

3.使用标签检索数据（索引）：一个Series就像一个固定大小的字典，可以通过索引标签获取和设置值

1) 使用索引标签检索单个元素

2) 使用索引标签列表检索多个元素

3) 如果不包含标签，检索会出现异常

4.简单运算

1) 例

1）创建两个Series名为ser3与ser4.

5.Series增删改

1) 增：Series的add()方法是加法计算不是增加Series元素用的，使用append连接其他Series。

2)删：Series的drop()方法可以对Series进行删除操作，返回一个被删除后的Series，原来的Series不改变

3) 改：通过索引的方式查找到某个元素，然后通过“=”赋予新的值

pandasSeries模块相关推荐

最新文章

热门文章