panda 函数-处理空值
今天这里谈的函数,以后进行数据分析的时候会经常用到。 import numpy as npimport pandas as pdfrom pandas import DataFrame , Seriesfrom numpy import nan as NA obj = Series(['c', 'a', 'd', 'a', 'a', 'b', 'b', 'c', 'c']) uniques = obj.unique()print("obj is \n", obj)print("obj.unique is \n ", obj.unique())print("uniques.sort() is \n", uniques.sort()) print("obj.value_counts() is \n", obj.value_counts())print("pd.value_counts(obj.values,sort=False) \n", pd.value_counts(obj.values, sort=False)) mask = obj.isin(['b' , 'c'])print("obj.isin(['b','c']) \n", obj.isin(['b' , 'c']))print("mask = obj.isin(['b','c'])")print("obj[mask] is \n", obj[mask]) data= DataFrame({'Qu1':[1,3,4,3,4], 'Qu2':[2,3,1,2,3], 'Qu3':[1,5,2,4,4]} ) print ("data is \n",data) result = data.apply(pd.value_counts).fillna(0) print("data.apply(pd.value_counts).fillna(0)\n ", result)print("计算一个series各值出现的频率") print("handling the missing data \n") string_data = Series(['aardvark','artichoke',np.nan,'avocado'])print("string_data is \n", string_data)print("string_data.isnull() \n",string_data.isnull()) print("The built-in python None value is also treated as NA in object Arrays \n")print("string_data[0]=None\n") string_data[0]=Noneprint("string_data.isnull() \n ",string_data.isnull) print(" NA handling methods in P143 Table 5-12") data = Series([1,NA,3.5,NA,7]) data.dropna()print("data is \n",data)print("data.dropna() is \n", data.dropna())print("data[data.notnull()],\n",data[data.notnull()]) data = DataFrame([[1.,6.5,3.],[1.,NA,NA],[NA,NA,NA],[NA,6.5,3.]]) cleaned = data.dropna() print("data is \n",data)print("data.dropna() is \n",cleaned)print("data.dropna(how='all') is \n", data.dropna(how='all'))print("passing how=all will only drop rows that are all NA") data[4]=NAprint("New data is \n", data)print("data.dropna(axis=1,how='all') \n",data.dropna(axis=1,how='all'))print("按照columns drop") df=DataFrame(np.random.randn(7,3))print("df is \n",df) df.ix[:4,1]=NA df.ix[:2,2]=NAprint("New df is \n",df)print("df.dropna(thresh=3)\n",df.dropna(thresh=3)) print("filling in the missing data")print("df.fillna(0) \n",df.fillna(0))print("df.fillna({1:0.5,3:-1}) \n",df.fillna({1:0.5,3:-1}))print("calling fillna with a dict you can use a different fill value for each columns") _=df.fillna(0,inplace=True)print("_=df.fillna(0,inplace=True) \n",df) df=DataFrame(np.random.randn(6,3))print("DataFrame(np.random.randn(6,3)) \n",df) df.ix[2:,1] = NA df.ix[4:,2] = NAprint("df.ix[2:,1] = NA; df.ix[4:,2] = NA \n",df ) print("df.fillna(method = 'ffill') \n", df.fillna(method = 'ffill')) print("df.fillna(method = 'ffill',limit =2) \n",df.fillna(method='ffill',limit = 2)) data= Series([1.,NA,3.5,NA,7])print("data is \n",data)print("data.fillna(data.mean()) \n",data.fillna(data.mean())) print("fillna function arguments on P146 Table 5-13")print("Hierarchical indexing") data = Series(np.random.randn(10),index=[['a','a','a','b','b','b','c','c','d','d'],[1,2,3,1,2,3,1,2,2,3]])print("data is \n",data)print("a Series with multi-index")print("data.index",data.index)print("data['b'] \n",data['b'])print("data['b':'c'] \n",data['b':'c'])print("data.ix[['b','d']] \n",data.ix[['b','d']]) print("data[:,2] \n",data[:,2])print("data.unstack() \n",data.unstack()) print("data.unstack().stack() \n ",data.unstack().stack()) print("data frame") frame = DataFrame(np.arange(12).reshape((4,3)),index=[['a','a','b','b'],[1,2,1,2]],columns=[['Ohio','Ohio','Colorado'],['Green','Red','Green']])print("frame is \n",frame) frame.index.names =["key1","key2"] frame.columns.names=["state","color"]print("New frame is \n",frame) print("frame['Ohio'] \n",frame['Ohio'])print("frame.swaplevel('key1','key2') \n", frame.swaplevel('key1','key2')) print("frame.sortlevel(1) \n",frame.sortlevel(1))print("frame.swaplevel(0,1).sortlevel(0)\n",frame.swaplevel(0,1).sortlevel(0)) print("summary statistics by level")print("frame.sum(level='key2') \n",frame.sum(level='key2'))print("frame.sum(level='color',axis=1) \n",frame.sum(level='color',axis = 1)) print("Using a DataFrame's columns") frame = DataFrame({'a':range(7),'b':range(7,0,-1),'c':['one','one','one','two','two','two','two'],'d':[0,1,2,0,1,2,3]})print("frame is \n",frame) frame2= frame.set_index(['c','d'])print("creating a new Dataframe using one or more its columns as the index")print("frame.set_index(['c','d']) \n",frame2) frame.set_index(['c','d'],drop=False)print("frame.set_index(['c','d'],drop =False) \n",frame.set_index(['c','d'],drop=False)) print("reset_index does the opposite of set_index,the hierarchical index levels are moved into the columns")print("frame2.reset_index() \n",frame2.reset_index()) http://www.xuebuyuan.com/2180572.html
转载于:https://www.cnblogs.com/wutongyuhou/p/6888148.html
panda 函数-处理空值相关推荐
- Python之pandas:pandas中数据处理常用函数(与空值相关/去重和替代)简介、具体案例、使用方法之详细攻略
Python之pandas:pandas中数据处理常用函数(与空值相关/去重和替代)简介.具体案例.使用方法之详细攻略 目录 pandas中数据处理常用函数(isnull/dropna/fillna/ ...
- SQL Server函数之空值处理
SQL Server函数之空值处理 coalesce( expression [ ,...n ] )返回其参数中第一个非空表达式. Select coalesce(null,null,'1','2') ...
- Access数据库中Sum函数返回空值(Null)时如何设置为0
在完成一个Access表中数据统计时,需要统计指定字段的和,使用到了Sum函数,但统计时发现,指定条件查询统计时有可能返回空值(Null),导致对应字段显示为空白,正常应显示为0.基本思路是在获取记录 ...
- panda 函数笔记(merge\DataFrame用法\DataFrame.plot)
1.merge( ) 2.DataFrame用法 2.1.创建一个DataFrame: 2. ...
- mysql空值判断函数_MySQL中的ifnull()函数判断空值
我们知道,在不同的数据库引擎中,内置函数的实现.命名都是存在差异的,如果经常切换使用这几个数据库引擎的话,很容易会将这些函数弄混淆. 比如说判断空值的函数,在Oracle中是NVL()函数.NVL2( ...
- 不忽略空值null的聚合函数_sqlzoo 练习题答案 聚合函数 和 空值 部分
1 SUM and COUNT 1 显示世界总人口 SELECT SUM(population) FROM world 2 列出所有的洲份, 每个只有一次. SELECT DISTINCT conti ...
- SQL数据定义、查询、更新+空值的处理 实践学习报告
本篇博客分享的是博主的各种数据操作-实践详细过程(以截图方式展示) 软件:Oracle SQL developer,希望对大家有所帮助咯! 目录 1.学生-课程数据库 2.1.数据定义理论 2.2 ...
- MySQL数据库中的内置函数
SQL函数分为单行函数和多行函数: 单行函数: 红色标注的为重点. - - - -字符串函数: - - - - - - - - - - 1.length() 存储长度 - - - - - - - - ...
- MySQL数据库聚合函数(count、max、min、sum、avg)
1. 聚合函数的介绍 聚合函数又叫组函数,通常是对表中的数据进行统计和计算,一般结合分组(group by)来使用,用于统计和计算分组数据. 常用的聚合函数: count(col): 表示求指定列的总 ...
最新文章
- SQL反模式笔记7——多列属性
- 【Python】random库的使用
- 需求与问题——一个老现象
- 暗黑破坏神(DIABLOII 1.11B)BOT 及源代码公开下载
- 联想小新13pro锐龙版网卡_4499元诠释极致性价比 联想小新Pro 13标压锐龙版上手...
- PHP生成腾讯云API签名
- range-coder使用
- softice使用(1)- 在VMware Workstation 中使用SoftICE zz xfocus
- [转]c#对象的浅拷贝和深拷贝
- 什么是数据结构?什么是算法
- 虚拟opc服务器软件,OPC Server
- 大数据开发中HBase高级特性和rowkey设计分析
- 大数据学习(十三)hive正则表达式
- Elasticsearch断电后启动异常(failed recovery, failure RecoveryFailedException)
- day02:云计算解决方案学习笔记
- android flash路径动画,Flash制作沿着路径的动画
- 随机数与随机序列生成
- UGUI源码剖析(Image)
- Aurora Innovation自动驾驶梦之队 | 智能驾驶
- Procast安装软件及教程下载