pandas数值型数据和非数值型数据统计

对单列数据进行统计

加载数据

import pandas as pddetail = pd.read_excel('./meal_order_detail.xlsx')

常见的数值统计的方法如下：

统计detail中的，单价相关指标
print('最大值',detail.loc[:,'amounts'].max())
print('最小值',detail.loc[:,'amounts'].min())
print('均值',detail.loc[:,'amounts'].mean())
print('中位数',detail.loc[:,'amounts'].median())
print('方差',detail.loc[:,'amounts'].var())
print('极差',detail.loc[:,'amounts'].ptp())
print('标准差',detail.loc[:,'amounts'].std())
print('众数',detail.loc[:,'amounts'].mode())
print('非空值的数目',detail.loc[:,'amounts'].count())
print('最大值的位置',detail.loc[:,'amounts'].idxmax())
print('最小值的位置',detail.loc[:,'amounts'].idxmin())

describe对于数值型的数据返回8中统计结果

print('describe',detail.loc[:,'amounts'].describe())

对多列数据进行统计

格式如下：
print('describe',detail.loc[:,['amounts','counts']].describe())  
简单来说，列的位置加入列名称列表即可

非数值统计统计

对于dataframe转化数据类型

其他类型转化成object，非数值型返回4个数据

detail.loc[:,'amounts'] = detail.loc[:,'amounts'].astype('object')
print(detail.loc[:,'amounts'].describe()
print(detail.loc[:,'amounts'].dtypes)

其他类型数据转化成类别型数据

detail.loc[:,'amounts'] = detail.loc[:,'amounts'].astype('category')
print(detail.loc[:,'amounts'].describe()
print(detail.loc[:,'amounts'].dtypes)

detail中那些菜品最火？菜品卖出多少份？

detail.loc[:,'dishes_name'] = detail.loc[:,'dishes_name'].astype('category')
print('按照deshed_name统计描述信息：',detail.loc[:,'dishes_name'].describe())        
发现这里的最火菜品是大碗白饭，但是大碗白饭不是菜品，所有重新计算。

删除数据中的大碗白饭

bool_id = detail.loc[:,'dishes_name'] == '白饭/大碗'
index = detail.loc[bool_id,:].index
detail.drop(labels=index,axis=0,inplace=True)

把数据类型重新转化，然后再赋给数据本身。

detail.loc[:,'dishes_name'] = detail.loc[:,'dishes_name'].astype('category')
#  在进行统计描述信息
print("按照dishes_name统计描述信息：\n",detail.loc[:,'dishes_name'].describe())

在返回数据为

detail.loc[:,'dishes_name'] = detail.loc[:,'dishes_name'].astype('category')
#  在进行统计描述信息
print("按照dishes_name统计描述信息：\n",detail.loc[:,'dishes_name'].describe())

在detail中哪个菜品点的最多，点了多少分菜？

将order_id转变成类别型数据，再进行describe

detail.loc[:,'order_id'] = detail.loc[:,'order_id'].astype('category')
print('按照order_id统计描述信息为:',detail.loc[:,'order_id'].describe())

建议在使用时，把数据类型转成category，然后再计算

pandas数值型数据和非数值型数据统计相关推荐

python非数值型数据_利用pandas将非数值数据转换成数值的方式
handle non numerical data 举个例子,将性别属性男女转换成0-1,精通ML的小老弟们可以略过本文~~, 这里不考虑稀疏向量的使用,仅提供一些思路.本来想直接利用pandas的D ...
Python每日一记127文本型数字转化为数值型数字（eval函数）
不知道大家有没有注意到这样一个问题,那就是我们进行format数字格式化后,是文本型数字,这样是不能进行后续计算的,如何将其转化为数值型数字呢?这里我们不用int(),或者float() 这个时候我们 ...
【问题解决】【excel】求平均值、求和结果为0 -＞将excel中文本型数据转化为数值型数据
问题:求平均值.求和结果为0 原因:表格中数据是文本型数据,而不是数值型数据,文本型数据平均值求和都为0 解决办法:将excel中文本型数据转化为数值型数据选中文本型数值区域,发现在选中区域的左上 ...
excel 回归 - 输入区域包含非数值型数据
回归 - 输入区域包含非数值型数据每天一点点,记录工作中实操可行 excel中在用f1:h128范围的数据做做回归分析时,一直提示"回归 - 输入区域包含非数值型数据",不要把第 ...
python非数值型数据_Python机器学习实战：如何处理非数值特征
机器学习实战:这里没有艰深晦涩的数学理论,我们将用简单的案例和大量的示例代码,向大家介绍机器学习的核心概念.我们的目标是教会大家用Python构建机器学习模型,解决现实世界的难题. 本文来自<数 ...
pandas计算滑动窗口中的数值总和实战（Rolling Sum of a Pandas Column）：计算单数据列滑动窗口中的数值总和（sum）、计算多数据列滑动窗口中的数值总和（sum）
pandas计算滑动窗口中的数值总和实战(Rolling Sum of a Pandas Column):计算单数据列滑动窗口中的数值总和(sum).计算多数据列滑动窗口中的数值总和(sum) 目录
Pandas把dataframe中的整数数值（integer）转化为时间(日期、时间)信息实战
Pandas把dataframe中的整数数值(integer)转化为时间(日期.时间)信息实战目录 Pandas把dataframe中的整数数值转化为时间(日期.时间)信息实战
在JS中如何判断所输入的是一个数、整数、正数、非数值？
1.判断是否为一个数字: Number(num)不为 NaN,说明为数字 2. 判断一个数为正数: var num=prompt("请输入:");if(Number(num)> ...
python：pandas全DataFrame查询定位赋值数值所在行列
pandas行列操作: https://www.cnblogs.com/mrwuzs/p/11325205.html pandas读取行列数据: https://www.cnblogs.com/wyn ...

pandas数值型数据和非数值型数据统计

对单列数据进行统计

对多列数据进行统计

非数值统计统计

detail中那些菜品最火？菜品卖出多少份？

在detail中哪个菜品点的最多，点了多少分菜？

建议在使用时，把数据类型转成category，然后再计算

pandas数值型数据和非数值型数据统计相关推荐

最新文章

热门文章