python 安卓app 缺点_用python对android APP进行分析2
文章接着前一篇文章《用python对android APP进行分析1》的内容
转换其他列数据类型
data.Reviews=data['Reviews'].astype(np.int,inpalce=True)
data.Reviews.head()
0 159
1 967
2 87510
3 215644
4 967
Name: Reviews, dtype: int32
print(data[~data.Size.str.contains('M')].head())
App Category Rating Reviews \
37 Floor Plan Creator ART_AND_DESIGN 4.1 36639
42 Textgram - write on photos ART_AND_DESIGN 4.4 295221
52 Used Cars and Trucks for Sale AUTO_AND_VEHICLES 4.6 17057
58 Restart Navigator AUTO_AND_VEHICLES 4.0 1403
67 Ulysse Speedometer AUTO_AND_VEHICLES 4.3 40211
Size Installs Type Price Content Rating Genres \
37 Varies with device 5000000 Free 0 Everyone Art & Design
42 Varies with device 10000000 Free 0 Everyone Art & Design
52 Varies with device 1000000 Free 0 Everyone Auto & Vehicles
58 201k 100000 Free 0 Everyone Auto & Vehicles
67 Varies with device 5000000 Free 0 Everyone Auto & Vehicles
Last Updated Current Ver Android Ver installs_range
37 July 14, 2018 Varies with device 2.3.3 and up 百万+
42 July 30, 2018 Varies with device Varies with device 百万+
52 July 30, 2018 Varies with device Varies with device 十万+
58 August 26, 2014 1.0.1 2.2 and up 万+
67 July 30, 2018 Varies with device Varies with device 百万+
大体发现有三种大小,k级的,m级的,不确定的
#定义改变大小统一单位的函数
def size_normal(x):
if 'M' in x.upper():
return float(x.replace('M',''))*1000
elif 'k' in x.lower():
return float(x.replace('k',''))
else:
return np.nan
data.Size.map(size_normal)[[1,146,10595]]#检验是否装换好
1 14000.0
146 NaN
10595 470.0
Name: Size, dtype: float64
data['size_k']=data.Size.map(size_normal)
print(data.head())
App Category Rating \
0 Photo Editor & Candy Camera & Grid & ScrapBook ART_AND_DESIGN 4.1
1 Coloring book moana ART_AND_DESIGN 3.9
2 U Launcher Lite – FREE Live Cool Themes, Hide ... ART_AND_DESIGN 4.7
3 Sketch - Draw & Paint ART_AND_DESIGN 4.5
4 Pixel Draw - Number Art Coloring Book ART_AND_DESIGN 4.3
Reviews Size Installs Type Price Content Rating \
0 159 19M 10000 Free 0 Everyone
1 967 14M 500000 Free 0 Everyone
2 87510 8.7M 5000000 Free 0 Everyone
3 215644 25M 50000000 Free 0 Teen
4 967 2.8M 100000 Free 0 Everyone
Genres Last Updated Current Ver \
0 Art & Design January 7, 2018 1.0.0
1 Art & Design;Pretend Play January 15, 2018 2.0.0
2 Art & Design August 1, 2018 1.2.4
3 Art & Design June 8, 2018 Varies with device
4 Art & Design;Creativity June 20, 2018 1.1
Android Ver installs_range size_k
0 4.0.3 and up 千+ 19000.0
1 4.0.3 and up 十万+ 14000.0
2 4.0.3 and up 百万+ 8700.0
3 4.2 and up 千万+ 25000.0
4 4.4 and up 万+ 2800.0
更新时间转换
from dateutil.parser import parse
def time_normal(time):
return parse(time)
data['Last Updated']=data['Last Updated'].map(time_normal)
print(data.head())
App Category Rating \
0 Photo Editor & Candy Camera & Grid & ScrapBook ART_AND_DESIGN 4.1
1 Coloring book moana ART_AND_DESIGN 3.9
2 U Launcher Lite – FREE Live Cool Themes, Hide ... ART_AND_DESIGN 4.7
3 Sketch - Draw & Paint ART_AND_DESIGN 4.5
4 Pixel Draw - Number Art Coloring Book ART_AND_DESIGN 4.3
Reviews Size Installs Type Price Content Rating \
0 159 19M 10000 Free 0 Everyone
1 967 14M 500000 Free 0 Everyone
2 87510 8.7M 5000000 Free 0 Everyone
3 215644 25M 50000000 Free 0 Teen
4 967 2.8M 100000 Free 0 Everyone
Genres Last Updated Current Ver Android Ver \
0 Art & Design 2018-01-07 1.0.0 4.0.3 and up
1 Art & Design;Pretend Play 2018-01-15 2.0.0 4.0.3 and up
2 Art & Design 2018-08-01 1.2.4 4.0.3 and up
3 Art & Design 2018-06-08 Varies with device 4.2 and up
4 Art & Design;Creativity 2018-06-20 1.1 4.4 and up
installs_range size_k
0 千+ 19000.0
1 十万+ 14000.0
2 百万+ 8700.0
3 千万+ 25000.0
4 万+ 2800.0
更新时间转换为时间格式,此处如果把时间装换为索引,通时间序列方法进行操作,但不做本次分析探讨内容。
检查异常值
print(data.describe())
Rating Reviews Installs size_k
count 10841.000000 1.084100e+04 1.084100e+04 9146.000000
mean 4.190739 4.441119e+05 1.546291e+07 21514.504975
std 0.479738 2.927629e+06 8.502557e+07 22588.342683
min 1.000000 0.000000e+00 0.000000e+00 8.500000
25% 4.100000 3.800000e+01 1.000000e+03 4900.000000
50% 4.200000 2.094000e+03 1.000000e+05 13000.000000
75% 4.500000 5.476800e+04 5.000000e+06 30000.000000
max 5.000000 7.815831e+07 1.000000e+09 100000.000000
发现数值类型列没有异常值,price将会在后面内容进行装换
删除重复值
data.duplicated().sum()
483
data.drop_duplicates(inplace=True)
data.info()
Int64Index: 10358 entries, 0 to 10840
Data columns (total 15 columns):
App 10358 non-null object
Category 10358 non-null object
Rating 10358 non-null float64
Reviews 10358 non-null int32
Size 10358 non-null object
Installs 10358 non-null int32
Type 10358 non-null object
Price 10358 non-null object
Content Rating 10358 non-null object
Genres 10358 non-null object
Last Updated 10358 non-null datetime64[ns]
Current Ver 10350 non-null object
Android Ver 10356 non-null object
installs_range 10358 non-null category
size_k 8832 non-null float64
dtypes: category(1), datetime64[ns](1), float64(2), int32(2), object(9)
memory usage: 1.1+ MB
data.to_csv(r'C:\Users\19078\Desktop\中级\第三关\android_data.csv',sep=',',encoding='utf_8_sig')#保存数据到csv格式
数据分析
分类对评论数数的影响
a=pd.pivot_table(data,columns='Type',index='Category',values='Reviews',aggfunc='mean').sort_values(by='Free',ascending=False)[:10]
b=pd.pivot_table(data,columns='Type',index='Category',values='Reviews',aggfunc='mean').sort_values(by='Paid',ascending=False)[:10]
a['Free'].plot(kind='bar',rot=60)
b['Paid'].plot(kind='bar',rot=60)
从两个图对比发现,不同类型app平均评论数相差较大,免费方面以游戏,社交,聊天居多,而付费中家庭,游戏,天气app评论居多,所以app种类和付费类型对评论数有一定影响。
类别与app软件大小的关系
a=pd.pivot_table(data,index='Category',values='size_k',aggfunc='mean').sort_values(by='size_k',ascending=False)[:15]
print(a)
size_k
Category
GAME 44126.850000
FAMILY 27930.435770
TRAVEL_AND_LOCAL 24515.994413
SPORTS 24181.192568
ENTERTAINMENT 22638.805970
PARENTING 22512.962963
FOOD_AND_DRINK 22056.122449
HEALTH_AND_FITNESS 21643.216667
EDUCATION 20076.895833
AUTO_AND_VEHICLES 20037.146667
MEDICAL 19383.681579
FINANCE 17937.730263
SOCIAL 16875.827586
PHOTOGRAPHY 16832.045267
MAPS_AND_NAVIGATION 16614.712963
可以看出不同类型软件大小也不同,游戏会比较大。同时也发现app普遍大小都是几十兆,所以可以了解app趋向的大小也是十几到及时兆比较合适。
付费软件中什么类别价格更高
data_paid=data[data.Type.isin(['Paid'])]
print(data_paid.head())
App Category Rating \
234 TurboScan: scan documents and receipts in PDF BUSINESS 4.7
235 Tiny Scanner Pro: PDF Doc Scan BUSINESS 4.8
427 Puffin Browser Pro COMMUNICATION 4.0
476 Moco+ - Chat, Meet People DATING 4.2
477 Calculator DATING 2.6
Reviews Size Installs Type Price Content Rating \
234 11442 6.8M 100000 Paid $4.99 Everyone
235 10295 39M 100000 Paid $4.99 Everyone
427 18247 Varies with device 100000 Paid $3.99 Everyone
476 1545 Varies with device 10000 Paid $3.99 Mature 17+
477 57 6.2M 1000 Paid $6.99 Everyone
Genres Last Updated Current Ver Android Ver installs_range \
234 Business 2018-03-25 1.5.2 4.0 and up 万+
235 Business 2017-04-11 3.4.6 3.0 and up 万+
427 Communication 2018-07-05 7.5.3.20547 4.1 and up 万+
476 Dating 2018-06-19 2.6.139 4.1 and up 千+
477 Dating 2017-10-25 1.1.6 4.0 and up 百+
size_k
234 6800.0
235 39000.0
427 NaN
476 NaN
477 6200.0
data_paid.Price=data_paid.Price.str.replace('$','').astype('float')
a=data_paid.groupby('Category')['Price'].agg(['mean','count']).sort_values(by='mean',ascending=False)[:15]
print(a)
mean count
Category
FINANCE 170.637059 17
LIFESTYLE 124.256316 19
EVENTS 109.990000 1
BUSINESS 14.607500 12
FAMILY 12.945561 187
MEDICAL 12.151071 84
PRODUCTIVITY 8.961786 28
PHOTOGRAPHY 6.111500 20
MAPS_AND_NAVIGATION 5.390000 5
SOCIAL 5.323333 3
PARENTING 4.790000 2
DATING 4.490000 7
EDUCATION 4.490000 4
AUTO_AND_VEHICLES 4.490000 3
HEALTH_AND_FITNESS 4.290000 15
C:\Users\19078\Anaconda3\envs\py\lib\site-packages\pandas\core\generic.py:4405: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self[name] = value
从上述结果看出,金融理财,生活类和事件类软件收费较高。
不同类型软件付费比率
data.size
155370
def p_f_rate(group):
rate=(group[group['Type'].isin(['Paid'])].size)/(group[group['Type'].isin(['Free'])].size)
return rate.round(2)
data.groupby('Category').apply(p_f_rate).sort_values(ascending=False)[:15]
Category
PERSONALIZATION 0.27
MEDICAL 0.26
BOOKS_AND_REFERENCE 0.14
WEATHER 0.11
FAMILY 0.11
TOOLS 0.10
COMMUNICATION 0.08
GAME 0.08
SPORTS 0.07
PRODUCTIVITY 0.07
PHOTOGRAPHY 0.07
LIFESTYLE 0.05
FINANCE 0.05
HEALTH_AND_FITNESS 0.05
ART_AND_DESIGN 0.05
dtype: float64
可以看出付费率高的个性化和医疗的app,纵观所有,发现app不管什么类型,多数都是免费的,所以互联网的免费思维对于运营很关键
python 安卓app 缺点_用python对android APP进行分析2相关推荐
- php手机端开发,php手机app开发_开发点餐平台app
如何用Wordpress制作App客户端并在AppStore上线 我猜你大概想表达用Wordpress制作App客户端的信息源,供App获取必要的信息.比如在Wordpress上发布一篇文章,然后Ap ...
- python手机app开发_利用python开发app实战的方法
我很早之前就想开发一款app玩玩,无奈对java不够熟悉,之前也没有开发app的经验,因此一直耽搁了.最近想到尝试用python开发一款app,google搜索了一番后,发现确实有路可寻,目前也有了一 ...
- python在线搭建教程_理解python web开发,轻松搭建web app!
大家好,今天分享给大家的是理解python web开发,轻松搭建web app,希望大家学有所获! 因为 python代码的优雅美观且易于维护这一特点,越来越多的人选择使用 Python做web开发. ...
- python测验7答案_中国大学MOOC的APP(慕课)2021用Python玩转数据章节测验答案
中国大学MOOC的APP(慕课)2021用Python玩转数据章节测验答案 更多相关问题 如图是一个液晶显示器厂去年四个季度产值统计图,看图填空.(1)这是______统计图.(2)产值最少的是第__ ...
- python 有什么一般人不知道的缺点_关于python,你知道它的优缺点吗?
python语言的优势介绍: 1.python是一门简单的编程语言,代表简单主义思想; 2.python简单容易上手,语法简单文档也非常明确; 3.python免费开源,是一款FLOSS(自由/源代码 ...
- python刷抖音_用Python生成抖音字符视频!
抖音字符视频在去年火过一段时间. 反正我是始终忘不了那段极乐净土的音乐... 这一次自己也来实现一波,做一个字符视频出来. 主要用到的库有cv2,pillow库. 原视频如下,直接抖音下载的,妥妥的水 ...
- python安卓自动化实现方法_uiautomator +python 实现安卓UI自动化
简单实例 注:安卓6.0以上的手机不会自动安装app-uiautomator.apk和app-uiautomator-test.apk,需要手动安装,否则报错ioerror RPC server no ...
- python安卓版开发环境搭建_React Native Android 开发环境搭建(Windows 版)
补上之前说的 Windows 系统的 React Native 开发环境搭建,坑还是比 Mac 环境下的多些.此文的受众还是已经搭建过 Android 开发环境的同学. 需要安装的软件 Chocola ...
- python简单实践作业_【Python】:简单爬虫作业
使用Python编写的图片爬虫作业: #coding=utf-8 import urllib import re def getPage(url): #urllib.urlopen(url[, dat ...
最新文章
- HaoZip(好压) 去广告纯净版 4.4
- 配置web项目session永不超时
- 大一计算机课实训总结1000字,大一计算机实训报告.doc
- 复习:线性表——双链表、循环链表
- SpringBootDubboZookeeper远程调用项目搭建
- Android模拟器的建立以及HelloWorld的编写
- 遗传算法MATLAB
- EPLAN教程——导出CAD如何快捷配置
- Qt优秀开源项目之十四:SortFilterProxyModel
- win下装django
- JAVA获得股票数据大全
- php不显示notice,解决PHP显示Warning和Notice等问题
- 华为手机能隐藏蓝牙吗_华为手机隐藏功能大全展示!
- Linux 操作必备 150 个命令,速度收藏~
- 搭建自己的github.io博客
- Unity 3D 遮挡剔除(仅专业版) Occlusion Culling (Pro only)
- JAVA大作业-购物车 (持续更新)
- 明明的随机数c++超短题解
- 看,2021年,一个普通应届生的成长之旅
- 落枕的原因 神奇穴位 预防落枕