R包学习——reshape包中melt、cast、merge函数用法
目录
- 1. melt:Melt an object into a form suitable for easy casting.
- 2. cast:Cast a molten data frame into the reshaped or aggregated form you want
- 3. merge:Merge two data frames by common columns or row names, or do other versions of database join operations.
- (1)数据准备
- (2)默认以两个dataframe都拥有的列(列名相同)进行merge
- (3)当两个dataframe没有相同列名时,可以指定列进行merge
- i. inner_join
- ii. left_join
- iii. right_join
- iv. full_join
- 4. example of using 'incomparables'
本文使用的例子皆为官方example,在Rstudio中使用 ?+函数名 即可查看。
1. melt:Melt an object into a form suitable for easy casting.
> head(tips)total_bill tip sex smoker day time size
1 16.99 1.01 Female No Sun Dinner 2
2 10.34 1.66 Male No Sun Dinner 3
3 21.01 3.50 Male No Sun Dinner 3
4 23.68 3.31 Male No Sun Dinner 2
5 24.59 3.61 Female No Sun Dinner 4
6 25.29 4.71 Male No Sun Dinner 4
> head(melt(tips))
Using sex, smoker, day, time as id variablessex smoker day time variable value
1 Female No Sun Dinner total_bill 16.99
2 Male No Sun Dinner total_bill 10.34
3 Male No Sun Dinner total_bill 21.01
4 Male No Sun Dinner total_bill 23.68
5 Female No Sun Dinner total_bill 24.59
6 Male No Sun Dinner total_bill 25.29
>
> head(airquality)ozone solar.r wind temp month day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6
> names(airquality) <- tolower(names(airquality))
> head(melt(airquality, id=c("month", "day")))month day variable value
1 5 1 ozone 41
2 5 2 ozone 36
3 5 3 ozone 12
4 5 4 ozone 18
5 5 5 ozone NA
6 5 6 ozone 28
>
> head(ChickWeight) weight time chick diet
1 42 0 1 1
2 51 2 1 1
3 59 4 1 1
4 64 6 1 1
5 76 8 1 1
6 93 10 1 1
> names(ChickWeight) <- tolower(names(ChickWeight))
> head(melt(ChickWeight, id=2:4))time chick diet variable value
1 0 1 1 weight 42
2 2 1 1 weight 51
3 4 1 1 weight 59
4 6 1 1 weight 64
5 8 1 1 weight 76
6 10 1 1 weight 93
2. cast:Cast a molten data frame into the reshaped or aggregated form you want
> names(airquality) <- tolower(names(airquality))
> aqm <- melt(airquality, id=c("month", "day"), na.rm=TRUE)
> head(aqm)month day variable value
1 5 1 ozone 41
2 5 2 ozone 36
3 5 3 ozone 12
4 5 4 ozone 18
5 5 6 ozone 28
6 5 7 ozone 23> str(cast(aqm, day ~ month ~ variable))num [1:31, 1:5, 1:4] 41 36 12 18 NA 28 23 19 8 NA ...- attr(*, "dimnames")=List of 3..$ day : Named chr [1:31] "1" "2" "3" "4" ..... ..- attr(*, "names")= chr [1:31] "1" "20" "39" "58" .....$ month : Named chr [1:5] "5" "6" "7" "8" ..... ..- attr(*, "names")= chr [1:5] "1" "600" "8" "12" .....$ variable: Named chr [1:4] "ozone" "solar.r" "wind" "temp".. ..- attr(*, "names")= chr [1:4] "1" "2" "3" "4"> cast(aqm, month ~ variable, mean)month ozone solar.r wind temp
1 5 23.61538 181.2963 11.622581 65.54839
2 6 29.44444 190.1667 10.266667 79.10000
3 7 59.11538 216.4839 8.941935 83.90323
4 8 59.96154 171.8571 8.793548 83.96774
5 9 31.44828 167.4333 10.180000 76.90000
> str(cast(aqm, month ~ variable, mean))
List of 5$ month : int [1:5] 5 6 7 8 9$ ozone : num [1:5] 23.6 29.4 59.1 60 31.4$ solar.r: num [1:5] 181 190 216 172 167$ wind : num [1:5] 11.62 10.27 8.94 8.79 10.18$ temp : num [1:5] 65.5 79.1 83.9 84 76.9- attr(*, "row.names")= int [1:5] 1 2 3 4 5- attr(*, "idvars")= chr "month"- attr(*, "rdimnames")=List of 2..$ :'data.frame': 5 obs. of 1 variable:.. ..$ month: int [1:5] 5 6 7 8 9..$ :'data.frame': 4 obs. of 1 variable:.. ..$ variable: Factor w/ 4 levels "ozone","solar.r",..: 1 2 3 4
3. merge:Merge two data frames by common columns or row names, or do other versions of database join operations.
(1)数据准备
> authors <- data.frame(
+ ## I(*) : use character columns of names to get sensible sort order
+ surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
+ nationality = c("US", "Australia", "US", "UK", "Australia"),
+ deceased = c("yes", rep("no", 4)))
> authorssurname nationality deceased
1 Tukey US yes
2 Venables Australia no
3 Tierney US no
4 Ripley UK no
5 McNeil Australia no
# 建立副本,新增一列surname改成name,删除surname列
> authorN <- within(authors, { name <- surname; rm(surname) })
> authorNnationality deceased name
1 US yes Tukey
2 Australia no Venables
3 US no Tierney
4 UK no Ripley
5 Australia no McNeil
> books <- data.frame(
+ name = I(c("Tukey", "Venables", "Tierney",
+ "Ripley", "Ripley", "McNeil", "R Core")),
+ title = c("Exploratory Data Analysis",
+ "Modern Applied Statistics ...",
+ "LISP-STAT",
+ "Spatial Statistics", "Stochastic Simulation",
+ "Interactive Data Analysis",
+ "An Introduction to R"),
+ other.author = c(NA, "Ripley", NA, NA, NA, NA,
+ "Venables & Smith"))
> booksname title other.author
1 Tukey Exploratory Data Analysis <NA>
2 Venables Modern Applied Statistics ... Ripley
3 Tierney LISP-STAT <NA>
4 Ripley Spatial Statistics <NA>
5 Ripley Stochastic Simulation <NA>
6 McNeil Interactive Data Analysis <NA>
7 R Core An Introduction to R Venables & Smith
(2)默认以两个dataframe都拥有的列(列名相同)进行merge
> (m0 <- merge(authorN, books))name nationality deceased title other.author
1 McNeil Australia no Interactive Data Analysis <NA>
2 Ripley UK no Spatial Statistics <NA>
3 Ripley UK no Stochastic Simulation <NA>
4 Tierney US no LISP-STAT <NA>
5 Tukey US yes Exploratory Data Analysis <NA>
6 Venables Australia no Modern Applied Statistics ... Ripley
(3)当两个dataframe没有相同列名时,可以指定列进行merge
注意:merge的几种使用方法等同于dplyr包中的四种函数(inner_join、left_join、right_join、full_join)
i. inner_join
> library(dplyr)> (m1 <- merge(authors, books, by.x = "surname", by.y = "name"))surname nationality deceased title other.author
1 McNeil Australia no Interactive Data Analysis <NA>
2 Ripley UK no Spatial Statistics <NA>
3 Ripley UK no Stochastic Simulation <NA>
4 Tierney US no LISP-STAT <NA>
5 Tukey US yes Exploratory Data Analysis <NA>
6 Venables Australia no Modern Applied Statistics ... Ripley> (inner_join(authors, books, by = c("surname" = "name")))surname nationality deceased title other.author
1 Tukey US yes Exploratory Data Analysis <NA>
2 Venables Australia no Modern Applied Statistics ... Ripley
3 Tierney US no LISP-STAT <NA>
4 Ripley UK no Spatial Statistics <NA>
5 Ripley UK no Stochastic Simulation <NA>
6 McNeil Australia no Interactive Data Analysis <NA>
> (m2 <- merge(books, authors, by.x = "name", by.y = "surname"))name title other.author nationality deceased
1 McNeil Interactive Data Analysis <NA> Australia no
2 Ripley Spatial Statistics <NA> UK no
3 Ripley Stochastic Simulation <NA> UK no
4 Tierney LISP-STAT <NA> US no
5 Tukey Exploratory Data Analysis <NA> US yes
6 Venables Modern Applied Statistics ... Ripley Australia no
ii. left_join
> merge(authors, books, by.x = "surname", by.y = "name", all.x = TRUE)surname nationality deceased title other.author
1 McNeil Australia no Interactive Data Analysis <NA>
2 Ripley UK no Spatial Statistics <NA>
3 Ripley UK no Stochastic Simulation <NA>
4 Tierney US no LISP-STAT <NA>
5 Tukey US yes Exploratory Data Analysis <NA>
6 Venables Australia no Modern Applied Statistics ... Ripley > left_join(authors, books, by = c("surname" = "name"))surname nationality deceased title other.author
1 Tukey US yes Exploratory Data Analysis <NA>
2 Venables Australia no Modern Applied Statistics ... Ripley
3 Tierney US no LISP-STAT <NA>
4 Ripley UK no Spatial Statistics <NA>
5 Ripley UK no Stochastic Simulation <NA>
6 McNeil Australia no Interactive Data Analysis <NA>
iii. right_join
> merge(authors, books, by.x = "surname", by.y = "name", all.y = TRUE)surname nationality deceased title other.author
1 McNeil Australia no Interactive Data Analysis <NA>
2 R Core <NA> <NA> An Introduction to R Venables & Smith
3 Ripley UK no Spatial Statistics <NA>
4 Ripley UK no Stochastic Simulation <NA>
5 Tierney US no LISP-STAT <NA>
6 Tukey US yes Exploratory Data Analysis <NA>
7 Venables Australia no Modern Applied Statistics ... Ripley> right_join(authors, books, by = c("surname" = "name"))surname nationality deceased title other.author
1 Tukey US yes Exploratory Data Analysis <NA>
2 Venables Australia no Modern Applied Statistics ... Ripley
3 Tierney US no LISP-STAT <NA>
4 Ripley UK no Spatial Statistics <NA>
5 Ripley UK no Stochastic Simulation <NA>
6 McNeil Australia no Interactive Data Analysis <NA>
7 R Core <NA> <NA> An Introduction to R Venables & Smith
iv. full_join
> merge(authors, books, by.x = "surname", by.y = "name", all = TRUE)surname nationality deceased title other.author
1 McNeil Australia no Interactive Data Analysis <NA>
2 R Core <NA> <NA> An Introduction to R Venables & Smith
3 Ripley UK no Spatial Statistics <NA>
4 Ripley UK no Stochastic Simulation <NA>
5 Tierney US no LISP-STAT <NA>
6 Tukey US yes Exploratory Data Analysis <NA>
7 Venables Australia no Modern Applied Statistics ... Ripley> full_join(authors, books, by = c("surname" = "name"))surname nationality deceased title other.author
1 Tukey US yes Exploratory Data Analysis <NA>
2 Venables Australia no Modern Applied Statistics ... Ripley
3 Tierney US no LISP-STAT <NA>
4 Ripley UK no Spatial Statistics <NA>
5 Ripley UK no Stochastic Simulation <NA>
6 McNeil Australia no Interactive Data Analysis <NA>
7 R Core <NA> <NA> An Introduction to R Venables & Smith
注意:这里由于数据集的原因,right_join和full_join的结果看不出来差距,要明白其中原理,右连接是以Y数据集为准,全连接是要顾及两个数据集。
4. example of using ‘incomparables’
> x <- data.frame(k1 = c(NA,NA,3,4,5), k2 = c(1,NA,NA,4,5), data = 1:5)
> xk1 k2 data
1 NA 1 1
2 NA NA 2
3 3 NA 3
4 4 4 4
5 5 5 5
> y <- data.frame(k1 = c(NA,2,NA,4,5), k2 = c(NA,NA,3,4,5), data = 1:5)
> yk1 k2 data
1 NA NA 1
2 2 NA 2
3 NA 3 3
4 4 4 4
5 5 5 5# inner_join
> merge(x, y, by = c("k1","k2")) # NA's matchk1 k2 data.x data.y
1 4 4 4 4
2 5 5 5 5
3 NA NA 2 1# 按照指定列merge,注意一个NA分别对应了两次NA
> merge(x, y, by = "k1") # NA's match, so 6 rowsk1 k2.x data.x k2.y data.y
1 4 4 4 4 4
2 5 5 5 5 5
3 NA 1 1 NA 1
4 NA 1 1 3 3
5 NA NA 2 NA 1
6 NA NA 2 3 3
> merge(x, y, by = "k2") # NA's match, so 6 rowsk2 k1.x data.x k1.y data.y
1 4 4 4 4 4
2 5 5 5 5 5
3 NA NA 2 NA 1
4 NA NA 2 2 2
5 NA 3 3 NA 1
6 NA 3 3 2 2# 去掉空值
> merge(x, y, by = "k2", incomparables = NA) # 2 rowsk2 k1.x data.x k1.y data.y
1 4 4 4 4 4
2 5 5 5 5 5
R包学习——reshape包中melt、cast、merge函数用法相关推荐
- r语言c函数怎么用,R语言学习笔记——C#中如何使用R语言setwd()函数
在R语言编译器中,设置当前工作文件夹可以用setwd()函数. > setwd("e://桌面//") > setwd("e:\桌面\") > ...
- Python编程语言学习:python中与数字相关的函数(取整等)、案例应用之详细攻略
Python编程语言学习:python中与数字相关的函数(取整等).案例应用之详细攻略 目录 python中与数字相关的函数 1.对小数进行向上取整 1.1.利用numpy库 1.2.利用math库
- java split函数的用法,java拆分字符串_java中split拆分字符串函数用法
摘要 腾兴网为您分享:java中split拆分字符串函数用法,中信期货,掌上电力,星球联盟,淘集集等软件知识,以及韩剧精灵,每日英语听力vip,龙卷风收音机,优衣库,中国平煤神马集团协同办公系统,光晕 ...
- C++string类常用函数 c++中的string常用函数用法总结
string类的构造函数: string(const char *s); //用c字符串s初始化 string(int n,char c); //用n个字符c初始化 此外,string类 ...
- python threading join_Python中threading模块join函数用法实例分析
本文实例讲述了Python中threading模块join函数用法.分享给大家供大家参考.具体分析如下: join的作用是众所周知的,阻塞进程直到线程执行完毕.通用的做法是我们启动一批线程,最后joi ...
- getservbyname php,php中getservbyport与getservbyname函数用法实例
本文实例讲述了php中getservbyport与getservbyname函数用法.分享给大家供大家参考.具体如下: string getservbyport ( int $port , strin ...
- python中内置函数的用法_python中str内置函数用法总结
大家在使用python的过程中,应该在敲代码的时候经常遇到str内置函数,为了防止大家搞混,本文整理归纳了str内置函数.1字符串查找类:find.index:2.字符串判断类:islower.isa ...
- Mysql中rank类的函数用法
Mysql中rank类的函数用法 rank() over 作用:查出指定条件后的进行排名,条件相同排名相同,排名间断不连续. 说明:例如学生排名,使用这个函数,成绩相同的两名是并列,下一位同学空出所占 ...
- C#中ToInt32以及类似函数用法介绍
C#中ToInt32以及类似函数用法介绍 作用 程序举例 程序逻辑 程序代码 程序 作用 将指定的值转换为 32 位有符号整数.对应的还有ToInt16,ToInt64 指定的值可以是字符串.时间.位 ...
最新文章
- Linux中常见shell命令总结
- 在AngularJS控制器之间共享数据
- python做图像处理快不快_Python 图像读写谁最快?不信就比一比
- Entity Framework 实体框架的形成之旅--实体数据模型 (EDM)的处理(4)
- 用户访问网站的基本流程
- HDUOJ 2089
- 【题意分析】1044 Shopping in Mars (25 分)【滑动窗口】
- 获取客户端ip_代理IP工具能否解决反爬?
- java中插入排序_Java中的插入排序
- idea DataGrip 使用图解教程
- Coursera机器学习week11 单元测试
- 联想计算机网络同传速度很慢,利用联想网络同传系统,提升微机室管理效率
- 李永乐线性代数辅导讲义第四章学霸小结
- ora 01033 linux,ORA-01033: ORACLE initialization or shutdown in progres
- matlab simulink电感,一文教你快速学会在matlab的simulink中调用C语言进行仿真
- 世界传说 换装迷宫2 所有人物及所有技能及奖励技能 传说系列各秘奥技和台词
- 如何利用PDF转换器将WPS转换成word
- flutter页面布局HTML,Flutter开发实战初级(2)页面布局详解
- 给自己的网站添加在线客服代码
- C#实现向手机发送验证码短信