目录

  • 1. melt:Melt an object into a form suitable for easy casting.
  • 2. cast:Cast a molten data frame into the reshaped or aggregated form you want
  • 3. merge:Merge two data frames by common columns or row names, or do other versions of database join operations.
    • (1)数据准备
    • (2)默认以两个dataframe都拥有的列(列名相同)进行merge
    • (3)当两个dataframe没有相同列名时,可以指定列进行merge
      • i. inner_join
      • ii. left_join
      • iii. right_join
      • iv. full_join
  • 4. example of using 'incomparables'

本文使用的例子皆为官方example,在Rstudio中使用 ?+函数名 即可查看。

1. melt:Melt an object into a form suitable for easy casting.

> head(tips)total_bill  tip    sex smoker day   time size
1      16.99 1.01 Female     No Sun Dinner    2
2      10.34 1.66   Male     No Sun Dinner    3
3      21.01 3.50   Male     No Sun Dinner    3
4      23.68 3.31   Male     No Sun Dinner    2
5      24.59 3.61 Female     No Sun Dinner    4
6      25.29 4.71   Male     No Sun Dinner    4
> head(melt(tips))
Using sex, smoker, day, time as id variablessex smoker day   time   variable value
1 Female     No Sun Dinner total_bill 16.99
2   Male     No Sun Dinner total_bill 10.34
3   Male     No Sun Dinner total_bill 21.01
4   Male     No Sun Dinner total_bill 23.68
5 Female     No Sun Dinner total_bill 24.59
6   Male     No Sun Dinner total_bill 25.29
>
> head(airquality)ozone solar.r wind temp month day
1    41     190  7.4   67     5   1
2    36     118  8.0   72     5   2
3    12     149 12.6   74     5   3
4    18     313 11.5   62     5   4
5    NA      NA 14.3   56     5   5
6    28      NA 14.9   66     5   6
> names(airquality) <- tolower(names(airquality))
> head(melt(airquality, id=c("month", "day")))month day variable value
1     5   1    ozone    41
2     5   2    ozone    36
3     5   3    ozone    12
4     5   4    ozone    18
5     5   5    ozone    NA
6     5   6    ozone    28
>
> head(ChickWeight) weight time chick diet
1     42    0     1    1
2     51    2     1    1
3     59    4     1    1
4     64    6     1    1
5     76    8     1    1
6     93   10     1    1
> names(ChickWeight) <- tolower(names(ChickWeight))
> head(melt(ChickWeight, id=2:4))time chick diet variable value
1    0     1    1   weight    42
2    2     1    1   weight    51
3    4     1    1   weight    59
4    6     1    1   weight    64
5    8     1    1   weight    76
6   10     1    1   weight    93

2. cast:Cast a molten data frame into the reshaped or aggregated form you want

> names(airquality) <- tolower(names(airquality))
> aqm <- melt(airquality, id=c("month", "day"), na.rm=TRUE)
> head(aqm)month day variable value
1     5   1    ozone    41
2     5   2    ozone    36
3     5   3    ozone    12
4     5   4    ozone    18
5     5   6    ozone    28
6     5   7    ozone    23> str(cast(aqm, day ~ month ~ variable))num [1:31, 1:5, 1:4] 41 36 12 18 NA 28 23 19 8 NA ...- attr(*, "dimnames")=List of 3..$ day     : Named chr [1:31] "1" "2" "3" "4" ..... ..- attr(*, "names")= chr [1:31] "1" "20" "39" "58" .....$ month   : Named chr [1:5] "5" "6" "7" "8" ..... ..- attr(*, "names")= chr [1:5] "1" "600" "8" "12" .....$ variable: Named chr [1:4] "ozone" "solar.r" "wind" "temp".. ..- attr(*, "names")= chr [1:4] "1" "2" "3" "4"> cast(aqm, month ~ variable, mean)month    ozone  solar.r      wind     temp
1     5 23.61538 181.2963 11.622581 65.54839
2     6 29.44444 190.1667 10.266667 79.10000
3     7 59.11538 216.4839  8.941935 83.90323
4     8 59.96154 171.8571  8.793548 83.96774
5     9 31.44828 167.4333 10.180000 76.90000
> str(cast(aqm, month ~ variable, mean))
List of 5$ month  : int [1:5] 5 6 7 8 9$ ozone  : num [1:5] 23.6 29.4 59.1 60 31.4$ solar.r: num [1:5] 181 190 216 172 167$ wind   : num [1:5] 11.62 10.27 8.94 8.79 10.18$ temp   : num [1:5] 65.5 79.1 83.9 84 76.9- attr(*, "row.names")= int [1:5] 1 2 3 4 5- attr(*, "idvars")= chr "month"- attr(*, "rdimnames")=List of 2..$ :'data.frame': 5 obs. of  1 variable:.. ..$ month: int [1:5] 5 6 7 8 9..$ :'data.frame': 4 obs. of  1 variable:.. ..$ variable: Factor w/ 4 levels "ozone","solar.r",..: 1 2 3 4

3. merge:Merge two data frames by common columns or row names, or do other versions of database join operations.

(1)数据准备

> authors <- data.frame(
+   ## I(*) : use character columns of names to get sensible sort order
+   surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
+   nationality = c("US", "Australia", "US", "UK", "Australia"),
+   deceased = c("yes", rep("no", 4)))
> authorssurname nationality deceased
1    Tukey          US      yes
2 Venables   Australia       no
3  Tierney          US       no
4   Ripley          UK       no
5   McNeil   Australia       no
# 建立副本,新增一列surname改成name,删除surname列
> authorN <- within(authors, { name <- surname; rm(surname) })
> authorNnationality deceased     name
1          US      yes    Tukey
2   Australia       no Venables
3          US       no  Tierney
4          UK       no   Ripley
5   Australia       no   McNeil
> books <- data.frame(
+   name = I(c("Tukey", "Venables", "Tierney",
+              "Ripley", "Ripley", "McNeil", "R Core")),
+   title = c("Exploratory Data Analysis",
+             "Modern Applied Statistics ...",
+             "LISP-STAT",
+             "Spatial Statistics", "Stochastic Simulation",
+             "Interactive Data Analysis",
+             "An Introduction to R"),
+   other.author = c(NA, "Ripley", NA, NA, NA, NA,
+                    "Venables & Smith"))
> booksname                         title     other.author
1    Tukey     Exploratory Data Analysis             <NA>
2 Venables Modern Applied Statistics ...           Ripley
3  Tierney                     LISP-STAT             <NA>
4   Ripley            Spatial Statistics             <NA>
5   Ripley         Stochastic Simulation             <NA>
6   McNeil     Interactive Data Analysis             <NA>
7   R Core          An Introduction to R  Venables & Smith

(2)默认以两个dataframe都拥有的列(列名相同)进行merge


> (m0 <- merge(authorN, books))name nationality deceased                         title other.author
1   McNeil   Australia       no     Interactive Data Analysis         <NA>
2   Ripley          UK       no            Spatial Statistics         <NA>
3   Ripley          UK       no         Stochastic Simulation         <NA>
4  Tierney          US       no                     LISP-STAT         <NA>
5    Tukey          US      yes     Exploratory Data Analysis         <NA>
6 Venables   Australia       no Modern Applied Statistics ...       Ripley

(3)当两个dataframe没有相同列名时,可以指定列进行merge

注意:merge的几种使用方法等同于dplyr包中的四种函数(inner_join、left_join、right_join、full_join

i. inner_join

> library(dplyr)> (m1 <- merge(authors, books, by.x = "surname", by.y = "name"))surname nationality deceased                         title other.author
1   McNeil   Australia       no     Interactive Data Analysis         <NA>
2   Ripley          UK       no            Spatial Statistics         <NA>
3   Ripley          UK       no         Stochastic Simulation         <NA>
4  Tierney          US       no                     LISP-STAT         <NA>
5    Tukey          US      yes     Exploratory Data Analysis         <NA>
6 Venables   Australia       no Modern Applied Statistics ...       Ripley> (inner_join(authors, books, by = c("surname" = "name")))surname nationality deceased                         title other.author
1    Tukey          US      yes     Exploratory Data Analysis         <NA>
2 Venables   Australia       no Modern Applied Statistics ...       Ripley
3  Tierney          US       no                     LISP-STAT         <NA>
4   Ripley          UK       no            Spatial Statistics         <NA>
5   Ripley          UK       no         Stochastic Simulation         <NA>
6   McNeil   Australia       no     Interactive Data Analysis         <NA>
> (m2 <- merge(books, authors, by.x = "name", by.y = "surname"))name                         title other.author nationality deceased
1   McNeil     Interactive Data Analysis         <NA>   Australia       no
2   Ripley            Spatial Statistics         <NA>          UK       no
3   Ripley         Stochastic Simulation         <NA>          UK       no
4  Tierney                     LISP-STAT         <NA>          US       no
5    Tukey     Exploratory Data Analysis         <NA>          US      yes
6 Venables Modern Applied Statistics ...       Ripley   Australia       no

ii. left_join


> merge(authors, books, by.x = "surname", by.y = "name", all.x = TRUE)surname nationality deceased                         title other.author
1   McNeil   Australia       no     Interactive Data Analysis         <NA>
2   Ripley          UK       no            Spatial Statistics         <NA>
3   Ripley          UK       no         Stochastic Simulation         <NA>
4  Tierney          US       no                     LISP-STAT         <NA>
5    Tukey          US      yes     Exploratory Data Analysis         <NA>
6 Venables   Australia       no Modern Applied Statistics ...       Ripley  > left_join(authors, books, by = c("surname" = "name"))surname nationality deceased                         title other.author
1    Tukey          US      yes     Exploratory Data Analysis         <NA>
2 Venables   Australia       no Modern Applied Statistics ...       Ripley
3  Tierney          US       no                     LISP-STAT         <NA>
4   Ripley          UK       no            Spatial Statistics         <NA>
5   Ripley          UK       no         Stochastic Simulation         <NA>
6   McNeil   Australia       no     Interactive Data Analysis         <NA>

iii. right_join

> merge(authors, books, by.x = "surname", by.y = "name", all.y = TRUE)surname nationality deceased                         title     other.author
1   McNeil   Australia       no     Interactive Data Analysis             <NA>
2   R Core        <NA>     <NA>          An Introduction to R Venables & Smith
3   Ripley          UK       no            Spatial Statistics             <NA>
4   Ripley          UK       no         Stochastic Simulation             <NA>
5  Tierney          US       no                     LISP-STAT             <NA>
6    Tukey          US      yes     Exploratory Data Analysis             <NA>
7 Venables   Australia       no Modern Applied Statistics ...           Ripley> right_join(authors, books, by = c("surname" = "name"))surname nationality deceased                         title     other.author
1    Tukey          US      yes     Exploratory Data Analysis             <NA>
2 Venables   Australia       no Modern Applied Statistics ...           Ripley
3  Tierney          US       no                     LISP-STAT             <NA>
4   Ripley          UK       no            Spatial Statistics             <NA>
5   Ripley          UK       no         Stochastic Simulation             <NA>
6   McNeil   Australia       no     Interactive Data Analysis             <NA>
7   R Core        <NA>     <NA>          An Introduction to R Venables & Smith

iv. full_join

> merge(authors, books, by.x = "surname", by.y = "name", all = TRUE)surname nationality deceased                         title     other.author
1   McNeil   Australia       no     Interactive Data Analysis             <NA>
2   R Core        <NA>     <NA>          An Introduction to R Venables & Smith
3   Ripley          UK       no            Spatial Statistics             <NA>
4   Ripley          UK       no         Stochastic Simulation             <NA>
5  Tierney          US       no                     LISP-STAT             <NA>
6    Tukey          US      yes     Exploratory Data Analysis             <NA>
7 Venables   Australia       no Modern Applied Statistics ...           Ripley> full_join(authors, books, by = c("surname" = "name"))surname nationality deceased                         title     other.author
1    Tukey          US      yes     Exploratory Data Analysis             <NA>
2 Venables   Australia       no Modern Applied Statistics ...           Ripley
3  Tierney          US       no                     LISP-STAT             <NA>
4   Ripley          UK       no            Spatial Statistics             <NA>
5   Ripley          UK       no         Stochastic Simulation             <NA>
6   McNeil   Australia       no     Interactive Data Analysis             <NA>
7   R Core        <NA>     <NA>          An Introduction to R Venables & Smith

注意:这里由于数据集的原因,right_join和full_join的结果看不出来差距,要明白其中原理,右连接是以Y数据集为准,全连接是要顾及两个数据集。

4. example of using ‘incomparables’

> x <- data.frame(k1 = c(NA,NA,3,4,5), k2 = c(1,NA,NA,4,5), data = 1:5)
> xk1 k2 data
1 NA  1    1
2 NA NA    2
3  3 NA    3
4  4  4    4
5  5  5    5
> y <- data.frame(k1 = c(NA,2,NA,4,5), k2 = c(NA,NA,3,4,5), data = 1:5)
> yk1 k2 data
1 NA NA    1
2  2 NA    2
3 NA  3    3
4  4  4    4
5  5  5    5# inner_join
> merge(x, y, by = c("k1","k2")) # NA's matchk1 k2 data.x data.y
1  4  4      4      4
2  5  5      5      5
3 NA NA      2      1# 按照指定列merge,注意一个NA分别对应了两次NA
> merge(x, y, by = "k1") # NA's match, so 6 rowsk1 k2.x data.x k2.y data.y
1  4    4      4    4      4
2  5    5      5    5      5
3 NA    1      1   NA      1
4 NA    1      1    3      3
5 NA   NA      2   NA      1
6 NA   NA      2    3      3
> merge(x, y, by = "k2") # NA's match, so 6 rowsk2 k1.x data.x k1.y data.y
1  4    4      4    4      4
2  5    5      5    5      5
3 NA   NA      2   NA      1
4 NA   NA      2    2      2
5 NA    3      3   NA      1
6 NA    3      3    2      2# 去掉空值
> merge(x, y, by = "k2", incomparables = NA) # 2 rowsk2 k1.x data.x k1.y data.y
1  4    4      4    4      4
2  5    5      5    5      5

R包学习——reshape包中melt、cast、merge函数用法相关推荐

  1. r语言c函数怎么用,R语言学习笔记——C#中如何使用R语言setwd()函数

    在R语言编译器中,设置当前工作文件夹可以用setwd()函数. > setwd("e://桌面//") > setwd("e:\桌面\") > ...

  2. Python编程语言学习:python中与数字相关的函数(取整等)、案例应用之详细攻略

    Python编程语言学习:python中与数字相关的函数(取整等).案例应用之详细攻略 目录 python中与数字相关的函数 1.对小数进行向上取整 1.1.利用numpy库 1.2.利用math库

  3. java split函数的用法,java拆分字符串_java中split拆分字符串函数用法

    摘要 腾兴网为您分享:java中split拆分字符串函数用法,中信期货,掌上电力,星球联盟,淘集集等软件知识,以及韩剧精灵,每日英语听力vip,龙卷风收音机,优衣库,中国平煤神马集团协同办公系统,光晕 ...

  4. C++string类常用函数 c++中的string常用函数用法总结

    string类的构造函数: string(const char *s);    //用c字符串s初始化 string(int n,char c);     //用n个字符c初始化 此外,string类 ...

  5. python threading join_Python中threading模块join函数用法实例分析

    本文实例讲述了Python中threading模块join函数用法.分享给大家供大家参考.具体分析如下: join的作用是众所周知的,阻塞进程直到线程执行完毕.通用的做法是我们启动一批线程,最后joi ...

  6. getservbyname php,php中getservbyport与getservbyname函数用法实例

    本文实例讲述了php中getservbyport与getservbyname函数用法.分享给大家供大家参考.具体如下: string getservbyport ( int $port , strin ...

  7. python中内置函数的用法_python中str内置函数用法总结

    大家在使用python的过程中,应该在敲代码的时候经常遇到str内置函数,为了防止大家搞混,本文整理归纳了str内置函数.1字符串查找类:find.index:2.字符串判断类:islower.isa ...

  8. Mysql中rank类的函数用法

    Mysql中rank类的函数用法 rank() over 作用:查出指定条件后的进行排名,条件相同排名相同,排名间断不连续. 说明:例如学生排名,使用这个函数,成绩相同的两名是并列,下一位同学空出所占 ...

  9. C#中ToInt32以及类似函数用法介绍

    C#中ToInt32以及类似函数用法介绍 作用 程序举例 程序逻辑 程序代码 程序 作用 将指定的值转换为 32 位有符号整数.对应的还有ToInt16,ToInt64 指定的值可以是字符串.时间.位 ...

最新文章

  1. Linux中常见shell命令总结
  2. 在AngularJS控制器之间共享数据
  3. python做图像处理快不快_Python 图像读写谁最快?不信就比一比
  4. Entity Framework 实体框架的形成之旅--实体数据模型 (EDM)的处理(4)
  5. 用户访问网站的基本流程
  6. HDUOJ 2089
  7. 【题意分析】1044 Shopping in Mars (25 分)【滑动窗口】
  8. 获取客户端ip_代理IP工具能否解决反爬?
  9. java中插入排序_Java中的插入排序
  10. idea DataGrip 使用图解教程
  11. Coursera机器学习week11 单元测试
  12. 联想计算机网络同传速度很慢,利用联想网络同传系统,提升微机室管理效率
  13. 李永乐线性代数辅导讲义第四章学霸小结
  14. ora 01033 linux,ORA-01033: ORACLE initialization or shutdown in progres
  15. matlab simulink电感,一文教你快速学会在matlab的simulink中调用C语言进行仿真
  16. 世界传说 换装迷宫2 所有人物及所有技能及奖励技能 传说系列各秘奥技和台词
  17. 如何利用PDF转换器将WPS转换成word
  18. flutter页面布局HTML,Flutter开发实战初级(2)页面布局详解
  19. 给自己的网站添加在线客服代码
  20. C#实现向手机发送验证码短信

热门文章

  1. 旅行商问题以及python实现
  2. 五种JavaScript富文本编辑器,总有一款适合你
  3. 在线淘礼金免单采集网网站源码
  4. dell笔记本耳机怎么设置_对戴尔系统上的耳机/麦克风插孔问题进行故障排除
  5. 外卖平台乱象迭出!究竟谁该负责?
  6. 数字转罗马数字_理解罗马数字
  7. Netty - 探究PageCache磁盘高速缓存
  8. Altium Designer(17.0)原理图模板设计
  9. js 时间与时间戳的转换
  10. 利用python进行探索性数据分析(EDA):以Kaggle泰坦尼克号数据集为例