The family of apply() functions in R is used to apply user-defined functions to the elements of complex structures like matrices, lists or data frames.

R中的apply()函数族用于将用户定义的函数应用于复杂结构的元素,例如矩阵 , 列表或数据帧 。

These functions help a lot in simplifying your code and making it more readable. Moreover, they are compatible with parallel processing as well. Let us look at each function with detailed examples.

这些功能在简化代码和提高可读性方面有很大帮助。 而且,它们也与并行处理兼容。 让我们看一下每个函数的详细示例。

R编程中的apply()函数 (The apply() function in R Programming)

The apply() function in R is used in case of matrices to apply a user-specified function on the rows or columns of the matrix. The following is the general syntax for apply() function.

R中的apply()函数用于矩阵的情况下,以将用户指定的函数应用于矩阵的行或列。 以下是apply()函数的常规语法。

apply(matrix, code, f, fargs)

In the above form, matrix is the matrix object which we are using the apply() function for. Code represents whether we wish to apply the function for rows (code set to 1) or columns (code set to 2). F represents the function we need to apply and fargs are the arguments to pass to the function.

在上述形式中, matrix是我们正在使用apply()函数的矩阵对象。 Code表示我们是希望将函数应用于行(代码设置为1)还是列(代码设置为2)。 F表示我们需要应用的函数,而fargs是传递给该函数的参数。

Let us first define a matrix to illustrate the apply() function.


> x <-matrix(c(4,5,6,10,12,16),nrow=2,ncol=3)
> x[,1] [,2] [,3]
[1,]    4    6   12
[2,]    5   10   16

Suppose that we wish to perform a very specific function upon each of these elements, like squaring first, then dividing by 3 and multiplying by 4. Let us define a function that does this.


> f <-function(x)
+ {
+   return (x^2*3/4)
+ }
> f(5)
[1] 18.75

Now, in order to apply this function to all the rows/columns in a matrix, we call the apply() function.


> apply(x, 2, f)[,1] [,2] [,3]
[1,] 12.00   27  108
[2,] 18.75   75  192

The function gets conveniently applied to each element in the matrix without calling it in a loop. The apply() function in R doesn’t provide any speed benefit in execution but helps you write a cleaner and more compact code.

该函数可以方便地应用于矩阵中的每个元素,而无需在循环中调用它。 R中的apply()函数在执行时没有任何速度上的好处,但是可以帮助您编写更简洁,更紧凑的代码。

R编程中的sapply()和lapply()函数 (sapply() and lapply() functions in R Programming)

使用清单 (Working with Lists)

The lapply() function in R is short for list apply. This works in a manner similar to the apply() function above, but uses lists instead of matrices.

R中的lapply()函数是列表应用的缩写。 它的工作方式类似于上面的apply()函数,但是使用列表而不是矩阵。

Let us look at an example:


> mylist <- list(c(1,2,3,4),c(10,20,30,40),c(5,5,5,5))
> lapply(mylist,mean)
[1] 2.5[[2]]
[1] 25[[3]]
[1] 5

The list mylist is a list of 3 vectors. We wish to apply a mean function to each one of the vectors. This is done by calling lapply(mylist,mean) that returns the mean values of the three constituent vectors.

列表mylist是3个向量的列表。 我们希望对每个向量应用均值函数。 这可以通过调用lapply(mylist,mean) ,该函数返回三个组成向量的平均值。

Similarly, sapply() function in R is short for for simplified apply. Instead of obtaining a separate mean value for each vector, sapply returns a vector containing the mean values.

同样,R中的sapply()函数是简化应用的简称。 sapply返回每个矢量的单独平均值, sapply返回一个包含平均值的矢量。

> sapply(mylist,mean)
[1]  2.5 25.0  5.0

在R中使用数据框 (Working with Data Frames in R)

Since data frames can be treated as a special case of lists, the functions lapply() and sapply() work in both cases. Let us look at an example.

由于数据帧可以视为列表的特殊情况,因此函数lapply()sapply()在这两种情况下都可以工作。 让我们来看一个例子。

Let us create a data frame first and then apply a sort() function on it using the lapply() function in R.


names <- c("Adam","Antony","Brian","Carl","Doug")
ages <- c(23,22,24,25,26)
playerdata <- data.frame(names,ages,stringsAsFactors = FALSE)#Apply a sort function on the dataframe
> lapply(playerdata,sort)
[1] "Adam"   "Antony" "Brian"  "Carl"   "Doug"  $ages
[1] 22 23 24 25 26

The function returns both the columns of the data frame in a sorted order separately.


Similarly, calling sapply() provides a compact list with each column sorted separately.


> sapply(playerdata,sort)names    ages
[1,] "Adam"   "22"
[2,] "Antony" "23"
[3,] "Brian"  "24"
[4,] "Carl"   "25"
[5,] "Doug"   "26"

tapply()函数 (tapply() function)

The tapply() function also belongs to the same family but used only in case of factors. This is best explained with an example. Suppose we have the salaries of employees in a company in the form of a vector and their respective means of transport in a factor. Suppose that we wish to calculate what is the average salary of each group using a specific means of transport.

tapply()函数也属于同一族,但仅在有factor的情况下使用 。 最好用一个例子来解释。 假设我们以矢量的形式获得公司员工的薪水,并将他们各自的运输方式作为一个因素。 假设我们希望使用特定的运输方式来计算每个组的平均工资。

The tapply() function in R programming can be called for this purpose using the R’s built-in mean function.


> salaries <-c(25000,30000,45000,66000,20000,50000,35000,20000,15000)
> transport <-c('Bus','Car','Bus','Car','Metro','Metro','Bus','Bus','Metro')
> tapply(salaries,transport,mean)Bus      Car    Metro
31250.00 48000.00 28333.33

As you can observe, the tapply() function in R outputs a well-formatted mean of the salaries with the means of transport as columns. Also, notice how it only accounts for unique values from the transport vector automatically.

如您所见,R中的tapply()函数以一种运输方式作为列输出格式合理的薪水平均值。 另外,请注意它是如何仅自动考虑传输向量中的唯一值的。


