
Vectors in R are the fundamental data types. This is because the R compiler treats all scalars (numerics, integers, etc.) and matrices as special cases of vectors.

R中的向量是基本数据类型。 这是因为R编译器将所有标量(数字,整数等)和矩阵都视为向量的特殊情况。

From a data scientist’s perspective, you can consider a vector as a collection of observations across an interval of time, such as temperatures read every day, total sales for the day, etc. R provides several relevant functions to handle vectors from this perspective.


在R中创建向量 (Creating Vectors in R)

The creation of a vector is done using the c() function.


myvec <- c(3.1,45,1,2,80)

R language provides us the functionality to dynamically calculate values and assign them to vectors.


> myvec2 <- c(3,5*8,(9/4))
> myvec2
[1]  3.00 40.00  2.25

We can create vectors using previously created variables.


> a <- 10
> b <-14.8
> c <-2
> myvec3 <- c(1,a,b,c)
> myvec3
[1]  1.0 10.0 14.8  2.0

We can also create a vector using two or more of the existing vectors.


> bigvec <-c(myvec,myvec2)
> bigvec
[1]  3.10 45.00  1.00  2.00 80.00  3.00 40.00  2.25

Vectors can have any number of items of the same data type (also sometimes known as the mode). Note: We cannot mix data types when we’re creating vectors in R.

向量可以具有相同数据类型的任何数量的项目(有时也称为mode )。 注意:在R中创建向量时,不能混合数据类型。

R语言中向量的运算 (Operations on Vectors in R Language)

Vectors are stored contiguously in the memory, similar to C. You can index the elements in a vector, extract subsets of vectors, sort and perform routine mathematical operations over vectors element-wise. We shall look at some examples that make these clear.

向量与C相似,连续存储在内存中。您可以索引向量中的元素,提取向量的子集,对向量进行逐元素排序并执行常规数学运算。 我们将看一些使这些事情变得清楚的例子。

向量的索引元素 (Indexing elements of a vector)

The elements of a vector can be extracted by using their index in a manner similar to accessing array elements. The following code snippet provides you with an example.

可以通过使用其索引以类似于访问数组元素的方式提取向量的元素。 以下代码段为您提供了一个示例。

> bigvec
[1]  3.10 45.00  1.00  2.00 80.00  3.00 40.00  2.25
> bigvec[2]
[1] 45
> avar <- bigvec[7]
> avar
[1] 40

When you try accessing an element beyond the vector’s size, R returns an NA value.


用R语言获取向量的长度 (Getting the length of a vector in R language)

Oftentimes, we deal with data from a dataset we download off the internet. We read entire columns into vector variables and may not be aware of the dimensions beforehand. In these cases, the length will be an important parameter to know so that we don’t run into NA values when working with data. The length of a vector can be known using a length() function.

通常,我们处理从互联网下载的数据集中的数据。 我们将整列读入向量变量,并且可能事先不知道尺寸。 在这些情况下,长度将是一个重要的参数,因此在处理数据时我们不会遇到NA值。 使用length()函数可以知道向量的length()

> length(bigvec)
[1] 8

R中的子集向量 (Subsetting vectors in R)

When dealing with long vectors, it is sometimes necessary to extract only the elements of interest from the vector. We can do this by making use of subsetting in R.

处理长向量时,有时有时仅需要从向量中提取感兴趣的元素。 我们可以通过使用R中的子集来做到这一点。

> bigvec
[1]  3.10 45.00  1.00  2.00 80.00  3.00 40.00  2.25#Extract the last element of a vector - where index equals length
> bigvec[length(x=bigvec)]
[1] 2.25#Extract the last but one element - subtract one from the length
> bigvec[length(x=bigvec)-1]
[1] 40#Extract all elements except for the first element
> bigvec[-1]
[1] 45.00  1.00  2.00 80.00  3.00 40.00  2.25#All elements except for the second one
> bigvec[-2]
[1]  3.10  1.00  2.00 80.00  3.00 40.00  2.25#Elements from index 1 to index 3
> bigvec[1:3]
[1]  3.1 45.0  1.0#Extract elements at specified indiced 1 and 5
> bigvec[c(1,5)]
[1]  3.1 80.0

We will look deeper into subsetting when we are working with real datasets in our further tutorials.


产生序列 (Generating Sequences)

Sequences are vectors in R that are generated using a sequence operator (:). They can also be generated using the seq function. These two methods are illustrated below.

序列是使用一个序列生成的操作者中的R矢量( : )。 它们也可以使用seq函数生成。 这两种方法如下所示。

> 4:10
[1]  4  5  6  7  8  9 10#From represents the starting range and to represents the ending range
#By is the increment factor.
> seq(from=1,to=20,by=2)[1]  1  3  5  7  9 11 13 15 17 19#By value is negative for decreasing sequences
> seq(from=10,to=2,by=-1)
[1] 10  9  8  7  6  5  4  3  2

Instead of using a by parameter, you can also supply a length.out parameter to indicate the length you need and get evenly spaced values from the starting range to ending range.


> seq(from=3,to=20,length.out=25)[1]  3.000000  3.708333  4.416667  5.125000  5.833333  6.541667  7.250000  7.958333[9]  8.666667  9.375000 10.083333 10.791667 11.500000 12.208333 12.916667 13.625000
[17] 14.333333 15.041667 15.750000 16.458333 17.166667 17.875000 18.583333 19.291667
[25] 20.000000

Vectors can be repeated using the rep function in R. The usage of rep is illustrated below.


> rep(x=1,times=5)
[1] 1 1 1 1 1

The x can be replaced by a vector to obtain a repeating vector as follows.


> rep(x=bigvec, times=3)[1]  3.10 45.00  1.00  2.00 80.00  3.00 40.00  2.25  3.10 45.00  1.00  2.00 80.00
[14]  3.00 40.00  2.25  3.10 45.00  1.00  2.00 80.00  3.00 40.00  2.25

排序向量 (Sorting vectors)

Vectors can be sorted in ascending or descending order using the sort() function in the following manner.


#Sorts in ascending order by default.
#decreasing=FALSE is an optional parameter.
> sort(bigvec, decreasing = FALSE)
[1]  1.00  2.00  2.25  3.00  3.10 40.00 45.00 80.00#Sort in descending order
> sort(bigvec, decreasing=TRUE)
[1] 80.00 45.00 40.00  3.10  3.00  2.25  2.00  1.00

向量算术 (Vector arithmetic)

Vector arithmetic has been covered in the operators in R discussion earlier. One important point about vector arithmetic is recycling. When the specified vector operation has two vectors with mismatched length as operands, R simply recycles the values from the shorter vector until it reaches the length.

在较早的R讨论中,向量算术已包含在运算符中 。 向量算术的一个重要方面是循环 。 当指定的向量运算具有两个长度不匹配的向量作为操作数时,R会简单地循环利用较短向量的值,直到达到长度为止。

> a <-c(0,1)
> b <-c(1,2,3,4,5)#The new value of a after recycling will be (0,1,0,1,0) which gets added to b.
> a+b
[1] 1 3 3 5 5

