R语言处理1975-2011年的人口信息

1975-2011年的数据中。

1）分别统计每年人口最多的国家是哪个?有多少

2）统计出各个国家的1975-2011年的平均人口增长率

3）统计每年人口最多的十个国家

4）统计出每年人口最少的十个国家

5）结合洲的语言的分类，请得出如下结果

5.1）哪个洲的人口最多?哪个洲的人口最少?

每个洲的前3个国家人口排名

5.2）哪种语言的国家人口最多?

librery(xlsx)

data<-read.xlsx("urbanpop.xlsx",sheet_index=3)
i<-0

for(dt in data){
if(i==0){
i<-2
next}
else{
index<-which(dt == max(dt,na.rm=TRUE))
cat(as.character(data$country[index]),dt[index],"\n")

}

data$country[1]

(data$X2011[1]-data$X1975[1])^(1/(2011-1975))-1

paste(((data$X2011[1]-data$X1975[1])^(1/(2011-1975))-1)*100,"%",sep="")

for(i in 1:209){
cat(as.character(data$country[i]),"\t",paste(((data$X2011[i]-data$X1975[i])^(1/(2011-1975))-1)*100,"%",sep=""),"\n")

}

i<-0
year<-1975
for(dt in data){
if(i==0){
i<-2
next}
else{
countrys_id <- order(dt,decreasing=TRUE)[1:10]
cat(year,"\t")
for(index in countrys_id){
cat(as.character(data$country[index]),"\t")
}
year=year+1
cat("\n")

}

i<-0
year<-1975
for(dt in data){
if(i==0){
i<-2
next}
else{
countrys_id <- order(dt,decreasing=FALSE)[1:10]
cat(year,"\t")
for(index in countrys_id){
cat(as.character(data$country[index]),"\t")
}
year=year+1
cat("\n")

}

Asian<-c("Afghanistan", "Armenia", "Azerbaijan", "Bahrain", "Bhutan", "Cambodia", "Indonesia",
"Iran", "Iraq", "Israel", "Japan", "Kazakhstan", "Kuwait", "Malaysia", "Myanmar", "Nepal", "Oman",
"Pakistan", "Qatar", "Saudi Arabia", "Singapore", "Tajikistan", "Thailand", "Turkmenistan", "Uzbekistan", "Yemen",
"Bangladesh", "Georgia", "India", "Jordan", "North Korea", "South Korea", "Lao", "Lebanon", "Maldives", "Mongolia",
"Philippines", "Sri Lanka", "Timor-Leste", "Turkey", "United Arab Emirates","Brunei", "China", "Hong Kong, China",
"Kyrgyz Republic", "Macao, China", "Syria", "Vietnam")

Europe<-c("Albania", "Austria", "Belgium", "Bosnia and Herzegovina", "Bulgaria", "Croatia",
"Cyprus", "Czech Republic", "Denmark", "Estonia", "France", "Germany", "Greece", "Hungary", "Latvia",
"Liechtenstein", "Lithuania", "Malta", "Netherlands", "Norway", "Portugal", "Russia", "Serbia", "Slovenia", "Sweden", "Ukraine",
"Andorra","Channel Islands", "Faeroe Islands", "Finland", "Iceland", "Ireland", "Isle of Man", "Italy", "Luxembourg", "Macedonia, FYR",
"Moldova", "Monaco", "Montenegro", "Poland", "Romania", "San Marino", "Slovak Republic", "Spain", "Switzerland", "United Kingdom")

Afrain<-c("Algeria", "Angola", "Benin", "Botswana", "Burkina Faso", "Burundi", "Chad", "Comoros",
"Cote d'Ivoire", "Djibouti", "Eritrea", "Ethiopia", "Guinea", "Kenya", "Lesotho", "Liberia", "Libya",
"Mauritania", "Mauritius", "Mozambique", "Namibia", "Niger", "Rwanda", "Sao Tome and Principe", "Seychelles",
"Sierra Leone", "Swaziland", "Tanzania", "Uganda", "Zambia", "Zimbabwe", "South Sudan","Cameroon",
"Central African Republic", "Egypt", "Equatorial Guinea", "Gabon", "Gambia", "Ghana", "Guinea-Bissau",
"Madagascar", "Malawi", "Mali", "Morocco", "Nigeria", "Senegal", "Somalia", "South Africa", "Sudan", "Togo","Tunisia",
"Cape Verde", "Congo, Dem. Rep.", "Congo, Rep.")

SouthAmerican<-c("Argentina", "Guyana", "Paraguay", "Peru", "Suriname", "Uruguay", "Venezuela","Brazil", "Chile",
"Colombia", "Ecuador","Aruba","Belarus","Bolivia")

NorthAmerican<-c("Antigua and Barbuda", "Bahamas", "Barbados", "Canada", "Greenland", "Grenada",
"Guatemala", "Honduras", "Jamaica", "Nicaragua", "St. Kitts and Nevis", "Trinidad and Tobago","Belize",
"Bermuda", "Cayman Islands", "Costa Rica", "Cuba", "Dominica", "Dominican Republic", "El Salvador",
"Haiti", "Mexico", "Panama", "Puerto Rico", "St. Lucia", "St. Vincent and the Grenadines", "Turks and Caicos Islands",
"United States", "Virgin Islands (U.S.)")

Oceania<-c("Australia", "Kiribati", "New Caledonia", "New Zealand", "Palau", "Papua New Guinea", "Solomon Islands", "Tuvalu",
"American Samoa", "Fiji", "French Polynesia", "Guam", "Marshall Islands", "Northern Mariana Islands", "Samoa", "Tonga", "Vanuatu",
"Micronesia, Fed. Sts.")

AS_number<-0
AF_number<-0
EU_number<-0
SA_number<-0
NA_number<-0
OC_number<-0
other_number<-0
index<-1
for(country in data$country){
if(country %in% Asian){
AS_number= AS_number+data$X2011[index]
}else if(country %in% Europe){
EU_number = EU_number+data$X2011[index]
}else if(country %in% Afrain){
AF_number= AF_number+data$X2011[index]
}else if(country %in% SouthAmerican){
SA_number= SA_number+data$X2011[index]
}else if(country %in% NorthAmerican){
NA_number= NA_number+data$X2011[index]
}else if(country %in% Oceania){
OC_number= OC_number+data$X2011[index]
}else{
other_number= other_number +data$X2011[index]
}
index=index+1
}

cat("亚洲人口数","欧洲人口数","北美洲人口数","南美洲人口数","非洲人口数","大洋洲人口数","\n")
population<-c(AS_number,EU_number,NA_number,SA_number,AF_number,OC_number)
sort_pl<-order(population)
sort_pl

AS<-c()
AF<-c()
EU<-c()
SA<-c()
NAA<-c()
OC<-c()
AS_I<-c()
AF_I<-c()
EU_I<-c()
SA_I<-c()
NAA_I<-c()
OC_I<-c()
index<-1
dt_2011<-data$X2011
for(country in data$country){
if(country %in% Asian){
AS_I=c(AS_I,country)
AS=c(AS,dt_2011[index])
}else if(country %in% Europe){
EU_I=c(EU_I,country)
EU=c(EU,dt_2011[index])
}else if(country %in% Afrain){
AF_I=c(AF_I,country)
AF=c(AF,dt_2011[index])
}else if(country %in% SouthAmerican){
SA_I=c(SA_I,country)
SA=c(SA,dt_2011[index])
}else if(country %in% NorthAmerican){
NAA_I=c(NAA_I,country)
NAA=c(NAA,dt_2011[index])
}else if(country %in% Oceania){
OC_I=c(OC_I,country)
OC=c(OC,dt_2011[index])
}else{
print(country)
}
index=index+1
}
for(x in order(AS,decreasing=TRUE)[1:3]){
cat(AS_I[x],"\t","人口数",AS[x],"\n")
}
for(x in order(AF,decreasing=TRUE)[1:3]){
cat(AF_I[x],"\t","人口数",AF[x],"\n")
}
for(x in order(EU,decreasing=TRUE)[1:3]){
cat(EU_I[x],"\t","人口数",EU[x],"\n")
}
for(x in order(SA,decreasing=TRUE)[1:3]){
cat(SA_I[x],"\t","人口数",SA[x],"\n")
}
for(x in order(NAA,decreasing=TRUE)[1:3]){
cat(NAA_I[x],"\t","人口数",NAA[x],"\n")
}
for(x in order(OC,decreasing=TRUE)[1:3]){
cat(OC_I[x],"\t","人口数",OC[x],"\n")
}

没想到没有R语言的代码贴士。这里面最麻烦的是第五题，数据要自己去爬，去了百度百科还有个data.cn的网站，爬，但是还剩下50几个爬不出来，心里很难受。

说下注意的东西吧。1.是工作目录得注意，不然读取不到csv文件。

2.因为国家名称是以因子的形式读取出来的，因此得使用as.character()来转换一下。

感觉就这两点东西需要注意，这东西不难，但是第五题太繁琐。

转载于:https://www.cnblogs.com/ke-T3022/p/7811057.html

R语言处理1975-2011年的人口信息相关推荐

R语言构建xgboost模型：控制训练信息输出级别verbose参数
R语言构建xgboost模型:控制训练信息输出级别verbose参数目录 R语言构建xgboost模型:控制训练信息输出级别verbose参数
R语言可视化：使用ggplot2绘制人口金字塔
人口金字塔是进行人口数据可视化时常用的一种统计图形,可以形象地描述人口年龄和性别的分布情况.最近工作上经常处理人口数据,于是试着使用ggplot2绘制了一下.在这里记录一下,顺便也熟悉一下ggplot ...
【R语言】——聚类热图行列分组信息注释热图2
上一期"[R语言]--聚类热图绘制(pheatmap)"介绍了R语言pheatmap包绘制聚类热图的基础代码,本期介绍当需要同时在热图上显示分组情况时,可利用pheatmap包构建 ...
《R语言编程艺术》——1.4　R语言中一些重要的数据结构
1.4 R语言中一些重要的数据结构 R有多种数据结构.本节将简单介绍几种常用的数据结构,使读者在深入细节之前先对R语言有个大概的认识.这样,读者至少可以开始尝试一些很有意义的例子,即使这些例子背后更多 ...
李倩星r语言实战_基于PCR的全球平均气温研究
段晓鸣 [摘要] 本文运用主成分回归的方法研究了全球平均气温与CO2,N2O,CFC.11,CFC.12,TSI,Aerosols六个自变量之间的关系,选取了自1983年5月到2008年12月的数据 ...
R语言ggplot2可视化：可视化人口金字塔图、人口金字塔显示不同性别不同年龄段的人口数，是了解人口组成的最优可视化方法、人口金字塔图可以用来表示按体积排序的群体的分布、形成漏斗结构
R语言ggplot2可视化:可视化人口金字塔图.人口金字塔显示不同性别不同年龄段的人口数,是了解人口组成的最优可视化方法.人口金字塔图可以用来表示按体积排序的群体的分布.形成漏斗结构(Populati ...
R语言ggplot2可视化：可视化人口金字塔图、直方图（堆叠直方图、连续变量堆叠直方图、离散变量堆叠直方图）、密度图、箱图（添加抖动数据点、tufte箱图、多分类变量分组箱图）、小提琴图
R语言ggplot2可视化:可视化人口金字塔图.直方图(堆叠直方图.连续变量堆叠直方图.离散变量堆叠直方图).密度图.箱图(添加抖动数据点.tufte箱图.多分类变量分组箱图).小提琴图目录
R语言-时间日期函数
R语言时间日期函数 1. 返回当前日期时间,有两种方式: Sys.time() date() 举例 format(Sys.time(), "%a %b %d %X %Y %Z")# ...
使用R语言的正确姿势，R包干货奉献
生物信息学习的正确姿势 NGS系列文章包括NGS基础.在线绘图.转录组分析 (Nature重磅综述|关于RNA-seq你想知道的全在这).ChIP-seq分析 (ChIP-seq基本分析流程).单细胞 ...

R语言处理1975-2011年的人口信息

R语言处理1975-2011年的人口信息相关推荐

最新文章

热门文章