iMeta | ggClusterNet微生物网络分析和可视化保姆级教程
写在前面
ggCLusterNet开发已历时两年,去年开始公开使用,不到一年的时间,谷歌上引用过ggClusterNet的文章数量已达23次。如今文章出来了,大家可以通过谷歌学术搜索ggClusterNet或文章主页(https://onlinelibrary.wiley.com/doi/full/10.1002/imt2.32)优雅地引用了。
全文翻译和视频简介见:《iMeta | 南农沈其荣团队发布微生物网络分析和可视化R包ggClusterNet》
欢迎大家扫描助手微信,备注R包,即可加入ggClusterNet用户交流群。
现发布包的使用教程,目录如下:
ggClusterNet:包含多种基于模块可视化布局算法的微生物网络挖掘R包
ggClusterNet: An R package for microbiome network analysis and modularity‐based multiple network layouts
作者:文涛#(wechat: nanjingxuezi),谢鹏昊#,杨胜蝶,牛国庆,刘潇予,丁哲旭,薛超,刘永鑫,沈其荣,袁军(junyuan@njau.edu.cn)
网络分析逐渐被生态学家们重视并持续应用于生态学领域,开发更为强大和方便的网络分析工具十分必要。因此,我们开发了名为ggClusterNet的R包,用于更加容易的进行网络数据分析挖掘和可视化。在ggClusterNet包中设计了数十种网络布局算法用于更好的展示微生物网络模块化信息(randomClusterG, PolygonClus-terG, PolygonRrClusterG, ArtifCluster, randSNEClusterG, PolygonModsquar-eG, PolyRdmNotdCirG, model_Gephi.2, model_igraph, model_maptree)。为了方便研究者使用,设计了多种微生物网络数据挖掘功能,例如相关性计算corMicor(),网络属性计算net_properties(),节点属性计算node_properties(),随机网络计算和比对random_Net_compate()。为了更加快捷挖掘网络,进一步将这些功能封装整合成两个函数network.2() 和 corBionetwork(),分别用于微生物网络数据挖掘和跨域网络数据挖掘。目前,ggClusterNet在GitHub: https://github.com/taowenmicro/ggClusterNet/ ,Gitee(https://gitee.com/wentaomicro/ggClusterNet)上开放使用。完整的说明和示例在wiki页面可以阅读。
ggClusterNet用于挖掘微生物网络和跨域网络,同时提供数十种基于模块化展示的网络布局算法。适应微生物组网络挖掘。
环境配置
R语言和Rstudio安装
R语言和Rstudio都是跨平台软件,从官方网站下载安装包进行安装:
R语言官方下载地址:https://www.r-project.org/
Rstudio官方下载地址:https://www.rstudio.com/products/rstudio/download/#download
测试代码和数据下载:在iMeta、宏基因组公众号后台回复“ggclusternet”
R包安装
方法1. Windows下批量下载R包解压安装
ggClusterNet依赖了十余个R包,如果安装存在困难,可以直接下载我编译好的安装包,这里还有一些其他R包,基本上都是用于组学分析等生物信息方面的分析。供win用户使用。
R 4.2系列的R包 321.56M 链接:https://pan.baidu.com/s/1xc-Eyz3GhJtZ6LaPOGhQWA?pwd=0315
备用链接含作者常用R包 2.08G接:https://pan.baidu.com/s/1pbumFhF-EC2atQTB0wuiMw?pwd=udf5 提取码:udf5
更新版获取方式,在iMeta、宏基因组公众号后台回复“ggclusternet”
使用方式为:下载后解压缩其中的所有目录到R语言安装路径中的4.2或library中,具体位置可执行 .libPaths() 查看。
> .libPaths()
[1] "C:/Users/yongxinliu/AppData/Local/R/win-library/4.2"
[2] "C:/Program Files/R/R-4.2.0/library"
最好复制到第一个用户目录中,为额外安装R包位置。第二个为系统目录,为软件自带R包位置。
方法2. 代码安装
# 基于CRAN安装R包
p_list = c("ggplot2", "BiocManager", "devtools", "igraph", "network", "sna", "tidyverse","tidyfst","ggnewscale")
for(p in p_list){if (!requireNamespace(p)){install.packages(p)}library(p, character.only = TRUE, quietly = TRUE, warn.conflicts = FALSE)}# 基于Bioconductor安装R包
p_list = c("phyloseq", "WGCNA")
for(p in p_list){if (!requireNamespace(p, quietly = TRUE)){BiocManager::install(p)}}# 基于github安装,检测没有则安装
library(devtools)
if (!requireNamespace("ggClusterNet", quietly = TRUE))remotes::install_github("taowenmicro/ggClusterNet")
视频1:安装包-微生物网络构建过程
视频2:二分网络运算过程
导入R包
#--导入所需R包#-------
library(phyloseq)
library(igraph)
library(network)
library(sna)
library(tidyverse)
library(ggClusterNet)
数据
输入方式一
直接输入phyloseq格式的数据。
#-----导入数据#-------
data(ps)
输入方式二
可以从https://github.com/taowenmicro/R-_function下载数据,构造phylsoeq文件。自己的数据也按照网址示例数据进行准备。虽然phylsoeq对象不易用常规手段处理,但是组学数据由于数据量比较大,数据注释内容多样化,所以往往使用诸如phyloseq这类对象进行处理,并简化数据处理过程。ggClusterNet同样使用了phyloseq对象作为微生物网络的分析。
phyloseq对象构建过程如下,网络分析主要用到otu表格,后续pipeline流程可能用到分组文件metadata,如果按照分类水平山色或者区分模块则需要taxonomy。这几个部分并不是都必须加入phyloseq对象中,可以用到哪个加哪个。
library(phyloseq)
library(ggClusterNet)
library(tidyverse)
library(Biostrings)metadata = read.delim("./metadata.tsv",row.names = 1)
otutab = read.delim("./otutab.txt", row.names=1)
taxonomy = read.table("./taxonomy.txt", row.names=1)
# tree = read_tree("./otus.tree")
# rep = readDNAStringSet("./otus.fa")ps = phyloseq(sample_data(metadata),otu_table(as.matrix(otutab), taxa_are_rows=TRUE),tax_table(as.matrix(taxonomy))#,# phy_tree(tree),# refseq(rep))
或者直接从网站读取,但是由于github在国外,所以容易失败
metadata = read.delim("https://raw.githubusercontent.com/taowenmicro/R-_function/main/metadata.tsv",row.names = 1)
otutab = read.delim("https://raw.githubusercontent.com/taowenmicro/R-_function/main/otutab.txt", row.names=1)
taxonomy = read.table("https://raw.githubusercontent.com/taowenmicro/R-_function/main/taxonomy.txt", row.names=1)
# tree = read_tree("https://raw.githubusercontent.com/taowenmicro/R-_function/main/otus.tree")
# rep = readDNAStringSet("https://raw.githubusercontent.com/taowenmicro/R-_function/main/otus.fa")ps = phyloseq(sample_data(metadata),otu_table(as.matrix(otutab), taxa_are_rows=TRUE),tax_table(as.matrix(taxonomy))#,# phy_tree(tree),# refseq(rep))
使用ggCLusterNet进行网络分析的过程
corMicro函数用于计算网络相关
按照丰度过滤微生物表格,并却计算相关矩阵,按照指定的阈值挑选矩阵中展示的数值。调用了psych包中的corr.test函数,使用三种相关方法。
N参数提取丰度最高的150个OTU;
method.scale参数确定微生物组数据的标准化方式,这里我们选用TMM方法标准化微生物数据。
#-提取丰度最高的指定数量的otu进行构建网络#----------计算相关#----
result = corMicro (ps = ps,N = 150,method.scale = "TMM",r.threshold=0.8,p.threshold=0.05,method = "spearman")#--提取相关矩阵
cor = result[[1]]
head(cor)#-网络中包含的OTU的phyloseq文件提取
ps_net = result[[3]]#-导出otu表格
otu_table = ps_net %>% vegan_otu() %>%t() %>%as.data.frame()
制作分组,我们模拟五个分组
ggClusterNet中提供了多种优秀的网络可视化布局算法:
PolygonClusterG:环状模块,环状布局
PolygonRrClusterG:环状模块,模块半径正比于节点数量,环状布局
randomClusterG:随机布局,环状模块
ArtifCluster:环状模块,人工布局
randSNEClusterG:sna包中的布局按照模块布局
PolygonModsquareG 环状布局,顺序行列排布
PolygonRdmNodeCir 实心圈布局,环状布局,控制半径
model_Gephi.2:模仿Gephi布局
model_igraph:模仿igraph布局
model_maptree:按照maptree算法布局模块
model_maptree2:内聚算法改进离散点排布
model_filled_circle 实心圆布局,基于分组信息
等···········
这是网络布局的基础,无论是什么聚类布局,都需要制作一个分组文件,这个文件有两列,一列是节点,一列是分组信息,这个分组信息名称为:group。这个文件信息就是用于对节点进行分组,然后按照分组对节点归类,使用包中可视化函数计算节点位置。
注意分组文件的格式,分为两列,第一列是网络中包含的OTU的名字,第二列是分组信息,同样的分组标记同样的字符。
#--人工构造分组信息:将网络中全部OTU分为五个部分,等分
netClu = data.frame(ID = row.names(otu_table),group =rep(1:5,length(row.names(otu_table)))[1:length(row.names(otu_table))] )
netClu$group = as.factor(netClu$group)
head(netClu)
PolygonClusterG 根据分组,计算布局位置坐标
不同的模块按照分组聚集成不同的圆,并且圆形的大小一样。如果一个分组只有一个点,则这个点坐落在圆心位置。
#--------计算布局#---------
result2 = PolygonClusterG (cor = cor,nodeGroup =netClu )
node = result2[[1]]
head(node)
nodeadd 节点注释的:用otu表格和分组文件进行注释
nodeadd函数只是提供了简单的用注释函数,用户可以自己在node的表格后面添加各种注释信息。
tax_table = ps_net %>%vegan_tax() %>%as.data.frame()
#---node节点注释#-----------
nodes = nodeadd(plotcord =node,otu_table = otu_table,tax_table = tax_table)
head(nodes)
计算边
#-----计算边#--------
edge = edgeBuild(cor = cor,node = node)
head(edge)
出图
p <- ggplot() + geom_segment(aes(x = X1, y = Y1, xend = X2, yend = Y2,color = as.factor(cor)),data = edge, size = 0.5) +geom_point(aes(X1, X2,fill = Phylum,size = mean),pch = 21, data = nodes) +scale_colour_brewer(palette = "Set1") +scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL) +theme(panel.background = element_blank()) +theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +theme(legend.background = element_rect(colour = NA)) +theme(panel.background = element_rect(fill = "white", colour = NA)) +theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
p
模拟不同的分组—可视化
模拟不同分组效果展示:1个分组
这是网络布局的基础,无论是什么聚类布局,都需要制作一个分组文件,这个文件有两列,一列是节点,一列是分组信息,这个分组信息名称必须设定为:group。
netClu = data.frame(ID = row.names(tax_table),group =rep(1,length(row.names(tax_table)))[1:length(row.names(tax_table))] )
netClu$group = as.factor(netClu$group)
result2 = PolygonClusterG (cor = cor,nodeGroup =netClu )
node = result2[[1]]
# ---node节点注释#-----------
nodes = nodeadd(plotcord =node,otu_table = otu_table,tax_table = tax_table)#-----计算边#--------
edge = edgeBuild(cor = cor,node = node)### 出图
pnet <- ggplot() + geom_segment(aes(x = X1, y = Y1, xend = X2, yend = Y2,color = as.factor(cor)),data = edge, size = 0.5) +geom_point(aes(X1, X2,fill = Phylum,size = mean),pch = 21, data = nodes) +scale_colour_brewer(palette = "Set1") +scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL) +# labs( title = paste(layout,"network",sep = "_"))+# geom_text_repel(aes(X1, X2,label=Phylum),size=4, data = plotcord)+# discard default grid + titles in ggplot2theme(panel.background = element_blank()) +# theme(legend.position = "none") +theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +theme(legend.background = element_rect(colour = NA)) +theme(panel.background = element_rect(fill = "white", colour = NA)) +theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
pnet
模拟不同的分组查看效果:8个分组
netClu = data.frame(ID = row.names(cor),group =rep(1:8,length(row.names(cor)))[1:length(row.names(cor))] )
netClu$group = as.factor(netClu$group)
result2 = PolygonClusterG (cor = cor,nodeGroup =netClu )
node = result2[[1]]
# ---node节点注释#-----------
nodes = nodeadd(plotcord =node,otu_table = otu_table,tax_table = tax_table)
#-----计算边#--------
edge = edgeBuild(cor = cor,node = node)
### 出图
pnet <- ggplot() + geom_segment(aes(x = X1, y = Y1, xend = X2, yend = Y2,color = as.factor(cor)),data = edge, size = 0.5) +geom_point(aes(X1, X2,fill = Phylum,size = mean),pch = 21, data = nodes) +scale_colour_brewer(palette = "Set1") +scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL) +# labs( title = paste(layout,"network",sep = "_"))+# geom_text_repel(aes(X1, X2,label=Phylum),size=4, data = plotcord)+# discard default grid + titles in ggplot2theme(panel.background = element_blank()) +# theme(legend.position = "none") +theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +theme(legend.background = element_rect(colour = NA)) +theme(panel.background = element_rect(fill = "white", colour = NA)) +theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
pnet
按照微生物分类不同设定分组
#----------计算相关#----
result = corMicro (ps = ps,N = 200,method.scale = "TMM",r.threshold=0.8,p.threshold=0.05,method = "spearman")
#--提取相关矩阵
cor = result[[1]]
head(cor)
#-网络中包含的OTU的phyloseq文件提取
ps_net = result[[3]]
#-导出otu表格
otu_table = ps_net %>% vegan_otu() %>%t() %>%as.data.frame()
tax = ps_net %>% vegan_tax() %>%as.data.frame()
tax$filed = tax$Phylum
group2 <- data.frame(ID = row.names(tax),group = tax$Phylum)
group2$group =as.factor(group2$group)
result2 = PolygonClusterG (cor = cor,nodeGroup =group2 )
node = result2[[1]]
# ---node节点注释#-----------
nodes = nodeadd(plotcord =node,otu_table = otu_table,tax_table = tax_table)
#-----计算边#--------
edge = edgeBuild(cor = cor,node = node)
### 出图
pnet <- ggplot() + geom_segment(aes(x = X1, y = Y1, xend = X2, yend = Y2,color = as.factor(cor)),data = edge, size = 0.5) +geom_point(aes(X1, X2,fill = Phylum,size = mean),pch = 21, data = nodes) +scale_colour_brewer(palette = "Set1") +scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL) +# labs( title = paste(layout,"network",sep = "_"))+# geom_text_repel(aes(X1, X2,label=Phylum),size=4, data = plotcord)+# discard default grid + titles in ggplot2theme(panel.background = element_blank()) +# theme(legend.position = "none") +theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +theme(legend.background = element_rect(colour = NA)) +theme(panel.background = element_rect(fill = "white", colour = NA)) +theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
pnet
结果发现这些高丰度OTU大部分属于放线菌门和变形菌门,其他比较少。所以下面我们按照OTU数量的多少,对每个模块的大小进行重新调整。
result2 = PolygonRrClusterG (cor = cor,nodeGroup =group2 )
node = result2[[1]]
# ---node节点注释#-----------
nodes = nodeadd(plotcord =node,otu_table = otu_table,tax_table = tax_table)
#-----计算边#--------
edge = edgeBuild(cor = cor,node = node)
### 出图
pnet <- ggplot() + geom_segment(aes(x = X1, y = Y1, xend = X2, yend = Y2,color = as.factor(cor)),data = edge, size = 0.5) +geom_point(aes(X1, X2,fill = Phylum,size = mean),pch = 21, data = nodes) +scale_colour_brewer(palette = "Set1") +scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL) +# labs( title = paste(layout,"network",sep = "_"))+# geom_text_repel(aes(X1, X2,label=Phylum),size=4, data = plotcord)+# discard default grid + titles in ggplot2theme(panel.background = element_blank()) +# theme(legend.position = "none") +theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +theme(legend.background = element_rect(colour = NA)) +theme(panel.background = element_rect(fill = "white", colour = NA)) +theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
pnet
用实心点作为每个模块的布局方式1
set.seed(12)
#-实心圆2
result2 = model_filled_circle(cor = cor,culxy =TRUE,da = NULL,# 数据框,包含x,和y列nodeGroup = group2,mi.size = 1,# 最小圆圈的半径,越大半径越大zoom = 0.3# 不同模块之间距离)# result2 = PolygonRrClusterG (cor = cor,nodeGroup =group2 )
node = result2[[1]]
# ---node节点注释#-----------
nodes = nodeadd(plotcord =node,otu_table = otu_table,tax_table = tax_table)
#-----计算边#--------
edge = edgeBuild(cor = cor,node = node)
### 出图
pnet <- ggplot() + geom_segment(aes(x = X1, y = Y1, xend = X2, yend = Y2,color = as.factor(cor)),data = edge, size = 0.5) +geom_point(aes(X1, X2,fill = Phylum,size = mean),pch = 21, data = nodes) +scale_colour_brewer(palette = "Set1") +scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL) +# labs( title = paste(layout,"network",sep = "_"))+# geom_text_repel(aes(X1, X2,label=Phylum),size=4, data = plotcord)+# discard default grid + titles in ggplot2theme(panel.background = element_blank()) +# theme(legend.position = "none") +theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +theme(legend.background = element_rect(colour = NA)) +theme(panel.background = element_rect(fill = "white", colour = NA)) +theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
pnet
用实心点作为每个模块布局方式2
set.seed(12)
#-实心圆2
result2 = model_maptree_group(cor = cor,nodeGroup = group2,)# result2 = PolygonRrClusterG (cor = cor,nodeGroup =group2 )
node = result2[[1]]
# ---node节点注释#-----------
nodes = nodeadd(plotcord =node,otu_table = otu_table,tax_table = tax_table)
#-----计算边#--------
edge = edgeBuild(cor = cor,node = node)
### 出图
pnet <- ggplot() + geom_segment(aes(x = X1, y = Y1, xend = X2, yend = Y2,color = as.factor(cor)),data = edge, size = 0.5) +geom_point(aes(X1, X2,fill = Phylum,size = mean),pch = 21, data = nodes) +scale_colour_brewer(palette = "Set1") +scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL) +# labs( title = paste(layout,"network",sep = "_"))+# geom_text_repel(aes(X1, X2,label=Phylum),size=4, data = plotcord)+# discard default grid + titles in ggplot2theme(panel.background = element_blank()) +# theme(legend.position = "none") +theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +theme(legend.background = element_rect(colour = NA)) +theme(panel.background = element_rect(fill = "white", colour = NA)) +theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
pnet
按照网络模块分析定义分组
netClu = modulGroup( cor = cor,cut = NULL,method = "cluster_fast_greedy" )result2 = model_maptree_group(cor = cor,nodeGroup = group2,)# result2 = PolygonRrClusterG (cor = cor,nodeGroup =group2 )
node = result2[[1]]# ---node节点注释#-----------
nodes = nodeadd(plotcord =node,otu_table = otu_table,tax_table = tax_table)
head(nodes)nodes2 = nodes %>% inner_join(netClu,by = c("elements" = "ID"))
nodes2$group = paste("Model_",nodes2$group,sep = "")#-----计算边#--------
edge = edgeBuild(cor = cor,node = node)### 出图
pnet <- ggplot() + geom_segment(aes(x = X1, y = Y1, xend = X2, yend = Y2,color = as.factor(cor)),data = edge, size = 0.5) +geom_point(aes(X1, X2,fill = group,size = mean),pch = 21, data = nodes2) +scale_colour_brewer(palette = "Set1") +scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL) +# labs( title = paste(layout,"network",sep = "_"))+# geom_text_repel(aes(X1, X2,label=Phylum),size=4, data = plotcord)+# discard default grid + titles in ggplot2theme(panel.background = element_blank()) +# theme(legend.position = "none") +theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +theme(legend.background = element_rect(colour = NA)) +theme(panel.background = element_rect(fill = "white", colour = NA)) +theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
pnet
使用升级的model_maptree2:不在可以将每个模块独立区分,而是将模块聚拢,并在整体布局上将离散的点同这些模块一同绘制到同心圆内。
result2 = model_maptree2(cor = cor,method = "cluster_fast_greedy")# result2 = PolygonRrClusterG (cor = cor,nodeGroup =group2 )
node = result2[[1]]# ---node节点注释#-----------
nodes = nodeadd(plotcord =node,otu_table = otu_table,tax_table = tax_table)
head(nodes)nodes2 = nodes %>% inner_join(netClu,by = c("elements" = "ID"))
nodes2$group = paste("Model_",nodes2$group,sep = "")#-----计算边#--------
edge = edgeBuild(cor = cor,node = node)### 出图
pnet <- ggplot() + geom_segment(aes(x = X1, y = Y1, xend = X2, yend = Y2,color = as.factor(cor)),data = edge, size = 0.5) +geom_point(aes(X1, X2,fill = group,size = mean),pch = 21, data = nodes2) +scale_colour_brewer(palette = "Set1") +scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL) +# labs( title = paste(layout,"network",sep = "_"))+# geom_text_repel(aes(X1, X2,label=Phylum),size=4, data = plotcord)+# discard default grid + titles in ggplot2theme(panel.background = element_blank()) +# theme(legend.position = "none") +theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +theme(legend.background = element_rect(colour = NA)) +theme(panel.background = element_rect(fill = "white", colour = NA)) +theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
pnet
model_igraph布局
result = cor_Big_micro(ps = ps,N = 1000,r.threshold=0.6,p.threshold=0.05,method = "spearman")#--提取相关矩阵
cor = result[[1]]
dim(cor)result2 <- model_igraph(cor = cor,method = "cluster_fast_greedy",seed = 12
)
node = result2[[1]]
head(node)dat = result2[[2]]
head(dat)
tem = data.frame(mod = dat$model,col = dat$color) %>% dplyr::distinct( mod, .keep_all = TRUE)
col = tem$col
names(col) = tem$mod#---node节点注释#-----------
otu_table = as.data.frame(t(vegan_otu(ps)))
tax_table = as.data.frame(vegan_tax(ps))
nodes = nodeadd(plotcord =node,otu_table = otu_table,tax_table = tax_table)
head(nodes)
#-----计算边#--------
edge = edgeBuild(cor = cor,node = node)
colnames(edge)[8] = "cor"
head(edge)tem2 = dat %>% dplyr::select(OTU,model,color) %>%dplyr::right_join(edge,by =c("OTU" = "OTU_1" ) ) %>%dplyr::rename(OTU_1 = OTU,model1 = model,color1 = color)
head(tem2)tem3 = dat %>% dplyr::select(OTU,model,color) %>%dplyr::right_join(edge,by =c("OTU" = "OTU_2" ) ) %>%dplyr::rename(OTU_2 = OTU,model2 = model,color2 = color)
head(tem3)tem4 = tem2 %>%inner_join(tem3)
head(tem4)edge2 = tem4 %>% mutate(color = ifelse(model1 == model2,as.character(model1),"across"),manual = ifelse(model1 == model2,as.character(color1),"#C1C1C1"))
head(edge2)
col_edge = edge2 %>% dplyr::distinct(color, .keep_all = TRUE) %>% select(color,manual)
col0 = col_edge$manual
names(col0) = col_edge$colorlibrary(ggnewscale)p1 <- ggplot() + geom_segment(aes(x = X1, y = Y1, xend = X2, yend = Y2,color = color),data = edge2, size = 1) +scale_colour_manual(values = col0) # ggsave("./cs1.pdf",p1,width = 16,height = 14)
p2 = p1 +new_scale_color() +geom_point(aes(X1, X2,color =model), data = dat,size = 4) +scale_colour_manual(values = col) +scale_x_continuous(breaks = NULL) + scale_y_continuous(breaks = NULL) +theme(panel.background = element_blank()) +theme(axis.title.x = element_blank(), axis.title.y = element_blank()) +theme(legend.background = element_rect(colour = NA)) +theme(panel.background = element_rect(fill = "white", colour = NA)) +theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank())
p2
# ggsave("./cs2.pdf",p2,width = 16,height = 14)
节点模块化计算和可视化
result4 = nodeEdge(cor = cor)
#提取变文件
edge = result4[[1]]
#--提取节点文件
node = result4[[2]]
igraph = igraph::graph_from_data_frame(edge, directed = FALSE, vertices = node)
res = ZiPiPlot(igraph = igraph,method = "cluster_fast_greedy")
p <- res[[1]]
p
网络性质计算
22年6月升级后版本包括了16项网络属性,包括周集中老师21年NCC文章中全部属性
dat = net_properties(igraph)
head(dat)
# 升级后包含的网络属性更多
dat = net_properties.2(igraph,n.hub = T)
head(dat,n = 16)
节点性质计算
nodepro = node_properties(igraph)
head(nodepro)
value
num.edges(L) 1.736000e+03
num.pos.edges 1.247000e+03
num.neg.edges 4.890000e+02
num.vertices(n) 6.010000e+02
Connectance(edge_density) 9.628397e-03
average.degree(Average K) 5.777038e+00
average.path.length 4.750325e+00
diameter 1.263110e+01
edge.connectivity 0.000000e+00
mean.clustering.coefficient(Average.CC) 2.909496e-01
no.clusters 5.100000e+01
centralization.degree 7.203827e-02
centralization.betweenness 8.584971e-02
centralization.closeness 2.737148e-03
RM(relative.modularity) -6.258599e-01
the.number.of.keystone.nodes 1.100000e+02
扩展-关键OTU挑选
Hub节点是在网络中与其他节点连接较多的节点,Hub微生物就是与其他微生物联系较为紧密的微生物,可以称之为关键微生物(keystone)
hub = hub_score(igraph)$vector %>%sort(decreasing = TRUE) %>%head(5) %>%as.data.frame()colnames(hub) = "hub_sca"ggplot(hub) +geom_bar(aes(x = hub_sca,y = reorder(row.names(hub),hub_sca)),stat = "identity",fill = "#4DAF4A")
对应随机网络构建和网络参数比对
result = random_Net_compate(igraph = igraph, type = "gnm", step = 100, netName = layout)
p1 = result[[1]]
sum_net = result[[4]]
p1
head(sum_net)
微生物组网络pipeline分析
使用network函数运行微生物网络全套分析:
使用OTU数量建议少于250个,如果OTU数量为250个,同时计算zipi,整个运算过程为3-5min。
data("ps16s")
path = "./result_micro_200/"
dir.create(path)result = network(ps = ps16s,N = 200,layout_net = "model_Gephi.2",r.threshold=0.6,p.threshold=0.05,label = FALSE,path = path,zipi = TRUE)# 多组网络绘制到一个面板
p = result[[1]]
p
# 全部样本网络参数比对
data = result[[2]]plotname1 = paste(path,"/network_all.jpg",sep = "")
ggsave(plotname1, p,width = 48,height = 16,dpi = 72)plotname1 = paste(path,"/network_all.pdf",sep = "")
ggsave(plotname1, p,width = 48,height = 16)tablename <- paste(path,"/co-occurrence_Grobel_net",".csv",sep = "")
write.csv(data,tablename)
微生物网络-大网络
使用model_maptree2布局计算微生物大网络。
大网络运算时间会比较长,这里我没有计算zipi,用时5min完成全部运行。
N=0,代表用全部的OTU进行计算。
3000个OTU不计算zipi全套需要18min。
path = "./result_big_1000/"
dir.create(path)result = network.2(ps = ps16s,N = 1000,big = TRUE,maxnode = 5,select_layout = TRUE,layout_net = "model_maptree2",r.threshold=0.4,p.threshold=0.01,label = FALSE,path = path,zipi = FALSE)# 多组网络绘制到一个面板
p = result[[1]]
# 全部样本网络参数比对
data = result[[2]]
num= 3
# plotname1 = paste(path,"/network_all.jpg",sep = "")
# ggsave(plotname1, p,width = 16*num,height = 16,dpi = 72)plotname1 = paste(path,"/network_all.pdf",sep = "")
ggsave(plotname1, p,width = 10*num,height = 10,limitsize = FALSE)tablename <- paste(path,"/co-occurrence_Grobel_net",".csv",sep = "")
write.csv(data,tablename)
使用model_igraph布局绘制微生物大网络
path = "./result_1000_igraph/"
dir.create(path)
map = sample_data(ps16s)
map$Group = "one"
sample_data(ps16s) = map
result = network.2(ps = ps16s,N = 1000,big = TRUE,maxnode = 5,select_layout = TRUE,layout_net = "model_igraph",r.threshold=0.4,p.threshold=0.01,label = FALSE,path = path,zipi = FALSE)# 多组网络绘制到一个面板
p = result[[1]]
# 全部样本网络参数比对
data = result[[2]]
num= 3
# plotname1 = paste(path,"/network_all.jpg",sep = "")
# ggsave(plotname1, p,width = 16*num,height = 16,dpi = 72)plotname1 = paste(path,"/network_all.pdf",sep = "")
ggsave(plotname1, p,width = 16*num,height = 16,limitsize = FALSE)tablename <- paste(path,"/co-occurrence_Grobel_net",".csv",sep = "")
write.csv(data,tablename)
跨域网络pipeline分析
细菌,真菌,环境因子三者跨域网络
#--细菌和环境因子的网络#--------
Envnetplot<- paste("./16s_ITS_Env_network",sep = "")
dir.create(Envnetplot)ps16s =ps16s%>% ggClusterNet::scale_micro()
psITS = psITS%>% ggClusterNet::scale_micro()library(phyloseq)
#--细菌和真菌ps对象中的map文件要一样
ps.merge <- ggClusterNet::merge16S_ITS(ps16s = ps16s,psITS= psITS,NITS = 200,N16s = 200)map = phyloseq::sample_data(ps.merge)
head(map)
map$Group = "one"
phyloseq::sample_data(ps.merge) <- map#--环境因子导入
data1 = env
envRDA = data.frame(row.names = env$ID,env[,-1])envRDA.s = vegan::decostand(envRDA,"hellinger")
data1[,-1] = envRDA.sGru = data.frame(ID = colnames(env)[-1],group = "env" )
head(Gru)library(sna)
library(ggClusterNet)
library(igraph)result <- ggClusterNet::corBionetwork(ps = ps.merge,N = 0,r.threshold = 0.4, # 相关阈值p.threshold = 0.05,big = T,group = "Group",env = data1, # 环境指标表格envGroup = Gru,# 环境因子分组文件表格# layout = "fruchtermanreingold",path = Envnetplot,# 结果文件存储路径fill = "Phylum", # 出图点填充颜色用什么值size = "igraph.degree", # 出图点大小用什么数据scale = TRUE, # 是否要进行相对丰度标准化bio = TRUE, # 是否做二分网络zipi = F, # 是否计算ZIPIstep = 100, # 随机网络抽样的次数width = 18,label = TRUE,height = 10
)p = result[[1]]
p
# 全部样本网络参数比对
data = result[[2]]
plotname1 = paste(Envnetplot,"/network_all.jpg",sep = "")
ggsave(plotname1, p,width = 15,height = 12,dpi = 72)
# plotname1 = paste(Envnetplot,"/network_all.png",sep = "")
# ggsave(plotname1, p,width = 10,height = 8,dpi = 72)
plotname1 = paste(Envnetplot,"/network_all.pdf",sep = "")
ggsave(plotname1, p,width = 15,height = 12)
tablename <- paste(Envnetplot,"/co-occurrence_Grobel_net",".csv",sep = "")
write.csv(data,tablename)
细菌环境因子跨域网络
#--细菌-环境因子网络#-----
Envnetplot<- paste("./16s_Env_network",sep = "")
dir.create(Envnetplot)
# ps16s = ps0 %>% ggClusterNet::scale_micro()
psITS = NULL
library(phyloseq)
#--细菌和真菌ps对象中的map文件要一样
ps.merge <- ggClusterNet::merge16S_ITS(ps16s = ps16s,psITS = NULL,N16s = 500)
ps.merge
map = phyloseq::sample_data(ps.merge)
head(map)
map$Group = "one"
phyloseq::sample_data(ps.merge) <- mapdata1 = env
envRDA.s = vegan::decostand(envRDA,"hellinger")
data1[,-1] = envRDA.sGru = data.frame(ID = colnames(env)[-1],group = "env" )
head(Gru)# library(sna)
# library(ggClusterNet)
# library(igraph)result <- ggClusterNet::corBionetwork(ps = ps.merge,N = 0,r.threshold = 0.4, # 相关阈值p.threshold = 0.05,big = T,group = "Group",env = data1, # 环境指标表格envGroup = Gru,# 环境因子分组文件表格# layout = "fruchtermanreingold",path = Envnetplot,# 结果文件存储路径fill = "Phylum", # 出图点填充颜色用什么值size = "igraph.degree", # 出图点大小用什么数据scale = TRUE, # 是否要进行相对丰度标准化bio = TRUE, # 是否做二分网络zipi = F, # 是否计算ZIPIstep = 100, # 随机网络抽样的次数width = 18,label = TRUE,height = 10
)p = result[[1]]
p
# 全部样本网络参数比对
data = result[[2]]
plotname1 = paste(Envnetplot,"/network_all.jpg",sep = "")
ggsave(plotname1, p,width = 13,height = 12,dpi = 72)plotname1 = paste(Envnetplot,"/network_all.pdf",sep = "")
ggsave(plotname1, p,width = 13,height = 12)
tablename <- paste(Envnetplot,"/co-occurrence_Grobel_net",".csv",sep = "")
write.csv(data,tablename)
细菌真菌跨越网络
#---细菌和真菌OTU网络-域网络-二分网络#-------
# 仅仅关注细菌和真菌之间的相关,不关注细菌内部和真菌内部相关Envnetplot<- paste("./16S_ITS_network",sep = "")
dir.create(Envnetplot)data(psITS)
#--细菌和真菌ps对象中的map文件要一样
ps.merge <- ggClusterNet::merge16S_ITS(ps16s = ps16s,psITS = psITS,N16s = 500,NITS = 500
)ps.mergemap = phyloseq::sample_data(ps.merge)head(map)
map$Group = "one"
phyloseq::sample_data(ps.merge) <- mapdata = data.frame(ID = c("fun_ASV_205","fun_ASV_316","fun_ASV_118"),c("Coremode","Coremode","Coremode"))# source("F:/Shared_Folder/Function_local/R_function/my_R_packages/ggClusterNet/R/corBionetwork2.R")result <- corBionetwork(ps = ps.merge,N = 0,lab = data,r.threshold = 0.8, # 相关阈值p.threshold = 0.05,group = "Group",# env = data1, # 环境指标表格# envGroup = Gru,# 环境因子分组文件表格# layout = "fruchtermanreingold",path = Envnetplot,# 结果文件存储路径fill = "Phylum", # 出图点填充颜色用什么值size = "igraph.degree", # 出图点大小用什么数据scale = TRUE, # 是否要进行相对丰度标准化bio = TRUE, # 是否做二分网络zipi = F, # 是否计算ZIPIstep = 100, # 随机网络抽样的次数width = 12,label = TRUE,height = 10,big = TRUE,select_layout = TRUE,layout_net = "model_maptree2",clu_method = "cluster_fast_greedy")tem <- model_maptree(cor =result[[5]],method = "cluster_fast_greedy",seed = 12
)
node_model = tem[[2]]
head(node_model)p = result[[1]]
p
# 全部样本网络参数比对
data = result[[2]]plotname1 = paste(Envnetplot,"/network_all.pdf",sep = "")
ggsave(plotname1, p,width = 20,height = 19)
tablename <- paste(Envnetplot,"/co-occurrence_Grobel_net",".csv",sep = "")
write.csv(data,tablename)
tablename <- paste(Envnetplot,"/node_model_imformation",".csv",sep = "")
write.csv(node_model,tablename)tablename <- paste(Envnetplot,"/nodeG_plot",".csv",sep = "")
write.csv(result[[4]],tablename)
tablename <- paste(Envnetplot,"/edge_plot",".csv",sep = "")
write.csv(result[[3]],tablename)
tablename <- paste(Envnetplot,"/cor_matrix",".csv",sep = "")
write.csv(result[[5]],tablename)
基于属水平绘制跨域网络-网络模块与环境因子相关等
#--细菌真菌任意分类水平跨域网络#----------
library(tidyverse)
# res1path = "result_and_plot/Micro_and_other_index_16s/"
Envnetplot<- paste("./16S_ITS_network_Genus",sep = "")
dir.create(Envnetplot)
#--细菌和真菌ps对象中的map文件要一样
ps.merge <- merge16S_ITS(ps16s = ps16s,psITS = psITS,N16s = 500,NITS = 500
)
ps.merge
map = phyloseq::sample_data(ps.merge)
head(map)
map$Group = "one"
phyloseq::sample_data(ps.merge) <- map
tem.0 = ps.merge %>% tax_glom_wt(ranks = "Genus")
tax = tem.0 %>% vegan_tax() %>%as.data.frame()
head(tax)data = data.frame(ID = c("Acaulium","Acidocella","Acrobeloides"),c("Acaulium","Acidocella","Acrobeloides"))result <- corBionetwork(ps = tem.0,N = 0,lab = data,r.threshold = 0.6, # 相关阈值p.threshold = 0.05,group = "Group",# env = data1, # 环境指标表格# envGroup = Gru,# 环境因子分组文件表格# layout = "fruchtermanreingold",path = Envnetplot,# 结果文件存储路径fill = "Phylum", # 出图点填充颜色用什么值size = "igraph.degree", # 出图点大小用什么数据scale = TRUE, # 是否要进行相对丰度标准化bio = TRUE, # 是否做二分网络zipi = F, # 是否计算ZIPIstep = 100, # 随机网络抽样的次数width = 12,label = TRUE,height = 10,big = TRUE,select_layout = TRUE,layout_net = "model_maptree2",clu_method = "cluster_fast_greedy")tem <- model_maptree(cor =result[[5]],method = "cluster_fast_greedy",seed = 12
)
node_model = tem[[2]]
head(node_model)library(WGCNA)
otu = tem.0 %>% vegan_otu() %>%as.data.frame()
node_model = node_model[match(colnames(otu),node_model$ID),]MEList = moduleEigengenes(otu, colors = node_model$group)
MEs = MEList$eigengenesnGenes = ncol(otu)
nSamples = nrow(otu)
moduleTraitCor = cor(MEs, envRDA, use = "p")
moduleTraitPvalue = corPvalueStudent(moduleTraitCor, nSamples)#sizeGrWindow(10,6)
pdf(file=paste(Envnetplot,"/","Module-env_relationships.pdf",sep = ""),width=10,height=6)
# Will display correlations and their p-values
textMatrix = paste(signif(moduleTraitCor, 2), "\n(",signif(moduleTraitPvalue, 1), ")", sep = "")dim(textMatrix) = dim(moduleTraitCor)
par(mar = c(6, 8.5, 3, 3))
# Display the correlation values within a heatmap plot
labeledHeatmap(Matrix = moduleTraitCor,xLabels = names(envRDA),yLabels = names(MEs),ySymbols = names(MEs),colorLabels = FALSE,colors = greenWhiteRed(50),textMatrix = textMatrix,setStdMargins = FALSE,cex.text = 0.5,zlim = c(-1,1),main = paste("Module-trait relationships"))
dev.off()p = result[[1]]
p
# 全部样本网络参数比对
data = result[[2]]plotname1 = paste(Envnetplot,"/network_all.pdf",sep = "")
ggsave(plotname1, p,width = 10,height = 10)
tablename <- paste(Envnetplot,"/co-occurrence_Grobel_net",".csv",sep = "")
write.csv(data,tablename)tablename <- paste(Envnetplot,"/nodeG_plot",".csv",sep = "")
write.csv(result[[4]],tablename)
tablename <- paste(Envnetplot,"/edge_plot",".csv",sep = "")
write.csv(result[[3]],tablename)
tablename <- paste(Envnetplot,"/cor_matrix",".csv",sep = "")
write.csv(result[[5]],tablename)
引文:Tao Wen, Penghao Xie, Shengdie Yang, Guoqing Niu, Xiaoyu Liu, Zhexu Ding, Chao Xue, Yong-Xin Liu, Qirong Shen, Jun Yuan. 2022. ggClusterNet: An R package for microbiome network analysis and modularity-based multiple network layouts. iMeta 1: e32. https://doi.org/10.1002/imt2.32
期刊简介
“iMeta” 是由威立、肠菌分会和本领域数百位华人科学家合作出版的开放获取期刊,主编由中科院微生物所刘双江研究员和荷兰格罗宁根大学傅静远教授担任。目的是发表原创研究、方法和综述以促进宏基因组学、微生物组和生物信息学发展。目标是发表前10%(IF > 15)的高影响力论文。期刊特色包括视频投稿、可重复分析、图片打磨、青年编委、前3年免出版费、50万用户的社交媒体宣传等。2022年2月正式创刊发行!
联系我们
iMeta主页:http://www.imeta.science
出版社:https://onlinelibrary.wiley.com/journal/2770596x
投稿:https://mc.manuscriptcentral.com/imeta
邮箱:office@imeta.science
微信公众号
iMeta
iMeta | ggClusterNet微生物网络分析和可视化保姆级教程相关推荐
- 决策树可视化保姆级教程
决策树可视化指南 决策树是机器学习的一种经典的模型,因其泛化性能好,可解释性强而被广泛应用到实际商业预测中.通常在我们完成决策树模型搭建后,我们会进一步研究分析我们搭建好的模型,这时候模型的可视化就显 ...
- win11 安装Ubuntu加可视化桌面(最新保姆级教程)
win11.win10 安装Ubuntu加可视化桌面(最新保姆级教程) 前言 保姆级安装流程 第一节·安装前准备 第二节·安装ubuntu本体 第三节·可视化安装 前言 win11.win10 安装U ...
- FastText词向量训练、使用及可视化操作【保姆级教程(包含藏文处理方法)】
目录 一.前言 二.FastText词向量训练 2.1 数据输入格式 2.2词向量训练 三.词向量使用 一.前言 本文是word2vec词向量篇的姊妹篇.fasttext是静态词向量构建方法的一种,本 ...
- Word2Vec词向量训练、使用及可视化操作【保姆级教程(包含藏文处理方法)】
目录 一.前言 二.Word2Vec词向量训练 2.1 数据输入格式 2.2词向量训练 三.词向量使用 四.词向量可视化 一.前言 word2vec是静态词向量构建方法的一种,本文将介绍word2ve ...
- 爬虫保姆级教程3:利用python-Flask框架搭建本地数据可视化网站
成果展示: (1)网站首页: (2)电影表单页 (3) 电影评分页 (4) 词频统计页 (5)团队页面 接下来让我们看看上述网站是如何完成的: 首先简单介绍一下Flask框架: Flask主要功能有两 ...
- 从购买服务器到网站搭建成功保姆级教程~超详细
??从购买服务器到网站搭建成功保姆级教程~真的超详细,各位看官细品 ??前言 ??预备知识 ??什么是云服务器? ??什么是域名? ??什么是SSL证书? ??服务器选配 ??阿里云[官网链接](ht ...
- win系统的阿里云服务器部署IDEASpringBoot项目保姆级教程
win系统的阿里云服务器部署SpringBoot项目保姆级教程 前言 配置阿里云服务器 配置环境 环境配置中的坑 还有别的坑的话,私信我我补充......毕竟配置是最基本的,还想咋样... jar包导 ...
- 快速上手Springboot项目(登录注册保姆级教程)
本文章对SpringBoot开发后端项目结构做了简单介绍,并示范了使用SpringBoot+MySQL实现登录的后端功能,与本博客的另一篇文章 Vue 实现登录注册功能(前后端分离完整案例) | Ma ...
- RocketMQ保姆级教程
上周花了一点时间从头到尾.从无到有地搭建了一套RocketMQ的环境,觉得还挺easy的,所以就写篇文章分享给大家. 整篇文章可以大致分为三个部分,第一部分属于一些核心概念和工作流程的讲解:第二部分就 ...
- 【强化学习实战-04】DQN和Double DQN保姆级教程(2):以MountainCar-v0
[强化学习实战-04]DQN和Double DQN保姆级教程(2):以MountainCar-v0 实战:用Double DQN求解MountainCar问题 MountainCar问题详解 Moun ...
最新文章
- php 获取坐标api,怎么获取百度坐标api返回的json格式字符串或对象呢
- android 动态label,Android仿抖音实现动态壁纸
- OpenStack核心组件-glance镜像服务
- 前端学习(1972)vue之电商管理系统电商系统之弹出修改对话框
- 带你了解FPGA(2)--逻辑设计基础
- java 参数理解,JAVA参数传递值传递的理解。
- Cocoa编程开发者手册
- 如果创业遍地,你觉得会是怎么样的景象
- oracle分区唯一索引,关于分区表与唯一索引的测试记录
- Oracle 迁移至 MySQL 后 需要修改的SQL语句 小总结
- 待解决的问题 POJ 3225 Help with Intervals 线段树 码力
- Win11 在线安装QT5.15.2教程
- 5G+北斗高精度定位系统适用于哪些行业领域?
- android工程角度相机,anglecam角度相机软件下载
- android应用市场汇总
- AI领域各个方向大牛博客汇总
- docker提交腾讯云标准模式
- 概率论与数理统计---基本概念
- 升级gradle版本报错
- 互联网时代,大数据技术与云计算平台结合发展、应用