文章目录

agricolae
DescTools
bruceR
- 1. 回归分析
- - 决定系数R2R^2R2
  - 效应值η²p
  - 方差膨胀系数VIF
  - 标准化偏回归系数
  - 偏相关r.partial与半偏相关r.part
- 2. 方差分析
- - 描述分析结果
  - 方差表
  - 方差一致性检验
  - 球形检验
  - 简单效应

主要介绍agricolae包、DescTools包、bruceR包。

agricolae

iris%>%aov(Sepal.Length~Species,data=.)%>%agricolae::duncan.test("Species",group=F,console=T)

group=T则输出分组标记（abc），否则输出差异值。

Study: . ~ "Species"Duncan's new multiple range test
for Sepal.Length Mean Square Error:  0.2650082 Species,  meansSepal.Length       std  r Min Max
setosa            5.006 0.3524897 50 4.3 5.8
versicolor        5.936 0.5161711 50 4.9 7.0
virginica         6.588 0.6358796 50 4.9 7.9Alpha: 0.05 ; DF Error: 147 Critical Range2         3
0.2034688 0.2141585 Means with the same letter are not significantly different.Sepal.Length groups
virginica         6.588      a
versicolor        5.936      b
setosa            5.006      c

双因素方差分析

library(agricolae)
data(greenhouse)greenhouse%>%.[[1]]%>%as_tibble()%>%split(.$method)%>%map(~aov(weight ~ variety,data=.x))%>%map(~duncan.test(.,"variety",console=F))%>%map(~.$groups)

>   greenhouse%>%.[[1]]%>%
+     as_tibble()%>%
+     split(.$method)%>%
+     map(~aov(weight ~ variety,data=.x))%>%
+     map(~duncan.test(.,"variety",console=F))%>%
+     map(~.$groups)
$aeroponicweight groups
Mariva    95.35625      a
Unica     88.54750      a
Costanera 40.15250      b$bedweight groups
Mariva    96.8025      a
Costanera 87.5475     ab
Unica     69.0825      b$hydroponicweight groups
Unica     26.28500      a
Costanera 26.13500      a
Mariva    14.25125      b$potweight groups
Costanera 141.0425      a
Unica      94.5550      b
Mariva     92.2050      b

DescTools

agricolae::greenhouse[[1]]%>%aov(weight ~ variety+method, data=.)%>%DescTools::PostHocTest(method = "duncan")

> greenhouse[[1]]%>%aov(weight ~ variety+method, data=.)%>%
+   DescTools::PostHocTest(method = "duncan")Posthoc multiple comparisons of means : Duncan's new multiple range test 95% family-wise confidence level$varietydiff     lwr.ci    upr.ci   pval
Mariva-Costanera  0.934375  -8.579594 10.448344 0.8471
Unica-Costanera  -4.101875 -13.615844  5.412094 0.3973
Unica-Mariva     -5.036250 -15.052430  4.979930 0.3306    $methoddiff     lwr.ci    upr.ci    pval
bed-aeroponic          9.792083  -1.193702  20.77787  0.0805 .
hydroponic-aeroponic -52.461667 -63.447452 -41.47588 3.3e-11 ***
pot-aeroponic         34.582083  23.016394  46.14777 2.0e-09 ***
hydroponic-bed       -62.253750 -73.819439 -50.68806 1.7e-11 ***
pot-bed               24.790000  13.804215  35.77579 1.2e-05 ***
pot-hydroponic        87.043750  75.090244  98.99726 1.1e-11 ***---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

bruceR

关于bruceR包的具体内容可参考包寒吴霜的专栏。

1. 回归分析

model=lm(Temp ~ Month + Day + Wind + Solar.R, data=airquality)
GLM_summary(model)

查看结果为：

> GLM_summary(model)
MODEL INFO:
Model type: General Linear Model (GLM) (OLS Regression)
Observations: N = 146 (7 missing cases deleted)MODEL FIT:
F(4, 141) = 22.22, p < 1e-10 ***
R² = 0.38659 (Adjusted R² = 0.36919)ANOVA table:
─────────────────────────────────────────────────────Sum Sq  Df Mean Sq     F     p     η²p
─────────────────────────────────────────────────────
[Model]     4040.28   4 1010.07 18.82 <.001 *** 0.348
Month       1367.28   1 1367.28 25.47 <.001 *** 0.153
Day           76.54   1   76.54  1.43  .234     0.010
Wind        1741.28   1 1741.28 32.44 <.001 *** 0.187
Solar.R      855.18   1  855.18 15.93 <.001 *** 0.102
Residuals   7568.88 141   53.68
Total      11609.16 145   80.06
─────────────────────────────────────────────────────FIXED EFFECTS:
Outcome variable: Temp (N = 146)
──────────────────────────────────────────────────────────────────b    S.E.     t     p        [95%      CI]   VIF
──────────────────────────────────────────────────────────────────
(Intercept)  68.770 (4.391) 15.66 <.001 *** [60.089, 77.450]
Month         2.225 (0.441)  5.05 <.001 *** [ 1.353,  3.096] 1.035
Day          -0.084 (0.070) -1.19  .234     [-0.222,  0.055] 1.024
Wind         -1.003 (0.176) -5.70 <.001 *** [-1.352, -0.655] 1.032
Solar.R       0.027 (0.007)  3.99 <.001 *** [ 0.014,  0.041] 1.034
──────────────────────────────────────────────────────────────────Standardized coefficients: Temp (N = 146)
─────────────────────────────────────────────────────────────Beta*   S.E.*        [95%      CI] r.partial r.part
─────────────────────────────────────────────────────────────
Month     0.339 (0.067) *** [ 0.206,  0.471]     0.391  0.333
Day      -0.080 (0.067)     [-0.212,  0.052]    -0.100 -0.079
Wind     -0.382 (0.067) *** [-0.514, -0.249]    -0.432 -0.376
Solar.R   0.268 (0.067) *** [ 0.135,  0.400]     0.319  0.263
──────────────────────────────────────────────────────────

上述结果主要有如下几个部分：

拟合优度分析
方差分析
偏回归系数
标准化偏回归系数

下面就主要的几个参数进行解释：

决定系数R2R^2R2

决定系数R2R^2R2为复相关系数R（yiy_iyi与其估计值y^\hat{y}y^的pearson相关系数）的平方值。决定系数为模型的可解释的变异SSR占总变异SS的比例。

效应值η²p

变量的效应值。标准化的效应量可以分为三大家族：d-family（difference family）：如Cohen’s d、Hedges’ gr-family（correlation family）：如Pearson r、R²、η²、ω²、fOR-family（categorical family）：如odds ratio (OR)、risk ratio (RR)

方差膨胀系数VIF

方差膨胀系数VIF，表示自变量间的共线性程度，其值为容忍度的倒数，容忍度（小于1）为1减去以改自变量为因变量，其余自变量为自变量的线性模型的决定系数，可以预见如果该自变量与其余自变量存在强线性关系（决定系数越大），则其容忍度越低，而VIF值（大于1）则越大。一般认为VIF不应大于5~10。

标准化偏回归系数

标准化偏回归系数（beta）可以理解为变量的敏感度，即自变量每变化1个标准差sdxsd_xsdx，因变量变化beta个标准差sdysd_ysdy。偏回归系数大的自变量其标准化偏回归系数不一定大。

偏相关r.partial与半偏相关r.part

partial correlations和semi-partial (part)correlations适用于研究消除其他变量的前提下两变量的相关性。具体来说：

partial correlations是指在消除了其他所有随机变量影响的条件下，考察两变量之间的相关性。
semi-partial (part) correlations是指在消除了其他变量对两变量部分影响的条件下(如只消除对其中一个变量的影响)，考察两变量之间的相关性。
R语言ppcor这个包可以计算pcor()与spcor()

Partial Correlation

We measure individual differences in many things, including cognitive ability, personality, interests & motives, attitudes, and so forth. Many times, we want to know about the influence of one IV on a DV, but one or more other IVs pose an alternative explanation. We would like to hold some third variable constant while examining the relations between X and Y. With assignment we can do this by design. With measures of individual differences, we can do this statistically rather than by manipulation.

The basic idea in partial and semipartial correlation is to examine the correlations among residuals (errors of prediction). If we regress variable X on variable Z, then subtract X’ from X, we have a residual e. This e will be uncorrelated with Z, so any correlation X shares with another variable Y cannot be due to Z.

Semipartial Correlation

With partial correlation, we find the correlation between X and Y holding Z constant for both X and Y. Sometimes, however, we want to hold Z constant for just X or just Y. In that case, we compute a semipartial correlation. A partial correlation is computed between two residuals. A semipartial is computed between one residual and another raw or unresidualized variable. The notation r1(2.3) means the semipartial correlation between unmodified X1 and residualized X2, where X3 has been taken from X2.

2. 方差分析

dat2<-data.frame(Nrate=rep(rep(c(30,60,90),each=10),10), Ntype=rep(rep(c("CK","NI"),each=10),15),plots=rep(1:30,each=10),days=rep(1:10,30),flux=runif(300,10,100)) MANOVA(data=dat2, subID="plots",dv="flux",between=c("Nrate","Ntype"),within="days",sph.correction = "GG") %>%EMMEANS("Ntype", by="Nrate")

cov可设置协变量，为协方差分析。
sph.correction参数为df校正，当球形假设不满足时，需要设置该参数。

Sphericity correction method to adjust the degrees of freedom (df) when the sphericity assumption is violated. Default is “none”. If Mauchly’s test of sphericity is significant, you may set it to “GG” (Greenhouse-Geisser) or “HF” (Huynh-Feldt).

描述分析结果

====== MANOVA Output (Mixed Design) ======Descriptive Statistics:
───────────────────────────────Nrate Ntype days  Mean  S.D. N
───────────────────────────────30    CK   1  65.08 22.19 530    CK   2  53.67 24.15 530    CK   3  57.90 24.34 530    CK   4  39.94 21.58 5。。。省略。。。。。
───────────────────────────────
Total sample size: N = 30

方差表

Contrasts set to contr.sum for the following variables: Nrate, Ntype
ANOVA Table:
Dependent variable(s):      flux
Between-subjects factor(s): Nrate, Ntype
Within-subjects factor(s):  days
Covariate(s):               -
───────────────────────────────────────────────────────────────────────────────────MS     MSE   df1    df2    F     p      η²p   [90%     CI]
───────────────────────────────────────────────────────────────────────────────────
Nrate              159.853 596.670  2.00  24.00 0.27  .767     0.022 [0.000, 0.116]
Ntype              251.099 596.670  1.00  24.00 0.42  .523     0.017 [0.000, 0.166]
Nrate:Ntype        178.611 596.670  2.00  24.00 0.30  .744     0.024 [0.000, 0.124]
days              1258.718 976.600  6.63 159.16 1.29  .261     0.051 [0.000, 0.077]
Nrate:days        1029.677 976.600 13.26 159.16 1.05  .403     0.081 [0.000, 0.077]
Ntype:days         624.971 976.600  6.63 159.16 0.64  .714     0.026 [0.000, 0.034]
Nrate:Ntype:days   936.235 976.600 13.26 159.16 0.96  .495     0.074 [0.000, 0.067]
───────────────────────────────────────────────────────────────────────────────────
MSE = Mean Square Error (an estimate of the population variance σ²)
Sphericity correction method: GG (Greenhouse-Geisser)ANOVA Effect Size:ω2      η2   η2[G]   η2[p] Cohen's f
Nrate             -0.005   0.002   0.002   0.022     0.150
Ntype             -0.002   0.001   0.001   0.017     0.132
Nrate:Ntype       -0.005   0.002   0.002   0.024     0.157
days               0.009   0.040   0.047   0.051     0.232
Nrate:days         0.003   0.065   0.074   0.081     0.297
Ntype:days        -0.011   0.020   0.024   0.026     0.163
Nrate:Ntype:days  -0.003   0.059   0.068   0.074     0.283ω² = omega-squared = (SS - df1 * MSE) / (SST + MSE)
η² = eta-squared = SS / SST
η²G = generalized eta-squared (see Olejnik & Algina, 2003)
η²p = partial eta-squared = SS / (SS + SSE) = F * df1 / (F * df1 + df2)
Cohen’s f = sqrt( η²p / (1 - η²p) )

方差一致性检验

Levene’s Test for Homogeneity of Variance:
DV = flux:Levene's F df1 df2     p sig
Based on Mean         0.58   5 294  .713
Based on Median       0.48   5 294  .790

球形检验

Mauchly’s Test of Sphericity:Mauchly's W       p
days                 0.11454 0.44742
Nrate:days           0.11454 0.44742
Ntype:days           0.11454 0.44742
Nrate:Ntype:days     0.11454 0.44742

球形检验的结果，由于交互作用的水平数都大于2，需要进行球形检验。当球形检验的p 值大于0.05时，自变量各水平之间都符合球形假设，即相互独立。反之则表明不符合球形假设，需要参考第三个表Sphericity Corrections中的GGe（Greenhouse-Geisser epsilon）或者HFe（Huynh-Feldt epsilon）对df 进行校正。关于采取哪种校正方案， Girden (1992)认为当epsilon > 0.75时，建议用Huynh-Feldt校正，当epsilon < 0.75时，建议用Greenhouse-Geisser校正。Mauchly’s Test表的结果显示，均满足球形假设，如果不满足，需要根据Sphericity Corrections 中的GGe对交互作用的df 进行校正。

简单效应

------ EMMEANS Output (effect = "Ntype") ------Simple Effects of "Ntype":
Nrate = 30:----       df1 df2    F         p sig   η2p  [90%     CI] Ntype        1  24 0.20 0.6597489     0.008 [0.000, 0.136]days         9  24 1.94 0.0950746 .   0.420 [0.000, 0.446]Ntype:days   9  24 0.31 0.9631948     0.105 [0.000, 1.000]Nrate = 60:----       df1 df2    F         p sig   η2p  [90%     CI] Ntype        1  24 2.23 0.1482685     0.085 [0.000, 0.278]days         9  24 0.95 0.5045090     0.262 [0.000, 0.259]Ntype:days   9  24 1.37 0.2574159     0.339 [0.000, 0.355]Nrate = 90:----       df1 df2    F         p sig   η2p  [90%     CI] Ntype        1  24 3.61 0.0693981 .   0.131 [0.000, 0.331]days         9  24 1.61 0.1686279     0.376 [0.000, 0.398]Ntype:days   9  24 2.84 0.0197606 *   0.516 [0.061, 0.545]Estimated Marginal Means of "Ntype":
Nrate = 30:Ntype EM.Mean    S.E. df  [95%     CI] CK      52.18 (3.650) 24 [44.65, 59.71]NI      54.48 (3.650) 24 [46.95, 62.01]Nrate = 60:Ntype EM.Mean    S.E. df  [95%     CI] CK      58.41 (3.650) 24 [50.88, 65.94]NI      50.70 (3.650) 24 [43.17, 58.23]Nrate = 90:Ntype EM.Mean    S.E. df  [95%     CI] CK      50.84 (3.650) 24 [43.31, 58.37]NI      60.65 (3.650) 24 [53.12, 68.18]Results are averaged over the levels of: days
EM.Mean uses an equally weighted average. Pairwise Comparisons of "Ntype":
Nrate = 30:Contrast     b    S.E. df     t     p sig  Cohen's d  [95%     CI]NI - CK   2.30 (5.162) 24  0.45  .660      0.0891568 [-0.32, 0.50]Nrate = 60:Contrast     b    S.E. df     t     p sig  Cohen's d  [95%     CI]NI - CK  -7.71 (5.162) 24 -1.49  .148     -0.2987534 [-0.71, 0.11]Nrate = 90:Contrast     b    S.E. df     t     p sig  Cohen's d  [95%     CI]NI - CK   9.81 (5.162) 24  1.90  .069 .    0.3801680 [-0.03, 0.79]SD_pooled for computing Cohen’s d: 25.81
Results are averaged over the levels of: days
No need to adjust p values.

基于R语言的方差分析及多重比较相关推荐

单因素方差分析_基于R语言开展方差分析(一)——单因素方差分析
基本原理方差分析(Analysis of variance, ANOVA)是用于两个或两个以上样本均数比较的方法,还可以分析两个或多个研究因素的交互交互作用以及回归方程的线性假设检验等.其基本思想是 ...
R语言应用实战系列（二）-基于R语言的方差分析
前言回归分析重点考察变量间的相关关系或因果关系,当包含因子是解释变量时,关注点通常是从变量间的关系转向组与组之间的差异分析,这种分析样本组之间的区别的方法称为方差分析(Analysis of Var ...
R语言做GGEbiplot_基于R语言的GGE双标图在大豆区试中的应用
基于 R 语言的 GGE 双标图在大豆区试中的应用 * 昝凯,陈亚光,徐淑霞 ** ,周青,张志民,杨慧凤,王凤菊,李明军,郭海芳 [摘要] 摘要:为合理评价大豆区域试验中品系的丰产性.稳产性和适 ...
Protein Cell：基于R语言的微生物组数据挖掘最佳流程（大众评审截止14号晚6点）...
Protein & Cell综述:基于R语言的微生物组数据挖掘的最佳流程近日,中国农业科学院刘永鑫组联合南京农业大学袁军组在国际期刊 Protein & Cell (IF = 15. ...
R语言嵌套方差分析（Nested ANOVA）实战
R语言嵌套方差分析(Nested ANOVA)实战目录 R语言嵌套方差分析(Nested ANOVA)实战 #嵌套方差分析(Nested ANOVA)
R语言Welch方差分析（Welch’s ANOVA）实战：Welch方差分析是典型的单因素方差分析的一种替代方法，当方差相等的假设被违反时我们无法使用单因素方差分析，这时候Welch’s出来救场了
R语言Welch方差分析(Welch's ANOVA)实战:Welch方差分析是典型的单因素方差分析的一种替代方法,当方差相等的假设被违反时我们无法使用单因素方差分析,这时候Welch's出来救场了 ...
canoco5主成分分析步骤_基于R语言的主成分分析
基于R语言的主成分分析加入的SPSS群里有人问,怎么用SPSS进行主成分分析.确实没有注意到这种操作.很好奇,于是翻了翻孙振球的<医学统计学>,发现主成分分析这一块,竟使用了SAS!后来 ...
r语言线性回归相关系数_基于R语言的lmer混合线性回归模型
原文基于R语言的lmer混合线性回归模型tecdat.cn 混合模型适合需求吗? 混合模型在很多方面与线性模型相似.它估计一个或多个解释变量对响应变量的影响.混合模型的输出将给出一个解释值列表,其 ...
基于R语言实现的交通时空大数据处理
2019独角兽企业重金招聘Python工程师标准>>> 基于R语言实现的交通时空大数据处理 Import public NYC taxi and Uber trip data int ...

基于R语言的方差分析及多重比较