散布矩阵(Scatter Matrix)及其与协方差矩阵(The Covariance Matrix)的关系

2024-04-23 16:52:46

在多元统计和概率论中，散点矩阵是一种统计量，用来估计协方差矩阵，例如多元正态分布。

In multivariate statistics and probability theory, the scatter matrix is a statistic that is used to make estimates of the covariance matrix, for instance of the multivariate normal distribution.

定义(Definition)

Given n samples of m-dimensional data, represented as the m-by-n matrix, X = [ x 1 , x 2 , . . . , x n ] X=[x_1,x_2,...,x_n] X=[x1,x2,...,xn], the sample mean is
x ‾ = 1 n ∑ j = 1 n x j \overline x=\frac{1}{n}\sum_{j=1}^nx_j x=n1j=1∑nxj

where x j x_j xj is the j-th column of X.

The scatter matrix is the m-by-m positive semi-definite matrix(协方差矩阵也是)

S = ∑ j = 1 n ( x j − x ‾ ) ( x j − x ‾ ) T = ∑ j = 1 n ( x j − x ‾ ) ⊗ ( x j − x ‾ ) = ( ∑ j = 1 n x j x j T − n x ‾ x ‾ T ) S=\sum_{j=1}^n(x_j-\overline x)(x_j-\overline x)^T=\sum_{j=1}^n(x_j-\overline x)\otimes(x_j-\overline x)=\bigg(\sum_{j=1}^nx_jx_j^T-n\overline x \overline x^T\bigg) S=j=1∑n(xj−x)(xj−x)T=j=1∑n(xj−x)⊗(xj−x)=(j=1∑nxjxjT−nxxT)
where T denotes matrix transpose, and multiplication is with regards to the outer product. The scatter matrix may be expressed more succinctly (简洁地) as
S = X C n X T S=XC_nX^T S=XCnXT

where C n C_n Cn is the n-by-n centering matrix.

Outer product: u ⊗ v = u v T u\otimes v=uv^T u⊗v=uvT
还不懂看这个：Outer product - Wikipedia

Centering matrix multiplied with a vector has the same effect as subtracting the mean of the components of the vector from every component of that vector.
Centering matrix - Wikipedia

应用(Application)

在给定n个样本的情况下，多元正态分布的协方差矩阵的最大似然估计可以表示为归一化散点矩阵
C M L = 1 n S C_{ML}=\frac{1}{n}S CML=n1S

当X的列分别从多元正态分布中独立采样时，则S具有Wishart分布。

The maximum likelihood estimate, given n samples, for the covariance matrix of a multivariate normal distribution can be expressed as the normalized scatter matrix
When the columns of X are independently sampled from a multivariate normal distribution, then S has a Wishart distribution.

散布矩阵(Scatter Matrix)及其与协方差矩阵(The Covariance Matrix)的关系相关推荐

加权协方差矩阵(weighted covariance matrix)
国内完全没一个有用的,这里给出了加权协方差矩阵计算函数.用的时候可以将权重先归一化. def weighted_cov(values, weights):"""Compu ...
协方差矩阵（covariance matrix）
Xn×d⇒(XTX)d×d X_{n\times d}\Rightarrow \left (X^TX\right )_{d\times d} (1)协方差矩阵:半正定(semi-positive de ...
散布矩阵（Scatter Matrix）
转载自http://blog.csdn.net/breeze5428/article/details/25612763,仅用作个人学习. 参考网页:http://en.wikipedia.org/wi ...
R语言使用psych包进行探索性因子分析EFA、使用cov2cor函数将原始数据的协方差矩阵将其转换为相关性矩阵（ covariance matrix into correlation matrix)
R语言使用psych包进行探索性因子分析EFA.使用cov2cor函数将原始数据的协方差矩阵将其转换为相关性矩阵( covariance matrix transform into correlati ...
matlab类间散度矩阵,协方差矩阵和散布矩阵（散度矩阵）的意义
在机器学习模式识别相关算法中,经常需要求样本的协方差矩阵C和散布矩阵S.如在PCA主成分分析中,就需要计算样本的散度矩阵,而有的教材资料是计算协方差矩阵.实质上协方差矩阵和散度矩阵的意义就是一样的,散 ...
如何理解协方差矩阵（散布矩阵）
这学期开了模式识别的学习课程,经常提到概率论与数理统计的一个概念:协方差矩阵(在模式识别中又叫散布矩阵).理解这个矩阵严格意义上来说其实不需要太多先导知识,我们只需要了解一些线性代数基本的概念.但是你 ...
协方差矩阵和散布矩阵（散度矩阵）的意义
协方差矩阵和散布矩阵的意义 [尊重原创,转载请注明出处]http://blog.csdn.net/guyuealian/article/details/68922981 在机器学习模式识别中, ...
散布矩阵(scatter_matrix)及相关系数(correlation coefficients)实例分析
在进行机器学习建模之前,需要对数据进行分析,判断各特征(属性,维度)的数据分布及其之间的关系成为十分必要的环节,本文利用Pandas和Numpy的散布矩阵函数及相关系数函数对数据集特征及其关系进行实例 ...
【Matlab】错误使用 classify (line 233) The pooled covariance matrix of TRAINING must be positive definite.
在 Matlab 用 Classify 函数做判别分析时,有时会碰到下面的问题: 错误使用 classify (line 233) The pooled covariance matrix of TR ...

最新文章

热门文章