1.MATLAB drtoolbox介绍

The Matlab Toolbox for Dimensionality Reduction contains Matlab implementations of 38 techniques for dimensionality reduction and metric learning.


This Matlab toolbox implements 32 techniques for dimensionality reduction. These techniques are all available through the COMPUTE_MAPPING function or trhough the GUI. The following techniques are available:

- Principal Component Analysis ('PCA')

- Linear Discriminant Analysis ('LDA')

- Multidimensional scaling ('MDS')

- Probabilistic PCA ('ProbPCA')

- Factor analysis ('FactorAnalysis')

- Sammon mapping ('Sammon')

- Isomap ('Isomap')

- Landmark Isomap ('LandmarkIsomap')

- Locally Linear Embedding ('LLE')

- Laplacian Eigenmaps ('Laplacian')

- Hessian LLE ('HessianLLE')

- Local Tangent Space Alignment ('LTSA')

- Diffusion maps ('DiffusionMaps')

- Kernel PCA ('KernelPCA')

- Generalized Discriminant Analysis ('KernelLDA')

- Stochastic Neighbor Embedding ('SNE')

- Symmetric Stochastic Neighbor Embedding ('SymSNE')

- t-Distributed Stochastic Neighbor Embedding ('tSNE')

- Neighborhood Preserving Embedding ('NPE')

- Linearity Preserving Projection ('LPP')

- Stochastic Proximity Embedding ('SPE')

- Linear Local Tangent Space Alignment ('LLTSA')

- Conformal Eigenmaps ('CCA', implemented as an extension of LLE)

- Maximum Variance Unfolding ('MVU', implemented as an extension of LLE)

- Landmark Maximum Variance Unfolding ('LandmarkMVU')

- Fast Maximum Variance Unfolding ('FastMVU')

- Locally Linear Coordination ('LLC')

- Manifold charting ('ManifoldChart')

- Coordinated Factor Analysis ('CFA')

- Gaussian Process Latent Variable Model ('GPLVM')

- Autoencoders using stack-of-RBMs pretraining ('AutoEncoderRBM')

- Autoencoders using evolutionary optimization ('AutoEncoderEA')

Furthermore, the toolbox contains 6 techniques for intrinsic dimensionality estimation. These techniques are available through the function INTRINSIC_DIM. The following techniques are available:

- Eigenvalue-based estimation ('EigValue')

- Maximum Likelihood Estimator ('MLE')

- Estimator based on correlation dimension ('CorrDim')

- Estimator based on nearest neighbor evaluation ('NearNb')

- Estimator based on packing numbers ('PackingNumbers')

- Estimator based on geodesic minimum spanning tree ('GMST')

In addition to these techniques, the toolbox contains functions for prewhitening of data (the function PREWHITEN), exact and estimate out-of-sample extension (the functions OUT_OF_SAMPLE and OUT_OF_SAMPLE_EST), and a function that generates toy datasets (the function GENERATE_DATA).

The graphical user interface of the toolbox is accessible through the DRGUI function. 2.安装



运行 rehash toolboxcache 命令,完成工具箱加载

>>rehash toolboxcache


>> what drtoolbox





线性降维是指通过降维所得到的低维数据能保持高维数据点之间的线性关系。线性降维方法主要包括PCA、LDA、LPP(LPP其实是Laplacian Eigenmaps的线性表示);非线性降维一类是基于核的,如KPCA,此处暂不讨论;另一类就是通常所说的流形学习:从高维采样数据中恢复出低维流形结构(假设数据是均匀采样于一个高维欧式空间中的低维流形),即找到高维空间中的低维流形,并求出相应的嵌入映射。非线性流形学习方法有:Isomap、LLE、Laplacian Eigenmaps、LTSA、MVU



监督式和非监督式学习的主要区别在于数据样本是否存在类别信息。非监督降维方法的目标是在降维时使得信息的损失最小,如PCA、LPP、Isomap、LLE、Laplacian Eigenmaps、LTSA、MVU;监督式降维方法的目标是最大化类别间的辨别信,如LDA。事实上,对于非监督式降维算法,都有相应的监督式或半监督式方法的研究。


局部方法仅考虑样品集合的局部信息,即数据点与临近点之间的关系。局部方法以LLE为代表,还包括Laplacian Eigenmaps、LPP、LTSA。


由于局部方法并不考虑数据流形上相距较远的样本之间的关系,因此,局部方法无法达到“使在数据流形上相距较远的样本的特征也相距较远”的目的。 4.工具箱使用





close all

% 产生测试数据

[X, labels] = generate_data('helix', 2000);


scatter3(X(:,1), X(:,2), X(:,3), 5, labels)

title('Original dataset')


% 估计本质维数

no_dims = round(intrinsic_dim(X, 'MLE'));

disp(['MLE estimate of intrinsic dimensionality: ' num2str(no_dims)]);

% PCA降维

[mappedX, mapping] = compute_mapping(X, 'PCA', no_dims);


scatter(mappedX(:,1), mappedX(:,2), 5, labels)

title('Result of PCA')

% Laplacian降维

[mappedX, mapping] = compute_mapping(X, 'Laplacian', no_dims, 7);


scatter(mappedX(:,1), mappedX(:,2), 5, labels(mapping.conn_comp))

title('Result of Laplacian Eigenmaps')


% Isomap降维

[mappedX, mapping] = compute_mapping(X, 'Isomap', no_dims);


scatter(mappedX(:,1), mappedX(:,2), 5, labels(mapping.conn_comp))

title('Result of Isomap')


