目录

前言

一、进度

二、基本内容

1.Unsupervised Learning

2.K-means Algorithem

3.Data Visualization

4.Data Compression

5.PCA Algorithem

6.Advice for Applying PCA

7.作业

总结

前言

聚类&降维

一、进度

第八周（86%）

二、基本内容

1.Unsupervised Learning

非监督学习，指没有y集，所有的样本全部一视同仁，不存在类似y=f(x)的关系。

以Clustering问题为主。

2.K-means Algorithem

先将点分类至各个聚类中心，然后修改聚类中心的位置。具体过程无需赘述，不会忘的。

目的是为了min所有点到各自聚类中心的距离和。

至于各个聚类中心的初始化，随机选用已经存在的某些点。当然随机总会导致一些问题，比如陷入了local optimum，那就会导致聚类的结果不是很好。

优化方法就是多次尝试，选择最终结果最优的方案。

至于选择聚类的个数，使用elbow原则。当增大聚类个数的时候，损失函数会出现某个“肘点”，看上去就是斜率变化最大的地方。如果没有elbow，那就手动选择损失函数足够低的某个点。有时候，比如衣物商家选择定义衣物的版型大小，也会使用聚类方法进行分类。

3.Data Visualization

这部分是为了Data Compresion铺垫的。假设有一张图，里面有很多维的数据。如何在一张图里直观地表现出来。做法就是将各个维度进行组合处理（引用线代里的“特征”概念相关进行操作），最终实现数据降维。至于右下角的那个国家，是哪位自不必多说（兔子有话要说）。

4.Data Compression

核心概念：降维。

5.PCA Algorithem

降维算法。

首先阐述PCA的数学内涵。对于某组多维数据，首先先找到这组数据的特征向量。特征向量的维数取决于你想要将数据降到多少维。然后将该组数据在这些特征向量上进行投影。最终在特征向量组成的空间内，这些数据投影形成的新多维向量就是我们要得到的将为后的数据。

假设将数据从n维降到k维，具体步骤如下：

1.Feature Scaling。日常对数据进行FS。有多种FS的方法，比如：

2.计算协方差矩阵。运用公式：

这里的sigma是字母σ的大写，不是求和。

3.计算特征向量。运用函数svd(Sigma):

[U,S,V] = svd(Sigma);

其中U是一个特征向量矩阵，长这样：

我们要多少维，就用U的前几列元素进行运算。前k列元素组成的新矩阵我们叫做Ureduce。

4.得到结果。

Z=Ureduce' * X

具体维数就变成n*k。

接下来针对PCA进行某些重点的讨论。

首先是数据的恢复。就是一个你过程。将数据投影到指定空间后，将其按照空间的方向进行恢复。显然这是有损失的。有点像线性回归里面拟合的曲线上找训练集的某些点。实际使用时的公式就是降维的逆运算：

Xapprox = Z*Ureduce'

然后是维数的选择。我们进行降维的要求是尽量不损失数据的特性。我们使用如下方法进行评估：

可以注意到，如果恢复的Xapprox和原先的X差距很小，那么式子的值就会很小。我们只需设定一个接受的损失值，即可评价该降维效果。

对于不同的维数，我们使用循环进行选择：

这里的评价函数其实对应到上面的svd函数的S对角矩阵。使用S矩阵进行计算更方便。

最后就是：PCA不等于线性回归。线性回归是计算纵向差值，并且有y；PCA只是使得投影距离最小，且所有样本数据无xy之分。本质上就不是一个东西，只是形式很像罢了。

6.Advice for Applying PCA

首先，对于某些Supervised Learning，X值过于复杂。比如给一个100*100的像素样本。这时候我们可以先对X进行降维，得到10*10的Z样本，然后进行y=f(Z）的学习。

另外，不要用PCA进行防止过拟合的操作。数据降维等于降低θ，是防止过拟合的一个思路，但是“This might work OK, but isn’t a good way to address overfilng. Use regularization instead”。

最后，不要总是在任何情况下都首先使用PCA。虽然能够提高效率，但是PCA毕竟是有损压缩，指不定哪里就把某些重要东西压缩没了（比如大鹅的荧光蓝标志）。

7.作业

easy。问题不大。

图片压缩使用聚类原理：假如将图片压缩至16色，则把原图的所有色素自动聚类为16个类，然后图片就被压缩成了只有16色的图。

function idx = findClosestCentroids(X, centroids)
%FINDCLOSESTCENTROIDS computes the centroid memberships for every example
%   idx = FINDCLOSESTCENTROIDS (X, centroids) returns the closest centroids
%   in idx for a dataset X where each row is a single example. idx = m x 1
%   vector of centroid assignments (i.e. each entry in range [1..K])
%% Set K
K = size(centroids, 1);% You need to return the following variables correctly.
idx = zeros(size(X,1), 1);% ====================== YOUR CODE HERE ======================
% Instructions: Go over every example, find its closest centroid, and store
%               the index inside idx at the appropriate location.
%               Concretely, idx(i) should contain the index of the centroid
%               closest to example i. Hence, it should be a value in the
%               range 1..K
%
% Note: You can use a for-loop over the examples to compute this.
%
for i = 1:size(X,1)minDistance = inf;for j = 1:Kdistance = sqrt(sum((X(i,:)-centroids(j,:)).^2));if distance<minDistanceminDistance = distance;idx(i) = j;endifendfor
endfor
% =============================================================end

function centroids = computeCentroids(X, idx, K)
%COMPUTECENTROIDS returns the new centroids by computing the means of the
%data points assigned to each centroid.
%   centroids = COMPUTECENTROIDS(X, idx, K) returns the new centroids by
%   computing the means of the data points assigned to each centroid. It is
%   given a dataset X where each row is a single data point, a vector
%   idx of centroid assignments (i.e. each entry in range [1..K]) for each
%   example, and K, the number of centroids. You should return a matrix
%   centroids, where each row of centroids is the mean of the data points
%   assigned to it.
%% Useful variables
[m n] = size(X);% You need to return the following variables correctly.
centroids = zeros(K, n);% ====================== YOUR CODE HERE ======================
% Instructions: Go over every centroid and compute mean of all points that
%               belong to it. Concretely, the row vector centroids(i, :)
%               should contain the mean of the data points assigned to
%               centroid i.
%
% Note: You can use a for-loop over the centroids to compute this.
%
for i = 1 : KsumValue = zeros(1,n);sumID = 0;for j = 1:size(idx,1)if idx(j)==isumValue = sumValue + X(j,:);sumID = sumID + 1;endifendforcentroids(i,:) = sumValue/sumID;
endfor
% =============================================================end

function [U, S] = pca(X)
%PCA Run principal component analysis on the dataset X
%   [U, S, X] = pca(X) computes eigenvectors of the covariance matrix of X
%   Returns the eigenvectors U, the eigenvalues (on diagonal) in S
%% Useful values
[m, n] = size(X);% You need to return the following variables correctly.
U = zeros(n);
S = zeros(n);% ====================== YOUR CODE HERE ======================
% Instructions: You should first compute the covariance matrix. Then, you
%               should use the "svd" function to compute the eigenvectors
%               and eigenvalues of the covariance matrix.
%
% Note: When computing the covariance matrix, remember to divide by m (the
%       number of examples).
%
Sigma = X'*X/m;
[U, S, V] = svd(Sigma);
% =========================================================================end

function Z = projectData(X, U, K)
%PROJECTDATA Computes the reduced data representation when projecting only
%on to the top k eigenvectors
%   Z = projectData(X, U, K) computes the projection of
%   the normalized inputs X into the reduced dimensional space spanned by
%   the first K columns of U. It returns the projected examples in Z.
%% You need to return the following variables correctly.
Z = zeros(size(X, 1), K);% ====================== YOUR CODE HERE ======================
% Instructions: Compute the projection of the data using only the top K
%               eigenvectors in U (first K columns).
%               For the i-th example X(i,:), the projection on to the k-th
%               eigenvector is given as follows:
%                    x = X(i, :)';
%                    projection_k = x' * U(:, k);
%
for i = 1:size(X, 1)x = X(i, :)';projection_k = x' * U(:, 1:K);Z(i,:) = projection_k;
end
% =============================================================end

function X_rec = recoverData(Z, U, K)
%RECOVERDATA Recovers an approximation of the original data when using the
%projected data
%   X_rec = RECOVERDATA(Z, U, K) recovers an approximation the
%   original data that has been reduced to K dimensions. It returns the
%   approximate reconstruction in X_rec.
%% You need to return the following variables correctly.
X_rec = zeros(size(Z, 1), size(U, 1));% ====================== YOUR CODE HERE ======================
% Instructions: Compute the approximation of the data by projecting back
%               onto the original space using the top K eigenvectors in U.
%
%               For the i-th example Z(i,:), the (approximate)
%               recovered data for dimension j is given as follows:
%                    v = Z(i, :)';
%                    recovered_j = v' * U(j, 1:K)';
%
%               Notice that U(j, 1:K) is a row vector.
%
X_rec = Z*U(:,1:K)';
% =============================================================end

总结

写点有的没的。突然就听到了祝乾亮的《二向箔降维打击》，觉得莫名贴合这次课主题。当然，降维只是某些处理方式罢了。

至于Clustering，人类社会会不会也自动产生很多聚类中心，然后再不断吸引人群的过程中，聚类中心也在调整呢？

最后用本期作业压缩一下大鹅：

【Coursera-Machine Learning】自用7相关推荐

[coursera machine learning] Week 1
1. machine learning 问题的分类: Supervised Learning: right answers given in samples Regression: continuou ...
Coursera Machine Learning 作业提交问题
关于作业提交问题的解决办法 Octave 4.0.0无法正常提交解决办法:打两个补丁补丁1:平台通用补丁2:Win,Linux or Mac 注:补丁文件中有安装说明
吴恩达ex3_[Coursera] Machine Learning ex3 多元分类和神经网络步骤分析
第四周的主要内容是神经网络,个人觉得讲得比较跳,所以补充几篇文章加深一下理解: But what *is* a Neural Network? 先提一下,本人设计背景,没学过微积分,这篇只当是笔记,有 ...
Machine Learning - Andrew Ng on Coursera (Week 6)
本篇文章将分享Coursera上Andrew Ng的Machine Learning第六周的课程,主要内容有如下,详细内容可以参考文末附件: 评价机器学习算法 Diagnosing bias vs. ...
Machine Learning - Andrew Ng on Coursera (Week 5)
本篇文章将分享Coursera上Andrew Ng的Machine Learning第五周的课程,主要内容有如下,详细内容可以参考文末附件: 代价函数及后向算法 Cost function(代价函数) ...
Machine Learning - Andrew Ng on Coursera (Week 4)
本篇文章将分享Coursera上Andrew Ng的Machine Learning第四周的课程,主要内容有如下,详细内容可以参考文末附件: 动机神经网络应用动机为什么要引入神经网络?在分类问 ...
Machine Learning - Andrew Ng on Coursera (Week 3)
本篇文章将分享Coursera上Andrew Ng的Machine Learning第三周的课程,主要内容有如下,详细内容可以参考文末附件: 分类问题及模型表示逻辑回归模型多类别的分类问题解决过 ...
Machine Learning - Andrew Ng on Coursera (Week 2)
本篇文章将分享Coursera上Andrew Ng的Machine Learning第二周的课程,主要内容有如下,详细内容可以参考文末附件: 设置作业环境多变量线性回归参数的解析算法 Octave ...
Machine Learning - Andrew Ng on Coursera (Week 1)
转载自:http://1.kaopuer.applinzi.com/?p=110 今天分享了Coursera上Andrew Ng的Machine Learning第一周的课程,主要内容有如下,详细内容 ...
Coursera公开课笔记: 斯坦福大学机器学习第十一课“机器学习系统设计(Machine learning system design)”
Coursera公开课笔记: 斯坦福大学机器学习第十一课"机器学习系统设计(Machine learning system design)" 斯坦福大学机器学习斯坦福大学机器学习第 ...

【Coursera-Machine Learning】自用7

前言