Python数据处理 PCA/ZCA 白化

参考材料 PCA、白化以及一份别人的课后作业答案 UFLDL教程答案(3):Exercise:PCA_in_2D&PCA_and_Whitening
与参考材料中实现的区别在于：参考材料中一列表示一个样本，在我的代码中一行表示一个样本
基于课后练习的数据对 PCA/ZCA白化进行代码说明
完整代码在这里

课后练习

PCA课后练习 http://ufldl.stanford.edu/wiki/index.php/Exercise:PCA_in_2D
PCA及白化课后练习 http://ufldl.stanford.edu/wiki/index.php/Exercise:PCA_and_Whitening
以上课后练习包含需要的数据，以及matlab代码

PCA, PCA whitening and ZCA whitening in 2D

Step 0: Load data

import matplotlib.pyplot as plt
import numpy as np%matplotlib inline
%config InlineBackend.figure_format = 'retina'

x = np.loadtxt('pca_2d/pcaData.txt')
# 转置数据，符合我们的代码
x = x.Tplt.scatter(x[:, 0], x[:, 1], marker='o', color='', edgecolors='b')# 零均值化
x -= np.mean(x, axis=0)# plot data
plt.scatter(x[:, 0], x[:, 1], marker='o', color='', edgecolors='g')
plt.show()

Step 1: Implement PCA

cov = np.dot(x.T, x) / x.shape[0]
U,S,V = np.linalg.svd(cov)xRot = np.dot(x, U.T)# Check xRot
plt.scatter(xRot[:, 0], xRot[:, 1], marker='o', color='', edgecolors='b')
plt.show()

Step 2:Dimension reduce and replot

k = 1
xReduce = np.dot(x,U[:,0:k])
xHat = np.concatenate((xReduce, np.zeros(shape=(x.shape[0], x.shape[1] - k))), axis=1)
xHat = xHat.dot(U.T)
plt.scatter(xHat[:, 0], xHat[:, 1], marker='o', color='', edgecolors='b')

Step 3: PCA Whitening

e = 1e-5
xPCAwhite =  xRot * (np.diag(1./np.sqrt(np.diag(S)+e)))
plt.scatter(xPCAwhite[:,0], xPCAwhite[:,1],marker='o',color='', edgecolors='b')

Step 4: ZCA Whitening

xZCAwhite = xPCAwhite.dot(U)
plt.scatter(xZCAwhite[:,0], xZCAwhite[:,1],marker='o', color='', edgecolors='b')

PCA and Whitening on natural images

Step 0: Prepare data

Step 0a: Load data

from scipy.io import loadmat

# Load data
data = loadmat('pca_exercise/IMAGES_RAW.mat')# imgs.shape = (512,512,10) 共有10张图片，每张图片512*512
imgs = data['IMAGESr']# 显示第一张图片
plt.imshow(imgs[:,:,0], cmap='gray')
plt.show()

# 生成块数据
patch_size = 12
num_patches = 10000
patches = np.zeros((num_patches, patch_size*patch_size))
p = 0
num_imgs = imgs.shape[2]
for im in range(num_imgs):num_samples = num_patches // num_imgsfor s in range(num_samples):y = np.random.randint(imgs.shape[0] - patch_size + 1)x = np.random.randint(imgs.shape[1] - patch_size + 1)sample = imgs[y:y+patch_size, x:x+patch_size, im]patches[p,:] = np.reshape(sample, (patch_size*patch_size))p += 1

def display_patches(samples, num_rows, num_cols, padding_size):display_height = num_rows*patch_size + (num_rows+1)*padding_sizedisplay_width = num_cols*patch_size + (num_cols+1)*padding_sizedisplay_imgs = np.full((display_height, display_width), -1.0)  samples -= np.mean(samples)for i in range(samples.shape[0]):row = i // num_rowscol = i % num_colsvertical_start = (col+1)*padding_size + col*patch_sizevertical_end = vertical_start + patch_sizehorizontal_start = (row+1)*padding_size + row*patch_sizehorizontal_end = horizontal_start + patch_size clim = np.max(np.abs(samples[i,:]))display_imgs[horizontal_start:horizontal_end, vertical_start:vertical_end] = np.reshape(samples[i,:]/clim, (patch_size, patch_size))plt.figure(figsize=(10,10))plt.imshow(display_imgs, cmap='gray')plt.show()

num_rows = 10
num_cols = 10
padding_size = 1
sample_index = np.random.randint(patches.shape[0], size=num_rows*num_cols)
samples = patches[sample_index, :]
display_patches(samples, num_rows, num_cols, padding_size)

Step 0b: Zero mean the data

patches = (patches.T - np.mean(patches, axis=1)).T

Step 1: Implement PCA

Step 1a: Implement PCA

cov = np.dot(patches.T, patches) / patches.shape[0]
U,S,V = np.linalg.svd(cov)pRot = np.dot(patches, U.T)

Step 1b: Check covariance

cov_rot = np.dot(pRot.T, pRot) / pRot.shape[0]
plt.imshow(cov_rot)
plt.show()

Step 2: Find number of components to retain

# 找到k， 使得PCA能够保持99%的方差
k = 0
for i, p in enumerate((np.cumsum(S) / np.sum(S)), 1):if p > 0.99:k = ibreak

Step 3: PCA with dimension reduction

pReduce = np.dot(patches,U[:,:k])# restore the images
pHat = np.concatenate((pReduce, np.zeros(shape=(patches.shape[0], patches.shape[1] - k))), axis=1)
pHat = pHat.dot(U.T)samples = pHat[sample_index, :]
display_patches(samples, num_rows, num_cols, padding_size)# 再减少维度
plt.figure()
low_dim_k = 20
pReduce = np.dot(patches,U[:,:low_dim_k])# restore the images
pHat = np.concatenate((pReduce, np.zeros(shape=(patches.shape[0], patches.shape[1] - low_dim_k))), axis=1)
pHat = pHat.dot(U.T)samples = pHat[sample_index, :]
display_patches(samples, num_rows, num_cols, padding_size)

<matplotlib.figure.Figure at 0x1252ccf7668>

Step 4: PCA with whitening and regularization

Step 4a: Implement PCA with whitening and regularization

Step 4b: Check covariance

epsilon = 1e-9
pPCAwhite =  pRot * (np.diag(1./np.sqrt(np.diag(S)+epsilon)))cov_PCAwhite = np.dot(pPCAwhite.T, pPCAwhite) / pPCAwhite.shape[0]
plt.imshow(cov_PCAwhite)
plt.show()

epsilon = 0.1
pPCAwhite =  pRot * (np.diag(1./np.sqrt(np.diag(S)+epsilon)))cov_PCAwhite = np.dot(pPCAwhite.T, pPCAwhite) / pPCAwhite.shape[0]
plt.imshow(cov_PCAwhite)
plt.show()

Step 5: ZCA whitening

pZCAwhite = pPCAwhite.dot(U)samples = pZCAwhite[sample_index, :]
display_patches(samples, num_rows, num_cols, padding_size)# 原图
plt.figure()
samples = patches[sample_index, :]
display_patches(samples, num_rows, num_cols, padding_size)

<matplotlib.figure.Figure at 0x1252c8d62b0>

Python数据处理 PCA/ZCA 白化(UFLDL教程:Exercise:PCA_in_2DPCA_and_Whitening)相关推荐

UFLDL教程: Exercise:Self-Taught Learning
自我学习 Deep Learning and Unsupervised Feature Learning Tutorial Solutions 1.先训练稀疏自编码器提取特征,再把特征和label给s ...
UFLDL教程: Exercise: Implement deep networks for digit classification
Deep networks Deep Learning and Unsupervised Feature Learning Tutorial Solutions 深度网络的优势比单层神经网络能学习到 ...
UFLDL教程: Exercise:Learning color features with Sparse Autoencoders
Linear Decoders Deep Learning and Unsupervised Feature Learning Tutorial Solutions 以三层的稀疏编码神经网络而言,在s ...
python做pca图_【教程】组学研究，用python快速实现PCA分析和绘图
什么是PCA 主成分分析(Principal Component Analysis,PCA)是一种无监督的多元统计分析方法.在蛋白组学和代谢组学研究中能从总体上反应各组样本之间的总体差异和组内样本之间 ...
UFLDL教程：Exercise:PCA in 2D PCA and Whitening
相关文章 PCA的原理及MATLAB实现 UFLDL教程:Exercise:PCA in 2D & PCA and Whitening python-A comparison of vario ...
UFLDL教程： Exercise: Sparse Autoencoder
自编码可以跟PCA 一样,给特征属性降维一些matlab函数 bsxfun:C=bsxfun(fun,A,B)表达的是两个数组A和B间元素的二值操作,fun是函数句柄或者m文件,或者是内嵌的函数.在 ...
UFLDL教程：数据预处理
数据预处理是深度学习中非常重要的一步!如果说原始数据的获得,是深度学习中最重要的一步,那么获得原始数据之后对它的预处理更是重要的一部分. 一般来说,算法的好坏一定程度上和数据是否归一化,是否白化有关. ...
Stanford UFLDL教程数据预处理
数据预处理 Contents [hide] 1概要 2数据归一化 2.1简单缩放 2.2逐样本均值消减 2.3特征标准化 3PCA/ZCA白化 3.1基于重构的模型 3.2基于正交化ICA的模型 4大 ...
深度学习入门教程UFLDL学习实验笔记三：主成分分析PCA与白化whitening
深度学习入门教程UFLDL学习实验笔记三:主成分分析PCA与白化whitening 主成分分析与白化是在做深度学习训练时最常见的两种预处理的方法,主成分分析是一种我们用的很多的降维的一种手段,通 ...

Python数据处理 PCA/ZCA 白化(UFLDL教程:Exercise:PCA_in_2DPCA_and_Whitening)

Python数据处理 PCA/ZCA 白化

课后练习

PCA, PCA whitening and ZCA whitening in 2D

Step 0: Load data

Step 1: Implement PCA

Step 2:Dimension reduce and replot

Step 3: PCA Whitening

Step 4: ZCA Whitening

PCA and Whitening on natural images

Step 0: Prepare data

Step 0a: Load data

Step 0b: Zero mean the data

Step 1: Implement PCA

Step 1a: Implement PCA

Step 1b: Check covariance

Step 2: Find number of components to retain

Step 3: PCA with dimension reduction

Step 4: PCA with whitening and regularization

Step 4a: Implement PCA with whitening and regularization

Step 4b: Check covariance

Step 5: ZCA whitening

Python数据处理 PCA/ZCA 白化(UFLDL教程:Exercise:PCA_in_2DPCA_and_Whitening)相关推荐

最新文章

热门文章