代码解析

初始化

# Run some setup code for this notebook.
importrandom
importnumpy as np
fromcs231n.data_utils import load_CIFAR10
importmatplotlib.pyplot as pltfrom__future__ import print_function#This is a bit of magic to make matplotlib figures appear inline in the notebook
#rather than in a new window.
%matplotlibinline
plt.rcParams['figure.figsize']= (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation']= 'nearest'
plt.rcParams['image.cmap']= 'gray'#Some more magic so that the notebook will reload external python modules;
#see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_extautoreload
%autoreload2

一些初始化代码,载入必要的包,保证图像输出在网页中而不新建窗口。

载入数据

# Load the raw CIFAR-10 data.
cifar10_dir= 'cs231n/datasets/cifar-10-batches-py'
X_train,y_train, X_test, y_test = load_CIFAR10(cifar10_dir)#As a sanity check, we print out the size of the training and test data.
print('Trainingdata shape: ', X_train.shape)
print('Traininglabels shape: ', y_train.shape)
print('Testdata shape: ', X_test.shape)
print('Testlabels shape: ', y_test.shape)

载入CIFAR-10数据。输出数据格式:

由于是彩图3通道,故大小为32*32*3.

展示部分训练图

 # Visualize some examples fromthe dataset.
#We show a few examples of training images from each class.
classes= ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship','truck']
num_classes= len(classes)
samples_per_class= 7
fory, cls in enumerate(classes):idxs = np.flatnonzero(y_train == y)idxs = np.random.choice(idxs,samples_per_class, replace=False)for i, idx in enumerate(idxs):plt_idx = i * num_classes + y + 1plt.subplot(samples_per_class,num_classes, plt_idx)plt.imshow(X_train[idx].astype('uint8'))plt.axis('off')if i == 0:plt.title(cls)
plt.show()

  从每一类中展示7张训练图片。结果如下:

取样数据

#Subsample the data for more efficient code execution in this exercise
num_training= 5000
mask= list(range(num_training))
X_train= X_train[mask]
y_train= y_train[mask]num_test= 500
mask= list(range(num_test))
X_test= X_test[mask]
y_test= y_test[mask]

在练习中,为了更高效地执行代码,我们只取样部分数据。选取5000张测试图片,500张测试图片。

#Reshape the image data into rows
X_train= np.reshape(X_train, (X_train.shape[0], -1))
X_test= np.reshape(X_test, (X_test.shape[0], -1))print(X_train.shape,X_test.shape)

将图片转化为行向量,所有图片组成二维矩阵。32*32*3=3072.

故结果如下:

(5000L,3072L) (500L, 3072L)

  

载入函数

fromcs231n.classifiers import KNearestNeighbor
#Create a kNN classifier instance.
#Remember that training a kNN classifier is a noop:
#the Classifier simply remembers the data and does no further processing
classifier= KNearestNeighbor()
classifier.train(X_train,y_train)

  

载入KnearestNeighbour包,创建KNearestNeighbor类的对象classifier,调用train()函数。

二重循环计算距离矩阵

defcompute_distances_two_loops(self, X):"""Compute the distance between each test point in X and each trainingpointin self.X_train using a nested loop over both the training data and thetest data.Inputs:- X: A numpy array of shape (num_test, D) containing test data.Returns:- dists: A numpy array of shape (num_test, num_train) where dists[i, j]is the Euclidean distance between the ithtest point and the jth trainingpoint."""num_test = X.shape[0]num_train = self.X_train.shape[0]dists = np.zeros((num_test, num_train)) #500*5000for i in xrange(num_test):for j in xrange(num_train):dists[i,j] = np.sqrt(np.sum(np.square(self.X_train[j,:]- X[i,:])))#数组切片[:]###################################################################### TODO:                                                            ## Compute the L2 distance between the ithtest point and the jth    ## training point, and store the resultin dists[i, j]. You should   ## not use a loop over dimension.                                   #######################################################################pass######################################################################                       END OF YOUR CODE                            ######################################################################return dists

  

使用的是L2距离,注意 i 和 j 分别代表测试集和训练集。

测试距离矩阵

#Open cs231n/classifiers/k_nearest_neighbor.py and implement
#compute_distances_two_loops.#Test your implementation:
dists= classifier.compute_distances_two_loops(X_test)
print(dists.shape)

  

调用刚刚写好的函数,打印距离矩阵的大小:

 (500L, 5000L)

  

#We can visualize the distance matrix: each row is a single test example and
#its distances to training examples
plt.imshow(dists,interpolation='none')
plt.show()

  将距离矩阵可视化。每一行表示测试样例距所有训练图片的距离。

如上图所示,纵坐标表示500张测试图片,横坐标表示5000张训练图片。越黑表示距离越接近,越亮表示距离越远。

预测测试图片的标签

实现预测函数predict_labels(),在cs231n/classifiers/k_nearest_neighbor.py目录下。根据3.7~3.8算出的dists距离矩阵,选出离测试图片最近的k张训练图,投票选出最可能的预测结果。若k=1,就选出距离最近的一张训练图,将该图的标签作为测试图的标签。

def predict_labels(self, dists, k=1):"""Given a matrix of distances between testpoints and training points,predict a label for each test point.Inputs:- dists: A numpy array of shape (num_test,num_train) where dists[i, j]gives the distance betwen the ith testpoint and the jth training point.Returns:- y: A numpy array of shape (num_test,)containing predicted labels for thetest data, where y[i] is the predictedlabel for the test point X[i]. """num_test = dists.shape[0]y_pred = np.zeros(num_test) #500*1for i in xrange(num_test):# A list of length k storing the labelsof the k nearest neighbors to# the ith test point.closest_y = []########################################################################## TODO:                                                                ## Use the distance matrix to find the knearest neighbors of the ith    ## testing point, and use self.y_train tofind the labels of these       ## neighbors. Store these labels inclosest_y.                           ## Hint: Look up the functionnumpy.argsort.                             ##########################################################################closest_y = np.argsort(dists[i,:]) # i'm socool#pass########################################################################## TODO:                                                                ## Now that you have found the labels ofthe k nearest neighbors, you    ## need to find the most common label inthe list closest_y of labels.   ## Store this label in y_pred[i]. Breakties by choosing the smaller     ## label.                                                               ##########################################################################y_pred[i] =np.argmax(np.bincount(self.y_train[closest_y[:k]]))#pass##########################################################################                           END OF YOURCODE                             ##########################################################################return y_pred

  

对所有的测试图片遍历,将dists[]矩阵按行排序(由大到小),索引放于closet_y向量中。对cloest_y中前k个向量进行计数np.bincount(),最终用np.argmax()得到票数最多的下标,作为最终的标签y_pred[i]。

举个例子:

假如cloest_y向量中排名前五的数字分别为[1,1,1,3,2],那么np.bincount()将会返回索引在该数组内出现的次数:

array([0, 3, 1, 1])

因为[1,1,1,3,2]中最大数字为3,故bincount()结果有4个数字,索引值为0~3.数组表示0出现0次,1出现3次,2出现1次,3出现1次。这时候取最大值下标np.argmax(),正好得到索引值1,就是我们希望的结果。

运行预测代码

# Now implement the function predict_labels and run the code below:
# We use k = 1 (which is Nearest Neighbor).
y_test_pred = classifier.predict_labels(dists, k=1)
# Compute and print the fraction of correctly predicted examples
num_correct = np.sum(y_test_pred == y_test)
accuracy = float(num_correct) / num_test
print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))

  

结果如下:

Got 137/500 correct => accuracy: 0.274000

  

测试k=5的情况

y_test_pred = classifier.predict_labels(dists, k=5)
num_correct = np.sum(y_test_pred == y_test)
accuracy = float(num_correct) / num_test
print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))

  结果如下:

Got 139/500 correct => accuracy: 0.278000

  

比起刚才的27.4%,准确度有了些微的提升。

半向量化代码

def compute_distances_one_loop(self, X):"""Compute the distance between each test point in X and each training pointin self.X_train using a single loop over the test data.Input / Output: Same as compute_distances_two_loops"""num_test = X.shape[0]num_train = self.X_train.shape[0]dists = np.zeros((num_test, num_train))for i in xrange(num_test):######################################################################## TODO:                                                               ## Compute the L2 distance between the ith test point and all training ## points, and store the result in dists[i, :].                        ########################################################################dists[i,:] = np.sqrt(np.sum(np.square(self.X_train - X[i,:]),axis = 1))#不同维加减 矩阵广播#pass########################################################################                         END OF YOUR CODE                            ########################################################################return dist

  和双重循环的区别在于,单层循环只遍历了测试图片,对训练图片的遍历采用了向量化代码完成。axis=1表示沿着水平方向累加。这是因为要对某张测试图片计算他距离每张训练图片的距离。

向量化代码

这一部分是最核心、最难以理解,也是最高效的代码。要求不能包含任何循环,依靠numpy提供的广播机制来计算矩阵。

首先考虑两个需要计算的矩阵。一个包含测试图片的矩阵X,大小是500*3072,表示有500张测试图,每一行代表图片的像素情况;另一个是包含训练图片的矩阵X_train,有5000张图片,故大小为5000*3072。需要做的就是将测试矩阵的每一行和训练矩阵的每一行做L2距离计算,结果存在dists矩阵(500*5000)中。

考虑L2距离的计算公式:

对求和符号内部公式展开得:x^2 + y ^2 – 2*x*y。

所谓广播机制,是指两个矩阵在每一维上维度相等或者其中一个矩阵的维度是1的情况下,较小的矩阵将自动扩展为较大矩阵同样的大小。举例如下:

矩阵a = [[1,2,3]

[1,2,3]

[1,2,3]]

b =  [1 1 1]

a矩阵维度为3*3,b矩阵维度为1*3.由于a和b在列上维度相同,b矩阵在行上维度为1。故a = a+b的结果将为:

a = [[2,3,4]

[2,3,4]

[2,3,4]]

相当于将b矩阵按行广播,变为[[1,1,1][1,1,1][1,1,1]]了。

更多广播机制的内容参见 这里 。

def compute_distances_no_loops(self, X):"""Compute the distance between each test point in X and each training pointin self.X_train using no explicit loops.Input / Output: Same as compute_distances_two_loops"""num_test = X.shape[0]num_train = self.X_train.shape[0]dists = np.zeros((num_test, num_train)) ########################################################################## TODO:                                                                 ## Compute the L2 distance between all test points and all training      ## points without using any explicit loops, and store the result in      ## dists.                                                                ##                                                                       ## You should implement this function using only basic array operations; ## in particular you should not use functions from scipy.                ##                                                                       ## HINT: Try to formulate the l2 distance using matrix multiplication    ##       and two broadcast sums.                                         ##########################################################################dists += np.sum(self.X_train ** 2, axis=1).reshape(1, num_train) #1*5000,第一次广播dists += np.sum(X ** 2, axis=1).reshape(num_test,1) #500*1,第二次广播dists -= 2 * np.dot(X, self.X_train.T) #500*5000dists = np.sqrt(dists)#pass##########################################################################                         END OF YOUR CODE                              ##########################################################################return dists

  

测试向量化代码

# Now implement the fully vectorized version inside compute_distances_no_loops
# and run the code
dists_two = classifier.compute_distances_no_loops(X_test)# check that the distance matrix agrees with the one we computed before:
difference = np.linalg.norm(dists - dists_two, ord='fro')
print('Difference was: %f' % (difference, ))
if difference < 0.001:
print('Good! The distance matrices are the same')
else:
print('Uh-oh! The distance matrices are different')

  

运行结果:

3.16 比较各函数效果

# Let's compare how fast the implementations are
def time_function(f, *args):
"""
Call a function f with args and return the time (in seconds) that it took to execute.
"""
import time
tic = time.time()
f(*args)
toc = time.time()
return toc - tictwo_loop_time = time_function(classifier.compute_distances_two_loops, X_test)
print('Two loop version took %f seconds' % two_loop_time)one_loop_time = time_function(classifier.compute_distances_one_loop, X_test)
print('One loop version took %f seconds' % one_loop_time)no_loop_time = time_function(classifier.compute_distances_no_loops, X_test)
print('No loop version took %f seconds' % no_loop_time)# you should see significantly faster performance with the fully vectorized implementation

  

我们在3.14实现了向量化函数,在3.12实现了半向量化函数,在3.7实现了无向量化函数。对他们用时分别测试。结果如下:

3.17  交叉验证

num_folds = 5
k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]X_train_folds = []
y_train_folds = []
################################################################################
# TODO: #
# Split up the training data into folds. After splitting, X_train_folds and #
# y_train_folds should each be lists of length num_folds, where #
# y_train_folds[i] is the label vector for the points in X_train_folds[i]. #
# Hint: Look up the numpy array_split function. #
################################################################################
# split self.X_train to 5 folds
avg_size = int(X_train.shape[0] / num_folds) # will abandon the rest if not divided evenly.
for i in range(num_folds):
X_train_folds.append(X_train[i * avg_size : (i+1) * avg_size])
y_train_folds.append(y_train[i * avg_size : (i+1) * avg_size])pass
################################################################################
# END OF YOUR CODE #
################################################################################# A dictionary holding the accuracies for different values of k that we find
# when running cross-validation. After running cross-validation,
# k_to_accuracies[k] should be a list of length num_folds giving the different
# accuracy values that we found when using that value of k.
k_to_accuracies = {}################################################################################
# TODO: #
# Perform k-fold cross validation to find the best value of k. For each #
# possible value of k, run the k-nearest-neighbor algorithm num_folds times, #
# where in each case you use all but one of the folds as training data and the #
# last fold as a validation set. Store the accuracies for all fold and all #
# values of k in the k_to_accuracies dictionary. #
################################################################################
for k in k_choices:
accuracies = []
print(k)
for i in range(num_folds):
X_train_cv = np.vstack(X_train_folds[0:i] + X_train_folds[i+1:])
y_train_cv = np.hstack(y_train_folds[0:i] + y_train_folds[i+1:])
X_valid_cv = X_train_folds[i]
y_valid_cv = y_train_folds[i]classifier.train(X_train_cv, y_train_cv)
dists = classifier.compute_distances_no_loops(X_valid_cv)
accuracy = float(np.sum(classifier.predict_labels(dists, k) == y_valid_cv)) / y_valid_cv.shape[0]
accuracies.append(accuracy)
k_to_accuracies[k] = accuracies
pass
################################################################################
# END OF YOUR CODE #
################################################################################# Print out the computed accuracies
for k in sorted(k_to_accuracies):
for accuracy in k_to_accuracies[k]:
print('k = %d, accuracy = %f' % (k, accuracy))

  

这部分代码不难理解,主要将训练代码分为5部分,其中一部分作为验证集,来不断改变k值,寻找最优解。其中np.vstack()表示沿着竖直方向将矩阵堆叠,np.hstack()表示沿水平方向堆叠矩阵。运行结果:

1
3
5
8
10
12
15
20
50
100
k = 1, accuracy = 0.263000
k = 1, accuracy = 0.257000
k = 1, accuracy = 0.264000
k = 1, accuracy = 0.278000
k = 1, accuracy = 0.266000
k = 3, accuracy = 0.239000
k = 3, accuracy = 0.249000
k = 3, accuracy = 0.240000
k = 3, accuracy = 0.266000
k = 3, accuracy = 0.254000
k = 5, accuracy = 0.248000
k = 5, accuracy = 0.266000
k = 5, accuracy = 0.280000
k = 5, accuracy = 0.292000
k = 5, accuracy = 0.280000
k = 8, accuracy = 0.262000
k = 8, accuracy = 0.282000
k = 8, accuracy = 0.273000
k = 8, accuracy = 0.290000
k = 8, accuracy = 0.273000
k = 10, accuracy = 0.265000
k = 10, accuracy = 0.296000
k = 10, accuracy = 0.276000
k = 10, accuracy = 0.284000
k = 10, accuracy = 0.280000
k = 12, accuracy = 0.260000
k = 12, accuracy = 0.295000
k = 12, accuracy = 0.279000
k = 12, accuracy = 0.283000
k = 12, accuracy = 0.280000
k = 15, accuracy = 0.252000
k = 15, accuracy = 0.289000
k = 15, accuracy = 0.278000
k = 15, accuracy = 0.282000
k = 15, accuracy = 0.274000
k = 20, accuracy = 0.270000
k = 20, accuracy = 0.279000
k = 20, accuracy = 0.279000
k = 20, accuracy = 0.282000
k = 20, accuracy = 0.285000
k = 50, accuracy = 0.271000
k = 50, accuracy = 0.288000
k = 50, accuracy = 0.278000
k = 50, accuracy = 0.269000
k = 50, accuracy = 0.266000
k = 100, accuracy = 0.256000
k = 100, accuracy = 0.270000
k = 100, accuracy = 0.263000
k = 100, accuracy = 0.256000
k = 100, accuracy = 0.263000

3.18 结果可视化

# plot the raw observations
for k in k_choices:
accuracies = k_to_accuracies[k]
plt.scatter([k] * len(accuracies), accuracies)# plot the trend line with error bars that correspond to standard deviation
accuracies_mean = np.array([np.mean(v) for k,v in sorted(k_to_accuracies.items())])
accuracies_std = np.array([np.std(v) for k,v in sorted(k_to_accuracies.items())])
plt.errorbar(k_choices, accuracies_mean, yerr=accuracies_std)
plt.title('Cross-validation on k')
plt.xlabel('k')
plt.ylabel('Cross-validation accuracy')
plt.show()

横坐标表示不同的k值选择,纵坐标表示交叉验证的准确率。

3.19  验证k值

# Based on the cross-validation results above, choose the best value for k,
# retrain the classifier using all the training data, and test it on the test
# data. You should be able to get above 28% accuracy on the test data.
temp = 0
for k in k_choices:
accuracies = k_to_accuracies[k]
if temp < accuracies[np.argmax(accuracies)]:
temp = accuracies[np.argmax(accuracies)]
best_k = k
print(best_k)
classifier = KNearestNeighbor()
classifier.train(X_train, y_train)
y_test_pred = classifier.predict(X_test, k=best_k)# Compute and display the accuracy
num_correct = np.sum(y_test_pred == y_test)
accuracy = float(num_correct) / num_test
print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))

我们找到最佳k值,利用测试集验证。结果如下:

10
Got 141 / 500 correct => accuracy: 0.282000

需要指出的是,训练集、验证集和测试集是不同的三个集合。训练集是训练用的数据,在其中分裂出一部分作为验证集,用来参数调优;记住千万不能利用测试集来调优,它应该是最后用来检验模型能力的标准。

总结
        可以看到,KNN模型用作图像分类任务是没有优势的,训练很简单(保存数据),测试的时候很耗费时间和计算资源(一一比对计算)。即使是最好的情况,识别率也不足30%。我们用这个模型来熟悉图像分类的大致流程,训练我们的向量化思维。

转载于:https://www.cnblogs.com/baiyunwanglai/p/10902464.html

CS231n Assiganment#1-KNN 代码解析相关推荐

  1. FAST-LIO2代码解析(五)

    0. 简介 上一节我们将主函数部分while外面的部分给讲完了,下面我们将深入while来学习里面的知识 1. 主函数while内部的eskf前馈 这部分是我们while内部前面的部分,内部的操作在前 ...

  2. matrix_multiply代码解析

    matrix_multiply代码解析 关于matrix_multiply 程序执行代码里两个矩阵的乘法,并将相乘结果打印在屏幕上. 示例的主要目的是展现怎么实现一个自定义CPU计算任务. 参考:ht ...

  3. CornerNet代码解析——损失函数

    CornerNet代码解析--损失函数 文章目录 CornerNet代码解析--损失函数 前言 总体损失 1.Heatmap的损失 2.Embedding的损失 3.Offset的损失 前言 今天要解 ...

  4. 视觉SLAM开源算法ORB-SLAM3 原理与代码解析

    来源:深蓝学院,文稿整理者:何常鑫,审核&修改:刘国庆 本文总结于上交感知与导航研究所科研助理--刘国庆关于[视觉SLAM开源算法ORB-SLAM3 原理与代码解析]的公开课. ORB-SLA ...

  5. java获取object属性值_java反射获取一个object属性值代码解析

    有些时候你明明知道这个object里面是什么,但是因为种种原因,你不能将它转化成一个对象,只是想单纯地提取出这个object里的一些东西,这个时候就需要用反射了. 假如你这个类是这样的: privat ...

  6. python中的doc_基于Python获取docx/doc文件内容代码解析

    这篇文章主要介绍了基于Python获取docx/doc文件内容代码解析,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下 整体思路: 下载文件并修改后缀 ...

  7. mongoose框架示例代码解析(一)

    mongoose框架示例代码解析(一) 参考: Mongoose Networking Library Documentation(Server) Mongoose Networking Librar ...

  8. ViBe算法原理和代码解析

    ViBe - a powerful technique for background detection and subtraction in video sequences 算法官网:http:// ...

  9. 【Android 逆向】使用 Python 代码解析 ELF 文件 ( PyCharm 中进行断点调试 | ELFFile 实例对象分析 )

    文章目录 一.PyCharm 中进行断点调试 二.ELFFile 实例对象分析 一.PyCharm 中进行断点调试 在上一篇博客 [Android 逆向]使用 Python 代码解析 ELF 文件 ( ...

  10. 密码算法中iv值是什么_?标检测中的?极?值抑制算法(nms):python代码解析

    ⾮极⼤值抑制(Non-Maximum Suppression)原理 ⾮极⼤值抑制,顾名思义,找出极⼤值,抑制⾮极⼤值.这种思路和算法在各个领域中应⽤⼴泛,⽐如边缘检测算法canny算⼦中就使⽤了该⽅法 ...

最新文章

  1. Linux那些事儿 之 戏说USB(25)设备的生命线(八)
  2. 云笔记项目-过滤器与拦截器学习
  3. python 代理的使用
  4. 静态网页与动态网页区别
  5. Spring Boot : Spring Boot Slf4j 以及 log4j 以及门面日志
  6. Linux打包压缩解压缩tar、gzip、bzip2
  7. odis工程师一键导入导出匹配数据信息功能_机械重复做了这么久,才发现竟然可以批量用户导入导出...
  8. 哪一类人用苹果手机最多?
  9. 一体化医用电脑推车行业调研报告 - 市场现状分析与发展前景预测(2021-2027年)
  10. VS C++ memcpy() memset()不明确报错
  11. c语言- 负号运算符,C语言运算符盘点,C语言运算符知识点讲解
  12. 达梦单机搭建及简单使用
  13. 日记侠:如何提高朋友圈活跃度,给你5种实用方法
  14. 化解仓储难题,WMS智能仓储系统解决方案
  15. 曾“须知少时凌云志,曾许人间第一流”,却10平米不到屋子像极了你小时候的“梦想”
  16. C++运算符重载 ++,--,+,-,+=,-=,输出输入运算符
  17. C#:实现数据去重算法​(附完整源码)
  18. (Java实现) 洛谷 P1200 你的飞碟在这儿
  19. Canvas 画九宫格图片
  20. Java多线程运用——赛马小游戏

热门文章

  1. Oracle for update skip locked 详解
  2. java bl层,科普一下bl锁的知识,没解锁的必看!
  3. veeam 备份文件服务器,如何用veeam给windows服务器做备份?
  4. MySQL学习笔记10(流程控制、函数)
  5. RFID技术在物联网中有哪些应用
  6. Unity商店下载的资源路径
  7. 【React】JSX 语法及原理
  8. 如何一键录制4k8k高清视频?
  9. Latex排版学习笔记(4)——Latex插入项目符号和编号
  10. 2007年在职攻读硕士学位全国联考