
  1. 可视化卷积神经网络。
  2. 设计和训练一个CNN来对MNIST手写数字分类。
  3. 设计并训练一个CNN来对CIFAR10数据集中的图像进行分类。


  1. SGD优化器:GD就是梯度下降(Gradient Descent),SGD就是随机梯度下降。SGD相对于GD优势在于:①不用计算全部图片输入网络的梯度,而用小批量图来更新一次网络,极大提升训练速度。②“歪歪扭扭”地走,天生容易跳出局部最优点,最终训练的精度往往比GD高的多。
  2. Sobel 算子:是一个离散微分算子, 结合了高斯平滑和微分求导,主要用来计算图像中某一点在横向/纵向上的近似梯度,如果梯度值大于某一个阈值,则认为该点为边缘点(像素值发生显著变化的地方)。

    1. 图像近似梯度计算如下:

    2. 所以,sobel x和sobel y参数一般如下:

  3. 交叉熵损失

    1. 二分类的交叉熵损失公式:(y为标签,y^为预测为正样本的概率)

    2. 训练过程中代价函数是对m个样本的损失函数求和然后除以m:

    3. 多分类交叉熵损失:

      1. K是种类数量
      2. y是标签,也就是如果类别是 i,则 yi =1,否则等于0
      3. p是神经网络的输出,也就是指类别是 i 的概率。这个输出值就是用 softmax 计算得来的。


1 可视化卷积神经网络

1.1 自定义滤波器

1.2 可视化卷积层

1.3 可视化池化层

1.3.1 Import the image

1.3.2 Define and visualize the filters

1.3.3 Define convolutional and pooling layers

1.3.4 Visualize the output of each filter

1.3.5 Visualize the output of the pooling layer

2 设计和训练一个CNN对MNIST手写数字分类

2.1 加载并可视化数据

2.1.1 可视化训练集中一个batch图像集

2.1.2 观察单个图像更详细的信息

2.2 定义网络结构

2.3 指定损失函数和优化器

2.4 训练网络

2.5 测试训练好的网络

2.6 可视化test集预测结果

3 设计并训练一个CNN来对CIFAR10数据集中的图像进行分类

3.1 CUDA测试

3.2 加载数据

3.3 可视化一批训练数据

3.4 更详细地查看图像

3.5 定义网络结构

3.6 指定损失函数和优化器

3.7 训练网络

3.8 加载模型

3.9 测试训练好的模型

3.10 问题:你的模型有哪些缺点,如何改进?

3.11 可视化test集预测结果

1 可视化卷积神经网络

1.1 自定义滤波器


import matplotlib.pyplot as plt
import matplotlib.image as mpimgimport cv2
import numpy as np%matplotlib inline# Read in the image
image = mpimg.imread('data/curved_lane.jpg')plt.imshow(image)


# Convert to grayscale for filtering
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)plt.imshow(gray, cmap='gray')




由您创建一个sobel x操作符并将其应用于给定的图像。


# Create a custom kernel# 3x3 array for edge detection
sobel_y = np.array([[ -1, -2, -1], [ 0, 0, 0], [ 1, 2, 1]])## TODO: Create and apply a Sobel x operator
sobel_x = np.array([[ -1, 0, 1], [ -2, 0, 2], [ -1, 0, 1]])# Filter the image using filter2D, which has inputs: (grayscale image, bit-depth, kernel)
filtered_image_x = cv2.filter2D(gray, -1, sobel_x)
filtered_image_y = cv2.filter2D(gray, -1, sobel_y)plt.figure(figsize=(14,14))#设置图像尺寸(画面大小其实是 1400 * 1400)#要生成两行两列,这是第一个图plt.subplot('行','列','编号')
plt.title('sobel x')
plt.imshow(filtered_image_x, cmap='gray')plt.subplot(1,2,2)
plt.title('sobel y')
plt.imshow(filtered_image_y, cmap='gray')plt.show()




  1. 创建具有小数值参数的过滤器。
  2. 创建5x5过滤器
  3. 将过滤器应用于images目录中的其他图像。

image = mpimg.imread('data/bridge_trees_example.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)sobel_y = np.array([[ -1, -2, -1], [ 0, 0, 0], [ 1, 2, 1]])sobel_y_2 = np.array([[ -1.5, -2.5, -1.5], [ 0, 0, 0], [ 1.5, 2.5, 1.5]])sobel_x = np.array([[ -1, 0, 1], [ -2, 0, 2], [ -1, 0, 1]])sobel_x_5x5 = np.array([[ -1, 0, 0, 0, 1], [ -1, 0, 0, 0, 1],[ -2, 0, 0, 0, 2], [ -1, 0, 0, 0, 1],[ -1, 0, 0, 0, 1]])# Filter the image using filter2D, which has inputs: (grayscale image, bit-depth, kernel)
filtered_image_y = cv2.filter2D(gray, -1, sobel_y)
filtered_image_y_2 = cv2.filter2D(gray, -1, sobel_y_2)
filtered_image_x = cv2.filter2D(gray, -1, sobel_x)
filtered_image_x_5x5 = cv2.filter2D(gray, -1, sobel_x_5x5)plt.figure(figsize=(14, 14))#设置图像尺寸(画面大小其实是 1200 * 1200)plt.subplot(3,2,1)
plt.imshow(gray, cmap='gray')plt.subplot(3,2,3)
plt.title('sobel y')
plt.imshow(filtered_image_y, cmap='gray')plt.subplot(3,2,4)
plt.title('sobel y decimal')
plt.imshow(filtered_image_y_2, cmap='gray')plt.subplot(3,2,5)
plt.title('sobel x')
plt.imshow(filtered_image_x, cmap='gray')plt.subplot(3,2,6)
plt.title('sobel x 5*5')
plt.imshow(filtered_image_x_5x5, cmap='gray')plt.show()


1.2 可视化卷积层




import cv2
import matplotlib.pyplot as plt
%matplotlib inline# TODO: Feel free to try out your own images here by changing img_path
# to a file path to another image on your computer!
img_path = 'data/udacity_sdc.png'# load color image
bgr_img = cv2.imread(img_path)
# convert to grayscale
gray_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2GRAY)# normalize, rescale entries to lie in [0,1]
gray_img = gray_img.astype("float32")/255# plot image
plt.imshow(gray_img, cmap='gray')


# visualize all four filters
fig = plt.figure(figsize=(10, 5))
for i in range(4):ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])ax.imshow(filters[i], cmap='gray')ax.set_title('Filter %s' % str(i+1))width, height = filters[i].shapefor x in range(width):for y in range(height):ax.annotate(str(filters[i][x][y]), xy=(y,x),horizontalalignment='center',verticalalignment='center',color='white' if filters[i][x][y]<0 else 'black')




import torch
import torch.nn as nn
import torch.nn.functional as F# define a neural network with a single convolutional layer with four filters
class Net(nn.Module):def __init__(self, weight):super(Net, self).__init__()# initializes the weights of the convolutional layer to be the weights of the 4 defined filtersk_height, k_width = weight.shape[2:]# assumes there are 4 grayscale filtersself.conv = nn.Conv2d(1, 4, kernel_size=(k_height, k_width), bias=False)self.conv.weight = torch.nn.Parameter(weight)def forward(self, x):# calculates the output of a convolutional layer# pre- and post-activationconv_x = self.conv(x)activated_x = F.relu(conv_x)# returns both layersreturn conv_x, activated_x# instantiate the model and set the weights
weight = torch.from_numpy(filters).unsqueeze(1).type(torch.FloatTensor)
model = Net(weight)# print out the layer in the network


首先,我们将定义一个helper函数,即接受特定层和过滤器数量(可选参数)的 viz_layer,并在图像通过后显示该层的输出。

# helper function for visualizing the output of a given layer
# default number of filters is 4
def viz_layer(layer, n_filters= 4):fig = plt.figure(figsize=(20, 20))for i in range(n_filters):ax = fig.add_subplot(1, n_filters, i+1, xticks=[], yticks=[])# grab layer outputsax.imshow(np.squeeze(layer[0,i].data.numpy()), cmap='gray')ax.set_title('Output %s' % str(i+1))


# plot original image
plt.imshow(gray_img, cmap='gray')# visualize all filters
fig = plt.figure(figsize=(12, 6))
fig.subplots_adjust(left=0, right=1.5, bottom=0.8, top=1, hspace=0.05, wspace=0.05)
for i in range(4):ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])ax.imshow(filters[i], cmap='gray')ax.set_title('Filter %s' % str(i+1))# convert the image into an input Tensor
gray_img_tensor = torch.from_numpy(gray_img).unsqueeze(0).unsqueeze(1)# get the convolutional layer (pre and post activation)
conv_layer, activated_layer = model(gray_img_tensor)# visualize the output of a conv layer


ReLu 激活函数


# after a ReLu is applied
# visualize the output of an activated conv layer


1.3 可视化池化层



1.3.1 Import the image

1.3.2 Define and visualize the filters

1.3.3 Define convolutional and pooling layers



1.3.4 Visualize the output of each filter


# helper function for visualizing the output of a given layer
# default number of filters is 4
def viz_layer(layer, n_filters= 4):fig = plt.figure(figsize=(20, 20))for i in range(n_filters):ax = fig.add_subplot(1, n_filters, i+1)# grab layer outputsax.imshow(np.squeeze(layer[0,i].data.numpy()), cmap='gray')ax.set_title('Output %s' % str(i+1))


# plot original image
plt.imshow(gray_img, cmap='gray')# visualize all filters
fig = plt.figure(figsize=(12, 6))
fig.subplots_adjust(left=0, right=1.5, bottom=0.8, top=1, hspace=0.05, wspace=0.05)
for i in range(4):ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])ax.imshow(filters[i], cmap='gray')ax.set_title('Filter %s' % str(i+1))# convert the image into an input Tensor
gray_img_tensor = torch.from_numpy(gray_img).unsqueeze(0).unsqueeze(1)# get all the layers
conv_layer, activated_layer, pooled_layer = model(gray_img_tensor)# visualize the output of the activated conv layer


1.3.5 Visualize the output of the pooling layer



2 设计和训练一个CNN对MNIST手写数字分类

在本笔记本中,我们将训练一个MLP(Multi-Layer Perceptron 多层感知器)来对MNIST数据库手写数字数据库中的图像进行分类。


  1. 加载并可视化数据
  2. 定义神经网络
  3. 训练模型
  4. 在测试数据集上评估我们训练模型的性能!


# import libraries
import torch
import numpy as np

2.1 加载并可视化数据



# The MNIST datasets are hosted on yann.lecun.com that has moved under CloudFlare protection
# Run this script to enable the datasets download
# Reference: https://github.com/pytorch/vision/issues/1938from six.moves import urllib
opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
from torchvision import datasets
import torchvision.transforms as transforms# number of subprocesses to use for data loading
num_workers = 0
# how many samples per batch to load
batch_size = 20# convert data to torch.FloatTensor
transform = transforms.ToTensor()# choose the training and test datasets
train_data = datasets.MNIST(root='data', train=True,download=True, transform=transform)
test_data = datasets.MNIST(root='data', train=False,download=True, transform=transform)# prepare data loaders
train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size,num_workers=num_workers)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, num_workers=num_workers)

2.1.1 可视化训练集中一个batch图像集


2.1.2 观察单个图像更详细的信息

2.2 定义网络结构


import torch.nn as nn
import torch.nn.functional as F## TODO: Define the NN architecture
class Net(nn.Module):def __init__(self):super(Net, self).__init__()# linear layer (784 -> 1 hidden node)self.fc1 = nn.Linear(28 * 28, 256)self.fc2 = nn.Linear(256, 64)self.fc3 = nn.Linear(64, 10)self.dropout = nn.Dropout(0.2)def forward(self, x):# flatten image inputx = x.view(-1, 28 * 28)# add hidden layer, with relu activation functionx = F.relu(self.fc1(x))x = self.dropout(x)x = F.relu(self.fc2(x))x = self.dropout(x)x = F.log_softmax(self.fc3(x), dim=1)  return x# initialize the NN
model = Net()

2.3 指定损失函数和优化器


## TODO: Specify loss and optimization functions
from torch import nn, optim
# specify loss function
criterion = nn.CrossEntropyLoss()# specify optimizer
optimizer = optim.SGD(model.parameters(), lr=0.01)

2.4 训练网络


  • 1.清除所有优化变量的梯度
  • 2.前向传播:通过将输入传递到模型来计算预测输出
  • 3.计算损失
  • 4.反向传播:计算相对于模型参数的损失梯度
  • 5.执行单个优化步骤(参数更新)
  • 6.更新平均训练损失


# number of epochs to train the model
n_epochs = 30  # suggest training between 20-50 epochsmodel.train() # prep model for trainingfor epoch in range(n_epochs):# monitor training losstrain_loss = 0.0#################### train the model ####################for data, target in train_loader:# clear the gradients of all optimized variablesoptimizer.zero_grad()# forward pass: compute predicted outputs by passing inputs to the modeloutput = model(data)# calculate the lossloss = criterion(output, target)# backward pass: compute gradient of the loss with respect to model parametersloss.backward()# perform a single optimization step (parameter update)optimizer.step()# update running training losstrain_loss += loss.item()*data.size(0)# print training statistics # calculate average loss over an epochtrain_loss = train_loss/len(train_loader.dataset)print('Epoch: {} \tTraining Loss: {:.6f}'.format(epoch+1, train_loss))


  • Epoch: 1     Training Loss: 0.950629
  • Epoch: 2     Training Loss: 0.378016
  • Epoch: 3     Training Loss: 0.292131
  • Epoch: 4     Training Loss: 0.237494
  • Epoch: 5     Training Loss: 0.203416
  • Epoch: 6     Training Loss: 0.178869
  • Epoch: 7     Training Loss: 0.157555
  • Epoch: 8     Training Loss: 0.143985
  • Epoch: 9     Training Loss: 0.132015
  • Epoch: 10     Training Loss: 0.122434
  • Epoch: 11     Training Loss: 0.113976
  • Epoch: 12     Training Loss: 0.105239
  • Epoch: 13     Training Loss: 0.098839
  • Epoch: 14     Training Loss: 0.093791
  • Epoch: 15     Training Loss: 0.088727
  • Epoch: 16     Training Loss: 0.081909
  • Epoch: 17     Training Loss: 0.079282
  • Epoch: 18     Training Loss: 0.074924
  • Epoch: 19     Training Loss: 0.071149
  • Epoch: 20     Training Loss: 0.068345
  • Epoch: 21     Training Loss: 0.065399
  • Epoch: 22     Training Loss: 0.062431
  • Epoch: 23     Training Loss: 0.060230
  • Epoch: 24     Training Loss: 0.056332
  • Epoch: 25     Training Loss: 0.055859
  • Epoch: 26     Training Loss: 0.053873
  • Epoch: 27     Training Loss: 0.050490
  • Epoch: 28     Training Loss: 0.049184
  • Epoch: 29     Training Loss: 0.046799
  • Epoch: 30     Training Loss: 0.047051

2.5 测试训练好的网络


model.eval() 将模型中的所有层设置为评估模式。这会影响像dropout这样的层,这些层在训练期间以一定的概率关闭节点,但是评估时dropout的功能会被关闭

# initialize lists to monitor test loss and accuracy
test_loss = 0.0
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))model.eval() # prep model for *evaluation*for data, target in test_loader:# forward pass: compute predicted outputs by passing inputs to the modeloutput = model(data)# calculate the lossloss = criterion(output, target)# update test loss test_loss += loss.item()*data.size(0)# convert output probabilities to predicted class_, pred = torch.max(output, 1)# compare predictions to true labelcorrect = np.squeeze(pred.eq(target.data.view_as(pred)))# calculate test accuracy for each object classfor i in range(batch_size):label = target.data[i]class_correct[label] += correct[i].item()class_total[label] += 1# calculate and print avg test loss
test_loss = test_loss/len(test_loader.dataset)
print('Test Loss: {:.6f}\n'.format(test_loss))for i in range(10):if class_total[i] > 0:print('Test Accuracy of %5s: %2d%% (%2d/%2d)' % (str(i), 100 * class_correct[i] / class_total[i],class_correct[i], class_total[i]))else:print('Test Accuracy of %5s: N/A (no training examples)' % (classes[i]))print('\nTest Accuracy (Overall): %2d%% (%2d/%2d)' % (100. * np.sum(class_correct) / np.sum(class_total),np.sum(class_correct), np.sum(class_total)))

2.6 可视化test集预测结果

此单元格按以下格式显示测试图像及其标签:predicted (ground-truth)。文本将是绿色的准确分类的例子和红色的错误预测。

# obtain one batch of test images
dataiter = iter(test_loader)
images, labels = dataiter.next()# get sample outputs
output = model(images)
# convert output probabilities to predicted class
_, preds = torch.max(output, 1)
# prep images for display
images = images.numpy()# plot the images in the batch, along with predicted and true labels
fig = plt.figure(figsize=(25, 4))
for idx in np.arange(20):ax = fig.add_subplot(2, 20/2, idx+1, xticks=[], yticks=[])ax.imshow(np.squeeze(images[idx]), cmap='gray')ax.set_title("{} ({})".format(str(preds[idx].item()), str(labels[idx].item())),color=("green" if preds[idx]==labels[idx] else "red"))

3 设计并训练一个CNN来对CIFAR10数据集中的图像进行分类



3.1 CUDA测试


3.2 加载数据


from torchvision import datasets
import torchvision.transforms as transforms
from torch.utils.data.sampler import SubsetRandomSampler# number of subprocesses to use for data loading
num_workers = 0
# how many samples per batch to load
batch_size = 20
# percentage of training set to use as validation
valid_size = 0.2# convert data to a normalized torch.FloatTensor
transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])# choose the training and test datasets
train_data = datasets.CIFAR10('data', train=True,download=True, transform=transform)
test_data = datasets.CIFAR10('data', train=False,download=True, transform=transform)# obtain training indices that will be used for validation
num_train = len(train_data)
indices = list(range(num_train))
split = int(np.floor(valid_size * num_train))
train_idx, valid_idx = indices[split:], indices[:split]# define samplers for obtaining training and validation batches
train_sampler = SubsetRandomSampler(train_idx)
valid_sampler = SubsetRandomSampler(valid_idx)# prepare data loaders (combine dataset and sampler)
train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size,sampler=train_sampler, num_workers=num_workers)
valid_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, sampler=valid_sampler, num_workers=num_workers)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, num_workers=num_workers)# specify the image classes
classes = ['airplane', 'automobile', 'bird', 'cat', 'deer','dog', 'frog', 'horse', 'ship', 'truck']

3.3 可视化一批训练数据

3.4 更详细地查看图像


rgb_img = np.squeeze(images[6]) #上图第6序号的红色鸟
channels = ['red channel', 'green channel', 'blue channel']fig = plt.figure(figsize = (36, 36))
for idx in np.arange(rgb_img.shape[0]):ax = fig.add_subplot(1, 3, idx + 1)img = rgb_img[idx]ax.imshow(img, cmap='gray')ax.set_title(channels[idx])width, height = img.shapethresh = img.max()/2.5for x in range(width):for y in range(height):val = round(img[x][y],2) if img[x][y] !=0 else 0ax.annotate(str(val), xy=(y,x),horizontalalignment='center',verticalalignment='center', size=8,color='white' if img[x][y]<thresh else 'black')


3.5 定义网络结构


  • 卷积层,可以看作是过滤图像的滤波器堆叠。
  • Maxpooling层,它减少输入的x-y大小,只保留前一层中最活跃的像素。
  • 通常的线性+dropout层,以避免过度拟合,并产生一个10维度的输出。









  • 我们可以计算输出卷的空间大小,作为输入卷大小(W)、内核大小(F)、应用它们的步长(S)和边界上使用的零填充量(P)的函数。计算输出的正确公式为:(W−F+2P)/S + 1。

例如,对于7x7输入和3x3滤波器,步幅1和pad 0,我们将得到5x5输出。如果用步幅2,我们可以得到3x3的输出。

import torch.nn as nn
import torch.nn.functional as F# define the CNN architecture
class Net(nn.Module):def __init__(self):super(Net, self).__init__()# convolutional layerself.conv1 = nn.Conv2d(3, 16, 3, padding=1) # convolutional layerself.conv2 = nn.Conv2d(16, 32, 3, padding=1)# convolutional layerself.conv3 = nn.Conv2d(32, 64, 3, padding=1)# max pooling layerself.pool = nn.MaxPool2d(2, 2)# linear layer (64 * 4 * 4 -> 200)self.fc1 = nn.Linear(64 * 4 * 4, 200)# linear layer (200 -> 10)self.fc2 = nn.Linear(200, 10)# dropout layer (p=0.2)self.dropout = nn.Dropout(0.2)def forward(self, x):# add sequence of convolutional and max pooling layersx = self.pool( F.relu( self.conv1(x))) #输出维度:16 * 16*16x = self.pool( F.relu( self.conv2(x))) #输出维度:32 * 8*8x = self.pool( F.relu( self.conv3(x))) #输出维度:64 * 4*4# flatten image inputx = x.view(-1, 64 * 4 * 4)# add dropout layerx = self.dropout(x)# add 1st hidden layer, with relu activation functionx = F.relu(self.fc1(x)) #输出维度:200# add dropout layerx = self.dropout(x)x = self.fc2(x) #输出维度:10return x# create a complete CNN
model = Net()
print(model)# move tensors to GPU if CUDA is available
if train_on_gpu:model.cuda()

3.6 指定损失函数和优化器

import torch.optim as optim# specify loss function
criterion = nn.CrossEntropyLoss()# specify optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

3.7 训练网络


# number of epochs to train the model
n_epochs = 8 # you may increase this number to train a final modelvalid_loss_min = np.Inf # track change in validation lossfor epoch in range(1, n_epochs+1):# keep track of training and validation losstrain_loss = 0.0valid_loss = 0.0#################### train the model ####################model.train()for data, target in train_loader:# move tensors to GPU if CUDA is availableif train_on_gpu:data, target = data.cuda(), target.cuda()# clear the gradients of all optimized variablesoptimizer.zero_grad()# forward pass: compute predicted outputs by passing inputs to the modeloutput = model(data)# calculate the batch lossloss = criterion(output, target)# backward pass: compute gradient of the loss with respect to model parametersloss.backward()# perform a single optimization step (parameter update)optimizer.step()# update training losstrain_loss += loss.item()*data.size(0)######################    # validate the model #######################model.eval()for data, target in valid_loader:# move tensors to GPU if CUDA is availableif train_on_gpu:data, target = data.cuda(), target.cuda()# forward pass: compute predicted outputs by passing inputs to the modeloutput = model(data)# calculate the batch lossloss = criterion(output, target)# update average validation loss valid_loss += loss.item()*data.size(0)# calculate average lossestrain_loss = train_loss/len(train_loader.dataset)valid_loss = valid_loss/len(valid_loader.dataset)# print training/validation statistics print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(epoch, train_loss, valid_loss))# save model if validation loss has decreasedif valid_loss <= valid_loss_min:print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(valid_loss_min,valid_loss))torch.save(model.state_dict(), 'model_cifar.pt')valid_loss_min = valid_loss


3.8 加载模型


3.9 测试训练好的模型


# track test loss
test_loss = 0.0
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))model.eval()
# iterate over test data
for data, target in test_loader:# move tensors to GPU if CUDA is availableif train_on_gpu:data, target = data.cuda(), target.cuda()# forward pass: compute predicted outputs by passing inputs to the modeloutput = model(data)# calculate the batch lossloss = criterion(output, target)# update test loss test_loss += loss.item()*data.size(0)# convert output probabilities to predicted class_, pred = torch.max(output, 1)    # compare predictions to true labelcorrect_tensor = pred.eq(target.data.view_as(pred))correct = np.squeeze(correct_tensor.numpy()) if not train_on_gpu else np.squeeze(correct_tensor.cpu().numpy())# calculate test accuracy for each object classfor i in range(batch_size):label = target.data[i]class_correct[label] += correct[i].item()class_total[label] += 1# average test loss
test_loss = test_loss/len(test_loader.dataset)
print('Test Loss: {:.6f}\n'.format(test_loss))for i in range(10):if class_total[i] > 0:print('Test Accuracy of %5s: %2d%% (%2d/%2d)' % (classes[i], 100 * class_correct[i] / class_total[i],np.sum(class_correct[i]), np.sum(class_total[i])))else:print('Test Accuracy of %5s: N/A (no training examples)' % (classes[i]))print('\nTest Accuracy (Overall): %2d%% (%2d/%2d)' % (100. * np.sum(class_correct) / np.sum(class_total),np.sum(class_correct), np.sum(class_total)))


3.10 问题:你的模型有哪些缺点,如何改进?


  1. 训练结束时,loss还在快速下降,训练的epoch数远远不够。
  2. 不同类别的测试结果差异较大,类别比较复杂多变的类预测效果普遍较差(如狗、小汽车、鸟类),这些类相对其他类,类内距离较大,这要么表示模型训练时间不够还没掌握复杂类的预测,要么模型结构的复杂度还较低导致无法表达复杂类情况。

3.11 可视化test集预测结果

# obtain one batch of test images
dataiter = iter(test_loader)
images, labels = dataiter.next()
images.numpy()# move model inputs to cuda, if GPU available
if train_on_gpu:images = images.cuda()# get sample outputs
output = model(images)
# convert output probabilities to predicted class
_, preds_tensor = torch.max(output, 1)
preds = np.squeeze(preds_tensor.numpy()) if not train_on_gpu else np.squeeze(preds_tensor.cpu().numpy())if train_on_gpu:images = images.cpu()# plot the images in the batch, along with predicted and true labels
fig = plt.figure(figsize=(25, 4))
for idx in np.arange(20):ax = fig.add_subplot(2, 20/2, idx+1, xticks=[], yticks=[])imshow(images[idx] if not train_on_gpu else images[idx].cpu())ax.set_title("{} ({})".format(classes[preds[idx]], classes[labels[idx]]),color=("green" if preds[idx]==labels[idx].item() else "red"))



