1、卷积神经网络在图片识别上的应用

(1)局部性:对一张照片而言,需要检测图片中的局部特征来决定图片的类别

(2)相同性:可以用同样的模式去检测不同照片的相同特征,只不过这些特征处于图片中不同的位置,但是特征检测所做的操作是不变的

(3)不变性:对于一张大图片,如果我们进行下采样,图片的性质基本保持不变

2、全连接神经网络处理大尺寸图像的缺点:

(1)首先将图像展开为向量会丢失空间信息;

(2)其次参数过多效率低下,训练困难;

(3)同时大量的参数也很快会导致网络过拟合。

3、卷积神经网络

主要由这几类层构成:输入层、卷积层,ReLU层、池化(Pooling)层和全连接层

(1)卷积层的作用:做特征提取

只需要定义移动的步长,我们就可以遍历图像中的所有点,比如定义步长为1,遍历过程如下图所示

上述过程是基于无填充的情况,在有填充的情况下,对于图像边缘的点,会用0填充,示意如下

遍历所有点从而得到一张新的图,这样的图称之为特征图,示例如下

所以卷积层是用来做特征提取的。

(2)池化层作用:(下采样计数)

去除特征图中的冗余信息,降低特征图的维度,减少网络中参数的数量,有效控制过拟合

参考博文:卷积神经网络(CNN)详解 - 知乎

卷积和池化简单介绍_BEIERMODE666的博客-CSDN博客_池化索引

4、卷积神经网络实现手写数字识别

(1)CNN

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
import torchvision
from torch.autograd import Variable
from torch.utils.data import DataLoader
import cv2# 数据集加载
train_dataset = datasets.MNIST(root='./num/',train=True,transform=transforms.ToTensor(),download=True
)
test_dataset = datasets.MNIST(root='./num/',train=False,transform=transforms.ToTensor(),download=True
)batch_size = 64train_loader = torch.utils.data.DataLoader(dataset=train_dataset,batch_size=batch_size,shuffle=True
)test_loader = torch.utils.data.DataLoader(dataset=test_dataset,batch_size=batch_size,shuffle=True
)# 单批次数据预览
# 实现单张图片可视化
images, labels = next(iter(train_loader))
img = torchvision.utils.make_grid(images)img = img.numpy().transpose(1, 2, 0)
std = [0.5, 0.5, 0.5]
mean = [0.5, 0.5, 0.5]
img = img * std + mean
print(labels)
cv2.imshow('win', img)
key_pressed = cv2.waitKey(0)# 神经网络模块
class CNN(nn.Module):def __init__(self):super(CNN, self).__init__()self.layer1 = nn.Sequential(nn.Conv2d(1, 16, kernel_size=3),nn.BatchNorm2d(16),nn.ReLU(inplace=True))self.layer2 = nn.Sequential(nn.Conv2d(16, 32, kernel_size=3),nn.BatchNorm2d(32),nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=2, stride=2))self.layer3 = nn.Sequential(nn.Conv2d(32, 64, kernel_size=3),nn.BatchNorm2d(64),nn.ReLU(inplace=True))self.layer4 = nn.Sequential(nn.Conv2d(64, 128, kernel_size=3),nn.BatchNorm2d(128),nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=2, stride=2))self.fc = nn.Sequential(nn.Linear(128 * 4 * 4, 1024),nn.ReLU(inplace=True),nn.Linear(1024, 128),nn.ReLU(inplace=True),nn.Linear(128, 10))def forward(self, x):x = self.layer1(x)x = self.layer2(x)x = self.layer3(x)x = self.layer4(x)x = x.view(x.size(0), -1)x = self.fc(x)return x# 训练模型
if torch.cuda.is_available():device = 'cuda'
else:device = "cpu"net = CNN().to(device)criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(params=net.parameters(), lr=1e-3)epoch = 1
if __name__ == '__main__':arr_loss = []for epoch in range(epoch):sum_loss = 0.0for i, data in enumerate(train_loader):inputs, labels = datainputs, labels = Variable(inputs).to(device), Variable(labels).to(device)optimizer.zero_grad()  # 将梯度归零outputs = net(inputs)  # 将数据传入网络进行前向运算loss = criterion(outputs, labels)  # 得到损失函数loss.backward()  # 反向传播optimizer.step()  # 通过梯度做一步参数更新# print(loss)sum_loss += loss.item()if i % 100 == 99:arr_loss.append(sum_loss)print('[%d,%d] loss:%.03f' %(epoch + 1, i + 1, sum_loss / 100))sum_loss = 0.0x = [1,2,3,4,5,6,7,8,9]y = arr_lossplt.plot(x, y)plt.show()net.eval()  # 将模型变换为测试模式correct = 0total = 0for data_test in test_loader:images, labels = data_testimages, labels = Variable(images).to(device), Variable(labels).to(device)output_test = net(images)_, predicted = torch.max(output_test, 1)total += labels.size(0)correct += (predicted == labels).sum()print("correct1: ", correct)print("Test acc: {0}".format(correct.item() /len(test_dataset)))

准确率98.98%

(2)LeNet

class LeNet(nn.Module):def __init__(self):super(LeNet, self).__init__()self.layer1 = nn.Sequential(nn.Conv2d(1, 6, 3, padding=1),nn.MaxPool2d(2, 2))self.layer2 = nn.Sequential(nn.Conv2d(6, 16, 5),nn.MaxPool2d(2, 2))self.layer3 = nn.Sequential(nn.Linear(400, 120),nn.Linear(120, 84),nn.Linear(84, 10))def forward(self, x):x = self.layer1(x)x = self.layer2(x)x = x.view(x.size(0), -1)x = self.layer3(x)return x

准确率:97.22%

(3)AlexNet

class AlexNet(nn.Module):def __init__(self):super(AlexNet, self).__init__()self.layer1 = nn.Sequential(  # 输入1*28*28nn.Conv2d(1, 32, kernel_size=3, padding=1),  # 32*28*28####################################################### in_channels:输入数据深度# out_channels:输出数据深度# kernel_size:卷积核大小,也可以表示为(3,2)# stride: 滑动步长# padding: 是否零填充####################################################### nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=2, stride=2),  # 32*14*14nn.Conv2d(32, 64, kernel_size=3, padding=1),  # 64*14*14# nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=2, stride=2),  # 64*7*7nn.Conv2d(64, 128, kernel_size=3, padding=1),  # 128*7*7# nn.ReLU(inplace=True),nn.Conv2d(128, 256, kernel_size=3, padding=1),  # 256*7*7# nn.ReLU(inplace=True),nn.Conv2d(256, 256, kernel_size=3, padding=1),  # 257*7*7# nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=3, stride=2)  # 256*3*3)self.layer2 = nn.Sequential(nn.Dropout(),nn.Linear(256 * 3 * 3, 1024),nn.ReLU(inplace=True),nn.Dropout(),nn.Linear(1024, 512),nn.ReLU(inplace=True),nn.Linear(512, 10))def forward(self, x):x = self.layer1(x)x = x.view(x.size(0), 256 * 3 * 3)x = self.layer2(x)return x

准确率:98.25%

(4)VGG

class VGG(nn.Module):def __init__(self):super(VGG, self).__init__()self.layer1 = nn.Sequential(  # 输入1*28*28nn.Conv2d(1, 32, kernel_size=3, padding=1),  # 32*28*28nn.ReLU(inplace=True),nn.Conv2d(32, 32, kernel_size=3, padding=1),  # 32*28*28nn.ReLU(True),nn.MaxPool2d(kernel_size=2, stride=2),  # 32*14*14nn.Conv2d(32, 64, kernel_size=3, padding=1),  # 64*14*14nn.ReLU(True),nn.Conv2d(64, 64, kernel_size=3, padding=1),  # 64*14*14nn.ReLU(True),nn.MaxPool2d(kernel_size=2, stride=2),  # 64*7*7nn.Conv2d(64, 128, kernel_size=3, padding=1),  # 128*7*7nn.ReLU(True),nn.Conv2d(128, 128, kernel_size=3, padding=1),  # 128*7*7nn.ReLU(True),nn.Conv2d(128, 256, kernel_size=3, padding=1),  # 256*7*7nn.ReLU(True),nn.MaxPool2d(kernel_size=3, stride=2),  # 256*3*3)self.layer2 = nn.Sequential(nn.Linear(256 * 3 * 3, 1024),nn.ReLU(True),nn.Dropout(),nn.Linear(1024, 1024),nn.ReLU(True),nn.Dropout(),nn.Linear(1024, 10))def forward(self, x):x = self.layer1(x)x = x.view(x.size(0), 256 * 3 * 3)x = self.layer2(x)return x

准确率98.52%

(5)GoogLeNet

class InceptionA(torch.nn.Module):def __init__(self,in_channels):super(InceptionA,self).__init__()self.branch1x1 = torch.nn.Conv2d(in_channels,16,kernel_size=1)self.branch5x5_1 = torch.nn.Conv2d(in_channels,16,kernel_size=1)self.branch5x5_2 = torch.nn.Conv2d(16,24,kernel_size=5,padding=2)self.branch3x3_1 = torch.nn.Conv2d(in_channels,16,kernel_size=1)self.branch3x3_2 = torch.nn.Conv2d(16,24,kernel_size=3,padding=1)self.branch3x3_3 = torch.nn.Conv2d(24,24,kernel_size=3,padding=1)# self.branch_pool_1 = torch.nn.functional.avg_pool2d(in_channels,kernel_size=3,stride=1,padding=1)self.branch_pool_2 = torch.nn.Conv2d(in_channels,24,kernel_size=1)def forward(self,x):branch1x1 = self.branch1x1(x)branch5x5_1 = self.branch5x5_1(x)branch5x5_2= self.branch5x5_2(branch5x5_1)branch3x3_1 = self.branch3x3_1(x)branch3x3_2 = self.branch3x3_2(branch3x3_1)branch3x3_3 = self.branch3x3_3(branch3x3_2)branch_pool_1 = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)branch_pool_2 = self.branch_pool_2(branch_pool_1)outputs = torch.cat([branch1x1,branch5x5_2,branch3x3_3,branch_pool_2],dim=1)return outputsclass Net(torch.nn.Module):def __init__(self):super(Net,self).__init__()self.conv1 = torch.nn.Conv2d(1,10,kernel_size=5)self.conv2 = torch.nn.Conv2d(88,20,kernel_size=5)self.incep1 = InceptionA(in_channels=10)self.incel2 = InceptionA(in_channels=20)self.mp = torch.nn.MaxPool2d(2)self.fc = torch.nn.Linear(1408,10)def forward(self,x):batch_size = x.size(0)x = F.relu(self.mp(self.conv1(x)))#1 =》10x = self.incep1(x) #10 => 88x = F.relu(self.mp(self.conv2(x)))x = self.incel2(x)x = x.view(batch_size,-1)x = self.fc(x)return x

准确率:97.35%

读书笔记-深度学习入门之pytorch-第四章(含卷积神经网络实现手写数字识别)(详解)相关推荐

  1. 用MXnet实战深度学习之一:安装GPU版mxnet并跑一个MNIST手写数字识别 (zz)

    用MXnet实战深度学习之一:安装GPU版mxnet并跑一个MNIST手写数字识别 我想写一系列深度学习的简单实战教程,用mxnet做实现平台的实例代码简单讲解深度学习常用的一些技术方向和实战样例.这 ...

  2. 《神经网络和深度学习》系列文章五:用简单的网络结构解决手写数字识别

    出处: Michael Nielsen的<Neural Network and Deep Learning>,点击末尾"阅读原文"即可查看英文原文. 本节译者:哈工大S ...

  3. 【FPGA教程案例100】深度学习1——基于CNN卷积神经网络的手写数字识别纯Verilog实现,使用mnist手写数字数据库

    FPGA教程目录 MATLAB教程目录 ---------------------------------------- 目录 1.软件版本 2.CNN卷积神经网络的原理 2.1 mnist手写数字数 ...

  4. keras从入门到放弃(十三)卷积神经网络处理手写数字识别

    今天来一个cnn例子 手写数字识别,因为是图像数据 import keras from keras import layers import numpy as np import matplotlib ...

  5. 深度学习笔记:07神经网络之手写数字识别的经典实现

    神经网络之手写数字识别的经典实现 上一节完成了简单神经网络代码的实现,下面我们将进行最终的实现:输入一张手写图片后,网络输出该图片对应的数字.由于网络需要用0-9一共十个数字中挑选出一个,所以我们的网 ...

  6. Python学习记录 搭建BP神经网络实现手写数字识别

    搭建BP神经网络实现手写数字识别 通过之前的文章我们知道了,构建一个简单的神经网络需要以下步骤 准备数据 初始化假设 输入神经网络进行计算 输出运行结果 这次,我们来通过sklearn的手写数字数据集 ...

  7. 读书笔记-深度学习入门之pytorch-第五章(含循环实现手写数字识别)(LSTM、GRU代码详解)

    目录 1.RNN优点:(记忆性) 2.循环神经网络结构与原理 3.LSTM(长短时记忆网络) 4.GRU 5.LSTM.RNN.GRU区别 6.收敛性问题 7.循环神经网络Pytorch实现 (1)R ...

  8. 【深度学习】基于Numpy实现的神经网络进行手写数字识别

    直接先用前面设定的网络进行识别,即进行推理的过程,而先忽视学习的过程. 推理的过程其实就是前向传播的过程. 深度学习也是分成两步:学习 + 推理.学习就是训练模型,更新参数:推理就是用学习到的参数来处 ...

  9. 计算机视觉与深度学习 | 基于MATLAB 使用CNN拟合一个回归模型来预测手写数字的旋转角度(卷积神经网络)

    博主github:https://github.com/MichaelBeechan 博主CSDN:https://blog.csdn.net/u011344545 上一篇写了一个:实现简单的数字分类 ...

最新文章

  1. 7的整除特征 三位一截_小学生三位数加法的策略与表现
  2. 原本要与Hinton当同事,最后被迫Bengio门下读博? | 独立研究员的坎坷之路
  3. 创建topic验证kafka集群
  4. Warning: post-commit hook failed (exit code 255) with no output.
  5. html 跨域_常见跨域解决方案以及Ocelot 跨域配置
  6. boost::range模块replaced_if相关的测试程序
  7. 分段概率密度矩估计_考研数学:高数、线代、概率3科目知识框架梳理
  8. swagger内部类_API管理工具Swagger介绍及Springfox原理分析
  9. 几台WEB经常宕机,求分析原因
  10. zotero文献管理高级操作|解决文献条目题录元数据一键更新问题
  11. python求最小公倍数_Python 最小公倍数算法
  12. 富文本TinyMCE
  13. Meta眼球追踪研究:采用事件相机,采样率可达kHz级
  14. 如何使用阿里云服务器快速搭建个人网站?
  15. 解决HttpServletRequest 流数据不可重复读
  16. 《电磁学》学习笔记5——磁场强度H
  17. adobe flash builder flex 4.6 下载地址
  18. tp框架中的facade
  19. Shell从小白牛到大黑牛你只差这一步
  20. python电路仿真001

热门文章

  1. shiro学习系列:shiro自定义filter过滤器
  2. Java经典算法四十例编程详解+程序实例
  3. 如何写互联网产品分析报告
  4. 自研数据分析工具——yandas系列二:分析泰坦尼克号沉船事件中的乘客信息表
  5. Unity实用小工具或脚本—3D炫酷UI篇(一)
  6. 打王者荣耀还不过瘾,腾讯绝悟AI这次还拿下了足球世界冠军
  7. Nginx + Lua 搭建网站WAF防火墙
  8. 机房管理系列之电话交换机
  9. 学生党专用计算机,学生党平价笔记本电脑 最适合学生的高性价比电脑推荐
  10. 吴晓波:拼多多的新与旧