Res-Net

  • 论文
    • 论文简读
  • 代码
    • 主要代码块
    • 代码

论文

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr.2016.90

论文简读

  1. Recent evidence[40, 43] reveals that network depth is of crucial importance, and the leading results [40, 43, 12, 16] on the challenging ImageNet dataset [35] all exploit “very deep” [40] models, with a depth of sixteen [40] to thirty [16].
    对于神经网络需要一个较大的深度,才会有更好的结果。
  2. Driven by the significance of depth, a question arises: Is learning better networks as easy as stacking more layers? An obstacle to answering this question was the notorious problem of vanishing/exploding gradients [14, 1, 8], which hamper convergence from the beginning. This problem, however, has been largely addressed by normalized initialization [23, 8, 36, 12] and intermediate normalization layers [16], which enable networks with tens of layers to start converging for stochastic gradient descent (SGD) with backpropagation [22].
    When deeper networks are able to start converging, a degradation problem has been exposed: with the network depth increasing, accuracy gets saturated (which might be unsurprising) and then degrades rapidly. Unexpectedly, such degradation is not caused by overfitting, and adding more layers to a suitably deep model leads to higher training error, as reported in [10, 41] and thoroughly verified by our experiments. Fig. 1 shows a typical example.
    梯度消失和爆炸的问题已经可以由正则化的方式解决。
    随着网络深度增加,会出现精度下降的问题,这并不是由于过拟合造成的,而且网络层数越深,训练误差越大,如图一。
  3. Formally, denoting the desired underlying mapping as H(x), we let the stacked nonlinear layers fit another mapping of F(x) := H(x)−x. The original mapping is recast into F(x)+x. We hypothesize that it is easier to optimize the residual mapping than to optimize the original, unreferenced mapping. To the extreme, if an identity mapping were optimal, it would be easier to push the residual to zero than to fit an identity mapping by a stack of nonlinear layers.
    The formulation of F(x)+x can be realized by feedforward neural networks with “shortcut connections” (Fig. 2). Shortcut connections [2, 33, 48] are those skipping one or more layers. In our case, the shortcut connections simply perform identity mapping, and their outputs are added to the outputs of the stacked layers (Fig. 2). Identity shortcut connections add neither extra parameter nor computational complexity. The entire network can still be trained end-to-end by SGD with backpropagation, and can be easily implemented using common libraries (e.g., Caffe [19]) without modifying the solvers.
    让一个模块的映射变成F(x)+x,F(x)是普通的网络层,x是相等映射,如图2.
    就是将前几层的输入x加到后几层的输出上,x是相等映射,不增加参数也不增加复杂度。
    具体地:假设某层的输入是 x,期望输出是 H(x), 如果我们直接把输入 x 传到输出作为初始结果,这就是一个更浅层的网络,更容易训练,而这个网络没有学会的部分,我们可以使用更深的网络 F(x) 去训练它,使得训练更加容易,最后希望拟合的结果就是 F(x) = H(x) - x,这就是一个残差的结构
  4. Based on the above plain network, we insert shortcut connections (Fig. 3, right) which turn the network into its counterpart residual version. The identity shortcuts (Eqn.(1)) can be directly used when the input and output are of the same dimensions (solid line shortcuts in Fig. 3). When the dimensions increase (dotted line shortcuts in Fig. 3), we consider two options: (A) The shortcut still performs identity mapping, with extra zero entries padded for increasing dimensions. This option introduces no extra parameter; (B) The projection shortcut in Eqn.(2) is used to match dimensions (done by 1×1 convolutions). For both options, when the shortcuts go across feature maps of two sizes, they are performed with a stride of 2.
    残差有两种连接方式,如图3中的实线和虚线:实线,维度相同;虚线,维度不同,有两种方式解决:1. 用0填充,2. 通过一个映射矩阵(1x1的卷积)转换到相同维度。

代码

主要代码块

可以定义一个残差模块:

#定义一个3*3的卷积层
def conv3x3(in_channel, out_channel, stride=1):return nn.Conv2d(in_channel, out_channel, 3, stride=stride, padding=1, bias=False)
#残差模块
class residual_block(nn.Module):def __init__(self, in_channel, out_channel, same_shape=True):super(residual_block, self).__init__()self.same_shape = same_shapestride=1 if self.same_shape else 2    #如果需要不改变输出大小,则stride=1,否则=2self.conv1 = conv3x3(in_channel, out_channel, stride=stride) #第一层卷积self.bn1 = nn.BatchNorm2d(out_channel)self.conv2 = conv3x3(out_channel, out_channel) #第二层卷积self.bn2 = nn.BatchNorm2d(out_channel)if not self.same_shape:  #如果需要改变输出特征的大小,那么需要通过一维卷积调整输入x的大小self.conv3 = nn.Conv2d(in_channel, out_channel, 1, stride=stride)def forward(self, x):out = self.conv1(x)out = F.relu(self.bn1(out), True)out = self.conv2(out)out = F.relu(self.bn2(out), True)if not self.same_shape: #如果输出的大小与输入不同,则通过一维卷积调整x的大小,使x维度与输出保持相同x = self.conv3(x)return F.relu(x+out, True)  #将输出out与输入x相加到一起形成新的特征

ResNet本质上就是这样的多个残差网络的堆叠。

代码

实现一个简单的残差网络,层数较少。

class resnet(nn.Module):def __init__(self, in_channel, num_classes, verbose=False):super(resnet, self).__init__()self.verbose = verbose  #是否输出的一个标志self.block1 = nn.Conv2d(in_channel, 64, 7, 2)self.block2 = nn.Sequential(nn.MaxPool2d(3, 2),residual_block(64, 64),residual_block(64, 64))self.block3 = nn.Sequential(residual_block(64, 128, False),  #残差模块的输入输出大小不一样residual_block(128, 128))self.block4 = nn.Sequential(residual_block(128, 256, False),residual_block(256, 256))self.block5 = nn.Sequential(residual_block(256, 512, False),residual_block(512, 512),nn.AvgPool2d(3))self.classifier = nn.Linear(512, num_classes)def forward(self, x):x = self.block1(x)if self.verbose:  #为True的话,就输出这个block的输出的大小print('block 1 output: {}'.format(x.shape))x = self.block2(x)if self.verbose:print('block 2 output: {}'.format(x.shape))x = self.block3(x)if self.verbose:print('block 3 output: {}'.format(x.shape))x = self.block4(x)if self.verbose:print('block 4 output: {}'.format(x.shape))x = self.block5(x)if self.verbose:print('block 5 output: {}'.format(x.shape))x = x.view(x.shape[0], -1)x = self.classifier(x)return x

Res-Net(Pytorch)相关推荐

  1. 深度学习准「研究僧」预习资料:图灵奖得主Yann LeCun《深度学习(Pytorch)》春季课程...

    视学算法报道 编辑:蛋酱 转载自公众号:机器之心 开学进入倒计时,深度学习方向的准「研究僧」们,你们准备好了吗? 转眼 2020 年已经过半,又一届深度学习方向的准研究生即将踏上「炼丹」之路.对于这一 ...

  2. 基于yolo4和yolo3(pytorch)的口罩识别的对比

    这是yolov4的(pytorch)目录结构 这个是基于keras yolo3的 目前测试来看pytorch的快 keras速度慢 这两个项目的地址 这里有模型直接就可以使用,感兴趣可以体验一下 py ...

  3. 【深度学习】Dropout、正反向传播、计算图等的介绍和实现(Pytorch)

    [深度学习]Dropout.正反向传播.计算图等的介绍和实现(Pytorch) 文章目录 1 Dropout概述 2 实践中的dropout2.1 从零开始实现2.2 定义模型参数2.3 定义模型2. ...

  4. 图解一维卷积层(PyTorch)

    图解一维卷积层(PyTorch) 在NLP中,我们需要对文本做embedding表示,那么embedding之后的文本做一维卷积运算的过程到底是什么样子的呢?我们给出下图加以说明

  5. 动手学习深度学习——2.7 文档(Pytorch)

    2.7 文档(Pytorch)   由于本书篇幅的限制,我们不可能介绍每一个单独的[PyTorch]函数和类.API文档,其他教程和示例提供了许多本书之外的文档.在本节中,我们将为您提供一些探索[Py ...

  6. 【图像分类】实战——使用ResNet实现猫狗分类(pytorch)

    目录 摘要 导入项目使用的库 设置全局参数 图像预处理 读取数据 设置模型 设置训练和验证 验证 完整代码: 摘要 ResNet(Residual Neural Network)由微软研究院的Kaim ...

  7. ResNeXt代码复现+超详细注释(PyTorch)

    ResNeXt就是一种典型的混合模型,由基础的Inception+ResNet组合而成,本质在gruops分组卷积,核心创新点就是用一种平行堆叠相同拓扑结构的blocks代替原来 ResNet 的三层 ...

  8. MobileNetV3 实战:植物幼苗分类(pytorch)

    文章目录 摘要 mobilenetv3简介 数据增强Cutout和Mixup 项目结构 导入项目使用的库 设置全局参数 图像预处理与增强 读取数据 设置模型 定义训练和验证函数 测试 摘要 本例提取了 ...

  9. SENet代码复现+超详细注释(PyTorch)

    在卷积网络中通道注意力经常用到SENet模块,来增强网络模型在通道权重的选择能力,进而提点.关于SENet的原理和具体细节,我们在上一篇已经详细的介绍了:经典神经网络论文超详细解读(七)--SENet ...

  10. (Pytorch)环境配置与代码学习1—边缘检测:更丰富的卷积特征 Richer Convolutional Features for Edge Detection

    (Pytorch)环境配置与代码学习1 - 边缘检测:更丰富的卷积特征 Richer Convolutional Features for Edge Detection Source code and ...

最新文章

  1. 正则的实例用法,删除包含某些字符的字符串
  2. Go Iris 中间件
  3. jquery日期插件_AngularJS 日期时间选择组件(附详细使用方法)
  4. 阿里云宣布3年再投2000亿
  5. Java之品优购课程讲义_day08(7)
  6. android activty动画,Activity动画效果
  7. loadGrid layui
  8. Protel99se信号完整性的最新应用
  9. 在ubuntu下安装openjdk
  10. marlin固件烧录教程_marlin固件中文(marlin固件下载)【配置教程】
  11. iOS:基于Photos框架的图片选择器以及创建自定义相册
  12. 加州大学洛杉矶计算机排名,加州大学洛杉矶分校计算机科学硕士排名第14(2020年TFE Times排名)...
  13. UiPath Excel 复制粘贴
  14. 民数记研读1——于宏洁
  15. 【愚公系列】2022年04月 密码学攻击-RSA之共模和模不互素
  16. Android Studio ---------------- 软件使用小细节(更新中。。。。。。)
  17. 区块链是如何解决慈善公益项目中存在的问题呢?
  18. 探秘app.asar
  19. IDEA使用手记——IDEA主菜单被隐藏了!!
  20. 高新技术企业认定申请通过后补贴

热门文章

  1. 基于jsp+servlet的银行信贷管理系统。
  2. R语言datacamp搬运日记
  3. python扫雷游戏实验分析_高级编程技术课程实验报告-扫雷游戏
  4. ios获取手机序列号_iOS-获取手机唯一标识符(获取苹果手机IMEI,获取苹果手机MAC)...
  5. ciscn 2022 华东北分区赛pwn duck
  6. 图形学基础|屏幕空间反射(SSR)
  7. DNG预设怎么导入Lightroom ?dng文件怎么导入lr?
  8. java使用fastdfs_使用fastdfs-client-java对fastdfs进行操作
  9. 二代锐龙新座驾微星B450M MORTAR
  10. layui upload附件上传