论文

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr.2016.90

论文简读

Recent evidence[40, 43] reveals that network depth is of crucial importance, and the leading results [40, 43, 12, 16] on the challenging ImageNet dataset [35] all exploit “very deep” [40] models, with a depth of sixteen [40] to thirty [16].
对于神经网络需要一个较大的深度，才会有更好的结果。
Driven by the significance of depth, a question arises: Is learning better networks as easy as stacking more layers? An obstacle to answering this question was the notorious problem of vanishing/exploding gradients [14, 1, 8], which hamper convergence from the beginning. This problem, however, has been largely addressed by normalized initialization [23, 8, 36, 12] and intermediate normalization layers [16], which enable networks with tens of layers to start converging for stochastic gradient descent (SGD) with backpropagation [22].
When deeper networks are able to start converging, a degradation problem has been exposed: with the network depth increasing, accuracy gets saturated (which might be unsurprising) and then degrades rapidly. Unexpectedly, such degradation is not caused by overfitting, and adding more layers to a suitably deep model leads to higher training error, as reported in [10, 41] and thoroughly verified by our experiments. Fig. 1 shows a typical example.
梯度消失和爆炸的问题已经可以由正则化的方式解决。
随着网络深度增加，会出现精度下降的问题，这并不是由于过拟合造成的，而且网络层数越深，训练误差越大，如图一。
Formally, denoting the desired underlying mapping as H(x), we let the stacked nonlinear layers fit another mapping of F(x) := H(x)−x. The original mapping is recast into F(x)+x. We hypothesize that it is easier to optimize the residual mapping than to optimize the original, unreferenced mapping. To the extreme, if an identity mapping were optimal, it would be easier to push the residual to zero than to fit an identity mapping by a stack of nonlinear layers.
The formulation of F(x)+x can be realized by feedforward neural networks with “shortcut connections” (Fig. 2). Shortcut connections [2, 33, 48] are those skipping one or more layers. In our case, the shortcut connections simply perform identity mapping, and their outputs are added to the outputs of the stacked layers (Fig. 2). Identity shortcut connections add neither extra parameter nor computational complexity. The entire network can still be trained end-to-end by SGD with backpropagation, and can be easily implemented using common libraries (e.g., Caffe [19]) without modifying the solvers.
让一个模块的映射变成F(x)+x，F(x)是普通的网络层，x是相等映射，如图2.
就是将前几层的输入x加到后几层的输出上，x是相等映射，不增加参数也不增加复杂度。
具体地：假设某层的输入是 x，期望输出是 H(x)，如果我们直接把输入 x 传到输出作为初始结果，这就是一个更浅层的网络，更容易训练，而这个网络没有学会的部分，我们可以使用更深的网络 F(x) 去训练它，使得训练更加容易，最后希望拟合的结果就是 F(x) = H(x) - x，这就是一个残差的结构
Based on the above plain network, we insert shortcut connections (Fig. 3, right) which turn the network into its counterpart residual version. The identity shortcuts (Eqn.(1)) can be directly used when the input and output are of the same dimensions (solid line shortcuts in Fig. 3). When the dimensions increase (dotted line shortcuts in Fig. 3), we consider two options: (A) The shortcut still performs identity mapping, with extra zero entries padded for increasing dimensions. This option introduces no extra parameter; (B) The projection shortcut in Eqn.(2) is used to match dimensions (done by 1×1 convolutions). For both options, when the shortcuts go across feature maps of two sizes, they are performed with a stride of 2.
残差有两种连接方式，如图3中的实线和虚线：实线，维度相同；虚线，维度不同，有两种方式解决：1. 用0填充，2. 通过一个映射矩阵（1x1的卷积）转换到相同维度。

代码

主要代码块

可以定义一个残差模块：

#定义一个3*3的卷积层
def conv3x3(in_channel, out_channel, stride=1):return nn.Conv2d(in_channel, out_channel, 3, stride=stride, padding=1, bias=False)
#残差模块
class residual_block(nn.Module):def __init__(self, in_channel, out_channel, same_shape=True):super(residual_block, self).__init__()self.same_shape = same_shapestride=1 if self.same_shape else 2    #如果需要不改变输出大小，则stride=1，否则=2self.conv1 = conv3x3(in_channel, out_channel, stride=stride) #第一层卷积self.bn1 = nn.BatchNorm2d(out_channel)self.conv2 = conv3x3(out_channel, out_channel) #第二层卷积self.bn2 = nn.BatchNorm2d(out_channel)if not self.same_shape:  #如果需要改变输出特征的大小，那么需要通过一维卷积调整输入x的大小self.conv3 = nn.Conv2d(in_channel, out_channel, 1, stride=stride)def forward(self, x):out = self.conv1(x)out = F.relu(self.bn1(out), True)out = self.conv2(out)out = F.relu(self.bn2(out), True)if not self.same_shape: #如果输出的大小与输入不同，则通过一维卷积调整x的大小，使x维度与输出保持相同x = self.conv3(x)return F.relu(x+out, True)  #将输出out与输入x相加到一起形成新的特征

ResNet本质上就是这样的多个残差网络的堆叠。

代码

实现一个简单的残差网络，层数较少。

class resnet(nn.Module):def __init__(self, in_channel, num_classes, verbose=False):super(resnet, self).__init__()self.verbose = verbose  #是否输出的一个标志self.block1 = nn.Conv2d(in_channel, 64, 7, 2)self.block2 = nn.Sequential(nn.MaxPool2d(3, 2),residual_block(64, 64),residual_block(64, 64))self.block3 = nn.Sequential(residual_block(64, 128, False),  #残差模块的输入输出大小不一样residual_block(128, 128))self.block4 = nn.Sequential(residual_block(128, 256, False),residual_block(256, 256))self.block5 = nn.Sequential(residual_block(256, 512, False),residual_block(512, 512),nn.AvgPool2d(3))self.classifier = nn.Linear(512, num_classes)def forward(self, x):x = self.block1(x)if self.verbose:  #为True的话，就输出这个block的输出的大小print('block 1 output: {}'.format(x.shape))x = self.block2(x)if self.verbose:print('block 2 output: {}'.format(x.shape))x = self.block3(x)if self.verbose:print('block 3 output: {}'.format(x.shape))x = self.block4(x)if self.verbose:print('block 4 output: {}'.format(x.shape))x = self.block5(x)if self.verbose:print('block 5 output: {}'.format(x.shape))x = x.view(x.shape[0], -1)x = self.classifier(x)return x

Res-Net（Pytorch）相关推荐

深度学习准「研究僧」预习资料：图灵奖得主Yann LeCun《深度学习（Pytorch）》春季课程...
视学算法报道编辑:蛋酱转载自公众号:机器之心开学进入倒计时,深度学习方向的准「研究僧」们,你们准备好了吗? 转眼 2020 年已经过半,又一届深度学习方向的准研究生即将踏上「炼丹」之路.对于这一 ...
基于yolo4和yolo3（pytorch）的口罩识别的对比
这是yolov4的(pytorch)目录结构这个是基于keras yolo3的目前测试来看pytorch的快 keras速度慢这两个项目的地址这里有模型直接就可以使用,感兴趣可以体验一下 py ...
【深度学习】Dropout、正反向传播、计算图等的介绍和实现（Pytorch）
[深度学习]Dropout.正反向传播.计算图等的介绍和实现(Pytorch) 文章目录 1 Dropout概述 2 实践中的dropout2.1 从零开始实现2.2 定义模型参数2.3 定义模型2. ...
图解一维卷积层（PyTorch）
图解一维卷积层(PyTorch) 在NLP中,我们需要对文本做embedding表示,那么embedding之后的文本做一维卷积运算的过程到底是什么样子的呢?我们给出下图加以说明
动手学习深度学习——2.7 文档（Pytorch）
2.7 文档(Pytorch) 由于本书篇幅的限制,我们不可能介绍每一个单独的[PyTorch]函数和类.API文档,其他教程和示例提供了许多本书之外的文档.在本节中,我们将为您提供一些探索[Py ...
【图像分类】实战——使用ResNet实现猫狗分类（pytorch）
目录摘要导入项目使用的库设置全局参数图像预处理读取数据设置模型设置训练和验证验证完整代码: 摘要 ResNet(Residual Neural Network)由微软研究院的Kaim ...
ResNeXt代码复现＋超详细注释（PyTorch）
ResNeXt就是一种典型的混合模型,由基础的Inception+ResNet组合而成,本质在gruops分组卷积,核心创新点就是用一种平行堆叠相同拓扑结构的blocks代替原来 ResNet 的三层 ...
MobileNetV3 实战：植物幼苗分类（pytorch）
文章目录摘要 mobilenetv3简介数据增强Cutout和Mixup 项目结构导入项目使用的库设置全局参数图像预处理与增强读取数据设置模型定义训练和验证函数测试摘要本例提取了 ...
SENet代码复现＋超详细注释（PyTorch）
在卷积网络中通道注意力经常用到SENet模块,来增强网络模型在通道权重的选择能力,进而提点.关于SENet的原理和具体细节,我们在上一篇已经详细的介绍了:经典神经网络论文超详细解读(七)--SENet ...
（Pytorch）环境配置与代码学习1—边缘检测：更丰富的卷积特征 Richer Convolutional Features for Edge Detection
(Pytorch)环境配置与代码学习1 - 边缘检测:更丰富的卷积特征 Richer Convolutional Features for Edge Detection Source code and ...

Res-Net（Pytorch）

Res-Net

论文

论文简读

代码

主要代码块

代码

Res-Net（Pytorch）相关推荐

最新文章

热门文章