ResNet是2015年ImageNet比赛的冠军,将识别错误率降低到了3.6%,这个结果甚至超出了正常人眼识别的精度。

通过观察学习vggnet等经典神经网络模型,我们可以发现随着深度学习的不断发展,模型的层数越来越多,网络结构也越来越复杂。那么是否加深网络结构,就一定会得到更好的效果呢?

从理论上来说,假设新增加的层都是恒等映射,只要原有的层学出跟原模型一样的参数,那么深模型结构就能达到原模型结构的效果。换句话说,原模型的解只是新模型的解的子空间,在新模型解的空间里应该能找到比原模型解对应的子空间更好的结果。但是实践表明,增加网络的层数之后,训练误差往往不降反升。这是因为梯度消失或是梯度爆炸导致的。

Kaiming He等人提出了残差网络ResNet来解决上述所说的退化问题,其基本思想如图6所示。

  • 图6(a):表示增加网络的时候,将x映射成y=F(x)输出。
  • 图6(b):对图6(a)作了改进,输出y=F(x)+x。这时不是直接学习输出特征y的表示,而是学习y-x。
    • 如果想学习出原模型的表示,只需将F(x)的参数 全部设置为0,则是恒等映射。
    • F(x)=y-x也叫做残差项,如果x→y的映射接近恒等映射,图6(b)中通过学习残差项也比图6(a)学习完整映射形式更加容易。


图6 残差块设计思想
图6(b)的结构是残差网络的基础,这种结构也叫做残差块(residual block)。输入x通过跨层连接,能更快的向前传播数据,或者向后传播梯度。残差块的具体设计方案如 图7 所示,这种设计方案也成称作瓶颈结构(BottleNeck)。

图7 残差块结构示意图

深层残差学习

残差恒等映射


F(x,{Wi})表示待学习的恒等映射。x和F维数必须相等,如果不相等,可以通过快捷连接执行线性投影Ws来匹配维度或是额外填充0输入以增加维数,stride=2.

下图为VGG-19,Plain-34(没有使用residual结构)和ResNet-34网络结构对比:

对上图进行如下说明:

  1. 相比于VGG-19,ResNet没有使用全连接层,而使用了全局平均池化层,可以减少大量参数。VGG-19大量参数集中在全连接层;

  2. ResNet-34中跳跃连接“实线”为identity mapping和residual mapping通道数相同,“虚线”部分指的是两者通道数不同,需要使用1x1卷积调整通道维度,使其可以相加。

论文一共提出5种ResNet网络,网络参数统计表如下:

实现

基本设置遵循以前的经典网络,可以看原文的参考文献。在每次卷积之后和激活之前,我们采用批量归一化(BN) ,紧接着,我们初始化权重,并从头开始训练所有普通/残差网。我们使用最小批量为256的SGD。当误差平稳时,学习率从0.1开始除以10,模型被训练达到600000次迭代。我们使用0.0001的重量衰减和0.9的动量。
下图表示出了ResNet-50的结构,一共包含49层卷积和1层全连接,所以被称为ResNet-50。


由上图可知,Resnet的训练或验证误差都小于简单网络,同一Resnet结构,随着网络层次的加深,误差越来越小。

代码实现

本文代码用keras实现Resnet_18

from keras.layers import Input
from keras.layers import Conv2D, MaxPool2D, Dense, BatchNormalization, Activation, add, GlobalAvgPool2D
from keras.models import Model
from keras import regularizers
from keras.utils import plot_model
from keras import backend as Kdef conv2d_bn(x, nb_filter, kernel_size, strides=(1, 1), padding='same'):"""conv2d -> batch normalization -> relu activation"""x = Conv2D(nb_filter, kernel_size=kernel_size,strides=strides,padding=padding,kernel_regularizer=regularizers.l2(0.0001))(x)x = BatchNormalization()(x)x = Activation('relu')(x)return xdef shortcut(input, residual):"""shortcut连接,也就是identity mapping部分。"""input_shape = K.int_shape(input)residual_shape = K.int_shape(residual)stride_height = int(round(input_shape[1] / residual_shape[1]))stride_width = int(round(input_shape[2] / residual_shape[2]))equal_channels = input_shape[3] == residual_shape[3]identity = input# 如果维度不同,则使用1x1卷积进行调整if stride_width > 1 or stride_height > 1 or not equal_channels:identity = Conv2D(filters=residual_shape[3],kernel_size=(1, 1),strides=(stride_width, stride_height),padding="valid",kernel_regularizer=regularizers.l2(0.0001))(input)return add([identity, residual])def basic_block(nb_filter, strides=(1, 1)):"""基本的ResNet building block,适用于ResNet-18和ResNet-34."""def f(input):conv1 = conv2d_bn(input, nb_filter, kernel_size=(3, 3), strides=strides)residual = conv2d_bn(conv1, nb_filter, kernel_size=(3, 3))return shortcut(input, residual)return fdef residual_block(nb_filter, repetitions, is_first_layer=False):"""构建每层的residual模块,对应论文参数统计表中的conv2_x -> conv5_x"""def f(input):for i in range(repetitions):strides = (1, 1)if i == 0 and not is_first_layer:strides = (2, 2)input = basic_block(nb_filter, strides)(input)return inputreturn fdef resnet_18(input_shape=(224,224,3), nclass=1000):"""build resnet-18 model using keras with TensorFlow backend.:param input_shape: input shape of network, default as (224,224,3):param nclass: numbers of class(output shape of network), default as 1000:return: resnet-18 model"""input_ = Input(shape=input_shape)conv1 = conv2d_bn(input_, 64, kernel_size=(7, 7), strides=(2, 2))pool1 = MaxPool2D(pool_size=(3, 3), strides=(2, 2), padding='same')(conv1)conv2 = residual_block(64, 2, is_first_layer=True)(pool1)conv3 = residual_block(128, 2, is_first_layer=True)(conv2)conv4 = residual_block(256, 2, is_first_layer=True)(conv3)conv5 = residual_block(512, 2, is_first_layer=True)(conv4)pool2 = GlobalAvgPool2D()(conv5)output_ = Dense(nclass, activation='softmax')(pool2)model = Model(inputs=input_, outputs=output_)model.summary()return modelif __name__ == '__main__':model = resnet_18()plot_model(model, 'ResNet-18.png')  # 保存模型图

用Resnet实现cifar100的分类

import keras
import argparse
import numpy as np
from keras.datasets import cifar10, cifar100
from keras.preprocessing.image import ImageDataGenerator
from keras.layers.normalization import BatchNormalization
from keras.layers import Conv2D, Dense, Input, add, Activation, GlobalAveragePooling2D
from keras.callbacks import LearningRateScheduler, TensorBoard, ModelCheckpoint
from keras.models import Model
from keras import optimizers, regularizers
from keras import backend as K# set GPU memory
if('tensorflow' == K.backend()):import tensorflow as tffrom keras.backend.tensorflow_backend import set_sessionconfig = tf.ConfigProto()config.gpu_options.allow_growth = Truesess = tf.Session(config=config)# set parameters via parser
parser = argparse.ArgumentParser()
parser.add_argument('-b','--batch_size', type=int, default=128, metavar='NUMBER',help='batch size(default: 128)')
parser.add_argument('-e','--epochs', type=int, default=200, metavar='NUMBER',help='epochs(default: 200)')
parser.add_argument('-n','--stack_n', type=int, default=5, metavar='NUMBER',help='stack number n, total layers = 6 * n + 2 (default: 5)')
parser.add_argument('-d','--dataset', type=str, default="cifar10", metavar='STRING',help='dataset. (default: cifar10)')args = parser.parse_args()stack_n            = args.stack_n
layers             = 6 * stack_n + 2
num_classes        = 10
img_rows, img_cols = 32, 32
img_channels       = 3
batch_size         = args.batch_size
epochs             = args.epochs
iterations         = 50000 // batch_size + 1
weight_decay       = 1e-4def color_preprocessing(x_train,x_test):x_train = x_train.astype('float32')x_test = x_test.astype('float32')mean = [125.307, 122.95, 113.865]std  = [62.9932, 62.0887, 66.7048]for i in range(3):x_train[:,:,:,i] = (x_train[:,:,:,i] - mean[i]) / std[i]x_test[:,:,:,i] = (x_test[:,:,:,i] - mean[i]) / std[i]return x_train, x_testdef scheduler(epoch):if epoch < 81:return 0.1if epoch < 122:return 0.01return 0.001def residual_network(img_input,classes_num=10,stack_n=5):def residual_block(x,o_filters,increase=False):stride = (1,1)if increase:stride = (2,2)o1 = Activation('relu')(BatchNormalization(momentum=0.9, epsilon=1e-5)(x))conv_1 = Conv2D(o_filters,kernel_size=(3,3),strides=stride,padding='same',kernel_initializer="he_normal",kernel_regularizer=regularizers.l2(weight_decay))(o1)o2  = Activation('relu')(BatchNormalization(momentum=0.9, epsilon=1e-5)(conv_1))conv_2 = Conv2D(o_filters,kernel_size=(3,3),strides=(1,1),padding='same',kernel_initializer="he_normal",kernel_regularizer=regularizers.l2(weight_decay))(o2)if increase:projection = Conv2D(o_filters,kernel_size=(1,1),strides=(2,2),padding='same',kernel_initializer="he_normal",kernel_regularizer=regularizers.l2(weight_decay))(o1)block = add([conv_2, projection])else:block = add([conv_2, x])return block# build model ( total layers = stack_n * 3 * 2 + 2 )# stack_n = 5 by default, total layers = 32# input: 32x32x3 output: 32x32x16x = Conv2D(filters=16,kernel_size=(3,3),strides=(1,1),padding='same',kernel_initializer="he_normal",kernel_regularizer=regularizers.l2(weight_decay))(img_input)# input: 32x32x16 output: 32x32x16for _ in range(stack_n):x = residual_block(x,16,False)# input: 32x32x16 output: 16x16x32x = residual_block(x,32,True)for _ in range(1,stack_n):x = residual_block(x,32,False)# input: 16x16x32 output: 8x8x64x = residual_block(x,64,True)for _ in range(1,stack_n):x = residual_block(x,64,False)x = BatchNormalization(momentum=0.9, epsilon=1e-5)(x)x = Activation('relu')(x)x = GlobalAveragePooling2D()(x)# input: 64 output: 10x = Dense(classes_num,activation='softmax',kernel_initializer="he_normal",kernel_regularizer=regularizers.l2(weight_decay))(x)return xif __name__ == '__main__':print("========================================") print("MODEL: Residual Network ({:2d} layers)".format(6*stack_n+2)) print("BATCH SIZE: {:3d}".format(batch_size)) print("WEIGHT DECAY: {:.4f}".format(weight_decay))print("EPOCHS: {:3d}".format(epochs))print("DATASET: {:}".format(args.dataset))print("== LOADING DATA... ==")# load dataglobal num_classesif args.dataset == "cifar100":num_classes = 100(x_train, y_train), (x_test, y_test) = cifar100.load_data()else:(x_train, y_train), (x_test, y_test) = cifar10.load_data()y_train = keras.utils.to_categorical(y_train, num_classes)y_test = keras.utils.to_categorical(y_test, num_classes)print("== DONE! ==\n== COLOR PREPROCESSING... ==")# color preprocessingx_train, x_test = color_preprocessing(x_train, x_test)print("== DONE! ==\n== BUILD MODEL... ==")# build networkimg_input = Input(shape=(img_rows,img_cols,img_channels))output    = residual_network(img_input,num_classes,stack_n)resnet    = Model(img_input, output)# print model architecture if you need.# print(resnet.summary())# set optimizersgd = optimizers.SGD(lr=.1, momentum=0.9, nesterov=True)resnet.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])# set callbackcbks = [TensorBoard(log_dir='./resnet_{:d}_{}/'.format(layers,args.dataset), histogram_freq=0),LearningRateScheduler(scheduler)]# dump checkpoint if you need.(add it to cbks)# ModelCheckpoint('./checkpoint-{epoch}.h5', save_best_only=False, mode='auto', period=10)# set data augmentationprint("== USING REAL-TIME DATA AUGMENTATION, START TRAIN... ==")datagen = ImageDataGenerator(horizontal_flip=True,width_shift_range=0.125,height_shift_range=0.125,fill_mode='constant',cval=0.)datagen.fit(x_train)# start trainingresnet.fit_generator(datagen.flow(x_train, y_train,batch_size=batch_size),steps_per_epoch=iterations,epochs=epochs,callbacks=cbks,validation_data=(x_test, y_test))resnet.save('resnet_{:d}_{}.h5'.format(layers,args.dataset))

resnet论文解读及代码实现相关推荐

  1. Resnet论文解读与TensorFlow代码分析

    残差网络Resnet论文解读 1.论文解读 博客地址:https://blog.csdn.net/loveliuzz/article/details/79117397 2.理解ResNet结构与Ten ...

  2. 单目标跟踪算法:Siamese RPN论文解读和代码解析

    点击上方"3D视觉工坊",选择"星标" 干货第一时间送达 作者:周威 | 来源:知乎 https://zhuanlan.zhihu.com/p/16198364 ...

  3. FPN论文解读 和 代码详解

    FPN论文解读 和 代码详解 论文地址:[Feature Pyramid Networks for Object Detection](1612.03144v2.pdf (arxiv.org)) 代码 ...

  4. CVPR 2020 Oral 文章汇总,包括论文解读与代码实现

    点击上方,选择星标或置顶,不定期资源大放送! 阅读大概需要10分钟 Follow小博主,每天更新前沿干货 [导读]本文为大家整理了10篇CVPR2020上被评为Oral的论文解读和代码汇总. 1.Ra ...

  5. 目标检测学习笔记2——ResNet残差网络学习、ResNet论文解读

    ResNet残差网络学习.ResNet论文解读 一.前言 为什么会提出ResNet? 什么是网络退化现象? 那网络退化现象是什么造成的呢? ResNet要如何解决退化问题? 二.残差模块 三.残差模块 ...

  6. 【Cylinder3D论文解读及代码略解】

    Cylinder3D论文解读及代码略解 论文解读 Abstract Introduction Related work 室内点云分割 室外点云分割 3D体素划分 Methodology(本文方法) C ...

  7. 【AI】《ResNet》论文解读、代码实现与调试找错

    前言 残差网络Resnet,被誉为撑起计算机视觉半边天的文章,重要性不言而喻.另外,文章作者何凯明,在2022年AI 2000人工智能最具影响力学者排行里排名第一: 为什么这篇文章影响力这么大呢? 通 ...

  8. YOLOv7来临:论文解读附代码解析

    前言: 是一份关于YOLOv7的论文解读,首发于[GiantPandaCV]公众号,写的不是很好,望大佬们包涵! 2022年7月,YOLOv7来临, 论文链接:https://arxiv.org/ab ...

  9. U2Net论文解读及代码测试

    论文名称: U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection 论文地址: https://arxiv. ...

最新文章

  1. sap 给集团分配一个逻辑系统
  2. linux free 命令 显示内存使用情况
  3. _matroska_decode_buffer in
  4. oracle 运营维护_oracle运维(持续更新)
  5. Spring源码:FactoryBean
  6. 优化Hibernate所鼓励的7大措施
  7. 谷歌要求华为不启用鸿蒙,谷歌:华为我不让你用我的服务!华为:我还是照样用!...
  8. DTO,VO,POJO,JavaBeans之间的区别?
  9. 基于VUE的echart图表自适应窗口大小变化插件开发
  10. linux保密检查工具,linux使用lynis检查系统安全
  11. Hadoop安装教程(Hadoop3.3.1版本),centos7系统,避免踩坑
  12. Mac m1芯片 安装 mosek python
  13. Markdown使用手册
  14. ai怎么渐变颜色_AI的渐变工具为什么如此难用?
  15. 一些有用的英语学习资料
  16. 【完结】囚生CYの备忘录(20221121-20230123)
  17. MySQL数据库卸载+MySQL常用的图形化管理工具介绍
  18. 安卓开发笔记-UI设计的概念
  19. 各种机器学习算法的应用场景分别是什么(比如朴素贝叶斯、决策树、K 近邻、SVM、逻辑回归最大熵模型)?...
  20. 基于java的公链,第一个基于Java的BFT区块链 – Alienchain外星链号称以太坊Java版

热门文章

  1. ASP .NET Controller返回类型
  2. 开发电子商城4(linux下安装maven)
  3. checkbox怎么判断是否选中
  4. 使用mysql命令修改配置信息
  5. vs debug 调试 快捷键
  6. 面向对象之信息传递为何是调用方法
  7. 贪心——跳跃游戏(Leetcode 55)
  8. 查找重复文件_快速查找、删除重复图片及文件!
  9. java选择安装路径的功能怎么实现_水槽怎么选择,从安装方式,材质功能,江水平给你一次性说清楚...
  10. PingCAP黄东旭:云原生、开源与分布式是数据库行业发展关键词