前言

ShuffleNet是一款旷视研究院提出的针对移动设备的轻量级网络

它是为了进一步改善MobileNet

ShuffleNet作者发现，虽然MobileNet作者采用逐通道卷积与逐点卷积（即用1x1卷积块进行卷积）来降低模型参数，但大部分参数都集中在1x1卷积快计算中。所以ShuffleNet改进方法就是将特征图进行分组，让卷积块只对组内的特征图进行计算，分组后，网络需要对不同组的输出进行信息交换，所以又引入了通道重排这一思想，将不同通道的特征图合并在一起

分组卷积

分组卷积实质就是只对组内的特征图进行卷积

首先就是对整个输入进行分组
每一组进行单独的卷积操作

通道重排

我们先来看一个简单的numpy代码

d = np.array([0,1,2,3,4,5,6,7,8])
x = np.reshape(d, (3,3))
x = np.transpose(x, [1,0]) # 转置
x = np.reshape(x, (9,)) # 平铺

输出结果

[0 3 6 1 4 7 2 5 8]

原本我们的输入是从0-8
假设我们每三个元素看成是一个通道内的元素
通道1元素有0, 1, 2
通道2元素有3, 4, 5
通道3元素有6, 7, 8
经过混合后0, 1, 2分别进入到通道1， 2， 3
同理3，4，5 6，7，8也是分别进入到三个通道

这样我们巧妙的利用矩阵的转置进行了通道的重排

ShuffleNetUnit

左边是Mobilenet作者提出的深度可分离卷积结构块，其中也是受到Resnet的启发，进行跳跃连接
中间则是ShuffleNet的主要Block
右边是ShuffleNet的降采样块
当Strides步长为2的时候进行降采样

具体实现代码

本博文代码来自https://blog.csdn.net/zjn295771349/article/details/89704086
我在代码基础上加了一些自己的注释

import numpy as np
from keras.callbacks import LearningRateScheduler
from keras.models import Model
from keras.layers import Input, Conv2D, Dropout,  Dense, GlobalAveragePooling2D, Concatenate, AveragePooling2D
from keras.layers import Activation, BatchNormalization, add, Reshape, ReLU, DepthwiseConv2D, MaxPooling2D, Lambda
from keras.utils.vis_utils import plot_model
from keras import backend as K
from keras.optimizers import SGDdef _group_conv(x, filters, kernel, stride, groups):# 根据keras格式判断通道在哪一个维度channel_axis = 1 if K.image_data_format() == 'channels_first' else -1in_channels = K.int_shape(x)[channel_axis] # 获取输入的维数# 每一组的输入通道数nb_ig = in_channels // groups# 每一组的输出通道数nb_og = filters // groupsgc_list = []# 假设能整整的分配到组里assert filters % groups == 0for i in range(groups):# 当通道维数在最后一维if channel_axis == -1:# Lambda是将一个lambda表达式封装为一个layer对象x_group = Lambda(lambda z: z[:, :, :, i*nb_ig: (i+1)*nb_ig])(x)else:x_group = Lambda(lambda z: z[:, i*nb_ig: (i+1)*nb_ig, :, :])(x)gc_list.append(Conv2D(filters=nb_og, kernel_size=kernel, strides=stride,padding='SAME', use_bias=False)(x_group))# 最后在通道维数进行连结return Concatenate(axis=channel_axis)(gc_list)def _channel_shuffle(x, groups):"""通道混洗层，这里借助矩阵的reshape和转置达到混洗示例代码如下:d = np.array([0,1,2,3,4,5,6,7,8])x = np.reshape(d, (3,3))x = np.transpose(x, [1,0]) # 转置x = np.reshape(x, (9,)) # 平铺[0 1 2 3 4 5 6 7 8] --> [0 3 6 1 4 7 2 5 8]:param x:输入张量:param groups:分组数:return:"""# 当通道数放在最后一维的时候if K.image_data_format() == "channels_last":# 获取长，宽，通道数height, width, in_channels = K.int_shape(x)[1:]channels_per_group = in_channels // groupspre_shape = [-1, height, width, groups, channels_per_group]dim = (0, 1, 2, 4, 3)# 相当于又回到了in_channels这个通道进行平铺later_shape = [-1, height, width, in_channels]else:# 当通道数放在第一维in_channels, height, width = K.int_shape(x)[1:]channels_per_group = in_channels // groupspre_shape = [-1, groups, channels_per_group, height, width]dim = (0, 2, 1, 3, 4)# 给维度增加-1能让np根据后面的维度自动推理形状later_shape = [-1, in_channels, height, width]x = Lambda(lambda z: K.reshape(z, pre_shape))(x)x = Lambda(lambda z: K.permute_dimensions(z, dim))(x) # 给定模式进行重排x = Lambda(lambda z: K.reshape(z, later_shape))(x)return xdef _shufflenet_unit(inputs, filters, kernel, stride, groups, stage, bottleneck_ratio=0.25):"""ShuffleNet unit# Argumentsinputs: Tensor, input tensor of with `channels_last` or 'channels_first' data formatfilters: Integer, number of output channelskernel: An integer or tuple/list of 2 integers, specifying thewidth and height of the 2D convolution window.strides: An integer or tuple/list of 2 integers,specifying the strides of the convolution along the width and height.Can be a single integer to specify the same value forall spatial dimensions.groups: Integer, number of groups per channelstage: Integer, stage number of ShuffleNetbottleneck_channels: Float, bottleneck ratio implies the ratio of bottleneck channels to output channels# ReturnsOutput tensor# NoteFor Stage 2, we(authors of shufflenet) do not apply group convolution on the first pointwise layerbecause the number of input channels is relatively small."""channel_axis = 1 if K.image_data_format() == 'channels_first' else -1in_channels = K.int_shape(inputs)[channel_axis]bottleneck_channels = int(filters*bottleneck_ratio)if stage == 2:x = Conv2D(filters=bottleneck_channels, kernel_size=kernel, strides=1, padding='SAME',use_bias=False)(inputs)else:x = _group_conv(inputs, bottleneck_channels, (1, 1), 1, groups)x = BatchNormalization(axis=channel_axis)(x)x = ReLU()(x)x = _channel_shuffle(x, groups)x = DepthwiseConv2D(kernel_size=kernel, strides=stride, depth_multiplier=1, padding='SAME',use_bias=False)(x)x = BatchNormalization(axis=channel_axis)(x)if stride == 2:# 当步长为2，shuffleNet模块转变为一个降采样模块x = _group_conv(x, filters-in_channels, (1, 1), 1, groups)# 因为这个降采样模块会将输入input，给concat到最后的输出# 所以这里的filters数目需要减掉输入的通道数x = BatchNormalization(axis=channel_axis)(x)avg = AveragePooling2D(pool_size=(3, 3), strides=2, padding='SAME')(inputs)x = Concatenate(axis=channel_axis)([avg, x])else:x = _group_conv(x, filters, (1, 1), 1, groups)x = BatchNormalization(axis=channel_axis)(x)x = add([x, inputs])return xdef _stage(x, filters, kernel, groups, repeat, stage):x = _shufflenet_unit(x, filters, kernel, 2, groups, stage)for i in range(1, repeat):x = _shufflenet_unit(x, filters, kernel, 1, groups, stage)return xdef shuffleNet(input_shape, classes):inputs = Input(shape=input_shape)x = Conv2D(24, (3, 3), strides=2, padding='SAME', use_bias=True, activation='elu')(inputs)x = MaxPooling2D(pool_size=(3, 3), strides=2, padding='SAME')(x)x = _stage(x, filters=384, kernel=(3, 3), groups=8, repeat=4, stage=2)x = _stage(x, filters=768, kernel=(3, 3), groups=8, repeat=8, stage=3)x = _stage(x, filters=1536, kernel=(3, 3), groups=8, repeat=4, stage=4)x = GlobalAveragePooling2D()(x)x = Dense(classes)(x)predicts = Activation('softmax')(x)model = Model(inputs, predicts)return modelif __name__ == '__main__':model = shuffleNet((224, 224, 3), 1000)model.summary()plot_model(model, to_file='./shuffleNet.png')

ShuffleNet原理相关推荐

2021届图像／计算机视觉算法提前批的面经 | 附内推码
点击上方"视学算法",选择加"星标"或"置顶" 重磅干货,第一时间送达作者:雲水謡来源:https://www.nowcoder.com ...
别人的面经（算法方向）
机器学习统计学基础数据增强,和要注意的地方怎么判断过拟合,如何解决 logistic回归 LR用过吗 (用过:Kaggle的二分类检测 ) softmax.多个logistic的各自的优势 ( ...
shufflenet中channel shuffle原理
分组卷积 Group convolution是将输入层的不同特征图进行分组,然后采用不同的卷积核再对各个组进行卷积,这样会降低卷积的计算量.因为一般的卷积都是在所有的输入特征图上做卷积,可以说是全通道 ...
TensorRT-优化-原理
TensorRT-优化-原理一．优化方式 TentsorRT 优化方式: TensorRT优化方法主要有以下几种方式,最主要的是前面两种. 层间融合或张量融合(Layer & Tensor ...
YOLOv5-Lite 详解教程 | 嚼碎所有原理、训练自己数据集、TensorRT部署落地应有尽有...
点击上方"3D视觉工坊",选择"星标" 干货第一时间送达作者丨ChaucerG 来源丨集智书童 YOLOv5 Lite在YOLOv5的基础上进行一系列消融实验 ...
Pytorch学习 - Task5 PyTorch卷积层原理和使用
Pytorch学习 - Task5 PyTorch卷积层原理和使用 1. 卷积层 (1)介绍 (torch.nn下的) 1) class torch.nn.Conv1d() 一维卷积层 2) clas ...
合并BN层到卷积层的原理及实验
1. 为什么要合并BN层在训练深度网络模型时,BN(Batch Normalization)层能够加速网络收敛,并且能够控制过拟合,一般放在卷积层之后.BN 层将数据归一化后,能够有效解决梯度消失 ...
轻量化神经网络篇（SqueezeNet、Xception、MobileNet、ShuffleNet）
写在前面:此文只记录了下本人感觉需要注意的地方,不全且不一定准确.详细内容可以参考文中帖的链接,比较好!!! 最近看的轻量化神经网络:SqueezeNet.Xception.MobileNet.Shu ...
深度学习（6）之卷积的几种方式：1D、2D和3D卷积的不同卷积原理（全网最全！）
深度学习(6)之卷积的几种方式:1D.2D和3D卷积的不同卷积原理(全网最全!) 英文原文 :A Comprehensive Introduction to Different Types of Co ...
轻量化网络：ShuffleNet v2解析
原文: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design 论文链接:https://arxiv.org ...

ShuffleNet原理

前言

分组卷积

通道重排

ShuffleNetUnit

具体实现代码

ShuffleNet原理相关推荐

最新文章

热门文章