本文介绍两种卷积:

  • 3D卷积
  • 去卷积

下篇文章介绍:

  • 空洞卷积
  • depthwise卷积

3D卷积

最近在读论文的时候,有好几个地方提到了3D卷积,一直不懂是怎么操作的,看了一部分资料:

  • 3D Convolutional Neural Networks for Human Action Recognition
  • CSDN博客

还是有些懵,又利用tensorflow做了进一步测试学习。

tensorflow中的函数tf.nn.conv3d

tf.nn.conv3d(input,filter,strides,padding,data_format='NDHWC',dilations=[1, 1, 1, 1, 1],name=None
)

给定一个5-D的输入和滤波器,计算一个3D卷积。

Args:

  • input: A Tensor. Must be one of the following type, half, bfloat16, float32, float64. Shape [batch, in_depth, in_height, in_width, in_channels]
  • filter: A tensor. Must have the same type as input. Shape [filter_depth, filter_height, filter_width, in_channels, out_channels]. in_channels must match between input and filter
  • strides: A list of ints that has length >= 5. 1-D tensor of length 5. The stride of the sliding window for each dimension of input. Must have strides[0] = strides[4] = 1.
  • padding: A string from: “SAME”, “VALID”. The type of padding algorithm to use.
  • data_format: An optional string from: “NDHWC”, “NCDHW”. Defaults to “NDHWC”. The data format of the input and output data. With the default format “NDHWC”, the data is stored in the order of: [batch, in_depth, in_height, in_width, in_channels]. Alternatively, the format could be “NCDHW”, the data storage order is: [batch, in_channels, in_depth, in_height, in_width].
  • dilations: An optional list of ints. Defaults to [1, 1, 1, 1, 1]. 1-D tensor of length 5. The dilation factor for each dimension of input. If set to k > 1, there will be k-1 skipped cells between each filter element on that dimension. The dimension order is determined by the value of data_format, see above for details. Dilations in the batch and depth dimensions must be 1.
  • name: A name for the operation (optional).

Returns:

A Tensor. Has the same type as input

与2D卷积tf.nn.conv2d相比,

tf.nn.conv2d(input,filter,strides,padding,use_cudnn_on_gpu=True,data_format='NHWC',dilations=[1, 1, 1, 1],name=None
)

其实差别不大,主要区别在于input和filter两个参数,2D卷积中这两个参数的要求是:

  • input: A Tensor. Must be one of the following types: half, bfloat16, float32, float64. A 4-D tensor. [batch, height, width, channels.
  • filter: A Tensor. Must have the same type as input. A 4-D tensor of shape [filter_height, filter_width, in_channels, out_channels]

主要区别是3D卷积中的input参数的in_depth与filter参数里面的filter_depth。

  • input: A Tensor. Must be one of the following type, half, bfloat16, float32, float64. Shape [batch, in_depth, in_height, in_width, in_channels]
  • in_depth表示的就是时间维度(如视频里面的帧)或者空间维度(如生物医学里面的多个连续切片),它表示帧的个数或者切片的个数。这个区别与图像的channels,如输入rgb图像的3channels。这里它的意义,卷积在这个维度作用于多少的size,如在平面图像的宽和高。
  • filter: A tensor. Must have the same type as input. Shape [filter_depth, filter_height, filter_width, in_channels, out_channels]. in_channels must match between input and filter
  • in_fiter就是说这个滤波器在这个尺度的size,例如滤波器在平面图像的size = 3 x 3,或5 x 5

其它参数和3D卷积类似。

下面就以程序为例做下测试。

import tensorflow as tf
import numpy as npinput = tf.constant(1, shape=[1, 7, 224, 224, 3], dtype=tf.float32)
filter_1_2 = tf.constant(2, shape=[1, 3, 3, 3, 64], dtype=tf.float32)
filter_3_4 = tf.constant(2, shape=[1, 5, 3, 3, 64], dtype=tf.float32)res_1 = tf.nn.conv3d(input=input,filter=filter_1_2,strides=[1, 1, 1, 1, 1],padding='SAME')
res_2 = tf.nn.conv3d(input=input,filter=filter_1_2,strides=[1, 2, 1, 1, 1],padding='SAME')res_3 = tf.nn.conv3d(input=input,filter=filter_3_4,strides=[1, 1, 1, 1, 1],padding='SAME')
res_4 = tf.nn.conv3d(input=input,filter=filter_3_4,strides=[1, 2, 1, 1, 1],padding='SAME')
sess = tf.Session()conv_res_1 = sess.run(res_1)
conv_res_2 = sess.run(res_2)conv_res_3 = sess.run(res_3)
conv_res_4 = sess.run(res_4)print conv_res_1.shape
print conv_res_2.shapeprint conv_res_3.shape
print conv_res_4.shape#print conv_res

Output:

(1, 7, 224, 224, 64)
(1, 4, 224, 224, 64)
(1, 7, 224, 224, 64)
(1, 4, 224, 224, 64)

可以看到在第2维的卷积和第3、4维一样的原理。

上图为一个3D滤波器(其实为4D: depth, channel, height, width)在输入数据进行卷积的效果。

如上图,滤波器在第三维(帧,切片)上进行滑动,产生了很多((in_depth - 1) / filter_depth + 1)单通道的feature map。不同的滤波器同样可以产生很多同等个数((in_depth - 1) / filter_depth + 1)的单层feature map。然后就一块组合成了[batch, (in_depth - 1) / filter_depth + 1, (in_height - 1) / filter_height + 1, (in_width - 1) / filter_width + 1, out_channels]的输出,如程序示例所示。

去卷积

去卷积是在语义分割里用的比较多,之前看过一篇专门介绍反卷积的论文,但现在记不太清了,故重新学习一下。

这里的动态图对了解反卷积很形象,看看应该就可以明白。

另外分享一篇论文A guide to convolution arithmetic for deep learning

tensorflow中的tf.nn.conv2d_transpose

tf.nn.conv2d_transpose(value,filter,output_shape,strides,padding='SAME',data_format='NHWC',name=None
)

Args:

  • value: A 4-D Tensor of type float and shape [batch, height, width, in_channels] for NHWC data format or [batch, in_channels, height, width] for NCHW data format.
  • filter: A 4-D Tensor with the same type as value and shape [height, width, output_channels, in_channels]. filter’s in_channesl dimension must match that of value.
  • output_shape: A 1-D Tensor representing the output shape of the deconvolution op.
  • strides: A list of ints. The stride of the sliding window for each dimension of the input tensor.
  • padding: A string, either ‘VALID’ or ‘SAME’. The padding algorithm. See the “returns” section of tf.nn.convolution for details.
  • data_format: A string. ‘NHWC’ and ‘NCHW’ are supported.
  • name: Optional name for the returned tensor.

这个output_shape就很懵,为什么input, filter, padding, stride确定后,output_shape不确定?

import tensorflow as tf
import numpy as npvalue = tf.constant(1, shape=[1, 3, 3, 3], dtype=tf.float32)
filter = tf.constant(2, shape=[3, 3, 64, 3],dtype=tf.float32)
output_shape_1 = tf.constant([1, 6, 6, 64])
output_shape_2 = tf.constant([1, 5, 5, 64])res_1 = tf.nn.conv2d_transpose(value=value,filter=filter,output_shape=output_shape_1,strides=[1, 2, 2, 1],padding='SAME'
)res_2 = tf.nn.conv2d_transpose(value=value,filter=filter,output_shape=output_shape_2,strides=[1, 2, 2, 1],padding='SAME'
)sess = tf.Session()conv_res_1 = sess.run(res_1)
conv_res_2 = sess.run(res_2)#conv_res_3 = sess.run(res_3)
#conv_res_4 = sess.run(res_4)print conv_res_1.shape
print conv_res_2.shape#print conv_res_3.shape
#print conv_res_4.shape#print conv_res

Output

(1, 6, 6, 64)
(1, 5, 5, 64)

上面的例子可以看到指定这个output_shape是重要的,因为反卷积后的shape是不确定,可以在可能的集合里来指定。

那么具体反卷积是怎么操作的呢?或者说给定input, filter, stride, padding等信息,反卷积后有哪些可能的output_shape呢?

考虑一下卷积的公式:

conv: i -> o

deconv: o -> i

o = floor(i + 2*p -k) / s + 1, (1)

反卷积是把推导出i的shape,i经过卷积后变成的o.

由(1)可得,

floor(i + 2*p - k) = (o - 1)s

  • 如果 s = 1, 那么 i = (o - 1) + k - 2p,i是固定的,这个时候的output_shape是不需要指定的。另外考虑(常用情况) k = 3, p = 1(padding = ‘SAME’),那么i = o; k = 3, p = 0(padding=‘VALID’), 那么 i = o + 2
  • 如果 s = 2,那么 i + 2p - k = 2(o-1)或 i + 2p - k = 2(o-1) + 1,即 i = 2o - 2 + k - 2p 或 i = 2o - 1 + k - 2p。另外考虑(常用情况)k = 3, p = 1(padding=‘SAME’),那么 i = 2o-1 或者 i = 2*o。k = 3, p = 0(padding=‘VALID’),那么i = 2o + 1或 i = 2o+2

总结:只考虑k = 3的情况(比较常用)

  • s = 1时,

    • padding=‘SAME’ -> i = o
    • padding=‘VALID’ -> i = o + 2
  • s = 2时,
    • padding=‘SAME’ -> i = 2o 或 i =2o - 1
    • padding = ‘VALID’ -> i = 2o + 1 或 2o+2

程序测试

import tensorflow as tf
import numpy as npvalue = tf.constant(1, shape=[1, 64, 64, 3], dtype=tf.float32)
filter = tf.constant(2, shape=[3, 3, 256, 3],dtype=tf.float32)def deconv_s_1():# s = 1, padding='SAME'output_shape_1_same = tf.constant([1, 64, 64, 256])res_1 = tf.nn.conv2d_transpose(value=value,filter=filter,output_shape=output_shape_1_same,strides=[1, 1, 1, 1],padding='SAME')# s = 1, padding='VALID'output_shape_1_valid = tf.constant([1, 66, 66, 256])res_2 = tf.nn.conv2d_transpose(value=value,filter=filter,output_shape=output_shape_1_valid,strides=[1, 1, 1, 1],padding='VALID')sess = tf.Session()conv_res_1 = sess.run(res_1)conv_res_2 = sess.run(res_2)print "s = 1, padding='SAME', expected: i = o = 64 "print conv_res_1.shapeprint "s = 1, padding='VALID', expected i = o + 2 = 66"print conv_res_2.shapedef deconv_s_2():#s = 2, padding='SAME'output_shape_2_same_1 = tf.constant([1, 128, 128, 256])res_1 = tf.nn.conv2d_transpose(value=value,filter=filter,output_shape=output_shape_2_same_1,strides=[1, 2, 2, 1],padding='SAME')# s = 2, padding='SAME'output_shape_2_same_2 = tf.constant([1, 127, 127, 256])res_2 = tf.nn.conv2d_transpose(value=value,filter=filter,output_shape=output_shape_2_same_2,strides=[1, 2, 2, 1],padding='SAME')# s = 2, padding='VALID'output_shape_2_valid_1 = tf.constant([1, 129, 129, 256])res_3 = tf.nn.conv2d_transpose(value=value,filter=filter,output_shape=output_shape_2_valid_1,strides=[1, 2, 2, 1],padding='VALID')# s = 2, padding='VALID'print "s = 2, padding='VALID'"output_shape_2_valid_2 = tf.constant([1, 130, 130, 256])res_4 = tf.nn.conv2d_transpose(value=value,filter=filter,output_shape=output_shape_2_valid_2,strides=[1, 2, 2, 1],padding='VALID')sess = tf.Session()conv_res_1 = sess.run(res_1)conv_res_2 = sess.run(res_2)conv_res_3 = sess.run(res_3)conv_res_4 = sess.run(res_4)print "s = 2, padding='SAME', expected i = 2o = 128 or i = 2o - 1 = 127 "print conv_res_1.shapeprint conv_res_2.shapeprint "s = 2, padding='VALID', expected i = 2o + 1 = 129 or i = 2o + 2 = 130"print conv_res_3.shapeprint conv_res_4.shape# print conv_resdef deconv_error():output_shape_2_same_1 = tf.constant([1, 129, 129, 256])res_1 = tf.nn.conv2d_transpose(value=value,filter=filter,output_shape=output_shape_2_same_1,strides=[1, 2, 2, 1],padding='SAME')sess = tf.Session()conv_res_1 = sess.run(res_1)print "s = 2, padding='SAME', expected i = 2o = 128 or i = 2o - 1 = 127 "print conv_res_1.shapeprint "input_size = 64:"print "stride = 1"
deconv_s_1()
print "stride = 2"
deconv_s_2()deconv_error()

Output

input_size = 64:
2018-12-04 00:04:02.363507: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
stride = 1
s = 1, padding='SAME', expected: i = o = 64
(1, 64, 64, 256)
s = 1, padding='VALID', expected i = o + 2 = 66
(1, 66, 66, 256)
stride = 2
s = 2, padding='VALID'
s = 2, padding='SAME', expected i = 2o = 128 or i = 2o - 1 = 127
(1, 128, 128, 256)
(1, 127, 127, 256)
s = 2, padding='VALID', expected i = 2o + 1 = 129 or i = 2o + 2 = 130
(1, 129, 129, 256)
(1, 130, 130, 256)
2018-12-04 00:04:03.186144: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at conv_grad_input_ops.cc:355 : Invalid argument: Conv2DCustomBackpropInput: Size of out_backprop doesn't match computed: actual = 64, computed = 65 spatial_dim: 1 input: 129 filter: 3 output: 64 stride: 2 dilation: 1
2018-12-04 00:04:03.187537: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at conv_grad_input_ops.cc:355 : Invalid argument: Conv2DCustomBackpropInput: Size of out_backprop doesn't match computed: actual = 64, computed = 65 spatial_dim: 1 input: 129 filter: 3 output: 64 stride: 2 dilation: 1
2018-12-04 00:04:03.188916: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at conv_grad_input_ops.cc:355 : Invalid argument: Conv2DCustomBackpropInput: Size of out_backprop doesn't match computed: actual = 64, computed = 65 spatial_dim: 1 input: 129 filter: 3 output: 64 stride: 2 dilation: 1
2018-12-04 00:04:03.189200: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at conv_grad_input_ops.cc:355 : Invalid argument: Conv2DCustomBackpropInput: Size of out_backprop doesn't match computed: actual = 64, computed = 65 spatial_dim: 1 input: 129 filter: 3 output: 64 stride: 2 dilation: 1

综上,通过上述程序验证了我们的想法。

所以当ksize = 3时,我们选择padding=‘SAME’, stride = 2,可以将输出Osize = 2 * Isize.

以上从卷积<->反卷积的shape的关系进行了解析,并分析了output_shape的可能性。

3D卷积和去(反)卷积相关推荐

  1. 卷积、空洞卷积、反卷积与空洞反卷积的计算公式(全)

    前言: 经常使用的卷积有卷积.空洞卷积.反卷积与空洞反卷积的,下面总结了他们的计算公式. 一.卷积计算公式 卷积神将网络的计算公式为: N=(W-F+2P)/S+1 其中 N:输出大小 W:输入大小 ...

  2. 卷积计算,反卷积计算,特征图大小计算,空洞卷积计算

    感受野大小计算.卷积参数量与计算量.空洞卷积计算量与参数量 卷积计算,反卷积计算,特征图大小计算,空洞卷积计算 转自:https://www.jianshu.com/p/09ea4df7a788?ut ...

  3. 卷积神经网络中卷积层、反卷积层和相关层

    卷积层.反卷积层和相关层三个层之间有很大的相似之处,现在好好分析一下. 1.卷积层 下图较大网格表示一幅图片,有颜色填充的网格表示一个卷积核,卷积核的大小为33.假设我们做步长为1的卷积操作,表示卷积 ...

  4. 深度学习卷积网络中反卷积/转置卷积的理解 transposed conv/deconv

    搞明白了卷积网络中所谓deconv到底是个什么东西后,不写下来怕又忘记,根据参考资料,加上我自己的理解,记录在这篇博客里. 先来规范表达 为了方便理解,本文出现的举例情况都是2D矩阵卷积,卷积输入和核 ...

  5. 反卷积原理 + pytorch反卷积层参数output_padding

    一般我们会认为2维卷积的计算分为了以下3类: 1.full   2.same   3. valid 参考:https://cn.mathworks.com/help/matlab/ref/conv2. ...

  6. 【故障诊断】基于最小熵反卷积、最大相关峰度反卷积和最大二阶环平稳盲反卷积等盲反卷积方法在机械故障诊断中的应用研究(Matlab代码实现)

  7. 用反卷积(Deconvnet)可视化理解卷积神经网络还有使用tensorboard

    『cs231n』卷积神经网络的可视化与进一步理解 深度学习小白--卷积神经网络可视化(二) TensorBoard--TensorFlow可视化 原文地址:http://blog.csdn.net/h ...

  8. 深度学习之卷积和反卷积

    ps 零零总总接触深度学习有1年了,虽然时间是一段一段的.现在再拾起来做一个新的项目,有些东西又要重新理解,感觉麻烦.现在就再次学习时候有些困惑的地方捋一遍. 1.卷积 说到卷积,我现在还有印象的是大 ...

  9. 深度学习之卷积、全连接、反卷积等

    全连接参考1:https://zhuanlan.zhihu.com/p/32819991 全连接参考2:代码实现 神经网络前后传输 CNN中卷积.池化的前向与反向传播: CNN卷积神经网络和反向传播( ...

最新文章

  1. 滴滴进入寒冬期,将裁员2000人
  2. Adobe将于2020年末停止对Flash的支持
  3. 【视频】vue组件的局部注册
  4. 后缀数组--(可重叠最长重复子串问题)
  5. ISA之三种客户端访问
  6. 《动手学深度学习 PyTorch版》学习笔记(二):自动求梯度
  7. java.sql.SQLSyntaxErrorException: ORA-00923: 未找到要求的 FROM 关键字
  8. css 立体管道图_高层住宅管道井内密集管线施工方法研究
  9. 亮度翻转_双轴翻转屏设计 ConceptD 3 Ezel设计师本评测
  10. 监控sqlserver 数据变化并记录_携程机票数据仓库11年技术栈的演进
  11. 鸿蒙系统官网电脑版,华为鸿蒙系统官方pc版下载-华为鸿蒙系统官方电脑pc版 -优盘手机站...
  12. BT.709 vs BT.2020
  13. 苹果手机软件升级密码_给iOS应用设置一个密码锁
  14. 用异常处理改编猜数游戏程序
  15. 光纤通道与以太网交换机之间有什么区别呢?
  16. 公众号粉丝迁移需要多长时间?
  17. 编程常用英语单词(一)
  18. 用不了tradingview?那自己捣鼓一个属于自己的量化分析平台
  19. java解析图片GPS等信息,springboot项目获得图片GPS
  20. C++20协程初探!

热门文章

  1. 基于LMS自适应滤波器的QPSK信号均衡器matlab仿真
  2. 常用Linux命令及其作用(超详细,带演示)
  3. Office Word 不显示 Citavi 插件,如何修复?
  4. ADB及shell命令总结
  5. IDEA MyBatis 报错:Could not find resource com/xxx/xxxMapper.xml
  6. 安居客 楼盘信息 项目代码-
  7. c语言解引用运算符,C++ 解引用(*)和箭头(-)运算符的重载
  8. 大厂程序员推荐的linux内核学习路线
  9. vue2存储数据方法
  10. 【Java+MySQL】使用JDBC连接MySQL 8.0数据库