pytorch PixelShuffle和Upscale函数

该函数设计思想来源于2016年的一篇SR文章,Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network 原理如下图:

子像素卷积的实现原理:利用卷积得到图像r2r^2r2个通道的特征图,并且特征图的大小和输入图像的大小一致,然后将特征图上的一个元素位置的r2r^2r2个特征点按次序排列开,形成r∗rr*rr∗r的像素分布,实现图像扩大的功能;

pytorch中的定义在:https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/pixelshuffle.py文件中

函数定义

其中upscale_factor为放大倍数

class torch.nn.PixleShuffle(upscale_factor)

输入输出

Input:KaTeX parse error: Expected '}', got '_' at position 31: …\text { upsclae_̲factor} ^ { 2 }…

output:KaTeX parse error: Expected '}', got '_' at position 23: … \text {upscale_̲factor},W*\text…

例子:

pixel_shuffle = nn.PixelShuffle(3) #放大3倍
input = torch.randn(1, 9, 4, 4)
output = pixel_shuffle(input)
print(output.size())  #输出为[1,1,12,12]

Upsample 函数

pytorch实现文件为:https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/upsampling.py

对给定多通道的1维（temporal）、2维（spatial）、3维（volumetric）数据进行上采样。

定义

class torch.nn.Upsample(size=None, scale_factor=None, mode='nearest', align_corners=None)

参数说明:

size 是要输出的尺寸，数据类型为tuple： ([optional D_out], [optional H_out], W_out)
scale_factor 在高度、宽度和深度上面的放大倍数。数据类型既可以是int——表明高度、宽度、深度都扩大同一倍数；亦或是tuple——指定高度、宽度、深度的扩大倍数。
mode 上采样的方法，包括最近邻（nearest），线性插值（linear），双线性插值（bilinear），三次线性插值（trilinear），默认是最近邻（nearest）。
align_corners 如果设为True，输入图像和输出图像角点的像素将会被对齐（aligned），这只在mode = linear, bilinear, or trilinear才有效，默认为False。

例子

input=torch.arange(1,5).view(1,1,2,2).float()
input
tensor([[[[ 1.,  2.],[ 3.,  4.]]]])input
m=nn.Upsample(scale_factor=2,mode='nearest')
m(input)
tensor([[[[ 1.,  1.,  2.,  2.],[ 1.,  1.,  2.,  2.],[ 3.,  3.,  4.,  4.],[ 3.,  3.,  4.,  4.]]]])m = nn.Upsample(scale_factor=2, mode='bilinear')
m(input)
tensor([[[[ 1.0000,  1.2500,  1.7500,  2.0000],[ 1.5000,  1.7500,  2.2500,  2.5000],[ 2.5000,  2.7500,  3.2500,  3.5000],[ 3.0000,  3.2500,  3.7500,  4.0000]]]])input
m = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
m(input)
tensor([[[[ 1.0000,  1.3333,  1.6667,  2.0000],[ 1.6667,  2.0000,  2.3333,  2.6667],[ 2.3333,  2.6667,  3.0000,  3.3333],[ 3.0000,  3.3333,  3.6667,  4.0000]]]])input_3x3 = torch.zeros(3, 3).view(1, 1, 3, 3)
input_3x3[:, :, :2, :2].copy_(input)
tensor([[[[ 1.,  2.],[ 3.,  4.]]]])input_3x3
tensor([[[[ 1.,  2.,  0.],[ 3.,  4.,  0.],[ 0.,  0.,  0.]]]])m = nn.Upsample(scale_factor=2, mode='bilinear')  # align_corners=False
m(input_3x3)
tensor([[[[ 1.0000,  1.2500,  1.7500,  1.5000,  0.5000,  0.0000],[ 1.5000,  1.7500,  2.2500,  1.8750,  0.6250,  0.0000],[ 2.5000,  2.7500,  3.2500,  2.6250,  0.8750,  0.0000],[ 2.2500,  2.4375,  2.8125,  2.2500,  0.7500,  0.0000],[ 0.7500,  0.8125,  0.9375,  0.7500,  0.2500,  0.0000],[ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000]]]])m = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
m(input_3x3)
tensor([[[[ 1.0000,  1.4000,  1.8000,  1.6000,  0.8000,  0.0000],[ 1.8000,  2.2000,  2.6000,  2.2400,  1.1200,  0.0000],[ 2.6000,  3.0000,  3.4000,  2.8800,  1.4400,  0.0000],[ 2.4000,  2.7200,  3.0400,  2.5600,  1.2800,  0.0000],[ 1.2000,  1.3600,  1.5200,  1.2800,  0.6400,  0.0000],[ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000]]]])