论文笔记 [5] Learning a Deep Convolutional Network for Image Super-Resolution

emmm……这篇文章就是在那篇做deblocking和deringing的文章中提到的，仍然是大佬汤晓鸥等做的深度CNN超分辨率的网络，就是SRCNN。这篇文章给出了一个端到端的，进去低分辨率出来高分辨率。并且说明了传统的sparse coding的SR方法也可以用CNN表示。而且这个网络比较轻量，效果好且速度快。

超分辨率实际上是一个很经典的问题，有的方法是通过图片内在的自相似性，而有的是通过external low- and high-resolution exemplar pairs 来学习映射关系。sparse coding（稀疏编码）是representative methods for external example-based image super-resolution 的一种方法。SC方法是这样，先分patch并且预处理，然后将这些patch用low-resolution dict编码，得到稀疏的coefficient，然后把码本换掉，换成high-resolution dict，然后重构出高分辨率的图像。所以以往对于基于SC的方法，人们主要关注怎么找到最好的码本，或其他模型。

CNN解决SR问题，不必显式地学习dict，manifold，modeling patch space 等等，而是用隐含层隐式学到。而且几乎不怎么用预处理和后处理。

SRCNN

SRCNN有以下几个操作：

patch extraction and representation
non-linear mapping
reconstruction

如下图所示：

patch extraction and representation 即一个可以产生n1个feature map的卷积层。non-linear mapping 环节，我们希望将每一个n1维的向量映射到n2维的向量，可以用1×1的kernel实现。文章中说：It is possible to add more convolutional layers (whose spatial supports are 1 × 1) to increase the non-linearity. But this can significantly increase the complexity of the model, and thus demands more training data and time. 虽然可以增加非线性（？），但是会使得模型变得复杂，所以需要更多的数据和训练时间。因此文章只用了一层map层。最后一层卷积回原来的通道数量c，三层的公式为：

Relationship to Sparse-Coding based Method

SC based 方法做SR问题的基本思路就是先对low-resolution（LR）的图像取patch并归一化，然后投影到一个LR 的dictionary，然后得到系数，再用 HR 的码本编码回去。在CNN中，第一层相当于提出了码本，filters就是dict中的元素，然后通过非线性映射相当于sparse coding solver，因为在SC的方法中，得到了n1个系数以后，要用sparse coding solver把n1个系数投射到n2系数，一般在SC中，n1=n2 。然后reconstruction过程相当于高分辨率码本进行合成。

文章说：Our non-linear operator can be considered as a pixel-wise fully-connected layer 。因为是1×1的kernel，实际上就是通道间逐像素的fc层。对于SC，没有对每个步骤都优化，而 But not all operations have been considered in the optimization in the sparse-coding-based SR methods. On the contrary, in our convolutional neural network, the low-resolution dictionary, high-resolution dictionary, non-linear mapping, together with mean subtraction and averaging, are all involved in the filters to be optimized. 所以可以对每个步骤达到最优。

通过和SC的对比可以用来调参，如下：

Others

loss function 就是MSE，用MSE实际上是favors a high PSNR，由于公式可以看出，PSNR和MSE的关系。另外PSNR只是部分的与perceptual quality相关，所以如果有更好的可导的loss function，可以在这个框架下把MSE替换掉，这也是传统方法不及之处。

训练用了91张图，然后用set5和set14分别用来对不同的upscaling factor做evaluation。

图像的合成：To synthesize the low-resolution samples {Y i }, we blur a sub-image by a proper Gaussian kernel, sub-sample it by the upscaling factor, and upscale it by the same factor via bicubic interpolation. 训练的patch文章中叫做sub-image，因为不像patch那样需要overlap和average。we mean these samples are treated as small “images” rather than “patches”, in the sense that “patches” are overlapping and require some averaging as post-processing but “sub-images” need not. 这些sub-image都是32×32 。

Following [20], we only consider the luminance channel (in YCrCb color space) in our experiments, so c = 1 in the first/last layer. The two chrominance channels are bicubic upsampled only for the purpose of displaying, but not for training/testing.

CNN模型可以处理多通道，作者说是为了fair comparison with 之前的SC方法，所以只用了一个luminance channel。为了避免boarder effect，没用padding，最后出来的patch是20×20.

关于学习率：We empirically find that a smaller learning rate in the last layer is important for the network to converge (similar to the denoising case [12])

在ImageNet上训练得到了更好的结果。

关于filter number，就是feature map的数量，用更多的feature map会提高performance，但是如果对速度有要求则应该用少一些的filters，也可以取得不错的效果。关于filter size，This suggests that a reasonably larger filter size could
grasp richer structural information, which in turn lead to better results. However, the deployment speed will also decrease with a larger filter size. Therefore, the choice of the network scale should always be a trade-off between performance and speed. 大尺寸的filter使得效果略略好一些。

与各种传统方法的对比图放一张，看上去貌似在PSNR高到一定程度的情况下，实际上PSNR的少量偏差和 visual / perception 得到的图像质量的偏差并不再是完全同步或等价了，因为HVS对于不同的细节和位置等敏感程度并不是完全一样的，而这一点在PSNR中并未体现。

reference：
Dong, Chao, Chen Change Loy, Kaiming He和Xiaoou Tang. 《Learning a Deep Convolutional Network for Image Super-Resolution》. 收入 Computer Vision – ECCV 2014, 184–99. Lecture Notes in Computer Science. Springer, Cham, 2014. https://doi.org/10.1007/978-3-319-10593-2_13.

2018/01/24

世人个个学长年，不悟常年在目前。我得宛秋平易法，只将食粥致神仙。 —— 陆游

论文笔记 [5] SRCNN相关推荐

ORB-SLAM3 论文笔记
ORB-SLAM3 论文笔记这篇博客 ORB-SLAM3系统相机模型的抽象(Camera Model) 重定位的问题图片矫正的问题视觉惯性SLAM的工作原理相关公式 IMU初始化跟踪和建图 ...
【论文笔记】 LSTM-BASED DEEP LEARNING MODELS FOR NONFACTOID ANSWER SELECTION
一.简介这篇论文由IBM Watson发表在2016 ICLR,目前引用量92.这篇论文的研究主题是answer selection,作者在这篇论文基础上[Applying Deep Learnin ...
最新图神经网络论文笔记汇总（附pdf下载）
点击上方,选择星标或置顶,不定期资源大放送! 阅读大概需要15分钟 Follow小博主,每天更新前沿干货 [导读]近年来,图神经网络变得非常火热,每年顶会在该领域内都会出现大量的研究论文,本文为大家提 ...
[论文笔记] Fast Quality Driven Selection of Composite Web Services (ECOWS, 2006)
Time: 4.0 hours Jae-Ho Jang, Dong-Hoon Shin, Kyong-Ho Lee, "Fast Quality Driven Selection of Co ...
论文笔记之：Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning
论文笔记之:Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning 2017-06-06 21: ...
光流速度_[论文笔记] FlowNet 光流估计
[论文笔记] FlowNet: Learning Optical Flow with Convolutional Networks 说在前面个人心得: 1. CNN的光流估计主要是速度上快,之后的v ...
论文笔记《Maxout Networks》《Network In Network》
原文出处:http://zhangliliang.com/2014/09/22/paper-note-maxout-and-nin/ 论文笔记 <Maxout Networks> & ...
论文笔记：HKMF-T: Recover From Blackouts in TaggedTime Series With Hankel Matrix Factorization
论文笔记:Hankel Matrix Factorization for Tagged Time Series to Recover Missing Values during Blackouts_U ...
论文笔记 A Spatial-Temporal Decomposition Based Deep Neural Network for TimeSeries Forecasting
0 abstract 空间时间序列预测问题出现在广泛的应用中,如环境和交通问题.由于存在特定的空间.短期和长期模式,以及维度的诅咒,这些问题具有挑战性. 在本文中,我们提出了一个用于大规模空间时间序列 ...

论文笔记 [5] SRCNN

论文笔记 [5] Learning a Deep Convolutional Network for Image Super-Resolution

SRCNN

Relationship to Sparse-Coding based Method

Others

论文笔记 [5] SRCNN相关推荐

最新文章

热门文章