一、作者

Lv Tang ，Bo Li1，Yijie Zhong ，Shouhong Ding， Mofei Song2， Youtu Lab

二、地址

2.1 原文地址

2.2 代码地址

三、摘要

Aiming at discovering and locating most distinctive objects from visual scenes, salient object detection (SOD) plays an essential role in various computer vision systems. Coming to the era of high resolution, SOD methods are facing new challenges. The major limitation of previous methods is that they try to identify the salient regions and estimate the accurate objects boundaries simultaneously with a single regression task at low-resolution. This practice ignores the inherent difference between the two difficult problems, resulting in poor detection quality. In this paper, we propose a novel deep learning framework for high-resolution SOD task, which disentangles the task into a low-resolution saliency classification network (LRSCN) and a high-resolution refinement network (HRRN). As apixel-wise classification task, LRSCN is designed to capture sufficient semantics at low-resolution to identify the definite salient(most pixels inside the salient object have the highest saliency value), background( most pixels in the background regions have the lowest salient value) and uncertain image regions(saliency values of the pixels at blurry object boundaries fluctuate between 0 and 1). HRRN is a regression task, which aims at accurately refining the saliency value of pixels in the uncertain region to preserve a clear object boundary at high-resolution with limited GPU memory. It is worth noting that by introducing uncertainty into the training process, our HRRN can well address the high-resolution refinement task without using any high-resolution training data. Extensive experiments on high-resolution saliency datasets as well as some widely used saliency benchmarks show that the method achieves superior performance compared to the state-of -the-art methods.

四、主要内容

4.1 主要工作

We provide a new perspective that high-resolution salient object detection should be disentangled into two tasks, and demonstrate that the disentanglement of the two tasks is essential for improving the performance of DNN based SOD models.
Motivated by the principle of disentanglement, we propose a novel framework for high-resolution salient object detection, which uses LRSCN to capture sufficient semantics at low-resolution and HRRN for accurate boundary refinement at high-resolution.
We make the earliest efforts to introduce the uncertainty into SOD network training, which empowers HRRN to well address the high-resolution refinement task without any high-resolution training datasets.
We perform extensive experiments to demonstrate the proposed method refreshes the SOTA performance on high-resolution saliency datasets as well as some widely used saliency benchmarks by a large margin.

4.2 网络结构 (VGG-16)

4.3 MECF and AGA

4.3.1 ME ( based on Global Convolutional Network (GCN))

4.3.2 CF (utilize cross-level feature fusion module)

4.3.3 SGA ( guarantees the alignment of trimap and saliecny map)

4.4 LOSS

4.4.1 LRSCN Loss

$ T^{gt}$ : trimap groundtruth

Tgt(x,y)={2,Tgt(x,y)ϵdefinitesalient0,Tgt(x,y)ϵdefinitebackground1,Tgt(x,y)ϵdefiniteuncertainregionT^{gt}(x, y) = \begin{cases} 2, \ T^{gt}(x,y)\epsilon \ definite\ salient \\ 0,T^{gt}(x,y)\epsilon \ definite\ background \\ 1, T^{gt}(x,y)\epsilon \ definite\ uncertain region \end{cases} Tgt(x,y)=⎩⎪⎨⎪⎧2, Tgt(x,y)ϵ definite salient0,Tgt(x,y)ϵ definite background1,Tgt(x,y)ϵ definite uncertainregion

Ltrimap=1N∑i−log(eTi∑jeTj)L_{trimap}=\frac{1}{N}\sum_i-log(\frac{e^{T_i}}{\sum_j e^{T_j}}) Ltrimap=N1i∑−log(∑jeTjeTi)

LLRSCN=Lsaliency+LtrimapLsaliency:BCE+SSIM+F−measureL_{LRSCN}=L_{saliency}+L_{trimap}\\ L_{saliency}:BCE+SSIM+F-measure LLRSCN=Lsaliency+LtrimapLsaliency:BCE+SSIM+F−measure

4.4.2 HRRN Loss

uncertainty loss will make the weight of the loss in the uncertainty region be small and let the network ignore effects from noisy data as much as possible.

L1=1E∑iϵE∣SiH−GiH∣E:numbeofpixelsL_1 = \frac{1}{E} \sum_{i\epsilon E}|S_i^H - G_i^H| \\ E:numbe\ of\ pixels L1=E1iϵE∑∣SiH−GiH∣E:numbe of pixels

Luncertainty=1U∑iϵU∣∣SiH−GiH∣∣22σi2+12logσi2U:totalnumberofpixelsinuncertainregionL_{uncertainty} = \frac{1}{U} \sum_{i\epsilon U}\frac{{||S_i^H - G_i^H||}^2}{2\sigma_i^2}+ \frac{1}{2}log\sigma_i^2 \\ U:total\ number\ of\ pixels\ in\ uncertain\ region Luncertainty=U1iϵU∑2σi2∣∣SiH−GiH∣∣2+21logσi2U:total number of pixels in uncertain region

LHRRN=Luncertainty+L1L_{HRRN} = L_{uncertainty} + L_1 LHRRN=Luncertainty+L1

五、评估材料

MAE
F-measure (FβF_\betaFβ and FβmaxF_\beta ^{max}Fβmax)
Structure Measure
RP曲线
BDE ( Boundary Displacement Error) 边界漂移误差
BμB_\muBμ

六、结论

In this paper, we argue that there are two difficult and inherently different problems in high-resolution SOD. From this perspective, we propose a novel deep learning framework to disentangle the high-resolution SOD into two tasks: LRSCN and HRRN. LRSCN can identify the definite salient, background and uncertain regions at low-resolution with sufficient semantics. While HRRN can accurately refining the saliency value of pixels in the uncertain region to preserve a clear object boundary at high-resolution with limited GPU memory. We also make the earliest efforts to introduce the uncertainty into SOD network training, which empower HRRN to learn rich details without using any high-resolution training datasets. Extensive evaluations on high-resolution datasets and popular benchmark datasets not only verify the superiority of our method but also demonstrate the importance of disentanglement for SOD. We believe our novel disentanglement view in this work can contribute to other high-resolution computer vision tasks in the future.

(2021 ICCV) Disentangled High Quality Salient Object Detection (A类)相关推荐

[论文阅读] Disentangled High Quality Salient Object Detection
论文地址:https://arxiv.org/abs/2108.03551 代码:https://github.com/luckybird1994/HQSOD 发表于:ICCV'21 Abstract ...
[论文阅读] Stereoscopically Attentive Multi-scale Network for Lightweight Salient Object Detection
论文地址:https://dx.doi.org/10.1109/TIP.2021.3065239 代码:https://mmcheng.net/SAMNet 发表于:TIP 2021 Abstract ...
[论文阅读] Looking for the Detail and Context Devils: High-Resolution Salient Object Detection
论文地址:https://dx.doi.org/10.1109/TIP.2020.3045624 发表于:TIP 2021 Abstract 近年来,随着大规模基准测试与深度学习技术的成就,显著目标检 ...
Revisiting Salient Object Detection: Simultaneous Detection, Ranking, and Subitizing of Multiple Sal
问题: 作者认为,显著性目标检测领域迄今为止的工作解决的是一个相当病态的问题.即不同的人对于什么是显著性目标没有一个普遍的一致意见.这意味着一些目标会比另一些目标更加显著,并且不同的显著性目标中存在着 ...
[论文阅读] Boosting Salient Object Detection with Transformer-based Asymmetric Bilateral U-Net
论文地址:https://arxiv.org/abs/2108.07851 发表于:Arxiv 2021.08 Abstract 现有的显著目标检测(SOD)方法主要依靠基于CNN的U型结构,通过跨层 ...
Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection（速读啊）内含与u-shape的对比
今天早早起来了吃完饭就开始干活了十点开始读论文所以速读适合没有很长事假你的情况下,你只需要读懂大意就可以了 QAQ,bhys,以后一定精读,好好找找里面的专业名词整理下来呜呜呜这次策略跟以前差 ...
Weakly Supervised Video Salient Object Detection
Weakly Supervised Video Salient Object Detection 摘要 1. Introduction 2. Related Work 3. Our Method 3. ...
Dynamic Selective Network for RGB-D Salient Object Detection
Dynamic Selective Network for RGB-D Salient Object Detection 用于 RGB-D 显着目标检测的动态选择网络 IEEE TRANSACTION ...
文献阅读20期：Transformer Transforms Salient Object Detection and Camouflaged Object Detection
[ 文献阅读 ] Transformer Transforms Salient Object Detection and Camouflaged Object Detection [1] 表现SOTA ...

(2021 ICCV) Disentangled High Quality Salient Object Detection (A类)