1. Motivation

overlapping，occlusion，分割高度重叠的对象具有挑战性，因为通常在真实对象轮廓contours和遮挡边界occlusion boundaries之间没有区别。

之前的工作在mask regression上做的很少，并且COCO训练数据中，大部分物体是没有遮挡信息的。

mask R-CNN以及它的改进都是直接回归了被遮挡物实例occludee，这种做法忽略了遮挡物实例occluding 以及物体之间重叠的关系。

Segmenting highly-overlapping objects is challenging, because typically no distinction is made between real object contours and occlusion boundaries.

The mask head design of Mask R-CNN in Figure 3 directly regress the occludee with a fully convolutional network, which which neglects both the occluding instances and the overlapping relations between objects

2. Related Work and Contribution

2.1 Related Work

Amodal Instance Segmentaion：无模态的实例分割，针对被遮挡的区域进行预测的工作，之前的文章主要是是添加一些annotation以及dataset。

Occlusion Handling：之前一些解决遮挡问题的工作。

2.2 Contribution

本文提出了BCNet（Bilayer Convolutional Network）双层卷积网络，BCNet由GCN layer 构成，top GCN检测occluder，bottom GCN 推断occludee。
BCNet显示的建模2层网络结构的关系，并且解耦了occluder（遮挡物）和occludee（被遮挡物）的边界，同时在mask regression时进行二者之间的交互。

We model image formation as composition of two overlapping layers, and propose Bilayer Convolutional Network (BCNet), where the top GCN layer detects the occluding objects (occluder) and the bottom GCN layer infers partially occluded instance (occludee)，and considers the interaction between them during mask regression。

BCNet可以在one-stage以及two-stage都有效，并且在COCO和KIS数据集上都有涨点。

We validate the efficacy of bilayer decoupling on both one- stage and two-stage object detectors with different back- bones and network layer choices.

Despite its simplicity, extensive experiments on COCO and KINS show that our occlusion-aware BCNet achieves large and consistent performance gain especially for heavy occlusion cases.

本文的一个我认为的创新点，GCN图卷积网络的应用，作者在文中解释，使用GCN的原因在于GCN可以考虑全局non-local的关系，允许pixel的信息进行传播propagating，而不用担心遮挡物体的存在。

We utilize GCN in our implementation because GCN can consider the non-local relationship between pixels, allowing for propagating information across pixels despite the presence of occluding regions.

图1是Simplified illustration，Bilayer Decoupling，Top Layer以及 Bottom Layer。二者重叠部分是被遮挡物occludee的invisible region无法可见的区域，这个区域会被作者提出的BCNet 显示的建模，第一层GCN提供了shape，location等丰富的遮挡信息，并且指导occludee（也叫target）的分割。

3. Method

图2是Msak R-CNN等网络以及BCNet 可视化的比较，可以看出BCNnet在边缘遮挡信息上，比较完整。

图3是Msak R-CNN等网络以及BCNet网络的比较。

3.1 Architecture of BCNet

BCNet网络结构如图4所示，由3个部分组成，分别是backbone+FPN，FCOS object detector， BCNet。

注意，遮挡物和这遮挡物都是在同一个ROI内部的，作者指出这样子得到的最后的分割结果有着更好的解释性。

The explicit bilayer occluder-occludee relational modeling within the same ROI also makes our final segmentation results more explainable than previous methods.

3.2 Work flow

work flow：给定一张图片，经过backbone+fpn提取特征，经过FCOS得到bbox以及class，然后使用ROI crop，将crop后的ROI feature送入BCNet进行mask的预测，第一个GCN层同时检测occluder的contour以及mask，来建模occludr区域，然后与ROI feature进行element-wise add 残差，第二个GCN通过occlusion-aware feature进行指导，同时输出部分occlude的contour和mask。

3.3 Bilayer Occluder-Occludee Modeling

3.3.1 Bilayer GCN Structure for Instance Segmentation

由于GCN的全局性质，作为BCNet的basic block，其中图中的每一个节点代表着featmap上的每一个pixel。

输入的X∈N×KX\in N \times KX∈N×K，其中N=H×W,K=channelN = H \times W, K = channelN=H×W,K=channel ；
A∈RN×NA \in R^{N \times N}A∈RN×N代表pixel之间的邻接矩阵，通过定义feature similarities相似度得到；
Wg∈RK×K′W_g \in R^{K \times K'}Wg∈RK×K′代表是可学习的权重矩阵，这里K=K′K = K'K=K′。输出的Z的维度是没有变化的，还是N×KN \times KN×K；
σ(⋅)\sigma (\cdot)σ(⋅) 代表ReLU以及normalization非线性变化；
还有一个residual connection。

如何定义邻接矩阵A？作者通过dot product similarity定义每个nodes Xi以及Xj：

公式2表明了一个F函数，对于Xi以及Xj来说，公式3表明F函数是2个trainable transformation funcion θ\thetaθ以及ϕ\phiϕ的乘积，通过1x1的卷积来实现，这样可以使得两个nodes之间的high confidence edge与更大的feature similarity 相关（相邻的nodes的相似度更大），思考了一下后，感觉这只是作者包装的GCN，如下图所示，经过W矩阵变化为（N，K），经过A矩阵后变化为（N，N），然后在做一个矩阵乘法，最后做一个残差。

其实就是过了3个conv而已？然后做了一下矩阵乘法来改变维度。

那么2个GCN可以通过ROI feature 以及第一层的GCN输出Z来进行联系，如公式6，5，4所示。

作者认为GVN双层结构对于遮挡区域构建了一种语义图空间semantic graph spcae。使得遮挡区域内的pixel在双层图中可以同时包含2种不同的states状态。

3.3.2 Occluder-occludee Modeling

occluder的边界检测是通过公式7的loss训练的：

occluder的mask检测是通过公式8的loss来训练的：

感觉需要注意的是，GTb以及GTs分别是数据集中annotations的边界信息和mask信息。

3.4 End-to-end Parameter Learning

总得loss如公式9所示，由Ldetect，Loccluder，Loccludee组成。

3.4.1 Training

这里其实不太理解，过滤没有遮挡的ROI，使得遮挡的情况占据平衡后的样本的50%？

For training the first GCN layer of BCNet, since partial occlusion cases only occupy a small fraction compared to the complete objects in COCO, we filter out part of the non-occluded ROI proposals to keep occlusion cases taking up 50% for balance sampling.

3.4.2 Inference

测试过程预测的是第二层的GCN的occluded target（只出50个prposals，从FCOS得来），而第一层GCN的作用是为第二层GCN产生occlusion-aware feature的输入（就是残差的部分）。

4. Experiment

4.1 Experimental Setup

作者构建了COCO-OCC，在验证集中提取的1005张图片，这些图片在bbox中的overlapping ratio至少有0.2。

For further investigating segmentation performance with occlusion handling, we propose a subset split, called COCO-OCC, which contains 1,005 images extracted from the validation set (5k images) where the overlapping ratio between the bounding boxes of objects is at least 0.2.

作者构建了Synthetic Occlusion Dataset。

we synthesize a large-scale instance segmentation dataset which contains 100k images following uniform class distribution for in- stances among the 80 categories in COCO.

Each synthetic image has true and complete object contours for both occluding and partially occluded objects.

4.2 Ablation Study

4.2. 1 Effect of Explicit Occlusion Modeling

4.2.2 Effect of Bilayer Occluder-occludee Modeling

4.2.3 Using FCN or GCN?

4.2.4 Influence of Object Detector

4.3. Performance Comparison and Analysis

4.3.1 Comparison with SOTA Methods

4.3.2 Comparison with Amodal Segmentation Methods

4.3.3 Evaluation on Occluded Images

4.3.4 Qualitative Evaluation

[BCNet] Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers(CVPR. 2021)相关推荐

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers笔记
目录结构工作流程具体算法: 文中的公式: 一些理解其他分割模型跑代码问题代码项目知乎原文:https://zhuanlan.zhihu.com/p/378269087 其他解读 http ...
《Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers》简述
引言最先进的实例分割方法通常遵循Mask R-CNN范式,第一阶段检测边界框,然后第二阶段分割实例掩码.然而,本文注意到,大多数性能的改进来自于更好的主干架构设计,而在从目标检测中获得感兴趣区域 ...
Deep Snake for Real-Time Instance Segmentation：基于Deep Snake的实例实时分割
本文针对自己所看的 Deep Snake 做一个总结和存档,也方便其他同学学习 -- 阿波,2020.4.23 论文原文:Deep Snake for Real-Time Instance Segme ...
(NeurIPS 2019) Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds
Abstract 我们提出了一种新颖的.概念上简单的通用框架,用于在3D点云上进行实例分割.我们的方法称为3D-BoNet,遵循每点多层感知器(MLP)的简单设计理念.该框架直接回归点云中所有实例的3 ...
物体分割--Deep Watershed Transform for Instance Segmentation
Deep Watershed Transform for Instance Segmentation CVPR2017 https://github.com/min2209/dwt 本文将传统的 wa ...
孤读Paper——《Deep Snake for Real-Time Instance Segmentation》
<Deep Snake for Real-Time Instance Segmentation> 论文借鉴了snake算法,将snake算法做成了轮廓结构化特征学习的方法.DeepSn ...
【论文阅读/翻译笔记】Deep Snake for Real-Time Instance Segmentation
原论文标题:Deep Snake for Real-Time Instance Segmentation 原论文链接:https://arxiv.org/abs/2001.01629 翻译:张欢荣用 ...
Deep Snake for Real-Time Instance Segmentation论文理解
用于实时实例分割的Deep Snake算法论文链接https://arxiv.org/abs/2001.01629 论文代码:https://github.com/zju3dv/snake/ 本人小 ...
实例分割总结 Instance Segmentation Summary（Center Mask、Mask-RCNN、PANNet、Deep Mask和Sharp Mask）
实例分割总结 Instance Segmentation Summary 实例分割常用网络总结 Mask-RCNN网络 PANnet Deep Mask和Sharp Mask CenterMask 二 ...

[BCNet] Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers(CVPR. 2021)