论文阅读：Detecting Visual Relationships Using Box Attention(ICCV19)

这篇论文的思想也挺简单的：目标检测网络+box attention input

对于上面这幅图来说，如果attention map是空的，那么模型会检测出图像中所有的主语，如果attention map注意到右边的人，那么模型会找出与这个attention map所表示的主语产生交互的宾语(bbox和类别)，和谓语(类别)。如果attention map注意左边的人同理。

那么，attention map又是什么呢？
attention map是与原图像大小相同，channel为3的二值图，第一维channel表示的是图像上的主语bbox。如果第一维是empty，第二维就是全1，第三维就是全0。如果第一维不是empty就倒过来。

把attention map加到目标检测网络也很简单：

训练时：
如果一张图片里有k个主语，那么首先把这张图片复制k份，每一份附上主语的attention map，同时与这个主语相关的宾语及谓语作为gt，这是k个训练样本。再把这张图片复制一份，附上empty attention map，同时全部主语作为gt，这是第k+1个训练样本。

测试时：
先输入图片和empty attention map到模型中，输出主语bbox和主语类别。再从主语bbox中提取attention map，再输入一次模型，就得到与主语相关的宾语的bbox、宾语和谓语类别。然后将主谓宾三者的置信度相乘，分数最高就是最终的结果了。

------------------------------------一些碎碎念---------------------------------------
今天大师兄已经回实验室了QAQ
我不想那么早回去
我还想再苟苟嘤。

后天去看这个杀手不太冷静
这总不能踩雷了吧。

---------------------------2022.02.14-------------------------
补个影评真的好好看
学校延迟返校了
现在心情就是比较纠结
又想早回又不想早回。

论文阅读：Detecting Visual Relationships Using Box Attention(ICCV19)相关推荐

Detecting Visual Relationships with Deep Relational Networks（阅读笔记）
Detecting Visual Relationships with Deep Relational Networks(阅读笔记) 原文链接:https://blog.csdn.net/xue_we ...
论文阅读笔记：MGAT: Multi-view Graph Attention Networks
论文阅读笔记:MGAT: Multi-view Graph Attention Networks 文章目录论文阅读笔记:MGAT: Multi-view Graph Attention Networ ...
论文阅读：Visual Semantic Localization based on HD Map for AutonomousVehicles in Urban Scenarios
题目:Visual Semantic Localization based on HD Map for Autonomous Vehicles in Urban Scenarios 中文:基于高清地图 ...
论文阅读：Detecting Visual Relationships with Deep Relational Networks
DR-Net(CVPR2017) 文章代码也是先用检测器将roi准备好,然后以这些roi为输入,与其他方法不同的是,该方法还需要记住roi的类别,文章提出jointly recognition, ...
论文阅读：Visual Relationship Detection with Language Priors
Visual Relationship Detection with Language Priors(ECCV2016) 文章尽管大多数的relationship并不常见,但是它们的object ...
VideoQA论文阅读笔记——Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering
论文:Heterogeneous Memory Enhanced Multimodal Attention Model for VQA 来源:CVPR2019 作者:京东研究院源码: Github ...
attention综述论文阅读：An Overview of the Attention Mechanisms in ComputerVision
1. Introduction 注意机制起源于对人类视觉的研究.在认知科学中,由于信息处理的瓶颈,人类只能注意到所有可见信息的一部分.受这种视觉注意机制的启发,研究者们试图寻找视觉选择性注意模型来模拟 ...
论文阅读：Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection
Softer-NMS 文章和之前同样出自Megvii的一篇论文IoU-Net一样,这篇论文的出发点也是,two-stage detector进行NMS时用到的score仅仅是classifica ...
【论文阅读】Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval 介绍模型跨膜态特征表 ...

论文阅读：Detecting Visual Relationships Using Box Attention(ICCV19)

论文阅读：Detecting Visual Relationships Using Box Attention(ICCV19)相关推荐

最新文章

热门文章