Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network

Robust Deep Multi-modal Learning Based on Gated Information Fusion Network

2018-07-27 14:25:26

Paper：https://arxiv.org/pdf/1807.06233.pdf

Related Papers:

1. Infrared and visible image fusion methods and applications: A survey 　　Paper

2. Chenglong Li, Xiao Wang, Lei Zhang, Jin Tang, Hejun Wu, and Liang Lin. WELD: Weighted Low-rank Decomposition or Robust Grayscale-Thermal Foreground Detection. IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), 27(4): 725-738, 2017. [Project page with Dataset and Code]

3. Chenglong Li, Xinyan Liang, Yijuan Lu, Nan Zhao, and Jin Tang. RGB-T Object Tracking: Benchmark and Baseline.[arXiv] [Dataset: Google drive, Baidu cloud] [Project page]

本文针对多模态融合问题（Multi-modal），提出一种基于 gate 机制的融合策略，能够自适应的进行多模态信息的融合。作者将该方法用到了物体检测上，其大致流程图如下所示：

如上图所示，作者分别用两路 Network 来提取两个模态的特征。该网络是由标准的 VGG-16 和 8 extra convolutional layers 构成。另外，作者提出新的 GIF（Gated Information Fusion Network）网络进行多个模态之间信息的融合，以取得更好的结果。动机当然就是多个模态的信息，是互补的，但是有的信息帮助会更大，有的可能就质量比较差，功效比较小，于是就可以自适应的来融合，达到更好的效果。

Gated Information Fusion Network (GIF)：

如上图所示：

该 GIF 网络的输入是：已经提取的 CNN feature map，这里是 F1, F2. 然后，将这两个 feature 进行 concatenate，得到 $F_G$. 该网络包含两个部分：

1. information fusion network（图2，虚线框意外的部分）；

2. weight generation network （WG Network，即：图2，虚线处）；

Weight Generation Network 分别用两个 3*3*1 的卷积核对组合后的 feature map $F_G$ 进行操作，然后输入到 sigmoid 函数中，即：gate layer，然后输出对应的权重 $w_1$，$w_2$。

Information fusion network 分别用得到的两个权重，点乘原始的 feature map，得到加权以后的特征图，将两者进行 concatenate 后，用 1*1*2k 的卷积核，得到最终的 feature map。

总结整个过程，可以归纳为：

== Done !

Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network相关推荐

Turbo Autoencoder: Deep learning based channel code for point-to-point communication channels
Turbo Autoencoder: Deep learning based channel code for point-to-point communication channels Abstra ...
Deep learning based multi-scale channel compression feature surface defect detection system
基于深度学习的多尺度通道压缩特征表面缺陷检测系统 Deep learning based multi-scale channel compression feature surface defect ...
跌倒综述 Deep Learning Based Systems Developed for Fall Detection A Review
文章目录 1.基本信息 2. 第一节介绍 3. 第二节跌倒检测系统文献 4.第三节讨论和未来方向 5. 第四节结论 6. 参考文献 1.基本信息题目:Deep Learning Based ...
【RS-Attack】Data Poisoning Attacks to Deep Learning Based Recommender Systems NDSS‘21
Data Poisoning Attacks to Deep Learning Based Recommender Systems NDSS'21 首个在基于深度学习的推荐系统中进行投毒攻击的研究.文 ...
Deep Learning Based Registration文章阅读(五)《Anatomy-guided Multimodal Registration by Learning Segment 》
Deep Learning Based Registration文章阅读(五) 这篇文章是MIA2021新出的一篇文章<Anatomy-guided Multimodal Registratio ...
基于深度强化学习的车道线检测和定位（Deep reinforcement learning based lane detection and localization）论文解读+代码复现
之前读过这篇论文,导师说要复现,这里记录一下.废话不多说,再重读一下论文. 注:非一字一句翻译.个人理解,一定偏颇. 基于深度强化学习的车道检测和定位官方源码下载:https://github.co ...
论文详读：LEMNA: Explaining Deep Learning based Security Applications
我以我ppt的内容顺序介绍一下这篇论文,希望有错误的地方大家可以帮我指出嘻嘻 1.论文出处论文名:LEMNA: Explaining Deep Learning based Security App ...
论文翻译七：Adversarial Transfer Learning for Deep Learning Based Automatic Modulation Classification
30天挑战翻译100篇论文坚持不懈,努力改变,在翻译中学习,在学习中改变,在改变中成长- Adversarial Transfer Learning for Deep Learning Based ...
论文翻译：2021_语音增强模型压缩_Towards model compression for deep learning based speech enhancement...
论文地址:面向基于深度学习的语音增强模型压缩论文代码:没开源,鼓励大家去向作者要呀,作者是中国人,在语音增强领域深耕多年引用格式:Tan K, Wang D L. Towards model c ...
【文献阅读03】Deep Reinforcement Learning Based Resource Allocation for V2V Communications
Deep Reinforcement Learning Based Resource Allocation for V2V Communications(点击可见原文) p.s.此文19年发表,到20 ...

Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network

Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network相关推荐

最新文章

热门文章