参考论文：“Attention gated networks : Learning to leverage salient regions in medical images”

关键字：Attention Gate（AG) – automatically learns to focus on target structures

主动学习将注意力放在关注的物体上的网络结构

1. Introduction

AGs automatically learn to focus on target structures without additional supervision.
At test time, these gates generate soft region proposals implicitly on-the-fly and highlight salient features useful for a specific task.
Image-grid based gating.

1.1 Related work

Attention gates:

The idea of attention mechanisms is to generate a context vector which assigns weight on the input sequence (注意力机制的基本思路是创建一个上下文向量，向量的每一个分量对应于输入向量相应元素的权重)

Contributions:

proposing grid-baed gating that allows attention gates to be more specificc to local regions
We propose one of the use cases of soft-attention in a feed-forward CNN model applied to a medical imaging task that is end-to-end trainable
For classification, better performance with AGs over the baseline approach.
For segmentation, an extension to the standard U-net model is proposed that provides increased sensitivity without the need of complicated heuristics, while not sacrificing specificity.
We demonstrate that the proposed attention mechanism provides fine-scale attention maps that can be visualised, with minimal computational overhead, which helps with interpretability of predictions.

2. Methodology

2.1 Convolutional neural network

AGs progressively suppress feature responses in irrelevant background regions without the requirement to crop a ROI between networks

2.2 Attention gate module

Let x l = { x i l } i = 1 n be the activation map of a chosen layer l ∈ { 1 , … , L } , where each x i l represents the pixel-wise feature vector of length F l For each x i l , AG computes coefficients α l = { α i l } i = 1 n , where α i l ∈ [ 0 , 1 ] , in order to preserve only the activations relevant to the specific task The output of AG is x ^ l = { α i l x } i = 1 n , where each feature vector is scaled by the corresponding attention coefficient \text{Let }x^l = \{x_i^l \}^n_{i=1}\text{be the activation map of a chosen layer }l \in \{1,\dots,L \}, \text{where each }x_i^l \text{ represents the pixel-wise feature vector of length } F_l \\ \text{For each } x_i^l \text{, AG computes coefficients }\alpha^l =\{\alpha_i^l \}_{i=1}^n, \text{where }\alpha_i^l\in[0,1]\text{, in order to preserve only the activations relevant to the specific task} \\ \text{The output of AG is }\hat{\bold x}^l=\{\alpha_i^l\bold{x}\}_{i=1}^n\text{, where each feature vector is scaled by the corresponding attention coefficient} Let xl={xil}i=1nbe the activation map of a chosen layer l∈{1,…,L},where each xil represents the pixel-wise feature vector of length FlFor each xil, AG computes coefficients αl={αil}i=1n,where αil∈[0,1], in order to preserve only the activations relevant to the specific taskThe output of AG is x^l={αilx}i=1n, where each feature vector is scaled by the corresponding attention coefficient

two commonly used attention types { Multiplicative: faster to compute and more memory-efficient Additive Attention: perform better for large dimensional input features \text{two commonly used attention types}\begin{cases} \text{Multiplicative: faster to compute and more memory-efficient} \\ \text{Additive Attention: perform better for large dimensional input features} \end{cases} two commonly used attention types{Multiplicative: faster to compute and more memory-efficientAdditive Attention: perform better for large dimensional input features

作者将flattern前的feature map作为coarse scale feature map（gating signal g），然后进行attention 机制

第一个圆圈里的＋是Additive Attention，第二个圆圈里的x是矩阵对应元素点乘

Additive Attention：

关于Additive Attention：

The linear transformations are computed using channel-wise 1x1x1 convolutions

channel-wise convolution: 不需要outchannel == inchannel!

Attention gated networks: Learning to leverage salient regions in medical images相关推荐

Conditional Channel Gated Networks for Task-Aware Continual Learning 笔记
Conditional Channel Gated Networks for Task-Aware Continual Learning 笔记 Abstract Introduction Relate ...
Machine Learning week 5 quiz: Neural Networks: Learning
Neural Networks: Learning 5 试题 1. You are training a three layer neural network and would like to us ...
【论文阅读】Occupancy Networks: Learning 3D Reconstruction in Function Space
论文题目:Occupancy Networks: Learning 3D Reconstruction in Function Space(占据网络:在函数空间内学习三维重建,简称ONet) 论文作者 ...
Programing Exercise 4:Neural Networks Learning
本文讲的是coursera上斯坦福大学机器学习公开课(吴文达)课程第五周Neural Networks :learning 的课后作业.本文给出了作业实现的具体代码,并给出相应的注释和解释,供各位同学 ...
《The Frontiers of Memory and Attention in Deep Learning》图文结合详解深度学习Memory Attention
原文地址: https://yq.aliyun.com/articles/65356?spm=5176.100238.goodcont.2.Sy8Xe6 深度学习中的记忆前沿和吸引点作者 Steph ...
吴恩达机器学习作业4---Neural Networks Learning
Neural Networks Learning 文章目录 Neural Networks Learning 代码分析数据集 ex4data1.mat ex4weights.mat 代码分析首先, ...
Hierarchical Attention Prototypical Networks for Few-Shot Text Classification
Abstract 目前文本分类任务的有效方法大多基于大规模的标注数据和大量的参数,但当监督训练数据少且难以收集时,这些模型就无法使用. 在本文中,我们提出了一种用于少样本文本分类的分层注意原型网络(H ...
Stanford机器学习---第五讲. 神经网络的学习 Neural Networks learning
原文见http://blog.csdn.net/abcjennifer/article/details/7758797,加入了一些自己的理解本栏目(Machine learning)包含单參数的线性 ...
[读论文]CVPR2019: Occupancy Networks: Learning 3D Reconstruction in Function Space
目录核心思想涉及到的底层模块网络结构训练 mesh提取实验概述三方面实验 baseline方法: Dataset: ShapeNet 评价指标(Metrics): 具体实验表达能力实验 ...

Attention gated networks: Learning to leverage salient regions in medical images