论文阅读: Channel Augmented Joint Learning for Visible-Infrared Recognition

code: https://gitee.com/mindspore/contrib/tree/master/papers/CAJ

动机:

现有的图像增广策略主要针对单模态的可见光图像,没有考虑可见光-红外图像匹配时的图像特性。

主要工作:

  1. 数据增广: 通过随机交换颜色通道生成与颜色无关的图像,能和现有增广方法相结合,增强对颜色变化的鲁棒性。——模拟随机遮挡,丰富了图像的多样性。
  2. 针对跨模态度量学习,提出Channel-mixed learning strategy,利用平方差,同时处理类内和类间变化; 进一步提出channel-augmented joint learning strategy:明确优化增广图像的输出。

图像增广

  1. Random Channel Exchangeable Augmentation

    该增广方法可以被理解为:均匀产生可见光图像的三个通道。这样鼓励模型去学习每个颜色通道与单通道可见光图像见的关系。
  2. Channel-Level Random Erasing (CRE)

    替换成为:从ImageNet中获取的 R, G and B channels 的均值。
  3. 另外,也采用grayscale trasformation(GA),random horizontal flip (FP)

代码如下:

from __future__ import absolute_import
import random
import math在这里插入图片描述class ChannelAdap():""" Adaptive selects a channel or two channels.Args:probability: The probability that the Random Erasing operation will be performed.sl: Minimum proportion of erased area against input image.sh: Maximum proportion of erased area against input image.r1: Minimum aspect ratio of erased area.mean: Erasing value."""def __init__(self, probability=0.5):self.probability = probabilitydef __call__(self, img):# if random.uniform(0, 1) > self.probability:# return imgidx = random.randint(0, 3)if idx == 0:# random select R Channelimg[1, :, :] = img[0, :, :]img[2, :, :] = img[0, :, :]elif idx == 1:# random select B Channelimg[0, :, :] = img[1, :, :]img[2, :, :] = img[1, :, :]elif idx == 2:# random select G Channelimg[0, :, :] = img[2, :, :]img[1, :, :] = img[2, :, :]else:img = imgreturn imgclass ChannelAdapGray():""" Adaptive selects a channel or two channels.Args:probability: The probability that the Random Erasing operation will be performed.sl: Minimum proportion of erased area against input image.sh: Maximum proportion of erased area against input image.r1: Minimum aspect ratio of erased area.mean: Erasing value."""def __init__(self, probability=0.5):self.probability = probabilitydef __call__(self, img):# if random.uniform(0, 1) > self.probability:# return imgidx = random.randint(0, 3)if idx == 0:# random select R Channelimg[1, :, :] = img[0, :, :]img[2, :, :] = img[0, :, :]elif idx == 1:# random select B Channelimg[0, :, :] = img[1, :, :]img[2, :, :] = img[1, :, :]elif idx == 2:# random select G Channelimg[0, :, :] = img[2, :, :]img[1, :, :] = img[2, :, :]else:if random.uniform(0, 1) > self.probability:# return imgimg = imgelse:tmp_img = 0.2989 * img[0, :, :] + 0.5870 * img[1, :, :] + 0.1140 * img[2, :, :]img[0, :, :] = tmp_imgimg[1, :, :] = tmp_imgimg[2, :, :] = tmp_imgreturn imgclass ChannelRandomErasing():""" Randomly selects a rectangle region in an image and erases its pixels.'Random Erasing Data Augmentation' by Zhong et al.Args:probability: The probability that the Random Erasing operation will be performed.sl: Minimum proportion of erased area against input image.sh: Maximum proportion of erased area against input image.r1: Minimum aspect ratio of erased area.mean: Erasing value."""def __init__(self, probability=0.5, sl=0.02, sh=0.4, r1=0.3):self.probability = probabilityself.mean = [0.4914, 0.4822, 0.4465]self.sl = slself.sh = shself.r1 = r1def __call__(self, img):if random.uniform(0, 1) > self.probability:return imgfor _ in range(100):area = img.shape[1] * img.shape[2]target_area = random.uniform(self.sl, self.sh) * areaaspect_ratio = random.uniform(self.r1, 1/self.r1)h = int(round(math.sqrt(target_area * aspect_ratio)))w = int(round(math.sqrt(target_area / aspect_ratio)))if w < img.shape[2] and h < img.shape[1]:x1 = random.randint(0, img.shape[1] - h)y1 = random.randint(0, img.shape[2] - w)if img.shape[0] == 3:img[0, x1:x1+h, y1:y1+w] = self.mean[0]img[1, x1:x1+h, y1:y1+w] = self.mean[1]img[2, x1:x1+h, y1:y1+w] = self.mean[2]# TODO when will img.shape != 3else:img[0, x1:x1+h, y1:y1+w] = self.mean[0]return imgreturn imgclass ChannelExchange():""" Adaptive selects a channel or two channels.Args:probability: The probability that the Random Erasing operation will be performed.sl: Minimum proportion of erased area against input image.sh: Maximum proportion of erased area against input image.r1: Minimum aspect ratio of erased area.mean: Erasing value."""def __init__(self, gray=2):self.gray = graydef __call__(self, img):idx = random.randint(0, self.gray)if idx == 0:# random select R Channelimg[1, :, :] = img[0, :, :]img[2, :, :] = img[0, :, :]elif idx == 1:# random select B Channelimg[0, :, :] = img[1, :, :]img[2, :, :] = img[1, :, :]elif idx == 2:# random select G Channelimg[0, :, :] = img[2, :, :]img[1, :, :] = img[2, :, :]else:tmp_img = 0.2989 * img[0, :, :] + 0.5870 * img[1, :, :] + 0.1140 * img[2, :, :]img[0, :, :] = tmp_imgimg[1, :, :] = tmp_imgimg[2, :, :] = tmp_imgreturn img

跨模态度量学习


1. Enhanced Channel-Mixed Learning
构建一个包括不同模态的图像,不去考虑模态的差异进行直接优化它们的关系。优化身份损失和 weighted regularization triplet loss(加权规则化的triplet loss)。


值得注意的是:pj和pk可以来自统一模态,也可以来自不同模态。——这就是作者提出mixed的含义吧,就是从混合模态组成的batch里随机去选图像,从而不去考虑模态的差异,直接优化intra-和inter-modality learning.

这里的d是欧式距离:

加权策略通过自适应考虑每个Triplet的贡献,增加困难样本的贡献(具有较大/较小距离的正/负对), 从而能够充分利用batch中的所有三元组。

Enhanced Squared Difference
常用的公式是L1L1L1,本文采用増广的平方差。


作者通过将函数曲线进行展示分析这样做的好处:

实验效果

**2. Channel-Augmented Joint Learning **
明确将通道増广图像看成一个辅助模态,这样一个batch中同时包含可见光RGB图像,通道増广图像,和红外图像。这样使得Batch增大,和之前一样,共享分类和度量学习模型。作者尝试采用不同的模型,但并未获得较好的结果。

实 验

  1. 分析了各种増广方法的效果

  2. 分析了平方距离的性能

  3. 分析不同学习策略的性能

  4. 与其他方法的性能比较

论文阅读: Channel Augmented Joint Learning for Visible-Infrared Recognition相关推荐

  1. 论文阅读32 | Channel Augmented Joint Learning for Visible-Infrared Recognition

    论文:Channel Augmented Joint Learning for Visible-Infrared Recognition 出处:CVPR 2021 1.摘要 这篇文章介绍了一种通道增强 ...

  2. 论文阅读:Channel Augmented Joint Learning for Visible-Infrared Recognition

    摘要 本文针对可见光红外识别问题,提出了一种强大的信道增强联合学习策略.对于数据增强,大多数现有方法直接采用为单模态可见光图像设计的标准操作,因此在可见光到红外匹配中没有充分考虑图像特性.我们的基本思 ...

  3. 论文阅读笔记——DLT-Net: Joint Detection of Drivable Areas, Lane Lines, and Traffic Objects)

    论文阅读笔记--DLT-Net: Joint Detection of Drivable Areas, Lane Lines, and Traffic Objects 论文简介 1 引言 2 DLT- ...

  4. 论文阅读笔记:SCAN: Learning to Classify Images without Labels

    论文阅读笔记:SCAN: Learning to Classify Images without Labels 摘要 简介和相关工作 方法 表征学习 语义聚类损失 2.3 通过自标记进行微调 3 实验 ...

  5. Zero-shot Learning零样本学习 论文阅读(一)——Learning to detect unseen object classes by between-class attribute

    Zero-shot Learning零样本学习 论文阅读(一)--Learning to detect unseen object classes by between-class attribute ...

  6. 年龄论文阅读——Deep Label Distribution Learning With Label Ambiguity

    论文阅读--Deep Label Distribution Learning With Label Ambiguity 版权声明:本文为博主原创文章,未经博主允许不得转载.https://blog.c ...

  7. 论文阅读|node2vec: Scalable Feature Learning for Networks

    论文阅读|node2vec: Scalable Feature Learning for Networks 文章目录 论文阅读|node2vec: Scalable Feature Learning ...

  8. 【论文阅读】Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations

    一篇经典的弱监督分割论文,发表在CVPR2019上面 论文标题: Weakly Supervised Learning of Instance Segmentation with Inter-pixe ...

  9. 【论文阅读】Neural Transformation Learning for Deep Anomaly Detection Beyond Images 异常检测,可学习变换,时间序列,表格数据

    本博客系博主阅读论文之后根据自己理解所写,非逐字逐句翻译,预知详情,请参阅论文原文. 论文标题:Neural Transformation Learning for Deep Anomaly Dete ...

最新文章

  1. java 快排和堆排序
  2. python3基础题目,Python3.x 基础练习题100例(91-100)
  3. 通过思科构造局域网_cisco设备构建典型局域网
  4. 数据爆炸时代,浪潮K1 Power释放新算能
  5. java bufferedrandomaccessfile_java 读写操作大文件 BufferedReader和RandomAccessFile
  6. css中背景的应用及BFC与IFC应用
  7. 给chrome手动安装github上插件
  8. linux怎么重载mysql配置命令_【Linux命令】数据库mysql配置命令
  9. regsvr32注册dll或ocx错误0x80040201的原因
  10. pktgen-dpdk 进行rfc2544测试
  11. PageHelper:在系统中发现了多个分页插件,请检查系统配置
  12. python 爬取句子迷,多好的一个网站(哭~~)
  13. python读取图片的几种方式
  14. 两台笔记本电脑共享屏幕(其中一台电脑当做另外一台电脑的扩展屏幕,多屏显示)
  15. 外包公司程序员的水平真的很垃圾吗?
  16. 《God of War 2 / 战神2》图文攻略 (Update:2007.7.18)
  17. 《现代软件工程-构建之法》读书笔记(1)
  18. 王道ch3-Stackp90_2.有一个列车,HS分别表示硬座软座,利用一个栈将序列调整为S在H之前
  19. js textarea换行
  20. ADG-12A-02-D2-1-52不带位置反馈比例换向阀放大器

热门文章

  1. 输出月份英文名 pta
  2. HttpURLConnection源码分析
  3. 盲盒系统搭建——玩转盲盒系统
  4. 谷歌机器学习主管:10年自学数据科学的3点心得体会
  5. java树结构模糊查询
  6. canvas -小球自由落体运动
  7. C语言入门教程||C语言常量||C语言存储类
  8. mac登录腾讯企业邮箱
  9. Python(x,y)安装
  10. Java项目的代码如何实现?