导言

这项工作设计了一项训练策略，允许从描述目标对象的一组视图产生高级知识。我们提出了Views Knowledge Distillation (VKD)，将这种visual variety （视觉多样性）固定为teacher-student框架中的监督信息号，其中老师教育观察较少视图的学生。结果，学生不仅在表现在超过了老师，还在image-to-video任务中成为了SOTA。

paper link :https://link.springer.com/chapter/10.1007%2F978-3-030-58607-2_6
code link: https://github.com/aimagelab/VKD.

下方↓公众号后台回复“VKD”，即可获得论文电子资源。

文章目录

导言
Introducation
- 动机
- VKD
- 主要贡献
Related works
Method
- Teacher Network
- - Set Representation.
  - Teacher Optimisation.
- Views Knowledge Distillation (VKD)
- - Student Optimisation.
Experience
- 数据集
- - Person Re-ID
  - Vehicle Re-ID
  - Animal Re-ID
- Self-distillation
- Comparison with State-Of-The-Art
- - Image-To-Video.
  - Video-To-Video.
- Analysis on VKD
- - In the Absence of Camera Information.
  - Distilling Viewpoints vs time.
  - VKD Reduces the Camera Bias.
  - Can Performance of the Student be Obtained Without Distillation?
  - Student Explanation.
  - Cross-distillation.
  - On the Impact of Loss Terms.
Conclusion

Introducation

动机

V2V he I2V之间还存在较大的差距。

As observed in [10], a large gap in Re-ID performance still subsists between V2V and I2V,

VKD

we propose Views Knowledge Distillation (VKD), which transfers the knowledge lying in several views in a teacher-student fashion. VKD devises a two-stage procedure, which pins the visual variety as a teaching signal for a student who has to recover it using fewer views.

主要贡献

i）学生的表现大大超过其老师，尤其是在“图像到视频”设置中；
ii）彻底的调查显示，与老师相比，学生将更多的精力放在目标上，并且丢弃了无用的细节；
iii）重要的是，我们不将分析局限于单个领域，而是在人，车辆和动物的Re-ID方面取得了出色的结果。

i) the student outperforms its teacher by a large margin, especially in the Image-To-Video setting;

ii) a thorough investigation shows that the student focuses more on the target compared to its teacher and discards uninformative details;

iii) importantly, we do not limit our analysis to a single domain, but instead achieve strong results on Person, Vehicle and Animal Re-ID.

Related works

Image-To-Video Re-Identification.
Knowledge Distillation

Method

图２VKD概述。学生网络被优化来在仅使用少量视图的情况下模仿老师的行为。

our proposal frames the training algorithm as a two-stage procedure, as follows

First step (Sect. 3.1): the backbone network is trained for the standard Video-To-Video setting.
Second step (Sect. 3.2): we appoint it as the teacher and freeze its parameters. Then, a new network with the role of the student is instantiated. As depicted in Fig. 2, we feed frames representing different views as input to the teacher and ask the student to mimic the same outputs from fewer frames.

第一步，用标准的Ｖ2V设置训练骨干网络。　
第二步，固定老师网络的参数，初始化学生网络。如图２所示，我们将表达不同视图的帧喂给老师网络，并且叫学生网络根据少量的帧来模仿相同的输出。

Teacher Network

用Imagenet初始化了网络的权重，还对架构做了少量的修改。

首先，我们抛弃了最后一个ReLU激活函数和最终分类层，转而使用BNNeck。第二：受益于细粒度的空间细节，最后一个残差块的步幅从2减少到1。

Set Representation.

Here, we naively compute the set-level embedding $F(S)\mathcal{F}(\mathcal{S})$ through a temporal average pooling. While we acknowledge better aggregation modules exist, we do not place our focus on devising a new one, but instead on improving the earlier features extractor.

Teacher Optimisation.

We train the base network - which will be the teacher during the following stage - combining a classification term $LCE\mathcal{L}_{CE}$ (cross-entropy) with the triplet loss $LTR\mathcal{L}_{TR}$ , The first can be formulated as:

其中 \textbf{y} 和 $y^\hat{\textbf{y}}$ 分别表示one-shot 标签和softmax输出的标签。
$LTR\mathcal{L}_{TR}$ 鼓励特征空间中的距离约束，将相同目标变得更近，不同目标变得更远。形式化为：

其中， $Sp\mathcal{S}_p$ 和 $Sn\mathcal{S}_n$ 分别为锚点 $Sa\mathcal{S}_a$ 在batch内的最强正锚点和负锚点。

Views Knowledge Distillation (VKD)

Views Knowledge Distillation（VKD）通过迫使学生网络 $FθS(⋅)\mathcal{F}_{\theta_S}(\cdot)$ 来匹配教师网络 $FθT(⋅)\mathcal{F}_{\theta_T}(\cdot)$ 的输出来解决问题。为此，我们１）允许教师网络从不同的视角访问帧 $S^T=(s^1,s^2,s^3,...,s^N)\hat{S}_T = (\hat{s}_1,\hat{s}_2,\hat{s}_3,...,\hat{s}_N)$ ，２）强迫学生网络根据 $S^S=(s^1,s^2,s^3,...,s^M)\hat{S}_S = (\hat{s}_1,\hat{s}_2,\hat{s}_3,...,\hat{s}_M)$ 　来模仿教师网络的输出。其中候选量Ｍ<N (在文章实验中，Ｍ＝２，Ｎ＝８)．

Views Knowledge Distillation (VKD) stresses this idea by forcing a student network $FθS(⋅)\mathcal{F}_{\theta_S}(\cdot)$ to match the outputs of the teacher $FθT(⋅)\mathcal{F}_{\theta_T}(\cdot)$ . In doing so, we: i) allow the teacher to access frames $S^T=(s^1,s^2,s^3,...,s^N)\hat{S}_T = (\hat{s}_1,\hat{s}_2,\hat{s}_3,...,\hat{s}_N)$ from different viewpoints; ii) force the student to mimic the teacher output starting from a subset $S^S=(s^1,s^2,s^3,...,s^M)\hat{S}_S = (\hat{s}_1,\hat{s}_2,\hat{s}_3,...,\hat{s}_M)$ with cardinality

[ECCV 2020] Robust Re-Identification by Multiple Views Knowledge Distillation,利用知识蒸馏实现最鲁棒Re-ID相关推荐

【李宏毅2020 ML/DL】P51 Network Compression - Knowledge Distillation | 知识蒸馏两大流派
我已经有两年 ML 经历,这系列课主要用来查缺补漏,会记录一些细节的.自己不知道的东西. 已经有人记了笔记(很用心,强烈推荐):https://github.com/Sakura-gh/ML-note ...

ECCV 2020 论文大盘点-人员重识别（ReID）篇
本文盘点ECCV 2020 中所有与人员再识别(Person Re-Identification,ReID)相关的论文,总计 24 篇,其中两篇Oral 论文,15篇已经或者将开源代码. 这一领域可称 ...

ECCV 2020开源项目合集（ECCV 2020 paper list with code/data）
文章转载自https://www.paperdigest.org/2020/08/eccv-2020-papers-with-code-data/,如有侵权,留言后删除. 以下表格列出了ECCV 20 ...

ECCV 2020 图像增强论文汇总
本文盘点 ECCV 2020 中底层图像处理方向相关的论文,包含:图像增强.图像恢复.去摩尔纹.去噪和质量评价论文,总计 29 篇,去摩尔纹 2 篇,去噪 10 篇,图像增强 7 篇,图像恢复 9 篇 ...

【2019-CVPR-3D人体姿态估计】Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views
Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views 题目:<快速鲁棒性多视图多人3D姿态估计> 作者: ...

论文笔记1：Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views
快速且鲁棒的多视角下多人三维姿态估计作者讲解:https://www.bilibili.com/video/BV1K441157Xf?from=search&seid=52494766343 ...

ECCV 2020 论文大盘点-目标跟踪篇
本文盘点ECCV 2020 所有与跟踪相关的研究,单指目标的跟踪,不涉及人体姿态等点的跟踪,总计19篇文章,其中12篇开源或者将开源. 其中多目标跟踪.单目标跟踪.对抗学习+目标跟踪研究工作较多.其中 ...

ECCV 2020 3D点云 Point Cloud 文章汇总
一.点云文章资源近年来,对于点云处理的研究越来越火热.Github上面有一个工程,汇总了从2017年以来各大会议上点云论文,awesome-point-cloud-analysis ,本文作者之前整 ...

【论文简述】Dense Hybrid Recurrent Multi-view Stereo Netwith Dynamic Consistency Checking（ECCV 2020）
一.论文简述 1. 第一作者:Jianfeng Yan.Zizhuang Wei.Hongwei Yi 2. 发表年份:2020 3. 发表期刊:ECCV 4. 关键词:MVS,深度学习,稠密混合循环 ...

最新文章

C语言利用malloc()和realloc()动态分配内存

HDU4357(数学思维题)

python对数的格式_python的log使用详解

elementui可编辑单元格_关于遥感解译点室内解译编号的读取编辑方法

分羊（区间dp：分治与决策单调性优化）

[渝粤教育] 中国地质大学金融保险业会计复习题

leetcode1291. 顺次数（回溯）

PPT(十)-动画基础知识学习

使用SQL Server 2017 Docker容器在.NET Core中进行本地Web API开发

WPF 浏览文件夹，获取其路径

C#虚基类继承与接口的区别

Executesql 实例及介绍

振铃效应与样点自适应补偿（Sample Adaptive Offset，SAO）技术

红米k20pro短接9008_拆解红米Redmi K20Pro，内部结果一目了然

[bzoj1406][数论]密码箱

解决sqliteman创建失败的一种方法

excel分类_Excel 的10个神奇功能，你会用几个？

eas bos客户端获取组织，人员，用户的方法

MAXENT模型的生物多样性生境模拟与保护优先区甄选、自然保护区布局优化及未来气候变化情景下自然保护区优化评估写作技巧

「杂谈」最有可能成为第五个一线城市，苏州 or 杭州？

热门文章

机器学习 ML在材料领域应用

必备干货外贸技巧，你拥有了吗？

VMware-NSX之CLI使用

常用的新媒体工具有哪些？

Similarity and Matching of Neural Network Representations 论文阅读笔记

硬件设计MBD的困境与出路

手机视频开发即时通讯软件

Car-eye 智能车联网管理云平台报警业务处理

MPP(大规模并行处理)详解

netty半包粘包处理_Java NIO 框架 Netty 之美：粘包与半包问题