PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer

摘要/介绍/相关工作

Recent deep learning approaches focus on mining subtle rPPG clues using convolutional neural networks with limited spatio-temporal receptive fields, which neglect the long-range spatio-temporal perception and interaction for rPPG modeling.(长程时间关系)

反例:Unifying frame rate and temporal dilations for improved remote pulse detection(SCI三区水论文)

the temporal difference transformers

提出了:global spatio-temporal attention based on the fine-grained temporal skin color differences

差异性:

  • subtle skin color changes

  • long-time monitoring task

  • a video sequence to signal sequence problem

we also propose the label distribution learning and a curriculum learning inspired dynamic constraint in frequency domain, which provide elaborate supervisions for PhysFormer and alleviate overfitting.

网络结构

TDC模块

埋个伏笔下次再讲差分卷积在计算机视觉中的应用 - 知乎 (zhihu.com)

class CDC_T(nn.Module):def __init__(self, in_channels, out_channels, kernel_size=3, stride=1,padding=1, dilation=1, groups=1, bias=False, theta=0.6):super(CDC_T, self).__init__()self.conv = nn.Conv3d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding,dilation=dilation, groups=groups, bias=bias)self.theta = thetadef forward(self, x):out_normal = self.conv(x)if math.fabs(self.theta - 0.0) < 1e-8:return out_normalelse:# pdb.set_trace()[C_out, C_in, t, kernel_size, kernel_size] = self.conv.weight.shape# only CD works on temporal kernel size>1if self.conv.weight.shape[2] > 1:kernel_diff = self.conv.weight[:, :, 0, :, :].sum(2).sum(2) + self.conv.weight[:, :, 2, :, :].sum(2).sum(2)kernel_diff = kernel_diff[:, :, None, None, None]out_diff = F.conv3d(input=x, weight=kernel_diff, bias=self.conv.bias, stride=self.conv.stride,padding=0, dilation=self.conv.dilation, groups=self.conv.groups)return out_normal - self.theta * out_diffelse:return out_normal

注意力模块

    def forward(self, x, gra_sharp):    # [B, 4*4*40, 128]"""x, q(query), k(key), v(value) : (B(batch_size), S(seq_len), D(dim))mask : (B(batch_size) x S(seq_len))* split D(dim) into (H(n_heads), W(width of head)) ; D = H * W"""# (B, S, D) -proj-> (B, S, D) -split-> (B, S, H, W) -trans-> (B, H, S, W)[B, P, C]=x.shapex = x.transpose(1, 2).view(B, C, P//16, 4, 4)      # [B, dim, 40, 4, 4]q, k, v = self.proj_q(x), self.proj_k(x), self.proj_v(x)q = q.flatten(2).transpose(1, 2)  # [B, 4*4*40, dim]k = k.flatten(2).transpose(1, 2)  # [B, 4*4*40, dim]v = v.flatten(2).transpose(1, 2)  # [B, 4*4*40, dim]q, k, v = (split_last(x, (self.n_heads, -1)).transpose(1, 2) for x in [q, k, v])# (B, H, S, W) @ (B, H, W, S) -> (B, H, S, S) -softmax-> (B, H, S, S)scores = q @ k.transpose(-2, -1) / gra_sharpscores = self.drop(F.softmax(scores, dim=-1))# (B, H, S, S) @ (B, H, S, W) -> (B, H, S, W) -trans-> (B, S, H, W)h = (scores @ v).transpose(1, 2).contiguous()# -merge-> (B, S, D)h = merge_last(h, 2)self.scores = scoresreturn h, scores

整体结构

    def forward(self, x, gra_sharp):b, c, t, fh, fw = x.shapex = self.Stem0(x)x = self.Stem1(x)x = self.Stem2(x)  # [B, 64, 160, 64, 64]x = self.patch_embedding(x)  # [B, 64, 40, 4, 4]x = x.flatten(2).transpose(1, 2)  # [B, 40*4*4, 64]Trans_features, Score1 =  self.transformer1(x, gra_sharp)  # [B, 4*4*40, 64]Trans_features2, Score2 =  self.transformer2(Trans_features, gra_sharp)  # [B, 4*4*40, 64]Trans_features3, Score3 =  self.transformer3(Trans_features2, gra_sharp)  # [B, 4*4*40, 64]#Trans_features3 = self.normLast(Trans_features3)# upsampling heads#features_last = Trans_features3.transpose(1, 2).view(b, self.dim, 40, 4, 4) # [B, 64, 40, 4, 4]features_last = Trans_features3.transpose(1, 2).view(b, self.dim, t//4, 4, 4) # [B, 64, 40, 4, 4]features_last = self.upsample(features_last)         # x [B, 64, 7*7, 80]features_last = self.upsample2(features_last)           # x [B, 32, 7*7, 160]features_last = torch.mean(features_last,3)     # x [B, 32, 160, 4]features_last = torch.mean(features_last,3)     # x [B, 32, 160]rPPG = self.ConvBlockLast(features_last)    # x [B, 1, 160]#pdb.set_trace()rPPG = rPPG.squeeze(1)return rPPG, Score1, Score2, Score3

Label Distribution Learning

新的loss计算方式

Curriculum Learning Guided Dynamic Loss

PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer相关推荐

  1. 【论文笔记】Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video

    Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video 前言 1. Backg ...

  2. 论文阅读 (64):Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning

    文章目录 1 引入 1.1 题目 1.2 代码 1.3 摘要 1.4 Bib 2 RTFM 2.1 理论动机 2.2 多尺度时间特征学习 2.3 特征量级学习 2.4 RTFM帧级分类器 3 实验 3 ...

  3. [学习笔记·翻译稿] Video Based Face Recognition by Using Discriminatively Learned Convex Models

    机翻+手动调整 仅供学习之用 PDF已上传至蓝奏云:https://wwi.lanzous.com/iAcIyl9vthc Video Based Face Recognition by Using ...

  4. 【阅读笔记】《TDN: Temporal Difference Networks for Efficient Action Recognition》阅读笔记

    <TDN: Temporal Difference Networks for Efficient Action Recognition> 论文连接:https://arxiv.org/ab ...

  5. ADPRL - 近似动态规划和强化学习 - Note 10 - 蒙特卡洛法和时序差分学习及其实例 (Monte Carlo and Temporal Difference)

    Note 10 蒙特卡洛法和时序差分学习 Monte Carlo and Temporal Difference 蒙特卡洛法和时序差分学习 Note 10 蒙特卡洛法和时序差分学习 Monte Car ...

  6. 每日一佳——Least-Squares Temporal Difference Learning(Justin A. Boyan,ICML,1999)

    PDF 这篇Paper获得ICML1999年的Best Paper Award.好吧,看到题目我就傻眼了,讲的是啥?没办法,只能Duang一下了.(^_^) Least-Squares:最小二乘 Te ...

  7. 使用 Temporal Fusion Transformer 进行时间序列预测

    转:Deephub Imba 目前来看表格类的数据的处理还是树型的结构占据了主导地位.但是在时间序列预测中,深度学习神经网络是有可能超越传统技术的. 为什么需要更加现代的时间序列模型? 专为单个时间序 ...

  8. Temporal Fusion Transformer (TFT) 各模块功能和代码解析(pytorch)

    Temporal Fusion Transformer (TFT) 各模块功能和代码解析(pytorch) 文章目录 Temporal Fusion Transformer (TFT) 各模块功能和代 ...

  9. 文献笔记:Contrast-Phys: Unsupervised Video-based Remote Physiological Measurement viaSpatiotemporal Con

    问题: 关于使用无监督学习:涉及到头部运动或视频是异构的,基于dl的方法可能比传统的手工方法更健壮.但是,基于dl的rPPG方法需要包括人脸视频和真实生理信号在内的大规模数据集.虽然大量获取人脸视频相 ...

最新文章

  1. SCCM 2007系列7 补丁分发上
  2. 企业选择网站建设能够对自身带来哪些作用?
  3. Selenium + Python操作IE 速度很慢的解决办法
  4. ajax返回html乱码问题,ajax返回的html代码问题
  5. 【helpdesk】启明星helpdesk7.0版本里,实现邮件提交功能介绍和原理
  6. 三维重建:PNG格式详解-与LibPNG使用
  7. 【渝粤题库】陕西师范大学200401 初等代数研究 作业(专升本)
  8. php 判断心跳包报错,第29问:MySQL 的复制心跳说它不想跳了
  9. 360小程序将上线,机会在哪里?
  10. JQueryDOM之CSS操作
  11. pb 哪里找到系统图标_建议收藏的7个高质量图标网站,一网打尽图标素材
  12. 小程序 长按转发_小程序转发分享
  13. 如何在arcgis中制作土地利用转移矩阵
  14. 如何同时将多张图片进行批量无损压缩、调整尺寸及调整大小
  15. VISIO 连接线转角居然默认不是直角,每次要改格式
  16. echarts2的一个地图demo
  17. c语言程序设计伴随矩阵,c语言求方阵的行列式、伴随矩阵算法
  18. 2021年最新3d材质贴图素材大合集来咯
  19. html的android开发工具,只会html也可以做安卓app(附实例)
  20. css文件插入背景音乐,关注css背景音乐代码

热门文章

  1. 普通人如何打造自己的ip?
  2. 浅聊一下那些营销工具—优惠券
  3. 8-32个字符,大写字母、小写字母、数字、特殊字符4类中至少3类 亲测
  4. 网络安全:通过445端口暴力破解植入木马。
  5. 独立型性格分析,独立型人格的职业分析
  6. 如何将传统 Web 框架部署到 Serverless
  7. buuctf · windows系统密码 · wp
  8. 类型四:间断点及分类
  9. 为什么你的孩子应该学习编程思维?如何选择第一本编程思维启蒙书?
  10. 用户密码的加密解密操作(前端加密,后端解密)