PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer
PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer
摘要/介绍/相关工作
Recent deep learning approaches focus on mining subtle rPPG clues using convolutional neural networks with limited spatio-temporal receptive fields, which neglect the long-range spatio-temporal perception and interaction for rPPG modeling.(长程时间关系)
反例:Unifying frame rate and temporal dilations for improved remote pulse detection(SCI三区水论文)
the temporal difference transformers
提出了:global spatio-temporal attention based on the fine-grained temporal skin color differences
差异性:
subtle skin color changes
long-time monitoring task
a video sequence to signal sequence problem
we also propose the label distribution learning and a curriculum learning inspired dynamic constraint in frequency domain, which provide elaborate supervisions for PhysFormer and alleviate overfitting.
网络结构
TDC模块
埋个伏笔下次再讲差分卷积在计算机视觉中的应用 - 知乎 (zhihu.com)
class CDC_T(nn.Module):def __init__(self, in_channels, out_channels, kernel_size=3, stride=1,padding=1, dilation=1, groups=1, bias=False, theta=0.6):super(CDC_T, self).__init__()self.conv = nn.Conv3d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding,dilation=dilation, groups=groups, bias=bias)self.theta = thetadef forward(self, x):out_normal = self.conv(x)if math.fabs(self.theta - 0.0) < 1e-8:return out_normalelse:# pdb.set_trace()[C_out, C_in, t, kernel_size, kernel_size] = self.conv.weight.shape# only CD works on temporal kernel size>1if self.conv.weight.shape[2] > 1:kernel_diff = self.conv.weight[:, :, 0, :, :].sum(2).sum(2) + self.conv.weight[:, :, 2, :, :].sum(2).sum(2)kernel_diff = kernel_diff[:, :, None, None, None]out_diff = F.conv3d(input=x, weight=kernel_diff, bias=self.conv.bias, stride=self.conv.stride,padding=0, dilation=self.conv.dilation, groups=self.conv.groups)return out_normal - self.theta * out_diffelse:return out_normal
注意力模块
def forward(self, x, gra_sharp): # [B, 4*4*40, 128]"""x, q(query), k(key), v(value) : (B(batch_size), S(seq_len), D(dim))mask : (B(batch_size) x S(seq_len))* split D(dim) into (H(n_heads), W(width of head)) ; D = H * W"""# (B, S, D) -proj-> (B, S, D) -split-> (B, S, H, W) -trans-> (B, H, S, W)[B, P, C]=x.shapex = x.transpose(1, 2).view(B, C, P//16, 4, 4) # [B, dim, 40, 4, 4]q, k, v = self.proj_q(x), self.proj_k(x), self.proj_v(x)q = q.flatten(2).transpose(1, 2) # [B, 4*4*40, dim]k = k.flatten(2).transpose(1, 2) # [B, 4*4*40, dim]v = v.flatten(2).transpose(1, 2) # [B, 4*4*40, dim]q, k, v = (split_last(x, (self.n_heads, -1)).transpose(1, 2) for x in [q, k, v])# (B, H, S, W) @ (B, H, W, S) -> (B, H, S, S) -softmax-> (B, H, S, S)scores = q @ k.transpose(-2, -1) / gra_sharpscores = self.drop(F.softmax(scores, dim=-1))# (B, H, S, S) @ (B, H, S, W) -> (B, H, S, W) -trans-> (B, S, H, W)h = (scores @ v).transpose(1, 2).contiguous()# -merge-> (B, S, D)h = merge_last(h, 2)self.scores = scoresreturn h, scores
整体结构
def forward(self, x, gra_sharp):b, c, t, fh, fw = x.shapex = self.Stem0(x)x = self.Stem1(x)x = self.Stem2(x) # [B, 64, 160, 64, 64]x = self.patch_embedding(x) # [B, 64, 40, 4, 4]x = x.flatten(2).transpose(1, 2) # [B, 40*4*4, 64]Trans_features, Score1 = self.transformer1(x, gra_sharp) # [B, 4*4*40, 64]Trans_features2, Score2 = self.transformer2(Trans_features, gra_sharp) # [B, 4*4*40, 64]Trans_features3, Score3 = self.transformer3(Trans_features2, gra_sharp) # [B, 4*4*40, 64]#Trans_features3 = self.normLast(Trans_features3)# upsampling heads#features_last = Trans_features3.transpose(1, 2).view(b, self.dim, 40, 4, 4) # [B, 64, 40, 4, 4]features_last = Trans_features3.transpose(1, 2).view(b, self.dim, t//4, 4, 4) # [B, 64, 40, 4, 4]features_last = self.upsample(features_last) # x [B, 64, 7*7, 80]features_last = self.upsample2(features_last) # x [B, 32, 7*7, 160]features_last = torch.mean(features_last,3) # x [B, 32, 160, 4]features_last = torch.mean(features_last,3) # x [B, 32, 160]rPPG = self.ConvBlockLast(features_last) # x [B, 1, 160]#pdb.set_trace()rPPG = rPPG.squeeze(1)return rPPG, Score1, Score2, Score3
Label Distribution Learning
新的loss计算方式
Curriculum Learning Guided Dynamic Loss
PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer相关推荐
- 【论文笔记】Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video
Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video 前言 1. Backg ...
- 论文阅读 (64):Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning
文章目录 1 引入 1.1 题目 1.2 代码 1.3 摘要 1.4 Bib 2 RTFM 2.1 理论动机 2.2 多尺度时间特征学习 2.3 特征量级学习 2.4 RTFM帧级分类器 3 实验 3 ...
- [学习笔记·翻译稿] Video Based Face Recognition by Using Discriminatively Learned Convex Models
机翻+手动调整 仅供学习之用 PDF已上传至蓝奏云:https://wwi.lanzous.com/iAcIyl9vthc Video Based Face Recognition by Using ...
- 【阅读笔记】《TDN: Temporal Difference Networks for Efficient Action Recognition》阅读笔记
<TDN: Temporal Difference Networks for Efficient Action Recognition> 论文连接:https://arxiv.org/ab ...
- ADPRL - 近似动态规划和强化学习 - Note 10 - 蒙特卡洛法和时序差分学习及其实例 (Monte Carlo and Temporal Difference)
Note 10 蒙特卡洛法和时序差分学习 Monte Carlo and Temporal Difference 蒙特卡洛法和时序差分学习 Note 10 蒙特卡洛法和时序差分学习 Monte Car ...
- 每日一佳——Least-Squares Temporal Difference Learning(Justin A. Boyan,ICML,1999)
PDF 这篇Paper获得ICML1999年的Best Paper Award.好吧,看到题目我就傻眼了,讲的是啥?没办法,只能Duang一下了.(^_^) Least-Squares:最小二乘 Te ...
- 使用 Temporal Fusion Transformer 进行时间序列预测
转:Deephub Imba 目前来看表格类的数据的处理还是树型的结构占据了主导地位.但是在时间序列预测中,深度学习神经网络是有可能超越传统技术的. 为什么需要更加现代的时间序列模型? 专为单个时间序 ...
- Temporal Fusion Transformer (TFT) 各模块功能和代码解析(pytorch)
Temporal Fusion Transformer (TFT) 各模块功能和代码解析(pytorch) 文章目录 Temporal Fusion Transformer (TFT) 各模块功能和代 ...
- 文献笔记:Contrast-Phys: Unsupervised Video-based Remote Physiological Measurement viaSpatiotemporal Con
问题: 关于使用无监督学习:涉及到头部运动或视频是异构的,基于dl的方法可能比传统的手工方法更健壮.但是,基于dl的rPPG方法需要包括人脸视频和真实生理信号在内的大规模数据集.虽然大量获取人脸视频相 ...
最新文章
- SCCM 2007系列7 补丁分发上
- 企业选择网站建设能够对自身带来哪些作用?
- Selenium + Python操作IE 速度很慢的解决办法
- ajax返回html乱码问题,ajax返回的html代码问题
- 【helpdesk】启明星helpdesk7.0版本里,实现邮件提交功能介绍和原理
- 三维重建:PNG格式详解-与LibPNG使用
- 【渝粤题库】陕西师范大学200401 初等代数研究 作业(专升本)
- php 判断心跳包报错,第29问:MySQL 的复制心跳说它不想跳了
- 360小程序将上线,机会在哪里?
- JQueryDOM之CSS操作
- pb 哪里找到系统图标_建议收藏的7个高质量图标网站,一网打尽图标素材
- 小程序 长按转发_小程序转发分享
- 如何在arcgis中制作土地利用转移矩阵
- 如何同时将多张图片进行批量无损压缩、调整尺寸及调整大小
- VISIO 连接线转角居然默认不是直角,每次要改格式
- echarts2的一个地图demo
- c语言程序设计伴随矩阵,c语言求方阵的行列式、伴随矩阵算法
- 2021年最新3d材质贴图素材大合集来咯
- html的android开发工具,只会html也可以做安卓app(附实例)
- css文件插入背景音乐,关注css背景音乐代码