Taigman, Yaniv, et al. “Deepface: Closing the gap to human-level performance in face verification.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014. (Citations: 851).

1 Motivation

Aligning faces in the unconstrained scenario is difficult because the variations of
• Pose (due to the non-planarity of the face).
• Non-rigid expressions.

The whole detection and alginment pipeline can be seen in Fig. 12.

Figure 12: Detection and alginment pipeline. (a) The detected face, with 6 initial fiducial points. (b) The induced 2D-aligned crop. (c) 67 fiducial points on the 2D-aligned crop with their corresponding Delaunay triangulation, we added triangles on the contour to avoid discontinuities. (d) The reference 3D shape transformed to the 2D-aligned crop image-plane. (e) Triangle visibility wrt to the fitted 3D-2D camera; darker triangles are less visible. (f) The 67 fiducial points induced by the 3D model that are used to direct
the piece-wise affine warpping. (g) The final frontalized crop. (h) A new view generated by the 3D model (not used in this paper).

2 Detection
Fiducial Point Detector
• Choose 6 fiducial points: 2 eyes’ center + 1 nose tip + 3 mouth points, see Fig. 12(a).
• Fiducial points are extracted by a SVR trained to predict point configurations from an image descriptor (LBP Histograms).

3 Alignment
2D Alignment
• Use fiducial points to scale, rotate, and translate the image into 6 fixed locations (anchor locations), see Fig. 12(b) for alignment result.
• However, this alignment fails to compensate for out-of-plane rotation, which is particularly important in unconstrained conditions.

3D Alignment
• Use a generic 3d shape model.
• from the 2d-aligned crop (Fig. 12(b)), using a second SVR localizing additional 67 fiducial points, see Fig. 12(c).
• An affine 3d-to-2d camera P is then fitted using the generalized least squares.
• However, this alignment fails to model full perspective projections and non-rigid deformations. Therefor, we allow to warp the 2d image with
small distortions, see Fig. 12(g) for alignment result.

4 Representation + Classification
In a Nutshell (120M Parameters)

• input (3 × 152 × 152).
• conv1 (32@11 × 11), relu1, pool1 (3 × 3, s2).
• conv2 (16@9 × 9), relu2.
• lc3 (16@9 × 9), relu3.
• lc4 (16@7 × 7), relu4.
• lc5 (16@5 × 5), relu5.
• fc6 (4096), relu6, drop6.
• fc7 (4030).

CONV Layers Used to extract low-level features like simple edges and texture.

No POOL2 Several levels of pooling would cause the network to lose information about the precise position of detailed facial structure and microtextures.

Locally Connected Layers (LC) Like a conv layer, they apply filter bank, but every location in the feature map learns a different set of filters since different regions of an aligned image have different local statistics.

5 Analysis
Features Produced by this Network is Very Sparse. See Fig. 13.

6 Identify Task
Definition Verifying whether two input instances belong to the same class (identity).

Idea Use the l 2 normalized fc6 features. The key is to design similarity measure.

Unsupervised Similarity Inner product of the two features.

Weighted χ^2 Distance

w are learned using a linear SVM.

Siamese Network End-to-end learning. The face recognition network (without the top layer) is replicated twice (one for each input image) and the features are used to directly predict whether the two input images belong to the same person. There are two ways
• 1. taking the absolute difference between the features.
• 2. Followed a top fc that maps into a single logistic unit (same/not same).

7 Ensembles of Networks
By feeding different types of inputs
• 3D aligned RGB inputs.
• 2D aligned RGB inputs.
• The gray-level image plus image gradient magnitude and orientation.

8
Experiments
Dataset Face dataset
• SFC (Social Face Classification): 4.4M faces, 4k peoples.

Identification dataset
• LFW (Labeled Faces in the Wild): 6k face pairs, 5.7k people.
• YTF (YouTube Faces): 5k video pairs, 1.6k people.

Train/Test Spilitting The most recent 5% of face images of each identity are left out for testing. This is done according to the images’ time-stamp in order to simulate continuous identification through aging.

Result LFW winner
• Accuracy: 97.25%.
• Human performace: 97.5%.

[深度学习论文笔记][Face Recognition] DeepFace: Closing the Gap to Human-Level Performance in Face Verificati相关推荐

  1. [深度学习论文笔记]医学图像分割U型网络大合集

    [深度学习论文笔记]医学图像分割U型网络大合集 2015 U-Net: Convolutional Networks for Biomedical Image Segmentation (MICCAI ...

  2. [深度学习论文笔记]Multi-phase Liver Tumor Segmentation with Spatial Aggregation

    Multi-phase Liver Tumor Segmentation with Spatial Aggregation and Uncertain Region Inpainting [深度学习论 ...

  3. [深度学习论文笔记]Pairwise Learning for Medical Image Segmentation

    [深度学习论文笔记]Pairwise Learning for Medical Image Segmentation 医学图像分割的成对学习 Published: October 2020 Publi ...

  4. 【深度学习论文笔记】DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition

     时间:2014/7/29 10:00 论文题目:DeCAF: A Deep Convolutional Activation Featurefor Generic Visual Recognit ...

  5. [深度学习论文笔记]A Tri-attention Fusion Guided Multi-modal Segmentation Network

    A Tri-attention Fusion Guided Multi-modal Segmentation Network 一种三注意力融合引导的多模态分割网络 Published: 2 Nov 2 ...

  6. [深度学习论文笔记]UNETR: Transformers for 3D Medical Image Segmentation

    UNETR: Transformers for 3D Medical Image Segmentation UNETR:用于三维医学图像分割的Transformer Published: Oct 20 ...

  7. [深度学习论文笔记DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets

    DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets DoDNet:学 ...

  8. [深度学习论文笔记]Modality-aware Mutual Learning for Multi-modal Medical Image Segmentation

    Modality-aware Mutual Learning for Multi-modal Medical Image Segmentation 多模态医学图像分割中的模态感知互学习 Publish ...

  9. [深度学习论文笔记]Multimodal CNN Networks for Brain Tumor Segmentation in MRI

    Multimodal CNN Networks for Brain Tumor Segmentation in MRI: A BraTS 2022 Challenge Solution MRI中用于脑 ...

最新文章

  1. ImageResizer for .net 图片处理强大类库
  2. CentOS中Jenkins的下载、安装、配置与启动(图文教程)
  3. html页面和Chrome开发者工具elements界面不一致的一个可能原因:没有在Chrome开发者工具里打开对Shadow DOM显示的支持
  4. python运行别人的项目_pycharm实现在虚拟环境中引入别人的项目
  5. 标准时间校对_光源色灯箱标准原理
  6. 【渝粤教育】电大中专中药制剂学作业 题库
  7. ArcGIS 字段计算器取前几位和替换操作
  8. 电机 matlab 仿真 实验总结,哈工大 电机学 MATLAB 仿真 实验报告.docx
  9. Uniapp使用GoEasy实现websocket实时通讯
  10. 基于用户的协同过滤Movielens电影推荐系统简单实例
  11. DXP2004生成PCB不显示连线
  12. 虚拟汽车加油问题 (贪心算法)
  13. 年面向大学生的 9 个最佳 Chrome 扩展程序
  14. hr标签---中心线:设置颜色
  15. 雀巢“可持续发展列车”驶入瑞士驻华大使馆
  16. 哦麦艾斯!AI设计的丑衣服将引领时尚?数据结构与算法代码面试题;将文件藏在图片里的隐写工具;蒙古语语音合成语料库
  17. Python数据结构栈,后进先出
  18. Linux 设置多指触控手势,以 Manjaro 为例
  19. 腾讯云TCB云函数抓取微信订阅号话题标签文章
  20. GD32F303CCT6与GD32F407VKTC spi 主从通信

热门文章

  1. Mysqls数据库的表出现the table is full的问题
  2. Linux 域名ping不通
  3. 下次去迪士尼,你可能会遇到机器人米奇:他们要用AI改变乐园
  4. 一种小程序弱网离线优化的思路
  5. 蛋壳公寓暴雷,一个将租房做成金融的韭菜联合收割机
  6. Dell H310 Mini Raid卡,新加入热备盘显示Foreign的解决方法
  7. 4个万能的扫描工具,各种文件3秒扫描成电子档
  8. nanoid js字符串id生成器
  9. 2016年11月笔记
  10. PTA 天梯赛 L2-014 列车调度