[深度学习论文笔记][Face Recognition] DeepFace: Closing the Gap to Human-Level Performance in Face Verificati

Taigman, Yaniv, et al. “Deepface: Closing the gap to human-level performance in face verification.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014. (Citations: 851).

1 Motivation

Aligning faces in the unconstrained scenario is difficult because the variations of
• Pose (due to the non-planarity of the face).
• Non-rigid expressions.

The whole detection and alginment pipeline can be seen in Fig. 12.

Figure 12: Detection and alginment pipeline. (a) The detected face, with 6 initial fiducial points. (b) The induced 2D-aligned crop. (c) 67 fiducial points on the 2D-aligned crop with their corresponding Delaunay triangulation, we added triangles on the contour to avoid discontinuities. (d) The reference 3D shape transformed to the 2D-aligned crop image-plane. (e) Triangle visibility wrt to the fitted 3D-2D camera; darker triangles are less visible. (f) The 67 fiducial points induced by the 3D model that are used to direct
the piece-wise affine warpping. (g) The final frontalized crop. (h) A new view generated by the 3D model (not used in this paper).

2 Detection
Fiducial Point Detector
• Choose 6 fiducial points: 2 eyes’ center + 1 nose tip + 3 mouth points, see Fig. 12(a).
• Fiducial points are extracted by a SVR trained to predict point configurations from an image descriptor (LBP Histograms).

3 Alignment
2D Alignment
• Use fiducial points to scale, rotate, and translate the image into 6 fixed locations (anchor locations), see Fig. 12(b) for alignment result.
• However, this alignment fails to compensate for out-of-plane rotation, which is particularly important in unconstrained conditions.

3D Alignment
• Use a generic 3d shape model.
• from the 2d-aligned crop (Fig. 12(b)), using a second SVR localizing additional 67 fiducial points, see Fig. 12(c).
• An affine 3d-to-2d camera P is then fitted using the generalized least squares.
• However, this alignment fails to model full perspective projections and non-rigid deformations. Therefor, we allow to warp the 2d image with
small distortions, see Fig. 12(g) for alignment result.

4 Representation + Classification
In a Nutshell (120M Parameters)

• input (3 × 152 × 152).
• conv1 (32@11 × 11), relu1, pool1 (3 × 3, s2).
• conv2 (16@9 × 9), relu2.
• lc3 (16@9 × 9), relu3.
• lc4 (16@7 × 7), relu4.
• lc5 (16@5 × 5), relu5.
• fc6 (4096), relu6, drop6.
• fc7 (4030).

CONV Layers Used to extract low-level features like simple edges and texture.

No POOL2 Several levels of pooling would cause the network to lose information about the precise position of detailed facial structure and microtextures.

Locally Connected Layers (LC) Like a conv layer, they apply filter bank, but every location in the feature map learns a different set of filters since different regions of an aligned image have different local statistics.

5 Analysis
Features Produced by this Network is Very Sparse. See Fig. 13.

6 Identify Task
Definition Verifying whether two input instances belong to the same class (identity).

Idea Use the l 2 normalized fc6 features. The key is to design similarity measure.

Unsupervised Similarity Inner product of the two features.

Weighted χ^2 Distance

w are learned using a linear SVM.

Siamese Network End-to-end learning. The face recognition network (without the top layer) is replicated twice (one for each input image) and the features are used to directly predict whether the two input images belong to the same person. There are two ways
• 1. taking the absolute difference between the features.
• 2. Followed a top fc that maps into a single logistic unit (same/not same).

7 Ensembles of Networks
By feeding different types of inputs
• 3D aligned RGB inputs.
• 2D aligned RGB inputs.
• The gray-level image plus image gradient magnitude and orientation.

8
Experiments
Dataset Face dataset
• SFC (Social Face Classification): 4.4M faces, 4k peoples.

Identification dataset
• LFW (Labeled Faces in the Wild): 6k face pairs, 5.7k people.
• YTF (YouTube Faces): 5k video pairs, 1.6k people.

Train/Test Spilitting The most recent 5% of face images of each identity are left out for testing. This is done according to the images’ time-stamp in order to simulate continuous identification through aging.

Result LFW winner
• Accuracy: 97.25%.
• Human performace: 97.5%.

[深度学习论文笔记][Face Recognition] DeepFace: Closing the Gap to Human-Level Performance in Face Verificati相关推荐

[深度学习论文笔记]医学图像分割U型网络大合集
[深度学习论文笔记]医学图像分割U型网络大合集 2015 U-Net: Convolutional Networks for Biomedical Image Segmentation (MICCAI ...
[深度学习论文笔记]Multi-phase Liver Tumor Segmentation with Spatial Aggregation
Multi-phase Liver Tumor Segmentation with Spatial Aggregation and Uncertain Region Inpainting [深度学习论 ...
[深度学习论文笔记]Pairwise Learning for Medical Image Segmentation
[深度学习论文笔记]Pairwise Learning for Medical Image Segmentation 医学图像分割的成对学习 Published: October 2020 Publi ...
【深度学习论文笔记】DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
时间:2014/7/29 10:00 论文题目:DeCAF: A Deep Convolutional Activation Featurefor Generic Visual Recognit ...
[深度学习论文笔记]A Tri-attention Fusion Guided Multi-modal Segmentation Network
A Tri-attention Fusion Guided Multi-modal Segmentation Network 一种三注意力融合引导的多模态分割网络 Published: 2 Nov 2 ...
[深度学习论文笔记]UNETR: Transformers for 3D Medical Image Segmentation
UNETR: Transformers for 3D Medical Image Segmentation UNETR:用于三维医学图像分割的Transformer Published: Oct 20 ...
[深度学习论文笔记DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets
DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets DoDNet:学 ...
[深度学习论文笔记]Modality-aware Mutual Learning for Multi-modal Medical Image Segmentation
Modality-aware Mutual Learning for Multi-modal Medical Image Segmentation 多模态医学图像分割中的模态感知互学习 Publish ...
[深度学习论文笔记]Multimodal CNN Networks for Brain Tumor Segmentation in MRI
Multimodal CNN Networks for Brain Tumor Segmentation in MRI: A BraTS 2022 Challenge Solution MRI中用于脑 ...

[深度学习论文笔记][Face Recognition] DeepFace: Closing the Gap to Human-Level Performance in Face Verificati

[深度学习论文笔记][Face Recognition] DeepFace: Closing the Gap to Human-Level Performance in Face Verificati相关推荐

最新文章

热门文章