Facenet paper地址 : facenet;   论文解析下载地址(PDF版):论文解析

FaceNet: A Unified Embedding for Face Recognition and Clustering

Abstract摘要

Despite significantrecent advances in the field of face recognition [10, 14, 15, 17], implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as featurevectors.

尽管人脸识别领域已取得了重大的步,但是于当下的方法如何有效的运用人脸验证和人脸识别仍然有巨大的挑战。在这个论文里,我们提出了一个叫facenet的系统,这个系统直接学习了一个从人脸图像到紧密型欧几里得空间的映射,在那里距离直接和人脸的相似度相关。一旦这个空间产生,诸如人脸识别、验证、聚集这类的任务可以在运用FaceNet embeddings特征向量的准技下轻松实现。

Our method uses a deep convolutional networktrained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches. To train, we use triplets of roughly aligned matchin/ non-matchingface patchesgenerated using a novel online triplet mining method. The benefit of our approach is much greater representational efficiency: we achieve state-of-the-art face recognition performance using only 128-bytes per face.

我们的方法运用深度卷积网络训练直接优化embedding,而不是像原来的运用中间瓶颈层为人脸图像的向量映射,然后以分类层作为输出层。对于训练,我们运用online triplet

mining 的方法生成triplets大致校准匹配或非匹配人脸补丁。我们方法的最大优点是拥有最大表征效率:我们仅用每张人脸128位取得了人脸识别的最先进的性能。

On the widely used Labeled Faces in the Wild(LFW) dataset, our system achieves a new record accuracy of  99.63%

On YouTubeFaces DB it achieves95.12%. Our system cuts the error rate in comparison to the best published result [15] by30% on both datasets.

在广泛使用的LFW数据库我们的系统取得了99.63%准确性的新纪录。在YouTub数据库取得了95.12%的成绩。我们的系统相比已经发布的最好结果在这两个数据集的错误率减少了30%

We also introduce the concept of harmonic embeddings ,and a harmonic triplet loss, which describe different versions of face embeddings (produced by different networks) that are compatible to each other and allow for direct comparison between each other.

我们也介绍了harmonic embeddings和harmonic triplet loss的概念,它们描述了由不同网络产生的不同版本的face embeddings他们之间是兼容的而且可以直接比较。

1.  Introduction 引言

In this paper we present a unified system for face verification (is this the same person), recognition (who is this person) and clustering (find common people among these faces). Our method is based on learning a Euclidean embedding per image using a deep convolutional network. The network is trained such that the squared L2 distancesin the embedding space directly correspond to face similarity: faces of the same person have small distances and faces of distinct people have large distances.

在这个论文里我们呈现了包含人脸验证(是不是同一个人)、识别(他是谁?)、聚集(在人脸中找到相同的人进行归类)的一个完整的系统。我们的方法是基于每张图像用一个深度卷积网络学习一个Euclidean embedding。然后把这个网络进行训练这样在embedding space的squared L2距离直接对应人脸相似度:同一个人的人脸具有小的距离不同人的人脸具有较大的距离。

Once this embedding has been produced, then the aforementioned tasks become straight-forward: face verification simply involves thresholdingthe distance between the two embeddings; recognition becomes a k-NN classification problem; and clusteringcan be achieved using off-the-shelf techniques such as k-means or agglomerative clustering.

一旦这个embedding产生前面提到的任务就变得很简单了:人脸验证仅仅只涉及两个embedding距离的阀值;识别成了一个k-NN分类问题;聚集可以用现成的例如k-means or agglomerative clustering之类的技术来实现。

Previous face recognition approaches based on deep networks use a classification layer [15, 17] trained over a set of known face identities and then take an intermediate bottleneck layer as a representation used to generalize recognition beyond the set of identities used in training. The downsides of this approach are its indirectness and its inefficiency: one has to hope that the bottleneck representation generalizes well to new faces; and by using a bottleneck layer the representationsize per face is usually very large (1000s ofdimensions ).Some recent work [15] has reduced this dimensionality using PCA, but this is a linear transformation that can be easily learnt in one layer of the network.

其中一个期望改进的是瓶颈层对新人脸也可以很好的泛化并且用瓶颈层每一个脸的表示尺寸都非常大(1000s of dimensions)。一些在的工作用PCA减小了度,但是是一个线性的变换可以在网的一层轻松学到。

In contrast to these approaches, FaceNet directly trains its output to be a compact 128-D embedding using a triplet-based loss function based on LMNN[19]. Our triplets consist of two matching face thumbnails and a non-matching face thumbnail and the loss aims to separate the positive pair from the negative by a distance margin. The thumbnails are tight crops of the face area, no 2D or 3D alignment, other than scale and translation is performed.

与上述那些方法比起来,FaceNe运用基于LMNN的triplet- based loss函数直接把训练成一个凑的128-Dembedding。我的triplets包含两个匹配的人脸缩略图和一个非匹配的人脸缩略图。Loss函数的目的就是通距离界区分正负类。缩略图为精密剪裁的脸部区域除了执行缩放平移外没有2D or 3D校准。

Choosing which triplets to use turns out to be very important for achieving good performance and, inspired by curriculum learning [1], we present a novel online negative exemplar mining strategy which ensures consistently increasing difficulty of triplets as the network trains. To improve clustering accuracy, we also explore hard-positive mining techniques which encourage spherical clusters for the embeddings of a single person.

选择运用triplets于取得好的性能是非常重要的,并且受curriculum learning,我提出了一个online negative exemplar mining策略,它确保了随着网络训练triplets度持了提高聚准确度,我也探索了hard-positive mining术它对于每个人的embeddings激发出球形聚类。

As an illustration of the incrediblevariability that our method can handle see Figure 1.Shown are image pairs from PIE [13] that previously were considered to be very difficult for face verification systems.

的方法可以生惊人的化性就像例1示那。展的是来自PIE图像对其曾经被认为对于人脸识别系统是十分困难的事情

Figure 1. Illumination and Pose invariane. Pose and illumination have been a long standing problem in face recognition. This figure shows the output distances of FaceNet between pairs of faces of the same and adifferent person in different pose and illumination combinations. A distance of0.0 means the faces are identical, 4.0 corresponds to the opposite spectrum,two different identities. You can see that a threshold of 1.1 would classifyevery pair correctly.

1.光照与姿态不性.姿态和光照是一个期存在的问题在人脸识别中。示了运用FaceNet生成的相同/不同人在不同的姿态和光照组合下的人脸对的输出距离,距离0意味着相同的人脸,4.0对应着相反的频谱,有不同的人脸特征。可以观察到门限值1.1可以准确区分每对人脸

An overview of the rest of the paper is as follows: in section 2 we review the literature in this area;section 3.1 defines the triplet loss and section 3.2 describes our novel triplet selection and training procedure; in section 3.3 we describe the model architecture used. Finally in section 4 and 5 we present some quantitative results of our embeddings and also qualitatively explore some clustering results.

其余文内容概述如下:

section 2:回域的相关文献

section 3.1:定triplet loss

section 3.2:描述了triplet selection& training procedure

section 3.3:所用的模型

section 4and 5:提出了一些关于embeddings的定量的结论,并且定性地探索了一些聚类结论。

2. Related Work

Similarly to other recent works which employ deep networks [15, 17], our approach is a purely data driven method which learns its representation directly from the pixels of the face. Rather than using engineered features, we use a large dataset of labelled faces to attain the appropriate invariances to pose, illumination, and other variational conditions.

在的运用的深度卷的方法似,我的方法是一个粹的数据驱动方法,方法从人的每一个像素开始直接学它的表示。我运用标记的大型数据得合适的姿态光照和其他可的情况的不变性,而不是运用engineered features

In this paper we explore two different deep network architectures that have been recently used to great success in the computer vision community. Both are deep convolutional networks [8, 11]. The first architecture is based on the Zeiler&Fergus [22] model which consists of multiple interleavedlayersof convolutions, non-linear activations, local response normalizations, and max pooling layers. We additionally add several 1*1*d convolution layers inspired by the work of [9]. The second architecture is based on the Inceptionmodel of Szegedyet al. which was recently used as the winning approach for ImageNet2014[16]. These networks use mixed layers that run several different convolutional and polling layersin parallel and concatenate their responses. We have found that these models can reduce the number of parameters by up to 20 times and have the potential to reduce the number of FLOPS required for comparable performance.

在论文里我们探究了最近在计算机视觉社区成功使用的两类不同的深度卷积神经网络架构。第一个架构是基于Zeiler&Fergus模型其包含multiple interleaved layers of convolutions, non-linear activations, local response normalizations,and max pooling layers[9]的工作启添加了几个1*1*d卷积层。第二个架构是基于the Inception model of Szegedy et al 这种架构被称为ImageNet 2014 中最的方式些网络在并联和串联它们的相应的时候运用运行在几个不同的卷和池化层组成的混合层。我们发现这两种模型都可以减少参数的使用次数达20次并且有减少浮点运算次数的潜在性能。

There is a vast corpus of face verification and recognition works. Reviewing it is out of this paper so we will only briefly discuss the most relevant recent work.

下面是人脸验证识别工作相关的大量料,但是它文不大相关所以我们仅仅讨论一下最相关的最近工作

The works of [15, 17, 23] all employ acomplex system of multiple stages, that combines the output of a deep convolutional network with PCA for dimensionality reduction and anSVM for classification.

[15, 17, 23]中的工作都是运用一个复的多,其中结合了带有用于减少纬度的PCA和分SVM的深度卷

Zhenyao et al. [23] employ a deep network to “warp” faces into a canonical frontal view and then learn CNN that classifies each face as belonging to a known identity. For face verification,PCA on the network output in conjunction with an ensemble of SVMs is used.

Taigman et al. [17] propose a multi-stageapproachthat aligns faces to a general 3D shape model. A multi-class network is trained to perform the face recognition task on over four thousand identities. The authors also experimented with a so called Siamese network where they directly optimize the L1-distance between two face features. Their best performance on LFW (97.35%) stems from an ensemble of three networks using different alignments and color channels. The predicted distances (non-linear SVM predictions based on the 2 kernel)of those networks are combined using non-linear SVM.

Taigman 训练一个多级网络去执行一个拥有四千多特征的人脸识别任务。作者也试用一个叫做Siamese的网络在这个网络中他们可以直接优化两个人脸特征间的L1-distancLFW上最好的性能是97.35%来源于运用不同校准和色通道的三个网体。些网预测距离(基于两个核的non-linear SVM)运用一个non-linear SVM合。

Sunet al. [14, 15] propose a compact and therefore relatively cheap to compute network. They use an ensemble of 25 of these network, each operating on a different face patch. For their final performance on LFW (99.47% [15]) the authors combine 50 responses (regular and flipped). Both PCA and a Joint Bayesian model [2] that effectively correspond to alinear transform in the embedding space are employed. Their method does not require explicit 2D/3D alignment. The networks are trained by using a combination of classification and verification loss. The verification loss is similar to the triplet loss we employ [12, 19], in that it minimizes the L2-distancebetween faces of the same identity and enforces a margin between the distanceof faces of different identities. The main difference is that only pairs of images are compared, whereas the triplet loss encourages a relative distance constraint.

Sun等提出了一个凑的并且相对简单算网作者合了50个响(regular andflipped)LFW上取得了99.47%的最性能。PCA Joint Bayesian模型都很有效的符合我运用的embedding space线性变换的方法不需要详尽2D/3D校准的方法运用classification verification loss来训练网络。他们的verification loss和我triplet loss很相似,因最小化了相同特征人脸间的L2-distance,加大了不同特征人脸间距离的边缘。主要的不同是他们仅对一对图像进行比较,然而triplet loss促进距离

A similar lossto the one used here was explored in Wang et al. [18] for ranking images by semantic and visual similarity.

Wang et al.提出similar loss为了根据语义和视觉相似度给图像划分等级

3. Method

FaceNet uses a deep convolutional network.We discuss two different core architectures: The Zeiler&Fergus [22] style networks and the recent Inception [16] type networks. The details of these networks aredescribed in section 3.3.

FaceNet运用深度卷们讨论两个不同的核心构:

TheZeiler&Fergus架构网

The recentInception架构网

些网详细描述在3.3部分

Given the model details, and treating it as a black box (see Figure 2), the most important part of our approach lies in the end-to-end learning of the whole system. To this end we employ the tripletloss that directly reflects what we want to achieve in face verification,recognition and clustering. Namely, we strive for an embedding f(x), from an image x into a feature space Rd, such that the squared distance between all faces, independent of imaging conditions, of the same identity is small, whereas the squared distance betweena pair of face images from different identities is large.

到模型的细节,我们暂且把它视为一个黑盒(见图2),我的方法最重要的部分在于整个系的端到端的学习。系统末端我们运用the triplet loss直接反射我想要得到的人脸验证、识别、和聚类。换句话来说,我们在争取实现一个embedding f(x),从一个像到一个特征空Rd这样像条件独立的情况下,相同特征的所有人脸间的平方距离是比较小的,然而不同特征的人脸图像对的平方距离是比较大的。

Figure 2. Model structure. Our network consists of a batch input layer and a deep CNN followed L2normalization,which results in the face embedding. This is followed by the triplet loss during training.

(图2.模型结构.我们的网络包含一批输入层和一个深度卷积网络随后是一个L2规范化,其结果进入face embedding。最后是训练中的triplet loss)

Although we did not directly compare to other losses, e.g. the one using pairs of positives and negatives, as used in [14] Eq. (2), we believe that the triplet loss is more suitable for face verification. The motivation is that the loss from [14] encourages all faces of one identity to be projected onto a single point in the embedding space. The triplet loss, however, tries to enforce a margin between each pair of faces from one person to all other faces.This allows the faces for one identity to live on a manifold, while still enforcing the distance and thus discriminability to other identities.

虽然我们并没有直接比较其他losses,例如在[14] Eq中用于正负类的loss,但是我们相信the triplet loss于人脸验证是最合适的激励是某个失其来自在the embedding space鼓励同一特征的所有人脸投影到单个。但是The triplet loss试图加强从某一人脸到其他人脸人脸对的边缘距离。这就允许同一特征的人脸可以依靠其一个复本,同时加强上述人脸间距离并且从而分辨其他特征。

The following section describes thistriplet lossand how it can be learned efficiently in scale.

下面的部分描述triplet loss和在大规模情况下如何高效的学习。

3.1.Triplet Loss

The embedding is represented byIt embeds an image x into a d-dimensional Euclidean space. Additionally, we constrain this embedding to live on the d-dimensional hypersphere, i.e. ||f(x)||2 = 1. This loss is motivated in [19] in the context of nearest-neighbor classification. Here we want to ensure that an image xai (anchor) of a specific person is closer to all other images xpi (positive) of the same person than it is to any image xni (negative)of any other person. This is visualized in Figure 3.

Embedding表示。它把一个图像x嵌入到一个d维的欧几里得空间。另外,我们依靠一个d的超球面约束这个embedding如:||f(x)||2 = 1.个loss被涉及在[19]的最近居分的上下文中里我想确保一个特定人的xai (anchor)更接近于个同一个人的其他xpi (positive)较远于其他任何人的任何xni (negative)见图3

Figure 3. The Triplet Loss minimizes the distance between an anchor and a positive, both of which have the same identity, and maximizes the distance between the anchor and negativehor and a negativof a different identity.

3:The Triplet Loss最小化了anchor 和positive之间的距离,它们两个具有相同的特征;最大化了具有不同特征的anchor 和˙negative之间的距离

Thus we want,

whereis a margin that is enforced between positive and negative pairs。T is the set of all possible triplets in the training set and has cardinality N .

apositive negative对间余量。T是训练集中的所有可能的triplets的集合并且有基数N

The loss that is being minimized is then L =

L是被最小化的loss

Generatingratingall possible triplets would result in many triplets that are easily satisfied (i.e. fulfill the constraint in Eq. (1)). These triplets would not contribute to the training and result in slower convergence,as they would still be passed through the network. It is crucial to select hard triplets, that are active and can therefore contribute to improving the model. The following section talks about the different approaches we use for the triplet selection.

生成所有可能的triplets将会致容易符合条件的tripletstriplets将不会对训练作出贡献并且会导致更低的收敛性,因为它们将会仍然通过网络。这是至关重要的对于选择hard triplets,并且是有效的有助于提升模型。以下部分述我用于triplet selection的不同方法

3.2. Triplet Selection

In order to ensure fast convergence it is crucial to select triplets that violate the triple constraint in Eq. (1). This means that, given xai , we want to select an xpi hard positive)such thatargmax  and similarly xni (hardnegative) such that argmin

为了确保快速收敛选择triplets是非常重要的,避免了式(1)中的triplet这就意味给我们xai可以选择xpihard positive)这样argmaxxn i(hardnegative) 这样 argmin

It is infeasible to compute the argmin and argmax across the whole training set. Additionally, it might lead to poor training, as mislabelled and poorly imaged faces would dominate the hard positivesandnegatives.There are two obvious choices that avoid this issue:

整个训练argminargmax是不现实的。另外,可能致差的训练,就像错误标签和差的人脸图像会决定hard positives negatives下面是两个明选择可以避免问题

Generate triplets offline every n steps, using the most recent network checkpoint and computing the argmin and argmax on a subset of the data.

Generate triplets online. This can be done by selecting the hard positive/negativeexemplars from within a mini-batch.

• 脱机每n步生成一 triplets,运用最近的网络检查站并在数据的子集上算出argminargmax

• 在线生成triplets 个可以实现通过在一个mini-batch选择the hard positive/negative

Here, we focus on the online generation and use large mini-batches in the order ofa few thousand exemplars and only compute the argmin and argmax within a mini-batch.

里,我关注在线生成并用largemini-batches几千个仅仅在一个mini-batchargminargmax

To have a meaningful representation of the anchor- positive distances, it needs to been sured that a minimal number of exemplars of any one identity is present in each mini-batch. In our experiments we sample the training data such that around 40 faces are selected per identity per mini- batch. Additionally, randomly sampled negative faces are added to each mini-batch.

了得到anchor-positive distances的有意的表达,需要确保在每个mini-batch任何一个特征的本的最小量。在我们的实验中我们对训练数据进行采样这样每个mini-batch每个特征大40个人脸被选择。另外,随机采样的负人脸被添加进每个mini-batch

Instead of picking the hardest positive, we use all anchor- positive pairs in a mini-batchwhile still selecting the hard negatives. We don’t have a side-by-side comparison of hard anchor-positive pairs versus all anchor-positive pairs within a mini-batch, but we found in practice that the allanchor-positivewas more stable and converged sightly fasterat the beginning of training.

用所有的anchor-positive在一个mini-batch并且仍然选择thehard negatives,而不是选择the hardest positive并没有hardanchor-positive pairs和所有anchor-positivepairs行并列比较,在一个小批次中,但是在实际中我们发现所有的anchor-positive 方法更加定并且收快在开始的训练

We also explored the offline generation of triplets in conjunction with the online generation and it may allow the use of smaller batch sizes, but the experiments were inconclusive.

也和在线生成一起探索了脱机生成,其允运用更小的batch,但是实验也是非决定性的。

Selecting the hardest negatives can in practice lead to bad local minima early on in training, specifically it can result in a collapsed model (i.e.f(x) = 0). In order to mitigate this, it helps to selectxni such that

实际上选择一个hardest negatives够导致差的局部最小训练的早期,特是它也可能致坍塌的模型(如:f(x) = 0)。种情况,像如下一样选择xni是有帮助的:

We call these negative exemplarssemi-hard, as they are further away from the anchor than the positive exemplar, but still hard because the squared distance is close to the anchor-positive distance. Those negatives lie inside the margin a.

负样本叫做semi-hard,因为较本来距离anchor,但是仍然是hard为平方距离接近于anchor-positive距离负类a内.

As mentioned before, correct triplet selection is crucial for fast convergence. On the one hand we would like to use small mini-batches as these tend to improve convergence during Stochastic GradientDescent (SGD) [20]. On the other hand, implementation details make batches of tens to hundreds of exemplars more efficient. The main constrain twith regards to the batch size, however, is the way we select hard relevan ttripletsfrom with int he mini-batches. In most experiments we use a batch size of around 1,800 exemplars.

就像以前所的,恰当的triplet selection是至关重要的于快速收敛。一方面我们更倾向于用小的mini-batches因为其有助于在随机梯度下降法中提高收敛性。另一方面,实现细节使得数十到上百样本的batches更加高效。但是,对于batch大小的主要约束是我们从the mini-batches选择hard relevan ttriplets的方式在大多数的实验中我采用的batch大小大1800

3.3. Deep Convolutional Networks

In all our experiments we train the CNN using Stochastic GradientDescent (SGD) with standard backprop [8, 11] and AdaGrad[5]. In most experiments we start with a learning rate of 0.05 which we lower to finalize the model. The models are initialized from random, similar to [16],and trained on a CPU cluster for 1,000 to 2,000 hours. The decrease in the loss(and increase in accuracy) slows down drastically after 500h of training, but additional training can still significantly improve performance. The margin a is set to 0.2.

在我实验中运用准反向播算法和AdaGrad的随机梯度下降法训练在大多数的实验中我0.05的学率开始,低的完成了个模型。模型被随机初始化类似于[16],模型在一个CPU集群上训练1000到2000。在训练500h后loss的下降幅度和accuracy的增幅度得异常慢,但是外的训练对于提升性能仍然很重要。边缘a置成0.2.

We used two types of architectures and explore their trade-offs in more detail in the experimental section. Their practical differences lie in the difference of parameters and FLOPS. The best model may be different depending on the application. E.g. a model running in a data center can have many parameters and require a large number of FLOPS, whereas a model running on a mobile phone needs to have few parameters, so that it can fit into memory. All our models use rectified linear unitsasthe non-linear activation function.

使用两种不同的架构并探索它的trade-offs实验部分的更多细节下。他实际上的不同在于不同的参数和浮点运算次数。依据用最好的模型可能不同,如运行在数据中心的一个模型可能需要多参数并且需要多的浮点运算次数,然而运行在一个移动电话的模型需要更少的参数,以便于它可以适应内存。我们所有的方法均运用改正的线性单元作为非线性激励函数。

The first category, shown in Table 1, adds 1*1*d convolutional layers, as suggested in [9], between the standard convolutional layers of the Zeiler&Fergus [22] architecture and results a model 22 layers deep. It has a total of 140 million parameters and requires around 1.6 billion FLOPS per image.

第一个类别(展示在表1中),像[9]的那theZeiler&Fergus标准卷积层间添加了1*1*d的卷积层,致使模型具有22深。它共有1.4亿个参数和每张图像160亿次浮点运算。

Table 1. NN1.Zeiler&Fergus [22] based model with 1*1 convolutions inspired by [9]. The input and output sizes are described in rows *cols *#filters. The kernel is specified as rows*cols, stride and the maxout [6] pooling size as p = 2

表1. NN1.基于1*1卷积的Zeiler&Fergus,输入和输出大小被描述为rows*cols*#filters。核心被定义为rows *cols,跨步stride和the maxout [6] pooling大小中p = 2.

The second category we use is based on GoogLeNet style Inception models[16]. These models have 20* fewer parameters (around 6.6M-7.5M)and up to 5*fewer FLOPS (between 500M-1.6B). Some of these models are dramatically reduced in size (both depth and number of filters), so that they can be run on a mobile phone. One,NNS1, has 26M parameters and only requires 220M FLOPS per image. The other, NNS2, has 4.3M parameters and 20M FLOPS. Table 2 describes NN2our largest network in detail. NN3 is identical in architecture but has a reduced input size of 160x160. NN4 has an input size of only 96x96, thereby drastically reducing the CPU requirements (285M FLOPS vs 1.6B for NN2). In addition to the reduced input size it does not use 5x5 convolutions in the higher layers as the receptive field is already too small by then. Generally we found that the 5x5 convolutions can be removed throughout with only a minor drop in accuracy,racyFigure 4 compares all our models.

第二个类别是基于GoogLeNet styleInception models模型这种模型有20* fewer的参数(大6.6M-7.5M)和多达5*fewerFLOPS(在500M-1.6B)。其中的一些模型大小急减小(在深度和波器数量方面),这样可以运行在移手机上。其中一种叫NNS126M的参数并且每张图需要220M的FLOPS。另外一种叫NNS24.3M的参数并且每张图需要20M的FLOPS。表2详细的描述了我最大的网NN2。NN3和其余的架构是一的但是160x160小的尺寸。NN496x96输入大小,因此大地减少了CPU的需求(285M FLOPS vs 1.6B for NN2)除了减小输入大小它也并没有在高层使用5x5为感受野太小了。通常我们发现5x5的卷可以被移除仅仅会造成精度很小的下降。4了我所有的模型。

(图4.FLOPS对比Accuracy trade-off .由图中可得不同的模型大小和架构在FLOPS和accuracy之间trade-off有一个较大的范围。其中突出我们在实验中关注的4个模型

2.NN2.描述了NN2 Inception incarnation 细节这个模型和在[16]中描述的差不多。两个主要的不同是使用L2 pooling代替Max pooling(m),这里详细说明。如:用 the L2 norm 取代spatial Max。池化总是3*3(除了最后的平均池化)并且和每个 Inception module中和卷积模型平行。在每个池化被表示成p后如果有维度的减小。1*1,3*3,和5*5池化被连接在一起得到最后输出。

4. Datasets and Evaluation

We evaluate our method on four datasets and with the exception of Labelled Faces in the Wild and YouTube Faceswe evaluate our method on the face verification task. I.e. given a pair of two face images a squared L2 distance threshold D(xi,xj) is used to determine the classification of same and different. All face pairs (i, j) of the same identity are denoted with Psame, whereas all pairs of different identities are denoted with Pdiff.

在四个数据集上估我的方法并且除了LFWYouTube Faces在人脸验证估我的方法。如:于一脸图squared L2 distanceD(xi,xj)被用作决定相同和不同人的分类。有相同特征的人脸对(i, j)表示成Psame,反之不同特征的人脸对表示成Pdiff.

We define the set of all true accepts as

These are the face pairs (i, j) that were correctly classified as same at threshold d.

这些是依据d准确分的人脸对(i, j)

Similarly is the set of all pairs that was incorrectly classifiedas same (false accept).

地:是一类被错误分类的人脸对

The validation rate VAL and the false accept rate FAR(d) for a given face distance d are then defined as

定的人距离d验证VAL(d)和可接受错误VAL(d)被定义为

4.1. Hold-out Test Set

We keep a hold out set of around one million images, that has the same distribution as our training set, but disjoint identities. For evaluation we split it into five disjoint sets of 200k images each. The FAR and VAL rate are then computed on 100k x100k image pairs. Standard error is reported across the five splits.

留出法测试集

我们保留与我们的训练集有同样分布但是有不同特征的大约100张照片的留出集。为了便于评估我们把留出集分成5个不相交的子集每个里面有200k图像。FAR和VAL率在100k x100k图像对过这五个分来描述标准差

4.2. Personal Photos

  This is a test set with similar distribution to our training set, but has been manually verified to have very clean labels. It consists of three personal photocollections  with a total of around 12k images. We compute the FAR and VAL rate across all 12k squared pairs of images.

个人照片

这是一个和我们的训练集有相同分布的测试集,但是它必须手工验证去保证非常整洁的标签。它由总共12k图像的三个个人照片集组成。我们在所有12k squared对图像上FAR VAL

4.3. Academic Datasets

Labeled Faces in the Wild (LFW) is the de-facto academic test set for face verification [7]. We follow the standard protocol     for unrestricted,labeled outside dataand report the mean classification accuracy as well as the standard error of the mean.

LFW是事上的人脸验证术测试集。我们遵守无限制标记外部数据的标准协议并且报告平均分类准确度和平均标准差

Youtube Faces DB [21] is a new dataset that has gained popularity in the face recognition community [17, 15]. The setup issimilar to LFW, but instead of verifying pairs of images, pairs of videos are used.

Youtube Faces DB是新的数据集并且在人脸识别社区很受迎。其上面可以使用视频而不是验证图

5. Experiments

If not mentioned otherwise we use between 100M-200M training face thumbnails consisting of about 8M different identities.A face detector is run on each image and a tight bounding box around each faceis generated. These face thumbnails are resized to the input size of the respective network. Input sizes range from 96x96 pixelsto 224x224 pixels in our experiments.

如果没有另外明我使用由8M不同特征成的100M-200M训练脸缩略图。在每个人脸上运行人脸探测器并在人脸周围生成一个紧密的bounding box些人脸缩整成相入大小。在我实验中输入大小的范围是从96x96像素到224x224像素。

5.1. Computation Accuracy Trade-off

Before diving into the details of more specific experimentswe will discuss the trade-off of accuracy versus number of FLOPS that a particular model requires. Figure 4 shows theFLOPS on the x-axis and accuracy at 0,001 false accept rat  our user labelled test-data set from section 4.2. It is interesting to see the strong correlation between the computation a model requires and the accuracy it achieves. The figure highlights the five models (NN1, NN2, NN3, NNS1, NNS2) that we discuss in more detail in our experiments.

计算准确

在探究更具体的实验细节前我们将会讨论准确度和浮点运算次数的权衡这是一个特定的模型所需要的。图4中用x轴表示浮点运算次数和在4.2中的用户标记数据集中0.001错误可接受率下的准确度很愿意看到一个模型所需要的算和它取得的准确度具有相关。强调了在我们实验详细讨论的5个模型(NN1, NN2, NN3, NNS1, NNS2)

We also looked into the accuracy trade-off with regards to the number of model parameters. However, the picture is not as clear in that case. For example, the Inception based model NN2 achieves a comparable performance to NN1, but only has a 20th of the parameters. The number of FLOPS is comparable,though. Obviously at some point the performanceis expected to decrease, if the number of parameters is reduced further. Other model architectures may allow further reductions without loss of accuracy, just like inception [16] did in this case.

也研究关于模型参数的准确度衡。但是,种情况在片中并不是太清楚。例如,Inception based model NN2NN1相比有更好的性能,但是有一个a 20th of the parameters尽管浮点运算次数是比得上的。很明在某些情况下性能是期望减少的如果参数量一步减小。其他模型架构可能在不失准确度的情况下一步减小,Inception [16]就是种情况

5.2. Effect of CNN Model

We now discuss the performance of our four selected models  in more detail. On the one hand we have our traditional Zeiler&Fergus based architecture with 1x1 convolutions [22, 9] (see Table 1). On the other hand we have Inception [16]based models that dramatically reduce the model size. Overall, in the final performance the top models of both architectures perform comparably.However, some of our Inception based models, such asNN3,still achieve good performance while significantly reducing both the FLOPS and the model size.

下面我详细讨论四个备选模型的性能一方面我有基于1x1架构的Zeiler&Fergus表1).另一方面大大减小模型尺寸的Inception[16] based models之,用两种架构的顶级模型最后的性能相当。不管怎么样,一些Inceptionbased models例如NN3仍然在减少浮点运算次数和模型大小方面取得了好的性能。

The detailed evaluation on our personal photos test set is shown in Figure 5. While the largest model achieves a dramatic mprovement in accuracy compared to the tiny NNS2, the latter can be run 30ms /image on a mobile phone and is still accurate enough to be used in face clustering. The sharp drop in the ROC for FAR < 10 4 indicates noisy labels in the test data groundtruth. At extremely low false accept rates a single mislabeled image can have a significant impact on the curve.

5示了在我的个人照片测试集上更加详细估。然NN2相比凑的NNS2模型最大的模型在准确度方面取得了极大的提高,但是凑的NNS2可以在移手机上每个像运行30ms并且于人他也是足准确的。在ROC线上当FAR < 10 4 时的急剧下降表明了在the test data groundtruth.上的noisy labels。在一个特别低的可接受错误率下一个错误标注的图像在曲线上可能有很大的影响。

图5.网络架构.本图显示了4.2中个人照片测试集上4个不同模型的完整的ROC曲线,在10E-4 处的急剧下降可以通过噪声解释在the groundtruth labels.中。按照模型的性能排序依次是:NN2: 224*224 input Inception based model; NN1:Zeiler&Fergus based network with 1x1 convolutions; NNS1: small Inception style model with only 220M FLOPS; NNS2: tiny Inception model with only 20M FLOPS.

5.3. Sensitivity to ImageQuality

Table 4 shows the robustness of our model across a wide range of image sizes. The network is surprisingly robust with respect to JPEG compression and performs very well down to a JPEGquality of 20. The performance drop is very small for face thumbnails down to asize of 120x120 pixels and even at 80x80 pixels it shows acceptable performance.This is notable, because the network was trained on 220x220input images. Training with lower resolution faces could improve this range further.

图像质量的敏感度

表4显示了我们的模型在较广泛的图像大小下的鲁棒性。这个网络在JPEG压缩候有惊人的棒性并且在JPEG20时执行的非常好对于下降到120x120像素大小甚至80x80像素脸缩说性能下降非常小这明显是较为合格的性能种情况是引人注意的,因为网络在一个220x220输出图像上训练在低分辨率人训练可以大需要像素的范

表4.图像质量.左表显示在不同JPEG质量在10E-3精度下对验证率的影响。右边表格显示在10E-3精度下图像大小对验证率的影响。这个实验运用NN1在留出数据集上进行。

5.4. Embedding Dimensionality

We explored various embedding dimensionalities and selected 128 for all experiments other than the comparison reported in Table 5. One would expect thelarger embeddings to perform at least as good as the smaller ones,however, it is possible that they require more training to achieve the same accuracy. That said, the differences in the performance reported in Table 5 are statistically insignificant.It should be noted, that during training a 128 dimensional float vector is used, but it can be quantized to 128-bytes without loss of accuracy. Thus each face is compactly represented by a 128 dimensional byte vector, which is ideal for large scale clustering and recognition. Smaller embeddings are possible at a minor loss ior los of accuracy and could be employed on mobile devices.

嵌入

探究了各种各的嵌入并在所有的实验中均选择了128维的,表5显示了其与其他的维度的对比。我们所期望的一点是大的embeddings至少表的和小的一好但是是可以实现的通过更多的训练得到同的准确度。意思就是,在表5中呈的不同的性能是统计上没有意的。应该注意到在训练中使用了一个128的浮点向量但是它可以量化128而且不失准确度。因此每被表示成128的字向量,这对于大模的人识别是很理想的。小的embeddings在微小的精度失下是可行的并且可以运用到移动设备上。

表5.嵌入维度.比较了在我们的留出测试集上运用NN1模型不同嵌入维度下的相关影响。除了在10E-3下的验证率我们还通过划分的五个块计算了平均的标准差。

5.5. Amount of Training Data

Table 6 shows the impact of large amounts of training data. Due to time constraints this evaluation was run on a smaller model; the effect may be even larger on larger models. It is clear that using tens ofmillions of exemplars results in a clear boost of accuracy on our personal photo test set from section 4.2.Compared to only millions of images the relative reduction in error is 60%.Using another order of magnitude more images (hundreds of millions) still givesa small boost, but the improvement tapers off.

训练数据量

6示了大的训练数据量VAL的影响由于时间约这次评估运行在一个小的模型上;在大的模型上可能会有大的影响。很明运用数千万的本会致准确度著的增4.2部分所用的个人照片测试集上。仅仅和数百万的像相比差相减少了60%。用另外一个数量更多的像(数亿张)仍然得到一个小的促进,但是提高在逐渐变低。

6.训练数据大小.此表的性能比较是在96x96像素输入的一个小的模型训练700h后进行的。这个模型架构和NN2很相似但是没有5x5的卷积在the Inception modules

5.6. Performanceon LFW

We evaluate our model on LFW using the standard protocol for unrestricted,labeled outside data.Nine training splitsare used to select the L2-distance threshold. Classification (same or different) is then performed onthe tenth test split. The selected optimal threshold is1.242 for all test splits except split eighth (1.256).

Our model is evaluated in two modes:

1.Fixed center crop of the LFW provided thumbnail.

2.A proprietary face detector(similar to Picasa[3])is run on the provided LFW thumbnails. If it fails to align the face (this happens for twoimages), the LFW alignment is used.

在LFW数据集上的性能

们评估我的模型在LFW数据上采用无限制标记外部数据的标准协议9训练被用于选择L2-distance阀值。然后在第十个分块上执行分类操作(相同或者不同)。除了第八个分块选择1.256其余的测试块选择1.242为最佳阀值。

我们的模型在两种模式下进行评估:

1.LFW 提供的缩略图进行中心裁剪

2.在LFW缩略图上运行一个和Picasa[3]相似的专有的人脸探测器,如果不能够校准人脸就使用LFW校准(其中有俩张图像出现了这种情况)

Figure 6 gives an overview of all failure cases. It shows false accepts on the top as well as false rejects at the bottom.We achieve a classification accuracy of 98.87%±0.15 when using the fixed center crop described in (1) and the record breaking 99.63%±0.09 standard error of the mean when using the extra face alignment (2). This reduces the error reported for Deep Face in [17] by more than a factor of 7 and the previous state-of-the-artreported for DeepId2+ in [15] by 30%. This is the performance of model NN1, but even the much smaller NN3 achieves performance that is not statistically significantly different.

6给出了所有失案例的概述。除了示了在底部的拒错误还有在顶部的可接受的错误。我们使用模式1中提到的固定中心剪裁得到了98.87%±0.15的分准确度然而运用模式2中额外的校准方法取得了平均标准差99.63%±0.09的突破[17]中的Deep Face道中总共7个因素中不止一个误差减小并且比DeepId2+ in [15]道的以前最先的减小了30%是在NN1模型上的性能,但是甚至非常小的NN3模型也取得了这样的性能但是没有统计大的不同。

6.LFW误差.左侧显示了LFW上所有错误划分图像对。13个中有8个错误拒绝其他5个在LFW上错误标记,这里是真实误差。

5.7. Performanceon Youtube Faces DB

We use the average similarity of all pairs of the first one hundred frames that our face detector detects in each video. This gives us a classification accuracy of95.12%±0.39. Using the first one thousand frames results 95.18%. Compared to [17] 91.4% who also evaluate one hundred frames per video we reduce the error rate by almost half. DeepId2+ [15] achieved 93.2% and our method reduces this error by 30%, comparable to our improvement on LFW.

You tubeFaces DB上的性能

每个视频中我器探的前100中所有的人脸对使用平均相似度。了我95.12%±0.39的分准确度前1000帧导致95.18%[17] 91.4%的相比他每个视频估了100但是我减少了至少一半的错误率。DeepId2+ [15]取得了93.2%可以和我在LFW上的提高相提并论我们减少了30%的误差。

5.8. FaceClustering

Ourcompact embedding lends itself to be used in order to cluster a users personal photos into groups of people with the same identity.The constraints in assignment imposed by clustering faces, compared to the pure verification task, lead to truly amazing results. Figure 7 shows one cluster in a users personal photo collection, generated using

agglomerative clustering.It is a clear show case of the incredible invariance to occlusion, lighting, pose and even age.

由于我compact embedding致它被用作把用的个人照片按照相同的特征行分。施加在人上的束,与粹的验证相比有更加真惊人的结果。图7显示一个用户个人照片集上用融合聚类产生的聚类。它清晰的显示了遮挡、光照、姿态、甚至年龄的惊人不变性。

7..展示了一个用本聚所有在用个人照片集中的像被聚在一起

6. Summary

We provide a method to directly learn an embedding into an Euclidean space for face verification. This sets it apart from other methods [15, 17] who use the CNN bottleneck layer, or require additional post-processing such asan

concatenation of multiple models

and PCA, as well as SVM classification. Ourend-to-end training both simplifies the setup and shows that directlyoptimizing a loss relevant to the task at hand improves performance.

总结

提出了方法直接学embedding到一个用于人脸验证的欧几里得空间。这个模型除了用于CNN瓶颈层的方法和需要添加的例如多模型的连接和PCA这样的额外的后处理外,也有SVM分类。我们端到端训练既简化了设备又直接优化了相关任务的损失提高了性能。

Another strength of our model is that it only requires minimal alignment (tight crop around the face area).[17], for example, performs a complex 3D alignment. We also experimented with asimilarity transform alignment and notice that this can actually improve  performance slightly. It is not clear if it is worth the extra complexity.

的模型另外一个长处就是它仅仅需要最小的校准(脸部区域的紧密剪裁)。例如,行一个复3D校准们也在一个相似转换校准的情况下实验并且注意到它可以稍微的准确提升性能。不清楚是否它值得需要额外的复杂性。

Future work will focus on better understanding of the error cases, further improving the model, and also reducing model size and reducing CPU requirements. We will also  look into 

ways of improving thecurrently extremely long training times, e.g. variations of ourcurriculum learning withsmaller batch sizes and offlineas well asonline positive and negative mining.

以后的工作将会关注如何更好的理解错误的情况,进一步提高模型并且减少模型大小和CPU的需求。我也在求提高当前特别长训练时间的方法例如:除了在线的正负类挖掘外小批和脱机的程学变动

7. Appendix:Harmonic Embedding

In this section we introduce the concept of harmonic embeddings. By this we denote a set of in the sense that aregenerated by different models v1 and v2 but are compatible in the sense that they can be compared to each other.

Harmonic Embedding

一部分我介Harmonic Embedding概念。这里我们表示一些embeddings由不同的V1和V2模型生成但是就它可以互相比是兼容的。

This compatibility greatly simplifies upgrade paths. E.g. in an scenario whereembedding v1 was computed across a large set of images and a new embeddingmodel v2 is being rolled out, this compatibility ensures a smooth transition without the need toworry about version incompatibilities. Figure 8 shows results on our 3Gdataset. It can be seen that the improved model NN2 significantly outperformsNN1, while the comparison of NN2 embeddings to NN1 embeddings performs at anintermediate level.

兼容性大大化了上升路径例如:在以下的情形中embedding v1在一个大的像集上算并且新的embedding model v2被推出,兼容性确保了平滑度不需要担心不同版本的不兼容。8示了在我3G数据集上的果。可以看到提升的NN2显胜过NN1,当NN2和NN1比在一个中间层行。

图8. Harmonic Embedding Compatibility.这些ROC曲线显示了NN2 embeddings to NN1embeddings的Harmonic Embedding兼容性。NN2是改善过的模型执行的比NN1好。当比较由NN1生成的embedding和NN2生成的embedding可以看到两者的兼容性。事实上,混合模型的性能仍然比NN1自身好。

7.1. HarmonicTriplet Loss

In order to learn the harmonic embedding we mix embeddings of v1 together with theembeddings v2, that are being learned. This is done inside the triplet loss and results in additionally generated triplets that encourage the compatibility between the different embedding versions. Figure 9 visualizes the different combinations of triplets that contribute to the triplet loss.

了学习 harmonic embedding混合了embeddings v1embeddingsv2triplet loss里面行并且外生成triplets了不同embedding版本的兼容性9示了有助于triplet loss的不同的合。

图9. learning the harmonic embedding.了学harmonic embedding生成了混合了V1embedding和正在训练V2embeddingtriplets选择来自整个V1V2embeddingsemi-hardnegatives

Weinitialized the v2 embedding from an independently trained NN2 and retrained the last layer (embedding layer) from random initialization with the compatibility encouraging triplet loss. First only the last layer is retrained, then we continue training the whole v2 network with the harmonic loss.

初始化了来自独立的已经训练的NN2中的 v2 embedding并且重新训练了最后一层从具有兼容性激励triplet loss的随机初始化开始仅仅最后一训练随后我们训练了带有harmonic loss的整个v2 network

Figure 10 shows a possible interpretation ofhow this compatibility may work in practice.The vast majority of v2 embeddings may be embedded near the corresponding v1embedding, however, incorrectly placed v1 embeddings can be perturbed slightly such that their new location in embedding space improves verification accuracy.

10显示了在实际中兼容性怎可能工作的一个合理的解释。大多数的v2 embeddings可能嵌入到接近的相应的v1 embeddings,但是,错误的安置v1 embeddings可能被微的打乱导致他新的位置在embedding space 提高了验证准确度。

图10. Harmonic Embedding Space

此图描述了一个可能的解释关于harmonic embedding怎么提高验证准确度当对于较小准确embeddings维持兼容性的时候。在这种情况下这里有一张错误分类的人脸它的embedding被扰乱为“正确”的位置在v2中。

7.2. Summary

These are very interesting findings and it is some what surprising that it works so well. Future work can explore how far this idea can be extended. Presumably there is a limit as to how much the v2embedding can improve over v1, while still being compatible. Additionally it would be interesting to train small networks that can run on a mobile phone and are compatible to a larger server side model.

总结

这些是非常有趣的发现并且它运行的这么好是很惊人的。以后的工作是这个想法还可以怎么扩展。我们可以推测v2 embedding提升超v1有一个极限,然仍然可以兼容。此外训练可以运行在移手机上的小的网是非常有趣的可以和大的服器模型兼容。

本人人脸识别初学者,英语渣渣,有些翻译基于了一些自己肤浅的理解,如有翻译不当烦请指出,不知道是否是编辑原因博客中可能缺少或者覆盖了相关配图,具体完整的可以参考我的解析PDF版 here,谢谢。

Google人脸识别系统Facenet paper解析相关推荐

  1. 人脸识别系统FaceNet原理

    1. 概述 近年来,随着深度学习在CV领域的广泛应用,人脸识别领域也得到了巨大的发展.在深度学习中,通过多层网络的连接,能够学习到图像的特征表示,那么两张人脸的图像,是不是可以通过深度学习判别其是否是 ...

  2. 如何在 Keras 中使用 FaceNet 开发人脸识别系统

    https://www.infoq.cn/article/4wT4mNvKlVvEQZR-JXmp Keras 是一个用 Python 编写的高级神经网络 API,能够以 TensorFlow.CNT ...

  3. matlab人脸识别样本库建立,facenet 人脸识别(二)——创建人脸库搭建人脸识别系统...

    搭建人脸库 选择的方式是从百度下载明星照片 照片下载,downloadImageByBaidu.py # coding=utf-8 """ 爬取百度图片的高清原图 &qu ...

  4. 创建自己的人脸识别系统

    点击上方"小白学视觉",选择加"星标"或"置顶" 重磅干货,第一时间送达 这是一篇全面的.互动性强的人脸识别初学指南.接下来,我们将创建一个 ...

  5. 推荐 6 个 yyds 的人脸识别系统

    本文章推荐 6 个 GitHub 上 Star 最多的人脸识别开源项目,逛逛 GitHub 会每天推荐一些优质有用的开源项目,欢迎关注订阅  本期推荐的开源项目是: 1. 带有移动应用程序的人脸识别库 ...

  6. 你熟知的那个杀毒软件公司McAfee,用这种方法骗过护照人脸识别系统

    选自mcafee.com 作者:Steve Povolny.Jesse Chick 机器之心编译 编辑:杜伟 当你自己与其他人的图像高度匹配时,人脸识别系统还能发挥其作用吗?网络安全公司McAfee生 ...

  7. 基于matlab的人脸五官边缘检测方法,基于MATLAB的人脸识别系统的设计

    基于MATLAB的人脸识别系统的设计(论文12000字,外文翻译,参考程序) 摘要:本文基于MATLAB平台设计了一款简单的人脸识别系统,通过USB摄像头来采集图像,经过肤色方法进行人脸检测与定位,然 ...

  8. python怎么另起一行阅读答案_使用Python+Dlib构建人脸识别系统(在Nvidia Jetson Nano 2GB开发板上)...

    Nvidia Jetson Nano 2GB开发板是一款新的单板机 售价59美元 运行带有GPU加速的人工智能软件.在2020年 你可以从一台售价59美元的单板计算机中获得令人惊叹的性能 让我们用它来 ...

  9. 基于 PCA 的人脸识别系统及人脸姿态分析

    文章目录 1 PCA 1.1 原理 1.2 算法流程 1.2.1 零均值化 1.2.2 计算协方差矩阵 1.2.3 特征值和特征向量 1.2.4 降维得到 K 维特征 1.2.5 PCA 的优缺点 2 ...

最新文章

  1. 1088 Rational Arithmetic
  2. 工作上,我到底想要什么呢?
  3. apache+tomcat配置
  4. Xcode 如何使用旧版本SDK以保证程序兼容性
  5. 转行做产品经理需要学什么?
  6. 李宏毅《机器学习》完整版笔记发布
  7. imx6. android6.0经常修改或者用到的目录(未完)
  8. vue-cli代理开发
  9. 「学术放养」和「认真负责」并不冲突,芝大CS博士谈从导师身上学到的几件事...
  10. 微积分的未来:DNA、非线性、混沌、复杂系统与人工智能
  11. 通天阁塔机器人图片_CORNER | 大阪 · 东京铁塔也比不过跟你一起看的通天阁
  12. ubuntu系统镜像文件下载
  13. myeclipse8.5 TPTP插件的使用问题
  14. 2021年危险化学品经营单位主要负责人考试题及危险化学品经营单位主要负责人模拟试题
  15. Spring MVC中redirect重定向3种方式(带参数)
  16. java中的直接内存
  17. c语言printf输出字符表情,C语言中printf输出的奇怪错误
  18. 08年A题数码相机定位学习笔记
  19. 唯一摩尔斯密码词 leetcode Java篇
  20. 英语一窍不通可以学习计算机吗,英语一窍不通从哪里开始学 怎么学英语最快最有效...

热门文章

  1. Eckart-Young-Mirsky theorem
  2. reduce函数及其用法
  3. 中国大陆5所院校入选2022 QS亚洲地区大学前10名;中国内地被评为全球进步最快的养老金体系 | 美通社头条...
  4. android支付宝(Alipay)接入介绍
  5. AR | 增强现实简述
  6. C# 微信网页协议 代码记录
  7. 某厂向用户提供饲料matlab,Matlab习题
  8. Firefoo:Firebase Cloud Firestore——GUI工具
  9. 因果分析系列6--相关,回归与因果
  10. 数据百问系列之二:游戏DAU骤降分析