by Sigurður Skúli


制作自己的人脸识别系统 (Making your own Face Recognition System)

Face recognition is the latest trend when it comes to user authentication. Apple recently launched their new iPhone X which uses Face ID to authenticate users. OnePlus 5 is getting the Face Unlock feature from theOnePlus 5T soon. And Baidu is using face recognition instead of ID cards to allow their employees to enter their offices. These applications may seem like magic to a lot of people. But in this article we aim to demystify the subject by teaching you how to make your own simplified version of a face recognition system in Python.

人脸识别是涉及用户身份验证的最新趋势。 苹果最近推出了他们的新iPhone X,该手机使用Face ID来验证用户身份。 OnePlus 5即将从OnePlus 5T获得面部解锁功能 。 百度正在使用人脸识别代替身份证,允许员工进入办公室 。 对于许多人来说,这些应用程序似乎是不可思议的。 但是在本文中,我们旨在通过教您如何使用Python制作自己的简化版本的人脸识别系统来揭开这个主题的神秘面纱。

Github link for those who do not like reading and only want the code


背景 (Background)

Before we get into the details of the implementation I want to discuss the details of FaceNet. Which is the network we will be using in our system.

在详细介绍实现之前,我想讨论一下FaceNet的细节。 我们将在系统中使用哪个网络。

面对网 (FaceNet)

FaceNet is a neural network that learns a mapping from face images to a compact Euclidean space where distances correspond to a measure of face similarity. That is to say, the more similar two face images are the lesser the distance between them.

FaceNet是一个神经网络,可学习从人脸图像到紧凑欧几里得空间的映射,其中距离对应于人脸相似性的度量。 也就是说,两个人脸图像越相似,它们之间的距离就越小。

三重损失 (Triplet Loss)

FaceNet uses a distinct loss method called Triplet Loss to calculate loss. Triplet Loss minimises the distance between an anchor and a positive, images that contain same identity, and maximises the distance between the anchor and a negative, images that contain different identities.

FaceNet使用一种称为三重损失的独特损失方法来计算损失。 三重损失使锚点和包含相同标识的正像之间的距离最小化,并使锚点和包含不同标识的负像之间的距离最大化。

  • f(a) refers to the output encoding of the anchor


  • f(p) refers to the output encoding of the positive


  • f(n) refers to the output encoding of the negative


  • alpha is a constant used to make sure that the network does not try to optimise towards f(a) - f(p) = f(a) - f(n) = 0.

    alpha是一个常数,用于确保网络不会尝试朝f(a)-f(p)= f(a)-f(n)= 0优化。

  • […]+ is equal to max(0, sum)

    […] +等于max(0,sum)

暹罗网络 (Siamese Networks)

FaceNet is a Siamese Network. A Siamese Network is a type of neural network architecture that learns how to differentiate between two inputs. This allows them to learn which images are similar and which are not. These images could be contain faces.

FaceNet是一个暹罗网络。 暹罗网络是一种神经网络架构,可以学习如何区分两个输入。 这使他们能够了解哪些图像相似,哪些不相似。 这些图像可能包含面Kong。

Siamese networks consist of two identical neural networks, each with the same exact weights. First, each network take one of the two input images as input. Then, the outputs of the last layers of each network are sent to a function that determines whether the images contain the same identity.

连体网络由两个相同的神经网络组成,每个神经网络具有相同的精确权重。 首先,每个网络都将两个输入图像之一作为输入。 然后,每个网络的最后一层的输出将发送到确定图像是否包含相同身份的功能。

In FaceNet, this is done by calculating the distance between the two outputs.


实作 (Implementation)

Now that we have clarified the theory, we can jump straight into the implementation.


In our implementation we’re going to be using Keras and Tensorflow. Additionally, we’re using two utility files that we got from’s repo to abstract all interactions with the FaceNet network.:

在我们的实现,我们要使用Keras和Tensorflow 。 此外,我们使用从deeplearning.ai的仓库中获得的两个实用程序文件来抽象​​与FaceNet网络的所有交互。

  • contains functions to feed images to the network and getting the encoding of images


  • contains functions to prepare and compile the FaceNet network


编译FaceNet网络 (Compiling the FaceNet network)

The first thing we have to do is compile the FaceNet network so that we can use it for our face recognition system.


import osimport globimport numpy as npimport cv2import tensorflow as tffrom fr_utils import *from inception_blocks_v2 import *from keras import backend as K
FRmodel = faceRecoModel(input_shape=(3, 96, 96))
def triplet_loss(y_true, y_pred, alpha = 0.3):    anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]    pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor,               positive)), axis=-1)    neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor,                negative)), axis=-1)    basic_loss = tf.add(tf.subtract(pos_dist, neg_dist), alpha)    loss = tf.reduce_sum(tf.maximum(basic_loss, 0.0))       return loss
FRmodel.compile(optimizer = 'adam', loss = triplet_loss, metrics = ['accuracy'])load_weights_from_FaceNet(FRmodel)

We’ll start by initialising our network with an input shape of (3, 96, 96). That means that the Red-Green-Blue (RGB) channels are the first dimension of the image volume fed to the network. And that all images that are fed to the network must be 96x96 pixel images.

我们将从初始化输入形状为(3,96,96)的网络开始。 这意味着红绿蓝(RGB)通道是馈送到网络的图像量的第一维。 并且所有送入网络的图像都必须是96x96像素的图像。

Next we’ll define the Triplet Loss function. The function in the code snippet above follows the definition of the Triplet Loss equation that we defined in the previous section.

接下来,我们将定义三重损失函数。 上面的代码片段中的函数遵循我们在上一节中定义的Triplet Loss方程的定义。

If you are unfamiliar with any of the Tensorflow functions used to perform the calculation, I’d recommend reading the documentation (for which I have added links to for each function) as it will improve your understanding of the code. But comparing the function to the equation in Figure 1 should be enough.

如果您不熟悉用于执行计算的任何Tensorflow函数,我建议您阅读文档(为此我添加了每个函数的链接),因为它会增进您对代码的理解。 但是将函数与图1中的方程进行比较就足够了。

Once we have our loss function, we can compile our face recognition model using Keras. And we’ll use the Adam optimizer to minimise the loss calculated by the Triplet Loss function.

一旦有了损失功能,就可以使用Keras编译人脸识别模型。 而且,我们将使用Adam优化器来最小化由Triplet Loss函数计算的损耗。

准备数据库 (Preparing a Database)

Now that we have compiled FaceNet, we are going to prepare a database of individuals we want our system to recognise. We are going to use all the images contained in our imagesdirectory for our database of individuals.

现在我们已经编译了FaceNet,我们将准备一个我们希望系统识别的个人数据库。 我们将使用图像中包含的所有图像 个人数据库的目录。

NOTE: We are only going to use one image of each individual in our implementation. The reason is that the FaceNet network is powerful enough to only need one image of an individual to recognise them!

注意:在我们的实现中,我们将只使用一个图像。 原因是FaceNet网络足够强大,只需要一个人的图像就可以识别它们!

def prepare_database():    database = {}
for file in glob.glob("images/*"):        identity = os.path.splitext(os.path.basename(file))[0]        database[identity] = img_path_to_encoding(file, FRmodel)
return database

For each image, we will convert the image data to an encoding of 128 float numbers. We do this by calling the function img_path_to_encoding. The function takes in a path to an image and feeds the image to our face recognition network. Then, it returns the output from the network, which happens to be the encoding of the image.

对于每张图像,我们会将图像数据转换为128个浮点数的编码。 我们通过调用img_path_to_encoding函数来实现 。 该功能将获取图像的路径,并将图像提供给我们的面部识别网络。 然后,它从网络返回输出,该输出恰好是图像的编码。

Once we have added the encoding for each image to our database, our system can finally start recognising individuals!


识别人脸 (Recognising a Face)

As discussed in the Background section, FaceNet is trained to minimise the distance between images of the same individual and maximise the distance between images of different individuals. Our implementation uses this information to determine which individual the new image fed to our system is most likely to be.

如背景技术部分中所述,FaceNet经过训练可以使同一个人的图像之间的距离最小,而使不同个人的图像之间的距离最大。 我们的实现使用此信息来确定新图像最有可能是哪个人。

def who_is_it(image, database, model):    encoding = img_to_encoding(image, model)        min_dist = 100    identity = None        # Loop over the database dictionary's names and encodings.    for (name, db_enc) in database.items():        dist = np.linalg.norm(db_enc - encoding)
print('distance for %s is %s' %(name, dist))
if dist < min_dist:            min_dist = dist            identity = name        if min_dist > 0.52:        return None    else:        return identity

The function above feeds the new image into a utility function called img_to_encoding. The function processes an image using FaceNet and returns the encoding of the image. Now that we have the encoding we can find the individual that the image most likely belongs to.

上面的函数将新图像馈送到名为img_to_encoding的实用程序函数中。 该函数使用FaceNet处理图像并返回图像的编码。 现在我们有了编码,我们可以找到图像最有可能属于的个人。

To find the individual, we go through our database and calculate the distance between our new image and each individual in the database. The individual with the lowest distance to the new image is then chosen as the most likely candidate.

为了找到个人,我们遍历数据库并计算新图像与数据库中每个个人之间的距离。 然后选择与新图像距离最短的个人作为最可能的候选人。

Finally, we must determine whether the candidate image and the new image contain the same person or not. Since by the end of our loop we have only determined the most likely individual. This is where the following code snippet comes into play.

最后,我们必须确定候选图像和新图像是否包含同一个人。 由于在循环结束时,我们仅确定了最有可能的个人。 这是以下代码段起作用的地方。

if min_dist > 0.52:    return Noneelse:    return identity
  • If the distance is above 0.52, then we determine that the individual in the new image does not exist in our database.如果距离大于0.52,则我们确定新图像中的个人在我们的数据库中不存在。
  • But, if the distance is equal to or below 0.52, then we determine they are the same individual!但是,如果距离等于或小于0.52,则我们确定它们是同一个人!

Now the tricky part here is that the value 0.52 was achieved through trial-and-error on my behalf for my specific dataset. The best value might be much lower or slightly higher and it will depend on your implementation and data. I recommend trying out different values and see what fits your system best!

现在,这里最棘手的部分是,我代表我的特定数据集通过反复试验获得了0.52的值。 最佳值可能会低得多或略高,这取决于您的实现和数据。 我建议尝试不同的值,然后看看最适合您的系统!

使用人脸识别构建系统 (Building a System using Face Recognition)

Now that we know the details on how we recognise a person using a face recognition algorithm, we can start having some fun with it.


In the Github repository I linked to at the beginning of this article is a demo that uses a laptop’s webcam to feed video frames to our face recognition algorithm. Once the algorithm recognises an individual in the frame, the demo plays an audio message that welcomes the user using the name of their image in the database. Figure 3 shows an example of the demo in action.

在本文开头我链接到的Github存储库中,是一个演示,该演示使用便携式计算机的网络摄像头将视频帧馈送到我们的面部识别算法。 一旦算法识别出帧中的某个人,演示就会播放音频消息,使用数据库中其图像的名称来欢迎用户。 图3显示了演示示例。

结论 (Conclusion)

By now you should be familiar with how face recognition systems work and how to make your own simplified face recognition system using a pre-trained version of the FaceNet network in python!

现在,您应该熟悉面部识别系统的工作原理,以及如何使用经过预训练的python FaceNet网络版本制作自己的简化面部识别系统!

If you want to play around with the demonstration in the Github repository and add images of people you know then go ahead and fork the repository.


Have some fun with the demonstration and impress all your friends with your awesome knowledge of face recognition!




  1. 这家AI公司用面具破解中国人脸识别系统!微信、支付宝、火车站无一幸免

    全世界只有3.14 % 的人关注了 青少年数学之旅 据外媒报道,一家人工智能公司Kneron用一个特制的3D面具,成功欺骗了包括支付宝和微信在内的诸多人脸识别支付系统,完成了购物支付程序.他们用同样的 ...

  2. 基于opencv和pillow实现人脸识别系统(附demo)

    更多python教程请到友情连接: 菜鸟教程 初中毕业读什么技校 茂名一技 p ...

  3. 人脸识别系统做CCC认证,人脸识别系统做SRRC认证

    人脸识别系统做CCC认证,人脸识别系统做SRRC认证 人脸识别一体机3C认证办理流程 1:认证委托和申请,资料必须纸档并签字,盖公司印章,一式两份. 2:产品送样检测,每个产品成品数量为五,收到测试样 ...

  4. 牛逼!这家 AI 公司用面具破解了中国的人脸识别系统!微信、支付宝、火车站无一幸免...

    公众号关注 "GitHubDaily" 设为 "星标",带你了解圈内新鲜事! 转自:新智元 来源:fortune.theverge 编辑:张佳 [导读]据外媒报 ...

  5. 人物关系 人脸识别_原因解密:格里兹曼宣布终止与华为合作,不只是因为人脸识别系统...

    恐怕不少球迷在看到这篇文章之前,依旧不知道格里兹曼突然间宣布终止了与华为方面的代言合作,并且这突然违约的行为背后是什么原因也让人不得而知.真的是因为华为方面研发了一款特殊的人脸识别系统,还是受到舆论的 ...

  6. 人脸识别门禁_门禁人脸识别系统铜陵县门禁人脸识别系统哪家好

    门禁人脸识别系统铜陵县门禁人脸识别系统哪家好 工地 1.建筑工地使用人脸识别门禁考勤机解决问题:工人刷脸出入,刷脸考勤,杜绝虚假考勤,提高考勤效率,工人工种分组,实名制管理. 2.人脸识别测温一体机用 ...

  7. 如何快速搭建智能人脸识别系统

    作者 | 小白 来源 | 小白学视觉 网络安全是现代社会最关心的问题之一,确保只有特定的人才能访问设备变得极其重要,这是我们的智能手机设有两级安全系统的主要原因之一.这是为了确保我们的隐私得到维护,只 ...

  8. 你熟知的那个杀毒软件公司McAfee,用这种方法骗过护照人脸识别系统

    选自 作者:Steve Povolny.Jesse Chick 机器之心编译 编辑:杜伟 当你自己与其他人的图像高度匹配时,人脸识别系统还能发挥其作用吗?网络安全公司McAfee生 ...

  9. 创建自己的人脸识别系统

    点击上方"小白学视觉",选择加"星标"或"置顶" 重磅干货,第一时间送达 这是一篇全面的.互动性强的人脸识别初学指南.接下来,我们将创建一个 ...


  1. K-均值聚类(K-Means) C++代码实现
  2. “直播第一股”映客,讲得好社交新故事吗?
  3. linux系统路由功能记录
  4. ARP欺骗:先认识再防御
  5. 数据级并行--计算机体系结构
  6. org.apache.hadoop.hive.metastore.api.InvalidObjectException: Role public already exists.
  7. 玩转oracle 11g(35):rman备份-参数文件spfile损坏恢复
  8. Linux图片马PHP,php 根据请求生成缩略图片保存到Linux图片服务器的代码
  9. 容器编排技术 -- Kubernetes kubectl set 命令详解
  10. 从“做什么”到“怎么做”,说说一只蚊子
  11. php获取总共内存_PHP获取内存使用情况详解
  12. GPS NMEA-0183标准数据介绍
  13. python 批量爬取网页pdf_批量抓取网页pdf文件
  14. 华为设备配置VRRP,实现设备网关冗余备份
  15. linux redhat下载地址
  16. Linux文件权限与目录配置
  17. 全网最细最全OLAP之clickhouse笔记|clickhouse文档|clickhouse揭秘文档(三)--clickhouse单机安装和clickhouse集群安装
  18. Linux-京东字节百度提前批,一面二面都被问到了awk——实例篇(2)去重统计排序
  19. Nvidia Agx Xavier平台nvp6324模块调试
  20. 【西语】【2】Recuërdame antes de que se desaparezca la memoria del amor 在爱的记忆消失前,请记住我


  1. 使用JSONObject 读取 jason对象中的key
  2. 清理异常值(MAD:绝对中位差)
  3. vmbackup和vmrestore是何方神圣?
  4. matlab如何导入大量的图片_本期介绍:如何在论文中插入高清的图片
  5. python基础复习(30)--string[start:end:step] start默认0 end默认列尾 step默认1
  6. 49个学习Python的国外资源
  7. 服务器硬件与 Linux 初体验
  8. js屏蔽手机的物理返回键
  9. 为什么中国的程序员喜欢用英文写代码,甚至注释也用英文?
  10. 男人应该做的50件事(1-16)