1. XM2VTSDB 人脸多模态数据库

(including high quality colour images, 32 KHz 16-bit sound files, video sequences and a 3d Model) 已下载 XM2VTSDB multi-modal face database project

The images are stored in PPM (portable pixmap format).portable

This is the home page for the XM2VTSDB multi-modal face database project. In this project a large multi-modal database was captured onto high quality digital video. The XM2VTSDB contains four recordings of 295 subjects taken over a period of four months. Each recording contains a speaking head shot and a rotating head shot. Sets of data taken from this database are available including high quality colour images, 32 KHz 16-bit sound files, video sequences and a 3d Model. For more information about the database and how to order it follow the links at the side of this page.

The database was aquired within the M2VTS project (Multi Modal Verification for Teleservices and Security applications), a part of the EU ACTS programme, which deals with access control by the use of multimodal identification of human faces. The goal of using a multimodal recognition scheme is to improve the recognition efficiency by combining single modalities, namely face and voice features.

The XM2VTSDB is being made available at cost price only - no benefits are expected from the distribution - we ask the end users to acknowledge the M2VTS project whenever this database is used (see the user agreement).
整个数据库是使用Sony VX1000E数字摄像机和DHR1000UX数字录像机采集的。它以4:2:0的彩色采样分辨率捕获视频,以32kHz的频率捕获音频,采样速率为16位。之所以选择这种硬件,是因为它可以通过火线端口与计算机连接。目前唯一支持的架构是PC,但是SUN、SGI和DEC都在致力于firewire解决方案。


2. 生物控制论 人脸数据库

https://vdb.kyb.tuebingen.mpg.de/ 注册还未同意(机构认证)



The MOBIO database consists of bi-modal (audio and video) data taken from 152 people. The database has a female-male ratio or nearly 1:2 (100 males and 52 females) and was collected from August 2008 until July 2010 in six different sites from five different countries. This led to a diverse bi-modal database with both native and non-native English speakers.

In total 12 sessions were captured for each client: 6 sessions for Phase I and 6 sessions for Phase II. The Phase I data consists of 21 questions with the question types ranging from: Short Response Questions, Short Response Free Speech, Set Speech, and Free Speech. The Phase II data consists of 11 questions with the question types ranging from: Short Response Questions, Set Speech, and Free Speech. A more detailed description of the questions asked of the clients is provided below.

The database was recorded using two mobile devices: a mobile phone and a laptop computer. The mobile phone used to capture the database was a NOKIA N93i mobile while the laptop computer was a standard 2008 MacBook. The laptop was only used to capture part of the first session, this first session consists of data captured on both the laptop and the mobile phone.

  1. Laboratory for Image & Video Engineering




YouTube Faces 数据集 (已下载,详细请见文档)

Welcome to YouTube Faces Database, a database of face videos designed for studying the problem of unconstrained face recognition in videos.
The data set contains 3,425 videos of 1,595 different people. All the videos were downloaded from YouTube. An average of 2.15 videos are available for each subject. The shortest clip duration is 48 frames, the longest clip is 6,070 frames, and the average length of a video clip is 181.3 frames.


已下载:Face Recognition in Unconstrained Videos with Matched Background Similarity ——去读




7. IMFDB 印度数据库==


Indian Movie Face database (IMFDB) is a large unconstrained face database consisting of 34512 images of 100 Indian actors collected from more than 100 videos. All the images are manually selected and cropped from the video frames resulting in a high degree of variability interms of scale, pose, expression, illumination, age, resolution, occlusion, and makeup. IMFDB is the first face database that provides a detailed annotation of every image in terms of age, pose, gender, expression and type of occlusion that may help other face related applications.

表达: 愤怒,幸福,悲伤,惊喜,恐惧,厌恶
照明: 坏,中,高
姿势: 正面,左,右,上,下
遮挡: 眼镜,胡子,饰品,头发,手,无,其他

8. Labeled Faces in the Wild (lFW数据库)


9. BU-3DFE

(Binghamton University 3D Facial Expression) Database (Static Data)
Analyzing Facial Expressions in Three Dimensional Space

10. 学生上课数据库

This IRTT student video database contains one video in .mp4 format. Later more videos will be included in this database. The video duration is 55.938 seconds and contains 30 frames with resolution of 720x1280. This video is captured by smart phone. The faces and other features like eyes, lips and nose are extracted from this video separately. Some of faces detected in video database are shown in Fig


11. SCface - Surveillance Cameras Face Database

Summary: SCface is a database of static images of human faces. Images were taken in uncontrolled indoor environment using five video surveillance cameras of various qualities. Database contains 4160 static images (in visible and infrared spectrum) of 130 subjects. Images from different quality cameras mimic the real-world conditions and enable robust face recognition algorithms testing, emphasizing different law enforcement and surveillance use case scenarios. SCface database is available to research community through the procedure described below.

12. McGillFaces Database


This database contains 18000 video frames of 640x480 resolution from 60 video sequences, each of which recorded from a different subject (31 female and 29 male). Each video was collected in a different environment ( indoor or outdoor) resulting arbitrary illumination conditions and background clutter. Furthermore, the subjects were completely free in their movements, leading to arbitrary face scales, arbitrary facial expressions, head pose (in yaw, pitch and roll), motion blur, and local or global occlusions.


