face alignment by 3000 fps系列学习总结(二)
准备初始数据
mean_shape
mean_shape就是训练图片所有ground_truth points的平均值.那么具体怎么做呢?是不是直接将特征点相加求平均值呢?
显然这样做是仓促和不准确的。因为图片之间人脸是各式各样的,收到光照、姿势等各方面的影响。因此我们求取平均值,应该在一个相对统一的框架下求取。如下先给出matlab代码:
function mean_shape = calc_meanshape(shapepathlistfile)fid = fopen(shapepathlistfile);
shapepathlist = textscan(fid, '%s', 'delimiter', '\n');if isempty(shapepathlist)error('no shape file found');mean_shape = [];return;
endshape_header = loadshape(shapepathlist{1}{1});if isempty(shape_header)error('invalid shape file');mean_shape = [];return;
endmean_shape = zeros(size(shape_header));num_shapes = 0;
for i = 1:length(shapepathlist{1})shape_i = double(loadshape(shapepathlist{1}{i}));if isempty(shape_i)continue;endshape_min = min(shape_i, [], 1);shape_max = max(shape_i, [], 1);% translate to origin pointshape_i = bsxfun(@minus, shape_i, shape_min);% resize shapeshape_i = bsxfun(@rdivide, shape_i, shape_max - shape_min);mean_shape = mean_shape + shape_i;num_shapes = num_shapes + 1;
endmean_shape = mean_shape ./ num_shapes;img = 255 * ones(500, 500, 3);drawshapes(img, 50 + 400 * mean_shape);endfunction shape = loadshape(path)
% function: load shape from pts file
file = fopen(path);
if file == -1shape = [];fclose(file);return;
end
shape = textscan(file, '%d16 %d16', 'HeaderLines', 3, 'CollectOutput', 2);
fclose(file);
shape = shape{1};
end
解析:
公式表示:
\{shape_{gt}-[Region(1),Region(2)]\}/[Region(3),Region(4)))]] \Rightarrow [0,1]\times[0,1]
准备ΔSt\Delta S^t
我们知道3000FPS的核心思想是:
\Delta S^t=W^t\Phi^t(I,S^{t-1})
其中 ΔSt=Sgt−St\Delta S^t=S^{gt}-S^{t}为第t个阶段的残差;而 Φt(I,St−1)\Phi^t(I,S^{t-1})则为特征提取函数;W为线性回归矩阵。由 《人脸配准坐标变换解析》我们可以看到所谓的 ΔSt\Delta S^t需进行相似性变换,而 Φt(I,St−1)\Phi^t(I,S^{t-1})则不需要.
相似性变换的主要过程是:
先将 StS^t, S0S^0中心化变换,再求解如下变换矩阵:
S^0=cRS^t,求解完cR后,对 ΔSt\Delta S^t施加同样的变换,即
\widetilde{S^t}=cR\Delta S^t.我们将使用变化后的 St˜\widetilde{S^t}去求解线性回归矩阵W.
先贴代码: train_model.m 第103行起
Param.meanshape = S0(Param.ind_usedpts, :); %选取特定的landmarkdbsize = length(Data);% load('Ts_bbox.mat');augnumber = Param.augnumber; %为每张人脸选取的init_shape的个数for i = 1:dbsize % initializ the shape of current face image by randomly selecting multiple shapes from other face images % indice = ceil(dbsize*rand(1, augnumber)); indice_rotate = ceil(dbsize*rand(1, augnumber)); indice_shift = ceil(dbsize*rand(1, augnumber)); scales = 1 + 0.2*(rand([1 augnumber]) - 0.5);Data{i}.intermediate_shapes = cell(1, Param.max_numstage); %中间shapeData{i}.intermediate_bboxes = cell(1, Param.max_numstage);Data{i}.intermediate_shapes{1} = zeros([size(Param.meanshape), augnumber]); %68*2*augnumber(augnumber为第i图片设置的初始shape的个数)Data{i}.intermediate_bboxes{1} = zeros([augnumber, size(Data{i}.bbox_gt, 2)]); %augnumber*4Data{i}.shapes_residual = zeros([size(Param.meanshape), augnumber]); %shapes_residual为shape 残差 维数:68*2*augnumberData{i}.tf2meanshape = cell(augnumber, 1);Data{i}.meanshape2tf = cell(augnumber, 1);% if Data{i}.isdet == 1% Data{i}.bbox_facedet = Data{i}.bbox_facedet*ts_bbox;% end % 如下一段的意思是如果augnumber=1,表明每个图片的Init_shape只有一个,因此这要设置成mean_shape即可,这时你会发现Data{i}.tf2meanshape{1}其实就是% 单位矩阵,因为他是从mean_shape转化到mean_shape。后面就不一样了.%;对于augnumber>1的其他init_shape将采用平移、旋转、% 缩放等方式产生更多的shape,也可以从其他图片的shape中挑选shapefor sr = 1:params.augnumberif sr == 1% estimate the similarity transformation from initial shape to mean shape% Data{i}.intermediate_shapes{1}(:,:, sr) = resetshape(Data{i}.bbox_gt, Param.meanshape);% Data{i}.intermediate_bboxes{1}(sr, :) = Data{i}.bbox_gt;Data{i}.intermediate_shapes{1}(:,:, sr) = resetshape(Data{i}.bbox_facedet, Param.meanshape);Data{i}.intermediate_bboxes{1}(sr, :) = Data{i}.bbox_facedet;%将mean shape reproject face detection bbox上meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape); %meanshape_resize与 Data{i}.intermediate_shapes{1}(:,:, sr) 是相同的%计算当前的shape与mean shape之间的相似性变换 Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), ...(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');Data{i}.meanshape2tf{1} = fitgeotrans((bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), ...bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), 'NonreflectiveSimilarity');% calculate the residual shape from initial shape to groundtruth shape under normalization scaleshape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, 1), [Data{i}.intermediate_bboxes{1}(1, 3) Data{i}.intermediate_bboxes{1}(1, 4)]);% transform the shape residual in the image coordinate to the mean shape coordinate[u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)'); Data{i}.shapes_residual(:, 1, 1) = u';Data{i}.shapes_residual(:, 2, 1) = v'; else% randomly rotate the shape % shape = resetshape(Data{i}.bbox_gt, Param.meanshape); % Data{indice_rotate(sr)}.shape_gtshape = resetshape(Data{i}.bbox_facedet, Param.meanshape); % Data{indice_rotate(sr)}.shape_gt%根据随机选取的scale,rotation,translate计算新的初始shape然后投影到bbox上if params.augnumber_scale ~= 0shape = scaleshape(shape, scales(sr));endif params.augnumber_rotate ~= 0shape = rotateshape(shape);endif params.augnumber_shift ~= 0shape = translateshape(shape, Data{indice_shift(sr)}.shape_gt);endData{i}.intermediate_shapes{1}(:, :, sr) = shape;Data{i}.intermediate_bboxes{1}(sr, :) = getbbox(shape);meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape); %将Data{i}.tf2meanshape{sr} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, sr), mean(Data{i}.intermediate_shapes{1}(1:end,:, sr))), ...bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :))), 'NonreflectiveSimilarity');Data{i}.meanshape2tf{sr} = fitgeotrans(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :))), ...bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, sr), mean(Data{i}.intermediate_shapes{1}(1:end,:, sr))), 'NonreflectiveSimilarity');shape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, sr), [Data{i}.intermediate_bboxes{1}(sr, 3) Data{i}.intermediate_bboxes{1}(sr, 4)]);[u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)');Data{i}.shapes_residual(:, 1, sr) = u';Data{i}.shapes_residual(:, 2, sr) = v';% Data{i}.shapes_residual(:, :, sr) = tformfwd(Data{i}.tf2meanshape{sr}, shape_residual(:, 1), shape_residual(:, 2));endend
end
这段代码的理解需要结合上面给出的那篇文章《人脸配准坐标变换解析》。
按照《人脸配准坐标变换解析》文章所述,
\left.\begin{matrix}\overline{S_0}&=S_0-mean(S_0)\\ \overline{S_1}&=S_1-mean(S_1) \end{matrix}\right\}\Rightarrow \overline{S_0}=c_1R_1\overline{S_1}
因此根据
\Delta S=S_g-S_1可推出
\widetilde{\Delta S}=c_1R_1\Delta S
但是现在问题比较特殊,需要多操作一下:
由:
%将mean shape reproject face detection bbox上meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape);
查看resetshape的定义知meanshape被映射到intermediate_bboxes中,使得S0S_0和S1S_1处于同样的尺度下和大致相似的位置上。用数学语言表达为:
S_0\_resize=S_0*Ratio+[Region(1),Region(2)] 这里Ratio实际上是intermediate_bboxes的大小。
于是同样按照上面的方法计算:
\widetilde{S_0}=S_0\_Resize-mean(S_0\_Resize)=S_0*Ratio-mean(S_0)*Ratio=(S_0-mean(S_0))*Ratio= \overline{S_0}*Ratio
经过计算得 S0˜=Ratio∗S0¯¯¯¯=c1˜R1˜S1¯¯¯¯\widetilde{S_0}=Ratio*\overline{S_0}=\widetilde{c_1}\widetilde{R_1} \overline{S_1}.( ★\bigstar)
这也就是上面的代码:
Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), ...(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');
Data{i}.tf2meanshape{1}即为这里算出的c1˜R1˜\widetilde{c_1}\widetilde{R_1}.
但我们想要的是S0¯¯¯¯=c1R1S1¯¯¯¯\overline{S_0}=c_1R_1\overline{S_1},不用着急,(★\bigstar)为我们指明了方向。
c1R1=c1˜R1˜/Ratio=c1˜R1˜/intermediate_bboxesc_1R_1=\widetilde{c_1}\widetilde{R_1}/Ratio=\widetilde{c_1}\widetilde{R_1}/{intermediate\_{bboxes}}.因此:
\widetilde{\Delta S}=\widetilde{c_1}\widetilde{R_1}/{intermediate\_{bboxes}}*\Delta S
也就是代码中提的:
%计算当前的shape与mean shape之间的相似性变换
Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))),(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');Data{i}.meanshape2tf{1} = fitgeotrans((bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))),bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), 'NonreflectiveSimilarity');% calculate the residual shape from initial shape to groundtruth shape under normalization scale
shape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, 1), [Data{i}.intermediate_bboxes{1}(1, 3) Data{i}.intermediate_bboxes{1}(1, 4)]);% transform the shape residual in the image coordinate to the mean shape coordinate
[u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)'); Data{i}.shapes_residual(:, 1, 1) = u';Data{i}.shapes_residual(:, 2, 1) = v';
face alignment by 3000 fps系列学习总结(二)相关推荐
- face alignment by 3000 fps系列学习总结(三)
训练 我们主要以3000fps matlab实现为叙述主体. 总体目标 我们需要为68个特征点的每一个特征点训练5棵随机树,每棵树4层深,即为所谓的随机森林. 开始训练 分配样本 事实上,对于每个特征 ...
- face alignment by 3000 fps系列学习总结
我们主要讲一讲Github上给出的matlab开源代码<jwyang/face-alignment>的配置. 首先声明:本人第一次配置的时候也是参考了csdn一个作者和github给出的说 ...
- Face Alignment by 3000 FPS系列学习总结(一)
广播: 如今的opencv已经提供了LBF的训练和测试代码,推荐阅读 <使用OpenCV实现人脸关键点检测> face alignment 流程图 train阶段 测试阶段 预处理 裁剪图 ...
- Face Alignment at 3000 FPS via Regressing Local Binary Features(CVPR2014)读后感(first pass)
Face Alignment at 3000 FPS via Regressing Local Binary Features(CVPR2014)读后感(first pass) 这篇文章还是通过训练形 ...
- Face alignment at 3000 FPS via Regressing Local Binary Features
最近在网上找了三个程序,一个程序是buding 找到给我的,是face alignment 的程序,网址是: http://www.csc.kth.se/~vahidk/face_ert.html 这 ...
- 论文《Face Alignment at 3000 FPS via Regressing Local Binary Features》笔记
论文:Face Alignment at 3000 FPS via Regressing Local Binary Features.pdf 实现:https://github.com/luoyetx ...
- 人脸对齐--Face Alignment at 3000 FPS via Regressing Local Binary Features
Face Alignment at 3000 FPS via Regressing Local Binary Features CVPR2014 https://github.com/yulequan ...
- 《Face alignment at 3000 FPS via Regressing Local Binary Features》阅读笔记
文章目录 一.前言 二.基于形状回归的人脸对齐算法 三.previous work 四.算法的具体实现 4.1 $\phi^{t}$ 的训练 4.2 全局线性回归矩阵 $W^{t}$ 的训练 五.局部 ...
- Linux系列学习(二) - Vim编辑器的介绍及使用、文件编译的过程、Makefile工具、Gdb调试器
目录 引言: 基本命令补充: cat命令: man命令: head命令: tail命令: find命令: grep命令: grep命令与管道"|" 的结合使用: ta ...
最新文章
- 父窗口控制弹出窗口快捷键ctrl+c关闭
- 使用GitList查看git修改记录
- Angular目录结构分析以及app.module.ts详解
- [深度学习基础] 5. 实现细节
- javaSE视频教程正式启动
- nodejs对文件进行分页
- unity 主线程调用_Kafka的Producer的调用序列图
- Solr部署到tomcat
- python 构造函数_Python和其他编程语言的代码对比
- JAVA TCP编程和UDP编程
- Cadence PSpice 仿真6:反向放大器傅里叶仿真(FFT,谐波分析)实战图文教程
- 小程序列表页制作优惠券效果
- 灵魂拷问:为什么5G路由器比2.4G路由器快?
- 一次性搞懂JavaScript正则表达式之方法
- 基于ssm框架开发的图书馆管理系统
- 【砸壳STEP2】使用cycript查看并修改微信UI界面
- Confluence 6 配置快速导航
- free_rtos系统基本配置
- idea2019之后版本 插件库打不开解决办法
- Dr. Evil Underscores