准备初始数据

mean_shape

mean_shape就是训练图片所有ground_truth points的平均值.那么具体怎么做呢？是不是直接将特征点相加求平均值呢？
显然这样做是仓促和不准确的。因为图片之间人脸是各式各样的，收到光照、姿势等各方面的影响。因此我们求取平均值，应该在一个相对统一的框架下求取。如下先给出matlab代码:

function mean_shape = calc_meanshape(shapepathlistfile)fid = fopen(shapepathlistfile);
shapepathlist = textscan(fid, '%s', 'delimiter', '\n');if isempty(shapepathlist)error('no shape file found');mean_shape = [];return;
endshape_header = loadshape(shapepathlist{1}{1});if isempty(shape_header)error('invalid shape file');mean_shape = [];return;
endmean_shape = zeros(size(shape_header));num_shapes = 0;
for i = 1:length(shapepathlist{1})shape_i = double(loadshape(shapepathlist{1}{i}));if isempty(shape_i)continue;endshape_min = min(shape_i, [], 1);shape_max = max(shape_i, [], 1);% translate to origin pointshape_i = bsxfun(@minus, shape_i, shape_min);% resize shapeshape_i = bsxfun(@rdivide, shape_i, shape_max - shape_min);mean_shape = mean_shape + shape_i;num_shapes = num_shapes + 1;
endmean_shape = mean_shape ./ num_shapes;img = 255 * ones(500, 500, 3);drawshapes(img, 50 + 400 * mean_shape);endfunction shape = loadshape(path)
% function: load shape from pts file
file = fopen(path);
if file == -1shape = [];fclose(file);return;
end
shape = textscan(file, '%d16 %d16', 'HeaderLines', 3, 'CollectOutput', 2);
fclose(file);
shape = shape{1};
end

解析:

公式表示:

{shapegt−[Region(1),Region(2)]}/[Region(3),Region(4)))]]⇒[0,1]×[0,1]

\{shape_{gt}-[Region(1),Region(2)]\}/[Region(3),Region(4)))]] \Rightarrow [0,1]\times[0,1]

准备ΔSt\Delta S^t

我们知道3000FPS的核心思想是:

ΔSt=WtΦt(I,St−1)

\Delta S^t=W^t\Phi^t(I,S^{t-1})
其中 ΔSt=Sgt−St\Delta S^t=S^{gt}-S^{t}为第t个阶段的残差；而 Φt(I,St−1)\Phi^t(I,S^{t-1})则为特征提取函数；W为线性回归矩阵。由《人脸配准坐标变换解析》我们可以看到所谓的 ΔSt\Delta S^t需进行相似性变换，而 Φt(I,St−1)\Phi^t(I,S^{t-1})则不需要.
相似性变换的主要过程是:
先将 StS^t， S0S^0中心化变换，再求解如下变换矩阵:

S0=cRSt

S^0=cRS^t,求解完cR后，对 ΔSt\Delta S^t施加同样的变换，即

St˜=cRΔSt

\widetilde{S^t}=cR\Delta S^t.我们将使用变化后的 St˜\widetilde{S^t}去求解线性回归矩阵W.
先贴代码: train_model.m 第103行起

Param.meanshape        = S0(Param.ind_usedpts, :); %选取特定的landmarkdbsize = length(Data);% load('Ts_bbox.mat');augnumber = Param.augnumber; %为每张人脸选取的init_shape的个数for i = 1:dbsize        % initializ the shape of current face image by randomly selecting multiple shapes from other face images       % indice = ceil(dbsize*rand(1, augnumber));  indice_rotate = ceil(dbsize*rand(1, augnumber));  indice_shift  = ceil(dbsize*rand(1, augnumber));  scales        = 1 + 0.2*(rand([1 augnumber]) - 0.5);Data{i}.intermediate_shapes = cell(1, Param.max_numstage); %中间shapeData{i}.intermediate_bboxes = cell(1, Param.max_numstage);Data{i}.intermediate_shapes{1} = zeros([size(Param.meanshape), augnumber]); %68*2*augnumber(augnumber为第i图片设置的初始shape的个数)Data{i}.intermediate_bboxes{1} = zeros([augnumber, size(Data{i}.bbox_gt, 2)]); %augnumber*4Data{i}.shapes_residual = zeros([size(Param.meanshape), augnumber]); %shapes_residual为shape 残差 维数:68*2*augnumberData{i}.tf2meanshape = cell(augnumber, 1);Data{i}.meanshape2tf = cell(augnumber, 1);% if Data{i}.isdet == 1%    Data{i}.bbox_facedet = Data{i}.bbox_facedet*ts_bbox;% end     % 如下一段的意思是如果augnumber=1，表明每个图片的Init_shape只有一个，因此这要设置成mean_shape即可,这时你会发现Data{i}.tf2meanshape{1}其实就是% 单位矩阵，因为他是从mean_shape转化到mean_shape。后面就不一样了.%；对于augnumber>1的其他init_shape将采用平移、旋转、% 缩放等方式产生更多的shape，也可以从其他图片的shape中挑选shapefor sr = 1:params.augnumberif sr == 1% estimate the similarity transformation from initial shape to mean shape% Data{i}.intermediate_shapes{1}(:,:, sr) = resetshape(Data{i}.bbox_gt, Param.meanshape);% Data{i}.intermediate_bboxes{1}(sr, :) = Data{i}.bbox_gt;Data{i}.intermediate_shapes{1}(:,:, sr) = resetshape(Data{i}.bbox_facedet, Param.meanshape);Data{i}.intermediate_bboxes{1}(sr, :) = Data{i}.bbox_facedet;%将mean shape reproject face detection bbox上meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape); %meanshape_resize与 Data{i}.intermediate_shapes{1}(:,:, sr) 是相同的%计算当前的shape与mean shape之间的相似性变换         Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), ...(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');Data{i}.meanshape2tf{1} = fitgeotrans((bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), ...bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), 'NonreflectiveSimilarity');% calculate the residual shape from initial shape to groundtruth shape under normalization scaleshape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, 1), [Data{i}.intermediate_bboxes{1}(1, 3) Data{i}.intermediate_bboxes{1}(1, 4)]);% transform the shape residual in the image coordinate to the mean shape coordinate[u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)'); Data{i}.shapes_residual(:, 1, 1) = u';Data{i}.shapes_residual(:, 2, 1) = v'; else% randomly rotate the shape            % shape = resetshape(Data{i}.bbox_gt, Param.meanshape);       % Data{indice_rotate(sr)}.shape_gtshape = resetshape(Data{i}.bbox_facedet, Param.meanshape);       % Data{indice_rotate(sr)}.shape_gt%根据随机选取的scale，rotation，translate计算新的初始shape然后投影到bbox上if params.augnumber_scale ~= 0shape = scaleshape(shape, scales(sr));endif params.augnumber_rotate ~= 0shape = rotateshape(shape);endif params.augnumber_shift ~= 0shape = translateshape(shape, Data{indice_shift(sr)}.shape_gt);endData{i}.intermediate_shapes{1}(:, :, sr) = shape;Data{i}.intermediate_bboxes{1}(sr, :) = getbbox(shape);meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape); %将Data{i}.tf2meanshape{sr} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, sr), mean(Data{i}.intermediate_shapes{1}(1:end,:, sr))), ...bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :))), 'NonreflectiveSimilarity');Data{i}.meanshape2tf{sr} = fitgeotrans(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :))), ...bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, sr), mean(Data{i}.intermediate_shapes{1}(1:end,:, sr))), 'NonreflectiveSimilarity');shape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, sr), [Data{i}.intermediate_bboxes{1}(sr, 3) Data{i}.intermediate_bboxes{1}(sr, 4)]);[u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)');Data{i}.shapes_residual(:, 1, sr) = u';Data{i}.shapes_residual(:, 2, sr) = v';% Data{i}.shapes_residual(:, :, sr) = tformfwd(Data{i}.tf2meanshape{sr}, shape_residual(:, 1), shape_residual(:, 2));endend
end

这段代码的理解需要结合上面给出的那篇文章《人脸配准坐标变换解析》。

按照《人脸配准坐标变换解析》文章所述，

S0¯¯¯¯S1¯¯¯¯=S0−mean(S0)=S1−mean(S1)}⇒S0¯¯¯¯=c1R1S1¯¯¯¯

\left.\begin{matrix}\overline{S_0}&=S_0-mean(S_0)\\ \overline{S_1}&=S_1-mean(S_1) \end{matrix}\right\}\Rightarrow \overline{S_0}=c_1R_1\overline{S_1}
因此根据

ΔS=Sg−S1

\Delta S=S_g-S_1可推出

ΔS˜=c1R1ΔS

\widetilde{\Delta S}=c_1R_1\Delta S
但是现在问题比较特殊，需要多操作一下:
由：

 %将mean shape reproject face detection bbox上meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape);

查看resetshape的定义知meanshape被映射到intermediate_bboxes中，使得S0S_0和S1S_1处于同样的尺度下和大致相似的位置上。用数学语言表达为:

S0_resize=S0∗Ratio+[Region(1),Region(2)]

S_0\_resize=S_0*Ratio+[Region(1),Region(2)] 这里Ratio实际上是intermediate_bboxes的大小。
于是同样按照上面的方法计算：

S0˜=S0_Resize−mean(S0_Resize)=S0∗Ratio−mean(S0)∗Ratio=(S0−mean(S0))∗Ratio=S0¯¯¯¯∗Ratio

\widetilde{S_0}=S_0\_Resize-mean(S_0\_Resize)=S_0*Ratio-mean(S_0)*Ratio=(S_0-mean(S_0))*Ratio= \overline{S_0}*Ratio
经过计算得 S0˜=Ratio∗S0¯¯¯¯=c1˜R1˜S1¯¯¯¯\widetilde{S_0}=Ratio*\overline{S_0}=\widetilde{c_1}\widetilde{R_1} \overline{S_1}.（ ★\bigstar）
这也就是上面的代码：

 Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), ...(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');

Data{i}.tf2meanshape{1}即为这里算出的c1˜R1˜\widetilde{c_1}\widetilde{R_1}.
但我们想要的是S0¯¯¯¯=c1R1S1¯¯¯¯\overline{S_0}=c_1R_1\overline{S_1},不用着急，(★\bigstar)为我们指明了方向。
c1R1=c1˜R1˜/Ratio=c1˜R1˜/intermediate_bboxesc_1R_1=\widetilde{c_1}\widetilde{R_1}/Ratio=\widetilde{c_1}\widetilde{R_1}/{intermediate\_{bboxes}}.因此:

ΔS˜=c1˜R1˜/intermediate_bboxes∗ΔS

\widetilde{\Delta S}=\widetilde{c_1}\widetilde{R_1}/{intermediate\_{bboxes}}*\Delta S
也就是代码中提的:

 %计算当前的shape与mean shape之间的相似性变换
Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))),(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');Data{i}.meanshape2tf{1} = fitgeotrans((bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))),bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), 'NonreflectiveSimilarity');% calculate the residual shape from initial shape to groundtruth shape under normalization scale
shape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, 1), [Data{i}.intermediate_bboxes{1}(1, 3) Data{i}.intermediate_bboxes{1}(1, 4)]);% transform the shape residual in the image coordinate to the mean shape coordinate
[u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)'); Data{i}.shapes_residual(:, 1, 1) = u';Data{i}.shapes_residual(:, 2, 1) = v';

face alignment by 3000 fps系列学习总结（二）相关推荐

face alignment by 3000 fps系列学习总结（三）
训练我们主要以3000fps matlab实现为叙述主体. 总体目标我们需要为68个特征点的每一个特征点训练5棵随机树,每棵树4层深,即为所谓的随机森林. 开始训练分配样本事实上,对于每个特征 ...
face alignment by 3000 fps系列学习总结
我们主要讲一讲Github上给出的matlab开源代码<jwyang/face-alignment>的配置. 首先声明:本人第一次配置的时候也是参考了csdn一个作者和github给出的说 ...
Face Alignment by 3000 FPS系列学习总结（一）
广播: 如今的opencv已经提供了LBF的训练和测试代码,推荐阅读 <使用OpenCV实现人脸关键点检测> face alignment 流程图 train阶段测试阶段预处理裁剪图 ...
Face Alignment at 3000 FPS via Regressing Local Binary Features（CVPR2014）读后感（first pass）
Face Alignment at 3000 FPS via Regressing Local Binary Features(CVPR2014)读后感(first pass) 这篇文章还是通过训练形 ...
Face alignment at 3000 FPS via Regressing Local Binary Features
最近在网上找了三个程序,一个程序是buding 找到给我的,是face alignment 的程序,网址是: http://www.csc.kth.se/~vahidk/face_ert.html 这 ...
论文《Face Alignment at 3000 FPS via Regressing Local Binary Features》笔记
论文:Face Alignment at 3000 FPS via Regressing Local Binary Features.pdf 实现:https://github.com/luoyetx ...
人脸对齐--Face Alignment at 3000 FPS via Regressing Local Binary Features
Face Alignment at 3000 FPS via Regressing Local Binary Features CVPR2014 https://github.com/yulequan ...
《Face alignment at 3000 FPS via Regressing Local Binary Features》阅读笔记
文章目录一.前言二.基于形状回归的人脸对齐算法三.previous work 四.算法的具体实现 4.1 $\phi^{t}$ 的训练 4.2 全局线性回归矩阵 $W^{t}$ 的训练五.局部 ...
Linux系列学习（二） - Vim编辑器的介绍及使用、文件编译的过程、Makefile工具、Gdb调试器
目录引言: 基本命令补充: cat命令: man命令: head命令: tail命令: find命令: grep命令: grep命令与管道"|" 的结合使用: ta ...

face alignment by 3000 fps系列学习总结（二）

准备初始数据

mean_shape

准备ΔSt\Delta S^t

face alignment by 3000 fps系列学习总结（二）相关推荐

最新文章

热门文章