【人脸姿态】2D人脸姿态估计的两种方式：solvePnP与3DMM参数

先看结果：

face

man1

1，solvePNP姿态估计

1.1简介

这里的姿态估计其实就是人脸相对相机的方向估计，估计的要点就是找出2D像素点与3D像素点之间的映射关系。这个映射矩阵是一个平移矩阵和旋转矩阵的组合。我们先给出3D到3D坐标的映射关系，其实就是相机坐标系向世界坐标系的变换关系(称作相机外参)，此变换关系就是人脸相对人脸的方向估计。3D变换关系如下：

可是我们现在不知道对于相机的3D坐标，所以我们需要2D点向相机3D点映射关系（相机内参），关系如下：

其中f是焦距，c是光学中心（我们先不考虑相机畸变）。组合之后的2d到3d变换关系如下

展开得到：

1.2内参标定

内参矩阵我们可以使用棋盘格对相机进行标定,程序如下：

import cv2
import numpy as np
import glob# 设置寻找亚像素角点的参数，采用的停止准则是最大循环次数30和最大误差容限0.001
criteria = (cv2.TERM_CRITERIA_MAX_ITER | cv2.TERM_CRITERIA_EPS, 30, 0.001)# 获取标定板角点的位置
objp = np.zeros((4 * 6, 3), np.float32)
objp[:, :2] = np.mgrid[0:6, 0:4].T.reshape(-1, 2)  # 将世界坐标系建在标定板上，所有点的Z坐标全部为0，所以只需要赋值x和yobj_points = []  # 存储3D点
img_points = []  # 存储2D点images = glob.glob("image4/*.jpg")
i=0;
for fname in images:img = cv2.imread(fname)gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)size = gray.shape[::-1]ret, corners = cv2.findChessboardCorners(gray, (6, 4), None)#print(corners)if ret:obj_points.append(objp)corners2 = cv2.cornerSubPix(gray, corners, (5, 5), (-1, -1), criteria)  # 在原角点的基础上寻找亚像素角点#print(corners2)if [corners2]:img_points.append(corners2)else:img_points.append(corners)cv2.drawChessboardCorners(img, (6, 4), corners, ret)  # 记住，OpenCV的绘制函数一般无返回值i+=1;cv2.imwrite('conimg'+str(i)+'.jpg', img)cv2.waitKey(1500)print(len(img_points))
cv2.destroyAllWindows()# 标定
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(obj_points, img_points, size, None, None)print("ret:", ret)
print("mtx:\n", mtx) # 内参数矩阵
print("dist:\n", dist)  # 畸变系数   distortion cofficients = (k_1,k_2,p_1,p_2,k_3)
print("rvecs:\n", rvecs)  # 旋转向量  # 外参数
print("tvecs:\n", tvecs ) # 平移向量  # 外参数print("-----------------------------------------------------")img = cv2.imread(images[2])
h, w = img.shape[:2]
newcameramtx, roi = cv2.getOptimalNewCameraMatrix(mtx,dist,(w,h),1,(w,h))#显示更大范围的图片（正常重映射之后会删掉一部分图像）
print (newcameramtx)
print("------------------使用undistort函数-------------------")
dst = cv2.undistort(img,mtx,dist,None,newcameramtx)
x,y,w,h = roi
dst1 = dst[y:y+h,x:x+w]
cv2.imwrite('calibresult3.jpg', dst1)
print ("方法一:dst的大小为:", dst1.shape)

1.3 估计方法

姿势估计的欧拉角都是根据旋转矩阵转换而来，而旋转矩阵是根据2d图像到3D图像，或者3d相机坐标系到3D世界坐标系的变换关系使用solvePnP算法求得，2d图像到3D相机坐标系是有相机内参矩阵变换得到，3D相机到3D世界坐标系是的变换矩阵包括R（旋转矩阵）和T（平移矩阵构成），其中的旋转矩阵可转换成欧拉角，也就是水平，垂直，深度轴三个方向的角度，具体步骤：

1）首先定义一个具有n个关键点的3D脸部模型，n可以根据自己对准确度的容忍程度进行定义，以下示例定义6个关键点的3D脸部模型（左眼角，右眼角，鼻尖，左嘴角，右嘴角，下颌）；

2）采用人脸检测以及面部关键点检测得到上述3D脸部对应的2D人脸关键点；

3）采用Opencv的solvePnP函数解出旋转向量；

4）将旋转向量转换为欧拉角；

OpenCV中solvePnP 和 solvePnPRansac都可以用来估计Pose。第二个solvePnPRansac利用随机抽样一致算法（Random sample consensus，RANSAC）的思想，虽然计算出的姿态更加精确，但速度慢、没法实现实时系统，所以我们这里只关注第一个solvePnP函数，具体的参数可以参见Opencv文档。

注意点1：solvePnP里有三种解法：P3P， EPnP，迭代法（默认）；opencv2里参数分别为CV_P3P，CV_EPNP，CV_ITERATIVE （opencv3里多了DLS和UPnP解法）。

注意点2：solvePnP需要至少3组点：P3P只使用4组点，3组求出多个解，第四组确定最优解；EPnP使用大于等于3组点；迭代法调用cvFindExtrinsicCameraParams2，进而使用SVD分解并调用cvFindHomography，而cvFindHomography需要至少4组点。

具体实现：1

import numpy as np
import math
import cv2
import dlibdetector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("./point68.dat")# 3D model points.
model_points = np.array([(0.0, 0.0, 0.0),             # Nose tip(0.0, -330.0, -65.0),        # Chin(-225.0, 170.0, -135.0),     # Left eye left corner(225.0, 170.0, -135.0),      # Right eye right corne(-150.0, -150.0, -125.0),    # Left Mouth corner(150.0, -150.0, -125.0)      # Right mouth corner])# 从旋转向量转换为欧拉角
def get_euler_angle(rotation_vector):# calculate rotation anglestheta = cv2.norm(rotation_vector, cv2.NORM_L2)# transformed to quaterniondw = math.cos(theta / 2)x = math.sin(theta / 2)*rotation_vector[0][0] / thetay = math.sin(theta / 2)*rotation_vector[1][0] / thetaz = math.sin(theta / 2)*rotation_vector[2][0] / thetaysqr = y * y# pitch (x-axis rotation)t0 = 2.0 * (w * x + y * z)t1 = 1.0 - 2.0 * (x * x + ysqr)print('t0:{}, t1:{}'.format(t0, t1))pitch = math.atan2(t0, t1)# yaw (y-axis rotation)t2 = 2.0 * (w * y - z * x)if t2 > 1.0:t2 = 1.0if t2 < -1.0:t2 = -1.0yaw = math.asin(t2)# roll (z-axis rotation)t3 = 2.0 * (w * z + x * y)t4 = 1.0 - 2.0 * (ysqr + z * z)roll = math.atan2(t3, t4)print('pitch:{}, yaw:{}, roll:{}'.format(pitch, yaw, roll))# 单位转换：将弧度转换为度Y = int((pitch/math.pi)*180)X = int((yaw/math.pi)*180)Z = int((roll/math.pi)*180)return X, Y, Zcap=cv2.VideoCapture(0)while True:# Camera internalsflag,img = cap.read()#img = cv2.imread("000_4.jpg")size = img.shapefocal_length = size[1]center = (size[1]/2, size[0]/2)camera_matrix = np.array([[focal_length, 0, center[0]],[0, focal_length, center[1]],[0, 0, 1]], dtype = "double")#print("Camera Matrix :\n {0}".format(camera_matrix)# 取灰度img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)# 人脸数rectsrects = detector(img_gray, 0)#for i in range(len(rects)):point_list=[]if len(rects)>0:landmarks = list((p.x, p.y) for p in predictor(img, rects[0]).parts())point_list.append(landmarks[30])point_list.append(landmarks[8])point_list.append(landmarks[36])point_list.append(landmarks[45])point_list.append(landmarks[48])point_list.append(landmarks[54])print(point_list)for idx, point in enumerate(point_list):cv2.circle(img, point, 2, color=(0, 255, 0))#        font = cv2.FONT_HERSHEY_SIMPLEX#        cv2.putText(img, str(idx + 1), None, font, 0.8, (0, 0, 255), 1, cv2.LINE_AA)image_points =  np.array(point_list,dtype="double")dist_coeffs = np.zeros((4,1)) # Assuming no lens distortionsuccess, rotation_vector, translation_vector = cv2.solvePnP(model_points, image_points, camera_matrix, dist_coeffs, flags=cv2.SOLVEPNP_ITERATIVE)#SOLVEPNP_P3P/SOLVEPNP_ITERATIVE#print("Rotation Vector:\n {0}".format(rotation_vector))#print("Translation Vector:\n {0}".format(translation_vector))X1,Y1,Z1 = get_euler_angle(rotation_vector)print("=====================>",X1,Y1,Z1 )# Project a 3D point (0, 0, 1000.0) onto the image plane.# We use this to draw a line sticking out of the nose(nose_end_point2D, jacobian) = cv2.projectPoints(np.array([(0.0, 0.0, 1000.0)]), rotation_vector, translation_vector, camera_matrix, dist_coeffs)for p in image_points:#cv2.circle(img, (int(p[0]), int(p[1])), 3, (0,0,255), -1)p1 = ( int(image_points[0][0]), int(image_points[0][1]))p2 = ( int(nose_end_point2D[0][0][0]), int(nose_end_point2D[0][0][1]))cv2.line(img, p1, p2, (255,0,0), 2)cv2.imshow("src",img)cv2.waitKey(50)cv2.imwrite("dst.jpg",img)cap.release()

1.4检测结果：

pnp

2，用3DMM参数人脸姿态估计

2.1参数估计

3DMM参数常用于人脸重建，是由一个固定的标准模型经过线性变换得到特定形状，纹理的人脸模型。任意的人脸模型可以由数据集中的m个人脸模型进行加权组合如下：

这里的形状就包括姿态和表情。

这里我们加载标准模型和CNN模型推理出3DMM参数（估计出的模型要根据标准模型的方差和平均模型估算），这里包括62个3DMM参数（12个姿态向量，40个姿态向量，10个表情向量），然后选用3DMM参数中姿态向量转化提取出旋转向量，再用旋转向量转化为欧拉角就得出姿态。

 # Crop image, forward to get the paramparam_lst = []roi_box_lst = []crop_policy = kvs.get('crop_policy', 'box')for obj in objs:if crop_policy == 'box':# by face boxroi_box = parse_roi_box_from_bbox(obj)elif crop_policy == 'landmark':# by landmarksroi_box = parse_roi_box_from_landmark(obj)else:raise ValueError(f'Unknown crop policy {crop_policy}')roi_box_lst.append(roi_box)img = crop_img(img_ori, roi_box)img = cv2.resize(img, dsize=(self.size, self.size), interpolation=cv2.INTER_LINEAR)inp = self.transform(img).unsqueeze(0)if self.gpu_mode:inp = inp.cuda(device=self.gpu_id)if kvs.get('timer_flag', False):end = time.time()param = self.model(inp)elapse = f'Inference: {(time.time() - end) * 1000:.1f}ms'print(elapse)else:param = self.model(inp)param = param.squeeze().cpu().numpy().flatten().astype(np.float32)param = param * self.param_std + self.param_mean  # re-scale# print('output', param)param_lst.append(param)

2.2参数转化

得到12pos参数进行分解，转换成欧拉角

def P2sRt(P):""" decompositing camera matrix P.Args:P: (3, 4). Affine Camera Matrix.Returns:s: scale factor.R: (3, 3). rotation matrix.t2d: (2,). 2d translation."""t3d = P[:, 3]R1 = P[0:1, :3]R2 = P[1:2, :3]s = (np.linalg.norm(R1) + np.linalg.norm(R2)) / 2.0r1 = R1 / np.linalg.norm(R1)r2 = R2 / np.linalg.norm(R2)r3 = np.cross(r1, r2)R = np.concatenate((r1, r2, r3), 0)return s, R, t3ddef matrix2angle(R):""" compute three Euler angles from a Rotation Matrix. Ref: http://www.gregslabaugh.net/publications/euler.pdfrefined by: https://stackoverflow.com/questions/43364900/rotation-matrix-to-euler-angles-with-opencvtodo: check and debugArgs:R: (3,3). rotation matrixReturns:x: yawy: pitchz: roll"""if R[2, 0] > 0.998:z = 0x = np.pi / 2y = z + atan2(-R[0, 1], -R[0, 2])elif R[2, 0] < -0.998:z = 0x = -np.pi / 2y = -z + atan2(R[0, 1], R[0, 2])else:x = asin(R[2, 0])y = atan2(R[2, 1] / cos(x), R[2, 2] / cos(x))z = atan2(R[1, 0] / cos(x), R[0, 0] / cos(x))return x, y, zdef calc_pose(param):P = param[:12].reshape(3, -1)  # camera matrixs, R, t3d = P2sRt(P)P = np.concatenate((R, t3d.reshape(3, -1)), axis=1)  # without scalepose = matrix2angle(R)pose = [p * 180 / np.pi for p in pose]return P, pose

2.3检测结果

face