使用MediaPipe工具包进行开发

什么是MediaPipe?

MediaPipe是一款由Google Research 开发并开源的多媒体机器学习模型应用框架,用于处理视频、音频等时间序列数据。这个跨平台架构使用于桌面/服务器、Android、iOS和嵌入式设备等。

我们使用MeidaPipe下的Solutions(方案特定的模型),共有16个Solutions:

  1. 人脸检测
  2. Fase Mesh (人脸上打了特别多网格)
  3. 虹膜(人眼)
  4. 姿态(!这章博客需要用到的)
  5. 人体
  6. 人物分割
  7. 头发分割
  8. 目标检测
  9. Box Tracking
  10. 实例动态跟踪
  11. 3D目标检测
  12. 特征匹配
  13. AutoFlip
  14. MediaSequence
  15. YouTube-8M

人体姿态估计代码

重点代码讲解

1.solutions.pose

import mediapipe as mp
mp_pose = mp.solutions.pose

mediapipe姿态估计模块(.solutions.pose),将人体各个部位分成33个点(0-32)如下图1.

图1.

通常可以通过判断角度,来判断姿态是什么动作。如下图2. (具体动作识别判断:采集不同动作下的图片,然后通过姿态检测,根据角度对图像进行标注,将大量图像作为训练集进行学习,完成姿态行为识别判断。)

图2.


2.mp_pose.Pose()其参数:

static_image_mode 表示 静态图像还是连续帧视频

model_complexity 表示 人体姿态估计模型; 0 表示 速度最快,精度最低(三者之中);1 表示 速度中间,精度中间(三者之中);2 表示 速度最慢,精度最高(三者之中);

smooth_landmarks 表示 是否平滑关键点;

enable_segmentation 表示 是否对人体进行抠图;

min_detection_confidence 表示 检测置信度阈值;

min_tracking_confidence 表示 各帧之间跟踪置信度阈值;

 pose = mp_pose.Pose(static_image_mode=True,model_complexity=1,smooth_landmarks=True,enable_segmentation=True,min_detection_confidence=0.5,min_tracking_confidence=0.5)

3.solutions.drawing_utils

绘图(可以绘制3D坐标,也可以直接在原图上绘制关键点,姿态)

drawing = mp.solutions.drawing_utils
...
drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)(原图基础上显示关键点,姿态)
drawing.plot_landmarks(results.pose_world_landmarks, mp_pose.POSE_CONNECTIONS)(3D)

之后的代码就是opencv相关


完整代码

import cv2
import mediapipe as mpif __name__ == '__main__':mp_pose = mp.solutions.posepose = mp_pose.Pose(static_image_mode=True,model_complexity=1,smooth_landmarks=True,# enable_segmentation=True,min_detection_confidence=0.5,min_tracking_confidence=0.5)drawing = mp.solutions.drawing_utils# read img BGR to RGBimg = cv2.imread("1.jpg")img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)cv2.imshow("input", img)results = pose.process(img)drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)cv2.imshow("keypoint", img)drawing.plot_landmarks(results.pose_world_landmarks, mp_pose.POSE_CONNECTIONS)cv2.waitKey(0)cv2.destroyAllWindows()

运行结果

原图

原图基础上显示关键点

3D坐标


坐标解析

还是拿这个正在跑步的人举例。

代码

import cv2
import mediapipe as mp
import matplotlib.pyplot as pltmp_pose = mp.solutions.posepose = mp_pose.Pose(static_image_mode=False,model_complexity=0,smooth_landmarks=True,enable_segmentation=True,min_detection_confidence=0.5,min_tracking_confidence=0.5)
drawing = mp.solutions.drawing_utilsif __name__ == '__main__':img = cv2.imread('1.jpg')results = pose.process(img)print(results.pose_landmarks)
# results.pose_landmarks

其中 所有关键点的检测结果可以从 results.pose_landmarks 看到如下;这是归一化之后的结果,若要还原,x得乘以宽度得到像素坐标;y得乘以高度得到像素坐标;z坐标的原点在人体左右两个髋关节的中点,方向如果离越近为负值,越远为正值。

landmark {x: 0.5913416743278503y: 0.17020824551582336z: -0.10148811340332031visibility: 0.9999819993972778
}
landmark {x: 0.5996149778366089y: 0.15227100253105164z: -0.11516246944665909visibility: 0.9999804496765137
}
landmark {x: 0.6029789447784424y: 0.15230441093444824z: -0.1152396947145462visibility: 0.9999819993972778
}
landmark {x: 0.6068712472915649y: 0.15244033932685852z: -0.11535673588514328visibility: 0.9999862909317017
}
landmark {x: 0.5956720113754272y: 0.15167823433876038z: -0.07190258055925369visibility: 0.9999619722366333
}
landmark {x: 0.5958214998245239y: 0.1511225700378418z: -0.07187224179506302visibility: 0.9999505281448364
}
landmark {x: 0.5961448550224304y: 0.15046393871307373z: -0.07175181061029434visibility: 0.9999566078186035
}
landmark {x: 0.6275932788848877y: 0.16257867217063904z: -0.12434940785169601visibility: 0.9999855756759644
}
landmark {x: 0.612525463104248y: 0.15917572379112244z: 0.07216572016477585visibility: 0.9999306201934814
}
landmark {x: 0.5972875952720642y: 0.1862889975309372z: -0.10227096825838089visibility: 0.9999692440032959
}
landmark {x: 0.592987596988678y: 0.18590152263641357z: -0.04587363451719284visibility: 0.9999159574508667
}
landmark {x: 0.6709297895431519y: 0.25625985860824585z: -0.19476133584976196visibility: 0.9999887943267822
}
landmark {x: 0.6155267357826233y: 0.27312740683555603z: 0.23764272034168243visibility: 0.9998886585235596
}
landmark {x: 0.76192706823349y: 0.32696548104286194z: -0.23866404592990875visibility: 0.9991742968559265
}
landmark {x: 0.6149069666862488y: 0.37373778223991394z: 0.3292929530143738visibility: 0.2991478443145752
}
landmark {x: 0.7010799646377563y: 0.4162237048149109z: -0.18799468874931335visibility: 0.9904139637947083
}
landmark {x: 0.5629619359970093y: 0.34696149826049805z: 0.2674705684185028visibility: 0.40632331371307373
}
landmark {x: 0.6892314553260803y: 0.43785160779953003z: -0.21043820679187775visibility: 0.9691246151924133
}
landmark {x: 0.5501535534858704y: 0.334354966878891z: 0.26719772815704346visibility: 0.36899450421333313
}
landmark {x: 0.6795801520347595y: 0.4296255111694336z: -0.19730502367019653visibility: 0.9676418304443359
}
landmark {x: 0.5508479475975037y: 0.3242868185043335z: 0.23829618096351624visibility: 0.37453970313072205
}
landmark {x: 0.6808692216873169y: 0.4231947660446167z: -0.17752741277217865visibility: 0.9631088972091675
}
landmark {x: 0.555029034614563y: 0.3278791904449463z: 0.2512615919113159visibility: 0.3893350064754486
}
landmark {x: 0.6576938033103943y: 0.5196953415870667z: -0.14214162528514862visibility: 0.9999549388885498
}
landmark {x: 0.6405556797981262y: 0.5202372074127197z: 0.14222070574760437visibility: 0.9999477863311768
}
landmark {x: 0.5241203904151917y: 0.6201881766319275z: -0.15834815800189972visibility: 0.9693808555603027
}
landmark {x: 0.7318142056465149y: 0.6902449727058411z: 0.11025446653366089visibility: 0.9495575428009033
}
landmark {x: 0.5604070425033569y: 0.815612256526947z: -0.03564663231372833visibility: 0.9501809477806091
}
landmark {x: 0.8767399191856384y: 0.8223288655281067z: 0.1430264711380005visibility: 0.9820705652236938
}
landmark {x: 0.5801612138748169y: 0.8386951684951782z: -0.026119956746697426visibility: 0.9103515148162842
}
landmark {x: 0.8959819078445435y: 0.875182569026947z: 0.14569874107837677visibility: 0.9787982106208801
}
landmark {x: 0.5071742534637451y: 0.875374436378479z: -0.06918345391750336visibility: 0.9140819907188416
}
landmark {x: 0.88722825050354y: 0.9008339643478394z: 0.09929685294628143visibility: 0.9545168280601501
}

调用

print(mp_pose.POSE_CONNECTIONS)# mp_pose.POSE_CONNECTIONS

mp_pose.POSE_CONNECTIONS表示人体33个关键点如何连接的骨架

frozenset({(15, 21), (16, 20), (18, 20),(3, 7), (14, 16), (23, 25), (28, 30), (11, 23), (27, 31), (6, 8), (15, 17),(24, 26), (16, 22), (4, 5), (5, 6), (29, 31), (12, 24), (23, 24), (0, 1), (9, 10), (1, 2), (0, 4), (11, 13), (30, 32), (28, 32), (15, 19), (16, 18), (25, 27), (26, 28), (12, 14), (17, 19), (2, 3), (11, 12), (27, 29), (13, 15)})

调用

print(results.pose_landmarks.landmark[mp_pose.PoseLandmark.LEFT_EYE])
print(results.pose_landmarks.landmark[2])# results.pose_landmarks.landmark[mp_pose.PoseLandmark.LEFT_EYE]
# or
# results.pose_landmarks.landmark[2]

查看左眼属性

x: 0.6029789447784424
y: 0.15230441093444824
z: -0.1152396947145462
visibility: 0.9999819993972778

调用左眼的x轴

print(results.pose_landmarks.landmark[2].x)# results.pose_landmarks.landmark[2].x

查看左眼的x轴

0.6029789447784424

还原左眼像素

    height, width, channels = img.shapeprint("x:{},y:{}".format(results.pose_landmarks.landmark[2].x * width,results.pose_landmarks.landmark[2].y * height))#(results.pose_landmarks.landmark[2].x * width,results.pose_landmarks.landmark[2].y * height)

查看左眼像素

x:602.9789447784424,y:116.5128743648529

获取左眼真实坐标 (真实坐标位于人体左右两个髋关节的中点)

print(results.pose_world_landmarks.landmark[2])# results.pose_world_landmarks.landmark[2]

查看

x: -0.1573336124420166
y: -0.6765229105949402
z: -0.09651455283164978
visibility: 0.9999819993972778

交互式三维可视化

核心思想都是将谷歌的pose_landmarks.landmark坐标提取出来,处理成python的列表数据格式

有两种方式获取三维坐标,可以测得每种方式的FPS

1.map

import timeimport cv2
import numpy as np
import mediapipe as mpmp_pose = mp.solutions.posepose = mp_pose.Pose(static_image_mode=False,model_complexity=0,smooth_landmarks=True,enable_segmentation=True,min_detection_confidence=0.5,min_tracking_confidence=0.5)
drawing = mp.solutions.drawing_utils# 提取x,y,z坐标
def get_x(each):return each.xdef get_y(each):return each.ydef get_z(each):return each.zif __name__ == '__main__':t0 = time.time()img = cv2.imread('1.jpg')img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)results = pose.process(img)drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)coords = np.array(results.pose_landmarks.landmark)x = np.array(list(map(get_x, coords)))y = np.array(list(map(get_y, coords)))z = np.array(list(map(get_z, coords)))point = np.vstack((x, y, z)).Tprint("FPS: {}".format(str(int(1 / (time.time() - t0)))))print(point)

运行结果

FPS: 4
[[ 0.59134167  0.17020825 -0.10148811][ 0.59961498  0.152271   -0.11516247][ 0.60297894  0.15230441 -0.11523969][ 0.60687125  0.15244034 -0.11535674][ 0.59567201  0.15167823 -0.07190258][ 0.5958215   0.15112257 -0.07187224][ 0.59614486  0.15046394 -0.07175181][ 0.62759328  0.16257867 -0.12434941][ 0.61252546  0.15917572  0.07216572][ 0.5972876   0.186289   -0.10227097][ 0.5929876   0.18590152 -0.04587363][ 0.67092979  0.25625986 -0.19476134][ 0.61552674  0.27312741  0.23764272][ 0.76192707  0.32696548 -0.23866405][ 0.61490697  0.37373778  0.32929295][ 0.70107996  0.4162237  -0.18799469][ 0.56296194  0.3469615   0.26747057][ 0.68923146  0.43785161 -0.21043821][ 0.55015355  0.33435497  0.26719773][ 0.67958015  0.42962551 -0.19730502][ 0.55084795  0.32428682  0.23829618][ 0.68086922  0.42319477 -0.17752741][ 0.55502903  0.32787919  0.25126159][ 0.6576938   0.51969534 -0.14214163][ 0.64055568  0.52023721  0.14222071][ 0.52412039  0.62018818 -0.15834816][ 0.73181421  0.69024497  0.11025447][ 0.56040704  0.81561226 -0.03564663][ 0.87673992  0.82232887  0.14302647][ 0.58016121  0.83869517 -0.02611996][ 0.89598191  0.87518257  0.14569874][ 0.50717425  0.87537444 -0.06918345][ 0.88722825  0.90083396  0.09929685]]Process finished with exit code 0

2.for

import timeimport cv2
import numpy as np
import mediapipe as mpmp_pose = mp.solutions.posepose = mp_pose.Pose(static_image_mode=False,model_complexity=0,smooth_landmarks=True,enable_segmentation=True,min_detection_confidence=0.5,min_tracking_confidence=0.5)
drawing = mp.solutions.drawing_utilspoint_x = []
point_y = []
point_z = []# 提取x,y,z坐标
def get_x_y_z(each):point_x.append(each.x)point_y.append(each.y)point_z.append(each.z)return point_x, point_y, point_zif __name__ == '__main__':t0 = time.time()img = cv2.imread('1.jpg')img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)results = pose.process(img)drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)coords = np.array(results.pose_landmarks.landmark)for index, each in enumerate(coords):x, y, z = get_x_y_z(each)point = np.vstack((x, y, z)).Tprint("FPS: {}".format(str(int(1 / (time.time() - t0)))))print(point)

运行结果

FPS: 4
[[ 0.59134167  0.17020825 -0.10148811][ 0.59961498  0.152271   -0.11516247][ 0.60297894  0.15230441 -0.11523969][ 0.60687125  0.15244034 -0.11535674][ 0.59567201  0.15167823 -0.07190258][ 0.5958215   0.15112257 -0.07187224][ 0.59614486  0.15046394 -0.07175181][ 0.62759328  0.16257867 -0.12434941][ 0.61252546  0.15917572  0.07216572][ 0.5972876   0.186289   -0.10227097][ 0.5929876   0.18590152 -0.04587363][ 0.67092979  0.25625986 -0.19476134][ 0.61552674  0.27312741  0.23764272][ 0.76192707  0.32696548 -0.23866405][ 0.61490697  0.37373778  0.32929295][ 0.70107996  0.4162237  -0.18799469][ 0.56296194  0.3469615   0.26747057][ 0.68923146  0.43785161 -0.21043821][ 0.55015355  0.33435497  0.26719773][ 0.67958015  0.42962551 -0.19730502][ 0.55084795  0.32428682  0.23829618][ 0.68086922  0.42319477 -0.17752741][ 0.55502903  0.32787919  0.25126159][ 0.6576938   0.51969534 -0.14214163][ 0.64055568  0.52023721  0.14222071][ 0.52412039  0.62018818 -0.15834816][ 0.73181421  0.69024497  0.11025447][ 0.56040704  0.81561226 -0.03564663][ 0.87673992  0.82232887  0.14302647][ 0.58016121  0.83869517 -0.02611996][ 0.89598191  0.87518257  0.14569874][ 0.50717425  0.87537444 -0.06918345][ 0.88722825  0.90083396  0.09929685]]Process finished with exit code 0

FPS都差不多

之后显示成三维

import cv2
import time
import numpy as np
import open3d
import mediapipe as mpmp_pose = mp.solutions.posepose = mp_pose.Pose(static_image_mode=False,model_complexity=0,smooth_landmarks=True,enable_segmentation=True,min_detection_confidence=0.5,min_tracking_confidence=0.5)
drawing = mp.solutions.drawing_utilspoint_x = []
point_y = []
point_z = []# 提取x,y,z坐标
def get_x_y_z(each):point_x.append(each.x)point_y.append(each.y)point_z.append(each.z)return point_x, point_y, point_zif __name__ == '__main__':img = cv2.imread('1.jpg')img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)results = pose.process(img)drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)coords = np.array(results.pose_landmarks.landmark)for index, each in enumerate(coords):x, y, z = get_x_y_z(each)point = np.vstack((x, y, z)).T# 3d点云 3维可视化point_cloud = open3d.geometry.PointCloud()points = open3d.utility.Vector3dVector(point)point_cloud.points = pointsopen3d.visualization.draw_geometries([point_cloud])

运行结果

关键点3D交互式可视化


优化显示效果

预测的关键点更加显著

将各个关键点加粗成莫兰迪色

莫兰蒂色卡

因为现在的img通道是bgr,所以调色板的rgb顺序得对换以下

代码优化

import cv2
import numpy as np
import mediapipe as mpmp_pose = mp.solutions.pose
pose = mp_pose.Pose(static_image_mode=True,model_complexity=2,smooth_landmarks=True,min_detection_confidence=0.5,min_tracking_confidence=0.5)
drawing = mp.solutions.drawing_utilsimg = cv2.imread("1.jpg")
height, width, channels = img.shapepoint_x = []
point_y = []# 提取x,y坐标
def get_x_y(each):point_x.append(int(each.x * width))point_y.append(int(each.y * height))return point_x, point_yif __name__ == '__main__':print("height:{}, width:{}".format(height, width))results = pose.process(img)drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)coords = np.array(results.pose_landmarks.landmark)for index, each in enumerate(coords):x, y = get_x_y(each)points = np.vstack((x, y)).Tradius = 5for index, point in enumerate(points):# noseif index == 0:img = cv2.circle(img, (point[0], point[1]), radius, (133, 152, 164), -1)# shoulderelif index in [11, 12]:img = cv2.circle(img, (point[0], point[1]), radius, (117, 149, 188), -1)# hip jointelif index in [23, 24]:img = cv2.circle(img, (point[0], point[1]), radius, (177, 202, 215), -1)# elbowelif index in [13, 14]:img = cv2.circle(img, (point[0], point[1]), radius, (221, 227, 229), -1)# lapelif index in [25, 26]:img = cv2.circle(img, (point[0], point[1]), radius, (117, 175, 198), -1)# wrist and ankleelif index in [15, 16, 27, 28]:img = cv2.circle(img, (point[0], point[1]), radius, (146, 134, 118), -1)# left handelif index in [17, 19, 21]:img = cv2.circle(img, (point[0], point[1]), radius, (122, 137, 128), -1)# right handelif index in [18, 20, 22]:img = cv2.circle(img, (point[0], point[1]), radius, (115, 117, 117), -1)# left feetelif index in [27, 29, 31]:img = cv2.circle(img, (point[0], point[1]), radius, (205, 209, 212), -1)# right feetelif index in [28, 30, 32]:img = cv2.circle(img, (point[0], point[1]), radius, (132, 115, 132), -1)# mouthelif index in [9, 10]:img = cv2.circle(img, (point[0], point[1]), radius, (79, 84, 113), -1)# face and eyeelif index in [1, 2, 3, 4, 5, 6, 7, 8]:img = cv2.circle(img, (point[0], point[1]), radius, (212, 195, 202), -1)# otherelse:img = cv2.circle(img, (point[0], point[1]), radius, (140, 47, 240), -1)cv2.imshow("aug_keypoint", img)cv2.waitKey(0)cv2.destroyAllWindows()

运行结果


实时视频人体姿态估计

1.摄像头拍摄视频实时人体姿态估计

算法核心:摄像头打开后估计人体姿态,10秒钟退出。

若觉得10秒太短,可修改:

if ((time.time() - t0) // 1) == 10:

完整代码

import sys
import cv2
import time
import mediapipe as mpmp_pose = mp.solutions.pose
drawing = mp.solutions.drawing_utils
pose = mp_pose.Pose(static_image_mode=False,model_complexity=1,smooth_landmarks=True,enable_segmentation=True,min_detection_confidence=0.5,min_tracking_confidence=0.5)def process_frame(img):results = pose.process(img)drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)return imgif __name__ == '__main__':t0 = time.time()cap = cv2.VideoCapture(0)cap.open(0)while cap.isOpened():success, frame = cap.read()if not success:raise ValueError("Error")frame = process_frame(frame)cv2.imshow("keypoint", frame)if ((time.time() - t0) // 1) == 10:sys.exit(0)cv2.waitKey(1)cap.release()cv2.destroyAllWindows()

此代码能正常运行,不展示运行结果!


2.视频实时人体姿态估计

算法核心:打开保存好的视频后估计人体姿态,视频读取完后退出。

完整代码

import os
import sys
import cv2
import mediapipe as mpBASE_DIR = os.path.dirname((os.path.abspath(__file__)))
print(BASE_DIR)
sys.path.append(BASE_DIR)
mp_pose = mp.solutions.pose
drawing = mp.solutions.drawing_utils
pose = mp_pose.Pose(static_image_mode=False,model_complexity=1,smooth_landmarks=True,enable_segmentation=True,min_detection_confidence=0.5,min_tracking_confidence=0.5)def process_frame(img):results = pose.process(img)drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)return imgif __name__ == '__main__':video_dirs = os.path.join(BASE_DIR, "1.mp4")cap = cv2.VideoCapture(video_dirs)while cap.isOpened():success, frame = cap.read()if frame is None:breakif success == True:frame = process_frame(frame)cv2.imshow("keypoint", frame)cv2.waitKey(1)cap.release()cv2.destroyAllWindows()

运行结果

原视频是最近报爆火的刘耕宏健身操。

人体姿态检测

在这个视频中证明此算法还是存在缺陷,因为视频中无法很好的将两人同时识别。


3.视频实时人体姿态估计代码优化

运用了上面所提及的莫兰迪色系,作为关键点的颜色。

完整代码

import cv2
import time
import numpy as np
from tqdm import tqdm
import mediapipe as mpmp_pose = mp.solutions.pose
pose = mp_pose.Pose(static_image_mode=True,model_complexity=2,smooth_landmarks=True,min_detection_confidence=0.5,min_tracking_confidence=0.5)
drawing = mp.solutions.drawing_utilsdef process_frame(img):height, width, channels = img.shapestart = time.time()results = pose.process(img)if results.pose_landmarks:drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)coords = np.array(results.pose_landmarks.landmark)for index, each in enumerate(coords):cx = int(each.x * width)cy = int(each.y * height)cz = each.zradius = 5# noseif index == 0:img = cv2.circle(img, (cx, cy), radius, (133, 152, 164), -1)# shoulderelif index in [11, 12]:img = cv2.circle(img, (cx, cy), radius, (117, 149, 188), -1)# hip jointelif index in [23, 24]:img = cv2.circle(img, (cx, cy), radius, (177, 202, 215), -1)# elbowelif index in [13, 14]:img = cv2.circle(img, (cx, cy), radius, (221, 227, 229), -1)# lapelif index in [25, 26]:img = cv2.circle(img, (cx, cy), radius, (117, 175, 198), -1)# wrist and ankleelif index in [15, 16, 27, 28]:img = cv2.circle(img, (cx, cy), radius, (146, 134, 118), -1)# left handelif index in [17, 19, 21]:img = cv2.circle(img, (cx, cy), radius, (122, 137, 128), -1)# right handelif index in [18, 20, 22]:img = cv2.circle(img, (cx, cy), radius, (115, 117, 117), -1)# left feetelif index in [27, 29, 31]:img = cv2.circle(img, (cx, cy), radius, (205, 209, 212), -1)# right feetelif index in [28, 30, 32]:img = cv2.circle(img, (cx, cy), radius, (132, 115, 132), -1)# mouthelif index in [9, 10]:img = cv2.circle(img, (cx, cy), radius, (79, 84, 113), -1)# face and eyeelif index in [1, 2, 3, 4, 5, 6, 7, 8]:img = cv2.circle(img, (cx, cy), radius, (212, 195, 202), -1)# otherelse:img = cv2.circle(img, (cx, cy), radius, (140, 47, 240), -1)else:fail = "fail detection"img = cv2.putText(img, fail, (25, 100), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 3)FPS = 1 / (time.time() - start)img = cv2.putText(img, "FPS" + str(int(FPS)), (25, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 3)return imgdef out_video(input):file = input.split("/")[-1]output = "out-" + fileprint("It will start processing video: {}".format(input))cap = cv2.VideoCapture(input)frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))frame_size = (cap.get(cv2.CAP_PROP_FRAME_WIDTH), cap.get(cv2.CAP_PROP_FRAME_HEIGHT))# # create VideoWriter,VideoWriter_fourcc is video decodefourcc = cv2.VideoWriter_fourcc(*'mp4v')fps = cap.get(cv2.CAP_PROP_FPS)out = cv2.VideoWriter(output, fourcc, fps, (int(frame_size[0]), int(frame_size[1])))# the progress barwith tqdm(range(frame_count)) as pbar:while cap.isOpened():success, frame = cap.read()if not success:breaktry:frame = process_frame(frame)out.write(frame)pbar.update(1)except:print("ERROR")passpbar.close()cv2.destroyAllWindows()out.release()cap.release()print("{} finished!".format(output))if __name__ == '__main__':video_dirs = "1.mp4"out_video(video_dirs)

运行结果

pbar程序运行

人体关键点检测优化

很明显比之前的效果更好!

第一次做视频效果不太熟练还请见谅!


下一话

3D视觉——2.人体姿态估计(Pose Estimation)入门——OpenPose含安装、编译、使用(单帧、实时视频)https://blog.csdn.net/XiaoyYidiaodiao/article/details/125565738?spm=1001.2014.3001.5502

3D视觉——1.人体姿态估计(Pose Estimation)入门——使用MediaPipe含单帧(Signel Frame)与实时视频(Real-Time Video)相关推荐

  1. 3D视觉——3.人体姿态估计(Pose Estimation) 算法对比 即 效果展示——MediaPipe与OpenPose

    上一话 3D视觉--2.人体姿态估计(Pose Estimation)入门--OpenPose含安装.编译.使用(单帧.实时视频)https://blog.csdn.net/XiaoyYidiaodi ...

  2. 3D视觉——4.手势识别(Gesture Recognition)入门——使用MediaPipe含单帧(Singel Frame)和实时视频(Real-Time Video)

    上一话3D视觉--3.人体姿态估计(Pose Estimation) 算法对比 即 效果展示--MediaPipe与OpenPosehttps://blog.csdn.net/XiaoyYidiaod ...

  3. 基于3D深度视觉的人体姿态估计算法

    点击上方"小白学视觉",选择加"星标"或"置顶" 重磅干货,第一时间送达 本文转自|新机器视觉 人体姿态估计是当前计算机视觉领域的热点研究问 ...

  4. 快速人体姿态估计--Pose Proposal Networks

    Pose Proposal Networks ECCV2018 本文使用 YOLO + bottom-up greedy parsing 进行人体姿态估计 its total runtime usin ...

  5. YOLOv7姿态估计pose estimation(姿态估计+目标检测+跟踪)

    概述 YOLOv7姿态估计:一种快速准确的人体姿态估计模型 ​ 人体姿态估计是计算机视觉中的一项重要任务,具有各种应用,例如动作识别.人机交互和监控.近年来,基于深度学习的方法在人体姿态估计方面取得了 ...

  6. 3d人体姿态估计资料

    人字姿态估计数据集 - 知乎 SMPL论文解读和相关基础知识介绍 - 知乎 3D人体姿态估计方法 MHFormer:Multi-Hypothesis Transformer - 知乎 论文阅读笔记: ...

  7. 基于深度学习的三维人体姿态估计

    目录 ​​​​​一.技术背景 1.1 人体姿态估计 1.2 三维人体重建 1.4 构建多人场景研究情况 二. 技术方法 2.1 基础架构 2.2 重叠loss 2.3 深度顺序感知loss 四.存在的 ...

  8. ICCV 2019 | 基于无标签单目视频序列的单帧三维人体姿态估计算法

    作者丨王璇 学校丨西安交通大学 研究方向丨人体姿态估计 我们都知道,要训练一个准确的三维人体姿态估计深度神经网络,需要大量有三维标签的数据,越多越好.而在现实生活中,我们很难得到很多有标签数据,而无标 ...

  9. 人体姿态估计openpose学习与应用

    前言 2021年时,就有做人体姿态估计的想法,具体应用场景是想去把这个姿态估计与工厂操作工的动作结合起来,搭建一套能够监控和规范产线操作工装配动作的基于视觉的人体姿态估计系统.因为一系列的各种原因就搁 ...

  10. 人体姿态跟踪--Pose Flow: Efficient Online Pose Tracking

    Pose Flow: Efficient Online Pose Tracking https://github.com/MVIG-SJTU/AlphaPose 本文主要是关于人体姿态跟踪方面的内容. ...

最新文章

  1. 阿里云查看mysql版本_查看mysql版本
  2. mysql索引设计策略_MySQL索引设计一些策略
  3. ZooKeeper的API操作(二)(通俗易懂)
  4. 机器学习——大数据与MapReduce
  5. 电脑罗盘时钟代码_苹果电脑怎么设置数字时钟屏保 Word Clock for Mac安装教程
  6. ORACLE 将SQL的执行脚本返回值传给SHELL
  7. 【算法刷题2】二叉树的后序遍历
  8. Netty工作笔记0005---NIO介绍说明
  9. MySQL 5.7.18忘记密码和密码过期解决
  10. b站视频解析php,b站视频解析【调解流程】
  11. vscode代码格式化配置
  12. windowsGHO镜像系统winXPwin7win8win10下载
  13. 蓝桥杯练习系统特殊回文数(python)
  14. Ubuntu(21.04)下UHD(4.1)与Gnuradio安装配置--USRP X410软件无线电平台开发
  15. Data‘ object has no attribute ‘has_isolated_nodes
  16. 【TensorFlow】DNNRegressor 的简单使用
  17. 关于DbVisualizer 6.0.14中的SQL语句显示中文问题
  18. 云服务器面临的问题_无服务器安全面临的多方面威胁以及我们应如何应对
  19. 行进中换轮胎——万字长文解析美团和大众点评两大数据平台是怎么融合的
  20. 西北乱跑娃 --- python类方法重写

热门文章

  1. 知网下载pdf论文的最简单方法
  2. 量子计算机慕课,计算机组成原理-中国大学mooc-题库零氪
  3. Graph Theory 离散数学第五章
  4. 车联网网络安全技术研究
  5. 【亲测】Ripro子主题美化X系列主题(夏系列)-开源未加密
  6. 无人机机架 无人机机架材料 无人机机架的尺寸计算 无人机机架结构图
  7. Java学习笔记(一)
  8. Unity Video Player(视频)
  9. linux 小红伞使用手册,安装LINUX下小红伞图文教程.pdf
  10. linux端口快速释放,Linux 快速释放端口与释放内存缓存,linux释放端口缓存