基于 SSD 的人体上下半身检测

这里主要是通过将训练数据转换成 Pascal VOC 数据集格式来实现 SSD 检测人体上下半身.

由于没有对人体上下半身进行标注的数据集, 这里利用 MPII Human Pose Dataset 来将 Pose 数据转换成上下半身 box 数据, 故box的准确性不一定很高, 但还是可以用来测试学习使用的.

1. Pose to GTbox

将MPII Human Pose Data 转换为 json 格式 - mpii_single.txt, 其内容如下:

mpii/060111501.jpg|{"PELVIS": [904,237], "THORAX": [858,135], "NECK": [871.1877,180.4244], "HEAD": [835.8123,58.5756], "R_ANKLE": [980,322], "R_KNEE": [896,318], "R_HIP": [865,248], "L_HIP": [943,226], "L_KNEE": [948,290], "L_ANKLE": [881,349], "R_WRIST": [772,294], "R_ELBOW": [754,247], "R_SHOULDER": [792,147], "L_SHOULDER": [923,123], "L_ELBOW": [995,163], "L_WRIST": [961,223]}
mpii/002058449.jpg|{"PELVIS": [846,351], "THORAX": [738,259], "NECK": [795.2738,314.8937], "HEAD": [597.7262,122.1063], "R_ANKLE": [918,456], "R_KNEE": [659,518], "R_HIP": [713,413], "L_HIP": [979,288], "L_KNEE": [1222,453], "L_ANKLE": [974,399], "R_WRIST": [441,490], "R_ELBOW": [446,434], "R_SHOULDER": [599,270], "L_SHOULDER": [877,247], "L_ELBOW": [1112,384], "L_WRIST": [1012,489]}
mpii/029122914.jpg|{"PELVIS": [332,346], "THORAX": [325,217], "NECK": [326.2681,196.1669], "HEAD": [330.7319,122.8331], "R_ANKLE": [301,473], "R_KNEE": [302,346], "R_HIP": [362,345], "L_HIP": [367,470], "L_KNEE": [275,299], "L_ANKLE": [262,300], "R_WRIST": [278,220], "R_ELBOW": [371,213], "R_SHOULDER": [396,309], "L_SHOULDER": [393,290]}
mpii/061185289.jpg|{"PELVIS": [533,322], "THORAX": [515.0945,277.1333], "NECK": [463.9055,148.8667], "HEAD": [353,172], "R_ANKLE": [426,239], "R_KNEE": [513,288], "R_HIP": [552,355]}
mpii/013949386.jpg|{"PELVIS": [159,370], "THORAX": [189,228], "NECK": [191.1195,227.0916], "HEAD": [326.8805,168.9084], "R_ANKLE": [110,385], "R_KNEE": [208,355], "R_HIP": [367,363], "L_HIP": [254,429], "L_KNEE": [166,303], "L_ANKLE": [212,153], "R_WRIST": [319,123], "R_ELBOW": [376,39]}
....

定义上下半身关节点:

upper = ['HEAD', 'NECK', 'L_SHOULDER', 'L_ELBOW', 'L_WRIST', 'R_WRIST', 'R_ELBOW', 'R_SHOULDER', 'THORAX']
lower = ['PELVIS', 'L_HIP', 'L_KNEE', 'L_ANKLE', 'R_ANKLE', 'R_KNEE', 'R_HIP']

以关节点图像中的位置, 设定外扩 50 个 像素,以使得 gtbox 尽可能准确.

get_gtbox.py

#!/usr/bin/env python
import json
import cv2
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import scipy.misc as scmupper = ['HEAD', 'NECK', 'L_SHOULDER', 'L_ELBOW', 'L_WRIST', 'R_WRIST', 'R_ELBOW', 'R_SHOULDER', 'THORAX']
lower = ['PELVIS', 'L_HIP', 'L_KNEE', 'L_ANKLE', 'R_ANKLE', 'R_KNEE', 'R_HIP']datas = open('mpii_single.txt').readlines()
print 'Length of datas: ', len(datas)f = open('mpii_gtbox.txt', 'w')
for data in datas:# print datadatasplit = data.split('|')imgname, posedict = datasplit[0], json.loads(datasplit[1])img = np.array(Image.open(imgname), dtype=np.uint8)height, width, _ = np.shape(img)if len(posedict.keys()) == 16: # only joints of full body used to get gtbox x_upper, y_upper = [], []for joint in upper:x_upper.append(posedict[joint][0])y_upper.append(posedict[joint][1])upper_x1, upper_y1 = int(max(min(x_upper) - 50, 0)),     int(max(min(y_upper) - 50, 0))upper_x2, upper_y2 = int(min(max(x_upper) + 50, width)), int(min(max(y_upper) + 50, height))img = cv2.rectangle(img, (upper_x1, upper_y1), (upper_x2, upper_y2), (0, 255, 0), 2)x_lower, y_lower = [], []for joint in lower:x_lower.append(posedict[joint][0])y_lower.append(posedict[joint][1])lower_x1, lower_y1 = int(max(min(x_lower) - 50, 0)),     int(max(min(y_lower) - 50, 0))lower_x2, lower_y2 = int(min(max(x_lower) + 50, width)), int(min(max(y_lower) + 50, height))img = cv2.rectangle(img, (lower_x1, lower_y1), (lower_x2, lower_y2), (255, 0, 0), 2)tempstr_upper = str(upper_x1) + ',' + str(upper_y1) + ',' + str(upper_x2) + ',' + str(upper_y2) + ',upper'tempstr_lower = str(lower_x1) + ',' + str(lower_y1) + ',' + str(lower_x2) + ',' + str(lower_y2) + ',lower'tempstr = imgname + '|' + tempstr_upper + '|' + tempstr_lower + '\n'f.write(tempstr)# plt.imshow(img)# plt.show()
f.close()
print 'Done.'

得到的 gtbox 如下:

2. GTbox - txt2xml

由于Pascal VOC 的 image-xml 的格式, 即一张图片对应一个 xml 标注信息, 因此这里也将得到的 人体上下半身的 gtbox 转换成 xml 标注的形式.

这里每张图片都是有两个标注信息的, 上半身 gtbox 和 下半身 gtbox.

txt2xml.py

#! /usr/bin/python
import os
from PIL import Imagedatas = open("mpii_gtbox.txt").readlines()imgpath = "mpii/"
ann_dir = 'gtboxs/'
for data in datas:datasplit = datas.split('|')img_name = datasplit[0]im = Image.open(imgpath + img_name)width, height = im.sizegts = datasplit[1:]# write in xml fileif os.path.exists(ann_dir + os.path.dirname(img_name)):passelse:os.makedirs(ann_dir + os.path.dirname(img_name))os.mknod(ann_dir + img_name[:-4] + '.xml')xml_file = open((ann_dir + img_name[:-4] + '.xml'), 'w')xml_file.write('<annotation>\n')xml_file.write('    <folder>gtbox</folder>\n')xml_file.write('    <filename>' + img_name + '</filename>\n')xml_file.write('    <size>\n')xml_file.write('        <width>' + str(width) + '</width>\n')xml_file.write('        <height>' + str(height) + '</height>\n')xml_file.write('        <depth>3</depth>\n')xml_file.write('     </size>\n')# write the region of text on xml filefor img_each_label in gts:spt = img_each_label.split(',')xml_file.write('    <object>\n')xml_file.write('        <name>'+ spt[4].strip() + '</name>\n')xml_file.write('        <pose>Unspecified</pose>\n')xml_file.write('        <truncated>0</truncated>\n')xml_file.write('        <difficult>0</difficult>\n')xml_file.write('        <bndbox>\n')xml_file.write('            <xmin>' + str(spt[0]) + '</xmin>\n')xml_file.write('            <ymin>' + str(spt[1]) + '</ymin>\n')xml_file.write('            <xmax>' + str(spt[2]) + '</xmax>\n')xml_file.write('            <ymax>' + str(spt[3]) + '</ymax>\n')xml_file.write('        </bndbox>\n')xml_file.write('    </object>\n')xml_file.write('</annotation>')xml_file.close() #print 'Done.'

gtbox - xml 内容格式如:

<annotation><folder>gtbox</folder><filename>mpii/000004812.jpg</filename><size><width>1920</width><height>1080</height><depth>3</depth></size><object><name>upper</name><pose>Unspecified</pose><truncated>0</truncated><difficult>0</difficult><bndbox><xmin>1408</xmin><ymin>573</ymin><xmax>1848</xmax><ymax>1025</ymax></bndbox></object><object><name>lower</name><pose>Unspecified</pose><truncated>0</truncated><difficult>0</difficult><bndbox><xmin>1310</xmin><ymin>475</ymin><xmax>1460</xmax><ymax>1042</ymax></bndbox></object>
</annotation>

3. Create LMDB

生成 trainval.txt 和 test.txt, 其内容格式为:

mpii/038796633.jpg gtboxs/038796633.xml
mpii/081305121.jpg gtboxs/081305121.xml
mpii/016047648.jpg gtboxs/016047648.xml
mpii/078242581.jpg gtboxs/078242581.xml
mpii/027364042.jpg gtboxs/027364042.xml
mpii/090828862.jpg gtboxs/090828862.xml
......

labelmap_gtbox.prototxt 定义如下:

item {name: "none_of_the_above"label: 0display_name: "background"
}
item {name: "upper"label: 1display_name: "upper"
}
item {name: "lower"label: 2display_name: "lower"
}

test_name_size.py 来生成 test_name_size.txt:

#! /usr/bin/pythonimport os
from PIL import Imageimg_lists = open('test.txt').readlines()
img_lists = [item.split(' ')[0] for item in img_lists]test_name_size = open('test_name_size.txt', 'w')imgpath = "mpii/"
for item in img_lists:img = Image.open(imgpath + item)width, height = img.sizetemp1, temp2 = os.path.splitext(item)test_name_size.write(temp1 + ' ' + str(height) + ' ' + str(width) + '\n')print 'Done.'

利用 create_data.sh 创建 trainval 和 test 的 lmdb —— gtbox_trainval_lmdb 和 gtbox_test_lmdb.

cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )
root_dir="mpii/data"est
ssd_dir="/path/to/caffe-ssd"cd $root_dirredo=1
data_root_dir="mpii/"
dataset_name="gtbox"
mapfile="$root_dir/labelmap_gtbox.prototxt"
anno_type="detection"
db="lmdb"
min_dim=0
max_dim=0
width=0
height=0extra_cmd="--encode-type=jpg --encoded"
if [ $redo ]
thenextra_cmd="$extra_cmd --redo"
fi
for subset in test trainval
dopython $ssd_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $root_dir/$subset.txt $root_dir/$dataset_name/$db/$dataset_name"_"$subset"_"$db ddbox/$dataset_name
done

4. Train/Eval

修改 examples/ssd/ssd_pascal.py, python 运行即可.

这里的训练和测试网络为—— ssd_detect_human_body.

训练得到的测试精度接近 90%,还可以.

检测代码 —— ssd_detect.py

#!/usr/bin/env/python
import numpy as np
import matplotlib.pyplot as pltcaffe_root = '/path/to/caffe-ssd/'
import sys
sys.path.insert(0, caffe_root + 'python')import caffe
caffe.set_device(0)
caffe.set_mode_gpu()from google.protobuf import text_format
from caffe.proto import caffe_pb2# load labels
labelmap_file = 'gtbox/labelmap_gtbox.prototxt'
file = open(labelmap_file, 'r')
labelmap = caffe_pb2.LabelMap()
text_format.Merge(str(file.read()), labelmap)def get_labelname(labelmap, labels):num_labels = len(labelmap.item)labelnames = []if type(labels) is not list:labels = [labels]for label in labels:found = Falsefor i in xrange(0, num_labels):if label == labelmap.item[i].label:found = Truelabelnames.append(labelmap.item[i].display_name)breakassert found == Truereturn labelnamesmodel_def     = 'deploy.prototxt'
model_weights = 'VGG_gtbox_SSD_300x300_iter_120000.caffemodel'
net = caffe.Net(model_def, model_weights, caffe.TEST)image_resize = 300
net.blobs['data'].reshape(1, 3, image_resize, image_resize)transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2, 0, 1))
transformer.set_mean('data', np.array([104,117,123])) # mean pixel
transformer.set_raw_scale('data', 255)  # the reference model operates on images in [0,255] range instead of [0,1]
transformer.set_channel_swap('data', (2,1,0))  # the reference model has channels in BGR order instead of RGBimage = caffe.io.load_image('images/000000011.jpg')transformed_image = transformer.preprocess('data', image)
net.blobs['data'].data[...] = transformed_image# Forward pass.
detections = net.forward()['detection_out']# Parse the outputs.
det_label = detections[0,0,:,1]
det_conf = detections[0,0,:,2]
det_xmin = detections[0,0,:,3]
det_ymin = detections[0,0,:,4]
det_xmax = detections[0,0,:,5]
det_ymax = detections[0,0,:,6]# Get detections with confidence higher than 0.6.
top_indices = [i for i, conf in enumerate(det_conf) if conf >= 0.6]top_conf = det_conf[top_indices]
top_label_indices = det_label[top_indices].tolist()
top_labels = get_labelname(labelmap, top_label_indices)
top_xmin = det_xmin[top_indices]
top_ymin = det_ymin[top_indices]
top_xmax = det_xmax[top_indices]
top_ymax = det_ymax[top_indices]colors = plt.cm.hsv(np.linspace(0, 1, 21)).tolist()plt.imshow(image)
plt.axis('off')
currentAxis = plt.gca()for i in xrange(top_conf.shape[0]):xmin = int(round(top_xmin[i] * image.shape[1]))ymin = int(round(top_ymin[i] * image.shape[0]))xmax = int(round(top_xmax[i] * image.shape[1]))ymax = int(round(top_ymax[i] * image.shape[0]))score = top_conf[i]label = int(top_label_indices[i])label_name = top_labels[i]display_txt = '%s: %.2f'%(label_name, score)coords = (xmin, ymin), xmax-xmin+1, ymax-ymin+1color = colors[label]currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor=color, linewidth=2))currentAxis.text(xmin, ymin, display_txt, bbox={'facecolor':color, 'alpha':0.5})plt.show()

5. Results



6. Reference

[1]. [Code-SSD]

[2] - SSD: Single Shot MultiBox Detector

[3] - SSD: Signle Shot Detector 用于自然场景文字检测

目标检测 - 基于 SSD: Single Shot MultiBox Detector 的人体上下半身检测相关推荐

  1. 【目标检测】SSD(Single Shot MultiBox Detector)的复现

    文章目录 SSD SSD源码解析 0. 从Git下载代码 1. 下载所需要的库 2. 数据集 3. Training SSD 4. Evaluation 参考资料 SSD 背景 这是一种 single ...

  2. 【目标检测】SSD: Single Shot MultiBox Detector 模型fine-tune和网络架构

    前言 博主在上一篇中提到了两种可能的改进方法.其中方法1,扩充类似数据集,详见Udacity Self-Driving 目标检测数据集简介与使用 ,由于一些原因,并未对此数据集做过多探索,一次简单训练 ...

  3. SSD(Single shot multibox detector)目标检测模型架构和设计细节分析

    先给出论文链接:SSD: Single Shot MultiBox Detector 本文将对SSD中一些难以理解的细节做仔细分析,包括了default box和ground truth的结合,def ...

  4. 目标检测--SSD: Single Shot MultiBox Detector

    SSD: Single Shot MultiBox Detector ECCV2016 https://github.com/weiliu89/caffe/tree/ssd 针对目标检测问题,本文取消 ...

  5. 目标检测方法简介:RPN(Region Proposal Network) and SSD(Single Shot MultiBox Detector)

    原文引用:http://lufo.me/2016/10/detection/ 最近几年深度学习在计算机视觉领域取得了巨大的成功,而在目标检测这一计算机视觉的经典问题上直到去年(2015)才有了完全使用 ...

  6. 目标检测 SSD: Single Shot MultiBox Detector - SSD在MMDetection中的实现

    目标检测 SSD: Single Shot MultiBox Detector - SSD在MMDetection中的实现 flyfish 目标检测 SSD: Single Shot MultiBox ...

  7. ssd网络结构_封藏的SSD(Single Shot MultiBox Detector)笔记

    关注oldpan博客,侃侃而谈人工智能深度酝酿优质原创文! 阅读本文需要xx分钟 ? 前言 本文用于记录学习SSD目标检测的过程,并且总结一些精华知识点. 为什么要学习SSD,是因为SSD和YOLO一 ...

  8. 深度学习之 SSD(Single Shot MultiBox Detector)

    目标检测近年来已经取得了很重要的进展,主流的算法主要分为两个类型: (1)two-stage方法,如R-CNN系算法,其主要思路是先通过启发式方法(selective search)或者CNN网络(R ...

  9. SSD论文阅读(Wei Liu——【ECCV2016】SSD Single Shot MultiBox Detector)

    本文转载自: http://www.cnblogs.com/lillylin/p/6207292.html SSD论文阅读(Wei Liu--[ECCV2016]SSD Single Shot Mul ...

最新文章

  1. JavaScript字符串数组拼接的性能测试及优化方法
  2. 【控制】《多智能体系统一致性协同演化控制理论与技术》纪良浩老师-第4章-具有随机扰动的多智能体系统脉冲一致性
  3. 2015第29周五AOP
  4. 计算机类学生发专利,关于统计2020年老师指导本科生发表科研论文数和申请专利数的通知...
  5. TF从文件中读取数据
  6. SDP协议 学习笔记
  7. React 源码剖析系列 - 解密 setState
  8. mysql join 联合查询,MySQL连接(join)查询
  9. springmvc源码阅读3--dispatcherServlet reqeust的执行流程
  10. Spring AOP面向切面源码解析
  11. 《专家系统破解篇 六、IL代码破解--配套乱说》之 FeatherskyExpertSystem
  12. JAVA中jspinner设置选中内容_java – 如何在JSpinner中获取所选项的值?
  13. C语言实现flappy bird(可视化编程)
  14. 【案例】某区医院绩效工资分配系统和绩效工资分配优化服务案例
  15. 安装mantis 2.14
  16. 前后落差大用什么词语_形容前后反差大的词语
  17. 常见面试题之布隆过滤器的使用案例(海量数据)
  18. MySql导入、导出数据解决方案(SQL语句)
  19. 染色体的基因顺序遗传图谱
  20. 在php中调用接口以及编写接口

热门文章

  1. 松下服务器分频器输出信号与,松下A5伺服驱动器的开关量输入、输出端子定义...
  2. GIS技巧100例06-ArcGIS快速批量将PDF转图片
  3. 教育培训行业如何通过拼团活动拉新引流、裂变
  4. win10 桌面图标变白方块
  5. 信息论中平稳概念--离散平稳信源
  6. 『詩解』八方捷报随春至,在前尽忠又立功的动物
  7. 读书-量子物理基础1
  8. python实现“粒子动画”
  9. JAVA调用FFmpeg实现音视频转码加水印功能
  10. 自己验证的一片扯淡的骨架提取论文