简介

写这篇文章的初衷是因为最近带的新人里面很多都没有一个实际项目的经历，偶尔有做过图像识别的，但是对于目标检测项目的完整流程还是知之甚少，这篇文章主要是以比较通俗的口吻来详细介绍一下入门上手一个目标检测项目的完整流程，大神请忽略！

我根据自己以往的一些项目经验，总结汇总得到下面一个大致的流程，比较简明扼要地概括了目标检测项目的处理流程，其中，最后一步主要是公司项目才会用到，科研学术项目一般不会过多关注于部署实施层面的工作。接下来，我就结合自己下方总结的步骤流程来详细展开介绍，如何入门实践自己的第一个目标检测项目。

实战入门

这里我们以 people 为待检测的目标对象。

第一步：获取数据集

我们的数据集来源是网络数据采集和实际拍照采集，主要是获取一些包含人在内的图片就行了，爬虫的话网上也有很多，直接百度搜索【人】即可：

数据获取部分不是本节要讲的重点，这里就交给读者自行完成了，这里先看下数据集样例：

第二步：数据标注

获取到我们所需要的图片数据之后，接下来就要对其进行标注处理了，我这里推荐使用的标注软件是labelImg，直接源码安装即可，我是直接网上搜索下载了一个已经打包好的labelImg.exe直接双击即可启动，下面来看下标注的流程。

软件启动界面如下：

接下来按照截图红框操作点击【Open Dir】打开图片存储目录：

确定后自动打开第一张图片：

接下来开始标注：

点击红框按钮，新生成一个标注框：

点击之后可以看到有两条垂直的黑色线条会随着光标移动，这个是用来帮你确定标注框的顶点的。

绘制完单个标注框之后，填入对应的名称，点击ok即可，这样就完成了单个目标对象的标注工作了。剩下的工作就是无尽的重复处理了，目标检测项目里面数据标注可是需要花大工夫才能完成的，数据量太少的话效果肯定不会好。

标注完成后，会生成与图片数据对等个数的xml文件：

单个标注文件部分内容如下：

红框部分表示的就是单个标注框的坐标，分别表示左上角和右下角两个坐标，共四个数值。

第三步：目标检测数据格式转化

这一步主要是将我们标注好的xml数据进行解析处理，生成可用于目标检测模型计算的数据集格式，首先我们会对整体的数据集进行划分计算，生成train.txt和test.txt两个文件，分别用于构建模型的训练数据集和测试数据集，

dataset目录截图如下：

代码实现如下：

import os
import random
random.seed(0)xmlfilepath='./dataset/xmls/'
saveBasePath="./dataset/"trainval_percent=0.2
train_percent=0.8
temp_xml = os.listdir(xmlfilepath)
total_xml = []
for xml in temp_xml:if xml.endswith(".xml"):total_xml.append(xml)
num=len(total_xml)
print("num: ",num)
list=range(num)
tv=int(num*trainval_percent)
tr=int(tv*train_percent)
trainval= random.sample(list,tv)
train=random.sample(trainval,tr)  ftest = open(os.path.join(saveBasePath,'test.txt'), 'w')
ftrain = open(os.path.join(saveBasePath,'train.txt'), 'w')
for i  in list:  name=total_xml[i][:-4]+'\n'  if i in train:  ftrain.write(name)  if i in trainval: ftest.write(name)
ftrain.close()
ftest .close()

代码执行完成之后会在我们的dataset目录下面生成两个文件train.txt和test.txt，这两个文件都是只包含图像的id。dataset目录截图如下所示：

train.txt截图如下：

test.txt截图如下：

为了演示，这里我总的数据集里面只有40张图片，id从0到39。

之后就要基于这两个文件来构建可用于模型的训练数据集和测试数据集了。项目整体目录如下：

代码实现如下所示：

sets=['train','test']
classes = ['people']def convert_annotation(image_id, list_file):in_file = open('dataset/xmls/%s.xml'%(image_id), encoding='utf-8')tree=ET.parse(in_file)root = tree.getroot()for obj in root.iter('object'):difficult = 0 if obj.find('difficult')!=None:difficult = obj.find('difficult').textcls = obj.find('name').textif cls not in classes or int(difficult)==1:continuecls_id = classes.index(cls)xmlbox = obj.find('bndbox')b = (int(xmlbox.find('xmin').text), int(xmlbox.find('ymin').text), int(xmlbox.find('xmax').text), int(xmlbox.find('ymax').text))list_file.write(" " + ",".join([str(a) for a in b]) + ',' + str(cls_id))prefix=""
for image_set in sets:image_ids = open('dataset/%s.txt'%(image_set)).read().strip().split()list_file = open('%s.txt'%(image_set), 'w')for image_id in image_ids:one_line_prefix=prefix+'dataset/JPEGImages/'+image_id+'.jpg'list_file.write(one_line_prefix)convert_annotation(image_id, list_file)list_file.write('\n')list_file.close()

代码执行完成之后会在当前目录下生成两个文件train.txt和test.txt，这是模型的训练数据集和测试数据集，这里可能有点绕，因为代码写的比较冗余，上面的两个环节实际上是可以合并到一起变成一步处理到位的。此时项目截图如下：

可以看到：在当前目录下新生成了两个文件，分别是train.txt和test.txt，下面来具体看下，train.txt截图如下：

我们这里以第四行为例进行解释，前面半部分为图片的路径，后面的数值为标注框的信息，后面我框了两个红框，每个单独的红框表示一个标注框，其中前四个标注框表示左上角和右下角的坐标信息，第五个数值表示的是目标类别列表的索引值，由于这里我们只有一个目标对象，所以这里的目标索引值均为0。

test.txt截图如下：

到这里，我们的数据集划分就介绍结束了。

第四步：模型训练

这一环节是我们目标检测项目实战的核心部分，前面几步我们获取、标注、处理得到了可用于模型训练的数据集以及对应的标注数据，接下来就要基于这批数据集来构建我们的目标检测模型了。首先要修改一下配置信息，这里主要是对应于config.py脚本的内容：

#训练数据集路径
annotation_path = 'train.txt'
#类别清单路径
classes_path = 'model_data/classes.txt'
#锚框路径
anchors_path = 'model_data/anchors.txt'
#预训练模型路径
weights_path = 'model_data/pretrain.h5'
#图像输入尺寸
input_shape = (416,416)
#是否归一化
normalize = False
#读取目标清单
class_names = get_classes(classes_path)
#加载锚框文件
anchors = get_anchors(anchors_path)
print('class_names: ', class_names)
print('anchors: ', anchors)
num_classes = len(class_names)
num_anchors = len(anchors)
print('num_classes: ', num_classes)
print('num_anchors: ', num_anchors)
mosaic = False
Cosine_scheduler = False
label_smoothing = 0

相应的注释我也都放到代码块里面了，其他的参数就保持默认设置就行了，别问，问就是我亲身实践测试的结果。

设置完成之后就可以进行模型训练了，训练方式也很简单，终端执行：

python train.py

记性了，train.py部分实现如下：

def data_generator(annotation_lines, batch_size, input_shape, anchors, num_classes, mosaic=False, random=True):'''数据生成器'''n = len(annotation_lines)i = 0flag = Truewhile True:image_data = []box_data = []for b in range(batch_size):if i==0:np.random.shuffle(annotation_lines)if mosaic:if flag and (i+4) < n:image, box = get_random_data_with_Mosaic(annotation_lines[i:i+4], input_shape)i = (i+4) % nelse:image, box = get_random_data(annotation_lines[i], input_shape, random=random)i = (i+1) % nflag = bool(1-flag)else:image, box = get_random_data(annotation_lines[i], input_shape, random=random)i = (i+1) % nimage_data.append(image)box_data.append(box)image_data = np.array(image_data)box_data = np.array(box_data)y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes)yield [image_data, *y_true], np.zeros(batch_size)def preprocess_true_boxes(true_boxes, input_shape, anchors, num_classes):'''预处理box'''assert (true_boxes[..., 4]<num_classes).all(), 'class id must be less than num_classes'num_layers = len(anchors)//3anchor_mask = [[6,7,8], [3,4,5], [0,1,2]] if num_layers==3 else [[3,4,5], [1,2,3]]true_boxes = np.array(true_boxes, dtype='float32')input_shape = np.array(input_shape, dtype='int32') boxes_xy = (true_boxes[..., 0:2] + true_boxes[..., 2:4]) // 2boxes_wh = true_boxes[..., 2:4] - true_boxes[..., 0:2]true_boxes[..., 0:2] = boxes_xy/input_shape[::-1]true_boxes[..., 2:4] = boxes_wh/input_shape[::-1]m = true_boxes.shape[0]grid_shapes = [input_shape//{0:32, 1:16, 2:8}[l] for l in range(num_layers)]y_true = [np.zeros((m,grid_shapes[l][0],grid_shapes[l][1],len(anchor_mask[l]),5+num_classes),dtype='float32') for l in range(num_layers)]anchors = np.expand_dims(anchors, 0)anchor_maxes = anchors / 2.anchor_mins = -anchor_maxesvalid_mask = boxes_wh[..., 0]>0for b in range(m):wh = boxes_wh[b, valid_mask[b]]if len(wh)==0: continuewh = np.expand_dims(wh, -2)box_maxes = wh / 2.box_mins = -box_maxesintersect_mins = np.maximum(box_mins, anchor_mins)intersect_maxes = np.minimum(box_maxes, anchor_maxes)intersect_wh = np.maximum(intersect_maxes - intersect_mins, 0.)intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1]box_area = wh[..., 0] * wh[..., 1]anchor_area = anchors[..., 0] * anchors[..., 1]iou = intersect_area / (box_area + anchor_area - intersect_area)best_anchor = np.argmax(iou, axis=-1)for t, n in enumerate(best_anchor):for l in range(num_layers):if n in anchor_mask[l]:i = np.floor(true_boxes[b,t,0] * grid_shapes[l][1]).astype('int32')j = np.floor(true_boxes[b,t,1] * grid_shapes[l][0]).astype('int32')k = anchor_mask[l].index(n)c = true_boxes[b, t, 4].astype('int32')y_true[l][b, j, i, k, 0:4] = true_boxes[b, t, 0:4]y_true[l][b, j, i, k, 4] = 1y_true[l][b, j, i, k, 5+c] = 1return y_trueif __name__ == "__main__":K.clear_session()image_input = Input(shape=(None, None, 3))h, w = input_shapeprint('Create YOLOv4-Tiny model with {} anchors and {} classes.'.format(num_anchors, num_classes))model_body = yolo_body(image_input, num_anchors//2, num_classes)print('Load weights {}.'.format(weights_path))model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)y_true = [Input(shape=(h//{0:32, 1:16}[l], w//{0:32, 1:16}[l], num_anchors//2, num_classes+5)) for l in range(2)]loss_input = [*model_body.output, *y_true]model_loss = Lambda(yolo_loss, output_shape=(1,), name='yolo_loss',arguments={'anchors': anchors, 'num_classes': num_classes, 'ignore_thresh': 0.5, 'label_smoothing': label_smoothing, 'normalize':normalize})(loss_input)model = Model([model_body.input, *y_true], model_loss)logging = TensorBoard(log_dir=log_dir)checkpoint = ModelCheckpoint(log_dir + 'best.h5',monitor='val_loss', save_weights_only=False, save_best_only=True, period=1)early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=1)val_split = 0.1with open(annotation_path) as f:lines = f.readlines()np.random.seed(10101)np.random.shuffle(lines)np.random.seed(None)num_val = int(len(lines)*val_split)num_train = len(lines) - num_valfreeze_layers = 60for i in range(freeze_layers): model_body.layers[i].trainable = Falseprint('Freeze the first {} layers of total {} layers.'.format(freeze_layers, len(model_body.layers)))if True:Init_epoch = 0Freeze_epoch = 50batch_size = 8learning_rate_base = 1e-3if Cosine_scheduler:warmup_epoch = int((Freeze_epoch-Init_epoch)*0.2)total_steps = int((Freeze_epoch-Init_epoch) * num_train / batch_size)warmup_steps = int(warmup_epoch * num_train / batch_size)reduce_lr = WarmUpCosineDecayScheduler(learning_rate_base=learning_rate_base,total_steps=total_steps,warmup_learning_rate=1e-4,warmup_steps=warmup_steps,hold_base_rate_steps=num_train,min_learn_rate=1e-6)model.compile(optimizer=Adam(), loss={'yolo_loss': lambda y_true, y_pred: y_pred})else:reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, verbose=1)model.compile(optimizer=Adam(learning_rate_base), loss={'yolo_loss': lambda y_true, y_pred: y_pred})print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))model.fit_generator(data_generator(lines[:num_train], batch_size, input_shape, anchors, num_classes, mosaic=mosaic),steps_per_epoch=max(1, num_train//batch_size),validation_data=data_generator(lines[num_train:], batch_size, input_shape, anchors, num_classes, mosaic=False),validation_steps=max(1, num_val//batch_size),epochs=Freeze_epoch,initial_epoch=Init_epoch,callbacks=[logging, checkpoint, reduce_lr, early_stopping])model.save_weights(log_dir + 'trained_weights_stage_1.h5')for i in range(freeze_layers): model_body.layers[i].trainable = Trueif True:Freeze_epoch = 50Epoch = 100batch_size = 8learning_rate_base = 1e-4if Cosine_scheduler:warmup_epoch = int((Epoch-Freeze_epoch)*0.2)total_steps = int((Epoch-Freeze_epoch) * num_train / batch_size)warmup_steps = int(warmup_epoch * num_train / batch_size)reduce_lr = WarmUpCosineDecayScheduler(learning_rate_base=learning_rate_base,total_steps=total_steps,warmup_learning_rate=1e-5,warmup_steps=warmup_steps,hold_base_rate_steps=num_train//2,min_learn_rate=1e-6)model.compile(optimizer=Adam(), loss={'yolo_loss': lambda y_true, y_pred: y_pred})else:reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, verbose=1)model.compile(optimizer=Adam(learning_rate_base), loss={'yolo_loss': lambda y_true, y_pred: y_pred})print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))model.fit_generator(data_generator(lines[:num_train], batch_size, input_shape, anchors, num_classes, mosaic=mosaic),steps_per_epoch=max(1, num_train//batch_size),validation_data=data_generator(lines[num_train:], batch_size, input_shape, anchors, num_classes, mosaic=False),validation_steps=max(1, num_val//batch_size),epochs=Epoch,initial_epoch=Freeze_epoch,callbacks=[logging, checkpoint, reduce_lr, early_stopping])model.save_weights(log_dir + 'model.h5')

训练启动，输出截图如下：

可以看到：模型已经正常训练启动了，静静等待模型训练即可，期间会一直覆盖存储最好的模型，可以随时用于模型的测试。

下面是一些模型样例输出：

关于模型评估、部署相关的内容我打算拆分出来后面单独写一篇文章来进行介绍，到这里一个完整的目标检测建模处理流程就全部介绍完成了，我主要是从代码实践的角度来进行讲解的，拿的是网上下载的一个开源的yolov4-tiny项目来进行具体实践的，感兴趣的话可以自己动手实践一下，欢迎互相交流学习！

零基础入门实践目标检测项目相关推荐

Pr零基础入门指南笔记一——项目、序列、预设
1.学习地址 [干货]PR零基础入门指南第二集:新建项目和序列以及预设,基础但非常重要,PR萌新必学!_哔哩哔哩_bilibili 2.视频剪辑 3.项目项目管理文件夹主项目文件夹日期+项目名 ...
【157.1】golang+beego零基础入门实践教程it营大地
go 在线工具 6.goLang语言基本数据类型 -整型详解 package main import "fmt" func main () {//1.定义int类型var num ...
二、零基础入门微信小程序项目开发之页面跳转实现
前言哈喽,大家好,我是明哥上一篇博客我们讲完了项目的引导页面的开发,这篇博文我们来讲讲如何从引导页页面跳转到我们的新闻预览页面,这是我们就要引入微信小程序的路由 ,什么是路由相信有前端开发经验的小伙 ...
零基础入门CV赛事，理论结合实践
Datawhale干货作者:阿水,Datawhale成员本次分享的背景是,Datawhle联合天池发布的学习赛:零基础入门CV赛事之街景字符识别.本文以该比赛为例,对计算机视觉赛事中,赛事理解和B ...
Android Studio新手–下载安装配置–零基础入门–基本使用–调试技能–构建项目基础–使用AS应对常规应用开发
转自:http://blog.csdn.net/yanbober/article/details/45306483 目标:Android Studio新手–>下载安装配置–>零基础入门–& ...
基于实践的LabVIEW零基础入门视频教程
原文地址::http://blog.eeecontrol.com/LabVIEW1/ <基于实践的LabVIEW零基础入门视频教程> 资料不在多,而在于精,资料太多,反而会迷失方向,学习最 ...
react从零基础入门到项目实战视频教程
React 起源于 Facebook 的内部项目,用来架设 Instagram 的网站, 并于 2013年 5 月开源.React 拥有较高的性能,代码逻辑非常简单,越来越多的人已开始关注和使用它.这 ...
一、零基础入门微信小程序开发之创建项目工程同时完成引导页开发
前言创建这个系列博客的原因是因为最近在加深微信小程序的学习,按照我之前的学习习惯是不喜欢记录的,加上自己有拖延症就更不太愿意做这件事情了,同时我要给学生上课,但总是缺少教材所以就开了这个系列的博客, ...
视频教程-20年Nodejs教程零基础入门到项目实战前端视频教程-Node.js
20年Nodejs教程零基础入门到项目实战前端视频教程 7年的开发架构经验,曾就职于国内一线互联网公司,开发工程师,现在是某创业公司技术负责人, 擅长语言有node/java/python,专注于服务 ...
2023年最新最全uniCloud入门学习，零基础入门到实战项目 uni-admin打造uniapp网页后端微信支付宝抖音小程序后端 unicloud数据后台快速打造uniapp小程序项目
今天开始带着大家一起零基础学习uniCloud,在下面的课程中我们就简称uniCloud为cloud吧.我这里从零基础开始教大家,后面可以带大家简单的做一个实战项目.所以不用担心自己没有基础,跟着石头 ...

零基础入门实践目标检测项目

简介