TensorFlow Objection Detection API使用教程

安装参考官方教程

注意在安装的时候需要将protoc升级到3.*版本,否则编译将不能成功。可能报以下错误：

cannot import name 'preprocessor_pb2'
cannot import name string_int_label_map_pb2
Import "object_detection/protos/ssd.proto" was not found or had errors.

注意一定要先编译object_detection/protos文件夹,否则报错。

1. 训练

1.1 制作lable_map.pbtxt文件

参考官方代码，中间的过程需要自己修改

import pandas as pddef create_labelmap(word_count_file="../data/sub_obj_word_count.txt",labelmap_outfile="../data/labelmap.pbtxt"):""":param word_count_file: "../data/sub_obj_word_count.txt":param labelmap_outfile::return:"""df = pd.read_csv(word_count_file, header=None,names=["obj_name", "obj_cnt"])objects = df.obj_name.tolist()end = "\n"s = " "class_map = {}for id, name in enumerate(objects):out = ""out += "item" + s + "{" + endout += (s * 2 + "id:" + " " + (str(id + 1)) + end)out += (s * 2 + "name:" + " " + "\'" + name + "\'" + end)out += ("}" + end * 2)with open(labelmap_outfile, "a") as f:f.write(out)class_map[name] = id + 1

1.2 制作TFRecord文件

import tensorflow as tffrom object_detection.utils import dataset_utilflags = tf.app.flags
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGSdef create_tf_example(example):# TODO(user): Populate the following variables from your example.height = None # Image heightwidth = None # Image widthfilename = None # Filename of the image. Empty if image is not from fileencoded_image_data = None # Encoded image bytesimage_format = None # b'jpeg' or b'png'xmins = [] # List of normalized left x coordinates in bounding box (1 per box)xmaxs = [] # List of normalized right x coordinates in bounding box# (1 per box)ymins = [] # List of normalized top y coordinates in bounding box (1 per box)ymaxs = [] # List of normalized bottom y coordinates in bounding box# (1 per box)classes_text = [] # List of string class name of bounding box (1 per box)classes = [] # List of integer class id of bounding box (1 per box)tf_example = tf.train.Example(features=tf.train.Features(feature={'image/height': dataset_util.int64_feature(height),'image/width': dataset_util.int64_feature(width),'image/filename': dataset_util.bytes_feature(filename),'image/source_id': dataset_util.bytes_feature(filename),'image/encoded': dataset_util.bytes_feature(encoded_image_data),'image/format': dataset_util.bytes_feature(image_format),'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),'image/object/class/text': dataset_util.bytes_list_feature(classes_text),'image/object/class/label': dataset_util.int64_list_feature(classes),}))return tf_exampledef main(_):writer = tf.python_io.TFRecordWriter(FLAGS.output_path)# TODO(user): Write code to read in your dataset to examples variablefor example in examples:tf_example = create_tf_example(example)writer.write(tf_example.SerializeToString())writer.close()if __name__ == '__main__':tf.app.run()

还可以将自己的标签制作成csv文件，格式如下:

filename	width	height	class	xmin	ymin	xmax	ymax
cam_image1.jpg	480	270	queen	173	24	260	137
cam_image1.jpg	480	270	queen	165	135	253	251
cam_image1.jpg	480	270,ten	255	96	337	208
cam_image10.jpg	960	540	ten	501	116	700	353
cam_image10.jpg	960	540	queen	261	124	453	370
cam_image11.jpg	960	540	nine	225	96	490	396
cam_image12.jpg	960	540	king	362	149	560	389
cam_image13.jpg	960	540	jack	349	142	550	388
cam_image14.jpg	960	540	jack	297	167	512	420
cam_image15.jpg	960	540	ace	367	181	589	457
cam_image16.jpg	960	540	ace	303	155	525	456

此时，需要得到三个文件：labelmap、train.csv, test.csv。然后用下面的程序来生成tfrecord文件:

"""
Usage:# From tensorflow/models/# Create train data:python generate_tfrecord.py --csv_input=images/train_labels.csv --image_dir=images/train_img --output_path=train.record# Create test data:python generate_tfrecord.py --csv_input=images/test_labels.csv  --image_dir=images/test_img --output_path=test.record
"""
from __future__ import division
from __future__ import print_function
from __future__ import absolute_importimport os
import io
import pandas as pd
import tensorflow as tffrom PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDictflags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('image_dir', '', 'Path to the image directory')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS# TO-DO replace this with label map
def class_text_to_int(row_label):words =pd.read_csv("/home/jamesben/relationship_vrd/data/sub_obj_word_count.txt", header=None, names=["name", "freq"]).name.tolist()word2ix = {y: x for x, y in enumerate(words)}return word2ix[row_label]def split(df, group):data = namedtuple('data', ['filename', 'object'])gb = df.groupby(group)return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]def create_tf_example(group, path):with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:encoded_jpg = fid.read()encoded_jpg_io = io.BytesIO(encoded_jpg)image = Image.open(encoded_jpg_io)width, height = image.sizefilename = group.filename.encode('utf8')image_format = b'jpg'xmins = []xmaxs = []ymins = []ymaxs = []classes_text = []classes = []for index, row in group.object.iterrows():xmins.append(row['xmin'] / width)xmaxs.append(row['xmax'] / width)ymins.append(row['ymin'] / height)ymaxs.append(row['ymax'] / height)classes_text.append(row['class'].encode('utf8'))classes.append(class_text_to_int(row['class']))tf_example = tf.train.Example(features=tf.train.Features(feature={'image/height': dataset_util.int64_feature(height),'image/width': dataset_util.int64_feature(width),'image/filename': dataset_util.bytes_feature(filename),'image/source_id': dataset_util.bytes_feature(filename),'image/encoded': dataset_util.bytes_feature(encoded_jpg),'image/format': dataset_util.bytes_feature(image_format),'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),'image/object/class/text': dataset_util.bytes_list_feature(classes_text),'image/object/class/label': dataset_util.int64_list_feature(classes),}))return tf_exampledef main(_):writer = tf.python_io.TFRecordWriter(FLAGS.output_path)path = os.path.join(os.getcwd(), FLAGS.image_dir)examples = pd.read_csv(FLAGS.csv_input)grouped = split(examples, 'filename')for group in grouped:tf_example = create_tf_example(group, path)writer.write(tf_example.SerializeToString())writer.close()output_path = os.path.join(os.getcwd(), FLAGS.output_path)print('Successfully created the TFRecords: {}'.format(output_path))if __name__ == '__main__':tf.app.run()

然后分别使用上面注释中的命令生成train.record和test.record文件。推荐该脚本来生成。

1.3 修改samples/configs/*.config文件

配置模型，训练和输入输出参数。重点需要修改的是model中的num_classes, train_config中的fine_tune_checkpoint, 以及train_input_reader、eval_config、eval_input_reader、eval_input_reader。

model {faster_rcnn {num_classes: 100image_resizer {keep_aspect_ratio_resizer {min_dimension: 600max_dimension: 1024}}feature_extractor {type: 'faster_rcnn_resnet101'first_stage_features_stride: 16}first_stage_anchor_generator {grid_anchor_generator {scales: [0.25, 0.5, 1.0, 2.0]aspect_ratios: [0.5, 1.0, 2.0]height_stride: 16width_stride: 16}}first_stage_box_predictor_conv_hyperparams {op: CONVregularizer {l2_regularizer {weight: 0.0}}initializer {truncated_normal_initializer {stddev: 0.01}}}first_stage_nms_score_threshold: 0.0first_stage_nms_iou_threshold: 0.7first_stage_max_proposals: 300first_stage_localization_loss_weight: 2.0first_stage_objectness_loss_weight: 1.0initial_crop_size: 14maxpool_kernel_size: 2maxpool_stride: 2second_stage_box_predictor {mask_rcnn_box_predictor {use_dropout: falsedropout_keep_probability: 1.0fc_hyperparams {op: FCregularizer {l2_regularizer {weight: 0.0}}initializer {variance_scaling_initializer {factor: 1.0uniform: truemode: FAN_AVG}}}}}second_stage_post_processing {batch_non_max_suppression {score_threshold: 0.0iou_threshold: 0.6max_detections_per_class: 100max_total_detections: 300}score_converter: SOFTMAX}second_stage_localization_loss_weight: 2.0second_stage_classification_loss_weight: 1.0}
}train_config: {batch_size: 1optimizer {momentum_optimizer: {learning_rate: {manual_step_learning_rate {initial_learning_rate: 0.0003schedule {step: 900000learning_rate: .00003}schedule {step: 1200000learning_rate: .000003}}}momentum_optimizer_value: 0.9}use_moving_average: false}gradient_clipping_by_norm: 10.0fine_tune_checkpoint: "test_ckpt/faster_rcnn_resnet101_coco_2018_01_28/model.ckpt"from_detection_checkpoint: truedata_augmentation_options {random_horizontal_flip {}}
}train_input_reader: {tf_record_input_reader {input_path: "object_detection/vrd_tfrecord/vrd_train.record"}label_map_path: "object_detection/data/vrd_labelmap.pbtxt"
}eval_config: {num_examples: 955  #注意该参数是测试集中图像的数目# Note: The below line limits the evaluation process to 10 evaluations.# Remove the below line to evaluate indefinitely.max_evals: 10
}eval_input_reader: {tf_record_input_reader {input_path: "object_detection/vrd_tfrecord/vrd_val.record"}label_map_path: "object_detection/data/vrd_labelmap.pbtxt"shuffle: falsenum_readers: 1
}

1.4 设置train的命令行参数

设置参数

--train_dir=train_dir\
--pipeline_config_path=pipeline_config_path

2. 评估预测好的模型

2.1 先将训练好的ckpt模型导出为pb文件

模型训练好了之后，会得到以下三个文件:

model.ckpt-${CHECKPOINT_NUMBER}.data-00000-of-00001,
model.ckpt-${CHECKPOINT_NUMBER}.index
model.ckpt-${CHECKPOINT_NUMBER}.meta

运行export_inference_graph.py文件：

# From tensorflow/models/research/
python export_inference_graph \--input_type image_tensor \
    --pipeline_config_path path/to/ssd_inception_v2.config \
    --trained_checkpoint_prefix path/to/model.ckpt-369 \
    --output_directory path/to/exported_model_directory \

然后会在output_directory目录下会得到一个frozen_inference_graph.pb文件。

2.2 预测

运行infer_detections文件

# From tensorflow/models/research/oid
SPLIT=validation  # or test
TF_RECORD_FILES=$(ls -1 ${SPLIT}_tfrecords/* | tr '\n' ',')  # 获取素有tfrecord文件PYTHONPATH=$PYTHONPATH:$(readlink -f ..) \
python -m object_detection/inference/infer_detections \--input_tfrecord_paths=$TF_RECORD_FILES \--output_tfrecord_path=${SPLIT}_detections.tfrecord\--inference_graph=faster_rcnn_inception_resnet_v2_atrous_oid/frozen_inference_graph.pb \--discard_image_pixels  # 预测的结果用来算mAP,不需要保存图片内容

运行完毕之后会得到一个validation_detections.tfrecord文件。该文件会被用来计算mAPmAPmAP。

2.3 生成指标相关的配置文件

# From tensorflow/models/research/oid
SPLIT=validation  # or test
NUM_SHARDS=1  # Set to NUM_GPUS if using the parallel evaluation script abovemkdir -p ${SPLIT}_eval_metricsecho "
label_map_path: '../object_detection/data/oid_bbox_trainable_label_map.pbtxt'
tf_record_input_reader: { input_path: '${SPLIT}_detections.tfrecord@${NUM_SHARDS}' }
" > ${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxtecho "
metrics_set: 'coco_detection_metrics'
" > ${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt

其中metrics_set有以下选项:

pascal_voc_detection_metrics
weighted_pascal_voc_detection_metrics
pascal_voc_instance_segmentation_metrics
open_images_detection_metrics
coco_detection_metrics
coco_mask_metrics

该脚本运行完毕之后,会生成两个配置文件：

validation_eval_config.pbtxt
validation_input_config.pbtxt

这两个配置文件在生成评估结果时会用到。

2.4 得到评价指标的结果

运行以下脚本：

# From tensorflow/models/research/oid
SPLIT=validation  # or testPYTHONPATH=$PYTHONPATH:$(readlink -f ..) \
python -m object_detection/metrics/offline_eval_map_corloc \--eval_dir=${SPLIT}_eval_metrics \--eval_config_path=${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt \--input_config_path=${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt

运行完毕之后会打印评价结果,并将相关的结果写进文件metrics.csv文件中。

3. 在tensorboard中查看模型训练和过拟合情况

要想实现tensorboard中查看，需要按照官方要求将数据组织成以下形式:

+data(folder)-label_map file-train TFRecord file-eval TFRecord file
+models(folder)+ model(folder)-pipeline config file+train(folder)+eval(folder)

然后在训练的时候，运行以下命令:

# From the tensorflow/models/research/ directory
python object_detection/train.py \--logtostderr \--pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \--train_dir=${PATH_TO_TRAIN_DIR}

其中${PATH_TO_YOUR_PIPELINE_CONFIG}是上面我们的config文件的路径。${PATH_TO_TRAIN_DIR}是训练时checkpoint和events会被写入的目录,即上面的train目录。
训练的同时，开启预测程序:

# From the tensorflow/models/research/ directory
python object_detection/eval.py \--logtostderr \--pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \--checkpoint_dir=${PATH_TO_TRAIN_DIR} \--eval_dir=${PATH_TO_EVAL_DIR}

预测程序会周期性地取train目录下最新的checkpoint文件来对测试数据进行评估。其中${PATH_TO_YOUR_PIPELINE_CONFIG}是config文件的目录，${PATH_TO_TRAIN_DIR}是上面的训练的checkpoint所在目录，${PATH_TO_EVAL_DIR}是评估时的event文件将会被写入的目录。

开启上面的两个程序后，就可以在tensorboard中查看模型的效果。此时进入到上面的models目录，然后运行下面的命令：

tensorboard --logdir=${PATH_TO_MODEL_DIRECTORY}

其中，${PATH_TO_MODEL_DIRECTORY}指的是train目录和eval目录的父目录，即上面的model目录。

得到的tensorboard就会有train和eval的loss及mAP: