python入门指南_Python实时对象检测入门指南

python入门指南

多年来，研究人员一直在研究赋予机器以视觉识别和识别物体的能力的可能性。这个称为计算机视觉或CV的特定领域具有广泛的现代应用程序。

从被自动驾驶汽车用于道路目标检测到复杂的面部和肢体语言识别（可以识别可能的犯罪或犯罪活动），CV在当今世界中有许多用途。不可否认，对象检测还是Computer Vision最酷的应用之一。

如果您尚未开始使用Python，或者您不熟悉OpenCV，请参阅此免费的Python备忘单 240+注释和OpenCV Python教程

当今的CV工具可以轻松地在图像甚至是实时流视频上实现对象检测。在本文中，我们将看一下使用TensorFlow进行实时对象检测的简单演示。

设置简单的对象检测器

先决条件：

Tensorflow> = 1.15.0

通过执行pip install tensorflow安装最新版本

我们现在出发了！

搭建环境

步骤1.从Github下载或克隆TensorFlow对象检测代码到本地计算机中

在终端中执行以下命令：

git clone https://github.com/tensorflow/models.git

如果您的计算机上未安装git，则可以选择从此处下载zip文件。

步骤2.安装依赖项

下一步是确保我们拥有在计算机上运行对象检测器所需的所有库和模块。

这是项目依赖的库的列表。（默认情况下，大多数依赖项都随Tensorflow一起提供）

赛顿
contextlib2
枕头
xml文件
matplotlib

如果您发现缺少任何模块，只需在您的环境中执行pip install即可安装。

步骤3.安装Protobuf编译器

Protobuf或Protocol缓冲区是Google的语言无关，平台无关的可扩展机制，用于序列化结构化数据。它可以帮助我们定义我们希望数据的结构方式，一旦结构化，就可以轻松地使用各种语言在各种数据流之间读写结构化数据。

这也是该项目的依赖项。您可以在此处了解有关Protobufs的更多信息。现在，我们将在我们的机器上安装Protobuf。

前往https://github.com/protocolbuffers/protobuf/releases

选择适合您的操作系统的版本，然后复制下载链接。

打开终端或命令提示符，将目录更改为克隆的存储库，然后在终端中执行以下命令。

cd models/research \
wget -O protobuf.zip https:/ /github.com/ protocolbuffers /protobuf/ releases /download/ v3. 9.1 /protoc-3.9.1-osx-x86_64.zip \
unzip protobuf.zip

注意：请确保在models / research目录中解压缩protobuf.zip文件

步骤4.编译Protobuf编译器

从research /目录执行以下命令以编译协议缓冲区。

./bin/protoc object_detection/protos/* .proto --python_out= .

在Python中实现对象检测

现在，我们已经安装了所有依赖项，让我们使用Python来实现对象检测。

在下载的存储库中，将目录更改为models/research/object_detection 。在此目录中，您将找到一个名为object_detection_tutorial.ipynb的ipython笔记本。该文件是用于对象检测的演示，执行时将使用指定的' ssd_mobilenet_v1_coco_2017_11_17 '模型对存储库中提供的两个测试图像进行分类。

以下是测试输出之一：

引入了一些小的更改以从实时流视频中检测对象。在相同的文件夹中制作一个新的Jupyter笔记本，并遵循以下代码。

在[1]中：

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
# This is needed since the notebook is stored in the object_detection folder.
sys.path.append( ".." )
from utils import ops as utils_ops
if StrictVersion(tf.__version__) < StrictVersion( '1.12.0' ):raise ImportError( 'Please upgrade your TensorFlow installation to v1.12.*.' )

在[2]中：

# This is needed to display the images.
get_ipython().run_line_magic( 'matplotlib' , 'inline' )

在[3]中：

# Object detection imports
# Here are the imports from the object detection module.
from utils import label_map_util
from utils import visualization_utils as vis_util

在[4]中：

# Model preparation
# Any model exported using the `export_inference_graph.py` tool can be loaded here simply by changing `PATH_TO_FROZEN_GRAPH` to point to a new .pb file.
# By default we use an "SSD with Mobilenet" model here.
#See https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
#for a list of other models that can be run out-of-the-box with varying speeds and accuracies.
# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_2017_11_17'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb'
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join( 'data' , 'mscoco_label_map.pbtxt' )

在[5]中：

#Download Model
opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():file_name = os.path.basename(file.name)if 'frozen_inference_graph.pb' in file_name:tar_file.extract(file, os.getcwd())

在[6]中：

# Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():od_graph_def = tf.GraphDef()with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb' ) as fid:serialized_graph = fid.read()od_graph_def.ParseFromString(serialized_graph)tf.import_graph_def(od_graph_def, name= '' )

在[7]中：

# Loading label map
# Label maps map indices to category names, so that when our convolution network predicts `5`,
#we know that this corresponds to `airplane`.  Here we use internal utility functions,
#but anything that returns a dictionary mapping integers to appropriate string labels would be fine
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name= True )

在[8]中：

def run_inference_for_single_image (image, graph) :with graph.as_default():with tf.Session() as sess:# Get handles to input and output tensorsops = tf.get_default_graph().get_operations()all_tensor_names = {output.name for op in ops for output in op.outputs}tensor_dict = {}for key in ['num_detections' , 'detection_boxes' , 'detection_scores' ,'detection_classes' , 'detection_masks' ]:tensor_name = key + ':0'if tensor_name in all_tensor_names:tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(tensor_name)if 'detection_masks' in tensor_dict:# The following processing is only for single imagedetection_boxes = tf.squeeze(tensor_dict[ 'detection_boxes' ], [ 0 ])detection_masks = tf.squeeze(tensor_dict[ 'detection_masks' ], [ 0 ])# Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.real_num_detection = tf.cast(tensor_dict[ 'num_detections' ][ 0 ], tf.int32)detection_boxes = tf.slice(detection_boxes, [ 0 , 0 ], [real_num_detection, -1 ])detection_masks = tf.slice(detection_masks, [ 0 , 0 , 0 ], [real_num_detection, -1 , -1 ])detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(detection_masks, detection_boxes, image.shape[ 1 ], image.shape[ 2 ])detection_masks_reframed = tf.cast(tf.greater(detection_masks_reframed, 0.5 ), tf.uint8)# Follow the convention by adding back the batch dimensiontensor_dict[ 'detection_masks' ] = tf.expand_dims(detection_masks_reframed, 0 )image_tensor = tf.get_default_graph().get_tensor_by_name( 'image_tensor:0' )# Run inferenceoutput_dict = sess.run(tensor_dict, feed_dict={image_tensor: image})# all outputs are float32 numpy arrays, so convert types as appropriateoutput_dict[ 'num_detections' ] = int(output_dict[ 'num_detections' ][ 0 ])output_dict[ 'detection_classes' ] = output_dict['detection_classes' ][ 0 ].astype(np.int64)output_dict[ 'detection_boxes' ] = output_dict[ 'detection_boxes' ][ 0 ]output_dict[ 'detection_scores' ] = output_dict[ 'detection_scores' ][ 0 ]if 'detection_masks' in output_dict:output_dict[ 'detection_masks' ] = output_dict[ 'detection_masks' ][ 0 ]return output_dict

在[8]中：

import cv2
cam = cv2.cv2.VideoCapture( 0 )
rolling = True
while (rolling):ret, image_np = cam.read()image_np_expanded = np.expand_dims(image_np, axis= 0 )# Actual detection.output_dict = run_inference_for_single_image(image_np_expanded, detection_graph)# Visualization of the results of a detection.vis_util.visualize_boxes_and_labels_on_image_array(image_np,output_dict[ 'detection_boxes' ],output_dict[ 'detection_classes' ],output_dict[ 'detection_scores' ],category_index,instance_masks=output_dict.get( 'detection_masks' ),use_normalized_coordinates= True ,line_thickness= 8 )cv2.imshow( 'image' , cv2.resize(image_np,( 1000 , 800 )))if cv2.waitKey( 25 ) & 0xFF == ord( 'q' ):breakcv2.destroyAllWindows()cam.release()

尾注

当您运行Jupyter笔记本时，系统网络摄像头将打开，并将检测原始模型已经训练过检测的所有类别的对象。

有关更多项目构想，请参阅 2020年25个最佳计算机视觉项目构想 。 学习愉快！

翻译自: https://hackernoon.com/introductory-guide-to-real-time-object-detection-with-python-6jyb36t5

python入门指南