接上一篇基于pytorch的YOLOv5单张图片检测实现，我们实现了pytorch的前向推理，但是这个推理过程需要依赖yolov5本身的模型文件以及结构搭建的过程，所以还是比较麻烦的。这里，有没有一个直接前向推理，然后只处理结果，无需考虑yolov5本身的文件。所以现在介绍的是基于onnx的推理。这个推理过程也很简单，将原模型转化为onnx格式，然后再使用onnxruntime进行就可以了，具体操作可以看我的文章。

文章目录

一、pt转onnx
二、onnxruntime前向推理
- 1. 安装依赖
- 2. 代码实现
- 3、onnxruntime和pytorch比较

一、pt转onnx

这里我们主要参考：https://github.com/ultralytics/yolov5/issues/251中的内容进行转化，进入yolov5安装目录，执行以下：

python models/export.py --weights yolov5s.pt --img 640 --batch 1

二、onnxruntime前向推理

1. 安装依赖

pip install onnxruntime

2. 代码实现

# coding=utf-8
import cv2.cv2 as cv2
import numpy as np
import onnxruntime
import torch
import torchvision
import time
import randomclass YOLOV5_ONNX(object):def __init__(self,onnx_path):'''初始化onnx'''self.onnx_session=onnxruntime.InferenceSession(onnx_path)self.input_name=self.get_input_name()self.output_name=self.get_output_name()def get_input_name(self):'''获取输入节点名称'''input_name=[]for node in self.onnx_session.get_inputs():input_name.append(node.name)return input_namedef get_output_name(self):'''获取输出节点名称'''output_name=[]for node in self.onnx_session.get_outputs():output_name.append(node.name)return output_namedef get_input_feed(self,image_tensor):'''获取输入tensor'''input_feed={}for name in self.input_name:input_feed[name]=image_tensorreturn input_feeddef letterbox(self,img, new_shape=(640, 640), color=(114, 114, 114), auto=False, scaleFill=False, scaleup=True,stride=32):'''图片归一化'''# Resize and pad image while meeting stride-multiple constraintsshape = img.shape[:2]  # current shape [height, width]if isinstance(new_shape, int):new_shape = (new_shape, new_shape)# Scale ratio (new / old)r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])if not scaleup:  # only scale down, do not scale up (for better test mAP)r = min(r, 1.0)# Compute paddingratio = r, r  # width, height ratiosnew_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh paddingif auto:  # minimum rectangledw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh paddingelif scaleFill:  # stretchdw, dh = 0.0, 0.0new_unpad = (new_shape[1], new_shape[0])ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratiosdw /= 2  # divide padding into 2 sidesdh /= 2if shape[::-1] != new_unpad:  # resizeimg = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))left, right = int(round(dw - 0.1)), int(round(dw + 0.1))img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add borderreturn img, ratio, (dw, dh)def xywh2xyxy(self,x):# Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-righty = np.copy(x)y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left xy[:, 1] = x[:, 1] - x[:, 3] / 2  # top left yy[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right xy[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right yreturn ydef nms(self,prediction, conf_thres=0.1, iou_thres=0.6, agnostic=False):if prediction.dtype is torch.float16:prediction = prediction.float()  # to FP32xc = prediction[..., 4] > conf_thres  # candidatesmin_wh, max_wh = 2, 4096  # (pixels) minimum and maximum box width and heightmax_det = 300  # maximum number of detections per imageoutput = [None] * prediction.shape[0]for xi, x in enumerate(prediction):  # image index, image inferencex = x[xc[xi]]  # confidenceif not x.shape[0]:continuex[:, 5:] *= x[:, 4:5]  # conf = obj_conf * cls_confbox = self.xywh2xyxy(x[:, :4])conf, j = x[:, 5:].max(1, keepdim=True)x = torch.cat((torch.tensor(box), conf, j.float()), 1)[conf.view(-1) > conf_thres]n = x.shape[0]  # number of boxesif not n:continuec = x[:, 5:6] * (0 if agnostic else max_wh)  # classesboxes, scores = x[:, :4] + c, x[:, 4]  # boxes (offset by class), scoresi = torchvision.ops.boxes.nms(boxes, scores, iou_thres)if i.shape[0] > max_det:  # limit detectionsi = i[:max_det]output[xi] = x[i]return outputdef clip_coords(self,boxes, img_shape):'''查看是否越界'''# Clip bounding xyxy bounding boxes to image shape (height, width)boxes[:, 0].clamp_(0, img_shape[1])  # x1boxes[:, 1].clamp_(0, img_shape[0])  # y1boxes[:, 2].clamp_(0, img_shape[1])  # x2boxes[:, 3].clamp_(0, img_shape[0])  # y2def scale_coords(self,img1_shape, coords, img0_shape, ratio_pad=None):'''坐标对应到原始图像上，反操作：减去pad，除以最小缩放比例:param img1_shape: 输入尺寸:param coords: 输入坐标:param img0_shape: 映射的尺寸:param ratio_pad::return:'''# Rescale coords (xyxy) from img1_shape to img0_shapeif ratio_pad is None:  # calculate from img0_shapegain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1])  # gain  = old / new,计算缩放比率pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2  # wh padding ，计算扩充的尺寸else:gain = ratio_pad[0][0]pad = ratio_pad[1]coords[:, [0, 2]] -= pad[0]  # x padding，减去x方向上的扩充coords[:, [1, 3]] -= pad[1]  # y padding，减去y方向上的扩充coords[:, :4] /= gain  # 将box坐标对应到原始图像上self.clip_coords(coords, img0_shape)  # 边界检查return coordsdef sigmoid(self,x):return 1 / (1 + np.exp(-x))def infer(self,img_path):'''执行前向操作预测输出'''# 超参数设置img_size=(640,640) #图片缩放大小conf_thres=0.25 #置信度阈值iou_thres=0.45 #iou阈值class_num=1 #类别数stride=[8,16,32]anchor_list= [[10,13, 16,30, 33,23],[30,61, 62,45, 59,119], [116,90, 156,198, 373,326]]anchor = np.array(anchor_list).astype(np.float).reshape(3,-1,2)area = img_size[0] * img_size[1]size = [int(area / stride[0] ** 2), int(area / stride[1] ** 2), int(area / stride[2] ** 2)]feature = [[int(j / stride[i]) for j in img_size] for i in range(3)]# 读取图片src_img=cv2.imread(img_path)src_size=src_img.shape[:2]# 图片填充并归一化img=self.letterbox(src_img,img_size,stride=32)[0]# Convertimg = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416img = np.ascontiguousarray(img)# 归一化img=img.astype(dtype=np.float32)img/=255.0# # BGR to RGB# img = img[:, :, ::-1].transpose(2, 0, 1)# img = np.ascontiguousarray(img)# 维度扩张img=np.expand_dims(img,axis=0)# 前向推理start=time.time()input_feed=self.get_input_feed(img)pred=self.onnx_session.run(output_names=self.output_name,input_feed=input_feed)#提取出特征y = []y.append(torch.tensor(pred[0].reshape(-1,size[0]*3,5+class_num)).sigmoid())y.append(torch.tensor(pred[1].reshape(-1,size[1]*3,5+class_num)).sigmoid())y.append(torch.tensor(pred[2].reshape(-1,size[2]*3,5+class_num)).sigmoid())grid = []for k, f in enumerate(feature):grid.append([[i, j] for j in range(f[0]) for i in range(f[1])])z = []for i in range(3):src = y[i]xy = src[..., 0:2] * 2. - 0.5wh = (src[..., 2:4] * 2) ** 2dst_xy = []dst_wh = []for j in range(3):dst_xy.append((xy[:, j * size[i]:(j + 1) * size[i], :] + torch.tensor(grid[i])) * stride[i])dst_wh.append(wh[:, j * size[i]:(j + 1) * size[i], :] * anchor[i][j])src[..., 0:2] = torch.from_numpy(np.concatenate((dst_xy[0], dst_xy[1], dst_xy[2]), axis=1))src[..., 2:4] = torch.from_numpy(np.concatenate((dst_wh[0], dst_wh[1], dst_wh[2]), axis=1))z.append(src.view(1, -1, 5+class_num))results = torch.cat(z, 1)results = self.nms(results, conf_thres, iou_thres)cast=time.time()-startprint("cast time:{}".format(cast))#映射到原始图像img_shape=img.shape[2:]print(img_size)for det in results:  # detections per imageif det is not None and len(det):det[:, :4] = self.scale_coords(img_shape, det[:, :4],src_size).round()if det is not None and len(det):self.draw(src_img, det)def plot_one_box(self,x, img, color=None, label=None, line_thickness=None):# Plots one bounding box on image imgtl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1  # line/font thicknesscolor = color or [random.randint(0, 255) for _ in range(3)]c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)if label:tf = max(tl - 1, 1)  # font thicknesst_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA)  # filledcv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)def draw(self,img, boxinfo):colors = [[0, 0, 255]]for *xyxy, conf, cls in boxinfo:label = '%s %.2f' % ('image', conf)print('xyxy: ', xyxy)self.plot_one_box(xyxy, img, label=label, color=colors[int(cls)], line_thickness=1)cv2.namedWindow("dst",0)cv2.imshow("dst", img)cv2.imwrite("data/res1.jpg",img)cv2.waitKey(0)# cv2.imencode('.jpg', img)[1].tofile(os.path.join(dst, id + ".jpg"))return 0if __name__=="__main__":model=YOLOV5_ONNX(onnx_path="./weights/image_detect.onnx")model.infer(img_path="data/PMC2663376_00004.jpg")

结果：

3、onnxruntime和pytorch比较

onnxruntime推理时间
pytorch推理时间

我们在归一化到640x640图像上进行比较，onnx推理比纯pytorch时间提升了1倍。说明onnx推理还是可以的，后续会在其他加速框架上进行测试，期待后续吧。。。

github链接：yolov5前向推理实现

参考链接：
onnxruntime-for-yolov5
python3 onnx 推理Demo

基于onnxruntime的YOLOv5单张图片检测实现相关推荐

CVPR 2019 | 基于骨架表达的单张图片三维物体重建方法
现有的单视角三维物体重建方法通过采用不同的几何形状表达方式取得了不同程度的成功,但它们都难以重建出拓扑复杂的物体形状.为此,华南理工大学,香港中文大学(深圳)以及微软亚研院联合提出一种以骨架(meso ...
yolov5 超大图片检测套路（附代码）
切割代码,将切割后的照片放到detect里去检测,生成检测后的图片是有顺序的,下一步图像的拼接,注意照片保存读取文件夹的选取,总体实现还是很简单的, 1.切割图片(附代码) from PIL impo ...
基于深度学习的磁环表面缺陷检测算法
基于深度学习的磁环表面缺陷检测算法人工智能技术与咨询来源:< 人工智能与机器人研究> ,作者罗菁等关键词: 缺陷检测:深度学习:磁环:YOLOv3: 摘要: 在磁环的生产制造过程中, ...
caffe ssd 测试demo,检测单张图片
原 SSD: Single Shot MultiBox Detector 检测单张图片 2016年10月29日 16:39:05 阅读数:19930 标签: python ssd ssd-detect ...
项目：基于yolov5的舰船检测+pycharm+机器学习+图像检测
项目:基于yolov5的舰船检测+pycharm+机器学习+图像检测项目将深度学习的方法引入海洋目标的检测,利用深度神经网络模型强大的学习能力和模型通用性,来实现准确.可靠和快速的目标自动检测和识别 ...
使用YMIR生产基于yolov5的头盔检测模型
使用YMIR生产基于yolov5的头盔检测模型 1.概述 2.YOLOV5结构解析 YOLOV5在coco数据集性能测试图 3.算法基本信息动手实测查看训练.测试数据集模型训练启动页面模型运行 ...
基于深度学习的高精度交警检测识别系统（PyTorch+Pyside6+YOLOv5模型）
摘要:基于深度学习的高精度交警检测识别系统可用于日常生活中检测与定位交警目标,利用深度学习算法可实现图片.视频.摄像头等方式的交警目标检测识别,另外支持结果可视化与图片或视频检测结果的导出.本系统采用 ...
基于YOLOv5的目标检测系统详解（附MATLAB GUI版代码）
摘要:本文重点介绍了基于YOLOv5目标检测系统的MATLAB实现,用于智能检测物体种类并记录和保存结果,对各种物体检测结果可视化,提高目标识别的便捷性和准确性.本文详细阐述了目标检测系统的原理,并给 ...
基于YOLOv5的舰船检测与识别系统（Python+清新界面+数据集）
摘要:基于YOLOv5的舰船检测与识别系统用于识别包括渔船.游轮等多种海上船只类型,检测船舰目标并进行识别计数,以提供海洋船只的自动化监测和管理.本文详细介绍船舰类型识别系统,在介绍算法原理的同时,给 ...
基于YOLOv5的停车位检测系统（清新UI+深度学习+训练数据集）
摘要:基于YOLOv5的停车位检测系统用于露天停车场车位检测,应用深度学习技术检测停车位是否占用,以辅助停车场对车位进行智能化管理.在介绍算法原理的同时,给出Python的实现代码.训练数据集以及Py ...

基于onnxruntime的YOLOv5单张图片检测实现