OpenVino入门（二）

一.OpenVino简介
- 1.1OpenVino是什么
- 1.2 OpenVino的网络加速原理
- - 1.2.1Linear Operations Fusing
  - 1.2.2 数据精度校准（Precision Calibration）
  - 1.2.3 补充
- 1.3 开发流程
- - 1.3.1 模型优化器(Model Optimizer)
  - 1.3.2 推断引擎(Inference Engine)
  - 1.3.2 推断引擎开发代码流程
二.以Super Resolution C++ Demo展示推断引擎
- 2.1Super Resolution C++源代码
- 2.2不同预训练模型的效果
- - 2.2.1 1-o.jpg
  - 2.2.2 2-0.bmp
  - 2.2.3 1-bmp.bmp
  - 2.2.4 3.bmp
三.以Object Detection C++ Demo演示模型优化器(Model Optimizer)
- 3.1模型转换流程
- 3.2不同模型配合openvino效果
- - 3.2.1 yolov3
  - 3.2.2 yolov4
  - 3.2.3 SSD预训练模型
  - 3.2.4 yolov5 openvino暂时无法直接支持
四.Openvino程序移植

一.OpenVino简介

参考他人博客：[1][2][3]

1.1OpenVino是什么

当模型训练结束后，上线部署时，就会遇到各种问题，比如，模型性能是否满足线上要求，模型如何嵌入到原有工程系统，推理线程的并发路数是否满足，这些问题决定着投入产出比。只有深入且准确的理解深度学习框架，才能更好的完成这些任务，满足上线要求。实际情况是，新的算法模型和所用框架在不停的变化，这个时候恨不得工程师什么框架都熟练掌握，令人失望的是，这种人才目前是稀缺的。

OpenVINO是一个Pipeline工具集，同时可以兼容各种开源框架训练好的模型，拥有算法模型上线部署的各种能力，只要掌握了该工具，可以轻松的将预训练模型在Intel的CPU上快速部署起来。

1.2 OpenVino的网络加速原理

为什么需要网络加速/压缩？

大家熟知的resnet，densenet均属于巨无霸类型的网络，在延迟，大小均对用户不友好。试想：你下载了一个手势识别的app，里面还带上了100m大小的resnet，这不是很好的体验。

为了部署深度学习模型，我们可能会在CPU/GPU设备上部署模型。所幸，英伟达与英特尔都提供了官方的网络加速工具。核弹厂对应Tensor RT（GPU），牙膏厂对应openvino（CPU）。

1.2.1Linear Operations Fusing

1.2.2 数据精度校准（Precision Calibration）

我们训练的网络通常是FP32精度的网络，一旦网络训练完成，在部署推理的过程中由于不需要反向传播，完全可以适当降低数据精度，比如降为FP16或INT8的精度。更低的数据精度将会使得内存占用和延迟更低，模型体积更小。

而什么是Calibration？对于模型中的若干网络层，我们可以逐个的降低其精度，同时准备一个验证集，再划定一条baseline，但网络的性能降低到baseline时，我们停止降低精度。当然也可以将所有网络层的精度降低，但与此同时模型的性能也会降低。

1.2.3 补充

openvino的网络加速，除了减小模型，还有对硬件指令的优化使得硬件效率更高

1.3 开发流程

OpenVINO工具包(ToolKit)主要包括两个核心组件:

模型优化器(Model Optimizer)
推理引擎(Inference Engine)

1.3.1 模型优化器(Model Optimizer)

模型优化器(Model Optimizer)将给定的模型转化为标准的 Intermediate Representation (IR) ，并对模型优化。
模型优化器支持的深度学习框架：

ONNX
TensorFlow
Caffe
MXNet
Kaldi

1.3.2 推断引擎(Inference Engine)

推断引擎(Inference Engine)支持硬件指令集层面的深度学习模型加速运行，同时对传统的OpenCV图像处理库也进行了指令集优化，有显著的性能与速度提升。
支持的硬件设备：

CPU
GPU
FPGA
VPU

1.3.2 推断引擎开发代码流程

新建InferenceEngine::Core core（处理器的插件库）InferenceEngine的作用
读取模型（网络结构和权重），由xxx.bin与xxx.xml组成
配置输入和输出参数（似乎这里可以不做，一切继承模型的配置）
装载模型，将模型依靠InferenceEngine::Core::LoadNetwork()载入到硬件上
建立推理请求CreateInferRequest()
准备输入数据
推理
结果处理

二.以Super Resolution C++ Demo展示推断引擎

demo：[官方文档]
demo的使用方法可以详见上篇博客的2.4

2.1Super Resolution C++源代码

// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
///*** @brief The entry point for inference engine Super Resolution demo application* @file super_resolution_demo/main.cpp* @example super_resolution_demo/main.cpp*/
#include <algorithm>
#include <vector>
#include <string>
#include <memory>#include <inference_engine.hpp>#include <samples/slog.hpp>
#include <samples/args_helper.hpp>
#include <samples/ocv_common.hpp>#include "super_resolution_demo.h"using namespace InferenceEngine;bool ParseAndCheckCommandLine(int argc, char *argv[]) {// ---------------------------Parsing and validation of input args--------------------------------------slog::info << "Parsing input parameters" << slog::endl;gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);if (FLAGS_h) {showUsage();showAvailableDevices();return false;}if (FLAGS_i.empty()) {throw std::logic_error("Parameter -i is not set");}if (FLAGS_m.empty()) {throw std::logic_error("Parameter -m is not set");}return true;
}int main(int argc, char *argv[]) {try {slog::info << "InferenceEngine: " << printable(*GetInferenceEngineVersion()) << slog::endl;// ------------------------------ Parsing and validation of input args ---------------------------------if (!ParseAndCheckCommandLine(argc, argv)) {return 0;}/** This vector stores paths to the processed images **/std::vector<std::string> imageNames;parseInputFilesArguments(imageNames);if (imageNames.empty()) throw std::logic_error("No suitable images were found");// -----------------------------------------------------------------------------------------------------// --------------------------- 1. Load inference engine -------------------------------------slog::info << "Loading Inference Engine" << slog::endl;Core ie;/** Printing device version **/slog::info << "Device info: " << slog::endl;slog::info << printable(ie.GetVersions(FLAGS_d)) << slog::endl;if (!FLAGS_l.empty()) {// CPU(MKLDNN) extensions are loaded as a shared library and passed as a pointer to base extensionIExtensionPtr extension_ptr = make_so_pointer<IExtension>(FLAGS_l);ie.AddExtension(extension_ptr, "CPU");slog::info << "CPU Extension loaded: " << FLAGS_l << slog::endl;}if (!FLAGS_c.empty()) {// clDNN Extensions are loaded from an .xml description and OpenCL kernel filesie.SetConfig({{PluginConfigParams::KEY_CONFIG_FILE, FLAGS_c}}, "GPU");slog::info << "GPU Extension loaded: " << FLAGS_c << slog::endl;}// -----------------------------------------------------------------------------------------------------// --------------------------- 2. Read IR Generated by ModelOptimizer (.xml and .bin files) ------------slog::info << "Loading network files" << slog::endl;/** Read network model **/auto network = ie.ReadNetwork(FLAGS_m);// -----------------------------------------------------------------------------------------------------// --------------------------- 3. Configure input & output ---------------------------------------------// --------------------------- Prepare input blobs -----------------------------------------------------slog::info << "Preparing input blobs" << slog::endl;/** Taking information about all topology inputs **/ICNNNetwork::InputShapes inputShapes(network.getInputShapes());if (inputShapes.size() != 1 && inputShapes.size() != 2)throw std::logic_error("The demo supports topologies with 1 or 2 inputs only");std::string lrInputBlobName = inputShapes.begin()->first;SizeVector lrShape = inputShapes[lrInputBlobName];if (lrShape.size() != 4) {throw std::logic_error("Number of dimensions for an input must be 4");}// A model like single-image-super-resolution-???? may take bicubic interpolation of the input image as the// second inputstd::string bicInputBlobName;if (inputShapes.size() == 2) {bicInputBlobName = (++inputShapes.begin())->first;SizeVector bicShape = inputShapes[bicInputBlobName];if (bicShape.size() != 4) {throw std::logic_error("Number of dimensions for both inputs must be 4");}if (lrShape[2] >= bicShape[2] && lrShape[3] >= bicShape[3]) {lrInputBlobName.swap(bicInputBlobName);lrShape.swap(bicShape);} else if (!(lrShape[2] <= bicShape[2] && lrShape[3] <= bicShape[3])) {throw std::logic_error("Each spatial dimension of one input must surpass or be equal to a spatial""dimension of another input");}}/** Collect images**/std::vector<cv::Mat> inputImages;for (const auto &i : imageNames) {/** Get size of low resolution input **/int w = lrShape[3];int h = lrShape[2];int c = lrShape[1];cv::Mat img = cv::imread(i, c == 1 ? cv::IMREAD_GRAYSCALE : cv::IMREAD_COLOR);if (img.empty()) {slog::warn << "Image " + i + " cannot be read!" << slog::endl;continue;}if (c != img.channels()) {slog::warn << "Number of channels of the image " << i << " is not equal to " << c << ". Skip it\n";continue;}if (w != img.cols || h != img.rows) {slog::warn << "Size of the image " << i << " is not equal to " << w << "x" << h << ". Resize it\n";cv::resize(img, img, {w, h});}inputImages.push_back(img);}if (inputImages.empty()) throw std::logic_error("Valid input images were not found!");/** Setting batch size using image count **/inputShapes[lrInputBlobName][0] = inputImages.size();if (!bicInputBlobName.empty()) {inputShapes[bicInputBlobName][0] = inputImages.size();}network.reshape(inputShapes);slog::info << "Batch size is " << std::to_string(network.getBatchSize()) << slog::endl;// ------------------------------ Prepare output blobs -------------------------------------------------slog::info << "Preparing output blobs" << slog::endl;OutputsDataMap outputInfo(network.getOutputsInfo());// BlobMap outputBlobs;std::string firstOutputName;for (auto &item : outputInfo) {if (firstOutputName.empty()) {firstOutputName = item.first;}DataPtr outputData = item.second;if (!outputData) {throw std::logic_error("output data pointer is not valid");}item.second->setPrecision(Precision::FP32);}// -----------------------------------------------------------------------------------------------------// --------------------------- 4. Loading model to the device ------------------------------------------slog::info << "Loading model to the device" << slog::endl;ExecutableNetwork executableNetwork = ie.LoadNetwork(network, FLAGS_d);// -----------------------------------------------------------------------------------------------------// --------------------------- 5. Create infer request -------------------------------------------------slog::info << "Create infer request" << slog::endl;InferRequest inferRequest = executableNetwork.CreateInferRequest();// -----------------------------------------------------------------------------------------------------// --------------------------- 6. Prepare input --------------------------------------------------------Blob::Ptr lrInputBlob = inferRequest.GetBlob(lrInputBlobName);for (size_t i = 0; i < inputImages.size(); ++i) {cv::Mat img = inputImages[i];matU8ToBlob<float_t>(img, lrInputBlob, i);if (!bicInputBlobName.empty()) {Blob::Ptr bicInputBlob = inferRequest.GetBlob(bicInputBlobName);int w = bicInputBlob->getTensorDesc().getDims()[3];int h = bicInputBlob->getTensorDesc().getDims()[2];cv::Mat resized;cv::resize(img, resized, cv::Size(w, h), 0, 0, cv::INTER_CUBIC);matU8ToBlob<float_t>(resized, bicInputBlob, i);}}// -----------------------------------------------------------------------------------------------------// --------------------------- 7. Do inference ---------------------------------------------------------std::cout << "To close the application, press 'CTRL+C' here";if (FLAGS_show) {std::cout << " or switch to the output window and press any key";}std::cout << std::endl;slog::info << "Start inference" << slog::endl;inferRequest.Infer();// -----------------------------------------------------------------------------------------------------// --------------------------- 8. Process output -------------------------------------------------------const Blob::Ptr outputBlob = inferRequest.GetBlob(firstOutputName);LockedMemory<const void> outputBlobMapped = as<MemoryBlob>(outputBlob)->rmap();const auto outputData = outputBlobMapped.as<float*>();size_t numOfImages = outputBlob->getTensorDesc().getDims()[0];size_t numOfChannels = outputBlob->getTensorDesc().getDims()[1];size_t h = outputBlob->getTensorDesc().getDims()[2];size_t w = outputBlob->getTensorDesc().getDims()[3];size_t nunOfPixels = w * h;slog::info << "Output size [N,C,H,W]: " << numOfImages << ", " << numOfChannels << ", " << h << ", " << w << slog::endl;for (size_t i = 0; i < numOfImages; ++i) {std::vector<cv::Mat> imgPlanes;if (numOfChannels == 3) {imgPlanes = std::vector<cv::Mat>{cv::Mat(h, w, CV_32FC1, &(outputData[i * nunOfPixels * numOfChannels])),cv::Mat(h, w, CV_32FC1, &(outputData[i * nunOfPixels * numOfChannels + nunOfPixels])),cv::Mat(h, w, CV_32FC1, &(outputData[i * nunOfPixels * numOfChannels + nunOfPixels * 2]))};} else {imgPlanes = std::vector<cv::Mat>{cv::Mat(h, w, CV_32FC1, &(outputData[i * nunOfPixels * numOfChannels]))};// Post-processing for text-image-super-resolution modelscv::threshold(imgPlanes[0], imgPlanes[0], 0.5f, 1.0f, cv::THRESH_BINARY);};for (auto & img : imgPlanes)img.convertTo(img, CV_8UC1, 255);cv::Mat resultImg;cv::merge(imgPlanes, resultImg);if (FLAGS_show) {cv::imshow("result", resultImg);cv::waitKey();}std::string outImgName = std::string("sr_" + std::to_string(i + 1) + ".png");cv::imwrite(outImgName, resultImg);}// -----------------------------------------------------------------------------------------------------}catch (const std::exception &error) {slog::err << error.what() << slog::endl;return 1;}catch (...) {slog::err << "Unknown/internal exception happened" << slog::endl;return 1;}slog::info << "Execution successful" << slog::endl;slog::info << slog::endl << "This demo is an API example, for any performance measurements ""please use the dedicated benchmark_app tool from the openVINO toolkit" << slog::endl;return 0;
}

2.2不同预训练模型的效果

官方提供了三种预训练：

single-image-super-resolution-1032, which is the model that
performs super resolution 4x upscale on a 270x480 image（它是在270x480图像上执行4倍超高分辨率的模型）
single-image-super-resolution-1033, which is the model that
performs super resolution 3x upscale on a 360x640 image（该模型可在360x640图像上执行3倍超高分辨率的超高分辨率）
text-image-super-resolution-0001, which is the model that performs super resolution 3x upscale on a 360x640 image（该模型可在360x640图像上执行3倍超高分辨率的超高分辨率）

2.2.1 1-o.jpg

1.single-image-super-resolution-1032, which is the model that
performs super resolution 4x upscale on a 270x480 image

整体对比：
细节对比：
2.single-image-super-resolution-1033, which is the model that
performs super resolution 3x upscale on a 360x640 image

整体对比：
细节对比：

3.text-image-super-resolution-0001, which is the model that performs super resolution 3x upscale on a 360x640 image

三个模型对比：

2.2.2 2-0.bmp

1.single-image-super-resolution-1032, which is the model that
performs super resolution 4x upscale on a 270x480 image

整体对比：

细节对比：

2.single-image-super-resolution-1033, which is the model that
performs super resolution 3x upscale on a 360x640 image

整体对比：

细节对比：

1033和1032对比：

3.text-image-super-resolution-0001, which is the model that performs super resolution 3x upscale on a 360x640 image

2.2.3 1-bmp.bmp

1033：

png和bmp对比：

2.2.4 3.bmp

1033：

细节对比：

三.以Object Detection C++ Demo演示模型优化器(Model Optimizer)

demo：[官方文档]

3.1模型转换流程

yolo具体操作就看参考博客或者官方文档吧
参考博客：[1][2][yolov4]
openvino官方文档：[官方教你yolov1-v3转模型]

3.2不同模型配合openvino效果

3.2.1 yolov3

3.2.2 yolov4

3.2.3 SSD预训练模型

找了很多ssd转换模型的方法，没找到，结果openvino提供的预训练模型就是ssd
参考博客：[1]
官方预训练模型：person-detection-retail-0013

3.2.4 yolov5 openvino暂时无法直接支持

原因详见他人博客

四.Openvino程序移植

和普通c++程序可以将exe和所需dll打包，直接放入他人电脑中直接运行不同的是。openvino需要一定的环境，但不需要所有的环境。
详情可见：他人博客
一般是缺什么dll，去找到复制粘贴就好啦
如果出现：plugins.xml:1:0: File was not found

把这里的所有东西带上

一个超分辨率的程序携带的所有东西
这个文件夹，去哪个电脑都能跑
文件夹链接: 提取码: qwww

OpenVino入门（二）相关推荐

SQL基础使用入门(二): DML语句和DCL语句
SQL语句第二个类别--DML 语句 DML是数据操作语言的缩写,主要用来对数据表中数据记录实例对象进行操作,包括插入.删除.查找以及修改四大操作,这也是开发人员使用中最为频繁的操作. 1.插入记录 ...
文本分类入门(二)文本分类的方法
文本分类入门(二)文本分类的方法文本分类问题与其它分类问题没有本质上的区别,其方法可以归结为根据待分类数据的某些特征来进行匹配,当然完全的匹配是不太可能的,因此必须(根据某种评价标准)选择最优的匹配 ...
转 Python爬虫入门二之爬虫基础了解
静觅 » Python爬虫入门二之爬虫基础了解 2.浏览网页的过程在用户浏览网页的过程中,我们可能会看到许多好看的图片,比如 http://image.baidu.com/ ,我们会看到几张的图片以 ...
java类作用域标识符_java入门 (二) 标识符、数据类型、类型转换、变量、常量、作用域...
java入门(二) 标识符数据类型类型转换变量.常量.作用域本次笔记引用B站:狂神说,虽然早就会了,现在回头来敲下基础,加深印象 1.标识符: java所有的组成部分都需要名字.类名丶变量名丶 ...
MySQL入门 (二) : SELECT 基础查询
1 查询资料前的基本概念 1.1 表格.纪录与栏位表格是资料库储存资料的基本元件,它是由一些栏位组合而成的,储存在表格中的每一笔纪录就拥有这些栏位的资料. 以储存城市资料的表格「city」来说,设计 ...
微信小程序入门二：底部导航tabBar
小程序底部导航栏组件tabBar,可以参考下官方的API:tabBar 先看代码 //app.json {"pages":["pages/index/index" ...
conan入门(二)：conan 服务配置-密码管理及策略
conan 服务配置密码管理及策略配置第一次以管理员身份(admin)使用默认密码(password)WEB登录入JFrog Artifactory后台时,系统就提示要求我修改密码,因为现有密码太 ...
CSS入门二、美化页面元素
零.文章目录 CSS入门二.美化页面元素 1.字体属性 CSS Fonts (字体)属性用于定义字体系列.大小.粗细.和文字样式(如斜体) (1)字体系列font-family font-family ...
Pascal游戏开发入门(二):渲染图片
Pascal游戏开发入门(二):渲染图片渲染静态图片新增一个Texture,然后Render出来创建Texture,并获取尺寸 procedure TGame.Init(title: strin ...

OpenVino入门（二）

OpenVino入门（二）

一.OpenVino简介

1.1OpenVino是什么

1.2 OpenVino的网络加速原理

1.2.1Linear Operations Fusing

1.2.2 数据精度校准（Precision Calibration）

1.2.3 补充

1.3 开发流程

1.3.1 模型优化器(Model Optimizer)

1.3.2 推断引擎(Inference Engine)

1.3.2 推断引擎开发代码流程

二.以Super Resolution C++ Demo展示推断引擎

2.1Super Resolution C++源代码

2.2不同预训练模型的效果

2.2.1 1-o.jpg

2.2.2 2-0.bmp

2.2.3 1-bmp.bmp

2.2.4 3.bmp

三.以Object Detection C++ Demo演示模型优化器(Model Optimizer)

3.1模型转换流程

3.2不同模型配合openvino效果

3.2.1 yolov3

3.2.2 yolov4

3.2.3 SSD预训练模型

3.2.4 yolov5 openvino暂时无法直接支持

四.Openvino程序移植

OpenVino入门（二）相关推荐

最新文章

热门文章