由于这段时间着手实现tensorflow到ncnn的转换,开发过程中对ncnn框架有了一定的认识,特此分享。

关于tensorflow2ncnn的具体细节和步骤,可以参考我的github:

https://github.com/hanzy88/tensorflow2ncnn

目前已支持CNN+FC的转换,并且基于full cnn/mobilenetv2的yolov3已测试成功,由于一定的精度损失结果基本正确。测试在jetson nano上运行成功(因为full cnn的yolov3太大导致运行非常!卡顿)。

言归正传,之前看ncnn源码的时候从mat这些着手,虽然看了一些但实在没有具体的概念,于是这个系列文章主要自上而下从模型和权重的传入开始解读ncnn的前向传播。

目录

  • 一. 模型和权重文件介绍
    • 模型加载及提取输出demo
  • 二. param模型读入
  • 三. bin文件读入
  • 四. 前向传递

一. 模型和权重文件介绍

在完成ncnn模型转换后,一般会得到两个文件:

ncnn.param
ncnn.bin

其中param存放的是模型结构,bin存放的是类似卷积这些op的权重文件。

param的结构如下:

7767517
3 3
Input         input    0 1 data 0=4 1=4 2=1
InnerProduct  ip       1 1 data fc 0=10 1=1 2=80
Softmax       softmax  1 1 fc prob 0=0

第一行是magic num,指定7767517
第二行第一个是layer数量,第二个blob数量,其中blob是层与层之间传递的数据结构,定义在blob.h, blob.cpp, 如下:

class Blob
{
public:// emptyBlob();public:
#if NCNN_STRING// blob namestd::string name;
#endif // NCNN_STRING// layer index which produce this blob as output 记录输出层的索引int producer;// layer index which need this blob as input 记录输入层的索引std::vector<int> consumers;
};

第三行至最后记录了每一层的传递信息,以第三行为例:
1列是当前层的op, 2列是层名,3,4列分别是输入和输出节点数,之后string就是输入节点名称和输出节点名称。再之后的数字是每一层传入的常量参数,具体继续往下看:

[layer type] [layer name] [input count] [output count] [input blobs] [output blobs] [layer specific params]layer type : type name, such as Convolution Softmax etc
layer name : name of this layer, must be unique among all layer names
input count : count of the blobs this layer needs as input
output count : count of the blobs this layer produces as output
input blobs : name list of all the input blob names, seperated by space, must be unique among input blob names of all layers
output blobs : name list of all the output blob names, seperated by space, must be unique among output blob names of all layers
layer specific params : key=value pair list, seperated by space

层参数:

0=1 1=2.5 -23303=2,2.0,3.0key index should be unique in each layer line, pair can be omitted if the default value usedthe meaning of existing param key index can be looked up at https://github.com/Tencent/ncnn/wiki/operation-param-weight-tableinteger or float key : index 0 ~ 19
integer value : int
float value : float
integer array or float array key : -23300 minus index 0 ~ 19
integer array value : [array size],int,int,...,int
float array value : [array size],float,float,...,float

来自官方的解释,如果你还是不太理解,就以我在tensorflow2ncnn.cpp中定义Range的参数读取代码为例:

else if(node.op() == "Range"){const tensorflow::TensorProto& start = weights[node.input(0)];const tensorflow::TensorProto& limit = weights[node.input(1)];const tensorflow::TensorProto& delta = weights[node.input(2)];const int * start_data = reinterpret_cast<const int *>(start.int_val().begin()); const int * limit_data = reinterpret_cast<const int *>(limit.int_val().begin()); const int * delta_data = reinterpret_cast<const int *>(delta.int_val().begin()); fprintf(pp, " 0=%d", *start_data);fprintf(pp, " 1=%d", *limit_data);fprintf(pp, " 2=%d", *delta_data);}

这里的0,1,2是可以自己随意定义的,但是你需要知道它具体代表的含义,以便在前向传递的时候根据定义的 index(i.e.,0,1,2) 取值操作,所以同样是index 0在不同层可能具有不同含义。

bin文件结构如下:

  +---------+---------+---------+---------+---------+---------+| weight1 | weight2 | weight3 | weight4 | ....... | weightN |+---------+---------+---------+---------+---------+---------+^         ^         ^         ^0x0      0x80      0x140     0x1C0

the model binary is the concatenation of all weight data, each weight buffer is aligned by 32bit.

weight buff的结构如下:

[flag] (optional)
[raw data]
[padding] (optional)flag : unsigned int, little-endian, indicating the weight storage type, 0 => float32, 0x01306B47 => float16, otherwise => quantized int8, may be omitted if the layer implementation forced the storage type explicitly
raw data : raw weight data, little-endian, float32 data or float16 data or quantized table and indexes depending on the storage type flag
padding : padding space for 32bit alignment, may be omitted if already aligned

refer to:
https://github.com/Tencent/ncnn/wiki/param-and-model-file-structure

模型加载及提取输出demo

在得到模型文件后,需要构建相应的net对象,载入模型和权重,完成输入后即可获取指定层的输出,如下:

 ncnn::Net net;net.load_param("ncnn.param");net.load_model("ncnn.bin");const int target_size = 227;int img_w = bgr.cols;int img_h = bgr.rows;ncnn::Mat in = ncnn::Mat::from_pixels_resize(bgr.data, ncnn::Mat::PIXEL_BGR, bgr.cols, bgr.rows, target_size, target_size);ncnn::Extractor ex = net.create_extractor();ex.input("input", in);ncnn::Mat out;ex.extract("softmax", out);

(param模型和bin文件的读取都在net.h和net.cpp中定义)

二. param模型读入

以 Net::load_param(FILE* fp)为例:

int Net::load_param(FILE* fp)
{//读取magic numberint magic = 0;int nbr = fscanf(fp, "%d", &magic);if (nbr != 1){LOG_HAN;fprintf(stderr, "issue with param file\n");return -1;}if (magic != 7767517){fprintf(stderr, "param is too old, please regenerate\n");return -1;}// parse 读取layer_count和blob_countint layer_count = 0;int blob_count = 0;nbr = fscanf(fp, "%d %d", &layer_count, &blob_count);if (nbr != 2 || layer_count <= 0 || blob_count <= 0){//LOG_HAN;fprintf(stderr, "nbr %d, layer_count %d, blob_count %d", nbr, layer_count, blob_count);fprintf(stderr, "issue with param file\n");return -1;}layers.resize((size_t)layer_count);blobs.resize((size_t)blob_count);ParamDict pd;int blob_index = 0;for (int i=0; i<layer_count; i++){int nscan = 0;char layer_type[257];  //不同的type对应不同opchar layer_name[257]; //当前层的名称int bottom_count = 0;int top_count = 0;//读入layer_op, layer_name, 输入数量以及输出数量nscan = fscanf(fp, "%256s %256s %d %d", layer_type, layer_name, &bottom_count, &top_count);if (nscan != 4){continue;}// 创建Layer对象,根据不同op进行动态绑定// Layer定义在layer.cpp和layer.h,具体op对应的代码在src/layer中定义Layer* layer = create_layer(layer_type);if (!layer){layer = create_custom_layer(layer_type);}if (!layer){fprintf(stderr, "layer %s of name %s not exists or registered\n", layer_type, layer_name);clear();return -1;}layer->type = std::string(layer_type);layer->name = std::string(layer_name);
//         fprintf(stderr, "new layer %d %s\n", i, layer_name);layer->bottoms.resize(bottom_count);//创建输入节点并存入当前层中for (int j=0; j<bottom_count; j++){char bottom_name[257];nscan = fscanf(fp, "%256s", bottom_name);//fprintf(stderr, "new blob %s, %d\n", bottom_name, __LINE__);if (nscan != 1){continue;}int bottom_blob_index = find_blob_index_by_name(bottom_name);//LOG_HAN;if (bottom_blob_index == -1){Blob& blob = blobs[blob_index];bottom_blob_index = blob_index;blob.name = std::string(bottom_name);fprintf(stderr, "new blob %s, %d\n", bottom_name, blob_index);blob_index++;}Blob& blob = blobs[bottom_blob_index];blob.consumers.push_back(i);layer->bottoms[j] = bottom_blob_index;}//创建输出节点并存入当前层中layer->tops.resize(top_count);for (int j=0; j<top_count; j++){Blob& blob = blobs[blob_index];char blob_name[257];nscan = fscanf(fp, "%256s", blob_name);if (nscan != 1){continue;}blob.name = std::string(blob_name);
//             fprintf(stderr, "new blob %s\n", blob_name);blob.producer = i;layer->tops[j] = blob_index;blob_index++;}// layer specific params 这里取值的是定义的常量层参数// 调用ParamDict::load_param取出所有定义的值int pdlr = pd.load_param(fp);if (pdlr != 0){fprintf(stderr, "ParamDict load_param failed\n");continue;}//调用具体op层重载的loar_param函数//取出指定需要的值int lr = layer->load_param(pd);if (lr != 0){fprintf(stderr, "layer load_param failed\n");continue;}layers[i] = layer;}return 0;
}

在加载每层具体参数这一块,同样以定义的Range为例(在src/layer/tfrange.h和tfrang.cpp中):

int TFRange::load_param(const ParamDict& pd)
{start = pd.get(0, 0);limit = pd.get(1, 1);delta = pd.get(2, 1);//fprintf(stderr, "slices: %d %d %d \n", start, limit, delta);return 0;
}

即可取出转换时定义的三个值。

三. bin文件读入

以 Net::load_model(FILE* fp)为例:

int Net::load_model(FILE* fp)
{if (layers.empty()){fprintf(stderr, "network graph not ready\n");return -1;}// load fileint ret = 0;ModelBinFromStdio mb(fp); //mb的拷贝构造,定义在modelbin.h和modelbin.cpp//依次读入层for (size_t i=0; i<layers.size(); i++){Layer* layer = layers[i];//Here we found inconsistent content in the parameter file.if (!layer){fprintf(stderr, "load_model error at layer %d, parameter file has inconsistent content.\n", (int)i);ret = -1;break;}//调用当前层的load_model,如果当前层没有重载,直接调用Layer的load_model返回0//重载函数传入取值的大小,传入如:weight_data = mb.load(weight_data_size, 0);//调用Mat ModelBinFromStdio::load(int w, int type) const,通过fread函数读取文件流int lret = layer->load_model(mb);if (lret != 0){fprintf(stderr, "layer load_model %d failed\n", (int)i);ret = -1;break;}int cret = layer->create_pipeline(opt);if (cret != 0){fprintf(stderr, "layer create_pipeline %d failed\n", (int)i);ret = -1;break;}}fuse_network();return ret;
}

其中,权重文件并未每层都用,param中的层具体参数也一样,需要在转换时根据具体op定义,可以参照tensorflow2ncnn.cpp中的FusedBatchnorm:

const tensorflow::TensorProto& scale = weights[node.input(1)];const tensorflow::TensorProto& B = weights[node.input(2)];const tensorflow::TensorProto& mean = weights[node.input(3)];const tensorflow::TensorProto& var = weights[node.input(4)];int channels = scale.tensor_shape().dim(0).size(); // data size//fprintf(stderr, "channels: %d\n", channels);int dtype = scale.dtype();switch (dtype){case 1: //float{float * scale_tensor = (float *)malloc(sizeof(float) * channels);float * mean_tensor = (float *)malloc(sizeof(float) * channels);float * var_tensor = (float *)malloc(sizeof(float) * channels);float * b_tensor = (float *)malloc(sizeof(float) * channels);const float * scale_data = reinterpret_cast<const float *>(scale.tensor_content().c_str());const float * mean_data = reinterpret_cast<const float *>(mean.tensor_content().c_str());const float * var_data = reinterpret_cast<const float *>(var.tensor_content().c_str());const float * b_data = reinterpret_cast<const float *>(B.tensor_content().c_str());for(int i=0;i<channels;i++){scale_tensor[i] = *scale_data++;mean_tensor[i] = *mean_data++;var_tensor[i] = *var_data++;b_tensor[i] = *b_data++;//fprintf(stderr, "scale_data: %f\n", * scale_data);}fwrite(scale_tensor, sizeof(float), channels, bp);fwrite(mean_tensor, sizeof(float), channels, bp);fwrite(var_tensor, sizeof(float), channels, bp);fwrite(b_tensor, sizeof(float), channels, bp);break;}

写入均值,方差和缩放平移因子后,前向传播时在batchnorm层中(batchnorm.cpp)可以通过load_model读取:

int BatchNorm::load_model(const ModelBin& mb)
{slope_data = mb.load(channels, 1);if (slope_data.empty())return -100;mean_data = mb.load(channels, 1);if (mean_data.empty())return -100;var_data = mb.load(channels, 1);if (var_data.empty())return -100;bias_data = mb.load(channels, 1);if (bias_data.empty())return -100;a_data.create(channels);if (a_data.empty())return -100;b_data.create(channels);if (b_data.empty())return -100;for (int i=0; i<channels; i++){float sqrt_var = sqrt(var_data[i] + eps);a_data[i] = bias_data[i] - slope_data[i] * mean_data[i] / sqrt_var;b_data[i] = slope_data[i] / sqrt_var;}return 0;
}

四. 前向传递

定义ncnn::Extractor对象后,首先传入输入层:

// 这一块比较简单,根据输入名称获取input op的index,将传入数据赋值
int Extractor::input(const char* blob_name, const Mat& in)
{int blob_index = net->find_blob_index_by_name(blob_name);if (blob_index == -1)return -1;return input(blob_index, in);
}// 在create_extractor中会构建一个blob_count大小的blob_mats,即输出mat
int Extractor::input(int blob_index, const Mat& in)
{if (blob_index < 0 || blob_index >= (int)blob_mats.size())return -1;blob_mats[blob_index] = in;return 0;
}

接着就是提取输出层,输出层可以随意取其中定义的任意层,且extract算过的layer不会重复计算,如果后续的extract涉及到没有计算过的layer,还是会消耗一些时间来计算的。

因为代码过长,截取部分说明:

// 同样获取输出层的index传入
int Extractor::extract(const char* blob_name, Mat& feat)
{int blob_index = net->find_blob_index_by_name(blob_name);if (blob_index == -1)return -1;return extract(blob_index, feat);
}// 在create_extractor中会构建一个blob_count大小的blob_mats,即输出mat
int Extractor::extract(int blob_index, Mat& feat)
{if (blob_index < 0 || blob_index >= (int)blob_mats.size())return -1;int ret = 0;//获得需要获取层的输出indexif (blob_mats[blob_index].dims == 0){int layer_index = net->blobs[blob_index].producer;ret = net->forward_layer(layer_index, blob_mats, opt);}//得到当前获取层Index对应的结果feat = blob_mats[blob_index];return ret;
}

其中forward_layer作为层之间的参数传递:

int Net::forward_layer(int layer_index, std::vector<Mat>& blob_mats, Option& opt) const
{//由输出层逐渐往前递归调用,将结果存入对应层index的blob_matsconst Layer* layer = layers[layer_index];//每层layer的one_blob_only属性,为true时说明时单输入单输出if (layer->one_blob_only){// load bottom blobint bottom_blob_index = layer->bottoms[0];int top_blob_index = layer->tops[0];if (blob_mats[bottom_blob_index].dims == 0){//依次获取前一层输入int ret = forward_layer(blobs[bottom_blob_index].producer, blob_mats, opt);if (ret != 0)return ret;}Mat bottom_blob = blob_mats[bottom_blob_index];if (opt.lightmode){// delete after taken in light modeblob_mats[bottom_blob_index].release();// deep copy for inplace forward if data is sharedif (layer->support_inplace && *bottom_blob.refcount != 1){bottom_blob = bottom_blob.clone();}}// forward 支持support_inplace属性,即输出mat替代输入matif (opt.lightmode && layer->support_inplace){Mat& bottom_top_blob = bottom_blob;//调用当前layer重载的farward_inplaceint ret = layer->forward_inplace(bottom_top_blob, opt);if (ret != 0)return ret;// store top blobblob_mats[top_blob_index] = bottom_top_blob;}else{//否则即重载输入mat和输出mat的forwad函数Mat top_blob;int ret = layer->forward(bottom_blob, top_blob, opt);if (ret != 0)return ret;// store top blobblob_mats[top_blob_index] = top_blob;}}else{// load bottom blobs//多输入,多输出/多输入,单输出/单输入,多输出的情况std::vector<Mat> bottom_blobs(layer->bottoms.size());for (size_t i=0; i<layer->bottoms.size(); i++){int bottom_blob_index = layer->bottoms[i];if (blob_mats[bottom_blob_index].dims == 0){int ret = forward_layer(blobs[bottom_blob_index].producer, blob_mats, opt);if (ret != 0)return ret;}bottom_blobs[i] = blob_mats[bottom_blob_index];if (opt.lightmode){// delete after taken in light modeblob_mats[bottom_blob_index].release();// deep copy for inplace forward if data is sharedif (layer->support_inplace && *bottom_blobs[i].refcount != 1){bottom_blobs[i] = bottom_blobs[i].clone();}}}// forwardif (opt.lightmode && layer->support_inplace){std::vector<Mat>& bottom_top_blobs = bottom_blobs;int ret = layer->forward_inplace(bottom_top_blobs, opt);if (ret != 0)return ret;// store top blobsfor (size_t i=0; i<layer->tops.size(); i++){int top_blob_index = layer->tops[i];blob_mats[top_blob_index] = bottom_top_blobs[i];}}else{std::vector<Mat> top_blobs(layer->tops.size());int ret = layer->forward(bottom_blobs, top_blobs, opt);if (ret != 0)return ret;// store top blobsfor (size_t i=0; i<layer->tops.size(); i++){int top_blob_index = layer->tops[i];blob_mats[top_blob_index] = top_blobs[i];}}}return 0;
}

以上就是ncnn根据param和bin前向传递的整体过程。

未完待续

自上而下解读ncnn系列(1):加载param模型和bin文件前向传播相关推荐

  1. three、vue中使用three、three怎么加载obj模型和mtl文件、three自定义800*800大小怎么拾取/点击

    以上都是这一个星期碰到的坑,找了很多很多资料,总结归纳一下,希望对你的项目有一点点帮助 先说说需求 1.加载3D模型 2.点击模型的子模型会显示对于子模型名称 3.不全屏展示,还要点击子模型 4.创建 ...

  2. 【自然语言处理入门系列】加载和预处理数据-以Cornell Movie-Dialogs Corpus数据集为例

    [自然语言处理入门系列]加载和预处理数据-以Cornell Movie-Dialogs Corpus数据集为例 Author: Yirong Chen from South China Univers ...

  3. Pandas将dataframe保存为pickle文件并加载保存后的pickle文件查看dataframe数据实战

    Pandas将dataframe保存为pickle文件并加载保存后的pickle文件查看dataframe数据实战 目录 Pandas将dataframe保存为pickle文件并加载保存后的pickl ...

  4. mysql中鼠标光标消失了_为什么我这里没有显示鼠标的悬停可改变页面颜色,以为什么我加载了mysql的jar文件还是不能显示报表的内容呢?...

    源自:3-6 JSP页面实现 为什么我这里没有显示鼠标的悬停可改变页面颜色,以为什么我加载了mysql的jar文件还是不能显示报表的内容呢? 首先是index.jsp pageEncoding=&qu ...

  5. python训练好的图片验证_利用keras加载训练好的.H5文件,并实现预测图片

    我就废话不多说了,直接上代码吧! import matplotlib matplotlib.use('Agg') import os from keras.models import load_mod ...

  6. MFC单文档程序加载web网站和html文件

    使用CHtmlView类,CHtmlView类的主要功能是访问Web网站和HTML文档:该类可说是对webbrowser控件的封装: 新建一个单文档项目:选择 CHtmlView 类作为视类的基类:项 ...

  7. python3读取网页_python3+selenium获取页面加载的所有静态资源文件链接操作

    软件版本: python 3.7.2 selenium 3.141.0 pycharm 2018.3.5 具体实现流程如下,废话不多说,直接上代码: from selenium import webd ...

  8. 安卓加载asset中的json文件_Joomla 4中的Web资源介绍

    Joomla 4中我最喜欢的改进之一是"Web资源"特性.它允许你通过一次调用按特定顺序加载一组JavaScript和CSS文件. 比方说,你希望加载依赖于其他文件的CSS或Jav ...

  9. 解决springmvc加载JS,CSS等文件问题【转】

    解决springmvc加载JS,CSS等文件问题[转] 参考文章: (1)解决springmvc加载JS,CSS等文件问题[转] (2)https://www.cnblogs.com/jerrylz/ ...

最新文章

  1. nginx在linux下安装,Nginx在linux下安装及简单命令
  2. C# 从CIL代码了解委托,匿名方法,Lambda 表达式和闭包本质
  3. 春天就是要搞技术啊!
  4. caffe :error MSB4062: 未能从程序集** 加载任务“NuGetPackageOverlay”
  5. if else 嵌套 来源微信公众号
  6. 决胜新能源汽车战场:价格拖死战、舆论声量战、产业兼并战
  7. 说说Thread的interrupt()
  8. C语言UDP socket编程
  9. vb计算机清除菜单代码,动态增减菜单用法 _VB编程语言动态增减菜单-w3school教程...
  10. 简单 黑苹果dsdt教程_从零开始学黑苹果-进阶安装教程(10.12.6)
  11. node mysql菜鸟教程_Node.js GET/POST请求
  12. 中国四级标准行政区划 JSON
  13. 卡内基梅隆大学计算机硕士专业,2020年卡内基梅隆大学专业设置
  14. python替换ppt文本_Python操作PPT实现自动查找替换
  15. echart--axisLabel中值太长不自动换行
  16. java转正自我陈述_试用期转正个人工作述职报告合集
  17. Excel表列名称(4)
  18. 交叉编译使用 hostapd-2.0 在开发板上开机自启动无线网卡 AP 功能
  19. SD 格式化錯誤提示Windows無法完成格式化
  20. 我是如何降低项目的沟通成本?

热门文章

  1. 递归实现指数型,排列型,组合型枚举
  2. 自己动手写CSDN博客提取器源码分析之三:处理网页保存为pdf文件
  3. Kotlin高仿微信-第20篇-个人信息
  4. 设置WIN 7 截图工具的快捷方式
  5. vue原生js打印插件
  6. 一种鼠标手势识别的方案
  7. Android UI详解之布局管理器(一)
  8. 海南大学信号与系统838报考高频问题整理(五)
  9. 计算机科学与技术的年崭,计算机科学与信息技术学院举行2019届毕业典礼暨表彰大会...
  10. ​电脑上的回收站怎么隐藏 ,怎么隐藏桌面回收站图标