自制深度学习推理框架-第六课-构建自己的计算图

项目主页

https://github.com/zjhellofss/KuiperInfer 感谢大家点赞和PR, 这是对我最大的鼓励, 谢谢.

现在KuiperInfer已经支持yolov5s的推理啦, 视频记录在这: KuiperInfer支持YoloV5s推理实录

本节课的配套视频课程

视频课程链接 ,请一定要配合视频一起观看此课件哦.

配套代码

git clone https://gitee.com/fssssss/KuiperCourse/
git checkout six
# 备用git地址 https://gitee.com/fssssss/Kuipe

PNNX

PNNX项目 PyTorch Neural Network eXchange(PNNX)是PyTorch模型互操作性的开放标准.

PNNX为PyTorch提供了一种开源的模型格式, 它定义了与PyTorch相匹配的数据流图和运算操作, 我们的框架在PNNX之上封装了一层更加易用和简单的计算图格式. PyTorch训练好一个模型之后, 然后模型需要转换到PNNX格式, 然后PNNX格式我们再去读取, 形成计算图.

PyTorch到我们计算图？

PNNX帮我做了很多的图优化、算子融合的工作, 所以底层的用它PNNX的话, 我们可以吸收图优化的结果, 后面推理更快.

但是我们不直接在项目中用PNNX, 因为别人的工作和自己推理框架开发思路总是有不同的. 所以在这上面封装, 又快速又好用方便, 符合自己的使用习惯. PNNX的使用方法, 我们只是去读取PNNX导出的模型, 然后构建自己一种易用的计算图结构.

PNNX的格式定义

PNNX由操作数operand(运算数)和operator(运算符号), PNNX::Graph用来管理和操作这两者.

操作数(operand), 也可以通过操作数来方向访问到这个数字的产生者和使用者Customer

代码链接

Operand

定义链接

Operand有以下几个部分组成:

Producer: 类型是operator, 表示产生了这个操作数(operand)的运算符(operator). 也就是说这个操作数(operand)是Producer的输出.

比如Producer是有个Add, Operand就是对应的Add结果.
Customer:类型是operator, 表示需要这个操作数是下一个操作的运算符(operator)的输入. 值得注意的是生产者Producer作为产生这个操作数的operator只能有一个, 而消费者Customer可以有多个, 消费者将当前的操作数Operand作为输入.
Name: 类型是std::string, 表示这个操作数的名称.
Shape: 类型是std::vector<int> , 用来表示操作数的大小.

Operator

定义链接

operator有以下几个部分组成:

Inputs: 类型为std::vector<operand*>, 表示这个运算符计算过程中所需要的输入操作数(operand)
Outputs: 类型为std::vector<operand*>, 表示这个运算符计算过程中得到的输出操作数(operand)
Type, Name 类型均为std::string, 分别表示运算符号的类型和名称
Params, 类型为std::map,用于存放该运算符的所有参数(例如对应Convolution operator的params中将存放stride, padding, kernel size等信息)
Attrs, 类型为std::map, 用于存放运算符号所需要的具体权重属性(例如对应Convolution operator的attrs中就存放着卷积的权重和偏移量)

我们对PNNX的封装

对Operands(运算数)的封装

struct RuntimeOperand {std::string name; /// 操作数的名称std::vector<int32_t> shapes; /// 操作数的形状std::vector<std::shared_ptr<Tensor<float>>> datas; /// 存储操作数RuntimeDataType type = RuntimeDataType::kTypeUnknown; /// 操作数的类型,  一般是float
};

对Operator(运算符)的封装

对PNNX::operator的封装是RuntimeOperator, 下面会讲具体的PNNX到KuiperInfer计算图的转换过程.

/// 计算图中的计算节点
struct RuntimeOperator {~RuntimeOperator();std::string name; /// 运算符号节点的名称std::string type; /// 运算符号节点的类型std::shared_ptr<Layer> layer; /// 节点对应的计算Layerstd::vector<std::string> output_names; /// 运算符号的输出节点名称std::shared_ptr<RuntimeOperand> output_operands; /// 运算符号的输出操作数std::map<std::string, std::shared_ptr<RuntimeOperand>> input_operands; /// 运算符的输入操作数std::vector<std::shared_ptr<RuntimeOperand>> input_operands_seq; /// 运算符的输入操作数,  顺序排列std::map<std::string, RuntimeParameter *> params;  /// 算子的参数信息std::map<std::string, std::shared_ptr<RuntimeAttribute> > attribute; /// 算子的属性信息,  内含权重信息
};

从PNNX计算图到KuiperInfer计算图的过程

本节代码链接

1. 加载PNNX的计算图

int load_result = this->graph_->load(param_path_, bin_path_);

2. 获取PNNX计算图中的运算符(operators)

std::vector<pnnx::Operator *> operators = this->graph_->ops;
if (operators.empty()) {LOG(ERROR) << "Can not read the layers' define";return false;
}

3. 遍历PNNX计算图中的运算符, 构建KuiperInfer计算图

 for (const pnnx::Operator *op : operators) {... }

4. 初始化RuntimeOperator的输入

初始化RuntimeOperator中的RuntimeOperator.input_operands和RuntimeOperator.input_operands_seq两个属性.

通过解析pnnx的计算图来初始化KuiperInfer RuntimeOperator中的输入部分. 简单来说就是从pnnx::inputs转换得到KuiperInfer::operator::inputs

struct RuntimeOperator {/// 本过程要初始化的两个属性std::map<std::string, std::shared_ptr<RuntimeOperand>> input_operands; /// 运算符的输入操作数std::vector<std::shared_ptr<RuntimeOperand>> input_operands_seq; /// 运算符的输入操作数,  顺序排列...
}

从PNNX::Operator::Input到KuiperInfer::Operator::Input的转换过程, 代码链接

const pnnx::Operator *op  = ...
const std::vector<pnnx::Operand *> &inputs = op->inputs;
if (!inputs.empty()) {InitInputOperators(inputs, runtime_operator);
}
....
void RuntimeGraph::InitInputOperators(const std::vector<pnnx::Operand *> &inputs,const std::shared_ptr<RuntimeOperator> &runtime_operator) {// 遍历输入pnnx的操作数类型(operands),  去初始化KuiperInfer中的操作符(RuntimeOperator)的输入.for (const pnnx::Operand *input : inputs) {if (!input) {continue;}// 得到pnnx操作数对应的生产者(类型是pnnx::operator)const pnnx::Operator *producer = input->producer;// 初始化RuntimeOperator的输入runtime_operandstd::shared_ptr<RuntimeOperand> runtime_operand = std::make_shared<RuntimeOperand>();// 赋值runtime_operand的名称和形状runtime_operand->name = producer->name;runtime_operand->shapes = input->shape;switch (input->type) {case 1: {runtime_operand->type = RuntimeDataType::kTypeFloat32;break;}case 0: {runtime_operand->type = RuntimeDataType::kTypeUnknown;break;}default: {LOG(FATAL) << "Unknown input operand type: " << input->type;}}// runtime_operand放入到KuiperInfer的运算符中runtime_operator->input_operands.insert({producer->name, runtime_operand});runtime_operator->input_operands_seq.push_back(runtime_operand);}
}

5. 初始化RuntimeOperator中的输出

初始化RuntimeOperator.output_names属性. 通过解析PNNX的计算图来初始化KuiperInfer Operator中的输出部分.代码链接

简单来说就是从PNNX::outputs到KuiperInfer::operator::output

void RuntimeGraph::InitOutputOperators(const std::vector<pnnx::Operand *> &outputs,const std::shared_ptr<RuntimeOperator> &runtime_operator) {for (const pnnx::Operand *output : outputs) {if (!output) {continue;}const auto &consumers = output->consumers;for (const auto &c : consumers) {runtime_operator->output_names.push_back(c->name);}}
}

6. 初始化RuntimeOperator的权重(Attr)属性

KuiperInfer::RuntimeOperator::RuntimeAttributes. Attributes中存放的是operator计算时需要的权重属性, 例如Convolution Operator中的weights和bias.

// 初始化算子中的attribute(权重)
const pnnx::Operator *op = ...
const std::map<std::string, pnnx::Attribute> &attrs = op->attrs;
if (!attrs.empty()) {InitGraphAttrs(attrs, runtime_operator);
}

代码链接

void RuntimeGraph::InitGraphAttrs(const std::map<std::string, pnnx::Attribute> &attrs,const std::shared_ptr<RuntimeOperator> &runtime_operator) {for (const auto &pair : attrs) {const std::string &name = pair.first;// 1.得到pnnx中的Attributeconst pnnx::Attribute &attr = pair.second;switch (attr.type) {case 1: {// 2. 根据Pnnx的Attribute初始化KuiperInferOperator中的Attributestd::shared_ptr<RuntimeAttribute> runtime_attribute = std::make_shared<RuntimeAttribute>();runtime_attribute->type = RuntimeDataType::kTypeFloat32;// 2.1 赋值权重weight(此处的data是std::vector<uchar>类型)runtime_attribute->weight_data = attr.data;runtime_attribute->shape = attr.shape;runtime_operator->attribute.insert({name, runtime_attribute});break;}default : {LOG(FATAL) << "Unknown attribute type";}}}
}

7. 初始化RuntimeOperator的参数(Param)属性

简单来说就是从pnnx::operators::Params去初始化KuiperInfer::RuntimeOperator::Params

const std::map<std::string, pnnx::Parameter> &params = op->params;
if (!params.empty()) {InitGraphParams(params, runtime_operator);
}

KuiperInfer::RuntimeOperator::RuntimeParameter有多个派生类构成, 以此来对应中多种多样的参数, 例如ConvOperator中有std::string类型的参数, padding_mode, 也有像uint32_t类型的kernel_size和padding_size参数, 所以我们需要以多种参数类型去支持他.

换句话说, 一个KuiperInfer::operator::Params, param可以是其中的任意一个派生类, 这里我们利用了多态的特性. KuiperInfer::RuntimeOperator::RuntimeParameter具有多种派生类, 如下分别表示为Int参数和Float参数, 他们都是RuntimeParameter的派生类.

std::map<std::string, RuntimeParameter *> params;  /// 算子的参数信息
// 用指针来实现多态struct RuntimeParameter { /// 计算节点中的参数信息virtual ~RuntimeParameter() = default;explicit RuntimeParameter(RuntimeParameterType type = RuntimeParameterType::kParameterUnknown) : type(type) {}RuntimeParameterType type = RuntimeParameterType::kParameterUnknown;
};
/// int类型的参数
struct RuntimeParameterInt : public RuntimeParameter {RuntimeParameterInt() : RuntimeParameter(RuntimeParameterType::kParameterInt) {}int value = 0;
};
/// float类型的参数
struct RuntimeParameterFloat : public RuntimeParameter {RuntimeParameterFloat() : RuntimeParameter(RuntimeParameterType::kParameterFloat) {}float value = 0.f;
};

从PNNX::param到RuntimeOperator::param的转换过程.
代码链接

void RuntimeGraph::InitGraphParams(const std::map<std::string, pnnx::Parameter> &params,const std::shared_ptr<RuntimeOperator> &runtime_operator) {for (const auto &pair : params) {const std::string &name = pair.first;const pnnx::Parameter &parameter = pair.second;const int type = parameter.type;// 根据PNNX的Parameter去初始化KuiperInfer::RuntimeOperator中的Parameterswitch (type) {case int(RuntimeParameterType::kParameterUnknown): {RuntimeParameter *runtime_parameter = new RuntimeParameter;runtime_operator->params.insert({name, runtime_parameter});break;}// 在这应该使用派生类RuntimeParameterBool case int(RuntimeParameterType::kParameterBool): {RuntimeParameterBool *runtime_parameter = new RuntimeParameterBool;runtime_parameter->value = parameter.b;runtime_operator->params.insert({name, runtime_parameter});break;}// 在这应该使用派生类RuntimeParameterIntcase int(RuntimeParameterType::kParameterInt): {RuntimeParameterInt *runtime_parameter = new RuntimeParameterInt;runtime_parameter->value = parameter.i;runtime_operator->params.insert({name, runtime_parameter});break;}case int(RuntimeParameterType::kParameterFloat): {RuntimeParameterFloat *runtime_parameter = new RuntimeParameterFloat;runtime_parameter->value = parameter.f;runtime_operator->params.insert({name, runtime_parameter});break;}case int(RuntimeParameterType::kParameterString): {RuntimeParameterString *runtime_parameter = new RuntimeParameterString;runtime_parameter->value = parameter.s;runtime_operator->params.insert({name, runtime_parameter});break;}case int(RuntimeParameterType::kParameterIntArray): {RuntimeParameterIntArray *runtime_parameter = new RuntimeParameterIntArray;runtime_parameter->value = parameter.ai;runtime_operator->params.insert({name, runtime_parameter});break;}case int(RuntimeParameterType::kParameterFloatArray): {RuntimeParameterFloatArray *runtime_parameter = new RuntimeParameterFloatArray;runtime_parameter->value = parameter.af;runtime_operator->params.insert({name, runtime_parameter});break;}case int(RuntimeParameterType::kParameterStringArray): {RuntimeParameterStringArray *runtime_parameter = new RuntimeParameterStringArray;runtime_parameter->value = parameter.as;runtime_operator->params.insert({name, runtime_parameter});break;}default: {LOG(FATAL) << "Unknown parameter type";}}}
}

8. 初始化成功

将通过如上步骤初始化好的KuiperInfer::RuntimeOperator存放到一个vector中

this->operators_.push_back(runtime_operator);

验证我们的计算图

我们先准备好了如下的一个计算图(准备过程不是本节的重点, 读者直接使用即可), 存放在tmp目录中, 它由两个卷积, 一个Add(expression)以及一个最大池化层组成.

TEST(test_runtime, runtime1) {using namespace kuiper_infer;const std::string &param_path = "./tmp/test.pnnx.param";const std::string &bin_path = "./tmp/test.pnnx.bin";RuntimeGraph graph(param_path, bin_path);graph.Init();const auto operators = graph.operators();for (const auto &operator_ : operators) {LOG(INFO) << "type: " << operator_->type << " name: " << operator_->name;}
}

如上为一个测试函数, Init就是我们刚才分析过的一个函数, 它定义了从PNNX计算图到KuiperInfer计算图的过程.

最后的输出

I20230107 11:53:33.033838 56358 test_main.cpp:13] Start test...
I20230107 11:53:33.034411 56358 test_runtime1.cpp:17] type: pnnx.Input name: pnnx_input_0
I20230107 11:53:33.034421 56358 test_runtime1.cpp:17] type: nn.Conv2d name: conv1
I20230107 11:53:33.034425 56358 test_runtime1.cpp:17] type: nn.Conv2d name: conv2
I20230107 11:53:33.034430 56358 test_runtime1.cpp:17] type: pnnx.Expression name: pnnx_expr_0
I20230107 11:53:33.034435 56358 test_runtime1.cpp:17] type: nn.MaxPool2d name: max
I20230107 11:53:33.034440 56358 test_runtime1.cpp:17] type: pnnx.Output name: pnnx_output_0

可以看出, Init函数最后得到的结果和图1中定义的是一致的. 含有两个Conv层, conv1和conv2, 一个add层Expression以及一个最大池化MaxPool2d层.

自制深度学习推理框架-第七课-构建自己的计算图相关推荐

自制深度学习推理框架-第十一节-再探Tensor类并构建计算图的图关系
自制深度学习推理框架-第十一节-再探Tensor类并准备算子的输入输出本课程介绍我写了一个<从零自制深度学习推理框架>的课程,课程语言是 C++,课程主要讲解包括算子实现和框架设计的思 ...
阿里深度学习推理框架_如何通过Knative无服务器框架构建深度学习推理
阿里深度学习推理框架在某些学术界和行业界,深度学习正在获得巨大的动力. 推理(基于预训练模型从现实世界数据中检索信息的能力)是深度学习应用程序的核心. 深度学习推理可用于在图像到达对象存储时对其进 ...
腾讯优图开源深度学习推理框架 TNN，助力 AI 开发降本增效
从学界到工业界,"开源"已经成为AI领域的一个关键词.一方面,它以"授人以渔"的方式为AI构建了一个开放共进的生态环境,帮助行业加速AI应用落地:另一方面,在解 ...
深度学习推理框架调研总结
深度学习推理框架作者介绍 1.移动端深度学习推理框架调研 1.1 小米的MACE(2017) 1.2 阿里的MNN 1.3 腾讯的TNN 1.4 ARM的tengine 1.5 百度的paddle- ...
AI学习笔记（九）从零开始训练神经网络、深度学习开源框架
AI学习笔记之从零开始训练神经网络.深度学习开源框架从零开始训练神经网络构建网络的基本框架启动训练网络并测试数据深度学习开源框架深度学习框架组件--张量组件--基于张量的各种操作组件- ...
深度学习多框架多平台推理引擎工具
一种深度学习推理引擎工具,支持多框架.支持多平台推理项目下载地址:下载地址支持的计算平台: - Windows 10 (Visual Studio 2019 x64) - Linux (x64, ...
深度学习推理性能优化,一个越来越重要的话题
向AI转型的程序员都关注了这个号???????????? 机器学习AI算法工程公众号:datayx 为什么我们开始关注和重视推理性能的优化. 天时深度学习的上半场主题是自证, 数据科学家们设计 ...
怎样快速掌握深度学习TensorFlow框架？
TensorFlow是Google基于DistBelief进行研发的第二代人工智能学习系统,其命名来源于本身的运行原理. Tensor(张量)意味着N维数组,Flow(流)意味着基于数据流图的计算,T ...
【CV实战】年轻人的第一个深度学习CV项目应该是什么样的？（支持13大深度学习开源框架）...
计算机视觉发展至今,许多技术已经非常成熟了,在各行各业落地业务非常多,因此不断的有新同学入行.本次我们就来介绍,对于新手来说,如何做一个最合适的项目.本次讲述一个完整的工业级别图像分类项目的标准流程, ...

自制深度学习推理框架-第七课-构建自己的计算图