模型量化 pytorch2onnx

文章目录

在pytorch中创建operation
- 根据步骤定义了自己的LinearFunction
- 直接使用LinearFunction
- 扩展LinearFunction到module
让torch.onnx能够识别自定义op
- 自定义的op在转onnx的时候报错
- - 被op运算符已经在ONNX标准化
  - op运算符没有被标准化
- 解决自定义的LinearFunction操作
- pytorch转onnx代码实现
- 对应的输出的onnx结构的部分也就是如下的

在pytorch中创建operation

检查pytorch是否包含operation。pytorch可以实现自定义层，可以拓展一些特殊的算子，同时提供了不可导operation的backward写法。例如，虽然pytorch可以自动求导，但是有时候一些操作是不可导的，这时候你需要自定义求导方式。也就是所谓的 “Extending torch.autograd”.

自定义一个pytorch的op，即对pytorch进行扩展。
扩展方法：通过继承 autograd.Function
继承 autograd.Function 的子类只需要实现两个静态方法：
- forward ：计算 op 的前向过程.
  - 在执行 forward 之前，Variable 参数已经被转换成了 Tensor
  - forward 的形参可以有默认参数，默认参数可以是任意 python 对象。
  - 可以返回任意多个 Tensor
  - 里面可以使用任何 python 操作，但是 return 的值必须是 Tensor
- backward ：计算梯度，
  - forward 返回几个值，这里就需要几个形参，还得外加一个 ctx。
  - forward 有几个形参（不包含 ctx），backward 就得返回几个值。
  - backward 实参也是 Variable 。
  - backward 返回的得是 Variable。

根据步骤定义了自己的LinearFunction

import torch
from torch.autograd import gradcheck
from torch.autograd import Variable
from torch.autograd import Function'''symbolic可以认为规定了,pytorch->onnx这个过程中的输出规范。简单的来说我们就是在自己创造,onnx非标准化的非ATen操作符(op),我的代码中对应的symbolic是这样的
'''
class LinearFunction(Function):# 这里的beta和alpha没有实际用处，只是证明使用自定义的op，在torch->onnx过程中，是可以传递网络参数的。@staticmethoddef symbolic(g, self, mat1, mat2, beta, alpha):#return g.op("nonentity", mat1, mat2, self, beta_f=beta, alpha_f=alpha)return g.op("nonentity", self,mat1, mat2,  beta_f=beta, alpha_f=alpha)# forward 和 backward 都得是 静态方法！！！！！@staticmethod# bias 是个可选参数，有个 默认值 Nonedef forward(ctx, input, weight, bias=None):# input，weight 都已经变成了 Tensor# 用 ctx 把该存的存起来，留着 backward 的时候用# ctx.save_for_backward 只能存 tensor, None, 其余都不能存。# ctx.save_for_backward 只保存 forward 的实参，或者 forward 的返回值。ctx.save_for_backward(input, weight, bias)output = input.mm(weight.t())if bias is not None:output += bias.unsqueeze(0).expand_as(output)return output# 由于 forward 只有一个 返回值，所以 backward 只需要一个参数 接收 梯度。@staticmethoddef backward(ctx, grad_output):# 此方法猜测是 torch.no_grad() 上下文中运行的. #grad_output 是 Variable 类型。# 在开头的地方将保存的 tensor 给 unpack 了# 然后 给 所有应该返回的 梯度 以 None 初始化。input, weight, bias = ctx.saved_tensorsgrad_input = grad_weight = grad_bias = None# needs_input_grad 检查是可选的。如果想使得 代码更简单的话，可以忽略。# 给不需要梯度的 参数返回梯度 不是一个错误。# 返回值 的个数 需要和 forward 形参的个数（不包含 ctx）一致if ctx.needs_input_grad[0]:grad_input = grad_output.mm(weight)if ctx.needs_input_grad[1]:grad_weight = grad_output.t().mm(input)if bias is not None and ctx.needs_input_grad[2]:grad_bias = grad_output.sum(0).squeeze(0)# 梯度的顺序和 forward 形参的顺序要对应。return grad_input, grad_weight, grad_bias

上面就是继承 Function 的全过程，pytorch封装有 Function 和 Module, linear 可以当成函数直接调用,像 F.conv2d 一样, 也可以封装进 Module 像 nn.Conv2d 那样使用.

直接使用LinearFunction

# input, weight, 是 Variable
def linear(input, weight, bias=None):# 一定是要 通过调用 apply 来用的。 Function.apply 中估计做了不少事情。return LinearFunction.apply(input, weight, bias)if __name__ == '__main__':in_ = torch.randn((20, 20), requires_grad=True, dtype=torch.double)weight_ = torch.randn((20, 20), requires_grad=True, dtype=torch.double)res= linear(in_, weight_)loss = res.sum()loss.backward()                      # 转成标量# 反向传播：因为 loss = sum(y),故grad_outputs = dloss/dy = 1,可以省略不写# print(in_.grad)# print(weight_.grad)input = (torch.randn((20, 20), requires_grad=True, dtype=torch.double) ,torch.randn((30, 20), requires_grad=True, dtype=torch.double))test = gradcheck(LinearFunction.apply, input, eps=1e-6, atol=1e-4)# 如果通过，最后会打印一个 Trueprint(test)

扩展LinearFunction到module

扩展module就很简单，需要重载 nn.Module中的__init__和__forward__

class Linear(nn.Module):def __init__(self, input_features, output_features, bias=True):super(Linear, self).__init__()self.input_features = input_featuresself.output_features = output_features# nn.Parameter is a special kind of Variable, that will get# automatically registered as Module's parameter once it's assigned# 这个很重要！ Parameters是默认需要梯度的！# as an attribute. Parameters and buffers need to be registered, or# they won't appear in .parameters() (doesn't apply to buffers), and# won't be converted when e.g. .cuda() is called. You can use# .register_buffer() to register buffers.# nn.Parameters can never be volatile and, different than Variables,# they require gradients by default.self.weight = nn.Parameter(torch.Tensor(output_features, input_features))if bias:self.bias = nn.Parameter(torch.Tensor(output_features))else:# You should always register all possible parameters, but the# optional ones can be None if you want.self.register_parameter('bias', None)# Not a very smart way to initialize weightsself.weight.data.uniform_(-0.1, 0.1)if bias is not None:self.bias.data.uniform_(-0.1, 0.1)def forward(self, input):# See the autograd section for explanation of what happens here.return LinearFunction.apply(input, self.weight, self.bias)

让torch.onnx能够识别自定义op

自定义的op在转onnx的时候报错

在尝试利用自定义的op执行torch.nn.export想要输出protobuf二值文件的时候，读到自定义op，会报错：

...%19 : Float(64, 64, 3, 3) = onnx::MaxPool[kernel_shape=[2, 2], pads=[0, 0, 0, 0], strides=[2, 2]](%18), scope: Net/Sequential[conv3]/MaxPool2d[2]%20 : Float(64, 576) = onnx::Flatten[axis=1](%19), scope: Net%input.5 : Float(64, 128) = ^LinearFunction()(%20, %dense.0.weight, %dense.0.bias), scope: Net/Sequential[dense]/Linear[0]%22 : Float(64, 128) = onnx::Relu(%input.5), scope: Net/Sequential[dense]/ReLU[1]%23 : Float(64, 10) = ^LinearFunction()(%22, %dense.2.weight, %dense.2.bias), scope: Net/Sequential[dense]/Linear[2]

显示未定义的操作operator LinearFunction
解决办法就是想办法让torch.onnx能读懂我自定义的op：LinearFunction。

被op运算符已经在ONNX标准化

现今onnx支持的运算符，一般最新版本的支持的运算符信息会在github的onnx源码工程中的Operators.md中写出Operators.md.
如果，运算符已经被标准化，即在上边的列表中能找到，且在该版本的torch中，这个操作是一个ATen操作符，即在 torch/csrc/autograd/generated/VariableType.h能找到它的定义。
那就在torch/onnx/symbolic.py里面加上符号并且遵循下面的指令：

在 torch/onnx/symbolic.py里面定义符号。确保该功能与在ATen操作符在VariableType.h的功能相同。
第一个参数总是ONNX图形参数，参数的名字必须与 VariableType.h里的匹配，因为调度是依赖于关键字参数完成的。
参数排序不需要严格与VariableType.h匹配，首先的张量一定是输入的张量，然后是非张量参数。
在符号功能里，如果操作符已经在ONNX标准化了，我们只需要创建一个代码去表示在图形里面的ONNX操作符。
如果输入参数是一个张量，但是ONNX需要的是一个标量形式的输入，我们需要做个转化。_scalar可以帮助我们将一个张量转化为一个python标量，并且_if_scalar_type_as函数可以将python标量转化为PyTorch张量。

op运算符没有被标准化

如果没有被标准化，也就代表torch.onnx模块下，也没有这个op的定义，是个非ATen操作符，那么符号功能需要加在相应的PyTorch函数类中。请阅读下面的指示：

在相应的函数类中创建一个符号函数命名为symbolic。
第一个参数总是导出ONNX图形参数。
参数的名字除了第一个必须与前面的形式严格匹配。
输出元组大小必须与前面的形式严格匹配。
在符号功能中，如果操作符已经在ONNX标准化了，我们只需要创建一个代码去表示在图形里面的ONNX操作符。

解决自定义的LinearFunction操作

在Pytorch1.1.0 入门自定义op（python）中提到过，早LinearFunction的定义中定义了一个@staticmethod的函数symbolic()，这个被叫做符号函数，经过后来的尝试，就是用torch.onnx.export进行向onnx格式转换的过程中，帮助识别自定义操作的函数。

最开始苦于不知道具体使用方法，观察了一下torch/onnx/symbolic.py下的操作，很多都是以g.op()作为返回对象的，而这个函数的第一个参数都能最后输出的onnx格式的模型的名字一样，例如：

def stack(g, tensor_list, dim):unsqueezed = [g.op("Unsqueeze", t, axes_i=[dim]) for t in _unpack_list(tensor_list)]return g.op("Concat", *unsqueezed, axis_i=dim)def mm(g, self, other):# Create a dummy C tensor. Only needed for API purposes, the value is# since beta = 0ty = _try_get_scalar_type(self, other).lower()C = g.constant(0, [1], ty)return g.op("Gemm", self, other, C, beta_f=0.0, alpha_f=1.0)

最后，对应的onnx层名字就是"Concat"和“Gemm”等。

标准torch.nn.Linear()方法输出的onnx的格式之后，发现全连接层的表示是“Gemm”：
去torch/onnx/symbolic.py扒了扒已经被定义的op的写法, addmm只会返回一个，所以torch.nn.Linear()调用的应该是addmm。

def addmm(g, self, mat1, mat2, beta, alpha):return g.op("Gemm", mat1, mat2, self, beta_f=_scalar(beta), alpha_f=_scalar(alpha))

所以，在symbolic()函数下照猫画虎定义了和addmm几乎一样的结构。

    def symbolic(g, self, mat1, mat2, beta, alpha):return g.op("nonentity", self,mat1, mat2,  beta_f=beta, alpha_f=alpha)

pytorch转onnx代码实现

 torch.onnx.export(model,(data,indices,updates),"vfe.onnx",#   export_params=True,opset_version=13,#   do_constant_folding=True,#   keep_initializers_as_inputs=True,input_names=["data","indices","updates"],output_names=["output"],operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK)

对应的输出的onnx结构的部分也就是如下的

...%19 : Float(64, 64, 3, 3) = onnx::MaxPool[kernel_shape=[2, 2], pads=[0, 0, 0, 0], strides=[2, 2]](%18), scope: Net_LinearFunction/Sequential[conv3]/MaxPool2d[2]%20 : Float(64, 576) = onnx::Flatten[axis=1](%19), scope: Net_LinearFunction%21 : Float(64, 128) = onnx::nonentity[alpha=1.3, beta=1.2](%20, %dense.0.weight, %dense.0.bias), scope: Net_LinearFunction/Sequential[dense]/Linear[0]%22 : Float(64, 128) = onnx::Relu(%21), scope: Net_LinearFunction/Sequential[dense]/ReLU[1]%23 : Float(64, 10) = onnx::nonentity[alpha=1.33, beta=1.22](%22, %dense.2.weight, %dense.2.bias), scope: Net_LinearFunction/Sequential[dense]/Linear[2]return (%23)

%21和%23都是自定义的op,“nonentity”来执行运算的，“[]”中代表的是网络参数，"()"中代表的权重

原文链接：https://blog.csdn.net/u012436149/article/details/78829329

原文链接：https://blog.csdn.net/qq_33120609/article/details/99429967