我的原工程模型是blazeface学习笔记_zhqh100的博客-CSDN博客完整的应该是一个人脸识别项目,人脸识别,大言不惭的说,我之前其实也做过,比如用dlib来做人脸识别,就是用opencv那一套来实现,说句实在话,速度非常慢,即便是在intel CPU上,一秒也就两三帧,确实是太慢了我其实也用过其他方案,比如前几年,下载虹软的免费的库,进行试用,效果确实惊人,给我印象最深刻的,倒不是识别准确度有多高,而是速度真的飞快,我也试过MTCNN,这个只要网上搜索人脸检测,基本都是搜到这个结果,我也尝试过,我不知道别人是如何夸奖这个库的,我试用的体会就是,经常误识别,就是本来就https://blog.csdn.net/zhqh100/article/details/123688945

量化的话,首先是参考

Quantization Recipe — PyTorch Tutorials 1.11.0+cu102 documentationhttps://pytorch.org/tutorials/recipes/quantization.html

要是像他的demo中的事例,那量化就简单多了,训练好的model,只需要运行的时候执行如下几行代码即可

backend = "qnnpack"
model.qconfig = torch.quantization.get_default_qconfig(backend)
torch.backends.quantized.engine = backend
model_static_quantized = torch.quantization.prepare(model, inplace=False)
model_static_quantized = torch.quantization.convert(model_static_quantized, inplace=False)

但假如只是按照他这么做的话,首先会遇到一个报错

NotImplementedError: Could not run 'quantized::conv2d.new' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'quantized::conv2d.new' is only available for these backends: [QuantizedCPU, BackendSelect, Python, Named, Conjugate, Negative, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradLazy, AutogradXPU, AutogradMLC, Tracer, UNKNOWN_TENSOR_TYPE_ID, Autocast, Batched, VmapMode].

那我们先看一下量化前和量化后的模型,量化前长这样

Blaze((conv1): Sequential((0): Conv2d(3, 24, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True))(conv2): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=24, bias=False)(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(24, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))))(conv3): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=24, bias=False)(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(24, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))))(conv4): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(24, 24, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=24, bias=False)(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(24, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)))(shortcut): Sequential((0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(1): Conv2d(24, 48, kernel_size=(1, 1), stride=(1, 1))(2): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(3): ReLU(inplace=True)))(conv5): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(48, 48, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=48, bias=False)(1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))))(conv6): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(48, 48, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=48, bias=False)(1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))))(conv7): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(48, 48, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=48, bias=False)(1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(48, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True))(shortcut): Sequential((0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(1): Conv2d(48, 96, kernel_size=(1, 1), stride=(1, 1))(2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(3): ReLU(inplace=True)))(conv8): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=96, bias=False)(1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True)))(conv9): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=96, bias=False)(1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True)))(conv10): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(96, 96, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=96, bias=False)(1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True))(shortcut): Sequential((0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(1): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1))(2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(3): ReLU(inplace=True)))(conv11): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=96, bias=False)(1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True)))(conv12): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): Conv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=96, bias=False)(1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)(4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True)))(loc): Sequential((0): Sequential((0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): Conv2d(96, 8, kernel_size=(1, 1), stride=(1, 1)))(1): Sequential((0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1))))(conf): Sequential((0): Sequential((0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): Conv2d(96, 4, kernel_size=(1, 1), stride=(1, 1)))(1): Sequential((0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): Conv2d(96, 12, kernel_size=(1, 1), stride=(1, 1))))(landm): Sequential((0): Sequential((0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): Conv2d(96, 20, kernel_size=(1, 1), stride=(1, 1)))(1): Sequential((0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): Conv2d(96, 60, kernel_size=(1, 1), stride=(1, 1))))
)

量化后长这样

Blaze((conv1): Sequential((0): QuantizedConv2d(3, 24, kernel_size=(3, 3), stride=(2, 2), scale=1.0, zero_point=0, padding=(1, 1), bias=False)(1): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True))(conv2): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), groups=24, bias=False)(1): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(24, 24, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))))(conv3): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), groups=24, bias=False)(1): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(24, 24, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))))(conv4): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(24, 24, kernel_size=(5, 5), stride=(2, 2), scale=1.0, zero_point=0, padding=(2, 2), groups=24, bias=False)(1): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(24, 48, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)))(shortcut): Sequential((0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(1): QuantizedConv2d(24, 48, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0)(2): QuantizedBatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(3): ReLU(inplace=True)))(conv5): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(48, 48, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), groups=48, bias=False)(1): QuantizedBatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))))(conv6): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(48, 48, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), groups=48, bias=False)(1): QuantizedBatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))))(conv7): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(48, 48, kernel_size=(5, 5), stride=(2, 2), scale=1.0, zero_point=0, padding=(2, 2), groups=48, bias=False)(1): QuantizedBatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(48, 24, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): QuantizedConv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), bias=False)(1): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True))(shortcut): Sequential((0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(1): QuantizedConv2d(48, 96, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0)(2): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(3): ReLU(inplace=True)))(conv8): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), groups=96, bias=False)(1): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): QuantizedConv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), bias=False)(1): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True)))(conv9): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), groups=96, bias=False)(1): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): QuantizedConv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), bias=False)(1): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True)))(conv10): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(5, 5), stride=(2, 2), scale=1.0, zero_point=0, padding=(2, 2), groups=96, bias=False)(1): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): QuantizedConv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), bias=False)(1): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True))(shortcut): Sequential((0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(1): QuantizedConv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0)(2): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(3): ReLU(inplace=True)))(conv11): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), groups=96, bias=False)(1): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): QuantizedConv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), bias=False)(1): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True)))(conv12): BlazeBlock((actvation): ReLU(inplace=True)(conv): Sequential((0): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), groups=96, bias=False)(1): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): ReLU(inplace=True)(2): Sequential((0): QuantizedConv2d(24, 24, kernel_size=(5, 5), stride=(1, 1), scale=1.0, zero_point=0, padding=(2, 2), bias=False)(1): QuantizedBatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU(inplace=True)(3): QuantizedConv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0, bias=False)(4): QuantizedBatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(3): ReLU(inplace=True)))(loc): Sequential((0): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), scale=1.0, zero_point=0, padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): QuantizedConv2d(96, 8, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0))(1): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), scale=1.0, zero_point=0, padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): QuantizedConv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0)))(conf): Sequential((0): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), scale=1.0, zero_point=0, padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): QuantizedConv2d(96, 4, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0))(1): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), scale=1.0, zero_point=0, padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): QuantizedConv2d(96, 12, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0)))(landm): Sequential((0): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), scale=1.0, zero_point=0, padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): QuantizedConv2d(96, 20, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0))(1): Sequential((0): QuantizedConv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), scale=1.0, zero_point=0, padding=(1, 1), groups=96)(1): ReLU(inplace=True)(2): QuantizedConv2d(96, 60, kernel_size=(1, 1), stride=(1, 1), scale=1.0, zero_point=0)))
)

如果稍微观察一下的话,会发现这个ReLU好像没有变化,也不知道对不对

回到上面那个报错,这个是因为预测的时候,输入没有量化,他是配套的,其实这个在上面的链接里也提到了,需要加上这么两行

self.quant = torch.quantization.QuantStub()
self.dequant = torch.quantization.DeQuantStub()

当然forward里也需要加上这么几行

inputs = self.quant(inputs)
...
bbox_regressions = torch.cat([o.view(o.size(0), -1, 4) for o in loc], 1)
classifications = torch.cat([o.view(o.size(0), -1, 2) for o in conf], 1)
ldm_regressions = torch.cat([o.view(o.size(0), -1, 10) for o in landm], 1)

然后运行,会报下面的错

NotImplementedError: Could not run 'quantized::conv2d.new' with arguments from the 'QuantizedCUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'quantized::conv2d.new' is only available for these backends: [QuantizedCPU, BackendSelect, Python, Named, Conjugate, Negative, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradLazy, AutogradXPU, AutogradMLC, Tracer, UNKNOWN_TENSOR_TYPE_ID, Autocast, Batched, VmapMode].

好像跟上面的报错有点像哈,不过问题还是不太一样,这个可能是因为cuda不支持量化,所以要不还是放到CPU上执行,所以把设备改为CPU,会报如下错误

NotImplementedError: Could not run 'aten::empty.memory_format' with arguments from the 'QuantizedCPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty.memory_format' is only available for these backends: [CPU, CUDA, Meta, MkldnnCPU, SparseCPU, SparseCUDA, BackendSelect, Python, Named, Conjugate, Negative, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradLazy, AutogradXPU, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, UNKNOWN_TENSOR_TYPE_ID, Autocast, Batched, VmapMode].

这个错误呢,跟上面格式好像也有点像,但他的问题又不一样,他是因为在模型中,有一个shortcut,也就是类似于resnet中的參差结构,会有一个加法,报错就是因为相加导致的,根据网络上的某些人的指导

NotImplementedError: Could not run 'aten::empty.memory_format' with arguments from the 'QuantizedCPU' backend - #2 by dalseeroh - quantization - PyTorch Forumshttps://discuss.pytorch.org/t/notimplementederror-could-not-run-aten-empty-memory-format-with-arguments-from-the-quantizedcpu-backend/138618/2加减乘除都是不支持量化的,要想放在量化里,你需要先dequant,计算完再 quant,

h = self.dequant(h)
x = self.dequant(x)
z = h + x
z = self.actvation(z)return self.quant(z)

然后,运行成功

Surprise

然后,预测的时候发现,现在推理要23ms, 量化前我记得是4ms左右,我不知道自己图个啥,然后精度是

==================== Results ====================
Easy   Val AP: 9.360055428376237e-08
Medium Val AP: 8.086679597040724e-07
Hard   Val AP: 3.3702511281364735e-07
=================================================

四舍五入也就基本为0吧

唯一的一点是,保存的模型确实小了,量化前是 785K,量化后是 384K,放弃了

如果 dynamic quantization 只支持一些层,static quantization也只是支持一些排列,如

Convolution, Relu,这是支持的,那你要是说我反过来行不?比如 Relu,Convolution,答:不行,就是这么傲娇,只有如下的op和顺序才可以:

Convolution, Batch normalization
Convolution, Batch normalization, Relu
Convolution, Relu
Linear, Relu
Batch normalization, Relu

QAT也稍微试了一下,貌似QAT不支持CUDA,无法理解

以上

一次失败的Pytorch模型量化尝试相关推荐

  1. PyTorch模型量化工具学习

    官方教程(英文): https://pytorch.org/docs/stable/quantization.html​pytorch.org 官方教程(中文): https://pytorch.ap ...

  2. 使用Mindstudio进行Pytorch模型量化压缩

    视频教程在模型量化压缩(Pytorch)_哔哩哔哩_bilibili MindStudio介绍与安装流程 1.1基本介绍: MindStudio为用户提供在AI开发所需的一站式开发环境,支持模型开发. ...

  3. Pytorch模型量化实践并以ResNet18模型量化为例(附代码)

    更多.更及时内容欢迎微信公众号:小窗幽记机器学习 围观,后续会进一步整理模型推理加速和部署方面的相关内容. 文章目录 量化基础知识 映射函数 量化参数 校准(Calibration) Affine和S ...

  4. PyTorch模型训练完毕后静态量化、保存、加载int8量化模型

    1. PyTorch模型量化方法 Pytorch模型量化方法介绍有很多可以参考的,这里推荐两篇文章写的很详细可以给大家一个大致的参考Pytorch的量化,官方量化文档 Pytorch的量化大致分为三种 ...

  5. 深度学习模型量化(低精度推理)大总结

    模型量化作为一种能够有效减少模型大小,加速深度学习推理的优化技术,已经得到了学术界和工业界的广泛研究和应用.模型量化有 8/4/2/1 bit等,本文主要讨论目前相对比较成熟的 8-bit 低精度推理 ...

  6. 判断深度学习模型的稳定性_人工智能干货|一线工程师带你学习深度学习模型量化理论+实践...

    2019年的最后一天,送给自己一份特殊的礼物. 模型量化作为一种能够有效减少模型大小,加速深度学习推理的优化技术,已经得到了学术界和工业界的广泛研究和应用.模型量化有 8/4/2/1 bit等,本文主 ...

  7. Intel发布神经网络压缩库Distiller:快速利用前沿算法压缩PyTorch模型

    Intel发布神经网络压缩库Distiller:快速利用前沿算法压缩PyTorch模型 原文:https://blog.csdn.net/u011808673/article/details/8079 ...

  8. 模型压缩:模型量化打怪升级之路-工具篇

    本文转载自商汤泰坦公开课. 1/ 最近发现一些还在学校读书的同学非常关注一个量化工作精度的高低,读过我上篇分享(模型压缩:模型量化打怪升级之路 - 0 序章)的同学应该知道,部分学术界的工作与工业界的 ...

  9. Wandb——Pytorch模型指标可视化及超参搜索

    Wandb--Pytorch模型指标可视化及超参搜索 文章目录 Wandb--Pytorch模型指标可视化及超参搜索 前言 一.wandb是什么? 二.可视化模型参数 1.伪代码 2.官方示例 > ...

最新文章

  1. 如何在Visual Studio Code中编译C ++代码
  2. 面试准备工作 -戈多编程
  3. macbook所有型号大全_提高MacBook电池寿命的15个技巧
  4. Eclipse基金会发布MicroProfile 2.2,适用于Java微服务
  5. 【PHP7.2+】win10安装laravel(完整版,包含运行)
  6. 文章和随笔的标题好像没有HtmlEnCode。
  7. html基础标签 1211
  8. 用户访一个APP或者网页流程示意图
  9. [转]百度地图的一些应用方法
  10. 用OCR技术识别验证码---tesseract
  11. 南阳oj-----一种排序(set)
  12. MATLAB数字图像处理 实验一:图像处理基本操作(平移、放大、缩小、旋转、插值)
  13. Linux 常用的zip压缩和解压命令详解
  14. 五十二 温柔一刀(下)
  15. 520评论点赞活动开奖
  16. php 工资条系统下载,发工资条软件
  17. Kubuntu简易安装教程(压缩磁盘版)
  18. 如何做CEdit中的Undo,Redo(和word类似的输入法输入一次为一个回退块)
  19. 计算机二进制由来阴阳,二进制来源于八卦?
  20. FS-2工作室QQ群建立通知

热门文章

  1. Python数据挖掘:数据转换-数据规范化
  2. The method replace(int, Fragment) in the type FragmentTransaction is not applicable for the argument
  3. PYTHON自动化Day12-unittest自动注册登录
  4. 2022-2028年中国工业大数据行业深度调研及投资前景预测报告
  5. oracle 存储过程 状态,查看ORACLE中正在运行的存储过程 | 学步园
  6. Go 学习笔记(28)— nil(nil 不能比较、不是关键字或保留字、nil 没有默认类型、不同类型的 nil 指针是一样的、不同类型的 nil 是不能比较的、相同类型的 nil 可能也无法比较)
  7. acitivity 和fragment 通信,使用广播来传递信息的问题
  8. 九零后的五年七次工作经历
  9. MongoDB(一):安装
  10. Jquery php 点击td变成input,修改后失去焦点发送数据