一、参考资料

EfficientDet tensorrt
Jetson TX2实现EfficientDet推理加速（二）

二、相关环境

系统环境

Environment
Operating System + Version: Ubuntu + 18.04
TensorRT Version: 8.0.1.6
GPU Type: Jetson TX2
CUDA Version: 10.2.300
Python Version (if applicable): 3.6.9
gcc：7.5.0
g++：7.5.0

三、注意事项

根据提示按需安装python包，不推荐安装所有的包 pip install -r requirements.txt

四、关键步骤

1. 下载官方 efficientdet ，安装相关的依赖包

git clone https://github.com/google/automl.gitcd /media/mydisk/MyDocuments/PyProjects/automl/efficientdet# 执行步骤3的指令，根据提示安装包。在Jetson TX2中，不推荐安装所有的包 `pip install -r requirements.txt`比如：
pip install tensorflow-model-optimization
pip install dm-tree说明：如果安装dm-tree，请用源码安装方式，见下文介绍

2. 下载预训练模型并解压

efficientdet-d0

# /media/mydisk/YOYOFile/efficientdet-d0├── efficientdet-d0
│   ├── checkpoint
│   ├── d0_coco_test-dev2017.txt
│   ├── d0_coco_val.txt
│   ├── model.data-00000-of-00001
│   ├── model.index
│   └── model.meta

3. 模型转换：转成pb文件

cd /media/mydisk/MyDocuments/PyProjects/automl/efficientdetpython model_inspect.py \--runmode saved_model \--model_name efficientdet-d0 \--ckpt_path /media/mydisk/YOYOFile/efficientdet-d0 \--saved_model_dir /media/mydisk/YOYOFile/saved_model

# /media/mydisk/YOYOFile/saved_model├── saved_model
│   ├── efficientdet-d0_frozen.pb
│   ├── saved_model.pb
│   └── variables# saved_model.pb，7.3MB

(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/automl/efficientdet$ time python model_inspect.py     --runmode saved_model     --model_name efficientdet-d0     --ckpt_path /media/mydisk/YOYOFile/efficientdet-d0     --saved_model_dir /media/mydisk/YOYOFile/saved_model
2021-10-21 17:37:26.382558: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 17:37:39.119335: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-10-21 17:37:39.136342: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:39.136588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 17:37:39.136711: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 17:37:39.153938: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.10
2021-10-21 17:37:39.154242: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.10
2021-10-21 17:37:39.169750: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-10-21 17:37:39.183745: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-10-21 17:37:39.196944: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.10
2021-10-21 17:37:39.206568: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.10
2021-10-21 17:37:39.207341: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-10-21 17:37:39.207750: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:39.208261: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:39.208475: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 17:37:39.208631: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 17:37:42.797530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 17:37:42.797636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 17:37:42.797688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 17:37:42.798146: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:42.798527: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:42.798851: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:42.799117: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 576 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 17:37:43.052975: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:43.053240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 17:37:43.053618: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:43.054036: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:43.054195: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/utils.py:602: The name tf.keras.layers.enable_v2_dtype_behavior is deprecated. Please use tf.compat.v1.keras.layers.enable_v2_dtype_behavior instead.W1021 17:37:45.469214 547581214736 module_wrapper.py:155] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/utils.py:602: The name tf.keras.layers.enable_v2_dtype_behavior is deprecated. Please use tf.compat.v1.keras.layers.enable_v2_dtype_behavior instead.WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py:5049: calling gather (from tensorflow.python.ops.array_ops) with validate_indices is deprecated and will be removed in a future version.
Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.
W1021 17:38:48.279611 547581214736 deprecation.py:534] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py:5049: calling gather (from tensorflow.python.ops.array_ops) with validate_indices is deprecated and will be removed in a future version.
Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.
2021-10-21 17:38:49.470314: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 31250000 Hz
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/training/moving_averages.py:457: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W1021 17:38:55.314421 547581214736 deprecation.py:336] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/training/moving_averages.py:457: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:201: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
W1021 17:40:41.735967 547581214736 deprecation.py:336] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:201: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/inference.py:582: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
W1021 17:43:51.238454 547581214736 deprecation.py:336] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/inference.py:582: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/framework/convert_to_constants.py:857: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
W1021 17:43:51.271648 547581214736 deprecation.py:336] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/framework/convert_to_constants.py:857: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`real   7m4.673s
user    6m41.776s
sys 0m7.824s

4. 下载 tensorRT 官方提供的 EfficientDet，生成onnx模型

git clone https://github.com/NVIDIA/TensorRT.gitcd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet# 执行步骤3的指令，根据提示安装包。在Jetson TX2中，不推荐安装所有的包 `pip install -r requirements.txt`比如：
pip install onnx
pip install onnx-graphsurgeon --index-url https://pypi.ngc.nvidia.com
pip install tf2onnx
pip install onnxruntime

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython create_onnx.py \--input_shape '1,512,512,3' \--saved_model /media/mydisk/YOYOFile/saved_model \--onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx

# /home/yichao/Downloads/saved_model_onnx├── saved_model_onnx
│   └── model.onnx# model.onnx，16.5MB

(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python create_onnx.py     --input_shape '1,512,512,3'     --saved_model /media/mydisk/YOYOFile/saved_model     --onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx
2021-10-21 18:27:24.750524: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 18:27:35.154053: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-10-21 18:27:35.163132: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.163380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:27:35.163498: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 18:27:35.168944: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.10
2021-10-21 18:27:35.169239: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.10
2021-10-21 18:27:35.172684: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-10-21 18:27:35.175041: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-10-21 18:27:35.183730: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.10
2021-10-21 18:27:35.192669: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.10
2021-10-21 18:27:35.193586: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-10-21 18:27:35.193970: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.194398: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.194583: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:27:35.197697: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.197924: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:27:35.198312: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.198736: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.198869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:27:35.199116: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 18:27:38.117646: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:27:38.117758: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:27:38.117813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:27:38.118213: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:38.118654: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:38.119182: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:38.119460: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:31:59.010282: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-10-21 18:31:59.210905: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 31250000 Hz
INFO:tf2onnx.tf_loader:Signatures found in model: [serving_default].
INFO:tf2onnx.tf_loader:Output names: ['detections:0']
WARNING:tensorflow:Issue encountered when serializing global_step.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing global_step.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing moving_average_variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing moving_average_variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing trainable_variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing trainable_variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
2021-10-21 18:33:06.415085: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.415304: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2021-10-21 18:33:06.415593: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2021-10-21 18:33:06.417680: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.417887: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:33:06.418187: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.418512: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.418650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:33:06.418763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:33:06.418825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:33:06.418871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:33:06.419332: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.419687: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.419853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:33:17.734622: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1171] Optimization results for grappler item: graph_to_optimizefunction_optimizer: Graph size after: 3761 nodes (0), 3553 edges (0), time = 169.426ms.function_optimizer: Graph size after: 3761 nodes (0), 3553 edges (0), time = 175.429ms.
Optimization results for grappler item: __inference_while_cond_22_6432function_optimizer: function_optimizer did nothing. time = 0.027ms.function_optimizer: function_optimizer did nothing. time = 0.007ms.
Optimization results for grappler item: __inference_TensorArrayV2Write_cond_true_49_3796function_optimizer: function_optimizer did nothing. time = 0.025ms.function_optimizer: function_optimizer did nothing. time = 0.007ms.
Optimization results for grappler item: __inference_while_body_23_3820function_optimizer: Graph size after: 29 nodes (0), 30 edges (0), time = 1.836ms.function_optimizer: Graph size after: 29 nodes (0), 30 edges (0), time = 1.947ms.
Optimization results for grappler item: __inference_TensorArrayV2Write_cond_false_50_233function_optimizer: function_optimizer did nothing. time = 0.024ms.function_optimizer: function_optimizer did nothing. time = 0.007ms.2021-10-21 18:34:09.261406: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:09.261647: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:34:09.261933: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:09.262202: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:09.262317: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:34:09.262418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:34:09.262469: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:34:09.262515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:34:09.262778: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:09.263169: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:09.263336: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tf2onnx/tf_loader.py:703: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tf2onnx/tf_loader.py:703: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
2021-10-21 18:34:29.838950: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.839220: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2021-10-21 18:34:29.839566: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2021-10-21 18:34:29.840683: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.840922: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:34:29.841253: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.841564: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.841690: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:34:29.841812: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:34:29.841873: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:34:29.841919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:34:29.842248: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.842606: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.842794: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:34:39.356289: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1171] Optimization results for grappler item: graph_to_optimizeconstant_folding: Graph size after: 2010 nodes (-1023), 2320 edges (-1216), time = 1219.22ms.function_optimizer: function_optimizer did nothing. time = 16.66ms.constant_folding: Graph size after: 2010 nodes (0), 2320 edges (0), time = 276.195ms.function_optimizer: function_optimizer did nothing. time = 15.49ms.INFO:EfficientDetGraphSurgeon:Loaded saved model from /media/mydisk/YOYOFile/saved_model
2021-10-21 18:36:40.496763: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:36:40.497016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:36:40.497301: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:36:40.497576: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:36:40.497690: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:36:40.497828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:36:40.497884: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:36:40.497925: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:36:40.498179: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:36:40.498467: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:36:40.498623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
INFO:tf2onnx.tfonnx:Using tensorflow=2.5.0, onnx=1.10.1, tf2onnx=1.9.2/0f28b7
INFO:tf2onnx.tfonnx:Using opset <onnx, 11>
2021-10-21 18:45:21.030096: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:21.030768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:45:21.031559: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:21.032771: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:21.033244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:45:21.033721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:45:21.033990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:45:21.034280: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:45:21.035300: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:21.036304: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:21.036804: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:45:23.648050: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.648408: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:45:23.648729: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.649003: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.649126: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:45:23.649279: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:45:23.649337: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:45:23.649379: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:45:23.649644: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.649943: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.650121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:45:23.685852: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.686416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:45:23.686875: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.687378: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.687534: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:45:23.687642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:45:23.687700: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:45:23.687743: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:45:23.688074: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.688453: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.688630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:45:23.831418: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.831653: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:45:23.831981: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.832326: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.832469: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:45:23.832570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:45:23.832625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:45:23.832667: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:45:23.832958: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.833287: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.833490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
INFO:tf2onnx.tf_utils:Computed 1 values for constant folding
INFO:tf2onnx.tfonnx:folding node using tf type=ExpandDims, name=ExpandDims
INFO:tf2onnx.tfonnx:folding node type=Range, name=range_1
INFO:tf2onnx.optimizer:Optimizing ONNX model
INFO:tf2onnx.optimizer:After optimization: BatchNormalization -45 (108->63), Cast -27 (41->14), Concat -1 (21->20), Const -530 (1139->609), GlobalAveragePool +16 (0->16), GlobalMaxPool +1 (0->1), Identity -22 (22->0), Mul -2 (187->185), ReduceMax -1 (1->0), ReduceMean -16 (16->0), ReduceSum -1 (1->0), Reshape -80 (92->12), Shape -1 (17->16), Slice -4 (33->29), Squeeze -10 (29->19), Transpose -761 (777->16), Unsqueeze -12 (32->20)
INFO:EfficientDetGraphSurgeon:TF2ONNX graph created successfully
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] 'Shape tensor cast elision' routine failed with: None
INFO:EfficientDetGraphSurgeon:Graph was detected as AutoML
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
INFO:EfficientDetGraphSurgeon:ONNX graph input shape: [1, 512, 512, 3] [NHWC format detected]
INFO:EfficientDetGraphSurgeon:Found Conv node 'efficientnet-b0/stem/conv2d/Conv2D' as stem entry
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
INFO:EfficientDetGraphSurgeon:Found Concat node 'concat' as the tip of class_net/
INFO:EfficientDetGraphSurgeon:Found Concat node 'concat_1' as the tip of box_net/
INFO:EfficientDetGraphSurgeon:Created NMS plugin 'EfficientNMS_TRT' with attributes: {'plugin_version': '1', 'background_class': -1, 'max_output_boxes': 100, 'score_threshold': 0.4000000059604645, 'iou_threshold': 0.5, 'score_activation': True, 'box_coding': 1}
Warning: Unsupported operator EfficientNMS_TRT. No schema registered for this operator.
Warning: Unsupported operator EfficientNMS_TRT. No schema registered for this operator.
Warning: Unsupported operator EfficientNMS_TRT. No schema registered for this operator.
INFO:EfficientDetGraphSurgeon:Saved ONNX model to /media/mydisk/YOYOFile/saved_model_onnx/model.onnxreal    28m11.266s
user    27m34.024s
sys 0m10.396s

5. 生成engine引擎

tensorRT FP32

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython build_engine.py \--onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx \--engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \--precision fp32

(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python build_engine.py \
>     --onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx \
>     --engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \
>     --precision fp32
[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +234, GPU +0, now: CPU 264, GPU 3727 (MiB)
[TensorRT] WARNING: onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: BatchedNMS_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: BatchedNMS_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] WARNING: builtin_op_importers.cpp:4552: Attribute scoreBits not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[TensorRT] INFO: Successfully created plugin: BatchedNMS_TRT
INFO:EngineBuilder:Network Description
INFO:EngineBuilder:Input 'image_arrays:0' with shape (1, 512, 512, 3) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'num_detections' with shape (1,) and dtype DataType.INT32
INFO:EngineBuilder:Output 'detection_boxes' with shape (1, 100, 4) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'detection_scores' with shape (1, 100) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'detection_classes' with shape (1, 100) and dtype DataType.FLOAT
INFO:EngineBuilder:Building fp32 Engine in /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt
[TensorRT] INFO: [MemUsageSnapshot] Builder begin: CPU 284 MiB, GPU 3765 MiB
[TensorRT] INFO: ---------- Layers Running on DLA ----------
[TensorRT] INFO: ---------- Layers Running on GPU ----------
[TensorRT] INFO: [GpuLayer] preprocessor/transpose
[TensorRT] INFO: [GpuLayer] preprocessor/scale_value:0 + preprocessor/scale + preprocessor/mean_value:0 + preprocessor/mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/stem/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/stem/Sigmoid), efficientnet-b0/stem/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/Sigmoid), efficientnet-b0/blocks_0/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/se/Sigmoid), efficientnet-b0/blocks_0/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/se/Sigmoid_1), efficientnet-b0/blocks_0/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/Sigmoid), efficientnet-b0/blocks_1/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/Sigmoid_1), efficientnet-b0/blocks_1/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/se/Sigmoid), efficientnet-b0/blocks_1/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/se/Sigmoid_1), efficientnet-b0/blocks_1/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/Sigmoid), efficientnet-b0/blocks_2/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/Sigmoid_1), efficientnet-b0/blocks_2/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/se/Sigmoid), efficientnet-b0/blocks_2/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/se/Sigmoid_1), efficientnet-b0/blocks_2/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/conv2d_1/Conv2D + efficientnet-b0/blocks_2/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/Sigmoid), efficientnet-b0/blocks_3/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/Sigmoid_1), efficientnet-b0/blocks_3/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/se/Sigmoid), efficientnet-b0/blocks_3/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/se/Sigmoid_1), efficientnet-b0/blocks_3/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/Sigmoid), efficientnet-b0/blocks_4/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/Sigmoid_1), efficientnet-b0/blocks_4/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/se/Sigmoid), efficientnet-b0/blocks_4/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/se/Sigmoid_1), efficientnet-b0/blocks_4/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/conv2d_1/Conv2D + efficientnet-b0/blocks_4/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/Sigmoid), efficientnet-b0/blocks_5/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/Sigmoid_1), efficientnet-b0/blocks_5/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/se/Sigmoid), efficientnet-b0/blocks_5/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/se/Sigmoid_1), efficientnet-b0/blocks_5/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/Sigmoid), efficientnet-b0/blocks_6/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/Sigmoid_1), efficientnet-b0/blocks_6/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/se/Sigmoid), efficientnet-b0/blocks_6/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/se/Sigmoid_1), efficientnet-b0/blocks_6/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/conv2d_1/Conv2D + efficientnet-b0/blocks_6/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/Sigmoid), efficientnet-b0/blocks_7/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/Sigmoid_1), efficientnet-b0/blocks_7/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/se/Sigmoid), efficientnet-b0/blocks_7/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/se/Sigmoid_1), efficientnet-b0/blocks_7/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/conv2d_1/Conv2D + efficientnet-b0/blocks_7/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/Sigmoid), efficientnet-b0/blocks_8/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/Sigmoid_1), efficientnet-b0/blocks_8/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/se/Sigmoid), efficientnet-b0/blocks_8/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/se/Sigmoid_1), efficientnet-b0/blocks_8/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/Sigmoid), efficientnet-b0/blocks_9/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/Sigmoid_1), efficientnet-b0/blocks_9/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/se/Sigmoid), efficientnet-b0/blocks_9/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/se/Sigmoid_1), efficientnet-b0/blocks_9/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/conv2d_1/Conv2D + efficientnet-b0/blocks_9/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/Sigmoid), efficientnet-b0/blocks_10/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/Sigmoid_1), efficientnet-b0/blocks_10/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/se/Sigmoid), efficientnet-b0/blocks_10/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/se/Sigmoid_1), efficientnet-b0/blocks_10/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/conv2d_1/Conv2D + efficientnet-b0/blocks_10/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/Sigmoid), efficientnet-b0/blocks_11/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/Sigmoid_1), efficientnet-b0/blocks_11/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/se/Sigmoid), efficientnet-b0/blocks_11/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/se/Sigmoid_1), efficientnet-b0/blocks_11/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/Sigmoid), efficientnet-b0/blocks_12/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/Sigmoid_1), efficientnet-b0/blocks_12/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/se/Sigmoid), efficientnet-b0/blocks_12/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/se/Sigmoid_1), efficientnet-b0/blocks_12/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/conv2d_1/Conv2D + efficientnet-b0/blocks_12/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/Sigmoid), efficientnet-b0/blocks_13/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/Sigmoid_1), efficientnet-b0/blocks_13/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/se/Sigmoid), efficientnet-b0/blocks_13/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/se/Sigmoid_1), efficientnet-b0/blocks_13/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/conv2d_1/Conv2D + efficientnet-b0/blocks_13/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/Sigmoid), efficientnet-b0/blocks_14/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/Sigmoid_1), efficientnet-b0/blocks_14/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/se/Sigmoid), efficientnet-b0/blocks_14/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/se/Sigmoid_1), efficientnet-b0/blocks_14/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/conv2d_1/Conv2D + efficientnet-b0/blocks_14/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/Sigmoid), efficientnet-b0/blocks_15/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/Sigmoid_1), efficientnet-b0/blocks_15/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/se/Sigmoid), efficientnet-b0/blocks_15/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/se/Sigmoid_1), efficientnet-b0/blocks_15/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] resample_p6/conv2d/BiasAdd || fpn_cells/cell_0/fnode5/resample_0_2_10/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] resample_p6/max_pooling2d/MaxPool
[TensorRT] INFO: [GpuLayer] resample_p7/max_pooling2d_1/MaxPool
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/resample_0_2_10/bn/FusedBatchNormV3__894
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/add_n_1/add
[TensorRT] INFO: [GpuLayer] resize_nearest_1
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_0/fnode5/op_after_combine10/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/separable_conv2d/depthwise__898
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode0/mul_1:0 + (Unnamed Layer* 284) [Shuffle] + fpn_cells/cell_0/fnode0/truediv_1, PWN(fpn_cells/cell_0/fnode0/mul:0 + (Unnamed Layer* 274) [Shuffle] + fpn_cells/cell_0/fnode0/truediv, fpn_cells/cell_0/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_0/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_0/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/resample_2_10_11/max_pooling2d_4/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_2
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/mul_1:0 + (Unnamed Layer* 313) [Shuffle] + fpn_cells/cell_0/fnode1/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/resample_0_2_6/conv2d/BiasAdd + fpn_cells/cell_0/fnode1/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(PWN(fpn_cells/cell_0/fnode6/mul_1:0 + (Unnamed Layer* 308) [Shuffle] + fpn_cells/cell_0/fnode6/truediv_1, PWN(fpn_cells/cell_0/fnode6/mul:0 + (Unnamed Layer* 271) [Shuffle] + fpn_cells/cell_0/fnode6/truediv, fpn_cells/cell_0/fnode6/add_n_1/add)), PWN(fpn_cells/cell_0/fnode6/mul_2:0 + (Unnamed Layer* 305) [Shuffle] + fpn_cells/cell_0/fnode6/truediv_2, fpn_cells/cell_0/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_0/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_0/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_0/fnode1/op_after_combine6/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/resample_1_11_12/max_pooling2d_5/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_3
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/mul_1:0 + (Unnamed Layer* 339) [Shuffle] + fpn_cells/cell_0/fnode2/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/resample_0_1_7/conv2d/BiasAdd + fpn_cells/cell_0/fnode2/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode7/mul_1:0 + (Unnamed Layer* 336) [Shuffle] + fpn_cells/cell_0/fnode7/truediv_1, PWN(fpn_cells/cell_0/fnode7/mul:0 + (Unnamed Layer* 278) [Shuffle] + fpn_cells/cell_0/fnode7/truediv, fpn_cells/cell_0/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_0/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_0/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_0/fnode2/op_after_combine7/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/mul_1:0 + (Unnamed Layer* 357) [Shuffle] + fpn_cells/cell_0/fnode4/truediv_1
[TensorRT] INFO: [GpuLayer] resize_nearest_4
[TensorRT] INFO: [GpuLayer] resize_nearest_5
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/resample_0_1_9/conv2d/BiasAdd + fpn_cells/cell_0/fnode4/add_n_1/add
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/mul_1:0 + (Unnamed Layer* 366) [Shuffle] + fpn_cells/cell_0/fnode3/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/resample_0_0_8/conv2d/BiasAdd + fpn_cells/cell_0/fnode3/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode0/mul_1:0 + (Unnamed Layer* 362) [Shuffle] + fpn_cells/cell_1/fnode0/truediv_1, PWN(fpn_cells/cell_1/fnode0/mul:0 + (Unnamed Layer* 331) [Shuffle] + fpn_cells/cell_1/fnode0/truediv, fpn_cells/cell_1/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_1/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_0/fnode3/op_after_combine8/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/resample_2_8_9/max_pooling2d_2/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_6
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode1/mul_1:0 + (Unnamed Layer* 390) [Shuffle] + fpn_cells/cell_1/fnode1/truediv_1, PWN(fpn_cells/cell_1/fnode1/mul:0 + (Unnamed Layer* 300) [Shuffle] + fpn_cells/cell_1/fnode1/truediv, fpn_cells/cell_1/fnode1/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_1/fnode1/op_after_combine6/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode4/mul_2:0 + (Unnamed Layer* 393) [Shuffle] + fpn_cells/cell_0/fnode4/truediv_2, fpn_cells/cell_0/fnode4/add_n_1/add_1), PWN(PWN(fpn_cells/cell_0/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_0/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_7
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode2/mul_1:0 + (Unnamed Layer* 419) [Shuffle] + fpn_cells/cell_1/fnode2/truediv_1, PWN(fpn_cells/cell_1/fnode2/mul:0 + (Unnamed Layer* 414) [Shuffle] + fpn_cells/cell_1/fnode2/truediv, fpn_cells/cell_1/fnode2/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_1/fnode2/op_after_combine7/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_8
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode3/mul_1:0 + (Unnamed Layer* 433) [Shuffle] + fpn_cells/cell_1/fnode3/truediv_1, PWN(fpn_cells/cell_1/fnode3/mul:0 + (Unnamed Layer* 384) [Shuffle] + fpn_cells/cell_1/fnode3/truediv, fpn_cells/cell_1/fnode3/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_1/fnode3/op_after_combine8/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/resample_2_8_9/max_pooling2d_6/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode4/mul_2:0 + (Unnamed Layer* 446) [Shuffle] + fpn_cells/cell_1/fnode4/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode4/mul_1:0 + (Unnamed Layer* 428) [Shuffle] + fpn_cells/cell_1/fnode4/truediv_1, PWN(fpn_cells/cell_1/fnode4/mul:0 + (Unnamed Layer* 411) [Shuffle] + fpn_cells/cell_1/fnode4/truediv, fpn_cells/cell_1/fnode4/add_n_1/add)), fpn_cells/cell_1/fnode4/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_1/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/resample_2_9_10/max_pooling2d_7/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode5/mul_2:0 + (Unnamed Layer* 462) [Shuffle] + fpn_cells/cell_1/fnode5/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode5/mul_1:0 + (Unnamed Layer* 408) [Shuffle] + fpn_cells/cell_1/fnode5/truediv_1, PWN(fpn_cells/cell_1/fnode5/mul:0 + (Unnamed Layer* 297) [Shuffle] + fpn_cells/cell_1/fnode5/truediv, fpn_cells/cell_1/fnode5/add_n_1/add)), fpn_cells/cell_1/fnode5/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_1/fnode5/op_after_combine10/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/resample_2_10_11/max_pooling2d_8/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode6/mul_2:0 + (Unnamed Layer* 478) [Shuffle] + fpn_cells/cell_1/fnode6/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode6/mul_1:0 + (Unnamed Layer* 381) [Shuffle] + fpn_cells/cell_1/fnode6/truediv_1, PWN(fpn_cells/cell_1/fnode6/mul:0 + (Unnamed Layer* 328) [Shuffle] + fpn_cells/cell_1/fnode6/truediv, fpn_cells/cell_1/fnode6/add_n_1/add)), fpn_cells/cell_1/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_1/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/resample_1_11_12/max_pooling2d_9/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode7/mul_1:0 + (Unnamed Layer* 494) [Shuffle] + fpn_cells/cell_1/fnode7/truediv_1, PWN(fpn_cells/cell_1/fnode7/mul:0 + (Unnamed Layer* 354) [Shuffle] + fpn_cells/cell_1/fnode7/truediv, fpn_cells/cell_1/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_1/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_9
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode0/mul_1:0 + (Unnamed Layer* 507) [Shuffle] + fpn_cells/cell_2/fnode0/truediv_1, PWN(fpn_cells/cell_2/fnode0/mul:0 + (Unnamed Layer* 490) [Shuffle] + fpn_cells/cell_2/fnode0/truediv, fpn_cells/cell_2/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_2/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_10
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode1/mul_1:0 + (Unnamed Layer* 521) [Shuffle] + fpn_cells/cell_2/fnode1/truediv_1, PWN(fpn_cells/cell_2/fnode1/mul:0 + (Unnamed Layer* 474) [Shuffle] + fpn_cells/cell_2/fnode1/truediv, fpn_cells/cell_2/fnode1/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_2/fnode1/op_after_combine6/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_11
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode2/mul_1:0 + (Unnamed Layer* 535) [Shuffle] + fpn_cells/cell_2/fnode2/truediv_1, PWN(fpn_cells/cell_2/fnode2/mul:0 + (Unnamed Layer* 458) [Shuffle] + fpn_cells/cell_2/fnode2/truediv, fpn_cells/cell_2/fnode2/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_2/fnode2/op_after_combine7/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_12
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode3/mul_1:0 + (Unnamed Layer* 549) [Shuffle] + fpn_cells/cell_2/fnode3/truediv_1, PWN(fpn_cells/cell_2/fnode3/mul:0 + (Unnamed Layer* 442) [Shuffle] + fpn_cells/cell_2/fnode3/truediv, fpn_cells/cell_2/fnode3/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_2/fnode3/op_after_combine8/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/resample_2_8_9/max_pooling2d_10/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode4/mul_2:0 + (Unnamed Layer* 561) [Shuffle] + fpn_cells/cell_2/fnode4/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode4/mul_1:0 + (Unnamed Layer* 544) [Shuffle] + fpn_cells/cell_2/fnode4/truediv_1, PWN(fpn_cells/cell_2/fnode4/mul:0 + (Unnamed Layer* 455) [Shuffle] + fpn_cells/cell_2/fnode4/truediv, fpn_cells/cell_2/fnode4/add_n_1/add)), fpn_cells/cell_2/fnode4/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_2/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid), class_net/mul)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid), box_net/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/resample_2_9_10/max_pooling2d_11/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_1), class_net/mul_1)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_1), box_net/mul_1)
[TensorRT] INFO: [GpuLayer] class_net/class-2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode5/mul_2:0 + (Unnamed Layer* 589) [Shuffle] + fpn_cells/cell_2/fnode5/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode5/mul_1:0 + (Unnamed Layer* 530) [Shuffle] + fpn_cells/cell_2/fnode5/truediv_1, PWN(fpn_cells/cell_2/fnode5/mul:0 + (Unnamed Layer* 471) [Shuffle] + fpn_cells/cell_2/fnode5/truediv, fpn_cells/cell_2/fnode5/add_n_1/add)), fpn_cells/cell_2/fnode5/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_2/fnode5/op_after_combine10/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_3), class_net/mul_3)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_3), box_net/mul_3)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_2), class_net/mul_2)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_2), box_net/mul_2)
[TensorRT] INFO: [GpuLayer] class_net/class-predict/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/resample_2_10_11/max_pooling2d_12/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_4), class_net/mul_4)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_4), box_net/mul_4)
[TensorRT] INFO: [GpuLayer] class_net/class-2_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict/BiasAdd__1552 + Reshape
[TensorRT] INFO: [GpuLayer] box_net/box-predict/BiasAdd__1591 + Reshape_1
[TensorRT] INFO: [GpuLayer] class_net/class-2_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode6/mul_2:0 + (Unnamed Layer* 633) [Shuffle] + fpn_cells/cell_2/fnode6/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode6/mul_1:0 + (Unnamed Layer* 516) [Shuffle] + fpn_cells/cell_2/fnode6/truediv_1, PWN(fpn_cells/cell_2/fnode6/mul:0 + (Unnamed Layer* 487) [Shuffle] + fpn_cells/cell_2/fnode6/truediv, fpn_cells/cell_2/fnode6/add_n_1/add)), fpn_cells/cell_2/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_2/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_6), class_net/mul_6)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_6), box_net/mul_6)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_5), class_net/mul_5)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_5), box_net/mul_5)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/resample_1_11_12/max_pooling2d_13/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_7), class_net/mul_7)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_7), box_net/mul_7)
[TensorRT] INFO: [GpuLayer] class_net/class-2_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/BiasAdd__1482 + Reshape_2
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/BiasAdd__1517 + Reshape_3
[TensorRT] INFO: [GpuLayer] class_net/class-2_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode7/mul_1:0 + (Unnamed Layer* 681) [Shuffle] + fpn_cells/cell_2/fnode7/truediv_1, PWN(fpn_cells/cell_2/fnode7/mul:0 + (Unnamed Layer* 503) [Shuffle] + fpn_cells/cell_2/fnode7/truediv, fpn_cells/cell_2/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_2/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_9), class_net/mul_9)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_9), box_net/mul_9)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_8), class_net/mul_8)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_8), box_net/mul_8)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_10), class_net/mul_10)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_10), box_net/mul_10)
[TensorRT] INFO: [GpuLayer] class_net/class-2_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/BiasAdd__1412 + Reshape_4
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/BiasAdd__1447 + Reshape_5
[TensorRT] INFO: [GpuLayer] class_net/class-2_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_12), class_net/mul_12)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_12), box_net/mul_12)
[TensorRT] INFO: [GpuLayer] class_net/class-1_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_11), class_net/mul_11)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_11), box_net/mul_11)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_13), class_net/mul_13)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_13), box_net/mul_13)
[TensorRT] INFO: [GpuLayer] class_net/class-2_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/BiasAdd__1342 + Reshape_6
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/BiasAdd__1377 + Reshape_7
[TensorRT] INFO: [GpuLayer] class_net/class-2_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_14), class_net/mul_14)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_14), box_net/mul_14)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/BiasAdd__1272 + Reshape_8
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/BiasAdd__1307 + Reshape_9
[TensorRT] INFO: [GpuLayer] Reshape:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_2:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_4:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_6:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_8:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_1:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_3:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_5:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_7:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_9:0 copy
[TensorRT] INFO: [GpuLayer] unstack
[TensorRT] INFO: [GpuLayer] unstack_0
[TensorRT] INFO: [GpuLayer] unstack_1
[TensorRT] INFO: [GpuLayer] unstack_2
[TensorRT] INFO: [GpuLayer] PWN(nms/class_net_sigmoid)
[TensorRT] INFO: [GpuLayer] unstack__1596
[TensorRT] INFO: [GpuLayer] unstack__1595
[TensorRT] INFO: [GpuLayer] unstack__1594
[TensorRT] INFO: [GpuLayer] unstack__1593
[TensorRT] INFO: [GpuLayer] sub_3:0
[TensorRT] INFO: [GpuLayer] sub_2:0
[TensorRT] INFO: [GpuLayer] sub_3:0_3
[TensorRT] INFO: [GpuLayer] sub_2:0_4
[TensorRT] INFO: [GpuLayer] truediv_5:0
[TensorRT] INFO: [GpuLayer] PWN(mul_5, add_6)
[TensorRT] INFO: [GpuLayer] truediv_4:0
[TensorRT] INFO: [GpuLayer] PWN(mul_4, add_5)
[TensorRT] INFO: [GpuLayer] PWN(ConstantFolding/truediv_7_recip:0 + (Unnamed Layer* 813) [Shuffle], PWN(PWN(Exp, mul_2), truediv_7))
[TensorRT] INFO: [GpuLayer] PWN(ConstantFolding/truediv_7_recip:0_5 + (Unnamed Layer* 816) [Shuffle], PWN(PWN(Exp_1, mul_3), truediv_6))
[TensorRT] INFO: [GpuLayer] sub_5
[TensorRT] INFO: [GpuLayer] add_8
[TensorRT] INFO: [GpuLayer] sub_4
[TensorRT] INFO: [GpuLayer] add_7
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1598
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1600
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1597
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1599
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1597:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1598:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1599:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1600:0 copy
[TensorRT] INFO: [GpuLayer] nms/box_net_reshape
[TensorRT] INFO: [GpuLayer] nms/non_maximum_suppression
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +167, GPU +171, now: CPU 455, GPU 3945 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +249, GPU +251, now: CPU 704, GPU 4196 (MiB)
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] ERROR: Tactic Device request: 1686MB Available: 1536MB. Device memory is insufficient to use tactic.
[TensorRT] WARNING: Skipping tactic 3 due to oom error on requested size of 1686 detected for tactic 4.
Try decreasing the workspace size with IBuilderConfig::setMaxWorkspaceSize().
[TensorRT] INFO: Detected 1 inputs and 4 output network tensors.
[TensorRT] INFO: Total Host Persistent Memory: 345200
[TensorRT] INFO: Total Device Persistent Memory: 14929920
[TensorRT] INFO: Total Scratch Memory: 107589120
[TensorRT] INFO: [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 12 MiB, GPU 1078 MiB
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 986, GPU 4853 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +1, GPU +0, now: CPU 987, GPU 4853 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 986, GPU 4853 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 986, GPU 4853 (MiB)
[TensorRT] INFO: [MemUsageSnapshot] Builder end: CPU 985 MiB, GPU 4853 MiB
INFO:EngineBuilder:Serializing engine to file: /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trtreal   11m45.193s
user    9m46.616s
sys 0m49.560s# engine.trt，29.2MB

tensorRT FP16

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython build_engine.py \--onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx \--engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \--precision fp16

(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python build_engine.py \
>     --onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx \
>     --engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \
>     --precision fp16
[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +234, GPU +0, now: CPU 264, GPU 3391 (MiB)
[TensorRT] WARNING: onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: BatchedNMS_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: BatchedNMS_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] WARNING: builtin_op_importers.cpp:4552: Attribute scoreBits not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[TensorRT] INFO: Successfully created plugin: BatchedNMS_TRT
INFO:EngineBuilder:Network Description
INFO:EngineBuilder:Input 'image_arrays:0' with shape (1, 512, 512, 3) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'num_detections' with shape (1,) and dtype DataType.INT32
INFO:EngineBuilder:Output 'detection_boxes' with shape (1, 100, 4) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'detection_scores' with shape (1, 100) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'detection_classes' with shape (1, 100) and dtype DataType.FLOAT
INFO:EngineBuilder:Building fp16 Engine in /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt
WARNING:EngineBuilder:FP16 is supported natively on this platform/device
[TensorRT] INFO: [MemUsageSnapshot] Builder begin: CPU 284 MiB, GPU 3428 MiB
[TensorRT] INFO: ---------- Layers Running on DLA ----------
[TensorRT] INFO: ---------- Layers Running on GPU ----------
[TensorRT] INFO: [GpuLayer] preprocessor/transpose
[TensorRT] INFO: [GpuLayer] preprocessor/scale_value:0 + preprocessor/scale + preprocessor/mean_value:0 + preprocessor/mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/stem/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/stem/Sigmoid), efficientnet-b0/stem/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/Sigmoid), efficientnet-b0/blocks_0/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/se/Sigmoid), efficientnet-b0/blocks_0/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/se/Sigmoid_1), efficientnet-b0/blocks_0/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/Sigmoid), efficientnet-b0/blocks_1/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/Sigmoid_1), efficientnet-b0/blocks_1/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/se/Sigmoid), efficientnet-b0/blocks_1/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/se/Sigmoid_1), efficientnet-b0/blocks_1/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/Sigmoid), efficientnet-b0/blocks_2/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/Sigmoid_1), efficientnet-b0/blocks_2/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/se/Sigmoid), efficientnet-b0/blocks_2/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/se/Sigmoid_1), efficientnet-b0/blocks_2/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/conv2d_1/Conv2D + efficientnet-b0/blocks_2/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/Sigmoid), efficientnet-b0/blocks_3/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/Sigmoid_1), efficientnet-b0/blocks_3/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/se/Sigmoid), efficientnet-b0/blocks_3/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/se/Sigmoid_1), efficientnet-b0/blocks_3/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/Sigmoid), efficientnet-b0/blocks_4/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/Sigmoid_1), efficientnet-b0/blocks_4/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/se/Sigmoid), efficientnet-b0/blocks_4/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/se/Sigmoid_1), efficientnet-b0/blocks_4/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/conv2d_1/Conv2D + efficientnet-b0/blocks_4/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/Sigmoid), efficientnet-b0/blocks_5/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/Sigmoid_1), efficientnet-b0/blocks_5/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/se/Sigmoid), efficientnet-b0/blocks_5/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/se/Sigmoid_1), efficientnet-b0/blocks_5/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/Sigmoid), efficientnet-b0/blocks_6/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/Sigmoid_1), efficientnet-b0/blocks_6/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/se/Sigmoid), efficientnet-b0/blocks_6/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/se/Sigmoid_1), efficientnet-b0/blocks_6/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/conv2d_1/Conv2D + efficientnet-b0/blocks_6/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/Sigmoid), efficientnet-b0/blocks_7/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/Sigmoid_1), efficientnet-b0/blocks_7/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/se/Sigmoid), efficientnet-b0/blocks_7/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/se/Sigmoid_1), efficientnet-b0/blocks_7/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/conv2d_1/Conv2D + efficientnet-b0/blocks_7/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/Sigmoid), efficientnet-b0/blocks_8/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/Sigmoid_1), efficientnet-b0/blocks_8/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/se/Sigmoid), efficientnet-b0/blocks_8/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/se/Sigmoid_1), efficientnet-b0/blocks_8/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/Sigmoid), efficientnet-b0/blocks_9/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/Sigmoid_1), efficientnet-b0/blocks_9/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/se/Sigmoid), efficientnet-b0/blocks_9/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/se/Sigmoid_1), efficientnet-b0/blocks_9/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/conv2d_1/Conv2D + efficientnet-b0/blocks_9/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/Sigmoid), efficientnet-b0/blocks_10/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/Sigmoid_1), efficientnet-b0/blocks_10/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/se/Sigmoid), efficientnet-b0/blocks_10/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/se/Sigmoid_1), efficientnet-b0/blocks_10/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/conv2d_1/Conv2D + efficientnet-b0/blocks_10/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/Sigmoid), efficientnet-b0/blocks_11/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/Sigmoid_1), efficientnet-b0/blocks_11/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/se/Sigmoid), efficientnet-b0/blocks_11/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/se/Sigmoid_1), efficientnet-b0/blocks_11/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/Sigmoid), efficientnet-b0/blocks_12/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/Sigmoid_1), efficientnet-b0/blocks_12/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/se/Sigmoid), efficientnet-b0/blocks_12/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/se/Sigmoid_1), efficientnet-b0/blocks_12/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/conv2d_1/Conv2D + efficientnet-b0/blocks_12/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/Sigmoid), efficientnet-b0/blocks_13/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/Sigmoid_1), efficientnet-b0/blocks_13/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/se/Sigmoid), efficientnet-b0/blocks_13/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/se/Sigmoid_1), efficientnet-b0/blocks_13/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/conv2d_1/Conv2D + efficientnet-b0/blocks_13/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/Sigmoid), efficientnet-b0/blocks_14/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/Sigmoid_1), efficientnet-b0/blocks_14/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/se/Sigmoid), efficientnet-b0/blocks_14/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/se/Sigmoid_1), efficientnet-b0/blocks_14/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/conv2d_1/Conv2D + efficientnet-b0/blocks_14/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/Sigmoid), efficientnet-b0/blocks_15/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/Sigmoid_1), efficientnet-b0/blocks_15/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/se/Sigmoid), efficientnet-b0/blocks_15/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/se/Sigmoid_1), efficientnet-b0/blocks_15/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] resample_p6/conv2d/BiasAdd || fpn_cells/cell_0/fnode5/resample_0_2_10/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] resample_p6/max_pooling2d/MaxPool
[TensorRT] INFO: [GpuLayer] resample_p7/max_pooling2d_1/MaxPool
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/resample_0_2_10/bn/FusedBatchNormV3__894
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/add_n_1/add
[TensorRT] INFO: [GpuLayer] resize_nearest_1
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_0/fnode5/op_after_combine10/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/separable_conv2d/depthwise__898
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode0/mul_1:0 + (Unnamed Layer* 284) [Shuffle] + fpn_cells/cell_0/fnode0/truediv_1, PWN(fpn_cells/cell_0/fnode0/mul:0 + (Unnamed Layer* 274) [Shuffle] + fpn_cells/cell_0/fnode0/truediv, fpn_cells/cell_0/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_0/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_0/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/resample_2_10_11/max_pooling2d_4/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_2
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/mul_1:0 + (Unnamed Layer* 313) [Shuffle] + fpn_cells/cell_0/fnode1/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/resample_0_2_6/conv2d/BiasAdd + fpn_cells/cell_0/fnode1/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(PWN(fpn_cells/cell_0/fnode6/mul_1:0 + (Unnamed Layer* 308) [Shuffle] + fpn_cells/cell_0/fnode6/truediv_1, PWN(fpn_cells/cell_0/fnode6/mul:0 + (Unnamed Layer* 271) [Shuffle] + fpn_cells/cell_0/fnode6/truediv, fpn_cells/cell_0/fnode6/add_n_1/add)), PWN(fpn_cells/cell_0/fnode6/mul_2:0 + (Unnamed Layer* 305) [Shuffle] + fpn_cells/cell_0/fnode6/truediv_2, fpn_cells/cell_0/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_0/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_0/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_0/fnode1/op_after_combine6/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/resample_1_11_12/max_pooling2d_5/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_3
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/mul_1:0 + (Unnamed Layer* 339) [Shuffle] + fpn_cells/cell_0/fnode2/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/resample_0_1_7/conv2d/BiasAdd + fpn_cells/cell_0/fnode2/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode7/mul_1:0 + (Unnamed Layer* 336) [Shuffle] + fpn_cells/cell_0/fnode7/truediv_1, PWN(fpn_cells/cell_0/fnode7/mul:0 + (Unnamed Layer* 278) [Shuffle] + fpn_cells/cell_0/fnode7/truediv, fpn_cells/cell_0/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_0/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_0/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_0/fnode2/op_after_combine7/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/mul_1:0 + (Unnamed Layer* 357) [Shuffle] + fpn_cells/cell_0/fnode4/truediv_1
[TensorRT] INFO: [GpuLayer] resize_nearest_4
[TensorRT] INFO: [GpuLayer] resize_nearest_5
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/resample_0_1_9/conv2d/BiasAdd + fpn_cells/cell_0/fnode4/add_n_1/add
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/mul_1:0 + (Unnamed Layer* 366) [Shuffle] + fpn_cells/cell_0/fnode3/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/resample_0_0_8/conv2d/BiasAdd + fpn_cells/cell_0/fnode3/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode0/mul_1:0 + (Unnamed Layer* 362) [Shuffle] + fpn_cells/cell_1/fnode0/truediv_1, PWN(fpn_cells/cell_1/fnode0/mul:0 + (Unnamed Layer* 331) [Shuffle] + fpn_cells/cell_1/fnode0/truediv, fpn_cells/cell_1/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_1/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_0/fnode3/op_after_combine8/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/resample_2_8_9/max_pooling2d_2/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_6
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode1/mul_1:0 + (Unnamed Layer* 390) [Shuffle] + fpn_cells/cell_1/fnode1/truediv_1, PWN(fpn_cells/cell_1/fnode1/mul:0 + (Unnamed Layer* 300) [Shuffle] + fpn_cells/cell_1/fnode1/truediv, fpn_cells/cell_1/fnode1/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_1/fnode1/op_after_combine6/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode4/mul_2:0 + (Unnamed Layer* 393) [Shuffle] + fpn_cells/cell_0/fnode4/truediv_2, fpn_cells/cell_0/fnode4/add_n_1/add_1), PWN(PWN(fpn_cells/cell_0/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_0/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_7
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode2/mul_1:0 + (Unnamed Layer* 419) [Shuffle] + fpn_cells/cell_1/fnode2/truediv_1, PWN(fpn_cells/cell_1/fnode2/mul:0 + (Unnamed Layer* 414) [Shuffle] + fpn_cells/cell_1/fnode2/truediv, fpn_cells/cell_1/fnode2/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_1/fnode2/op_after_combine7/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_8
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode3/mul_1:0 + (Unnamed Layer* 433) [Shuffle] + fpn_cells/cell_1/fnode3/truediv_1, PWN(fpn_cells/cell_1/fnode3/mul:0 + (Unnamed Layer* 384) [Shuffle] + fpn_cells/cell_1/fnode3/truediv, fpn_cells/cell_1/fnode3/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_1/fnode3/op_after_combine8/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/resample_2_8_9/max_pooling2d_6/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode4/mul_2:0 + (Unnamed Layer* 446) [Shuffle] + fpn_cells/cell_1/fnode4/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode4/mul_1:0 + (Unnamed Layer* 428) [Shuffle] + fpn_cells/cell_1/fnode4/truediv_1, PWN(fpn_cells/cell_1/fnode4/mul:0 + (Unnamed Layer* 411) [Shuffle] + fpn_cells/cell_1/fnode4/truediv, fpn_cells/cell_1/fnode4/add_n_1/add)), fpn_cells/cell_1/fnode4/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_1/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/resample_2_9_10/max_pooling2d_7/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode5/mul_2:0 + (Unnamed Layer* 462) [Shuffle] + fpn_cells/cell_1/fnode5/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode5/mul_1:0 + (Unnamed Layer* 408) [Shuffle] + fpn_cells/cell_1/fnode5/truediv_1, PWN(fpn_cells/cell_1/fnode5/mul:0 + (Unnamed Layer* 297) [Shuffle] + fpn_cells/cell_1/fnode5/truediv, fpn_cells/cell_1/fnode5/add_n_1/add)), fpn_cells/cell_1/fnode5/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_1/fnode5/op_after_combine10/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/resample_2_10_11/max_pooling2d_8/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode6/mul_2:0 + (Unnamed Layer* 478) [Shuffle] + fpn_cells/cell_1/fnode6/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode6/mul_1:0 + (Unnamed Layer* 381) [Shuffle] + fpn_cells/cell_1/fnode6/truediv_1, PWN(fpn_cells/cell_1/fnode6/mul:0 + (Unnamed Layer* 328) [Shuffle] + fpn_cells/cell_1/fnode6/truediv, fpn_cells/cell_1/fnode6/add_n_1/add)), fpn_cells/cell_1/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_1/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/resample_1_11_12/max_pooling2d_9/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode7/mul_1:0 + (Unnamed Layer* 494) [Shuffle] + fpn_cells/cell_1/fnode7/truediv_1, PWN(fpn_cells/cell_1/fnode7/mul:0 + (Unnamed Layer* 354) [Shuffle] + fpn_cells/cell_1/fnode7/truediv, fpn_cells/cell_1/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_1/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_9
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode0/mul_1:0 + (Unnamed Layer* 507) [Shuffle] + fpn_cells/cell_2/fnode0/truediv_1, PWN(fpn_cells/cell_2/fnode0/mul:0 + (Unnamed Layer* 490) [Shuffle] + fpn_cells/cell_2/fnode0/truediv, fpn_cells/cell_2/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_2/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_10
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode1/mul_1:0 + (Unnamed Layer* 521) [Shuffle] + fpn_cells/cell_2/fnode1/truediv_1, PWN(fpn_cells/cell_2/fnode1/mul:0 + (Unnamed Layer* 474) [Shuffle] + fpn_cells/cell_2/fnode1/truediv, fpn_cells/cell_2/fnode1/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_2/fnode1/op_after_combine6/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_11
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode2/mul_1:0 + (Unnamed Layer* 535) [Shuffle] + fpn_cells/cell_2/fnode2/truediv_1, PWN(fpn_cells/cell_2/fnode2/mul:0 + (Unnamed Layer* 458) [Shuffle] + fpn_cells/cell_2/fnode2/truediv, fpn_cells/cell_2/fnode2/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_2/fnode2/op_after_combine7/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_12
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode3/mul_1:0 + (Unnamed Layer* 549) [Shuffle] + fpn_cells/cell_2/fnode3/truediv_1, PWN(fpn_cells/cell_2/fnode3/mul:0 + (Unnamed Layer* 442) [Shuffle] + fpn_cells/cell_2/fnode3/truediv, fpn_cells/cell_2/fnode3/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_2/fnode3/op_after_combine8/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/resample_2_8_9/max_pooling2d_10/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode4/mul_2:0 + (Unnamed Layer* 561) [Shuffle] + fpn_cells/cell_2/fnode4/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode4/mul_1:0 + (Unnamed Layer* 544) [Shuffle] + fpn_cells/cell_2/fnode4/truediv_1, PWN(fpn_cells/cell_2/fnode4/mul:0 + (Unnamed Layer* 455) [Shuffle] + fpn_cells/cell_2/fnode4/truediv, fpn_cells/cell_2/fnode4/add_n_1/add)), fpn_cells/cell_2/fnode4/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_2/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid), class_net/mul)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid), box_net/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/resample_2_9_10/max_pooling2d_11/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_1), class_net/mul_1)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_1), box_net/mul_1)
[TensorRT] INFO: [GpuLayer] class_net/class-2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode5/mul_2:0 + (Unnamed Layer* 589) [Shuffle] + fpn_cells/cell_2/fnode5/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode5/mul_1:0 + (Unnamed Layer* 530) [Shuffle] + fpn_cells/cell_2/fnode5/truediv_1, PWN(fpn_cells/cell_2/fnode5/mul:0 + (Unnamed Layer* 471) [Shuffle] + fpn_cells/cell_2/fnode5/truediv, fpn_cells/cell_2/fnode5/add_n_1/add)), fpn_cells/cell_2/fnode5/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_2/fnode5/op_after_combine10/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_3), class_net/mul_3)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_3), box_net/mul_3)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_2), class_net/mul_2)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_2), box_net/mul_2)
[TensorRT] INFO: [GpuLayer] class_net/class-predict/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/resample_2_10_11/max_pooling2d_12/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_4), class_net/mul_4)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_4), box_net/mul_4)
[TensorRT] INFO: [GpuLayer] class_net/class-2_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict/BiasAdd__1552 + Reshape
[TensorRT] INFO: [GpuLayer] box_net/box-predict/BiasAdd__1591 + Reshape_1
[TensorRT] INFO: [GpuLayer] class_net/class-2_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode6/mul_2:0 + (Unnamed Layer* 633) [Shuffle] + fpn_cells/cell_2/fnode6/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode6/mul_1:0 + (Unnamed Layer* 516) [Shuffle] + fpn_cells/cell_2/fnode6/truediv_1, PWN(fpn_cells/cell_2/fnode6/mul:0 + (Unnamed Layer* 487) [Shuffle] + fpn_cells/cell_2/fnode6/truediv, fpn_cells/cell_2/fnode6/add_n_1/add)), fpn_cells/cell_2/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_2/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_6), class_net/mul_6)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_6), box_net/mul_6)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_5), class_net/mul_5)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_5), box_net/mul_5)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/resample_1_11_12/max_pooling2d_13/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_7), class_net/mul_7)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_7), box_net/mul_7)
[TensorRT] INFO: [GpuLayer] class_net/class-2_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/BiasAdd__1482 + Reshape_2
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/BiasAdd__1517 + Reshape_3
[TensorRT] INFO: [GpuLayer] class_net/class-2_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode7/mul_1:0 + (Unnamed Layer* 681) [Shuffle] + fpn_cells/cell_2/fnode7/truediv_1, PWN(fpn_cells/cell_2/fnode7/mul:0 + (Unnamed Layer* 503) [Shuffle] + fpn_cells/cell_2/fnode7/truediv, fpn_cells/cell_2/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_2/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_9), class_net/mul_9)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_9), box_net/mul_9)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_8), class_net/mul_8)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_8), box_net/mul_8)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_10), class_net/mul_10)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_10), box_net/mul_10)
[TensorRT] INFO: [GpuLayer] class_net/class-2_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/BiasAdd__1412 + Reshape_4
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/BiasAdd__1447 + Reshape_5
[TensorRT] INFO: [GpuLayer] class_net/class-2_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_12), class_net/mul_12)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_12), box_net/mul_12)
[TensorRT] INFO: [GpuLayer] class_net/class-1_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_11), class_net/mul_11)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_11), box_net/mul_11)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_13), class_net/mul_13)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_13), box_net/mul_13)
[TensorRT] INFO: [GpuLayer] class_net/class-2_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/BiasAdd__1342 + Reshape_6
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/BiasAdd__1377 + Reshape_7
[TensorRT] INFO: [GpuLayer] class_net/class-2_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_14), class_net/mul_14)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_14), box_net/mul_14)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/BiasAdd__1272 + Reshape_8
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/BiasAdd__1307 + Reshape_9
[TensorRT] INFO: [GpuLayer] Reshape:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_2:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_4:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_6:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_8:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_1:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_3:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_5:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_7:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_9:0 copy
[TensorRT] INFO: [GpuLayer] unstack
[TensorRT] INFO: [GpuLayer] unstack_0
[TensorRT] INFO: [GpuLayer] unstack_1
[TensorRT] INFO: [GpuLayer] unstack_2
[TensorRT] INFO: [GpuLayer] PWN(nms/class_net_sigmoid)
[TensorRT] INFO: [GpuLayer] unstack__1596
[TensorRT] INFO: [GpuLayer] unstack__1595
[TensorRT] INFO: [GpuLayer] unstack__1594
[TensorRT] INFO: [GpuLayer] unstack__1593
[TensorRT] INFO: [GpuLayer] sub_3:0
[TensorRT] INFO: [GpuLayer] sub_2:0
[TensorRT] INFO: [GpuLayer] sub_3:0_3
[TensorRT] INFO: [GpuLayer] sub_2:0_4
[TensorRT] INFO: [GpuLayer] truediv_5:0
[TensorRT] INFO: [GpuLayer] PWN(mul_5, add_6)
[TensorRT] INFO: [GpuLayer] truediv_4:0
[TensorRT] INFO: [GpuLayer] PWN(mul_4, add_5)
[TensorRT] INFO: [GpuLayer] PWN(ConstantFolding/truediv_7_recip:0 + (Unnamed Layer* 813) [Shuffle], PWN(PWN(Exp, mul_2), truediv_7))
[TensorRT] INFO: [GpuLayer] PWN(ConstantFolding/truediv_7_recip:0_5 + (Unnamed Layer* 816) [Shuffle], PWN(PWN(Exp_1, mul_3), truediv_6))
[TensorRT] INFO: [GpuLayer] sub_5
[TensorRT] INFO: [GpuLayer] add_8
[TensorRT] INFO: [GpuLayer] sub_4
[TensorRT] INFO: [GpuLayer] add_7
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1598
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1600
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1597
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1599
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1597:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1598:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1599:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1600:0 copy
[TensorRT] INFO: [GpuLayer] nms/box_net_reshape
[TensorRT] INFO: [GpuLayer] nms/non_maximum_suppression
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +167, GPU +287, now: CPU 455, GPU 3720 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +250, GPU +449, now: CPU 705, GPU 4169 (MiB)
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] ERROR: Tactic Device request: 1686MB Available: 1536MB. Device memory is insufficient to use tactic.
[TensorRT] WARNING: Skipping tactic 3 due to oom error on requested size of 1686 detected for tactic 4.
Try decreasing the workspace size with IBuilderConfig::setMaxWorkspaceSize().
[TensorRT] ERROR: Tactic Device request: 1679MB Available: 1536MB. Device memory is insufficient to use tactic.
[TensorRT] WARNING: Skipping tactic 3 due to oom error on requested size of 1679 detected for tactic 4.
Try decreasing the workspace size with IBuilderConfig::setMaxWorkspaceSize().
[TensorRT] INFO: Detected 1 inputs and 4 output network tensors.
[TensorRT] INFO: Total Host Persistent Memory: 346688
[TensorRT] INFO: Total Device Persistent Memory: 8863232
[TensorRT] INFO: Total Scratch Memory: 107589120
[TensorRT] INFO: [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 12 MiB, GPU 1078 MiB
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1002, GPU 5823 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1002, GPU 5823 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1002, GPU 5823 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1001, GPU 5823 (MiB)
[TensorRT] INFO: [MemUsageSnapshot] Builder end: CPU 998 MiB, GPU 5823 MiB
INFO:EngineBuilder:Serializing engine to file: /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trtreal   30m6.750s
user    26m56.900s
sys 1m8.252s# engine.trt，20.6MB

6. 推理

tensorRT FP32

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython infer.py \--engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \--input /media/mydisk/YOYOFile/coco_calib \--output /media/mydisk/YOYOFile/infer_fp32

(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python infer.py \
>     --engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \
>     --input /media/mydisk/YOYOFile/coco_calib \
>     --output /media/mydisk/YOYOFile/infer_fp32
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000009.jpg']
time_1: 1803 age 1 / 1000len(images): 1
detections: [[{'ymin': -10.35421371459961, 'xmin': 299.7505760192871, 'ymax': 240.5466079711914, 'xmax': 637.5276184082031, 'score': 0.6842605, 'class': 50}, {'ymin': 80.76017379760742, 'xmin': -1.2042045593261719, 'ymax': 436.9062805175781, 'xmax': 459.1698455810547, 'score': 0.62641436, 'class': 50}, {'ymin': 188.0274200439453, 'xmin': 27.046070098876953, 'ymax': 471.3096618652344, 'xmax': 601.4808654785156, 'score': 0.57706714, 'class': 50}, {'ymin': 222.76180267333984, 'xmin': 249.59579467773438, 'ymax': 473.05362701416016, 'xmax': 562.2965240478516, 'score': 0.5429893, 'class': 55}, {'ymin': 69.52826499938965, 'xmin': 388.28861236572266, 'ymax': 141.8428897857666, 'xmax': 470.15674591064453, 'score': 0.45283884, 'class': 54}, {'ymin': 6.308660507202148, 'xmin': 19.28152084350586, 'ymax': 293.54217529296875, 'xmax': 427.21527099609375, 'score': 0.42518762, 'class': 50}]]
time_2: 368
...
...
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000580385.jpg']
time_1: 145 mage 998 / 1000len(images): 1
detections: [[{'ymin': 47.34269142150879, 'xmin': 125.20036697387695, 'ymax': 359.3659210205078, 'xmax': 525.5953216552734, 'score': 0.9364286, 'class': 6}]]
time_2: 251
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000581218.jpg']
time_1: 151 mage 999 / 1000len(images): 1
detections: [[{'ymin': 309.5945739746094, 'xmin': 241.9521141052246, 'ymax': 348.5173034667969, 'xmax': 271.9435691833496, 'score': 0.69110346, 'class': 4}, {'ymin': 147.7055263519287, 'xmin': 271.3910102844238, 'ymax': 188.7246322631836, 'xmax': 300.4056739807129, 'score': 0.64089125, 'class': 4}, {'ymin': 269.5161247253418, 'xmin': 260.4314994812012, 'ymax': 310.32026290893555, 'xmax': 289.4497871398926, 'score': 0.6314739, 'class': 4}, {'ymin': 234.64244842529297, 'xmin': 308.0194282531738, 'ymax': 274.2452621459961, 'xmax': 339.4849395751953, 'score': 0.526499, 'class': 4}, {'ymin': 273.08494567871094, 'xmin': 298.948917388916, 'ymax': 313.12862396240234, 'xmax': 325.6622314453125, 'score': 0.4886201, 'class': 4}, {'ymin': 233.17075729370117, 'xmin': 282.2890281677246, 'ymax': 268.447322845459, 'xmax': 312.5209617614746, 'score': 0.41873252, 'class': 4}]]
time_2: 430
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000581766.jpg']
time_1: 149 mage 1000 / 1000len(images): 1
detections: [[{'ymin': 148.56106042861938, 'xmin': 204.04022932052612, 'ymax': 283.86545181274414, 'xmax': 297.4337339401245, 'score': 0.8775656, 'class': 69}, {'ymin': 153.74591946601868, 'xmin': 17.941385507583618, 'ymax': 285.72288155555725, 'xmax': 127.41453945636749, 'score': 0.81447136, 'class': 69}, {'ymin': 146.68376743793488, 'xmin': 373.20685386657715, 'ymax': 278.85639667510986, 'xmax': 481.5758466720581, 'score': 0.751435, 'class': 69}]]
time_2: 276infer time: 504825Finished Processingreal    8m32.614s
user    6m5.816s
sys 0m10.920s

总结：

COCO数据集
（较好）FP16平均耗时150ms/张，即6.7fps
（较差）FP16平均耗时160ms/张，即6.2fps

tensorRT FP16

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython infer.py \--engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \--input /media/mydisk/YOYOFile/coco_calib \--output /media/mydisk/YOYOFile/infer_fp16

(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python infer.py \
>     --engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \
>     --input /media/mydisk/YOYOFile/coco_calib \
>     --output /media/mydisk/YOYOFile/infer_fp16
[TensorRT] VERBOSE: Registered plugin creator - ::GridAnchor_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::GridAnchorRect_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::NMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Reorg_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Region_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Clip_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::LReLU_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PriorBox_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Normalize_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ScatterND version 1
[TensorRT] VERBOSE: Registered plugin creator - ::RPROI_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::BatchedNMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::FlattenConcat_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::CropAndResize version 1
[TensorRT] VERBOSE: Registered plugin creator - ::DetectionLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::EfficientNMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Proposal version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ProposalLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ResizeNearest_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Split version 1
[TensorRT] VERBOSE: Registered plugin creator - ::SpecialSlice_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::InstanceNormalization_TRT version 1
[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +234, GPU +0, now: CPU 264, GPU 7209 (MiB)
[TensorRT] INFO: Loaded engine size: 19 MB
[TensorRT] INFO: [MemUsageSnapshot] deserializeCudaEngine begin: CPU 284 MiB, GPU 7228 MiB
[TensorRT] VERBOSE: Using cublas a tactic source
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +167, GPU +268, now: CPU 466, GPU 7510 (MiB)
[TensorRT] VERBOSE: Using cuDNN as a tactic source
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +250, GPU +245, now: CPU 716, GPU 7755 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 716, GPU 7755 (MiB)
[TensorRT] VERBOSE: Deserialization required 5901688 microseconds.
[TensorRT] INFO: [MemUsageSnapshot] deserializeCudaEngine end: CPU 716 MiB, GPU 7755 MiB
[TensorRT] INFO: [MemUsageSnapshot] ExecutionContext creation begin: CPU 696 MiB, GPU 7736 MiB
[TensorRT] VERBOSE: Using cublas a tactic source
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 696, GPU 7736 (MiB)
[TensorRT] VERBOSE: Using cuDNN as a tactic source
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 696, GPU 7736 (MiB)
[TensorRT] VERBOSE: Total per-runner device memory is 9042944
[TensorRT] VERBOSE: Total per-runner host memory is 343280
[TensorRT] VERBOSE: Allocated activation device memory of size 141150208
[TensorRT] INFO: [MemUsageSnapshot] ExecutionContext creation end: CPU 699 MiB, GPU 7770 MiB
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000009.jpg']
time_1: 3401mage 1 / 1000
len(images): 1
detections: [[{'ymin': -10.390625, 'xmin': 299.53125, 'ymax': 240.625, 'xmax': 637.5, 'score': 0.684264, 'class': 50}, {'ymin': 80.78125, 'xmin': -1.25, 'ymax': 436.875, 'xmax': 459.0625, 'score': 0.62612414, 'class': 50}, {'ymin': 188.125, 'xmin': 27.1875, 'ymax': 471.25, 'xmax': 601.5625, 'score': 0.57749534, 'class': 50}, {'ymin': 222.8125, 'xmin': 249.6875, 'ymax': 472.8125, 'xmax': 562.1875, 'score': 0.5418937, 'class': 55}, {'ymin': 69.53125, 'xmin': 388.4375, 'ymax': 141.875, 'xmax': 470.3125, 'score': 0.4530755, 'class': 54}, {'ymin': 6.25, 'xmin': 19.375, 'ymax': 293.59375, 'xmax': 427.1875, 'score': 0.42536652, 'class': 50}]]
time_2: 485
...
...
...
time_1: 163Image 998 / 1000
len(images): 1
detections: [[{'ymin': 47.265625, 'xmin': 125.0, 'ymax': 359.375, 'xmax': 525.625, 'score': 0.9365176, 'class': 6}]]
time_2: 449
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000581218.jpg']
time_1: 168Image 999 / 1000
len(images): 1
detections: [[{'ymin': 309.53125, 'xmin': 241.875, 'ymax': 349.375, 'xmax': 271.5625, 'score': 0.69014156, 'class': 4}, {'ymin': 147.734375, 'xmin': 271.40625, 'ymax': 188.75, 'xmax': 300.46875, 'score': 0.63973606, 'class': 4}, {'ymin': 269.375, 'xmin': 260.46875, 'ymax': 310.3125, 'xmax': 289.53125, 'score': 0.6306849, 'class': 4}, {'ymin': 234.53125, 'xmin': 307.96875, 'ymax': 274.21875, 'xmax': 339.375, 'score': 0.52634275, 'class': 4}, {'ymin': 273.125, 'xmin': 299.0625, 'ymax': 313.125, 'xmax': 325.625, 'score': 0.4882834, 'class': 4}, {'ymin': 233.125, 'xmin': 282.1875, 'ymax': 268.4375, 'xmax': 312.5, 'score': 0.42059958, 'class': 4}]]
time_2: 665
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000581766.jpg']
time_1: 160Image 1000 / 1000
len(images): 1
detections: [[{'ymin': 148.5595703125, 'xmin': 204.1015625, 'ymax': 283.69140625, 'xmax': 297.36328125, 'score': 0.8774537, 'class': 69}, {'ymin': 153.80859375, 'xmin': 17.9443359375, 'ymax': 285.64453125, 'xmax': 127.44140625, 'score': 0.8146434, 'class': 69}, {'ymin': 146.728515625, 'xmin': 373.291015625, 'ymax': 278.80859375, 'xmax': 481.689453125, 'score': 0.7512834, 'class': 69}]]
time_2: 373infer time: 684610Finished Processing
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 964, GPU 7654 (MiB)real  11m37.284s
user    7m18.972s
sys 0m14.796s

总结：
COCO数据集
（较好）FP16平均耗时140ms/张，即7fps
（较差）FP16平均耗时170ms/张，即5.9fps

person_horse数据集
（较好）FP16平均耗时140ms/张，即7fps
（较差）FP16平均耗时170ms/张，即5.9fps

7. 评估指标

tensorRT FP32

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython eval_coco.py \--engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \--input /media/mydisk/YOYOFile/COCO/val2017 \--annotations /media/mydisk/YOYOFile/COCO/annotations/instances_val2017.json \--automl_path /media/mydisk/MyDocuments/PyProjects/automl

(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python eval_coco.py \
>     --engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \
>     --input /media/mydisk/YOYOFile/COCO/val2017 \
>     --annotations /media/mydisk/YOYOFile/COCO/annotations/instances_val2017.json \
>     --automl_path /media/mydisk/MyDocuments/PyProjects/automl
2021-10-26 18:22:17.807992: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
...
...
Processing Image 5000 / 5000
infer time: 951576loading annotations into memory...
Done (t=3.04s)
creating index...
index created!
Loading and preparing results...
Converting ndarray to lists...
(18132, 7)
0/18132
DONE (t=0.38s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=78.79s).
Accumulating evaluation results...
DONE (t=12.23s).Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.282Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.397Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.315Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.053Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.319Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.494Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.242Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.315Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.316Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.052Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.349Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.562real   18m6.646s
user    6m34.516s
sys 0m34.588s

tensorRT FP16

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython eval_coco.py \--engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \--input /media/mydisk/YOYOFile/COCO/val2017 \--annotations /media/mydisk/YOYOFile/COCO/annotations/instances_val2017.json \--automl_path /media/mydisk/MyDocuments/PyProjects/automl

(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python eval_coco.py     --engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt     --input /media/mydisk/YOYOFile/COCO/val2017     --annotations /media/mydisk/YOYOFile/COCO/annotations/instances_val2017.json     --automl_path /media/mydisk/MyDocuments/PyProjects/automl
2021-10-22 14:44:34.439779: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
...
...
Processing Image 5000 / 5000
infer time: 926744loading annotations into memory...
Done (t=3.13s)
creating index...
index created!
Loading and preparing results...
Converting ndarray to lists...
(18133, 7)
0/18133
DONE (t=0.40s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=70.50s).
Accumulating evaluation results...
DONE (t=11.49s).Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.282Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.397Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.315Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.053Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.319Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.494Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.242Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.315Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.316Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.052Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.349Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.562real   17m27.213s
user    6m39.904s
sys 0m38.652s

8. 比较原生的tensorflow和tensorRT

tensorRT FP32

cd /home/yichao/MyDocuments/TensorRT/samples/python/efficientdetpython compare_tf.py \--engine /home/yichao/Downloads/saved_model_trt_fp32/engine.trt \--saved_model /home/yichao/Downloads/saved_model \--input /home/yichao/Downloads/coco_calib \--output /home/yichao/Downloads/output_fp32

tensorRT FP16

cd /home/yichao/MyDocuments/TensorRT/samples/python/efficientdetpython compare_tf.py \--engine /home/yichao/Downloads/saved_model_trt_fp16/engine.trt \--saved_model /home/yichao/Downloads/saved_model \--input /home/yichao/Downloads/coco_calib \--output /home/yichao/Downloads/output_fp16

(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python compare_tf.py     --engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt     --saved_model /media/mydisk/YOYOFile/saved_model     --input /media/mydisk/YOYOFile/coco_calib     --output /media/mydisk/YOYOFile/output_fp16
2021-10-22 16:19:15.765642: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-22 16:19:28.446469: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-10-22 16:19:28.447570: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:19:28.448016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-22 16:19:28.448459: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-22 16:19:28.448790: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.10
2021-10-22 16:19:28.448971: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.10
2021-10-22 16:19:28.449133: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-10-22 16:19:28.449401: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-10-22 16:19:28.468938: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.10
2021-10-22 16:19:28.485613: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.10
2021-10-22 16:19:28.486389: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-10-22 16:19:28.487274: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:19:28.487977: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:19:28.488423: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-22 16:20:55.378425: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:20:55.378885: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-22 16:20:55.379462: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:20:55.379853: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:20:55.379990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-22 16:20:55.380803: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-22 16:21:00.945108: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-22 16:21:00.945316: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-22 16:21:00.945392: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-22 16:21:00.946306: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:21:00.947010: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:21:00.947459: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:21:00.947732: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2356 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-22 16:24:27.939628: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-10-22 16:24:28.315283: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 31250000 Hz
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000009.jpg']
2021-10-22 16:26:00.194066: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-10-22 16:26:00.207154: I tensorflow/stream_executor/cuda/cuda_dnn.cc:380] Loaded cuDNN version 8201
2021-10-22 16:26:03.482891: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.10
2021-10-22 16:26:11.318267: W tensorflow/core/common_runtime/bfc_allocator.cc:337] Garbage collection: deallocate free memory regions (i.e., allocations) so that we can re-allocate a larger region to avoid OOM due to memory fragmentation. If you see this message frequently, you are running near the threshold of the available device memory and re-allocation may incur great performance overhead. You may try smaller batch sizes to observe the performance impact. Set TF_ENABLE_GPU_GARBAGE_COLLECTION=false if you'd like to disable this feature.
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000151.jpg']
...
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000061328.jpg']
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000061399.jpg']
Processing 100 / 100 images (TensorFlow)
infer time: 47011time_1: 47011
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000009.jpg']
...
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000061328.jpg']
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000061399.jpg']
Processing 100 / 100 images (TensorRT)
infer time: 18347time_2: 18348
Processing 100 / 100 images (Visualization)real 8m12.180s
user    7m17.240s
sys 0m19.388s

COCO数据集
time_1: 47011
time_2: 18348time_1: 46871
time_2: 18095time_1: 46849
time_2: 18110time_1: 46364
time_2: 17931person_horse数据集
time_1: 42032
time_2: 20510time_1: 48867
time_2: 20521

测试数据	分辨率	TensorFlow耗时/ms	TensorRT耗时/ms	加速比
COCO，100张	640x480	16089	5268	3
COCO，100张	640x480	11888	2509	4.7
COCO，100张	640x480	12012	3008	4
COCO，100张	640x480	11803	3183	3.7
COCO，100张	640x480	11776	3140	3.7

测试数据	分辨率	TensorFlow耗时/ms	TensorRT耗时/ms	加速比
person_horse，100张	1280x720	12891	3452	3.7
person_horse，100张	1280x720	14989	3480	4.3
person_horse，100张	1280x720	15565	3515	4.4
person_horse，100张	1280x720	12883	3456	3.7

注意：加速比=TensorFlow耗时TensorRT耗时加速比=\frac{TensorFlow耗时}{TensorRT耗时}加速比=TensorRT耗时TensorFlow耗时

Jetson TX2实现EfficientDet推理加速（一）相关推荐

Jetson TX2实现EfficientDet推理加速（二）
一.参考资料 TensorRT实现EfficientDet推理加速(一) 二.可能出现的问题 infer推理错误 [TensorRT] ERROR: 2: [pluginV2DynamicExtRun ...
爱视图灵-深度学习推理盒（JETSON TX2）
爱视图灵-深度学习推理盒(JETSON TX2) 一.NVIDIA Jetson TX2 模块化 AI 超级计算机的优势传统的视频分析使用基于计算机视觉的方法,但下一代解决方案愈发依赖深度学习技术. ...
jetson tx2上运行mobilenet-ssd的坑：interrupted by signal 9: SIGKILL
从ssd-caffe转战到mobilenet-ssd,也就是为了实时性.jetson tx2运行caffe-ssd前向的时间大概就是210ms.但是经过实际测试,对前5层卷积层使用CUDNN加速时,m ...
NVDIA Jetson TX2软件介绍
介绍 JETSON TX2 模块它是一台基于NVIDIA Pascal™架构的AI单模块超级计算机.它性能强大,但外形小巧,节能高效,非常适合机器人.无人机.智能摄像机和便携医疗设备等智能终端设备. ...
Jetson TX2上配置archiconda、Yolov5、tensorrtx环境问题记录
文章目录前言本文主要记录在Jetson TX2上配置archiconda.Yolov5.tensorrtx环境中遇到的问题以及解决方法.以及一些包的分享. 一.Jetson TX2刷机二.安装a ...
TensorRT实现RetinaFace推理加速（一）
一.参考资料 tensorrtx/retinaface TensorRT实现yolov5推理加速(一) TensorRT实现yolov5推理加速(二) 二.实验环境 ##系统环境 Environmen ...
Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测
上一篇:Jetson AGX Xavier安装torch.torchvision且成功运行yolov5算法下一篇:Jetson AGX Xavier测试YOLOv4 一.前言由于YOLOv5在Xa ...
Jetson TX2 开发记录
一. 开箱,刷机 https://github.com/dusty-nv/jetson-inference#building-from-source-on-jetson (官方教程) http://v ...
Jetson TX2介绍
目录 Jetson TX2概述 Jetson TX2架构 1.模组配置 2. 对外接口 3.按键接口和TX1的对比自带的软件包配置JetPack 3.0 CUDA OpenCV VisionWor ...

Jetson TX2实现EfficientDet推理加速（一）

一、参考资料

二、相关环境

系统环境

三、注意事项

四、关键步骤

1. 下载官方 efficientdet ，安装相关的依赖包

2. 下载预训练模型并解压

3. 模型转换：转成pb文件

4. 下载 tensorRT 官方提供的 EfficientDet，生成onnx模型

5. 生成engine引擎

tensorRT FP32

tensorRT FP16

6. 推理

tensorRT FP32

tensorRT FP16

7. 评估指标

tensorRT FP32

tensorRT FP16

8. 比较原生的tensorflow和tensorRT

tensorRT FP32

tensorRT FP16

Jetson TX2实现EfficientDet推理加速（一）相关推荐

最新文章

热门文章