一、参考资料

EfficientDet tensorrt
Jetson TX2实现EfficientDet推理加速(二)

二、相关环境

系统环境

Environment
Operating System + Version: Ubuntu + 18.04
TensorRT Version: 8.0.1.6
GPU Type: Jetson TX2
CUDA Version: 10.2.300
Python Version (if applicable): 3.6.9
gcc:7.5.0
g++:7.5.0

三、注意事项

  1. 根据提示按需安装python包,不推荐安装所有的包 pip install -r requirements.txt

四、关键步骤

1. 下载官方 efficientdet ,安装相关的依赖包

git clone https://github.com/google/automl.gitcd /media/mydisk/MyDocuments/PyProjects/automl/efficientdet# 执行步骤3的指令,根据提示安装包。在Jetson TX2中,不推荐安装所有的包 `pip install -r requirements.txt`比如:
pip install tensorflow-model-optimization
pip install dm-tree说明:如果安装dm-tree,请用源码安装方式,见下文介绍

2. 下载预训练模型并解压

efficientdet-d0

# /media/mydisk/YOYOFile/efficientdet-d0├── efficientdet-d0
│   ├── checkpoint
│   ├── d0_coco_test-dev2017.txt
│   ├── d0_coco_val.txt
│   ├── model.data-00000-of-00001
│   ├── model.index
│   └── model.meta

3. 模型转换:转成pb文件

cd /media/mydisk/MyDocuments/PyProjects/automl/efficientdetpython model_inspect.py \--runmode saved_model \--model_name efficientdet-d0 \--ckpt_path /media/mydisk/YOYOFile/efficientdet-d0 \--saved_model_dir /media/mydisk/YOYOFile/saved_model
# /media/mydisk/YOYOFile/saved_model├── saved_model
│   ├── efficientdet-d0_frozen.pb
│   ├── saved_model.pb
│   └── variables# saved_model.pb,7.3MB
(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/automl/efficientdet$ time python model_inspect.py     --runmode saved_model     --model_name efficientdet-d0     --ckpt_path /media/mydisk/YOYOFile/efficientdet-d0     --saved_model_dir /media/mydisk/YOYOFile/saved_model
2021-10-21 17:37:26.382558: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 17:37:39.119335: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-10-21 17:37:39.136342: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:39.136588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 17:37:39.136711: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 17:37:39.153938: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.10
2021-10-21 17:37:39.154242: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.10
2021-10-21 17:37:39.169750: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-10-21 17:37:39.183745: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-10-21 17:37:39.196944: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.10
2021-10-21 17:37:39.206568: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.10
2021-10-21 17:37:39.207341: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-10-21 17:37:39.207750: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:39.208261: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:39.208475: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 17:37:39.208631: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 17:37:42.797530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 17:37:42.797636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 17:37:42.797688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 17:37:42.798146: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:42.798527: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:42.798851: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:42.799117: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 576 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 17:37:43.052975: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:43.053240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 17:37:43.053618: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:43.054036: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 17:37:43.054195: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/utils.py:602: The name tf.keras.layers.enable_v2_dtype_behavior is deprecated. Please use tf.compat.v1.keras.layers.enable_v2_dtype_behavior instead.W1021 17:37:45.469214 547581214736 module_wrapper.py:155] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/utils.py:602: The name tf.keras.layers.enable_v2_dtype_behavior is deprecated. Please use tf.compat.v1.keras.layers.enable_v2_dtype_behavior instead.WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py:5049: calling gather (from tensorflow.python.ops.array_ops) with validate_indices is deprecated and will be removed in a future version.
Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.
W1021 17:38:48.279611 547581214736 deprecation.py:534] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py:5049: calling gather (from tensorflow.python.ops.array_ops) with validate_indices is deprecated and will be removed in a future version.
Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.
2021-10-21 17:38:49.470314: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 31250000 Hz
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/training/moving_averages.py:457: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W1021 17:38:55.314421 547581214736 deprecation.py:336] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/training/moving_averages.py:457: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:201: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
W1021 17:40:41.735967 547581214736 deprecation.py:336] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:201: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/inference.py:582: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
W1021 17:43:51.238454 547581214736 deprecation.py:336] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/inference.py:582: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/framework/convert_to_constants.py:857: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
W1021 17:43:51.271648 547581214736 deprecation.py:336] From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/framework/convert_to_constants.py:857: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`real   7m4.673s
user    6m41.776s
sys 0m7.824s

4. 下载 tensorRT 官方提供的 EfficientDet,生成onnx模型

git clone https://github.com/NVIDIA/TensorRT.gitcd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet# 执行步骤3的指令,根据提示安装包。在Jetson TX2中,不推荐安装所有的包 `pip install -r requirements.txt`比如:
pip install onnx
pip install onnx-graphsurgeon --index-url https://pypi.ngc.nvidia.com
pip install tf2onnx
pip install onnxruntime
cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython create_onnx.py \--input_shape '1,512,512,3' \--saved_model /media/mydisk/YOYOFile/saved_model \--onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx
# /home/yichao/Downloads/saved_model_onnx├── saved_model_onnx
│   └── model.onnx# model.onnx,16.5MB
(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python create_onnx.py     --input_shape '1,512,512,3'     --saved_model /media/mydisk/YOYOFile/saved_model     --onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx
2021-10-21 18:27:24.750524: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 18:27:35.154053: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-10-21 18:27:35.163132: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.163380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:27:35.163498: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 18:27:35.168944: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.10
2021-10-21 18:27:35.169239: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.10
2021-10-21 18:27:35.172684: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-10-21 18:27:35.175041: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-10-21 18:27:35.183730: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.10
2021-10-21 18:27:35.192669: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.10
2021-10-21 18:27:35.193586: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-10-21 18:27:35.193970: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.194398: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.194583: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:27:35.197697: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.197924: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:27:35.198312: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.198736: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:35.198869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:27:35.199116: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-21 18:27:38.117646: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:27:38.117758: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:27:38.117813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:27:38.118213: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:38.118654: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:38.119182: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:27:38.119460: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:31:59.010282: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-10-21 18:31:59.210905: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 31250000 Hz
INFO:tf2onnx.tf_loader:Signatures found in model: [serving_default].
INFO:tf2onnx.tf_loader:Output names: ['detections:0']
WARNING:tensorflow:Issue encountered when serializing global_step.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing global_step.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing moving_average_variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing moving_average_variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing trainable_variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
WARNING:tensorflow:Issue encountered when serializing trainable_variables.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
to_proto not supported in EAGER mode.
2021-10-21 18:33:06.415085: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.415304: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2021-10-21 18:33:06.415593: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2021-10-21 18:33:06.417680: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.417887: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:33:06.418187: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.418512: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.418650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:33:06.418763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:33:06.418825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:33:06.418871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:33:06.419332: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.419687: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:33:06.419853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:33:17.734622: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1171] Optimization results for grappler item: graph_to_optimizefunction_optimizer: Graph size after: 3761 nodes (0), 3553 edges (0), time = 169.426ms.function_optimizer: Graph size after: 3761 nodes (0), 3553 edges (0), time = 175.429ms.
Optimization results for grappler item: __inference_while_cond_22_6432function_optimizer: function_optimizer did nothing. time = 0.027ms.function_optimizer: function_optimizer did nothing. time = 0.007ms.
Optimization results for grappler item: __inference_TensorArrayV2Write_cond_true_49_3796function_optimizer: function_optimizer did nothing. time = 0.025ms.function_optimizer: function_optimizer did nothing. time = 0.007ms.
Optimization results for grappler item: __inference_while_body_23_3820function_optimizer: Graph size after: 29 nodes (0), 30 edges (0), time = 1.836ms.function_optimizer: Graph size after: 29 nodes (0), 30 edges (0), time = 1.947ms.
Optimization results for grappler item: __inference_TensorArrayV2Write_cond_false_50_233function_optimizer: function_optimizer did nothing. time = 0.024ms.function_optimizer: function_optimizer did nothing. time = 0.007ms.2021-10-21 18:34:09.261406: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:09.261647: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:34:09.261933: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:09.262202: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:09.262317: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:34:09.262418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:34:09.262469: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:34:09.262515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:34:09.262778: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:09.263169: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:09.263336: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tf2onnx/tf_loader.py:703: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
WARNING:tensorflow:From /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tf2onnx/tf_loader.py:703: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
2021-10-21 18:34:29.838950: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.839220: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2021-10-21 18:34:29.839566: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2021-10-21 18:34:29.840683: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.840922: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:34:29.841253: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.841564: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.841690: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:34:29.841812: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:34:29.841873: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:34:29.841919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:34:29.842248: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.842606: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:34:29.842794: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:34:39.356289: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1171] Optimization results for grappler item: graph_to_optimizeconstant_folding: Graph size after: 2010 nodes (-1023), 2320 edges (-1216), time = 1219.22ms.function_optimizer: function_optimizer did nothing. time = 16.66ms.constant_folding: Graph size after: 2010 nodes (0), 2320 edges (0), time = 276.195ms.function_optimizer: function_optimizer did nothing. time = 15.49ms.INFO:EfficientDetGraphSurgeon:Loaded saved model from /media/mydisk/YOYOFile/saved_model
2021-10-21 18:36:40.496763: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:36:40.497016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:36:40.497301: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:36:40.497576: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:36:40.497690: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:36:40.497828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:36:40.497884: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:36:40.497925: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:36:40.498179: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:36:40.498467: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:36:40.498623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
INFO:tf2onnx.tfonnx:Using tensorflow=2.5.0, onnx=1.10.1, tf2onnx=1.9.2/0f28b7
INFO:tf2onnx.tfonnx:Using opset <onnx, 11>
2021-10-21 18:45:21.030096: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:21.030768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:45:21.031559: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:21.032771: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:21.033244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:45:21.033721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:45:21.033990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:45:21.034280: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:45:21.035300: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:21.036304: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:21.036804: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:45:23.648050: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.648408: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:45:23.648729: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.649003: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.649126: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:45:23.649279: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:45:23.649337: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:45:23.649379: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:45:23.649644: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.649943: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.650121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:45:23.685852: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.686416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:45:23.686875: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.687378: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.687534: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:45:23.687642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:45:23.687700: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:45:23.687743: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:45:23.688074: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.688453: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.688630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-21 18:45:23.831418: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.831653: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-21 18:45:23.831981: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.832326: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.832469: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-21 18:45:23.832570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-21 18:45:23.832625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-21 18:45:23.832667: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-21 18:45:23.832958: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.833287: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-21 18:45:23.833490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1011 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
INFO:tf2onnx.tf_utils:Computed 1 values for constant folding
INFO:tf2onnx.tfonnx:folding node using tf type=ExpandDims, name=ExpandDims
INFO:tf2onnx.tfonnx:folding node type=Range, name=range_1
INFO:tf2onnx.optimizer:Optimizing ONNX model
INFO:tf2onnx.optimizer:After optimization: BatchNormalization -45 (108->63), Cast -27 (41->14), Concat -1 (21->20), Const -530 (1139->609), GlobalAveragePool +16 (0->16), GlobalMaxPool +1 (0->1), Identity -22 (22->0), Mul -2 (187->185), ReduceMax -1 (1->0), ReduceMean -16 (16->0), ReduceSum -1 (1->0), Reshape -80 (92->12), Shape -1 (17->16), Slice -4 (33->29), Squeeze -10 (29->19), Transpose -761 (777->16), Unsqueeze -12 (32->20)
INFO:EfficientDetGraphSurgeon:TF2ONNX graph created successfully
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] 'Shape tensor cast elision' routine failed with: None
INFO:EfficientDetGraphSurgeon:Graph was detected as AutoML
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
INFO:EfficientDetGraphSurgeon:ONNX graph input shape: [1, 512, 512, 3] [NHWC format detected]
INFO:EfficientDetGraphSurgeon:Found Conv node 'efficientnet-b0/stem/conv2d/Conv2D' as stem entry
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Div) bound to different types (tensor(float) and tensor(int32) in node (truediv_1).
INFO:EfficientDetGraphSurgeon:Found Concat node 'concat' as the tip of class_net/
INFO:EfficientDetGraphSurgeon:Found Concat node 'concat_1' as the tip of box_net/
INFO:EfficientDetGraphSurgeon:Created NMS plugin 'EfficientNMS_TRT' with attributes: {'plugin_version': '1', 'background_class': -1, 'max_output_boxes': 100, 'score_threshold': 0.4000000059604645, 'iou_threshold': 0.5, 'score_activation': True, 'box_coding': 1}
Warning: Unsupported operator EfficientNMS_TRT. No schema registered for this operator.
Warning: Unsupported operator EfficientNMS_TRT. No schema registered for this operator.
Warning: Unsupported operator EfficientNMS_TRT. No schema registered for this operator.
INFO:EfficientDetGraphSurgeon:Saved ONNX model to /media/mydisk/YOYOFile/saved_model_onnx/model.onnxreal    28m11.266s
user    27m34.024s
sys 0m10.396s

5. 生成engine引擎

tensorRT FP32

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython build_engine.py \--onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx \--engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \--precision fp32
(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python build_engine.py \
>     --onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx \
>     --engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \
>     --precision fp32
[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +234, GPU +0, now: CPU 264, GPU 3727 (MiB)
[TensorRT] WARNING: onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: BatchedNMS_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: BatchedNMS_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] WARNING: builtin_op_importers.cpp:4552: Attribute scoreBits not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[TensorRT] INFO: Successfully created plugin: BatchedNMS_TRT
INFO:EngineBuilder:Network Description
INFO:EngineBuilder:Input 'image_arrays:0' with shape (1, 512, 512, 3) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'num_detections' with shape (1,) and dtype DataType.INT32
INFO:EngineBuilder:Output 'detection_boxes' with shape (1, 100, 4) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'detection_scores' with shape (1, 100) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'detection_classes' with shape (1, 100) and dtype DataType.FLOAT
INFO:EngineBuilder:Building fp32 Engine in /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt
[TensorRT] INFO: [MemUsageSnapshot] Builder begin: CPU 284 MiB, GPU 3765 MiB
[TensorRT] INFO: ---------- Layers Running on DLA ----------
[TensorRT] INFO: ---------- Layers Running on GPU ----------
[TensorRT] INFO: [GpuLayer] preprocessor/transpose
[TensorRT] INFO: [GpuLayer] preprocessor/scale_value:0 + preprocessor/scale + preprocessor/mean_value:0 + preprocessor/mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/stem/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/stem/Sigmoid), efficientnet-b0/stem/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/Sigmoid), efficientnet-b0/blocks_0/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/se/Sigmoid), efficientnet-b0/blocks_0/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/se/Sigmoid_1), efficientnet-b0/blocks_0/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/Sigmoid), efficientnet-b0/blocks_1/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/Sigmoid_1), efficientnet-b0/blocks_1/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/se/Sigmoid), efficientnet-b0/blocks_1/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/se/Sigmoid_1), efficientnet-b0/blocks_1/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/Sigmoid), efficientnet-b0/blocks_2/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/Sigmoid_1), efficientnet-b0/blocks_2/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/se/Sigmoid), efficientnet-b0/blocks_2/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/se/Sigmoid_1), efficientnet-b0/blocks_2/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/conv2d_1/Conv2D + efficientnet-b0/blocks_2/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/Sigmoid), efficientnet-b0/blocks_3/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/Sigmoid_1), efficientnet-b0/blocks_3/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/se/Sigmoid), efficientnet-b0/blocks_3/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/se/Sigmoid_1), efficientnet-b0/blocks_3/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/Sigmoid), efficientnet-b0/blocks_4/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/Sigmoid_1), efficientnet-b0/blocks_4/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/se/Sigmoid), efficientnet-b0/blocks_4/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/se/Sigmoid_1), efficientnet-b0/blocks_4/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/conv2d_1/Conv2D + efficientnet-b0/blocks_4/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/Sigmoid), efficientnet-b0/blocks_5/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/Sigmoid_1), efficientnet-b0/blocks_5/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/se/Sigmoid), efficientnet-b0/blocks_5/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/se/Sigmoid_1), efficientnet-b0/blocks_5/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/Sigmoid), efficientnet-b0/blocks_6/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/Sigmoid_1), efficientnet-b0/blocks_6/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/se/Sigmoid), efficientnet-b0/blocks_6/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/se/Sigmoid_1), efficientnet-b0/blocks_6/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/conv2d_1/Conv2D + efficientnet-b0/blocks_6/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/Sigmoid), efficientnet-b0/blocks_7/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/Sigmoid_1), efficientnet-b0/blocks_7/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/se/Sigmoid), efficientnet-b0/blocks_7/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/se/Sigmoid_1), efficientnet-b0/blocks_7/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/conv2d_1/Conv2D + efficientnet-b0/blocks_7/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/Sigmoid), efficientnet-b0/blocks_8/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/Sigmoid_1), efficientnet-b0/blocks_8/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/se/Sigmoid), efficientnet-b0/blocks_8/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/se/Sigmoid_1), efficientnet-b0/blocks_8/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/Sigmoid), efficientnet-b0/blocks_9/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/Sigmoid_1), efficientnet-b0/blocks_9/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/se/Sigmoid), efficientnet-b0/blocks_9/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/se/Sigmoid_1), efficientnet-b0/blocks_9/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/conv2d_1/Conv2D + efficientnet-b0/blocks_9/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/Sigmoid), efficientnet-b0/blocks_10/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/Sigmoid_1), efficientnet-b0/blocks_10/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/se/Sigmoid), efficientnet-b0/blocks_10/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/se/Sigmoid_1), efficientnet-b0/blocks_10/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/conv2d_1/Conv2D + efficientnet-b0/blocks_10/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/Sigmoid), efficientnet-b0/blocks_11/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/Sigmoid_1), efficientnet-b0/blocks_11/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/se/Sigmoid), efficientnet-b0/blocks_11/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/se/Sigmoid_1), efficientnet-b0/blocks_11/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/Sigmoid), efficientnet-b0/blocks_12/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/Sigmoid_1), efficientnet-b0/blocks_12/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/se/Sigmoid), efficientnet-b0/blocks_12/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/se/Sigmoid_1), efficientnet-b0/blocks_12/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/conv2d_1/Conv2D + efficientnet-b0/blocks_12/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/Sigmoid), efficientnet-b0/blocks_13/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/Sigmoid_1), efficientnet-b0/blocks_13/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/se/Sigmoid), efficientnet-b0/blocks_13/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/se/Sigmoid_1), efficientnet-b0/blocks_13/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/conv2d_1/Conv2D + efficientnet-b0/blocks_13/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/Sigmoid), efficientnet-b0/blocks_14/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/Sigmoid_1), efficientnet-b0/blocks_14/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/se/Sigmoid), efficientnet-b0/blocks_14/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/se/Sigmoid_1), efficientnet-b0/blocks_14/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/conv2d_1/Conv2D + efficientnet-b0/blocks_14/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/Sigmoid), efficientnet-b0/blocks_15/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/Sigmoid_1), efficientnet-b0/blocks_15/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/se/Sigmoid), efficientnet-b0/blocks_15/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/se/Sigmoid_1), efficientnet-b0/blocks_15/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] resample_p6/conv2d/BiasAdd || fpn_cells/cell_0/fnode5/resample_0_2_10/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] resample_p6/max_pooling2d/MaxPool
[TensorRT] INFO: [GpuLayer] resample_p7/max_pooling2d_1/MaxPool
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/resample_0_2_10/bn/FusedBatchNormV3__894
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/add_n_1/add
[TensorRT] INFO: [GpuLayer] resize_nearest_1
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_0/fnode5/op_after_combine10/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/separable_conv2d/depthwise__898
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode0/mul_1:0 + (Unnamed Layer* 284) [Shuffle] + fpn_cells/cell_0/fnode0/truediv_1, PWN(fpn_cells/cell_0/fnode0/mul:0 + (Unnamed Layer* 274) [Shuffle] + fpn_cells/cell_0/fnode0/truediv, fpn_cells/cell_0/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_0/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_0/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/resample_2_10_11/max_pooling2d_4/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_2
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/mul_1:0 + (Unnamed Layer* 313) [Shuffle] + fpn_cells/cell_0/fnode1/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/resample_0_2_6/conv2d/BiasAdd + fpn_cells/cell_0/fnode1/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(PWN(fpn_cells/cell_0/fnode6/mul_1:0 + (Unnamed Layer* 308) [Shuffle] + fpn_cells/cell_0/fnode6/truediv_1, PWN(fpn_cells/cell_0/fnode6/mul:0 + (Unnamed Layer* 271) [Shuffle] + fpn_cells/cell_0/fnode6/truediv, fpn_cells/cell_0/fnode6/add_n_1/add)), PWN(fpn_cells/cell_0/fnode6/mul_2:0 + (Unnamed Layer* 305) [Shuffle] + fpn_cells/cell_0/fnode6/truediv_2, fpn_cells/cell_0/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_0/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_0/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_0/fnode1/op_after_combine6/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/resample_1_11_12/max_pooling2d_5/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_3
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/mul_1:0 + (Unnamed Layer* 339) [Shuffle] + fpn_cells/cell_0/fnode2/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/resample_0_1_7/conv2d/BiasAdd + fpn_cells/cell_0/fnode2/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode7/mul_1:0 + (Unnamed Layer* 336) [Shuffle] + fpn_cells/cell_0/fnode7/truediv_1, PWN(fpn_cells/cell_0/fnode7/mul:0 + (Unnamed Layer* 278) [Shuffle] + fpn_cells/cell_0/fnode7/truediv, fpn_cells/cell_0/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_0/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_0/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_0/fnode2/op_after_combine7/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/mul_1:0 + (Unnamed Layer* 357) [Shuffle] + fpn_cells/cell_0/fnode4/truediv_1
[TensorRT] INFO: [GpuLayer] resize_nearest_4
[TensorRT] INFO: [GpuLayer] resize_nearest_5
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/resample_0_1_9/conv2d/BiasAdd + fpn_cells/cell_0/fnode4/add_n_1/add
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/mul_1:0 + (Unnamed Layer* 366) [Shuffle] + fpn_cells/cell_0/fnode3/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/resample_0_0_8/conv2d/BiasAdd + fpn_cells/cell_0/fnode3/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode0/mul_1:0 + (Unnamed Layer* 362) [Shuffle] + fpn_cells/cell_1/fnode0/truediv_1, PWN(fpn_cells/cell_1/fnode0/mul:0 + (Unnamed Layer* 331) [Shuffle] + fpn_cells/cell_1/fnode0/truediv, fpn_cells/cell_1/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_1/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_0/fnode3/op_after_combine8/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/resample_2_8_9/max_pooling2d_2/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_6
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode1/mul_1:0 + (Unnamed Layer* 390) [Shuffle] + fpn_cells/cell_1/fnode1/truediv_1, PWN(fpn_cells/cell_1/fnode1/mul:0 + (Unnamed Layer* 300) [Shuffle] + fpn_cells/cell_1/fnode1/truediv, fpn_cells/cell_1/fnode1/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_1/fnode1/op_after_combine6/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode4/mul_2:0 + (Unnamed Layer* 393) [Shuffle] + fpn_cells/cell_0/fnode4/truediv_2, fpn_cells/cell_0/fnode4/add_n_1/add_1), PWN(PWN(fpn_cells/cell_0/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_0/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_7
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode2/mul_1:0 + (Unnamed Layer* 419) [Shuffle] + fpn_cells/cell_1/fnode2/truediv_1, PWN(fpn_cells/cell_1/fnode2/mul:0 + (Unnamed Layer* 414) [Shuffle] + fpn_cells/cell_1/fnode2/truediv, fpn_cells/cell_1/fnode2/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_1/fnode2/op_after_combine7/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_8
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode3/mul_1:0 + (Unnamed Layer* 433) [Shuffle] + fpn_cells/cell_1/fnode3/truediv_1, PWN(fpn_cells/cell_1/fnode3/mul:0 + (Unnamed Layer* 384) [Shuffle] + fpn_cells/cell_1/fnode3/truediv, fpn_cells/cell_1/fnode3/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_1/fnode3/op_after_combine8/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/resample_2_8_9/max_pooling2d_6/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode4/mul_2:0 + (Unnamed Layer* 446) [Shuffle] + fpn_cells/cell_1/fnode4/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode4/mul_1:0 + (Unnamed Layer* 428) [Shuffle] + fpn_cells/cell_1/fnode4/truediv_1, PWN(fpn_cells/cell_1/fnode4/mul:0 + (Unnamed Layer* 411) [Shuffle] + fpn_cells/cell_1/fnode4/truediv, fpn_cells/cell_1/fnode4/add_n_1/add)), fpn_cells/cell_1/fnode4/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_1/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/resample_2_9_10/max_pooling2d_7/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode5/mul_2:0 + (Unnamed Layer* 462) [Shuffle] + fpn_cells/cell_1/fnode5/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode5/mul_1:0 + (Unnamed Layer* 408) [Shuffle] + fpn_cells/cell_1/fnode5/truediv_1, PWN(fpn_cells/cell_1/fnode5/mul:0 + (Unnamed Layer* 297) [Shuffle] + fpn_cells/cell_1/fnode5/truediv, fpn_cells/cell_1/fnode5/add_n_1/add)), fpn_cells/cell_1/fnode5/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_1/fnode5/op_after_combine10/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/resample_2_10_11/max_pooling2d_8/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode6/mul_2:0 + (Unnamed Layer* 478) [Shuffle] + fpn_cells/cell_1/fnode6/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode6/mul_1:0 + (Unnamed Layer* 381) [Shuffle] + fpn_cells/cell_1/fnode6/truediv_1, PWN(fpn_cells/cell_1/fnode6/mul:0 + (Unnamed Layer* 328) [Shuffle] + fpn_cells/cell_1/fnode6/truediv, fpn_cells/cell_1/fnode6/add_n_1/add)), fpn_cells/cell_1/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_1/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/resample_1_11_12/max_pooling2d_9/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode7/mul_1:0 + (Unnamed Layer* 494) [Shuffle] + fpn_cells/cell_1/fnode7/truediv_1, PWN(fpn_cells/cell_1/fnode7/mul:0 + (Unnamed Layer* 354) [Shuffle] + fpn_cells/cell_1/fnode7/truediv, fpn_cells/cell_1/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_1/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_9
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode0/mul_1:0 + (Unnamed Layer* 507) [Shuffle] + fpn_cells/cell_2/fnode0/truediv_1, PWN(fpn_cells/cell_2/fnode0/mul:0 + (Unnamed Layer* 490) [Shuffle] + fpn_cells/cell_2/fnode0/truediv, fpn_cells/cell_2/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_2/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_10
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode1/mul_1:0 + (Unnamed Layer* 521) [Shuffle] + fpn_cells/cell_2/fnode1/truediv_1, PWN(fpn_cells/cell_2/fnode1/mul:0 + (Unnamed Layer* 474) [Shuffle] + fpn_cells/cell_2/fnode1/truediv, fpn_cells/cell_2/fnode1/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_2/fnode1/op_after_combine6/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_11
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode2/mul_1:0 + (Unnamed Layer* 535) [Shuffle] + fpn_cells/cell_2/fnode2/truediv_1, PWN(fpn_cells/cell_2/fnode2/mul:0 + (Unnamed Layer* 458) [Shuffle] + fpn_cells/cell_2/fnode2/truediv, fpn_cells/cell_2/fnode2/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_2/fnode2/op_after_combine7/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_12
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode3/mul_1:0 + (Unnamed Layer* 549) [Shuffle] + fpn_cells/cell_2/fnode3/truediv_1, PWN(fpn_cells/cell_2/fnode3/mul:0 + (Unnamed Layer* 442) [Shuffle] + fpn_cells/cell_2/fnode3/truediv, fpn_cells/cell_2/fnode3/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_2/fnode3/op_after_combine8/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/resample_2_8_9/max_pooling2d_10/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode4/mul_2:0 + (Unnamed Layer* 561) [Shuffle] + fpn_cells/cell_2/fnode4/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode4/mul_1:0 + (Unnamed Layer* 544) [Shuffle] + fpn_cells/cell_2/fnode4/truediv_1, PWN(fpn_cells/cell_2/fnode4/mul:0 + (Unnamed Layer* 455) [Shuffle] + fpn_cells/cell_2/fnode4/truediv, fpn_cells/cell_2/fnode4/add_n_1/add)), fpn_cells/cell_2/fnode4/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_2/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid), class_net/mul)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid), box_net/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/resample_2_9_10/max_pooling2d_11/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_1), class_net/mul_1)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_1), box_net/mul_1)
[TensorRT] INFO: [GpuLayer] class_net/class-2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode5/mul_2:0 + (Unnamed Layer* 589) [Shuffle] + fpn_cells/cell_2/fnode5/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode5/mul_1:0 + (Unnamed Layer* 530) [Shuffle] + fpn_cells/cell_2/fnode5/truediv_1, PWN(fpn_cells/cell_2/fnode5/mul:0 + (Unnamed Layer* 471) [Shuffle] + fpn_cells/cell_2/fnode5/truediv, fpn_cells/cell_2/fnode5/add_n_1/add)), fpn_cells/cell_2/fnode5/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_2/fnode5/op_after_combine10/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_3), class_net/mul_3)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_3), box_net/mul_3)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_2), class_net/mul_2)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_2), box_net/mul_2)
[TensorRT] INFO: [GpuLayer] class_net/class-predict/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/resample_2_10_11/max_pooling2d_12/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_4), class_net/mul_4)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_4), box_net/mul_4)
[TensorRT] INFO: [GpuLayer] class_net/class-2_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict/BiasAdd__1552 + Reshape
[TensorRT] INFO: [GpuLayer] box_net/box-predict/BiasAdd__1591 + Reshape_1
[TensorRT] INFO: [GpuLayer] class_net/class-2_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode6/mul_2:0 + (Unnamed Layer* 633) [Shuffle] + fpn_cells/cell_2/fnode6/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode6/mul_1:0 + (Unnamed Layer* 516) [Shuffle] + fpn_cells/cell_2/fnode6/truediv_1, PWN(fpn_cells/cell_2/fnode6/mul:0 + (Unnamed Layer* 487) [Shuffle] + fpn_cells/cell_2/fnode6/truediv, fpn_cells/cell_2/fnode6/add_n_1/add)), fpn_cells/cell_2/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_2/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_6), class_net/mul_6)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_6), box_net/mul_6)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_5), class_net/mul_5)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_5), box_net/mul_5)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/resample_1_11_12/max_pooling2d_13/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_7), class_net/mul_7)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_7), box_net/mul_7)
[TensorRT] INFO: [GpuLayer] class_net/class-2_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/BiasAdd__1482 + Reshape_2
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/BiasAdd__1517 + Reshape_3
[TensorRT] INFO: [GpuLayer] class_net/class-2_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode7/mul_1:0 + (Unnamed Layer* 681) [Shuffle] + fpn_cells/cell_2/fnode7/truediv_1, PWN(fpn_cells/cell_2/fnode7/mul:0 + (Unnamed Layer* 503) [Shuffle] + fpn_cells/cell_2/fnode7/truediv, fpn_cells/cell_2/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_2/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_9), class_net/mul_9)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_9), box_net/mul_9)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_8), class_net/mul_8)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_8), box_net/mul_8)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_10), class_net/mul_10)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_10), box_net/mul_10)
[TensorRT] INFO: [GpuLayer] class_net/class-2_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/BiasAdd__1412 + Reshape_4
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/BiasAdd__1447 + Reshape_5
[TensorRT] INFO: [GpuLayer] class_net/class-2_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_12), class_net/mul_12)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_12), box_net/mul_12)
[TensorRT] INFO: [GpuLayer] class_net/class-1_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_11), class_net/mul_11)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_11), box_net/mul_11)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_13), class_net/mul_13)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_13), box_net/mul_13)
[TensorRT] INFO: [GpuLayer] class_net/class-2_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/BiasAdd__1342 + Reshape_6
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/BiasAdd__1377 + Reshape_7
[TensorRT] INFO: [GpuLayer] class_net/class-2_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_14), class_net/mul_14)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_14), box_net/mul_14)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/BiasAdd__1272 + Reshape_8
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/BiasAdd__1307 + Reshape_9
[TensorRT] INFO: [GpuLayer] Reshape:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_2:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_4:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_6:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_8:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_1:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_3:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_5:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_7:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_9:0 copy
[TensorRT] INFO: [GpuLayer] unstack
[TensorRT] INFO: [GpuLayer] unstack_0
[TensorRT] INFO: [GpuLayer] unstack_1
[TensorRT] INFO: [GpuLayer] unstack_2
[TensorRT] INFO: [GpuLayer] PWN(nms/class_net_sigmoid)
[TensorRT] INFO: [GpuLayer] unstack__1596
[TensorRT] INFO: [GpuLayer] unstack__1595
[TensorRT] INFO: [GpuLayer] unstack__1594
[TensorRT] INFO: [GpuLayer] unstack__1593
[TensorRT] INFO: [GpuLayer] sub_3:0
[TensorRT] INFO: [GpuLayer] sub_2:0
[TensorRT] INFO: [GpuLayer] sub_3:0_3
[TensorRT] INFO: [GpuLayer] sub_2:0_4
[TensorRT] INFO: [GpuLayer] truediv_5:0
[TensorRT] INFO: [GpuLayer] PWN(mul_5, add_6)
[TensorRT] INFO: [GpuLayer] truediv_4:0
[TensorRT] INFO: [GpuLayer] PWN(mul_4, add_5)
[TensorRT] INFO: [GpuLayer] PWN(ConstantFolding/truediv_7_recip:0 + (Unnamed Layer* 813) [Shuffle], PWN(PWN(Exp, mul_2), truediv_7))
[TensorRT] INFO: [GpuLayer] PWN(ConstantFolding/truediv_7_recip:0_5 + (Unnamed Layer* 816) [Shuffle], PWN(PWN(Exp_1, mul_3), truediv_6))
[TensorRT] INFO: [GpuLayer] sub_5
[TensorRT] INFO: [GpuLayer] add_8
[TensorRT] INFO: [GpuLayer] sub_4
[TensorRT] INFO: [GpuLayer] add_7
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1598
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1600
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1597
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1599
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1597:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1598:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1599:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1600:0 copy
[TensorRT] INFO: [GpuLayer] nms/box_net_reshape
[TensorRT] INFO: [GpuLayer] nms/non_maximum_suppression
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +167, GPU +171, now: CPU 455, GPU 3945 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +249, GPU +251, now: CPU 704, GPU 4196 (MiB)
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] ERROR: Tactic Device request: 1686MB Available: 1536MB. Device memory is insufficient to use tactic.
[TensorRT] WARNING: Skipping tactic 3 due to oom error on requested size of 1686 detected for tactic 4.
Try decreasing the workspace size with IBuilderConfig::setMaxWorkspaceSize().
[TensorRT] INFO: Detected 1 inputs and 4 output network tensors.
[TensorRT] INFO: Total Host Persistent Memory: 345200
[TensorRT] INFO: Total Device Persistent Memory: 14929920
[TensorRT] INFO: Total Scratch Memory: 107589120
[TensorRT] INFO: [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 12 MiB, GPU 1078 MiB
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 986, GPU 4853 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +1, GPU +0, now: CPU 987, GPU 4853 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 986, GPU 4853 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 986, GPU 4853 (MiB)
[TensorRT] INFO: [MemUsageSnapshot] Builder end: CPU 985 MiB, GPU 4853 MiB
INFO:EngineBuilder:Serializing engine to file: /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trtreal   11m45.193s
user    9m46.616s
sys 0m49.560s# engine.trt,29.2MB

tensorRT FP16

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython build_engine.py \--onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx \--engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \--precision fp16
(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python build_engine.py \
>     --onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx \
>     --engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \
>     --precision fp16
[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +234, GPU +0, now: CPU 264, GPU 3391 (MiB)
[TensorRT] WARNING: onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: ResizeNearest_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: ResizeNearest_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] INFO: Successfully created plugin: ResizeNearest_TRT
[TensorRT] INFO: No importer registered for op: BatchedNMS_TRT. Attempting to import as plugin.
[TensorRT] INFO: Searching for plugin: BatchedNMS_TRT, plugin_version: 1, plugin_namespace:
[TensorRT] WARNING: builtin_op_importers.cpp:4552: Attribute scoreBits not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[TensorRT] INFO: Successfully created plugin: BatchedNMS_TRT
INFO:EngineBuilder:Network Description
INFO:EngineBuilder:Input 'image_arrays:0' with shape (1, 512, 512, 3) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'num_detections' with shape (1,) and dtype DataType.INT32
INFO:EngineBuilder:Output 'detection_boxes' with shape (1, 100, 4) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'detection_scores' with shape (1, 100) and dtype DataType.FLOAT
INFO:EngineBuilder:Output 'detection_classes' with shape (1, 100) and dtype DataType.FLOAT
INFO:EngineBuilder:Building fp16 Engine in /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt
WARNING:EngineBuilder:FP16 is supported natively on this platform/device
[TensorRT] INFO: [MemUsageSnapshot] Builder begin: CPU 284 MiB, GPU 3428 MiB
[TensorRT] INFO: ---------- Layers Running on DLA ----------
[TensorRT] INFO: ---------- Layers Running on GPU ----------
[TensorRT] INFO: [GpuLayer] preprocessor/transpose
[TensorRT] INFO: [GpuLayer] preprocessor/scale_value:0 + preprocessor/scale + preprocessor/mean_value:0 + preprocessor/mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/stem/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/stem/Sigmoid), efficientnet-b0/stem/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/Sigmoid), efficientnet-b0/blocks_0/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/se/Sigmoid), efficientnet-b0/blocks_0/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_0/se/Sigmoid_1), efficientnet-b0/blocks_0/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_0/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/Sigmoid), efficientnet-b0/blocks_1/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/Sigmoid_1), efficientnet-b0/blocks_1/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/se/Sigmoid), efficientnet-b0/blocks_1/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_1/se/Sigmoid_1), efficientnet-b0/blocks_1/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_1/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/Sigmoid), efficientnet-b0/blocks_2/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/Sigmoid_1), efficientnet-b0/blocks_2/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/se/Sigmoid), efficientnet-b0/blocks_2/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_2/se/Sigmoid_1), efficientnet-b0/blocks_2/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_2/conv2d_1/Conv2D + efficientnet-b0/blocks_2/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/Sigmoid), efficientnet-b0/blocks_3/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/Sigmoid_1), efficientnet-b0/blocks_3/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/se/Sigmoid), efficientnet-b0/blocks_3/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_3/se/Sigmoid_1), efficientnet-b0/blocks_3/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_3/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/Sigmoid), efficientnet-b0/blocks_4/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/Sigmoid_1), efficientnet-b0/blocks_4/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/se/Sigmoid), efficientnet-b0/blocks_4/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_4/se/Sigmoid_1), efficientnet-b0/blocks_4/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_4/conv2d_1/Conv2D + efficientnet-b0/blocks_4/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/Sigmoid), efficientnet-b0/blocks_5/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/Sigmoid_1), efficientnet-b0/blocks_5/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/se/Sigmoid), efficientnet-b0/blocks_5/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_5/se/Sigmoid_1), efficientnet-b0/blocks_5/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_5/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/Sigmoid), efficientnet-b0/blocks_6/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/Sigmoid_1), efficientnet-b0/blocks_6/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/se/Sigmoid), efficientnet-b0/blocks_6/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_6/se/Sigmoid_1), efficientnet-b0/blocks_6/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_6/conv2d_1/Conv2D + efficientnet-b0/blocks_6/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/Sigmoid), efficientnet-b0/blocks_7/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/Sigmoid_1), efficientnet-b0/blocks_7/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/se/Sigmoid), efficientnet-b0/blocks_7/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_7/se/Sigmoid_1), efficientnet-b0/blocks_7/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_7/conv2d_1/Conv2D + efficientnet-b0/blocks_7/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/Sigmoid), efficientnet-b0/blocks_8/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/Sigmoid_1), efficientnet-b0/blocks_8/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/se/Sigmoid), efficientnet-b0/blocks_8/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_8/se/Sigmoid_1), efficientnet-b0/blocks_8/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_8/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/Sigmoid), efficientnet-b0/blocks_9/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/Sigmoid_1), efficientnet-b0/blocks_9/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/se/Sigmoid), efficientnet-b0/blocks_9/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_9/se/Sigmoid_1), efficientnet-b0/blocks_9/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_9/conv2d_1/Conv2D + efficientnet-b0/blocks_9/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/Sigmoid), efficientnet-b0/blocks_10/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/Sigmoid_1), efficientnet-b0/blocks_10/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/se/Sigmoid), efficientnet-b0/blocks_10/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_10/se/Sigmoid_1), efficientnet-b0/blocks_10/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_10/conv2d_1/Conv2D + efficientnet-b0/blocks_10/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/Sigmoid), efficientnet-b0/blocks_11/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/Sigmoid_1), efficientnet-b0/blocks_11/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/se/Sigmoid), efficientnet-b0/blocks_11/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_11/se/Sigmoid_1), efficientnet-b0/blocks_11/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_11/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/Sigmoid), efficientnet-b0/blocks_12/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/Sigmoid_1), efficientnet-b0/blocks_12/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/se/Sigmoid), efficientnet-b0/blocks_12/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_12/se/Sigmoid_1), efficientnet-b0/blocks_12/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_12/conv2d_1/Conv2D + efficientnet-b0/blocks_12/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/Sigmoid), efficientnet-b0/blocks_13/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/Sigmoid_1), efficientnet-b0/blocks_13/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/se/Sigmoid), efficientnet-b0/blocks_13/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_13/se/Sigmoid_1), efficientnet-b0/blocks_13/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_13/conv2d_1/Conv2D + efficientnet-b0/blocks_13/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/Sigmoid), efficientnet-b0/blocks_14/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/Sigmoid_1), efficientnet-b0/blocks_14/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/se/Sigmoid), efficientnet-b0/blocks_14/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_14/se/Sigmoid_1), efficientnet-b0/blocks_14/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_14/conv2d_1/Conv2D + efficientnet-b0/blocks_14/Add
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/conv2d/Conv2D
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/Sigmoid), efficientnet-b0/blocks_15/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/depthwise_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/Sigmoid_1), efficientnet-b0/blocks_15/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/Mean
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/se/Sigmoid), efficientnet-b0/blocks_15/se/mul)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/se/conv2d_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(efficientnet-b0/blocks_15/se/Sigmoid_1), efficientnet-b0/blocks_15/se/mul_1)
[TensorRT] INFO: [GpuLayer] efficientnet-b0/blocks_15/conv2d_1/Conv2D
[TensorRT] INFO: [GpuLayer] resample_p6/conv2d/BiasAdd || fpn_cells/cell_0/fnode5/resample_0_2_10/conv2d/BiasAdd
[TensorRT] INFO: [GpuLayer] resample_p6/max_pooling2d/MaxPool
[TensorRT] INFO: [GpuLayer] resample_p7/max_pooling2d_1/MaxPool
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/resample_0_2_10/bn/FusedBatchNormV3__894
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/add_n_1/add
[TensorRT] INFO: [GpuLayer] resize_nearest_1
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_0/fnode5/op_after_combine10/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/separable_conv2d/depthwise__898
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode0/mul_1:0 + (Unnamed Layer* 284) [Shuffle] + fpn_cells/cell_0/fnode0/truediv_1, PWN(fpn_cells/cell_0/fnode0/mul:0 + (Unnamed Layer* 274) [Shuffle] + fpn_cells/cell_0/fnode0/truediv, fpn_cells/cell_0/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_0/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_0/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/resample_2_10_11/max_pooling2d_4/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_2
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/mul_1:0 + (Unnamed Layer* 313) [Shuffle] + fpn_cells/cell_0/fnode1/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/resample_0_2_6/conv2d/BiasAdd + fpn_cells/cell_0/fnode1/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(PWN(fpn_cells/cell_0/fnode6/mul_1:0 + (Unnamed Layer* 308) [Shuffle] + fpn_cells/cell_0/fnode6/truediv_1, PWN(fpn_cells/cell_0/fnode6/mul:0 + (Unnamed Layer* 271) [Shuffle] + fpn_cells/cell_0/fnode6/truediv, fpn_cells/cell_0/fnode6/add_n_1/add)), PWN(fpn_cells/cell_0/fnode6/mul_2:0 + (Unnamed Layer* 305) [Shuffle] + fpn_cells/cell_0/fnode6/truediv_2, fpn_cells/cell_0/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_0/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_0/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_0/fnode1/op_after_combine6/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/resample_1_11_12/max_pooling2d_5/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_3
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/mul_1:0 + (Unnamed Layer* 339) [Shuffle] + fpn_cells/cell_0/fnode2/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/resample_0_1_7/conv2d/BiasAdd + fpn_cells/cell_0/fnode2/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode7/mul_1:0 + (Unnamed Layer* 336) [Shuffle] + fpn_cells/cell_0/fnode7/truediv_1, PWN(fpn_cells/cell_0/fnode7/mul:0 + (Unnamed Layer* 278) [Shuffle] + fpn_cells/cell_0/fnode7/truediv, fpn_cells/cell_0/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_0/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_0/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_0/fnode2/op_after_combine7/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/mul_1:0 + (Unnamed Layer* 357) [Shuffle] + fpn_cells/cell_0/fnode4/truediv_1
[TensorRT] INFO: [GpuLayer] resize_nearest_4
[TensorRT] INFO: [GpuLayer] resize_nearest_5
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/resample_0_1_9/conv2d/BiasAdd + fpn_cells/cell_0/fnode4/add_n_1/add
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/mul_1:0 + (Unnamed Layer* 366) [Shuffle] + fpn_cells/cell_0/fnode3/truediv_1
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/resample_0_0_8/conv2d/BiasAdd + fpn_cells/cell_0/fnode3/add_n_1/add
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode0/mul_1:0 + (Unnamed Layer* 362) [Shuffle] + fpn_cells/cell_1/fnode0/truediv_1, PWN(fpn_cells/cell_1/fnode0/mul:0 + (Unnamed Layer* 331) [Shuffle] + fpn_cells/cell_1/fnode0/truediv, fpn_cells/cell_1/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_1/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_0/fnode3/op_after_combine8/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/resample_2_8_9/max_pooling2d_2/MaxPool
[TensorRT] INFO: [GpuLayer] resize_nearest_6
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode1/mul_1:0 + (Unnamed Layer* 390) [Shuffle] + fpn_cells/cell_1/fnode1/truediv_1, PWN(fpn_cells/cell_1/fnode1/mul:0 + (Unnamed Layer* 300) [Shuffle] + fpn_cells/cell_1/fnode1/truediv, fpn_cells/cell_1/fnode1/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_1/fnode1/op_after_combine6/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_0/fnode4/mul_2:0 + (Unnamed Layer* 393) [Shuffle] + fpn_cells/cell_0/fnode4/truediv_2, fpn_cells/cell_0/fnode4/add_n_1/add_1), PWN(PWN(fpn_cells/cell_0/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_0/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_0/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_7
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode2/mul_1:0 + (Unnamed Layer* 419) [Shuffle] + fpn_cells/cell_1/fnode2/truediv_1, PWN(fpn_cells/cell_1/fnode2/mul:0 + (Unnamed Layer* 414) [Shuffle] + fpn_cells/cell_1/fnode2/truediv, fpn_cells/cell_1/fnode2/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_1/fnode2/op_after_combine7/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_8
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode3/mul_1:0 + (Unnamed Layer* 433) [Shuffle] + fpn_cells/cell_1/fnode3/truediv_1, PWN(fpn_cells/cell_1/fnode3/mul:0 + (Unnamed Layer* 384) [Shuffle] + fpn_cells/cell_1/fnode3/truediv, fpn_cells/cell_1/fnode3/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_1/fnode3/op_after_combine8/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/resample_2_8_9/max_pooling2d_6/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode4/mul_2:0 + (Unnamed Layer* 446) [Shuffle] + fpn_cells/cell_1/fnode4/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode4/mul_1:0 + (Unnamed Layer* 428) [Shuffle] + fpn_cells/cell_1/fnode4/truediv_1, PWN(fpn_cells/cell_1/fnode4/mul:0 + (Unnamed Layer* 411) [Shuffle] + fpn_cells/cell_1/fnode4/truediv, fpn_cells/cell_1/fnode4/add_n_1/add)), fpn_cells/cell_1/fnode4/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_1/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/resample_2_9_10/max_pooling2d_7/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode5/mul_2:0 + (Unnamed Layer* 462) [Shuffle] + fpn_cells/cell_1/fnode5/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode5/mul_1:0 + (Unnamed Layer* 408) [Shuffle] + fpn_cells/cell_1/fnode5/truediv_1, PWN(fpn_cells/cell_1/fnode5/mul:0 + (Unnamed Layer* 297) [Shuffle] + fpn_cells/cell_1/fnode5/truediv, fpn_cells/cell_1/fnode5/add_n_1/add)), fpn_cells/cell_1/fnode5/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_1/fnode5/op_after_combine10/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/resample_2_10_11/max_pooling2d_8/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode6/mul_2:0 + (Unnamed Layer* 478) [Shuffle] + fpn_cells/cell_1/fnode6/truediv_2, PWN(PWN(fpn_cells/cell_1/fnode6/mul_1:0 + (Unnamed Layer* 381) [Shuffle] + fpn_cells/cell_1/fnode6/truediv_1, PWN(fpn_cells/cell_1/fnode6/mul:0 + (Unnamed Layer* 328) [Shuffle] + fpn_cells/cell_1/fnode6/truediv, fpn_cells/cell_1/fnode6/add_n_1/add)), fpn_cells/cell_1/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_1/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_1/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/resample_1_11_12/max_pooling2d_9/MaxPool
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_1/fnode7/mul_1:0 + (Unnamed Layer* 494) [Shuffle] + fpn_cells/cell_1/fnode7/truediv_1, PWN(fpn_cells/cell_1/fnode7/mul:0 + (Unnamed Layer* 354) [Shuffle] + fpn_cells/cell_1/fnode7/truediv, fpn_cells/cell_1/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_1/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_1/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_1/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_9
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode0/mul_1:0 + (Unnamed Layer* 507) [Shuffle] + fpn_cells/cell_2/fnode0/truediv_1, PWN(fpn_cells/cell_2/fnode0/mul:0 + (Unnamed Layer* 490) [Shuffle] + fpn_cells/cell_2/fnode0/truediv, fpn_cells/cell_2/fnode0/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode0/op_after_combine5/Sigmoid), fpn_cells/cell_2/fnode0/op_after_combine5/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode0/op_after_combine5/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode0/op_after_combine5/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_10
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode1/mul_1:0 + (Unnamed Layer* 521) [Shuffle] + fpn_cells/cell_2/fnode1/truediv_1, PWN(fpn_cells/cell_2/fnode1/mul:0 + (Unnamed Layer* 474) [Shuffle] + fpn_cells/cell_2/fnode1/truediv, fpn_cells/cell_2/fnode1/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode1/op_after_combine6/Sigmoid), fpn_cells/cell_2/fnode1/op_after_combine6/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode1/op_after_combine6/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode1/op_after_combine6/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_11
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode2/mul_1:0 + (Unnamed Layer* 535) [Shuffle] + fpn_cells/cell_2/fnode2/truediv_1, PWN(fpn_cells/cell_2/fnode2/mul:0 + (Unnamed Layer* 458) [Shuffle] + fpn_cells/cell_2/fnode2/truediv, fpn_cells/cell_2/fnode2/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode2/op_after_combine7/Sigmoid), fpn_cells/cell_2/fnode2/op_after_combine7/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode2/op_after_combine7/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode2/op_after_combine7/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] resize_nearest_12
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode3/mul_1:0 + (Unnamed Layer* 549) [Shuffle] + fpn_cells/cell_2/fnode3/truediv_1, PWN(fpn_cells/cell_2/fnode3/mul:0 + (Unnamed Layer* 442) [Shuffle] + fpn_cells/cell_2/fnode3/truediv, fpn_cells/cell_2/fnode3/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode3/op_after_combine8/Sigmoid), fpn_cells/cell_2/fnode3/op_after_combine8/mul))
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode3/op_after_combine8/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode3/op_after_combine8/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/resample_2_8_9/max_pooling2d_10/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode4/mul_2:0 + (Unnamed Layer* 561) [Shuffle] + fpn_cells/cell_2/fnode4/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode4/mul_1:0 + (Unnamed Layer* 544) [Shuffle] + fpn_cells/cell_2/fnode4/truediv_1, PWN(fpn_cells/cell_2/fnode4/mul:0 + (Unnamed Layer* 455) [Shuffle] + fpn_cells/cell_2/fnode4/truediv, fpn_cells/cell_2/fnode4/add_n_1/add)), fpn_cells/cell_2/fnode4/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode4/op_after_combine9/Sigmoid), fpn_cells/cell_2/fnode4/op_after_combine9/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid), class_net/mul)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid), box_net/mul)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/op_after_combine9/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode4/op_after_combine9/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1/BiasAdd
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/resample_2_9_10/max_pooling2d_11/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_1), class_net/mul_1)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_1), box_net/mul_1)
[TensorRT] INFO: [GpuLayer] class_net/class-2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode5/mul_2:0 + (Unnamed Layer* 589) [Shuffle] + fpn_cells/cell_2/fnode5/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode5/mul_1:0 + (Unnamed Layer* 530) [Shuffle] + fpn_cells/cell_2/fnode5/truediv_1, PWN(fpn_cells/cell_2/fnode5/mul:0 + (Unnamed Layer* 471) [Shuffle] + fpn_cells/cell_2/fnode5/truediv, fpn_cells/cell_2/fnode5/add_n_1/add)), fpn_cells/cell_2/fnode5/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode5/op_after_combine10/Sigmoid), fpn_cells/cell_2/fnode5/op_after_combine10/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_3), class_net/mul_3)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_3), box_net/mul_3)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/op_after_combine10/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode5/op_after_combine10/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_2), class_net/mul_2)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_2), box_net/mul_2)
[TensorRT] INFO: [GpuLayer] class_net/class-predict/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/resample_2_10_11/max_pooling2d_12/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_4), class_net/mul_4)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_4), box_net/mul_4)
[TensorRT] INFO: [GpuLayer] class_net/class-2_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict/BiasAdd__1552 + Reshape
[TensorRT] INFO: [GpuLayer] box_net/box-predict/BiasAdd__1591 + Reshape_1
[TensorRT] INFO: [GpuLayer] class_net/class-2_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_1/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode6/mul_2:0 + (Unnamed Layer* 633) [Shuffle] + fpn_cells/cell_2/fnode6/truediv_2, PWN(PWN(fpn_cells/cell_2/fnode6/mul_1:0 + (Unnamed Layer* 516) [Shuffle] + fpn_cells/cell_2/fnode6/truediv_1, PWN(fpn_cells/cell_2/fnode6/mul:0 + (Unnamed Layer* 487) [Shuffle] + fpn_cells/cell_2/fnode6/truediv, fpn_cells/cell_2/fnode6/add_n_1/add)), fpn_cells/cell_2/fnode6/add_n_1/add_1)), PWN(PWN(fpn_cells/cell_2/fnode6/op_after_combine11/Sigmoid), fpn_cells/cell_2/fnode6/op_after_combine11/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_6), class_net/mul_6)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_6), box_net/mul_6)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/op_after_combine11/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode6/op_after_combine11/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_5), class_net/mul_5)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_5), box_net/mul_5)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/resample_1_11_12/max_pooling2d_13/MaxPool
[TensorRT] INFO: [GpuLayer] class_net/class-0_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_7), class_net/mul_7)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_7), box_net/mul_7)
[TensorRT] INFO: [GpuLayer] class_net/class-2_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_1/BiasAdd__1482 + Reshape_2
[TensorRT] INFO: [GpuLayer] box_net/box-predict_1/BiasAdd__1517 + Reshape_3
[TensorRT] INFO: [GpuLayer] class_net/class-2_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_2/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(fpn_cells/cell_2/fnode7/mul_1:0 + (Unnamed Layer* 681) [Shuffle] + fpn_cells/cell_2/fnode7/truediv_1, PWN(fpn_cells/cell_2/fnode7/mul:0 + (Unnamed Layer* 503) [Shuffle] + fpn_cells/cell_2/fnode7/truediv, fpn_cells/cell_2/fnode7/add_n_1/add)), PWN(PWN(fpn_cells/cell_2/fnode7/op_after_combine12/Sigmoid), fpn_cells/cell_2/fnode7/op_after_combine12/mul))
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_9), class_net/mul_9)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_9), box_net/mul_9)
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/op_after_combine12/conv/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] fpn_cells/cell_2/fnode7/op_after_combine12/conv/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-1_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_8), class_net/mul_8)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_8), box_net/mul_8)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-0_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-0_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-0_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-0_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_10), class_net/mul_10)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_10), box_net/mul_10)
[TensorRT] INFO: [GpuLayer] class_net/class-2_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_2/BiasAdd__1412 + Reshape_4
[TensorRT] INFO: [GpuLayer] box_net/box-predict_2/BiasAdd__1447 + Reshape_5
[TensorRT] INFO: [GpuLayer] class_net/class-2_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_12), class_net/mul_12)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_12), box_net/mul_12)
[TensorRT] INFO: [GpuLayer] class_net/class-1_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-1_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-1_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-1_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_11), class_net/mul_11)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_11), box_net/mul_11)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_13), class_net/mul_13)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_13), box_net/mul_13)
[TensorRT] INFO: [GpuLayer] class_net/class-2_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-2_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_3/BiasAdd__1342 + Reshape_6
[TensorRT] INFO: [GpuLayer] box_net/box-predict_3/BiasAdd__1377 + Reshape_7
[TensorRT] INFO: [GpuLayer] class_net/class-2_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-2_4/BiasAdd
[TensorRT] INFO: [GpuLayer] PWN(PWN(class_net/Sigmoid_14), class_net/mul_14)
[TensorRT] INFO: [GpuLayer] PWN(PWN(box_net/Sigmoid_14), box_net/mul_14)
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/separable_conv2d/depthwise
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/BiasAdd
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/BiasAdd
[TensorRT] INFO: [GpuLayer] class_net/class-predict_4/BiasAdd__1272 + Reshape_8
[TensorRT] INFO: [GpuLayer] box_net/box-predict_4/BiasAdd__1307 + Reshape_9
[TensorRT] INFO: [GpuLayer] Reshape:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_2:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_4:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_6:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_8:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_1:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_3:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_5:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_7:0 copy
[TensorRT] INFO: [GpuLayer] Reshape_9:0 copy
[TensorRT] INFO: [GpuLayer] unstack
[TensorRT] INFO: [GpuLayer] unstack_0
[TensorRT] INFO: [GpuLayer] unstack_1
[TensorRT] INFO: [GpuLayer] unstack_2
[TensorRT] INFO: [GpuLayer] PWN(nms/class_net_sigmoid)
[TensorRT] INFO: [GpuLayer] unstack__1596
[TensorRT] INFO: [GpuLayer] unstack__1595
[TensorRT] INFO: [GpuLayer] unstack__1594
[TensorRT] INFO: [GpuLayer] unstack__1593
[TensorRT] INFO: [GpuLayer] sub_3:0
[TensorRT] INFO: [GpuLayer] sub_2:0
[TensorRT] INFO: [GpuLayer] sub_3:0_3
[TensorRT] INFO: [GpuLayer] sub_2:0_4
[TensorRT] INFO: [GpuLayer] truediv_5:0
[TensorRT] INFO: [GpuLayer] PWN(mul_5, add_6)
[TensorRT] INFO: [GpuLayer] truediv_4:0
[TensorRT] INFO: [GpuLayer] PWN(mul_4, add_5)
[TensorRT] INFO: [GpuLayer] PWN(ConstantFolding/truediv_7_recip:0 + (Unnamed Layer* 813) [Shuffle], PWN(PWN(Exp, mul_2), truediv_7))
[TensorRT] INFO: [GpuLayer] PWN(ConstantFolding/truediv_7_recip:0_5 + (Unnamed Layer* 816) [Shuffle], PWN(PWN(Exp_1, mul_3), truediv_6))
[TensorRT] INFO: [GpuLayer] sub_5
[TensorRT] INFO: [GpuLayer] add_8
[TensorRT] INFO: [GpuLayer] sub_4
[TensorRT] INFO: [GpuLayer] add_7
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1598
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1600
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1597
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1599
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1597:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1598:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1599:0 copy
[TensorRT] INFO: [GpuLayer] stack_2_Unsqueeze__1600:0 copy
[TensorRT] INFO: [GpuLayer] nms/box_net_reshape
[TensorRT] INFO: [GpuLayer] nms/non_maximum_suppression
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +167, GPU +287, now: CPU 455, GPU 3720 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +250, GPU +449, now: CPU 705, GPU 4169 (MiB)
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] ERROR: Tactic Device request: 1686MB Available: 1536MB. Device memory is insufficient to use tactic.
[TensorRT] WARNING: Skipping tactic 3 due to oom error on requested size of 1686 detected for tactic 4.
Try decreasing the workspace size with IBuilderConfig::setMaxWorkspaceSize().
[TensorRT] ERROR: Tactic Device request: 1679MB Available: 1536MB. Device memory is insufficient to use tactic.
[TensorRT] WARNING: Skipping tactic 3 due to oom error on requested size of 1679 detected for tactic 4.
Try decreasing the workspace size with IBuilderConfig::setMaxWorkspaceSize().
[TensorRT] INFO: Detected 1 inputs and 4 output network tensors.
[TensorRT] INFO: Total Host Persistent Memory: 346688
[TensorRT] INFO: Total Device Persistent Memory: 8863232
[TensorRT] INFO: Total Scratch Memory: 107589120
[TensorRT] INFO: [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 12 MiB, GPU 1078 MiB
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1002, GPU 5823 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1002, GPU 5823 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1002, GPU 5823 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1001, GPU 5823 (MiB)
[TensorRT] INFO: [MemUsageSnapshot] Builder end: CPU 998 MiB, GPU 5823 MiB
INFO:EngineBuilder:Serializing engine to file: /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trtreal   30m6.750s
user    26m56.900s
sys 1m8.252s# engine.trt,20.6MB

6. 推理

tensorRT FP32

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython infer.py \--engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \--input /media/mydisk/YOYOFile/coco_calib \--output /media/mydisk/YOYOFile/infer_fp32
(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python infer.py \
>     --engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \
>     --input /media/mydisk/YOYOFile/coco_calib \
>     --output /media/mydisk/YOYOFile/infer_fp32
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000009.jpg']
time_1: 1803 age 1 / 1000len(images): 1
detections: [[{'ymin': -10.35421371459961, 'xmin': 299.7505760192871, 'ymax': 240.5466079711914, 'xmax': 637.5276184082031, 'score': 0.6842605, 'class': 50}, {'ymin': 80.76017379760742, 'xmin': -1.2042045593261719, 'ymax': 436.9062805175781, 'xmax': 459.1698455810547, 'score': 0.62641436, 'class': 50}, {'ymin': 188.0274200439453, 'xmin': 27.046070098876953, 'ymax': 471.3096618652344, 'xmax': 601.4808654785156, 'score': 0.57706714, 'class': 50}, {'ymin': 222.76180267333984, 'xmin': 249.59579467773438, 'ymax': 473.05362701416016, 'xmax': 562.2965240478516, 'score': 0.5429893, 'class': 55}, {'ymin': 69.52826499938965, 'xmin': 388.28861236572266, 'ymax': 141.8428897857666, 'xmax': 470.15674591064453, 'score': 0.45283884, 'class': 54}, {'ymin': 6.308660507202148, 'xmin': 19.28152084350586, 'ymax': 293.54217529296875, 'xmax': 427.21527099609375, 'score': 0.42518762, 'class': 50}]]
time_2: 368
...
...
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000580385.jpg']
time_1: 145 mage 998 / 1000len(images): 1
detections: [[{'ymin': 47.34269142150879, 'xmin': 125.20036697387695, 'ymax': 359.3659210205078, 'xmax': 525.5953216552734, 'score': 0.9364286, 'class': 6}]]
time_2: 251
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000581218.jpg']
time_1: 151 mage 999 / 1000len(images): 1
detections: [[{'ymin': 309.5945739746094, 'xmin': 241.9521141052246, 'ymax': 348.5173034667969, 'xmax': 271.9435691833496, 'score': 0.69110346, 'class': 4}, {'ymin': 147.7055263519287, 'xmin': 271.3910102844238, 'ymax': 188.7246322631836, 'xmax': 300.4056739807129, 'score': 0.64089125, 'class': 4}, {'ymin': 269.5161247253418, 'xmin': 260.4314994812012, 'ymax': 310.32026290893555, 'xmax': 289.4497871398926, 'score': 0.6314739, 'class': 4}, {'ymin': 234.64244842529297, 'xmin': 308.0194282531738, 'ymax': 274.2452621459961, 'xmax': 339.4849395751953, 'score': 0.526499, 'class': 4}, {'ymin': 273.08494567871094, 'xmin': 298.948917388916, 'ymax': 313.12862396240234, 'xmax': 325.6622314453125, 'score': 0.4886201, 'class': 4}, {'ymin': 233.17075729370117, 'xmin': 282.2890281677246, 'ymax': 268.447322845459, 'xmax': 312.5209617614746, 'score': 0.41873252, 'class': 4}]]
time_2: 430
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000581766.jpg']
time_1: 149 mage 1000 / 1000len(images): 1
detections: [[{'ymin': 148.56106042861938, 'xmin': 204.04022932052612, 'ymax': 283.86545181274414, 'xmax': 297.4337339401245, 'score': 0.8775656, 'class': 69}, {'ymin': 153.74591946601868, 'xmin': 17.941385507583618, 'ymax': 285.72288155555725, 'xmax': 127.41453945636749, 'score': 0.81447136, 'class': 69}, {'ymin': 146.68376743793488, 'xmin': 373.20685386657715, 'ymax': 278.85639667510986, 'xmax': 481.5758466720581, 'score': 0.751435, 'class': 69}]]
time_2: 276infer time: 504825Finished Processingreal    8m32.614s
user    6m5.816s
sys 0m10.920s

总结:

COCO数据集
(较好)FP16平均耗时150ms/张,即6.7fps
(较差)FP16平均耗时160ms/张,即6.2fps

tensorRT FP16

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython infer.py \--engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \--input /media/mydisk/YOYOFile/coco_calib \--output /media/mydisk/YOYOFile/infer_fp16
(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python infer.py \
>     --engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \
>     --input /media/mydisk/YOYOFile/coco_calib \
>     --output /media/mydisk/YOYOFile/infer_fp16
[TensorRT] VERBOSE: Registered plugin creator - ::GridAnchor_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::GridAnchorRect_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::NMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Reorg_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Region_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Clip_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::LReLU_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PriorBox_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Normalize_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ScatterND version 1
[TensorRT] VERBOSE: Registered plugin creator - ::RPROI_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::BatchedNMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::FlattenConcat_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::CropAndResize version 1
[TensorRT] VERBOSE: Registered plugin creator - ::DetectionLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::EfficientNMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Proposal version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ProposalLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ResizeNearest_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Split version 1
[TensorRT] VERBOSE: Registered plugin creator - ::SpecialSlice_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::InstanceNormalization_TRT version 1
[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +234, GPU +0, now: CPU 264, GPU 7209 (MiB)
[TensorRT] INFO: Loaded engine size: 19 MB
[TensorRT] INFO: [MemUsageSnapshot] deserializeCudaEngine begin: CPU 284 MiB, GPU 7228 MiB
[TensorRT] VERBOSE: Using cublas a tactic source
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +167, GPU +268, now: CPU 466, GPU 7510 (MiB)
[TensorRT] VERBOSE: Using cuDNN as a tactic source
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +250, GPU +245, now: CPU 716, GPU 7755 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 716, GPU 7755 (MiB)
[TensorRT] VERBOSE: Deserialization required 5901688 microseconds.
[TensorRT] INFO: [MemUsageSnapshot] deserializeCudaEngine end: CPU 716 MiB, GPU 7755 MiB
[TensorRT] INFO: [MemUsageSnapshot] ExecutionContext creation begin: CPU 696 MiB, GPU 7736 MiB
[TensorRT] VERBOSE: Using cublas a tactic source
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 696, GPU 7736 (MiB)
[TensorRT] VERBOSE: Using cuDNN as a tactic source
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 696, GPU 7736 (MiB)
[TensorRT] VERBOSE: Total per-runner device memory is 9042944
[TensorRT] VERBOSE: Total per-runner host memory is 343280
[TensorRT] VERBOSE: Allocated activation device memory of size 141150208
[TensorRT] INFO: [MemUsageSnapshot] ExecutionContext creation end: CPU 699 MiB, GPU 7770 MiB
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000009.jpg']
time_1: 3401mage 1 / 1000
len(images): 1
detections: [[{'ymin': -10.390625, 'xmin': 299.53125, 'ymax': 240.625, 'xmax': 637.5, 'score': 0.684264, 'class': 50}, {'ymin': 80.78125, 'xmin': -1.25, 'ymax': 436.875, 'xmax': 459.0625, 'score': 0.62612414, 'class': 50}, {'ymin': 188.125, 'xmin': 27.1875, 'ymax': 471.25, 'xmax': 601.5625, 'score': 0.57749534, 'class': 50}, {'ymin': 222.8125, 'xmin': 249.6875, 'ymax': 472.8125, 'xmax': 562.1875, 'score': 0.5418937, 'class': 55}, {'ymin': 69.53125, 'xmin': 388.4375, 'ymax': 141.875, 'xmax': 470.3125, 'score': 0.4530755, 'class': 54}, {'ymin': 6.25, 'xmin': 19.375, 'ymax': 293.59375, 'xmax': 427.1875, 'score': 0.42536652, 'class': 50}]]
time_2: 485
...
...
...
time_1: 163Image 998 / 1000
len(images): 1
detections: [[{'ymin': 47.265625, 'xmin': 125.0, 'ymax': 359.375, 'xmax': 525.625, 'score': 0.9365176, 'class': 6}]]
time_2: 449
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000581218.jpg']
time_1: 168Image 999 / 1000
len(images): 1
detections: [[{'ymin': 309.53125, 'xmin': 241.875, 'ymax': 349.375, 'xmax': 271.5625, 'score': 0.69014156, 'class': 4}, {'ymin': 147.734375, 'xmin': 271.40625, 'ymax': 188.75, 'xmax': 300.46875, 'score': 0.63973606, 'class': 4}, {'ymin': 269.375, 'xmin': 260.46875, 'ymax': 310.3125, 'xmax': 289.53125, 'score': 0.6306849, 'class': 4}, {'ymin': 234.53125, 'xmin': 307.96875, 'ymax': 274.21875, 'xmax': 339.375, 'score': 0.52634275, 'class': 4}, {'ymin': 273.125, 'xmin': 299.0625, 'ymax': 313.125, 'xmax': 325.625, 'score': 0.4882834, 'class': 4}, {'ymin': 233.125, 'xmin': 282.1875, 'ymax': 268.4375, 'xmax': 312.5, 'score': 0.42059958, 'class': 4}]]
time_2: 665
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000581766.jpg']
time_1: 160Image 1000 / 1000
len(images): 1
detections: [[{'ymin': 148.5595703125, 'xmin': 204.1015625, 'ymax': 283.69140625, 'xmax': 297.36328125, 'score': 0.8774537, 'class': 69}, {'ymin': 153.80859375, 'xmin': 17.9443359375, 'ymax': 285.64453125, 'xmax': 127.44140625, 'score': 0.8146434, 'class': 69}, {'ymin': 146.728515625, 'xmin': 373.291015625, 'ymax': 278.80859375, 'xmax': 481.689453125, 'score': 0.7512834, 'class': 69}]]
time_2: 373infer time: 684610Finished Processing
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 964, GPU 7654 (MiB)real  11m37.284s
user    7m18.972s
sys 0m14.796s

总结:
COCO数据集
(较好)FP16平均耗时140ms/张,即7fps
(较差)FP16平均耗时170ms/张,即5.9fps

person_horse数据集
(较好)FP16平均耗时140ms/张,即7fps
(较差)FP16平均耗时170ms/张,即5.9fps


7. 评估指标

tensorRT FP32

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython eval_coco.py \--engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \--input /media/mydisk/YOYOFile/COCO/val2017 \--annotations /media/mydisk/YOYOFile/COCO/annotations/instances_val2017.json \--automl_path /media/mydisk/MyDocuments/PyProjects/automl
(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python eval_coco.py \
>     --engine /media/mydisk/YOYOFile/saved_model_trt_fp32/engine.trt \
>     --input /media/mydisk/YOYOFile/COCO/val2017 \
>     --annotations /media/mydisk/YOYOFile/COCO/annotations/instances_val2017.json \
>     --automl_path /media/mydisk/MyDocuments/PyProjects/automl
2021-10-26 18:22:17.807992: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
...
...
Processing Image 5000 / 5000
infer time: 951576loading annotations into memory...
Done (t=3.04s)
creating index...
index created!
Loading and preparing results...
Converting ndarray to lists...
(18132, 7)
0/18132
DONE (t=0.38s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=78.79s).
Accumulating evaluation results...
DONE (t=12.23s).Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.282Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.397Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.315Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.053Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.319Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.494Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.242Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.315Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.316Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.052Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.349Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.562real   18m6.646s
user    6m34.516s
sys 0m34.588s

tensorRT FP16

cd /media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdetpython eval_coco.py \--engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \--input /media/mydisk/YOYOFile/COCO/val2017 \--annotations /media/mydisk/YOYOFile/COCO/annotations/instances_val2017.json \--automl_path /media/mydisk/MyDocuments/PyProjects/automl
(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python eval_coco.py     --engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt     --input /media/mydisk/YOYOFile/COCO/val2017     --annotations /media/mydisk/YOYOFile/COCO/annotations/instances_val2017.json     --automl_path /media/mydisk/MyDocuments/PyProjects/automl
2021-10-22 14:44:34.439779: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
...
...
Processing Image 5000 / 5000
infer time: 926744loading annotations into memory...
Done (t=3.13s)
creating index...
index created!
Loading and preparing results...
Converting ndarray to lists...
(18133, 7)
0/18133
DONE (t=0.40s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=70.50s).
Accumulating evaluation results...
DONE (t=11.49s).Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.282Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.397Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.315Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.053Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.319Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.494Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.242Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.315Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.316Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.052Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.349Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.562real   17m27.213s
user    6m39.904s
sys 0m38.652s

8. 比较原生的tensorflow和tensorRT

tensorRT FP32

cd /home/yichao/MyDocuments/TensorRT/samples/python/efficientdetpython compare_tf.py \--engine /home/yichao/Downloads/saved_model_trt_fp32/engine.trt \--saved_model /home/yichao/Downloads/saved_model \--input /home/yichao/Downloads/coco_calib \--output /home/yichao/Downloads/output_fp32

tensorRT FP16

cd /home/yichao/MyDocuments/TensorRT/samples/python/efficientdetpython compare_tf.py \--engine /home/yichao/Downloads/saved_model_trt_fp16/engine.trt \--saved_model /home/yichao/Downloads/saved_model \--input /home/yichao/Downloads/coco_calib \--output /home/yichao/Downloads/output_fp16
(venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python compare_tf.py     --engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt     --saved_model /media/mydisk/YOYOFile/saved_model     --input /media/mydisk/YOYOFile/coco_calib     --output /media/mydisk/YOYOFile/output_fp16
2021-10-22 16:19:15.765642: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-22 16:19:28.446469: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-10-22 16:19:28.447570: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:19:28.448016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-22 16:19:28.448459: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-22 16:19:28.448790: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.10
2021-10-22 16:19:28.448971: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.10
2021-10-22 16:19:28.449133: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-10-22 16:19:28.449401: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-10-22 16:19:28.468938: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.10
2021-10-22 16:19:28.485613: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.10
2021-10-22 16:19:28.486389: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-10-22 16:19:28.487274: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:19:28.487977: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:19:28.488423: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-22 16:20:55.378425: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:20:55.378885: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
2021-10-22 16:20:55.379462: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:20:55.379853: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:20:55.379990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2021-10-22 16:20:55.380803: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.10.2
2021-10-22 16:21:00.945108: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-22 16:21:00.945316: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-10-22 16:21:00.945392: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-10-22 16:21:00.946306: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:21:00.947010: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:21:00.947459: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2021-10-22 16:21:00.947732: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2356 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-10-22 16:24:27.939628: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-10-22 16:24:28.315283: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 31250000 Hz
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000009.jpg']
2021-10-22 16:26:00.194066: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-10-22 16:26:00.207154: I tensorflow/stream_executor/cuda/cuda_dnn.cc:380] Loaded cuDNN version 8201
2021-10-22 16:26:03.482891: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.10
2021-10-22 16:26:11.318267: W tensorflow/core/common_runtime/bfc_allocator.cc:337] Garbage collection: deallocate free memory regions (i.e., allocations) so that we can re-allocate a larger region to avoid OOM due to memory fragmentation. If you see this message frequently, you are running near the threshold of the available device memory and re-allocation may incur great performance overhead. You may try smaller batch sizes to observe the performance impact. Set TF_ENABLE_GPU_GARBAGE_COLLECTION=false if you'd like to disable this feature.
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000151.jpg']
...
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000061328.jpg']
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000061399.jpg']
Processing 100 / 100 images (TensorFlow)
infer time: 47011time_1: 47011
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000009.jpg']
...
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000061328.jpg']
len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000061399.jpg']
Processing 100 / 100 images (TensorRT)
infer time: 18347time_2: 18348
Processing 100 / 100 images (Visualization)real 8m12.180s
user    7m17.240s
sys 0m19.388s
COCO数据集
time_1: 47011
time_2: 18348time_1: 46871
time_2: 18095time_1: 46849
time_2: 18110time_1: 46364
time_2: 17931person_horse数据集
time_1: 42032
time_2: 20510time_1: 48867
time_2: 20521
测试数据 分辨率 TensorFlow耗时/ms TensorRT耗时/ms 加速比
COCO,100张 640x480 16089 5268 3
COCO,100张 640x480 11888 2509 4.7
COCO,100张 640x480 12012 3008 4
COCO,100张 640x480 11803 3183 3.7
COCO,100张 640x480 11776 3140 3.7
测试数据 分辨率 TensorFlow耗时/ms TensorRT耗时/ms 加速比
person_horse,100张 1280x720 12891 3452 3.7
person_horse,100张 1280x720 14989 3480 4.3
person_horse,100张 1280x720 15565 3515 4.4
person_horse,100张 1280x720 12883 3456 3.7

注意:加速比=TensorFlow耗时TensorRT耗时加速比=\frac{TensorFlow耗时}{TensorRT耗时}加速比=TensorRT耗时TensorFlow耗时​​

Jetson TX2实现EfficientDet推理加速(一)相关推荐

  1. Jetson TX2实现EfficientDet推理加速(二)

    一.参考资料 TensorRT实现EfficientDet推理加速(一) 二.可能出现的问题 infer推理错误 [TensorRT] ERROR: 2: [pluginV2DynamicExtRun ...

  2. 爱视图灵-深度学习推理盒(JETSON TX2)

    爱视图灵-深度学习推理盒(JETSON TX2) 一.NVIDIA Jetson TX2 模块化 AI 超级计算机的优势 传统的视频分析使用基于计算机视觉的方法,但下一代解决方案愈发依赖深度学习技术. ...

  3. jetson tx2上运行mobilenet-ssd的坑:interrupted by signal 9: SIGKILL

    从ssd-caffe转战到mobilenet-ssd,也就是为了实时性.jetson tx2运行caffe-ssd前向的时间大概就是210ms.但是经过实际测试,对前5层卷积层使用CUDNN加速时,m ...

  4. NVDIA Jetson TX2软件介绍

    介绍 JETSON TX2 模块 它是一台基于NVIDIA Pascal™架构的AI单模块超级计算机.它性能强大,但外形小巧,节能高效,非常适合机器人.无人机.智能摄像机和便携医疗设备等智能终端设备. ...

  5. Jetson TX2上配置archiconda、Yolov5、tensorrtx环境问题记录

    文章目录 前言 本文主要记录在Jetson TX2上配置archiconda.Yolov5.tensorrtx环境中遇到的问题以及解决方法.以及一些包的分享. 一.Jetson TX2刷机 二.安装a ...

  6. TensorRT实现RetinaFace推理加速(一)

    一.参考资料 tensorrtx/retinaface TensorRT实现yolov5推理加速(一) TensorRT实现yolov5推理加速(二) 二.实验环境 ##系统环境 Environmen ...

  7. Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测

    上一篇:Jetson AGX Xavier安装torch.torchvision且成功运行yolov5算法 下一篇:Jetson AGX Xavier测试YOLOv4 一.前言 由于YOLOv5在Xa ...

  8. Jetson TX2 开发记录

    一. 开箱,刷机 https://github.com/dusty-nv/jetson-inference#building-from-source-on-jetson (官方教程) http://v ...

  9. Jetson TX2介绍

    目录 Jetson TX2概述 Jetson TX2架构 1.模组配置 2. 对外接口 3.按键接口 和TX1的对比 自带的软件包配置JetPack 3.0 CUDA OpenCV VisionWor ...

最新文章

  1. Alpha 冲刺 (5/10)
  2. checkboxlist 数据库连接代码
  3. 6、函数返回值、this、递归及回调函数
  4. html表格数据点击事件,如何在iview的table单元格里实现点击事件?
  5. [工作积累] UE4 TAA ReProjection的精度处理
  6. 分区操作后索引的状态
  7. 据报道称“浏览器内核有上千万行代码”,浏览器内核真的很复杂吗?
  8. Altium AD20大电流表层开窗,用特殊粘贴复制平面区域到其他层,阻焊开窗显示沉金LOGO
  9. java 随机生成大写字母_java 生成随机大写字母,整数,小写字母
  10. 【poj3468】A Simple Problem with Integers
  11. 项目过程的几点经验总结
  12. 【软考 系统架构设计师】案例分析① 解题技巧
  13. 合并两个有序表(C语言)
  14. 世界上第一台计算机论文,世界上公认的第一台电子计算机是1946年诞生。.doc
  15. Android 4.2虚拟按键背景透明,Android 4.0 隐藏虚拟按键(导航栏)的方法
  16. 更换新电池对iPhone手机性能的影响实测
  17. 如何理解容器,容器化的由来?它有何优缺点?
  18. python获取qq好友ip_使用Python模拟登录QQ邮箱获取QQ好友列表
  19. AHB2APB桥接器设计(2)——同步桥设计的介绍
  20. java基本数据_Java基本数据类型-Java基本数据类型大小-嗨客网

热门文章

  1. Hive实现32位UUID
  2. 关于系统对接,你需要关注的点都在这里
  3. 每日新闻:阿里云掀起新一轮价格大战;比特大陆正式发布AI芯片;百度区块链实验室落户海南;救市意图明显 北京海淀区成立百亿纾困基金...
  4. 萝卜家园 Ghost XP 新春装机版 V200801
  5. Canvas画布完成一个数字钟表
  6. 读《STRENGTHNET: DEEP LEARNING-BASED EMOTION STRENGTH ASSESSMENT FOR EMOTIONAL SPEECH SYNTHESIS》
  7. Ubuntu22.04 美化
  8. laya游戏开发之贪吃蛇大作战(一)
  9. CF13C Sequence
  10. 智云通CRM:当客户上来就问价格,销售该怎么回答?