[TensorRT / polygraphy] Found duplicate region name Reshape_

ONNXを経由してTensorRTを実行しようとしたところ、下記のようなエラーが発生した。

Error Code 2: Internal Error (Assertion regionNames.find(r->name) == regionNames.end() failed.
Found duplicate region name Reshape__8560:0'[shuffle input])

[E:onnxruntime:Default, tensorrt_execution_provider.h:58 log]
[2023-02-27 15:13:50   ERROR] 2: [checkSanity.cpp::checkSanity::106] Error Code 2: Internal Error (Assertion regionNames.find(r->name) == regionNames.end() failed. Found duplicate region name Reshape__8560:0'[shuffle input])
Traceback (most recent call last):
  File "/usr/local/bin/sit4onnx", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/sit4onnx/onnx_inference_test.py", line 498, in main
    final_results = inference(
  File "/usr/local/lib/python3.8/dist-packages/sit4onnx/onnx_inference_test.py", line 233, in inference
    onnx_session = onnxruntime.InferenceSession(
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 347, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 395, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.EPFail: [ONNXRuntimeError] : 11 : EP_FAIL : TensorRT EP could not build engine for fused node: TensorrtExecutionProvider_TRTKernel_graph_tf2onnx_10356034146968653513_1_0

PINTO

OPの名前に重複するものがある？と言われているようだ。そんなはずないけどな。下記の手順でサニタイズする。ポイントは、TensorRTのエクステンション polygraphy を使用すること。

python -m pip install \
colored \
polygraphy \
onnx-graphsurgeon \
--extra-index-url https://pypi.ngc.nvidia.com

inspect で問題箇所を分析できる。

polygraphy inspect model dehaze_maxim_2022aug_opt_sim_special_08_redsum_axes_three.onnx \
--show layers attrs | grep Reshape__8560:0

３箇所ダブっているとの指摘。なんでやねん。

-> {Reshape__8560:0 [dtype=float32, shape=(16384, 160)]}
 Reshape__8560:0 [dtype=float32, shape=(16384, 160)]}
 Reshape__8560:0 [dtype=float32, shape=(16384, 160)]}

サニタイズする。

polygraphy surgeon sanitize \
dehaze_maxim_2022aug_opt_sim_special_08_redsum_axes_three.onnx \
--fold-constants \
-o dehaze_maxim_2022aug_opt_sim_special_08_redsum_axes_three_fold.onnx

PINTO

[I] RUNNING | Command: /home/xxxx/.local/bin/polygraphy surgeon sanitize dehaze_maxim_2022aug_opt_sim_special_08_redsum_axes_three.onnx --fold-constants -o dehaze_maxim_2022aug_opt_sim_special_08_redsum_axes_three_fold.onnx
[I] Inferring shapes in the model with `onnxruntime.tools.symbolic_shape_infer`.
    Note: To force Polygraphy to use `onnx.shape_inference` instead, set `allow_onnxruntime=False` or use the `--no-onnxruntime-shape-inference` command-line option.
[I] Loading model: /home/xxxx/work/MAXIM_bk/pinto_special/dehaze_maxim_2022aug_opt_sim_special_08_redsum_axes_three.onnx
[I] Original Model:
    Name: tf2onnx | ONNX Opset: 13 | Other Opsets: {'ai.onnx.ml': 2}
    
    ---- 1 Graph Input(s) ----
    {input_image [dtype=float32, shape=(1, 3, 512, 640)]}
    
    ---- 6 Graph Output(s) ----
    {Identity_6:0 [dtype=float32, shape=(1, 128, 160, 3)],
     Identity_7:0 [dtype=float32, shape=(1, 256, 320, 3)],
     Identity_8:0 [dtype=float32, shape=(1, 512, 640, 3)],
     Identity_9:0 [dtype=float32, shape=(1, 128, 160, 3)],
     Identity_10:0 [dtype=float32, shape=(1, 256, 320, 3)],
     Identity_11:0 [dtype=float32, shape=(1, 512, 640, 3)]}
    
    ---- 1859 Initializer(s) ----
    
    ---- 8335 Node(s) ----
    
[I] Folding Constants | Pass 1
[I]     Total Nodes | Original:  8335, After Folding:  8335 |     0 Nodes Folded
[I] Saving ONNX model to: dehaze_maxim_2022aug_opt_sim_special_08_redsum_axes_three_fold.onnx
[I] New Model:
    Name: tf2onnx | ONNX Opset: 13 | Other Opsets: {'ai.onnx.ml': 2}
    
    ---- 1 Graph Input(s) ----
    {input_image [dtype=float32, shape=(1, 3, 512, 640)]}
    
    ---- 6 Graph Output(s) ----
    {Identity_6:0 [dtype=float32, shape=(1, 128, 160, 3)],
     Identity_7:0 [dtype=float32, shape=(1, 256, 320, 3)],
     Identity_8:0 [dtype=float32, shape=(1, 512, 640, 3)],
     Identity_9:0 [dtype=float32, shape=(1, 128, 160, 3)],
     Identity_10:0 [dtype=float32, shape=(1, 256, 320, 3)],
     Identity_11:0 [dtype=float32, shape=(1, 512, 640, 3)]}
    
    ---- 1859 Initializer(s) ----
    
    ---- 8335 Node(s) ----
    
[I] PASSED | Runtime: 5.715s | Command: /home/xxxx/.local/bin/polygraphy surgeon sanitize dehaze_maxim_2022aug_opt_sim_special_08_redsum_axes_three.onnx --fold-constants -o dehaze_maxim_2022aug_opt_sim_special_08_redsum_axes_three_fold.onnx

PINTO

直ってない。なんでやねん。

2023-02-28 00:40:04.713089620 [W:onnxruntime:Default, tensorrt_execution_provider.h:60 log] [2023-02-27 15:40:04 WARNING] external/onnx-tensorrt/onnx2trt_utils.cpp:367: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
2023-02-28 00:48:01.513350249 [W:onnxruntime:Default, tensorrt_execution_provider.h:60 log] [2023-02-27 15:48:01 WARNING] TensorRT was linked against cuBLAS/cuBLAS LT 11.8.0 but loaded cuBLAS/cuBLAS LT 110.9.2
2023-02-28 00:48:01.892215068 [W:onnxruntime:Default, tensorrt_execution_provider.h:60 log] [2023-02-27 15:48:01 WARNING] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1
2023-02-28 00:48:02.302692008 [E:onnxruntime:Default, tensorrt_execution_provider.h:58 log] [2023-02-27 15:48:02   ERROR] 2: [checkSanity.cpp::checkSanity::106] Error Code 2: Internal Error (Assertion regionNames.find(r->name) == regionNames.end() failed. Found duplicate region name Reshape__8560:0'[shuffle input])
Traceback (most recent call last):
  File "/usr/local/bin/sit4onnx", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/sit4onnx/onnx_inference_test.py", line 498, in main
    final_results = inference(
  File "/usr/local/lib/python3.8/dist-packages/sit4onnx/onnx_inference_test.py", line 233, in inference
    onnx_session = onnxruntime.InferenceSession(
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 347, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 395, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.EPFail: [ONNXRuntimeError] : 11 : EP_FAIL : TensorRT EP could not build engine for fused node: TensorrtExecutionProvider_TRTKernel_graph_tf2onnx_10356034146968653513_1_0