😊
sit4onnxを使った推論テスト
ONNXモデルの単純な推論テストを行うためのツールであるsit4onnxを試してみる。
モデルは前記事と同じ huggingface
の distilgpt2
インストールも簡単。
!pip install sit4onnx
ダミーテンソルを自動生成してくれる点と、最初の1回の結果を捨てる(ウォームアップ)点が気が利いている…
simplifyする前のモデル
!sit4onnx -if distilgpt2_20230307.onnx -oep cpu # デフォルトは内部で11回ループ
INFO: file: distilgpt2_20230307.onnx
INFO: providers: ['CPUExecutionProvider']
INFO: input_name.1: input_ids shape: [1, 6] dtype: int64
INFO: test_loop_count: 10
INFO: total elapsed time: 59.000492095947266 ms
INFO: avg elapsed time per pred: 5.900049209594727 ms
INFO: output_name.1: outputs shape: [1, 6, 768] dtype: float32
INFO: output_name.2: key.3 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.3: value.3 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.4: key.11 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.5: value.11 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.6: key.19 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.7: value.19 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.8: key.27 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.9: value.27 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.10: key.35 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.11: value.35 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.12: key.43 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.13: value.43 shape: [1, 12, 6, 64] dtype: float32
simplifyした後のモデル
!sit4onnx -if distilgpt2_20230307_simplified.onnx -oep cpu # デフォルトは内部で11回ループ
INFO: file: distilgpt2_20230307_simplified.onnx
INFO: providers: ['CPUExecutionProvider']
INFO: input_name.1: input_ids shape: [1, 6] dtype: int64
INFO: test_loop_count: 10
INFO: total elapsed time: 49.99661445617676 ms
INFO: avg elapsed time per pred: 4.999661445617676 ms
INFO: output_name.1: outputs shape: [1, 6, 768] dtype: float32
INFO: output_name.2: key.3 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.3: value.3 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.4: key.11 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.5: value.11 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.6: key.19 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.7: value.19 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.8: key.27 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.9: value.27 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.10: key.35 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.11: value.35 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.12: key.43 shape: [1, 12, 6, 64] dtype: float32
INFO: output_name.13: value.43 shape: [1, 12, 6, 64] dtype: float32
オプションも豊富で、例えば以下のようなケースで役立ちそう。ツールの使い方を思い出す前にスクラッチベタ書きでテストしてしまいそうな気もするが、こっちを使ったほうがコマンド一行で済むので格段に楽
- --test_loop_count=11でモデルの想定される処理時間と大きく差異がないか確認
- --batch_sizeでの処理時間の確認
- --output_numpy_fileで出力OPが想定される値になっているかの確認
optional arguments:
-h, --help
show this help message and exit.
-if INPUT_ONNX_FILE_PATH, --input_onnx_file_path INPUT_ONNX_FILE_PATH
Input onnx file path.
-b BATCH_SIZE, --batch_size BATCH_SIZE
Value to be substituted if input batch size is undefined.
This is ignored if the input dimensions are all of static size.
Also ignored if input_numpy_file_paths_for_testing or
numpy_ndarrays_for_testing or fixed_shapes is specified.
-fs FIXED_SHAPES [FIXED_SHAPES ...], --fixed_shapes FIXED_SHAPES [FIXED_SHAPES ...]
Input OPs with undefined shapes are changed to the specified shape.
This parameter can be specified multiple times depending on
the number of input OPs in the model.
Also ignored if input_numpy_file_paths_for_testing or
numpy_ndarrays_for_testing is specified.
e.g.
--fixed_shapes 1 3 224 224 \
--fixed_shapes 1 5 \
--fixed_shapes 1 1 224 224
-tlc TEST_LOOP_COUNT, --test_loop_count TEST_LOOP_COUNT
Number of times to run the test.
The total execution time is divided by the number of times the test is executed,
and the average inference time per inference is displayed.
-oep {tensorrt,cuda,openvino_cpu,openvino_gpu,cpu}, \
--onnx_execution_provider {tensorrt,cuda,openvino_cpu,openvino_gpu,cpu}
ONNX Execution Provider.
-ifp INPUT_NUMPY_FILE_PATHS_FOR_TESTING, \
--input_numpy_file_paths_for_testing INPUT_NUMPY_FILE_PATHS_FOR_TESTING
Use an external file of numpy.ndarray saved using np.save as input data for testing.
This parameter can be specified multiple times depending on
the number of input OPs in the model.
If this parameter is specified, the value specified for
batch_size and fixed_shapes are ignored.
e.g.
--input_numpy_file_paths_for_testing aaa.npy \
--input_numpy_file_paths_for_testing bbb.npy \
--input_numpy_file_paths_for_testing ccc.npy
-ofp, --output_numpy_file
Outputs the last inference result to an .npy file.
-n, --non_verbose
Do not show all information logs. Only error logs are displayed.
Discussion