🚀

EfficientNet B0のKerasモデルをONNXモデルに変換して推論する

2021/04/23に公開

ディープラーニング

はじめに

「けしからん画像分類器を作ってみる」シリーズでは、KerasとEfficientNet B0を使って画像分類器を実装しました。
その画像分類モデルを、ONNXモデルに変換して推論してみたいと思います。

ONNXとは？

ONNX（Open Neural Network Exchange）は、Facebook、Microsoftが主導して、機械学習フレームワークの相互運用を実現するためのプロジェクトです。詳しくは、まぁ・・・ググってください。

公式: ONNX
Wikipedia: Open Neural Network Exchange

なぜONNXモデルに変換するの？

個人的には「ONNXRuntimeによる推論が早いから」というのが一番の理由ですが、Keras、PyTorchなど異なる機械学習フレームワークを使って学習したモデルを、推論時に統一的に扱える嬉しさもあります。

モデルを変換する2つの方法

KerasモデルをONNXモデルに変換する方法は、ざっと調べる限り、以下の2つがあります。

tf2onnxを使って変換する ← オススメ！
keras2onnxを使って変換する ← 成功せず

前者が圧倒的にオススメです。勉強のために後者も試してみましたが、なかなか大変でした。何とか変換はできたのですが、推論で失敗してしまいました。

なお、本記事では学習は行わず、変換、推論だけを行っていますが、学習後に変換する手順も同様です。

環境

どちらの変換、推論も以下の環境で実行しました。Docker内で実行しており、GPUは使用していません。

ハードウェア:
- CPU: AMD Ryzen 7 3700X（8コア/16スレッド）
- メモリ: 64GB
- GPU: GeForce GTX 1070（メモリ8GB）
ソフトウェア:
- OS: Ubuntu 20.04.2 LTS
- Docker: 19.03.8
- NVIDIAドライバ: 460.39

tf2onnxで変換する → 成功

tf2onnxは、TensorFlowモデルをONNXモデルに変換するツールです。
Kerasでモデルを作成した後、TensorFlow SavedModel形式でモデルを保存すると、このツールで変換することができます。
Keras H5形式には対応していないのでご注意ください。

Dockerイメージをビルドする

使用したDockerfile、requirements.txtは以下の通りです。

Dockerfile
FROM nvidia/cuda:11.0.3-cudnn8-devel-ubuntu20.04
RUN apt-get update \
  && DEBIAN_FRONTEND=noninteractive apt-get install --yes --no-install-recommends \
    build-essential \
    ca-certificates \
    python3-dev \
    python3-pip \
    python3-setuptools \
    tzdata \
  && rm --recursive --force /var/lib/apt/lists/*
RUN python3 -m pip install --upgrade pip setuptools
WORKDIR /opt/app
COPY requirements.txt ./
RUN python3 -m pip install --requirement requirements.txt
ENV LANG C.UTF-8
ENV TZ Asia/Tokyo

requirements.txt

onnxruntime==1.7.0
tensorflow-hub==0.11.0
tensorflow==2.4.1
tf2onnx==1.8.4

モデルを生成する

TensorFlow Hubにある学習済みのEfficientNet B0をそのまま保存することでモデルファイルを生成します。
本来は学習を行うのですが、今回は学習前の推論結果を変換前後で比較することで、変換の成否を判断します。

save_model.py
#!/usr/bin/env python3

import tensorflow as tf
import tensorflow_hub as hub

model = tf.keras.Sequential(
    [
        hub.KerasLayer(
            "https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1",
            trainable=False,
        ),
        tf.keras.layers.Dense(1, activation="sigmoid"),
    ]
)
model.build([None, 224, 224, 3])
model.summary()
model.save("efficientnet-b0")

実行例を以下に示します。成功するとefficientnet-b0ディレクトリが生成されます。

$ ./save_model.py
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
keras_layer (KerasLayer)     (None, 1280)              4049564
_________________________________________________________________
dense (Dense)                (None, 1)                 1281
=================================================================
Total params: 4,050,845
Trainable params: 1,281
Non-trainable params: 4,049,564
_________________________________________________________________
2021-04-23 00:07:12.365759: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.

$ ls efficientnet-b0
assets  saved_model.pb  variables

Kerasで推論する

ONNXモデルに変換する前に、Kerasでの推論結果を確認しておきましょう。
ここでは黒一色、白一色の2枚の画像に対して推論を実行しています。

predict_keras.py
#!/usr/bin/env python3

import numpy as np
import tensorflow as tf

model = tf.keras.models.load_model("efficientnet-b0")

images = np.array(
    [
        np.zeros((224, 224, 3), dtype=np.float32),
        np.ones((224, 224, 3), dtype=np.float32),
    ]
)

results = model.predict(images)
print(results)

実行例を以下に示します。全結合層が乱数で初期化されているため、モデルを保存する度に値が変わることにご注意ください。

$ ./predict_keras.py
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
[[0.5295639]
 [0.5148043]]

モデルを変換する

tf2onnxを使ってモデルを変換します。

convert.sh
#!/bin/bash
python3 -m tf2onnx.convert --saved-model efficientnet-b0 --output efficientnet-b0.onnx

実行例を以下に示します。いくつか警告が出力されていますが今回は無視します。

$ ./convert.sh
/usr/lib/python3.8/runpy.py:127: RuntimeWarning: 'tf2onnx.convert' found in sys.modules after import of package 'tf2onnx', but prior to execution of 'tf2onnx.convert'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
2021-04-23 00:14:01,125 - WARNING - '--tag' not specified for saved_model. Using --tag serve
2021-04-23 00:14:06,873 - INFO - Signatures found in model: [serving_default].
2021-04-23 00:14:06,873 - WARNING - '--signature_def' not specified, using first signature: serving_default
2021-04-23 00:14:06,873 - INFO - Output names: ['dense']
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tf2onnx/tf_loader.py:557: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
2021-04-23 00:14:09,553 - WARNING - From /usr/local/lib/python3.8/dist-packages/tf2onnx/tf_loader.py:557: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
2021-04-23 00:14:10,275 - INFO - Using tensorflow=2.4.1, onnx=1.9.0, tf2onnx=1.8.4/cd55bf
2021-04-23 00:14:10,275 - INFO - Using opset <onnx, 9>
2021-04-23 00:14:11,053 - INFO - Computed 0 values for constant folding
2021-04-23 00:14:13,763 - INFO - Optimizing ONNX model
2021-04-23 00:14:17,027 - INFO - After optimization: BatchNormalization -42 (49->7), Const -240 (442->202), Identity -926 (926->0), Squeeze -16 (16->0), Transpose -275 (276->1), Unsqueeze -64 (64->0)
2021-04-23 00:14:17,056 - INFO -
2021-04-23 00:14:17,057 - INFO - Successfully converted TensorFlow model efficientnet-b0 to ONNX
2021-04-23 00:14:17,057 - INFO - Model inputs: ['keras_layer_input:0']
2021-04-23 00:14:17,057 - INFO - Model outputs: ['dense']
2021-04-23 00:14:17,057 - INFO - ONNX model is saved at efficientnet-b0.onnx

ONNXで推論する

変換したONNXモデルを使って推論してみます。

predict_onnx.py
#!/usr/bin/env python3

import numpy as np
import onnxruntime

session = onnxruntime.InferenceSession("efficientnet-b0.onnx")

images = np.array(
    [
        np.zeros((224, 224, 3), dtype=np.float32),
        np.ones((224, 224, 3), dtype=np.float32),
    ]
)

results = session.run(["dense"], {"keras_layer_input:0": images})
print(results)

実行例を以下に示します。

$ ./predict_onnx.py
[array([[0.52956396],
       [0.5148051 ]], dtype=float32)]

Kerasモデルの推論結果と厳密には一致していませんが、小数点第5位まで一致しているので問題はなさそうです。

keras2onnxで変換する → 失敗

続いて、keras2onnxを使って変換してみます。

tf2onnxの変換についてはサクッと一発で成功しましたが、keras2onnxを使った変換にはなかなか難儀しました。
注意点は以下の通りです。

keras2onnxはTensorFlow v2.4に対応しておらず、TensorFlow v2.2までしか対応していません。（2021年4月23日現在、v1.7.0）
TensorFlow v2.4で生成したモデルはTensorFlow v2.2では読み込むことができなかったため、学習もv2.2で行う必要がありました。
TensorFlow v2.2を使うためにはCUDA 11.0/cuDNN 8ではなくCUDA 10.1/cuDNN 7を使う必要がありました。

Dockerイメージをビルドする

使用したDockerfile、requirements.txtは以下の通りです。
CUDA、TensorFlowのバージョンなどがtf2onnxの場合と異なります。

Dockerfile
FROM nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
RUN apt-get update \
  && DEBIAN_FRONTEND=noninteractive apt-get install --yes --no-install-recommends \
    build-essential \
    ca-certificates \
    python3-dev \
    python3-pip \
    python3-setuptools \
    tzdata \
  && rm --recursive --force /var/lib/apt/lists/*
RUN python3 -m pip install --upgrade pip setuptools
WORKDIR /opt/app
COPY requirements.txt ./
RUN python3 -m pip install --requirement requirements.txt
ENV LANG C.UTF-8
ENV TZ Asia/Tokyo

requirements.txt

keras2onnx==1.7.0
onnxruntime==1.7.0
tensorflow-hub==0.11.0
tensorflow==2.2.1

モデルを生成する

基本的な手順はtf2onnxの場合と同様ですが、なぜかfitかpredictを呼び出さないとモデルの保存時にエラーが発生しました。

save_model.py
#!/usr/bin/env python3

import numpy as np
import tensorflow as tf
import tensorflow_hub as hub

model = tf.keras.Sequential(
    [
        hub.KerasLayer(
            "https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1",
            trainable=False,
        ),
        tf.keras.layers.Dense(1, activation="sigmoid"),
    ]
)
model.build([None, 224, 224, 3])
model.summary()
model.predict(np.zeros((1, 224, 224, 3), dtype=np.float32))
model.save("efficientnet-b0")

実行例は以下の通りです。

$ ./save_model.py
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
keras_layer (KerasLayer)     multiple                  4049564
_________________________________________________________________
dense (Dense)                multiple                  1281
=================================================================
Total params: 4,050,845
Trainable params: 1,281
Non-trainable params: 4,049,564
_________________________________________________________________
2021-04-23 00:35:08.379870: W tensorflow/python/util/util.cc:329] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.

Kerasで推論する

ソースコードはtf2onnxの場合と同様なので省略します。

実行例を以下に示します。

$ ./predict_keras.py
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
[[0.41439614]
 [0.43379608]]

モデルを変換する

keras2onnxを使ってモデルを変換します。

convert.py
#!/usr/bin/env python3

import keras2onnx
import onnx
import tensorflow as tf

model = tf.keras.models.load_model("efficientnet-b0")
onnx_model = keras2onnx.convert_keras(model, "efficientnet-b0")
onnx.save_model(onnx_model, "efficientnet-b0.onnx")

実行例を以下に示します。

$ ./convert.py
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
tf executing eager_mode: True
tf.keras model eager_mode: False
2021-04-23 00:41:39.446674: W tensorflow/python/util/util.cc:329] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
WARN: No corresponding ONNX op matches the tf.op node sequential/keras_layer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/tf_op_layer_BroadcastTo_1/PartitionedCall/BroadcastTo_1 of type BroadcastTo
      The generated ONNX model needs run with the custom op supports.
The ONNX operator number change on the optimization: 4007 -> 492

ONNXで推論する

ソースコードはtf2onnxの場合と同様なので省略します。

実行例を以下に示します。

$ ./predict_onnx.py
Traceback (most recent call last):
  File "./predict_onnx.py", line 6, in <module>
    session = onnxruntime.InferenceSession("efficientnet-b0.onnx")
  File "/usr/local/lib/python3.6/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 280, in __init__
    self._create_inference_session(providers, provider_options)
  File "/usr/local/lib/python3.6/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 307, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from efficientnet-b0.onnx failed:Fatal error: BroadcastTo is not a registered function/op

・・・エラーになっちゃいました。
変換時のメッセージにもある通り、ONNXではサポートされていないオペレータBroadcastToが原因かと思います。
カスタムオペレータを追加すれば対応できるかもしれませんが、tf2onnxでの変換は成功しているので調査は中断しました。

結論

tf2onnxを使いましょう。

Discussion

ログインするとコメントできます