📚

wonnx を動かすための環境構築 (Python編)

2024/03/31に公開

Docker

最終的な成果物は以下のリポジトリ wonnx-nvidia-docker にあります．
使用している API は Vulkan です．
今後，Rust の環境も追加する予定です．

wonnx とは

wonnx とは Rust の wgpu　クレートをバックエンドとして，実装された onnx ランタイムです．
wonnx はクロスプラットフォームで動作します．
その理由は，クロスプラットフォームで動作することを主眼においたグラフィックス API 郡を定義している wgpu クレートをバックエンドに使用しているからです．

https://github.com/webonnx/wonnx

https://github.com/gfx-rs/wgpu

ホスト PCのスペック

作成する環境は，このようなホスト PCのスペックで作成しました．

CPU: AMD Ryzen9 5900x
GPU: Geforce RTX 3070Ti

Dockerfile の作成

まずは，必要なパッケージやソフトウェアを追加してきます．
必要なソフトウェアは大きく分けて Vulkan 用， wonnx-py用と pytorch 用の3種類あります．
Pytorch は wonnx のサンプルコードで使用しているからです．

FROM nvidia/cuda:12.1.0-devel-ubuntu22.04
# FROM ubuntu:22.04

# Needed to share GPU
ENV NVIDIA_DRIVER_CAPABILITIES=all
ENV NVIDIA_VISIBLE_DEVICES=all
ENV XDG_RUNTIME_DIR=/tmp/runtime-xdg_runtime_dir

# Install vulkan tools
RUN DEBIAN_FRONTEND=noninteractive apt-get update &&\
    apt-get install -y \
    pciutils \
    vulkan-tools \
    mesa-utils \
    libglib2.0-0

# Install python and packages
RUN apt-get update &&\
    apt-get install -y python3-pip
RUN pip3 install \
    onnx==1.16.0 \
    wonnx==0.5.1 \
    opencv-python==4.9.0.80 \
    autopep8
RUN pip3 install \
    torch \
    torchvision \
    torchaudio \
    --index-url https://download.pytorch.org/whl/cu118

FROM nvidia/cuda:12.1.0-devel-ubuntu22.04

ベースイメージはcudaでなく，通常の FROM ubuntu:22.04 でも Vulkan で動作します．
今回は Pytorch をインストールするので，Pytorchを使用するときに GPU が動作したほうが便利化と思い，cuda ベースのイメージを使用しています．

ENV NVIDIA_DRIVER_CAPABILITIES=all
ENV NVIDIA_VISIBLE_DEVICES=all
ENV XDG_RUNTIME_DIR=/tmp/runtime-xdg_runtime_dir

この３つの環境変数の設定は必須です．
特に，上２つがないと GPU が認識されません．

devcontainer.json は以下のようになります．

{
    "name": "NVIDIA Docker WONNX dev",
    "build": {
        "dockerfile": "Dockerfile"
    },
    "runArgs": [
        "--runtime=nvidia",
        "--gpus",
        "all"
    ],
    "remoteUser": "vscode",
    "customizations": {
        // Configure properties specific to VS Code.
        "vscode": {
            // Set *default* container specific settings.json values on container create.
            "settings": {
                "terminal.integrated.defaultProfile.linux": "bash",
                "editor.formatOnSave": true
            },
            "extensions": [
                "ms-python.python"
            ]
        }
    },
    "features": {
        "common": {
            "username": "vscode",
            "uid": "automatic",
            "gid": "automatic",
            "installZsh": true,
            "installOhMyZsh": false,
            "upgradePackages": false,
            "nonFreePackages": false
        }
    }
}

    "runArgs": [
        "--runtime=nvidia",
        ...
    ]

この設定もないと，GPU が認識されません．
--gpus all オプションが渡せる時点で，nvidia docker を使用してるはずなのにも関わらず，明示的なランタイム指定が必要みたいです．

サンプルコードの実行

上記の Dockerfile と devcontainer.json で Docker コンテナを作成し，以下のサンプルコードを実行することができます．

import onnx
import wonnx
from onnx import helper
from onnx import TensorProto
from torchvision import transforms
import numpy as np
import cv2
import os
import time

basedir = os.path.dirname(os.path.realpath(__file__))

def test_squeezenet():
    image = cv2.imread(os.path.join(basedir, "../../data/images/pelican.jpeg"))
    rgb_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    # apply transforms to the input image
    transform = transforms.Compose(
        [
            transforms.ToPILImage(),
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize(
                mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
            ),
        ]
    )
    input_tensor = transform(rgb_image)

    # Create the model (ModelProto)
    session = wonnx.Session.from_path(
        os.path.join(basedir, "../../data/models/opt-squeeze.onnx")
    )
    inputs = {"data": input_tensor.flatten().tolist()}
    
    start = time.time()
    result = session.run(inputs)["squeezenet0_flatten0_reshape0"]
    end = time.time()

    print(f"result 144={result[144]} argmax={np.argmax(result)} score_max={np.max(result)} time={(end - start) * 1000}ms")

    assert (
        np.argmax(result) == 144
    ), "Squeezenet does not work"

if __name__ == '__main__':
    test_squeezenet()

実行結果は以下のようになります．

$ python3 main.py
result 144=22.830949783325195 argmax=144 score_max=22.830949783325195 time=4.0766277313232ms

wonnx とは

ホスト PCのスペック

Dockerfile の作成

サンプルコードの実行

Discussion