M1 Mac mini で Stable Diffusion Web UI を使う

基本こちらの記事を参考にさせていただく。環境構築には nix develop を使う。

flake.nix

{
  inputs.flake-utils.url = "github:numtide/flake-utils";

  outputs = { self, nixpkgs, flake-utils }:
    flake-utils.lib.eachDefaultSystem (system:
      let pkgs = nixpkgs.legacyPackages.${system}; in
      {
        devShell = pkgs.mkShell {
          buildInputs = with pkgs; [
            git
            protobuf
            python310
            python310Packages.torch
            python310Packages.torchvision
            rustup
            wget
          ];
        };
      }
    );
}

kino-ma

stable-diffusion-webui/webui-macos-env.sh

#!/bin/bash
####################################################################
#                          macOS defaults                          #
# Please modify webui-user.sh to change these instead of this file #
####################################################################

if [[ -x "$(command -v python3.10)" ]]
then
    python_cmd="python3.10"
fi

#export install_dir="$HOME/docs/ai/stable-diffusion"
export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate --no-half"
#export TORCH_COMMAND="pip install torch==2.0.1 torchvision==0.15.2"
export TORCH_COMMAND="true"
export PYTORCH_ENABLE_MPS_FALLBACK=1

####################################################################

install_dir は指定しなければ git リポジトリのパスになるっぽいので消す。
pip は flake ではいるので TORCH_COMMAND を消す。 fallback のコマンドがあったりしたら嫌なので true コマンドを入れておく。
あとは記事通りに --no-falh を足しておいた。

kino-ma

bash stable-diffusion-webui/webui.sh

RuntimeError: Failed to import transformers.models.clip.modeling_clip because of the following error (look up to see its traceback):
No module named 'torch._C._distributed_c10d'; 'torch._C' is not a package

Python CLI でも再現する。なんじゃこりゃ

bash-5.2$ python3
Python 3.10.13 (main, Aug 24 2023, 12:59:26) [Clang 16.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch._C
>>> import torch._C._distributed_c10d
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'torch._C._distributed_c10d'; 'torch._C' is not a package

kino-ma

bash-5.2$ python3
Python 3.10.13 (main, Aug 24 2023, 12:59:26) [Clang 16.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.distributed.is_available()
False

distributed? サポートがないといけない？とのこと。

ビルドしなおすと一時間以上かかるらしいので避けたい。。。

kino-ma

Nix からきた torch が distribution をサポートしていない予感がするので Poetry に帰る

flake.nix

{
  inputs.flake-utils.url = "github:numtide/flake-utils";

  outputs = { self, nixpkgs, flake-utils }:
    flake-utils.lib.eachDefaultSystem (system:
      let pkgs = nixpkgs.legacyPackages.${system}; in
      {
        devShell = pkgs.mkShell {
          buildInputs = with pkgs; [
            git
            poetry
            protobuf
            python310
            rustup
            wget
          ];
        };
      }
    );
}

pyproject.toml

[tool.poetry]
name = "stable-diffusion"
version = "0.1.0"
description = ""
authors = ["kino-ma <ma@kino.ma>"]
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.10"
torch = "^2.1.2"
torchvision = "^0.16.2"


[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

kino-ma

TORCH_COMMAND はコマンドそのものではなかったっぽい。試しに消してみる

Command: "/Users/kino-ma/Library/Caches/pypoetry/virtualenvs/stable-diffusion-X0gitCx1-py3.10/bin/python3.10" -m true

#export TORCH_COMMAND="true"

なんか pip install が走っていそうなログが出ているけど、 venv だしまあいいや。。。

  Downloading torch-2.0.1-cp310-none-macosx_11_0_arm64.whl (55.8 MB)

kino-ma

ちょっと目の当たりなど怪しいが、ちゃんと生成できた。
一枚あたり 40 秒くらい。使えなくもない程度かな？

kino-ma

Activity Monitor によると GPU は 100% 使い切っている様子なので、速度の改善はあまり見込めないかも。

kino-ma

この記事に従って、 SDXL のインストールを進める。

チェックポイントファイルがくそでかい（計 13MB ほど）ので気長に。

kino-ma

待っている間に、顔などを綺麗にしてくれる？拡張機能である ADetailer をインストールする。

webui の Extensions タブから Install from URL に以下を貼り付ける。
https://github.com/Bing-su/adetailer

kino-ma

起動してすぐに Generate するとサーバがクラッシュするようになった。

issue に上がっていたのに従って、サーバ起動時に --loglevel DEBUG を付加した上で以下のようなログが出るまで待つようにしたら治った。

Model loaded in 19.6s (load weights from disk: 0.3s, create model: 0.9s, apply weights to model: 17.7s, move model to device: 0.2s, calculate empty prompt: 0.3s).

kino-ma

たしかに、気持ち綺麗になった感じがする。

kino-ma

SDXL 各チェックポイントファイルの DL が終わったので stable-diffusion-webui/models に配置してみたが、 UI のチェックポイント一覧に表示されなかった。

探してみると stable-diffusion-webui/ 配下にさらに stable-diffusion-webui があって、かつ先ほどインストールした extension はこちらに入っているようだった。

stable-diffusion-webui/stable-diffusion-webui/models に移動したところ読み込んでくれた。

kino-ma

SD VAE を sdxl_vae.safetensors に設定するときものすごく時間がかかる。し、一回メモリ不足で macOS がクラッシュした。 Chrome などメモリを食うプロセスを落としてからやり直した。

kino-ma

SDXL のチェックポイントを使った画像生成は、予測所要時間が 15 分前後となってしまう。
また、予測所要時間を超える場合が多い（４０分以上になったこともあった）。

そして、 16GB の M1 Mac ではメモリ不足のエラーが表示されて生成が中断されてしまう。
Refiner・VAE 有り無しなど色々試したがうまくいかなかったため、 M1 Pro 以上などより多くのメモリを搭載していないと動かないかもしれない。

    RuntimeError: MPS backend out of memory (MPS allocated: 13.93 GB, other allocations: 3.98 GB, max allowed: 18.13 GB). Tried to allocate 256.00 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

しかし、 96GB 搭載マシンでも同じ現象が起きているとの報告もあるため、物理的な RAM 容量は関係ないのかもしれない。

エラーが示す通り PYTORCH_MPS_HIGH_WATERMARK_RATIO の指定を試してみる。

kino-ma

別 issue のコメントに従って、 PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.7 を設定し、起動オプションに --precision full --no-half を追加してみた。

webui-macos-env.sh

export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate --no-half --precision full"
export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.7

kino-ma

上の設定で VAE なし、 Refiner ありで実行すると正常に終了した。（なんなら実行時間が大幅に短縮されて 3min 20sec）

そういう表現なのか、不思議な絵柄にはなってしまったが、とりあえず動いたのでヨシ。
VAE を有効化して再度やってみる。

kino-ma

またしても不思議な絵柄になってしまったが、 VAE ありでも生成できた。実行時間もやはり早く、三分ちょい。

(negative) prompt が悪さをしているのかな？？
もうちょっと試してみる。