Closed8

RTX 5060 Ti で StableDiffusionがややこしかった

ぽぽこんぽぽこん

問題

RTX 5060 Ti(sm_120)では標準PyTorchが対応しておらず、以下のエラーが発生:

RuntimeError: CUDA error: no kernel image is available for execution on the device
NVIDIA GeForce RTX 5060 Ti with CUDA capability sm_120 is not compatible
ぽぽこんぽぽこん

1. docker-compose.ymlに環境変数追加

1. プロジェクトクローン
git clone https://github.com/AbdBarho/stable-diffusion-webui-docker.git
cd stable-diffusion-webui-docker

以下は念の為。わからんClaudeがそういったから

services:
  auto:
    environment:
      - CUDA_LAUNCH_BLOCKING=1
      - TORCH_USE_CUDA_DSA=1
      - PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128
ぽぽこんぽぽこん

2. Dockerfileの修正

services/AUTOMATIC1111/Dockerfileを以下に置き換え:

FROM alpine/git:2.36.2 as download

COPY clone.sh /clone.sh

RUN . /clone.sh stable-diffusion-webui-assets https://github.com/AUTOMATIC1111/stable-diffusion-webui-assets.git 6f7db241d2f8ba7457bac5ca9753331f0c266917

RUN . /clone.sh stable-diffusion-stability-ai https://github.com/Stability-AI/stablediffusion.git cf1d67a6fd5ea1aa600c4df58e5b47da45f6bdbf \
  && rm -rf assets data/**/*.png data/**/*.jpg data/**/*.gif

RUN . /clone.sh BLIP https://github.com/salesforce/BLIP.git 48211a1594f1321b00f14c9f7a5b4813144b2fb9
RUN . /clone.sh k-diffusion https://github.com/crowsonkb/k-diffusion.git ab527a9a6d347f364e3d185ba6d714e22d80cb3c
RUN . /clone.sh clip-interrogator https://github.com/pharmapsychotic/clip-interrogator 2cf03aaf6e704197fd0dae7c7f96aa59cf1b11c9
RUN . /clone.sh generative-models https://github.com/Stability-AI/generative-models 45c443b316737a4ab6e40413d7794a7f5657c19f
RUN . /clone.sh stable-diffusion-webui-assets https://github.com/AUTOMATIC1111/stable-diffusion-webui-assets 6f7db241d2f8ba7457bac5ca9753331f0c266917

FROM nvidia/cuda:12.8.0-devel-ubuntu22.04

ENV DEBIAN_FRONTEND=noninteractive PIP_PREFER_BINARY=1

RUN --mount=type=cache,target=/var/cache/apt \
  apt-get update && \
  # we need those
  apt-get install -y fonts-dejavu-core rsync git jq moreutils aria2 python3 python3-pip python3-dev \
  # extensions needs those
  ffmpeg libglfw3-dev libgles2-mesa-dev pkg-config libcairo2 libcairo2-dev build-essential && \
  # Create python symlink for compatibility
  ln -s /usr/bin/python3 /usr/bin/python

WORKDIR /
RUN --mount=type=cache,target=/root/.cache/pip \
  git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git && \
  cd stable-diffusion-webui && \
  git reset --hard v1.9.4

# Install PyTorch nightly with CUDA 12.8 support
RUN --mount=type=cache,target=/root/.cache/pip \
  python3 -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128

# Install other requirements
RUN --mount=type=cache,target=/root/.cache/pip \
  cd stable-diffusion-webui && \
  pip install -r requirements_versions.txt

ENV ROOT=/stable-diffusion-webui

COPY --from=download /repositories/ ${ROOT}/repositories/
RUN mkdir ${ROOT}/interrogate && cp ${ROOT}/repositories/clip-interrogator/clip_interrogator/data/* ${ROOT}/interrogate

RUN --mount=type=cache,target=/root/.cache/pip \
  pip install pyngrok \
  git+https://github.com/TencentARC/GFPGAN.git@8d2447a2d918f8eba5a4a01463fd48e45126a379 \
  git+https://github.com/openai/CLIP.git@d50d76daa670286dd6cacf3bcd80b5e4823fc8e1 \
  git+https://github.com/mlfoundations/open_clip.git@v2.20.0

# Install xformers and fix scikit-image compatibility
RUN --mount=type=cache,target=/root/.cache/pip \
  pip install xformers --upgrade && \
  pip install scikit-image --upgrade --no-cache-dir

# there seems to be a memory leak (or maybe just memory not being freed fast enough) that is fixed by this version of malloc
# maybe move this up to the dependencies list.
RUN apt-get -y install libgoogle-perftools-dev && apt-get clean
ENV LD_PRELOAD=libtcmalloc.so

COPY . /docker

RUN \
  # mv ${ROOT}/style.css ${ROOT}/user.css && \
  # one of the ugliest hacks I ever wrote - find gradio location dynamically \
  GRADIO_PATH=$(python3 -c "import gradio; import os; print(os.path.dirname(gradio.__file__))") && \
  if [ -f "$GRADIO_PATH/routes.py" ]; then \
    sed -i 's/in_app_dir = .*/in_app_dir = True/g' "$GRADIO_PATH/routes.py"; \
  else \
    echo "Warning: gradio routes.py not found, skipping sed command"; \
  fi && \
  git config --global --add safe.directory '*'

WORKDIR ${ROOT}
ENV NVIDIA_VISIBLE_DEVICES=all
ENV CLI_ARGS=""
EXPOSE 7860
ENTRYPOINT ["/docker/entrypoint.sh"]
CMD python -u webui.py --listen --port 7860 ${CLI_ARGS}
ぽぽこんぽぽこん

3. ビルド・起動コマンド

 モデルダウンロード
docker compose --profile download up --build

 WebUI起動
docker compose --profile auto up --build
ぽぽこんぽぽこん

エラーが発生した場合、コンテナ内で以下を実行:
というかいつまでもエラーがでたから速攻でコンテナで処理した。
Dockerfileでなんとかしているつもりなんだけどな……

# コンテナに入る
docker compose exec auto bash

# Python環境をリセット
python3 -m pip install --upgrade pip
python3 -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
python3 -m pip install open-clip-torch xformers
python3 -m pip install jaraco.text jaraco.collections more-itertools

# setuptools修復(必要に応じて)
pip install setuptools packaging --upgrade --force-reinstall

exit
docker compose restart auto
ぽぽこんぽぽこん

5. 動作確認

段階的に確認

# コンテナに入る
docker compose exec auto bash

# 1. 基本環境確認
python3 --version
pip --version

# 2. PyTorch確認
python3 -c "
import torch
print(f'PyTorch: {torch.__version__}')
print(f'CUDA available: {torch.cuda.is_available()}')
print(f'CUDA version: {torch.version.cuda}')
"

# 3. GPU詳細確認
python3 -c "
import torch
print(f'GPU name: {torch.cuda.get_device_name(0)}')
print(f'GPU capability: {torch.cuda.get_device_capability(0)}')
print(f'CUDA arch list: {torch.cuda.get_arch_list()}')
"

# 4. 必要パッケージ確認
python3 -c "
try:
    import open_clip
    print('open-clip: OK')
except ImportError as e:
    print(f'open-clip: ERROR - {e}')

try:
    import xformers
    print('xformers: OK')
except ImportError as e:
    print(f'xformers: ERROR - {e}')

try:
    import jaraco.text
    print('jaraco: OK')
except ImportError as e:
    print(f'jaraco: ERROR - {e}')
"

exit
Python: 3.10.12
PyTorch: 2.8.0.dev20250606+cu128
CUDA available: True
CUDA version: 12.8
GPU name: NVIDIA GeForce RTX 5060 Ti
GPU capability: (12, 0)
CUDA arch list: ['sm_75', 'sm_80', 'sm_86', 'sm_90', 'sm_100', 'sm_120', 'compute_120']
open-clip: OK
xformers: OK
jaraco: OK
ぽぽこんぽぽこん

GPU使用状況確認

# GPU認識確認
docker compose exec auto nvidia-smi

# リアルタイムGPU監視(画像生成中に確認)
nvidia-smi -l 1
ぽぽこんぽぽこん

6. バックアップ作成

失いたくないこの完成したイメージ

# 動作するコンテナをバックアップ
docker commit webui-docker-auto-1 sd-webui-rtx5060ti:working
docker save sd-webui-rtx5060ti:working -o sd-webui-backup.tar

キーポイント

  1. CUDA 12.8.0のベースイメージが必須
  2. PyTorch nightly(CUDA 12.8対応)が必要
  3. 依存関係の順序が重要
  4. 環境変数でCUDAデバッグを有効化
このスクラップは3ヶ月前にクローズされました