windowsにwsl経由でcudaとcudnnを入れてfaster-whisperを動かす

最初はwindows側に入れようと頑張ったがどうにもならず(cudnnまわりのdllのパスが通ってないぞと無限に言われる・・・通してるのに・・・)、wsl上に構築するのをためしたところすんなりいきました

バージョン回り厳しいとの話ですが結局なにがなにやらよくわからんかった・・・
そしてfaster-whisperもネット上の情報が新旧混ざっててややこしい・・・

faster-whisperはv1.0.0以降はcuda12をサポートしている・内部で使用している ctranslate2がcuda12のみサポート(cuda11はサポートしていない)らしい

https://github.com/SYSTRAN/faster-whisper/issues/783
faster-whisperの最新バージョン(2024年10月5日現在)はv1.0.3

でreadmeのrequirementsのところに
Python 3.8 or greater

と

GPU execution requires the following NVIDIA libraries to be installed:
cuBLAS for CUDA 12
cuDNN 8 for CUDA 12
って書いてありますね

https://github.com/SYSTRAN/faster-whisper?tab=readme-ov-file#requirements

yuuri

windows側に
GeForce Game Ready Driver 565.90
https://www.nvidia.com/ja-jp/drivers/details/232871/

wsl側(ubuntu)に
cuda tool kit CUDA Toolkit 12.6 Update 2
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0&target_type=deb_network

cudnn 8.9.7
を入れました

yuuri

作業メモ

windowsにgeforceのドライバ入れる

↓
wsl をインストールする
管理者モードでpowerShellで書きコマンドを実行

wsl --install

メモ: wsl下記インストール中に下記エラーが出た

PS C:\Users\yuuri>  wsl --install
クラスが登録されていません
エラー コード: Wsl/CallMsi/Install/REGDB_E_CLASSNOTREG

2.3.24が入っていたのですが察するにこれがダメらしく、githubから2.3.22をダウンロードした（wsl.2.3.24.0.x64.msi をダウンロード・実行した）ところうまくインストールされるようになりました

下記を参考にしました

yuuri

下記のOption 1にしたがってwsl内にcuda設定を行います

リンクを開くとインストール手順が表示されるので、これに従ってcudaのインストールを行う

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.6.2/local_installers/cuda-repo-wsl-ubuntu-12-6-local_12.6.2-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-12-6-local_12.6.2-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-12-6-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-6

yuuri

cudnn v8.9.7 for cuda 12.x のインストールを行います
まず下記ページから「Local Installer for Ubuntu22.04 x86_64 (Deb)」をダウンロードします（このときNVIDIA accountでログインする必要があります）

# Windowsのダウンロードフォルダからwslのカレントディレクトリへファイルをコピーする
yuuri@hotcocoa:~$ cp /mnt/c/Users/yuuri/Downloads/cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb .
# インストールを行う
yuuri@hotcocoa:~$ sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb
[sudo] password for yuuri:
Selecting previously unselected package cudnn-local-repo-ubuntu2204-8.9.7.29.
(Reading database ... 54408 files and directories currently installed.)
Preparing to unpack cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb ...
Unpacking cudnn-local-repo-ubuntu2204-8.9.7.29 (1.0-1) ...
Setting up cudnn-local-repo-ubuntu2204-8.9.7.29 (1.0-1) ...

The public cudnn-local-repo-ubuntu2204-8.9.7.29 GPG key does not appear to be installed.
To install the key, run this command:
sudo cp /var/cudnn-local-repo-ubuntu2204-8.9.7.29/cudnn-local-08A7D361-keyring.gpg /usr/share/keyrings/

yuuri@hotcocoa:~$ sudo cp /var/cudnn-local-repo-ubuntu2204-8.9.7.29/cudnn-local-08A7D361-keyring.gpg /usr/share/keyrings/
yuuri@hotcocoa:~$ sudo apt update
yuuri@hotcocoa:~$ sudo apt install libcudnn8
yuuri@hotcocoa:~$ sudo apt install libcudnn8-dev
yuuri@hotcocoa:~$ sudo apt install libcudnn8-samples

yuuri

下記のtranscription.pyをお借りして、whisperが動作することの確認を行います

### 文字起こし(faster_whisperを使用)
from faster_whisper import WhisperModel

# モデル設定
model_size = "large-v3"

# GPU設定
model = WhisperModel(model_size, device="cuda", compute_type="float16")

AUDIO_FILE = "record_test.wav"

# 文字起こしの実行
segments, info = model.transcribe(AUDIO_FILE, beam_size=5)

# 結果の表示
print("Detected language:", info.language)
for segment in segments:
    print(f"[{segment.start:.2f} - {segment.end:.2f}]: {segment.text}")

wsl内でpythonの設定を行います
python3は入っているので、pipとvenvをインストールします

yuuri@hotcocoa:~$ python3 -V
Python 3.12.3
yuuri@hotcocoa:~$ sudo apt install python3-pip
yuuri@hotcocoa:~$ sudo apt install python3-venv

その後venvを作って、pipでfaster-whisperをインストールします

yuuri@hotcocoa:~/whisper-test$ python3 -m venv .venv
yuuri@hotcocoa:~/whisper-test$ source .venv/bin/activate
(.venv) yuuri@hotcocoa:~/whisper-test$ pip3 install faster_whisper

record_test.wavを配置し、これの文字起こしが行われるかどうか試してみます

(.venv) yuuri@hotcocoa:~/whisper-test$ python whisper.py
config.json: 100%|█████████████████████████████████████████████████████████████████| 2.39k/2.39k [00:00<00:00, 27.2MB/s]
vocabulary.json: 100%|█████████████████████████████████████████████████████████████| 1.07M/1.07M [00:00<00:00, 7.38MB/s]
preprocessor_config.json: 100%|████████████████████████████████████████████████████████| 340/340 [00:00<00:00, 4.36MB/s]
tokenizer.json: 100%|██████████████████████████████████████████████████████████████| 2.48M/2.48M [00:00<00:00, 3.02MB/s]model.bin: 100%|███████████████████████████████████████████████████████████████████| 3.09G/3.09G [00:50<00:00, 61.5MB/s]
Detected language: ja██████████████████████████████████████████████████████████████| 2.48M/2.48M [00:00<00:00, 3.03MB/s]
[0.00 - 5.30]: 雨にも負けず 風にも負けず
[5.30 - 13.06]: 雪にも夏の暑さにも負けぬ 丈夫な体を持ち
[13.06 - 16.30]: 欲はなく 決して怒らず
[16.30 - 19.36]: いつも静かに笑っている
[22.30 - 27.78]: 一日に玄米四合と 味噌と少しの野菜を食べ
[27.78 - 33.16]: あらゆることを 自分を感情に入れずに
[33.16 - 37.64]: よく見聞きし分かり そして忘れず
[37.64 - 45.46]: 野原の松の林の影の 小さなかやぶきの小屋にいて
[46.60 - 51.66]: 東に病気の子供あれば 行って看病してやり
[51.66 - 56.46]: 西に疲れた母あれば 行って
[56.46 - 57.76]: その稲の手を 探してやる
[57.76 - 65.88]: 南に死にそうな人あれば 行って怖がらなくてもいいといい
[65.88 - 72.60]: 北に喧嘩や訴訟があれば つまらないからやめろといい
[72.60 - 80.26]: 日出りの時は涙を流し 寒さの夏はおろおろ歩き
[80.26 - 85.92]: みんなにデクノボーと呼ばれ 褒められもせず
[85.92 - 87.74]: 苦にもされず 涙を流し 涙を流し
[87.76 - 91.52]: そういうものに 私はなりたい

「西に疲れた母あれば、その稲の束を負い」の箇所がやや微妙な感じですがだいたい良い感じではないでしょうか