👏

stable-audio-open-1.0 を gradio で実行できる「Stability-AI/stable-audio-tools」

2024/06/14に公開

Audio

Stability AI

tech

stable-audio-open-1.0 を gradio で実行できる「Stability-AI/stable-audio-tools」をインストールして使う

インストール

取り合えず、pythonの仮想環境を作って、githubの説明(https://github.com/Stability-AI/stable-audio-tools)の通りに進めます。

インストールディレクトリは、取り合えず、$HOME/work/stable-audio にしました。

cd
mkdir -p work/stable-audio
cd work/stable-audio
python3 -m venv venv_stable_audio
. venv_stable_audio/bin/activate
git clone https://github.com/Stability-AI/stable-audio-tools
cd stable-audio-tools
pip install stable-audio-tools
pip install .

https://huggingface.co/stabilityai/stable-audio-open-1.0/は、アクセス制限がありますので、huggingfaceのcliログインの準備をします。

pip install huggingface_hub[cli]
git config --global credential.helper store

準備はここまでです。

立ち上げ

以下、上記インストールの続きの前提です。
そうでない場合は、以下を実行してください。

cd
cd work/stable-audio
. venv_stable_audio/bin/activate
cd stable-audio-tools

そして、インストールの続きの場合（そして、続きではなくて、上記のコマンドを実行した場合）、以下のコマンドで、huggingfaceへログインします。

huggingface-cli login

すると、以下のような画面になって、トークンを求められます。

(venv_stable_audio) ryuuri@RTX-3090:~/work/stable-audio/stable-audio-tools$ huggingface-cli login

    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible):

ここで、トークンを手に入れるために、https://huggingface.co/settings/tokens にアクセスしますが、事前準備が必要になります。

(1) huggingfaceのアカウントが無ければつくります
- (1-1) https://huggingface.co/ にアクセスします
- (1-2) 右上の「Sign Up」ボタンをクリックします
- (1-3) メールアドレスと、パスワードを求められるので、紐づけたい自分のメールアドレスと、HuggingFace用に使用したいパスワードを入力します
- (1-4) ユーザ名、フルネームを入力します。アバター、GitHubのユーザ名、ホームページ、ツイッターのユーザーネーム、などは入れても入れなくても大丈夫です。入力が終わったら、create acountボタンをクリックします
- (1-5) ユーザが人間であることを確認する　開始をクリック（いつもの人間確認ゲームが始まる）
- (1-6) 登録したメールアドレスに、huggingfaceから確認メールが来ています。このリンクをクリックすると、登録完了です
(2) https://huggingface.co/stabilityai/stable-audio-open-1.0/ のアクセス権を取得します
- (2-1) https://huggingface.co/stabilityai/stable-audio-open-1.0/ にアクセスします
- (2-2) You need to agree to share your contact information to access this model と表示されます。名前、メールアドレス、国、所属、E-Mailを受け取るか？を入力し、「Agree and access repository」をクリックします
(3) このstable-audio-open-1.0にアクセスするトークンを発行します
- (3-1) https://huggingface.co/settings/tokens にアクセスします
- (3-2) 「New token」をクリックします
- (3-3) Nameに、stable-audio-open-1.0 を入力します。Typeは、Fine-grained(custom)のままで「Generate a token」ボタンをクリックします
- (3-4) トークンが発行されるので、Copyボタンをクリックして、どこか安全なところに保管します。
- (3-5) Set permissionsボタンをクリックします
- (3-6) Edit Access Token Permissions の画面になりますので、トークン名のすぐ下の「Repositories permissions」の検索窓に、stable-audio-open-1.0を入力します。するとModelsの候補にstabilityai/stable-audio-open-1.0が出てくるので、このstabilityai/stable-audio-open-1.0をクリックします
- (3-7) User permissions の Repos の Read access to contents of all public gated repos you can access をクリックしてチェックを入れます
- (3-8) 一番下の「Save」ボタンをクリックします

(venv_stable_audio) ryuuri@RTX-3090:~/work/stable-audio/stable-audio-tools$ huggingface-cli login

    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible):

※画面にも書いてありますが、↓で貼り付けるトークンは見えませんので、注意が必要です（私は間違って何回もペーストしてしまい、ログインをやり直しました）。

この画面に戻って、安全な所に保管しておいたトークンをコピーし、ここに貼り付けて、エンターキーを押します。
Add token as git credential? (Y/n) と聞かれるので、yと入力してエンターキーを押すと終了です。

Add token as git credential? (Y/n) y
Token is valid (permission: fineGrained).
Your token has been saved in your configured git credential helpers (store).
Your token has been saved to /home/ryuuri/.cache/huggingface/token
Login successful

そして、起動スクリプトを実行します。

python3 ./run_gradio.py --pretrained-name stabilityai/stable-audio-open-1.0

起動後は、run_gradio.pyを起動したPCのブラウザで、http://127.0.0.1:7860/を開くと、gradioの画面になります。

別PCからアクセスできるように改造

改造と言っても、gradioに渡すオプションを追加するだけです。
以下のように変更します。

ryuuri@RTX-3090:~/work/stable-audio/stable-audio-tools$ git diff
diff --git a/run_gradio.py b/run_gradio.py
index ae3ba95..406a1da 100644
--- a/run_gradio.py
+++ b/run_gradio.py
@@ -15,7 +15,10 @@ def main(args):
         model_half=args.model_half
     )
     interface.queue()
-    interface.launch(share=True, auth=(args.username, args.password) if args.username is not None else None)
+    interface.launch(share=True,
+                     auth=(args.username, args.password) if args.username is not None else None,
+                     server_name=None if not args.listen else (args.listen_host or '0.0.0.0'),
+                     server_port=args.listen_port)

 if __name__ == "__main__":
     import argparse
@@ -26,6 +29,9 @@ if __name__ == "__main__":
     parser.add_argument('--pretransform-ckpt-path', type=str, help='Optional to model pretransform checkpoint', required=False)
     parser.add_argument('--username', type=str, help='Gradio username', required=False)
     parser.add_argument('--password', type=str, help='Gradio password', required=False)
+    parser.add_argument('--listen', action='store_true', help='listen', required=False)
+    parser.add_argument('--listen-host', type=str, help='listen host ip address', required=False)
+    parser.add_argument('--listen-port', type=int, help='listen port number', required=False)
     parser.add_argument('--model-half', action='store_true', help='Whether to use half precision', required=False)
     args = parser.parse_args()

--listen, --listen-host, --listen-portの引数を追加しました
gradioの起動時引数に、server_nameを追加し、--listenが指定されていたら、0.0.0.0をlistenするようにしました
gradioの起動時引数に、server_portを追加し、--listen-portで指定したポート番号でlistenするようにしました

この改造を入れた後は、以下のように、起動スクリプトを実行します。

python3 ./run_gradio.py --listen --listen-port 7860 --pretrained-name stabilityai/stable-audio-open-1.0

ここで、--listenは、別のPCからアクセスできるようにするオプションです。
指定しないと、この run_gradio.pyを起動したPCからしかアクセスできません。

起動後は、別のPCのブラウザで、http://<<run_gradio.pyを実行したPCのIPアドレス>>:7860/を開くと、gradioの画面になります。

stable-audio-open-1.0 を gradio で実行できる「Stability-AI/stable-audio-tools」をインストールして使う

インストール

立ち上げ

別PCからアクセスできるように改造

Discussion