ローカルLLMでVOICEBOXと会話するメモ

なんかローカルLLMでVOICEBOXと会話できそうなのでやってみました。

環境

Windows11

WSL2
必要なもの
LM Studio（すでにモデルを含め導入済みであること）
WSLで動くDocker環境
※中身

VOICEBOX

https://hub.docker.com/r/voicevox/voicevox_engine

OpenWebUI

https://openwebui.com/

あと、この方が作ったBridgeを使います。

https://github.com/NP-F/openaitts_voicevox_bridge

inunekousapion

WSLに適当なディレクトリを作って下記のファイルを作ります。

compose.yml
config.ini
Dockerfile

inunekousapion

compose.yml

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: always
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://localhost:1234/v1/
    volumes:
      - open-webui:/app/backend/data
  voicevox-engine:
    image: voicevox/voicevox_engine:cpu-latest
    container_name: voicevox-engine
    ports:
      - "50021:50021"
    restart: always
  openaitts-voicevox-bridge:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: openaitts-voicevox-bridge
    command: ["python", "openaitts2voicevox_bridge.py"]
    ports:
      - "8000:8000"
    restart: always
    depends_on: [voicevox-engine, open-webui]
volumes:
  open-webui:

inunekousapion

Dockerfile

FROM python:3.12-slim-bullseye

# Install dependencies
RUN apt-get update && apt-get install -y \
    ffmpeg \
    git \
    && apt-get clean

# Clone the repository
WORKDIR /app
RUN git clone https://github.com/NP-F/openaitts_voicevox_bridge /app/openaitts_voicevox_bridge

WORKDIR /app/openaitts_voicevox_bridge

# Install Python dependencies
RUN pip install --upgrade pip && pip install -r requirements.txt

COPY config.ini /app/openaitts_voicevox_bridge/config.ini

CMD ["/bin/bash"]

inunekousapion

config.ini

※openaitts_voicevox_bridgeの設定ファイルを上書きします。

[SETTINGS]
PARSERATE = 2
HOST = 0.0.0.0
PORT = 8000

[SOUND]
SAMPLING_RATE = 44100
BITDEPTH = 2

[VOICEVOX]
AUTOLAUNCH = false
PATH = /PATH/TO/VOICEVOX/EXECUTABLE
API = http://voicevox-engine:50021

[ALTID]
ALLOY = 0
ECHO = 0
FABLE = 0
ONYX = 0
NOVA = 0
SHIMMER = 0
DEFAULT_SPEAKER = 0

inunekousapion

Docker Composeを起動します。

docker compose up

inunekousapion

localhost:3000にアクセスして、管理者アカウントの設定をします。
管理者設定のオーディオを開き、下記の設定をします。

APIキーは何でもよいらしいです。

inunekousapion

あとは会話したりするとVOICEBOXの音声で声が返ります。

inunekousapion

遅くなりましたが、WSLで動かしているサーバーにlocalhostでアクセスするため.wslconfigに下記を加えています。

[wsl2]
localhostForwarding=True

作成者以外のコメントは許可されていません