iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
👏

Struggles with Remdis Environment Setup (and a Usage Guide)

に公開
2

Introduction

In this article, I will summarize the difficulties I encountered while setting up the environment for Remdis, a platform for developing text, voice, and multimodal dialogue systems.

I eventually succeeded in setting it up, so I hope this serves as a reference for anyone else who is struggling.

I also tried using Remdis briefly at the end, so I will introduce its usage as well.

What is Remdis?

Remdis is a platform for developing text, voice, and multimodal dialogue systems.
For more details, please see the repository below.

https://github.com/remdis/remdis

As shown in the demo video below, it is a system that enables very natural and low-latency real-time dialogue.
https://www.youtube.com/watch?v=mYT7nC_U3M8

Also, a book explaining this repository is available for purchase. The code is explained very clearly, so I highly recommend it for those who want to use Remdis.

Pythonと大規模言語モデルで作るリアルタイムマルチモーダル対話システム (エンジニア入門シリーズ128)

Environment Setup

As a prerequisite, my environment is as follows:
PC: M2 Mac

Installing Docker

Now, let's actually proceed with the environment setup by following the repository below.
https://github.com/remdis/remdis?tab=readme-ov-file

First, install Docker.
For Mac, just run the following command, open the installed Docker Desktop, and follow the instructions on the screen to complete the setup.

./
brew install --cask docker

Configure Docker Desktop following these steps:
(The following images are part of a document I compiled for personal use in the past.)

Run the following command to check if Docker is working correctly.

./
docker version

It's OK if you see a display like the one below.

Installing Remdis

Next, install Remdis as instructed.
Execute the following commands:

./
git clone https://github.com/remdis/remdis.git
cd remdis

Python version settings

Next, we will install the necessary packages in a virtual environment.
(Although the repository uses conda, I prefer pyenv + venv, so I will use those for the setup.)

First, use the following commands to set the Python version to 3.11 as specified.

For information on how to install pyenv itself, please see here:
https://qiita.com/koooooo/items/b21d87ffe2b56d0c589b

Once pyenv is installed, you can specify the Python version with the following commands:

./remdis/
pyenv install 3.11.9
pyenv local 3.11.9 # or pyenv global 3.11.9

This allows you to specify the Python version.
pyenv global is used when you want to apply this version system-wide.
pyenv local is used when you want to apply this version only to the current directory.

Check if the Python version has been changed by running the following command:

./remdis/
python -V
# Python 3.11.9

Installing required packages

Next, create a virtual environment to install the required packages.
We will use venv for the virtual environment.
Since venv is the official Python virtual environment tool, it can be used without any additional installation as long as Python is available.

./remdis/
python -m venv env
source env/bin/activate

pip install -r requirements.txt

Errors Occurring

./remdis/
pip install -r requirements.txt

Running the above command will likely cause errors.
Below, I will present the solutions I used for each error.

Error during PyAudio package installation

I apologize; I didn't save the error message, so I don't remember exactly what it said, but the error occurred during pip install pyaudio.
(I'm not sure exactly when it appeared, so please come back and check this if you still have errors after resolving the ones mentioned later.)
It can be resolved with the following commands (on Mac):

./remdis/
brew remove portaudio
pip install pyaudio

I referred to the following:
https://stackoverflow.com/questions/33851379/how-to-install-pyaudio-on-mac-using-python-3

Error during parallel-wavegan installation

Part of the error message
        File "<string>", line 7, in <module>
      ModuleNotFoundError: No module named 'pip'
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Regarding this error, I will first try the method described in the repository to see if it can be resolved.
That is, the following commands:

./remdis/
git clone https://github.com/kan-bayashi/ParallelWaveGAN.git
cd ParallelWaveGAN
./remdis/ParallelWaveGAN
pip install -e .

However, even after executing the above commands, the same error occurs.
Therefore, I executed the following:

./remdis/ParallelWaveGAN
python setup.py develop

I resolved this by referring to ChatGPT.
It seems that pip install -e . and python setup.py develop yield almost identical results.

Then, I was able to install it as follows:

Installation complete
Using /Users/xxx/remdis/ParallelWaveGAN/.eggs/numpy-2.0.0-py3.11-macosx-14.5-arm64.egg
Finished processing dependencies for parallel-wavegan==0.6.2a0

If you are unsure, you can verify the installation with the following command.
parallel-wavegan should be listed.

pip list

Continuing

Once the visible errors are resolved, try the following command again.

./remdis/
pip install -r requirements.txt

However, please comment out the parallel-wavegan that has already been installed, as shown below:

./remdis/requirements.txt
...
omegaconf==2.3.0
openai==0.28.0
packaging==23.2
#parallel-wavegan==0.6.1
pika==1.3.2
pillow==10.2.0
...

Then you will encounter a new error.

Error during nnmnkwii package installation

I got significantly bogged down resolving this error.
First, I tried installing it using the standard method from the repository:
https://github.com/r9y9/nnmnkwii
(I skipped the pip method since I knew it wouldn't work).

./remdis/
git clone https://github.com/r9y9/nnmnkwii
cd nnmnkwii
./remdis/nnmnkwii/
python setup.py develop # or install

However, the same error occurs.
I went through a lot of trial and error and got stuck here.

The cause, as mentioned in the issue below, is a compatibility issue due to the major update of numpy (2.0.0).
https://github.com/r9y9/nnmnkwii/issues/126

(Updated July 8, 2024)
I received a comment from @melon1891, and it seems that installation is now possible using the following method.

Currently, the latest version of nnmnkwii with this error fixed has been released, so changing the version in requirements.txt from 0.1.2 to 0.1.3 makes it work in one go.

Thank you for letting me know.

From here on, I will leave the previous solution.

Previous solution

The cause, as mentioned in the issue below, is a compatibility issue due to the major update of numpy (2.0.0).
https://github.com/r9y9/nnmnkwii/issues/126

As I read through this issue, I found that decfrr has created a fork that resolves this problem!! (Thank you so much!)

Therefore, I downloaded the contents of the decfrr/fix-numpy-version branch from the fork below and proceeded to install nnmnkwii.
https://github.com/decfrr/nnmnkwii/tree/decfrr/fix-numpy-version

./remdis/
git clone -b decfrr/fix-numpy-version https://github.com/decfrr/nnmnkwii.git
cd nnmnkwii
./remdis/nnmnkwii/
pip install -e .

Finally, I was able to install it as follows.

Successfully installed cython-3.0.10 fastdtw-0.3.4 nnmnkwii-0.1.2+e9a96b1 numpy-1.26.4 pysptk-0.2.2

(Aside 1)
By the way, if you use

./remdis/nnmnkwii/
python setup.py develop

instead of

./remdis/nnmnkwii/
pip install -e .

it results in an error. I don't even know why anymore.

(Aside 2)
Finally, with gratitude to decfrr, I posted the solution in the following issue.
https://github.com/r9y9/nnmnkwii/issues/126

I was able to install it using the link below.
https://github.com/decfrr/nnmnkwii/tree/decfrr/fix-numpy-version

git clone -b decfrr/fix-numpy-version https://github.com/decfrr/nnmnkwii.git
cd nnmnkwii
pip install -e .

Thank you, @decfrr.

Once Again

Now that another error has been resolved, I will try the following command again.

./remdis/
pip install -r requirements.txt

As shown below, the installation was completely successful.

Successful package installation
Successfully installed Cython-3.0.8 Jinja2-3.1.3 PyAudio-0.2.14 PyYAML-6.0.1 aiohttp-3.9.3 aiosignal-1.3.1 annotated-types-0.6.0 antlr4-python3-runtime-4.9.3 anyio-4.2.0 argcomplete-3.2.2 attrs-23.2.0 beautifulsoup4-4.12.3 cachetools-5.3.2 certifi-2024.2.2 cffi-1.16.0 contourpy-1.2.0 distro-1.9.0 filelock-3.13.1 fonttools-4.48.1 frozenlist-1.4.1 fsspec-2024.2.0 gdown-5.1.0 google-api-core-2.16.2 google-auth-2.27.0 google-cloud-speech-2.24.1 googleapis-common-protos-1.62.0 grpcio-1.60.1 grpcio-status-1.60.1 h11-0.14.0 h5py-3.10.0 httpcore-1.0.2 httpx-0.26.0 hydra-core-1.3.2 idna-3.6 joblib-1.3.2 lazy_loader-0.3 librosa-0.10.1 llvmlite-0.42.0 matplotlib-3.8.2 mecab-python3-1.0.8 msgpack-1.0.7 multidict-6.0.5 networkx-3.2.1 numba-0.59.0 omegaconf-2.3.0 openai-0.28.0 packaging-23.2 pika-1.3.2 pillow-10.2.0 platformdirs-4.2.0 pooch-1.8.0 proto-plus-1.23.0 protobuf-4.25.2 pyasn1-0.5.1 pyasn1-modules-0.3.0 pycparser-2.21 pydantic-2.6.1 pydantic_core-2.16.2 pyopenjtalk-0.3.3 pyparsing-3.1.1 python-dateutil-2.8.2 pyworld-0.3.4 requests-2.31.0 rsa-4.9 scikit-learn-1.4.0 scipy-1.12.0 sniffio-1.3.0 soxr-0.3.7 sympy-1.12 threadpoolctl-3.2.0 tomlkit-0.12.3 torch-2.2.0 tqdm-4.66.1 ttslearn-0.2.2 typing_extensions-4.9.0 unidic-lite-1.0.8 urllib3-2.2.0 yarl-1.9.4 yq-3.2.3

Obtaining and Setting Various API Keys

Now that the package installation is complete, I will set up the various API keys.
Let's carry this out according to the instructions in the repository.

Enter them in the corresponding part of config/config.yaml.
It should follow the format below; quotation marks ("") are not necessary.

./remdis/config/config.yaml
ChatGPT:
 api_key: sk-xxxx

Installing VAP

Follow the repository for this as well; there shouldn't be any specific errors.

./remdis/
pip install torch==2.0.1 torchvision torchaudio

git clone https://github.com/ErikEkstedt/VAP.git
cd VAP
./remdis/VAP
pip3 install -r requirements.txt
pip3 install -e .
pip3 install torchsummary

cd models/vap
./remdis/VAP/models/vap
unzip sw2japanese_public0.zip

Installing MMDAgent-EX

I haven't touched this part yet.
I want to do voice dialogue but don't need an avatar, so I'm skipping it.
(Even if you skip it, the avatar just won't be displayed, and voice dialogue will still work fine.)

How to Use

Proceed by following the instructions in the repository.

Running the RabbitMQ Server

With Docker Desktop running, execute the following command:

docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3.12-management

Running Text Dialogue

Please open three new terminals.
For all terminals, activate the virtual environment and move to the specified folder as shown below.

./remdis/
source env/bin/activate
cd modules/

Execute the following in the three terminals:

./remdis/modules/
python tin.py
./remdis/modules/
python dialogue.py
./remdis/modules/
python tout.py

Next, enter text in the tin.py terminal screen.
Pressing Enter will send it.
Also, when you have finished saying what you want to say, press Enter again with nothing entered.
Pressing Enter while nothing is entered serves as the signal for input termination.
After that, the AI's utterance content should be displayed in the tout.py terminal screen.

Running Text Input → Voice Output

Since I haven't prepared the Google Speech-to-Text API yet, I will use text for my input and have the AI respond with voice, in order to conduct the dialogue without voice recognition.
One of the attractions of remdis is its high flexibility in this area thanks to the use of RabbitMQ.

The execution method is the same as for text dialogue.
First, open four terminals and prepare them in the same way.
(This assumes the RabbitMQ server is running on Docker.)

./remdis/
source env/bin/activate
cd modules/

Execute the following commands in each terminal:
(For convenience, I'll list them in the same block, but when executing, please run each in a separate terminal.)

./remdis/modules/
python tin.py
python dialogue.py
python tts.py
python output.py

By starting them as described above, the text information entered in tin.py is processed by ChatGPT in dialogue.py, the AI's utterance is generated as text, converted to voice with ttslearn in tts.py, and spoken from the speakers in output.py.

It's simple.

Running Voice Dialogue

Proceed as described in the repository. It should probably work.

Summary

So far, I have summarized the difficulties I encountered during the environment setup for Remdis, a multimodal dialogue system, along with its simple usage.
I hope this helps anyone facing similar issues.

By the way, I am also building a simple voice dialogue system.
While it's not as sophisticated as Remdis, it uses Style-Bert-VITS2, a very high-performance AI, for speech output, allowing for expressive and more human-like dialogue.
Also, because it is implemented very simply, the code should be easy to understand.

I've introduced it in the following article, so I'd be happy if you'd take a look!
(You can see the power of Style-Bert-VITS2 by watching the embedded video in the article.)
https://zenn.dev/asap/articles/5b1b7553fcaa76

Thank you for reading this far!

Discussion

melon1891(@melon1891)melon1891(@melon1891)

有用な記事ありがとうございます!

nnmnkwiiパッケージ インストール時のエラー

ですが、現在はこちらのエラーが修正されたnnnkwiiの最新版が出ているので Requirements.textの当該バージョンを 0.1.2 から 0.1.3に変更すると一発で行くようになっていました。
https://github.com/r9y9/nnmnkwii/releases/tag/v0.1.3

逆に今は修正版の方が消されてアクセスできない感じです
https://github.com/decfrr/nnmnkwii/tree/decfrr/fix-numpy-version

asapasap

教えていただきありがとうございます!
こちら記事の方を修正させていただきました。

また、気づいた点などございましたらコメントいただけますと嬉しいです!