iTranslated by AI
Struggles with Remdis Environment Setup (and a Usage Guide)
Introduction
In this article, I will summarize the difficulties I encountered while setting up the environment for Remdis, a platform for developing text, voice, and multimodal dialogue systems.
I eventually succeeded in setting it up, so I hope this serves as a reference for anyone else who is struggling.
I also tried using Remdis briefly at the end, so I will introduce its usage as well.
What is Remdis?
Remdis is a platform for developing text, voice, and multimodal dialogue systems.
For more details, please see the repository below.
As shown in the demo video below, it is a system that enables very natural and low-latency real-time dialogue.
Also, a book explaining this repository is available for purchase. The code is explained very clearly, so I highly recommend it for those who want to use Remdis.
Pythonと大規模言語モデルで作るリアルタイムマルチモーダル対話システム (エンジニア入門シリーズ128)
Environment Setup
As a prerequisite, my environment is as follows:
PC: M2 Mac
Installing Docker
Now, let's actually proceed with the environment setup by following the repository below.
First, install Docker.
For Mac, just run the following command, open the installed Docker Desktop, and follow the instructions on the screen to complete the setup.
brew install --cask docker
Configure Docker Desktop following these steps:
(The following images are part of a document I compiled for personal use in the past.)






Run the following command to check if Docker is working correctly.
docker version
It's OK if you see a display like the one below.

Installing Remdis
Next, install Remdis as instructed.
Execute the following commands:
git clone https://github.com/remdis/remdis.git
cd remdis
Python version settings
Next, we will install the necessary packages in a virtual environment.
(Although the repository uses conda, I prefer pyenv + venv, so I will use those for the setup.)
First, use the following commands to set the Python version to 3.11 as specified.
For information on how to install pyenv itself, please see here:
Once pyenv is installed, you can specify the Python version with the following commands:
pyenv install 3.11.9
pyenv local 3.11.9 # or pyenv global 3.11.9
This allows you to specify the Python version.
pyenv global is used when you want to apply this version system-wide.
pyenv local is used when you want to apply this version only to the current directory.
Check if the Python version has been changed by running the following command:
python -V
# Python 3.11.9
Installing required packages
Next, create a virtual environment to install the required packages.
We will use venv for the virtual environment.
Since venv is the official Python virtual environment tool, it can be used without any additional installation as long as Python is available.
python -m venv env
source env/bin/activate
pip install -r requirements.txt
Errors Occurring
pip install -r requirements.txt
Running the above command will likely cause errors.
Below, I will present the solutions I used for each error.
Error during PyAudio package installation
I apologize; I didn't save the error message, so I don't remember exactly what it said, but the error occurred during pip install pyaudio.
(I'm not sure exactly when it appeared, so please come back and check this if you still have errors after resolving the ones mentioned later.)
It can be resolved with the following commands (on Mac):
brew remove portaudio
pip install pyaudio
I referred to the following:
Error during parallel-wavegan installation
File "<string>", line 7, in <module>
ModuleNotFoundError: No module named 'pip'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Regarding this error, I will first try the method described in the repository to see if it can be resolved.
That is, the following commands:
git clone https://github.com/kan-bayashi/ParallelWaveGAN.git
cd ParallelWaveGAN
pip install -e .
However, even after executing the above commands, the same error occurs.
Therefore, I executed the following:
python setup.py develop
I resolved this by referring to ChatGPT.
It seems that pip install -e . and python setup.py develop yield almost identical results.
Then, I was able to install it as follows:
Using /Users/xxx/remdis/ParallelWaveGAN/.eggs/numpy-2.0.0-py3.11-macosx-14.5-arm64.egg
Finished processing dependencies for parallel-wavegan==0.6.2a0
If you are unsure, you can verify the installation with the following command.
parallel-wavegan should be listed.
pip list
Continuing
Once the visible errors are resolved, try the following command again.
pip install -r requirements.txt
However, please comment out the parallel-wavegan that has already been installed, as shown below:
...
omegaconf==2.3.0
openai==0.28.0
packaging==23.2
#parallel-wavegan==0.6.1
pika==1.3.2
pillow==10.2.0
...
Then you will encounter a new error.
Error during nnmnkwii package installation
I got significantly bogged down resolving this error.
First, I tried installing it using the standard method from the repository:
(I skipped the pip method since I knew it wouldn't work).
git clone https://github.com/r9y9/nnmnkwii
cd nnmnkwii
python setup.py develop # or install
However, the same error occurs.
I went through a lot of trial and error and got stuck here.
The cause, as mentioned in the issue below, is a compatibility issue due to the major update of numpy (2.0.0).
(Updated July 8, 2024)
I received a comment from @melon1891, and it seems that installation is now possible using the following method.
Currently, the latest version of nnmnkwii with this error fixed has been released, so changing the version in requirements.txt from 0.1.2 to 0.1.3 makes it work in one go.
Thank you for letting me know.
From here on, I will leave the previous solution.
Previous solution
The cause, as mentioned in the issue below, is a compatibility issue due to the major update of numpy (2.0.0).
As I read through this issue, I found that decfrr has created a fork that resolves this problem!! (Thank you so much!)
Therefore, I downloaded the contents of the decfrr/fix-numpy-version branch from the fork below and proceeded to install nnmnkwii.
git clone -b decfrr/fix-numpy-version https://github.com/decfrr/nnmnkwii.git
cd nnmnkwii
pip install -e .
Finally, I was able to install it as follows.
Successfully installed cython-3.0.10 fastdtw-0.3.4 nnmnkwii-0.1.2+e9a96b1 numpy-1.26.4 pysptk-0.2.2
(Aside 1)
By the way, if you use
python setup.py develop
instead of
pip install -e .
it results in an error. I don't even know why anymore.
(Aside 2)
Finally, with gratitude to decfrr, I posted the solution in the following issue.
I was able to install it using the link below.
https://github.com/decfrr/nnmnkwii/tree/decfrr/fix-numpy-versiongit clone -b decfrr/fix-numpy-version https://github.com/decfrr/nnmnkwii.git cd nnmnkwii pip install -e .Thank you, @decfrr.
Once Again
Now that another error has been resolved, I will try the following command again.
pip install -r requirements.txt
As shown below, the installation was completely successful.
Successfully installed Cython-3.0.8 Jinja2-3.1.3 PyAudio-0.2.14 PyYAML-6.0.1 aiohttp-3.9.3 aiosignal-1.3.1 annotated-types-0.6.0 antlr4-python3-runtime-4.9.3 anyio-4.2.0 argcomplete-3.2.2 attrs-23.2.0 beautifulsoup4-4.12.3 cachetools-5.3.2 certifi-2024.2.2 cffi-1.16.0 contourpy-1.2.0 distro-1.9.0 filelock-3.13.1 fonttools-4.48.1 frozenlist-1.4.1 fsspec-2024.2.0 gdown-5.1.0 google-api-core-2.16.2 google-auth-2.27.0 google-cloud-speech-2.24.1 googleapis-common-protos-1.62.0 grpcio-1.60.1 grpcio-status-1.60.1 h11-0.14.0 h5py-3.10.0 httpcore-1.0.2 httpx-0.26.0 hydra-core-1.3.2 idna-3.6 joblib-1.3.2 lazy_loader-0.3 librosa-0.10.1 llvmlite-0.42.0 matplotlib-3.8.2 mecab-python3-1.0.8 msgpack-1.0.7 multidict-6.0.5 networkx-3.2.1 numba-0.59.0 omegaconf-2.3.0 openai-0.28.0 packaging-23.2 pika-1.3.2 pillow-10.2.0 platformdirs-4.2.0 pooch-1.8.0 proto-plus-1.23.0 protobuf-4.25.2 pyasn1-0.5.1 pyasn1-modules-0.3.0 pycparser-2.21 pydantic-2.6.1 pydantic_core-2.16.2 pyopenjtalk-0.3.3 pyparsing-3.1.1 python-dateutil-2.8.2 pyworld-0.3.4 requests-2.31.0 rsa-4.9 scikit-learn-1.4.0 scipy-1.12.0 sniffio-1.3.0 soxr-0.3.7 sympy-1.12 threadpoolctl-3.2.0 tomlkit-0.12.3 torch-2.2.0 tqdm-4.66.1 ttslearn-0.2.2 typing_extensions-4.9.0 unidic-lite-1.0.8 urllib3-2.2.0 yarl-1.9.4 yq-3.2.3
Obtaining and Setting Various API Keys
Now that the package installation is complete, I will set up the various API keys.
Let's carry this out according to the instructions in the repository.
Enter them in the corresponding part of config/config.yaml.
It should follow the format below; quotation marks ("") are not necessary.
ChatGPT:
api_key: sk-xxxx
Installing VAP
Follow the repository for this as well; there shouldn't be any specific errors.
pip install torch==2.0.1 torchvision torchaudio
git clone https://github.com/ErikEkstedt/VAP.git
cd VAP
pip3 install -r requirements.txt
pip3 install -e .
pip3 install torchsummary
cd models/vap
unzip sw2japanese_public0.zip
Installing MMDAgent-EX
I haven't touched this part yet.
I want to do voice dialogue but don't need an avatar, so I'm skipping it.
(Even if you skip it, the avatar just won't be displayed, and voice dialogue will still work fine.)
How to Use
Proceed by following the instructions in the repository.
Running the RabbitMQ Server
With Docker Desktop running, execute the following command:
docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3.12-management
Running Text Dialogue
Please open three new terminals.
For all terminals, activate the virtual environment and move to the specified folder as shown below.
source env/bin/activate
cd modules/
Execute the following in the three terminals:
python tin.py
python dialogue.py
python tout.py
Next, enter text in the tin.py terminal screen.
Pressing Enter will send it.
Also, when you have finished saying what you want to say, press Enter again with nothing entered.
Pressing Enter while nothing is entered serves as the signal for input termination.
After that, the AI's utterance content should be displayed in the tout.py terminal screen.
Running Text Input → Voice Output
Since I haven't prepared the Google Speech-to-Text API yet, I will use text for my input and have the AI respond with voice, in order to conduct the dialogue without voice recognition.
One of the attractions of remdis is its high flexibility in this area thanks to the use of RabbitMQ.
The execution method is the same as for text dialogue.
First, open four terminals and prepare them in the same way.
(This assumes the RabbitMQ server is running on Docker.)
source env/bin/activate
cd modules/
Execute the following commands in each terminal:
(For convenience, I'll list them in the same block, but when executing, please run each in a separate terminal.)
python tin.py
python dialogue.py
python tts.py
python output.py
By starting them as described above, the text information entered in tin.py is processed by ChatGPT in dialogue.py, the AI's utterance is generated as text, converted to voice with ttslearn in tts.py, and spoken from the speakers in output.py.
It's simple.
Running Voice Dialogue
Proceed as described in the repository. It should probably work.
Summary
So far, I have summarized the difficulties I encountered during the environment setup for Remdis, a multimodal dialogue system, along with its simple usage.
I hope this helps anyone facing similar issues.
By the way, I am also building a simple voice dialogue system.
While it's not as sophisticated as Remdis, it uses Style-Bert-VITS2, a very high-performance AI, for speech output, allowing for expressive and more human-like dialogue.
Also, because it is implemented very simply, the code should be easy to understand.
I've introduced it in the following article, so I'd be happy if you'd take a look!
(You can see the power of Style-Bert-VITS2 by watching the embedded video in the article.)
Thank you for reading this far!
Discussion
有用な記事ありがとうございます!
ですが、現在はこちらのエラーが修正されたnnnkwiiの最新版が出ているので Requirements.textの当該バージョンを 0.1.2 から 0.1.3に変更すると一発で行くようになっていました。
逆に今は修正版の方が消されてアクセスできない感じです
教えていただきありがとうございます!
こちら記事の方を修正させていただきました。
また、気づいた点などございましたらコメントいただけますと嬉しいです!