iTranslated by AI
Playing with Dify + Llama 3 (2) — Also Using Llama-3-ELYZA-JP-8B
Purpose
Some time has passed since Playing with Dify + Llama 3 (1), and since Dify has been updated to 0.6.13, I'm writing this follow-up article. To be honest, the update itself isn't the main point—I just need to document this for myself so I don't forget how to do it.
Also, I was having trouble with the chat responding in English right away, so I'm switching to "Llama-3-ELYZA-JP-8B," which is rumored to be strong in Japanese.
Updating to Dify 0.6.13
Dify has reached v0.6.13. When I ran git pull, there were quite a few differences, so I looked into it out of concern.
$ git clone https://github.com/langgenius/dify.git
$ git checkout 0.6.13
Reviewing Settings
According to Supporting Dify 0.6.12: Running and Integrating n8n and Dify in a Docker environment on a VPS. Includes extras like security and nginx settings, the specifications have changed slightly since 0.6.12. It seems that instead of editing docker-compose.yaml directly, settings are now overridden using a .env file.
docker-compose.yaml
Continuing from the previous article https://zenn.dev/derwind/articles/dwd-llm-dify01, I want to use ollama's Llama 3 locally, so I need to be able to reference the host network from within the Docker container. Therefore, I continued to configure only the following in docker-compose.yaml instead of .env:
$ git diff
diff --git a/docker/docker-compose.yaml b/docker/docker-compose.yaml
index 3d26ae2ad..883d06aee 100644
--- a/docker/docker-compose.yaml
+++ b/docker/docker-compose.yaml
@@ -177,6 +177,8 @@ services:
networks:
- ssrf_proxy_network
- default
+ extra_hosts:
+ - "host.docker.internal:host-gateway"
.env
First, start by copying .env.sample.
$ cd dify/docker
$ cp .env.sample .env
Then, modify .env as appropriate. This time, I ambitiously tried to enable SSL encrypted communication as well:
$ diff -ub .env.example .env
--- .env.example 2024-07-10 00:51:44.335410700 +0900
+++ .env 2024-07-10 01:42:06.600874278 +0900
@@ -564,7 +564,7 @@
# Environment Variables for Nginx reverse proxy
# ------------------------------
NGINX_SERVER_NAME=_
-NGINX_HTTPS_ENABLED=false
+NGINX_HTTPS_ENABLED=true
# HTTP port
NGINX_PORT=80
# SSL settings are only applied when HTTPS_ENABLED is true
@@ -602,5 +602,5 @@
# ------------------------------
# Docker Compose Service Expose Host Port Configurations
# ------------------------------
-EXPOSE_NGINX_PORT=80
-EXPOSE_NGINX_SSL_PORT=443
+EXPOSE_NGINX_PORT=8102
+EXPOSE_NGINX_SSL_PORT=8103
SSL Encrypted Communication
$ cd nginx/ssl/
$ ls
dify.crt dify.key
You should place things like the private key and server certificate in this format. For those with some issues, the browser will warn you when you try to connect, but that's at your own risk.
That covers the update to 0.6.13. Since I'm at it, I'd like to try updating (?) the LLM engine as well.
Using Llama-3-ELYZA-JP-8B
I will refer to "Alright, I'm going to run Llama-3-ELYZA-JP-8B with Ollama!".
Discarding Llama 3
I introduced Llama 3 in Playing with Dify + Llama 3 (1), but I will discard it and replace it.
Update ollama just in case:
$ docker pull ollama/ollama:latest
Delete the current volume to temporarily discard the old Llama 3:
$ docker volume rm ollama
This will delete the volume of about 5GB that contained Llama 3.
Setting up Llama-3-ELYZA-JP-8B
Download the model from Hugging Face at elyza/Llama-3-ELYZA-JP-8B-GGUF.
$ mkdir Llama-3-ELYZA
$ cd Llama-3-ELYZA
$ curl -LO https://huggingface.co/elyza/Llama-3-ELYZA-JP-8B-GGUF/resolve/main/Llama-3-ELYZA-JP-8B-q4_k_m.gguf
Next, prepare a Modelfile. While it seems you can obtain one using ollama show, I'll follow "Alright, I'm going to run Llama-3-ELYZA-JP-8B with Ollama!" exactly:
[Modelfile]
FROM ./Llama-3-ELYZA-JP-8B-q4_k_m.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>"""
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|reserved_special_token"
Start the ollama container in detached mode
To import ./Llama-3-ELYZA-JP-8B-q4_k_m.gguf, start the container by mounting the current directory to /work:
$ docker run -d --rm --gpus=all -v $PWD:/work -v ollama:/root/.ollama \
--name ollama ollama/ollama
Enter the ollama container and set up Llama-3-ELYZA-JP-8B
It should look something like this:
$ docker exec -it ollama bash
root@bd8c5bc7fa9e:/# cd /work/
root@bd8c5bc7fa9e:/work# ls
Llama-3-ELYZA-JP-8B-q4_k_m.gguf Modelfile
root@bd8c5bc7fa9e:/work# ollama create elyza:jp8b -f Modelfile
transferring model data
using existing layer sha256:91553c45080b11d95be21bb67961c9a5d2ed7556275423efaaad6df54ba9beae
creating new layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f
creating new layer sha256:c0aac7c7f00d8a81a8ef397cd78664957fbe0e09f87b08bc7afa8d627a8da87f
creating new layer sha256:bc526ae2132e2fc5e7ab4eef535720ce895c7a47429782231a33f62b0fa4401f
writing manifest
success
root@bd8c5bc7fa9e:/work# exit
$ docker container stop ollama
$ cd ..
$ rm -rf Llama-3-ELYZA
Finally, I discarded the downloaded model since it was no longer needed (as it had been transferred to the Docker volume). During the process, the downloaded model plus the Docker volume likely consumed about 10GB of storage temporarily.
Verification
$ docker run -d --rm --gpus=all -v ollama:/root/.ollama -p 11434:11434 \
--name ollama ollama/ollama
Start the container in detached mode as shown above and run the elyza:jp8b that was just set up:
$ docker exec -it ollama ollama run elyza:jp8b
>>> Hello~
Hello! How are you?
>>> What time is it now?
Since I am an AI, I cannot keep track of the current time. I only have access to information from the conversation or past data.
>>> /bye
It seems to have worked successfully.
Using Llama-3-ELYZA-JP-8B from Dify
Since the preparations on the Dify side are already done, start it as follows and access it from your browser.
$ cd dify/docker
$ docker compose up -d
Model Provider Settings
Click on the user icon to find "Settings," and configure the LLM from there. Since I've already deleted "llama3," remove the old information and set up "elyza:jp8b."
Model Name: elyza:jp8b
Base URL: http://host.docker.internal:11434
Completion mode: Chat
Model context size: 4096
Upper bound for max tokens: 4096
Vision support: No

Now the preparation is complete.
Settings from the Application
Select "Studio" from the top center, then choose "Chatbot" and "Create from scratch" or similar. Immediately set the LLM model to "elyza:jp8b" and prepare the chat app with appropriate instructions.

You can see that it responds in Japanese without even having to specifically ask it to "speak in Japanese."
Termination Process
$ docker compose down
to terminate the Dify container orchestration (?), and then
$ docker container stop ollama
to terminate the ollama container.
Summary
As it turns out, the update and the switch to a new LLM were surprisingly easy, and chatting in Japanese has become much more efficient.
Discussion