iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🤖

Playing with Dify + Llama 3 (2) — Also Using Llama-3-ELYZA-JP-8B

に公開

Purpose

Some time has passed since Playing with Dify + Llama 3 (1), and since Dify has been updated to 0.6.13, I'm writing this follow-up article. To be honest, the update itself isn't the main point—I just need to document this for myself so I don't forget how to do it.

Also, I was having trouble with the chat responding in English right away, so I'm switching to "Llama-3-ELYZA-JP-8B," which is rumored to be strong in Japanese.

Updating to Dify 0.6.13

Dify has reached v0.6.13. When I ran git pull, there were quite a few differences, so I looked into it out of concern.

$ git clone https://github.com/langgenius/dify.git
$ git checkout 0.6.13

Reviewing Settings

According to Supporting Dify 0.6.12: Running and Integrating n8n and Dify in a Docker environment on a VPS. Includes extras like security and nginx settings, the specifications have changed slightly since 0.6.12. It seems that instead of editing docker-compose.yaml directly, settings are now overridden using a .env file.

docker-compose.yaml

Continuing from the previous article https://zenn.dev/derwind/articles/dwd-llm-dify01, I want to use ollama's Llama 3 locally, so I need to be able to reference the host network from within the Docker container. Therefore, I continued to configure only the following in docker-compose.yaml instead of .env:

$ git diff
diff --git a/docker/docker-compose.yaml b/docker/docker-compose.yaml
index 3d26ae2ad..883d06aee 100644
--- a/docker/docker-compose.yaml
+++ b/docker/docker-compose.yaml
@@ -177,6 +177,8 @@ services:
     networks:
       - ssrf_proxy_network
       - default
+    extra_hosts:
+      - "host.docker.internal:host-gateway"

.env

First, start by copying .env.sample.

$ cd dify/docker
$ cp .env.sample .env

Then, modify .env as appropriate. This time, I ambitiously tried to enable SSL encrypted communication as well:

$ diff -ub .env.example .env
--- .env.example        2024-07-10 00:51:44.335410700 +0900
+++ .env        2024-07-10 01:42:06.600874278 +0900
@@ -564,7 +564,7 @@
 # Environment Variables for Nginx reverse proxy
 # ------------------------------
 NGINX_SERVER_NAME=_
-NGINX_HTTPS_ENABLED=false
+NGINX_HTTPS_ENABLED=true
 # HTTP port
 NGINX_PORT=80
 # SSL settings are only applied when HTTPS_ENABLED is true
@@ -602,5 +602,5 @@
 # ------------------------------
 # Docker Compose Service Expose Host Port Configurations
 # ------------------------------
-EXPOSE_NGINX_PORT=80
-EXPOSE_NGINX_SSL_PORT=443
+EXPOSE_NGINX_PORT=8102
+EXPOSE_NGINX_SSL_PORT=8103

SSL Encrypted Communication

$ cd nginx/ssl/
$ ls
dify.crt  dify.key

You should place things like the private key and server certificate in this format. For those with some issues, the browser will warn you when you try to connect, but that's at your own risk.

That covers the update to 0.6.13. Since I'm at it, I'd like to try updating (?) the LLM engine as well.

Using Llama-3-ELYZA-JP-8B

I will refer to "Alright, I'm going to run Llama-3-ELYZA-JP-8B with Ollama!".

Discarding Llama 3

I introduced Llama 3 in Playing with Dify + Llama 3 (1), but I will discard it and replace it.

Update ollama just in case:

$ docker pull ollama/ollama:latest

Delete the current volume to temporarily discard the old Llama 3:

$ docker volume rm ollama

This will delete the volume of about 5GB that contained Llama 3.

Setting up Llama-3-ELYZA-JP-8B

Download the model from Hugging Face at elyza/Llama-3-ELYZA-JP-8B-GGUF.

$ mkdir Llama-3-ELYZA
$ cd Llama-3-ELYZA
$ curl -LO https://huggingface.co/elyza/Llama-3-ELYZA-JP-8B-GGUF/resolve/main/Llama-3-ELYZA-JP-8B-q4_k_m.gguf

Next, prepare a Modelfile. While it seems you can obtain one using ollama show, I'll follow "Alright, I'm going to run Llama-3-ELYZA-JP-8B with Ollama!" exactly:

[Modelfile]

FROM ./Llama-3-ELYZA-JP-8B-q4_k_m.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|reserved_special_token"

Start the ollama container in detached mode

To import ./Llama-3-ELYZA-JP-8B-q4_k_m.gguf, start the container by mounting the current directory to /work:

$ docker run -d --rm --gpus=all -v $PWD:/work -v ollama:/root/.ollama \
             --name ollama ollama/ollama

Enter the ollama container and set up Llama-3-ELYZA-JP-8B

It should look something like this:

$ docker exec -it ollama bash
root@bd8c5bc7fa9e:/# cd /work/
root@bd8c5bc7fa9e:/work# ls
Llama-3-ELYZA-JP-8B-q4_k_m.gguf  Modelfile
root@bd8c5bc7fa9e:/work# ollama create elyza:jp8b -f Modelfile
transferring model data
using existing layer sha256:91553c45080b11d95be21bb67961c9a5d2ed7556275423efaaad6df54ba9beae
creating new layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f
creating new layer sha256:c0aac7c7f00d8a81a8ef397cd78664957fbe0e09f87b08bc7afa8d627a8da87f
creating new layer sha256:bc526ae2132e2fc5e7ab4eef535720ce895c7a47429782231a33f62b0fa4401f
writing manifest
success
root@bd8c5bc7fa9e:/work# exit
$ docker container stop ollama
$ cd ..
$ rm -rf Llama-3-ELYZA

Finally, I discarded the downloaded model since it was no longer needed (as it had been transferred to the Docker volume). During the process, the downloaded model plus the Docker volume likely consumed about 10GB of storage temporarily.

Verification

$ docker run -d --rm --gpus=all -v ollama:/root/.ollama -p 11434:11434 \
             --name ollama ollama/ollama

Start the container in detached mode as shown above and run the elyza:jp8b that was just set up:

$ docker exec -it ollama ollama run elyza:jp8b
>>> Hello~
Hello! How are you?

>>> What time is it now?
Since I am an AI, I cannot keep track of the current time. I only have access to information from the conversation or past data.

>>> /bye

It seems to have worked successfully.

Using Llama-3-ELYZA-JP-8B from Dify

Since the preparations on the Dify side are already done, start it as follows and access it from your browser.

$ cd dify/docker
$ docker compose up -d

Model Provider Settings

Click on the user icon to find "Settings," and configure the LLM from there. Since I've already deleted "llama3," remove the old information and set up "elyza:jp8b."

Model Name: elyza:jp8b
Base URL: http://host.docker.internal:11434
Completion mode: Chat
Model context size: 4096
Upper bound for max tokens: 4096
Vision support: No

Now the preparation is complete.

Settings from the Application

Select "Studio" from the top center, then choose "Chatbot" and "Create from scratch" or similar. Immediately set the LLM model to "elyza:jp8b" and prepare the chat app with appropriate instructions.

You can see that it responds in Japanese without even having to specifically ask it to "speak in Japanese."

Termination Process

$ docker compose down

to terminate the Dify container orchestration (?), and then

$ docker container stop ollama

to terminate the ollama container.

Summary

As it turns out, the update and the switch to a new LLM were surprisingly easy, and chatting in Japanese has become much more efficient.

GitHubで編集を提案

Discussion