Open4
Raspberry Pi 4 で VILA-7B を動かす

VILA-7Bについて

GitHub
https://github.com/NVlabs/VILA を読むと、
https://github.com/mit-han-lab/TinyChatEngine へのリンクの記載がある。
これを使うことで、VILA-7Bをエッジデバイスで動作させられるとのこと。

WindowsのWSL上で動作させられることは既に確認できたため、
ラズパイで動作させられるか検証する
Key | Value |
---|---|
機種 | Raspberry Pi 4 Model B |
RAM | 8GB |
OS | Raspberry Pi OS (64-bit) |
PRETTY_NAME | Debian GNU/Linux 11 (bullseye) |
raspi@raspberrypi4:~ $ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
raspi@raspberrypi4:~ $ uname -a
Linux raspberrypi4 6.6.62+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.62-1+rpt1 (2024-11-25) aarch64 GNU/Linux

git clone --recursive https://github.com/mit-han-lab/TinyChatEngine
cd TinyChatEngine
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
cd llm
python tools/download_model.py --model LLaMA_3_8B_Instruct_awq_int4 --QM QM_ARM
# make chat -j # 並列化オプションを付けるとフリーズしてしまう
make chat
# 普通のチャットを試す
./chat
メモリが足りない
raspi@raspberrypi4:~/work/TinyChatEngine/llm $ ./chat
TinyChatEngine by MIT HAN Lab: https://github.com/mit-han-lab/TinyChatEngine
Using model: LLaMA_3_8B_Instruct
Using AWQ for 4bit quantization: https://github.com/mit-han-lab/llm-awq
Loading model... Killed
# ランレベルを切り変える
systemctl set-default multi-user.target
SWAPのサイズを確認
デフォルトでは512MG
raspi@raspberrypi4:~ $ sudo swapon --show
NAME TYPE SIZE USED PRIO
/var/swap file 512M 50.2M -2
SWAPのサイズを増やす
sudo dphys-swapfile swapoff
cat /etc/dphys-swapfile
/etc/dphys-swapfile
# /etc/dphys-swapfile - user settings for dphys-swapfile package
# author Neil Franklin, last modification 2010.05.05
# copyright ETH Zuerich Physics Departement
# use under either modified/non-advertising BSD or GPL license
# this file is sourced with . so full normal sh syntax applies
# the default settings are added as commented out CONF_*=* lines
# where we want the swapfile to be, this is the default
#CONF_SWAPFILE=/var/swap
# set size to absolute value, leaving empty (default) then uses computed value
# you most likely don't want this, unless you have an special disk situation
CONF_SWAPSIZE=512
# set size to computed value, this times RAM size, dynamically adapts,
# guarantees that there is enough swap without wasting disk space on excess
#CONF_SWAPFACTOR=2
# restrict size (computed and absolute!) to maximally this limit
# can be set to empty for no limit, but beware of filled partitions!
# this is/was a (outdated?) 32bit kernel limit (in MBytes), do not overrun it
# but is also sensible on 64bit to prevent filling /var or even / partition
#CONF_MAXSWAP=2048
CONF_SWAPSIZE=512
を
CONF_SWAPSIZE=8192
に変更する
sudo vi /etc/dphys-swapfile
sudo dphys-swapfile setup
raspi@raspberrypi4:~ $ sudo dphys-swapfile setup
want /var/swap=8192MByte, restricting to config limit: 2048MBytes, checking existing: deleting wrong size file (536870912), generating swapfile ... of 2048MBytes
8192MBで設定したのだが、2048MBが限界とのこと
sudo dphys-swapfile swapon
sudo reboot
SWAPのサイズが2Gに拡張された
raspi@raspberrypi4:~ $ sudo swapon --show
NAME TYPE SIZE USED PRIO
/var/swap file 2G 0B -2
でも、結局 killed になってしまった...
raspi@raspberrypi4:~/work/TinyChatEngine/llm $ ./chat
TinyChatEngine by MIT HAN Lab: https://github.com/mit-han-lab/TinyChatEngine
Using model: LLaMA_3_8B_Instruct
Using AWQ for 4bit quantization: https://github.com/mit-han-lab/llm-awq
Loading model... Killed
マルチモーダルのモデルが動くか検証
python -m pip install termvisage
python tools/download_model.py --model VILA_7B_awq_int4_CLIP_ViT-L --QM QM_ARM
./vila ../assets/figures/vlm_demo/pedestrian.png
エラー
(.venv) raspi@raspberrypi4:~/work/TinyChatEngine/llm $
(.venv) raspi@raspberrypi4:~/work/TinyChatEngine/llm $ ./vila ../assets/figures/vlm_demo/pedestrian.png
TinyChatEngine by MIT HAN Lab: https://github.com/mit-han-lab/TinyChatEngine
Using model: VILA1.5_8B
Using AWQ for 4bit quantization: https://github.com/mit-han-lab/llm-awq
Loading model... No such file or directory: models/CLIP_ViT_Large/encoder/layer0/layer_norm1/weight.bin
terminate called after throwing an instance of 'char const*'
./vila: line 7: 857 Aborted ./chat VILA1.5_8B INT4 5 $image_path
なぜか
VILA1.5_8B
モデルが使われてしまっているので、
VILA_7B
を使うように変更する
--- a/llm/vila
+++ b/llm/vila
@@ -4,4 +4,4 @@ image_path="$1"
termvisage $image_path -w 70
-./chat VILA1.5_8B INT4 5 $image_path
+./chat VILA_7B INT4 5 $image_path
(.venv) raspi@raspberrypi4:~/work/TinyChatEngine/llm $ ./vila ../assets/figures/vlm_demo/pedestrian.png
TinyChatEngine by MIT HAN Lab: https://github.com/mit-han-lab/TinyChatEngine
Using model: VILA_7B
Using AWQ for 4bit quantization: https://github.com/mit-han-lab/llm-awq
Loading model... No such file or directory: models/CLIP_ViT_Large/encoder/layer0/layer_norm1/weight.bin
terminate called after throwing an instance of 'char const*'
./vila: line 8: 892 Aborted ./chat VILA_7B INT4 5 $image_path
No such file or directory: models/CLIP_ViT_Large/encoder/layer0/layer_norm1/weight.bin
異なるモデルがロードされてしまっている
ソースコードを修正する必要がありそう。