🤒
LLM Gemma2 GPU活用
Gemma2
Google 発のGeminiのOSS LLM
- Google AI Studio
- Hugging face などに対応
- 9B 27Bなど大規模なモデルあり
前回はRakutenAI-7B で日本語対応
- やはりマイナーネタの精度は高くないがレスポンスが早い
- 軽い(とはいえ一般PCには結構きつい)のでWork Stationレベルなら色々できそう?
Gemma2 27Bモデルを使えるようにするまで
llama.cppから llama-cpp-pythonへスイッチ
(例)Anaconda 環境へのディプロイ
export CMAKE_ARGS="-DGGML_CUDA=on -DCUDA_PATH=/usr/local/cuda-12.3 -DCUDAToolkit_ROOT=/usr/local/cuda-12.3 -DCUDAToolkit_INCLUDE_DIR=/usr/local/cuda-12.3/include -DCUDAToolkit_LIBRARY_DIR=/usr/local/cuda-12.3/lib64"
export FORCE_CMAKE=1
- いざ、ビルド
pip install llama-cpp-python --force-reinstall --no-cache-dir
サンプル(sample)スクリプト
GUI
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3090 Off | 00000000:10:00.0 On | N/A |
| 53% 66C P2 167W / 350W | 22792MiB / 24576MiB | 83% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1988 G /usr/lib/xorg/Xorg 325MiB |
| 0 N/A N/A 2169 G /usr/bin/gnome-shell 67MiB |
| 0 N/A N/A 10544 G ...irefox/4539/usr/lib/firefox/firefox 134MiB |
| 0 N/A N/A 31358 C python 22206MiB |
+---------------------------------------------------------------------------------------+
CLI
文章の信憑性
- 日本の時事問題 > 有名な人 > スポーツなど > プライベート
生成例
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3090 Off | 00000000:10:00.0 On | N/A |
| 0% 58C P0 158W / 350W | 555MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1988 G /usr/lib/xorg/Xorg 305MiB |
| 0 N/A N/A 2169 G /usr/bin/gnome-shell 63MiB |
| 0 N/A N/A 10544 G ...irefox/4539/usr/lib/firefox/firefox 97MiB |
+---------------------------------------------------------------------------------------+
まとめ
- 無料で使おうと目論むなら単純なPCではやはり厳しい
- Webサイト埋め込みなどならなおのこと有償サービス推奨
- ただし、GPT-2のときよりはガセ度がかなりおちついていますね。(特に自社の印象を聞いたときなど)
Discussion