Closed2025/01/30にクローズ3

新手法「TAID」を用いた小規模日本語言語モデル「TinySwallow-1.5B」を試す

LLM

Sakana AI

TinySwallow

TAID

kun432

https://x.com/SakanaAILabs/status/1884770664353325399
公式ブログの記事
https://sakana.ai/taid-jp/
TAIDは、LLMの知識をSLMに転移させる「知識蒸留」の新たな手法です。既存の知識蒸留の手法とは異なり、TAIDは、SLMの学習過程に応じて段階的にLLMの知識を転移させます。「常にちょうど良いレベルの先生についてもらう」ことで効率的かつ効果的な知識転移を実現します。更に私たちは、TAIDによって作った小規模モデルを多くの人に使ってもらうため、英語のSLMである「TAID-LLM-1.5B」と日本語のSLMである「TinySwallow-1.5B」を開発しました。TAIDを用いて32BパラメータのLLMから1.5BパラメータのSLMへ知識を転移し、結果として1.5Bパラメータモデルとして最高性能となる日本語の小規模言語モデルを作り出すことに成功しました。
TAIDの論文
https://arxiv.org/abs/2501.16937
モデル
https://huggingface.co/collections/SakanaAI/tinyswallow-676cf5e57fff9075b5ddb7ec

kun432

TinySwallow-1.5B-Instructを試してみる

Colaboratory T4で。

モデルとトークナイザーのロード

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer


# 1. モデルのロード
device = "cuda" if torch.cuda.is_available() else "cpu"
repo_id = "SakanaAI/TinySwallow-1.5B-Instruct"
model = AutoModelForCausalLM.from_pretrained(repo_id)
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model.to(device)

nvidia-smi

出力

Thu Jan 30 12:11:39 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   34C    P0             25W /   70W |    6092MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

6GBぐらい。

生成

%%time

# 2. 入力
text = "競馬の魅力について、簡潔に5つリストアップしてください。"
messages = [{"role": "user", "content": text}]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")

# 3. 生成
output_ids = model.generate(
    input_ids.to(device),
    max_new_tokens=1024,
)
output_ids = output_ids[:, input_ids.shape[1] :]
generated_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
print(generated_text)

1. 予想外の結果を楽しむことができる。
2. 競争心と戦略性を養うことができる。
3. 馬や騎手の個性や物語を感じることができる。
4. 常に新しい情報を得られる。
5. 楽しいエンターテイメントとして楽しめる。 




CPU times: user 3.36 s, sys: 0 ns, total: 3.36 s
Wall time: 4.6 s

だいたい4〜6秒ぐらい。

生成後のnvidia-smi

出力

Thu Jan 30 12:13:55 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   46C    P0             26W /   70W |    6160MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

kun432

GGUFもある
https://huggingface.co/SakanaAI/TinySwallow-1.5B-Instruct-GGUF
MLXもすでに変換されたものがある
https://huggingface.co/mlx-community/TinySwallow-1.5B-Instruct-4bit

このスクラップは2025/01/30にクローズされました