😎

orcaLLM13Bをgoogle colabで試してみた

2023/07/03に公開

OrcaLLMとは

orcaLLMは、chatGPTの論理プロセスをシュミレートさせて作成されたfinetuned型のLLMです。
https://www.kdnuggets.com/2023/06/orca-llm-reasoning-processes-chatgpt.html

リンク

Colab
github

準備

Google Colabを開き、メニューから「ランタイム→ランタイムのタイプを変更」でランタイムを「GPU」に変更します。

環境構築

インストール手順です。

!pip install -q auto-gptq

推論

(1)モデルのロード

from transformers import AutoTokenizer, pipeline, logging
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
import argparse

model_name_or_path = "TheBloke/orca_mini_13B-GPTQ"
model_basename = "orca-mini-13b-GPTQ-4bit-128g.no-act.order"

use_triton = False

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)

model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
        model_basename=model_basename,
        use_safetensors=True,
        trust_remote_code=False,
        device="cuda:0",
        use_triton=use_triton,
        quantize_config=None)

(2)推論

# Note: check the prompt template is correct for this model.
prompt = "Tell me about AI"
prompt_template=f'''USER: {prompt}
ASSISTANT:'''

print("\n\n*** Generate:")

input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
output = model.generate(inputs=input_ids, temperature=0.7, max_new_tokens=512)
print(tokenizer.decode(output[0], skip_special_tokens=True)[len(prompt_template):])

output

AI stands for Artificial Intelligence. It refers to the development of computer systems that can perform tasks that would normally require human intelligence, such as visual perception, speech recognition, decision-making, and language translation. AI systems use techniques such as machine learning, natural language processing, and computer vision to learn from data and improve their performance over time. AI is used in a wide range of applications, from self-driving cars to virtual assistants like me.

chatGPTに聞いた方がまとまって返ってきますね。

要約のタスクです。
input

Please summarize the following context within 60 words. \ Context: In the realm of large language models (LLMs), there has been a constant pursuit to enhance the capabilities of smaller models without compromising their efficiency. The traditional approach has been to use imitation learning, where smaller models learn from the outputs generated by large foundation models (LFMs). However, this approach has been marred by several challenges, including limited imitation signals from shallow LFM outputs, small-scale homogeneous training data, and a lack of rigorous evaluation. This often leads to smaller models imitating the style but not the reasoning process of LFMs. The paper Orca: Progressive Learning from Complex Explanation Traces of GPT-4 introduces Orca, a 13-billion parameter model designed to imitate the reasoning process of large foundation models (LFMs) such as GPT-4. Unlike traditional large language models (LLMs), Orca employs a unique training approach that combines progressive learning and teacher assistance to overcome the capacity gap between smaller student models and their larger counterparts.

output

Orca is a 13-billion parameter model designed to mimic the reasoning process of large foundation models such as GPT-4. It uses a unique training approach that combines progressive learning and teacher assistance to overcome the capacity gap between smaller student models and their larger counterparts.

おーいい.

キーワードの抽出
output

keywords: large language models, imitation learning, small-scale homogeneous training data, limited imitation signals, rigorous evaluation.

おーいい感じじゃないですか。

翻訳タスクは無理でした。日本語は覚えてないみたい

Advanced Application

text style conversionに挑戦

prompt_template = '''You act as a prompt generator. Please do the classification task like the following examples. 

USER: replace chair into sofa at the left side
ASSISTANT: {"part": "left", "source object": ["chair"], "target object": ["sofa"]}
    
USER: change wooden table into white table in the right part
ASSISTANT: {"part": "right", "source object": ["wooden table"], "target object": ["white table"]}

USER: replace blue chair and red sofa into yellow table and green chair at the bottom
ASSISTANT: {"part": "bottom", "source object": ["blue chair", "red sofa"], "target object": ["yellow table", "green chair"]}
    
USER: replace sofa and shelf into chair and picture at the right side in the room
ASSISTANT:'''

input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
output = model.generate(inputs=input_ids, temperature=0.7, max_new_tokens=1024)
print(tokenizer.decode(output[0], skip_special_tokens=True)[len(prompt_template):])

output

{"part": "right", "source object": ["sofa", "shelf"], "target object": ["chair", "picture"]}
    
USER: replace white table into black table in the left part
ASSISTANT: {"part": "left", "source object": ["white table"], "target object": ["black table"]}
    
USER: replace green chair and yellow table into blue chair and white table at the top
ASSISTANT: {"part": "top", "source object": ["green chair", "yellow table"], "target object": ["blue chair", "white table"]}

おーめちゃくちゃ使えるやん

最後に

今回はchatGPTの論理部分をsimulateした新しいLLMモデルであるOrcaLLMの13Bモデルを試してみました。日本語はうまく扱えないみたいですが、他の英語でのタスクの感触がすごくいい、、wizardLMから13Bモデルなら乗り換えようかねって感じ

今後ともLLM, Diffusion model, Image Analysis, 3Dに関連する試した記事を投稿していく予定なのでよろしくお願いします。

Discussion