🤖

transformersのtutorialを読んでみた - part1

2022/09/13に公開

はじめに

最近transformersというライブラリを勉強していて、今はtutorialsを読んでいます！せっかくなので、tutorialsの内容を訳して記事にしようと思っています！今回は第一弾ということで、以下の記事を参考に書きました～ぜひ最後まで読んでください！(google colabで実装しながら読んで頂けると良いと思います！)

推論用パイプライン

pipeline()を使えば、テキスト生成、画像のセグメンテーション、音声分類など様々なタスクのモデルをModel Hubから簡単に利用することができます。モデルを動かすコードを理解していなくても、pipeline()を使ってモデルを利用することができます。

Pipelineの使い方

各タスクは関連するpipeline()を持ちますが、特定のタスクのパイプラインをすべて含む一般的なpipeline()を使用する方が簡単です。pipeline()は、デフォルトのモデルとトークナイザーを自動的にロードします。

まず、pipeline()を作成し、タスクを指定します。

from transformers import pipeline
generator = pipeline(task='text-generation')

文章をpipeline()に入力します。

generator(
    "Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone"
)

・出力
[{'generated_text': 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, Eight for the Dwarven-lords, Nine for the Elves and Orcs of the world, and more. With great dedication the'}]

出力を見ると、入力した文章に続く文章が生成されていることが分かります。すごいですね～

2つ以上の入力がある時は、リストとして入力します。

generator(
    [
        "Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone",
        "Nine for Mortal Men, doomed to die, One for the Dark Lord on his dark throne",
    ]
)

・出力
[[{'generated_text': 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, Eight for the Orc-lords in their castles.\n\nEach of these books is designed to be an effective one, which was'}],
[{'generated_text': "Nine for Mortal Men, doomed to die, One for the Dark Lord on his dark throne.\n\nGwen's life\n\nGwen met the Master for the first time when she and her brother Arthur returned to their mortal home in the distant"}]]

タスクのパラメータは、pipeline()に含めることもできます。テキスト生成タスクには、出力を制御するためのいくつかのパラメータを持つgenerate()メソッドがあります。例えば、複数の出力を生成したい場合は、num_return_sequencesパラメータを設定します。

generator(
    "Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone",
    num_return_sequences=2,
)

・出力
[{'generated_text': 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone!\n\nLions and Elves\n\nLions of Night\n\nLions of Night Elves\n\nLions of Night\n'},
{'generated_text': 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, or One Ring for those who are in trouble in their castle? No matter how many Rings were left out of the main-event'}]

モデルとトークナイザーを選ぶ

pipeline()はModel Hubから任意のモデルを受け取ります。Model Hubにはタグがあり、タスクに使いたいモデルをフィルタリングすることができます。適切なモデルを選択したら、対応するAutoModelFor～とAutoTokenizerクラスをロードします。例えば、因果関係言語モデリングタスクのために、AutoModelForCausalLMクラスをロードします。

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
model = AutoModelForCausalLM.from_pretrained("distilgpt2")

pipeline()を作成し、先ほどロードしたモデルとトークナイザーを指定します。

from transformers import pipeline

generator = pipeline(task="text-generation", model=model, tokenizer=tokenizer)

テキストをpipeline()に入力し、テキストを生成します。

generator(
    "Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone"
)

・出力
[{'generated_text': 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone and the Iron-blood kings in their halls of stone. Five-year ago they would be out on a journey, and now they'}]

Audio pipeline

pipeline()は音声タスクにも対応できます。例えば、次の音声の感情を分類します。

from datasets import load_dataset
import torch

torch.manual_seed(42)
ds = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation")
audio_file = ds[0]["audio"]["path"]

Model Hubで感情分析用の音声分類モデルを見つけて、pipeline()にロードします。

from transformers import pipeline

audio_classifier = pipeline(
    task="audio-classification", model="ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition"
)

最後に音声ファイルをpipeline()に入力します。

preds = audio_classifier(audio_file)
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
preds

・出力
[{'score': 0.1315, 'label': 'calm'},
{'score': 0.1307, 'label': 'neutral'},
{'score': 0.1274, 'label': 'sad'},
{'score': 0.1261, 'label': 'fearful'},
{'score': 0.1242, 'label': 'happy'}]

Vision pipeline

次は画像タスクです。まず、画像のタスクを指定し、分類器に画像を入力します。この際、画像を以下のようにurlで指定することもできます。
今回用いた画像：https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg

from transformers import pipeline

vision_classifier = pipeline(task="image-classification")
preds = vision_classifier(
    images="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
preds

・出力
[{'score': 0.4403, 'label': 'lynx, catamount'},
{'score': 0.0343,
'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'},
{'score': 0.0321, 'label': 'snow leopard, ounce, Panthera uncia'},
{'score': 0.0235, 'label': 'Egyptian cat'},
{'score': 0.023, 'label': 'tiger cat'}]

おわりに

最後まで読んで頂き、ありがとうございました！テキスト生成、音声分類、画像分類といった様々なタスクに対応できるtransformersはすごいですね！

GitHubで編集を提案