Haystackチュートリアルをやってみる: How to Use Pipelines
Colaboratoryで進める。
GPUを有効にする必要があるので、「ノートブックの設定」で"T4 GPU"を使用する。
インストール。
%%bash
pip install --upgrade pip
pip install farm-haystack[colab,inference]
テレメトリー有効化。
from haystack.telemetry import tutorial_running
tutorial_running(11)
ロギング設定。
import logging
logging.basicConfig(format="%(levelname)s - %(name)s - %(message)s", level=logging.WARNING)
logging.getLogger("haystack").setLevel(logging.INFO)
ドキュメントやノードの準備、今回は一気に。
ドキュメントをDocumentStoreに入れて、BM25Retriever, EmbeddingRetriever, FARMReaderを初期化する。EmbeddingRetrieverでEmbeddingsも合わせて作成。
from haystack.document_stores import InMemoryDocumentStore
from haystack.utils import fetch_archive_from_http, convert_files_to_docs, clean_wiki_text
from haystack.nodes import BM25Retriever, EmbeddingRetriever, FARMReader
document_store = InMemoryDocumentStore(use_bm25=True)
doc_dir = "data/tutorial11"
s3_url = "https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/wiki_gameofthrones_txt11.zip"
fetch_archive_from_http(url=s3_url, output_dir=doc_dir)
got_docs = convert_files_to_docs(dir_path=doc_dir, clean_func=clean_wiki_text, split_paragraphs=True)
document_store.delete_documents()
document_store.write_documents(got_docs)
bm25_retriever = BM25Retriever(document_store=document_store)
embedding_retriever = EmbeddingRetriever(
document_store=document_store, embedding_model="sentence-transformers/multi-qa-mpnet-base-dot-v1"
)
document_store.update_embeddings(embedding_retriever, update_existing_embeddings=False)
reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2")
ビルトインのパイプライン
Haystackにはビルトインのパイプラインが予め用意されているのでまずはそれを使ってみる。
ExtractiveQAPipeline
を使って抽出型QAとして回答を生成する。
from haystack.pipelines import ExtractiveQAPipeline
from haystack.utils import print_answers
p_extractive_premade = ExtractiveQAPipeline(reader=reader, retriever=bm25_retriever)
res = p_extractive_premade.run(
query="Who is the father of Arya Stark?", params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
)
print_answers(res, details="minimum")
結果
'Answers:'
[ { 'answer': 'Eddard',
'context': 's Nymeria after a legendary warrior queen. She travels '
"with her father, Eddard, to King's Landing when he is made "
'Hand of the King. Before she leaves,'},
{ 'answer': 'Ned',
'context': '\n'
'====Season 1====\n'
'Arya accompanies her father Ned and her sister Sansa to '
"King's Landing. Before their departure, Arya's "
'half-brother Jon Snow gifts A'},
{ 'answer': 'Lord Eddard Stark',
'context': 'ark daughters.\n'
'During the Tourney of the Hand to honour her father Lord '
'Eddard Stark, Sansa Stark is enchanted by the knights '
'performing in the event.'},
{ 'answer': 'Eddard Stark',
'context': "arrested by Catelyn Stark. He then races to King's Landing "
"to inform Eddard Stark. During Lord Eddard's execution, he "
'finds Arya Stark and shields her'},
{ 'answer': 'Joffrey',
'context': 'laying with one of his wooden toys.\n'
"After Eddard discovers the truth of Joffrey's paternity, "
'he tells Sansa that they will be heading back to Winterfe'}]
DocumentSearchPipeline
を使うと単純にretrieval部分だけとなる。
from haystack.pipelines import DocumentSearchPipeline
from haystack.utils import print_documents
p_retrieval = DocumentSearchPipeline(embedding_retriever)
res = p_retrieval.run(query="Who is the father of Arya Stark?", params={"Retriever": {"top_k": 10}})
print_documents(res, max_text_len=200)
結果
Query: Who is the father of Arya Stark?
{ 'content': '\n'
'=== Background ===\n'
'Arya is the third child and younger daughter of Eddard and '
'Catelyn Stark and is nine years old at the beginning of the '
'book series. She has five siblings: an older brother Robb, '
'a...',
'name': '43_Arya_Stark.txt'}
{ 'content': '\n'
'===Arya Stark===\n'
"'''Arya Stark''' portrayed by Maisie Williams. Arya Stark of "
'House Stark is the younger daughter and third child of Lord '
'Eddard and Catelyn Stark of Winterfell. Ever the tomboy, '
'Arya...',
'name': '349_List_of_Game_of_Thrones_characters.txt'}
{ 'content': "'''Arya Stark''' is a fictional character in American author "
"George R. R. Martin's ''A Song of Ice and Fire'' epic fantasy "
'novel series. She is a prominent point of view character in '
'the novels with ...',
'name': '43_Arya_Stark.txt'}
{ 'content': '\n'
'=== Arya Stark ===\n'
'Arya Stark is the third child and younger daughter of Eddard '
'and Catelyn Stark. She serves as a POV character for 33 '
"chapters throughout ''A Game of Thrones'', ''A Clash of "
"Kings''...",
'name': '30_List_of_A_Song_of_Ice_and_Fire_characters.txt'}
{ 'content': '\n'
'== Character description ==\n'
"Robb is 14 years old at the beginning of ''A Game of Thrones'' "
'(1996). He is the oldest legitimate son of Eddard "Ned" Stark '
'and his wife Catelyn, and has five siblings: S...',
'name': '208_Robb_Stark.txt'}
以下のようなパイプラインもあるらしい。
- ドキュメント検索(DocumentSearchPipeline),
- 要約付ドキュメント検索(SearchSummarizationPipeline)
- FAQスタイルのQA (FAQPipeline)
- 翻訳検索translated search (TranslationWrapperPipeline)
パイプラインの可視化
GraphVizに対応していて、パイプラインの可視化ができる。
GraphVizのライブラリとpythonパッケージをインストール
!apt install libgraphviz-dev
!pip install pygraphviz
パイプラインオブジェクトのdrawメソッドで画像パスを指定すれば出力される。Colaboratoryの場合はIPython.displayを使えば表示できる。
ExtractiveQAPipeline
の場合。
from IPython.display import Image,display
pipeline_extractive_premade_image = "pipeline_extractive_premade.png"
p_extractive_premade.draw(pipeline_extractive_premade_image)
display(Image(pipeline_extractive_premade_image))
DocumentSearchPipeline
の場合。
from IPython.display import Image,display
pipeline_retrieval_image = "pipeline_retrieval.png"
p_retrieval.draw(pipeline_retrieval_image)
display(Image(pipeline_retrieval_image))
カスタムなパイプライン
ビルトインではなく独自にカスタムなパイプラインを作成できる。Extractive QA Pipelineをカスタムに作る。パイプラインを初期化してadd_nodeメソッドで各ノードをパイプラインに追加する。
from haystack.pipelines import Pipeline
from IPython.display import Image,display
p_extractive = Pipeline()
p_extractive.add_node(component=bm25_retriever, name="Retriever", inputs=["Query"])
p_extractive.add_node(component=reader, name="Reader", inputs=["Retriever"])
res = p_extractive.run(query="Who is the father of Arya Stark?", params={"Retriever": {"top_k": 10}})
print_answers(res, details="minimum")
pipeline_extractive_image = "pipeline_extractive.png"
p_extractive.draw(pipeline_extractive_image)
display(Image(pipeline_extractive_image))
結果
'Answers:'
[ { 'answer': 'Eddard',
'context': 's Nymeria after a legendary warrior queen. She travels '
"with her father, Eddard, to King's Landing when he is made "
'Hand of the King. Before she leaves,'},
{ 'answer': 'Ned',
'context': '\n'
'====Season 1====\n'
'Arya accompanies her father Ned and her sister Sansa to '
"King's Landing. Before their departure, Arya's "
'half-brother Jon Snow gifts A'},
{ 'answer': 'Lord Eddard Stark',
'context': 'ark daughters.\n'
'During the Tourney of the Hand to honour her father Lord '
'Eddard Stark, Sansa Stark is enchanted by the knights '
'performing in the event.'},
{ 'answer': 'Eddard Stark',
'context': "arrested by Catelyn Stark. He then races to King's Landing "
"to inform Eddard Stark. During Lord Eddard's execution, he "
'finds Arya Stark and shields her'},
{ 'answer': 'Joffrey',
'context': 'laying with one of his wooden toys.\n'
"After Eddard discovers the truth of Joffrey's paternity, "
'he tells Sansa that they will be heading back to Winterfe'},
{ 'answer': 'Eddard "Ned" Stark',
'context': '=\n'
"After Varys tells him that Sansa Stark's life is also at "
'stake, Eddard "Ned" Stark agrees to make a false '
'confession and swear loyalty to King Joffr'},
{ 'answer': 'Eddard Stark',
'context': 'e from House Tully in the Riverlands region prior to her '
'marriage to Eddard Stark. She has her hair dyed dark brown '
'later on while in the Vale, disgui'},
{ 'answer': 'Eddard Stark and Catelyn Stark',
'context': 'ces==\n'
'Sansa Stark is the second child and elder daughter of '
'Eddard Stark and Catelyn Stark. She was born and raised in '
'Winterfell, until leaving with '},
{ 'answer': 'Robb',
'context': 'ost, Walder was able to negotiate marriage contracts for '
'his children to Robb and Arya Stark. But during Season 2 '
'Robb broke his word and married Lady'},
{ 'answer': 'Ned Stark',
'context': ' to reveal her true identity, and is surprised to learn '
"she is in fact Ned Stark's daughter. After the Goldcloaks "
'get help from Ser Amory Lorch and hi'}]
可視化したパイプライン
カスタムなパイプラインを複数のRetrieverと組み合わせる
EmbeddingRetrieverとBM25Retrieverを組み合わせて、疎ベクトル・密ベクトル両方に対応したパイプラインを作ることもできる。JoinDocuments
というノードを使う。
from haystack.nodes import JoinDocuments
p_ensemble = Pipeline()
p_ensemble.add_node(component=bm25_retriever, name="BM25Retriever", inputs=["Query"])
p_ensemble.add_node(component=embedding_retriever, name="EmbeddingRetriever", inputs=["Query"])
p_ensemble.add_node(
component=JoinDocuments(join_mode="concatenate"), name="JoinResults", inputs=["BM25Retriever", "EmbeddingRetriever"]
)
p_ensemble.add_node(component=reader, name="Reader", inputs=["JoinResults"])
res = p_ensemble.run(
query="Who is the father of Arya Stark?", params={"EmbeddingRetriever": {"top_k": 5}, "BM25Retriever": {"top_k": 5}}
)
print_answers(res, details="minimum")
pipeline_ensemble_image = "pipeline_ensemble.png"
p_extractive.draw(pipeline_ensemble_image)
display(Image(pipeline_ensemble_image))
結果
'Answers:'
[ { 'answer': 'Eddard',
'context': 's Nymeria after a legendary warrior queen. She travels '
"with her father, Eddard, to King's Landing when he is made "
'Hand of the King. Before she leaves,'},
{ 'answer': 'Eddard and Catelyn Stark',
'context': 'tark ===\n'
'Arya Stark is the third child and younger daughter of '
'Eddard and Catelyn Stark. She serves as a POV character '
"for 33 chapters throughout ''A "},
{ 'answer': 'Lord Eddard Stark',
'context': 'ark daughters.\n'
'During the Tourney of the Hand to honour her father Lord '
'Eddard Stark, Sansa Stark is enchanted by the knights '
'performing in the event.'},
{ 'answer': 'Lord Eddard Stark',
'context': "Game of Thrones'', Arya is the third child and younger "
'daughter of Lord Eddard Stark and his wife Lady Catelyn '
'Stark. She is tomboyish, headstrong, f'},
{ 'answer': 'Eddard and Catelyn Stark',
'context': 'Background ===\n'
'Arya is the third child and younger daughter of Eddard and '
'Catelyn Stark and is nine years old at the beginning of '
'the book series. Sh'},
{ 'answer': 'Lord Eddard and Catelyn Stark',
'context': 'rk of House Stark is the younger daughter and third child '
'of Lord Eddard and Catelyn Stark of Winterfell. Ever the '
'tomboy, Arya would rather be traini'},
{ 'answer': 'Eddard Stark',
'context': "arrested by Catelyn Stark. He then races to King's Landing "
"to inform Eddard Stark. During Lord Eddard's execution, he "
'finds Arya Stark and shields her'},
{ 'answer': 'Joffrey',
'context': 'laying with one of his wooden toys.\n'
"After Eddard discovers the truth of Joffrey's paternity, "
'he tells Sansa that they will be heading back to Winterfe'},
{ 'answer': 'Jon Snow',
'context': '. She wields a smallsword named Needle, a gift from her '
'half-brother, Jon Snow, and is trained in the Braavosi '
'style of sword fighting by Syrio Forel'},
{ 'answer': 'Ned Stark',
'context': ' to reveal her true identity, and is surprised to learn '
"she is in fact Ned Stark's daughter. After the Goldcloaks "
'get help from Ser Amory Lorch and hi'}]
可視化したイメージ
カスタムなノード
カスタムなパイプラインが作りたくなるならば、当然ノードもカスタムを作りたくなる。Haystackではこれも可能。
カスタムなノードを作る場合の要件は以下。
-
BaseComponent
を継承するクラスを作成する - そのクラスに
run()
メソッドを追加する。処理に必要な必須およびオプションの引数を追加。これらの引数は、パイプラインへの入力として、params
内や、先行するノードの出力として渡される必要がある。 -
run()
の中に処理ロジックを追加する(例:クエリの再フォーマット)。 - ノード用の出力データ(タプル形式)と、出力エッジの名前(デフォルトでは1つの出力を持つノードの場合は“output_1”)を返す。
- ノードからの出力オプションの数を定義するクラス属性
outgoing_edges = 1
を追加。ここでより大きな数値が必要なのは、決定ノードを持っている場合のみ(決定ノードは後述)。
カスタムノードのテンプレートは以下。
from haystack import BaseComponent
from typing import Optional, List
class CustomNode(BaseComponent):
outgoing_edges = 1
def run(self, query: str, my_optional_param: Optional[int]):
# process the inputs
output = {"my_output": ...}
return output, "output_1"
def run_batch(self, queries: List[str], my_optional_param: Optional[int]):
# process the inputs
output = {"my_output": ...}
return output, "output_1"
決定ノード
決定ノードは、データのルーティングを行うためのノード。実例を見たほうが早い。
from haystack import BaseComponent
from typing import Optional, List
class CustomQueryClassifier(BaseComponent):
outgoing_edges = 2
def run(self, query: str):
if "?" in query:
return {}, "output_2"
else:
return {}, "output_1"
def run_batch(self, queries: List[str]):
split = {"output_1": {"queries": []}, "output_2": {"queries": []}}
for query in queries:
if "?" in query:
split["output_2"]["queries"].append(query)
else:
split["output_1"]["queries"].append(query)
return split, "split"
p_classifier = Pipeline()
p_classifier.add_node(component=CustomQueryClassifier(), name="QueryClassifier", inputs=["Query"])
p_classifier.add_node(component=bm25_retriever, name="BM25Retriever", inputs=["QueryClassifier.output_1"])
p_classifier.add_node(component=embedding_retriever, name="EmbeddingRetriever", inputs=["QueryClassifier.output_2"])
p_classifier.add_node(component=reader, name="QAReader", inputs=["BM25Retriever", "EmbeddingRetriever"])
res_1 = p_classifier.run(query="Who is the father of Arya Stark?")
print("\nEmbedding Retriever Results" + "\n" + "=" * 15)
print_answers(res_1)
print()
res_2 = p_classifier.run(query="Arya Stark father")
print("\nBM25Retriever Results" + "\n" + "=" * 15)
print_answers(res_2)
pipeline_classifier_image = "pipeline_classifier.png"
p_classifier.draw(pipeline_classifier_image)
display(Image(pipeline_classifier_image))
CustomQueryClassifier
というカスタムノードを作成する。これはクエリに"?"が含まれているかいないかで、2つの出力エッジのどちらかだけを返すというもの。これをパイプラインに組み込んで、"?"が含まれていればEmbeddingRetrieverへ、なければBM25Retrieverへ、という分岐処理を行っている。
結果はこうなる。
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 2.92 Batches/s]
Embedding Retriever Results
===============
'Query: Who is the father of Arya Stark?'
'Answers:'
[ <Answer {'answer': 'Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.9058835506439209, 'context': "tark ===\nArya Stark is the third child and younger daughter of Eddard and Catelyn Stark. She serves as a POV character for 33 chapters throughout ''A ", 'offsets_in_document': [{'start': 74, 'end': 98}], 'offsets_in_context': [{'start': 63, 'end': 87}], 'document_ids': ['965789c741c68963042e85b6e7b89757'], 'meta': {'name': '30_List_of_A_Song_of_Ice_and_Fire_characters.txt'}}>,
<Answer {'answer': 'Lord Eddard Stark', 'type': 'extractive', 'score': 0.8793494701385498, 'context': "Game of Thrones'', Arya is the third child and younger daughter of Lord Eddard Stark and his wife Lady Catelyn Stark. She is tomboyish, headstrong, f", 'offsets_in_document': [{'start': 419, 'end': 436}], 'offsets_in_context': [{'start': 67, 'end': 84}], 'document_ids': ['2ee56bdd46dfd30b23f91bcc046456a4'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
<Answer {'answer': 'Robb', 'type': 'extractive', 'score': 0.8409849405288696, 'context': "f the Vale, restoring Stark rule in the North in the process. Meanwhile, Robb's youngest sister Arya Stark returns to Westeros, murders Walder Frey, a", 'offsets_in_document': [{'start': 3268, 'end': 3272}], 'offsets_in_context': [{'start': 73, 'end': 77}], 'document_ids': ['d7b414a49fd49fac14f535bef7169987'], 'meta': {'name': '208_Robb_Stark.txt'}}>,
<Answer {'answer': 'Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.8221831321716309, 'context': 'Background ===\nArya is the third child and younger daughter of Eddard and Catelyn Stark and is nine years old at the beginning of the book series. Sh', 'offsets_in_document': [{'start': 68, 'end': 92}], 'offsets_in_context': [{'start': 63, 'end': 87}], 'document_ids': ['d7a98cb66f592540fa7de20bf46a5e64'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
<Answer {'answer': 'Jon Snow', 'type': 'extractive', 'score': 0.8216632604598999, 'context': ' of Winterfell. She is particularly close to her bastard half-brother Jon Snow, who encourages her to learn how to fight and gives her the smallsword', 'offsets_in_document': [{'start': 656, 'end': 664}], 'offsets_in_context': [{'start': 71, 'end': 79}], 'document_ids': ['a64bb94eab347d5cc10686c16b52a4dd'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
<Answer {'answer': 'Balon Greyjoy', 'type': 'extractive', 'score': 0.7154540419578552, 'context': "He sends Theon to the Iron Islands hoping to broker an alliance with Balon Greyjoy, Theon's father. In exchange for Greyjoy support, Robb as the King ", 'offsets_in_document': [{'start': 983, 'end': 996}], 'offsets_in_context': [{'start': 69, 'end': 82}], 'document_ids': ['d7b414a49fd49fac14f535bef7169987'], 'meta': {'name': '208_Robb_Stark.txt'}}>,
<Answer {'answer': 'Lord Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.6865442991256714, 'context': 'rk of House Stark is the younger daughter and third child of Lord Eddard and Catelyn Stark of Winterfell. Ever the tomboy, Arya would rather be traini', 'offsets_in_document': [{'start': 134, 'end': 163}], 'offsets_in_context': [{'start': 61, 'end': 90}], 'document_ids': ['623e446a1b048d81130a29e262152cd5'], 'meta': {'name': '349_List_of_Game_of_Thrones_characters.txt'}}>,
<Answer {'answer': 'Jon Snow', 'type': 'extractive', 'score': 0.5814977884292603, 'context': '. She wields a smallsword named Needle, a gift from her half-brother, Jon Snow, and is trained in the Braavosi style of sword fighting by Syrio Forel', 'offsets_in_document': [{'start': 662, 'end': 670}], 'offsets_in_context': [{'start': 71, 'end': 79}], 'document_ids': ['2ee56bdd46dfd30b23f91bcc046456a4'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
<Answer {'answer': 'Walder Frey', 'type': 'extractive', 'score': 0.5672119855880737, 'context': "while, Robb's youngest sister Arya Stark returns to Westeros, murders Walder Frey, and later uses his face to disguise herself as Frey, to poison all ", 'offsets_in_document': [{'start': 3331, 'end': 3342}], 'offsets_in_context': [{'start': 70, 'end': 81}], 'document_ids': ['d7b414a49fd49fac14f535bef7169987'], 'meta': {'name': '208_Robb_Stark.txt'}}>,
<Answer {'answer': 'Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.5108734369277954, 'context': '\n=== Robb Stark ===\nRobb Stark is the oldest child of Eddard and Catelyn Stark, and the heir to Winterfell. He is not a POV character, but features in', 'offsets_in_document': [{'start': 54, 'end': 78}], 'offsets_in_context': [{'start': 54, 'end': 78}], 'document_ids': ['a9aef0a50f4ae283230cd2de274ffaf2'], 'meta': {'name': '30_List_of_A_Song_of_Ice_and_Fire_characters.txt'}}>]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 4.52 Batches/s]
BM25Retriever Results
===============
'Query: Arya Stark father'
'Answers:'
[ <Answer {'answer': 'Eddard', 'type': 'extractive', 'score': 0.9085890054702759, 'context': "s Nymeria after a legendary warrior queen. She travels with her father, Eddard, to King's Landing when he is made Hand of the King. Before she leaves,", 'offsets_in_document': [{'start': 147, 'end': 153}], 'offsets_in_context': [{'start': 72, 'end': 78}], 'document_ids': ['ba2a8e87ddd95e380bec55983ee7d55f'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
<Answer {'answer': 'Ned', 'type': 'extractive', 'score': 0.7877868413925171, 'context': "\n====Season 1====\nArya accompanies her father Ned and her sister Sansa to King's Landing. Before their departure, Arya's half-brother Jon Snow gifts A", 'offsets_in_document': [{'start': 46, 'end': 49}], 'offsets_in_context': [{'start': 46, 'end': 49}], 'document_ids': ['180c2a6b36369712b361a80842e79356'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
<Answer {'answer': 'Lord Eddard Stark', 'type': 'extractive', 'score': 0.7179794311523438, 'context': 'ark daughters.\nDuring the Tourney of the Hand to honour her father Lord Eddard Stark, Sansa Stark is enchanted by the knights performing in the event.', 'offsets_in_document': [{'start': 659, 'end': 676}], 'offsets_in_context': [{'start': 67, 'end': 84}], 'document_ids': ['d1f36ec7170e4c46cde65787fe125dfe'], 'meta': {'name': '332_Sansa_Stark.txt'}}>,
<Answer {'answer': 'Balon', 'type': 'extractive', 'score': 0.40754902362823486, 'context': "sgusted, Robb acquiesces to Theon's further captivity, as Theon's father Balon has recently died and Theon's absence presents a succession crisis for ", 'offsets_in_document': [{'start': 274, 'end': 279}], 'offsets_in_context': [{'start': 73, 'end': 78}], 'document_ids': ['fc56eb160221cbdc74d223383680dbeb'], 'meta': {'name': '487_Ramsay_Bolton.txt'}}>,
<Answer {'answer': 'Robert', 'type': 'extractive', 'score': 0.2134847790002823, 'context': "d Gendry over to them - King Joffrey has ordered that all of his father Robert's bastards be killed, but Yoren turns the Goldcloaks away. Later, Gendr", 'offsets_in_document': [{'start': 362, 'end': 368}], 'offsets_in_context': [{'start': 72, 'end': 78}], 'document_ids': ['dd4e070a22896afa81748d6510006d2'], 'meta': {'name': '191_Gendry.txt'}}>,
<Answer {'answer': 'Prince Joffrey', 'type': 'extractive', 'score': 0.14156031608581543, 'context': "\n==== ''A Game of Thrones'' ====\nPrince Joffrey is taken by his parents to Winterfell and is betrothed to Sansa Stark in order to create an alliance b", 'offsets_in_document': [{'start': 33, 'end': 47}], 'offsets_in_context': [{'start': 33, 'end': 47}], 'document_ids': ['58e878049c1864c83325dd2a13ab37ee'], 'meta': {'name': '37_Joffrey_Baratheon.txt'}}>,
<Answer {'answer': 'Joffrey', 'type': 'extractive', 'score': 0.07759759575128555, 'context': "laying with one of his wooden toys.\nAfter Eddard discovers the truth of Joffrey's paternity, he tells Sansa that they will be heading back to Winterfe", 'offsets_in_document': [{'start': 1161, 'end': 1168}], 'offsets_in_context': [{'start': 72, 'end': 79}], 'document_ids': ['d1f36ec7170e4c46cde65787fe125dfe'], 'meta': {'name': '332_Sansa_Stark.txt'}}>,
<Answer {'answer': 'Eddard', 'type': 'extractive', 'score': 0.06299781054258347, 'context': 's bodyguard Sandor "The Hound" Clegane hunt down and kill Mycah.\nLater, Eddard Stark discovers that Joffrey is not King Robert\'s biological son and re', 'offsets_in_document': [{'start': 981, 'end': 987}], 'offsets_in_context': [{'start': 72, 'end': 78}], 'document_ids': ['58e878049c1864c83325dd2a13ab37ee'], 'meta': {'name': '37_Joffrey_Baratheon.txt'}}>,
<Answer {'answer': 'Eddard "Ned" Stark agrees to make a false confession and swear loyalty to King Joffrey Baratheon.\nArya Stark finds a crowd gathering to watch her father be judged', 'type': 'extractive', 'score': 0.06236845254898071, 'context': 'Eddard "Ned" Stark agrees to make a false confession and swear loyalty to King Joffrey Baratheon.\nArya Stark finds a crowd gathering to watch her father be judged', 'offsets_in_document': [{'start': 89, 'end': 251}], 'offsets_in_context': [{'start': 0, 'end': 162}], 'document_ids': ['956aa2b653c6debcb6cb217531a6be58'], 'meta': {'name': '450_Baelor.txt'}}>,
<Answer {'answer': 'Robb', 'type': 'extractive', 'score': 0.058649975806474686, 'context': 'allow the army to cross the river and to commit his troops in return for Robb and Arya Stark marrying two of his children.\nTyrion Lannister suspects h', 'offsets_in_document': [{'start': 193, 'end': 197}], 'offsets_in_context': [{'start': 73, 'end': 77}], 'document_ids': ['6b181174d1237878b706e6a12d69e92'], 'meta': {'name': '450_Baelor.txt'}}>]
可視化した結果を見るとわかりやすい。
評価ノード
評価するためのノードもある。これについては別途やる。
パイプラインのデバッグ
デバッグ出力を行う方法は複数ある。
- ノードの
debug
アトリビュートを有効化する
bm25_retriever.debug = True
- パイプライン実行時に
debug
パラメータを付与する
パイプライン全体でdebugを有効にするやり方と特定のノードだけ有効にするやり方がある
全体で有効にする場合はparamsでそのまま渡す。
result = p_classifier.run(query="Who is the father of Arya Stark?", params={"debug": True})
特定のノードでだけ有効にする場合はparamsでノード名を指定してそこで有効にする。
result = p_classifier.run(query="Who is the father of Arya Stark?", params={"BM25Retriever": {"debug": True}})
デバッグ結果は、パイプライン実行結果の中に_debug
というキーが割り当てられてここに出力される。
実際に取得できるデータはこんな感じ。ちょっとデータ量が多いので上位3件に絞るオプションも追加して、パイプライン全体でデバッグを有効にしてみた。
res_1 = p_classifier.run(query="Who is the father of Arya Stark?", params={"debug": True, "BM25Retriever": {"top_k": 3}, "EmbeddingRetriever": {"top_k": 3}, "QAReader": {"top_k": 3}})
print("\nEmbedding Retriever Results" + "\n" + "=" * 15)
pprint(res_1)
{'_debug': {'EmbeddingRetriever': {'exec_time_ms': 199.99,
'input': {'debug': True,
'query': 'Who is the father of '
'Arya Stark?',
'root_node': 'Query',
'top_k': 3},
'output': {'documents': [<Document: {'content': '\n=== Background ===\nArya is the third child and younger daughter of Eddard and Catelyn Stark and is nine years old at the beginning of the book series. She has five siblings: an older brother Robb, an older sister Sansa, two younger brothers Bran and Rickon, and an older illegitimate half-brother, Jon Snow.', 'content_type': 'text', 'score': 0.560986304295376, 'meta': {'name': '43_Arya_Stark.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': 'd7a98cb66f592540fa7de20bf46a5e64'}>,
<Document: {'content': "\n===Arya Stark===\n'''Arya Stark''' portrayed by Maisie Williams. Arya Stark of House Stark is the younger daughter and third child of Lord Eddard and Catelyn Stark of Winterfell. Ever the tomboy, Arya would rather be training to use weapons than sewing with a needle. She names her direwolf Nymeria, after a legendary warrior queen.", 'content_type': 'text', 'score': 0.5607921031690192, 'meta': {'name': '349_List_of_Game_of_Thrones_characters.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '623e446a1b048d81130a29e262152cd5'}>,
<Document: {'content': "'''Arya Stark''' is a fictional character in American author George R. R. Martin's ''A Song of Ice and Fire'' epic fantasy novel series. She is a prominent point of view character in the novels with the third most viewpoint chapters, and is the only viewpoint character to have appeared in every published book of the series.\nIntroduced in 1996's ''A Game of Thrones'', Arya is the third child and younger daughter of Lord Eddard Stark and his wife Lady Catelyn Stark. She is tomboyish, headstrong, feisty, independent, disdains traditional female pursuits, and is often mistaken for a boy. She wields a smallsword named Needle, a gift from her half-brother, Jon Snow, and is trained in the Braavosi style of sword fighting by Syrio Forel.\nArya is portrayed by English actress Maisie Williams in HBO's Emmy-winning television adaptation of the novel series, ''Game of Thrones''. Her performance has garnered critical acclaim, particularly in the second season for her work opposite veteran actor Charles Dance (Tywin Lannister) when she served as his cupbearer. She is among the most popular characters in either version of the story. Williams was nominated for a Primetime Emmy Award for Outstanding Supporting Actress in a Drama Series for the role in 2016. She and the rest of the cast were nominated for Screen Actors Guild Awards for Outstanding Performance by an Ensemble in a Drama Series in 2011, 2013, 2014, 2015, 2016 and 2017.", 'content_type': 'text', 'score': 0.5599752218462483, 'meta': {'name': '43_Arya_Stark.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '2ee56bdd46dfd30b23f91bcc046456a4'}>]}},
'QAReader': {'exec_time_ms': 100.68,
'input': {'debug': True,
'documents': [<Document: {'content': '\n=== Background ===\nArya is the third child and younger daughter of Eddard and Catelyn Stark and is nine years old at the beginning of the book series. She has five siblings: an older brother Robb, an older sister Sansa, two younger brothers Bran and Rickon, and an older illegitimate half-brother, Jon Snow.', 'content_type': 'text', 'score': 0.560986304295376, 'meta': {'name': '43_Arya_Stark.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': 'd7a98cb66f592540fa7de20bf46a5e64'}>,
<Document: {'content': "\n===Arya Stark===\n'''Arya Stark''' portrayed by Maisie Williams. Arya Stark of House Stark is the younger daughter and third child of Lord Eddard and Catelyn Stark of Winterfell. Ever the tomboy, Arya would rather be training to use weapons than sewing with a needle. She names her direwolf Nymeria, after a legendary warrior queen.", 'content_type': 'text', 'score': 0.5607921031690192, 'meta': {'name': '349_List_of_Game_of_Thrones_characters.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '623e446a1b048d81130a29e262152cd5'}>,
<Document: {'content': "'''Arya Stark''' is a fictional character in American author George R. R. Martin's ''A Song of Ice and Fire'' epic fantasy novel series. She is a prominent point of view character in the novels with the third most viewpoint chapters, and is the only viewpoint character to have appeared in every published book of the series.\nIntroduced in 1996's ''A Game of Thrones'', Arya is the third child and younger daughter of Lord Eddard Stark and his wife Lady Catelyn Stark. She is tomboyish, headstrong, feisty, independent, disdains traditional female pursuits, and is often mistaken for a boy. She wields a smallsword named Needle, a gift from her half-brother, Jon Snow, and is trained in the Braavosi style of sword fighting by Syrio Forel.\nArya is portrayed by English actress Maisie Williams in HBO's Emmy-winning television adaptation of the novel series, ''Game of Thrones''. Her performance has garnered critical acclaim, particularly in the second season for her work opposite veteran actor Charles Dance (Tywin Lannister) when she served as his cupbearer. She is among the most popular characters in either version of the story. Williams was nominated for a Primetime Emmy Award for Outstanding Supporting Actress in a Drama Series for the role in 2016. She and the rest of the cast were nominated for Screen Actors Guild Awards for Outstanding Performance by an Ensemble in a Drama Series in 2011, 2013, 2014, 2015, 2016 and 2017.", 'content_type': 'text', 'score': 0.5599752218462483, 'meta': {'name': '43_Arya_Stark.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '2ee56bdd46dfd30b23f91bcc046456a4'}>],
'query': 'Who is the father of Arya Stark?',
'top_k': 3},
'output': {'answers': [<Answer {'answer': 'Lord Eddard Stark', 'type': 'extractive', 'score': 0.8793496489524841, 'context': "Game of Thrones'', Arya is the third child and younger daughter of Lord Eddard Stark and his wife Lady Catelyn Stark. She is tomboyish, headstrong, f", 'offsets_in_document': [{'start': 419, 'end': 436}], 'offsets_in_context': [{'start': 67, 'end': 84}], 'document_ids': ['2ee56bdd46dfd30b23f91bcc046456a4'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
<Answer {'answer': 'Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.8221828937530518, 'context': 'Background ===\nArya is the third child and younger daughter of Eddard and Catelyn Stark and is nine years old at the beginning of the book series. Sh', 'offsets_in_document': [{'start': 68, 'end': 92}], 'offsets_in_context': [{'start': 63, 'end': 87}], 'document_ids': ['d7a98cb66f592540fa7de20bf46a5e64'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
<Answer {'answer': 'Lord Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.6865440607070923, 'context': 'rk of House Stark is the younger daughter and third child of Lord Eddard and Catelyn Stark of Winterfell. Ever the tomboy, Arya would rather be traini', 'offsets_in_document': [{'start': 134, 'end': 163}], 'offsets_in_context': [{'start': 61, 'end': 90}], 'document_ids': ['623e446a1b048d81130a29e262152cd5'], 'meta': {'name': '349_List_of_Game_of_Thrones_characters.txt'}}>],
'no_ans_gap': 8.355133056640625,
'query': 'Who is the father of Arya '
'Stark?'}},
'Query': {'exec_time_ms': 0.07,
'input': {'debug': True},
'output': {}},
'QueryClassifier': {'exec_time_ms': 0.08,
'input': {'debug': True,
'query': 'Who is the father of Arya '
'Stark?'},
'output': {}}},