Closed11

Haystackチュートリアルをやってみる: How to Use Pipelines

kun432kun432

Colaboratoryで進める。

GPUを有効にする必要があるので、「ノートブックの設定」で"T4 GPU"を使用する。

インストール。

%%bash

pip install --upgrade pip
pip install farm-haystack[colab,inference]

テレメトリー有効化。

from haystack.telemetry import tutorial_running

tutorial_running(11)

ロギング設定。

import logging

logging.basicConfig(format="%(levelname)s - %(name)s -  %(message)s", level=logging.WARNING)
logging.getLogger("haystack").setLevel(logging.INFO)
kun432kun432

ドキュメントやノードの準備、今回は一気に。

ドキュメントをDocumentStoreに入れて、BM25Retriever, EmbeddingRetriever, FARMReaderを初期化する。EmbeddingRetrieverでEmbeddingsも合わせて作成。

from haystack.document_stores import InMemoryDocumentStore
from haystack.utils import fetch_archive_from_http, convert_files_to_docs, clean_wiki_text
from haystack.nodes import BM25Retriever, EmbeddingRetriever, FARMReader

document_store = InMemoryDocumentStore(use_bm25=True)

doc_dir = "data/tutorial11"
s3_url = "https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/wiki_gameofthrones_txt11.zip"
fetch_archive_from_http(url=s3_url, output_dir=doc_dir)

got_docs = convert_files_to_docs(dir_path=doc_dir, clean_func=clean_wiki_text, split_paragraphs=True)
document_store.delete_documents()
document_store.write_documents(got_docs)

bm25_retriever = BM25Retriever(document_store=document_store)

embedding_retriever = EmbeddingRetriever(
    document_store=document_store, embedding_model="sentence-transformers/multi-qa-mpnet-base-dot-v1"
)
document_store.update_embeddings(embedding_retriever, update_existing_embeddings=False)

reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2")
kun432kun432

ビルトインのパイプライン

Haystackにはビルトインのパイプラインが予め用意されているのでまずはそれを使ってみる。

ExtractiveQAPipelineを使って抽出型QAとして回答を生成する。

from haystack.pipelines import ExtractiveQAPipeline
from haystack.utils import print_answers

p_extractive_premade = ExtractiveQAPipeline(reader=reader, retriever=bm25_retriever)
res = p_extractive_premade.run(
    query="Who is the father of Arya Stark?", params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
)
print_answers(res, details="minimum")

結果

'Answers:'
[   {   'answer': 'Eddard',
        'context': 's Nymeria after a legendary warrior queen. She travels '
                   "with her father, Eddard, to King's Landing when he is made "
                   'Hand of the King. Before she leaves,'},
    {   'answer': 'Ned',
        'context': '\n'
                   '====Season 1====\n'
                   'Arya accompanies her father Ned and her sister Sansa to '
                   "King's Landing. Before their departure, Arya's "
                   'half-brother Jon Snow gifts A'},
    {   'answer': 'Lord Eddard Stark',
        'context': 'ark daughters.\n'
                   'During the Tourney of the Hand to honour her father Lord '
                   'Eddard Stark, Sansa Stark is enchanted by the knights '
                   'performing in the event.'},
    {   'answer': 'Eddard Stark',
        'context': "arrested by Catelyn Stark. He then races to King's Landing "
                   "to inform Eddard Stark. During Lord Eddard's execution, he "
                   'finds Arya Stark and shields her'},
    {   'answer': 'Joffrey',
        'context': 'laying with one of his wooden toys.\n'
                   "After Eddard discovers the truth of Joffrey's paternity, "
                   'he tells Sansa that they will be heading back to Winterfe'}]

DocumentSearchPipelineを使うと単純にretrieval部分だけとなる。

from haystack.pipelines import DocumentSearchPipeline
from haystack.utils import print_documents

p_retrieval = DocumentSearchPipeline(embedding_retriever)
res = p_retrieval.run(query="Who is the father of Arya Stark?", params={"Retriever": {"top_k": 10}})
print_documents(res, max_text_len=200)

結果

Query: Who is the father of Arya Stark?

{   'content': '\n'
               '=== Background ===\n'
               'Arya is the third child and younger daughter of Eddard and '
               'Catelyn Stark and is nine years old at the beginning of the '
               'book series.  She has five siblings: an older brother Robb, '
               'a...',
    'name': '43_Arya_Stark.txt'}

{   'content': '\n'
               '===Arya Stark===\n'
               "'''Arya Stark''' portrayed by Maisie Williams. Arya Stark of "
               'House Stark is the younger daughter and third child of Lord '
               'Eddard and Catelyn Stark of Winterfell. Ever the tomboy, '
               'Arya...',
    'name': '349_List_of_Game_of_Thrones_characters.txt'}

{   'content': "'''Arya Stark''' is a fictional character in American author "
               "George R. R. Martin's ''A Song of Ice and Fire'' epic fantasy "
               'novel series.  She is a prominent point of view character in '
               'the novels with ...',
    'name': '43_Arya_Stark.txt'}

{   'content': '\n'
               '=== Arya Stark ===\n'
               'Arya Stark is the third child and younger daughter of Eddard '
               'and Catelyn Stark. She serves as a POV character for 33 '
               "chapters throughout ''A Game of Thrones'', ''A Clash of "
               "Kings''...",
    'name': '30_List_of_A_Song_of_Ice_and_Fire_characters.txt'}

{   'content': '\n'
               '== Character description ==\n'
               "Robb is 14 years old at the beginning of ''A Game of Thrones'' "
               '(1996). He is the oldest legitimate son of Eddard "Ned" Stark '
               'and his wife Catelyn, and has five siblings: S...',
    'name': '208_Robb_Stark.txt'}

以下のようなパイプラインもあるらしい。

  • ドキュメント検索(DocumentSearchPipeline),
  • 要約付ドキュメント検索(SearchSummarizationPipeline)
  • FAQスタイルのQA (FAQPipeline)
  • 翻訳検索translated search (TranslationWrapperPipeline)
kun432kun432

パイプラインの可視化

GraphVizに対応していて、パイプラインの可視化ができる。

GraphVizのライブラリとpythonパッケージをインストール

!apt install libgraphviz-dev
!pip install pygraphviz

パイプラインオブジェクトのdrawメソッドで画像パスを指定すれば出力される。Colaboratoryの場合はIPython.displayを使えば表示できる。

ExtractiveQAPipelineの場合。

from IPython.display import Image,display

pipeline_extractive_premade_image = "pipeline_extractive_premade.png"

p_extractive_premade.draw(pipeline_extractive_premade_image)
display(Image(pipeline_extractive_premade_image))

DocumentSearchPipelineの場合。

from IPython.display import Image,display

pipeline_retrieval_image = "pipeline_retrieval.png"
p_retrieval.draw(pipeline_retrieval_image)
display(Image(pipeline_retrieval_image))

kun432kun432

カスタムなパイプライン

ビルトインではなく独自にカスタムなパイプラインを作成できる。Extractive QA Pipelineをカスタムに作る。パイプラインを初期化してadd_nodeメソッドで各ノードをパイプラインに追加する。

from haystack.pipelines import Pipeline
from IPython.display import Image,display

p_extractive = Pipeline()
p_extractive.add_node(component=bm25_retriever, name="Retriever", inputs=["Query"])
p_extractive.add_node(component=reader, name="Reader", inputs=["Retriever"])

res = p_extractive.run(query="Who is the father of Arya Stark?", params={"Retriever": {"top_k": 10}})
print_answers(res, details="minimum")

pipeline_extractive_image = "pipeline_extractive.png"
p_extractive.draw(pipeline_extractive_image)
display(Image(pipeline_extractive_image))

結果

'Answers:'
[   {   'answer': 'Eddard',
        'context': 's Nymeria after a legendary warrior queen. She travels '
                   "with her father, Eddard, to King's Landing when he is made "
                   'Hand of the King. Before she leaves,'},
    {   'answer': 'Ned',
        'context': '\n'
                   '====Season 1====\n'
                   'Arya accompanies her father Ned and her sister Sansa to '
                   "King's Landing. Before their departure, Arya's "
                   'half-brother Jon Snow gifts A'},
    {   'answer': 'Lord Eddard Stark',
        'context': 'ark daughters.\n'
                   'During the Tourney of the Hand to honour her father Lord '
                   'Eddard Stark, Sansa Stark is enchanted by the knights '
                   'performing in the event.'},
    {   'answer': 'Eddard Stark',
        'context': "arrested by Catelyn Stark. He then races to King's Landing "
                   "to inform Eddard Stark. During Lord Eddard's execution, he "
                   'finds Arya Stark and shields her'},
    {   'answer': 'Joffrey',
        'context': 'laying with one of his wooden toys.\n'
                   "After Eddard discovers the truth of Joffrey's paternity, "
                   'he tells Sansa that they will be heading back to Winterfe'},
    {   'answer': 'Eddard "Ned" Stark',
        'context': '=\n'
                   "After Varys tells him that Sansa Stark's life is also at "
                   'stake, Eddard "Ned" Stark agrees to make a false '
                   'confession and swear loyalty to King Joffr'},
    {   'answer': 'Eddard Stark',
        'context': 'e from House Tully in the Riverlands region prior to her '
                   'marriage to Eddard Stark. She has her hair dyed dark brown '
                   'later on while in the Vale, disgui'},
    {   'answer': 'Eddard Stark and Catelyn Stark',
        'context': 'ces==\n'
                   'Sansa Stark is the second child and elder daughter of '
                   'Eddard Stark and Catelyn Stark. She was born and raised in '
                   'Winterfell, until leaving with '},
    {   'answer': 'Robb',
        'context': 'ost, Walder was able to negotiate marriage contracts for '
                   'his children to Robb and Arya Stark. But during Season 2 '
                   'Robb broke his word and married Lady'},
    {   'answer': 'Ned Stark',
        'context': ' to reveal her true identity, and is surprised to learn '
                   "she is in fact Ned Stark's daughter. After the Goldcloaks "
                   'get help from Ser Amory Lorch and hi'}]

可視化したパイプライン

kun432kun432

カスタムなパイプラインを複数のRetrieverと組み合わせる

EmbeddingRetrieverとBM25Retrieverを組み合わせて、疎ベクトル・密ベクトル両方に対応したパイプラインを作ることもできる。JoinDocumentsというノードを使う。

from haystack.nodes import JoinDocuments

p_ensemble = Pipeline()
p_ensemble.add_node(component=bm25_retriever, name="BM25Retriever", inputs=["Query"])
p_ensemble.add_node(component=embedding_retriever, name="EmbeddingRetriever", inputs=["Query"])
p_ensemble.add_node(
    component=JoinDocuments(join_mode="concatenate"), name="JoinResults", inputs=["BM25Retriever", "EmbeddingRetriever"]
)
p_ensemble.add_node(component=reader, name="Reader", inputs=["JoinResults"])

res = p_ensemble.run(
    query="Who is the father of Arya Stark?", params={"EmbeddingRetriever": {"top_k": 5}, "BM25Retriever": {"top_k": 5}}
)
print_answers(res, details="minimum")

pipeline_ensemble_image = "pipeline_ensemble.png"
p_extractive.draw(pipeline_ensemble_image)
display(Image(pipeline_ensemble_image))

結果

'Answers:'
[   {   'answer': 'Eddard',
        'context': 's Nymeria after a legendary warrior queen. She travels '
                   "with her father, Eddard, to King's Landing when he is made "
                   'Hand of the King. Before she leaves,'},
    {   'answer': 'Eddard and Catelyn Stark',
        'context': 'tark ===\n'
                   'Arya Stark is the third child and younger daughter of '
                   'Eddard and Catelyn Stark. She serves as a POV character '
                   "for 33 chapters throughout ''A "},
    {   'answer': 'Lord Eddard Stark',
        'context': 'ark daughters.\n'
                   'During the Tourney of the Hand to honour her father Lord '
                   'Eddard Stark, Sansa Stark is enchanted by the knights '
                   'performing in the event.'},
    {   'answer': 'Lord Eddard Stark',
        'context': "Game of Thrones'', Arya is the third child and younger "
                   'daughter of Lord Eddard Stark and his wife Lady Catelyn '
                   'Stark.  She is tomboyish, headstrong, f'},
    {   'answer': 'Eddard and Catelyn Stark',
        'context': 'Background ===\n'
                   'Arya is the third child and younger daughter of Eddard and '
                   'Catelyn Stark and is nine years old at the beginning of '
                   'the book series.  Sh'},
    {   'answer': 'Lord Eddard and Catelyn Stark',
        'context': 'rk of House Stark is the younger daughter and third child '
                   'of Lord Eddard and Catelyn Stark of Winterfell. Ever the '
                   'tomboy, Arya would rather be traini'},
    {   'answer': 'Eddard Stark',
        'context': "arrested by Catelyn Stark. He then races to King's Landing "
                   "to inform Eddard Stark. During Lord Eddard's execution, he "
                   'finds Arya Stark and shields her'},
    {   'answer': 'Joffrey',
        'context': 'laying with one of his wooden toys.\n'
                   "After Eddard discovers the truth of Joffrey's paternity, "
                   'he tells Sansa that they will be heading back to Winterfe'},
    {   'answer': 'Jon Snow',
        'context': '.  She wields a smallsword named Needle, a gift from her '
                   'half-brother, Jon Snow, and is trained in the Braavosi '
                   'style of sword fighting by Syrio Forel'},
    {   'answer': 'Ned Stark',
        'context': ' to reveal her true identity, and is surprised to learn '
                   "she is in fact Ned Stark's daughter. After the Goldcloaks "
                   'get help from Ser Amory Lorch and hi'}]

可視化したイメージ

kun432kun432

カスタムなノード

カスタムなパイプラインが作りたくなるならば、当然ノードもカスタムを作りたくなる。Haystackではこれも可能。

カスタムなノードを作る場合の要件は以下。

  • BaseComponentを継承するクラスを作成する
  • そのクラスにrun()メソッドを追加する。処理に必要な必須およびオプションの引数を追加。これらの引数は、パイプラインへの入力として、params内や、先行するノードの出力として渡される必要がある。
  • run()の中に処理ロジックを追加する(例:クエリの再フォーマット)。
  • ノード用の出力データ(タプル形式)と、出力エッジの名前(デフォルトでは1つの出力を持つノードの場合は“output_1”)を返す。
  • ノードからの出力オプションの数を定義するクラス属性outgoing_edges = 1を追加。ここでより大きな数値が必要なのは、決定ノードを持っている場合のみ(決定ノードは後述)。

カスタムノードのテンプレートは以下。

from haystack import BaseComponent
from typing import Optional, List


class CustomNode(BaseComponent):
    outgoing_edges = 1

    def run(self, query: str, my_optional_param: Optional[int]):
        # process the inputs
        output = {"my_output": ...}
        return output, "output_1"

    def run_batch(self, queries: List[str], my_optional_param: Optional[int]):
        # process the inputs
        output = {"my_output": ...}
        return output, "output_1"
kun432kun432

決定ノード

決定ノードは、データのルーティングを行うためのノード。実例を見たほうが早い。

from haystack import BaseComponent
from typing import Optional, List


class CustomQueryClassifier(BaseComponent):
    outgoing_edges = 2

    def run(self, query: str):
        if "?" in query:
            return {}, "output_2"
        else:
            return {}, "output_1"

    def run_batch(self, queries: List[str]):
        split = {"output_1": {"queries": []}, "output_2": {"queries": []}}
        for query in queries:
            if "?" in query:
                split["output_2"]["queries"].append(query)
            else:
                split["output_1"]["queries"].append(query)

        return split, "split"


p_classifier = Pipeline()
p_classifier.add_node(component=CustomQueryClassifier(), name="QueryClassifier", inputs=["Query"])
p_classifier.add_node(component=bm25_retriever, name="BM25Retriever", inputs=["QueryClassifier.output_1"])
p_classifier.add_node(component=embedding_retriever, name="EmbeddingRetriever", inputs=["QueryClassifier.output_2"])
p_classifier.add_node(component=reader, name="QAReader", inputs=["BM25Retriever", "EmbeddingRetriever"])

res_1 = p_classifier.run(query="Who is the father of Arya Stark?")
print("\nEmbedding Retriever Results" + "\n" + "=" * 15)
print_answers(res_1)

print()

res_2 = p_classifier.run(query="Arya Stark father")
print("\nBM25Retriever Results" + "\n" + "=" * 15)
print_answers(res_2)

pipeline_classifier_image = "pipeline_classifier.png"
p_classifier.draw(pipeline_classifier_image)
display(Image(pipeline_classifier_image))

CustomQueryClassifierというカスタムノードを作成する。これはクエリに"?"が含まれているかいないかで、2つの出力エッジのどちらかだけを返すというもの。これをパイプラインに組み込んで、"?"が含まれていればEmbeddingRetrieverへ、なければBM25Retrieverへ、という分岐処理を行っている。

結果はこうなる。

Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.92 Batches/s]
Embedding Retriever Results
===============
'Query: Who is the father of Arya Stark?'
'Answers:'
[   <Answer {'answer': 'Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.9058835506439209, 'context': "tark ===\nArya Stark is the third child and younger daughter of Eddard and Catelyn Stark. She serves as a POV character for 33 chapters throughout ''A ", 'offsets_in_document': [{'start': 74, 'end': 98}], 'offsets_in_context': [{'start': 63, 'end': 87}], 'document_ids': ['965789c741c68963042e85b6e7b89757'], 'meta': {'name': '30_List_of_A_Song_of_Ice_and_Fire_characters.txt'}}>,
    <Answer {'answer': 'Lord Eddard Stark', 'type': 'extractive', 'score': 0.8793494701385498, 'context': "Game of Thrones'', Arya is the third child and younger daughter of Lord Eddard Stark and his wife Lady Catelyn Stark.  She is tomboyish, headstrong, f", 'offsets_in_document': [{'start': 419, 'end': 436}], 'offsets_in_context': [{'start': 67, 'end': 84}], 'document_ids': ['2ee56bdd46dfd30b23f91bcc046456a4'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
    <Answer {'answer': 'Robb', 'type': 'extractive', 'score': 0.8409849405288696, 'context': "f the Vale, restoring Stark rule in the North in the process. Meanwhile, Robb's youngest sister Arya Stark returns to Westeros, murders Walder Frey, a", 'offsets_in_document': [{'start': 3268, 'end': 3272}], 'offsets_in_context': [{'start': 73, 'end': 77}], 'document_ids': ['d7b414a49fd49fac14f535bef7169987'], 'meta': {'name': '208_Robb_Stark.txt'}}>,
    <Answer {'answer': 'Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.8221831321716309, 'context': 'Background ===\nArya is the third child and younger daughter of Eddard and Catelyn Stark and is nine years old at the beginning of the book series.  Sh', 'offsets_in_document': [{'start': 68, 'end': 92}], 'offsets_in_context': [{'start': 63, 'end': 87}], 'document_ids': ['d7a98cb66f592540fa7de20bf46a5e64'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
    <Answer {'answer': 'Jon Snow', 'type': 'extractive', 'score': 0.8216632604598999, 'context': ' of Winterfell.  She is particularly close to her bastard half-brother Jon Snow, who encourages her to learn how to fight and gives her the smallsword', 'offsets_in_document': [{'start': 656, 'end': 664}], 'offsets_in_context': [{'start': 71, 'end': 79}], 'document_ids': ['a64bb94eab347d5cc10686c16b52a4dd'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
    <Answer {'answer': 'Balon Greyjoy', 'type': 'extractive', 'score': 0.7154540419578552, 'context': "He sends Theon to the Iron Islands hoping to broker an alliance with Balon Greyjoy, Theon's father. In exchange for Greyjoy support, Robb as the King ", 'offsets_in_document': [{'start': 983, 'end': 996}], 'offsets_in_context': [{'start': 69, 'end': 82}], 'document_ids': ['d7b414a49fd49fac14f535bef7169987'], 'meta': {'name': '208_Robb_Stark.txt'}}>,
    <Answer {'answer': 'Lord Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.6865442991256714, 'context': 'rk of House Stark is the younger daughter and third child of Lord Eddard and Catelyn Stark of Winterfell. Ever the tomboy, Arya would rather be traini', 'offsets_in_document': [{'start': 134, 'end': 163}], 'offsets_in_context': [{'start': 61, 'end': 90}], 'document_ids': ['623e446a1b048d81130a29e262152cd5'], 'meta': {'name': '349_List_of_Game_of_Thrones_characters.txt'}}>,
    <Answer {'answer': 'Jon Snow', 'type': 'extractive', 'score': 0.5814977884292603, 'context': '.  She wields a smallsword named Needle, a gift from her half-brother, Jon Snow, and is trained in the Braavosi style of sword fighting by Syrio Forel', 'offsets_in_document': [{'start': 662, 'end': 670}], 'offsets_in_context': [{'start': 71, 'end': 79}], 'document_ids': ['2ee56bdd46dfd30b23f91bcc046456a4'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
    <Answer {'answer': 'Walder Frey', 'type': 'extractive', 'score': 0.5672119855880737, 'context': "while, Robb's youngest sister Arya Stark returns to Westeros, murders Walder Frey, and later uses his face to disguise herself as Frey, to poison all ", 'offsets_in_document': [{'start': 3331, 'end': 3342}], 'offsets_in_context': [{'start': 70, 'end': 81}], 'document_ids': ['d7b414a49fd49fac14f535bef7169987'], 'meta': {'name': '208_Robb_Stark.txt'}}>,
    <Answer {'answer': 'Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.5108734369277954, 'context': '\n=== Robb Stark ===\nRobb Stark is the oldest child of Eddard and Catelyn Stark, and the heir to Winterfell. He is not a POV character, but features in', 'offsets_in_document': [{'start': 54, 'end': 78}], 'offsets_in_context': [{'start': 54, 'end': 78}], 'document_ids': ['a9aef0a50f4ae283230cd2de274ffaf2'], 'meta': {'name': '30_List_of_A_Song_of_Ice_and_Fire_characters.txt'}}>]

Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  4.52 Batches/s]
BM25Retriever Results
===============
'Query: Arya Stark father'
'Answers:'
[   <Answer {'answer': 'Eddard', 'type': 'extractive', 'score': 0.9085890054702759, 'context': "s Nymeria after a legendary warrior queen. She travels with her father, Eddard, to King's Landing when he is made Hand of the King. Before she leaves,", 'offsets_in_document': [{'start': 147, 'end': 153}], 'offsets_in_context': [{'start': 72, 'end': 78}], 'document_ids': ['ba2a8e87ddd95e380bec55983ee7d55f'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
    <Answer {'answer': 'Ned', 'type': 'extractive', 'score': 0.7877868413925171, 'context': "\n====Season 1====\nArya accompanies her father Ned and her sister Sansa to King's Landing. Before their departure, Arya's half-brother Jon Snow gifts A", 'offsets_in_document': [{'start': 46, 'end': 49}], 'offsets_in_context': [{'start': 46, 'end': 49}], 'document_ids': ['180c2a6b36369712b361a80842e79356'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
    <Answer {'answer': 'Lord Eddard Stark', 'type': 'extractive', 'score': 0.7179794311523438, 'context': 'ark daughters.\nDuring the Tourney of the Hand to honour her father Lord Eddard Stark, Sansa Stark is enchanted by the knights performing in the event.', 'offsets_in_document': [{'start': 659, 'end': 676}], 'offsets_in_context': [{'start': 67, 'end': 84}], 'document_ids': ['d1f36ec7170e4c46cde65787fe125dfe'], 'meta': {'name': '332_Sansa_Stark.txt'}}>,
    <Answer {'answer': 'Balon', 'type': 'extractive', 'score': 0.40754902362823486, 'context': "sgusted, Robb acquiesces to Theon's further captivity, as Theon's father Balon has recently died and Theon's absence presents a succession crisis for ", 'offsets_in_document': [{'start': 274, 'end': 279}], 'offsets_in_context': [{'start': 73, 'end': 78}], 'document_ids': ['fc56eb160221cbdc74d223383680dbeb'], 'meta': {'name': '487_Ramsay_Bolton.txt'}}>,
    <Answer {'answer': 'Robert', 'type': 'extractive', 'score': 0.2134847790002823, 'context': "d Gendry over to them - King Joffrey has ordered that all of his father Robert's bastards be killed, but Yoren turns the Goldcloaks away. Later, Gendr", 'offsets_in_document': [{'start': 362, 'end': 368}], 'offsets_in_context': [{'start': 72, 'end': 78}], 'document_ids': ['dd4e070a22896afa81748d6510006d2'], 'meta': {'name': '191_Gendry.txt'}}>,
    <Answer {'answer': 'Prince Joffrey', 'type': 'extractive', 'score': 0.14156031608581543, 'context': "\n==== ''A Game of Thrones'' ====\nPrince Joffrey is taken by his parents to Winterfell and is betrothed to Sansa Stark in order to create an alliance b", 'offsets_in_document': [{'start': 33, 'end': 47}], 'offsets_in_context': [{'start': 33, 'end': 47}], 'document_ids': ['58e878049c1864c83325dd2a13ab37ee'], 'meta': {'name': '37_Joffrey_Baratheon.txt'}}>,
    <Answer {'answer': 'Joffrey', 'type': 'extractive', 'score': 0.07759759575128555, 'context': "laying with one of his wooden toys.\nAfter Eddard discovers the truth of Joffrey's paternity, he tells Sansa that they will be heading back to Winterfe", 'offsets_in_document': [{'start': 1161, 'end': 1168}], 'offsets_in_context': [{'start': 72, 'end': 79}], 'document_ids': ['d1f36ec7170e4c46cde65787fe125dfe'], 'meta': {'name': '332_Sansa_Stark.txt'}}>,
    <Answer {'answer': 'Eddard', 'type': 'extractive', 'score': 0.06299781054258347, 'context': 's bodyguard Sandor "The Hound" Clegane hunt down and kill Mycah.\nLater, Eddard Stark discovers that Joffrey is not King Robert\'s biological son and re', 'offsets_in_document': [{'start': 981, 'end': 987}], 'offsets_in_context': [{'start': 72, 'end': 78}], 'document_ids': ['58e878049c1864c83325dd2a13ab37ee'], 'meta': {'name': '37_Joffrey_Baratheon.txt'}}>,
    <Answer {'answer': 'Eddard "Ned" Stark agrees to make a false confession and swear loyalty to King Joffrey Baratheon.\nArya Stark finds a crowd gathering to watch her father be judged', 'type': 'extractive', 'score': 0.06236845254898071, 'context': 'Eddard "Ned" Stark agrees to make a false confession and swear loyalty to King Joffrey Baratheon.\nArya Stark finds a crowd gathering to watch her father be judged', 'offsets_in_document': [{'start': 89, 'end': 251}], 'offsets_in_context': [{'start': 0, 'end': 162}], 'document_ids': ['956aa2b653c6debcb6cb217531a6be58'], 'meta': {'name': '450_Baelor.txt'}}>,
    <Answer {'answer': 'Robb', 'type': 'extractive', 'score': 0.058649975806474686, 'context': 'allow the army to cross the river and to commit his troops in return for Robb and Arya Stark marrying two of his children.\nTyrion Lannister suspects h', 'offsets_in_document': [{'start': 193, 'end': 197}], 'offsets_in_context': [{'start': 73, 'end': 77}], 'document_ids': ['6b181174d1237878b706e6a12d69e92'], 'meta': {'name': '450_Baelor.txt'}}>]

可視化した結果を見るとわかりやすい。

kun432kun432

パイプラインのデバッグ

デバッグ出力を行う方法は複数ある。

  1. ノードのdebugアトリビュートを有効化する
bm25_retriever.debug = True
  1. パイプライン実行時にdebugパラメータを付与する

パイプライン全体でdebugを有効にするやり方と特定のノードだけ有効にするやり方がある

全体で有効にする場合はparamsでそのまま渡す。

result = p_classifier.run(query="Who is the father of Arya Stark?", params={"debug": True})

特定のノードでだけ有効にする場合はparamsでノード名を指定してそこで有効にする。

result = p_classifier.run(query="Who is the father of Arya Stark?", params={"BM25Retriever": {"debug": True}})

デバッグ結果は、パイプライン実行結果の中に_debugというキーが割り当てられてここに出力される。

実際に取得できるデータはこんな感じ。ちょっとデータ量が多いので上位3件に絞るオプションも追加して、パイプライン全体でデバッグを有効にしてみた。

res_1 = p_classifier.run(query="Who is the father of Arya Stark?", params={"debug": True, "BM25Retriever": {"top_k": 3}, "EmbeddingRetriever": {"top_k": 3}, "QAReader": {"top_k": 3}})
print("\nEmbedding Retriever Results" + "\n" + "=" * 15)
pprint(res_1)
{'_debug': {'EmbeddingRetriever': {'exec_time_ms': 199.99,
                                   'input': {'debug': True,
                                             'query': 'Who is the father of '
                                                      'Arya Stark?',
                                             'root_node': 'Query',
                                             'top_k': 3},
                                   'output': {'documents': [<Document: {'content': '\n=== Background ===\nArya is the third child and younger daughter of Eddard and Catelyn Stark and is nine years old at the beginning of the book series.  She has five siblings: an older brother Robb, an older sister Sansa, two younger brothers Bran and Rickon, and an older illegitimate half-brother, Jon Snow.', 'content_type': 'text', 'score': 0.560986304295376, 'meta': {'name': '43_Arya_Stark.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': 'd7a98cb66f592540fa7de20bf46a5e64'}>,
                                                            <Document: {'content': "\n===Arya Stark===\n'''Arya Stark''' portrayed by Maisie Williams. Arya Stark of House Stark is the younger daughter and third child of Lord Eddard and Catelyn Stark of Winterfell. Ever the tomboy, Arya would rather be training to use weapons than sewing with a needle. She names her direwolf Nymeria, after a legendary warrior queen.", 'content_type': 'text', 'score': 0.5607921031690192, 'meta': {'name': '349_List_of_Game_of_Thrones_characters.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '623e446a1b048d81130a29e262152cd5'}>,
                                                            <Document: {'content': "'''Arya Stark''' is a fictional character in American author George R. R. Martin's ''A Song of Ice and Fire'' epic fantasy novel series.  She is a prominent point of view character in the novels with the third most viewpoint chapters, and is the only viewpoint character to have appeared in every published book of the series.\nIntroduced in 1996's ''A Game of Thrones'', Arya is the third child and younger daughter of Lord Eddard Stark and his wife Lady Catelyn Stark.  She is tomboyish, headstrong, feisty, independent, disdains traditional female pursuits, and is often mistaken for a boy.  She wields a smallsword named Needle, a gift from her half-brother, Jon Snow, and is trained in the Braavosi style of sword fighting by Syrio Forel.\nArya is portrayed by English actress Maisie Williams in HBO's Emmy-winning television adaptation of the novel series, ''Game of Thrones''.  Her performance has garnered critical acclaim, particularly in the second season for her work opposite veteran actor Charles Dance (Tywin Lannister) when she served as his cupbearer. She is among the most popular characters in either version of the story.  Williams was nominated for a Primetime Emmy Award for Outstanding Supporting Actress in a Drama Series for the role in 2016. She and the rest of the cast were nominated for Screen Actors Guild Awards for Outstanding Performance by an Ensemble in a Drama Series in 2011, 2013, 2014, 2015, 2016 and 2017.", 'content_type': 'text', 'score': 0.5599752218462483, 'meta': {'name': '43_Arya_Stark.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '2ee56bdd46dfd30b23f91bcc046456a4'}>]}},
            'QAReader': {'exec_time_ms': 100.68,
                         'input': {'debug': True,
                                   'documents': [<Document: {'content': '\n=== Background ===\nArya is the third child and younger daughter of Eddard and Catelyn Stark and is nine years old at the beginning of the book series.  She has five siblings: an older brother Robb, an older sister Sansa, two younger brothers Bran and Rickon, and an older illegitimate half-brother, Jon Snow.', 'content_type': 'text', 'score': 0.560986304295376, 'meta': {'name': '43_Arya_Stark.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': 'd7a98cb66f592540fa7de20bf46a5e64'}>,
                                                 <Document: {'content': "\n===Arya Stark===\n'''Arya Stark''' portrayed by Maisie Williams. Arya Stark of House Stark is the younger daughter and third child of Lord Eddard and Catelyn Stark of Winterfell. Ever the tomboy, Arya would rather be training to use weapons than sewing with a needle. She names her direwolf Nymeria, after a legendary warrior queen.", 'content_type': 'text', 'score': 0.5607921031690192, 'meta': {'name': '349_List_of_Game_of_Thrones_characters.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '623e446a1b048d81130a29e262152cd5'}>,
                                                 <Document: {'content': "'''Arya Stark''' is a fictional character in American author George R. R. Martin's ''A Song of Ice and Fire'' epic fantasy novel series.  She is a prominent point of view character in the novels with the third most viewpoint chapters, and is the only viewpoint character to have appeared in every published book of the series.\nIntroduced in 1996's ''A Game of Thrones'', Arya is the third child and younger daughter of Lord Eddard Stark and his wife Lady Catelyn Stark.  She is tomboyish, headstrong, feisty, independent, disdains traditional female pursuits, and is often mistaken for a boy.  She wields a smallsword named Needle, a gift from her half-brother, Jon Snow, and is trained in the Braavosi style of sword fighting by Syrio Forel.\nArya is portrayed by English actress Maisie Williams in HBO's Emmy-winning television adaptation of the novel series, ''Game of Thrones''.  Her performance has garnered critical acclaim, particularly in the second season for her work opposite veteran actor Charles Dance (Tywin Lannister) when she served as his cupbearer. She is among the most popular characters in either version of the story.  Williams was nominated for a Primetime Emmy Award for Outstanding Supporting Actress in a Drama Series for the role in 2016. She and the rest of the cast were nominated for Screen Actors Guild Awards for Outstanding Performance by an Ensemble in a Drama Series in 2011, 2013, 2014, 2015, 2016 and 2017.", 'content_type': 'text', 'score': 0.5599752218462483, 'meta': {'name': '43_Arya_Stark.txt'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '2ee56bdd46dfd30b23f91bcc046456a4'}>],
                                   'query': 'Who is the father of Arya Stark?',
                                   'top_k': 3},
                         'output': {'answers': [<Answer {'answer': 'Lord Eddard Stark', 'type': 'extractive', 'score': 0.8793496489524841, 'context': "Game of Thrones'', Arya is the third child and younger daughter of Lord Eddard Stark and his wife Lady Catelyn Stark.  She is tomboyish, headstrong, f", 'offsets_in_document': [{'start': 419, 'end': 436}], 'offsets_in_context': [{'start': 67, 'end': 84}], 'document_ids': ['2ee56bdd46dfd30b23f91bcc046456a4'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
                                                <Answer {'answer': 'Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.8221828937530518, 'context': 'Background ===\nArya is the third child and younger daughter of Eddard and Catelyn Stark and is nine years old at the beginning of the book series.  Sh', 'offsets_in_document': [{'start': 68, 'end': 92}], 'offsets_in_context': [{'start': 63, 'end': 87}], 'document_ids': ['d7a98cb66f592540fa7de20bf46a5e64'], 'meta': {'name': '43_Arya_Stark.txt'}}>,
                                                <Answer {'answer': 'Lord Eddard and Catelyn Stark', 'type': 'extractive', 'score': 0.6865440607070923, 'context': 'rk of House Stark is the younger daughter and third child of Lord Eddard and Catelyn Stark of Winterfell. Ever the tomboy, Arya would rather be traini', 'offsets_in_document': [{'start': 134, 'end': 163}], 'offsets_in_context': [{'start': 61, 'end': 90}], 'document_ids': ['623e446a1b048d81130a29e262152cd5'], 'meta': {'name': '349_List_of_Game_of_Thrones_characters.txt'}}>],
                                    'no_ans_gap': 8.355133056640625,
                                    'query': 'Who is the father of Arya '
                                             'Stark?'}},
            'Query': {'exec_time_ms': 0.07,
                      'input': {'debug': True},
                      'output': {}},
            'QueryClassifier': {'exec_time_ms': 0.08,
                                'input': {'debug': True,
                                          'query': 'Who is the father of Arya '
                                                   'Stark?'},
                                'output': {}}},
このスクラップは2023/10/04にクローズされました