LlamaIndex v6.0.0以降で変更があったため、以下に書き直した。

これ以降は古い記事です。

概要

LangChainを先にやったので「似たようなもんでしょ」ということで全く触ったことなかったLlamaIndexを触ってみる。

触ってみようと思ったのはLangChainにはデフォルトで用意されていないLoaderがいくつかあって、自分のコンテンツのvectorstoreがかんたんに作れそうと思ったので。

https://llamahub.ai/

Twitter, wordpress, web/rssあたりがよさそう。

ただしLangChainのLoaderがイケてないというわけではなくて、こっちはこっちで色々ある。LangChainとLlamaIndexを組わせて使うこともできるので、適宜組み合わせれば良さそう。

kun432

Colaboratoryで。

ライブラリをインストール

!pip install openai
!pip install langchain
!pip install llama-index

!apt install -y jq

OpenAIのAPIキーを入力

OPENAI_API_KEY=""#@param{type: "string"}

import os
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

レポジトリをクローンして、exampleのディレクトリに移る

!git clone https://github.com/jerryjliu/gpt_index.git
%cd gpt_index/examples/paul_graham_essay

ログ出力を有効化する。Colaboratoryだとforce=Trueをつけないと出力されなかった。ログレベルは適宜、今回はINFOにした。

import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO, Force=True)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

ではサンプルを動かしてみる。dataディレクトリ以下のテキストファイルのデータから回答を生成してくれる。

from llama_index import GPTSimpleVectorIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()
index = GPTSimpleVectorIndex.from_documents(documents)

response = index.query("What did the author do growing up?")
print(response)

実行結果

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 17617 tokens
> [build_index_from_nodes] Total embedding token usage: 17617 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 4075 tokens
> [query] Total LLM token usage: 4075 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 8 tokens
> [query] Total embedding token usage: 8 tokens


The author grew up writing short stories, programming on an IBM 1401, and building a computer kit from Heathkit. They also wrote simple games, a program to predict how high model rockets would fly, and a word processor. They studied philosophy in college, but switched to AI and taught themselves Lisp. They reverse-engineered SHRDLU for their undergraduate thesis and wrote a book about Lisp hacking. They also took art classes at Harvard and applied to art schools, and while a student at the Accademia, they started painting still lives in their bedroom at night. These paintings were tiny, because the room was, and because they painted them on leftover scraps of canvas, which was all they could afford at the time. They also had an arrangement with the faculty of the Accademia whereby the students wouldn't require the faculty to teach anything, and in return the faculty wouldn't require the students to learn anything. They also had a model who lived just down the street from them, who made a living from a combination of modelling and making fakes for a local antique dealer.

DEBUGにすると裏で何してるかがなんとなくわかるので見てみると良い。多分こんな感じ。

OpenAIのembeddings APIで、テキストデータをベクトル化
同じく、質問をベクトル化
質問とデータを類似検索
検索結果をコンテキストとして、OpenAIのcompletion APIに渡して、質問の回答を生成
（ここはよくわかってない）生成された回答をrefineするために（コンテキストを追加して）completionにもう一度投げる
最終回答を得る

作成したインデックスを永続データとして保存する

index.save_to_disk('index.json')

中身を見るとベクターデータになっているのがわかる。

!jq -r '.' index.json | head -100


  "index_struct": {
    "__type__": "simple_dict",
    "__data__": {
      "index_id": "929b373a-474e-4c01-aeb4-6cbb21dd06b1",
      "summary": null,
      "nodes_dict": {
        "fe352ef2-0c37-443b-987e-ff9bd61c2c6b": "6c1acefe-c699-4135-bb0f-3f18347a471c",
        "fd63597b-7f56-4793-b15d-6c520f7ceb34": "62752dce-b5ba-4608-81f1-f662d9789f1e",
        "d371426d-8332-4e79-8a39-a7e637579b86": "053e9ffd-c07e-463f-aca6-85bb03360c71",
        "347cdb60-2e01-4715-a6ea-e981b4f316df": "4472b209-eb99-4852-8a9f-4af222692b21",
        "6360e83a-fcec-475c-9f72-cbe92828e1b8": "79f2a391-d671-4748-9f0d-7ff5f4acc696",
        "16c1a1d0-39e5-4324-aa47-50251f610784": "b88660de-d3f6-49d0-85b4-43e8051e0fff"
      },
      "doc_id_dict": {
        "6168407e-317b-4df9-b8a9-f5a2d84a7914": [
          "fe352ef2-0c37-443b-987e-ff9bd61c2c6b",
          "fd63597b-7f56-4793-b15d-6c520f7ceb34",
          "d371426d-8332-4e79-8a39-a7e637579b86",
          "347cdb60-2e01-4715-a6ea-e981b4f316df",
          "6360e83a-fcec-475c-9f72-cbe92828e1b8",
          "16c1a1d0-39e5-4324-aa47-50251f610784"
        ]
      },
      "embeddings_dict": {
        "fe352ef2-0c37-443b-987e-ff9bd61c2c6b": [
          0.0026574512012302876,
          -0.0063492292538285255,
          -0.0017209660727530718,
          -0.029771840199828148,
          -0.007575745228677988,
          0.025676464661955833,
          -0.02901706099510193,
          -0.003076772904023528,
          -0.009169167838990688,
          -0.0368443988263607,
          0.022671325132250786,
          0.0330425500869751,
          0.0020372045692056417,
          -0.007939157076179981,
          0.0006316032959148288,
          0.01650729775428772,
          0.02960411086678505,
(snip)

読み出す場合は以下のようにすれば良い。DEBUGログを見ればデータのベクトル化が行われず、質問だけベクトル化してインデックスを検索してるのがわかる。

from llama_index import GPTSimpleVectorIndex, SimpleDirectoryReader

index = GPTSimpleVectorIndex.load_from_disk('index.json')

response = index.query("What did the author do growing up?")
print(response)

kun432

自分のはてなブログのデータを使ってみる。web/rssが簡単にできそう。

from llama_index import download_loader

RssReader = download_loader("RssReader")

loader = RssReader()
documents = loader.load_data([
    "https://kun432.hatenablog.com/rss"
])

index = GPTSimpleVectorIndex.from_documents(documents)

response = index.query("Voiceflowについて教えて下さい。")
print(response)

結果

Voiceflowは、Webサイトにチャットボットをかんたんに設置できるサービスです。Voiceflowは、Dialog Management APIを使用して、ユーザーとの会話を設計し、実行することができます。Voiceflowは、ユーザーとの会話を直感的かつ柔軟に行えるように設計されており、Transcribeを使用して、会話をテキストに変換することも可能です。また、Webチャットボットで音声入力を使えるようにすることも可能です。

自分のブログでは、Voiceflowの基本的な說明は書いていなくて、新機能とかの紹介がメインなので、ちょっと説明内容がややずれてる感じ（間違ってはないんだけど）。

ChatGPTで聞くとこういう感じ。

Voiceflowは、会話型アプリケーションや音声アシスタント、スキルなどを簡単に作成できる、クラウドベースのプラットフォームです。Voiceflowを使用することで、コーディングの知識がなくても、直感的なインターフェースを使用して、音声インタフェースを持つアプリケーションを構築することができます。

Voiceflowは、デザインツールとして使われ、会話の流れを作成して、必要な応答を設定することができます。Voiceflowは、さまざまな音声アシスタントやプラットフォームに対応しており、Amazon Alexa、Google Assistant、Microsoft Cortana、Samsung Bixby、Web Chatなどに出力することができます。

Voiceflowは、無料で始めることができ、初心者向けのテンプレートも用意されています。また、高度な機能を使いたい場合は、有料プランを選択することができます。

あとそこそこ時間掛かるので、インデックスは定期的に更新するようにしておいて、普段はローカルのデータを見るようにしたほうがいいかもしれない。

CPU times: user 1.58 s, sys: 14.6 ms, total: 1.59 s
Wall time: 17 s

kun432

LlamaIndexとLangChainを組み合わせて使う。具体的にはLlamaIndexをLangChainのagentのtoolとして使う。

from langchain.agents import initialize_agent, Tool
from langchain.tools import BaseTool
from langchain.llms import OpenAI

from llama_index import download_loader

# LlamaIndexのweb/rssを使って、ブログをindex化
RssReader = download_loader("RssReader")
loader = RssReader()
documents = loader.load_data([
    "https://kun432.hatenablog.com/rss"
])
index = GPTSimpleVectorIndex.from_documents(documents)

# LlamaIndexのweb/rssをLangChainのToolとして定義
tools = [
    Tool(
        name = "MyBlogSearch",
        func=lambda q: str(index.query(q)),
        description="kun432's blogから検索します。"
    ),    
]

llm = OpenAI(temperature=0)

# agentの定義
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

agent.run("Voiceflowについて教えて")

結果

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 196303 tokens
> [build_index_from_nodes] Total embedding token usage: 196303 tokens


> Entering new AgentExecutor chain...
 Voiceflowに関する情報を探す
Action: MyBlogSearch
Action Input: VoiceflowINFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 1438 tokens
> [query] Total LLM token usage: 1438 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 2 tokens
> [query] Total embedding token usage: 2 tokens

Observation: allows users to do what?

Voiceflow allows users to easily set up a chatbot on their website. It also allows users to design conversations intuitively and flexibly, and quickly make improvements to their conversation models if needed.
Thought: Voiceflowの詳細を知る
Action: MyBlogSearch
Action Input: Voiceflow featuresINFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 1591 tokens
> [query] Total LLM token usage: 1591 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 3 tokens
> [query] Total embedding token usage: 3 tokens

Observation: 
Voiceflowは、Webサイトにチャットボットをかんたんに設置できるようになるサービスです。Voiceflowの特徴として、Dialog Management APIを使用して、直感的かつ柔軟な対話デザインを行えること、1 Click Integration的な機能を提供していること、MS LUISをベースとしたWeb ChatチャネルのNLUを提供していることが挙げられます。また、音声入力を使えるようにする機能も要望されています。
Thought: Voiceflowの最終的な答えを知る
Action: MyBlogSearch
Action Input: Voiceflow reviewINFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 1624 tokens
> [query] Total LLM token usage: 1624 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 3 tokens
> [query] Total embedding token usage: 3 tokens

Observation: 
Voiceflowは、Webサイトにチャットボットをかんたんに設置できるようになったサービスです。Voiceflowは、Dialog Management APIを使用して、ユーザーとの会話を設計し、実行できるようにします。Voiceflowは、ユーザーとの会話を直感的かつ柔軟に設計できるため、企業の顧客向け問い合わせ用途などに最適です。Voiceflowは、非常にインパクトあるリリースであり、今後も1 Click Integration的な機能のリリースが予定されています。
Thought: I now know the final answer
Final Answer: Voiceflowは、Webサイトにチャットボットをかんたんに設置できるようになったサービスです。Voiceflowは、Dialog Management APIを使用して、ユーザーとの会話を設計し、実行できるようにします。Voiceflowは、ユーザーとの会話を直感的かつ柔軟に設計できるため、企業の顧客向け問い合わせ用途などに最適です。また、1 Click Integration的な機能も提供しており、今後もさらなる機能をリリース予定です。

> Finished chain.
Voiceflowは、Webサイトにチャットボットをかんたんに設置できるようになったサービスです。Voiceflowは、Dialog Management APIを使用して、ユーザーとの会話を設計し、実行できるようにします。Voiceflowは、ユーザーとの会話を直感的かつ柔軟に設計できるため、企業の顧客向け問い合わせ用途などに最適です。また、1 Click Integration的な機能も提供しており、今後もさらなる機能をリリース予定です。'

Toolの定義で"return_direct=True"を定義していないので、ReActで回答を精査してるのがわかる。

kun432

ドキュメント見てみたけど、LangChainのchain/agentに相当するものはなさそう？なのでLangChainと組み合わせるのが良さそうに思う。単純に1データソースだけでよいならシンプルに書けて良いんだけど、それだけで済まずにいろいろやりたくなると思うので。

student-ops

滅茶苦茶参考になります!!

zenn ではllamaindexに関する情報が少ないと感じるのですが、同様のツールで他に流行っている物があるのでしょうか?

kun432

LlamaIndexはあまり使っていないのでよくわかってなくて間違ってるかもしれないという前提で。

個人的な印象としては、LlamaIndexとLangChainはもともとの方向性がやや違っている感を持っています。

LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM’s with external data.

https://python.langchain.com/en/latest/

LangChain is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an API, but will also:

Be data-aware: connect a language model to other sources of data

Be agentic: allow a language model to interact with its environment

この辺も。

なので、アプリケーション開発という意味ではLangChainのほうが汎用性があるように思えますので、そちらの情報が多くなっているのではないかと思っています。

kun432

とはいえ、ユースケースが出てくるにつれて、この辺の差は徐々になくなってくるのではないかなと個人的には思っています。例えばLlamaIndexにはこういうのが出てきているらしいですし。

https://github.com/run-llama/llama-lab

ちなみに、似たようなツールだとこういうのもあるようです。使ったことないですが。

https://haystack.deepset.ai/

An NLP Framework To Use Transformers In Your Applications

kun432

上の方で書いたとおり、LangChainとLlamaIndexを組み合わせていいとこ取りできるのはメリットなのですが、トラブったときとかを考えると複数のツールを使うのはデメリットにもなり得るかなと思いますので、環境や体制に応じて適宜判断することになるのではないかと思います。

このスクラップは2023/06/07にクローズされました