👌

LangChain - AgentType.ZERO_SHOT_REACT_DESCRIPTIONの仕組みを調べる

2023/05/05に公開

ChatGPT

tech

大規模言語モデルを使ったアプリケーション開発をしたい、という前提で必要な情報をまとめたものです。
正しい情報であるかは保証していないのでご注意を。

LangChain

大規模言語モデル単体では、プロンプトされた文章に対する結果を出力することしか行えない
過去のやり取りを記録したり、外部データと連携するには、別途仕組みを構築する必要がある
LangChainを使うことで、次のような仕組みを構築しやすくなる
- 大規模言語モデルの管理
- 外部データとの連携
- やり取りの種類を識別
- 過去のやり取りを記録

基本的な使い方

https://python.langchain.com/en/latest/getting_started/getting_started.html

AgentによるActionの推論（ZERO_SHOT_REACT_DESCRIPTION）

ChainでToolを指定しているが、どのタイミングでToolを使うか指定していない
AgentはLLMを使って、どのタイミングでToolを使うかを推論している

以下のコードを実行した場合は、Search・Calculatorの処理が行われる

llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
print(agent.run("昨日の東京の最高気温を2倍した数値は?"))

> Entering new AgentExecutor chain...
 I need to find out what the highest temperature in Tokyo was yesterday
Action: Search
Action Input: "Tokyo highest temperature yesterday"
Observation: Tokyo Temperature Yesterday. Maximum temperature yesterday: 70 °F (at 12:00 pm) Minimum temperature yesterday: 64 °F (at 10:00 am) Average temperature ...
Thought: I need to double the maximum temperature
Action: Calculator
Action Input: 70 * 2
Observation: Answer: 140
Thought: I now know the final answer
Final Answer: 140 °F

> Finished chain.
140 °F

これだけだと具体的にどのようなプロンプトでLLMを使った推論を行っているのかよくわからない。
なので、どのようなプロンプトがリクエストされているのか確認していく。

まず初めに、Agentへ入力した文章を含んだプロンプトが生成される。
入力した文章をそのままLLMへ渡すのではなく、Toolを使うべきか判断できるプロンプトになっている。

---------- プロンプト内容 ----------
Answer the following questions as best you can. You have access to the following tools:

Search: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.
Calculator: Useful for when you need to answer questions about math.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Search, Calculator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 昨日の東京の最高気温を2倍した数値は?
Thought:

---------- レスポンス ----------
 I need to find out what the highest temperature in Tokyo was yesterday
Action: Search
Action Input: "Tokyo highest temperature yesterday"

LLMで推論した結果、”Search”を実行することになり、SerpAPIで天気予報の情報を取得している。
SerpAPIへ入力する文章自体もLLMから推論されたものである。

---------- 検索クエリ ----------
Query: Tokyo highest temperature yesterday

---------- レスポンス ----------
Tokyo Temperature Yesterday. Maximum temperature yesterday: 70 °F (at 12:00 pm) Minimum temperature yesterday: 64 °F (at 10:00 am) Average temperature ...

LLMで推論→推論結果を元に処理、が1回完了したので再度LLMで推論に戻る。
最初のプロンプトとの違いは、Action・Action Input・Observationが追加されている点である。
これにより、天気予報の情報は取得済みである状態を表現している。

---------- プロンプト内容 ----------
Answer the following questions as best you can. You have access to the following tools:

Search: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.
Calculator: Useful for when you need to answer questions about math.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Search, Calculator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 昨日の東京の最高気温を2倍した数値は?
Thought: I need to find out what the highest temperature in Tokyo was yesterday
Action: Search
Action Input: "Tokyo highest temperature yesterday"
Observation: Tokyo Temperature Yesterday. Maximum temperature yesterday: 70 °F (at 12:00 pm) Minimum temperature yesterday: 64 °F (at 10:00 am) Average temperature ...
Thought:

---------- レスポンス ----------
 I need to double the maximum temperature
Action: Calculator
Action Input: 70 * 2

LLMで推論した結果、”Calculator”を実行することになり、LLMで計算している。
今回のケースだと計算結果（Answer: 140）の部分はLLMで出力されなかったが、llm-mathの実装を見るとtextの値をプログラム上で計算するか、計算結果があれば使用する形になっている。

---------- プロンプト内容 ----------
Translate a math problem into a expression that can be executed using Python's numexpr library. Use the output of running this code to answer the question.

Question: ${Question with math problem.}
```text
${single line mathematical expression that solves the problem}
```
...numexpr.evaluate(text)...
```output
${Output of running the code}
```
Answer: ${Answer}

Begin.

Question: What is 37593 * 67?

```text
37593 * 67
```
...numexpr.evaluate("37593 * 67")...
```output
2518731
```
Answer: 2518731

Question: 70 * 2

---------- レスポンス ----------
```text
70 * 2
```
...numexpr.evaluate("70 * 2")...

LLMで推論→推論結果を元に処理、が2回完了したので再度LLMで推論に戻る。
今回は実行するActionがなく、最終的な回答が出力されている。
なので、Agentへ入力した文章に対する回答が行われている。

---------- プロンプト内容 ----------
Answer the following questions as best you can. You have access to the following tools:

Search: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.
Calculator: Useful for when you need to answer questions about math.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Search, Calculator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 昨日の東京の最高気温を2倍した数値は?
Thought: I need to find out what the highest temperature in Tokyo was yesterday
Action: Search
Action Input: "Tokyo highest temperature yesterday"
Observation: Tokyo Temperature Yesterday. Maximum temperature yesterday: 70 °F (at 12:00 pm) Minimum temperature yesterday: 64 °F (at 10:00 am) Average temperature ...
Thought: I need to double the maximum temperature
Action: Calculator
Action Input: 70 * 2
Observation: Answer: 140
Thought:

---------- レスポンス ----------
 I now know the final answer
Final Answer: 140 °F

ReAct

LangChainのドキュメントを見ると、zero-shot-react-descriptionのAgentでは、ReAct frameworkを使っているとのこと
- https://python.langchain.com/en/latest/modules/agents/agents/agent_types.html
- https://arxiv.org/abs/2210.03629

Chain-of-Thought

最終的な結果だけを例として示すのではなく、結果を導出する過程を含めて例として示す
- これにより、質問に対する回答の精度が上がる
- https://www.promptingguide.ai/jp/techniques/cot

ReAct

結果を導出する過程そのものを出力するようなもの
- 結果を導出するために必要な”行動”が出力されれば、外部データなどから行動の結果を得る
- 得られた行動の結果を”導出する過程”に加えて、再度LLMかへ入力する
- 必要な”行動”があれば同様の作業を繰り返し、なければ最終的な結果が出力される
- https://www.promptingguide.ai/jp/techniques/react

まとめ

LangChain自体は、難しいものではなさそう
難しいのは Prompt Engineering に関するテクニックを活用する部分にありそう

LangChain

基本的な使い方

AgentによるActionの推論（ZERO_SHOT_REACT_DESCRIPTION）

ReAct

Chain-of-Thought

ReAct

まとめ

Discussion