🦔

llama-agentsコードリーディング

2024/08/11に公開

LLM

LlamaIndex

llamaagent

tech

概要

llm agentフレームワークであるllama-agentsのソースコードを読んでみました。
以下のサンプルコードで何が行われているのかを見ていきます。

from llama_agents import (
    AgentService,
    AgentOrchestrator,
    ControlPlaneServer,
    SimpleMessageQueue,
)

from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI


# create an agent
def get_the_secret_fact() -> str:
    """Returns the secret fact."""
    return "The secret fact is: A baby llama is called a 'Cria'."


tool = FunctionTool.from_defaults(fn=get_the_secret_fact)

agent1 = ReActAgent.from_tools([tool], llm=OpenAI())
agent2 = ReActAgent.from_tools([], llm=OpenAI())

# create our multi-agent framework components
message_queue = SimpleMessageQueue(port=8000)
control_plane = ControlPlaneServer(
    message_queue=message_queue,
    orchestrator=AgentOrchestrator(llm=OpenAI(model="gpt-4-turbo")),
    port=8001,
)
agent_server_1 = AgentService(
    agent=agent1,
    message_queue=message_queue,
    description="Useful for getting the secret fact.",
    service_name="secret_fact_agent",
    port=8002,
)
agent_server_2 = AgentService(
    agent=agent2,
    message_queue=message_queue,
    description="Useful for getting random dumb facts.",
    service_name="dumb_fact_agent",
    port=8003,
)

from llama_agents import LocalLauncher
import nest_asyncio

# needed for running in a notebook
nest_asyncio.apply()

# launch it
launcher = LocalLauncher(
    [agent_server_1, agent_server_2],
    control_plane,
    message_queue,
)
result = launcher.launch_single("What is the secret fact?")

print(f"Result: {result}")

処理詳細

こちらの図を参考に、順に処理を追っていきます。

FunctionToolを作成

# create an agent
def get_the_secret_fact() -> str:
    """Returns the secret fact."""
    return "The secret fact is: A baby llama is called a 'Cria'."


tool = FunctionTool.from_defaults(fn=get_the_secret_fact)

関数をFunctionToolクラスでラップしている。

pydanticのFieldとして引数を定義することで、より明確にインターフェースを定義することができる。

2つのagentを作成

agent1 = ReActAgent.from_tools([tool], llm=OpenAI())
agent2 = ReActAgent.from_tools([], llm=OpenAI())

ReActAgentの作成。
toolsとしてFunctionToolを渡すことで、agantが必要なときに関数を実行させることができる。

ReActAgentのリファレンス。

system messageとして使用されるプロンプトテンプレート。

ちなみにここまではllama-agentsではなくllama-indexの機能となっている。

メッセージキューサーバの作成

message_queue = SimpleMessageQueue(port=8000)

fastapiで実装されている。
以下のエンドポイントを持つ。

    When launched as a server, exposes the following endpoints:
    - GET `/`: Home endpoint
    - POST `/register_consumer`: Register a consumer
    - POST `/deregister_consumer`: Deregister a consumer
    - GET `/get_consumers/{message_type}`: Get consumers for a message type
    - POST `/publish`: Publish a message

ControlPlaneServerの作成

control_plane = ControlPlaneServer(
    message_queue=message_queue,
    orchestrator=AgentOrchestrator(llm=OpenAI(model="gpt-4-turbo")),
    port=8001,
)

ControlPlaneは上記の図にもある通り、タスクを振り分ける司令塔のような役割を担っている。
メッセージキューサーバの指定をしている。
orchestratorとしてgpt-4-turboを指定している。
デフォルトはgpt-3.5-turboなので優秀なLLMである必要があるっぽい。

AgentServiceの作成

agent_server_1 = AgentService(
    agent=agent1,
    message_queue=message_queue,
    description="Useful for getting the secret fact.",
    service_name="secret_fact_agent",
    port=8002,
)
agent_server_2 = AgentService(
    agent=agent2,
    message_queue=message_queue,
    description="Useful for getting random dumb facts.",
    service_name="dumb_fact_agent",
    port=8003,
)

2つのagentをサービスとして定義している。
descriptionでそれぞれのagentの役割を設定する。

内部ではfastapiが立ち上がり、以下のエンドポイントを持つ。

    Exposes the following endpoints:
    - GET `/`: Home endpoint.
    - POST `/process_message`: Process a message.
    - POST `/task`: Create a task.
    - GET `/messages`: Get messages.
    - POST `/toggle_agent_running`: Toggle the agent running state.
    - GET `/is_worker_running`: Check if the agent is running.
    - POST `/reset_agent`: Reset the agent.

LocalLauncherの作成

# launch it
launcher = LocalLauncher(
    [agent_server_1, agent_server_2],
    control_plane,
    message_queue,
)

llama-agentsをlocalで動かすためのランチャー。

agent実行時の処理の流れ

実際に質問を投げたときの処理の流れを読みます。

agentの実行

result = launcher.launch_single("What is the secret fact?")

print(f"Result: {result}")

agentに質問を行います。

INFO:llama_agents.message_queues.simple - Consumer AgentService-853de8df-09d6-4da5-8fdd-5313f0f82724: secret_fact_agent has been registered.
INFO:llama_agents.message_queues.simple - Consumer AgentService-00ae3e0e-3ca1-4b0b-823d-fd3c8ba502b1: dumb_fact_agent has been registered.
INFO:llama_agents.message_queues.simple - Consumer 5f825545-f54b-493b-a7da-9a186c851bdb: human has been registered.
INFO:llama_agents.message_queues.simple - Consumer ControlPlaneServer-d56d3621-3422-4152-b6c1-81237263d590: control_plane has been registered.
INFO:llama_agents.services.agent - secret_fact_agent launch_local
INFO:llama_agents.services.agent - dumb_fact_agent launch_local
INFO:llama_agents.message_queues.base - Publishing message to 'control_plane' with action 'ActionTypes.NEW_TASK'
INFO:llama_agents.message_queues.simple - Launching message queue locally
INFO:llama_agents.services.agent - Processing initiated.
INFO:llama_agents.services.agent - Processing initiated.
INFO:llama_agents.message_queues.base - Publishing message to 'secret_fact_agent' with action 'ActionTypes.NEW_TASK'
INFO:llama_agents.message_queues.simple - Successfully published message 'control_plane' to consumer.
INFO:llama_agents.services.agent - Created new task: 71bdf2a5-0181-4782-afae-442e85058478
INFO:llama_agents.message_queues.simple - Successfully published message 'secret_fact_agent' to consumer.
INFO:llama_agents.message_queues.base - Publishing message to 'control_plane' with action 'ActionTypes.COMPLETED_TASK'
INFO:llama_agents.message_queues.base - Publishing message to 'human' with action 'ActionTypes.COMPLETED_TASK'
INFO:llama_agents.message_queues.simple - Successfully published message 'control_plane' to consumer.
INFO:llama_agents.message_queues.simple - Successfully published message 'human' to consumer.

Result: A baby llama is called a 'Cria'.

上記のような出力が行われます。

実行部分の入口。

register_consumers

message_queueに対して

service1
service2
human_consumer
control_plane

をconsumerとして登録しています。

register_service

control_planeに対して

service1
service2

をserviceとして登録しています。
2つのエージェントがいるので必要になったらタスクを割り振ってね、という感覚だと思います。

start services

service1
service2

を起動。

consumers start consuming in their own threads

service1
service2
human_consumer
control_plane

のイベントを処理するスレッドを開始。
ここまでapiサーバとしては立ち上がっていたが、タスクを処理するループは回っていなかったっぽい。

publish initial task

ここでようやく質問を発行します。

type="control_plane"
としてmessage queueに登録しているので、control_planeが最初にメッセージを受診します。

control_planeからのタスクの振り分け

toolsとしてservice1, service2が渡されている。
これによりcontrol_planeがどのserviceに対して問い合わせを行うか、function callとして判別しているっぽいです。

message queueの起動

message queueイベントを処理するスレッドを開始。
こちらも、apiサーバとしては立ち上がっていたが、タスクを処理するループは回っていなかったっぽい。

agent_server_1からタスクの終了を通知

agent_serverでタスクを完了すると、この辺りでメッセージキューにタスクの終了を通知している様です。

control_planeでタスクの終了を確認

agent_server_1のタスクを受けて結果をハンドリング。

各タスクの終了処理

各タスクの終了待ちを行います。

感想

llama-agentsはagentのフレームワークとしては新しいものですが、今後に期待しています。

llama-agentsやllama-indexのソースコードはとても読み易く感じましたが、日本語としてこういったフレームワークを活用するにはプロンプトを日本語化するなどのカスタマイズが必要になり、ソースを読まないと活用が難しくなりそうと感じます。

またフレームワークに寄らない一般的な話として、RAGなどの仕組みは処理の途中で意図しない挙動をしてしまった場合、復旧不可能という問題があると思います。
llm agentであればllmが十分賢ければある程度軌道修正して正しい答えに辿り着けそうなので期待しています。

参考ドキュメント

ReAct: Synergizing Reasoning and Acting in Language Models
https://arxiv.org/abs/2210.03629