LLM Agent 標準化/MCP/ライブラリ欲張り幕内弁当
Agent 元年の慌ただしい始まり
2025 年は Agent 元年と呼ばれ、その直前の年の瀬から各メジャープロバイダーから Agent 開発に関する標準化やライブラリの発表が相次いだ。その中でも以下の 3 つは注目を集めた。
- 2024.11.25: MCP by Anthropic
- 2025.03.11: OpenAI Agent SDK by OpenAI
- 2025.04.09: A2A Protocol by Google
まだギリ間に合うタイミングかと思うので、欲張り幕の内弁当としてこららを食してみる 🍱
上記の文章をイメージして、幕の内弁当の画像を作成して
それぞれの関係性
各々を見る前に全体におけるそれぞれの役割や立ち位置を整理する。
レイヤ | 役割 | 例 |
---|---|---|
ツール連携プロトコル | LLM が外部 API や DB、クラウドサービスを「安全に」呼び出すための統一フォーマット | MCP |
エージェント実行基盤 | “どの LLM を使い、どのツールを呼び、結果をどうパースして次のステップへ渡すか” を管理 | OpenAI Agents SDK |
エージェント間通信プロトコル | ベンダー/実装を超えて複数エージェントが協調するための「共通語」 | A2A Protocol |
ライブラリ/FW | 上記プロトコルを実装してワークフローを組みやすくする OSS 群 | LangChain / Autogen / LangGraph / など |
MCP
概要
MCP (Model Component Protocol) とは LLM と外部への接続を標準化したプロトコルである。標準化されたプロトコルを利用することで、従来の tool-calling と比較して以下のようなメリットがある。
- 互換性のある pre-built のプラグインのエコシステムの成熟が期待できる。
- LLM プロバイダーやツール間が疎結合となり、柔軟性が増す。
- プロトコルに認証が含まれているため、セキュリティを担保しやすい。
アーキテクチャとしては以下のような構成となっている。
MCP アーキテクチャ
- MCP Hosts: プログラム本体
- MCP Clients: サーバーを 1:1 で接続を行うプロトコルクライアント
- MCP Servers: 特定の能力領域を MCP に則って提供する軽量なプログラム
- Local Data Source: MCP Server がアクセスできるホストマシンのファイルシステム
- Remote Services: MCP Server がアクセスできる外部システム(外部 API など)
Server を作ってみる
上記の tutorial を少し内容を変えながらなぞっていく。最終コードは以下に詳細を参照。
処理内容の詳細説明は公式ドキュメントに譲るが、書いている中でいくつか疑問に思って調べたことを記述する。
Q. 一つの MCP Server に複数のツールを紐付けできる?
@mcp.tool()
async def get_alerts(state: str) -> str:
"""Get weather alerts for a US state.
Args:
state: Two-letter US state code (e.g. CA, NY)
"""
...
@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
"""Get weather forecast for a location.
Args:
latitude: Latitude of the location
longitude: Longitude of the location
"""
...
A. できる。MCP Server の定義としては “specific capabilities” とあり、特定の領域範囲を実現するために利用するツールの個数は制限していない。(個別具体のツール実装と特定領域というインターフェイスの分離を行なっている。)
Q. transport="stdio"
とは何か?
if __name__ == "__main__":
# Initialize and run the server
mcp.run(transport="stdio")
A. MCP Server を構成する主要コンポーネントである transport layer の種別を表している。
MCP Server を構成する主要コンポーネント
Protocol Layer
- Protocol
- Client
- Server
Transport Layer: クライアント-サーバー間の実際のやり取りを行う。JSON-RPC 2.0 に従う。 - Stdio transport: 標準入出力を利用する。ローカル処理に適する。
- HTTP with SSE transport: サーバーが HTTP POST で送るイベントを利用する。
Message types - Requests: Response を期待する。
- Results: Requests に対する成功の応答。
- Errors: エラー。
- Notifications: 一方通行のメッセージで Response を期待しない。
Client を作ってみる
上記の tutorial で Anthropic を OpenAI (4o-mini) で動かしてみる。実際のコードは以下参照。
Client から MCP Server を利用する手順の概要は以下の通り。
-
MCP Server への接続
-
MCP Server から利用できるツールをリストアップ: MCP でツール定義フォーマットが決まっているため、直接 json を記述したり、
@mcp.tool
デコレーターから自動生成したりする。 -
2.で取得したツールを利用した tool-calling 問い合わせ: 実行に関しても MCP のインターフェイスを利用するため、ツールによって呼び出し方が変わらない。
-
3.での結果を利用してツールを実行する
-
4.の結果をコンテキストとして LLM 問い合わせ
実際のレスポンス
Query: weather in ca
[05/14/25 02:07:54] INFO Processing request of type ListToolsRequest server.py:545
[05/14/25 02:07:59] INFO Processing request of type CallToolRequest server.py:545
[05/14/25 02:08:02] INFO HTTP Request: GET https://api.weather.gov/alerts/active/area/CA "HTTP/1.1 200 _client.py:1740
OK"
Here are the current weather alerts for California:
1. **Wind Advisory**
- **Area**: Indian Wells Valley, Mojave Desert, Mojave Desert Slopes
- **Details**: West winds 25 to 35 mph with gusts up to 50 mph. Impacts include blowing unsecured objects and possibly causing power outages. Highways impacted include SR 14, 58, and 178, as well as U.S. 395. Valid until 2 AM PDT Wednesday.
2. **Air Quality Alert**
- **Area**: Imperial County Southwest, Imperial County West, Imperial Valley
- **Details**: Issued due to harmful levels of particle pollution from windblown dust. Health risks include serious respiratory issues, especially for sensitive groups. Valid until 11 AM PDT Wednesday.
3. **Wind Advisory**
- **Area**: San Luis Obispo County Beaches, Santa Barbara County Central Coast Beaches, Ventura County Beaches
- **Details**: West to northwest winds 15 to 30 mph with gusts up to 45 mph. Impacts include blowing unsecured objects. Valid from noon to 9 PM PDT today.
4. **Wind Advisory**
- **Area**: Western Antelope Valley Foothills, Antelope Valley
- **Details**: West winds 20 to 30 mph with gusts up to 50 mph. Valid until 9 PM PDT today.
5. **Wind Advisory**
- **Area**: Santa Barbara County Southwestern Coast, Santa Ynez Mountains
- **Details**: Northwest winds 25 to 35 mph with gusts up to 50 mph. Valid until 3 AM PDT Wednesday.
6. **Wind Advisory**
- **Area**: San Bernardino County Mountains, Riverside County Mountains, San Diego County Mountains
- **Details**: Southwest to west winds 25 to 35 mph with gusts to 55 mph, isolated gusts to 65 mph. Valid until 5 AM PDT Wednesday.
7. **Wind Advisory**
- **Area**: Salton Sea, Imperial County West
- **Details**: West winds 20 to 30 mph with gusts up to 45 mph. Valid until 2 AM PDT Wednesday.
8. **Air Quality Alert**
- **Area**: Coachella Valley, San Gorgonio Pass Near Banning
- **Details**: Issued due to harmful levels of particle pollution from windblown dust. Started Sunday and valid until 5 AM Wednesday.
Please take necessary precautions if you are in or near the affected areas. For more details, check local weather services.
OpenAI Agents SDK
概要
軽量な LLM Agent 作成フレームワークで以下の 3 コンポーネント(抽象化)から成る。
- Agents: ツール、instruction を与えられた LLM
- Handoffs: 特定のタスクに関して、他のエージェントへの delegate を許可すること
- Guardrails: エージェントへの入力が validation されること
これらに加え、tracing 機能も標準で付いている。tracing 機能の重要性に関しては以下で確認済み。
MCP を使ってみる
先ほど作成した weather.py
MCP Server を用いて、天気予報を行う Agent を作成する。
実際のレスポンス
❯ uv run agent.py 五大湖周辺の天気は?
[05/15/25 00:09:41] INFO Processing request of type ListToolsRequest server.py:545
[05/15/25 00:09:49] INFO Processing request of type CallToolRequest server.py:545
[05/15/25 00:09:50] INFO HTTP Request: GET https://api.weather.gov/points/45.0,-85.0 "HTTP/1.1 301 Moved Permanently" _client.py:1740
[05/15/25 00:09:55] INFO Processing request of type CallToolRequest server.py:545
INFO HTTP Request: GET https://api.weather.gov/points/46.79,-92.1 "HTTP/1.1 200 OK" _client.py:1740
INFO HTTP Request: GET https://api.weather.gov/gridpoints/DLH/91,69/forecast "HTTP/1.1 200 OK" _client.py:1740
[05/15/25 00:09:58] INFO Processing request of type CallToolRequest server.py:545
INFO HTTP Request: GET https://api.weather.gov/points/41.8781,-87.6298 "HTTP/1.1 200 OK" _client.py:1740
INFO HTTP Request: GET https://api.weather.gov/gridpoints/LOT/76,73/forecast "HTTP/1.1 200 OK" _client.py:1740
[05/15/25 00:10:00] INFO Processing request of type CallToolRequest server.py:545
INFO HTTP Request: GET https://api.weather.gov/points/42.8864,-78.8784 "HTTP/1.1 200 OK" _client.py:1740
[05/15/25 00:10:01] INFO HTTP Request: GET https://api.weather.gov/gridpoints/BUF/36,47/forecast "HTTP/1.1 200 OK" _client.py:1740
以下は五大湖周辺の代表的な3地点(北端のスペリオル湖域、ミシガン湖域、エリー/オンタリオ湖域)の今日以降の天気予報です。
1)ダルース(ミネソタ州/スペリオル湖近郊)
- 今日(昼間):最高気温66°F(約19℃)、東風5~10mph、11~16時は雨シャワーまたは雷雨の可能性30%、概ね晴れのち曇り
- 今夜:最低49°F(約9℃)、北東風10mph、雷雨の可能性30%、曇り
- 木曜:最高62°F(約17℃)、東風10~20mph、雨シャワーや雷雨80%、曇り
- 木曜夜:最低48°F(約9℃)、南東風10~15mph、雷雨80%、曇り
- 金曜:最高56°F(約13℃)、南風15mph、雨シャワー80%、曇り
2)シカゴ(イリノイ州/ミシガン湖南西岸)
- 今日:最高63°F(約17℃)、東北東風5~10mph、部分的に晴れ、霧の可能性
- 今夜:最低54°F(約12℃)、東風5~10mph、ほぼ曇り、霧
- 木曜:最高87°F(約31℃)、南東風5~20mph、午前は霧、のち40%でシャワーや雷雨、部分的に晴れ
- 木曜夜:最低63°F(約17℃)、南西風10~20mph、雷雨40%、部分的に晴れ
- 金曜:最高86°F(約30℃)、南西風15~25mph、午後から雨の可能性20%、晴れ
3)バッファロー(ニューヨーク州/エリー湖東岸・オンタリオ湖域)
- 今日:最高71°F(約22℃)、南東風10mph、午前に雨シャワー20%、曇りのち概ね曇り
- 今夜:最低58°F(約14℃)、南東風5~9mph、曇り
- 木曜:最高74°F(約24℃)、南風6mph、11~14時に雨シャワー40%、部分的に晴れ
- 木曜夜:最低62°F(約17℃)、南東風2~6mph、雨シャワー50%、部分的に晴れ
- 金曜:最高70°F(約21℃)、南西風6~10mph、午前に雷雨50%、部分的に晴れ
※上記は五大湖周辺の一部代表地点での予報です。ほかの湖岸や特定の都市・州別の詳細が必要であれば、お知らせください。
tracing 機能を ON にしておけば https://platform.openai.com/traces で途中結果や結果詳細を確認することができる。
Trace 結果
処理全体
tool calling の結果
tool の実行結果
最終結果生成
A2A Protocol
概要
Agent2Agent (A2A) は HTTP, SSE, JSON-RPC といった標準技術を利用した、異なるベンダー・システム間でも行える通信プロトコル。単一 Agent がツールとの接続を行う MCP に対して、A2A は動的・マルチモーダルなコミュニケーションを Agent 同士で実現することで、MCP を補完する。
A2A と MCP は補完関係にある
コンセプト
主な登場人物
名称 | 説明 |
---|---|
User | リクエストを開始またはゴールを設定するエンドユーザー。 |
A2A Client (Client Agnet) | ユーザーの代わりに remote agent に A2A プロトコルに則ってリクエストを送るアプリケーション、サービス、エージェントなど。 |
A2A Server (Remote Agent) | A2A プロトコルに則った HTTP エンドポイントを提供する AI エージェントやシステム。 |
コンポーネント
コンポーネント名 | 説明 |
---|---|
Agent Card | A2A Server の機能や説明を表す JSON 形式のメタデータ。普通は well-known URL (/.well-known/agent.json ) で取得できる。 |
Task | Agent が達成すべきゴール (例: 「レポートを作成して」「電車を予約して」)。各タスクにはユニークな ID が割り振られ、ステータス管理 (submitted , working , input-required , completed , failed ) される。 |
Message | Task 内の単一のコミュニケーションの単位で、role は user または agent 。 |
Part | Message 内のコンテンツを表す単位で、 TextPart , FilePart , DataPart がある。 |
Artifact | 実際に確認することができる A2A Server のアウトプット。 |
インタラクションの種類
インタラクション | 説明 |
---|---|
Request/Response (Polling) | クライアントがリクエスト (RPC method tasks/send ) を投げ、レスポンスをサーバーから受け取る。時間がかかる Task に対しては、サーバーは working ステータスを最初に返却後、クライアントは処理が終わるまで定期的に polling する。 |
Streaming (SSE) | Task の進捗状況を取得するのに SSE を用いてサーバーから都度レスポンスを得る。リクエストは RPC method tasks/sendSubscribe 。 |
Push Notifications | リクエストに webhook URL を含め、サーバーが処理完了後にそれを叩く。処理時間が長く、コネクションを最後まで保持できないような Task で有効。リクエストは tasks/pushNotification/set 。 |
複数 Agent をオーケストレーションしてみる
A2A プロトコルを利用したオーケストレーションにフォーカスするため、 LangChain などのフレームワークを一切使わずに触ってみる。Server Agent として A2A プロトコルの公式ドキュメントを MCP ツールとして利用できるものとそれとコミュニケーションを行う Client Agent を作成してみる。
Client Agent
Agent 内部処理フローは以下の通り。
-
プロンプトを受け取る。
-
リクエスト可能な Server Agent の Agent Card 一覧を取得する。合わせて後続のコンテキストとなる以下のようなプロンプトを準備する。
# Agent contexts These are some consults from agent(s) that may be useful to answer the question {% for agent in agent_cards %} Agent name: {{agent.name}} Agent description: {{agent.description}} Agent skills: {% for skill in agent.skills%} - name: {{skill.name}} - description: {{skill.description}} - example: {{skill.examples}}{% endfor %}{% endfor %} -------------------
-
どの Server Agent に対してリクエストを投げるかを LLM などを用いて決定する。必ずしも 1 つではない。プロンプトは以下の通り。
You duty is to decide which agent to consult or ask for help to answer the question. The question is: {{ question }} {{ call_agent_prompt }} {{ agent_prompt }} You must answer in the following format: <Thoughts> Thoughts: - ... - ... - ... </Thoughts> <Selected agents> ```json [ { "name": "agent_name", "prompt": "prompt_to_agent" }, ... ] ``` </Selected agents> Note: - You can leave the selected agents empty if you think none of the agents are relevant to the question or given contexts are enough to answer the question. - You can select multiple agents if you think multiple agents are relevant to the question. - You `agent_name` must be one of the agent names in the agents list, and you must spell it correctly. - If there is no need to call any agent, Please give your answer by continuing the following format: <Answer> <Your answer here> </Answer>
-
question
: 元のプロンプト -
call_agent_prompt
: Agent とのやりとり履歴Previous agents have been called. {% for agent in called_agents %} - Agent: {{ agent.name }} - Prompt: {{ agent.prompt }} - Answer: {{ agent.answer }} -------------------------------- {% endfor %}
-
agent_prompt
: 2 で作成したプロンプト
-
-
3.の結果から、どの Server Agents にどのようなプロンプトを投げるのか (
<Selected agents/>
) をパースする。 -
選ばれた各 Server Agent に対してプロンプトを TextPart としてメッセージを送る。
-
5.のレスポンスのうち回答 (
<Answer/>
) をパースし、Agent とのやりとり履歴に追加する。
以下実装。
Server Agent
Agent 内部の処理フローは以下の通り。
-
質問に回答するのにどのツールを利用するのかを LLM を使って決定する。
-
MCP に則ったツールがなければユーザー質問を直接 LLM に投げる。
-
MCP に則ったツールがあれば、以下ツール選定に関するコンテキストを準備する。
You duty is to decide which tool to use to answer the question. The question is: {{ question }} {{ called_tools }} {{ tool_prompt }} You must answer in the following format: <Thoughts> Thoughts: - ... - ... - ... </Thoughts> <Selected tools> ```json [ { "name": "tool_name", "arguments": { "argument_name": "argument_value" } }, ... ] ``` </Selected tools> Note: - You can leave the selected tools empty if you think none of the tools are relevant to the question or given contexts are enough to answer the question. - You can select multiple tools if you think multiple tools are relevant to the question. - You `tool_name` must be one of the tool names in the tools list, and you must spell it correctly. - If there is no need to call any tool, Please give your answer by continuing the following format: <Answer> <Your answer here> </Answer>
-
question
: 元のプロンプト -
called_tools
: ツールの利用履歴
Previous tools have been called. {% for tool in called_tools %} - Tool: {{ tool.name }} - Arguments: {{ tool.arguments }} - Result: {{ tool.result }} {% endfor %}
-
tool_prompt
: ツールの選択肢
Tools{% for tool in tools %} - {{loop.index}}: {{ tool.name }} - Description: {{ tool.description }} - Input Schema: {{tool.inputSchema.properties}}{% endfor %}
-
-
-
1.の回答から利用するツール (
<Selected tools/>
箇所) をパースする。 -
ツールを並列実行する。
-
- の結果をツール利用履歴に追加する。
以下実装。
上記フローを実現する Agent を以下の A2A で定義されているインターフェイスを実装することで A2A プロトコルに則った形で提供する。
-
AgentExecutor
Agent の実行や Task の管理などコアロジックを実行する。
以下実装例。
-
RequestHandler
A2A のコンポーネントAgentExecutor
,TaskStore
,QueueManager
,PushNotifier
(任意) をコーディネートし、A2A JSON-RPC メソッドの実装を行う。
以下実装例。
-
JSONRPCApplication
AgentSkill
,AgentCard
,RequestHandler
をコーディネートし、A2A プロトコルに則った形で JSON-RPC アプリケーション提供する。
以下実装例。
実行
-
Server Agent を起動する
uv run --env-file .env python -m src.no_llm_framework.server.__main__
-
Client Agent にユーザープロンプトを投げる
uv run --env-file .env python -m src.no_llm_framework.client --question "What is A2A protocol?"
実行結果
```
❯ uv run --env-file .env python -m src.no_llm_framework.client --question "A2Aプロトコルの メリットを教えて"
<Thoughts>
Thoughts:
- The question asks for the merits (メリット) of the A2A protocol.
- The A2A Protocol Agent is specifically designed to answer questions about the A2A protocol.
- Therefore, this agent is the most appropriate choice to answer the question.
</Thoughts>
<Selected agents>
```json
[
{
"name": "A2A Protocol Agent",
"prompt": "A2Aプロトコルのメリットを教えて"
}
]
```
</Selected agents>
<Agent name="A2A Protocol Agent">
Step 0<Thoughts>
Thoughts:
- The question asks for the merits (メリット) of the A2A protocol. This is a general question about the advantages of using the protocol.
- The `fetch_A2A_documentation` tool is ideal for getting a broad overview of the A2A protocol and its features, which will likely include information on its benefits.
- After fetching the documentation, `search_A2A_documentation` could be used to refine the search for specific mentions of advantages or benefits, if needed.
- The code (`search_A2A_code`) is unlikely to directly explain the protocol's merits; the documentation is a better source for this.
- `fetch_generic_url_content` is a secondary tool to be used only if the documentation references external URLs containing further details on A2A advantages.
</Thoughts>
<Selected tools>
```json
[
{
"name": "fetch_A2A_documentation",
"arguments": {}
},
{
"name": "search_A2A_documentation",
"arguments": {
"query": "メリット, advantages, benefits, strengths, plus points"
}
}
]
```
</Selected tools>
Previous tools have been called.
- Tool:
- Arguments: {}
- Result: # A2A (Agent2Agent) Protocol High-Level Summary
This project defines and demonstrates the **Agent2Agent (A2A) protocol**, an open standard initiated by Google designed to enable communication and interoperability between disparate AI agent systems. The core goal is to allow agents built on different frameworks (e.g., LangGraph, CrewAI, Google ADK, Genkit) or by different vendors to discover each other's capabilities, negotiate interaction modes (text, forms, files, potentially audio/video later), and collaborate on tasks.
The repository provides:
1. **Formal Specification:** A detailed JSON Schema (`specification/json/a2a.json`) defining the structure of A2A messages, including requests, responses, task states, artifacts, and agent metadata (Agent Cards).
2. **Core Concepts Documentation (Implied):** Links in the main README suggest documentation covering agent discovery, task lifecycle, artifact exchange, streaming updates, push notifications, and enterprise readiness.
3. **Sample Implementations:**
- **Common Libraries:** Reusable Python (`samples/python/common`) and JavaScript/TypeScript (`samples/js/src`) code for building A2A clients and servers, handling JSON-RPC communication, task management, and potentially authentication.
- **Example Agents:** Demonstrations of integrating A2A into various agent frameworks:
- **Python:** LangGraph (currency conversion, streaming), CrewAI (image generation, file artifacts), Google ADK (expense reports, form handling).
- **JavaScript/TypeScript:** Genkit (movie info via API, code generation with file artifacts).
- **Example Hosts:** Applications that *consume* A2A services:
- CLIs in both Python and JS for direct interaction.
- A Python-based multi-agent orchestrator (using Google ADK) that delegates tasks to other A2A agents.
4. **Demo Web Application:** A web UI (`demo/ui`, likely using Mesop) demonstrating multi-agent interactions facilitated by the A2A protocol, including visualization of conversations, tasks, events, and agent discovery. It features a backend service coordinating with the host agent.
Key features of the A2A protocol highlighted by the specification and samples include: agent discovery via Agent Cards, standardized task management (send, get, cancel), support for different content types (text, files, structured data) via `Parts` and `Artifacts`, streaming updates for long-running tasks, and mechanisms for push notifications. The project is open source and encourages community contribution.
# A2A (Agent2Agent) Protocol
## 1. Overview
- **Project Name:** Agent2Agent (A2A) Protocol
- **Purpose:** An open protocol by Google enabling communication and interoperability between AI agents built on different frameworks or by different vendors.
- **Core Goal:** Allow agents to discover capabilities, negotiate interaction, and collaborate securely on tasks.
- **Communication:** Uses JSON-RPC 2.0 over HTTP(S). Supports standard request/response and Server-Sent Events (SSE) for streaming.
- **Key Components:** Specification (JSON Schema), Common Libraries (Python, JS/TS), Sample Agents (LangGraph, CrewAI, ADK, Genkit), Sample Hosts (CLI, Orchestrator), Demo Web App (Mesop).
## 2. Protocol Specification (`specification/json/a2a.json`)
### 2.1. Core JSON-RPC Structures
- **`JSONRPCMessage`:** Base for requests/responses. Contains `jsonrpc: "2.0"` and optional `id`.
- **`JSONRPCRequest`:** Represents a request.
- `method`: String identifying the operation (e.g., "tasks/send").
- `params`: Object or Array containing parameters for the method.
- `id`: Unique identifier (string/number) for request/response correlation. Omitted/null for notifications.
- **`JSONRPCResponse`:** Represents a response.
- `result`: Contains the successful result data (can be `null`). Mutually exclusive with `error`.
- `error`: Contains an error object if the request failed. Mutually exclusive with `result`.
- `id`: Must match the corresponding request `id`.
- **`JSONRPCError`:** Represents an error.
- `code`: Integer error code.
- `message`: String description of the error.
- `data`: Optional additional error details.
### 2.2. Key A2A Data Objects
- **`AgentCard`:** Metadata describing an agent. Found typically at `/.well-known/agent.json`.
- `name`: (string) Human-readable name.
- `description`: (string) Agent description.
- `url`: (string) Base URL endpoint for the agent's A2A service.
- `provider`: (`AgentProvider`) Organization details (optional).
- `version`: (string) Agent/API version.
- `documentationUrl`: (string) Link to documentation (optional).
- `capabilities`: (`AgentCapabilities`) Features supported (streaming, push).
- `securitySchemes`: (Object) Security scheme details for authentication (optional).
- `security`: (Array) Security requirements for contacting the agent (optional).
- `defaultInputModes`: (string[]) Default supported input types (e.g., "text/plain", "application/json").
- `defaultOutputModes`: (string[]) Default supported output types.
- `skills`: (`AgentSkill[]`) List of specific capabilities.
- `supportsAuthenticatedExtendedCard`: (boolean) Indicates support for retrieving a more detailed Agent Card via an authenticated endpoint (optional).
- **`AgentCapabilities`:**
- `streaming`: (boolean) Supports `message/stream` and `tasks/resubscribe` for real-time updates via Server-Sent Events (SSE). Default: `false`.
- `pushNotifications`: (boolean) Supports `tasks/pushNotificationConfig/set|get` for asynchronous task updates via webhooks. Default: `false`.
- `stateTransitionHistory`: (boolean) Supports providing detailed history of status changes within the Task object (future enhancement). Default: `false`.
- **`AgentSkill`:**
- `id`: (string) Unique skill ID within this agent.
- `name`: (string) Human-readable skill name.
- `description`: (string) Detailed skill description. CommonMark may be used.
- `tags`: (string[]) Keywords/categories for discoverability.
- `examples`: (string[]) Example prompts or use cases demonstrating skill usage (optional).
- `inputModes`: (string[]) Overrides `defaultInputModes` for this specific skill. Accepted Media Types (optional).
- `outputModes`: (string[]) Overrides `defaultOutputModes` for this specific skill. Produced Media Types (optional).
- **`Task`:** Represents a unit of work processed by an agent.
- `id`: (string) Unique task identifier.
- `contextId`: (string) Groups related tasks in a conversation or session.
- `status`: (`TaskStatus`) Current state information.
- `artifacts`: (`Artifact[]`) Files/data produced by the agent (optional).
- `history`: (`TaskStatus[]`) Status transition history (optional, requires `stateTransitionHistory` capability).
- `metadata`: (object) Custom key-value data for client use (optional).
- `kind`: ("task") Type discriminator.
- **`TaskStatus`:**
- `state`: (`TaskState`) Current lifecycle state (enum).
- `message`: (string) Human-readable status message (optional).
- `timestamp`: (string) ISO-8601 timestamp of when this status was set.
- `kind`: ("status") Type discriminator.
- **`TaskState`:** (enum)
- `submitted`: Task received but not yet processing (non-terminal).
- `working`: Task is actively being processed (non-terminal).
- `input_required`: Task requires additional input from the client to proceed (non-terminal).
- `completed`: Task completed successfully (terminal).
- `failed`: Task encountered an error (terminal).
- `canceled`: Task was canceled by client request (terminal).
- `rejected`: Task was rejected by the agent (terminal).
- `auth-required`: Authentication required from client/user to proceed (non-terminal).
- `unknown`: Task state cannot be determined (terminal).
- **`Message`:** Communication unit between user and agent.
- `role`: ("user" | "agent") Sender role.
- `parts`: (`Part[]`) Content parts (text, file, data).
- `metadata`: (object | null) Message-specific metadata.
- `messageId`: (string) Unique identifier for the message.
- `parentMessageId`: (string) Reference to a previous message being replied to (optional).
- `rootMessageId`: (string) Reference to the first message in a thread (optional).
- `referenceTaskIds`: (string[]) List of tasks referenced as contextual hint by this message (optional).
- `taskId`: (string) Task identifier the current message is related to (optional).
- `contextId`: (string) Context identifier the message is associated with (optional).
- `kind`: ("message") Type discriminator.
- **`Part` (Union Type):** Represents a piece of content within a Message or Artifact.
- **`TextPart`:**
- `type`: "text"
- `text`: (string) Text content.
- `mimeType`: (string) Format specification (default: "text/plain").
- **`FilePart`:**
- `type`: "file"
- `file`: (`FileWithUri`) File reference with URI.
- `inline`: (boolean) Whether the file content should be displayed inline (optional).
- **`DataPart`:**
- `type`: "data"
- `data`: (any) Structured data in JSON-serializable format.
- `mimeType`: (string) Format specification (e.g., "application/json").
- `inline`: (boolean) Whether the data should be displayed inline (optional).
- `metadata`: (object | null) Optional metadata for the specific part.
- **`FileWithUri`:** Represents a file reference with URI.
- `name`: (string) Filename with extension.
- `mimeType`: (string) Content type (Media Type).
- `uri`: (string) URI to access the file content.
- **`Artifact`:** An output generated by a task.
- `artifactId`: (string) Unique identifier for the artifact.
- `name`: (string) Human-readable artifact name (optional).
- `description`: (string) Detailed description of the artifact (optional).
- `parts`: (`Part[]`) Content segments in the artifact.
- `metadata`: (object) Custom key-value data (optional).
- `kind`: ("artifact") Type discriminator.
- **`AuthenticationInfo`:** Authentication details for push notifications.
- `schemes`: (string[]) Array of authentication scheme names the caller must use.
- `credentials`: (string) Optional static credentials or scheme-specific configuration info.
- **`PushNotificationConfig`:** Configuration for push notifications.
- `url`: (string) Endpoint URL for the agent to POST notifications to.
- `token`: (string) Optional authentication token for the agent to include in requests (optional).
- `authentication`: (`AuthenticationInfo`) Authentication details the agent needs to call the URL (optional).
- **`TaskPushNotificationConfig`:** Associates a `PushNotificationConfig` with a task ID.
- `taskId`: (string) The task ID to receive push notifications for.
- `config`: (`PushNotificationConfig`) The push notification configuration to use.
- `metadata`: (object) Custom key-value data for client use (optional).
- **`MessageSendParams`:** Parameters for sending a message.
- `message`: (`Message`) The message to send.
- `configuration`: (`MessageSendConfiguration`) Optional configuration for the message.
- `metadata`: (object) Custom key-value data for client use (optional).
- **`agent/authenticatedExtendedCard`:** (HTTP GET)
- Retrieves a potentially more detailed version of the Agent Card after the client has authenticated.
- Available only if `AgentCard.supportsAuthenticatedExtendedCard` is `true`.
- Endpoint URL: `{AgentCard.url}/../agent/authenticatedExtendedCard`
- Authentication: Required using one of the schemes declared in the public `AgentCard.securitySchemes`.
- Response: Complete `AgentCard` object with potentially additional details.
### 2.3. A2A RPC Methods
- **`message/send`:** (Request/Response)
- Sends a message to initiate or continue a task.
- `params`: `MessageSendParams` (includes `taskId`, `contextId`, `message`, optionally `configuration`).
- `result`: `Task` (final state after synchronous processing).
- **`message/stream`:** (Request/Stream)
- Sends a message and subscribes to real-time updates via Server-Sent Events (SSE).
- `params`: `MessageSendParams`.
- `result` (stream events): `SendStreamingMessageResponse` containing one of: `MessageEvent`, `TaskStatusUpdateEvent`, `TaskArtifactUpdateEvent`. Final event has `final: true`.
- **`tasks/get`:** (Request/Response)
- Retrieves the current state of a task.
- `params`: `TaskQueryParams` (includes `id`, optionally `historyLength`).
- `result`: `Task`.
- **`tasks/cancel`:** (Request/Response)
- Requests cancellation of a running task.
- `params`: `TaskIdParams` (includes `id`).
- `result`: `Task` (updated state, likely 'canceled') or error if not cancelable.
- **`tasks/pushNotificationConfig/set`:** (Request/Response)
- Sets or updates the push notification configuration for a task.
- `params`: `TaskPushNotificationConfig`.
- `result`: `TaskPushNotificationConfig` (confirmed configuration).
- **`tasks/pushNotificationConfig/get`:** (Request/Response)
- Retrieves the current push notification configuration for a task.
- `params`: `TaskIdParams` (includes `taskId`).
- `result`: `TaskPushNotificationConfig`.
- **`tasks/resubscribe`:** (Request/Stream)
- Resubscribes to task updates after a connection interruption (SSE).
- `params`: `TaskQueryParams`.
- `result` (stream events): `TaskStatusUpdateEvent` or `TaskArtifactUpdateEvent`.
### 2.4. Streaming Update Events (Result of `message/stream` or `tasks/resubscribe`)
- **`SendStreamingMessageResponse`:** Contains a message from the agent.
- `type`: "message"
- `message`: (`Message`) The message content.
- `contextId`: (string) Context identifier the message is associated with.
- `kind`: ("streaming-response") Type discriminator.
- `final`: (boolean) True if this is the final message for the task.
- **`TaskStatusUpdateEvent`:** Signals a change in task status.
- `type`: "task-status"
- `taskId`: (string) Task ID.
- `contextId`: (string) Context identifier the task is associated with.
- `kind`: ("status-update") Type discriminator.
- `status`: (`TaskStatus`) The new status object.
- `final`: (boolean) True if this is the terminal update for the task.
- **`TaskArtifactUpdateEvent`:** Signals a new or updated artifact.
- `type`: "task-artifact"
- `taskId`: (string) Task ID.
- `artifact`: (`Artifact`) The artifact data.
- `append`: (boolean) If true, append parts to artifact; if false (default), replace.
- `lastChunk`: (boolean) If true, indicates this is the final update for the artifact.
- `final`: (boolean) Usually false for artifacts, can signal end concurrently with status.
### 2.5. Standard Error Codes
- `-32700`: `JSONParseError` - Invalid JSON payload.
- `-32600`: `InvalidRequestError` - Invalid JSON-RPC request object.
- `-32601`: `MethodNotFoundError` - Method does not exist.
- `-32602`: `InvalidParamsError` - Invalid method parameters.
- `-32603`: `InternalError` - Internal server error.
### 2.6. A2A Specific Error Codes
- `-32001`: `TaskNotFoundError` - Specified task ID not found.
- `-32002`: `TaskNotCancelableError` - Task is in a terminal state and cannot be canceled.
- `-32003`: `PushNotificationNotSupportedError` - Agent does not support push notifications.
- `-32004`: `UnsupportedOperationError` - The requested operation is not supported.
- `-32005`: `ContentTypeNotSupportedError` - Mismatch in supported content types.
- `-32006`: `InvalidAgentResponseError` - Agent generated an invalid response for the requested method.
## 3. Core Concepts
- **Agent Discovery:** Clients find agents and their capabilities by fetching the `AgentCard` JSON, typically from `/.well-known/agent.json`.
- **Task Lifecycle:** Tasks progress through states defined in `TaskState` (submitted -> working -> [input_required] -> completed/failed/canceled/rejected/unknown).
- **Communication:** Uses `Message` objects containing `Part`s (text, file, data). Task outputs are represented as `Artifact`s, also containing `Part`s.
- **Streaming:** Long-running tasks can provide real-time updates using Server-Sent Events (SSE) via `message/stream`. Updates are sent as `MessageEvent`, `TaskStatusUpdateEvent` and `TaskArtifactUpdateEvent`. Reconnection after interruptions is supported via `tasks/resubscribe`.
- **Push Notifications:** Agents can proactively notify clients about task updates using webhook URLs provided via `tasks/pushNotificationConfig/set`. Authentication mechanisms (e.g., Bearer tokens via JWT signed with keys from agent's JWKS endpoint) are supported for secure communication.
- **Authentication:** Defined in `AgentCard` (via `securitySchemes` and `security` fields) and `PushNotificationConfig`. Can involve various schemes (e.g., API keys, OAuth, JWT). The protocol supports authenticated extended Agent Card retrieval via the `agent/authenticatedExtendedCard` endpoint. Samples use JWT for push notifications and secure communication.
- **Forms:** Structured data can be requested and submitted using `DataPart` within Messages/Artifacts (demonstrated in ADK sample).
## 4. Security Considerations
- **Transport Security:** Always use HTTPS with strong TLS configurations in production environments.
- **Authentication:**
- Handled via standard HTTP mechanisms (e.g., `Authorization` header with Bearer tokens, API keys).
- Requirements are declared in the `AgentCard`.
- Credentials MUST be obtained out-of-band by the client.
- A2A Servers MUST authenticate every request.
- **Authorization:**
- A server-side responsibility based on the authenticated identity.
- Implement the principle of least privilege.
- Can be granular, based on skills, actions, or data.
- **Push Notification Security:**
- Webhook URL validation (by the A2A Server sending notifications) is crucial to prevent SSRF.
- Authentication of the A2A Server to the client's webhook is essential.
- Authentication of the notification by the client's webhook receiver (verifying it came from the legitimate A2A Server and is relevant) is critical.
- **Input Validation:** Servers MUST rigorously validate all RPC parameters and the content/structure of data in `Message` and `Artifact` parts to prevent injection attacks or processing errors.
- **Resource Management:** Implement rate limiting, concurrency controls, and resource limits to protect agents from abuse or overload.
- **Data Privacy:** Adhere to all applicable privacy regulations for data exchanged in `Message` and `Artifact` parts. Minimize sensitive data transfer.
## 4. Implementations & Samples
### 4.1. Common Libraries
- **Python (`samples/python/common`)**:
- `client/`: `A2AClient` for making requests, `A2ACardResolver` for discovery.
- `server/`: `A2AServer` (Starlette-based), `TaskManager` base class, `InMemoryTaskManager`.
- `types.py`: Pydantic models mirroring the JSON schema.
- `utils/`: Helpers for push notification auth (JWT signing/verification, JWKS endpoint).
- **JavaScript/TypeScript (`samples/js/src`)**:
- `client/`: `A2AClient` implementation using `fetch`.
- `server/`: `A2AServer` (Express-based), `TaskStore` interface, `InMemoryTaskStore`, `FileStore`.
- `schema.ts`: TypeScript interfaces matching the JSON schema.
- `handler.ts`, `error.ts`, `utils.ts`: Support code for the server.
### 4.2. Python Samples
- **Location:** `samples/python/agents/` & `samples/python/hosts/`
- **Setup:** Uses `uv` and `pyproject.toml`. Requires Python >= 3.12/3.13. API keys via `.env`.
- **Agents:**
- **LangGraph (`agents/langgraph`)**: Currency conversion agent. Demonstrates tool use, multi-turn (`input-required`), and **streaming** (`tasks/sendSubscribe`).
- **CrewAI (`agents/crewai`)**: Image generation agent. Demonstrates multi-turn and handling **file artifacts** (images).
- **Google ADK (`agents/google_adk`)**: Expense reimbursement agent. Demonstrates multi-turn and handling **forms** using `DataPart`.
- **Hosts:**
- **CLI (`hosts/cli`)**: Simple command-line client to interact with any A2A agent. Supports streaming and optional push notification listening.
- **Multi-Agent Orchestrator (`hosts/multiagent`)**: An ADK-based "Host Agent" that manages connections (`RemoteAgentConnections`) to other A2A agents and delegates tasks based on instructions.
### 4.3. JavaScript/TypeScript Samples
- **Location:** `samples/js/`
- **Setup:** Uses `npm`/`pnpm`, `tsx`, `tsconfig.json`. Requires Node.js >= 18. API keys via environment variables. Framework: **Genkit**.
- **Agents (`src/agents/`)**:
- **Movie Agent (`movie-agent`)**: Uses TMDB API via Genkit tools to answer movie questions. Demonstrates tool use and multi-turn (`AWAITING_USER_INPUT` mapped to `input-required`).
- **Coder Agent (`coder`)**: Generates code files. Demonstrates producing multiple **file artifacts** via streaming updates. Uses custom Genkit format (`code-format.ts`).
- **Hosts:**
- **CLI (`src/cli.ts`)**: Command-line client for interacting with JS agents.
### 4.4. Demo Application (`demo/`)
- **UI (`demo/ui`)**: Web application built with **Mesop**.
- Visualizes conversations with multiple agents via the host orchestrator.
- Renders text, images, forms.
- Allows dynamic agent registration via URL.
- Provides views for task list and event logs.
- **Service (`demo/ui/service`)**: Backend service for the Mesop UI.
- `server/`: Manages conversations, routes messages, interfaces with the host agent (`ADKHostManager` or `InMemoryFakeAgentManager`).
- `client/`: Client used by the UI to talk to its *own* backend service.
## 5. Development & Setup
- **Prerequisites:** Python (>=3.12 or 3.13), Node.js (>=18), `uv` (for Python), `npm`/`pnpm` (for JS).
- **API Keys:** Required for LLM access (e.g., `GOOGLE_API_KEY`, `TMDB_API_KEY`), typically set via `.env` files or environment variables.
- **Running Samples:** Generally involves running an agent server (`uv run ...` or `npm run ...`) and then a host client/app (e.g., `uv run hosts/cli --agent <agent_url>`).
## 6. Contribution
- See `CONTRIBUTING.md`.
- GitHub discussion available as the primary means of communication.
- GitHub issues for bugs and feature requests.
- Google Form for private feedback.
- Tool:
- Arguments: {'query': 'メリット, advantages, benefits, strengths, plus points'}
- Result: ### Search Results for: "メリット, advantages, benefits, strengths, plus points"
No relevant documentation found for your query. It's either being indexed or the search query did not match any documentation.
As a fallback, this is the documentation for google/A2A:
# A2A (Agent2Agent) Protocol High-Level Summary
This project defines and demonstrates the **Agent2Agent (A2A) protocol**, an open standard initiated by Google designed to enable communication and interoperability between disparate AI agent systems. The core goal is to allow agents built on different frameworks (e.g., LangGraph, CrewAI, Google ADK, Genkit) or by different vendors to discover each other's capabilities, negotiate interaction modes (text, forms, files, potentially audio/video later), and collaborate on tasks.
The repository provides:
1. **Formal Specification:** A detailed JSON Schema (`specification/json/a2a.json`) defining the structure of A2A messages, including requests, responses, task states, artifacts, and agent metadata (Agent Cards).
2. **Core Concepts Documentation (Implied):** Links in the main README suggest documentation covering agent discovery, task lifecycle, artifact exchange, streaming updates, push notifications, and enterprise readiness.
3. **Sample Implementations:**
- **Common Libraries:** Reusable Python (`samples/python/common`) and JavaScript/TypeScript (`samples/js/src`) code for building A2A clients and servers, handling JSON-RPC communication, task management, and potentially authentication.
- **Example Agents:** Demonstrations of integrating A2A into various agent frameworks:
- **Python:** LangGraph (currency conversion, streaming), CrewAI (image generation, file artifacts), Google ADK (expense reports, form handling).
- **JavaScript/TypeScript:** Genkit (movie info via API, code generation with file artifacts).
- **Example Hosts:** Applications that *consume* A2A services:
- CLIs in both Python and JS for direct interaction.
- A Python-based multi-agent orchestrator (using Google ADK) that delegates tasks to other A2A agents.
4. **Demo Web Application:** A web UI (`demo/ui`, likely using Mesop) demonstrating multi-agent interactions facilitated by the A2A protocol, including visualization of conversations, tasks, events, and agent discovery. It features a backend service coordinating with the host agent.
Key features of the A2A protocol highlighted by the specification and samples include: agent discovery via Agent Cards, standardized task management (send, get, cancel), support for different content types (text, files, structured data) via `Parts` and `Artifacts`, streaming updates for long-running tasks, and mechanisms for push notifications. The project is open source and encourages community contribution.
# A2A (Agent2Agent) Protocol
## 1. Overview
- **Project Name:** Agent2Agent (A2A) Protocol
- **Purpose:** An open protocol by Google enabling communication and interoperability between AI agents built on different frameworks or by different vendors.
- **Core Goal:** Allow agents to discover capabilities, negotiate interaction, and collaborate securely on tasks.
- **Communication:** Uses JSON-RPC 2.0 over HTTP(S). Supports standard request/response and Server-Sent Events (SSE) for streaming.
- **Key Components:** Specification (JSON Schema), Common Libraries (Python, JS/TS), Sample Agents (LangGraph, CrewAI, ADK, Genkit), Sample Hosts (CLI, Orchestrator), Demo Web App (Mesop).
## 2. Protocol Specification (`specification/json/a2a.json`)
### 2.1. Core JSON-RPC Structures
- **`JSONRPCMessage`:** Base for requests/responses. Contains `jsonrpc: "2.0"` and optional `id`.
- **`JSONRPCRequest`:** Represents a request.
- `method`: String identifying the operation (e.g., "tasks/send").
- `params`: Object or Array containing parameters for the method.
- `id`: Unique identifier (string/number) for request/response correlation. Omitted/null for notifications.
- **`JSONRPCResponse`:** Represents a response.
- `result`: Contains the successful result data (can be `null`). Mutually exclusive with `error`.
- `error`: Contains an error object if the request failed. Mutually exclusive with `result`.
- `id`: Must match the corresponding request `id`.
- **`JSONRPCError`:** Represents an error.
- `code`: Integer error code.
- `message`: String description of the error.
- `data`: Optional additional error details.
### 2.2. Key A2A Data Objects
- **`AgentCard`:** Metadata describing an agent. Found typically at `/.well-known/agent.json`.
- `name`: (string) Human-readable name.
- `description`: (string) Agent description.
- `url`: (string) Base URL endpoint for the agent's A2A service.
- `provider`: (`AgentProvider`) Organization details (optional).
- `version`: (string) Agent/API version.
- `documentationUrl`: (string) Link to documentation (optional).
- `capabilities`: (`AgentCapabilities`) Features supported (streaming, push).
- `securitySchemes`: (Object) Security scheme details for authentication (optional).
- `security`: (Array) Security requirements for contacting the agent (optional).
- `defaultInputModes`: (string[]) Default supported input types (e.g., "text/plain", "application/json").
- `defaultOutputModes`: (string[]) Default supported output types.
- `skills`: (`AgentSkill[]`) List of specific capabilities.
- `supportsAuthenticatedExtendedCard`: (boolean) Indicates support for retrieving a more detailed Agent Card via an authenticated endpoint (optional).
- **`AgentCapabilities`:**
- `streaming`: (boolean) Supports `message/stream` and `tasks/resubscribe` for real-time updates via Server-Sent Events (SSE). Default: `false`.
- `pushNotifications`: (boolean) Supports `tasks/pushNotificationConfig/set|get` for asynchronous task updates via webhooks. Default: `false`.
- `stateTransitionHistory`: (boolean) Supports providing detailed history of status changes within the Task object (future enhancement). Default: `false`.
- **`AgentSkill`:**
- `id`: (string) Unique skill ID within this agent.
- `name`: (string) Human-readable skill name.
- `description`: (string) Detailed skill description. CommonMark may be used.
- `tags`: (string[]) Keywords/categories for discoverability.
- `examples`: (string[]) Example prompts or use cases demonstrating skill usage (optional).
- `inputModes`: (string[]) Overrides `defaultInputModes` for this specific skill. Accepted Media Types (optional).
- `outputModes`: (string[]) Overrides `defaultOutputModes` for this specific skill. Produced Media Types (optional).
- **`Task`:** Represents a unit of work processed by an agent.
- `id`: (string) Unique task identifier.
- `contextId`: (string) Groups related tasks in a conversation or session.
- `status`: (`TaskStatus`) Current state information.
- `artifacts`: (`Artifact[]`) Files/data produced by the agent (optional).
- `history`: (`TaskStatus[]`) Status transition history (optional, requires `stateTransitionHistory` capability).
- `metadata`: (object) Custom key-value data for client use (optional).
- `kind`: ("task") Type discriminator.
- **`TaskStatus`:**
- `state`: (`TaskState`) Current lifecycle state (enum).
- `message`: (string) Human-readable status message (optional).
- `timestamp`: (string) ISO-8601 timestamp of when this status was set.
- `kind`: ("status") Type discriminator.
- **`TaskState`:** (enum)
- `submitted`: Task received but not yet processing (non-terminal).
- `working`: Task is actively being processed (non-terminal).
- `input_required`: Task requires additional input from the client to proceed (non-terminal).
- `completed`: Task completed successfully (terminal).
- `failed`: Task encountered an error (terminal).
- `canceled`: Task was canceled by client request (terminal).
- `rejected`: Task was rejected by the agent (terminal).
- `auth-required`: Authentication required from client/user to proceed (non-terminal).
- `unknown`: Task state cannot be determined (terminal).
- **`Message`:** Communication unit between user and agent.
- `role`: ("user" | "agent") Sender role.
- `parts`: (`Part[]`) Content parts (text, file, data).
- `metadata`: (object | null) Message-specific metadata.
- `messageId`: (string) Unique identifier for the message.
- `parentMessageId`: (string) Reference to a previous message being replied to (optional).
- `rootMessageId`: (string) Reference to the first message in a thread (optional).
- `referenceTaskIds`: (string[]) List of tasks referenced as contextual hint by this message (optional).
- `taskId`: (string) Task identifier the current message is related to (optional).
- `contextId`: (string) Context identifier the message is associated with (optional).
- `kind`: ("message") Type discriminator.
- **`Part` (Union Type):** Represents a piece of content within a Message or Artifact.
- **`TextPart`:**
- `type`: "text"
- `text`: (string) Text content.
- `mimeType`: (string) Format specification (default: "text/plain").
- **`FilePart`:**
- `type`: "file"
- `file`: (`FileWithUri`) File reference with URI.
- `inline`: (boolean) Whether the file content should be displayed inline (optional).
- **`DataPart`:**
- `type`: "data"
- `data`: (any) Structured data in JSON-serializable format.
- `mimeType`: (string) Format specification (e.g., "application/json").
- `inline`: (boolean) Whether the data should be displayed inline (optional).
- `metadata`: (object | null) Optional metadata for the specific part.
- **`FileWithUri`:** Represents a file reference with URI.
- `name`: (string) Filename with extension.
- `mimeType`: (string) Content type (Media Type).
- `uri`: (string) URI to access the file content.
- **`Artifact`:** An output generated by a task.
- `artifactId`: (string) Unique identifier for the artifact.
- `name`: (string) Human-readable artifact name (optional).
- `description`: (string) Detailed description of the artifact (optional).
- `parts`: (`Part[]`) Content segments in the artifact.
- `metadata`: (object) Custom key-value data (optional).
- `kind`: ("artifact") Type discriminator.
- **`AuthenticationInfo`:** Authentication details for push notifications.
- `schemes`: (string[]) Array of authentication scheme names the caller must use.
- `credentials`: (string) Optional static credentials or scheme-specific configuration info.
- **`PushNotificationConfig`:** Configuration for push notifications.
- `url`: (string) Endpoint URL for the agent to POST notifications to.
- `token`: (string) Optional authentication token for the agent to include in requests (optional).
- `authentication`: (`AuthenticationInfo`) Authentication details the agent needs to call the URL (optional).
- **`TaskPushNotificationConfig`:** Associates a `PushNotificationConfig` with a task ID.
- `taskId`: (string) The task ID to receive push notifications for.
- `config`: (`PushNotificationConfig`) The push notification configuration to use.
- `metadata`: (object) Custom key-value data for client use (optional).
- **`MessageSendParams`:** Parameters for sending a message.
- `message`: (`Message`) The message to send.
- `configuration`: (`MessageSendConfiguration`) Optional configuration for the message.
- `metadata`: (object) Custom key-value data for client use (optional).
- **`agent/authenticatedExtendedCard`:** (HTTP GET)
- Retrieves a potentially more detailed version of the Agent Card after the client has authenticated.
- Available only if `AgentCard.supportsAuthenticatedExtendedCard` is `true`.
- Endpoint URL: `{AgentCard.url}/../agent/authenticatedExtendedCard`
- Authentication: Required using one of the schemes declared in the public `AgentCard.securitySchemes`.
- Response: Complete `AgentCard` object with potentially additional details.
### 2.3. A2A RPC Methods
- **`message/send`:** (Request/Response)
- Sends a message to initiate or continue a task.
- `params`: `MessageSendParams` (includes `taskId`, `contextId`, `message`, optionally `configuration`).
- `result`: `Task` (final state after synchronous processing).
- **`message/stream`:** (Request/Stream)
- Sends a message and subscribes to real-time updates via Server-Sent Events (SSE).
- `params`: `MessageSendParams`.
- `result` (stream events): `SendStreamingMessageResponse` containing one of: `MessageEvent`, `TaskStatusUpdateEvent`, `TaskArtifactUpdateEvent`. Final event has `final: true`.
- **`tasks/get`:** (Request/Response)
- Retrieves the current state of a task.
- `params`: `TaskQueryParams` (includes `id`, optionally `historyLength`).
- `result`: `Task`.
- **`tasks/cancel`:** (Request/Response)
- Requests cancellation of a running task.
- `params`: `TaskIdParams` (includes `id`).
- `result`: `Task` (updated state, likely 'canceled') or error if not cancelable.
- **`tasks/pushNotificationConfig/set`:** (Request/Response)
- Sets or updates the push notification configuration for a task.
- `params`: `TaskPushNotificationConfig`.
- `result`: `TaskPushNotificationConfig` (confirmed configuration).
- **`tasks/pushNotificationConfig/get`:** (Request/Response)
- Retrieves the current push notification configuration for a task.
- `params`: `TaskIdParams` (includes `taskId`).
- `result`: `TaskPushNotificationConfig`.
- **`tasks/resubscribe`:** (Request/Stream)
- Resubscribes to task updates after a connection interruption (SSE).
- `params`: `TaskQueryParams`.
- `result` (stream events): `TaskStatusUpdateEvent` or `TaskArtifactUpdateEvent`.
### 2.4. Streaming Update Events (Result of `message/stream` or `tasks/resubscribe`)
- **`SendStreamingMessageResponse`:** Contains a message from the agent.
- `type`: "message"
- `message`: (`Message`) The message content.
- `contextId`: (string) Context identifier the message is associated with.
- `kind`: ("streaming-response") Type discriminator.
- `final`: (boolean) True if this is the final message for the task.
- **`TaskStatusUpdateEvent`:** Signals a change in task status.
- `type`: "task-status"
- `taskId`: (string) Task ID.
- `contextId`: (string) Context identifier the task is associated with.
- `kind`: ("status-update") Type discriminator.
- `status`: (`TaskStatus`) The new status object.
- `final`: (boolean) True if this is the terminal update for the task.
- **`TaskArtifactUpdateEvent`:** Signals a new or updated artifact.
- `type`: "task-artifact"
- `taskId`: (string) Task ID.
- `artifact`: (`Artifact`) The artifact data.
- `append`: (boolean) If true, append parts to artifact; if false (default), replace.
- `lastChunk`: (boolean) If true, indicates this is the final update for the artifact.
- `final`: (boolean) Usually false for artifacts, can signal end concurrently with status.
### 2.5. Standard Error Codes
- `-32700`: `JSONParseError` - Invalid JSON payload.
- `-32600`: `InvalidRequestError` - Invalid JSON-RPC request object.
- `-32601`: `MethodNotFoundError` - Method does not exist.
- `-32602`: `InvalidParamsError` - Invalid method parameters.
- `-32603`: `InternalError` - Internal server error.
### 2.6. A2A Specific Error Codes
- `-32001`: `TaskNotFoundError` - Specified task ID not found.
- `-32002`: `TaskNotCancelableError` - Task is in a terminal state and cannot be canceled.
- `-32003`: `PushNotificationNotSupportedError` - Agent does not support push notifications.
- `-32004`: `UnsupportedOperationError` - The requested operation is not supported.
- `-32005`: `ContentTypeNotSupportedError` - Mismatch in supported content types.
- `-32006`: `InvalidAgentResponseError` - Agent generated an invalid response for the requested method.
## 3. Core Concepts
- **Agent Discovery:** Clients find agents and their capabilities by fetching the `AgentCard` JSON, typically from `/.well-known/agent.json`.
- **Task Lifecycle:** Tasks progress through states defined in `TaskState` (submitted -> working -> [input_required] -> completed/failed/canceled/rejected/unknown).
- **Communication:** Uses `Message` objects containing `Part`s (text, file, data). Task outputs are represented as `Artifact`s, also containing `Part`s.
- **Streaming:** Long-running tasks can provide real-time updates using Server-Sent Events (SSE) via `message/stream`. Updates are sent as `MessageEvent`, `TaskStatusUpdateEvent` and `TaskArtifactUpdateEvent`. Reconnection after interruptions is supported via `tasks/resubscribe`.
- **Push Notifications:** Agents can proactively notify clients about task updates using webhook URLs provided via `tasks/pushNotificationConfig/set`. Authentication mechanisms (e.g., Bearer tokens via JWT signed with keys from agent's JWKS endpoint) are supported for secure communication.
- **Authentication:** Defined in `AgentCard` (via `securitySchemes` and `security` fields) and `PushNotificationConfig`. Can involve various schemes (e.g., API keys, OAuth, JWT). The protocol supports authenticated extended Agent Card retrieval via the `agent/authenticatedExtendedCard` endpoint. Samples use JWT for push notifications and secure communication.
- **Forms:** Structured data can be requested and submitted using `DataPart` within Messages/Artifacts (demonstrated in ADK sample).
## 4. Security Considerations
- **Transport Security:** Always use HTTPS with strong TLS configurations in production environments.
- **Authentication:**
- Handled via standard HTTP mechanisms (e.g., `Authorization` header with Bearer tokens, API keys).
- Requirements are declared in the `AgentCard`.
- Credentials MUST be obtained out-of-band by the client.
- A2A Servers MUST authenticate every request.
- **Authorization:**
- A server-side responsibility based on the authenticated identity.
- Implement the principle of least privilege.
- Can be granular, based on skills, actions, or data.
- **Push Notification Security:**
- Webhook URL validation (by the A2A Server sending notifications) is crucial to prevent SSRF.
- Authentication of the A2A Server to the client's webhook is essential.
- Authentication of the notification by the client's webhook receiver (verifying it came from the legitimate A2A Server and is relevant) is critical.
- **Input Validation:** Servers MUST rigorously validate all RPC parameters and the content/structure of data in `Message` and `Artifact` parts to prevent injection attacks or processing errors.
- **Resource Management:** Implement rate limiting, concurrency controls, and resource limits to protect agents from abuse or overload.
- **Data Privacy:** Adhere to all applicable privacy regulations for data exchanged in `Message` and `Artifact` parts. Minimize sensitive data transfer.
## 4. Implementations & Samples
### 4.1. Common Libraries
- **Python (`samples/python/common`)**:
- `client/`: `A2AClient` for making requests, `A2ACardResolver` for discovery.
- `server/`: `A2AServer` (Starlette-based), `TaskManager` base class, `InMemoryTaskManager`.
- `types.py`: Pydantic models mirroring the JSON schema.
- `utils/`: Helpers for push notification auth (JWT signing/verification, JWKS endpoint).
- **JavaScript/TypeScript (`samples/js/src`)**:
- `client/`: `A2AClient` implementation using `fetch`.
- `server/`: `A2AServer` (Express-based), `TaskStore` interface, `InMemoryTaskStore`, `FileStore`.
- `schema.ts`: TypeScript interfaces matching the JSON schema.
- `handler.ts`, `error.ts`, `utils.ts`: Support code for the server.
### 4.2. Python Samples
- **Location:** `samples/python/agents/` & `samples/python/hosts/`
- **Setup:** Uses `uv` and `pyproject.toml`. Requires Python >= 3.12/3.13. API keys via `.env`.
- **Agents:**
- **LangGraph (`agents/langgraph`)**: Currency conversion agent. Demonstrates tool use, multi-turn (`input-required`), and **streaming** (`tasks/sendSubscribe`).
- **CrewAI (`agents/crewai`)**: Image generation agent. Demonstrates multi-turn and handling **file artifacts** (images).
- **Google ADK (`agents/google_adk`)**: Expense reimbursement agent. Demonstrates multi-turn and handling **forms** using `DataPart`.
- **Hosts:**
- **CLI (`hosts/cli`)**: Simple command-line client to interact with any A2A agent. Supports streaming and optional push notification listening.
- **Multi-Agent Orchestrator (`hosts/multiagent`)**: An ADK-based "Host Agent" that manages connections (`RemoteAgentConnections`) to other A2A agents and delegates tasks based on instructions.
### 4.3. JavaScript/TypeScript Samples
- **Location:** `samples/js/`
- **Setup:** Uses `npm`/`pnpm`, `tsx`, `tsconfig.json`. Requires Node.js >= 18. API keys via environment variables. Framework: **Genkit**.
- **Agents (`src/agents/`)**:
- **Movie Agent (`movie-agent`)**: Uses TMDB API via Genkit tools to answer movie questions. Demonstrates tool use and multi-turn (`AWAITING_USER_INPUT` mapped to `input-required`).
- **Coder Agent (`coder`)**: Generates code files. Demonstrates producing multiple **file artifacts** via streaming updates. Uses custom Genkit format (`code-format.ts`).
- **Hosts:**
- **CLI (`src/cli.ts`)**: Command-line client for interacting with JS agents.
### 4.4. Demo Application (`demo/`)
- **UI (`demo/ui`)**: Web application built with **Mesop**.
- Visualizes conversations with multiple agents via the host orchestrator.
- Renders text, images, forms.
- Allows dynamic agent registration via URL.
- Provides views for task list and event logs.
- **Service (`demo/ui/service`)**: Backend service for the Mesop UI.
- `server/`: Manages conversations, routes messages, interfaces with the host agent (`ADKHostManager` or `InMemoryFakeAgentManager`).
- `client/`: Client used by the UI to talk to its *own* backend service.
## 5. Development & Setup
- **Prerequisites:** Python (>=3.12 or 3.13), Node.js (>=18), `uv` (for Python), `npm`/`pnpm` (for JS).
- **API Keys:** Required for LLM access (e.g., `GOOGLE_API_KEY`, `TMDB_API_KEY`), typically set via `.env` files or environment variables.
- **Running Samples:** Generally involves running an agent server (`uv run ...` or `npm run ...`) and then a host client/app (e.g., `uv run hosts/cli --agent <agent_url>`).
## 6. Contribution
- See `CONTRIBUTING.md`.
- GitHub discussion available as the primary means of communication.
- GitHub issues for bugs and feature requests.
- Google Form for private feedback.
If you'd like to retry the search, try changing the query to increase the likelihood of a match.
Step 1<Thoughts>
The question asks for the merits (メリット) of the A2A protocol. The provided documentation is quite comprehensive. A direct search within that documentation for "メリット" or its English equivalents ("advantages," "benefits," etc.) failed, but the document itself describes the features and capabilities of A2A. Therefore, I can extract the advantages from the existing documentation without needing to call any additional tools.
</Thoughts>
<Selected tools>
```json
[]
```
</Selected tools>
<Answer>
The A2A (Agent2Agent) protocol offers several key advantages:
* **Interoperability:** It enables communication and collaboration between AI agents built on different frameworks (e.g., LangGraph, CrewAI, Google ADK, Genkit) and by different vendors. This avoids vendor lock-in and allows for greater flexibility in building AI systems.
* **Standardized Communication:** It uses a standardized JSON-RPC 2.0 protocol over HTTP(S), ensuring consistent and predictable communication between agents.
* **Efficient Task Management:** A standardized task management system allows for easy tracking, handling, and cancellation of tasks.
* **Support for Various Data Types:** It supports different content types (text, files, structured data) through the use of Parts and Artifacts, enabling agents to exchange diverse information.
* **Real-time Updates (Streaming):** Long-running tasks can provide real-time updates using Server-Sent Events (SSE), enabling more interactive and responsive agent collaborations.
* **Asynchronous Updates (Push Notifications):** Agents can proactively notify clients about task updates via webhooks, allowing for more efficient communication and less reliance on polling.
* **Secure Communication:** The protocol incorporates security considerations such as HTTPS, authentication, and authorization mechanisms to protect against various attacks.
* **Agent Discovery:** Clients can easily discover available agents and their capabilities via Agent Cards, simplifying the process of integrating new agents.
* **Open Source and Community Driven:** The open-source nature of the project encourages community contributions, ensuring ongoing development and improvement.
In short, A2A facilitates seamless interoperability, efficient task management, and secure communication between disparate AI agents, leading to more flexible, scalable, and robust AI systems.
</Answer>
</Agent>
<Thoughts>
Thoughts:
- The question asks for the benefits of the A2A protocol.
- The A2A Protocol Agent has already provided a comprehensive and satisfactory answer.
- No further agents are needed.
</Thoughts>
<Selected agents>
```json
[]
```
</Selected agents>
<Answer>
The A2A Protocol Agent has already provided a complete and accurate answer to the question. The benefits of the A2A protocol include interoperability between AI agents from different frameworks and vendors, standardized communication using JSON-RPC 2.0, efficient task management, support for various data types, real-time and asynchronous updates, secure communication, agent discovery, and an open-source community-driven development model.
</Answer>
```
雑記
一通り Agent 周りのプロトコルやライブラリを触ってみたが、よりプログラムの書き方が宣言的なものに寄っていると感じた。そして MCP や Agent を利用することでその宣言の仕方も自然言語で記述できるため、AI の民主化が進み、非エンジニアも自分の MCP や Agent を世界に公開できる世界が来るかもしれない。それはスマホ一つで tiktok や YouTube に動画をあげられるようになったのと同じようなことだと思う。その中で我々エンジニアが価値を出すためには、技術理解を強固にし、AI を使って拡大していくことだと思う。そういった意味では、このような将来の生成 AI 開発の基盤となるようなプロトコルの生い立ちに出会えたことはその理解を肌身レベルで感じられる。この時代を生きるエンジニアの特権かもしれない。
Discussion