Open2025/06/03にコメント追加8

Googleの「Agent Development Kit」を試す

Agent Development Kit

公式記事
https://developers.googleblog.com/en/agent-development-kit-easy-to-build-multi-agent-applications/
npaka先生のまとめ
https://note.com/npaka/n/n3d7b72d8d74f
個人的にはここが気になっている


組み込みストリーミング

ADK独自の双方向音声・動画ストリーミング機能により、人間のような会話でエージェントとインタラクションできます。わずか数行のコードで、エージェントとの自然なインタラクションを構築でき、テキストによるやり取りから、リッチでマルチモーダルな対話へと進化させることができます。

ドキュメント
https://google.github.io/adk-docs/

 Agent Development KitGeminiとGoogleと統合されたオープンソースのAIエージェントフレームワーク

 エージェント開発キットとは何ですか？エージェント開発キット（ADK）は、AIエージェントの開発とデプロイメントのための柔軟でモジュール式のフレームワークです。ADKは、一般的なLLMやオープンソースの生成型AIツールと併用でき、GoogleエコシステムおよびGeminiモデルとの緊密な統合に重点を置いて設計されています。ADKを使用すると、GeminiモデルとGoogle AIツールを搭載したシンプルなエージェントを簡単に開始できるだけでなく、より複雑なエージェントアーキテクチャやオーケストレーションに必要な制御と構造を提供します。

 Get StartedGet Started に従って進める。
https://google.github.io/adk-docs/get-started/

インストール

今回はローカルのMac上で。

uvでプロジェクト作成

uv init -p 3.12.9 adk-work
cd adk-work

ADKのパッケージをインストール

uv add google-adk

出力

 + google-adk==0.1.0

main.pyは一旦削除

rm main.py

Quickstart

複数のツールを持ったエージェントをADKで作成していく。まずQuickstart通りのディレクトリ・ファイル構造を作成。

mkdir multi_tool_agent
touch multi_tool_agent/{__init__.py,agent.py,.env}

tree -a multi_tool_agent/

出力

multi_tool_agent/
├── .env
├── __init__.py
└── agent.py

__init__.py

from . import agent

agent.py

import datetime
from zoneinfo import ZoneInfo
from google.adk.agents import Agent

def get_weather(city: str) -> dict:
    """特定の都市の現在の天気情報を取得する。

    Args:
        city (str): 天気情報を取得したい都市名。英語で。

    Returns:
        dict: ステータスと結果またはエラーメッセージ。
    """
    if city.lower() == "new york":
        return {
            "status": "success",
            "report": (
                "ニューヨークの天気は晴れです。"
                "気温は25度（41°F）です。"
            ),
        }
    else:
        return {
            "status": "error",
            "error_message": f"{city}の天気情報は利用できません。",
        }


def get_current_time(city: str) -> dict:
    """特定の都市の現在時刻を取得する。

    Args:
        city (str): 天気情報を取得したい都市名。英語で。

    Returns:
        dict: ステータスと結果またはエラーメッセージ。
    """

    if city.lower() == "new york":
        tz_identifier = "America/New_York"
    else:
        return {
            "status": "error",
            "error_message": (
                f"{city}のタイムゾーン情報は利用できません。"
            ),
        }

    tz = ZoneInfo(tz_identifier)
    now = datetime.datetime.now(tz)
    report = (
        f'{city}の現在時刻は {now.strftime("%Y-%m-%d %H:%M:%S %Z%z")}です。'
    )
    return {"status": "success", "report": report}


root_agent = Agent(
    name="weather_time_agent",
    model="gemini-2.0-flash-exp",
    description=(
        "任意の都市の時刻と天気に関する質問に答えるエージェントです。"
    ),
    instruction=(
        "私は指定された都市の時刻や天気に関する質問に答えることができます。"
    ),
    tools=[get_weather, get_current_time],
)

.envはGoogle AI Studioを使うか、Vertex AIを使うかで、異なる

Google AI Studioの場合

.env

GOOGLE_GENAI_USE_VERTEXAI="False"
GOOGLE_API_KEY="XXXXXXXXXX"

Vertex AIの場合。なお、リージョンについてはus-central1を選択。asia-northeast1だとgemini-2.0-flash-expは使えないはず。

.env

GOOGLE_CLOUD_PROJECT="<プロジェクトID>"
GOOGLE_CLOUD_LOCATION="リージョン"  # 例: `us-central1`、`asia-northeast1`
GOOGLE_GENAI_USE_VERTEXAI="True"

自分はVertex AIで試してみる。事前にGoogle Cloud側でVertex AI APIを有効化して、gcloud auth loginで認証を行っておくこと。

ではエージェントを実行してみる。エージェントを実行する方法は3つ記載されているが、全てadkコマンドで行う様子。

Dev UI: adk web
ターミナル: adk run
APIサーバ: adk api_server

順に試してみる。まずDev UI。

uv run adk web

8000番ポートで立ち上がる様子。

出力

INFO:     Started server process [13449]
INFO:     Waiting for application startup.
+-----------------------------------------------------------------------------+
| ADK Web Server started                                                      |
|                                                                             |
| For local testing, access at http://localhost:8000.                         |
+-----------------------------------------------------------------------------+

INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

ブラウザでアクセスするとこんな画面になっている。左のドロップダウンメニューからエージェントを選択、今回の場合はmulti_tool_agent。

チャット画面が表示される。ので適当に喋ってみる。

こんな感じでツールを使って回答できているのがわかる。

左側のペーンで、ツール実行時のリクエスト・レスポンスや、モデルとのリクエスト・レスポンスなどの詳細が確認できる。

マイクやカメラなどを使うこともできる。マイクの場合はこんな感じになったけど、gemini-2.0-flash-expと直接マルチモーダルでやり取りしてるってことかな？レスポンスは結構速い。

Dev UIは止めて、次はターミナル経由。

uv run adk run multi_tool_agent

出力

Log setup complete: /var/folders/5z/mnlc5_7x5dv8r528s4sg1h2r0000gn/T/agents_log/agent.20250411_111641.log
To access latest log: tail -F /var/folders/5z/mnlc5_7x5dv8r528s4sg1h2r0000gn/T/agents_log/agent.latest.log
Running agent weather_time_agent, type exit to exit.
user: おはよう！
[weather_time_agent]: おはようございます！ 今日は何をお手伝いできますか？
user: 東京の天気を教えて。
[weather_time_agent]: 申し訳ありませんが、東京の天気情報は現在利用できません。
user: じゃあニューヨークは？
[weather_time_agent]: ニューヨークの天気は晴れで、気温は25度（41°F）です。
user: ニューヨークは今何時？
[weather_time_agent]: ニューヨークの現在時刻は 2025-04-10 22:17:10 EDT-0400 です。

最後にAPIサーバ。

uv run adk api_server

こちらも8000番ポートで立ち上がっている。

出力

INFO:     Started server process [14236]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

APIサーバを使ったローカルでのテストは以下に記載がある。

まずエージェントとの新しいセッションを作成する。エージェント・ユーザID・セッションIDを指定して作成する模様。ステート情報を渡すこともできるが、お試しなのでシンプルに。

curl -X POST http://0.0.0.0:8000/apps/multi_tool_agent/users/u_123/sessions/s_123 \
    -H "Content-Type: application/json" | jq -r

出力

{
  "id": "s_123",
  "app_name": "multi_tool_agent",
  "user_id": "u_123",
  "state": {},
  "events": [],
  "last_update_time": 1744338124.291699
}

セッションを作成したら、クエリを送信する。エンドポイントは/runと/run_sseの2つで、後者がストリーミング向け。

とりあえず普通に/run。

curl -X POST http://0.0.0.0:8000/run \
    -H "Content-Type: application/json" \
    -d '{
        "app_name": "multi_tool_agent",
        "user_id": "u_123",
        "session_id": "s_123",
        "new_message": {
            "role": "user",
            "parts": [{
                "text": "おはよう。ニューヨークの天気を教えて。"
            }]
        }
    }' | jq -r .

出力

[
  {
    "content": {
      "parts": [
        {
          "functionCall": {
            "id": "adk-9df2a453-949c-4538-8498-aa7da951bb5a",
            "args": {
              "city": "New York"
            },
            "name": "get_weather"
          }
        }
      ],
      "role": "model"
    },
    "invocation_id": "e-e61ed9f4-7a1e-4bb1-93d4-bf0e2c09a0db",
    "author": "weather_time_agent",
    "actions": {
      "state_delta": {},
      "artifact_delta": {},
      "requested_auth_configs": {}
    },
    "long_running_tool_ids": [],
    "id": "Zxk73L2x",
    "timestamp": 1744338358.38818
  },
  {
    "content": {
      "parts": [
        {
          "functionResponse": {
            "id": "adk-9df2a453-949c-4538-8498-aa7da951bb5a",
            "name": "get_weather",
            "response": {
              "status": "success",
              "report": "ニューヨークの天気は晴れです。気温は25度（41°F）です。"
            }
          }
        }
      ],
      "role": "user"
    },
    "invocation_id": "e-e61ed9f4-7a1e-4bb1-93d4-bf0e2c09a0db",
    "author": "weather_time_agent",
    "actions": {
      "state_delta": {},
      "artifact_delta": {},
      "requested_auth_configs": {}
    },
    "id": "U8qfhyl1",
    "timestamp": 1744338360.444321
  },
  {
    "content": {
      "parts": [
        {
          "text": "ニューヨークの天気は晴れです。気温は25度（41°F）です。"
        }
      ],
      "role": "model"
    },
    "invocation_id": "e-e61ed9f4-7a1e-4bb1-93d4-bf0e2c09a0db",
    "author": "weather_time_agent",
    "actions": {
      "state_delta": {},
      "artifact_delta": {},
      "requested_auth_configs": {}
    },
    "id": "AfTFMUUg",
    "timestamp": 1744338360.447432
  }
]

/run_sseでストリーミング

curl -X POST http://0.0.0.0:8000/run_sse \
    -H "Content-Type: application/json" \
    -d '{
        "app_name": "multi_tool_agent",
        "user_id": "u_123",
        "session_id": "s_123",
        "new_message": {
            "role": "user",
            "parts": [{
                "text": "おはよう。ニューヨークの天気を教えて。"
            }]
        },
        "streaming": true
    }'

チャンクに分割されているのがわkる。

出力

data: {"content":{"parts":[{"functionCall":{"id":"adk-7f1f9135-5adf-4060-b9ab-640a944d1656","args":{"city":"New York"},"name":"get_weather"}}],"role":"model"},"invocation_id":"e-cf9b4c0e-10e9-4422-a2f7-9df6238d69ad","author":"weather_time_agent","actions":{"state_delta":{},"artifact_delta":{},"requested_auth_configs":{}},"long_running_tool_ids":[],"id":"MCaMWTLL","timestamp":1744338531.517398}

data: {"content":{"parts":[{"functionResponse":{"id":"adk-7f1f9135-5adf-4060-b9ab-640a944d1656","name":"get_weather","response":{"status":"success","report":"ニューヨークの天気は晴れです。気温は25度（41°F）です。"}}}],"role":"user"},"invocation_id":"e-cf9b4c0e-10e9-4422-a2f7-9df6238d69ad","author":"weather_time_agent","actions":{"state_delta":{},"artifact_delta":{},"requested_auth_configs":{}},"id":"W7eS6W3T","timestamp":1744338533.378343}

data: {"content":{"parts":[{"text":"ニューヨーク"}],"role":"model"},"partial":true,"invocation_id":"e-cf9b4c0e-10e9-4422-a2f7-9df6238d69ad","author":"weather_time_agent","actions":{"state_delta":{},"artifact_delta":{},"requested_auth_configs":{}},"id":"8BrDOiDh","timestamp":1744338533.382575}

data: {"content":{"parts":[{"text":"の天気は"}],"role":"model"},"partial":true,"invocation_id":"e-cf9b4c0e-10e9-4422-a2f7-9df6238d69ad","author":"weather_time_agent","actions":{"state_delta":{},"artifact_delta":{},"requested_auth_configs":{}},"id":"8BrDOiDh","timestamp":1744338533.382575}

data: {"content":{"parts":[{"text":"晴れです。気温は25度です。"}],"role":"model"},"partial":true,"invocation_id":"e-cf9b4c0e-10e9-4422-a2f7-9df6238d69ad","author":"weather_time_agent","actions":{"state_delta":{},"artifact_delta":{},"requested_auth_configs":{}},"id":"8BrDOiDh","timestamp":1744338533.382575}

data: {"content":{"parts":[{"text":"ニューヨークの天気は晴れです。気温は25度です。"}],"role":"model"},"invocation_id":"e-cf9b4c0e-10e9-4422-a2f7-9df6238d69ad","author":"weather_time_agent","actions":{"state_delta":{},"artifact_delta":{},"requested_auth_configs":{}},"id":"8BrDOiDh","timestamp":1744338533.382575}

ちなみに気になったのでgemini-1.5-flashに変更して音声試してみたら・・・

出力

websockets.exceptions.ConnectionClosedError: received 1007 (invalid frame payload data) Request trace id: 7855fdbaae399fad, [ORIGINAL ERROR] generic::invalid_argument: gemini-1.5-flash is not supported in the li; then sent 1007 (invalid frame payload data) Request trace id: 7855fdbaae399fad, [ORIGINAL ERROR] generic::invalid_argument: gemini-1.5-flash is not supported in the li

テキストチャットは問題ないので、音声チャットの場合にはモデルを選ぶということみたい。（マルチモーダルモデル＋WebSocketで実現しているのだと推測。）

なお、LiteLLMが使えるので他ベンダーのモデル（OpenAIのRealtimeモデルとか）も試してみたが、こちらも非対応な様子。

今のところ音声はgemini-2.0-flash-expだけしか動くのを確認していない。

ちなみに気になったのでgemini-1.5-flashに変更して音声試してみたら・・・

出力

websockets.exceptions.ConnectionClosedError: received 1007 (invalid frame payload data) Request trace id: 7855fdbaae399fad, [ORIGINAL ERROR] generic::invalid_argument: gemini-1.5-flash is not supported in the li; then sent 1007 (invalid frame payload data) Request trace id: 7855fdbaae399fad, [ORIGINAL ERROR] generic::invalid_argument: gemini-1.5-flash is not supported in the li

テキストチャットは問題ないので、音声チャットの場合にはモデルを選ぶということみたい。（マルチモーダルモデル＋WebSocketで実現しているのだと推測。）

なお、LiteLLMが使えるので他ベンダーのモデル（OpenAIとか）も試してみたが、こちらも同様。

今のところ音声はgemini-2.0-flash-expだけしか動くのを確認していない。

Quickstart（streaming）

まずQuickstart通りのディレクトリ・ファイル構造を作成。Python仮想環境と.envは前回のものを流用。

mkdir -p adk-streaming/app/google_search_agent
touch adk-streaming/app/google_search_agent/{__init__,agent}.py
cp multi_tool_agent/.env adk-streaming/app/.

tree -a adk-streaming

出力

adk-streaming/
└── app
    ├── .env
    └── google_search_agent
        ├── __init__.py
        └── agent.py

3 directories, 3 files

agent.pyを作成。root_agentを定義する必要がある。

adk-streaming/app/google_search_agent/agent.py

from google.adk.agents import Agent
from google.adk.tools import google_search  # Import the tool

root_agent = Agent(
   # A unique name for the agent.
   name="basic_search_agent",
   # エージェントが使用する大規模言語モデル（LLM）
   model="gemini-2.0-flash-exp",
   # エージェントの目的の簡単な説明
   description="Google検索を使って質問に答えるエージェント",
   # エージェントの動作を設定するための指示
   instruction="あなたは優秀な研究者であり、常に事実を重視しています。",
   # Google検索でgroundingを行うgoogle_searchツールを追加
   tools=[google_search]
)

__init__.pyを作成

adk-streaming/app/google_search_agent/__init__.py

from . import agent

adk webで実行

cd adk-streaming/app

uv run adk web

ブラウザでアクセスして、google_search_agentを選択

Google検索でお天気サイトなどから情報を取得しているのがわかる。

ただ、ここまでの内容だと全然ストリーミング感がない。一応、ここのチェックをいれるとストリーミングにはなるのだけども。

次のステップはオプションになっているようなのだけど、カスタムでストリーミング対応したものを作るみたい。進める。

指示通りに、main.pyとstaticディレクトリを作成。

mkdir static
touch main.py
touch static/index.html

main.pyを以下の内容に。

main.py

import os
import json
import asyncio

from pathlib import Path
from dotenv import load_dotenv

from google.genai.types import (
    Part,
    Content,
)

from google.adk.runners import Runner
from google.adk.agents import LiveRequestQueue
from google.adk.agents.run_config import RunConfig
from google.adk.sessions.in_memory_session_service import InMemorySessionService

from fastapi import FastAPI, WebSocket
from fastapi.staticfiles import StaticFiles
from fastapi.responses import FileResponse

from google_search_agent.agent import root_agent

#
# ADK ストリーミング
#

# Gemini API キーのロード
load_dotenv()

APP_NAME = "ADKストリーミングのサンプル"
session_service = InMemorySessionService()


def start_agent_session(session_id: str):
    """エージェントセッションを開始する"""

    # セッションの作成
    session = session_service.create_session(
        app_name=APP_NAME,
        user_id=session_id,
        session_id=session_id,
    )

    # Runner の作成
    runner = Runner(
        app_name=APP_NAME,
        agent=root_agent,
        session_service=session_service,
    )

    # レスポンスモダリティを TEXT に設定
    run_config = RunConfig(response_modalities=["TEXT"])

    # LiveRequestQueue の作成
    live_request_queue = LiveRequestQueue()

    # エージェントセッションの開始
    live_events = runner.run_live(
        session=session,
        live_request_queue=live_request_queue,
        run_config=run_config,
    )
    return live_events, live_request_queue


async def agent_to_client_messaging(websocket, live_events):
    """エージェントからクライアントへのコミュニケーション"""
    while True:
        async for event in live_events:
            # ターン完了
            if event.turn_complete:
                await websocket.send_text(json.dumps({"turn_complete": True}))
                print("[TURN COMPLETE]")

            if event.interrupted:
                await websocket.send_text(json.dumps({"interrupted": True}))
                print("[INTERRUPTED]")

            # コンテンツと最初の部分を読み取る
            part: Part = (
                event.content and event.content.parts and event.content.parts[0]
            )
            if not part or not event.partial:
                continue

            # テキストを取得
            text = event.content and event.content.parts and event.content.parts[0].text
            if not text:
                continue

            # クライアントにテキストを送信
            await websocket.send_text(json.dumps({"message": text}))
            print(f"[AGENT TO CLIENT]: {text}")
            await asyncio.sleep(0)


async def client_to_agent_messaging(websocket, live_request_queue):
    """クライアントからエージェントへのコミュニケーション"""
    while True:
        text = await websocket.receive_text()
        content = Content(role="user", parts=[Part.from_text(text=text)])
        live_request_queue.send_content(content=content)
        print(f"[CLIENT TO AGNET]: {text}")
        await asyncio.sleep(0)


#
# FastAPI Webアプリ
#

app = FastAPI()

STATIC_DIR = Path("static")
app.mount("/static", StaticFiles(directory=STATIC_DIR), name="static")


@app.get("/")
async def root():
    """index.htmlを提供する"""
    return FileResponse(os.path.join(STATIC_DIR, "index.html"))


@app.websocket("/ws/{session_id}")
async def websocket_endpoint(websocket: WebSocket, session_id: int):
    """クライアントのWebSocketエンドポイント"""

    # クライアント接続待ち
    await websocket.accept()
    print(f"クライアント #{session_id} が接続しました")

    # エージェントセッション開始
    session_id = str(session_id)
    live_events, live_request_queue = start_agent_session(session_id)

    # タスク開始
    agent_to_client_task = asyncio.create_task(
        agent_to_client_messaging(websocket, live_events)
    )
    client_to_agent_task = asyncio.create_task(
        client_to_agent_messaging(websocket, live_request_queue)
    )
    await asyncio.gather(agent_to_client_task, client_to_agent_task)

    #切断
    print(f"クライアント #{session_id} が切断しました")

FastAPIでWebSocketエンドポイントを提供するWebアプリとなっている。以下のような感じになっている。

起動
API キーを読み込み
セッションとrunnerを管理するInMemorySessionServiceを初期化
FastAPIで以下のエンドポイントを公開
- @app.get("/"): クライアント側スクリプトを含むindex.htmlを静的コンテンツとして返す
- @app.websocket("/ws/{session_id}"): チャットのやり取りを行うWebSocketエンドポイント
ブラウザからアクセスすると、
- WebSocketの接続が行われる。
- start_agent_session でエージェントが初期化される
  - InMemorySessionService インスタンスを使って、セッションインスタンスを初期化
  - Runnerクラスでrunnerインスタンスを初期化
    - runnerにエージェントが紐づいている
  - セッション用のキューをLiveRequestQueue()で初期化
  - runnerを実行
- 2つの非同期タスクを実行してループ
  - client_to_agent_messaging: クライントからエージェントへのメッセージを処理する
    - クライアントからテキストを受け取って、メッセージを作成し、キューに入れる
  - agent_to_client_messaging: エージェントからクライアントへのメッセージを処理する
    - イベントを監視
      - ターン終了・割り込みを検出
      - イベントからメッセージを読み取って、クライアントに返す
あとはブラウザでのやり取りをそれぞれ処理する

ちょっと端折ったんだけど、runnerに、エージェント、キュー、イベントが紐づいている感じなんだな。

次にstatic/index.htmlを以下の内容で作成。

static/index.html

<!doctype html>
<html>
  <head>
    <title>ADK ストリーミングのサンプル</title>
  </head>

  <body>
    <h1>ADK ストリーミングのサンプル</h1>
    <div
      id="messages"
      style="height: 300px; overflow-y: auto; border: 1px solid black"></div>
    <br />

    <form id="messageForm">
      <label for="message">メッセージ:</label>
      <input type="text" id="message" name="message" />
      <button type="submit" id="sendButton" disabled>送信</button>
    </form>
  </body>

  <script>
    // サーバーとWebSocket接続
    const sessionId = Math.random().toString().substring(10);
    const ws_url = "ws://" + window.location.host + "/ws/" + sessionId;
    let ws = new WebSocket(ws_url);

    // DOM要素の取得
    const messageForm = document.getElementById("messageForm");
    const messageInput = document.getElementById("message");
    const messagesDiv = document.getElementById("messages");
    let currentMessageId = null;

    // WebSocket ハンドラー
    function addWebSocketHandlers(ws) {
      ws.onopen = function () {
        console.log("WebSocket接続が開けました。");
        document.getElementById("sendButton").disabled = false;
        document.getElementById("messages").textContent = "接続が開けました";
        addSubmitHandler(this);
      };

      ws.onmessage = function (event) {
        // 受信メッセージの解析
        const packet = JSON.parse(event.data);
        console.log(packet);

        // ターン完了チェック
        // ターン完了時に新しいメッセージを追加
        if (packet.turn_complete && packet.turn_complete == true) {
          currentMessageId = null;
          return;
        }

        // 新しいターンの場合は新しいメッセージを追加
        if (currentMessageId == null) {
          currentMessageId = Math.random().toString(36).substring(7);
          const message = document.createElement("p");
          message.id = currentMessageId;
          // メッセージ要素を messagesDiv に追加
          messagesDiv.appendChild(message);
        }

        // 現在のメッセージ要素にテキストを追加
        const message = document.getElementById(currentMessageId);
        message.textContent += packet.message;

        // messagesDivを最下行までスクロール
        messagesDiv.scrollTop = messagesDiv.scrollHeight;
      };

      // 接続が閉じられた場合、再接続を試みる
      ws.onclose = function () {
        console.log("WebSocket接続が閉じました。");
        document.getElementById("sendButton").disabled = true;
        document.getElementById("messages").textContent = "接続が閉じました";
        setTimeout(function () {
          console.log("再接続中...");
          ws = new WebSocket(ws_url);
          addWebSocketHandlers(ws);
        }, 5000);
      };

      ws.onerror = function (e) {
        console.log("WebSocket エラー: ", e);
      };
    }
    addWebSocketHandlers(ws);

    // サブミットハンドラーの追加
    function addSubmitHandler(ws) {
      messageForm.onsubmit = function (e) {
        e.preventDefault();
        const message = messageInput.value;
        if (message) {
          const p = document.createElement("p");
          p.textContent = "> " + message;
          messagesDiv.appendChild(p);
          ws.send(message);
          messageInput.value = "";
        }
        return false;
      };
    }
  </script>
</html>

では起動

uv run uvicorn main:app --reload

こんな感じでやり取り。ストリーミングされているのがわかる。あと、WebSocketなのでLLMの応答中にこちらから入力すればおそらく中断されると思う（試してない）

なんだろう、ストリーミングのパラメータを有効にして、レスポンスをチャンクで取り出す、みたいな感じの話ではなく、もっと広い感じの内容だった。前回のQuickstartから一気にハードル上がったな。