📑
[ADK] Gemini Deep Research AgentをADKから触ってみる

2025/12/18に公開
Agent Development Kit
tech
こんにちは、サントリーこと大橋です。
本記事は ADK Advent Calendar 2025 のシリーズ 2 の 2025/12/18 の記事です。

ADK Advent Calendar 2025 (シリーズ 1) は埋まっていますが、シリーズ 2 はまだまだ募集中なのでぜひ参加してください。
https://qiita.com/advent-calendar/2025/adk
今回は 2025/12/11 にリリースされた、Gemini Deep Research Agent を ADK から触ってみます。

 対象ADK を触ったことがある Python ユーザー
Gemini Deep Research Agent について知りたい人
なお本ドキュメント中では ADK の Agent をAgent、それ以外の AI エージェントをエージェントと表記します。

 Gemini Deep Research Agent についてGemini Deep Research Agent は、2025/12/11 に発表された、Gemini API の Interactions API から利用できる事前定義された AI エージェントです。
https://blog.google/technology/developers/deep-research-agent-gemini-api/

https://ai.google.dev/gemini-api/docs/deep-research?hl=ja

https://ai.google.dev/gemini-api/docs/interactions?hl=ja&ua=chat
このエージェントは、Gemini 3 Pro を搭載しており、複雑で長時間実行されるコンテキストの収集と統合のタスクを実行するように設計されています。

具体的には以下のような特徴があります。

自律的なリサーチ: 複数ステップのリサーチタスクを自律的に計画、実行、統合します。

反復的なプロセス: 検索結果を読み、知識のギャップを特定し、再検索を行うという反復的なプロセスを経て、詳細なレポートを作成します。

バックグラウンド実行: リサーチタスクは完了までに数分かかることがあるため、非同期のバックグラウンド実行が推奨されています。
これまでの LLM が単発の回答生成を得意としていたのに対し、Deep Research Agent は「調査・分析」という時間のかかるタスクを肩代わりしてくれる強力なツールです。

 ADK の Interactions API サポートについて先日リリースされた ADK 1.21.0 では、Interactions API のサポートが追加されました。

これについては以前の記事でも触れています。
https://zenn.dev/soundtricker/articles/f04e5c20e84c74
詳しくは上記記事を見ていただければと思いますが、現状 ADK は Interactions API 自体はサポートしているものの、Gemini Deep Research Agent を利用するための agent プロパティが標準の Gemini モデルクラスで対応していません。
そのため、現状のままでは ADK の Interactions API を用いて Gemini Deep Research Agent を触ることはできません。
今回はカスタムモデルクラスを作成して、Gemini Deep Research Agent を使えるようにしたいと思います。

(※ そのうち ADK 側で正式サポートされそうな気もしますが...)

 環境構築まずは今回のプロジェクトを作成し、Python 環境をセットアップします。
mkdir deep-research-agent
cd deep-research-agent

# pythonの環境セットアップ
mise use python3.12
mise use uv@latest
uv python pin 3.12
uv init --app .
uv add google-adk

 コア機能の実装
 カスタムモデルクラスを作成現状の問題は google.adk.models.google_llm.Gemini が model プロパティを受け取れるものの、agent プロパティを受け取れず、それを Interactions API に渡すことができない点です。

そこで、agent プロパティを受け取れるように拡張したカスタムモデルクラスを作成します。
models/gemini.py に以下のコードを作成します。
models/gemini.py
from google.adk.models.google_llm import Gemini
from google.adk.models.llm_request import LlmRequest
from google.adk.models.llm_response import LlmResponse
from typing import AsyncGenerator

class GeminiAgent(Gemini):
    agent: str | None # agent名
    def __init__(self, agent: str | None = None, **kwargs): # agentを受け取れるようにする
        super().__init__(agent=agent, **kwargs)

    async def _generate_content_via_interactions(
    self,
    llm_request: LlmRequest,
    stream: bool,
    ) -> AsyncGenerator[LlmResponse, None]:
        from .interactions_utils import generate_content_via_interactions_with_agent # これは後で作成する Interactions API用のユーティリティ関数

        async for llm_response in generate_content_via_interactions_with_agent(
            api_client=self.api_client,
            llm_request=llm_request,
            stream=stream,
            agent=self.agent,
        ):
            yield llm_response

 Interactions API 用のユーティリティ関数を作成する次に、実際に Interactions API を呼び出すためのユーティリティ関数を作成します。

元々 ADK には google.adk.models.interactions_utils がありますが、これも agent プロパティや Deep Research 特有のパラメータに対応していないため、拡張します。
models/interactions_utils.py (※ファイルパスは構成に合わせて utils/interactions_utils.py でも可) に以下のコードを作成します。
models/interactions_utils.py
from __future__ import annotations

import asyncio
import logging
from typing import TYPE_CHECKING, AsyncGenerator, Optional

from google.adk.models.interactions_utils import *
from google.genai import types

if TYPE_CHECKING:
    from google.adk.models.llm_request import LlmRequest
    from google.adk.models.llm_response import LlmResponse
    from google.genai import Client

logger = logging.getLogger("google_adk." + __name__)

_NEW_LINE = "\n"


async def generate_content_via_interactions_with_agent(
    api_client: Client,
    llm_request: LlmRequest,
    stream: bool,
    agent: str | None = None,
) -> AsyncGenerator[LlmResponse, None]:
    """Generate content using the interactions API.

    The interactions API provides stateful conversation capabilities. When
    previous_interaction_id is set in the request, the API chains interactions
    instead of requiring full conversation history.

    Note: Context caching is not used with the Interactions API since it
    maintains conversation state via previous_interaction_id.

    Args:
    api_client: The Google GenAI client.
    llm_request: The LLM request to send.
    stream: Whether to stream the response.

    Yields:
    LlmResponse objects converted from interaction responses.
    """

    if agent is None:
        async for llm_response in generate_content_via_interactions(
            api_client, llm_request, stream
        ):
            yield llm_response
            return

    # When previous_interaction_id is set, only send the latest continuous
    # user messages (the current turn) instead of full conversation history
    contents = llm_request.contents
    if llm_request.previous_interaction_id and contents:
        contents = _get_latest_user_contents(contents)

    # Convert contents to interactions API format
    input_turns = convert_contents_to_turns(contents)
    interaction_tools = convert_tools_config_to_interactions_format(llm_request.config)

    # Get previous interaction ID for stateful conversations
    previous_interaction_id = llm_request.previous_interaction_id

    # Log the request
    logger.info(
        "Sending request via interactions API, model: %s, stream: %s, "
        "previous_interaction_id: %s",
        llm_request.model,
        stream,
        previous_interaction_id,
    )
    # Track the current interaction ID from responses
    current_interaction_id: Optional[str] = None

    if stream:
        # Streaming mode
        responses = await api_client.aio.interactions.create(
            agent=agent,
            input=input_turns,
            stream=True,
            tools=interaction_tools if interaction_tools else None,
            previous_interaction_id=previous_interaction_id,
            background=True,
            agent_config={"type": "deep-research", "thinking_summaries": "auto"},
        )

        aggregated_parts: list[types.Part] = []
        async for event in responses:
            # Log the streaming event
            logger.debug(build_interactions_event_log(event))

            # Extract interaction ID from event if available
            if hasattr(event, "id") and event.id:
                current_interaction_id = event.id
            llm_response = convert_interaction_event_to_llm_response(
                event, aggregated_parts, current_interaction_id
            )
            if llm_response:
                yield llm_response
            await asyncio.sleep(10)

        # Final aggregated response
        if aggregated_parts:
            yield LlmResponse(
                content=types.Content(role="model", parts=aggregated_parts),
                partial=False,
                turn_complete=True,
                finish_reason=types.FinishReason.STOP,
                interaction_id=current_interaction_id,
            )
            return

    else:
        # Non-streaming mode
        interaction = await api_client.aio.interactions.create(
            agent=agent,
            input=input_turns,
            stream=False,
            tools=interaction_tools if interaction_tools else None,
            previous_interaction_id=previous_interaction_id,
            background=True,
        )

        # Log the response
        logger.info("Interaction response received from the model.")
        logger.debug(build_interactions_response_log(interaction))

        while True:
            current_interaction_id = interaction.id
            interaction = await api_client.aio.interactions.get(interaction.id)

            llm_response = convert_interaction_to_llm_response(interaction)
            if llm_response:
                yield llm_response

            if interaction.status == "completed":
                print("\nFinal Report:\n", interaction.outputs[-1].text)
                break
            elif interaction.status in ["failed", "cancelled"]:
                print(f"Failed with status: {interaction.status}")
                break
            await asyncio.sleep(10)


def _get_latest_user_contents(
    contents: list[types.Content],
) -> list[types.Content]:
    """Extract the latest turn contents for interactions API.

    For interactions API with previous_interaction_id, we only need to send
    the current turn's messages since prior history is maintained by
    the interaction chain.

    Special handling for function_result: When the user content contains a
    function_result (response to a model's function_call), we must also include
    the preceding model content with the function_call. The Interactions API
    needs both the function_call and function_result to properly match call_ids.

    Args:
    contents: The full list of content messages.

    Returns:
    A list containing the contents needed for the current turn.
    """
    if not contents:
        return []

    # Find the latest continuous user messages from the end
    latest_user_contents = []
    for content in reversed(contents):
        if content.role == "user":
            latest_user_contents.insert(0, content)
        else:
            # Stop when we hit a non-user message
            break

    # Check if the user contents contain a function_result
    has_function_result = False
    for content in latest_user_contents:
        if content.parts:
            for part in content.parts:
                if part.function_response is not None:
                    has_function_result = True
                    break
        if has_function_result:
            break

    # If we have a function_result, we also need the preceding model content
    # with the function_call so the API can match the call_id
    if has_function_result and len(contents) > len(latest_user_contents):
        # Get the index where user contents start
        user_start_idx = len(contents) - len(latest_user_contents)
        if user_start_idx > 0:
            # Check if the content before user contents is a model turn with
            # function_call
            preceding_content = contents[user_start_idx - 1]
            if preceding_content.role == "model" and preceding_content.parts:
                for part in preceding_content.parts:
                    if part.function_call is not None:
                        # Include the model's function_call turn before user's
                        # function_result
                        return [preceding_content] + latest_user_contents

    return latest_user_contents
このユーティリティでは、background=True を指定して Deep Research Agent を呼び出し、完了するまでポーリングするように実装しています。

 エージェントの実装では、作成した機能を利用して Agent を実装していきます。

今回は単純に Deep Research Agent を呼び出すサブエージェントを持つ、ルートエージェントを作成します。

 ルートエージェントの作成CLI でテンプレートを作成します。

Deep Research Agent は現状 Vertex AI では使えないので、API KEY を利用してください。
uv run adk create deep_research
生成された deep_research_agent/agent.py を以下のように修正します。
deep_research_agent/agent.py
from google.adk.agents.llm_agent import Agent
from .sub_agents.deep_research.agent import root_agent as deep_research_root_agent

root_agent = Agent(
    model="gemini-2.5-flash",
    name="root_agent",
    description="A helpful assistant for user questions.",
    instruction="""
    あなたはユーザーの質問に答えるAIアシスタントです。
    ユーザーが深い調査を求める場合には、deep_research_root_agent を利用して、途中経過を表示しながら回答してください。
    """,
    sub_agents=[deep_research_root_agent],
)

 Deep Research Agent をサブエージェントとして作成次に、実際に Deep Research を担当するサブエージェントを作成します。
uv run adk create deep_research_agent/sub_agents/deep_research
deep_research_agent/sub_agents/deep_research/agent.py を以下のように編集し、先程作成した GeminiAgent クラスを利用するようにします。
deep_research_agent/sub_agents/deep_research/agent.py
from models.gemini import GeminiAgent
from google.adk.agents.llm_agent import Agent

root_agent = Agent(
    model=GeminiAgent(agent='deep-research-pro-preview-12-2025', use_interactions_api=True),
    name='deep_research_agent',
    description='A Deep Research Agent.',
    instruction='Answer user questions with deep research.',
)

 テストDev UI 経由でテストしてみます。
uv run adk web .
Dev UI 上で、「〜について詳しく調査して」と依頼すると、ルートエージェントから Deep Research Agent が呼び出され、調査が開始されます。
!使ってみて感じた現状の課題：

時間がかかる: 調査には数分かかることがあり、その間 UI 上で動きがないように見えるため、ユーザー体験（UX）としては改善の余地がありそうです。

タイムアウト: Deep Research Agent を ストリーミングモードで呼び出すと、API リクエストが返ってこず、最終的にタイムアウトすることがあります。

非同期実装: 今回は aio を使った実装になっていますが、公式ドキュメントにある同期的な実装にしても同様の現象が見られたため、もう少し調査が必要かもしれません。

 まとめ今回は Gemini Deep Research Agent を ADK 経由で試してみました。

Deep Research Agent のような強力なエージェントを、ADK の枠組みの中で比較的簡単に組み込めるようになったのは大きな進歩です。
ただ、長時間実行されるタスクのハンドリングや UX 面では、まだ工夫が必要な部分も多そうです。

今回実装したカスタムモデルクラスのような機能が、今後 ADK に標準で取り込まれることを期待しつつ、引き続きウォッチしていきたいと思います。

 お知らせ/宣伝ADK 開発者が集う日本語の Discord コミュニティがあります。ADK に関する情報交換や議論に興味がある方は、ぜひご参加ください！
https://discord.gg/BKpGRzjtqZ
また、ADK の最新のコミットログやリリースノートを分かりやすく解説する Podcast を、月・水・金に配信しています。ADK の動向を追いかけたい方は、ぜひ聴いてみてください。クリスマス期間中は毎日配信しています。
https://www.youtube.com/playlist?list=PL0Zc2RFDZsM_MkHOzWNJpaT4EH5fQxA8n
対象

Gemini Deep Research Agent について

ADK の Interactions API サポートについて

環境構築

コア機能の実装

カスタムモデルクラスを作成

Interactions API 用のユーティリティ関数を作成する

エージェントの実装

ルートエージェントの作成

Deep Research Agent をサブエージェントとして作成

テスト

まとめ

お知らせ/宣伝

Discussion