Open5ヶ月前にコメント追加3

LiveKit Agent 整理

https://docs.livekit.io/agents/build/

agent session
- 説明
  - voice エージェントのメインオーケストレータ
- 責務
  - ユーザ入力を収集
  - voice パイプラインの管理（ユーザ入力に対して LLM を呼び出し出力をユーザに返す一連のデータ処理）
  - 各セッションには少なくとも 1 つの agent が必要
agent
- 責務
  - コアな AI ロジック（指示、ツール利用、などなど）
track
- ユーザと agent との間で確立されるメディアストリーム。送受信するデータとしてはオーディオ、画像が利用可能。
RoomIO
- 説明
  - agent session と LiveKit room を繋ぐユーティリティクラス
- 責務
  - agent session 初期化時、全てのルーム参加者に available なオーディオトラックを購読可能にする

from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import openai, cartesia, deepgram, noise_cancellation, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel

session = AgentSession(
    stt=deepgram.STT(),
    llm=openai.LLM(),
    tts=cartesia.TTS(),
    vad=silero.VAD.load(),
    turn_detection=turn_detector.MultilingualModel(),
)

await session.start(
    room=ctx.room,
    agent=Agent(instructions="You are a helpful voice AI assistant."),
    room_input_options=RoomInputOptions(
        noise_cancellation=noise_cancellation.BVC(),
    ),
)

nukopy

 Agent セッションで可能なことWorkflows
マルチエージェントによる複雑なタスクを管理する（オーケストレートする）

https://docs.livekit.io/agents/build/workflows/
Tool definition & use
外部サービスの呼び出し、カスタムロジックの注入を行うためのツールを使用する
Pipeline nodes
voice パイプラインの各コンポーネントの振る舞いをカスタムする