Clineを使ってDifyを解析してみた
Difyというノーコード・ローコードでAIアプリケーションを開発・運用できるオープンソースプラットフォームがあります。
オープンソースでapi側はPython(Flask)、フロント側はtypescript(Next.js)で書かれており、コードは700,000行以上ありました(2025-03-31現在、mainブランチをclocで計測)
ある程度以上のコード規模のリポジトリを、クラス図、ER図、状態遷移図などにしてどこまで解析できるのか? という所に興味があったのと、併せてDifiのシステム解析をしてAIアプリケーションがどのような構成でできているのか? の勉強も兼ねて、今回解析してみました。
モデルは最近出たGemini2.5 proを使ってみました。現時点で100万入力トークンが可能なAIモデルで、大規模コードの解析がどこまでできるかが見たかった為に試しています。
はじまりのプロンプト
まずは最初に以下のようなプロンプトで mermaid
形式のシステム図の出力を指示します。
apiディレクトリ以下のディレクトリとファイルの内容を確認して、ディレクトリ構成と以下のファイルをapi/docディレクトリにmermaid形式で作成してください。
・クラス図
・コンポーネント図(C4)
・データフロー図(フローチャート)
・状態遷移図
・ER図
後app/controllersのファイルを走査して、api定義をswagger形式で出力してください。
すると以下のようなファイル群がapi/docに出力されました。
尚、今回はこの先出力されたファイルを基本的に信頼して扱っていますが、あくまでAIによる解釈であり、厳密な正確性は別途検証が必要です。
- class_diagram_extended.md
- component_diagram_c4_context.md
- directory_structure.md
- er_diagram.md
- state_diagram_workflow.md
- main_api_spec.yaml
まずはシステムの全体像を把握するために er_diagram.md
を確認していきます。
動かしながらの方が分かりやすいので、docker ディレクトリに移動して、docker compose up
でアプリを起動します。
そのままではDBポートがローカルに開かれてなくてテーブルやデータの内容を確認できないので、docker-compose.ymlのdbサービスに以下のポート指定をして起動します。
ports:
- "5432:5432"
ER図
tenantを頂点に、それに紐づく形でapp、account、datasetなどが紐づいているようです。
difyでは画面から作成できる代表的なものに、ワークフロー、チャットボット、エージェントやナレッジがあります。
- ワークフロー (Workflow):
- 複数のAIモデル、ロジック(条件分岐など)、外部ツール(API呼び出しなど)、ナレッジ検索といった様々な「ノード」を視覚的につなぎ合わせることで、複雑なAI処理の流れを自由に設計・構築できる機能です。単純なプロンプト応答だけでなく、一連のタスクを自動化したり、複数のステップを経て最終的な出力を生成したりするような、より高度で柔軟なAIアプリケーションを作成するのに使われます。
- チャットボット (Chatbot):
- ユーザーと対話形式でやり取りを行うAIアプリケーションを作成するための機能です。基本的な質疑応答、特定の話題に関する会話、あるいは後述のナレッジと連携して特定の情報源に基づいた回答を行うなど、様々な対話型AIを比較的簡単に構築し、Webサイトへの埋め込みなどで利用できます。
- エージェント (Agent):
- AI(特にLLM)に特定の目標を与え、その目標達成のために自律的に計画を立て、必要なツール(API連携、Web検索、計算など)を選択・実行する能力を持たせることを目指した機能です。ユーザーからの指示を受け、複数のステップやツール利用を経てタスクを遂行するような、より能動的で自律性の高いAIアプリケーションを構築するために利用されます。
- ナレッジ (Knowledge):
- 独自のドキュメント(PDF、テキストファイル、Webサイトなど)や構造化データ(CSVなど)をアップロード・登録し、AIが参照できる知識ベースを作成する機能です。AIは、このナレッジ内の情報を検索・参照して回答を生成するため、一般的な学習データには含まれない、特定のドメイン知識や最新情報に基づいた応答(Retrieval-Augmented Generation, RAG と呼ばれる技術)が可能になります。
ワークフローとチャットボットについては workflows
テーブルにデータが格納されます。
ワークフロー例)
node上に実行フローを設定でき、こちらはnodeデータとして、workflows.graph
に設定されるようです。
エージェントは app_model_configs
で、ナレッジは data_sets
に保存されました。
ERの概要が分かればシステムをある程度俯瞰して眺められます。
コンポーネント図(C4)
WebUIをUserが操作して、Dify APIを介してDatabaseやWorker、各種Providerに接続されるのが図式化されています。
その中でも以下の項目について補足します。
- Vector Database (ベクトルデータベース):
- テキストなどのデータを意味的に検索可能な「ベクトル」形式で保存するデータベースです。
Difyでは主にナレッジベース機能で使われ、ドキュメント検索を高速かつ高精度に行います。 (例: Weaviate, Qdrant)
- テキストなどのデータを意味的に検索可能な「ベクトル」形式で保存するデータベースです。
- Celery Worker(分散タスクキュー):
- 時間のかかる処理(ナレッジのインデックス作成など)をバックグラウンドで非同期に実行するコンポーネント
- Providers (プロバイダー):
- Difyが連携する外部サービス群です。Difyはこれらの外部サービスAPIを呼び出して、高度な機能を実現します。
- LLM Provider: テキスト生成 (例: OpenAI, Anthropic)
- Embedding Provider: テキストのベクトル化
- Moderation Provider: コンテンツフィルタリング
- TTS Provider: テキスト読み上げ
- ASR Provider: 音声認識
- Difyが連携する外部サービス群です。Difyはこれらの外部サービスAPIを呼び出して、高度な機能を実現します。
状態遷移図
こちらはワークフローがどのように作成されて実行されるか図式化されています。
まず下書きが作成、編集されて、次に公開。公開されたワークフローが実行されると処理中へとなり、成功、失敗、停止、部分成功などの状態へ遷移します。
他にも色々な状態遷移はあるでしょうが、今回はこちらだけの出力となりました。
API定義
swagger定義
swagger editorなどで下記をコピーして表示
openapi: 3.0.2
info:
title: Dify API (Main Endpoints)
version: '1.0'
description: A selection of main API endpoints for Dify. This is not exhaustive.
servers:
- url: /v1 # Assuming API base path is /v1, adjust if necessary
tags:
- name: Authentication (Console)
description: Console user authentication and related operations.
- name: Apps (Console)
description: Application management operations in the console.
- name: App Execution (Web)
description: Running applications via the web API.
- name: Workflow (Console)
description: Workflow creation, publishing, and debugging operations.
- name: Datasets (Console)
description: Dataset (Knowledge Base) management operations.
paths:
/login:
post:
summary: Login
description: Authenticate user with email and password.
tags:
- Authentication (Console)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
email:
type: string
format: email
description: User's email address.
password:
type: string
format: password
description: User's password.
remember_me:
type: boolean
default: false
invite_token:
type: string
nullable: true
language:
type: string
default: "en-US"
required:
- email
- password
responses:
'200':
description: Login successful, returns access and refresh tokens.
content:
application/json:
schema:
type: object
properties:
result:
type: string
example: success
data:
type: object
properties:
access_token:
type: string
refresh_token:
type: string
'400':
description: Bad Request (e.g., invalid email format, missing fields).
'401':
description: Unauthorized (e.g., invalid password, account banned, account not found and registration disabled).
'429':
description: Too Many Requests (Login rate limit exceeded).
/logout:
get:
summary: Logout
description: Logout the current user.
tags:
- Authentication (Console)
responses:
'200':
description: Logout successful.
/email-code-login:
post:
summary: Send Email Login Code
description: Sends a verification code to the user's email for login.
tags:
- Authentication (Console)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
email:
type: string
format: email
language:
type: string
default: "en-US"
required:
- email
responses:
'200':
description: Email sent successfully, returns a temporary token.
'400':
description: Bad Request.
'404':
description: Account not found (if registration is disabled).
'429':
description: Too Many Requests (Email sending rate limit exceeded).
/email-code-login/validity:
post:
summary: Login with Email Code
description: Logs in or registers a user using the email and verification code.
tags:
- Authentication (Console)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
email:
type: string
format: email
code:
type: string
description: Verification code sent via email.
token:
type: string
description: Temporary token received from /email-code-login.
required:
- email
- code
- token
responses:
'200':
description: Login/Registration successful, returns access and refresh tokens.
'400':
description: Bad Request (e.g., invalid code, invalid token, invalid email).
/reset-password:
post:
summary: Send Reset Password Email
description: Sends an email with instructions to reset the password.
tags:
- Authentication (Console)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
email:
type: string
format: email
language:
type: string
default: "en-US"
required:
- email
responses:
'200':
description: Email sent successfully, returns a temporary token.
'400':
description: Bad Request.
'404':
description: Account not found (if registration is disabled).
'429':
description: Too Many Requests (Password reset rate limit exceeded).
/refresh-token:
post:
summary: Refresh Access Token
description: Obtain a new access token using a refresh token.
tags:
- Authentication (Console)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
refresh_token:
type: string
required:
- refresh_token
responses:
'200':
description: Token refreshed successfully.
'401':
description: Unauthorized (Invalid refresh token).
/apps:
get:
summary: Get App List
description: Retrieves a paginated list of applications for the current tenant.
tags:
- Apps (Console)
parameters:
- name: page
in: query
schema:
type: integer
default: 1
- name: limit
in: query
schema:
type: integer
default: 20
- name: mode
in: query
schema:
type: string
enum: [chat, workflow, agent-chat, channel, all]
default: all
- name: name
in: query
schema:
type: string
- name: tag_ids
in: query
schema:
type: string
description: Comma-separated list of UUIDs.
- name: is_created_by_me
in: query
schema:
type: boolean
responses:
'200':
description: A list of applications.
'401':
description: Unauthorized.
post:
summary: Create App
description: Creates a new application.
tags:
- Apps (Console)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
name:
type: string
description:
type: string
nullable: true
mode:
type: string
enum: [chat, agent-chat, advanced-chat, workflow, completion]
icon:
type: string
nullable: true
icon_background:
type: string
nullable: true
required:
- name
- mode
responses:
'201':
description: Application created successfully.
'400':
description: Bad Request (e.g., missing mode).
'403':
description: Forbidden (User does not have permission).
/apps/{app_id}:
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
get:
summary: Get App Detail
description: Retrieves details for a specific application.
tags:
- Apps (Console)
responses:
'200':
description: Application details.
'401':
description: Unauthorized.
'404':
description: App not found.
put:
summary: Update App
description: Updates details for a specific application.
tags:
- Apps (Console)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
name:
type: string
description:
type: string
nullable: true
icon:
type: string
nullable: true
icon_background:
type: string
nullable: true
required:
- name
responses:
'200':
description: Application updated successfully.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App not found.
delete:
summary: Delete App
description: Deletes a specific application.
tags:
- Apps (Console)
responses:
'204':
description: Application deleted successfully.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App not found.
/apps/{app_id}/copy:
post:
summary: Copy App
description: Copies an existing application to create a new one.
tags:
- Apps (Console)
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
requestBody:
content:
application/json:
schema:
type: object
properties:
name:
type: string
nullable: true
description:
type: string
nullable: true
icon:
type: string
nullable: true
icon_background:
type: string
nullable: true
responses:
'201':
description: Application copied successfully.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: Source app not found.
/apps/{app_id}/export:
get:
summary: Export App DSL
description: Exports the application configuration in DSL (YAML) format.
tags:
- Apps (Console)
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
- name: include_secret
in: query
schema:
type: boolean
default: false
responses:
'200':
description: Application DSL.
content:
application/json: # Or potentially application/yaml
schema:
type: object
properties:
data:
type: string # YAML content as a string
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App not found.
/apps/{app_id}/name:
post:
summary: Update App Name
description: Updates the name of a specific application.
tags:
- Apps (Console)
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
name:
type: string
required:
- name
responses:
'200':
description: App name updated successfully.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App not found.
/apps/{app_id}/icon:
post:
summary: Update App Icon
description: Updates the icon of a specific application.
tags:
- Apps (Console)
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
requestBody:
content:
application/json:
schema:
type: object
properties:
icon:
type: string
nullable: true
icon_background:
type: string
nullable: true
responses:
'200':
description: App icon updated successfully.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App not found.
/apps/{app_id}/site-enable:
post:
summary: Update App Site Status
description: Enables or disables the public web app site for an application.
tags:
- Apps (Console)
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
enable_site:
type: boolean
required:
- enable_site
responses:
'200':
description: App site status updated successfully.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App not found.
/apps/{app_id}/api-enable:
post:
summary: Update App API Status
description: Enables or disables API access for an application.
tags:
- Apps (Console)
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
enable_api:
type: boolean
required:
- enable_api
responses:
'200':
description: App API status updated successfully.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App not found.
/completion-messages:
post:
summary: Run Completion App
description: Executes a completion mode application.
tags:
- App Execution (Web)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
inputs:
type: object
description: Input variables for the app.
query:
type: string
description: User query (optional, depends on app config).
files:
type: array
items:
type: object # Define file object structure if known
nullable: true
response_mode:
type: string
enum: [blocking, streaming]
description: |
'blocking' waits for the full response.
'streaming' returns chunks.
user:
type: string
description: End user identifier.
required:
- inputs
- user # Assuming user identifier is always required for web APIs
responses:
'200':
description: Execution result (blocking or streaming).
'400':
description: Bad Request.
'401':
description: Unauthorized (Invalid API Key or user).
'404':
description: App not found or not a completion app.
'500':
description: Internal Server Error during execution.
/completion-messages/{task_id}/stop:
post:
summary: Stop Completion Task
description: Stops a running completion task.
tags:
- App Execution (Web)
parameters:
- name: task_id
in: path
required: true
schema:
type: string
responses:
'200':
description: Stop signal sent successfully.
'401':
description: Unauthorized.
'404':
description: App not found or not a completion app.
/chat-messages:
post:
summary: Run Chat/Agent/Workflow App
description: Sends a message to a chat, agent, or advanced-chat mode application.
tags:
- App Execution (Web)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
inputs:
type: object
description: Input variables for the app.
query:
type: string
description: User's message/query.
files:
type: array
items:
type: object # Define file object structure if known
nullable: true
response_mode:
type: string
enum: [blocking, streaming]
conversation_id:
type: string
format: uuid
nullable: true
description: Existing conversation ID to continue, or null/omit to start new.
user:
type: string
description: End user identifier.
required:
- inputs
- query
- user # Assuming user identifier is always required for web APIs
responses:
'200':
description: Execution result (blocking or streaming).
'400':
description: Bad Request.
'401':
description: Unauthorized (Invalid API Key or user).
'404':
description: App/Conversation not found or not a chat-based app.
'500':
description: Internal Server Error during execution.
/chat-messages/{task_id}/stop:
post:
summary: Stop Chat Task
description: Stops a running chat/agent/workflow task.
tags:
- App Execution (Web)
parameters:
- name: task_id
in: path
required: true
schema:
type: string
responses:
'200':
description: Stop signal sent successfully.
'401':
description: Unauthorized.
'404':
description: App not found or not a chat-based app.
/apps/{app_id}/workflows/draft:
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
get:
summary: Get Draft Workflow
description: Retrieves the draft version of the workflow for an app.
tags:
- Workflow (Console)
responses:
'200':
description: Draft workflow details.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App or draft workflow not found.
post:
summary: Sync Draft Workflow
description: Saves or updates the draft workflow definition.
tags:
- Workflow (Console)
requestBody:
required: true
content:
application/json: # or text/plain containing JSON
schema:
type: object
properties:
graph:
type: object
description: Workflow graph structure (nodes, edges).
features:
type: object
description: Workflow features configuration.
environment_variables:
type: array
items:
type: object # Define variable structure
nullable: true
conversation_variables:
type: array
items:
type: object # Define variable structure
nullable: true
hash:
type: string
nullable: true
description: Optional hash for optimistic locking.
required:
- graph
- features
responses:
'200':
description: Draft workflow synced successfully.
'400':
description: Bad Request or Draft workflow conflict (hash mismatch).
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App not found.
'415':
description: Unsupported Media Type (if not JSON or text/plain with JSON).
/apps/{app_id}/workflows/publish:
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
get:
summary: Get Published Workflow
description: Retrieves the currently published workflow for an app.
tags:
- Workflow (Console)
responses:
'200':
description: Published workflow details.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App or published workflow not found.
post:
summary: Publish Workflow
description: Publishes the current draft workflow.
tags:
- Workflow (Console)
requestBody:
content:
application/json:
schema:
type: object
properties:
marked_name:
type: string
maxLength: 20
nullable: true
marked_comment:
type: string
maxLength: 100
nullable: true
responses:
'200':
description: Workflow published successfully.
'400':
description: Bad Request (e.g., validation error).
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App or draft workflow not found.
/apps/{app_id}/workflows:
get:
summary: Get Published Workflow History
description: Retrieves a paginated list of published workflow versions.
tags:
- Workflow (Console)
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
- name: page
in: query
schema:
type: integer
default: 1
- name: limit
in: query
schema:
type: integer
default: 20
- name: user_id
in: query
schema:
type: string
format: uuid
nullable: true
- name: named_only
in: query
schema:
type: boolean
default: false
responses:
'200':
description: List of published workflows.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App not found.
/apps/{app_id}/workflows/{workflow_id}:
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
- name: workflow_id
in: path
required: true
schema:
type: string
format: uuid # Or potentially a version string
patch:
summary: Update Workflow Attributes
description: Updates attributes (like marked name/comment) of a specific workflow version.
tags:
- Workflow (Console)
requestBody:
content:
application/json:
schema:
type: object
properties:
marked_name:
type: string
maxLength: 20
nullable: true
marked_comment:
type: string
maxLength: 100
nullable: true
responses:
'200':
description: Workflow updated successfully.
'400':
description: Bad Request (e.g., validation error).
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: Workflow not found.
delete:
summary: Delete Workflow Version
description: Deletes a specific published workflow version (cannot delete draft this way).
tags:
- Workflow (Console)
responses:
'204':
description: Workflow deleted successfully.
'400':
description: Bad Request (e.g., trying to delete draft or workflow in use).
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: Workflow not found.
/apps/{app_id}/convert-to-workflow:
post:
summary: Convert App to Workflow Mode
description: Converts a basic chat or completion app to advanced-chat or workflow mode.
tags:
- Workflow (Console)
parameters:
- name: app_id
in: path
required: true
schema:
type: string
format: uuid
requestBody:
content:
application/json:
schema:
type: object
properties:
name:
type: string
nullable: true
icon:
type: string
nullable: true
icon_background:
type: string
nullable: true
responses:
'200':
description: Conversion successful, returns the new app ID.
content:
application/json:
schema:
type: object
properties:
new_app_id:
type: string
format: uuid
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: App not found or already in workflow mode.
/datasets:
get:
summary: Get Dataset List
description: Retrieves a paginated list of datasets (knowledge bases).
tags:
- Datasets (Console)
parameters:
- name: page
in: query
schema:
type: integer
default: 1
- name: limit
in: query
schema:
type: integer
default: 20
- name: ids
in: query
schema:
type: array
items:
type: string
format: uuid
style: form
explode: false
- name: keyword
in: query
schema:
type: string
- name: tag_ids
in: query
schema:
type: array
items:
type: string
format: uuid
style: form
explode: false
- name: include_all
in: query
schema:
type: boolean
default: false
responses:
'200':
description: A list of datasets.
'401':
description: Unauthorized.
post:
summary: Create Dataset
description: Creates a new empty dataset (knowledge base).
tags:
- Datasets (Console)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
name:
type: string
minLength: 1
maxLength: 40
description:
type: string
maxLength: 400
nullable: true
indexing_technique:
type: string
enum: [high_quality, economy, null]
nullable: true
# Add fields for external knowledge if needed
required:
- name
responses:
'201':
description: Dataset created successfully.
'400':
description: Bad Request (e.g., name validation failed, duplicate name).
'401':
description: Unauthorized.
'403':
description: Forbidden.
/datasets/{dataset_id}:
parameters:
- name: dataset_id
in: path
required: true
schema:
type: string
format: uuid
get:
summary: Get Dataset Detail
description: Retrieves details for a specific dataset.
tags:
- Datasets (Console)
responses:
'200':
description: Dataset details.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: Dataset not found.
patch:
summary: Update Dataset
description: Updates details for a specific dataset.
tags:
- Datasets (Console)
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
name:
type: string
minLength: 1
maxLength: 40
nullable: true
description:
type: string
maxLength: 400
nullable: true
indexing_technique:
type: string
enum: [high_quality, economy, null]
nullable: true
permission:
type: string
enum: [only_me, all_team_members, partial_members]
nullable: true
embedding_model:
type: string
nullable: true
embedding_model_provider:
type: string
nullable: true
retrieval_model:
type: object # Define retrieval model structure
nullable: true
partial_member_list:
type: array
items:
type: string # User IDs?
nullable: true
responses:
'200':
description: Dataset updated successfully.
'400':
description: Bad Request (e.g., validation failed).
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: Dataset not found.
delete:
summary: Delete Dataset
description: Deletes a specific dataset.
tags:
- Datasets (Console)
responses:
'204':
description: Dataset deleted successfully.
'400':
description: Bad Request (Dataset is in use).
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: Dataset not found.
/datasets/{dataset_id}/use-check:
get:
summary: Check Dataset Usage
description: Checks if a dataset is currently being used by any applications.
tags:
- Datasets (Console)
parameters:
- name: dataset_id
in: path
required: true
schema:
type: string
format: uuid
responses:
'200':
description: Usage status.
content:
application/json:
schema:
type: object
properties:
is_using:
type: boolean
'401':
description: Unauthorized.
'404':
description: Dataset not found.
/datasets/{dataset_id}/related-apps:
get:
summary: Get Related Apps
description: Retrieves a list of applications that use a specific dataset.
tags:
- Datasets (Console)
parameters:
- name: dataset_id
in: path
required: true
schema:
type: string
format: uuid
responses:
'200':
description: List of related applications.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: Dataset not found.
/datasets/indexing-estimate:
post:
summary: Estimate Indexing Cost
description: Estimates the cost (e.g., tokens) for indexing provided documents.
tags:
- Datasets (Console)
requestBody:
required: true
content:
application/json:
schema:
type: object
# Define the complex request body structure here based on DocumentService.estimate_args_validate
properties:
info_list:
type: object
process_rule:
type: object
indexing_technique:
type: string
enum: [high_quality, economy, null]
required:
- info_list
- process_rule
- indexing_technique
responses:
'200':
description: Indexing cost estimation.
'400':
description: Bad Request (Invalid arguments or estimation error).
'401':
description: Unauthorized.
/datasets/api-keys:
get:
summary: List Dataset API Keys
description: Retrieves all API keys specifically for dataset access.
tags:
- Datasets (Console)
responses:
'200':
description: List of API keys.
'401':
description: Unauthorized.
post:
summary: Create Dataset API Key
description: Creates a new API key for dataset access.
tags:
- Datasets (Console)
responses:
'200':
description: Newly created API key.
'400':
description: Bad Request (e.g., maximum key limit reached).
'401':
description: Unauthorized.
'403':
description: Forbidden.
/datasets/api-keys/{api_key_id}:
delete:
summary: Delete Dataset API Key
description: Deletes a specific dataset API key.
tags:
- Datasets (Console)
parameters:
- name: api_key_id
in: path
required: true
schema:
type: string
format: uuid
responses:
'204':
description: API key deleted successfully.
'401':
description: Unauthorized.
'403':
description: Forbidden.
'404':
description: API key not found.
components: {} # Schemas, SecuritySchemes etc. would go here in a full spec
大まかに分類すると、認証系とAPP系、Workflow系にDataset系のAPIがあるようです。
クラス図
先程のAPI定義と見比べてみると、どのAPIをどのサービスが担当しているかなどの関係性がつかめそうです。
まとめ
ここまででシステムの概要をざっくりと確認してきましたが、どうでしょうか?
個人的にはOSSのオンボーディング体験としては結構よく、Contributeする時もある程度は当たりをつけやすくなったのではないかと感じます。
AIのエージェントの理解を高めるためにシステムの仕様書を読ませるというのがあると思いますが、この辺りのビジュアライズされた仕様書を読み込ませても面白いかもしれません。
AIアプリケーション理解という観点だと、Vector Database (ベクトルデータベース)の概念は普段馴染はなく、Difyのシステム的にはこちらの理解が肝になるかなという印象でした。また時間が取れたらその辺りも深堀りしていけたらと思います。
後、今回解析したのはサーバーサイドのapiディレクトリだけで、webディレクトリのフロントエンドの解析は全く行えていません。node表現の所などはどうやって描画を実現しているのかなど、その辺りももし時間取れたら見ていきたいです。
Discussion