TiDB Starter Data Service V3 (2) Chat2Query
今日は前回の記事に続いて新しくなったData Serviceを触っていきます。Standard Data App
を使ってみました。
今日はChat2Query Data App
を使っていきます。
Standard Data App と Chat2Query Data App の違い
Data Appのモードには2種類、Standard
とChat2Query
があります。Standard
は前回の記事で触ったようにあらかじめ実行を行いたいSQLを保存しておいて、HTTPエンドポイントからリクエストを行い実行します。エンドポイントは保存するSQLごとに行われます。動的な外部からのSQLを流し込んだ実行は行えません。
一方Chat2Query
モードは自然言語のリクエストによってSQLが自動生成され実行されます。例えばshow top10 data in the items table
のようにリクエストを出すことができます。
さっそくやってみる
1. Data App の作成
まずはChat2Query
モードでData Appを作成します。
自動で大量のエンドポイントが作成されます。
2. Data Summary の実行
Chat2Queryが正しく動作するためには、まずData Summary
を実行してChat2Query用統計を取得しておく必要があります。
Show Code Example
をクリックして出力されたcurlコマンドを実行します。
curl --digest --user ${PUBLIC_KEY}:${PRIVATE_KEY} --request POST 'https://us-west-2.data.tidbcloud.com/api/v1beta/app/chat2query-UfpVDXbp/endpoint/v3/dataSummaries'\
--header 'content-type: application/json'\
--data-raw '{
"cluster_id": "10080985057875672215",
"database": "test",
"description": "<Your data summary description>",
"reuse": false,
"default": false
}'
{"code":200,"msg":"","result":{"data_summary_id":350233,"job_id":"4c673622fdc74186bcba4de41d736967"}}
ジョブが生成され実行されます。小さい環境であれば一瞬で完了しますが、データが多い環境では少し時間がかかります。出力されたjob_id
でステータスを確認可能です。
最新APIはv3
なのですが後方互換があるためジョブのステータスはv2用APIで確認します。
curl --digest --user ${PUBLIC_KEY}:${PRIVATE_KEY} --request GET 'https://us-west-2.data.tidbcloud.com/api/v1beta/app/chat2query-UfpVDXbp/endpoint/v2/jobs/{job_id}'\
--header 'content-type: application/json'
{job_id}
は先ほど出力された値に置き換えて実行します。レスポンスでものすごい長いjsonが戻ります。
response
{
"code": 200,
"msg": "",
"result": {
"ended_at": 1759818715,
"job_id": "4c673622fdc74186bcba4de41d736967",
"reason": "",
"result": {
"cluster_id": "10080985057875672215",
"data_summary_id": 350233,
"database": "test",
"default": true,
"description": {
"system": "This data source is a comprehensive system designed for managing user accounts, content, application metadata, products, and user profiles within a web application or content management system. It encompasses secure user authentication, authorization, content organization, and metadata management, enabling efficient and controlled access to data and resources.",
"user": "<Your data summary description>"
},
"keywords": [
"User Management",
"Content Management",
"Application Metadata",
"Product Catalog",
"User Profiles"
],
"relationships": {
"auth_group": [
{
"referenced_table": "auth_group",
"referenced_table_column": "id",
"referencing_table": "auth_user_groups",
"referencing_table_column": "group_id"
},
{
"referenced_table": "auth_group",
"referenced_table_column": "id",
"referencing_table": "auth_group_permissions",
"referencing_table_column": "group_id"
}
],
"auth_group_permissions": [
{
"referenced_table": "auth_group",
"referenced_table_column": "id",
"referencing_table": "auth_group_permissions",
"referencing_table_column": "group_id"
},
{
"referenced_table": "auth_permission",
"referenced_table_column": "id",
"referencing_table": "auth_group_permissions",
"referencing_table_column": "permission_id"
}
],
"auth_permission": [
{
"referenced_table": "django_content_type",
"referenced_table_column": "id",
"referencing_table": "auth_permission",
"referencing_table_column": "content_type_id"
},
{
"referenced_table": "auth_permission",
"referenced_table_column": "id",
"referencing_table": "auth_user_user_permissions",
"referencing_table_column": "permission_id"
}
],
"auth_user": [
{
"referenced_table": "auth_user",
"referenced_table_column": "id",
"referencing_table": "auth_user_groups",
"referencing_table_column": "user_id"
},
{
"referenced_table": "auth_user",
"referenced_table_column": "id",
"referencing_table": "auth_user_user_permissions",
"referencing_table_column": "user_id"
},
{
"referenced_table": "auth_user",
"referenced_table_column": "id",
"referencing_table": "django_admin_log",
"referencing_table_column": "user_id"
}
],
"auth_user_groups": [
{
"referenced_table": "auth_user",
"referenced_table_column": "id",
"referencing_table": "auth_user_groups",
"referencing_table_column": "user_id"
},
{
"referenced_table": "auth_group",
"referenced_table_column": "id",
"referencing_table": "auth_user_groups",
"referencing_table_column": "group_id"
}
],
"auth_user_user_permissions": [
{
"referenced_table": "auth_user",
"referenced_table_column": "id",
"referencing_table": "auth_user_user_permissions",
"referencing_table_column": "user_id"
},
{
"referenced_table": "auth_permission",
"referenced_table_column": "id",
"referencing_table": "auth_user_user_permissions",
"referencing_table_column": "permission_id"
}
],
"django_admin_log": [
{
"referenced_table": "django_content_type",
"referenced_table_column": "id",
"referencing_table": "django_admin_log",
"referencing_table_column": "content_type_id"
},
{
"referenced_table": "auth_user",
"referenced_table_column": "id",
"referencing_table": "django_admin_log",
"referencing_table_column": "user_id"
}
],
"django_content_type": [
{
"referenced_table": "django_content_type",
"referenced_table_column": "id",
"referencing_table": "auth_permission",
"referencing_table_column": "content_type_id"
},
{
"referenced_table": "django_content_type",
"referenced_table_column": "id",
"referencing_table": "django_admin_log",
"referencing_table_column": "content_type_id"
}
]
},
"status": "done",
"summary": "Comprehensive user, content, and metadata management system.",
"tables": {
"auth_group": {
"columns": {
"id": {
"description": "A unique identifier for each group record in the table.",
"name": "id"
},
"name": {
"description": "The name or label associated with a particular group.",
"name": "name"
}
},
"description": "The 'auth_group' table is used to store and manage different groups or roles that users can be assigned to within an application or system. These groups can be used to control access permissions and privileges for various features or resources.",
"name": "auth_group"
},
"auth_group_permissions": {
"columns": {
"group_id": {
"description": "A reference to a specific group, likely representing a user group or role.",
"name": "group_id"
},
"id": {
"description": "A unique identifier for each record in the table.",
"name": "id"
},
"permission_id": {
"description": "A reference to a specific permission, likely representing an action or access level granted to the associated group.",
"name": "permission_id"
}
},
"description": "This table represents the permissions assigned to different user groups or roles within a system. It allows for the management and assignment of access levels and actions that specific groups are authorized to perform.",
"name": "auth_group_permissions"
},
"auth_permission": {
"columns": {
"codename": {
"description": "A codified representation of the permission, typically following a convention like 'action_model'.",
"name": "codename"
},
"content_type_id": {
"description": "A foreign key referencing the content type (model) to which the permission applies.",
"name": "content_type_id"
},
"id": {
"description": "A unique identifier for each permission record.",
"name": "id"
},
"name": {
"description": "A descriptive name for the permission, indicating the action and object it applies to.",
"name": "name"
}
},
"description": "The auth_permission table is used to manage and store permissions related to various actions that can be performed on different models or objects within an application or system. It serves as a central repository for defining and controlling access rights.",
"name": "auth_permission"
},
"auth_user": {
"columns": {
"date_joined": {
"description": "The date and time when the user account was created.",
"name": "date_joined"
},
"email": {
"description": "The user's email address.",
"name": "email"
},
"first_name": {
"description": "The user's first name.",
"name": "first_name"
},
"id": {
"description": "A unique identifier for each user record.",
"name": "id"
},
"is_active": {
"description": "A flag indicating whether the user's account is active.",
"name": "is_active"
},
"is_staff": {
"description": "A flag indicating whether the user is a staff member.",
"name": "is_staff"
},
"is_superuser": {
"description": "A flag indicating whether the user has superuser privileges.",
"name": "is_superuser"
},
"last_login": {
"description": "The date and time of the user's last login.",
"name": "last_login"
},
"last_name": {
"description": "The user's last name.",
"name": "last_name"
},
"password": {
"description": "The encrypted password for the user's account.",
"name": "password"
},
"username": {
"description": "The unique username for the user's account.",
"name": "username"
}
},
"description": "This table represents user accounts in an authentication system, storing essential information such as usernames, passwords, personal details, account status, and timestamps.",
"name": "auth_user"
}
// 以降のテーブル (auth_user_groups, auth_user_user_permissions, django_admin_log, django_content_type, django_migrations, django_session, documents, items, sample_project_document, users) も同様に整形済み
}
},
"status": "done"
}
}
ステータスがdone
になっていれば完了です。
3. Chat2Query の実行
以下のコマンドを実行します。
curl --digest --user ${PUBLIC_KEY}:${PRIVATE_KEY} --request POST 'https://us-west-2.data.tidbcloud.com/api/v1beta/app/chat2query-UfpVDXbp/endpoint/v3/chat2data'\
--header 'content-type: application/json'\
--data-raw '{
"cluster_id": "10080985057875672215",
"database": "test",
"question": "show all users in the users table"
}'
{"code":200,"msg":"","result":{"cluster_id":"10080985057875672215","database":"test","job_id":"edac2b034f7040cb985f129d918f4773","session_id":350234}}
これでSQLが自動生成され実行中です。
4. 実行結果の確認
さきほど Data Summaryのjob_idでステータスを確認したのと同様に出力されたjob_idでステータスを確認します。
curl --digest --user ${PUBLIC_KEY}:${PRIVATE_KEY} --request GET 'https://us-west-2.data.tidbcloud.com/api/v1beta/app/chat2query-UfpVDXbp/endpoint/v2/jobs/edac2b034f7040cb985f129d918f4773'\
--header 'content-type: application/json'
{
"code": 200,
"msg": "",
"result": {
"ended_at": 1759819216,
"job_id": "edac2b034f7040cb985f129d918f4773",
"reason": "",
"result": {
"assumptions": [],
"chart_options": {},
"clarified_task": "The task is to retrieve all user records from the 'users' table.",
"data": {
"columns": [
{ "col": "id" },
{ "col": "username" },
{ "col": "email" },
{ "col": "created_at" }
],
"rows": [
["1", "alice", "alice@example.com", "2025-10-05 03:01:04"],
["2", "bob", "bob@example.com", "2025-10-05 03:01:04"]
]
},
"description": "",
"sql": "SELECT * FROM `users`;",
"sql_error": null,
"status": "done",
"task_id": "0",
"type": "data_retrieval"
},
"status": "done"
}
}
"sql": "SELECT * FROM
users;",
としてSQLが自動生成されていることがわかります。
例えば質問文を"question": "count item number"
に変更すると"sql": "SELECT COUNT(*) FROM
items;",
として異なるSQLが実行されます。
ちなみに"レコードの総数は?"など日本語にも対応しています。
Discussion