👏
Vertex AI Feature Storeの機能グループをTerraformで作る方法について

2025/09/18に公開1件
今回はVertex AIのFeature Storeの機能グループをTerraformから作成する方法について調べてみました。機能グループを利用するにはバックエンドとなるBigQueryデータセットの作成が必要であり、これら二つのリソースの作成をしてみます。
Google CloudでTerraformを試してみた系は以下のスクラップにまとめていますので合わせてご覧ください！
https://zenn.dev/akasan/scraps/c6182a7d763bc8

 早速作ってみる！今回作成する機能グループについて、リソース定義は以下をベースに実施します。
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/vertex_ai_feature_group

 変数の定義まずはGoogle CloudのプロジェクトIDやリージョンを設定するためにvariables.tfを以下のように作成しました。
variables.tf
variable "project_id" {
  description = "The Google Cloud project ID"
  type        = string
}

variable "region" {
  description = "The Google Cloud region"
  type        = string
  default     = "us-central1"
}

variable "feature_group_name" {
  description = "Feature Group name"
  type        = string
  default     = "example_feature_store"
}

variable "bigquery_table_id" {
  description = "BigQuery table id"
  type        = string
  default     = "feature_table"
}

variable "bigquery_dataset_id" {
  description = "BigQuery dataset id"
  type        = string
  default     = "feature_dataset"
}
それぞれの変数は以下の目的で定義しています。

project_id: Google CloudのプロジェクトID

region: リソースを作成するリージョン

feature_group_name: 機能グループ名

bigquery_table_id: BigQueryテーブルID

bigquery_dataset_id: bigquery_table_id下で作成されるデータセットID
今回はproject_idだけ実行時に指定し、それ以外は固定値で実施するようにします。

 リソース定義次は機能グループとBigQueryのリソースをそれぞれmain.tfに定義します。
main.tf
provider "google" {
  project = var.project_id
  region  = var.region
}

resource "google_vertex_ai_feature_group" "feature_group" {
  name        = var.feature_group_name
  description = "A sample feature group"
  region      = var.region
  labels = {
    label-one = "value-one"
  }
  big_query {
    big_query_source {
      # The source table must have a column named 'feature_timestamp' of type TIMESTAMP.
      input_uri = "bq://${google_bigquery_table.sample_table.project}.${google_bigquery_table.sample_table.dataset_id}.${google_bigquery_table.sample_table.table_id}"
    }
    entity_id_columns = ["user_id"]
  }
}

resource "google_bigquery_dataset" "sample_dataset" {
  dataset_id    = var.bigquery_dataset_id
  friendly_name = "test"
  description   = "This is a test description"
  location      = "US"
}

resource "google_bigquery_table" "sample_table" {
  deletion_protection = false
  dataset_id          = google_bigquery_dataset.sample_dataset.dataset_id
  table_id            = var.bigquery_table_id

  schema = <<EOF
  [
      {
          "name": "user_id",
          "type": "STRING",
          "mode": "NULLABLE"
      },
      {
          "name": "feature_timestamp",
          "type": "TIMESTAMP",
          "mode": "NULLABLE"
      }
  ]
  EOF
}
まずはgoogle_vertex_ai_feature_groupリソースを作成し、機能グループを定義します。重要なのはbig_queryの部分で特徴量が保存されるBigQueryの情報を指定するところになります。big_query_source.input_uriにて特徴量が保存されるテーブルを指定し、entity_id_columnsにて特徴量を一意に指定できるカラム（RDBでいうところのプライマリーキーのような役割）を指定します。今回はuser_idというカラム名にしています。
次にBigQueryのリソースを定義します。まずはPgoogle_bigquery_datasetリソースでデータセットを作成します。次にgoogle_bigquery_tableリソースを作成して特徴量を保存するテーブルを作成します。重要な場所としてschemaにてテーブル定義をしているところになります。機能グループではentity_idとfeature_timestampの二つの要素が必要になります。前者は先ほど機能グループにて設定した一意に指定するためのキーであり、後者はどの時点でデータが登録されたかを指定するタイムスタンプになります。
それではterraform planをして内容をみてみましょう。
terraform plan
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # google_bigquery_dataset.sample_dataset will be created
  + resource "google_bigquery_dataset" "sample_dataset" {
      + creation_time              = (known after apply)
      + dataset_id                 = "feature_dataset"
      + default_collation          = (known after apply)
      + delete_contents_on_destroy = false
      + description                = "This is a test description"
      + effective_labels           = {
          + "goog-terraform-provisioned" = "true"
        }
      + etag                       = (known after apply)
      + friendly_name              = "test"
      + id                         = (known after apply)
      + is_case_insensitive        = (known after apply)
      + last_modified_time         = (known after apply)
      + location                   = "US"
      + max_time_travel_hours      = (known after apply)
      + project                    = "project_id"
      + self_link                  = (known after apply)
      + storage_billing_model      = (known after apply)
      + terraform_labels           = {
          + "goog-terraform-provisioned" = "true"
        }

      + access (known after apply)
    }

  # google_bigquery_table.sample_table will be created
  + resource "google_bigquery_table" "sample_table" {
      + creation_time                = (known after apply)
      + dataset_id                   = "feature_dataset"
      + deletion_protection          = false
      + effective_labels             = {
          + "goog-terraform-provisioned" = "true"
        }
      + etag                         = (known after apply)
      + expiration_time              = (known after apply)
      + generated_schema_columns     = (known after apply)
      + id                           = (known after apply)
      + ignore_auto_generated_schema = false
      + last_modified_time           = (known after apply)
      + location                     = (known after apply)
      + max_staleness                = (known after apply)
      + num_bytes                    = (known after apply)
      + num_long_term_bytes          = (known after apply)
      + num_rows                     = (known after apply)
      + project                      = "project_id"
      + schema                       = jsonencode(
            [
              + {
                  + mode = "NULLABLE"
                  + name = "user_id"
                  + type = "STRING"
                },
              + {
                  + mode = "NULLABLE"
                  + name = "feature_timestamp"
                  + type = "TIMESTAMP"
                },
            ]
        )
      + self_link                    = (known after apply)
      + table_id                     = "feature_table"
      + terraform_labels             = {
          + "goog-terraform-provisioned" = "true"
        }
      + type                         = (known after apply)
    }

  # google_vertex_ai_feature_group.feature_group will be created
  + resource "google_vertex_ai_feature_group" "feature_group" {
      + create_time      = (known after apply)
      + description      = "A sample feature group"
      + effective_labels = {
          + "goog-terraform-provisioned" = "true"
          + "label-one"                  = "value-one"
        }
      + etag             = (known after apply)
      + id               = (known after apply)
      + labels           = {
          + "label-one" = "value-one"
        }
      + name             = "example_feature_store"
      + project          = "project_id"
      + region           = "us-central1"
      + terraform_labels = {
          + "goog-terraform-provisioned" = "true"
          + "label-one"                  = "value-one"
        }
      + update_time      = (known after apply)

      + big_query {
          + entity_id_columns = [
              + "user_id",
            ]

          + big_query_source {
              + input_uri = "bq://project_id.feature_dataset.feature_table"
            }
        }
    }

Plan: 3 to add, 0 to change, 0 to destroy.
[90m
─────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't
guarantee to take exactly these actions if you run "terraform apply" now.

いい感じに設定が読み込まれていますね。それでは実行に移りましょう。

 リソースの作成まずは以下を実行してリソースを作成します。
terraform apply
まずはBigQueryのリソースが作成されていることを確認してみます。BigQueryにコンソールからアクセスすると、指定した内容でテーブルが作成されていることを確認できました。user_idとfeature_timestamp二つの指定したカラムも生成されていることを確認できました。
次に機能グループを確認してみましょう。Vertex AI Feature Storeの機能グループ画面に行くと以下のように機能グループが作成されていることを確認できました。
また、機能グループを選択すると詳細が表示でき、user_idがエンティティID列に指定されていることまで確認できました。
最後に利用が完了したらterraform destroyでリソースを削除します。

 まとめ今回はVertex AI Feature Storeの機能グループをTerraformで作成してみました。手動で作成するとBigQueryのテーブル作成からFeature Store上での設定まで結構めんどくさいので、Terraformで一発で作れるのはとても楽に感じますし、再現性がある方法なのでとてもおすすめです。
Discussion

Akasan
こちらで実装しているFeature Storeは機能グループ（オフラインストア）になります。機能グループはバックエンドにBigQueryを指定できます。
一方、オンラインストアを選択するとBigtableがバックエンドになります