Closed7

Snowflake と S3 連携を試す with Terraform

9sako69sako6

前準備

トライアルアカウントを作った。

S3 バケットと Policy, Role の作成

https://docs.snowflake.com/ja/user-guide/data-load-s3-config-storage-integration

ドキュメントのステップ2までをやったつもり。

main.tf
terraform {
  required_version = "~> 1.8.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "ap-northeast-1"
}

resource "aws_iam_policy" "snowflake_s3_access" {
  name        = "snowflake-s3-access"
  description = "Policy for Snowflake to access S3 data"

  policy = templatefile(
    "snowflake-s3-access-policy.json",
    {
      bucket_name = aws_s3_bucket.default.bucket,
    }
  )
}

resource "aws_iam_role" "snowflake_s3_access" {
  name = "snowflake-s3-access"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
        Action = "sts:AssumeRole"
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "s3_access_attachment" {
  role       = aws_iam_role.snowflake_s3_access.name
  policy_arn = aws_iam_policy.snowflake_s3_access.arn
}

resource "aws_s3_bucket" "default" {
  bucket = "9sako6-demo-snowflake-data-lake"

  lifecycle {
    prevent_destroy = true
  }
}

snowflake-s3-access-policy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:DeleteObject",
        "s3:DeleteObjectVersion"
      ],
      "Resource": "arn:aws:s3:::${bucket_name}/*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": "arn:aws:s3:::${bucket_name}"
    }
  ]
}

9sako69sako6

ストレージ統合を作る。

ストレージ統合は、S3クラウドストレージ用に生成されたIDおよびアクセス管理(IAM)ユーザーと、許可またはブロックされたストレージロケーション(バケットなど)のオプションセットを保存するSnowflakeオブジェクトです。

ここで snowsql で接続できない現象。ブラウザの UI からコピーしてきた XXXXXXX.XXXXXXX というアカウント表記だとダメで、XXXXXXX-XXXXXXX にするといけた。これは何。

snowsql -u <user> -a XXXXXXX-XXXXXXX
9sako69sako6

IAM ロールの ARN が必要になったので tf に追記して取得。

output "iam_role_arn" {
  value = aws_iam_role.snowflake_s3_access.arn
  description = "The ARN of the IAM role"
}

9sako69sako6

ドキュメントに習って CREATE STORAGE INTEGRATION を実行した。

+--------------------------------------------------+                            
| status                                           |
|--------------------------------------------------|
| Integration S3_INTEGRATION successfully created. |
+--------------------------------------------------+
1 Row(s) produced. Time Elapsed: 0.386s

次の手順にある DESC INTEGRATION も問題なく実行できた。
表示された情報のうち、STORAGE_AWS_IAM_USER_ARN, STORAGE_AWS_EXTERNAL_ID を使って次の手順を進める。

9sako69sako6

先ほど IAM ロールの assume_role_policy を下記に変更して apply した。

iam_snowflake.tf
resource "aws_iam_role" "snowflake_s3_access" {
  name = "snowflake-s3-access"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        "Sid": "",
        "Effect": "Allow",
        "Principal": {
          "AWS": "<STORAGE_AWS_IAM_USER_ARN>"
        },
        "Action": "sts:AssumeRole",
        "Condition": {
          "StringEquals": {
            "sts:ExternalId": "<STORAGE_AWS_EXTERNAL_ID>"
          }
        }
      }
    ]
  })
}


次は外部ステージを作る。DATABASE, SCHEMA が必要だったので作った。

CREATE DATABASE IF NOT EXISTS demo_db;

CREATE SCHEMA IF NOT EXISTS demo_schema;

ステージは、 my_csv_format という名前のファイル形式オブジェクトも参照します。

急にわからなくなった。ファイル形式オブジェクトとは。
これか。
https://docs.snowflake.com/ja/user-guide/data-load-internal-tutorial-create-file-format

作りました。

CREATE OR REPLACE FILE FORMAT mycsvformat
  TYPE = 'CSV'
  SKIP_HEADER = 1;

外部ステージを作成する。

USE SCHEMA demo_db.demo_schema;

CREATE STAGE my_s3_stage
  STORAGE_INTEGRATION = s3_integration
  URL = 's3://9sako6-demo-snowflake-data-lake/'
  FILE_FORMAT = mycsvformat;
+----------------------------------+                                            
| status                           |
|----------------------------------|
| Statement executed successfully. |
+----------------------------------+
1 Row(s) produced. Time Elapsed: 0.133s
+----------------------------------------------+                                
| status                                       |
|----------------------------------------------|
| Stage area MY_S3_STAGE successfully created. |
+----------------------------------------------+
1 Row(s) produced. Time Elapsed: 1.033s
9sako69sako6

下記のサンプル CSV を S3 にアップロードしてみる。

https://docs.snowflake.com/ja/user-guide/tutorials/snowflake-in-20minutes#introduction

Snowflake 上に反映されている!

LIST コマンドでも確認できた。

9sako6demo#COMPUTE_WH@DEMO_DB.DEMO_SCHEMA>LIST @my_s3_stage;
+----------------------------------------------------------+------+----------------------------------+-------------------------------+
| name                                                     | size | md5                              | last_modified                 |
|----------------------------------------------------------+------+----------------------------------+-------------------------------|
| s3://9sako6-demo-snowflake-data-lake/raw/employees01.csv |  370 | ee3c86ae89eabb22667d6fd7684fec17 | Sun, 26 May 2024 08:17:55 GMT |
| s3://9sako6-demo-snowflake-data-lake/raw/employees02.csv |  364 | 20447871857babe7b4a12363f5f754da | Sun, 26 May 2024 08:17:55 GMT |
| s3://9sako6-demo-snowflake-data-lake/raw/employees03.csv |  407 | 65d89581bbd5b07cf78bf847205d7cbd | Sun, 26 May 2024 08:17:55 GMT |
| s3://9sako6-demo-snowflake-data-lake/raw/employees04.csv |  375 | 53fe4c128aa9567e7deb28abce5f8d97 | Sun, 26 May 2024 08:17:56 GMT |
| s3://9sako6-demo-snowflake-data-lake/raw/employees05.csv |  404 | 87d47add6452f8ab2949f6b29d6b285a | Sun, 26 May 2024 08:17:56 GMT |
+----------------------------------------------------------+------+----------------------------------+-------------------------------+
5 Row(s) produced. Time Elapsed: 0.170s

この CSV をテーブルに取り込んでみる。まずはテーブルの作成。

CREATE OR REPLACE TABLE emp_basic (
   first_name STRING ,
   last_name STRING ,
   email STRING ,
   streetaddress STRING ,
   city STRING ,
   start_date DATE
   );

ウェアハウスも必要そうなので作成。

CREATE OR REPLACE WAREHOUSE wh_demo_xs WITH
WAREHOUSE_SIZE='X-SMALL'
AUTO_SUSPEND = 180
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED=TRUE;
9sako6demo#WH_DEMO_XS@DEMO_DB.DEMO_SCHEMA>SELECT CURRENT_WAREHOUSE();
+---------------------+                                                         
| CURRENT_WAREHOUSE() |
|---------------------|
| WH_DEMO_XS          |
+---------------------+
1 Row(s) produced. Time Elapsed: 0.337s

ステージングされたデータを取り込むには COPY INTO を使う模様。

9sako6demo#WH_DEMO_XS@DEMO_DB.DEMO_SCHEMA>COPY INTO emp_basic
                                            FROM @my_s3_stage
                                            FILE_FORMAT = (type = csv field_optionally_enclosed_by='"')
                                            PATTERN = '.*employees0[1-5].csv'
                                            ON_ERROR = 'skip_file';
+----------------------------------------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
| file                                                     | status | rows_parsed | rows_loaded | error_limit | errors_seen | first_error | first_error_line | first_error_character | first_error_column_name |
|----------------------------------------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------|
| s3://9sako6-demo-snowflake-data-lake/raw/employees05.csv | LOADED |           5 |           5 |           1 |           0 | NULL        |             NULL |                  NULL | NULL                    |
| s3://9sako6-demo-snowflake-data-lake/raw/employees01.csv | LOADED |           5 |           5 |           1 |           0 | NULL        |             NULL |                  NULL | NULL                    |
| s3://9sako6-demo-snowflake-data-lake/raw/employees04.csv | LOADED |           5 |           5 |           1 |           0 | NULL        |             NULL |                  NULL | NULL                    |
| s3://9sako6-demo-snowflake-data-lake/raw/employees02.csv | LOADED |           5 |           5 |           1 |           0 | NULL        |             NULL |                  NULL | NULL                    |
| s3://9sako6-demo-snowflake-data-lake/raw/employees03.csv | LOADED |           5 |           5 |           1 |           0 | NULL        |             NULL |                  NULL | NULL                    |
+----------------------------------------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
5 Row(s) produced. Time Elapsed: 1.377s

取り込めた!

9sako6demo#WH_DEMO_XS@DEMO_DB.DEMO_SCHEMA>SELECT * FROM emp_basic;
+------------+--------------+---------------------------+-----------------------------+--------------------+------------+
| FIRST_NAME | LAST_NAME    | EMAIL                     | STREETADDRESS               | CITY               | START_DATE |
|------------+--------------+---------------------------+-----------------------------+--------------------+------------|
| Arlene     | Davidovits   | adavidovitsk@sf_tuts.com  | 7571 New Castle Circle      | Meniko             | 2017-05-03 |
| Violette   | Shermore     | vshermorel@sf_tuts.com    | 899 Merchant Center         | Troitsk            | 2017-01-19 |
| Ron        | Mattys       | rmattysm@sf_tuts.com      | 423 Lien Pass               | Bayaguana          | 2017-11-15 |
| Shurlocke  | Oluwatoyin   | soluwatoyinn@sf_tuts.com  | 40637 Portage Avenue        | Semënovskoye       | 2017-09-12 |
| Granger    | Bassford     | gbassfordo@sf_tuts.co.uk  | 6 American Ash Circle       | Kardítsa           | 2016-12-30 |
| Lem        | Boissier     | lboissier@sf_tuts.com     | 3002 Ruskin Trail           | Shikārpur          | 2017-08-25 |
| Iain       | Hanks        | ihanks1@sf_tuts.com       | 2 Pankratz Hill             | Monte-Carlo        | 2017-12-10 |
| Avo        | Laudham      | alaudham2@sf_tuts.com     | 6948 Debs Park              | Prażmów            | 2017-10-18 |
| Emili      | Cornner      | ecornner3@sf_tuts.com     | 177 Magdeline Avenue        | Norrköping         | 2017-08-13 |
| Harrietta  | Goolding     | hgoolding4@sf_tuts.com    | 450 Heath Trail             | Osielsko           | 2017-11-27 |
| Wallis     | Sizey        | wsizeyf@sf_tuts.com       | 36761 American Lane         | Taibao             | 2016-12-30 |
| Di         | McGowran     | dmcgowrang@sf_tuts.com    | 1856 Maple Lane             | Banjar Bengkelgede | 2017-04-22 |
| Carson     | Bedder       | cbedderh@sf_tuts.co.au    | 71 Clyde Gallagher Place    | Leninskoye         | 2017-03-29 |
| Dana       | Avory        | davoryi@sf_tuts.com       | 2 Holy Cross Pass           | Wenlin             | 2017-05-11 |
| Ronny      | Talmadge     | rtalmadgej@sf_tuts.co.uk  | 588 Chinook Street          | Yawata             | 2017-06-02 |
| Nyssa      | Dorgan       | ndorgan5@sf_tuts.com      | 7 Tomscot Way               | Pampas Chico       | 2017-04-13 |
| Catherin   | Devereu      | cdevereu6@sf_tuts.co.au   | 535 Basil Terrace           | Magapit            | 2016-12-17 |
| Grazia     | Glaserman    | gglaserman7@sf_tuts.com   | 162 Debra Lane              | Shiquanhe          | 2017-06-06 |
| Ivett      | Casemore     | icasemore8@sf_tuts.com    | 84 Holmberg Pass            | Campina Grande     | 2017-03-29 |
| Cesar      | Hovie        | chovie9@sf_tuts.com       | 5 7th Pass                  | Miami              | 2016-12-21 |
| Althea     | Featherstone | afeatherstona@sf_tuts.com | 8172 Browning Street, Apt B | Calatrava          | 2017-07-12 |
| Hollis     | Anneslie     | hanneslieb@sf_tuts.com    | 3248 Roth Park              | Aleysk             | 2017-11-16 |
| Betti      | Cicco        | bciccoc@sf_tuts.com       | 121 Victoria Junction       | Sinegor'ye         | 2017-06-22 |
| Brendon    | Durnall      | bdurnalld@sf_tuts.com     | 26814 Weeping Birch Place   | Sabadell           | 2017-11-14 |
| Kylila     | MacConnal    | kmacconnale@sf_tuts.com   | 04 Valley Edge Court        | Qingshu            | 2017-06-22 |
+------------+--------------+---------------------------+-----------------------------+--------------------+------------+
25 Row(s) produced. Time Elapsed: 0.157s
このスクラップは6ヶ月前にクローズされました