Closed1

AWS Athena で構造体の配列を含むjsonファイルをクエリする時のDDL

PolonityPolonity
  • S3 の中のデータ
{
"ID":"IDentifier",
"Data":[
    {"Time": 1234567890,"AD":12},
    {"Time": 1234567891,"AD":13},
     ・・・
    {"Time": 1234567899,"AD":11},
]
}
  • 上記データに基づいて、Athena でテーブルを作成するためのDDL
CREATE EXTERNAL TABLE IF NOT EXISTS `data_source`.`database` (
	ID string,
	Data array<struct<
    	Time:bigint,
    	AD:int
    >>
) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' WITH SERDEPROPERTIES ('serialization.format' = '1') LOCATION 's3://your_bucket_pass' TBLPROPERTIES ('has_encrypted_data' = 'false')
  • 上記テーブルの配列を解きほぐすクエリ
SELECT 
    ID as ID,
    from_unixtime(data.Time) as Time,
    data.AD as AD
FROM "data_source"."database" CROSS JOIN UNNEST(Data) AS t(data)
  • 特定のデバイスのみクエリ

SELECT 
    ID as ID,
    from_unixtime(data.Time) as Time,
    data.AD as AD
FROM "data_source"."database" CROSS JOIN UNNEST(Data) AS t(data)
WHERE (ID = "DEVICE009")
このスクラップは2021/12/17にクローズされました