Closed6

VPC Flow LogsをAthenaで分析

Takashi YamaharaTakashi Yamahara

VPC Flow LogsをTerraformで設定する際、log_formatでは$${version}のように$でエスケープする。

 resource "aws_flow_log" "example" {
  log_destination      = aws_s3_bucket.example.arn
  log_destination_type = "s3"
  log_format           = "$${version} $${account-id} $${interface-id} $${srcaddr} $${dstaddr} $${srcport} $${dstport} $${protocol} $${packets} $${bytes} $${start} $${end} $${action} $${log-status} $${pkt-srcaddr} $${pkt-dstaddr} $${pkt-src-aws-service} $${pkt-dst-aws-service} $${flow-direction} $${traffic-path}"
  # 他にも以下のフィールドを指定可能
  # $${vpc-id} $${subnet-id} $${instance-id} $${tcp-flags} $${type} $${region} $${az-id} $${interface-id} $${sublocation-type} $${sublocation-id}

  traffic_type         = "ALL"
  vpc_id               = aws_vpc.example.id
}

resource "aws_s3_bucket" "example" {
  bucket = "example"
}
Takashi YamaharaTakashi Yamahara
  • AthenaのQuery result locationとなるS3バケットを作成しておく。
  • Athenaのsettingsで、上記S3バケットを指定する。
Takashi YamaharaTakashi Yamahara

テーブルを作成する。CREATE EXTERNAL TABLEと指定するとエラーになるのでCREATE TABLEとしている。

CREATE TABLE IF NOT EXISTS vpc_flow_logs (
	version int,
	account string,
	interfaceid string,
	sourceaddress string,
	destinationaddress string,
	sourceport int,
	destinationport int,
	protocol int,
	numpackets int,
	numbytes bigint,
	starttime int,
	endtime int,
	action string,
	logstatus string,
	pktsrcaddr string,
	pktdstaddr string,
	pktsrcawsservice string,
	pktdstawsservice string,
	flowdirection string,
	trafficpath string
) PARTITIONED BY (`date` string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' LOCATION 's3://example/AWSLogs/{AWS_ACCOUNT_ID/vpcflowlogs/{region}/' TBLPROPERTIES (
	"skip.header.line.count" = "1",
	"projection.enabled" = "true",
	"projection.date.type" = "date",
	"projection.date.range" = "2021/01/01,NOW",
	"projection.date.format" = "yyyy/MM/dd",
	"projection.date.interval.unit" = "DAYS",
	"storage.location.template" = "s3://example/AWSLogs/{AWS_ACCOUNT_ID/vpcflowlogs/{region}/${date}"
);
Takashi YamaharaTakashi Yamahara

NAT Gatewayを介するトラフィック上位

SELECT
  sourceaddress,
  pktdstaddr,
  pktsrcawsservice,
  pktdstawsservice,
  SUM(numbytes) AS bytes
 FROM "default"."vpc_flow_logs"
 WHERE
   regexp_like(sourceaddress, '10.0') AND
   destinationaddress = '10.0.3.228' -- NAT GatewayのプライベートIPアドレス
 GROUP BY sourceaddress, pktdstaddr, pktsrcawsservice, pktdstawsservice
 ORDER BY bytes DESC
 LIMIT 5
Takashi YamaharaTakashi Yamahara

NAT Gatewayを介したダウンロードのトラフィック量の多い送信元上位

SELECT
  sourceaddress,
  pktsrcawsservice,
  pktdstawsservice,
  SUM(numbytes) AS bytes
 FROM "default"."vpc_flow_logs"
 WHERE
   NOT regexp_like(sourceaddress, '10.0')
   AND destinationaddress = '10.0.3.228'
 GROUP BY sourceaddress, pktsrcawsservice, pktdstawsservice
 ORDER BY bytes DESC
 LIMIT 5
このスクラップは2022/01/04にクローズされました