🚀

Terraformではじめる Amazon S3 Vectors を使ったRAG構築ハンズオン

xthixsl_ml

2025/12/04に公開

はじめに

Fusicのレオナです。AWS re:Invent 2025でAmazon S3 VectorsがGAされました。それに伴い、Terraform 6.24.0からS3 Vectorsがサポートされたので本ブログはTerraformでRAG構築をしてみたいと思います。
また今回はRAGの精度検証、および考察はしません。

Amazon S3 Vectorsとは

Amazon S3 Vectorsは、Amazon S3にベクトル検索機能をネイティブに追加した新しいサービスです。S3を単なるオブジェクトストレージとして使うだけでなく、ベクトルデータを直接格納・検索できるようになりました。

詳しくはプレビュー版の時に執筆した以下のブログも参考にしてください。

やること

プレビュー版の時に執筆したブログで行ったマネコンぽちぽち作業をTerraformで実装します。
S3 Vectorsバケットとインデックスの作成をTerraformでモジュール化し、boto3のPutVectors APIでマークダウンをベクトル挿入、QueryVectors APIでクエリに近いドキュメントを検索、検索結果をコンテキストとしてClaude Sonnet 4で回答を生成させるRAGを作成します。

ベクトル化するテキストは弊社のニュースページの1つからテキストを使用します。[こちら]

実装

今回はバージニア北部リージョンでデプロイします。

ディレクトリ構造

以下のディレクトリ構成で作成します。

root/
├── .gitignore
├── .env.example               
├── README.md                   
├── pyproject.toml              # Python uv管理
│
├── modules/                    
│   ├── s3-vectors/             
│   │   ├── main.tf             # Vector Bucket、Vector Index
│   │   ├── variables.tf        # モジュール変数
│   │   └── outputs.tf          # モジュール出力
│   │
│   └── iam/                    
│       ├── main.tf             # IAMロール、Bedrockポリシー、S3ポリシー
│       ├── variables.tf        # モジュール変数
│       └── outputs.tf          # モジュール出力
│
├── envs/                       
│   └── dev/                    
│       ├── main.tf             # モジュール呼び出し
│       ├── variables.tf        # 環境変数定義
│       ├── outputs.tf          # 環境出力定義
│       ├── providers.tf        # プロバイダー設定
│       └── terraform.tfvars    # 環境変数値
│
├── scripts/                    
│   ├── put_vectors.py          # PutVectors API - ベクトル化したものを挿入
│   └── query_vectors.py        # RAG実行
│
└── sample-docs/                # ベクトル化するサンプルドキュメント
    └── fusic-brand-slogan.md

1. Terraformコード

1.1 S3 Vectorsモジュール

S3 Vectorsはベクトルバケットとインデックスの2つのリソースで構成されます。

ベクトルバケットとインデックスを作成するメインファイル

modules/s3-vectors/main.tf

# S3 Vectorsベクトルバケット
resource "aws_s3vectors_vector_bucket" "this" {
  vector_bucket_name = var.vector_bucket_name

  encryption_configuration {
    sse_type = var.sse_type
  }

  force_destroy = var.force_destroy

  tags = merge(
    var.tags,
    {
      Name        = var.vector_bucket_name
      Environment = var.environment
    }
  )
}

# S3 Vectorsインデックス
resource "aws_s3vectors_index" "this" {
  index_name         = var.index_name
  vector_bucket_name = aws_s3vectors_vector_bucket.this.vector_bucket_name

  data_type       = var.data_type
  dimension       = var.dimension
  distance_metric = var.distance_metric

  tags = merge(
    var.tags,
    {
      Name        = var.index_name
      Environment = var.environment
    }
  )
}

aws_s3vectors_vector_bucket: ベクトルデータを格納するコンテナ。通常のS3バケットとは異なり、ベクトル検索専用のストレージとして機能します。

モジュールの入力変数を定義

modules/s3-vectors/variables.tf

variable "vector_bucket_name" {
  description = "S3 Vectorsバケット名"
  type        = string
}

variable "index_name" {
  description = "ベクトルインデックス名"
  type        = string
}

variable "environment" {
  description = "環境名"
  type        = string
}

variable "force_destroy" {
  description = "バケット削除時にインデックスとベクトルも強制削除するか"
  type        = bool
  default     = false
}

variable "sse_type" {
  description = "サーバーサイド暗号化タイプ (AES256 or aws:kms)"
  type        = string
  default     = "AES256"
}

variable "data_type" {
  description = "ベクトルのデータタイプ"
  type        = string
  default     = "float32"
}

variable "dimension" {
  description = "ベクトル次元数"
  type        = number
  default     = 1024
}

variable "distance_metric" {
  description = "類似度検索の距離メトリック (cosine or euclidean)"
  type        = string
  default     = "cosine"
}

variable "tags" {
  description = "リソースに付与するタグ"
  type        = map(string)
  default     = {}
}

モジュールの出力値を定義

modules/s3-vectors/outputs.tf

# S3 Vectorsモジュール - 出力定義

output "vector_bucket_name" {
  description = "S3 Vectorsバケット名"
  value       = aws_s3vectors_vector_bucket.this.vector_bucket_name
}

output "vector_bucket_arn" {
  description = "S3 VectorsバケットARN"
  value       = aws_s3vectors_vector_bucket.this.vector_bucket_arn
}

output "index_name" {
  description = "ベクトルインデックス名"
  value       = aws_s3vectors_index.this.index_name
}

output "index_arn" {
  description = "ベクトルインデックスARN"
  value       = aws_s3vectors_index.this.index_arn
}

output "dimension" {
  description = "ベクトル次元数"
  value       = aws_s3vectors_index.this.dimension
}

output "distance_metric" {
  description = "距離メトリック"
  value       = aws_s3vectors_index.this.distance_metric
}

1.2 IAMモジュール

Bedrockモデル呼び出し用のIAMロールとポリシーを作成

modules/iam/main.tf

# Bedrockアクセス用IAMモジュール

resource "aws_iam_role" "bedrock_access" {
  name = var.role_name

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = concat(
      [
        {
          Effect = "Allow"
          Principal = {
            Service = "bedrock.amazonaws.com"
          }
          Action = "sts:AssumeRole"
        }
      ],
      var.allow_account_assume ? [
        {
          Effect = "Allow"
          Principal = {
            AWS = "arn:aws:iam::${var.account_id}:root"
          }
          Action = "sts:AssumeRole"
        }
      ] : []
    )
  })

  tags = var.tags
}

# Bedrockモデル呼び出しポリシー
resource "aws_iam_role_policy" "bedrock_invoke" {
  name = "bedrock-invoke-policy"
  role = aws_iam_role.bedrock_access.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "bedrock:InvokeModel",
          "bedrock:InvokeModelWithResponseStream"
        ]
        Resource = [
          for model_id in var.bedrock_model_ids :
          "arn:aws:bedrock:${var.region}::foundation-model/${model_id}"
        ]
      }
    ]
  })
}

# S3バケットアクセスポリシー
resource "aws_iam_role_policy" "s3_access" {
  count = var.enable_s3_access ? 1 : 0

  name = "s3-bucket-access-policy"
  role = aws_iam_role.bedrock_access.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "s3:PutObject",
          "s3:GetObject",
          "s3:DeleteObject",
          "s3:ListBucket"
        ]
        Resource = [
          var.s3_bucket_arn,
          "${var.s3_bucket_arn}/*"
        ]
      }
    ]
  })
}

ポイント:

bedrock:InvokeModel: Bedrockモデルを呼び出すための権限
allow_account_assume: アカウントルートからのAssumeRoleを許可するかどうか

modules/iam/variables.tf

variable "role_name" {
  description = "IAMロール名"
  type        = string
}

variable "account_id" {
  description = "AWSアカウントID"
  type        = string
}

variable "region" {
  description = "AWSリージョン"
  type        = string
}

variable "allow_account_assume" {
  description = "アカウントルートからのAssumeRoleを許可するか"
  type        = bool
  default     = true
}

variable "bedrock_model_ids" {
  description = "Bedrockモデルのリスト"
  type        = list(string)
}

variable "s3_bucket_arn" {
  description = "アクセスを許可するS3バケットのARN"
  type        = string
}

variable "enable_s3_access" {
  description = "S3アクセスポリシーを有効にするか"
  type        = bool
  default     = true
}

variable "tags" {
  description = "リソースに付与するタグ"
  type        = map(string)
  default     = {}
}

modules/iam/outputs.tf

output "role_arn" {
  description = "IAMロールARN"
  value       = aws_iam_role.bedrock_access.arn
}

output "role_name" {
  description = "IAMロール名"
  value       = aws_iam_role.bedrock_access.name
}

output "role_id" {
  description = "IAMロールID"
  value       = aws_iam_role.bedrock_access.id
}

1.3 開発環境設定

モジュールを呼び出して実際にリソースを作成する環境設定

envs/dev/main.tf

data "aws_caller_identity" "current" {}

locals {
  vector_bucket_name = "${var.project_name}-${var.environment}-vectors"
  vector_index_name  = "markdown-embeddings"
  iam_role_name      = "${var.project_name}-${var.environment}-bedrock-access"
}

# S3 Vectorsバケットとインデックス
module "s3_vectors" {
  source = "../../modules/s3-vectors"

  vector_bucket_name = local.vector_bucket_name
  index_name         = local.vector_index_name
  environment        = var.environment
  force_destroy      = true # dev環境では強制削除を許可
  sse_type           = "AES256"
  dimension          = var.vector_dimension
  distance_metric    = "cosine"
  tags               = var.tags
}

# Bedrockアクセス用IAMロール
module "iam" {
  source = "../../modules/iam"

  role_name            = local.iam_role_name
  account_id           = data.aws_caller_identity.current.account_id
  region               = var.region
  allow_account_assume = true
  s3_bucket_arn        = module.s3_vectors.vector_bucket_arn
  enable_s3_access     = true

  bedrock_model_ids = [
    var.bedrock_embedding_model_id,
    var.bedrock_llm_model_id
  ]

  tags = var.tags
}

force_destroy = true: dev環境ではterraform destroy時にデータも削除できるようにします。

envs/dev/variables.tf

variable "project_name" {
  description = "プロジェクト名"
  type        = string
  default     = "s3vectors-rag"
}

variable "environment" {
  description = "環境名"
  type        = string
  default     = "dev"
}

variable "region" {
  description = "AWSリージョン"
  type        = string
  default     = "us-east-1"
}

variable "vector_dimension" {
  description = "ベクトル次元数"
  type        = number
  default     = 1024
}

variable "bedrock_embedding_model_id" {
  description = "Bedrock埋め込みモデルID"
  type        = string
  default     = "amazon.titan-embed-text-v2:0"
}

variable "bedrock_llm_model_id" {
  description = "Bedrock LLMモデルID"
  type        = string
  default     = "anthropic.claude-sonnet-4-20250514-v1:0"
}

variable "tags" {
  description = "共通タグ"
  type        = map(string)
  default = {
    ManagedBy   = "Terraform"
    Project     = "S3VectorsRAG"
    Environment = "dev"
  }
}

envs/dev/outputs.tf

output "vector_bucket_name" {
  description = "S3 Vectorsバケット名"
  value       = module.s3_vectors.vector_bucket_name
}

output "vector_bucket_arn" {
  description = "S3 VectorsバケットARN"
  value       = module.s3_vectors.vector_bucket_arn
}

output "vector_index_name" {
  description = "ベクトルインデックス名"
  value       = module.s3_vectors.index_name
}

output "vector_index_arn" {
  description = "ベクトルインデックスARN"
  value       = module.s3_vectors.index_arn
}

output "vector_dimension" {
  description = "ベクトル次元数"
  value       = module.s3_vectors.dimension
}

output "distance_metric" {
  description = "距離メトリック"
  value       = module.s3_vectors.distance_metric
}

output "bedrock_role_arn" {
  description = "Bedrockアクセス用IAMロールARN"
  value       = module.iam.role_arn
}

output "bedrock_role_name" {
  description = "Bedrockアクセス用IAMロール名"
  value       = module.iam.role_name
}

output "bedrock_embedding_model_id" {
  description = "埋め込みモデルID"
  value       = var.bedrock_embedding_model_id
}

output "bedrock_llm_model_id" {
  description = "LLMモデルID"
  value       = var.bedrock_llm_model_id
}

output "aws_account_id" {
  description = "AWSアカウントID"
  value       = data.aws_caller_identity.current.account_id
}

output "aws_region" {
  description = "AWSリージョン"
  value       = var.region
}

envs/dev/providers.tf

terraform {
  required_version = ">= 1.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 6.24"
    }
  }
}

provider "aws" {
  region = var.region

  default_tags {
    tags = var.tags
  }
}

ポイント:

aws provider version ~> 6.24: S3 Vectorsリソースは6.24以降で利用可能になります

envs/dev/terraform.tfvars

# Dev Environment - Terraform Variables

project_name               = "s3vectors-rag"
environment                = "dev"
region                     = "us-east-1"
vector_dimension           = 1024
bedrock_embedding_model_id = "amazon.titan-embed-text-v2:0"
bedrock_llm_model_id       = "anthropic.claude-sonnet-4-20250514-v1:0"

tags = {
  ManagedBy   = "Terraform"
  Project     = "S3VectorsRAG"
  Environment = "dev"
}

1.4 Terraformの実行

cd envs/dev
terraform init
terraform plan
terraform apply

作成されるリソース:

S3 Vectorsバケット: s3vectors-rag-dev-vectors
ベクトルインデックス: markdown-embeddings
Bedrockアクセス用IAMロール

2. Python周り

2.1 pyproject.toml

依存関係を定義

pyproject.toml

[project]
name = "s3-vectors-rag-automation"
version = "1.0.0"
description = "Terraform-based S3 Vectors RAG system with automated markdown ingestion and Claude Sonnet 4 integration"
requires-python = ">=3.11"
dependencies = [
    "boto3>=1.35.0",
    "python-dotenv>=1.0.0",
    "pydantic>=2.0.0",
    "tenacity>=8.0.0",
]

[tool.setuptools]
packages = ["scripts"]
py-modules = []

[project.optional-dependencies]
dev = [
    "ruff>=0.1.0",
    "mypy>=1.0.0",
]

[tool.uv]
dev-dependencies = []

[tool.ruff]
line-length = 100
target-version = "py311"

[tool.mypy]
python_version = "3.11"
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = false

仮想環境のアクティベート

# rootディレクトリで
uv venv && source .venv/bin/activate
uv sync

2.2 put_vectors.py ベクトル挿入

マークダウンファイルをスキャンし、ベクトル化してS3 Vectorsに挿入するスクリプト

scripts/put_vectors.py

"""
S3 Vectors - PutVectorsスクリプト

マークダウンファイルをスキャンし、Bedrock Titan Text Embeddings V2で
埋め込みベクトルを生成し、PutVectors APIでS3 Vectorsに挿入します。

参考: https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-PutVectors.html
"""

import sys
import json
import argparse
import logging
import os
from pathlib import Path
from typing import List, Dict, Any
from datetime import datetime

import boto3
from dotenv import load_dotenv
from pydantic import BaseModel, Field, field_validator
from tenacity import retry, stop_after_attempt, wait_exponential

# .envファイルを読み込み
load_dotenv()

# ログ設定
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s - %(levelname)s - %(message)s"
)
logger = logging.getLogger(__name__)


class PutVectorsConfig(BaseModel):
    """PutVectors処理用の型安全な設定クラス"""
    source_directory: Path = Field(description="マークダウンファイルのソースディレクトリ")
    vector_bucket_name: str = Field(min_length=3, max_length=63)
    index_name: str = Field(min_length=1)
    bedrock_region: str = Field(default="us-east-1")
    embedding_model_id: str = Field(default="amazon.titan-embed-text-v2:0")
    vector_dimension: int = Field(default=1024)
    batch_size: int = Field(default=10, ge=1, le=100)

    @field_validator('vector_dimension')
    @classmethod
    def validate_dimension(cls, v: int) -> int:
        if v not in [256, 512, 1024]:
            raise ValueError("ベクトル次元数は256、512、または1024である必要があります")
        return v


class MarkdownDocument(BaseModel):
    """マークダウンドキュメントの型定義"""
    file_path: Path
    content: str
    metadata: Dict[str, Any] = Field(default_factory=dict)
    file_size: int = Field(ge=0)


class EmbeddingResponse(BaseModel):
    """Bedrock埋め込みレスポンスの型定義"""
    embedding: List[float]
    input_token_count: int


class MarkdownScanner:
    """ディレクトリからマークダウンファイルをスキャンするクラス"""

    def __init__(self, config: PutVectorsConfig):
        self.config = config

    def scan_directory(self) -> List[MarkdownDocument]:
        """ディレクトリを再帰的にスキャンしてマークダウンファイルを取得"""
        markdown_files: List[MarkdownDocument] = []

        print(f"ディレクトリをスキャン中: {self.config.source_directory}")

        for root, _, files in sorted(Path(self.config.source_directory).walk()):
            for file in sorted(files):
                if file.endswith('.md'):
                    file_path = root / file
                    try:
                        doc = self._load_document(file_path)
                        markdown_files.append(doc)
                        print(f"✓ 検出: {file_path}")
                    except Exception as e:
                        print(f"✗ 読み込みエラー {file_path}: {e}")
                        logger.error(f"{file_path}の読み込みに失敗: {e}")

        print(f"検出ファイル数: {len(markdown_files)}")
        return markdown_files

    def _load_document(self, file_path: Path) -> MarkdownDocument:
        """メタデータ付きでマークダウンドキュメントを読み込む"""
        with open(file_path, 'r', encoding='utf-8') as f:
            content = f.read()

        return MarkdownDocument(
            file_path=file_path,
            content=content,
            file_size=file_path.stat().st_size,
            metadata={
                "file_name": file_path.name,
                "file_path": str(file_path),
            }
        )


class BedrockEmbedder:
    """Bedrock Titan Text Embeddings V2を使用して埋め込みベクトルを生成するクラス"""

    def __init__(self, config: PutVectorsConfig):
        self.config = config
        self.bedrock_runtime = boto3.client(
            'bedrock-runtime',
            region_name=config.bedrock_region
        )

    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=1, min=2, max=10)
    )
    def embed_text(self, text: str) -> EmbeddingResponse:
        """テキストの埋め込みベクトルを生成（リトライロジック付き）"""
        body = json.dumps({
            "inputText": text[:8192],  # 最大長に切り詰め
            "dimensions": self.config.vector_dimension,
            "normalize": True
        })

        response = self.bedrock_runtime.invoke_model(
            modelId=self.config.embedding_model_id,
            body=body
        )

        response_body = json.loads(response['body'].read())

        return EmbeddingResponse(
            embedding=response_body['embedding'],
            input_token_count=response_body.get('inputTextTokenCount', 0)
        )


class S3VectorsPutter:
    """PutVectors APIを使用してS3 Vectorsにベクトルを挿入するクラス"""

    def __init__(self, config: PutVectorsConfig):
        self.config = config
        self.s3vectors_client = boto3.client('s3vectors', region_name=config.bedrock_region)

    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=1, min=2, max=10)
    )
    def put_vectors(self, vectors_to_put: List[Dict[str, Any]]) -> Dict[str, Any]:
        """S3 Vectors PutVectors APIを使用してベクトルを挿入"""
        print(f"{len(vectors_to_put)}件のベクトルを挿入中...")

        result = self.s3vectors_client.put_vectors(
            vectorBucketName=self.config.vector_bucket_name,
            indexName=self.config.index_name,
            vectors=vectors_to_put
        )

        print("✓ ベクトルの挿入が完了しました")
        return result


def parse_args() -> PutVectorsConfig:
    """コマンドライン引数を解析（環境変数をデフォルト値として使用）"""
    parser = argparse.ArgumentParser(
        description="S3 Vectors PutVectors - マークダウンファイルをベクトル化してS3 Vectorsに挿入"
    )

    parser.add_argument(
        '--source-dir',
        type=Path,
        required=True,
        help='マークダウンファイルを含むソースディレクトリ'
    )
    parser.add_argument(
        '--bucket',
        type=str,
        default=os.getenv('VECTOR_BUCKET_NAME'),
        help='S3 Vectorsバケット名（環境変数: VECTOR_BUCKET_NAME）'
    )
    parser.add_argument(
        '--index',
        type=str,
        default=os.getenv('VECTOR_INDEX_NAME', 'markdown-embeddings'),
        help='ベクトルインデックス名（環境変数: VECTOR_INDEX_NAME）'
    )
    parser.add_argument(
        '--region',
        type=str,
        default=os.getenv('AWS_REGION', 'us-east-1'),
        help='AWSリージョン（環境変数: AWS_REGION）'
    )
    parser.add_argument(
        '--dimension',
        type=int,
        default=int(os.getenv('VECTOR_DIMENSION', '1024')),
        choices=[256, 512, 1024],
        help='ベクトル次元数（環境変数: VECTOR_DIMENSION）'
    )

    args = parser.parse_args()

    # バケット名の検証
    if not args.bucket:
        parser.error("--bucket引数または環境変数VECTOR_BUCKET_NAMEが必要です")

    return PutVectorsConfig(
        source_directory=args.source_dir,
        vector_bucket_name=args.bucket,
        index_name=args.index,
        bedrock_region=args.region,
        vector_dimension=args.dimension
    )


def main():
    """メインPutVectorsパイプライン"""
    try:
        # 設定を解析
        config = parse_args()

        print("\n=== S3 Vectors - PutVectors ===")
        print(f"ソース: {config.source_directory}")
        print(f"バケット: {config.vector_bucket_name}")
        print(f"インデックス: {config.index_name}")
        print(f"次元数: {config.vector_dimension}\n")

        start_time = datetime.now()

        # ステップ1: マークダウンファイルをスキャン
        scanner = MarkdownScanner(config)
        documents = scanner.scan_directory()

        if not documents:
            print("マークダウンファイルが見つかりません。終了します。")
            return

        print(f"\nステップ1完了: {len(documents)}ファイルをスキャン\n")

        # ステップ2: 埋め込みベクトルを生成してベクトルを準備
        embedder = BedrockEmbedder(config)
        vectors_to_put: List[Dict[str, Any]] = []

        for idx, doc in enumerate(documents, 1):
            try:
                print(f"処理中 {idx}/{len(documents)}: {doc.file_path.name}")

                embedding_response = embedder.embed_text(doc.content)

                # S3 Vectors PutVectors API用のフォーマット
                # 注意: フィルタリング可能なメタデータは2048バイト以下である必要があります
                # 日本語文字は約3バイトなので、テキストを約400文字に制限
                text_preview = doc.content[:400]

                vector_entry = {
                    "key": f"doc_{idx}_{doc.file_path.stem}",
                    "data": {"float32": embedding_response.embedding},
                    "metadata": {
                        "text": text_preview,
                        "file_path": str(doc.file_path),
                        "file_name": doc.file_path.name,
                    }
                }
                vectors_to_put.append(vector_entry)

                print(
                    f"✓ 埋め込み完了: {doc.file_path.name} "
                    f"({embedding_response.input_token_count}トークン)"
                )

            except Exception as e:
                print(f"✗ 埋め込みエラー {doc.file_path}: {e}")
                logger.error(f"{doc.file_path}の埋め込みに失敗: {e}")

        print(f"\nステップ2完了: {len(vectors_to_put)}件の埋め込みを生成\n")

        # ステップ3: S3 Vectorsにベクトルを挿入
        putter = S3VectorsPutter(config)
        result = putter.put_vectors(vectors_to_put)

        end_time = datetime.now()
        duration = (end_time - start_time).total_seconds()

        print(f"\n✓ PutVectors完了!")
        print(f"処理ファイル数: {len(documents)}")
        print(f"挿入ベクトル数: {len(vectors_to_put)}")
        print(f"処理時間: {duration:.2f}秒\n")

    except Exception as e:
        print(f"\n✗ PutVectors失敗: {e}\n")
        logger.exception("PutVectorsパイプラインが失敗しました")
        sys.exit(1)


if __name__ == "__main__":
    main()

2.3 query_vectors.py RAG

S3 Vectorsでベクトル検索を実行し、Claude Sonnet 4でRAG回答を生成するスクリプト

scripts/query_vectors.py

"""
S3 Vectors - QueryVectorsスクリプト

S3 Vectors QueryVectors APIを使用してベクトル類似度検索を実行し、
オプションでClaude Sonnet 4を使用してRAGレスポンスを生成します。

参考: https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-QueryVectors.html
"""

import sys
import json
import argparse
import logging
import os
from typing import List, Dict, Any
from pathlib import Path

import boto3
from dotenv import load_dotenv
from pydantic import BaseModel, Field
from tenacity import retry, stop_after_attempt, wait_exponential

# .envファイルを読み込み
load_dotenv()

# ログ設定
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
logger = logging.getLogger(__name__)


class QueryConfig(BaseModel):
    """クエリ用の型安全な設定クラス"""
    vector_bucket_name: str = Field(min_length=3, max_length=63)
    index_name: str = Field(min_length=1)
    bedrock_region: str = Field(default="us-east-1")
    embedding_model_id: str = Field(default="amazon.titan-embed-text-v2:0")
    llm_model_id: str = Field(default="us.anthropic.claude-sonnet-4-20250514-v1:0")
    vector_dimension: int = Field(default=1024)
    top_k: int = Field(default=3, ge=1, le=100)
    max_tokens: int = Field(default=2048, ge=1, le=4096)
    temperature: float = Field(default=0.7, ge=0.0, le=1.0)


class QueryResult(BaseModel):
    """クエリ結果の型定義"""
    key: str
    distance: float
    metadata: Dict[str, Any]


class RAGResponse(BaseModel):
    """RAGレスポンスの型定義"""
    answer: str
    sources: List[str]
    model_id: str
    input_tokens: int
    output_tokens: int


class S3VectorsQuery:
    """S3 Vectors QueryVectors APIを使用してベクトル類似度検索を実行するクラス"""

    def __init__(self, config: QueryConfig):
        self.config = config
        self.bedrock_runtime = boto3.client(
            'bedrock-runtime',
            region_name=config.bedrock_region
        )
        self.s3vectors_client = boto3.client(
            's3vectors',
            region_name=config.bedrock_region
        )

    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=1, min=2, max=10)
    )
    def vectorize_query(self, query_text: str) -> List[float]:
        """Bedrockを使用してクエリテキストをベクトル化"""
        body = json.dumps({
            "inputText": query_text[:8192],
            "dimensions": self.config.vector_dimension,
            "normalize": True
        })

        response = self.bedrock_runtime.invoke_model(
            modelId=self.config.embedding_model_id,
            body=body
        )

        response_body = json.loads(response['body'].read())
        return response_body['embedding']

    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=1, min=2, max=10)
    )
    def query_vectors(self, query_embedding: List[float]) -> List[QueryResult]:
        """S3 Vectors QueryVectors APIを使用してベクトルを検索"""
        print("S3 Vectorsを検索中...")

        response = self.s3vectors_client.query_vectors(
            vectorBucketName=self.config.vector_bucket_name,
            indexName=self.config.index_name,
            queryVector={"float32": query_embedding},
            topK=self.config.top_k,
            returnMetadata=True,
            returnDistance=True,
        )

        results = []
        for vector in response.get('vectors', []):
            results.append(QueryResult(
                key=vector.get('key', 'unknown'),
                distance=vector.get('distance', 0),
                metadata=vector.get('metadata', {})
            ))

        return results

    def search(self, query_text: str) -> List[QueryResult]:
        """完全な検索パイプラインを実行"""
        print(f"クエリ: {query_text}")

        # ステップ1: クエリをベクトル化
        query_embedding = self.vectorize_query(query_text)
        print(f"✓ クエリをベクトル化: {len(query_embedding)}次元")

        # ステップ2: ベクトルを検索
        results = self.query_vectors(query_embedding)
        print(f"✓ 検索完了: {len(results)}件の結果\n")

        return results


class RAGGenerator:
    """Claude Sonnet 4を使用してRAGレスポンスを生成するクラス"""

    def __init__(self, config: QueryConfig):
        self.config = config
        self.bedrock_runtime = boto3.client(
            'bedrock-runtime',
            region_name=config.bedrock_region
        )

    def _read_file_content(self, file_path: str) -> str:
        """指定されたパスからファイルコンテンツを読み込む"""
        try:
            path = Path(file_path)
            if path.exists():
                content = path.read_text(encoding='utf-8')
                # 長すぎる場合は切り詰め（ドキュメントあたり最大約4000文字）
                if len(content) > 4000:
                    content = content[:4000] + "\n...(切り詰め)"
                return content
            else:
                return f"[ファイルが見つかりません: {file_path}]"
        except Exception as e:
            return f"[ファイル読み込みエラー: {e}]"

    def build_context(self, query: str, results: List[QueryResult]) -> tuple[str, str]:
        """検索結果からRAGコンテキストを構築"""
        system_prompt = """あなたは技術文書を基に質問に答えるAIアシスタントです。
提供されたコンテキストに基づいて、正確で簡潔な回答を生成してください。
コンテキストに情報がない場合は、その旨を明確に伝えてください。"""

        # 検索結果からコンテキストを構築
        context_parts = []
        for idx, result in enumerate(results, 1):
            file_path = result.metadata.get('file_path', '')

            # 実際のファイルコンテンツを読み込み、失敗時はメタデータのテキストにフォールバック
            if file_path:
                file_content = self._read_file_content(file_path)
            else:
                file_content = result.metadata.get('text', '[コンテンツなし]')

            context_parts.append(
                f"[ドキュメント {idx}] (ソース: {file_path}, 距離: {result.distance:.4f})\n"
                f"{file_content}"
            )

        context_text = "\n\n".join(context_parts)

        user_prompt = f"""以下のコンテキストを参考に、質問に答えてください。

【コンテキスト】
{context_text}

【質問】
{query}

【回答】"""

        return system_prompt, user_prompt

    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=1, min=2, max=10)
    )
    def generate_response(self, query: str, results: List[QueryResult]) -> RAGResponse:
        """Claude Sonnet 4を使用してレスポンスを生成"""
        print("Claude Sonnet 4でRAGレスポンスを生成中...")

        system_prompt, user_prompt = self.build_context(query, results)

        messages = [
            {
                "role": "user",
                "content": [{"text": user_prompt}]
            }
        ]

        response = self.bedrock_runtime.converse(
            modelId=self.config.llm_model_id,
            messages=messages,
            system=[{"text": system_prompt}],
            inferenceConfig={
                "maxTokens": self.config.max_tokens,
                "temperature": self.config.temperature
            }
        )

        answer = response['output']['message']['content'][0]['text']
        usage = response['usage']

        sources = [
            result.metadata.get('file_path', 'Unknown')
            for result in results
        ]

        return RAGResponse(
            answer=answer,
            sources=sources,
            model_id=self.config.llm_model_id,
            input_tokens=usage['inputTokens'],
            output_tokens=usage['outputTokens']
        )


def parse_args() -> tuple[QueryConfig, str, bool]:
    """コマンドライン引数を解析（環境変数をデフォルト値として使用）"""
    parser = argparse.ArgumentParser(
        description="S3 Vectors QueryVectors - ベクトル検索とRAG生成"
    )

    parser.add_argument(
        '--query',
        type=str,
        required=True,
        help='検索クエリテキスト'
    )
    parser.add_argument(
        '--bucket',
        type=str,
        default=os.getenv('VECTOR_BUCKET_NAME'),
        help='S3 Vectorsバケット名（環境変数: VECTOR_BUCKET_NAME）'
    )
    parser.add_argument(
        '--index',
        type=str,
        default=os.getenv('VECTOR_INDEX_NAME', 'markdown-embeddings'),
        help='ベクトルインデックス名（環境変数: VECTOR_INDEX_NAME）'
    )
    parser.add_argument(
        '--region',
        type=str,
        default=os.getenv('AWS_REGION', 'us-east-1'),
        help='AWSリージョン（環境変数: AWS_REGION）'
    )
    parser.add_argument(
        '--top-k',
        type=int,
        default=3,
        help='返す検索結果の数'
    )
    parser.add_argument(
        '--enable-rag',
        action='store_true',
        help='Claude Sonnet 4でRAG生成を有効化'
    )

    args = parser.parse_args()

    # バケット名の検証
    if not args.bucket:
        parser.error("--bucket引数または環境変数VECTOR_BUCKET_NAMEが必要です")

    config = QueryConfig(
        vector_bucket_name=args.bucket,
        index_name=args.index,
        bedrock_region=args.region,
        top_k=args.top_k
    )

    return config, args.query, args.enable_rag


def display_results(results: List[QueryResult]):
    """検索結果を整形して表示"""
    if not results:
        print("検索結果が見つかりませんでした。")
        return

    print("\n=== 検索結果 ===\n")

    for idx, result in enumerate(results, 1):
        file_name = result.metadata.get('file_name', 'Unknown')
        file_path = result.metadata.get('file_path', 'Unknown')

        print(f"{idx}. {file_name}")
        print(f"   キー: {result.key}")
        print(f"   距離: {result.distance:.4f}")
        print(f"   パス: {file_path}")

        # テキストプレビューを表示（利用可能な場合）
        text_preview = result.metadata.get('text', '')
        if text_preview:
            preview = text_preview[:100] + "..." if len(text_preview) > 100 else text_preview
            print(f"   プレビュー: {preview}")
        print()


def display_rag_response(response: RAGResponse):
    """RAGレスポンスを整形して表示"""
    print("\n=== RAGレスポンス ===\n")
    print("--- 生成された回答 ---")
    print(response.answer)
    print("----------------------\n")

    print("ソース:")
    for idx, source in enumerate(response.sources, 1):
        print(f"  {idx}. {source}")

    print(f"\nモデル: {response.model_id}")
    print(f"トークン: {response.input_tokens} 入力, {response.output_tokens} 出力\n")


def main():
    """メインクエリパイプライン"""
    try:
        config, query_text, enable_rag = parse_args()

        print("\n=== S3 Vectors - QueryVectors ===")
        print(f"バケット: {config.vector_bucket_name}")
        print(f"インデックス: {config.index_name}")
        print(f"RAG: {'有効' if enable_rag else '無効'}\n")

        # ステップ1: ベクトル検索
        searcher = S3VectorsQuery(config)
        results = searcher.search(query_text)

        # 検索結果を表示
        display_results(results)

        # ステップ2: RAG生成（有効な場合）
        if enable_rag and results:
            try:
                generator = RAGGenerator(config)
                rag_response = generator.generate_response(query_text, results)
                display_rag_response(rag_response)
            except Exception as e:
                print(f"\nRAG生成に失敗しました: {e}")
                print("検索結果のみを表示します\n")
                logger.exception("RAG生成に失敗しました")

    except Exception as e:
        print(f"\n✗ クエリ失敗: {e}\n")
        logger.exception("クエリパイプラインが失敗しました")
        sys.exit(1)


if __name__ == "__main__":
    main()

3. サンプルドキュメント

ベクトル化するサンプルのマークダウンファイルです。

sample-docs/fusic-brand-slogan.md

# Fusicブランドスローガン

## ブランドスローガンを定めた背景と目的

ブランドスローガンを定めた背景には、当社が掲げるミッション・ビジョンに立ち返り、「私たちは世の中に対して何を成すのか」という意思や約束を、あらためて定義しようという想いがありました。社内インタビューやワークショップを重ね、丁寧に向き合い、こだわり抜いて、少しずつ言葉を紡ぎだしていった結果、生まれたのがこのブランドスローガンです。

### ミッション

「Why we do」自分たちの在り方

**"人に多様な道を 世の中に爪跡を"**

### ビジョン

「What we do」日々の心得

**"個性をかき集めて、驚きの角度から世の中をアップデートしつづける。"**

ブランドスローガンを定める目的は、スローガンという共通言語を持つことによって、メンバー全員が同じ目線で会話できるようにし、Fusicブランドを世の中に適切な形で届けることです。

## ブランドスローガンに込めた想いと意味

### 「OSEKKAI × TECHNOLOGY」

このブランドスローガンは、Fusic の本質的な価値を表したものです。自分たちの在り方である「Why we do」を実現するために、日々の心得である「What we do」を実行し、その結果として社会に貢献している価値を表現しています。

### 「ココロと技術で、ぴったりも、びっくりも。」

このサブコピーは、ブランドスローガンを補強する言葉として、お客さま一人ひとりと丁寧に向き合い、伴走しながら、求められているもの以上のプラスαを提供していくという私たちの姿勢を表しています。

ブランドスローガンの価値観や目指す方向性をまとめた「ブランドステートメント」も定めました。

4. 実行

PutVectors（ベクトル挿入）

uv run python scripts/put_vectors.py --source-dir ./sample-docs

実行結果


=== S3 Vectors - PutVectors ===
ソース: sample-docs
バケット: s3vectors-rag-dev-vectors
インデックス: markdown-embeddings
次元数: 1024

ディレクトリをスキャン中: sample-docs
✓ 検出: sample-docs/fusic-brand-slogan.md
検出ファイル数: 1

ステップ1完了: 1ファイルをスキャン

2025-12-04 18:41:15,784 - INFO - Found credentials in shared credentials file: ~/.aws/credentials
処理中 1/1: fusic-brand-slogan.md
✓ 埋め込み完了: fusic-brand-slogan.md (689トークン)

ステップ2完了: 1件の埋め込みを生成

1件のベクトルを挿入中...
✓ ベクトルの挿入が完了しました

✓ PutVectors完了!
処理ファイル数: 1
挿入ベクトル数: 1
処理時間: 2.88秒

QueryVectors（検索のみ）

uv run python scripts/query_vectors.py   --query "Fusicのブランドスローガンについて教えて"

実行結果


=== S3 Vectors - QueryVectors ===
バケット: s3vectors-rag-dev-vectors
インデックス: markdown-embeddings
RAG: 無効

2025-12-04 18:45:23,823 - INFO - Found credentials in shared credentials file: ~/.aws/credentials
クエリ: Fusicのブランドスローガンについて教えて
✓ クエリをベクトル化: 1024次元
S3 Vectorsを検索中...
✓ 検索完了: 1件の結果


=== 検索結果 ===

1. fusic-brand-slogan.md
   キー: doc_1_fusic-brand-slogan
   距離: 0.3428
   パス: sample-docs/fusic-brand-slogan.md
   プレビュー: # Fusicブランドスローガン

## ブランドスローガンを定めた背景と目的

ブランドスローガンを定めた背景には、当社が掲げるミッション・ビジョンに立ち返り、「私たちは世の中に対して何を成すのか」...

QueryVectors + RAG（回答生成）

uv run python scripts/query_vectors.py \
  --query "Fusicのブランドスローガンについて教えて" \
  --enable-rag

実行結果

uv run python scripts/query_vectors.py \
  --query "Fusicのブランドスローガンについて教えて" \
  --enable-rag

=== S3 Vectors - QueryVectors ===
バケット: s3vectors-rag-dev-vectors
インデックス: markdown-embeddings
RAG: 有効

2025-12-04 18:46:17,163 - INFO - Found credentials in shared credentials file: ~/.aws/credentials
クエリ: Fusicのブランドスローガンについて教えて
✓ クエリをベクトル化: 1024次元
S3 Vectorsを検索中...
✓ 検索完了: 1件の結果


=== 検索結果 ===

1. fusic-brand-slogan.md
   キー: doc_1_fusic-brand-slogan
   距離: 0.3428
   パス: sample-docs/fusic-brand-slogan.md
   プレビュー: # Fusicブランドスローガン

## ブランドスローガンを定めた背景と目的

ブランドスローガンを定めた背景には、当社が掲げるミッション・ビジョンに立ち返り、「私たちは世の中に対して何を成すのか」...

Claude Sonnet 4でRAGレスポンスを生成中...

=== RAGレスポンス ===

--- 生成された回答 ---
Fusicのブランドスローガンについてご説明します。

## ブランドスローガン
**「OSEKKAI × TECHNOLOGY」**

## サブコピー
**「ココロと技術で、ぴったりも、びっくりも。」**

## 策定の背景と目的
ブランドスローガンは、Fusicのミッション・ビジョンに立ち返り、「私たちは世の中に対して何を成すのか」という意思や約束を明確に定義するために策定されました。社内インタビューやワークショップを重ねて丁寧に検討した結果、生まれたものです。

目的は、スローガンという共通言語を持つことで、メンバー全員が同じ目線で会話できるようにし、Fusicブランドを世の中に適切な形で届けることです。

## 込められた想い
- **「OSEKKAI × TECHNOLOGY」**：Fusicの本質的な価値を表現したもので、ミッション（Why we do）を実現するためにビジョン（What we do）を実行し、その結果として社会に貢献している価値を表現しています。

- **「ココロと技術で、ぴったりも、びっくりも。」**：お客さま一人ひとりと丁寧に向き合い、伴走しながら、求められているもの以上のプラスαを提供していくという姿勢を表現しています。

このブランドスローガンは、Fusicの価値観や目指す方向性をまとめた「ブランドステートメント」とともに定められています。
----------------------

ソース:
  1. sample-docs/fusic-brand-slogan.md

モデル: us.anthropic.claude-sonnet-4-20250514-v1:0
トークン: 808 入力, 469 出力

5. リソース削除

cd envs/dev
terraform destroy

最後に

今回はTerraformとAmazon S3 VectorsでRAGをやってみるコードを解説しました。GAされたことにより、Terraform 6.24.0から使えるようになりました。

Fusic 技術ブログPublication

さまざまな個性を受け入れて有機的につなぐ社内環境を整える。あらゆる事業機会の創出と実現を繰り返し、世の中に対する視点を絶えず増やして成長していく。あっと驚くような角度から発展できるポイントを見つけ、そこにいい感じにフィットする形でテクノロジーを組み込んで、世の中をちょっとずつ、時には大胆にアップデートしつづけていく。

はじめに

Amazon S3 Vectorsとは

やること

実装

ディレクトリ構造

1. Terraformコード

1.1 S3 Vectorsモジュール

1.2 IAMモジュール

1.3 開発環境設定

1.4 Terraformの実行

2. Python周り

2.1 pyproject.toml

2.2 put_vectors.py ベクトル挿入

2.3 query_vectors.py RAG

3. サンプルドキュメント

4. 実行

PutVectors（ベクトル挿入）

QueryVectors（検索のみ）

QueryVectors + RAG（回答生成）

5. リソース削除

最後に

Discussion