📻
Quick Suite と AgentCore で Podcast を爆速で作る

Yuki Sekiya
2025/11/14に公開
👉本ブログは AWS AI Agent ブログ祭り (Zenn: #awsaiagentblogfes, X: #AWS_AI_AGENT_ブログ祭り) の第 15 日目です。
!第 14 日目の記事はこちら！
チャットでデータ分析！？Amazon Quick Suite Chat agents で Quick Sight のデータを扱ってみる

 はじめに以前投稿したQuick Suite で穴場の観光地をリサーチ では Research 機能を使い、調査を進めました。
要約機能を使って概略を掴む事はできました。でも目が疲れているときや、画面が見れないけど耳なら空いてる時ってありますよね。
今回は Quick Suite と AgentCore を使ってそんなニーズにも答えられる仕掛けを実現します。
!このブログでは us-east-1 リージョンを利用します。

 1. ffmpeg 用の Lambda Layer を作成ffmpeg 用の Lambda Layer のページに移動して、Deploy ボタンを押下します。
次に、右下のデプロイを押下して開始します。
しばらく待つと画面下部リソースの項目に arn が表示されます。
!この arn は後のステップ利用しますので、メモしておいてください。

 2. AWS Cloudformation のスタックを作成次に AWS Cloudformation のコンソールへ移動して、右上のスタックの作成から、「新しいリソースを使用(標準) 」を押下します。
次に、Cloudformation テンプレートをアップロードします。

このテンプレートの内容はブログ下部にあるコードをローカルに main.yaml などの名前で保存したものをアップロードします。
次にスタック名とパラメータを設定します。パラメータには先程メモした arn を入力します。
IAM が作成されることに同意して、次へを押下します。
作成される内容を確認して「送信」ボタンを押下します。

 3. Quick Suite と Podcast MCP の紐づけAmazon Quick Suite の Integration ページ へ移動して Model Context Protocol のプラスボタンを押下します。
!ここでは、以前投稿したQuick Suite で穴場の観光地をリサーチ の環境を利用しています。

もし、アカウントの登録やリサーチがまだの場合は、以前のブログ投稿を確認して手順をすすめてください。
名前と説明、さらに MCP サーバーエンドポイントを入力します。
MCP サーバーエンドポイントには 先程の AWS Cloudformation の出力タブから確認できます。
更に次のページの入力項目としても、AWS Cloudformation の出力タブ の内容を利用します。
次のページへ遷移すると Amazon Cognito のポップアップウィンドウが表示されます。Sign up を押下します。
新しいアカウントを作成して次に進みます。
!一次コードを求められますのでそちらも進めてください
ローディングの画面が表示されますが、右下の「次へ」を押下します。
さらに、右下の「次へ」を押下します。
すると、アクションタブに Podcast Maker が表示されますので、それをクリックします。
右上の「再接続」ボタンを押下します。

 4. 研究ファイルをダウンロード新しいタブで Amazon Quick Suite の 研究ページ へ移動し各リサーチファイルをダウンロードします。
!ここでは、以前投稿したQuick Suite で穴場の観光地をリサーチ の環境を利用しています。

もし、アカウントの登録やリサーチがまだの場合は、以前のブログ投稿を確認して手順をすすめてください。
右上の「共有」ボタンから Word を選択しダウンロードします。

 5. チャットの前準備チャットエージェントのページ を押下して、右側の展開するボタンを押下します。
画面中央のクリップアイコンを押下して、ファイルアップロードのモーダルを表示します。
先程ダウンロードした word ファイルを添付します。
さらに アクション を追加するモーダルへ遷移します。
アクションタブから Podcast Maker MCP を追加します。

 6. Podcast を作るでは本題の Podcast を作っていきましょう。下記のプロンプトを入力します。
あなたは人気の Podcast パーソナリティです。魅力的な番組を作ることができます。

今回のテーマはアップロードした3つのwordファイルを分析してわかりやすく伝えることにあります。

では、シナリオを考えて、Podcast Maker MCP で番組を作り音声ファイルのURLを教えて下さい。
途中でスクリプトの確認がなされますが、Submit を押下します。
しばらくすると Amazon S3 の Presigned URL が描画されますので、新しいタブなどで開きます。

 Cleanup今回利用したリソースはCloudformation のコンソールから podcast-maker, serverlessrepo-ffmpeg-lambda-layer を削除します。

 まとめ今回は Amazon Quick Suite のリサーチ内容から Podcast を作ってみました。
また Podcast を作るにあたり、 Amazon Bedrock AgentCore Gateway を MCP サーバーとして利用しています。
簡単に MCP サーバーが作れる AWS のエコシステムといろんなものを繋げられる Amazon Quick Suite の良さを体感できた気がします。
ぜひ皆さんも Amazon Bedrock AgentCore と Amazon Quick Suite で AI 活用を進めてみてください。

 AppendixAWSTemplateFormatVersion: '2010-09-09'
Description: MCP Podcast Server with Bedrock AgentCore Gateway, Lambda, and Cognito (OAuth Only - Secure Design)

Parameters:
  FFmpegLayerArn:
    Type: String
    Description: FFmpeg Lambda Layer の ARN（デプロイ時に実際のARNを指定）
    Default: arn:aws:lambda:us-west-2:123456789012:layer:ffmpeg:1

Resources:
  # OAuth認証専用 Cognito User Pool
  PodcastUserPool:
    Type: AWS::Cognito::UserPool
    Properties:
      UserPoolName: MCPPodcastServerUserPool
      UsernameAttributes:
        - email
      AutoVerifiedAttributes:
        - email
      Policies:
        PasswordPolicy:
          MinimumLength: 8
          RequireUppercase: true
          RequireLowercase: true
          RequireNumbers: true
          RequireSymbols: true
          TemporaryPasswordValidityDays: 7
      AccountRecoverySetting:
        RecoveryMechanisms:
          - Name: verified_email
            Priority: 1

  # OAuth フロー用 User Pool Domain
  PodcastUserPoolDomain:
    Type: AWS::Cognito::UserPoolDomain
    Properties:
      Domain: !Sub '${AWS::StackName}-podcast-oauth'
      UserPoolId: !Ref PodcastUserPool

  # Authorization Code Grant用 User Pool Client
  PodcastUserPoolClient:
    Type: AWS::Cognito::UserPoolClient
    Properties:
      ClientName: MCPPodcastAuthClient
      UserPoolId: !Ref PodcastUserPool
      GenerateSecret: true
      # OAuth Authorization Code Grant
      AllowedOAuthFlowsUserPoolClient: true
      AllowedOAuthFlows:
        - code
      AllowedOAuthScopes:
        - email
        - openid
        - profile
      CallbackURLs:
        - "https://us-east-1.quicksight.aws.amazon.com/sn/oauthcallback"
        - "https://us-west-2.quicksight.aws.amazon.com/sn/oauthcallback"
      LogoutURLs:
        - "https://us-east-1.quicksight.aws.amazon.com/sn/logout"
        - "https://us-west-2.quicksight.aws.amazon.com/sn/logout"
      SupportedIdentityProviders:
        - COGNITO
      PreventUserExistenceErrors: ENABLED
      AccessTokenValidity: 1  # 1 hour
      IdTokenValidity: 1      # 1 hour
      RefreshTokenValidity: 30  # 30 days

  # 生成されたポッドキャスト音声ファイル保存用 S3 Bucket
  PodcastAudioBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketName: !Sub '${AWS::StackName}-podcast-audio-${AWS::AccountId}'
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - ServerSideEncryptionByDefault:
              SSEAlgorithm: AES256
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
      VersioningConfiguration:
        Status: Enabled
      LifecycleConfiguration:
        Rules:
          - Id: DeleteOldAudioFiles
            Status: Enabled
            ExpirationInDays: 1

  # Lambda実行用 IAM Role
  PodcastLambdaExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: !Sub '${AWS::StackName}-lambda-execution-role'
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action: sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      Policies:
        - PolicyName: PodcastGenerationPolicy
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - polly:SynthesizeSpeech
                Resource: '*'
              - Effect: Allow
                Action:
                  - s3:PutObject
                  - s3:GetObject
                Resource: !Sub '${PodcastAudioBucket.Arn}/*'

  # 埋め込みコード付き Lambda Function（ZipFile）
  PodcastGeneratorFunction:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: !Sub '${AWS::StackName}-podcast-generator'
      Runtime: python3.12
      Handler: index.lambda_handler
      Role: !GetAtt PodcastLambdaExecutionRole.Arn
      Timeout: 900
      MemorySize: 10240
      EphemeralStorage:
        Size: 10240
      Environment:
        Variables:
          S3_BUCKET_NAME: !Ref PodcastAudioBucket
      Layers:
        - !Ref FFmpegLayerArn
      Code:
        ZipFile: |
          import json
          import os
          import boto3
          import logging
          from datetime import datetime
          import subprocess
          import tempfile
          import shutil

          # ログ設定
          logger = logging.getLogger()
          logger.setLevel(logging.INFO)

          # AWS クライアント初期化
          polly_client = boto3.client('polly')
          s3_client = boto3.client('s3')

          # 環境変数
          S3_BUCKET_NAME = os.environ.get('S3_BUCKET_NAME')

          # VoiceId マッピング（speaker名 → Polly VoiceId）
          VOICE_MAPPING = {
              'Alice': 'Mizuki',  # 日本語女性
              'Bob': 'Takumi'     # 日本語男性
          }

          def validate_input(event):
              """入力JSONのバリデーション"""
              try:
                  body = json.loads(event.get('body', '{}'))
                  
                  if 'dialogues' not in body:
                      raise ValueError("Missing required field: dialogues")
                  
                  if not isinstance(body['dialogues'], list):
                      raise ValueError("dialogues must be an array")
                  
                  if len(body['dialogues']) == 0:
                      raise ValueError("dialogues array cannot be empty")
                  
                  for idx, dialogue in enumerate(body['dialogues']):
                      if 'speaker' not in dialogue:
                          raise ValueError(f"Missing speaker in dialogue {idx}")
                      if 'text' not in dialogue:
                          raise ValueError(f"Missing text in dialogue {idx}")
                      
                      if len(dialogue['text']) > 3000:
                          raise ValueError(f"Text too long in dialogue {idx}: max 3000 characters")
                      
                      if dialogue['speaker'] not in VOICE_MAPPING:
                          raise ValueError(f"Invalid speaker in dialogue {idx}: {dialogue['speaker']}")
                  
                  return body
              
              except json.JSONDecodeError:
                  raise ValueError("Invalid JSON format")

          def synthesize_speech(text, voice_id, output_path):
              """Amazon Pollyで音声合成"""
              try:
                  response = polly_client.synthesize_speech(
                      Text=text,
                      OutputFormat='mp3',
                      VoiceId=voice_id,
                      Engine='standard'
                  )
                  
                  with open(output_path, 'wb') as f:
                      f.write(response['AudioStream'].read())
                  
                  return True
              
              except Exception as e:
                  logger.error(f"Speech synthesis failed: {str(e)}")
                  raise

          def concat_audio_files(input_files, output_path):
              """FFmpegで音声ファイルを結合"""
              try:
                  # filelist.txt作成
                  filelist_path = '/tmp/filelist.txt'
                  with open(filelist_path, 'w') as f:
                      for file in input_files:
                          f.write(f"file '{file}'\n")
                  
                  # FFmpeg実行
                  cmd = [
                      '/opt/bin/ffmpeg',
                      '-f', 'concat',
                      '-safe', '0',
                      '-i', filelist_path,
                      '-c', 'copy',
                      output_path
                  ]
                  
                  result = subprocess.run(cmd, capture_output=True, text=True, check=True)
                  
                  return True
              
              except subprocess.CalledProcessError as e:
                  logger.error(f"FFmpeg failed: {e.stderr}")
                  raise Exception(f"Audio concatenation failed: {e.stderr}")
              except Exception as e:
                  logger.error(f"Concatenation error: {str(e)}")
                  raise

          def upload_to_s3(file_path, bucket_name):
              """S3にファイルをアップロード"""
              try:
                  timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
                  object_key = f'podcasts/{timestamp}.mp3'
                  
                  s3_client.upload_file(file_path, bucket_name, object_key)
                  
                  # ファイルサイズ取得
                  file_size = os.path.getsize(file_path)
                  
                  # presigned URL生成（1時間有効）
                  presigned_url = s3_client.generate_presigned_url(
                      'get_object',
                      Params={'Bucket': bucket_name, 'Key': object_key},
                      ExpiresIn=3600
                  )
                  
                  return {
                      'presignedUrl': presigned_url,
                      'fileSize': file_size,
                      'objectKey': object_key
                  }
              
              except Exception as e:
                  logger.error(f"S3 upload failed: {str(e)}")
                  raise

          def cleanup_temp_files(file_paths):
              """一時ファイルのクリーンアップ"""
              for file_path in file_paths:
                  try:
                      if os.path.exists(file_path):
                          os.remove(file_path)
                  except Exception as e:
                      logger.warning(f"Failed to cleanup {file_path}: {str(e)}")

          def lambda_handler(event, context):
              """Lambda ハンドラー関数"""
              temp_files = []
              
              try:
                  # MCP Gateway からの呼び出しの場合、argumentsが直接渡される可能性
                  if 'dialogues' in event:
                      # 直接渡される場合
                      dialogues = event['dialogues']
                  else:
                      # HTTP Gateway形式の場合
                      body = validate_input(event)
                      dialogues = body['dialogues']
                  
                  logger.info(f"Processing {len(dialogues)} dialogues")
                  
                  # 各dialogueを音声合成
                  audio_files = []
                  for idx, dialogue in enumerate(dialogues):
                      speaker = dialogue['speaker']
                      text = dialogue['text']
                      voice_id = VOICE_MAPPING[speaker]
                      
                      output_path = f'/tmp/dialogue_{idx}.mp3'
                      temp_files.append(output_path)
                      
                      synthesize_speech(text, voice_id, output_path)
                      audio_files.append(output_path)
                  
                  # 音声ファイルを結合
                  output_path = '/tmp/output.mp3'
                  temp_files.append(output_path)
                  
                  if len(audio_files) == 1:
                      # 1つのファイルの場合はそのまま使用
                      shutil.copy(audio_files[0], output_path)
                  else:
                      # 複数ファイルの場合は結合
                      concat_audio_files(audio_files, output_path)
                      temp_files.append('/tmp/filelist.txt')
                  
                  # ファイルサイズチェック（100MB制限）
                  file_size = os.path.getsize(output_path)
                  if file_size > 100 * 1024 * 1024:
                      raise Exception(f"Generated file too large: {file_size} bytes")
                  
                  # S3にアップロード
                  s3_result = upload_to_s3(output_path, S3_BUCKET_NAME)
                  
                  # 音声ファイルの長さを推定（簡易計算: ファイルサイズから）
                  # MP3の平均ビットレート128kbpsと仮定
                  duration = file_size / (128 * 1024 / 8)
                  
                  # 成功レスポンス
                  response_body = {
                      'status': 'success',
                      'presignedUrl': s3_result['presignedUrl'],
                      'fileSize': s3_result['fileSize'],
                      'duration': round(duration, 2)
                  }
                  
                  logger.info(f"Processing completed successfully")
                  
                  return {
                      'statusCode': 200,
                      'headers': {
                          'Content-Type': 'application/json'
                      },
                      'body': json.dumps(response_body)
                  }
              
              except ValueError as e:
                  logger.warning(f"Validation error: {str(e)}")
                  return {
                      'statusCode': 400,
                      'headers': {
                          'Content-Type': 'application/json'
                      },
                      'body': json.dumps({'error': str(e)})
                  }
              
              except Exception as e:
                  logger.error(f"Processing failed: {str(e)}", exc_info=True)
                  return {
                      'statusCode': 500,
                      'headers': {
                          'Content-Type': 'application/json'
                      },
                      'body': json.dumps({'error': 'Internal server error'})
                  }
              
              finally:
                  # 一時ファイルのクリーンアップ
                  cleanup_temp_files(temp_files)

  # Bedrock AgentCore Gateway 用 IAM Role
  GatewayExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: !Sub '${AWS::StackName}-gateway-execution-role'
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: bedrock-agentcore.amazonaws.com
            Action: sts:AssumeRole
      Policies:
        - PolicyName: GatewayInvokeLambdaPolicy
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - lambda:InvokeFunction
                Resource: !GetAtt PodcastGeneratorFunction.Arn

  # Bedrock AgentCore Gateway
  PodcastMCPGateway:
    Type: AWS::BedrockAgentCore::Gateway
    Properties:
      Name: !Sub '${AWS::StackName}-mcp-gateway'
      ProtocolType: MCP
      AuthorizerType: CUSTOM_JWT
      RoleArn: !GetAtt GatewayExecutionRole.Arn
      AuthorizerConfiguration:
        CustomJWTAuthorizer:
          DiscoveryUrl: !Sub 'https://cognito-idp.${AWS::Region}.amazonaws.com/${PodcastUserPool}/.well-known/openid-configuration'
          AllowedClients:
            - !Ref PodcastUserPoolClient

  # Bedrock AgentCore GatewayTarget（Lambda Tool登録）
  PodcastMCPGatewayTarget:
    Type: AWS::BedrockAgentCore::GatewayTarget
    Properties:
      GatewayIdentifier: !Ref PodcastMCPGateway
      Name: podcast-generator-tool
      TargetConfiguration:
        Mcp:
          Lambda:
            LambdaArn: !GetAtt PodcastGeneratorFunction.Arn
            ToolSchema:
              InlinePayload:
                - Name: generate_podcast
                  Description: Generate podcast audio from dialogue script
                  InputSchema:
                    Type: object
                    Properties:
                      dialogues:
                        Type: array
                        Description: Array of dialogue entries
                        Items:
                          Type: object
                          Properties:
                            speaker:
                              Type: string
                              Description: Speaker name (Alice or Bob)
                            text:
                              Type: string
                              Description: Text to speak
                          Required:
                            - speaker
                            - text
                    Required:
                      - dialogues
      CredentialProviderConfigurations:
        - CredentialProviderType: GATEWAY_IAM_ROLE

  # AgentCore Gateway 用 Lambda権限
  GatewayLambdaInvokePermission:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !GetAtt PodcastGeneratorFunction.Arn
      Action: lambda:InvokeFunction
      Principal: bedrock-agentcore.amazonaws.com
      SourceArn: !Sub 'arn:aws:bedrock-agentcore:${AWS::Region}:${AWS::AccountId}:gateway/${PodcastMCPGateway}'

Outputs:
  GatewayURL:
    Description: AgentCore Gateway エンドポイント URL
    Value: !GetAtt PodcastMCPGateway.GatewayUrl
    Export:
      Name: !Sub '${AWS::StackName}-GatewayURL'

  UserPoolClientId:
    Description: Cognito User Pool Client ID
    Value: !Ref PodcastUserPoolClient
    Export:
      Name: !Sub '${AWS::StackName}-UserPoolClientId'

  UserPoolClientSecret:
    Description: Cognito User Pool Client Secret（OAuthフロー用）
    Value: !GetAtt PodcastUserPoolClient.ClientSecret
    Export:
      Name: !Sub '${AWS::StackName}-UserPoolClientSecret'

  # OAuth Token Endpoint（認証コード→トークン交換用）
  OAuthTokenURL:
    Description: OAuth Token URL（認証コード→トークン交換用）
    Value: !Sub 'https://${PodcastUserPoolDomain}.auth.${AWS::Region}.amazoncognito.com/oauth2/token'
    Export:
      Name: !Sub '${AWS::StackName}-OAuthTokenURL'

  # OAuth Authorize Endpoint（ユーザー認証開始用）
  OAuthAuthorizeURL:
    Description: OAuth Authorize URL（ユーザー認証開始用）
    Value: !Sub 'https://${PodcastUserPoolDomain}.auth.${AWS::Region}.amazoncognito.com/oauth2/authorize'
    Export:
      Name: !Sub '${AWS::StackName}-OAuthAuthorizeURL'