🌊

【AWS学習記録4回目】AWS Lambda と AWS AI Services を組み合わせて作る音声文字起こし & 感情分析パイプライン

2024/03/10に公開

AWS Lambda と AWS AI Servicesで音声文字起こし & 感情分析パイプラインを構築してみました

https://pages.awscloud.com/JAPAN-event-OE-Hands-on-for-Beginners-Serverless-3-2022-reg-event.html?trk=aws_introduction_page

※セミナーの内容と完全に一致していません

CloudFormation

AWSTemplateFormatVersion: 2010-09-09
Resources:
  #======================
  # Bucket
  #======================
  TranscribingS3:
    Type: AWS::S3::Bucket
    Properties:
      BucketName: transcribingbucketpractice
      NotificationConfiguration: 
        LambdaConfigurations: 
        - Event: s3:ObjectCreated:*
          Function: !GetAtt TranscribeFunction.Arn
  
  TranscribeS3:
    Type: AWS::S3::Bucket
    Properties:
      BucketName: transcribes3buckepractice
      NotificationConfiguration: 
        LambdaConfigurations:
        - Event: s3:ObjectCreated:*
          Function: !GetAtt ComprehendFunction.Arn
  
  #======================
  # Function
  #======================
  TranscribeFunction:
    Type: AWS::Lambda::Function
    Properties:
      Handler: index.lambda_handler
      Role: !GetAtt LambdaCanGetTranscribingS3Object.Arn
      Runtime: python3.9
      Code: 
        ZipFile: !Sub |
          import boto3
          import urllib.parse
          import datetime

          def lambda_handler(event, context):
            bucket = event['Records'][0]['s3']['bucket']['name']
            key    = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')

            transcribe = boto3.client('transcribe')
            transcribe.start_transcription_job(
              TranscriptionJobName = datetime.datetime.now().strftime('%Y%m%d%H%M%S') + '_Transcription',
              LanguageCode = 'ja-JP',
              Media = {
                  'MediaFileUri': 'https://s3.ap-northeast-1.amazonaws.com/' + bucket + '/' + key
              },
              OutputBucketName = 'transcribes3buckepractice'
            )

  ComprehendFunction:
    Type: AWS::Lambda::Function
    Properties:
      Handler: index.lambda_handler
      Role: !GetAtt LambdaCanGetTranscribeS3Object.Arn
      Runtime: python3.9
      Code: 
        ZipFile: !Sub |
          import boto3
          import urllib.parse
          import json

          def lambda_handler(event, context):
            bucket = event['Records'][0]['s3']['bucket']['name']
            key    = event['Records'][0]['s3']['object']['key']
            s3     = boto3.client('s3')
            res    = json.loads(s3.get_object(Bucket=bucket, Key=key)['Body'].read().decode('utf-8'))
            
            if res:
              input_text = res["results"]["transcripts"][0]["transcript"]
              comprehend = boto3.client('comprehend')

              response = comprehend.detect_sentiment(
                Text         = input_text,
                LanguageCode = 'ja'
              )

              print('入力するテキスト:', input_text)
              print("結果", response)

  #======================
  # Permission
  #======================
  TranscribingS3InvokeFunction:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !GetAtt TranscribeFunction.Arn
      Action: lambda:InvokeFunction
      Principal: s3.amazonaws.com
      SourceArn: !Sub arn:aws:s3:::transcribingbucketpractice
  
  TranscribeS3InvokeFunction:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !GetAtt ComprehendFunction.Arn
      Action: lambda:InvokeFunction
      Principal: s3.amazonaws.com
      SourceArn: !Sub arn:aws:s3:::transcribes3buckepractice
  
  #======================
  # Role
  #======================
  LambdaCanGetTranscribingS3Object:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument: 
        Version: "2012-10-17"
        Statement:
        - Effect: Allow
          Principal:
            Service: lambda.amazonaws.com
          Action: sts:AssumeRole
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/CloudWatchLogsFullAccess
      - arn:aws:iam::aws:policy/AmazonTranscribeFullAccess
      - arn:aws:iam::aws:policy/AmazonS3FullAccess

  LambdaCanGetTranscribeS3Object:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument: 
        Version: "2012-10-17"
        Statement:
        - Effect: Allow
          Principal:
            Service: lambda.amazonaws.com
          Action: sts:AssumeRole
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/CloudWatchLogsFullAccess
      - arn:aws:iam::aws:policy/ComprehendFullAccess
      Policies:
      - PolicyName: CanGetS3Object
        PolicyDocument: 
          Version: "2012-10-17"
          Statement:
          - Effect: "Allow"
            Action: 
            - "s3:GetObject"
            Resource: "arn:aws:s3:::*/*"

結果

入力したテキストに対して、感情分析できた

参考資料

https://docs.aws.amazon.com/ja_jp/AWSCloudFormation/latest/UserGuide/aws-resource-lambda-function.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/comprehend/client/detect_sentiment.html
https://docs.aws.amazon.com/ja_jp/lambda/latest/dg/with-s3-example.html
https://dev.classmethod.jp/articles/lambda-invoke-streamingbody-error/ https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html#client

Discussion