Closed6

AWS で CogStudio を動かす

mobmob

起動までは問題なくいけるが、生成しようとするとエラーになる

ログはこんな感じ。

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.10/site-packages/gradio/queueing.py", line 624, in process_events
    response = await route_utils.call_process_api(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/gradio/blocks.py", line 2015, in process_api
    result = await self.call_function(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/gradio/blocks.py", line 1562, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/home/ubuntu/.local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/home/ubuntu/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/gradio/utils.py", line 865, in wrapper
    response = f(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/gradio/utils.py", line 865, in wrapper
    response = f(*args, **kwargs)
  File "/home/ubuntu/CogVideo/inference/gradio_composite_demo/cogstudio.py", line 727, in generate
    latents, seed = infer(
  File "/home/ubuntu/CogVideo/inference/gradio_composite_demo/cogstudio.py", line 245, in infer
    video_pt = pipe_image(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/diffusers/pipelines/cogvideo/pipeline_cogvideox_image2video.py", line 777, in __call__
    noise_pred = self.transformer(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/diffusers/models/transformers/cogvideox_transformer_3d.py", line 473, in forward
    hidden_states, encoder_hidden_states = block(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/diffusers/models/transformers/cogvideox_transformer_3d.py", line 132, in forward
    attn_hidden_states, attn_encoder_hidden_states = self.attn1(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 495, in forward
    return self.processor(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 1950, in __call__
    hidden_states = F.scaled_dot_product_attention(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 56.50 GiB. GPU

GPUのメモリが足りてないっぽい

mobmob

上記のメモリ対応のためインスタンスを変える

もともと g4dn.xlarge だったが g5.2xlarge にしてみた。
コスト感はこんな感じ。

mobmob

無事に動いた...

stable-diffusion で適当に作成した画像を投げてみたら無事に動いた。

ただ、6秒の動画を生成するのに 20min くらいかかった。

mobmob

CloudFormation 全文

AWSTemplateFormatVersion: "2010-09-09"
Description: A CloudFormation template to deploy the Cog Studio Web UI
Parameters:
  SubnetId:
    Description: The ID of the subnet where the EC2 instance will be launched.
    Type: AWS::EC2::Subnet::Id
  Ec2ImageId:
    Type: String
    Default: ami-09e6f55b0eefaa8ef
    Description: Enter appropriate AMI ID in your region. Tested with "Deep Learning AMI GPU PyTorch 1.13.1 (Ubuntu 20.04) 20230510" in ap-north-east1.
  Ec2InstanceType:
    Type: String
    Default: g5.2xlarge
  EC2InstanceProfileName:
    Type: String
    Description: Name of the existing IAM instance profile whic has access to S3 (arn:aws:iam::<account>:instance-profile/<EC2InstanceProfileName>)
Resources:
  EC2Instance:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: !Ref Ec2InstanceType
      ImageId: !Ref Ec2ImageId
      IamInstanceProfile: !Ref EC2InstanceProfileName
      SubnetId: !Ref SubnetId
      BlockDeviceMappings:
        - DeviceName: /dev/sda1
          Ebs:
            VolumeSize: 300
            VolumeType: gp2
      "Tags": [{ "Key": "Name", "Value": "cog-studio-cf" }]
      UserData:
        "Fn::Base64": !Sub |
          Content-Type: multipart/mixed; boundary="//"
          MIME-Version: 1.0

          --//
          Content-Type: text/cloud-config; charset="us-ascii"
          MIME-Version: 1.0
          Content-Transfer-Encoding: 7bit
          Content-Disposition: attachment; filename="cloud-config.txt"

          #cloud-config
          cloud_final_modules:
          - [scripts-user, always]

          --//
          Content-Type: text/x-shellscript; charset="us-ascii"
          MIME-Version: 1.0
          Content-Transfer-Encoding: 7bit
          Content-Disposition: attachment; filename="userdata.txt"

          #!/bin/bash

          # Install packages
          sudo apt update
          sudo add-apt-repository ppa:deadsnakes/ppa -y
          sudo apt -y install wget git s3fs
          sudo apt -y install python3 python-is-python3 python3-pip python3-venv
          sudo apt -y install python3.10 python3.10-distutils python3.10-venv python3.10-tk
          curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
          python3.10 -m pip install --upgrade pip
          sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1

          # Launch Cog Video Web UI
          cd /home/ubuntu
          # setup script uses existing folder if it exists. Feel free to change version here.
          sudo -u ubuntu git clone https://github.com/THUDM/CogVideo
          cd /home/ubuntu/CogVideo/inference/gradio_composite_demo
          sudo -u ubuntu wget https://raw.githubusercontent.com/pinokiofactory/cogstudio/refs/heads/main/cogstudio.py --no-check-certificate
          sudo -u ubuntu chmod 644 cogstudio.py
          sudo -u ubuntu chown ubuntu:ubuntu cogstudio.py
          sudo -u ubuntu python -m venv env
          sudo -u ubuntu bash -c "source env/bin/activate"
          sudo -u ubuntu pip install -r requirements.txt
          sudo -u ubuntu pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
          sudo -u ubuntu pip install 'markupsafe==2.0.1'
          sudo -u ubuntu pip install moviepy==2.0.0.dev2
          sudo -u ubuntu python cogstudio.py

          --//
Outputs:
  InstanceID:
    Description: EC2Instance ID
    Value: !Ref EC2Instance
このスクラップは10日前にクローズされました