CDK のコードを読みつつ使い方を調べてみる

hassaku63

chatGPT を使いつつまとめていく。

GPT 4.0 は遅いので 3.5 の出力と実際のソースを見比べつつ調べる。

読解のテーマは Lambda Function のデプロイに関する話。zip or image の構成方法など。

hassaku63

lambda.Code.fromBucket(bucket, key[, objectVersion]) - specify an S3 object that contains the archive of your runtime code.
lambda.Code.fromInline(code) - inline the handle code as a string. This is limited to supported runtimes.
lambda.Code.fromAsset(path) - specify a directory or a .zip file in the local filesystem which will be zipped and uploaded to S3 before deployment. See also bundling asset code.
lambda.Code.fromDockerBuild(path, options) - use the result of a Docker build as code. The runtime code is expected to be located at /asset in the image and will be zipped and uploaded to S3 as an asset.

4つある。

メソッド名	説明
`lambda.Code.fromBucket(bucket, key[, objectVersion])`	ランタイムコードのアーカイブを含むS3オブジェクトを指定します。
`lambda.Code.fromInline(code)`	ハンドルコードを文字列としてインライン化します。これは、サポートされているランタイムに限定されます。
`lambda.Code.fromAsset(path)`	ディレクトリまたはローカルファイルシステム内の.zipファイルを指定し、展開前にS3にアップロードされることになるアセットコードをバンドルします。アセットコードのバンドルについては、バンドル化するアセットコードを参照してください。
`lambda.Code.fromDockerBuild(path, options)`	Dockerビルドの結果をコードとして使用します。ランタイムコードは、イメージ内の/assetに配置されていることが想定されており、アセットとして圧縮されてS3にアップロードされます。

hassaku63

[chatGPT]

Code クラスに存在するメソッドで、Docker Image 形式による Lambda Function のデプロイをサポートする方法は提供されていますか？

はい、AWS CDKのCodeクラスには、Dockerイメージ形式のLambda関数のデプロイをサポートするメソッドがあります。そのメソッドはlambda.Code.fromDockerImage(image[, options])です。このメソッドを使用すると、Dockerイメージを直接Lambda関数のランタイムコードとして使用することができます。imageパラメータには、DockerイメージのURIを指定します。また、optionsパラメータを使用して、Dockerイメージをビルドする際の設定を指定することもできます。

hassaku63

実際のソースを見てみる。

fromDockerBuild がコンテナイメージによるデプロイをサポートする方法に該当する手段のように見える

// packages/aws-cdk-lib/aws-lambda/lib/code.ts

// export abstract class Code
  /**
   * Loads the function code from an asset created by a Docker build.
   *
   * By default, the asset is expected to be located at `/asset` in the
   * image.
   *
   * @param path The path to the directory containing the Docker file
   * @param options Docker build options
   */
  public static fromDockerBuild(path: string, options: DockerBuildAssetOptions = {}): AssetCode {
    let imagePath = options.imagePath ?? '/asset/.';

    // ensure imagePath ends with /. to copy the **content** at this path
    if (imagePath.endsWith('/')) {
      imagePath = `${imagePath}.`;
    } else if (!imagePath.endsWith('/.')) {
      imagePath = `${imagePath}/.`;
    }

    const assetPath = cdk.DockerImage
      .fromBuild(path, options)
      .cp(imagePath, options.outputPath);

    return new AssetCode(assetPath);
  }

DockerBuildAssetOptions は imagePath (string), outputPath (string) のプロパティを持っているが、どちらも任意でありなくても動作しそうなのでスルーしておく

path のディレクトリで cdk.DockerImage.fromBuild(path, options) を呼び出している。DockerImage.fromBuild() は自分自身を返す Factory method 相当な機能

最終的には AssetCode を返している。 AssetCode は何かしらの「アセット」という理解でいったん置いておく

DockerImage.fromBuild() を見ていく。

hassaku63

DockerImage.fromBuild()

export class DockerImage extends BundlingDockerImage {
  /**
   * Builds a Docker image
   *
   * @param path The path to the directory containing the Docker file
   * @param options Docker build options
   */
  public static fromBuild(path: string, options: DockerBuildOptions = {}) {
    const buildArgs = options.buildArgs || {};

    if (options.file && isAbsolute(options.file)) {
      throw new Error(`"file" must be relative to the docker build directory. Got ${options.file}`);
    }

    // Image tag derived from path and build options
    const input = JSON.stringify({ path, ...options });
    const tagHash = crypto.createHash('sha256').update(input).digest('hex');
    const tag = `cdk-${tagHash}`;

    const dockerArgs: string[] = [
      'build', '-t', tag,
      ...(options.file ? ['-f', join(path, options.file)] : []),
      ...(options.platform ? ['--platform', options.platform] : []),
      ...(options.targetStage ? ['--target', options.targetStage] : []),
      ...flatten(Object.entries(buildArgs).map(([k, v]) => ['--build-arg', `${k}=${v}`])),
      path,
    ];

    dockerExec(dockerArgs);

    // Fingerprints the directory containing the Dockerfile we're building and
    // differentiates the fingerprint based on build arguments. We do this so
    // we can provide a stable image hash. Otherwise, the image ID will be
    // different every time the Docker layer cache is cleared, due primarily to
    // timestamps.
    const hash = FileSystem.fingerprint(path, { extraHash: JSON.stringify(options) });
    return new DockerImage(tag, hash);
  }
  // ...

ざっくり当たりをつけていく

イメージタグはこの辺で自動生成

    const tagHash = crypto.createHash('sha256').update(input).digest('hex');
    const tag = `cdk-${tagHash}`;

options で明示的に使ってそうなプロパティは file, platform, targetStage の3つ。コンテナの中身（実装部分）について関係しそうなものは Dockerfile をポイントしてそうな file くらいに見える。

途中の dockerExec() は実際に docker buiid している箇所っぽいので読み飛ばす。

とりあえず、手元でビルドするなり、ECR から既存を引っ張ってくるなり、何かしらの方法によって構成された Docker image の概念が DockerImage クラスである、という捉え方で良さそう

最終的に DockerImage はイメージタグとハッシュを引数としてインスタンス化されている。

イメージタグは上述のように cdk- で始まるビルドオプションをハッシュ化して appnend した文字列、
ハッシュはイメージ自身のコンテンツを反映したハッシュっぽい。Docker image の仕様に従うハッシュを時前で計算している？？？

ハッシュ計算処理の直前のコメントは以下

Fingerprints the directory containing the Dockerfile we're building and differentiates the fingerprint based on build arguments. We do this so we can provide a stable image hash. Otherwise, the image ID will be different every time the Docker layer cache is cleared, due primarily to timestamps.

和訳:

ビルドしているDockerfileが含まれるディレクトリに指紋をつけ、ビルド引数に基づいて指紋を区別します。これを行うことで、安定したイメージハッシュを提供できます。そうしないと、タイムスタンプに起因してDockerレイヤーキャッシュがクリアされるたびに、イメージIDが毎回異なってしまいます。

これを読んで、Docker image のハッシュって自分で指定できるのか？？と思ったんだけど普通に考えてそんな仕様なわけないよなと思っている。Docker 側の仕様に準拠するような実装してんじゃないかと想像するが、いったん枝葉の話なので詳細理解は置いとく。

DockerImage を fromBuild によって構成した後、 cp を呼び出しているのでそこを眺める

// export class DockerImage extends BundlingDockerImage
  /**
   * Copies a file or directory out of the Docker image to the local filesystem.
   *
   * If `outputPath` is omitted the destination path is a temporary directory.
   *
   * @param imagePath the path in the Docker image
   * @param outputPath the destination path for the copy operation
   * @returns the destination path
   */
  public cp(imagePath: string, outputPath?: string): string {
    // It is preferable for the deprecated class to inherit a non-deprecated class.
    // However, in this case, the opposite has occurred which is incompatible with
    // a deprecation feature. See https://github.com/aws/jsii/issues/3102.
    const deprecated = quiet();

    const result = super.cp(imagePath, outputPath);

    reset(deprecated);
    return result;
  }

ということで、 imagePath はコンテナ内のパスで、outputPath はホスト側のパスっぽい。

親クラスの cp に書いてある doc string を見てみると

Copies a file or directory out of the Docker image to the local filesystem.

If outputPath is omitted the destination path is a temporary directory.

想像通りのことが書いてあったので正解ぽい

hassaku63

ここまで読んでみると cp でホストのパスに成果物をコピーしているような実装が入っているように見える。

でも、実際には docker image でデプロイするなら ECR へのプッシュになるんじゃないの？？？と思うのでちょっと直感に反している。

BundlingDockerImage.cp() の実装は dcoker cp に対応している様子。

ここまで見てみると、やっぱり「zip でデプロイするためのコンテンツを docker 環境で出力してホストにコピー」なんじゃないかという気がしてくる。 lambda.Code.fromDockerImage はコンテナデプロイに対応しているのだろうか？？

実際に Lambda を構成してみて、 synth/deploy すればわかりそうなので試す

hassaku63

Dockerfile は以下のドキュメントにかかれているものをほぼそのまま流用

FROM public.ecr.aws/lambda/python:3.8

# Install the function's dependencies using file requirements.txt
# from your project folder.

COPY requirements.txt  .
RUN  pip3 install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"

# Copy function code
COPY app.py ${LAMBDA_TASK_ROOT}

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "app.handler" ]

今使っているのは 3.9 なので、FROM を書き換えして利用。
また、エントリポイントに使用するモジュール名も適宜置き換え。

zip デプロイ版で検証していたコードがあり、それが CDK プロジェクト配下の python-handlers/ に配置されているので、それを再利用した。

$ tree python-handlers
python-handlers
├── Dockerfile
├── handler.py
├── requirements.txt
└── template.txt

CDK の Construct は以下の要領

    const pythonFunction = new lambda.Function(this, "PythonFunction", {
      code: lambda.Code.fromDockerBuild("python-handlers"),
      handler: "handler.handler",
      runtime: lambda.Runtime.PYTHON_3_9,
    });

上記の実装では synth でエラーが出た。

Error: Failed to copy files from xxxxxxxxxxxxxx:/asset/. to /private/var/folders/k5/xxxxxxxxx/T/cdk-docker-cp-XXXXXXX: Error: docker exited with status 1

ホスト側のパーミッションには問題なし。改めて fromDockerBuild のオプションを見てみると、

DockerBuildAssetOptions のオプションである imagePath が、コンテナ内の /assert をデフォルトとして docker cp の src 側の引数に使われてるっぽい様子がわかる。....となると、やはりデプロイするコンテナイメージそのものを作ってるのではなく、この文脈におけるコンテナはアセットのビルド環境としてのコンテナである、という風に理解できそう。

この仮定なら、lambda.Code.fromDockerBuild() で利用する Dockerfile は /asset にアセットを出力するような動作を期待してるはず、という推測になる

Dockerfile に COPY . /asset を追記して、 pip install した後のディレクトリをコンテナ内の /asset に配置するようにしてみると、cdk synth が通った。

synth の出力から該当する AWS::Lambda::Function リソースを確認すると、以下のような出力。

PythonFunctionXXXXXXX:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        S3Bucket:
          Fn::Sub: cdk-xxxxxxxxx-assets-${AWS::AccountId}-${AWS::Region}
        S3Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.zip

想定通り、lambda.Code.fromDockerBuild() は zip によるデプロイ方法をサポートするメソッドであることがわかった。

hassaku63

改めて、Python で動作する Lambda をコンテナ形式でデプロイする方法について確認していく

おとなしく CDK のドキュメントに記載がないか漁ることにする

hassaku63

普通にそれっぽいことが書いてあった。。。

new lambda.DockerImageFunction(this, 'AssetFunction', {
  code: lambda.DockerImageCode.fromImageAsset(path.join(__dirname, 'docker-handler')),
});

DockerImageCode の実装はこんな感じ

export declare abstract class DockerImageCode {
    /**
     * Use an existing ECR image as the Lambda code.
     * @param repository the ECR repository that the image is in
     * @param props properties to further configure the selected image
     */
    static fromEcr(repository: ecr.IRepository, props?: EcrImageCodeProps): DockerImageCode;
    /**
     * Create an ECR image from the specified asset and bind it as the Lambda code.
     * @param directory the directory from which the asset must be created
     * @param props properties to further configure the selected image
     */
    static fromImageAsset(directory: string, props?: AssetImageCodeProps): DockerImageCode;
    /**
     * Produce a `Code` instance from this `DockerImageCode`.
     * @internal
     */
    abstract _bind(architecture?: Architecture): Code;
}

AssetImageCodeProps は cmd, entrypoint, workingDirectory が任意プロパティで定義されている。公式ドキュメントの Dockerfile サンプルをほぼそのまま使うのなら、いったんスルーで良さそう。CDK のドキュメント通りに実装してみる。

また synth をしてみると、今度は意図したようにコンテナイメージによるデプロイになりそうな出力になった。

  PythonFunctionXXXXX:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ImageUri:
          Fn::Sub: ${AWS::AccountId}.dkr.ecr.${AWS::Region}.${AWS::URLSuffix}/cdk-XXXXXXX-container-assets-${AWS::AccountId}-${AWS::Region}:xxxxxxxxxxxxxxxxxxxxxxxxxxx
      PackageType: Image
      # ...

synth した時点だと ECR レポジトリは作成されてるっぽい？？
（日付が合わないので別のプロジェクトでやってた検証（忘れた）で作られた ECR レポジトリを共有しているっぽい雰囲気。bootstrap で s3 バケットを作る処理と近い気がする。いつ作られたのかは今の時点では不明だが、少なくとも bootstrap する時点で ECR レポジトリが作られること自体は自明じゃないはずなので多分 synth を行った際に必要に応じて、ということであろうと推測する