💨

CodeBuildのローカル環境でのビルド高速化 (ビルドキャッシュの模倣)

に公開

TL;DR

  • buildspec.ymlを解析するなど自前のキャッシュ処理機構を作る
  • キャッシュ済みディレクトリをホスト側で持っておき、ビルドコンテナ起動時にリストア

背景

CodeBuildの機能としてGradleなどのランタイム時にダウンロードされるライブラリなどをキャッシュしておき、次回以降のビルドで同一ライブラリのダウンロードをスキップしてキャッシュから復元することで、ビルド高速化を図る機能がある。

しかしながら、ローカルでのCodeBuild実行環境であるlocal_buildsではこの機能が動作しておらずbuildspec.ymlでキャッシュのキーを指定していても下記ログとなりキャッシュも使われない。

agent-1  | [Container] ****/**/** **:**:**.****** Found possible syntax errors in buildspec: 
agent-1  | In the section cache
agent-1  |      The following keys cannot be identified:
agent-1  |              key
agent-1  | [Container] ****/**/** **:**:**.****** Processing environment variables

プロジェクトとしてビルドキャッシュを有効にする設定をしていても、ローカル環境ではその動作を再現あるいは模倣することは現状ではできない

また当然ではあるがビルドキャッシュの内部処理・ソースコードはどこにも公開されておらず、完全に同じ動きを再現することはできないうえ、S3を使う部分までローカルで再現してしまってはキャッシュとしての恩恵がほぼなくなってしまう。
そこで完全再現にはせずかつ極力ビルド環境への影響を最小限に抑えつつ、ある程度の動きを模倣した自前のキャッシュ処理機構を組み込み、S3不使用でローカル環境でのCodeBuild高速化を狙う。

事前準備

カスタムエージェントイメージの作成

下記記事と同じ方針 (記事内容の対応も含む)

  • Dockerfile作成
    Dockerfile
    FROM public.ecr.aws/codebuild/local-builds:latest
    
    COPY "./docker-compose.yml" \
        "/LocalBuild/agent-resources/docker-compose.yml"
    COPY "./docker-compose-mount-src-dir.yml" \
        "/LocalBuild/agent-resources/docker-compose-mount-src-dir.yml"
    COPY "./local_build.sh" \
        "/usr/local/bin/local_build.sh"
    
    RUN curl \
        -L \
        -o /usr/bin/yq \
        https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 2>/dev/null && \
        chmod +x /usr/bin/yq && \
        chmod +x /usr/local/bin/local_build.sh
    
  • local_build.sh向けパッチファイル
    local_build.patch
    local_build.patch
    diff --git a/local_build.sh b/local_build.sh
    --- a/local_build.sh
    +++ b/local_build.sh
    @@ -17,10 +17,14 @@ then
         touch /codebuild/output/log
         tail -F /codebuild/output/log &
         sh ./start > /dev/null
    +    if ${CODEBUILD_ENABLE_CUSTOM_CACHE:-false};then
    +        echo "Wait infinity for the custom cache process in the build container"
    +        sleep infinity
    +    fi
     else
         if [ -z "${LOCAL_AGENT_IMAGE_NAME}" ]
         then
    -        LOCAL_AGENT_IMAGE_NAME="amazon/aws-codebuild-local:latest"
    +        LOCAL_AGENT_IMAGE_NAME="my-local-builds:latest"
         fi
     
         if [ -z "${DOCKER_PRIVILEGED_MODE}" ]
    @@ -100,6 +104,148 @@ else
             /LocalBuild/agent-resources/bin/edit-docker-compose /LocalBuild/envFile/$ENV_VAR_FILE /LocalBuild/agent-resources/customer-specific.yml "EnvironmentVariables"
         fi
     
    +    if [ -n "${CACHE_HOST_ROOT_DIR}" ]; then
    +        TMP_WORK_DIR="/tmp/work"
    +
    +        buildspec_mount_args=
    +        if [ -z "${BUILDSPEC}" ]; then
    +            buildspec_mount_args="${CODEBUILD_LOCAL_SOURCE_DIRECTORY}:${TMP_WORK_DIR}"
    +        else
    +            buildspec_mount_args="${BUILDSPEC}:${TMP_WORK_DIR}/buildspec.yml"
    +        fi
    +
    +        BUILDSPEC_CONTENT="$(docker run --rm -it \
    +            --entrypoint= \
    +            -v "${buildspec_mount_args}:ro" \
    +            -w "${TMP_WORK_DIR}" "${LOCAL_AGENT_IMAGE}" cat buildspec.yml)"
    +
    +        CACHE_PATHS="$(echo "${BUILDSPEC_CONTENT}" |
    +            yq "(.cache.paths // [])[]")"
    +
    +        if [ -n "${CACHE_PATHS}" ]; then
    +            tmp_container_env_args=""
    +
    +            AGENT_ENVS="$(yq '.services.agent.environment[] | select(test("^CUSTOMER_") | not)' \
    +                customer-specific.yml)"
    +            for line in ${AGENT_ENVS}; do
    +                tmp_container_env_args="${tmp_container_env_args} -e ${line}"
    +            done
    +
    +            CUSTOMER_ENVS="$(yq '.services.agent.environment[] | select(test("^CUSTOMER_"))' \
    +                customer-specific.yml)"
    +            for line in ${CUSTOMER_ENVS}; do
    +                tmp_container_env_args="${tmp_container_env_args} -e ${line#CUSTOMER_}"
    +            done
    +
    +            BUILDSPEC_ENVS="$(echo "${BUILDSPEC_CONTENT}" |
    +                yq '.env.variables | to_entries | .[] | "\(.key)=\(.value)" | select(. != "=")')"
    +            for line in ${BUILDSPEC_ENVS}; do
    +                tmp_container_env_args="${tmp_container_env_args} -e ${line}"
    +            done
    +            PATH_SET="$(docker run --rm -it \
    +                            --entrypoint= \
    +                            ${tmp_container_env_args} \
    +                            "${IMAGE_FOR_CODEBUILD_LOCAL_BUILD}" \
    +                            sh -c "echo \"${CACHE_PATHS}\" | \
    +                            while read -r line; do \
    +                                RESTORE_TARGET_ROOT=\"\$(eval printf \"%s\" \"\${line%%\**}\" | sed \"s/\/*$//\")\"; \
    +                                FILE_PATTERN=\"\${line#\"\${RESTORE_TARGET_ROOT}\"}\"; \
    +                                printf \"%s\n\" \"\${RESTORE_TARGET_ROOT}\${FILE_PATTERN}>\"\$(
    +                                     printf \"%s\" \"\${RESTORE_TARGET_ROOT}\${FILE_PATTERN}\" | \
    +                                        sha1sum | \
    +                                        awk '{print \$1}' | \
    +                                        cut -c1-8
    +                                )\"\"; \
    +                            done")"
    +
    +            CB_TMP_ROOT_DIR="/tmp/codebuild"
    +            cache_dirname=
    +            CACHE_KEY_FORMULA="$(echo "${BUILDSPEC_CONTENT}" | yq '.cache.key // ""')"
    +            if [ -n "${CACHE_KEY_FORMULA}" ]; then
    +
    +                CB_TMP_SRC_DIR="${CB_TMP_ROOT_DIR}/src"
    +                CB_TMP_EVAL_DIR="${CB_TMP_ROOT_DIR}/eval"
    +                CACHE_KEY="$(docker run --rm -it \
    +                    --entrypoint= \
    +                    -v "${CODEBUILD_LOCAL_SOURCE_DIRECTORY}:${CB_TMP_SRC_DIR}:ro" \
    +                    ${tmp_container_env_args} \
    +                    "${IMAGE_FOR_CODEBUILD_LOCAL_BUILD}" \
    +                    sh -c "set -e && cp -a ${CB_TMP_SRC_DIR} ${CB_TMP_EVAL_DIR} && \
    +                cd ${CB_TMP_EVAL_DIR} || exit 1 && \
    +                CACHE_KEY=${CACHE_KEY_FORMULA} && \
    +                echo \"\${CACHE_KEY}\"")"
    +                EVAL_CACHE_KEY_RETURN_CODE=$?
    +                if [ ${EVAL_CACHE_KEY_RETURN_CODE} -eq 0 ]; then
    +                    echo "Cache key: ${CACHE_KEY}"
    +                    cache_dirname="$(echo "${CACHE_KEY}${PATH_SET}" |
    +                        sha1sum |
    +                        awk '{print $1}' |
    +                        cut -c1-8)"
    +                else
    +                    echo "Cache key cannot be evaluated: ${EVAL_CACHE_KEY_RETURN_CODE}. Consider using cache without a key."
    +                fi
    +            fi
    +            if [ -z "${cache_dirname}" ]; then
    +                cache_dirname="$(echo "${PATH_SET}" |
    +                    sha1sum |
    +                    awk '{print $1}' |
    +                    cut -c1-8)"
    +            fi
    +            CACHE_HOST_DIR="${CACHE_HOST_ROOT_DIR}/${cache_dirname}"
    +            echo "Cache host directory: ${CACHE_HOST_DIR}"
    +
    +            CACHE_PATHS_FILENAME="cache_paths.txt"
    +
    +            CACHE_PATHS_FILE_EXISTS="$(
    +                docker run --rm -it \
    +                    --entrypoint= \
    +                    -v "${CACHE_HOST_DIR}:${TMP_WORK_DIR}" \
    +                    -w ${TMP_WORK_DIR} "${LOCAL_AGENT_IMAGE}" \
    +                    sh -c "\
    +                if [ -f \"${CACHE_PATHS_FILENAME}\" ]; then \
    +                    printf 'true'; \
    +                else printf 'false'; \
    +                fi"
    +            )"
    +            if ! ${CACHE_PATHS_FILE_EXISTS}; then
    +                docker run --rm -it \
    +                    --entrypoint= \
    +                    -v "${CACHE_HOST_DIR}:${TMP_WORK_DIR}" \
    +                    -w "${TMP_WORK_DIR}" "${LOCAL_AGENT_IMAGE}" \
    +                    sh -c "echo \"${PATH_SET}\" >\"${CACHE_PATHS_FILENAME}\""
    +            fi
    +
    +            CB_CACHE_ROOT_DIR="${CB_TMP_ROOT_DIR}/cache"
    +            CB_CACHE_FILES_DIR="${CB_CACHE_ROOT_DIR}/files"
    +            CB_CACHE_PATHS_FILE="${CB_CACHE_ROOT_DIR}/${CACHE_PATHS_FILENAME}"
    +            CB_RESTORE_CACHE_PATH="${CB_CACHE_ROOT_DIR}/$(basename "${RESTORE_CACHE_HOST_PATH}")"
    +            CB_STORE_CACHE_PATH="${CB_CACHE_ROOT_DIR}/$(basename "${STORE_CACHE_HOST_PATH}")"
    +            yq '.services.agent.environment += ["CODEBUILD_ENABLE_CUSTOM_CACHE=true"]' \
    +                -i customer-specific.yml
    +            yq '.services.build.volumes += ["'"${CACHE_HOST_DIR}/files"':'"${CB_CACHE_FILES_DIR}"'"]' \
    +                -i customer-specific.yml
    +            yq '.services.build.volumes += ["'"${CACHE_HOST_DIR}/${CACHE_PATHS_FILENAME}"':'"${CB_CACHE_PATHS_FILE}:ro"'"]' \
    +                -i customer-specific.yml
    +            yq '.services.build.volumes += ["'"${RESTORE_CACHE_HOST_PATH}"':'"${CB_RESTORE_CACHE_PATH}:ro"'"]' \
    +                -i customer-specific.yml
    +            yq '.services.build.volumes += ["'"${STORE_CACHE_HOST_PATH}"':'"${CB_STORE_CACHE_PATH}:ro"'"]' \
    +                -i customer-specific.yml
    +            yq '.services.build.environment += ["CODEBUILD_CACHE_FILES_ROOT_DIR='"${CB_CACHE_FILES_DIR}"'"]' \
    +                -i customer-specific.yml
    +            yq '.services.build.environment += ["CODEBUILD_CACHE_PATHS_FILE='"${CB_CACHE_PATHS_FILE}"'"]' \
    +                -i customer-specific.yml
    +
    +            export ENTRYPOINT_ARGS="sh -c \"${CB_RESTORE_CACHE_PATH} && \
    +                while [ ! -f /codebuild/readonly/bin/executor.done ]; do \
    +                sleep 1; \
    +                done && \
    +                /codebuild/readonly/bin/executor > /dev/null && \
    +                ${CB_STORE_CACHE_PATH}\""
    +            yq '.services.build.entrypoint = env(ENTRYPOINT_ARGS)' \
    +                -i customer-specific.yml
    +        fi
    +    fi
    +
         # Validate docker-compose config
         docker-compose -f customer-specific.yml config --quiet || exit 1
     
    
  • codebuild_build.sh向けパッチファイル
    codebuild_build.patch
    codebuild_build.patch
    @@ -52,6 +52,7 @@ function usage {
         echo "               * Blank lines are ignored"
         echo "               * File can be of type .env or .txt"
         echo "               * There is no special handling of quotation marks, meaning they will be part of the VAL"
    +    echo "  -u        Used to specify the build cache."
         exit 1
     }
     
    @@ -60,8 +61,9 @@ artifact_flag=false
     awsconfig_flag=false
     mount_src_dir_flag=false
     docker_privileged_mode_flag=false
    +build_cache=false
     
    -while getopts "cmdi:a:r:s:b:e:l:p:h" opt; do
    +while getopts "cmdi:a:r:s:b:e:l:p:uh" opt; do
         case $opt in
             i  ) image_flag=true; image_name=$OPTARG;;
             a  ) artifact_flag=true; artifact_dir=$OPTARG;;
    @@ -74,6 +76,7 @@ while getopts "cmdi:a:r:s:b:e:l:p:h" opt; do
             e  ) environment_variable_file=$OPTARG;;
             l  ) local_agent_image=$OPTARG;;
             p  ) aws_profile=$OPTARG;;
    +        u  ) build_cache=true;;
             h  ) usage; exit;;
             \? ) echo "Unknown option: -$OPTARG" >&2; exit 1;;
             :  ) echo "Missing option argument for -$OPTARG" >&2; exit 1;;
    @@ -182,11 +185,19 @@ else
         docker_command+=" -e \"INITIATOR=$USER\""
     fi
     
    +CURRENT_DIR="$(allOSRealPath "$(dirname "${BASH_SOURCE[0]}")")"
    +if $build_cache
    +then
    +    docker_command+=" -e \"CACHE_HOST_ROOT_DIR=${CURRENT_DIR}/cached\""
    +    docker_command+=" -e \"RESTORE_CACHE_HOST_PATH=${CURRENT_DIR}/restore_caches.sh\""
    +    docker_command+=" -e \"STORE_CACHE_HOST_PATH=${CURRENT_DIR}/store_caches.sh\""
    +fi
    +
     if [ -n "$local_agent_image" ]
     then
         docker_command+=" $local_agent_image"
     else
    -    docker_command+=" public.ecr.aws/codebuild/local-builds:latest"
    +    docker_command+=" my-local-builds:latest"
     fi
     
     # Note we do not expose the AWS_SECRET_ACCESS_KEY or the AWS_SESSION_TOKEN
    
    
  • 下記コマンド実行
    command
    export LOCALBUILD_CUSTOM_IMAGE_TAG="my-local-builds:latest" && \
        export LOCALBUILD_WORK_DIR="/workdir" && \
        docker run \
            -it \
            -v ./:"${LOCALBUILD_WORK_DIR}" \
            --rm \
            --entrypoint=/bin/bash \
            public.ecr.aws/codebuild/local-builds:latest \
                -c "curl \
                    -L \
                    -o /usr/bin/yq \
                    https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 2>/dev/null&& \
                    chmod +x /usr/bin/yq && \
                    cp /LocalBuild/agent-resources/{docker-compose.yml,docker-compose-mount-src-dir.yml} ${LOCALBUILD_WORK_DIR} && \
                    cp /usr/local/bin/local_build.sh ${LOCALBUILD_WORK_DIR} && \
                    for file in \"docker-compose.yml\" \"docker-compose-mount-src-dir.yml\"; do \
                        yq -i \
                            '.version=\"3\" | \
                            .services.build.environment[0] = \"NO_PROXY=agent:3000\" |
                            .services.build.environment[2] = \"CODEBUILD_AGENT_PORT=http://agent:3000\" |
                            del(.services.build.links)' \
                            ${LOCALBUILD_WORK_DIR}/\${file}; \
                    done" && \
        patch -p1 < local_build.patch && \
        docker build -t "${LOCALBUILD_CUSTOM_IMAGE_TAG}" . && \
        wget -O codebuild_build.sh \
        https://raw.githubusercontent.com/aws/aws-codebuild-docker-images/refs/heads/master/local_builds/codebuild_build.sh &&
        patch -p1 < codebuild_build.patch
    
    my-local-builds:latestという名前のカスタムイメージが作成される。

キャッシュ保存・復元用スクリプト作成

下記を実行ディレクトリに配置して実行権限をつけておく。

store_caches.sh
store_caches.sh
#!/bin/sh
if [ ! -f "${CODEBUILD_CACHE_PATHS_FILE}" ]; then
    echo "Cache paths file does not exist: ${CODEBUILD_CACHE_PATHS_FILE}"
    return
fi

if [ ! -d "${CODEBUILD_CACHE_FILES_ROOT_DIR}" ]; then
    mkdir -p "${CODEBUILD_CACHE_FILES_ROOT_DIR}"
fi

if [ -n "$(find "${CODEBUILD_CACHE_FILES_ROOT_DIR}" -type f)" ]; then
    echo "Cached files exist. Skip store cache: ${CODEBUILD_CACHE_FILES_ROOT_DIR}"
    return
fi

while read -r path_set; do
    NORMALIZED_PATH="$(echo "${path_set%%>*}" | sed "s/\/*$//")"
    PATH_HASH="${path_set#*>}"

    STORE_TARGET_ROOT="$(echo "${NORMALIZED_PATH%%\**}" | sed "s/\/*$//")"
    if [ ! -d "${STORE_TARGET_ROOT}" ]; then
        echo "Store target root does not exist: ${STORE_TARGET_ROOT}"
        continue
    fi

    store_targets=
    FILE_PATTERN="${NORMALIZED_PATH#"${STORE_TARGET_ROOT}"}"
    if [ -n "${FILE_PATTERN}" ]; then
        store_targets="$(find "${STORE_TARGET_ROOT}" -type f -path "${FILE_PATTERN}" -printf "%P\n")"
    else
        store_targets="$(find "${STORE_TARGET_ROOT}" -type f -printf "%P\n")"
    fi
    if [ -z "${store_targets}" ]; then
        echo "No store target: ${STORE_TARGET_ROOT}"
        continue
    fi

    CACHE_DIR="${CODEBUILD_CACHE_FILES_ROOT_DIR}/${PATH_HASH}"

    echo "Store cache start: ${NORMALIZED_PATH} -> ${CACHE_DIR}"

    echo "${store_targets}" |
        while read -r file; do
            TARGET_PATH="${STORE_TARGET_ROOT}/${file}"
            STORE_TO="${CACHE_DIR}/${file}"
            STORE_TO_DIR="$(dirname "${STORE_TO}")"
            if [ ! -d "${STORE_TO_DIR}" ]; then
                mkdir -p "${STORE_TO_DIR}"
            fi
            cp -a "${TARGET_PATH}" "${STORE_TO}"
        done

    echo "Store cache end: ${NORMALIZED_PATH} -> ${CACHE_DIR}"

done <"${CODEBUILD_CACHE_PATHS_FILE}"
restore_caches.sh
restore_caches.sh
#!/bin/sh
if [ ! -f "${CODEBUILD_CACHE_PATHS_FILE}" ]; then
    echo "Cache paths file does not exist: ${CODEBUILD_CACHE_PATHS_FILE}"
    return
fi

if [ ! -d "${CODEBUILD_CACHE_FILES_ROOT_DIR}" ]; then
    echo "Cache files root directory does not exist: ${CODEBUILD_CACHE_FILES_ROOT_DIR}"
    return
fi

if [ -z "$(find "${CODEBUILD_CACHE_FILES_ROOT_DIR}" -type f)" ]; then
    echo "No cached file: ${CODEBUILD_CACHE_FILES_ROOT_DIR}"
    return
fi

while read -r path_set; do
    NORMALIZED_PATH="$(echo "${path_set%%>*}" | sed "s/\/*$//")"
    PATH_HASH="${path_set#*>}"
    CACHE_DIR="${CODEBUILD_CACHE_FILES_ROOT_DIR}/${PATH_HASH}"
    if [ ! -d "${CACHE_DIR}" ]; then
        echo "Cached directory does not exist: ${NORMALIZED_PATH} -> ${CACHE_DIR}"
        continue
    fi
    RESTORE_TARGET_ROOT="$(echo "${NORMALIZED_PATH%%\**}" | sed "s/\/*$//")"

    restore_targets=
    FILE_PATTERN="${NORMALIZED_PATH#"${RESTORE_TARGET_ROOT}"}"
    if [ -n "${FILE_PATTERN}" ]; then
        restore_targets="$(find "${CACHE_DIR}" -type f -path "${FILE_PATTERN}" -printf "%P\n")"
    else
        restore_targets="$(find "${CACHE_DIR}" -type f -printf "%P\n")"
    fi

    if [ -z "${restore_targets}" ]; then
        echo "No restore cache target: ${NORMALIZED_PATH} -> ${CACHE_DIR}"
        continue
    fi

    echo "Restore cache start: ${CACHE_DIR} -> ${NORMALIZED_PATH}"

    echo "${restore_targets}" |
        while read -r file; do
            CACHE_PATH="${CACHE_DIR}/${file}"
            RESTORE_TO="${RESTORE_TARGET_ROOT}/${file}"
            RESTORE_TO_DIR="$(dirname "${RESTORE_TO}")"
            if [ ! -d "${RESTORE_TO_DIR}" ]; then
                mkdir -p "${RESTORE_TO_DIR}"
            fi
            cp -a "${CACHE_PATH}" "${RESTORE_TO}"
        done

    echo "Restore cache end: ${CACHE_DIR} -> ${NORMALIZED_PATH}"
done <"${CODEBUILD_CACHE_PATHS_FILE}"

サンプルビルドプロジェクトの作成

動作確認用。実行ディレクトリにsrcディレクトリを作成し、直下に以下のファイルを配置

src/buildspec.yml
Version: 0.2
env:
  variables:
    FOR_CACHE_KEY: "./for_cache_key.txt"
phases:
  build:
    commands:
      - start_time="$(date +%s)"
      - "cd gradle-simple && gradle build --info"
      - echo "Elapsed time $(($(date +%s) - start_time))s"
artifacts:
  files:
    - "**/*"
  base-directory: "./gradle-simple/build/libs/"
cache:
  key: "cache-key-$(cat ${FOR_CACHE_KEY})"
  paths:
    - "/home/gradle/.gradle/**/*"

pathsについて、事前評価により変数展開(${VAR}等)は対応しているが、ワイルドカードのパターンは・・・/**/*のみしか検証していない。

src/for_cache_key.txt
test

src以下に下記リポジトリをクローン

速度差を出すために適当な大きめの依存関係を追加 (ビルドには不使用)

src/gradle-simple/build.gradle
・・・
dependencies {
    implementation 'com.google.guava:guava:29.0-jre'
    implementation 'org.tensorflow:tensorflow:1.15.0'
    implementation 'ai.djl:api:0.23.0'
    implementation 'org.deeplearning4j:deeplearning4j-core:1.0.0-beta7'
    implementation 'org.apache.commons:commons-math3:3.6.1'
}
・・・

動作確認

実行コマンド

command
bash codebuild_build.sh \
  -i gradle:8.13.0-jdk21-noble \
  -s ./src \
  -a ./ \
  -u

-uオプションをつけることで本対応のビルドキャッシュが有効化される。

初回実行

ログ
Cache key: cache-key-test
Cache host directory: /***/./cached/b81c79e6
・・・
Attaching to agent-1, build-1
build-1  | No cached file: /tmp/codebuild/cache/files
agent-1  | [Container] ****/**/** **:**:**.****** Running on CodeBuild On-demand
agent-1  | [Container] ****/**/** **:**:**.****** Waiting for agent ping
agent-1  | [Container] ****/**/** **:**:**.****** Waiting for DOWNLOAD_SOURCE
agent-1  | [Container] ****/**/** **:**:**.****** Phase is DOWNLOAD_SOURCE
agent-1  | [Container] ****/**/** **:**:**.****** CODEBUILD_SRC_DIR=/codebuild/output/src3349486736/src
agent-1  | [Container] ****/**/** **:**:**.****** YAML location is /codebuild/output/srcDownload/src/buildspec.yml
agent-1  | [Container] ****/**/** **:**:**.****** Found possible syntax errors in buildspec: 
agent-1  | In the section cache
agent-1  |      The following keys cannot be identified:
agent-1  |              key
agent-1  | [Container] ****/**/** **:**:**.****** Processing environment variables
・・・
agent-1  | [Container] ****/**/** **:**:**.****** Running command cd gradle-simple && gradle build --info
agent-1  | Initialized native services in: /home/gradle/.gradle/native
agent-1  | Initialized jansi services in: /home/gradle/.gradle/native
agent-1  | Removing 0 daemon stop events from registry
agent-1  | Starting a Gradle Daemon (subsequent builds will be faster)
・・・
agent-1  | > Task :compileJava
agent-1  | Downloading https://repo.maven.apache.org/maven2/com/google/guava/guava/29.0-jre/guava-29.0-jre.pom to /home/gradle/.gradle/.tmp/gradle_download2186584014182616499bin
agent-1  | Downloading https://repo.maven.apache.org/maven2/org/deeplearning4j/deeplearning4j-core/1.0.0-beta7/deeplearning4j-core-1.0.0-beta7.pom to /home/gradle/.gradle/.tmp/gradle_download11010501578351348172bin
agent-1  | Downloading https://repo.maven.apache.org/maven2/org/apache/commons/commons-math3/3.6.1/commons-math3-3.6.1.pom to /home/gradle/.gradle/.tmp/gradle_download9787345126855781459bin
agent-1  | Downloading https://repo.maven.apache.org/maven2/org/tensorflow/tensorflow/1.15.0/tensorflow-1.15.0.pom to /home/gradle/.gradle/.tmp/gradle_download14123938126877793223bin
agent-1  | Downloading https://repo.maven.apache.org/maven2/ai/djl/api/0.23.0/api-0.23.0.pom to /home/gradle/.gradle/.tmp/gradle_download10191227888946264688bin
・・・
agent-1  | Caching disabled for task ':compileJava' because:
agent-1  |   Build cache is disabled
agent-1  | Task ':compileJava' is not up-to-date because:
agent-1  |   No history is available.
・・・
agent-1  | BUILD SUCCESSFUL in 1m 35s
agent-1  | 5 actionable tasks: 5 executed
agent-1  | Some of the file system contents retained in the virtual file system are on file systems that Gradle doesn't support watching. The relevant state was discarded to ensure changes to these locations are properly detected. You can override this by explicitly enabling file system watching.
agent-1  | 
agent-1  | [Container] ****/**/** **:**:**.****** Running command echo "Elapsed time $(($(date +%s) - start_time))s"
agent-1  | Elapsed time 96s
agent-1  | 
agent-1  | [Container] ****/**/** **:**:**.****** Phase complete: BUILD State: SUCCEEDED
agent-1  | [Container] ****/**/** **:**:**.****** Phase context status code:  Message: 
agent-1  | [Container] ****/**/** **:**:**.****** Entering phase POST_BUILD
agent-1  | [Container] ****/**/** **:**:**.****** Phase complete: POST_BUILD State: SUCCEEDED
agent-1  | [Container] ****/**/** **:**:**.****** Phase context status code:  Message: 
agent-1  | [Container] ****/**/** **:**:**.****** Expanding base directory path: ./gradle-simple/build/libs/
agent-1  | [Container] ****/**/** **:**:**.****** Assembling file list
agent-1  | [Container] ****/**/** **:**:**.****** Expanding ./gradle-simple/build/libs/
agent-1  | [Container] ****/**/** **:**:**.****** Expanding file paths for base directory ./gradle-simple/build/libs/
agent-1  | [Container] ****/**/** **:**:**.****** Assembling file list
agent-1  | [Container] ****/**/** **:**:**.****** Expanding **/*
agent-1  | [Container] ****/**/** **:**:**.****** Found 3 file(s)
agent-1  | [Container] ****/**/** **:**:**.****** Preparing to copy secondary artifacts
agent-1  | [Container] ****/**/** **:**:**.****** No secondary artifacts defined in buildspec
agent-1  | [Container] ****/**/** **:**:**.****** Phase complete: UPLOAD_ARTIFACTS State: SUCCEEDED
agent-1  | [Container] ****/**/** **:**:**.****** Phase context status code:  Message: 
agent-1  | Wait infinity for the custom cache process in the build container
build-1  | Store cache start: /home/gradle/.gradle/**/* -> /tmp/codebuild/cache/files/2ffe2421
build-1  | Store cache end: /home/gradle/.gradle/**/* -> /tmp/codebuild/cache/files/2ffe2421
Aborting on container exit...
build-1 exited with code 0

処理時間:1分36秒
初回はキャッシュが存在しないためライブラリのダウンロードが実行され、全体の処理時間にダウンロード時間が加算されている。
The following keys cannot be identifiedエラーは依然として出ているが (本対応では無関係)、ビルド後にホスト側でcached/b81c79e6/files/2ffe2421のようなディレクトリが作成され、キャッシュが保存される

2回目実行

ログ
Cache key: cache-key-test
Cache host directory: /***/./cached/b81c79e6
・・・
Attaching to agent-1, build-1
build-1  | Restore cache start: /tmp/codebuild/cache/files/2ffe2421 -> /home/gradle/.gradle/**/*
build-1  | Restore cache end: /tmp/codebuild/cache/files/2ffe2421 -> /home/gradle/.gradle/**/*
agent-1  | [Container] ****/**/** **:**:**.****** Running on CodeBuild On-demand
agent-1  | [Container] ****/**/** **:**:**.****** Waiting for agent ping
agent-1  | [Container] ****/**/** **:**:**.****** Waiting for DOWNLOAD_SOURCE
agent-1  | [Container] ****/**/** **:**:**.****** Phase is DOWNLOAD_SOURCE
agent-1  | [Container] ****/**/** **:**:**.****** CODEBUILD_SRC_DIR=/codebuild/output/src4198552667/src
agent-1  | [Container] ****/**/** **:**:**.****** YAML location is /codebuild/output/srcDownload/src/buildspec.yml
agent-1  | [Container] ****/**/** **:**:**.****** Found possible syntax errors in buildspec: 
agent-1  | In the section cache
agent-1  |      The following keys cannot be identified:
agent-1  |              key
agent-1  | [Container] ****/**/** **:**:**.****** Processing environment variables
・・・
agent-1  | [Container] ****/**/** **:**:**.****** Running command cd gradle-simple && gradle build --info
agent-1  | Initialized native services in: /home/gradle/.gradle/native
agent-1  | Initialized jansi services in: /home/gradle/.gradle/native
agent-1  | Removing daemon from the registry due to communication failure. Daemon information: DaemonInfo{pid=74, address=[7e6a689e-6a5c-4690-a983-e77a441db7ab port:36729, addresses:[/127.0.0.1]], state=Idle, lastBusy=*************, context=DefaultDaemonContext[uid=9dae61b3-9bc6-4da0-84de-e9cff9616194,javaHome=/opt/java/openjdk,javaVersion=21,javaVendor=Eclipse Adoptium,daemonRegistryDir=/home/gradle/.gradle/daemon,pid=74,idleTimeout=10800000,priority=NORMAL,applyInstrumentationAgent=true,nativeServicesMode=ENABLED,daemonOpts=-XX:MaxMetaspaceSize=384m,-XX:+HeapDumpOnOutOfMemoryError,-Xms256m,-Xmx512m,-Dfile.encoding=UTF-8,-Duser.country=US,-Duser.language=en,-Duser.variant]}
agent-1  | Removing 0 daemon stop events from registry
agent-1  | Previous Daemon (74) stopped at *** *** ** **:**:** *** **** by user or operating system
agent-1  | Starting a Gradle Daemon, 1 incompatible and 1 stopped Daemons could not be reused, use --status for details
・・・
agent-1  | > Task :compileJava
agent-1  | Caching disabled for task ':compileJava' because:
agent-1  |   Build cache is disabled
agent-1  | Task ':compileJava' is not up-to-date because:
agent-1  |   No history is available.
・・・
agent-1  | BUILD SUCCESSFUL in 8s
agent-1  | 5 actionable tasks: 5 executed
agent-1  | Some of the file system contents retained in the virtual file system are on file systems that Gradle doesn't support watching. The relevant state was discarded to ensure changes to these locations are properly detected. You can override this by explicitly enabling file system watching.
agent-1  | 
agent-1  | [Container] ****/**/** **:**:**.****** Running command echo "Elapsed time $(($(date +%s) - start_time))s"
agent-1  | Elapsed time 9s
agent-1  | 
agent-1  | [Container] ****/**/** **:**:**.****** Phase complete: BUILD State: SUCCEEDED
agent-1  | [Container] ****/**/** **:**:**.****** Phase context status code:  Message: 
agent-1  | [Container] ****/**/** **:**:**.****** Entering phase POST_BUILD
agent-1  | [Container] ****/**/** **:**:**.****** Phase complete: POST_BUILD State: SUCCEEDED
agent-1  | [Container] ****/**/** **:**:**.****** Phase context status code:  Message: 
agent-1  | [Container] ****/**/** **:**:**.****** Expanding base directory path: ./gradle-simple/build/libs/
agent-1  | [Container] ****/**/** **:**:**.****** Assembling file list
agent-1  | [Container] ****/**/** **:**:**.****** Expanding ./gradle-simple/build/libs/
agent-1  | [Container] ****/**/** **:**:**.****** Expanding file paths for base directory ./gradle-simple/build/libs/
agent-1  | [Container] ****/**/** **:**:**.****** Assembling file list
agent-1  | [Container] ****/**/** **:**:**.****** Expanding **/*
agent-1  | [Container] ****/**/** **:**:**.****** Found 3 file(s)
agent-1  | [Container] ****/**/** **:**:**.****** Preparing to copy secondary artifacts
agent-1  | [Container] ****/**/** **:**:**.****** No secondary artifacts defined in buildspec
agent-1  | [Container] ****/**/** **:**:**.****** Phase complete: UPLOAD_ARTIFACTS State: SUCCEEDED
agent-1  | [Container] ****/**/** **:**:**.****** Phase context status code:  Message: 
agent-1  | Wait infinity for the custom cache process in the build container
build-1  | Cached files exist. Skip store cache: /tmp/codebuild/cache/files
Aborting on container exit...
build-1 exited with code 0

処理時間:9秒
初回のようなライブラリのダウンロードはスキップされ、実環境のビルドキャッシュ機能のように/home/gradle/.gradle/**/*キャッシュが復元されているため数秒でビルドが完了している
また、ビルド後にキャッシュファイルは更新しない。これは実環境のビルドキャッシュ機能と同様の動作としている。
もし更新したい場合は一度キャッシュファイルをすべて削除する。上記の例ではcached/b81c79e6/以下。

サンプルプロジェクトではfor_cache_key.txtの内容文字列をキャッシュキーとしているので、内容を変更したりbuildspec.ymlのパスの文字列を変更すれば異なるパスで同様のキャッシュが保存される。
この辺りは以下の仕様を模倣している。

ただしパス文字列の変数展開は実際には式評価を行なっているため、もし実環境で式が扱えない場合は正確には再現ができていない (動作未確認)。

説明

詳細

codebuild_build.shにおいて、buildspec.ymlの解析とビルド環境の環境変数を取得する。(キャッシュキー指定時の処理に対応するため)
エージェントはコンテナで動いているため、ホスト側のファイル読み込みのために一時的なコンテナを何度か立ち上げている。

codebuild_build.sh
        BUILDSPEC_CONTENT="$(docker run --rm -it \
            --entrypoint= \
            -v "${buildspec_mount_args}:ro" \
            -w "${TMP_WORK_DIR}" "${LOCAL_AGENT_IMAGE}" cat buildspec.yml)"

キャッシュキーの式評価も一時的なコンテナ内で実行している。
この際ホスト側のソースをマウントするので、もし式評価でファイルアクセスがあるとビルド環境に影響が出る。
極力影響を避けるために一時ディレクトリにコピー・移動してから式を評価している。

                CACHE_KEY="$(docker run --rm -it \
                    --entrypoint= \
                    -v "${CODEBUILD_LOCAL_SOURCE_DIRECTORY}:${CB_TMP_SRC_DIR}:ro" \
                    ${tmp_container_env_args} \
                    "${IMAGE_FOR_CODEBUILD_LOCAL_BUILD}" \
                    sh -c "set -e && cp -a ${CB_TMP_SRC_DIR} ${CB_TMP_EVAL_DIR} && \
                cd ${CB_TMP_EVAL_DIR} || exit 1 && \
                CACHE_KEY=${CACHE_KEY_FORMULA} && \
                echo \"\${CACHE_KEY}\"")"

キャッシュ対象ディレクトリはpathsで指定された文字列のハッシュ値のディレクトリ名でキャッシュする。その情報はホスト側にcache_paths.txtというファイル名で保存する。

            PATH_SET="$(docker run --rm -it \
                            --entrypoint= \
                            ${tmp_container_env_args} \
                            "${IMAGE_FOR_CODEBUILD_LOCAL_BUILD}" \
                            sh -c "echo \"${CACHE_PATHS}\" | \
                            while read -r line; do \
                                RESTORE_TARGET_ROOT=\"\$(eval printf \"%s\" \"\${line%%\**}\" | sed \"s/\/*$//\")\"; \
                                FILE_PATTERN=\"\${line#\"\${RESTORE_TARGET_ROOT}\"}\"; \
                                printf \"%s\" \"\${RESTORE_TARGET_ROOT}\${FILE_PATTERN}>\"\$(
                                     printf \"%s\" \"\${RESTORE_TARGET_ROOT}\${FILE_PATTERN}\" | \
                                        sha1sum | \
                                        awk '{print \$1}' | \
                                        cut -c1-8
                                )\"\"; \
                            done")"
                docker run --rm -it \
                    --entrypoint= \
                    -v "${CACHE_HOST_DIR}:${TMP_WORK_DIR}" \
                    -w "${TMP_WORK_DIR}" "${LOCAL_AGENT_IMAGE}" \
                    sh -c "echo \"${PATH_SET}\" >\"${CACHE_PATHS_FILENAME}\""

ビルドコンテナが開始したら、初めにキャッシュ復元を試行し、ビルド後にキャッシュ保存を試行する。

            export ENTRYPOINT_ARGS="sh -c \"${CB_RESTORE_CACHE_PATH} && \
                while [ ! -f /codebuild/readonly/bin/executor.done ]; do \
                sleep 1; \
                done && \
                /codebuild/readonly/bin/executor > /dev/null && \
                ${CB_STORE_CACHE_PATH}\""

注意点として、キャッシュの復元はビルドコンテナ内で実施している。
理由としては、実環境ではおそらくエージェント⇔ビルドコンテナ間で何らかのメッセージ受送信の仕組みを用いてキャッシュファイルを転送させていると思われるが、その動作まで再現することは困難であると判断し、手っ取り早く同一コンテナの一時ディレクトリにマウント・コピーさせる方針としたため。

store_caches.sh
    echo "${store_targets}" |
        while read -r file; do
            TARGET_PATH="${STORE_TARGET_ROOT}/${file}"
            STORE_TO="${CACHE_DIR}/${file}"
            STORE_TO_DIR="$(dirname "${STORE_TO}")"
            if [ ! -d "${STORE_TO_DIR}" ]; then
                mkdir -p "${STORE_TO_DIR}"
            fi
            cp -a "${TARGET_PATH}" "${STORE_TO}"
        done

これによりビルド環境の一部に本対応向けの領域ができてしまうため、実環境とは差異が生じビルド動作に影響が出る可能性はある。
あくまで一時ディレクトリのため、極力影響は抑えている作りにはしている。

また、この作りだとキャッシュの保存に時間がかかった場合に先にエージェント側が終了してComposeもdownしてしまい、キャッシュ保存途中でコンテナが終了してしまうためエージェント側で無限スリープを入れている。

local_build.sh
    sh ./start > /dev/null
    if ${CODEBUILD_ENABLE_CUSTOM_CACHE:-false};then
        echo "Wait infinity for the custom cache process in the build container"
        sleep infinity
    fi

Discussion