iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
👋

Running Tests Only for Changed Projects in a Monorepo with GitHub Actions

に公開1

Important Addendum

The main point of this article was to use sparse checkout and determine changes with git diff, but I wasn't aware of the existence of on.push.paths. By using it, simply performing a sparse checkout as shown below is sufficient.

# .github/workflows/foo-test.yaml
name: foo-test
on:
  push:
    paths: systems/foo/**
env:
  SPARSE_CHECKOUT_DIR: systems/foo
jobs:
  test:
    runs-on: Ubuntu-20.04
    steps:
      - name: sparse checkout
        run: |
          git clone --filter=blob:none --no-checkout --depth 1 --sparse https://${GITHUB_ACTOR}:${{secrets.GITHUB_TOKEN}}@github.com/${GITHUB_REPOSITORY}.git .
          git sparse-checkout init --cone
          git sparse-checkout add ${SPARSE_CHECKOUT_DIR}
          git checkout ${GITHUB_SHA}
      - run: echo write your tests
        working-directory: ./systems/foo

As a result, most of this article has become unnecessary, but I'm leaving it here because I worked hard on writing it...


Suppose you have the following configuration in a monorepo.

systems/foo
systems/bar

As a premise, assume that foo and bar are part of one large project, but it is known that they have no dependencies on each other.

In this case, simply running tests for both on push would be quite wasteful. For example, when you are managing Android and Web in the same repository.

Also, in my local environment, repeatedly using actions/checkout with default settings (even with --depth 1) caused an issue where the transfer volume exceeded the free tier of GitHub Actions. I will resolve this at the same time.

git sparse-checkout

Since Git 2.16, there is an option called sparse-checkout that allows you to fetch only a portion of a specified repository. While it can be cumbersome for normal use, it is perfect for use on CI for this purpose.

Git - git-sparse-checkout Documentation

However, actions/checkout does not have a sparse-checkout feature.

Options like sparse mode · Issue #172 · actions/checkout

Therefore, referring to this comment, I will clone only the specified directories.

REPO="https://${GITHUB_ACTOR}:${{ secrets.GITHUB_TOKEN }}@github.com/${GITHUB_REPOSITORY}.git"
git clone --filter=blob:none --no-checkout --depth 1  --sparse $REPO .
git sparse-checkout init --cone
git sparse-checkout add "folder1" "folder2/folder3"
git checkout ${GITHUB_SHA}

GITHUB_SHA is a GitHub Actions environment variable representing the current commit hash.

git diff --exit-code origin/main...${GITHUB_SHA} --relative ...

The following code compares the current commit hash with origin/main (master) and returns an exit code of 1 if there are differences. By specifying --relative, you can check the diff for that specific directory.

git diff --exit-code origin/main...${GITHUB_SHA} --relative=systems/foo

This allows for difference detection compared to main, so you can decide whether to run the tests.

Job Dependencies and Return Values

GitHub Actions allows you to set a value for a step using ::set-output name=[key]::[value].

    step:
      - id: foo
        run: echo "::set-output name=x::1"
## This can be referenced as ${{steps.foo.outputs.x}}

Furthermore, GitHub Actions allows you to declare dependencies between jobs and reference the outputs of the dependency from subsequent jobs.

Referencing Output Values of Jobs in Subsequent Jobs in GitHub Actions - notebook

By combining this with the Git commands mentioned earlier, you can write a job that passes whether there are changes in a specified directory to subsequent jobs.

env:
  SPARSE_CHECKOUT_DIR: systems/actionhub
jobs:
  check:
    runs-on: Ubuntu-20.04
    outputs:
      changed: ${{ steps.checkout.outputs.changed }}
    steps:
      - id: checkout
        name: sparse checkout
        run: |
          git clone --filter=blob:none --no-checkout --depth 1 --sparse https://${GITHUB_ACTOR}:${{secrets.GITHUB_TOKEN}}@github.com/${GITHUB_REPOSITORY}.git .
          git sparse-checkout init --cone
          git sparse-checkout add ${SPARSE_CHECKOUT_DIR}
          git checkout ${GITHUB_SHA}
          echo "::set-output name=changed::$(git diff --exit-code origin/main --relative=${SPARSE_CHECKOUT_DIR} > /dev/null || echo $?)"
  test:
    runs-on: Ubuntu-20.04
    needs: [check]
    if: needs.check.outputs.changed == '1'
    steps:
      # your tests

The reason this is beneficial is that with simple step references, you would have to write if: steps.checkout.outputs.changed == '1' for every single step, whereas with this approach, you only need to evaluate if: needs.check.outputs.changed == '1' once for the test job.

Final Version

# .github/workflows/foo-test.yaml
name: foo-test
on: [push]

env:
  SPARSE_CHECKOUT_DIR: systems/foo
jobs:
  check:
    runs-on: Ubuntu-20.04
    outputs:
      changed: ${{ steps.checkout.outputs.changed }}
    steps:
      - id: checkout
        name: sparse checkout
        run: |
          git clone --filter=blob:none --no-checkout --depth 1 --sparse https://${GITHUB_ACTOR}:${{secrets.GITHUB_TOKEN}}@github.com/${GITHUB_REPOSITORY}.git .
          git sparse-checkout init --cone
          git sparse-checkout add ${SPARSE_CHECKOUT_DIR}
          git checkout ${GITHUB_SHA}
          echo "::set-output name=changed::$(git diff --exit-code origin/main --relative=${SPARSE_CHECKOUT_DIR} > /dev/null || echo $?)"
  test:
    runs-on: Ubuntu-20.04
    needs: [check]
    if: needs.check.outputs.changed == '1'
    steps:
      - name: sparse checkout
        run: |
          git clone --filter=blob:none --no-checkout --depth 1 --sparse https://${GITHUB_ACTOR}:${{secrets.GITHUB_TOKEN}}@github.com/${GITHUB_REPOSITORY}.git .
          git sparse-checkout init --cone
          git sparse-checkout add ${SPARSE_CHECKOUT_DIR}
          git checkout ${GITHUB_SHA}
      - run: echo write your tests
        working-directory: ./systems/foo

To be honest, it still feels a bit subtle. This is because I'm performing the sparse checkout twice. I've judged that the risk of accidents caused by requiring if to be attached to every step outweighs the redundancy of doing it twice. Also, it's easier to copy and paste.

I considered making it a script within the repository, but since the code itself doesn't exist before the checkout, I ended up pasting this code inline twice.

Probably, the "correct" way to do it would be for actions/checkout to support sparse checkout, or for someone to turn this into an action. I'll leave the rest as homework for all of you.

I want something equivalent to circleci-agent step halt.

Discussion