CircleCI の Container Agent 触ってみる
これ
A more scalable, container-friendly self-hosted runner: Container Agent - now in Open Preview - Build Environment - CircleCI Discuss
今回の内容のプルリクエスト
try: circleci container agent by korosuke613 · Pull Request #39 · korosuke613/playground
まとめ
- セットアップが簡単
- ただしドキュメントは不親切
- 何もわからなくてもできるようなドキュメントではない
- helm知らない人はhelmのコマンドも調べる必要がある
- セルフホストランナーを有効化にする手順が省かれている
- container-agentのリソースクラスを利用するジョブが作成されたらpodが作られる
- ランナーの設定項目は豊富
- Apple Silicon搭載MacのkindおよびDocker Desktopでは動作しなかった
- なぜかはよくわかってない
- armでも動くらしいが...?
- GKE Autopilotでは動作した
- なぜかはよくわかってない
情報
- 告知:A more scalable, container-friendly self-hosted runner: Container Agent - now in Open Preview - Build Environment - CircleCI Discuss
- ドキュメント:Container runner open preview - CircleCI
- FAQ:CircleCI’s self-hosted runner FAQs - CircleCI
使うまでの手順
-
Organization Settings
->Self-Hosted Runners
でセルフホストランナーを有効にする- https://app.circleci.com/settings/organization/github/<プロジェクト名>/runners
- トップページに戻り、
Self-Hosted Runners
を開く -
Create Resource Class
からリソースクラスを作成し、生成されたトークンを保存する - クラスタにcontainer-agentをインストールする
helm repo add container-agent https://packagecloud.io/circleci/container-agent/helm
helm repo update
kubectl create namespace circleci
helm install container-agent container-agent/container-agent -n circleci
-
values.yaml
を生成するhelm show values container-agent/container-agent > values.yaml
-
values.yaml
を編集する-
.agent.resourceClasses
に 3. で作成したリソースクラスとトークンを追加-
values.yamlの抜粋
resourceClasses: korosuke613/gke-autopilot: token: < 3. で保存したトークン>
-
-
-
values.yaml
を適用するhelm upgrade container-agent container-agent/container-agent -n circleci -f ./values.yaml
-
.circleci/config.yml
でリソースクラスを利用する-
config.ymlの抜粋
docker-image: docker: - image: quay.io/buildah/stable resource_class: <namespace>/<resourceClass>
-
helmで構築する
❯ helm repo add container-agent https://packagecloud.io/circleci/container-agent/helm
"container-agent" has been added to your repositories
❯ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "container-agent" chart repository
Update Complete. ⎈Happy Helming!⎈
❯ helm install container-agent container-agent/container-agent -n circleci
Error: INSTALLATION FAILED: create: failed to create: namespaces "circleci" not found
circleciというネームスペースを先に作っておく必要がありそう
❯ kubectl create namespace circleci
namespace/circleci created
❯ helm install container-agent container-agent/container-agent -n circleci
NAME: container-agent
LAST DEPLOYED: Tue Oct 11 15:52:46 2022
NAMESPACE: circleci
STATUS: deployed
REVISION: 1
TEST SUITE: None
できたっぽい
ドキュメントによると次のコマンドでvaluesを取得できる。
❯ helm show values circleci/container-agent
Error: failed to download "circleci/container-agent"
ダメじゃん。
インストールしたのはcontainer-agent/container-agent
なので、helm show values container-agent/container-agent
でできそう。
取得できた。
helm show values container-agent/container-agent
# Default values for container-agent.
## Overrides for generated resource names
# See templates/_helpers.tpl
# nameOverride:
# fullnameOverride:
agent:
replicaCount: 1
image:
registry: ""
repository: "circleci/container-agent"
pullPolicy: Always
tag: "1.0.17278-5c2bd95"
pullSecrets: []
matchLabels:
app: container-agent
# Annotations to be added to agent pods
podAnnotations: {}
# Security Context policies for agent pods
podSecurityContext: {}
# Security Context policies for agent containers
containerSecurityContext: {}
# Liveness and readiness probe values
# Ref: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-probes
livenessProbe:
httpGet:
# should match container.healthCheckPath
path: "/live"
port: 7623
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
# should match container.healthCheckPath
path: "/ready"
port: 7623
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 3
# Agent pod resource configuration
# Ref: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
resources: {}
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
# Node labels for agent pod assignment
# Ref: https://kubernetes.io/docs/user-guide/node-selection/
nodeSelector:
kubernetes.io/arch: amd64
# Node tolerations for agent scheduling to nodes with taints
# Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
tolerations: []
# Agent affinity and anti-affinity
# Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinity: {}
# # An example of preferred pod anti-affinity, weight is in the range 1-100
# podAntiAffinity:
# preferredDuringSchedulingIgnoredDuringExecution:
# - weight: 100
# podAffinityTerm:
# labelSelector:
# matchExpressions:
# - key: app.kubernetes.io/name
# operator: In
# values:
# - ingress-nginx
# - key: app.kubernetes.io/instance
# operator: In
# values:
# - ingress-nginx
# - key: app.kubernetes.io/component
# operator: In
# values:
# - controller
# topologyKey: kubernetes.io/hostname
# # An example of required pod anti-affinity
# podAntiAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# - labelSelector:
# matchExpressions:
# - key: app.kubernetes.io/name
# operator: In
# values:
# - ingress-nginx
# - key: app.kubernetes.io/instance
# operator: In
# values:
# - ingress-nginx
# - key: app.kubernetes.io/component
# operator: In
# values:
# - controller
# topologyKey: "kubernetes.io/hostname"
# Pod disruption budget settings
pdb:
create: false
minAvailable: 1
maxUnavailable: ""
# CircleCI Runner API URL
runnerAPI: "https://runner.circleci.com"
# A (preferably) unique name assigned to this particular container-agent instance.
# This name will appear in your runners inventory page in the CircleCI UI.
# If left unspecified, the name will default to the name of the deployment.
name: ""
# Tasks are drained during the termination grace period,
# so this should be sufficiently long relative to the maximum run time to ensure graceful shutdown
terminationGracePeriodSeconds: 18300 # 5 hours and 5 minutes
maxRunTime: "5h"
# Maximum number of tasks that can be run concurrently.
# IMPORTANT: This concurrency is independent of, and may be limited by, the Runner concurrency of your plan.
# Configure this value at your own risk based on the resources allocated to your cluster.
maxConcurrentTasks: 20
# Enable garbage collection of dangling Kubernetes objects managed by container agent
kubeGCEnabled: true
# The age of a Kubernetes object managed by container agent before the garbage collection deletes it
kubeGCThreshold: "5h5m"
# Name of the user provided secret containing resource class tokens. You can mix tokens from this secret
# and in the secret created from tokens specified in the resourceClasses section below
#
# The tokens should be specified as secret key-value pairs of the form
# ResourceClass: Token
# The resource class name needs to match the names configured below exactly to match tokens to the correct configuration
# As Kubernetes does not allow / in secret keys, a period (.) should be substituted instead
customSecret: ""
# Resource class settings. The tokens specified here will be used to claim tasks & the tasks
# will be launched with the configured configs
resourceClasses: {}
# circleci-runner/resourceClass:
# token: XXXX
# metadata:
# annotations:
# custom.io: my-annotation
# spec:
# containers:
# - resources:
# limits:
# cpu: 500m
# volumeMounts:
# - name: xyz
# mountPath: /path/to/mount
# securityContext:
# runAsNonRoot: true
# imagePullSecrets:
# - name: my_cred
# circleci-runner/resourceClass2:
# token: XXXX
# spec:
# imagePullSecrets:
# - name: "other"
## Resource class constraint validation checker settings. The checker will periodically validate the
## node constraints in the resource class spec to ensure task pods can be scheduled before claiming tasks
constraintChecker:
# Enable constraint checking (This requires at least List Node permissions)
enable: false
# Number of failed checks before disabling task claim
threshold: 3
# Check interval
interval: 15m
# Kubernetes service account settings
serviceAccount:
create: true
name: ""
automountServiceAccountToken: true
annotations: {}
# Kubernetes Roles Based Access Control settings
rbac:
create: true
role:
name: ""
rules: []
roleBinding:
name: ""
clusterRole:
name: ""
rules: []
clusterRoleBinding:
name: ""
設定を書き出しておく
❯ helm show values container-agent/container-agent > values.yaml
どうやら.agent.resourceClasses.<namespace>/<project>.token
にPATを設定すればいいらしい。
# Resource class settings. The tokens specified here will be used to claim tasks & the tasks
# will be launched with the configured configs
resourceClasses: {}
# circleci-runner/resourceClass:
# token: XXXX
# metadata:
# annotations:
# custom.io: my-annotation
# spec:
# containers:
# - resources:
# limits:
# cpu: 500m
# volumeMounts:
# - name: xyz
# mountPath: /path/to/mount
# securityContext:
# runAsNonRoot: true
# imagePullSecrets:
# - name: my_cred
# circleci-runner/resourceClass2:
# token: XXXX
# spec:
# imagePullSecrets:
# - name: "other"
APIトークンを生成。
設定
resourceClasses:
korosuke613/playground:
token: <見せられないよ!>
ていうかvalues.yaml
の内容ってどう反映すればいいんだ?helm詳しくないからわからん。
ここら辺のやり方circleciのドキュメントに載ってない。載せてほしい
❯ helm upgrade container-agent/container-agent -n circleci -f ./values.yaml
Error: "helm upgrade" requires 2 arguments
Usage: helm upgrade [RELEASE] [CHART] [flags]
間違えた。
❯ helm upgrade container-agent container-agent/container-agent -n circleci -f ./values.yaml
Release "container-agent" has been upgraded. Happy Helming!
NAME: container-agent
LAST DEPLOYED: Tue Oct 11 16:08:02 2022
NAMESPACE: circleci
STATUS: deployed
REVISION: 2
TEST SUITE: None
更新できた。
うーむ上手くいかん。
そもそもagentが動いてない?
❯ kubectl get -n circleci pods
NAME READY STATUS RESTARTS AGE
container-agent-55f749d6c6-b6v24 0/1 Pending 0 16m
container-agent-5df4f9f7d7-4zt4n 0/1 Pending 0 32m
うーんなんでだろう。
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m31s (x6 over 27m) default-scheduler 0/1 nodes are available: 1 node(s) didn't match Pod's node affinity/selector. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
kindで動かしてるのが関係してたりするんかなー
docker desktopのk8sでやってみるか
docker desktopのクラスタでも同じエラーが出た。
gcpでやってみるか
あーgke autopilotはスケジュールポッド無効にされてるかも
Warning FailedScheduling 28s gke.io/optimize-utilization-scheduler no nodes available to schedule pods
待ってたら動いた。でも謎のエラーでCrashLoopBackOff
に...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal TriggeredScaleUp 3m22s cluster-autoscaler pod triggered scale-up: [{https://www.googleapis.com/compute/v1/projects/korosuke613-playground/zones/asia-east1-b/instanceGroups/gk3-autopilot-cluster-1-nap-1l62u8rn-14fe2a13-grp 0->1 (max: 1000)}]
Warning FailedScheduling 2m59s (x2 over 4m8s) gke.io/optimize-utilization-scheduler no nodes available to schedule pods
Warning FailedScheduling 2m33s gke.io/optimize-utilization-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
Normal Scheduled 104s gke.io/optimize-utilization-scheduler Successfully assigned circleci/container-agent-78d9f8f6b6-6wtb2 to gk3-autopilot-cluster-1-nap-1l62u8rn-14fe2a13-jzfv
Normal Pulled 44s kubelet Successfully pulled image "circleci/container-agent:1.0.17278-5c2bd95" in 23.650168695s
Normal Pulled 25s kubelet Successfully pulled image "circleci/container-agent:1.0.17278-5c2bd95" in 15.549995251s
Normal Pulling 8s (x3 over 68s) kubelet Pulling image "circleci/container-agent:1.0.17278-5c2bd95"
Normal Created 6s (x3 over 44s) kubelet Created container container-agent
Normal Pulled 6s kubelet Successfully pulled image "circleci/container-agent:1.0.17278-5c2bd95" in 2.06933746s
Normal Started 5s (x3 over 43s) kubelet Started container container-agent
Warning BackOff 1s (x7 over 24s) kubelet Back-off restarting failed container
再度helm upgrade
したらいけた
❯ helm upgrade container-agent container-agent/container-agent -n circleci -f ./values.yaml
Release "container-agent" has been upgraded. Happy Helming!
NAME: container-agent
LAST DEPLOYED: Tue Oct 11 16:51:32 2022
NAMESPACE: circleci
STATUS: deployed
REVISION: 2
TEST SUITE: None
❯ kubectl get pods -n circleci
NAME READY STATUS RESTARTS AGE
container-agent-79874f697f-cgw4w 1/1 Running 0 25s
でも状況変わらず。
jobは実行できない。
ログを覗いてみる
kubectl logs -n circleci container-agent-79874f697f-cgw4w
07:54:16 0e2dd 231.379ms claim app.loop_name=claim: korosuke613.playground error=Error calling CircleCI: the response from POST /api/v2/runner/claim was 401 (Unauthorized) (1 attempts) mode=agent result=error service.name=container-agent service_name=container-agent
認証がダメっぽい。APIトークン間違ってるかも
PATの方にした。
うーん、でも同じエラー出てる
セルフホストランナーはまず同意する必要がありそう...
Organization Settings
あーページでたわ
リソースクラスを作成するとトークンも出てきた。
インストール方法ページも出た
トークンを更新後、再度helm upgrade container-agent container-agent/container-agent -n circleci -f ./values.yaml
登録できた!!
以下のconfig.ymlをpush。
version: 2.1
workflows:
try-container-agent:
jobs:
- try-container-agent
jobs:
try-container-agent:
docker:
- image: cimg/base:current
resource_class: korosuke613/gke-autopilot
steps:
- checkout
- run: echo "hello world"
❯ kubectl get pods -n circleci
NAME READY STATUS RESTARTS AGE
ccita-6345247bb9955825bf6de2ba-0-fozo6ngc 0/1 ContainerCreating 0 4s
container-agent-58fdf77455-bnvgk 1/1 Running 0 108s
pod生まれた!
ジョブ動いたー!
めちゃre-runしたけどすぐpodを立てて実行してくれた。
❯ kubectl get pods -n circleci -w
NAME READY STATUS RESTARTS AGE
ccita-6345258c50d0733667eaa9ed-0-knzwxxdk 1/1 Running 0 14s
ccita-63452593b3ebc43b41eb6774-0-lv4eflj7 0/1 Pending 0 14s
ccita-63452599a9dccd19f905b57f-0-va3l0led 0/1 Pending 0 4s
container-agent-58fdf77455-bnvgk 1/1 Running 0 6m30s
ccita-6345258c50d0733667eaa9ed-0-knzwxxdk 1/1 Terminating 0 20s
ccita-6345258c50d0733667eaa9ed-0-knzwxxdk 1/1 Terminating 0 20s
ccita-6345258c50d0733667eaa9ed-0-knzwxxdk 0/1 Terminating 0 50s
ccita-6345258c50d0733667eaa9ed-0-knzwxxdk 0/1 Terminating 0 50s
ccita-6345258c50d0733667eaa9ed-0-knzwxxdk 0/1 Terminating 0 50s
ccita-63452593b3ebc43b41eb6774-0-lv4eflj7 0/1 Pending 0 50s
ccita-63452593b3ebc43b41eb6774-0-lv4eflj7 0/1 ContainerCreating 0 50s
ccita-63452593b3ebc43b41eb6774-0-lv4eflj7 1/1 Running 0 56s
ccita-63452593b3ebc43b41eb6774-0-lv4eflj7 1/1 Terminating 0 66s
ccita-63452593b3ebc43b41eb6774-0-lv4eflj7 0/1 Terminating 0 97s
ccita-63452593b3ebc43b41eb6774-0-lv4eflj7 0/1 Terminating 0 97s
ccita-63452593b3ebc43b41eb6774-0-lv4eflj7 0/1 Terminating 0 97s
ccita-63452599a9dccd19f905b57f-0-va3l0led 0/1 Pending 0 87s
ccita-63452599a9dccd19f905b57f-0-va3l0led 0/1 Pending 0 96s
ccita-63452599a9dccd19f905b57f-0-va3l0led 0/1 ContainerCreating 0 96s
ccita-63452599a9dccd19f905b57f-0-va3l0led 1/1 Running 0 2m26s
ccita-63452599a9dccd19f905b57f-0-va3l0led 1/1 Terminating 0 2m40s
ccita-63452599a9dccd19f905b57f-0-va3l0led 1/1 Terminating 0 2m40s
ccita-63452599a9dccd19f905b57f-0-va3l0led 0/1 Terminating 0 3m11s
ccita-63452599a9dccd19f905b57f-0-va3l0led 0/1 Terminating 0 3m11s
ccita-63452599a9dccd19f905b57f-0-va3l0led 0/1 Terminating 0 3m11s
podの状況はcircleciの画面からも見れる。
gke autopilot使ってるからかたくさんpodを立てた時に時間がかかった。
お片付け
gke autopilotにpod立てっぱなしにしてると課金されちゃうからね
❯ helm uninstall container-agent -n circleci
release "container-agent" uninstalled
削除完了
agentの消し方がわからん
circleci cliで消せそう。
❯ circleci runner resource-class list korosuke613
+---------------------------+-------------+
| RESOURCE CLASS | DESCRIPTION |
+---------------------------+-------------+
| korosuke613/gke-autopilot | |
+---------------------------+-------------+
❯ circleci runner resource-class delete korosuke613/gke-autopilot
Error: resource class korosuke613/gke-autopilot still has tokens in use
tokenから先に消さないとダメ?
❯ circleci runner token list korosuke613/gke-autopilot
+--------------------------------------+----------+----------------------+
| ID | NICKNAME | CREATED AT |
+--------------------------------------+----------+----------------------+
| 73c05bc1-9288-4ee5-b084-8b0cb0c950f0 | default | 2022-10-11T08:04:58Z |
+--------------------------------------+----------+----------------------+
❯ circleci runner token delete korosuke613/gke-autopilot
Error: id is not a valid uuid
トークンの削除で指定するのはid
だった。
❯ circleci runner token delete 73c05bc1-9288-4ee5-b084-8b0cb0c950f0
❯ circleci runner resource-class delete korosuke613/gke-autopilot
消せた
まとめ書いた
セルフホストランナー入門ブログを発見しました。
フィードバックしてくれたらしい。ありがとうございます!!