👌

cloud-initでkubectl wait → error: no matching resources found

2024/01/23に公開

Issue

スクリプトなどで連続して kubectl apply kubectl wait を実行すると

cloud-init[1040]: error: no matching resources found

となってしまい、その後の処理もコケてしまう。

処理例

  • MetalLB のマニフェストを apply
  • kubectl wait
    • Pod の Initialize すら始まっていない状態で kubectl wait が実行されるため error: no matching resources found となる。
  • IPAddressPool のマニフェストを apply
    • MetalLB が Ready で無いため fail
cloud-init[1040]: error: no matching resources found
[  OK  ] Created slice libcontainer…_43fc_afed_ebc3e495eb05.slice.
[  OK  ] Created slice libcontainer…_4e83_ad39_aeba2ba9d69f.slice.
[  OK  ] Started libcontainer conta…81676d269b2c6f22ec4485464c52b.
[  OK  ] Started libcontainer conta…38ff429b69abe1b1e2debd5de212a.
[  OK  ] Started libcontainer conta…f0571226cae324376f8f6b89b5479.
cloud-init[1040]: Error from server (InternalError): error when creating "http://192.168.3.2:8080/metallb-l2-advertisement.yaml": Internal error occurred: failed calling webhook "ipaddresspoolvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-ipaddresspool?timeout=10s": dial tcp 10.98.3.73:443: i/o timeout
cloud-init[1040]: Error from server (InternalError): error when creating "http://192.168.3.2:8080/metallb-l2-advertisement.yaml": Internal error occurred: failed calling webhook "l2advertisementvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-l2advertisement?timeout=10s": dial tcp 10.98.3.73:443: connect: connection refused

Fix

no matching resources found とならず、正常終了するまでループさせ回避。

※ 以下の設定では timeout した場合もループが続くため CrashLoopBack などで進行不能なった場合にループを抜けたいのであれば CrashLoopBack でもループを終了する OR 条件を入れてください。

     runcmd:
	 # (前略)
         echo "Installing calico cni."
         kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml
         echo "Waiting on calico to start up..."
-        kubectl wait --namespace kube-system \
-            --for=condition=ready pod \
-            --selector=k8s-app=calico-kube-controllers \
-            --timeout=120s
+        while ! ( \
+          kubectl wait --namespace kube-system \
+            --for=condition=ready pod \
+            --selector=k8s-app=calico-kube-controllers \
+            --timeout=120s \
+        ); do sleep 10; done
         kubectl taint nodes --all node-role.kubernetes.io/control-plane-
         echo "Installing metallb."
         kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.12/config/manifests/metallb-native.yaml
         echo "Waiting on metallb to start up..."
-        kubectl wait --namespace metallb-system \
-            --for=condition=ready pod \
-            --selector=component=speaker \
-            --timeout=120s
+        while ! ( \
+          kubectl wait --namespace metallb-system \
+            --for=condition=ready pod \
+            --selector=component=speaker \
+            --timeout=120s \
+        ); do sleep 10; done

Discussion