😊

containerd を新しくした上で kubeadm upgrade apply v1.26.1 したら成功した話

2023/01/29に公開

Kubernetes v1.25.3 から v1.26.1 に kubeadm upgrade したら containerd が古くて失敗 の話の続きです。

containerd を containerd.io に入れ替えて出直した

Ubuntu 22.04 LTS の containerd を OS 標準のものから containerd.io に入れ替えた に書いた記事の手順で containerd を入れ替えてきました

コントロールプレーンのバージョンアップ

改めて kubeadm upgrade apply v1.26.1 したら成功した

imksoo@k8smaster:~$ sudo kubeadm upgrade apply v1.26.1
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0129 22:59:22.179736  745994 configset.go:177] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeproxy.config.k8s.io", Version:"v1alpha1", Kind:"KubeProxyConfiguration"}: strict decoding error: unknown field "udpIdleTimeout"
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.26.1"
[upgrade/versions] Cluster version: v1.25.3
[upgrade/versions] kubeadm version: v1.26.1
[upgrade] Are you sure you want to proceed? [y/N]: y
[upgrade/prepull] Pulling images required for setting up a Kubernetes cluster
[upgrade/prepull] This might take a minute or two, depending on the speed of your internet connection
[upgrade/prepull] You can also perform this action in beforehand using 'kubeadm config images pull'
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.26.1" (timeout: 5m0s)...
[upgrade/etcd] Upgrading to TLS for etcd
[upgrade/staticpods] Preparing for "etcd" upgrade
[upgrade/staticpods] Renewing etcd-server certificate
[upgrade/staticpods] Renewing etcd-peer certificate
[upgrade/staticpods] Renewing etcd-healthcheck-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/etcd.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2023-01-29-23-00-02/etcd.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=etcd
[upgrade/staticpods] Component "etcd" upgraded successfully!
[upgrade/etcd] Waiting for etcd to become available
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests3850380804"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2023-01-29-23-00-02/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2023-01-29-23-00-02/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2023-01-29-23-00-02/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.26.1". Enjoy!

[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
imksoo@k8smaster:~$ 
imksoo@k8smaster:~$ kubeadm version -o json
{
  "clientVersion": {
    "major": "1",
    "minor": "26",
    "gitVersion": "v1.26.1",
    "gitCommit": "8f94681cd294aa8cfd3407b8191f6c70214973a4",
    "gitTreeState": "clean",
    "buildDate": "2023-01-18T15:56:50Z",
    "goVersion": "go1.19.5",
    "compiler": "gc",
    "platform": "linux/amd64"
  }
}
imksoo@k8smaster:~$ kubectl version -o json 
{
  "clientVersion": {
    "major": "1",
    "minor": "25",
    "gitVersion": "v1.25.3",
    "gitCommit": "434bfd82814af038ad94d62ebe59b133fcb50506",
    "gitTreeState": "clean",
    "buildDate": "2022-10-12T10:57:26Z",
    "goVersion": "go1.19.2",
    "compiler": "gc",
    "platform": "linux/amd64"
  },
  "kustomizeVersion": "v4.5.7",
  "serverVersion": {
    "major": "1",
    "minor": "26",
    "gitVersion": "v1.26.1",
    "gitCommit": "8f94681cd294aa8cfd3407b8191f6c70214973a4",
    "gitTreeState": "clean",
    "buildDate": "2023-01-18T15:51:25Z",
    "goVersion": "go1.19.5",
    "compiler": "gc",
    "platform": "linux/amd64"
  }
}

kubelet, kubectl のバージョンアップ

imksoo@k8smaster:~$ sudo apt-mark unhold kubelet kubectl
[sudo] password for imksoo: 
Canceled hold on kubelet.
Canceled hold on kubectl.
imksoo@k8smaster:~$ 
imksoo@k8smaster:~$ sudo apt install kubelet kubectl 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
  bridge-utils dns-root-data dnsmasq-base pigz ubuntu-fan
Use 'sudo apt autoremove' to remove them.
The following packages will be upgraded:
  kubectl kubelet
2 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
Need to get 30.5 MB of archives.
After this operation, 10.0 MB of additional disk space will be used.
Get:1 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubectl amd64 1.26.1-00 [10.1 MB]
Get:2 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubelet amd64 1.26.1-00 [20.5 MB]
Fetched 30.5 MB in 2s (16.7 MB/s)  
(Reading database ... 109630 files and directories currently installed.)
Preparing to unpack .../kubectl_1.26.1-00_amd64.deb ...
Unpacking kubectl (1.26.1-00) over (1.25.3-00) ...
Preparing to unpack .../kubelet_1.26.1-00_amd64.deb ...
Unpacking kubelet (1.26.1-00) over (1.25.3-00) ...
Setting up kubectl (1.26.1-00) ...
Setting up kubelet (1.26.1-00) ...
Scanning processes...                                                                                                                                                     
Scanning candidates...                                                                                                                                                    
Scanning linux images...                                                                                                                                                  

Restarting services...
 systemctl restart containerd.service
Service restarts being deferred:
 systemctl restart networkd-dispatcher.service
 systemctl restart systemd-logind.service
 systemctl restart unattended-upgrades.service
 systemctl restart user@1000.service

No containers need to be restarted.

No user sessions are running outdated binaries.

No VM guests are running outdated hypervisor (qemu) binaries on this host.
imksoo@k8smaster:~$ 
imksoo@k8smaster:~$ sudo apt-mark hold kubelet kubectl
kubelet set on hold.
kubectl set on hold.
imksoo@k8smaster:~$ 
imksoo@k8smaster:~$ sudo systemctl daemon-reload
imksoo@k8smaster:~$ sudo systemctl restart kubelet
imksoo@k8smaster:~$ 

CNI ( Calico ) のバージョンアップ

マニフェストファイルを落として眺めてみた感じ、そんなに大きな差分も無さそうなことを確認して kubectl apply -f calico.yaml しました。

--- calico.yaml 2023-01-29 23:23:14.394652470 +0900
+++ calico.yaml.v1.25   2022-11-10 00:45:41.881045708 +0900
@@ -149,12 +149,6 @@
                       type: string
                   type: object
                 type: array
-              ignoredInterfaces:
-                description: IgnoredInterfaces indicates the network interfaces that
-                  needs to be excluded when reading device routes.
-                items:
-                  type: string
-                type: array
               listenPort:
                 description: ListenPort is the port where BGP protocol should listen.
                   Defaults to 179
@@ -373,23 +367,12 @@
                   remote AS number comes from the remote node's NodeBGPSpec.ASNumber,
                   or the global default if that is not set.
                 type: string
-              reachableBy:
-                description: Add an exact, i.e. /32, static route toward peer IP in
-                  order to prevent route flapping. ReachableBy contains the address
-                  of the gateway which peer can be reached by.
-                type: string
               sourceAddress:
                 description: Specifies whether and how to configure a source address
                   for the peerings generated by this BGPPeer resource.  Default value
                   "UseNodeIP" means to configure the node IP as the source address.  "None"
                   means not to configure a source address.
                 type: string
-              ttlSecurity:
-                description: TTLSecurity enables the generalized TTL security mechanism
-                  (GTSM) which protects against spoofed packets by ignoring received
-                  packets with a smaller than expected TTL value. The provided value
-                  is the number of hops (edges) between the peers.
-                type: integer
             type: object
         type: object
     served: true
@@ -874,10 +857,9 @@
                   [Default: false]'
                 type: boolean
               bpfEnforceRPF:
-                description: 'BPFEnforceRPF enforce strict RPF on all host interfaces
-                  with BPF programs regardless of what is the per-interfaces or global
-                  setting. Possible values are Disabled, Strict or Loose. [Default:
-                  Strict]'
+                description: 'BPFEnforceRPF enforce strict RPF on all interfaces with
+                  BPF programs regardless of what is the per-interfaces or global
+                  setting. Possible values are Disabled or Strict. [Default: Strict]'
                 type: string
               bpfExtToServiceConnmark:
                 description: 'BPFExtToServiceConnmark in BPF mode, control a 32bit
@@ -917,14 +899,6 @@
                   kube-proxy.  Lower values give reduced set-up latency.  Higher values
                   reduce Felix CPU usage by batching up more work.  [Default: 1s]'
                 type: string
-              bpfL3IfacePattern:
-                description: BPFL3IfacePattern is a regular expression that allows
-                  to list tunnel devices like wireguard or vxlan (i.e., L3 devices)
-                  in addition to BPFDataIfacePattern. That is, tunnel interfaces not
-                  created by Calico, that Calico workload traffic flows over as well
-                  as any interfaces that handle incoming traffic to nodeports and
-                  services from outside the cluster.
-                type: string
               bpfLogLevel:
                 description: 'BPFLogLevel controls the log level of the BPF programs
                   when in BPF dataplane mode.  One of "Off", "Info", or "Debug".  The
@@ -1000,12 +974,11 @@
                   to use.  Only used if UseInternalDataplaneDriver is set to false.
                 type: string
               dataplaneWatchdogTimeout:
-                description: "DataplaneWatchdogTimeout is the readiness/liveness timeout
-                  used for Felix's (internal) dataplane driver. Increase this value
+                description: 'DataplaneWatchdogTimeout is the readiness/liveness timeout
+                  used for Felix''s (internal) dataplane driver. Increase this value
                   if you experience spurious non-ready or non-live events when Felix
                   is under heavy load. Decrease the value to get felix to report non-live
-                  or non-ready more quickly. [Default: 90s] \n Deprecated: replaced
-                  by the generic HealthTimeoutOverrides."
+                  or non-ready more quickly. [Default: 90s]'
                 type: string
               debugDisableLogDropping:
                 type: boolean
@@ -1109,21 +1082,15 @@
                   type: object
                 type: array
               featureDetectOverride:
-                description: FeatureDetectOverride is used to override feature detection
-                  based on auto-detected platform capabilities.  Values are specified
-                  in a comma separated list with no spaces, example; "SNATFullyRandom=true,MASQFullyRandom=false,RestoreSupportsLock=".  "true"
-                  or "false" will force the feature, empty or omitted values are auto-detected.
-                type: string
-              featureGates:
-                description: FeatureGates is used to enable or disable tech-preview
-                  Calico features. Values are specified in a comma separated list
-                  with no spaces, example; "BPFConnectTimeLoadBalancingWorkaround=enabled,XyZ=false".
-                  This is used to enable features that are not fully production ready.
+                description: FeatureDetectOverride is used to override the feature
+                  detection. Values are specified in a comma separated list with no
+                  spaces, example; "SNATFullyRandom=true,MASQFullyRandom=false,RestoreSupportsLock=".
+                  "true" or "false" will force the feature, empty or omitted values
+                  are auto-detected.
                 type: string
               floatingIPs:
                 description: FloatingIPs configures whether or not Felix will program
-                  non-OpenStack floating IP addresses.  (OpenStack-derived floating
-                  IPs are always programmed, regardless of this setting.)
+                  floating IP addresses.
                 enum:
                 - Enabled
                 - Disabled
@@ -1140,23 +1107,6 @@
                 type: string
               healthPort:
                 type: integer
-              healthTimeoutOverrides:
-                description: HealthTimeoutOverrides allows the internal watchdog timeouts
-                  of individual subcomponents to be overriden.  This is useful for
-                  working around "false positive" liveness timeouts that can occur
-                  in particularly stressful workloads or if CPU is constrained.  For
-                  a list of active subcomponents, see Felix's logs.
-                items:
-                  properties:
-                    name:
-                      type: string
-                    timeout:
-                      type: string
-                  required:
-                  - name
-                  - timeout
-                  type: object
-                type: array
               interfaceExclude:
                 description: 'InterfaceExclude is a comma-separated list of interfaces
                   that Felix should exclude when monitoring for host endpoints. The
@@ -1198,7 +1148,7 @@
                 type: string
               iptablesBackend:
                 description: IptablesBackend specifies which backend of iptables will
-                  be used. The default is Auto.
+                  be used. The default is legacy.
                 type: string
               iptablesFilterAllowAction:
                 type: string
@@ -4440,7 +4390,7 @@
         # It can be deleted if this is a fresh installation, or if you have already
         # upgraded to use calico-ipam.
         - name: upgrade-ipam
-          image: docker.io/calico/cni:v3.25.0
+          image: docker.io/calico/cni:v3.24.5
           imagePullPolicy: IfNotPresent
           command: ["/opt/cni/bin/calico-ipam", "-upgrade"]
           envFrom:
@@ -4468,7 +4418,7 @@
         # This container installs the CNI binaries
         # and CNI network config file on each node.
         - name: install-cni
-          image: docker.io/calico/cni:v3.25.0
+          image: docker.io/calico/cni:v3.24.5
           imagePullPolicy: IfNotPresent
           command: ["/opt/cni/bin/install"]
           envFrom:
@@ -4511,7 +4461,7 @@
         # i.e. bpf at /sys/fs/bpf and cgroup2 at /run/calico/cgroup. Calico-node initialisation is executed
         # in best effort fashion, i.e. no failure for errors, to not disrupt pod creation in iptable mode.
         - name: "mount-bpffs"
-          image: docker.io/calico/node:v3.25.0
+          image: docker.io/calico/node:v3.24.5
           imagePullPolicy: IfNotPresent
           command: ["calico-node", "-init", "-best-effort"]
           volumeMounts:
@@ -4537,7 +4487,7 @@
         # container programs network policy and routes on each
         # host.
         - name: calico-node
-          image: docker.io/calico/node:v3.25.0
+          image: docker.io/calico/node:v3.24.5
           imagePullPolicy: IfNotPresent
           envFrom:
           - configMapRef:
@@ -4754,7 +4704,7 @@
       priorityClassName: system-cluster-critical
       containers:
         - name: calico-kube-controllers
-          image: docker.io/calico/kube-controllers:v3.25.0
+          image: docker.io/calico/kube-controllers:v3.24.5
           imagePullPolicy: IfNotPresent
           env:
             # Choose which controllers to run.

ワーカーノードのバージョンアップ

だいたいの流れはコントロールプレーンと一緒になりますね。
apt-mark unholdapt-mark hold のくだりはもうサクッと apt install することで割愛。あと、ワーカーノードの kubectl drain と kubectl uncordon のあたりもスルーしてます。(所詮おうちの盆栽 k8s なんで……)

imksoo@k8sworker02:~$ sudo apt update
imksoo@k8sworker02:~$ sudo apt upgrade -y 
imksoo@k8sworker02:~$ sudo apt install kubeadm=1.26.1-00
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
  bridge-utils dns-root-data dnsmasq-base pigz ubuntu-fan
Use 'sudo apt autoremove' to remove them.
The following held packages will be changed:
  kubeadm
The following packages will be upgraded:
  kubeadm
1 upgraded, 0 newly installed, 0 to remove and 6 not upgraded.
Need to get 9,732 kB of archives.
After this operation, 2,961 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubeadm amd64 1.26.1-00 [9,732 kB]
Fetched 9,732 kB in 1s (9,658 kB/s)   
(Reading database ... 109556 files and directories currently installed.)
Preparing to unpack .../kubeadm_1.26.1-00_amd64.deb ...
Unpacking kubeadm (1.26.1-00) over (1.25.4-00) ...
Setting up kubeadm (1.26.1-00) ...
Scanning processes...                                                                                                                                                     
Scanning candidates...                                                                                                                                                    
Scanning linux images...                                                                                                                                                  

Restarting services...
 systemctl restart containerd.service
Service restarts being deferred:
 systemctl restart networkd-dispatcher.service
 systemctl restart systemd-logind.service
 systemctl restart unattended-upgrades.service
 systemctl restart user@1000.service

No containers need to be restarted.

No user sessions are running outdated binaries.

No VM guests are running outdated hypervisor (qemu) binaries on this host.
imksoo@k8sworker02:~$ 
imksoo@k8sworker02:~$ sudo kubeadm upgrade node 
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks
[preflight] Skipping prepull. Not a control plane node.
[upgrade] Skipping phase. Not a control plane node.
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[upgrade] The configuration for this node was successfully updated!
[upgrade] Now you should go ahead and upgrade the kubelet package using your package manager.
imksoo@k8sworker02:~$ 
imksoo@k8sworker02:~$ sudo apt install kubelet=1.26.1-00 kubectl=1.26.1-00
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
  bridge-utils dns-root-data dnsmasq-base pigz ubuntu-fan
Use 'sudo apt autoremove' to remove them.
The following held packages will be changed:
  kubectl kubelet
The following packages will be upgraded:
  kubectl kubelet
2 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
Need to get 30.5 MB of archives.
After this operation, 10.0 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubectl amd64 1.26.1-00 [10.1 MB]
Get:2 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubelet amd64 1.26.1-00 [20.5 MB]
Fetched 30.5 MB in 2s (14.9 MB/s)  
(Reading database ... 109556 files and directories currently installed.)
Preparing to unpack .../kubectl_1.26.1-00_amd64.deb ...
Unpacking kubectl (1.26.1-00) over (1.25.4-00) ...
Preparing to unpack .../kubelet_1.26.1-00_amd64.deb ...
Unpacking kubelet (1.26.1-00) over (1.25.4-00) ...
Setting up kubectl (1.26.1-00) ...
Setting up kubelet (1.26.1-00) ...
Scanning processes...                                                                                                                                                     
Scanning candidates...                                                                                                                                                    
Scanning linux images...                                                                                                                                                  

Restarting services...
 systemctl restart containerd.service
Service restarts being deferred:
 systemctl restart networkd-dispatcher.service
 systemctl restart systemd-logind.service
 systemctl restart unattended-upgrades.service
 systemctl restart user@1000.service

No containers need to be restarted.

No user sessions are running outdated binaries.

No VM guests are running outdated hypervisor (qemu) binaries on this host.
imksoo@k8sworker02:~$ 
imksoo@k8sworker02:~$ sudo systemctl daemon-reload
imksoo@k8sworker02:~$ sudo systemctl restart kubelet

全ノードをバージョンアップし終わった

imksoo@k8smaster:~$ kubectl get node -o wide
NAME          STATUS   ROLES           AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
k8smaster     Ready    control-plane   80d   v1.26.1   192.168.1.111   <none>        Ubuntu 22.04.1 LTS   5.15.0-56-generic   containerd://1.6.15
k8sworker01   Ready    <none>          80d   v1.26.1   192.168.1.112   <none>        Ubuntu 22.04.1 LTS   5.15.0-53-generic   containerd://1.6.15
k8sworker02   Ready    <none>          74d   v1.26.1   192.168.1.113   <none>        Ubuntu 22.04.1 LTS   5.15.0-53-generic   containerd://1.6.15

Discussion