おうちKubernetesを作るよ!
方針:おうちKubernetesを作って楽しむ
参考元:https://developers.cyberagent.co.jp/blog/archives/27443/
筐体はRaspberryPi 4 8GB
OSはUbuntu20.04
Kubernetesは1.20
1.21も出たけどまだ機能に追いつけてないので
コンテナランタイムはcontainerdを利用
インストールしたいものメモ(随時追記)
- Prometheus
- node-exporter
- kube-state-metrics
- Falco
- speedtest cli
まずはOSイメージに埋め込むcloud-initの設定
user-dataに適当なOSの設定、SSHして作業ができるようにしておくことを目標にしておく
# hostname
hostname: MY_HOSTNAME
# Japan
timezone: "Asia/Tokyo"
locale: "ja_JP.UTF-8"
# Never allow to ssh using password
ssh_pwauth: false
users:
- name: USER
gecos: I am USER
primary_group: USER
groups: [adm, audio, cdrom, dialout, dip, floppy, lxd, netdev, plugdev, sudo, video]
shell: /bin/bash
sudo: ALL=(ALL) NOPASSWD:ALL
lock_passwd: true
ssh_import_id:
- gh:RyuSA
# Update packages
package_update: true
package_upgrade: true
mounts:
- [ tmpfs, /tmp, tmpfs, "defaults,size=256m", "0", "0" ]
- [ tmpfs, /var/tmp, tmpfs, "defaults,size=256m", "0", "0" ]
packages:
- iptables
- arptables
- ebtables
- apt-transport-https
- ca-certificates
- curl
- software-properties-common
write_files:
# install /etc/hosts
- content: |
127.0.0.1 localhost
192.168.1.101 seagull01
192.168.1.102 seagull02
192.168.1.103 seagull03
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
owner: root:root
path: /etc/hosts
permissions: '0644'
# iptablesがブリッジを通過するトラフィックを処理できるようにする
- content: |
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
owner: root:root
path: /etc/sysctl.d/k8s.conf
permissions: '0644'
# containerdに必要な設定
- content: |
overlay
br_netfilter
owner: root:root
path: /etc/modules-load.d/containerd.conf
permissions: '0644'
# 必要なカーネルパラメータ
- content: |
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
owner: root:root
path: /etc/sysctl.d/99-kubernetes-cri.conf
permissions: '0644'
runcmd:
- sudo swapoff -a
IPアドレスを適当に固定
version: 2
ethernets:
eth0:
dhcp4: false
dhcp6: false
addresses:
- 192.168.1.XXX/24
gateway4: 192.168.1.1
nameservers:
addresses:
- 192.168.1.1
optional: true
cmdline.txtにcgroup_memory=1
と cgroup_enable=memory
を追加する
net.ifnames=0 dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=LABEL=writable rootfstype=ext4 elevator=deadline rootwait fixrtc cgroup_memory=1 cgroup_enable=memory
containerdのインストール、Kubernetesの公式ドキュメントにはトラップ(リポジトリのarch)があるので注意
Kubernetes安定稼働してきた
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta2
kubernetesVersion: v1.21.0
networking:
podSubnet: "10.0.0.0/16"
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd
authentication:
anonymous:
enabled: false
rotateCertificates: true
featureGates:
ServiceTopology: true
kubeadmとkubeletをインストール後、一度再起動して綺麗にしれからkubeadm initすることでうまく動く。やらないとコンテナが中々起動してくれない
安定するまで1時間程度時間かかった、kube-apiserverがポンポン死ぬのでcrictlでPodの状態を確認しながら様子を眺めていた
root@seagull01:~# crictl --runtime-endpoint /run/containerd/containerd.sock ps
CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID
9a61341163d5b ed0f9e1a5baac 49 seconds ago Running kube-apiserver 8 04e7ea11f5112
021184558aa66 7356c5b8dc57f About a minute ago Running kube-controller-manager 12 f65661760e63f
e0084228e9cb5 e7b7605bdcc69 About a minute ago Running kube-scheduler 10 aa32552c29d1b
e7f0d385db938 05b738aa1bc63 7 minutes ago Running etcd 3 ccb20a1d51b71
7aae901c35411 277e064e431e5 41 minutes ago Running weave 5 c1da1a6428703
3274acb604924 c88719b2e8ccc About an hour ago Running weave-npc 0 c1da1a6428703
c85be02b11513 d37325a6ff9ba About an hour ago Running kube-proxy 0 0caab09079265
Prometheusを動かしてクラスターの健全性を確認したい
のでPVを用意したい
のでNFSサーバーを用意する必要がある
のでNFSサーバーをデプロイする
NFSのデプロイとプロビジョニングを行うことができるこのリポジトリを使ってデプロイしていく
まずはコンテナイメージのビルド、どうもarm64のイメージは自分でビルドする必要がありそうだ
これに従ってrelease-tools/build.make
を微修正
BUILD_PLATFORMS=linux arm64
この状態でビルド
~/develop/github.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner @c5a1f60c* 2m 7s 11:42:36
❯ make container
./release-tools/verify-go-version.sh "go"
======================================================
WARNING
This projects is tested with Go v1.15.
Your current Go version is v1.16.
This may or may not be close enough.
In particular test-gofmt and test-vendor
are known to be sensitive to the version of
Go.
======================================================
mkdir -p bin
echo 'linux arm64' | tr ';' '\n' | while read -r os arch suffix; do \
if ! (set -x; CGO_ENABLED=0 GOOS="$os" GOARCH="$arch" go build -a -ldflags ' -X main.version=v3.0.0-0-gc5a1f60-dirty -extldflags "-static"' -o "./bin/nfs-provisioner$suffix" ./cmd/nfs-provisioner); then \
echo "Building nfs-provisioner for GOOS=$os GOARCH=$arch failed, see error(s) above."; \
exit 1; \
fi; \
done
+ CGO_ENABLED=0
+ GOOS=linux
+ GOARCH=arm64
+ go build -a -ldflags ' -X main.version=v3.0.0-0-gc5a1f60-dirty -extldflags "-static"' -o ./bin/nfs-provisioner ./cmd/nfs-provisioner
docker build -t nfs-provisioner:latest -f Dockerfile --label revision=v3.0.0-0-gc5a1f60-dirty .
[+] Building 1.7s (21/21) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 37B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for registry.fedoraproject.org/fedora-minimal:30 0.6s
=> [internal] load metadata for docker.io/library/fedora:30 1.0s
=> [internal] load build context 0.5s
=> => transferring context: 36.45MB 0.5s
=> [build 1/6] FROM docker.io/library/fedora:30@sha256:3a0c8c86d8ac2d1bbcfd08d40d3b757337f7916fb14f40efcb1d1137a4edef45 0.0s
=> [run 1/9] FROM registry.fedoraproject.org/fedora-minimal:30@sha256:5093ce3d2a3b37888f85579a4f89ad5c17bdfd1eab9489203bb7af2ac1455394 0.0s
=> CACHED [run 2/9] RUN microdnf install -y libblkid userspace-rcu dbus-x11 rpcbind hostname nfs-utils xfsprogs jemalloc libnfsidmap && microdnf clean all 0.0s
=> CACHED [run 3/9] RUN mkdir -p /var/run/dbus && mkdir -p /export 0.0s
=> CACHED [run 4/9] RUN echo /usr/local/lib64 > /etc/ld.so.conf.d/local_libs.conf 0.0s
=> CACHED [run 5/9] RUN sed -i s/systemd// /etc/nsswitch.conf 0.0s
=> CACHED [build 2/6] RUN dnf install -y tar gcc cmake-3.14.2-1.fc30 autoconf libtool bison flex make gcc-c++ krb5-devel dbus-devel jemalloc-devel libnfsidmap-devel libnsl2-devel 0.0s
=> CACHED [build 3/6] RUN curl -L https://github.com/nfs-ganesha/nfs-ganesha/archive/V2.8.2.tar.gz | tar zx && curl -L https://github.com/nfs-ganesha/ntirpc/archive/v1.8.0.tar 0.0s
=> CACHED [build 4/6] WORKDIR /nfs-ganesha-2.8.2 0.0s
=> CACHED [build 5/6] RUN mkdir -p /usr/local && cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_CONFIG=vfs_only -DCMAKE_INSTALL_PREFIX=/usr/local src/ && sed -i 's|@SYSSTATEDIR 0.0s
=> CACHED [build 6/6] RUN mkdir -p /ganesha-extra && mkdir -p /ganesha-extra/etc/dbus-1/system.d && cp src/scripts/ganeshactl/org.ganesha.nfsd.conf /ganesha-extra/etc/dbu 0.0s
=> CACHED [run 6/9] COPY --from=build /usr/local /usr/local/ 0.0s
=> CACHED [run 7/9] COPY --from=build /ganesha-extra / 0.0s
=> CACHED [run 8/9] COPY bin/nfs-provisioner /nfs-provisioner 0.0s
=> CACHED [run 9/9] RUN ldconfig 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:18027ca9946b82319dd9c7c83e5c8e8e5bcf3f413151236d606023f30e11f15f 0.0s
=> => naming to docker.io/library/nfs-provisioner:latest 0.0s
あとは適当にtagを切ってpushするだけ
NFSを起動するための諸々を起動
apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-provisioner
namespace: nfs
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-provisioner-runner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
- apiGroups: [""]
resources: ["services", "endpoints"]
verbs: ["get"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-provisioner
subjects:
- kind: ServiceAccount
name: nfs-provisioner
namespace: nfs
roleRef:
kind: ClusterRole
name: nfs-provisioner-runner
apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-provisioner
rules:
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-provisioner
subjects:
- kind: ServiceAccount
name: nfs-provisioner
namespace: nfs
roleRef:
kind: Role
name: leader-locking-nfs-provisioner
apiGroup: rbac.authorization.k8s.io
kind: Service
apiVersion: v1
metadata:
name: nfs-provisioner
namespace: nfs
labels:
app: nfs-provisioner
spec:
ports:
- name: nfs
port: 2049
- name: nfs-udp
port: 2049
protocol: UDP
- name: nlockmgr
port: 32803
- name: nlockmgr-udp
port: 32803
protocol: UDP
- name: mountd
port: 20048
- name: mountd-udp
port: 20048
protocol: UDP
- name: rquotad
port: 875
- name: rquotad-udp
port: 875
protocol: UDP
- name: rpcbind
port: 111
- name: rpcbind-udp
port: 111
protocol: UDP
- name: statd
port: 662
- name: statd-udp
port: 662
protocol: UDP
selector:
app: nfs-provisioner
---
kind: StatefulSet
apiVersion: apps/v1
metadata:
name: nfs-provisioner
namespace: nfs
labels:
app: nfs-provisioner
spec:
selector:
matchLabels:
app: nfs-provisioner
serviceName: "nfs-provisioner"
replicas: 1
template:
metadata:
labels:
app: nfs-provisioner
spec:
serviceAccount: nfs-provisioner
terminationGracePeriodSeconds: 10
containers:
- name: nfs-provisioner
image: docker.io/ryusa/nfs-provisioner:v3.0.0-0.1
ports:
- name: nfs
containerPort: 2049
- name: nfs-udp
containerPort: 2049
protocol: UDP
- name: nlockmgr
containerPort: 32803
- name: nlockmgr-udp
containerPort: 32803
protocol: UDP
- name: mountd
containerPort: 20048
- name: mountd-udp
containerPort: 20048
protocol: UDP
- name: rquotad
containerPort: 875
- name: rquotad-udp
containerPort: 875
protocol: UDP
- name: rpcbind
containerPort: 111
- name: rpcbind-udp
containerPort: 111
protocol: UDP
- name: statd
containerPort: 662
- name: statd-udp
containerPort: 662
protocol: UDP
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
securityContext:
capabilities:
add:
- DAC_READ_SEARCH
- SYS_RESOURCE
args:
- "-provisioner=example.com/nfs"
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: SERVICE_NAME
value: nfs-provisioner
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
imagePullPolicy: "IfNotPresent"
volumeMounts:
- name: export-volume
mountPath: /export
volumes:
- name: export-volume
hostPath:
path: /data/etc/export
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
nodeSelector:
external-disk: data
結論から言って、うまくいかなかった
-
コントロールプレーンが安定しない
ラズパイの限界か、定期的にコントロールプレーンのコンポーネントがだめになる
特にkube-apiserverが応答しなくなる度にノードにSSHしてcrictlを叩くことになって非常に面倒くさい -
NFSが動作しない
コントロールプレーンが安定しない今、ラズパイの上でNFSサーバーを起動するのは良くない…… -
動作が遅すぎる
コントロールプレーンのロードアベレージが常に7.0を超えており、正常にコンポーネントが動いている時ですらkube-apiserverの応答が遅すぎる
ということで、家に転がっていたDellのノートPCにUbuntuを入れてコントロールプレーンとして動かす方針にして作り直すことにする