🐈
kubeadmで作成したk8sクラスタ上にceph external clusterを立てて、Proxmoxのceph clusterに接続
概要
Proxmox上で構築したCephClusterに対してk8s上に構築したExternalCluster経由でボリュームを扱えるようにする
構成
Proxmox: 8.3.0
k8s: 1.33
k8sインストール方法: kubeadm
rook: 1.7.4
ceph-cluster: 1.7.4
1. rook-cephインストール
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: rook-ceph
namespace: argocd
annotations:
argocd.argoproj.io/sync-wave: "-1"
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://charts.rook.io/release
chart: rook-ceph
targetRevision: v1.17.4
destination:
server: "https://kubernetes.default.svc"
namespace: rook-ceph
syncPolicy:
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
2. k8sでceph poolを作成する
pveceph pool create k8s-pv-pool --pg_autoscale_mode-on
3. cephへ接続するための環境変数を取得する
proxmoxにて実行
wget https://raw.githubusercontent.com/rook/rook/release-1.17/deploy/examples/create-external-cluster-resources.py
python3 create-external-cluster-resources.py --namespace rook-ceph-external --rbd-data-pool-name k8s-pv-pool --format bash --skip-monitoring-endpoint
出力結果
root@pve1:~# python3 create-external-cluster-resources.py --namespace rook-ceph-external --rbd-data-pool-name k8s-pv-pool --format bash --skip-monitoring-endpoint
export ARGS="[Configurations]
namespace = rook-ceph-external
rgw-pool-prefix = default
format = bash
cephfs-filesystem-name = cephfs-1
cephfs-metadata-pool-name = cephfs-1_metadata
cephfs-data-pool-name = cephfs-1_data
rbd-data-pool-name = k8s-pv-pool
skip-monitoring-endpoint = True
"
export NAMESPACE=rook-ceph-external
export ROOK_EXTERNAL_FSID=c6fd9cdb-6e0b-4237-af84-01216dfad644
export ROOK_EXTERNAL_USERNAME=client.healthchecker
export ROOK_EXTERNAL_CEPH_MON_DATA=pve1=10.1.0.111:6789
export ROOK_EXTERNAL_USER_SECRET=AQBDiSloJQujIRAA0mSitw1px481+CXD+pB0vQ==
export CSI_RBD_NODE_SECRET=AQBDiSlolSkFIhAAk4ChKmYu3fGHWkiFz2kfsg==
export CSI_RBD_NODE_SECRET_NAME=csi-rbd-node
export CSI_RBD_PROVISIONER_SECRET=AQBDiSlo9j1jIhAAzuhzaoTc8NYghEJlWhYQNw==
export CSI_RBD_PROVISIONER_SECRET_NAME=csi-rbd-provisioner
export CEPHFS_POOL_NAME=cephfs-1_data
export CEPHFS_METADATA_POOL_NAME=cephfs-1_metadata
export CEPHFS_FS_NAME=cephfs-1
export CSI_CEPHFS_NODE_SECRET=AQBDiSlop/i7IhAAuq4ncBbunQ2sWpgQRZY84Q==
export CSI_CEPHFS_PROVISIONER_SECRET=AQBDiSloLoYKIxAAuSqfAFVH3PW9rcs+Q+OTPw==
export CSI_CEPHFS_NODE_SECRET_NAME=csi-cephfs-node
export CSI_CEPHFS_PROVISIONER_SECRET_NAME=csi-cephfs-provisioner
export RBD_POOL_NAME=k8s-pv-pool
export RGW_POOL_PREFIX=default
kubectl実行マシンにて実行
# 上記出力の環境変数を設定
wget https://raw.githubusercontent.com/rook/rook/release-1.17/deploy/examples/import-external-cluster.sh
bash import-external-cluster.sh
出力結果
❯ bash import-external-cluster.sh
cluster namespace rook-ceph-external already exists
secret rook-ceph-mon already exists
configmap rook-ceph-mon-endpoints already exists
configmap external-cluster-user-command already exists, updating it
configmap/external-cluster-user-command patched
secret rook-csi-rbd-node already exists
secret csi-rbd-provisioner already exists
secret csi-cephfs-node already exists
secret csi-cephfs-provisioner already exists
storageclass ceph-rbd already exists
storageclass cephfs already exists
4. ceph-clusterを作成する
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: rook-ceph-external-cluster
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://charts.rook.io/release
chart: rook-ceph-cluster
targetRevision: v1.17.4
helm:
valuesObject:
cephClusterSpec:
external:
enable: true
crashCollector:
disable: true
healthCheck:
daemonHealth:
mon:
disabled: false
interval: 45s
cephBlockPools: {}
cephFileSystems: {}
cephObjectStores: {}
destination:
server: "https://kubernetes.default.svc"
namespace: rook-ceph-external
syncPolicy:
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
values.yamlの説明にcluster-externalの設定はcephClusterSpecセクションを置き換えてね。と書いてある。
# This cluster spec example is for a converged cluster where all the Ceph daemons are running locally,
# as in the host-based example (cluster.yaml). For a different configuration such as a
# PVC-based cluster (cluster-on-pvc.yaml), external cluster (cluster-external.yaml),
# or stretch cluster (cluster-stretched.yaml), replace this entire `cephClusterSpec`
# with the specs from those examples.
cephclusterリソースが作成されており、HEALTH_OKとなっていればOK。
$ kubectl -n rook-ceph-external get cephclusters.ceph.rook.io
NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL FSID
rook-ceph-external /var/lib/rook 3 39m Connected Cluster connected successfully HEALTH_OK true c6fd9cdb-6e0b-4237-af84-01216dfad644
5. 動作確認
rbdの確認
PodがRunningになればOK
---
# https://github.com/rook/rook/blob/release-1.17/deploy/examples/csi/rbd/pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: ceph-rbd
---
# https://github.com/rook/rook/blob/release-1.17/deploy/examples/csi/rbd/pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: csirbd-demo-pod
spec:
containers:
- name: web-server
image: nginx
volumeMounts:
- name: mypvc
mountPath: /var/lib/www/html
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: rbd-pvc
readOnly: false
kubectl apply -f pod.yaml
$ kubectl get pvc 133ms Wed May 21 01:02:00 2025
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
rbd-pvc Bound pvc-3e9500b5-d3cb-43ee-b6ae-3ad4b4fc15dd 1Gi RWO ceph-rbd <unset> 23s
$ kubectl get pod csirbd-demo-pod 129ms Wed May 21 01:02:04 2025
NAME READY STATUS RESTARTS AGE
csirbd-demo-pod 1/1 Running 0 30s
cephfsの確認
PodがRunningになればOK
---
# https://github.com/rook/rook/blob/release-1.17/deploy/examples/csi/cephfs/pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-pvc
# labels:
# group: snapshot-test
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: cephfs
---
# https://github.com/rook/rook/blob/release-1.17/deploy/examples/csi/cephfs/pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: csicephfs-demo-pod
spec:
containers:
- name: web-server
image: nginx
volumeMounts:
- name: mypvc
mountPath: /var/lib/www/html
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: cephfs-pvc
readOnly: false
kubectl apply -f pod_cephfs.yaml
トラブルシュート
CephFSをマウントしようとするとThe cluster might be laggy, or you may not be authorizedと出る
どうもcreate-external-cluster-resources.py
実行時に--v2-port-enable
をつけるとこのようなエラーがでてしまう。
rbdは問題ないのだけど、cephfsのPVをマウントしようとしたときにContainerCreatingのままスタックしてしまう。
結果、--v2-port-enable
をつけずに実行したらうまくいった。rookもceph-clusterも1.7.4でインストールした。
10時間くらいはまりました。
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 117s (x189 over 9h) kubelet (combined from similar events): MountVolume.MountDevice failed for volume "pvc-eebfac1c-6779-4fef-818f-94f7d71cd9b8" : rpc error: code = Internal desc = an error (exit status 32) occurred while running mount args: [-t ceph csi-cephfs-node@c6fd9cdb-6e0b-4237-af84-01216dfad644.cephfs-1=/volumes/csi/csi-vol-f08cf514-eb47-45a1-a163-51345b40c074/1371ce68-6ed8-4eac-9c8a-aef67b646ff2 /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.cephfs.csi.ceph.com/6baba07dbc4a445d497f1e21d685d895f954d1b5eebb84eeeb806aef871e348a/globalmount -o mon_addr=10.1.0.111:3300,secretfile=/tmp/csi/keys/keyfile-2667384440,_netdev] stderr: mount error: no mds (Metadata Server) is up. The cluster might be laggy, or you may not be authorized
wget https://raw.githubusercontent.com/rook/rook/release-1.17/deploy/examples/create-external-cluster-resources.py
python3 create-external-cluster-resources.py --namespace rook-ceph-external --rbd-data-pool-name k8s-pv-pool --format bash --skip-monitoring-endpoint --v2-port-enable
参考
Discussion