iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🐷

Official Ceph Management Tool: cephadm

に公開

Introduction

This article is a late entry for the 10th day of the Rook and Friends, Cloud Native Storage Advent Calendar 2020.

In this article, I will introduce cephadm, the official management tool for Ceph. For the background on how cephadm came to be, please refer to another article. My main goal here is to demonstrate how incredibly simple it is to deploy a Ceph cluster using cephadm.

To be honest, almost anyone can do what is written here by looking at the official documentation. However, please bear with me as I am presenting a way to create a minimal Ceph cluster with even less information than the official docs.

Environment

  • Software
    • OS: Ubuntu 18.04/amd64
    • ceph-common package
  • Hardware
    • Empty disk for OSD: /dev/sdb (5GB)

Installing cephadm

For simplicity, all subsequent commands are executed with root privileges.

Just run the following commands to complete the installation.

# curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm
# chmod +x cephadm

After this, create the directory where the Ceph cluster configuration files will be stored.

# mkdir -p /etc/ceph

Creating a Ceph Cluster with Only One MON and MGR

Run the following command to complete the setup. Please replace <ipaddr> with the IP address of your local host.

# ./cephadm bootstrap --mon-ip <ipaddr>
...
Bootstrap complete.
# 

Installation finishes in just 2 or 3 minutes, which is amazing.

You can run ceph commands via the shell subcommand of cephadm. Let's check the cluster status here.

# ./cephadm shell -- ceph -s
Inferring fsid 49d0c54a-4b8e-11eb-959d-00155d0a6b00
Inferring config /var/lib/ceph/49d0c54a-4b8e-11eb-959d-00155d0a6b00/mon.ubuntu1804/config
Using recent ceph image ceph/ceph-amd64:v15.2.8-20201217
  cluster:
    id:     49d0c54a-4b8e-11eb-959d-00155d0a6b00
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum ubuntu1804 (age 61s)
    mgr: ubuntu1804.tihjed(active, since 23s)
    osd: 0 osds: 0 up, 0 in
...

It seems to be working correctly.

You can check the daemons created by cephadm using the cephadm ls command.

# ./cephadm ls
[
    {
        "style": "cephadm:v1",
        "name": "mgr.ubuntu1804.tihjed",
        "fsid": "49d0c54a-4b8e-11eb-959d-00155d0a6b00",
        "systemd_unit": "ceph-49d0c54a-4b8e-11eb-959d-00155d0a6b00@mgr.ubuntu1804.tihjed",
        "enabled": true,
        "state": "running",
        "container_id": "601ff097ade13c7a3cc1814ac2506ddbdd1fd05d59f3e821e2c206a749ed156f",
        "container_image_name": "docker.io/ceph/ceph:v15",
        "container_image_id": "5553b0cb212ca2aa220d33ba39d9c602c8412ce6c5febc57ef9cdc9c5844b185",
        "version": "15.2.8",
...
    },
    {
        "style": "cephadm:v1",
        "name": "prometheus.ubuntu1804",
 ...
    },
    {
        "style": "cephadm:v1",
        "name": "mon.ubuntu1804",
...
]

You can see that various daemons such as mon, mgr, and prometheus are created by default. The container_image_name represents the name of the container image for each daemon. These use the official Ceph container images.

Creating OSDs

To create OSDs, first list the devices available for use as OSDs from the local host.

# ./cephadm shell -- ceph orch device ls
Inferring fsid 49d0c54a-4b8e-11eb-959d-00155d0a6b00
Inferring config /var/lib/ceph/49d0c54a-4b8e-11eb-959d-00155d0a6b00/mon.ubuntu1804/config
Using recent ceph image ceph/ceph-amd64:v15.2.8-20201217
Hostname    Path      Type  Serial                            Size   Health   Ident  Fault  Available
ubuntu1804  /dev/sdb  hdd   600224803b3ed0b6e86337e91de598f0  10.7G  Unknown  N/A    N/A    Yes
ubuntu1804  /dev/sdc  hdd   600224800f81ac0c3a79fde2b585fb72  64.4G  Unknown  N/A    N/A    Yes

You can create OSDs on devices where the Available field is Yes.

You can then add an OSD by running ceph orch daemon add osd <hostname>:<device-name>. Below is an example of adding /dev/sdb on the local host (ubuntu1804).

# ./cephadm shell -- ceph orch daemon add osd ubuntu1804:/dev/sdb
Inferring fsid 49d0c54a-4b8e-11eb-959d-00155d0a6b00
Inferring config /var/lib/ceph/49d0c54a-4b8e-11eb-959d-00155d0a6b00/mon.ubuntu1804/config
Using recent ceph image ceph/ceph:v15
Created osd(s) 0 on host 'ubuntu1804'

Verify that the OSD has been created.

# ./cephadm shell -- ceph osd tree
Inferring fsid 49d0c54a-4b8e-11eb-959d-00155d0a6b00
Inferring config /var/lib/ceph/49d0c54a-4b8e-11eb-959d-00155d0a6b00/mon.ubuntu1804/config
Using recent ceph image ceph/ceph-amd64:v15.2.8-20201217
ID  CLASS  WEIGHT   TYPE NAME            STATUS  REWEIGHT  PRI-AFF
-1         0.00980  root default
-3         0.00980      host ubuntu1804
 0    hdd  0.00980          osd.0            up   1.00000  1.00000

Success! From here, you can manage and use the Ceph cluster as usual using the ceph command.

Also, confirm that cephadm recognizes the creation of the OSD daemon.

# ./cephadm ls
[
...
    {
        "style": "cephadm:v1",
        "name": "osd.0",
...
    },
...

There it is.

Removing OSDs

Removing an OSD requires three stages:

  1. Deleting the OSD daemon
  2. Removing the OSD from the CRUSH map
  3. Deleting the OSD data

You can delete the OSD daemon using the cephadm rm-daemon command.

# ./cephadm rm-daemon --force --name osd.0 --fsid 49d0c54a-4b8e-11eb-959d-00155d0a6b00

To remove it from the CRUSH map, use the ceph osd down, ceph osd out, and ceph osd purge commands.

# ./cephadm shell -- ceph osd down osd.0
...
# ./cephadm shell -- ceph osd out osd.0
...
# ./cephadm shell -- ceph osd purge --force osd.0
...

This removes the OSD from the CRUSH map.

# ./cephadm shell -- ceph osd tree
Inferring fsid 49d0c54a-4b8e-11eb-959d-00155d0a6b00
Inferring config /var/lib/ceph/49d0c54a-4b8e-11eb-959d-00155d0a6b00/mon.ubuntu1804/config
Using recent ceph image ceph/ceph:v15
ID  CLASS  WEIGHT  TYPE NAME            STATUS  REWEIGHT  PRI-AFF
-1              0  root default
-3              0      host ubuntu1804

OSD data can be deleted using the ceph orch device zap command.

# ./cephadm shell -- ceph orch device zap --force ubuntu1804 /dev/sdb
...
/usr/bin/docker:stderr --> Zapping successful for: <Raw Device: /dev/sdb>
...

After this, /dev/sdb becomes reusable.

Removing the Cluster

To remove the cluster, use the cephadm rm-cluster command.

# ./cephadm rm-cluster --force --fsid 49d0c54a-4b8e-11eb-959d-00155d0a6b00

Important Notes

Since cephadm is a very young tool, there is a significant possibility that the above commands may stop working in future versions. The official documentation states the following:

Cephadm is a new feature in the Octopus release and has seen limited use in production and at scale. We would like users to try cephadm, especially for new clusters, but please be aware that some functionality is still rough around the edges. We expect fairly frequent updates and improvements over the first several bug fix releases of Octopus.

For more details on the stability of cephadm, please see here.

Conclusion

I usually use Rook for managing Ceph clusters, but when a bug I discover in Rook seems to be rooted in Ceph itself, I often reproduce it on cephadm before reporting the issue. It has been incredibly useful and wonderful to work with.

Discussion