iTranslated by AI
Rook/Ceph Object Storage: Present and Future
Introduction
This article is for the 5th day of Rook and Friends, Cloud Native Storage Advent Calendar 2020. I will discuss the past and future of Rook/Ceph object storage, specifically focusing on its interfaces. First, I'll briefly touch upon Ceph's object storage, followed by how object storage is used in Rook and what the future holds.
Ceph Object Storage
Ceph supports object storage through a feature called RadosGW (RGW). It provides two types of interfaces: S3-compatible and Swift-compatible. At the core of Ceph lies a proprietary object storage system called RADOS. Since this feature acts as a gateway between RADOS and the aforementioned commonly used interfaces, it is named "Rados" "Gateway."
Current Object Storage in Rook
Rook supports the S3-compatible interface of RGW. While Kubernetes does not provide native resources for accessing object storage as of v1.20, Rook allows for the creation and deletion of object storage and buckets without directly calling the S3 API by using the following Custom Resources (CRs):
- CephObjectStore CR: Corresponds to a single object storage instance.
- ObjectBucket CR (hereafter referred to as OB): Corresponds to a single bucket.
- ObjectBucketClaim CR (hereafter referred to as OBC): A user's request to use a bucket. An OB is allocated based on this request.
- ObjectStoreUser CR: A user of the object store.
A brief explanation of how to use them is as follows:
- Create a CephObjectStore CR. This creates the object store.
- Create an OBC. This automatically creates a bucket, its corresponding OB, a user, and a ConfigMap and Secret containing information to access the bucket.
- Import the aforementioned ConfigMap and Secret into a Pod using
envFrom. - Access the object storage from the Pod using the specified environment variables.
If you need a user that isn't tied to a specific bucket, creating an ObjectStoreUser CR will have Rook create the user for you. However, subsequent operations must be performed by calling the S3 API yourself.
For those who want to actually try using object storage in Rook, Mr. Utsunomiya's article (Japanese) might be helpful. Please check the official documentation for detailed specifications.
lib-bucket-provisioner providing OB and OBC
Having read this far, those familiar with Kubernetes storage may have noticed that the roles of OB and OBC are very similar to PersistentVolume (PV) and PersistentVolumeClaim (PVC) used for accessing block devices or file systems from Pods. In fact, OB and OBC were born from the desire to access object storage in a Kubernetes-like manner, similar to PVs and PVCs.
OB and OBC are provided by a library called lib-bucket-provisioner, and Rook uses this library internally. The interface that lib-bucket-provisioner provides to users of Rook/Ceph and others is also very similar to CSI[1].
While lib-bucket-provisioner is very convenient, it faced many issues, such as the difficulty of keeping up with Kubernetes version updates because it is not a Kubernetes standard. These issues were also a headache for its users, such as Rook/Ceph. For details on specifically what was difficult and why, please refer to this article.
Standardization of Object Storage Access in Kubernetes
As mentioned earlier, there is no standard way to access object storage in Kubernetes. This caused disadvantages for various stakeholders, including users and storage vendors, which led to the creation of a KEP to solve this problem[2] last year. Initially, its interface was based on lib-bucket-provisioner.
After the KEP was created, heated discussions took place among many engineers over a long period. The debate was so intense that the aforementioned PR 1383 became full of comments and frequently triggered the dreaded "unicorn" error page, leading to the creation of another PR.
Finally, PR 2100 was merged in October this year, with the goal of an alpha release in v1.21. However, the interface became something completely different from the original proposal, called Container Object Storage Interface (COSI). It is clear from this name that it is strongly inspired by CSI[3].
Now that it is known COSI will become the standard, you might wonder what will happen to lib-bucket-provisioner. In conclusion, this library has already been marked as deprecated. Here is a quote from the README on the official page:
DEPRECATION NOTICE:
There is a Kubernetes Enhancement Proposal under review which will significantly
change this design and interfaces. This repo is not longer active and should not
be used.
It is better not to use lib-bucket-provisioner for new projects in the future.
Future of Object Storage in Rook
What will happen to object storage in Rook, which internally uses lib-bucket-provisioner? I don't believe Rook users need to panic and do anything right away. I would like to state the reasons for this below.
First, currently there is not a single line of code in COSI, so not only Rook users but even developers have nothing they can do. Even if an alpha release is achieved in v1.21 as expected, that's next year, and it will take even more time to write and stabilize a COSI driver for Rook/Ceph. Another point is that since Rook supports very old versions of Kubernetes (supporting Kubernetes v1.11 and later as of v1.5), it would be practically impossible to support only COSI drivers, which can only be used in v1.21 and later at the earliest.
As a rough estimate, I think Rook users should keep the following in mind:
- For about the next year, since there is no COSI code in Kubernetes, no one can do anything.
- In about 1 to 2 years, a COSI driver for Rook will be released experimentally.
- In 2 to 3 years, it will be a state where those who can migrate should do so.
- In about 5 years, it will be a state where you should migrate unless there is a very good reason not to.
For reference, I based this estimate on the adoption rate of CSI and CSI drivers in file/block storage, and the corresponding support status for Ceph CSI drivers in Rook.
As a side note, although it currently states that lib-bucket-provisioner "should not be used," it is still in a state where PRs are being accepted, so it likely won't disappear immediately. Eventually, it may stop accepting fixes, and there is a possibility that large users like Rook might fork it and maintain it themselves, but nothing can be said for sure at this stage.
Conclusion
In this article, I have written about the current state and future of Rook's object storage based on the knowledge I gained from Rook development and information gathering around Kubernetes storage. Since the world of Kubernetes moves fast, the predictions written here might turn out to be completely wrong, but I would be happy if you could use them as one point of reference.
-
Short for Container Storage Interface. A CNCF standard that defines an interface for accessing block/file storage in containerized environments. ↩︎
-
Short for Kubernetes Enhancement Proposal. A proposal for adding features to Kubernetes. ↩︎
-
For now, COSI is independent of CNCF and is unique to Kubernetes. ↩︎
Discussion