iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
💡

How Investigating Small Curiosities Led to Future Solutions

に公開

I'd like to share a story about how investigating a small doubt instead of ignoring it proved useful in an unexpected way. The primary target audience is those who want to grow as IT engineers but are unsure where to start, or those looking for ways to gain knowledge other than sitting down and reading textbooks systematically.

This article is essentially a text version of the video below. If you've already seen the video, you can skip this as there isn't much new content.

https://youtu.be/g0OOIDM23oI?feature=shared

Background

Living as a software developer, we encounter various questions, large and small, every day. Do you find yourself ignoring them all, thinking "well, that just happens"? This article shares my experience of investigating such a question, feeling satisfied once it was resolved, and then having that knowledge come in handy later in an unexpected way—providing an even greater sense of satisfaction. If you feel this approach suits you, please give it a try. If you already investigate every question you encounter, you probably don't need to read this.

The structure of this article is as follows:

  1. A question arises: I became curious about how lsblk, a command that lists block devices, works.
  2. Resolving the question: I felt satisfied after uncovering how lsblk works.
  3. Helping in another matter: Months later, I ran into an issue with the behavior of another command that uses lsblk output, but I was able to handle it using the knowledge gained in step 2.

From here, I will explain what happened in steps 1 to 3 in order.

A Question Arises

As part of my job, I am a maintainer for a software called Rook. Rook is an orchestrator for Ceph, a distributed storage system. Ceph creates data structures called OSDs on block devices existing on various nodes and bundles these OSDs to create a massive storage pool. Rook automates this process. Specifically, it is divided into the following two steps:

  1. Execute the lsblk command on each node to list block devices existing on the node.
  2. Create OSDs on devices of a specific type supported by Ceph (described later) from the results of the lsblk command.

The output of the lsblk command looks like this:

$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda      8:0    0   128G  0 disk
├─sda1   8:1    0     1G  0 part /boot/efi
└─sda2   8:2    0 126.9G  0 part /
sdb      8:16   0     6G  0 disk
sdc      8:32   0     6G  0 disk

The TYPE field indicates the device type. For example, sda, which represents the entire disk, is of type disk, and the partitions created within it, sda1 and sda2, are of type part. There are various other types, such as crypt for encrypted devices.

While looking at the output of the lsblk command several times, I started wondering, "What information is this type based on?" I could have left this question alone, but I happened to have some spare time, and based on my intuition that it probably wasn't doing anything too complicated, I decided to investigate.

Resolving the Question

I knew the lsblk command was part of util-linux, so I decided to read the code for util-linux. In times like this, I'm grateful for open source where you can quickly read the code.

As a result of the investigation, I found that the TYPE field is based on information obtained from sysfs and information such as the UUID of the Linux device mapper. For more details on the investigation method, please see the following video or the materials referenced in the video.

https://youtu.be/x_QSV1tM3qY?feature=shared

https://speakerdeck.com/sat/lsblknotypehuirudonosikumi

As you can see from the video and materials, it wasn't a major undertaking. I was just making a guess and reading the code of small functions. It was great to be able to satisfy my intellectual curiosity with just a little effort.

Helping in another matter

A few months later, at an event called Kernel/VM Hokuriku part 6, I gave a presentation introducing a feature to emulate I/O errors on arbitrary block devices. Below are the recordings of that presentation, split into two parts and uploaded to YouTube.

https://youtu.be/8Mbd31KHDR4

https://speakerdeck.com/sat/ozhang-hai-noemiyuresiyon-ji-cun-tagetutobian

https://youtu.be/nbwF9uaw-sQ

https://speakerdeck.com/sat/ozhang-hai-noemiyuresiyon-kanerumoziyuruzi-zuo-bian

The presentation itself was fortunately well-received. However, due to time constraints during preparation, I regretted not being able to introduce a practical example. So, I declared, "I'll do a demo of a practical example in the lightning talk (LT) session if I can make it in time," and decided to demonstrate that when an I/O error occurs on a Ceph OSD (disk), Ceph recovers the corrupted data and returns the correct value before the LT session started.

However, a problem occurred. To emulate I/O errors, it was necessary to create an OSD on a virtual block device created by the device mapper feature, but Ceph did not support creating OSDs on devices created by the device mapper.

At that moment, a memory from a few months ago suddenly flashed back. Since I had read the code to get the device type for lsblk, I knew that by setting the device mapper's UUID to a specific value, you could make the TYPE field appear as whatever value you wanted.

https://youtu.be/D_pecRQXn0k

https://speakerdeck.com/sat/lsblkkomandonotypehuirudonozhi-wozi-you-nibian-geng

Using this hack, I managed to create an OSD on a virtual block device, and the demo was successful.

https://youtu.be/uN_Gn-bfiSI

https://speakerdeck.com/sat/fen-san-sutorezicephnodetapo-huai-jian-zhi-xiu-fu-ji-neng-haben-dang-nidong-zuo-surunoka

This demo would have been impossible if I hadn't looked at the lsblk source code a few months earlier. I vividly remember feeling quite refreshed, thinking, "I never thought what I once investigated on a whim would be useful in this way at a time like this!"

Conclusion

As I've described, by not discarding small doubts and investigating them slightly, I was able to find satisfaction, and it turned out to be twice as rewarding when it helped with something else months later. I've had countless experiences like this. Also, since I'm not good at reading reference books to understand things systematically, I have basically grown through this kind of practice-based approach. I hope the method introduced in this article will be particularly helpful for people who are similar to me.

Lastly, I'll mention two points of caution regarding investigating the doubts I've written about. First, do not try to clarify every single unknown. Time is finite, so if you try to investigate everything, you'll never finish what you're actually supposed to be doing. Especially in the case of work, it's best to finish what needs to be finished and investigate if you have spare time. Second, don't be too obsessed with growth. Constantly thinking only about growth can be overwhelming and is not good for your mental and physical health. If you get bored or if the investigation seems harder than expected, I think it's good to leave an escape route, such as quitting or putting it off. The questions aren't going anywhere.

Discussion