iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
😎

Reflections on Reading the Linux Kernel for 100 Hours

に公開

I've Spent About 100 Hours Reading the Linux Kernel

Over the past two weeks, I've been writing articles about visualizing the Linux kernel.

https://zenn.dev/coffeecupjp/articles/07f01040a4c7a4

https://linux.tokyo

Originally, I left my job in mid-October to focus entirely on reading the Linux kernel, which was my primary goal. Looking at OpenAI's statistics, it seems I've finally hit about 100 hours of reading the Linux kernel.

On days I spent $3.79, I was reading for over 10 hours. Since I've spent a total of $37.12, I calculate that I've been reading for roughly 100 hours.

Some might wonder, "Why OpenAI for reading code?" I used my self-made VSCode extension, "Linux-Reader," to analyze each function in the code, so I calculated the time based on OpenAI usage.

https://marketplace.visualstudio.com/items?itemName=coffeecupjapan.linux-reader&ssr=false

As I might have mentioned before, this VSCode extension allows me to save traces of my reading history, which is how linux.tokyo was built.


Anyway, enough with the introduction. In this article, I want to talk about how I've changed after spending 100 hours reading the Linux kernel.

What Changed After 100 Hours of Reading

Even though I say "I read the Linux kernel for 100 hours!", at the 0-hour mark, I had only read a little bit of sched and kernel_clone using my extension, so I had almost zero knowledge.

In other words, I was a total Linux kernel wannabe (lol).

However, I thought I lacked the knowledge to dive in immediately (having only read the "Unix V6" book twice), so I started by solidifying my foundation with a book called "[Try and Understand] How Linux Works: Basics of OS, Virtual Machines, and Containers Learned through Experiments and Diagrams" before I began reading the source code.

https://www.amazon.co.jp/[試して理解]Linuxのしくみ-―実験と図解で学ぶOS、仮想マシン、コンテナの基礎知識【増補改訂版】-武内-覚-ebook/dp/B0BG8J5QJ1?ref_=ast_author_mpb

Looking back now that I'm actually reading the Linux kernel, there are parts where I think "Ah, I see!" and other parts where the explanation doesn't feel deep enough, but it was sufficient for getting the general concepts into my head. So, I can recommend this book.

However, thinking back, even with the support of the VSCode extension, I doubted at first whether I really understood what I was reading (especially regarding write).
Nevertheless, as I gradually read through write, fork, exec, exit, open, and read, I began to understand how to read it.
So, I decided to write this article to document that transformation.


Now, let's take a look at what has changed!

1. Instead of diving in immediately, I started asking ChatGPT for explanations each time to understand the direction before reading

In the beginning, despite having help from my VSCode extension, I often dove straight into reading. And if I encountered an unfamiliar word, there was a 50/50 chance I wouldn't even Google it (lol). Since I wasn't familiar with the concepts, my main goal was just to get through the code for the time being.

However, by the time I realized it, I had started asking ChatGPT for the location of functions first to grasp the big picture before looking at the details. This might be because I've gained a bit of mental breathing room.

For all of you, when reading large-scale code, it might be fine to read blindly at first, but once you understand things to some extent, it might be a good idea to try using the power of AI.

2. Now able to read while imagining how concepts combine

The next change is that I can now read the code while being conscious of the internal implementation. For example, in the read implementation related to the Block layer, I can think about how optimization techniques like "Red-Black Trees" or "folio's xarray" are being used.

At first... I didn't understand anything and read without considering the relationships between components at all. So, I feel like there were parts in the initial sections I wrote for linux.tokyo, such as write or fork, where even I didn't fully understand what was happening.

However, from the time I started working on open, I became able to read while being aware of how concepts combine, and for read, I managed to organize my understanding of the overall structure (though I'm still considering how to visualize this organization...).

3. Fixed bugs in the VSCode extension and improved the reading experience

The final change concerns my self-made VSCode extension.

https://marketplace.visualstudio.com/items?itemName=coffeecupjapan.linux-reader&ssr=false

Actually, I've updated the version of this VSCode extension from 1.0.7 to 1.0.18 over the last two or three weeks. This is because the quality was terrible. I think anyone who has used linux.tokyo even a little bit will understand; for example, there were parts where it couldn't correctly capture the code content of #define. That issue still remains in some parts of the write system call.

Because of that, over these past few weeks, I introduced unit tests and started fixing it in earnest. While it's still not 100% bug-free software, I believe it has become usable to some extent.

Reflecting on the first 100 hours of reading the Linux kernel

At first glance, I thought the Linux kernel was an extremely difficult OSS that was "off-limits to beginners." However, when I actually used generative AI effectively to read it, I felt like I could understand it reasonably well, and I think I've reached the point where I can actually read it and organize the concepts.

Of course, I realize it's hard to find this much time unless you're unemployed like me. However, on linux.tokyo, I've managed to read and explain up to 50 different paths per system call, which would normally require reading a massive amount of code. So, if you have the time, I hope you can also use generative AI effectively to conquer the Linux kernel or other difficult OSS.

Discussion