iTranslated by AI
GitHub Utilization Roadmap for Research Lab Management
Based on the case where we introduced GitHub to my research laboratory, I have developed two types of utilization roadmaps: a bottom-up approach and a top-down approach.
Let's Use GitHub in the Laboratory!
GitHub officially launched in 2008, was acquired by Microsoft in 2018, and started the GitHub Actions service in 2019. Today, it has become an indispensable tool for IT engineers and is also a crucial tool for students and researchers for smooth collaborative research. It is also evaluated as an effective platform for conducting reproducible research, beyond just collaboration.
Furthermore, GitHub is effective not only for research programs but also for managing lab websites and other assets. Below, I propose two types of roadmaps for promoting the introduction and utilization of GitHub in laboratory management.
Bottom-up Approach
First, I developed a bottom-up roadmap where students or young faculty members take leadership to expand its utilization. My laboratory is currently proceeding with the introduction based on this scenario. Here, "young leader" refers to the person at the center of the introduction, such as a student or postdoc, and "PI" refers to the professor or associate professor.
| Phase | Milestone |
|---|---|
| 01 | The young leader creates a personal GitHub account. |
| 02 | The young leader obtains permission from the PI and creates a laboratory organization. |
| 03 | The young leader creates and shares repositories for research code or the lab website. |
| 04 | Hold a study session for other students and young faculty members, where everyone creates a GitHub account on the spot. The young leader adds other members to the organization, enabling them to at least view files. It is better to cover up to making a pull request during the study session. |
| 05 | Other students and young faculty members create pull requests according to the young leader's Issues and receive code reviews. |
| 06 | Other students and young faculty members become able to isolate problems properly and create Issues under the young leader's guidance. |
| 07 | Other students and young faculty members develop a habit of resolving Issues, making pull requests, and conducting code reviews without the young leader. |
| 08 | The young leader demonstrates milestone management, and other students/young faculty members implement it as needed. |
| 09 | The PI creates their own GitHub account, and the young leader adds them to the organization. |
| 10 | The PI becomes able to create Issues themselves, and students/young faculty members resolve them. |
| 11 | The PI sets milestones themselves and manages the progress of students and young faculty members. |
In this roadmap, the goal is for the PI to be able to play a project manager-like role on GitHub. Therefore, we prioritize the mastery of upstream processes such as Issue and milestone management, leaving things like the PI writing code themselves to submit Pull Requests (PRs) or providing specific guidance through code reviews as something to do if there is extra capacity after the main goal is achieved.
Top-down Approach
Next, I have developed a roadmap for proceeding with the introduction in a top-down manner from the laboratory's principal investigator (PI), such as a professor or associate professor. I would like to hear the opinions of professors who have actual experience with this.
| Phase | Milestone |
|---|---|
| 01 | The PI creates a personal GitHub account. |
| 02 | The PI creates a laboratory organization. |
| 03 | If there are young members who already have a GitHub account, add them to the organization and grant management permissions as needed. |
| 04 | The PI or young leader creates and shares repositories for research code or the lab website. |
| 05 | The PI or young leader holds a study session for other students and young faculty members, where everyone creates a GitHub account on the spot. The PI or young leader adds other members to the organization, enabling them to view files. It is better to cover up to making pull requests during the study session. |
| 06 | Students and young faculty members create pull requests according to the PI's Issues and receive code reviews. |
| 07 | The PI sets milestones and manages the progress of students and young faculty members. |
| 08 | Students and young faculty members become able to create Issues. |
| 09 | The PI submits pull requests and assists students and young faculty members. |
| 10 | Students and young faculty members develop a habit of resolving Issues, making pull requests, and conducting code reviews. |
| 11 | Students and young faculty members manage milestones and carry out research. |
Other Roadmaps
Here are some other possible patterns.
- The professor wants to release code as open source but simply cannot use Git/GitHub.
→ A young member creates an account on their behalf, notifies the professor of the ID and password, makes only the initial commit using the professor's account, and thereafter adds their own account as a collaborator to manage it by proxy. - etc.
Points to Note
- According to GitHub's terms,
each person can only have one account, so be sure to create a personal one (Reference: Case of Abolishing Company GitHub Accounts Company-wide)you can only hold one free account. Therefore, I recommend a handle name that you wouldn't be embarrassed to have colleagues or real-life friends see. I feel a slight mismatch between the anonymous culture of engineers and the real-name culture of researchers, but since I am committed to a career as a researcher, I created mine using my real name. - Repositories don't necessarily have to be public; in fact, files like LaTeX documents for papers should probably be kept private. Let's explain that even closed repositories have value for backup and sharing purposes. Since the meaning of version control is often not understood, I recommend communicating values other than that. (A commenter shared examples of damage caused by not using version control.)
- Some people often say, "I don't need it if I'm just doing research alone," but anyone who has fully mastered Git/GitHub uses it even when working solo. I recommend getting into the habit before you regret it, thinking, "I should have used GitHub from the start."
- Basically, manage calculation programs and such, and add heavy data like calculation results to the .gitignore file so they are not committed.
→ There is something called Git Large File Storage, and it is possible to manage heavy data together. - Include not only source code but also a README.md that summarizes Makefiles and build methods.
- Even if it's not open source, write documentation. You might pass it on to someone else, or by the time you look at it again, you might have forgotten it, and the code you wrote yourself might look like someone else's.
- Write tests before you get bad-mouthed on X.
- Let's also introduce code reviews.
- Add a license as a "last will and testament." The MIT license is recommended.
- Regarding websites, if they are set up to be uploaded via FTP connection, you can use GitHub Actions for automatic deployment. However, it cannot bypass a VPN, so you may need to consider other methods depending on the circumstances.
Summary
In introducing GitHub to my lab, I proposed two types of utilization roadmaps: bottom-up and top-down. If you have any know-how regarding its utilization, please share it.
Related Articles
While I didn't refer to these during the writing stage, they describe procedures for holding training sessions to introduce Git or persuading supervisors. The generation that learned programming in compulsory education will undoubtedly have to persuade seniors or supervisors with lower IT literacy than themselves to promote the introduction of new technologies, so I think this will be useful in a few years.
Update: May 14
Discussion
アカウントは1人1つまで、という訳ではないようです (2025年5月時点)。公式ドキュメント↓
自分が実際に見た、バージョン管理をしていない組織で生じた被害をご紹介します (研究室にGit,Githubを導入したい方のご参考に!)
上記の問題は、OneDriveでのファイル共有を忘れずにやる・ドキュメントを詳細に書く、等でも解決可能です。しかし、かなりの手間な上に人間頼りなので、いつか誰かが忘れたりミスしたりします。そもそも、その手間が研究の大きな妨げになってしまいます。
それを楽に、根本的に解決できるのがバージョン管理(およびそのホスティング)だと、1ユーザとして考えています。導入/学習コストを大きく上回る利益があることを、あの手この手でメンバーに実感してもらいたいですね...
(拙記事のリンクありがとうございます!)
ありがとうございます!保有アカウントに関する部分を修正しました。
ということのようです。バージョン管理の利益についてもご紹介頂きありがとうございます。記事内で紹介させて頂きます!