iTranslated by AI
Goodbye, SHA-1
Recently, git v2.29 was released.
Notably, this release includes experimental support for SHA-2 commit hashes. It seems it can be used like this:
$ git init --object-format=sha256 sample-repo
Initialized empty Git repository in /home/username/sample-repo/.git/
$ cd sample-repo
$ echo 'Hello, SHA-256!' >README.md
$ git add README.md
$ git commit -m "README.md: initial commit"
[main (root-commit) 6d45449] README.md: initial commit
1 file changed, 1 insertion(+)
create mode 100644 README.md
$ git rev-parse HEAD
6d45449028a8e76500adbfe7330e779d5dc4a3a14fca58ff08ec354c58727b2c
In this article, I will briefly introduce the compromise of SHA-1, which is likely relevant to this update.
The Hash Collision Problem
A hash function in cryptography is an algorithm with the following features:
- It "summarizes" an arbitrary data string into a fixed-length data string (hash value).
- The original data string cannot be inferred from the hash value.
- Multiple data strings cannot be found (in real-time) for a single hash value.
Hash functions are a core technology for Message Authentication Codes (MAC) and digital signatures, and are an essential element for ensuring the "Integrity" of data.
In particular, if the third feature—"Multiple data strings cannot be found (in real-time) for a single hash value"—is broken, the integrity can no longer be guaranteed by that hash function. This is known as the "Hash Collision Problem."
Compromise of the SHA-1 Algorithm?
It all started in 2004 when an attack method was published that could cause hash collisions with a high probability in multiple hash functions.
- Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD: The paper that started it all. It also demonstrated that SHA-0 is vulnerable.
Subsequent research revealed that SHA-1 was also vulnerable, causing a major stir in the cryptographic community. Originally, there was already skepticism because the NSA was involved in the development of SHA-1 and SHA-2. With the appearance of the aforementioned paper, the need for hash algorithms to replace SHA-1 and SHA-2 grew. This eventually led to the recommendation of SHA-3 by NIST.
On the other hand, regarding the compromise of the SHA-1 algorithm, the discussion eventually downscaled because no concrete examples of collisions for SHA-1 or SHA-2 were found in subsequent research for some time. SHA-3, which had been developed with great effort, ended up being positioned merely as a backup for SHA-2.
- "Recent Topics in Cryptography" — Old Main Blog | Baldanders.info
- SHA-3 Officially Released: 10 Years Since Then... — Old Main Blog | Baldanders.info
The 2010 Problem
NIST announced a phase-out schedule for SHA-1 and recommended migrating hash algorithms used for digital signatures to SHA-2 (SHA-224/256/364/512) by 2010.
However, in reality, the transition progressed slowly, and NIST's migration schedule was postponed to 2013. But even then, it was delayed by more than two additional years (laughs).
Performance Limits of SHA-1
In the 2010s, the performance limits of SHA-1 began to be debated. During this time, papers started to be published on brute-forcing SHA-1. This was against a background where the cost of procuring massive computing power—such as dedicated hardware using plenty of GPUs and the rise of cloud services—began to fall.
- The SHA-1 Collision Problem: Moving Up the Phase-out | text.Baldanders.info
- The First SHA-1 Collision Example | text.Baldanders.info
The following paper became the decisive factor in this:
There are two notable points in this paper:
- Using a method called "chosen-prefix collision for SHA-1," there is a high degree of freedom when preparing data that can collide.
- The cost of procuring computing power to compromise the hash value has dropped to a relatively practical level.
The second point is particularly important; it seems it was compromised in about two months with a configuration of "Nvidia GTX 1060 GPU" × 900. The cost was said to be 45k USD[1].
Following the publication of "SHA-1 is a Shambles", GnuPG decided to remove all SHA-1 based digital signatures added to keys after 2019-01-19, starting from version 2.2.18.
- [Announce] GnuPG 2.2.18 released
- [openpgp] Deprecating SHA1
- GnuPG 2.2.18 Released: Goodbye SHA-1 | text.Baldanders.info
Additionally, OpenSSH has stated that it will disable the "ssh-rsa" public key signature algorithm, which uses SHA-1, in the near future.
In the future, SHA-1 will likely be kept only for referencing legacy assets, much like MD5 was in the past.
Are Git Commit Hashes Safe?
The issues regarding the compromise of SHA-1 discussed so far primarily affect digital signatures and are said not to affect various MAC algorithms or pseudorandom number generation algorithms. Since Git commit hashes are used simply as commit identities, the requirement for integrity is likely not as strict as it is for digital signatures.
However, it cannot be denied that new compromise issues may surface in the future, and it would be precarious not to have an alternative when that happens. In that sense, it seems meaningful to secure SHA-2 as an algorithm for commit hashes. If I were to ask for more, I would hope they also leave room to support SHA-3 in the future.
References
- [ANNOUNCE] Git v2.29.1
- Release Git for Windows 2.29.0 · git-for-windows/git · GitHub
- Release Git for Windows 2.29.1 · git-for-windows/git · GitHub
- "Git for Windows 2.29.0" Released – Default branch name can be set during setup - Mado no Mori
- Git v2.29 Released | text.Baldanders.info
-
Simply taking 1 USD = 110 JPY, 45k USD = approximately 4.95M JPY. Well, it means it can be compromised for under five million yen. ↩︎
Discussion