iTranslated by AI
SipHash and Its Implementation in the Rust Standard Library
This article is for Day 14 of the Rust Advent Calendar 2023.
Introduction
SipHash is the hash algorithm used by default in places like std::collections::HashMap in the Rust standard library.
It is resistant to HashDoS attacks and suitable as a default algorithm, but since I personally tend to casually switch to other (non-resistant) algorithms when I want to speed up a HashMap, I decided to investigate it properly this time.
What is SipHash
Overview
SipHash is a type of PRF that works as follows:
It generates a 64-bit hash value from a 128-bit secret key and an input of arbitrary length.
As long as the contents of the secret key remain unknown, it is impossible to predict what hash value an input will produce, which is why it is resistant to HashDoS attacks.
Secret Key Generation in Rust
In the Rust standard library, the secret key is generated by obtaining a random number from the OS only at the beginning, and thereafter, that value is maintained in thread_local and incremented by one each time.
The comments mention that obtaining a random number from the OS every time was slow.
Algorithm
Let's examine the SipHash algorithm along with its implementation in the Rust standard library.
- Initialize the internal state with two u64 values, k0 and k1 (a total of 128 bits for the secret key), and fixed constants.
The constants being XORed come from the ASCII codes of b'somepseudorandomgeneratedbytes'.
It seems the goal is simply to ensure each state v0-3 starts with a different value.
# python
>>> ''.join(['%x' % c for c in b'somepseudorandomgeneratedbytes'])
'736f6d6570736575646f72616e646f6d67656e6572617465646279746573'
- Read the input 8 bytes at a time (until less than 8 bytes of input remain) and update the internal state.
mi receives the input in 64-bit segments.
What it does is simply XOR the input into the internal state before and after calling c_rounds.
As for the crucial c_rounds, the Rust standard library uses a variant called SipHash-1-3, so it calls compress! only once.
The "1" in SipHash-1-3 indicates that compress! is called exactly once for each 8-byte chunk of input.
compress! is defined as a macro (perhaps they didn't trust the compiler's optimization?).
In any case, it updates the internal state by repeating three types of operations: wrapping_add (Add), rotate_left, and XOR.
Functions that use only these three types of operations are called ARX, named after their initials.
They have the advantage of running quickly even on low-end CPUs because they don't require special instructions (like AES instructions).

Image cited from SipHash: a fast short-input PRF Jean-Philippe Aumasson and Daniel J. Bernstein
The number of rotations for each step was apparently determined through mechanical testing.
- Final Output
When less than 8 bytes of input remain, it updates the internal state by incorporating input length % 255, then XORs 0xff into v2, calls d_rounds to update the state, and finally XORs v0-3 together for the output.
Since it's SipHash-1-3, d_rounds calls compress! three times.
SipHash Variants
The notation SipHash-c-d represents the number of rounds in SipHash. It calls compress! c times for each 8-byte chunk and d times at the end.
While the paper recommends SipHash-2-4, the Rust standard library uses SipHash-1-3.
According to SipHash designer Jean-Philippe Aumasson[1], SipHash-1-3 appears to be sufficient (https://github.com/rust-lang/rust/issues/29754).
Comparison with Other Algorithms
SipHash is faster than HMAC-SHA1, which was its obvious competitor at the time, and it is particularly optimized for short inputs, making it suitable for use in data structures like HashMap.
While HMAC-SHA1 is secure, the reasoning seems to be that since a secret key is used anyway, security can be maintained even with less computational effort than SHA1 (broad paraphrase).
History
- 2015-05 Rust 1.0.0 released. SipHash-2-4 was used at this time. https://github.com/rust-lang/rust/blob/1.0.0/src/libcore/hash/sip.rs
- 2015-11 Changed from SipHash-2-4 to SipHash-1-3, as SipHash-1-3 was deemed sufficient.
- https://github.com/rust-lang/rust/issues/29754
- Ruby also changed from SipHash-2-4 to SipHash-1-3. https://bugs.ruby-lang.org/issues/13017
- 2016-08 A proposal was made to speed up the SipHash implementation using assembly or other methods, but it was abandoned as there was no significant difference.
-
[author of Serious Cryptography] ↩︎
Discussion