iTranslated by AI
Reading Safetensors Headers
What is Safetensors
Safetensors is a library and file format developed by Hugging Face, primarily designed for safely and quickly reading and writing tensors.
The provided Python library is compatible with PyTorch, TensorFlow, and others. Furthermore, since it lacks the functionality to execute arbitrary code (unlike the pickle format) and is relatively safe, recent deep learning models are increasingly being distributed in this format.
Structure

Explanation of the Safetensors file structure [1]
Safetensors has a simple structure. It is broadly divided into the header size area (8 bytes), the header area (N bytes), and the buffer area (the remaining part). (Since the official names for these areas are unknown, I have given them these names in this article for convenience.) Because the header and buffer areas are separate, it is possible to use the header information to load only specific parts without reading the entire file.
In this article, I will explain how to read the header area of Safetensors.
Header Size Area
The first 8 bytes (uint64) represent the size of the header.
Header Area
The header area is UTF-8 JSON, so it can be easily read in many programming languages.
{
"__metadata__": {
"format": "pt"
},
"model.embed_tokens.weight": {
"dtype": "F32",
"shape": [49152, 576],
"data_offsets": [0, 113246208]
},
"model.layers.0.input_layernorm.weight": {
"dtype": "F32",
"shape": [576],
"data_offsets": [113246208, 113248512]
},
"model.layers.0.mlp.down_proj.weight": {
"dtype": "F32",
"shape": [576, 1536],
"data_offsets": [113248512, 116787456]
},
...
}
As shown in the explanatory image, basically:
"layer_name": {
"dtype": "data_type",
"shape": [dim1, dim2, ...],
"data_offsets": [data_start, data_end]
}
is repeated.
Additionally, there is an optional __metadata__ field as a special key where you can store metadata. While there are no strict rules and you can include information freely, there is a constraint that it must be a string: string key-value pair. Although the header itself is in JSON format, only the string type can be used within __metadata__, so a bit of caution is needed.
To prevent DoS attacks, the maximum header size is limited to 100MB as a constraint [2]. If it exceeds the maximum size, it will fail to load with a HeaderTooLarge error.
dtype: Data Type
A string representing the data type. As of October 2, 2024, the following are available: [3]
-
BOOL: Boolean type -
U8: Unsigned 8-bit integer -
I8: Signed 8-bit integer -
F8_E5M2: 8-bit floating point (5-bit exponent, 2-bit mantissa) -
F8_E4M3: 8-bit floating point (4-bit exponent, 3-bit mantissa) -
I16: Signed 16-bit integer -
U16: Unsigned 16-bit integer -
F16: 16-bit floating point -
BF16: 16-bit floating point (Brain floating point) -
I32: Signed 32-bit integer -
U32: Unsigned 32-bit integer -
F32: 32-bit floating point -
F64: 64-bit floating point -
I64: Signed 64-bit integer -
U64: Unsigned 64-bit integer
However, depending on the library used to handle tensors, some data types may not be supported. [4]
shape: Tensor Shape
An array of integers representing the shape of the tensor.
For scalars (0-dimensional), specify it with an empty array [].
data_offsets: Data Start and End Positions
An array of integers [start, end] representing the start and end positions of the tensor data.
These are specified as relative positions from the beginning of the buffer area, not absolute positions. Therefore, in many cases, the start position of the first layer's data will be 0.
Specifications Regarding Metadata
While the content written in the __metadata__ field is flexible, a convention for using it to record model information has been proposed by Stability AI.
It allows for specifying model architecture, model names, and Base64-encoded thumbnail images. While there are items for text generation models, it is a standard primarily targeted at image generation models. I won't go into much depth here.
Reading Local Headers with Python
Let's try to retrieve the header of a Safetensors file using Python. As an example Safetensors file, I'm using the model file from HuggingFaceTB/SmolLM-135M.
import json
path = "./model.safetensors" # Path where the safetensors file is located
with open(path, "rb") as f:
# Read 8 bytes
buffer = f.read(8)
# Convert the byte sequence to an integer in little-endian
header_size = int.from_bytes(buffer, byteorder="little")
print(f"header_size: {header_size}")
with open(path, "rb") as f:
# Read the header portion
f.seek(8)
buffer = f.read(header_size)
# Decode the header portion as JSON
header = json.loads(buffer.decode("utf-8"))
print(header)
❯ python ./main.py
header_size: 30368
{'__metadata__': {'format': 'pt'}, 'model.embed_tokens.weight': {'dtype': 'F32', 'shape': [49152, 576], 'data_offsets': [0, 113246208]}, ...
The header size was 30368. Although the header output is partially omitted, you can see that it contains metadata and information for each layer of the model.
Reading Local Headers with Rust
Similar to Python, the process is the same. Since parsing JSON is a bit of a hassle, I've read it as a string here.
use std::{fs::File, io::Read};
fn main() {
let path = "./model.safetensors"; // Path where the safetensors file is located
let mut file = File::open(path).unwrap();
let mut buffer = vec![0u8; 8]; // Prepare an 8-byte buffer
file.read_exact(&mut buffer).unwrap(); // Read 8 bytes from the file
// Convert the buffer to u64 in little-endian
let header_size = u64::from_le_bytes(buffer.try_into().unwrap());
println!("header_size: {}", header_size);
// Read for the header size
let mut header_buffer = vec![0u8; header_size as usize];
file.read_exact(&mut header_buffer).unwrap();
let header = String::from_utf8(header_buffer).unwrap(); // Convert to text
println!("{}", header);
}
❯ cargo run -q
header_size: 30368
{"__metadata__":{"format":"pt"},"model.embed_tokens.weight":{"dtype":"F32","shape":[49152,576],"data_offsets":[0,113246208]}, ...
Reading Remote Headers with TypeScript
Thanks to the very simple structure of Safetensors, it is possible to retrieve layer information by fetching only the header portion without reading the entire file. By leveraging this characteristic and combining it with the HTTP Range request header, you can obtain information from a Safetensors file on the internet without downloading the complete file. Please refer to the MDN documentation for the Range header.
Below is an example of using TypeScript to retrieve the header without downloading the entire file.
// Hugging Face model download URL
const fileUrl = "https://huggingface.co/HuggingFaceTB/SmolLM-135M/resolve/main/model.safetensors"
const headerSizeRes = await fetch(fileUrl,
{
method: "GET",
headers: {
// https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range
"Range": "bytes=0-7" // Fetch 8 bytes
}
}
)
const headerSize = await headerSizeRes.arrayBuffer().then((buffer) => {
// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/DataView/getBigUint64
const view = new DataView(buffer)
// Read 8 bytes from the beginning (offset 0) in little-endian and convert to bigint
// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/DataView/getBigUint64
return view.getBigUint64(0, true)
})
console.log(`headerSize: ${headerSize}`)
const headerRes = await fetch(
fileUrl,
{
method: "GET",
headers: {
// Add 7n (bigint) to the header size to retrieve the header portion
"Range": `bytes=8-${7n + headerSize}`
}
}
)
const json = await headerRes.json()
console.log(json)
❯ bun run ./main.ts | head -n 10
headerSize: 30368
{
__metadata__: {
format: "pt",
},
"model.embed_tokens.weight": {
dtype: "F32",
shape: [ 49152, 576 ],
data_offsets: [ 0, 113246208 ],
},
...
I used Bun this time, but since it only uses standard features, it should work on other runtimes as well.
The method using the Range header is also introduced in the official documentation and is actually used by Hugging Face's model pages to display the total number of parameters and layer information. (You can see requests being made with the Range header if you monitor them from the Network tab.)
Bonus
It's mostly for my own use, but I've created a CLI tool that can read or delete Safetensors metadata, so please give it a try if you're interested.
-
Modified and translated from the official explanatory image (CC-BY-NC-SA-4.0) ↩︎
-
From https://github.com/huggingface/safetensors/blob/5db3b92c76ba293a0715b916c16b113c0b3551e9/safetensors/src/tensor.rs#L654-L689 ↩︎
-
For example, in integration with PyTorch,
U64andU16are not supported. ↩︎
Discussion