iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🦔

Development Record of C++ Library libsparseir (Multi-language Support Edition)

に公開

Introduction

This article was written to share the knowledge gained through the development of libsparseir, a C++ port of the library called sparse-ir originally written in Python and Julia. This article is a sequel to the C++ Porting Edition. By around March 2025, most of the features from SparseIR.jl had been successfully ported to C++. In this article, I will share the progress made up until September 2025.

Multi-language Support Using C-API

"Multi-language support" here refers to setting up the environment so that Julia, Python, and Fortran users can call the functions provided by the C++ library libsparseir.

The specific source code for the C-API is as follows:

https://github.com/SpM-lab/libsparseir/blob/v0.6.0/src/cinterface.cpp

It looks roughly like this:

extern "C" {
    // Constructor
    // Return value is a pointer to a struct wrapping the C++ object
    spir_kernel* spir_logistic_kernel_new(double lambda, int* status){
        // Omitting details
    }
    // Calculations regarding spir_kernel return a status code 
    // indicating whether the calculation succeeded or failed.
    // Calling the API updates values like xmin, xmax, ymin, ymax, etc.
    int spir_kernel_domain(const spir_kernel *k, double *xmin, double *xmax,
                           double *ymin, double *ymax){
        // Omitting details
    }
    // Various other functions
}

spir_kernel is an Opaque type defined using the IMPLEMENT_OPAQUE_TYPE macro found in the link below. For example, spir_kernel corresponds to the following implementation:

IMPLEMENT_OPAQUE_TYPE(kernel, sparseir::AbstractKernel)

This automatically defines functions like spir_kernel_release to free resources.

https://github.com/SpM-lab/libsparseir/blob/v0.6.0/src/cinterface_impl/opaque_types.hpp

We request developers of languages such as Julia, Python, and Fortran to call the functions of libsparseir using this C-API. Developers of each language design their libraries by wrapping these functions in language-specific structs or classes, so that library users don't need to be aware of the objects generated via the C-API.

In the Case of Python

We use ctypes for communication with the C library. Basic components were generated using Cursor, while finer details such as arguments and return values were adjusted manually while debugging. Looking back, I feel I could have put more effort into a mechanism for automatic generation from headers using Clang-related features, but since manual work has sufficed, I'll consider it acceptable.

Let's bring in part of the implementation found in https://github.com/SpM-lab/libsparseir/blob/v0.6.0/python/pylibsparseir/core.py. The following is an example of using the C-API spir_logistic_kernel_new.

# pylibsparseir package
# A thin wrapper package that manages the C-API

import ctypes
_lib = ctypes.CDLL("path/to/sharedlib")

# Kernel functions
_lib.spir_logistic_kernel_new.argtypes = [c_double, POINTER(c_int)]
_lib.spir_logistic_kernel_new.restype = spir_kernel

def logistic_kernel_new(lambda_val):
    """Create a new logistic kernel."""
    status = c_int()
    kernel = _lib.spir_logistic_kernel_new(lambda_val, byref(status))
    if status.value != COMPUTATION_SUCCESS:
        raise RuntimeError(f"Failed to create logistic kernel: {status.value}")
    return kernel

Python developers implement the LogisticKernel class and call the above logistic_kernel_new function in the constructor.

# A thin package wrapping the C-API managed by the libsparseir repository
from pylibsparseir.core import _lib
from pylibsparseir.core import logistic_kernel_new

class LogisticKernel(AbstractKernel):
    def __init__(self, lambda_):
        """Initialize logistic kernel with cutoff lambda."""
        self._lambda = float(lambda_)
        self._ptr = logistic_kernel_new(self._lambda)

    def __del__(self):
        """Clean up kernel resources."""
        if hasattr(self, '_ptr') and self._ptr:
            _lib.spir_kernel_release(self._ptr)

For the specific implementation, please refer to the following:

https://github.com/SpM-lab/sparse-ir/blob/aa095eddf9d2b836eee964d100d442591aa7fa29/src/sparse_ir/kernel.py#L14-L74

We use uv as the package manager and leave the building of the C/C++ library to scikit-build.

https://github.com/SpM-lab/libsparseir/blob/v0.6.0/python/pyproject.toml

Registration to PyPI is automated where possible by constructing a workflow file triggered by events like tagging a version (e.g., v0.6.0). To support various operating systems, we package it using cibuildwheel.

Reference: https://github.com/SpM-lab/libsparseir/blob/main/.github/workflows/PublishPyPI.yml

Additionally, a workflow has been created to support people using Anaconda.

https://github.com/SpM-lab/libsparseir/blob/main/.github/workflows/conda.yml

In the Case of Julia

To call C resources, the ccall function is used. By using Clang.jl, the parts that directly use ccall can be automatically generated.

https://github.com/SpM-lab/SparseIR.jl/blob/main/src/C_API.jl

The script for automatic generation can be found at the link below.

https://github.com/SpM-lab/SparseIR.jl/tree/main/utils

This script depends on Clang.jl. By using this package, you can read sparseir.h and create a thin wrapper module.

The thin wrapper package corresponding to pylibsparseir is libsparseir_jll. To create this, you need to submit a pull request to JuliaPackaging/Yggdrasil to add a build script called build_tarballs.jl.

New PR: https://github.com/JuliaPackaging/Yggdrasil/pull/12027
Version update PR: https://github.com/JuliaPackaging/Yggdrasil/pull/12161

Updating and pushing build_tarballs.jl is handled by a custom script: https://github.com/SpM-lab/libsparseir/tree/main/julia/YggdrasilCommitHelper.jl.

PRs for updating versions are created manually. It would be great if the official Julia organization provided a mechanism to automate this part.

Julia developers wrap and manage C resources using structs as shown below.

mutable struct LogisticKernel <: AbstractKernel
    ptr::Ptr{spir_kernel}
    Λ::Float64

    function LogisticKernel::Real)
        Λ  0 || throw(DomainError(Λ, "Kernel cutoff Λ must be non-negative"))
        status = Ref{Cint}(-100)
        ptr = spir_logistic_kernel_new(Float64(Λ), status)
        status[] == 0 || error("Failed to create logistic kernel")
        kernel = new(ptr, Float64(Λ))
        finalizer(k -> spir_kernel_release(k.ptr), kernel)
        return kernel
    end
end

The pattern of using finalizer is based on https://docs.julialang.org/en/v1/base/base/#Base.finalizer.

It's Actually Very Tough

Although I've written this matter-of-factly, the effort required for multi-language support took about as much time as building the C++ library itself. In the C-API, there are APIs that exchange multi-dimensional arrays as input and output, and it was necessary to properly implement the passing of these arrays. For example, the following items were involved:

  • Passing information about the length, shape, and dimensions of the array.
  • Ensuring the calling side has a contiguous memory layout.
  • Setting flags to perform calculations in specific modes since memory layouts (row/column-major) differ between languages.
  • Converting arrays received via the C-API into a format that the internal C++ API can understand (specifically, a format Eigen understands).

To package for Python, I had to set up various files like pyproject.toml, setup.py, CMakeLists.txt, and Publish.yml. I managed to get it done with the help of LLMs, but there is just too much preparation work involved.

While Julia is somewhat easier in this regard, the time and effort required for tasks like local testing for registration in Yggdrasil cannot be ignored.

I was responsible for the Julia, Python <-> C/C++ bridges with the help of LLMs. It was a good experience for me as a software developer while maintaining the C++ library. However, libsparseir is a package for computational physics. I have mixed feelings about whether those who face physics daily through computers (i.e., students and researchers) should have to go through the same experiences I did for maintenance and new feature development in the future.

By the way, there is also a recent movement: to be continued in Rust!. I'm curious to see how much harder or easier it will become when the backend of the sparse-ir project switches from C++ to Rust 👀.

Discussion