Open6

Fedora 41 Install Memo

nb.onb.o

Post-Installation Configuration

Change the folder name in the home directory from Japanese to English

$ LANG=C xdg-user-dirs-gtk-update 
Moving DESKTOP directory from デスクトップ to Desktop
Moving DOWNLOAD directory from ダウンロード to Downloads
Moving TEMPLATES directory from テンプレート to Templates
Moving PUBLICSHARE directory from 公開 to Public
Moving DOCUMENTS directory from ドキュメント to Documents
Moving MUSIC directory from 音楽 to Music
Moving PICTURES directory from 画像 to Pictures
Moving VIDEOS directory from ビデオ to Videos

Adding the RPM Fusion Repository

$ sudo dnf install https://mirrors.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm https://mirrors.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm
$ sudo dnf upgrade --refresh
nb.onb.o

Install NVIDIA Driver

Secure Boot

$ sudo dnf install kmodtool akmods mokutil openssl
$ sudo kmodgenca -a
input password: 
input password again:
$ systemctl reboot

Installing the drivers

$ sudo dnf update
$ sudo dnf install akmod-nvidia

Wait about 5 minutes and then reboot.
After rebooting

For cuda/nvdec/nvenc support

$ sudo dnf install xorg-x11-drv-nvidia-cuda
$ nvidia-smi 
Fri Nov  1 08:03:25 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1070        Off |   00000000:0A:00.0  On |                  N/A |
|  0%   38C    P8             10W /  151W |     412MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2375      G   /usr/bin/gnome-shell                          209MiB |
|    0   N/A  N/A      3166    C+G   /usr/bin/ptyxis                                29MiB |
|    0   N/A  N/A      3225      G   /usr/bin/Xwayland                              13MiB |
|    0   N/A  N/A      3394      G   ...seed-version=20241031-050108.844000        139MiB |
+-----------------------------------------------------------------------------------------+
nb.onb.o

Install Podman and Nvidia NVIDIA Container Toolkit

Install NVIDIA Container Toolkit

$ curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
  sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
$ sudo dnf --enable nvidia-container-toolkit-experimental
$ sudo dnf install nvidia-container-toolkit

Configuration

$ sudo nvidia-ctk runtime configure --runtime=docker
INFO[0000] Config file does not exist; using empty config 
INFO[0000] Wrote updated config to /etc/docker/daemon.json 
INFO[0000] It is recommended that docker daemon be restarted. 
$ sudo systemctl restart podman

# Rootless
$ nvidia-ctk runtime configure --runtime=docker --config=$HOME/.config/docker/daemon.json
INFO[0000] Config file does not exist; using empty config 
INFO[0000] Wrote updated config to /home/nobuo/.config/docker/daemon.json 
INFO[0000] It is recommended that docker daemon be restarted.
$ sudo systemctl restart podman
$ sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place

Generating a CDI specification

$ sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
INFO[0000] Using /usr/lib64/libnvidia-ml.so.560.35.03   
WARN[0000] Ignoring error in locating libnvidia-sandboxutils.so.1: pattern libnvidia-sandboxutils.so.1 not found
libnvidia-sandboxutils.so.1: not found 
INFO[0000] Auto-detected mode as 'nvml'                 
WARN[0000] Failed to init nvsandboxutils: ERROR_LIBRARY_LOAD; ignoring 
INFO[0000] Selecting /dev/nvidia0 as /dev/nvidia0       
INFO[0000] Selecting /dev/dri/card1 as /dev/dri/card1   
WARN[0000] Could not locate /dev/dri/controlD65: pattern /dev/dri/controlD65 not found 
INFO[0000] Selecting /dev/dri/renderD128 as /dev/dri/renderD128 
INFO[0000] Using driver version 560.35.03               
INFO[0000] Selecting /dev/nvidia-modeset as /dev/nvidia-modeset 
INFO[0000] Selecting /dev/nvidia-uvm-tools as /dev/nvidia-uvm-tools 
INFO[0000] Selecting /dev/nvidia-uvm as /dev/nvidia-uvm 
INFO[0000] Selecting /dev/nvidiactl as /dev/nvidiactl   
INFO[0000] Selecting /usr/lib64/libnvidia-egl-gbm.so.1.1.2 as /usr/lib64/libnvidia-egl-gbm.so.1.1.2 
INFO[0000] Selecting /usr/lib64/libnvidia-egl-wayland.so.1.1.17 as /usr/lib64/libnvidia-egl-wayland.so.1.1.17 
INFO[0000] Selecting /usr/lib64/libnvidia-allocator.so.560.35.03 as /usr/lib64/libnvidia-allocator.so.560.35.03 
WARN[0000] Could not locate libnvidia-vulkan-producer.so.560.35.03: pattern libnvidia-vulkan-producer.so.560.35.03 not found
libnvidia-vulkan-producer.so.560.35.03: not found 
WARN[0000] Could not locate nvidia_drv.so: pattern nvidia_drv.so not found 
WARN[0000] Could not locate libglxserver_nvidia.so.560.35.03: pattern libglxserver_nvidia.so.560.35.03 not found 
INFO[0000] Selecting /usr/share/glvnd/egl_vendor.d/10_nvidia.json as /usr/share/glvnd/egl_vendor.d/10_nvidia.json 
INFO[0000] Selecting /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json as /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json 
INFO[0000] Selecting /usr/share/egl/egl_external_platform.d/10_nvidia_wayland.json as /usr/share/egl/egl_external_platform.d/10_nvidia_wayland.json 
INFO[0000] Selecting /usr/share/nvidia/nvoptix.bin as /usr/share/nvidia/nvoptix.bin 
WARN[0000] Could not locate X11/xorg.conf.d/10-nvidia.conf: pattern X11/xorg.conf.d/10-nvidia.conf not found 
WARN[0000] Could not locate X11/xorg.conf.d/nvidia-drm-outputclass.conf: pattern X11/xorg.conf.d/nvidia-drm-outputclass.conf not found 
WARN[0000] Could not locate vulkan/icd.d/nvidia_icd.json: pattern vulkan/icd.d/nvidia_icd.json not found
pattern vulkan/icd.d/nvidia_icd.json not found 
WARN[0000] Could not locate vulkan/icd.d/nvidia_layers.json: pattern vulkan/icd.d/nvidia_layers.json not found
pattern vulkan/icd.d/nvidia_layers.json not found 
INFO[0000] Selecting /usr/share/vulkan/implicit_layer.d/nvidia_layers.json as /etc/vulkan/implicit_layer.d/nvidia_layers.json 
INFO[0000] Selecting /usr/lib64/libEGL_nvidia.so.560.35.03 as /usr/lib64/libEGL_nvidia.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libGLESv1_CM_nvidia.so.560.35.03 as /usr/lib64/libGLESv1_CM_nvidia.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libGLESv2_nvidia.so.560.35.03 as /usr/lib64/libGLESv2_nvidia.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libGLX_nvidia.so.560.35.03 as /usr/lib64/libGLX_nvidia.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libcuda.so.560.35.03 as /usr/lib64/libcuda.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libcudadebugger.so.560.35.03 as /usr/lib64/libcudadebugger.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvcuvid.so.560.35.03 as /usr/lib64/libnvcuvid.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-allocator.so.560.35.03 as /usr/lib64/libnvidia-allocator.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-cfg.so.560.35.03 as /usr/lib64/libnvidia-cfg.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-eglcore.so.560.35.03 as /usr/lib64/libnvidia-eglcore.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-encode.so.560.35.03 as /usr/lib64/libnvidia-encode.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-fbc.so.560.35.03 as /usr/lib64/libnvidia-fbc.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-glcore.so.560.35.03 as /usr/lib64/libnvidia-glcore.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-glsi.so.560.35.03 as /usr/lib64/libnvidia-glsi.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-glvkspirv.so.560.35.03 as /usr/lib64/libnvidia-glvkspirv.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-gpucomp.so.560.35.03 as /usr/lib64/libnvidia-gpucomp.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-gtk3.so.560.35.03 as /usr/lib64/libnvidia-gtk3.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-ml.so.560.35.03 as /usr/lib64/libnvidia-ml.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-ngx.so.560.35.03 as /usr/lib64/libnvidia-ngx.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-nvvm.so.560.35.03 as /usr/lib64/libnvidia-nvvm.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-opencl.so.560.35.03 as /usr/lib64/libnvidia-opencl.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-opticalflow.so.560.35.03 as /usr/lib64/libnvidia-opticalflow.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-pkcs11-openssl3.so.560.35.03 as /usr/lib64/libnvidia-pkcs11-openssl3.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-ptxjitcompiler.so.560.35.03 as /usr/lib64/libnvidia-ptxjitcompiler.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-rtcore.so.560.35.03 as /usr/lib64/libnvidia-rtcore.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-tls.so.560.35.03 as /usr/lib64/libnvidia-tls.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-vksc-core.so.560.35.03 as /usr/lib64/libnvidia-vksc-core.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-wayland-client.so.560.35.03 as /usr/lib64/libnvidia-wayland-client.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvoptix.so.560.35.03 as /usr/lib64/libnvoptix.so.560.35.03 
INFO[0000] Selecting /usr/lib64/vdpau/libvdpau_nvidia.so.560.35.03 as /usr/lib64/vdpau/libvdpau_nvidia.so.560.35.03 
WARN[0000] Could not locate /nvidia-persistenced/socket: pattern /nvidia-persistenced/socket not found 
WARN[0000] Could not locate /nvidia-fabricmanager/socket: pattern /nvidia-fabricmanager/socket not found 
WARN[0000] Could not locate /tmp/nvidia-mps: pattern /tmp/nvidia-mps not found 
INFO[0000] Selecting /lib/firmware/nvidia/560.35.03/gsp_ga10x.bin as /lib/firmware/nvidia/560.35.03/gsp_ga10x.bin 
INFO[0000] Selecting /lib/firmware/nvidia/560.35.03/gsp_tu10x.bin as /lib/firmware/nvidia/560.35.03/gsp_tu10x.bin 
INFO[0000] Selecting /usr/bin/nvidia-smi as /usr/bin/nvidia-smi 
INFO[0000] Selecting /usr/bin/nvidia-debugdump as /usr/bin/nvidia-debugdump 
INFO[0000] Selecting /usr/bin/nvidia-persistenced as /usr/bin/nvidia-persistenced 
INFO[0000] Selecting /usr/bin/nvidia-cuda-mps-control as /usr/bin/nvidia-cuda-mps-control 
INFO[0000] Selecting /usr/bin/nvidia-cuda-mps-server as /usr/bin/nvidia-cuda-mps-server 
WARN[0000] Could not locate nvidia_drv.so: pattern nvidia_drv.so not found 
WARN[0000] Could not locate libglxserver_nvidia.so.560.35.03: pattern libglxserver_nvidia.so.560.35.03 not found 
INFO[0000] Generated CDI spec with version 0.8.0        
$ nvidia-ctk cdi list
INFO[0000] Found 3 CDI devices                          
nvidia.com/gpu=0
nvidia.com/gpu=GPU-acc29702-e8eb-2c6b-2e6c-6e2aafabb704
nvidia.com/gpu=all

Running a Workload with CDI

$ podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable ubuntu nvidia-smi
Thu Oct 31 23:18:49 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1070        Off |   00000000:0A:00.0  On |                  N/A |
|  0%   38C    P8              9W /  151W |     482MiB /   8192MiB |      1%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

nb.onb.o

Install Virtualenvwrapper and create jax virtualenv

$ sudo dnf install python3-virtualenvwrapper

Add $HOME/.bashrc

export WORKON_HOME=$HOME/.virtualenvs
source /usr/bin/virtualenvwrapper.sh

loading

$ source .bashrc

Create virtualenv (Python 3.11, install JAX and TensorFlow)

Python on Fedora 41 is 3.12, but the dependency package supports Python3.11 (tf-models-official -> PyYAML-5.4.1), so a 3.11 virtual environment will be created.

$ mkvirtualenv --python 3.11 jax
(jax)$ pip install --upgrade "jax[cuda12]"
(jax)$ pip3 install tf-models-official
(jax)$ pip3 install clu