Zenn
Open8

Fedora 41 Install Memo

nb.onb.o

Post-Installation Configuration

Change the folder name in the home directory from Japanese to English

$ LANG=C xdg-user-dirs-gtk-update 
Moving DESKTOP directory from デスクトップ to Desktop
Moving DOWNLOAD directory from ダウンロード to Downloads
Moving TEMPLATES directory from テンプレート to Templates
Moving PUBLICSHARE directory from 公開 to Public
Moving DOCUMENTS directory from ドキュメント to Documents
Moving MUSIC directory from 音楽 to Music
Moving PICTURES directory from 画像 to Pictures
Moving VIDEOS directory from ビデオ to Videos

Adding the RPM Fusion Repository

$ sudo dnf install https://mirrors.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm https://mirrors.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm
$ sudo dnf upgrade --refresh
nb.onb.o

Install NVIDIA Driver

Secure Boot

$ sudo dnf install kmodtool akmods mokutil openssl
$ sudo kmodgenca -a
$ sudo mokutil --import /etc/pki/akmods/certs/public_key.der
input password: 
input password again:
$ systemctl reboot

Installing the drivers

$ sudo dnf update
$ sudo dnf install akmod-nvidia

Wait about 5 minutes and then reboot.
After rebooting

For cuda/nvdec/nvenc support

$ sudo dnf install xorg-x11-drv-nvidia-cuda
$ nvidia-smi 
Fri Nov  1 08:03:25 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1070        Off |   00000000:0A:00.0  On |                  N/A |
|  0%   38C    P8             10W /  151W |     412MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2375      G   /usr/bin/gnome-shell                          209MiB |
|    0   N/A  N/A      3166    C+G   /usr/bin/ptyxis                                29MiB |
|    0   N/A  N/A      3225      G   /usr/bin/Xwayland                              13MiB |
|    0   N/A  N/A      3394      G   ...seed-version=20241031-050108.844000        139MiB |
+-----------------------------------------------------------------------------------------+
nb.onb.o

Install Podman and Nvidia NVIDIA Container Toolkit

Install NVIDIA Container Toolkit

$ curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
  sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
$ sudo dnf install nvidia-container-toolkit

Configuration

$ sudo nvidia-ctk runtime configure --runtime=docker
INFO[0000] Config file does not exist; using empty config 
INFO[0000] Wrote updated config to /etc/docker/daemon.json 
INFO[0000] It is recommended that docker daemon be restarted. 
$ sudo systemctl restart podman

# Rootless
$ nvidia-ctk runtime configure --runtime=docker --config=$HOME/.config/docker/daemon.json
INFO[0000] Config file does not exist; using empty config 
INFO[0000] Wrote updated config to /home/nobuo/.config/docker/daemon.json 
INFO[0000] It is recommended that docker daemon be restarted.
$ sudo systemctl restart podman
$ sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place

Generating a CDI specification

$ sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
INFO[0000] Using /usr/lib64/libnvidia-ml.so.560.35.03   
WARN[0000] Ignoring error in locating libnvidia-sandboxutils.so.1: pattern libnvidia-sandboxutils.so.1 not found
libnvidia-sandboxutils.so.1: not found 
INFO[0000] Auto-detected mode as 'nvml'                 
WARN[0000] Failed to init nvsandboxutils: ERROR_LIBRARY_LOAD; ignoring 
INFO[0000] Selecting /dev/nvidia0 as /dev/nvidia0       
INFO[0000] Selecting /dev/dri/card1 as /dev/dri/card1   
WARN[0000] Could not locate /dev/dri/controlD65: pattern /dev/dri/controlD65 not found 
INFO[0000] Selecting /dev/dri/renderD128 as /dev/dri/renderD128 
INFO[0000] Using driver version 560.35.03               
INFO[0000] Selecting /dev/nvidia-modeset as /dev/nvidia-modeset 
INFO[0000] Selecting /dev/nvidia-uvm-tools as /dev/nvidia-uvm-tools 
INFO[0000] Selecting /dev/nvidia-uvm as /dev/nvidia-uvm 
INFO[0000] Selecting /dev/nvidiactl as /dev/nvidiactl   
INFO[0000] Selecting /usr/lib64/libnvidia-egl-gbm.so.1.1.2 as /usr/lib64/libnvidia-egl-gbm.so.1.1.2 
INFO[0000] Selecting /usr/lib64/libnvidia-egl-wayland.so.1.1.17 as /usr/lib64/libnvidia-egl-wayland.so.1.1.17 
INFO[0000] Selecting /usr/lib64/libnvidia-allocator.so.560.35.03 as /usr/lib64/libnvidia-allocator.so.560.35.03 
WARN[0000] Could not locate libnvidia-vulkan-producer.so.560.35.03: pattern libnvidia-vulkan-producer.so.560.35.03 not found
libnvidia-vulkan-producer.so.560.35.03: not found 
WARN[0000] Could not locate nvidia_drv.so: pattern nvidia_drv.so not found 
WARN[0000] Could not locate libglxserver_nvidia.so.560.35.03: pattern libglxserver_nvidia.so.560.35.03 not found 
INFO[0000] Selecting /usr/share/glvnd/egl_vendor.d/10_nvidia.json as /usr/share/glvnd/egl_vendor.d/10_nvidia.json 
INFO[0000] Selecting /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json as /usr/share/egl/egl_external_platform.d/15_nvidia_gbm.json 
INFO[0000] Selecting /usr/share/egl/egl_external_platform.d/10_nvidia_wayland.json as /usr/share/egl/egl_external_platform.d/10_nvidia_wayland.json 
INFO[0000] Selecting /usr/share/nvidia/nvoptix.bin as /usr/share/nvidia/nvoptix.bin 
WARN[0000] Could not locate X11/xorg.conf.d/10-nvidia.conf: pattern X11/xorg.conf.d/10-nvidia.conf not found 
WARN[0000] Could not locate X11/xorg.conf.d/nvidia-drm-outputclass.conf: pattern X11/xorg.conf.d/nvidia-drm-outputclass.conf not found 
WARN[0000] Could not locate vulkan/icd.d/nvidia_icd.json: pattern vulkan/icd.d/nvidia_icd.json not found
pattern vulkan/icd.d/nvidia_icd.json not found 
WARN[0000] Could not locate vulkan/icd.d/nvidia_layers.json: pattern vulkan/icd.d/nvidia_layers.json not found
pattern vulkan/icd.d/nvidia_layers.json not found 
INFO[0000] Selecting /usr/share/vulkan/implicit_layer.d/nvidia_layers.json as /etc/vulkan/implicit_layer.d/nvidia_layers.json 
INFO[0000] Selecting /usr/lib64/libEGL_nvidia.so.560.35.03 as /usr/lib64/libEGL_nvidia.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libGLESv1_CM_nvidia.so.560.35.03 as /usr/lib64/libGLESv1_CM_nvidia.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libGLESv2_nvidia.so.560.35.03 as /usr/lib64/libGLESv2_nvidia.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libGLX_nvidia.so.560.35.03 as /usr/lib64/libGLX_nvidia.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libcuda.so.560.35.03 as /usr/lib64/libcuda.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libcudadebugger.so.560.35.03 as /usr/lib64/libcudadebugger.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvcuvid.so.560.35.03 as /usr/lib64/libnvcuvid.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-allocator.so.560.35.03 as /usr/lib64/libnvidia-allocator.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-cfg.so.560.35.03 as /usr/lib64/libnvidia-cfg.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-eglcore.so.560.35.03 as /usr/lib64/libnvidia-eglcore.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-encode.so.560.35.03 as /usr/lib64/libnvidia-encode.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-fbc.so.560.35.03 as /usr/lib64/libnvidia-fbc.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-glcore.so.560.35.03 as /usr/lib64/libnvidia-glcore.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-glsi.so.560.35.03 as /usr/lib64/libnvidia-glsi.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-glvkspirv.so.560.35.03 as /usr/lib64/libnvidia-glvkspirv.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-gpucomp.so.560.35.03 as /usr/lib64/libnvidia-gpucomp.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-gtk3.so.560.35.03 as /usr/lib64/libnvidia-gtk3.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-ml.so.560.35.03 as /usr/lib64/libnvidia-ml.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-ngx.so.560.35.03 as /usr/lib64/libnvidia-ngx.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-nvvm.so.560.35.03 as /usr/lib64/libnvidia-nvvm.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-opencl.so.560.35.03 as /usr/lib64/libnvidia-opencl.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-opticalflow.so.560.35.03 as /usr/lib64/libnvidia-opticalflow.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-pkcs11-openssl3.so.560.35.03 as /usr/lib64/libnvidia-pkcs11-openssl3.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-ptxjitcompiler.so.560.35.03 as /usr/lib64/libnvidia-ptxjitcompiler.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-rtcore.so.560.35.03 as /usr/lib64/libnvidia-rtcore.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-tls.so.560.35.03 as /usr/lib64/libnvidia-tls.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-vksc-core.so.560.35.03 as /usr/lib64/libnvidia-vksc-core.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvidia-wayland-client.so.560.35.03 as /usr/lib64/libnvidia-wayland-client.so.560.35.03 
INFO[0000] Selecting /usr/lib64/libnvoptix.so.560.35.03 as /usr/lib64/libnvoptix.so.560.35.03 
INFO[0000] Selecting /usr/lib64/vdpau/libvdpau_nvidia.so.560.35.03 as /usr/lib64/vdpau/libvdpau_nvidia.so.560.35.03 
WARN[0000] Could not locate /nvidia-persistenced/socket: pattern /nvidia-persistenced/socket not found 
WARN[0000] Could not locate /nvidia-fabricmanager/socket: pattern /nvidia-fabricmanager/socket not found 
WARN[0000] Could not locate /tmp/nvidia-mps: pattern /tmp/nvidia-mps not found 
INFO[0000] Selecting /lib/firmware/nvidia/560.35.03/gsp_ga10x.bin as /lib/firmware/nvidia/560.35.03/gsp_ga10x.bin 
INFO[0000] Selecting /lib/firmware/nvidia/560.35.03/gsp_tu10x.bin as /lib/firmware/nvidia/560.35.03/gsp_tu10x.bin 
INFO[0000] Selecting /usr/bin/nvidia-smi as /usr/bin/nvidia-smi 
INFO[0000] Selecting /usr/bin/nvidia-debugdump as /usr/bin/nvidia-debugdump 
INFO[0000] Selecting /usr/bin/nvidia-persistenced as /usr/bin/nvidia-persistenced 
INFO[0000] Selecting /usr/bin/nvidia-cuda-mps-control as /usr/bin/nvidia-cuda-mps-control 
INFO[0000] Selecting /usr/bin/nvidia-cuda-mps-server as /usr/bin/nvidia-cuda-mps-server 
WARN[0000] Could not locate nvidia_drv.so: pattern nvidia_drv.so not found 
WARN[0000] Could not locate libglxserver_nvidia.so.560.35.03: pattern libglxserver_nvidia.so.560.35.03 not found 
INFO[0000] Generated CDI spec with version 0.8.0        
$ nvidia-ctk cdi list
INFO[0000] Found 3 CDI devices                          
nvidia.com/gpu=0
nvidia.com/gpu=GPU-acc29702-e8eb-2c6b-2e6c-6e2aafabb704
nvidia.com/gpu=all

Running a Workload with CDI

$ podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable ubuntu nvidia-smi
Thu Oct 31 23:18:49 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1070        Off |   00000000:0A:00.0  On |                  N/A |
|  0%   38C    P8              9W /  151W |     482MiB /   8192MiB |      1%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

nb.onb.o

Install Virtualenvwrapper and create jax virtualenv

$ sudo dnf install python3-virtualenvwrapper

Add $HOME/.bashrc

export WORKON_HOME=$HOME/.virtualenvs
source /usr/bin/virtualenvwrapper.sh

loading

$ source .bashrc

Create virtualenv (Python 3.11, install JAX and TensorFlow)

Python on Fedora 41 is 3.12, but the dependency package supports Python3.11 (tf-models-official -> PyYAML-5.4.1), so a 3.11 virtual environment will be created.

$ mkvirtualenv --python 3.11 jax
(jax)$ pip install --upgrade "jax[cuda12]"
(jax)$ pip3 install tf-models-official
(jax)$ pip3 install clu
nb.onb.o

Install CUDA 12.8 and cuDNN 9.7.0

Reference

Install CUDA 12.8

$ sudo dnf config-manager addrepo --from-repofile https://developer.download.nvidia.com/compute/cuda/repos/fedora41/x86_64/cuda-fedora41.repo
$ sudo dnf clean all
$ sudo dnf module disable nvidia-driver
$ sudo dnf config-manager setopt cuda-fedora41-$(uname -m).exclude=nvidia-driver,nvidia-modprobe,nvidia-persistenced,nvidia-settings,nvidia-libXNVCtrl,nvidia-xconfig
$ sudo dnf clean all
$ sudo dnf -y install cuda-toolkit-12-8

Install cuDNN 9.7.0

$ sudo dnf config-manager addrepo --from-repofile  https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
$ sudo dnf clean all
$ sudo dnf -y install cudnn

Verify

$ sudo dnf -y install freeimage-devel
$ sudo dnf -y install libcudnn9-samples
$ cd ~
$ cp -R /usr/src/cudnn_samples_v9/ ./
$ cd cudnn_samples_v9/mnistCUDNN/

# Modify lines 83 and 238 of the Makefile
83 line
NVCCFLAGS   := -ccbin $(HOST_COMPILER) -m${TARGET_SIZE} -std=c++11
↓
NVCCFLAGS   := -ccbin $(HOST_COMPILER) -m${TARGET_SIZE} -std=c++17

238 line
$(EXEC) $(HOST_COMPILER) $(INCLUDES) $(CCFLAGS) $(EXTRA_CCFLAGS) -std=c++11 -o $@ -c $<$(EXEC) $(HOST_COMPILER) $(INCLUDES) $(CCFLAGS) $(EXTRA_CCFLAGS) -std=c++17 -o $@ -c $<
$ make clean && make
$ ./mnistCUDNN 
Executing: mnistCUDNN
cudnnGetVersion() : 90700 , CUDNN_VERSION from cudnn.h : 90700 (9.7.0)
Host compiler version : GCC 14.2.1

There are 1 CUDA capable devices on your machine :
device 0 : sms 15  Capabilities 6.1, SmClock 1708.5 Mhz, MemSize (Mb) 8103, MemClock 4004.0 Mhz, Ecc=0, boardGroupID=0
Using device 0

Testing single precision
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.034816 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.038816 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.079680 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.120736 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.130048 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.171776 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.063488 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.067456 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.143072 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.147456 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.164576 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.194560 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.032768 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.036864 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.040960 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.051200 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.075968 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.104320 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.063488 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.067264 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.084992 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.120832 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.135168 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.142144 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

Testing half precision (math in single precision)
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 4608 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.036800 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.074496 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.088000 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.099328 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.112640 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.191488 time requiring 4608 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.073600 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.084992 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.118784 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.134144 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.147456 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.272384 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 4608 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.018432 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.058304 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.089088 time requiring 4608 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.090304 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.093184 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.118784 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.077824 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.091136 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.093184 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.114688 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.148320 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.148480 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!
ログインするとコメントできます