iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🐧

Setting Up a Generative AI PC for the First Time: Enabling GPU Support for PyTorch and TensorFlow

に公開

Introduction

I bought my first desktop Windows PC to study generative AI.
In this post, I will perform the initial setup so that the GPU of my brand-new PC can run generative AI, and configure PyTorch and TensorFlow to recognize the GPU!

PC Specs

I wrote in detail about the story of how I came across this PC.
https://zenn.dev/yasuna/articles/e574c9154d678d

Initial Setup for Machine Learning Software

First, let's prepare the environment for running generative AI.
This time, we will set it up to be compatible with an environment where the machine learning software libraries "PyTorch" and "TensorFlow" can be used.

PyTorch is a Machine Learning Library Developed by a Former Facebook Group

PyTorch is an open-source machine learning library for Python, based on Torch, which is used in computer vision and natural language processing. It was initially developed by Facebook's artificial intelligence research group, AI Research lab (FAIR) (quoted from Wiki).

https://pytorch.org/

TensorFlow is a Machine Learning Library Developed by Google

TensorFlow is a software library for machine learning developed by Google and released as open source (quoted from Wiki).

https://www.tensorflow.org

I tried setting things up based on the following official documentation for "Installing TensorFlow using pip," but I ran into some difficulties during the installation.
https://www.tensorflow.org/install/pip?hl=ja#windows-wsl2

I was troubled because the software only recognized the CPU, but then I found this article by Professor Kaneko of Fukuyama University to be extremely helpful, so I will proceed with the setup based on it. I am truly grateful!!!
https://www.kkaneko.jp/tools/wsl/wsl_tensorflow2.html

Configuring Windows Subsystem for Linux 2 (WSL2)

First, let's configure the shell for executing commands.
Windows Subsystem for Linux 2 is supported by TensorFlow, so we will use it.
I installed Linux (Ubuntu) on Windows by referring to Professor Kaneko's article.
https://www.kkaneko.jp/tools/wsl/wsl2.html

# Successfully introduced Ubuntu to WSL2!
PS C:\Users\USER> wsl -l
Windows Subsystem for Linux Distributions:
Ubuntu (Default)
PS C:\Users\USER> wsl -d Ubuntu uname -r
5.15.90.1-microsoft-standard-WSL2
PS C:\Users\USER>

Checking Hardware Requirements

PyTorch and TensorFlow support NVIDIA graphics cards.
You can achieve good performance by preparing hardware that meets the following GPU requirements.

The following GPU-enabled devices are supported:
NVIDIA® GPU cards with CUDA® architectures 3.5, 5.0, 6.0, 7.0, 7.5, 8.0, or higher.

First, install the drivers to run NVIDIA.
These will be installed on the Windows side.
https://www.nvidia.co.jp/Download/index.aspx?lang=jp

Next, let's check the CUDA version.

My PC has:
GPU: ZOTAC NVIDIA GeForce RTX4080 (GDDR6X-16GB)
So, it meets the hardware requirements.

nvidia-smi

You can check GPU information by running it in the shell.

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.120                Driver Version: 537.58       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4080        On  | 00000000:2B:00.0  On |                  N/A |
| 31%   32C    P8              14W / 320W |   1178MiB / 16376MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A        23      G   /Xwayland                                 N/A      |
+---------------------------------------------------------------------------------------+

CUDA Version was 12.2!

Regarding the details of this command, the following article was helpful!
https://qiita.com/miyamotok0105/items/1b34e548f96ef7d40370

Installing PyTorch (When Using GPU)

First, let's install PyTorch.

Since we will be installing the libraries via the pip command, let's make sure Python and pip are available on WSL2.

sudo apt -y update
sudo apt -y install python3-dev python3-pip python3-setuptools

Select the PyTorch installation command from the official website below.
https://pytorch.org/

Let's check if it was installed successfully.

# Check the PyTorch version in Python
$ python3 -c "import torch; print( torch.__version__ )"
2.1.0+cu121

Check if the GPU can be used with PyTorch. This is very important.

# Check if GPU is available in PyTorch
$ python3 -c "import torch; print(torch.__version__, torch.cuda.is_available())"
2.1.0+cu121 True

Let's verify the operation of PyTorch.

# Verify PyTorch operation with python3
$ python3 -c "import torch; print( torch.__version__ )"
2.1.0+cu121

$ python3 -c "import torch; print(torch.__version__, torch.cuda.is_available())"
2.1.0+cu121 True

$ python3
Python 3.11.6 (main, Oct  2 2023, 13:45:54) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> import torch
 torch.rand(5, 3)
print(x)>>> x = torch.rand(5, 3)
>>> print(x)
tensor([[0.4083, 0.9996, 0.6138],
        [0.6909, 0.0726, 0.1601],
        [0.4745, 0.9035, 0.0309],
        [0.0565, 0.3769, 0.4328],
        [0.5298, 0.0900, 0.0638]])

PyTorch has been installed!

Installing TensorFlow (When Using GPU)

Next, I installed TensorFlow.
I continued to follow Professor Kaneko's article for the installation steps.
https://www.kkaneko.jp/tools/wsl/wsl_tensorflow2.html

Let's check if TensorFlow was installed correctly.
Below is the command and the result to confirm if TensorFlow can recognize the GPU.

~$ python3 -c "from tensorflow.python.client import device_lib; print(device_lib.list_local_devices())"
2023-10-21 22:47:33.485793: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-21 22:47:33.485844: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-21 22:47:33.485871: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-21 22:47:33.489461: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-21 22:47:34.072622: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/lib/python3/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.1
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
2023-10-21 22:47:34.715020: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:34.835683: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:34.835755: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:35.334747: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:35.334824: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:35.334846: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2023-10-21 22:47:35.334899: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:35.334932: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /device:GPU:0 with 13340 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4080, pci bus id: 0000:2b:00.0, compute capability: 8.9
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 5767509284482495646
xla_global_id: -1
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 13988003840
locality {
  bus_id: 1
  links {
  }
}
incarnation: 581585132380973155
physical_device_desc: "device: 0, name: NVIDIA GeForce RTX 4080, pci bus id: 0000:2b:00.0, compute capability: 8.9"
xla_global_id: 416903419
]

According to Professor Kaneko's article,

If "device_type: "GPU"" is present, the GPU is being recognized.

So I felt relieved.

Verifying TensorFlow Operation by Running a Python Program

Finally, I will verify the operation of TensorFlow.

$ python3
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-10-21 22:50:09.166516: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-21 22:50:09.166567: E tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/lib/python3/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.1
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
>>> hello = tf.constant('Hello, TensorFlow!')
2023-10-21 22:50:10.387277: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.398297: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.398370: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.399660: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.399710: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.399749: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.898379: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.898451: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.898481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2023-10-21 22:50:10.898534: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.898569: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13340 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4080, pci bus id: 0000:2b:00.0, compute capability: 8.9
>>> print(hello)
tf.Tensor(b'Hello, TensorFlow!', shape=(), dtype=string)```

With this, TensorFlow has been successfully installed.

Summary

Since installing from the official manual didn't work well with the GPU, I searched and found the best, highly reliable installation manual.
Professor Kaneko of Fukuyama University, you were a huge help!!!
https://www.fukuyama-u.ac.jp/eng/information-engineering/kaneko-kunihiko/

GitHubで編集を提案

Discussion