iTranslated by AI
Setting Up a Generative AI PC for the First Time: Enabling GPU Support for PyTorch and TensorFlow
Introduction
I bought my first desktop Windows PC to study generative AI.
In this post, I will perform the initial setup so that the GPU of my brand-new PC can run generative AI, and configure PyTorch and TensorFlow to recognize the GPU!
PC Specs
I wrote in detail about the story of how I came across this PC.
Initial Setup for Machine Learning Software
First, let's prepare the environment for running generative AI.
This time, we will set it up to be compatible with an environment where the machine learning software libraries "PyTorch" and "TensorFlow" can be used.
PyTorch is a Machine Learning Library Developed by a Former Facebook Group
PyTorch is an open-source machine learning library for Python, based on Torch, which is used in computer vision and natural language processing. It was initially developed by Facebook's artificial intelligence research group, AI Research lab (FAIR) (quoted from Wiki).
TensorFlow is a Machine Learning Library Developed by Google
TensorFlow is a software library for machine learning developed by Google and released as open source (quoted from Wiki).
I tried setting things up based on the following official documentation for "Installing TensorFlow using pip," but I ran into some difficulties during the installation.
I was troubled because the software only recognized the CPU, but then I found this article by Professor Kaneko of Fukuyama University to be extremely helpful, so I will proceed with the setup based on it. I am truly grateful!!!
Configuring Windows Subsystem for Linux 2 (WSL2)
First, let's configure the shell for executing commands.
Windows Subsystem for Linux 2 is supported by TensorFlow, so we will use it.
I installed Linux (Ubuntu) on Windows by referring to Professor Kaneko's article.
# Successfully introduced Ubuntu to WSL2!
PS C:\Users\USER> wsl -l
Windows Subsystem for Linux Distributions:
Ubuntu (Default)
PS C:\Users\USER> wsl -d Ubuntu uname -r
5.15.90.1-microsoft-standard-WSL2
PS C:\Users\USER>
Checking Hardware Requirements
PyTorch and TensorFlow support NVIDIA graphics cards.
You can achieve good performance by preparing hardware that meets the following GPU requirements.
The following GPU-enabled devices are supported:
NVIDIA® GPU cards with CUDA® architectures 3.5, 5.0, 6.0, 7.0, 7.5, 8.0, or higher.
First, install the drivers to run NVIDIA.
These will be installed on the Windows side.
Next, let's check the CUDA version.
My PC has:
GPU: ZOTAC NVIDIA GeForce RTX4080 (GDDR6X-16GB)
So, it meets the hardware requirements.
nvidia-smi
You can check GPU information by running it in the shell.
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.120 Driver Version: 537.58 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4080 On | 00000000:2B:00.0 On | N/A |
| 31% 32C P8 14W / 320W | 1178MiB / 16376MiB | 2% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 23 G /Xwayland N/A |
+---------------------------------------------------------------------------------------+
CUDA Version was 12.2!
Regarding the details of this command, the following article was helpful!
Installing PyTorch (When Using GPU)
First, let's install PyTorch.
Since we will be installing the libraries via the pip command, let's make sure Python and pip are available on WSL2.
sudo apt -y update
sudo apt -y install python3-dev python3-pip python3-setuptools
Select the PyTorch installation command from the official website below.
Let's check if it was installed successfully.
# Check the PyTorch version in Python
$ python3 -c "import torch; print( torch.__version__ )"
2.1.0+cu121
Check if the GPU can be used with PyTorch. This is very important.
# Check if GPU is available in PyTorch
$ python3 -c "import torch; print(torch.__version__, torch.cuda.is_available())"
2.1.0+cu121 True
Let's verify the operation of PyTorch.
# Verify PyTorch operation with python3
$ python3 -c "import torch; print( torch.__version__ )"
2.1.0+cu121
$ python3 -c "import torch; print(torch.__version__, torch.cuda.is_available())"
2.1.0+cu121 True
$ python3
Python 3.11.6 (main, Oct 2 2023, 13:45:54) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
torch.rand(5, 3)
print(x)>>> x = torch.rand(5, 3)
>>> print(x)
tensor([[0.4083, 0.9996, 0.6138],
[0.6909, 0.0726, 0.1601],
[0.4745, 0.9035, 0.0309],
[0.0565, 0.3769, 0.4328],
[0.5298, 0.0900, 0.0638]])
PyTorch has been installed!
Installing TensorFlow (When Using GPU)
Next, I installed TensorFlow.
I continued to follow Professor Kaneko's article for the installation steps.
Let's check if TensorFlow was installed correctly.
Below is the command and the result to confirm if TensorFlow can recognize the GPU.
~$ python3 -c "from tensorflow.python.client import device_lib; print(device_lib.list_local_devices())"
2023-10-21 22:47:33.485793: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-21 22:47:33.485844: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-21 22:47:33.485871: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-21 22:47:33.489461: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-21 22:47:34.072622: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/lib/python3/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.1
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
2023-10-21 22:47:34.715020: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:34.835683: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:34.835755: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:35.334747: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:35.334824: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:35.334846: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-10-21 22:47:35.334899: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:47:35.334932: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /device:GPU:0 with 13340 MB memory: -> device: 0, name: NVIDIA GeForce RTX 4080, pci bus id: 0000:2b:00.0, compute capability: 8.9
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 5767509284482495646
xla_global_id: -1
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 13988003840
locality {
bus_id: 1
links {
}
}
incarnation: 581585132380973155
physical_device_desc: "device: 0, name: NVIDIA GeForce RTX 4080, pci bus id: 0000:2b:00.0, compute capability: 8.9"
xla_global_id: 416903419
]
According to Professor Kaneko's article,
If "device_type: "GPU"" is present, the GPU is being recognized.
So I felt relieved.
Verifying TensorFlow Operation by Running a Python Program
Finally, I will verify the operation of TensorFlow.
$ python3
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-10-21 22:50:09.166516: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-21 22:50:09.166567: E tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/lib/python3/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.1
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
>>> hello = tf.constant('Hello, TensorFlow!')
2023-10-21 22:50:10.387277: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.398297: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.398370: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.399660: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.399710: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.399749: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.898379: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.898451: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.898481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1977] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-10-21 22:50:10.898534: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:2b:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-21 22:50:10.898569: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13340 MB memory: -> device: 0, name: NVIDIA GeForce RTX 4080, pci bus id: 0000:2b:00.0, compute capability: 8.9
>>> print(hello)
tf.Tensor(b'Hello, TensorFlow!', shape=(), dtype=string)```
With this, TensorFlow has been successfully installed.
Summary
Since installing from the official manual didn't work well with the GPU, I searched and found the best, highly reliable installation manual.
Professor Kaneko of Fukuyama University, you were a huge help!!!
Discussion