🔥

【GPU】CUDA "sm_" Error handling

2024/08/01に公開

1. Error message

NVIDIA XXXXXX with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_60 sm_70 sm_75 compute_70 compute_75.

or

AttributeError: module 'torch.amp' has no attribute 'GradScaler'

This problem occurs sometimes in kaggle docker environment.

2. How to fix

2.1 Uninstall torch

pip uninstall torch

reinstall

Install the correct version of PyTorch:
Visit the PyTorch website and use the selector to choose the appropriate options for your environment.
For example, for the NVIDIA RTX A6000, you would likely need the latest version of PyTorch with CUDA 11.1 or higher.
Alternatively, you can directly install the latest version of PyTorch with CUDA support by running one of the following commands in your terminal:

・For CUDA 11.8:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

・For CUDA 11.7:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

・For CUDA 11.6:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu116

3. Verify the installation

import torch
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))

Discussion