NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver

Note: The machine used for the demonstration output below is Ubuntu 22.04.2 LTS, equipped with NVIDIA GeForce GT 730 graphics card

Introduction

When the server does not have NVIDIA drivers installed, or the driver version does not match the graphics card, or when some system software is installed or the system updates the kernel, the server may not be able to connect to NVIDIA drivers after restarting. The error message is as follows:

$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

How to fix the “NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver” error

To install the necessary kernel headers

If it was normal before and this error occurs after restarting, it is probably caused by the upgrade of Ubuntu's kernel version. Let's reinstall the kernel headers.

$ sudo apt install linux-headers-`uname -r`

#or
$ sudo apt install linux-headers-$(uname -r) 

Then enter nvidia-smi, and the output may be normal. If it is still wrong, please continue with the method below.

Use DKMS to reinstall NVIDIA driver for the kernel

Sometimes, after restarting the machine, NVIDIA SMI will display NVIDIA driver loss, which is due to a Linux kernel upgrade where the previous NVIDIA driver does not match the connection.

DKMS (Dynamic Kernel Module System) can automatically compile modules after kernel changes and adapt to new kernels. It allows discrete kernel modules to update without the need to modify the entire kernel. Use dkms to reinstall the appropriate driver for the kernel:

$ sudo apt install dkms
$ sudo dkms install -m nvidia -v 470.182.03

$ dkms status nvidia
nvidia/470.182.03, 5.15.0-88-generic, x86_64: installed

Note: The 470.182.03 in the above command line is the version number of NVIDIA. When you are not aware of it, enter the/usr/src directory and you will see the nvidia folder with its suffix. Alternatively, use the following command to query it.

$ ls /usr/src | grep nvidia
nvidia-470.182.03

When you input nvidia smi again, the familiar output will come back.

nvidia-smi output
If it still does not work, please persevere and try the following methods. Let us proceed with reinstalling the NVIDIA driver. Presented below are three widely-used installation methods specifically designed for Linux.

3 Ways to Install NVIDIA Driver

1. Install Nvidia Driver via Command Line on Linux

Step 1: Before installing the driver, make sure to update the package repository. Run the following commands:

$ sudo apt update
$ sudo apt upgrade

Step 2: Search for Nvidia drivers, run the following command. The output shows a list of available drivers for your GPU.

$ apt search nvidia-driver

Step 3: Choose a driver to install from the list of available GPU drivers. The best fit is the latest tested proprietary version.

$ sudo apt install nvidia-driver-470

For this tutorial, we installed nvidia-driver-340, the latest tested proprietary driver for this GPU.

Step 4: Reboot your machine with the following command:

# sudo reboot

2. Install Nvidia Beta Drivers via PPA Repository on Ubuntu

The PPA repository allows developers to distribute software that is not available in the official Ubuntu repositories. This means that you can install the latest beta drivers, however, at the risk of an unstable system.

To install the latest Nvidia drivers via the PPA repository, follow these steps:

Step 1: Add the graphics drivers repository to the system with the following command:

$ sudo add-apt-repository ppa:graphics-drivers/ppa

Step 2: To verify which GPU model you are using and to see a list of available drivers, run the following command:

$ ubuntu-drivers devices
ubuntu-drivers devices

Step 3: The output shows your GPU model as well as any available drivers for that specific GPU. To install a specific driver, use the following syntax:

$ sudo apt install nvidia-driver-470

Alternatively, install the recommended driver automatically by running:

$ sudo ubuntu-drivers autoinstall

Step 4: Reboot the machine for the changes to take effect.

3. Install Nvidia Drivers via Runfile Installers on Linux

Step1. NVIDIA drivers are available as .run installer packages for use with Linux distributions from the NVIDIA driver downloads site. Select the .run package for your GPU product.

nvidia download drivers

Step 2. The .run can be downloaded using wget as shown in the example below:

$ wget https://us.download.nvidia.com/XFree86/Linux-x86_64/470.223.02/NVIDIA-Linux-x86_64-470.223.02.run

Step 3. Once the .run installer has been downloaded, the NVIDIA driver can be installed:

$ sudo sh NVIDIA-Linux-x86_64-$DRIVER_VERSION.run

Follow the prompts on the screen during the installation. For more advanced options on using the .run installer, see the --help option.

Step 4. Reboot the machine for the changes to take effect.