I’m getting the following error message when I run the command nvidia-smi : NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
I have Nvidia’s drivers from NVidia’s own yum repository (the name starts “cuda-”).
Just today, I updated kernel and ‘dnf history’ shows messages.
Looks like the ‘dkms’ did compile (nvidia) kernel modules for the new kernel and
there were mention of “signed” and a path to *.mok file. I do presume that mokutil
could load that certificate to UEFI and then loading of those kernel modules would
succeed also when secure boot is enforcing.
Do you have a link to that repo and instructions on how to use it. Would I have to make sure mokutil is installed or is it already there? I’ve only installed the minimal version of Rocky,
Thanks
Note: One should not use rpm to install from local RPM-files; the dnf can install those just fine.
A lot of “guides” haven’t catched up with that fact.
I do presume that the installer does include the mokutil (and other EFI-related tools) when you do an EFI installation – even the Minimal Install. Besides, a GUI desktop is not in that either …
Sorry, I’ve never used mokutil, it’s installed on my system. I tried following the instructions given in the nvidia link, and they completed successfully, but on reboot I still get the same problem.
I have been using nvidia drivers successfully with the standard rhel package. Today, when I upgraded the kernel due to system update, the nvidia modules would not build at boot and I had to go back to the old kernel.
Where are the commands to rebuild the module for a newly installed kernel? I do not know whether the new kernel came with headers, etc… something is wrong.
Sorry not an expert. This is my SMI (went back to old kernel):
Sun Apr 2 14:32:43 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.78.01 Driver Version: 525.78.01 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro K2200 Off | 00000000:02:00.0 On | N/A |
| 51% 65C P0 3W / 39W | 842MiB / 4096MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 11541 G /usr/libexec/Xorg 315MiB |
| 0 N/A N/A 11708 G /usr/bin/gnome-shell 311MiB |
| 0 N/A N/A 12484 G nvidia-settings 0MiB |
| 0 N/A N/A 12513 G /usr/lib64/firefox/firefox 207MiB |
+-----------------------------------------------------------------------------+
uname -a
Linux XXXXX 5.14.0-162.18.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Mar 1 22:02:24 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux