Rocky 9.3 Nvidia 550 driver fails to install

I’ve not been able to successfully install the nvidia drivers version 550 using this:

dnf update
reboot
dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
dnf install kernel-devel-$(uname -r) kernel-headers-$(uname -r)
dnf install nvidia-driver (package contains nvidia-settings and cuda-driver)
add nvidia_drm.modeset=1 to grub permanently in /etc/default/grub 
grub2-mkconfig --update-bls-cmdline -o /boot/grub2/grub.cfg (only needed for Rocky 9.3)
reboot

After reboot the command nvidia-smi It give message NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

The above procedure worked fine a couple of months ago on a different system when the latest nvidia release was 545.

But I am able to get version 545 running:

dnf remove nvidia-driver nvidia-settings cuda-driver (removes the latest driver just installed)
dnf module reset nvidia-driver
dnf module install nvidia-driver:545-dkms/default
reboot

nvidia-smi now works and the driver is functional on release 545.

Also installing the 550 dkms version using did not work either and gave the same result:

dnf module install nvidia-driver:latest-dkms/default

Is there something else that needs to be done to get the latest version running or is this a known issue?

What card do you have?

I guess you have the official nvidia repo:

dnf -y config-manager --add-repo=https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo

This works for me:

dnf -y remove nvidia-driver
dnf -y module reset nvidia-driver
reboot # might not be necessary
dnf -y module install nvidia-driver:latest-dkms
reboot

I usually install using kickstart, but I have to reinstall the nvidia driver after the first boot. ;-(.

or try appending the following to grub:

nofb quiet splash=quiet rd.driver.blacklist=nouveau nouveau.modeset=0

followed by

grub2-mkconfig

Well I thought I had the nvidia repo, this is what I used: (edited the original post with this for future readers)

dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo

I’ve tested it with a K2200 and P4000. Both adapters do the same thing. Both are listed as supported by nvidia. And both worked fine about two months ago when the 545 release of nvidia was the latest. I also installed both in Win10 to make sure the driver loaded and the adapters worked fully.

The repo definition seems ok.

Note that

dnf list kmod-nvidia-5[45]*

lists many versions and not just the active “stream”, and it does not list package for every kernel version.


It is a bit hazy to me how the dkms “works” with the closed driver. As example,
AlmaLinux did release kernel 5.14.0-362.24.2.el9_3.x86_64 that RHEL does not have and thus NVidia has no kmod-nvidia package for it.

The kmod-nvidia-525.147.05-5.14.0-362.24.1.x86_64 puts precompiled modules into

/lib/modules/5.14.0-362.24.1.el9_3.x86_64/extra/drivers/video/

for 5.14.0-362.24.1.el9_3.x86_64
and installation of 5.14.0-362.24.2.el9_3.x86_64 creates symlinks from

/lib/modules/5.14.0-362.24.2.el9_3.x86_64/weak-updates/drivers/video/nvidia/

and the 5.14.0-362.24.2.el9_3.x86_64` is happy with those “24.1” modules.


Overall, install of NVidia drivers does not always succeed on first attempt. Or trivially.

I would check what rpm -qa kmod-nvidia\* shows and whether there are files/valid links in
extra/drivers/video or weak-updates/drivers/video/nvidia.


I do have legacy 470-series driver from RPM Fusion on my Kepler systems since I did get impression that current series no longer supports Kepler. (Might not be true for all Keplers though.)


ELRepo does now build NVidia module for el9. They were solid with el7, so perhaps it is less hassle?

I can unfortunately not really help you, but at least confirm that I have the same issue. Even on a fresh install of 9.3. Installing 545 also works for me and is my current workaround. I would be happy if someone here has a solution.

@jlehtone thanks for the info. While I’ve been administering headless linux servers since Redhat 3 I’ve can count the number of workstations I’ve setup that required nvidia drivers on one hand so it’s somewhat of a black art to me. Every bit of info helps see the big picture. Since I have a working setup with 545 I can’t spend anymore time on this one but hopefully I can find a resolution before the next build. Thanks for the input!

@smith good to know! Maybe a nvidia developer will see this and take a look.

1 Like

What I did not mention is that we do still use “3D hardware quad-buffered stereo” a bit. It includes an (IR or radio) emitter that sends sync signal to active shutter glasses. On version 530 the emitter did not activate, so we did revert to 525-series drivers. I.e. switched to nvidia-driver:525-dkms stream. If/when NVidia stops building 525 for new kernels, … will be fascinating.