Problems installing rpmfusion Nvidia drivers

Hi all,

I’ve been trying to install the Nvidia drivers from rpmfusion and just can’t get them to work. I’m running Rocky 9.3 and have followed this post:

dnf update -y
reboot
dnf install akmod-nvidia

I have waited a while until the installation process seemed completed (looking at “top”). After a reboot, the nvidia driver didn’t work though, there were messages about falling back to nouveau.

# rpm -q akmod-nvidia kernel
akmod-nvidia-545.29.06-1.el9.x86_64
kernel-5.14.0-362.18.1.el9_3.x86_64
# uname -r
5.14.0-362.18.1.el9_3.x86_64

I have read in another post that one could try running “akmods --force” which I did but that didn’t solve the problem.

I’ve tried to load the module manually and this is where it gets interesting:

# modprobe nvidia
modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.14.0-362.18.1.el9_3.x86_64

The module isn’t in that directory but it can be found in another location: /lib/modules/5.14.0-362.8.1.el9_3.x86_64/extra/nvidia/nvidia.ko.xz

This is from a slightly older kernel version. The directory for the kernel I’m running doesn’t have the extra/nvidia subdirectories at all. I’ve moved that directory somewhere else, uninstalled that older kernel package and ran “akmods --force” again in case it just ended up in the wrong version, but that didn’t result in any changes I could see. Is there more logging output available than just the result of akmods which is:

Checking kmods exist for 5.14.0-362.18.1.el9_3.x86_64 [  OK  ]
Building and installing nvidia-kmod [  OK  ]

Where did the module go? Should I try renaming the modules folder from the old kernel version to the new one in case these are compatible? Could it be a problem with the nouveau driver being around? It’s blacklisted on the boot command line (rd.driver.blacklist=nouveau modprobe.blacklist=nouveau) as far as I can see, but it seems to be loaded according to lsmod.

Check the logs in /var/log/akmods, they might contain a clue to what’s happening.

Thank you! Your hint eventually led me to the solution. Long story short: dnf thought kmod-nvidia was already installed so akmods --force didn’t finish (but also didn’t display an error).

Step by step solution in case somebody else stumbles upon this problem:

The akmods log file just contained

2024/02/09 10:20:37 akmods: Building RPM using the command '/sbin/akmodsbuild --kernels 5.14.0-362.18.1.el9_3.x86_64 /usr/src/akmods/nvidia-kmod.latest'
2024/02/09 10:21:26 akmods: Installing newly built rpms
2024/02/09 10:21:26 akmods: DNF detected
2024/02/09 10:22:46 akmods: Successful.

But by looking at the akmods script itself I found out that the generated rpm package ended up in /var/cache/akmods/nvidia/ alongside a log file. And that one contained:

Package kmod-nvidia-5.14.0-362.el9_3-3:545.29.06-1.el9.x86_64 is already installed.
Dependencies resolved.
Nothing to do.

RPM also showed me that kmod-nvidia was installed, but DNF wasn’t able to remove it using “remove kmod-nvidia”. Only when I told it to remove kmod-nvidia-5.14.0… (the full version string) it removed the package which it claimed was installed from rpmfusion.

I know I had installed both kmod-nvidia and akmod-nvidia, removed them, reinstalled them, during my efforts to get the Nvidia driver going (and I didn’t know the difference between kmod and akmod).

Eventually, once the kmod-nvidia package was gone, akmods --force rebuilt and installed the right package and now I have the nvidia kernel module in /lib/modules/$(uname -r) and the driver was loaded after a reboot.