Which repo to use for nvidia and cuda drivers

In GPU drivers. I would like to know the correct method or repos to use to install GPU drivers with cuda support. All this time I have been installing gpu drivers using the elrepo repository and I install 3 packages kmod-nvidia, nvidia-x11-drv, nvidia-x11-drv-libs.

There is also a cuda repository and a nvidia repository.
I have as a test installed drivers using a nvidia repository and installed nvidia-drivers and nvidia-driver-libs.

But im abit confused between the cuda and the nvidia repository and which one should I really be using on rocky 9.3?

I know nvidia-smi shows me a cuda version but that only shows me the highest supported cuda that gpu card can support.

ls -l /usr/local | grep cuda - shows me the actually version installed. Does this cuda install get installed or is bundled in the elrepo nvidia drivers?

First, ELRepo is awesome, but they chose to not build proprietary NVidia packages for EL9. It is understandable, since there are other repos that do. One of those are the RPMFusion repos.

The only repo to provide CUDA (that I know of) are the NVidia’s repos. They hold both the GPU driver and the CUDA, so that should definitely be a “matching pair”.

For that reason I do have (and EPEL repo is required too):

$ cat /etc/yum.repos.d/cuda-rhel9.repo 
[cuda-rhel9-x86_64]
baseurl = https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64
enabled = 1
gpgcheck = 1
gpgkey = https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/D42D0685.pub
name = cuda-rhel9-x86_64

I seem to install these packages:

for_driver:
- nvidia-driver
- nvidia-kmod-common
- nvidia-modprobe
- nvidia-settings
- nvidia-xconfig
- dnf-plugin-nvidia
- kernel-devel-matched
- dkms
for_cuda:
- cuda-runtime-11-8
- cuda-toolkit-11-8

I use actually non-latest CUDA and module stream (that was set before installing the packages):

dnf module reset nvidia-driver
dnf module enable nvidia-driver:525-dkms

One could select stream nvidia-driver:latest-dkms and see what versions that shows in the repository.

Thank you for this. I have question about downgrading drivers for older cards. The 470 driver is available from the elrepo repo for rocky 8 but as no drivers are available for rocky 9 will have to use nvidia repo. Where do i get older drivers for rocky 9. We have some systems that need older gpu drivers. Certain applications don’t like newer drivers and cause certain tools to slow things down in the app. This is especially for 3dequalizer app.

I would take a look at the guide here on the forums. The guide talks about using rpmfusion. rpmfusion is recommended over the nvidia repos if you are needing an older driver.

See the rpmfusion wiki for more information on the drivers that can be installed. You can also view this page for CUDA information.

2 Likes

So i got my rocky 9.3 working with nvidia repository. Relevant nvidia and cuda drivers get installed. But for rpmfusion non free updates rpm. I have the below installed but cat /proc/driver/nvidia does not exist and nvidia-smi does not see the driver either.

I installed:
kmod-nvidia
nvidia-driver
nvidia-kmod-common
nvidia-settings
nvidia-xconfig

You may have kernel module(s), but are they loaded. If exists, but not loaded, then why?

lsmod | grep nvidia
modinfo nvidia
dmesg -T | grep -i nvidia

lsmod | grep nvidia - GIves no output

modinfo nvidia
modinfo: ERROR: Module nvidia not found.

dmesg -T | grep -i nvidia
nouveau 0000:8b:00.0: NVIDIA TU104 (164000a1)
input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:8a/0000:8a:00.0/0000:8b:00.1/sound/card1/input12
input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:8a/0000:8a:00.0/0000:8b:00.1/sound/card1/input13
input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:8a/0000:8a:00.0/0000:8b:00.1/sound/card1/input14
nput: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:8a/0000:8a:00.0/0000:8b:00.1/sound/card1/input15

modprobe nvidia
modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.14.0-362.18.1.el9_3.x86_64

Am i missing some kind of rpms that did not install?

# lsmod | grep nvidia
nvidia_drm             86016  3
nvidia_modeset       1212416  5 nvidia_drm
nvidia              35627008  189 nvidia_modeset
video                  73728  1 nvidia_modeset
drm_kms_helper        245760  1 nvidia_drm
drm                   704512  7 drm_kms_helper,nvidia,nvidia_drm
# modinfo nvidia | head -2
filename:       /lib/modules/5.14.0-362.8.1.el9_3.x86_64/extra/nvidia-470xx/nvidia.ko.xz
firmware:       nvidia/470.223.02/gsp.bin
# rpm -qf /lib/modules/5.14.0-362.8.1.el9_3.x86_64/extra/nvidia-470xx/nvidia.ko.xz
kmod-nvidia-470xx-5.14.0-362.el9_3-470.223.02-1.el9.x86_64
# dnf --enablerepo=rpmfus*updates list kmod-nvidia-470xx\*
Installed Packages
kmod-nvidia-470xx-5.14.0-284.el9_2.x86_64         3:470.182.03-1.el9         @@commandline            
kmod-nvidia-470xx-5.14.0-284.el9_2.x86_64         3:470.223.02-1.el9         @@commandline            
kmod-nvidia-470xx-5.14.0-362.el9_3.x86_64         3:470.223.02-1.el9         @@commandline            
Available Packages
kmod-nvidia-470xx.x86_64                          3:470.223.02-1.el9         rpmfusion-nonfree-updates

The kernel module is from package kmod-nvidia-470xx-5.14.0-362.el9_3, but RPM Fusion has only package kmod-nvidia-470xx. The kmod-nvidia-470xx-5.14.0-362.el9_3 must have been generated locally, by dkms, akmod, or whatever system RPM Fusion does use.

Your installation has failed to do that.


In fact, for installing the RPM Fusion 470xx driver, I had installed only two packages explicitly:

xorg-x11-drv-nvidia-470xx
akmod-nvidia-470xx

Is there away to install gdm without xorg-x11-server-Xwayland. I dont want wayland to start when i install the nvidia gpu drivers.

On login dialog (gdm), after you have given/selected user and could type password there is a cog wheel on right bottom corner. There you can choose session type. Gnome(wayland) is the default, but there should be Gnome(x11) too and the selection should store (in home dir) for future sessions too.

Or edit config, something like in Configuring Xorg as the default GNOME session :: Fedora Docs

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.