Another noob another Nvidia topic

Hi, am n00b.

Spent a total of 36 hours over the last three days trying to install nvidia drivers on RockyLinux 8.5 and 8.6.

Followed the guides listed below, which all resultet in the same result: black screen at boot. Managed to uninstall nvidia drivers with alt+f2 f3 f4 and retun to nouveau, but never got the nvidia drivers to actually work.

Purged the os and reinstalled 8.5 multiple times. Upgraded from 8.5 → 8.6 multiple times. Tried to install drivers with and without running “sudo dnf update”.
Tested with and without safe boot mode enabled in BIOS.
Tried to install drivers on other distros like ubunto and pop! yet everything resulted in blackscreen at boot.

At the point of throwing in the towl and crawl back to windows… :frowning:

Send help.

gpu: gtx 980 ti
drivers teste:

  • 470.129.06
  • 510.73.05
  • 515.43.04

I’m only allowed to post two links sadly, but I’ve filtered through about every video on youtube and first 5 pages on google related to the issue.

Guide 01
Guide 02

More sources that yielded the same result and issue.

Guide 03

Guide 04 (rocky related rpmfusion)

You did fail with ELRepo’s packages too?

Yeah, I failed and got a blackscreen at boot with ELRepo too. I followed your steps from this post which was very helpfull (thank you), but in the end did not work out.

So then the question is: What goes wrong?

The first that I would look at is X11’s log-file: /var/log/Xorg.0.log

Good question! I’ll read through it and try to figure it out, thanks for suggestion. Dumping the context of the Xorg.0.log file below if anyone’s interested.

[   100.523] (--) Log file renamed from "/var/log/Xorg.pid-2418.log" to "/var/log/Xorg.0.log"
[   100.524] 
X.Org X Server 1.20.11
X Protocol Version 11, Revision 0
[   100.524] Build Operating System:  4.18.0-348.20.1.el8_5.x86_64 
[   100.524] Current Operating System: Linux n00dleBox 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 x86_64
[   100.524] Kernel command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-372.9.1.el8.x86_64 root=/dev/mapper/rl-root ro crashkernel=auto resume=/dev/mapper/rl-swap rd.lvm.lv=rl/root rd.lvm.lv=rl/swap rhgb quiet nouveau.modeset=0 rd.driver.blacklist=nouveau plymouth.ignore-udev
[   100.524] Build Date: 13 April 2022  04:39:44AM
[   100.524] Build ID: xorg-x11-server 1.20.11-5.el8 
[   100.524] Current version of pixman: 0.38.4
[   100.524]     Before reporting problems, check http://wiki.x.org
    to make sure that you have the latest version.
[   100.524] Markers: (--) probed, (**) from config file, (==) default setting,
    (++) from command line, (!!) notice, (II) informational,
    (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[   100.524] (==) Log file: "/var/log/Xorg.0.log", Time: Thu Jun  2 12:57:36 2022
[   100.524] (==) Using config file: "/etc/X11/xorg.conf"
[   100.524] (==) Using config directory: "/etc/X11/xorg.conf.d"
[   100.524] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[   100.524] (==) No Layout section.  Using the first Screen section.
[   100.524] (==) No screen section available. Using defaults.
[   100.524] (**) |-->Screen "Default Screen Section" (0)
[   100.524] (**) |   |-->Monitor "<default monitor>"
[   100.524] (==) No device specified for screen "Default Screen Section".
    Using the first device section listed.
[   100.524] (**) |   |-->Device "Videocard0"
[   100.524] (==) No monitor specified for screen "Default Screen Section".
    Using a default monitor configuration.
[   100.525] (==) Automatically adding devices
[   100.525] (==) Automatically enabling devices
[   100.525] (==) Automatically adding GPU devices
[   100.525] (==) Automatically binding GPU devices
[   100.525] (==) Max clients allowed: 256, resource mask: 0x1fffff
[   100.525] (==) FontPath set to:
    catalogue:/etc/X11/fontpath.d,
    built-ins
[   100.525] (==) ModulePath set to "/usr/lib64/xorg/modules"
[   100.525] (II) The server relies on udev to provide the list of input devices.
    If no devices become available, reconfigure udev or disable AutoAddDevices.
[   100.525] (II) Loader magic: 0x564b6a40b020
[   100.525] (II) Module ABI versions:
[   100.525]     X.Org ANSI C Emulation: 0.4
[   100.525]     X.Org Video Driver: 24.1
[   100.525]     X.Org XInput driver : 24.1
[   100.525]     X.Org Server Extension : 10.0
[   100.525] (++) using VT number 1

[   100.527] (II) systemd-logind: took control of session /org/freedesktop/login1/session/c8
[   100.528] (II) xfree86: Adding drm device (/dev/dri/card0)
[   100.528] (II) Platform probe for /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:08.0/0000:06:00.0/drm/card0
[   100.528] (II) systemd-logind: got fd for /dev/dri/card0 226:0 fd 12 paused 0
[   100.536] (--) PCI:*(6@0:0:0) 10de:17c8:3842:4992 rev 161, Mem @ 0xfa000000/16777216, 0xc0000000/268435456, 0xd0000000/33554432, I/O @ 0x0000e000/128, BIOS @ 0x????????/524288
[   100.536] (II) LoadModule: "glx"
[   100.536] (II) Loading /usr/lib64/xorg/modules/extensions/libglx.so
[   100.538] (II) Module glx: vendor="X.Org Foundation"
[   100.538]     compiled for 1.20.11, module version = 1.0.0
[   100.538]     ABI class: X.Org Server Extension, version 10.0
[   100.538] (II) LoadModule: "nvidia"
[   100.538] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so
[   100.538] (II) Module nvidia: vendor="NVIDIA Corporation"
[   100.538]     compiled for 1.6.99.901, module version = 1.0.0
[   100.538]     Module class: X.Org Video Driver
[   100.538] (II) NVIDIA dlloader X Driver  470.129.06  Thu May 12 22:49:34 UTC 2022
[   100.538] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[   100.538] (II) systemd-logind: releasing fd for 226:0
[   100.539] (II) Loading sub module "fb"
[   100.539] (II) LoadModule: "fb"
[   100.539] (II) Loading /usr/lib64/xorg/modules/libfb.so
[   100.539] (II) Module fb: vendor="X.Org Foundation"
[   100.539]     compiled for 1.20.11, module version = 1.0.0
[   100.539]     ABI class: X.Org ANSI C Emulation, version 0.4
[   100.539] (II) Loading sub module "wfb"
[   100.539] (II) LoadModule: "wfb"
[   100.539] (II) Loading /usr/lib64/xorg/modules/libwfb.so
[   100.540] (II) Module wfb: vendor="X.Org Foundation"
[   100.540]     compiled for 1.20.11, module version = 1.0.0
[   100.540]     ABI class: X.Org ANSI C Emulation, version 0.4
[   100.540] (II) Loading sub module "ramdac"
[   100.540] (II) LoadModule: "ramdac"
[   100.540] (II) Module "ramdac" already built-in
[   100.540] (II) NVIDIA(0): Creating default Display subsection in Screen section
    "Default Screen Section" for depth/fbbpp 24/32
[   100.540] (==) NVIDIA(0): Depth 24, (==) framebuffer bpp 32
[   100.540] (==) NVIDIA(0): RGB weight 888
[   100.540] (==) NVIDIA(0): Default visual is TrueColor
[   100.540] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[   100.540] (**) NVIDIA(0): Enabling 2D acceleration
[   100.540] (II) Loading sub module "glxserver_nvidia"
[   100.540] (II) LoadModule: "glxserver_nvidia"
[   100.541] (II) Loading /usr/lib64/xorg/modules/extensions/libglxserver_nvidia.so
[   100.548] (II) Module glxserver_nvidia: vendor="NVIDIA Corporation"
[   100.548]     compiled for 1.6.99.901, module version = 1.0.0
[   100.548]     Module class: X.Org Server Extension
[   100.548] (II) NVIDIA GLX Module  470.129.06  Thu May 12 22:46:43 UTC 2022
[   100.548] (II) NVIDIA: The X server supports PRIME Render Offload.
[   101.276] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:6:0:0.  Please
[   101.276] (EE) NVIDIA(GPU-0):     check your system's kernel log for additional error
[   101.276] (EE) NVIDIA(GPU-0):     messages and refer to Chapter 8: Common Problems in the
[   101.276] (EE) NVIDIA(GPU-0):     README for additional information.
[   101.276] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA graphics device!
[   101.276] (EE) NVIDIA(0): Failing initialization of X screen
[   101.276] (II) UnloadModule: "nvidia"
[   101.276] (II) UnloadSubModule: "glxserver_nvidia"
[   101.276] (II) Unloading glxserver_nvidia
[   101.276] (II) UnloadSubModule: "wfb"
[   101.276] (II) UnloadSubModule: "fb"
[   101.276] (EE) Screen(s) found, but none have a usable configuration.
[   101.276] (EE) 
Fatal server error:
[   101.276] (EE) no screens found(EE) 
[   101.276] (EE) 
Please consult the The X.Org Foundation support 
     at http://wiki.x.org
 for help. 
[   101.276] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[   101.276] (EE) 
[   101.277] (EE) Server terminated with error (1). Closing log file.

Note how X11 kindly marks messages:

Therefore, this draws attention:

The relevant “system’s kernel log” is probably in /var/log/messages and /var/log/boot.log

The “README” is from NVidia, both ascii and HTML versions. Perhaps in /usr/share/doc/nvidia-x11-drv/ (at least in ELRepo’s version.)

One can also run dmesg -T or journalctl -xe – assuming you look at live system (logged in via text console or ssh), rather than post-mortem analysis. There is some way to make systemd-journal to write to files too – the /var/log/ files are mostly from ‘rsyslog.service’.

Is this saying ‘ALT+’, or does it mean ‘CTRL+ALT+’.

If it can’t activate the GPU and there’s a black screen at boot, does that mean you can’t even use text based emergency mode when it’s in that state, e.g. does it need the GPU to be able to show plain text on the screen?

Awesome, thanks for pointing that out! Really appreciate it. :slight_smile:
I’ll dig through those logs and readmes now!

My apologies, that was badly written on my part. It means that when Rocky boots with installed nvidia drivers, it boots into a blank screen with a blinking hyphen. Text inputs are disabled. By using the keyboard combination “ALT + F2/F3/F4” on my system, I am able access the text based console mode of Rocky linux. From this mode I run the commands stated below to remove the nvidia drivers, and after a reboot I am able to access Rocky in a GUI mode with nouveau drivers.

`sudo dnf remove nvidia-driver nvidia-settings cuda-driver kernel-devel-$(uname -r) kernel-headers-$(uname -r)`

The key combo is CTRL+ALT+Fn on most systems, but yes, you successfully switch to “virtual console” and yes, drawing text seems ok in your system. (I have one system where Nouveau can draw nothing, not even the text … if I have more than one monitor.)

Note that on bootloader (GRUB) menu it is possible to edit an entry to change the kernel command-line options.
If one adds systemd.unit=multi-user.target, then system boots without starting the GUI, to text-mode.

If running in text mode, one can start the GUI with systemctl isolate graphical.target
Similarly, one can GUI-related services and “go to text mode” with systemctl isolate multi-user.target

It can draw text on the screen using the same hardware, so I wonder if the text based console even uses the Nvidia driver, or does it use some generic low res driver, and if so, maybe it goes black at the point it tries to switch to high res graphics?

A blank|black screen, even without text usually implies for a specialized video driver, that the nouveau driver is blacklisted.

You have to blacklist the nouveau driver in order for other drivers to load, especially the vendor driver, regardless of how it is being installed (RPM, .bin, tarball, whatever).

Ensure “nouveau” is blacklisted with a file in the directory - /etc/modprobe.d.
Then install the drivers at the runlevel 2 equivalent, which in systemd is set-default=multi-user.target, if even only temporarily.
Then reboot, and if you assigned multi-user.target to “set-default” then you will need to manually start “graphical.target”.

This process can be very frustrating.

Personally, I have never installed the Legacy nVidia drivers for nVidia GPU’s. The OpenSource Nouveau Driver (the OS should normally default to this), work perfectly well, & to add 3D support you need to add the Mesa System. I have though only used all that on Debian based OS’s (Rocky I only used as VM’s). So to get Mesa installed, you may require to add some repo’s etc.

But the performance very often is better than with the legacy nVidia drivers, & system updates are less likely to break your OS.

The first tutorial example (I hate video tutorials, especially ones that just show typing on the screen) in the original post shows the proprietary drivers installing for Rocky 8.5. Note the question ‘do you want to use DKMS’. DKMS (Dynamic Kernel Module Support) automatically reinstalls the driver modules into the kernel upon kernel update.

My experience is similar to this, though I’ve only used the drivers in RHEL 6 and RHEL 7. A key to those was blacklisting nouveau in modprobe and grub. And while kernel updates required a driver reinstall, it hardly ‘broke the OS’. Boot to multi-user mode and reinstall the driver. Reboot and done.

Did you ever blacklist nouveau in your attempts to use the 3rd party driver?

Also, according to nVidia, Official Drivers | NVIDIA, the current driver for 64 bit X64 systems and the 980Ti:

LINUX X64 (AMD64/EM64T) DISPLAY DRIVER

Version: 515.48.07
Release Date: 2022.5.31
Operating System: Linux 64-bit
Language: English (US)
File Size: 343.76 MB

Just a FYI on this. I’m looking at moving to Rocky Linux and was checking on nVidia drivers. Don’t use it to game, but prefer them working.

Fedora did the same thing. The fix there was a couple things.

But first… be sure SSH connections can be made as a user who can SUDO or root. This is how I installed the drivers with the ‘black screen’.

Blacklist Nouveau in GRUB.
With Fedora, you had to disable Wayland also. (vi /etc/gdm/custom.conf – WaylandEnable=false)

Then just SSH in and install the driver. It worked with Fedora every time and with Kernel update that broke the driver as well. SSH was much easier than trying to reboot in rescue mode or change the run level, etc.

With this - I can see they will work, so I’ll be migrating to Rocky in the very near future myself! :slight_smile: