Nvidia Driver Problems in Rocky Linux 8.5

Well after finally getting Rocky Linux on the nvme drive – FINALLY – figured out, but running nouveau on my GeForce GT730 ( don’t go there!! I have a Nvidia Geforce GTX 1050 TI VGA and non VGA editions but saving that for another post – I made an interest discovery there. At the current time, the 730 is “Good Enough”) I bit the bullet and decided to try and install the Nvidia drivers. Here are the instructions I followed:

I first tried this on the jaguar, my “Kitty Litter Box” that is currently running Rocky Linux 8.5, so I knew what types of problems I may encounter on ocelot. The first thing I did was to backup the Rocky Linux drive if everything went south.

I followed the destructions to the letter, then rebooted. Nvidia popped up without incident in both cases. with one minor fly in the ointment: The resolution topped out at 1024x 768 (? or something like that ) and did not see my Ben Q monitor that is 1920 x 1080 resolution. I found no solution on how to fix the problem. So having done this a number of times in jaguar I found I could safely run the command:

dnf remove nvidia-driver

to be returned back to the nouveau driver and back to 1920 x 1080 resolution. Another small fly in the ointment: when I rebooted ocelot I no longer could get back into Rocky Linux!! (No problem in jaguar however). I then tried about 3-4 times to “backdoor” my way into Rocky from the openSUSE menu. Once I got in the Rocky 8.5 menu actually began to work. (glad I had made the Backup of the RL nvme drive none the less: I actually thought I might need it for a moment).

Here is the list of the HW and Software:

Monitor: Ben Q 24"
Graphics Card: Nvidia GeForce GT 730
Video Input: VGA
Current Driver: Nouveau
Storage Device: Corsair Force MP 600 NVMe (4th Gen.) – 1TB
CPU: AMD Ryzen 9 5900X
RAM Memory: 64 GB, GSkill 3600 MHz
OS: Rocky Linux 8.5
DE: KDE 5.23.3
Kernel Ver.: 4.18.0-348.23.1.el8_5.x86_64 (64bit)
Graphics Platform: KDE X11

The Answer MIGHT be found somewhere in /etc/X11 or in /etc/modprobe.d, but it might be in any of a number of config files located in??? If it IS located in either X11 or modprobe.d does it mean I have to FIRST have to create by hand some config files BEFORE I run the Nvidia Driver Installer program (as above), so that when I do run the destruction program it will find them all set up?

Using leopard (running CentOS 7.9) as a model, under /etc/X11/nvidia-xorg.conf I find the following:

Section “Device”
Identifier “Videocard0”
Driver “nvidia”
EndSection

… or do I need to include something like this (cribbed from panther – a very, very old gateway):

Section “Monitor”
Identifier “Monitor0”
ModelName “BenQ GW2450H (autoconfigured)”
EndSection

Section “Screen”
Identifier “Screen0”
Device “Videocard0”
Monitor “Monitor0”
DefaultDepth 24
SubSection “Display”
Viewport 0 0
Depth 24
EndSubSection
EndSection

And finally there is “Blacklisting Nouveau” under modprobe.d. Do I have to create by hand such a file? Do I need such a file?

I am really lost and am hoping someone has run into this problem and figured out how to solve it. Mine are, at best, pure guesses at best cribbed from 3 other computers.

This has me stumped. While I can mess around in the “Kitty Litter Box” and if I blow it up… no big loss; the same can’t be said if I blow up ocelot, and I came r-e-a-l close this after noon. Nouveau may suck, but at least it works, and the default resolution is the correct 1920x1080. Any help someone could provide would be greatly appreciated.

D’ Cat

Well, I’ve tried at least 3-4 different “procedures”, but in the end I can’t break the
1024 x 768 resolution, indeed regardless of what I do it still does not recognize the monitor as being a BenQ. There just has to be a way to get the correct resolution but I’m stumped. In the meantime I am back to using nouveau.

Help!!

D’ Cat

Dunno if it fits ur environment but I had a (still have a) monitor res issue when added a vga switch and system no longer recognises monitor type/ native res etc so fix for the moment in my case is (as an example):

#! /bin/bash

# xrandr - to get monitor identifiers etc
# cvt 1440 900 - to calc line for the res I wanted
# and apply ...
xrandr --newmode "1440x900_60.00"  106.50  1440 1528 1672 1904  900 903 909 934 -hsync +vsync
xrandr --addmode VGA-0 "1440x900_60.00"
xrandr --output VGA-0 --mode "1440x900_60.00"

as part of startup or run manually though I do need to auto adjust (on the monitor) and reposition them in the gui - not ideal but gets me the res for now…

…from answer here.

bobar

You don’t quite fit the description of genius, but you gave me a thread to follow with xrandr. If I ran it on jaguar (aka the “Kitty Litter Box”. Under nouveau it tells me there is VGA-1 (and a whole lot of mode lines) DVI-1 (disconnected) and HDMI-1 (disconnected). It also correctly identifies the resolution as being 1920x1080 @ 60Hz

Now I installed the Nvidia drivers, rebooted it, trying in both GNOME Classic X11 and in Plasma X11 than ran xrandr but this time I got something vastly different:

“Failed to get size of gamma for output default”.
“Screen0 Minimum 640x480; Maximum 1024x768; Current 1024x768”

Nothing about VGA1; DVI-1; or HDMI-1. This means it can’t tell anything about the monitor, the graphcs card, or the resolution. Nor do I suspect that it has anything to do with the graphics card because when I installed the Nvidia drivers on ocelot, it too was limited to the same 1024x768.

As an aside I too have a 4 position switch that connects 4 computers (the 5th is jaguar which is temporarily the 4th computer on the network as I am still in temporary housing… at least for another 2-4 weeks I am told, and then…? Hopefully a move back into my old unit. Might this be a problem of the switch?? Maybe, but the requires that I run cables and power cords just to isolate that one computer. Too leopard is on the network and is on the switch and it is running Nvidia just fine, with no hiccups. Indeed leopard has two monitors and xrandr shows them both, one VGA-0 and the other
DVI-D-0, and fine with 1920x1080.

Mode lines – I never in my wildest imaginings ever thought I may have to play with those again. I think the last time was about 18-19 years ago.

Any rate THANK YOU for the CLUE.

D’ Cat

You’re welcome.

Note xrandr does not show the native resolution for my monitor (when using the switch) until after I’ve added it with the code above although it does at least list VGA-0 as connected.

So, you have
Jaguar, nouveau, RL 8.5 = OK
Jaguar, Nvidia, RL 8.5 = NOT OK
Ocelot, Nvidia, RL ? = NOT OK
Leopard, switch, CentOS 7.9, Nvidia = OK

Nvidia seems to work on your CentOS but not your RL. Given different OSes I’m wondering if the Nvidia drivers/code used is the same version / is OS compatible / matches the cards etc.

Is the switch a problem ? Looks like maybe not in your case or maybe it is in combination with other variables (which’d be interesting to know if it was the case). I came across some old posts online but no one ever resolved their issues and the world just seems to have mostly moved on from vga switches where res is an issue worth fixing… I suspect a fully mechanical switch (as opposed with electronics) might not have issues but then there’s all the cables/connections involved too…

I’ll leave you to it !

A quick scan of that installation instruction link and I only see CUDA drivers… not sure that includes the video stuff you want…someone else here’ll know… my last attempt I was downloading video driver code not cuda code.

The NVidia’s own yum repository contains both the NVidia drivers and the CUDA toolkit. The CUDA requires drivers’ kernel module.

If the kernel module packages in that repo (this I haven’t checked) are made for specific kernel versions, then they are useless with other kernel versions and you have to wait until NVidia builds a package for new kernel.
If the kernel module package can automatically recompile the module for new kernel during boot, then kernel updates are easier.

I have installed CUDA from NVidia’s repo, but used kernel module from ELRepo. CUDA seems to work, even though it could crash due to “not exactly built together” version issues.

1 Like

…but does the cuda code include video driver code ? I was under the impression it was just drivers to let you use the cards’ cuda capabilities.

The CUDA toolkit and the NVidia drivers are separate packages. The NVidia drivers are further split (if RPM-packaged) into the kernel module and the X11 drivers. Both CUDA and X11 drivers talk to the same kernel module. AFAIK, NVidia’s repo has the X11 drivers too.

PS. NVidia released/releases open source version of the kernel module.

yes, just seen the “optionally install cuda drivers” line so that makes sense now…