Flatpak NVIDIA drivers issues

Hi all.

When I was using RHEL 9 oficial NVIDIA drivers version 515, flatpak also installed 515 and thing are running well.

After a fresh install:

sudo dnf install epel-release
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
sudo dnf install kernel-devel-$(uname -r) kernel-headers-$(uname -r)
sudo dnf install nvidia-driver nvidia-settings cuda-driver
reboot

NVIDIA-SMI (from today with 520 and not 515 like 2 weeks ago)

bash-5.1$ nvidia-smi
Thu Oct 13 20:31:09 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:0B:00.0  On |                  N/A |
| 20%   54C    P0    28W / 120W |    771MiB /  6144MiB |      4%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      3632      G   /usr/libexec/Xorg                 246MiB |
|    0   N/A  N/A      3775      G   /usr/bin/gnome-shell               80MiB |
|    0   N/A  N/A      6518      G   /usr/lib64/firefox/firefox        421MiB |
+-----------------------------------------------------------------------------+

After a update to 520 through

sudo dnf update

(bad habit to use terminal to update the system)

Things just got broken, FlatPak stop working properly with OPENGL related stuff.

BLENDER

bash-5.1$ flatpak run org.blender.Blender

intern/ghost/intern/GHOST_WindowX11.cpp:191: X11 glXChooseVisual() failed, verify working openGL system!
initial window could not find the GLX extension
Writing: /tmp/blender.crash.txt

GODOT (3D GAME ENGINE)

sh-5.1$ flatpak run org.godotengine.Godot
Godot Engine v3.5.1.stable.flathub.6fed1ffa3 - https://godotengine.org
ERROR: Condition "!fbc" is true. Returned: ERR_UNCONFIGURED
   at: initialize (platform/x11/context_gl_x11.cpp:167)
ERROR: Condition "!fbc" is true. Returned: ERR_UNCONFIGURED
   at: initialize (platform/x11/context_gl_x11.cpp:167)
ERROR: Error initializing GLAD
   at: is_viable (drivers/gles2/rasterizer_gles2.cpp:166)
Gtk-Message: 20:27:35.633: Failed to load module "canberra-gtk-module"
Gtk-Message: 20:27:35.633: Failed to load module "pk-gtk-module"
Gtk-Message: 20:27:35.633: Failed to load module "canberra-gtk-module"
Gtk-Message: 20:27:35.633: Failed to load module "pk-gtk-module"

TOTAL CHAOS (FREE GAME FROM FLATHUB)

bash-5.1$ flatpak run com.moddb.TotalChaos
GZDoom <unknown version> -  - SDL version
Compiled on Feb 16 2022

M_LoadDefaults: Load system defaults.
W_Init: Init WADfiles.
 adding /app/share/games/doom/gzdoom.pk3, 627 lumps
 adding /app/share/games/doom/zd_extra.pk3, 132 lumps
 adding /app/share/games/doom/freedoom2.wad, 3595 lumps
 adding /app/share/games/doom/totalchaos.pk3, 5659 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource.wad, 214 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_blood.wad, 10 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_bloodfx.wad, 30 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_bubbles.wad, 3 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_combat.wad, 3 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_dust.wad, 13 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_explosion.wad, 43 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_fires.wad, 93 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_industrial.wad, 6 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_inventory.wad, 3 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_leaves.wad, 52 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_smog.wad, 82 lumps
 adding /app/share/games/doom/totalchaos.pk3:resource_spark.wad, 42 lumps
I_Init: Setting up machine state.
CPU Vendor ID: AuthenticAMD
  Name: AMD Ryzen 9 3900X 12-Core Processor 
  Family 23 (23), Model 113, Stepping 0
  Features: SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 HyperThreading
I_InitSound: Initializing OpenAL
  Opened device HyperX Cloud Orbit 3D 8Ch Surround analógico 7.1
  EFX enabled
V_Init: allocate screen.
S_Init: Setting up sound.
ST_Init: Init startup screen.
Checking cmd-line parameters...
S_InitData: Load sound definitions.
G_ParseMapInfo: Load map definitions.
Texman.Init: Init texture manager.
ParseTeamInfo: Load team definitions.
LoadActors: Load actor definitions.
script parsing took 194.43 ms
R_Init: Init Doom refresh subsystem.
DecalLibrary: Load decals.
Adding dehacked patch freedoom2.wad:DEHACKED
Patch installed
M_Init: Init menus.
P_Init: Init Playloop state.
ParseSBarInfo: Loading custom status bar definition.
===========================================================================
This is Freedoom, the free content first person shooter.

Freedoom is freely redistributable under the terms of the modified BSD
license. Check out the Freedoom website for more information:

    https://freedoom.github.io/
============================================================================
D_CheckNetGame: Checking network game status.
player 1 of 1 (1 nodes)
Using video driver x11
Gtk-Message: 20:29:06.862: Failed to load module "canberra-gtk-module"
Gtk-Message: 20:29:06.862: Failed to load module "pk-gtk-module"
Gtk-Message: 20:29:06.862: Failed to load module "canberra-gtk-module"
Gtk-Message: 20:29:06.862: Failed to load module "pk-gtk-module"
Failed to load OpenGL functions.

I’ve tried to manually install drivers 515 and 520 in the list:

bash-5.1$ flatpak remote-ls flathub | grep nvidia

And them

flatpak --system install org.freedesktop.Platform.GL32.nvidia-520-56-06

And later on:

flatpak --system install org.freedesktop.Platform.GL32.nvidia-515-76

Nothing works.

So. There is a way to ‘activate’ the drivers or something like to make things run properly?

After a long research, a lot of reading about various problems and some reverse engineering in my drivers installation I’ve found out a couple of things.

First off all, Flatplak runs applications inside of a ‘sandbox enviroment’ but this doesn’t mean literally everything runs inside a sandbox, NVIDIA drivers (I don’t know about AMD / Intel) have some kind of communication with the system kernel, I don’t know what they talk about, but NVIDIA inside of Flatpak requires the exactly same version of the driver used in the host system. This is caused by modifications in the kernel during driver installation.

In other words, if ROCKY (any distro in general) is using driver version 515.0.0.1 on the host, Flatpak version also need to be 515.0.0.1, one problem with this aproach is match the versions in distros that is not tagged like ‘friendly’ like OPENSUSE LEAP, Manjaro, Ubuntu.

Usually this ‘friendly’ distros have custom or comunities respositories that keep a ‘LTS’ version of the driver for a long time. RHEL bases is not part of them.

In my case, after doing some ‘reverse enginering’ in my drivers install, I’ve found that the tutorial uses the DEVELOPER repo of NVIDIA, what means more often releases, what causes a mismatch with Flatpak and a ‘broken’ system in my case.

The original installation process (this problems also happens with RPM Fusion install).

sudo dnf install epel-release

sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo

sudo dnf install kernel-devel-$(uname -r) kernel-headers-$(uname -r)

sudo dnf install nvidia-driver nvidia-settings cuda-driver

reboot

This method uses the latest version by default. At the moment of this post is 520.something that is not available on flatpak (at the date of this post), so I got a newer version of the driver on the system and a older one in the Flatpak environment and a broken system due to the mismatch.

NVIDIA website (at the date of this post) uses version 515.76 like a ‘stable’ release, and developer branch have 520.something else.

My solution was make a manual install of the .run file from the website, what means a more manual and long job blocking nouveu, altering grub and etc.

After the annoying process, my system have 515.76 on the host and Flatpak and literally everything is working flawless.

nvidia-smi
NVIDIA-SMI 515.76       Driver Version: 515.76       CUDA Version: 11.7

Flatpak list
org.freedesktop.Platform.GL32.nvidia-515-76

Here is the tutorial that I’ve used for fix my problem, just remember to take a look at the date, 9 years old tutorials usually doesn’t works fine.

NVIDIA drivers from website

And off course, if someone nows knows the stable release repo to NVIDIA drivers, fell free to share, it’s less annoying than the entire process involving grub and etc.

1 Like