I’m running into a weird display driver lockup problem on one of my Rocky 9 boxes. The cursor still moves with the mouse and the keyboard is active such that I can get to an alternate console and kill the Xorg process, do a startx again and then run fine for anything from a few hours to a few days before the next lockup.
Research here and with Google seems to indicate that the Radeon and/or amdgpu driver may be the problem. Oddly (to me) both seem to be getting loaded:
[dave@bend ~]$ lsmod | egrep 'amdgpu|radeon'
amdgpu 11087872 0
iommu_v2 24576 1 amdgpu
drm_buddy 20480 1 amdgpu
gpu_sched 57344 1 amdgpu
radeon 2068480 16
drm_ttm_helper 16384 2 amdgpu,radeon
ttm 98304 3 amdgpu,radeon,drm_ttm_helper
video 73728 2 amdgpu,radeon
drm_display_helper 200704 2 amdgpu,radeon
drm_kms_helper 245760 3 drm_display_helper,amdgpu,radeon
drm 704512 20 gpu_sched,drm_kms_helper,drm_display_helper,drm_buddy,amdgpu,radeon,drm_ttm_helper,ttm
i2c_algo_bit 16384 3 igb,amdgpu,radeon
Video card is:
04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Bonaire XTX [Radeon R7 260X/360] (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. AMD Radeon R7 260X
Physical Slot: 2
Flags: bus master, fast devsel, latency 0, IRQ 64
Memory at e0000000 (64-bit, prefetchable) [size=256M]
Memory at f0000000 (64-bit, prefetchable) [size=8M]
I/O ports at 2000 [size=256]
Memory at f0900000 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: radeon
Kernel modules: radeon, amdgpu
I can blacklist either driver but wondering if 1) this is the right solution? and 2) which one?
My other Rocky 9 box with a NVidia graphics card does not have this problem.
Thanks!