I feel there is something wrong with that kernel. On my HP Proliant DL360e Gen8 that kernel doesn’t work either, or at least for a few hours and then crashes on me. Currently using the 9.5 kernel (5.14.0-503.40.1.el9_5.x86_64) as this one works fine. Didn’t try the previous 570.17 kernel like you have, but I guess that might also work on mine.
I don’t know if I have the same errors as you, but I expect as the behaviour is similar I don’t think it will be problems with your hardware, but rather something in this kernel is borked.
The 570-19 kernel is available in Rocky 9 now, perhaps try this one, I’m going to do the same now by running dnf update.
I’m also contemplating enabling elrepo and installing kernel-lt or kernel-ml since they are 6.x kernels of which I know work on this server since I was using either Debian 12 or Ubuntu with a 6.x kernel before switching to Rocky 9.
Actually server died on the 570.19 kernel as well when performing a restic backup to one of my VM’s. So looks like either an elrepo kernel or the 9.5 one.
EDIT: so far kernel-lt from elrepo is working fine. No crashes so far. Worst case I’ll end up returning to 5.14.0-503.40.1.
thank for looking into it. I have installed kernel-lt and test to see how things are going. I must say, I have spent 2 days debugging this issue. I hope all will be fine for now. thx!
@iwalker one question, out of curiosity. Is this standard procedure to install different kernels for Rocky based on the hardware configuration to see what works and what not?
Since I am using latest 9950x3D CPU I ended up with elrepo 6.14 kernels because I kept having crashes because of the 3D cache on kernel-lt.
Also I could not use the newest elrepo 6.15 kernel because it was too new for Nvidia drivers. So 6.14 seems to be a good middle ground for me.
I am just surprised that I have to be so selective about kernels and nvidia drivers in order not to break things and have the hardware work.
You have to remember, EL is Enterprise Linux, and there is a certified hardware list that Red Hat makes the distro for. Since Rocky is based on RHEL, that means the hardware support for the stock kernels is the same. If you have new hardware, especially if something exotic with the absolutely latest CPU, or whatever, then sure there can be issues since it’s most likely not on the hardware compatibility list that Red Hat built the distro for. In which case you need to use the kernel-lt or kernel-ml from elrepo. For example, AMD Ryzen CPU’s would have the CPU fan going at full whack on older kernels than for say with a newer one when support for the CPU was provided.
EL or Enterprise Linux is built for stability, and that doesn’t necessarily mean that the latest and greatest hardware is supported. In this instance, you are probably better off with Fedora or as already mentioned a newer kernel but still use EL.