I am documenting this here to help anyone else who runs into it and searches for it.
Dell PowerEdge servers having this thing called a Lifecycle Controller that they will first boot into when you turn them on for the first time. I have found it mostly useless, actually just a waste of time and annoying mostly … and, as important for this post, dangerous.
It will require you to go through a wizard to even start the install of an OS or else it will keep booting the Lifecycle Controller on future boots till you do. Part of the Wizard asks what OS you are going to install. I was installing Rocky 9.4 and the closest thing in its list with RHEL 9.3 so I choose that. Big mistake.
Once I got it booting via PXE off the Rocky 9.4 install for a kickstart, it would fail with
modprobe: ERROR could not insert 'bnxt_en': Exec format error
with similar modprobe failures on ‘bnxt_re’ and ‘tg3’ and ‘mpi3mr’. After these failures it would endless loop trying to download the kickstart file over http and failing.
I know that error usually means that the modules being loaded don’t match the kernel. So I went to my pxeboot setup and checked the vmlinuz and initrd.img files to verify that they really matched the ones downloaded under rocky/9/BaseOS/os. They did. I tried PXE booting the Rocky 8 vmlinuz/initrd.img I still had on the server and it worked fine. But trying to boot Rocky 9 installer failed again with the same above errors.
I then looked above the errors and saw these lines:
Examining /dev/sdb1
mount: /media/DD-1: WARNING: source write-protected, mounted read-only
I booted a rescue disk I had and sure enough lsscsi shows a virtual USB drive named OEMDRV the Dell server had created.
It appears my choice in the LifeCycle Controller OS install wizard of RHEL 9.3 had setup a supplement DD drive of drivers overridding the ones on the Rocky 9 install but that were for the 9.3 kernel, not 9.4. I am not sure why it did not interfer with my install of Rocky 8.10 though.
Anyway, after going back into the LifeCycle Controller wizard and making an OS choose of “Other” instead of RHEL 9.3 and doing the install again the problem went away and I was able to install Rocky 9 normally.
I also learned one can disable the booting into the LifeCycle Controller if you have ssh access to the DRAC by sshing to it and running
racadm set LifecycleController.LCAttributes.LifecycleControllerState 0