Hi guys ^^,
I’m glad to post here, because I’ve read this forum many times, and helped me a lot (I am slightly new to Rocky 9.3, but years in CentOS 8.4).

This is the error I want to discuss with you, I’ll try to explain it as best I can.
I have a DAQ by PCIe that I need to reboot the OS to turn it on so I use a “.service” to do it.
This is the error I see, never happens after a Reboot, has to be after a Power On, but not happened always.
I have read about it in 2 post:
https://forums.rockylinux.org/t/crash-recovery-kernel-arming/11021
https://forums.rockylinux.org/t/how-do-i-remove-crashkernel-from-cmdline/13346
However, I don’t fully understand and solve the problem. It seems the problem is related to Kdump and Rocky 9, on the first post somebody talks about “Docs”, Where are these docs?
I have disabled Kdump.service, but I saw the error again. Can solve it by directly avoiding Kdump on the OS installation?
Any advice and help is welcome 
Regards
I’m running into the same issue using the RockyLinux 9 AMI on AWS.
[rocky@i-0978b586c4c18d4f2 ~]$ sudo systemctl list-units --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● kdump.service loaded failed failed Crash recovery kernel arming
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
1 loaded units listed.
[rocky@i-0978b586c4c18d4f2 ~]$ sudo systemctl status kdump.service
× kdump.service - Crash recovery kernel arming
Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Tue 2024-06-18 15:47:44 UTC; 2min 59s ago
Main PID: 1001 (code=exited, status=1/FAILURE)
CPU: 40ms
Jun 18 15:47:44 i-0978b586c4c18d4f2.eu-west-1.compute.internal systemd[1]: Starting Crash recovery kernel arming...
Jun 18 15:47:44 i-0978b586c4c18d4f2.eu-west-1.compute.internal kdumpctl[1007]: kdump: No memory reserved for crash kernel
Jun 18 15:47:44 i-0978b586c4c18d4f2.eu-west-1.compute.internal kdumpctl[1007]: kdump: Starting kdump: [FAILED]
Jun 18 15:47:44 i-0978b586c4c18d4f2.eu-west-1.compute.internal systemd[1]: kdump.service: Main process exited, code=exited, status=1/FAILURE
Jun 18 15:47:44 i-0978b586c4c18d4f2.eu-west-1.compute.internal systemd[1]: kdump.service: Failed with result 'exit-code'.
Jun 18 15:47:44 i-0978b586c4c18d4f2.eu-west-1.compute.internal systemd[1]: Failed to start Crash recovery kernel arming.
[rocky@i-0978b586c4c18d4f2 ~]$
This is how I look up the AMI:
data "aws_ami" "rocky_linux_9" {
most_recent = true
filter {
name = "name"
values = ["Rocky-9-*x86_64-*"]
}
filter {
name = "architecture"
values = ["x86_64"]
}
owners = ["679593333241"]
}
The actual AMI I get with this is
- AMI ID:
ami-0cb9745e56da171c2
- AMI Name:
Rocky-9-EC2-LVM-9.4-20240523.0.x86_64-prod-hyj6jp3bki4bm
I’m not sure how to proceed. Appreciate your help!
FWIW downgrading to Rocky 9.3 seems to fix it.
data "aws_ami" "rocky_linux_9" {
most_recent = true
filter {
name = "name"
- values = ["Rocky-9-*x86_64-*"]
+ values = ["Rocky-9-*9.3-*x86_64-*"]
}
filter {
name = "architecture"
values = ["x86_64"]
}
owners = ["679593333241"]
}
Now I realize also why I had to adjust my cloud-init, there’s some change in between Rocky 9.3 and 9.4 as well in the block device layout. Is this intentional? (Diff old is 9.3, new is 9.4)
runcmd:
- - [ growpart, /dev/nvme0n1, 5 ]
- - [ pvresize, /dev/nvme0n1p5 ]
- - [ lvresize, -l, +100%FREE, /dev/mapper/rocky-root ]
+ - [ growpart, /dev/nvme0n1, 4 ]
+ - [ pvresize, /dev/nvme0n1p4 ]
+ - [ lvresize, -l, +100%FREE, /dev/mapper/rocky-lvroot ]
- [ xfs_growfs, / ]
I had tested disabling the kdump however, I haven’t tested “erase” it:
grubby --update-kernel=ALL --args=“crashkernel=no”
grub2-mkconfig -o /boot/grub2/grub.cfg
The error changed, and I saw that it related to my “reboot”.
I have a .service file to do a reboot for a problem on the power of a DAQ PCI board. It seems that the reboot after checking the board is bad timing.
But I’m testing adding 5-10s before “reboot” command on the .service, and it seems the failure disappeared 
Still on test
It is not failing anymore.
My solution was to use a “sleep 5;” in the X.service I was using to reboot the Pc. Something in the timing of the command to reboot inside a .service was failing.