Ionos VPS with Rocky Linux 9 hang during cli reboot

I’ve run CentOS VPSs for many years with Ionos and reboots have always been blissfully quick and painless but in the last year or so, and notably since moving to Rocky Linux 9, it seems that after a week or more of uptime a cli reboot appears to shutdown fine but does not ever reboot. I have to go into the Ionos web interface and restart the VPS.

But, after doing this once, if I reboot from the cli again immediately, the reboot is quick and fine, as I would expect.

I’m totally confused what’s going on with it.

I see something about ACPI vs PCI reboots for certain hypervisors but I’m hesitant to touch reboot params on servers.

If with a vps you might have luck installing the upower package:

dnf install -y upower
systemctl enable --now upower

but a lot of this can also depend on the hardware or the way the hypervisor underneath is configured. I’ve got a dedicated server with a certain VPS provider which will reboot with no problems if it hasn’t been up for long, but after a couple of weeks it will hang at shutdown instead of restarting. So if that doesn’t help, there isn’t probably too much you can do.

Incidently, on physical servers I have everything works fine, just my VPS provider is the issue so something with their hardware - and not all of it - some works, some doesn’t. So perhaps BIOS, firmware, etc.

Ya, physical servers are fine for me, too. AI says there’s something about the hypervisor (qemu, kvm, etc.) that’s causing problems.

A local VirtualBox host I run doesn’t have this problem with Rocky Linux, either.

To be fair, the long-uptime-specific behaviour points to the reboot method that the kernel uses to actually trigger the hardware restart. On KVM/QEMU hypervisors like Ionos uses, the default PCI reboot method sometimes fails to signal the hypervisor correctly after the system has been running for a while, so it shuts down cleanly but the reboot signal never gets through.

Worth trying the ACPI reboot method. Edit /etc/default/grub and add reboot=acpi to the GRUB_CMDLINE_LINUX line:

GRUB_CMDLINE_LINUX="... reboot=acpi"

Then rebuild the GRUB config:

grub2-mkconfig -o /boot/grub2/grub.cfg

Reboot the VPS from the Ionos control panel to apply it (since the normal reboot doesn’t work anyway), then leave it up for a week and see if the problem recurs.

Some KVM hosts also respond better to reboot=kbd or reboot=bios, so if ACPI doesn’t sort it, those are worth trying in sequence.

A second thing to check: the upower suggestion above is a good one for systems where power management events interfere with shutdown. If upower isn’t installed, the kernel may not properly signal a reboot intent vs a power-off to the hypervisor on some setups.

If neither approach works, it’s worth raising with Ionos support and mentioning the reboot method issue specifically. Some VPS providers have hypervisor-side settings that affect how guest reboot signals are handled.

Mine’s actually different, I have a physical server at a hosting provider that doesn’t reboot so for this one it’s potentially something with the hardware or firmware. But I have installed upower on it to see if it will make a difference next time I reboot. All my VM’s work fine, it’s just this one particular physical server that’s a problem but it doesn’t belong to me so can’t do much with it. All the ones I do have in my office work fine. I just put it down to crappy hardware from the hosting provider.

I’m hoping this will sort out my physical server. For me, every Rocky VM I have reboots fine.

If I understand correctly, modern kernels default to acpi, and mine wasn’t specified in the grub file, so I added reboot=pci. It rebooted fine (which it does shortly after a hanging reboot and a force server restart) so we’ll see in a week, I guess. Thanks for the instructions.

reboot=pci didn’t work. Based on what I’m seeing, it’s a qemu host misconfiguration issue which I won’t be able to fix from my side. I’ll see if submitting a support ticket helps.

Thought there was something of interest here but it also seems to be about the host:

It’s still odd that after a reboot, another reboot will be quick, but after a week or more the reboot will hang…