Rocky 9 KVM host - reboot does not suspend VM's, hard shutdown instead

I have a Rocky 9.7 install that’s bare metal on an old Lenovo RD430 server (used to run Server 2019) that was setup OOTB to be a KVM virtualization host.

It hosts 3x VM’s:

2x Rocky 9.7 VM’s

1x Windows Server 2019 VM

All VM’s have the QEMU packages installed and are accessible from the host.

There were updates available for the host OS, so these were installed and the host rebooted. When it came back up, the first clue was the Windows VM told me it was unsafely shutdown, so then I checked the two Rocky VM’s and they had been “unplugged” too; no shutdown command was sent to them.

The RHEL documentation says they should be paused and resumed, assuming auto start is enabled, which it is.

There are no obvious steps that I appear to be missing. The machine is operated mostly headless (there is a console available, but it’s not used generally), so most management is being done through SSH or Cockpit. I did find an article that mentioned something about the Watchdog config:

KVM Kernel Virtualization Machine on Rocky Linux 9 - wiki.TerraBase.info

Under the “Power Events” section.

It mentions the Virtual Machine Manager GUI, which I’m of course not using, but that whole section sounds pretty wild.

I think you mean if you reboot a host that is running guest VMs, it does bad things to the guests?

Yeah, it’s not suspending (“pausing”) the guests, which is apparently what it is supposed to do (like what Hyper-V does) or shutting them down either, it’s basically just pulling the plug on them, which of course they are unhappy about.

Is the qemu-guest-agent package installed and are these services running? The only thing I can think of that would cause such a situation.

Yep, first thing I checked.

The missing piece is libvirt-guests.service - this is the systemd unit that handles graceful VM shutdown or suspend when the host powers off. Without it configured, the kernel simply kills all processes including qemu instances when shutdown is initiated, which is why the VMs are seeing a “pulled plug” event.

First check if the service exists and is enabled:

systemctl status libvirt-guests.service

If it’s disabled or not running, enable it:

systemctl enable --now libvirt-guests.service

Then configure its behavior in /etc/sysconfig/libvirt-guests. The key settings:

# What to do to each guest on host shutdown
# Options: suspend / shutdown / none
ON_SHUTDOWN=shutdown

# How long to wait for each guest to shut down before giving up (seconds)
SHUTDOWN_TIMEOUT=300

# What to do to guests on host boot
ON_BOOT=start

If you want behavior similar to Hyper-V’s save/resume, use ON_SHUTDOWN=suspend. This saves the VM’s memory state to disk and restores it on next boot, so the guest sees no interruption. The tradeoff is disk space and time during host shutdown.

ON_SHUTDOWN=shutdown sends a clean ACPI shutdown signal and waits up to SHUTDOWN_TIMEOUT seconds for the guest to power off gracefully. This is what Windows Server and most Linux guests handle well since they receive a proper shutdown command rather than a hard kill.

After editing /etc/sysconfig/libvirt-guests, restart the service and test with a host reboot:

systemctl restart libvirt-guests.service

Your Windows VM should receive a normal ACPI shutdown signal next time the host reboots.

Thank you! I saw this in the guide for 8, but it was pulled from 9?

Checking the state:

As expected, it’s disabled. I’ve enabled it.

Now, there was no file under sysconfig called libvirt-guests, so I created it with your contents, adjusted for suspend rather than shutdown.

Now, interestingly, when I restarted the service as instructed, it suspended and then unsuspended the VM’s? Is this expected behavior? It was successful (they suspended, then resumed).

Do we know why this is:

A) not included in the qemu setup guide

B) disabled by default

?

Some of the answers are here
man libvirt-guests
it explains about defaults and being able to override using /etc/sysconfig/libvirt-guests.
It’s normal for (many services) to be disabled on day one. You don’t want everything enabled by default turning your “enterprise linux” into a zoo.

1 Like

True, but I’d have expected it to be enabled when you choose “VM Server” in the provisioning options, and I’d certainly expect it to be mentioned in the VM host setup documentation, which it isn’t. This seems like a rather important component that’s being omitted here, as clearly, the behavior I’ve observed with it disabled is undesirable, and would be undesirable for anyone.