Rocky Linux 8 can't boot on Google Cloud

Hello,

I have Rocky Linux. 8 on Google Cloud Engine which for some reason is unable to boot due can’t mount root filesystem as the error below. Kindly please let me know how to fix the issue. Thank you so much.

Kind regards,
Asfihani

UEFI: Attempting to start image.
Description: Rocky Linux
FilePath: HD(1,GPT,14141A39-396F-47E5-9B5B-29A1B56C4D7D,0x800,0x64000)/\EFI\rocky\shimx64.efi
OptionNumber: 3.

e[He[Je[1;1He[He[Je[1;1Herror: ../../grub-core/fs/fshelp.c:258:file
`/boot/initramfs-4.18.0-513.11.1.el8_9.cloud
.0.1.0.3.x86_64.img' not found.
error: ../../grub-core/fs/fshelp.c:258:file
`/boot/initramfs-4.18.0-513.11.1.el8_9.cloud.0.1.0.3.x86_64.img' not found.


Press any key to continue...Press any key to continue...

[    0.000000] Linux version 4.18.0-513.11.1.el8_9.cloud.0.1.0.3.x86_64 (mockbuild@iad1-prod-build001.bld.equ.rockylinux.org) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-20) (GCC)) #1 SMP Wed Feb 14 16:07:11 UTC 2024
[    0.000000] Command line: BOOT_IMAGE=(hd0,gpt2)/boot/vmlinuz-4.18.0-513.11.1.el8_9.cloud.0.1.0.3.x86_64 root=UUID=9bd2062d-d31f-4ec4-a11d-72ebc298c688 ro net.ifnames=0 biosdevname=0 scsi_mod.use_blk_mq=Y crashkernel=auto console=ttyS0,38400n8 rootflags=uquota
...

[    2.372745] md: Waiting for all devices to be available before autodetect
[    2.374481] md: If you don't use raid, use raid=noautodetect
[    2.376705] md: Autodetecting RAID arrays.
[    2.377913] md: autorun ...
[    2.378737] md: ... autorun DONE.
[    2.379786] List of all partitions:
[    2.380864] No filesystem could mount root, tried: 
[    2.380865] 
[    2.382594] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[    2.384857] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.18.0-513.11.1.el8_9.cloud.0.1.0.3.x86_64 #1
[    2.387428] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
[    2.389867] Call Trace:
[    2.390601]  dump_stack+0x41/0x60
[    2.392010]  panic+0xe7/0x2ac
[    2.392607]  mount_block_root+0x2be/0x2e6
[    2.393653]  ? do_early_param+0x95/0x95
[    2.394670]  prepare_namespace+0x135/0x16b
[    2.395798]  kernel_init_freeable+0x208/0x232
[    2.396797]  ? rest_init+0xaa/0xaa
[    2.397703]  kernel_init+0xa/0xff
[    2.398879]  ret_from_fork+0x35/0x40
[    2.400597] Kernel Offset: 0x2c600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[    2.403386] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]---

Hi -

A few questions:

What instance type is this using?

What image did you select when creating this machine?

Is this a new machine, or has this instance been around for a while?

At a guess, it appears that your initramfs was not properly generated–this can be for a number of reasons but the usual suspect is that a cloud instance does not have enough RAM to perform critical functions including package management (dnf) and post-tasks like dracut to regenerate the initramfs. Information on required/recommended system specifications are available upstream at https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/performing_a_standard_rhel_8_installation/system-requirements-reference_installing-rhel

I created an e2-medium using rocky-cloud-8,image=projects/rocky-linux-cloud/global/images/rocky-linux-8-optimized-gcp-v20240213 and updated to 4.18.0-513.11.1.el8_9.cloud.0.1.0.3.x86_64 without incident.

To recover your instance, there are some general instructions from GCP here: Recover a VM with a corrupted or full disk  |  Compute Engine Documentation  |  Google Cloud - but the basic steps boil down to:

  1. snapshot the machine’s disk
  2. create a new machine
  3. attach the bad machine’s disk snapshot to the new machine
  4. login using the new machine, and mount the bad disk’s partitions to perform fixes or data recovery

If you can provide more information on the environment in which you encountered this problem, that would be helpful.

Best,
Neil

Hi Neil,

Thank you for the prompt response and I apologize for the detailed information as I was in hurry. This machine has been around for a while and it was created around Mar 14, 2023. It run on cPanel which host few websites and email. Yesterday may partner rebooted the server and it never come back. After connect to serial console I got the error as I mentioned earlier.

Fortunately, I setup daily snapshot backup and able to recover from the snapshot so basically the instance is now back and running again. However, I would like to know what prossibility causing the issue and how to avoid that in the future.

Thanks again for your time.

Kind regards,
Asfihani