Default kernel choice ignored

I’m going crazy trying to figure out why my Rocky 8 system is ignoring my choice of default kernel. I followed some suggestions in Latest kernel not booted - #6 by guzzijason, tried various combinations of grub2-switch-to-blscfg, grub2-mkconfig, grubby --set-default, etc, and no matter what I do when grub comes up the selected menu item is a 4.18.0-372.16 kernel rather than the latest 4.18.0-425.13. I can manually select the one I want, but the selection doesn’t stick.

The weirdest thing is that grubby --default-kernel reports the correct one (425.13), and grubenv (in both /boot/grub2/ and /boot/efi/EFI/rocky/ report the saved entry as the correct one. /etc/default/grub has GRUB_DEFAULT=saved and GRUB_SAVEDEFAULT=true.

So we need to know the state of where what boot related files are on your system. Use the block quotes for the requested output from the editor menu </>
In /etc we need to know where these links point to in the output of this listing:

ls -l /etc | grep grub2

Post your systems mount points per /etc/fstab
Post the output of this command:

less /etc/default/grub | grep -i blscfg

Then the output of this listing:

ls -l /boot/loader/entries

Once you’ve posted the requested information it may be possible to repair the boot loader menu selection.

Thanks for taking a look.

$ ls -l /etc | grep grub2
lrwxrwxrwx.  1 root  root          22 Feb 21 15:58 grub2.cfg -> ../boot/grub2/grub.cfg
lrwxrwxrwx   1 root  root          30 Feb 21 15:58 grub2-efi.cfg -> ../boot/efi/EFI/rocky/grub.cfg

relevant mountpoints

/dev/lvm_vol_group/boot              /boot                   ext4    defaults,nosuid,nodev        1 2
$ grep -i blscfg /etc/default/grub 
GRUB_ENABLE_BLSCFG=true
$ ls -l /boot/loader/entries
total 16
-rw-r--r--. 1 root root 405 Jun 22  2021 264e18c4978f4cc9b7885e5be223b563-0-rescue.conf
-rw-r--r--  1 root root 360 Aug 12  2022 264e18c4978f4cc9b7885e5be223b563-4.18.0-372.16.1.el8_6.0.1.x86_64.conf
-rw-r--r--  1 root root 340 Feb 21 14:39 264e18c4978f4cc9b7885e5be223b563-4.18.0-425.13.1.el8_7.x86_64.conf
-rw-r--r--  1 root root 325 Nov  9 15:27 264e18c4978f4cc9b7885e5be223b563-4.18.0-425.3.1.el8.x86_64.conf

So based on the output you’ve provided it appears that RH uses the old method of implementing the “BLS” or boot loader spec. That means the grub.cfg in …/EFI/Rocky/ is not a stub pointing to /boot as it is on RL9. The target of grub2_mkconfig in this case should be /boot/efi/EFI/Rocky/grub.cfg .
Since you say you are trying to specify the default kernel which just happens to be the current latest kernel, does this mean you had in the past specified a different kernel? If so, what means did you use to do that?
There is one more thing that would be good to know and that is the output of this:

grep -i grub_default /etc/default/grub

To boot the latest kernel the value of that output should = 0

I don’t recall ever specifying a specific kernel. What I’d like it to default to the latest kernel unless I specifically pick a different one at boot time, and then stick with that one. I have

$ grep -i grub_default /etc/default/grub
GRUB_DEFAULT=saved

which, as I understand, is consistent with the behavior I want.

I’ve also confirmed that the output of grub2-mkconfig is identical to /boot/efi/EFI/rocky/grub.cfg

Is your system booting via EFI? If so, delete /boot/grub2/grub.cfg and /etc/grub.cfg.
After, configure your system again just to be sure grub2-mkconfig > /etc/grub2-efi.cfg

I suspect it is booting by EFI, but why would having both config files be an issue, if they are identical, which they are?

I don’t think the duplicate configs matter either. How bls configures the boot order is different than grub2. The grubby tool edits the configs under /boot/loader/entries/ so I would look at and compare the entries there to see what differences there are between them. It also may be helpful to consult the redhat documentation for RHEL 8. They may cover the saved entry conditions that you desire. So what happens when the saved entry gets rotated out of existence on a future kernel update?

Doesn’t that mean that the GRUB should update the grubenv on every boot to record which entry you did choose?

It is not set by default.

Doesn’t that mean that the GRUB should update the grubenv on every boot to record which entry you did choose?

Yes, I think it should, and the saved entry in grubenv is the correct one (i.e. the one I selected last), but it still doesn’t select that one by default next time I boot. And grubby correctly reports the kernel I want to boot with as the default. But the grub menu doesn’t actually have that one highlighted by default.

The efibootmgr can show the UEFI boot menu entries. Does the default entry load .efi binary from the EFI/rocky/ ?

I’m not entirely sure how to parse the output, but I think yes?

BootCurrent: 0000
Timeout: 1 seconds
BootOrder: 0000,000B,000C,0002,0003,0004,0005,0007,0008,0009,000A,000D,0001
Boot0000* Rocky Linux
.
.
.

Since the 0000 is first in boot order and is titled “Rocky Linux”, it probably loads what we expect.

The command needs the verbose option to know for sure: -v
See https://wiki.gentoo.org/wiki/Efibootmgr

You might look at what is saved in /etc/sysconfig/kernel/ maybe there is a conflict with what is there and what is in your grubenv.

Does the order of these entries matter?

efibootmgr -v | grep Rocky
Boot0000* Rocky Linux	HD(7,GPT,4d4baea3-0308-4ae4-b2c1-7ecc6f262789,0xdd01000,0x200000)/File(\EFI\ROCKY\SHIMX64.EFI)

This isn’t going to be conclusive as it only points to the UEFI boot loader and not to a grub.cfg file. Do note that the first 50 lines of grub.cfg deal with what kernel to boot based on the saved default.

That is true. The bootloader can and does seek config with many names&paths. I’ve seen that in the logs of tftp server after PXE-boot.

So I read the documentation for RH9 in regards to setting a kernel other than the latest as default and it only covered one time reboots. Useless in the OP’s desired setup. The thing is that with BLS grub2 is not truly controlling the boot. The OP’s desire is an edge case, but valid if you have hardware that doesn’t work with a particular kernel. Oh I should note that the OP is using RH8 but the documentation is probably just as sparse. So the only real help that will be useful is for someone who has actually mastered this on their own replying to this thread with their knowledge.