Latest kernel not booted

I just noticed that the most recent kernel is not selected on reboots through cockpit nor when rebooting at the console. Currently if I don’t intervene by selecting the latest kernel, the first in grub menu, it will boot the 3rd choice.

System is MBR

/etc/default/grub

GRUB_TIMEOUT=8
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto resume=UUID=71da8f2b-df89-4d45-b763-9768884fea6f selinux=0"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true

I also see that on every boot the file /boot/grub2/grubenv is touched (time stamp updates) but contents remain unchanged. Still listing the old kernel as per:

# GRUB Environment Block
saved_entry=8b5f15c6d76440a3b655ab35c3fe1d9a-4.18.0-348.20.1.el8_5.x86_64
kernelopts=root=UUID=1d2eb7c8-376d-47f1-8ffa-a13a70b7b4a2 ro crashkernel=auto resume=UUID=71da8f2b-df89-4d45-b763-9768884fea6f selinux=0 
boot_success=1
boot_indeterminate=0
#################################################################################################+##############################################################################################################################################################################################################################################################################################################################################################################################################################################################################

I know I can probably fix this with the grubby tool but I am concerned that some script is not running correctly during kernel updates and this issue will recur.

Installed kernels:

vmlinuz-4.18.0-348.20.1.el8_5.x86_64
vmlinuz-4.18.0-348.23.1.el8_5.x86_64
vmlinuz-4.18.0-372.9.1.el8.x86_64

Check the value of UPDATEDEFAULT in /etc/sysconfig/kernel. If it is yes, then the last kernel should be selected:

# UPDATEDEFAULT specifies if kernel-install should make
# new kernels the default
UPDATEDEFAULT=yes

# DEFAULTKERNEL specifies the default kernel package type
DEFAULTKERNEL=kernel-core

Set the correct default kernel:

grubby --set-default /boot/vmlinuz-4.18.0-372.9.1.el8.x86_64
1 Like

olista,
The contents of /etc/sysconfig/kernel are exactly as you posted.

I do believe I have fixed it now. I will have to wait till the next kernel update to know for sure. The steps I took were to change this line in my /etc/default/grub file:
from

GRUB_DEFAULT=saved

to

GRUB_DEFAULT=0

Then I updated grub:
grub2-mkconfig -o /boot/grub2/grub.cfg

When I did so I got an error saying the environment file was too small and the update failed. I then removed the original /boot/grub2/grubenv file and ran grub2-mkconfig again and it completed successfully including creating a new grubenv file. I have since rebooted to the latest kernel w/o intervention. So I think the real issue is that the original grubenv file was corrupt in some manner and it was not apparent to me till I manually ran grub2-mkconfig -o /boot/grub2/grub.cfg and saw the error. In conclusion I don’t think I needed to make any edits to the file /etc/default/grub as the problem was a corrupt grubenv file.

Solved till proven otherwise for now.

GRUB_DEFAULT=saved is the default in Rocky Linux. It is not recommended to change default parameters, unless there is a good reason (which is not the case here).

saved_entry was wrongly pointing to the old kernel in the grubenv file. Running grubby --set-default … would have fixed it.

I agree there is no reason to change from saved to 0, it is just the wiggle that lead me to the corrupt environment file.

I’m not so sure that is true (Running grubby --set-default …) since the environment file was corrupt. I think grubby is called during kernel updates to do just that and was failing. I’ll have to go back through the dnf.logs to see as that is the only logging done now on package updates.

I just ran into a similar problem myself, trying to update from (8.6) 4.18.0-372.9.1 to 4.18.0-372.13.1.

# ls /boot/loader/entries/
7d726ff1eec94d04bc45f3f5e772bff9-0-rescue.conf
7d726ff1eec94d04bc45f3f5e772bff9-4.18.0-372.13.1.el8_6.x86_64.conf
7d726ff1eec94d04bc45f3f5e772bff9-4.18.0-372.9.1.el8.x86_64.conf

# grep ^UPDATEDEFAULT /etc/sysconfig/kernel
UPDATEDEFAULT=yes

# grep ^GRUB_DEFAULT /etc/default/grub
GRUB_DEFAULT=saved

But when the servers boot, the latest kernel is not listed in the grub menu at all - all I see there are 4.18.0-372.9.1 and the rescue option.

I’m still trying to get to the root cause.

Problem seems to be related to the switch to BLS: 1859398 – dnf ugrade kernel does not update grub.cfg

Resolution: I was able to fix the problem on my systems by running:

# grub2-switch-to-blscfg

and then reinstalling the latest kernel package. At this point, its successfully using BLS, and reboots into the new kernel.

What I don’t understand currently is, if BLS is now supposed to be the default in el8, why did I have to manually run the conversion tool? Strange.

Update: our config mgmt was enforcing a template on /etc/default/grub that was NOT including GRUB_ENABLE_BLSCFG=true, so it seems that’s what was breaking it in our environment.