Trouble in River City -- Rocky Linux 8.5 Rollover Blew Up

Well I did went and did my updates and was greeted with some 880+ updates and figured that Rocky Linux 8.5 was out. Part of it went in but a whole lot did not. So I did dnf update --allowerasing and also --nobest. TADA… sort of. Everything installed so I rebooted the machine and got NOTHING!!! Either I get a Blinking Cursor followed buy a string of OK’s, then then the screen goes to BLACK and nothing else. The last “OK” had to do with GNOME and I run KDE.

Alternately I get a Blinking Cursor and nothing else. I seem to have no way of even checking log files.

BTW before someone asks YES I did try using the emergency Rescue kernel. All I got there was alternating blackouts followed by OK’s, so I can’t even Rescue Me.

Me Thinks I am in DEEP TROUBLE. Suggestions?

When you do a restart, when the list of Kernels pops up, press “e” at the latest Kernel to start an edit session, then on the line which starts with Linux, add a " 3" (without the “”) to the end of the line, make sure there’s a space before the “3”, the line is usually 2 or 3 lines long. After this, press ctrl-x to exit the edit mode and move on, you will then end up in the terminal mode to login as root to sort stuff out, hope this helps.

@TonyH The “3” is a SysV runlevel. One should use systemd targets.
Runlevels “are only included for compatibility reasons and should be avoided.
See Chapter 15. Working with systemd targets Red Hat Enterprise Linux 8 | Red Hat Customer Portal

@desercat Who said that --allowerasing or --nobest are good to try?

Black/blink is typical for graphics driver issues in graphical.target. Booting to multi-user or rescue target are a logical option.

When you can’t get into your system at all then using the legacy runlevels still works and is relevant. Chapter 15 describes how to enter the other targets on the next boot. You have to be in a running session to enable those targets.

@jbkt23 you can access the systemd targets from grub. Edit the grub menu, and to the end of the kernel line:

systemd.unit=multi-user.target

will also get you to runlevel 3. You can also get what is the equivalent of single-user mode, or runlevel 1 if you like with:

systemd.unit=emergency.target

this has been the recommended method from RHEL7 and onwards due to the implementation of systemd. Whilst legacy is still available, this will most likely disappear at some point and therefore no longer an option. So ideal to use the newer methods to get used to remembering them for the future.

1 Like

This is what should have been posted as a response instead of citing chapter 15 which only covers implementing those states from a running system. I’m good with that even if I didn’t know the correct answer. But, I can’t imagine memorizing that 20 character string for just incase you can’t boot your system.

Yep not ideal. Providing we know it always starts with systemd.unit= and always ends in .target then the only part we change is emergency or multi-user. But yeah writing 1 or single was always easier.

Well things are are getting more mysterious by the day. This is the case of who --or what – dunit.

I tried the suggested hints. I modified the various entries first my adding the 3 at the end of that big long ugly string and then using systemd.unit=multi-user.target and systemd.unit=emergency.target. Then logged in as root. First thing I did was to update the system hopping that would solve the problem, it did not. Then for a lark I ran startx. SURPRISE!! I was able to mess around in GNOME, but KDE was out of the question. I rebooted the system but still had no login page. Next I tried to Rescue Me. Another SURPRISE awaited me: I got the following message: “Nvidia Kernel Module MISSING (!!!) Falling back to Nouveau”. Then it went out into the weeds. My next experiment was to run FSCK on all the nvme0n1p1-10 partitions hoping that would solve the problem. It did not.

SUMMATION:

  1. Editing the long ugly string allowed me to log in as root which allowed to update the system.
  2. Editing the LUS allowed me to log in as root and run startx which dropped me into GNOME
  3. Editing the LUS for the Rescue Kernel comes up with following message:: “Nvidia Kernel Module MISSING (!!!) Falling back to Nouveau”.
  4. Once I exit the OS and start up as normal I am still missing a login screen. It either gives me a long string of ERROR messages, or it blinks on and off shows a string of OK’s, and then it goes to a single blinking cursor.
  5. Regardless of what I try I can’t seem to reach a login screen, and wonder if the problem has to do with A) the NVMe 4.0 drive or B) the NVIDIA Module which is (or was ) installed.
  6. I also ran /boot/grub2/grub.cfg which did not fix it.
  7. I wonder if while doing the rollover from RL 8.4 to 8.5 something got whacked. Up to that point RL 8.4 ran flawlessly.

Any idea where next to go? (To Hell if I don’t mend my ways). I am starting to fear the next step may be meatball surgery ie. a partial reinstall with RL 8.5 and hope that might fix it, or even a FULL reinstall and hope 8.5 will install but it probably it will not and I may be forced to drop back to CentOS 8.3, rollover from there, and then update that to CentOS 8.5, and then convert that to Rocky Linux 8.5, and that could take me a week or more to do, and I shudder at the thought.

I’m tapped out of ideas. I suspect the problem has to do with either the NVIDIA Kernel Module which is MISSING (though how one fixes that is a mystery to me) and/or replacing the login page (though again how one would do that is also a mystery).

D’Cat

How did you install NVidia in the first place?

Good question! I think I went to the Nvidia Website, looked up my Graphics card (a GeForce GTX 1050 Ti) Downloaded the correct driver for the card for 64 bit Linux, then did some song and dance muttered magic words and probably installed Development Tools and PRESTO!! I had Nvidia installed, but whatever I did, all those packages etc., plus the Nvidia Driver itself should still be installed unless when I rolled over from 8.4 to 8.5 it nuked all those configurations and packages. BTW while exploring the in and outs I did find out that the installed version of Rocky Linux now reads 8.5, not 8.4. But you just did give me an idea of trying to re-install Nvidia from run lever 3 or systemd.unit, but not sure if I should do that from the most recent Kernel. or the “EMERGENCY” “Rescue” kernel? I initially installed the Nvidia Driver in CentOS 8.3 => Rollover to CentOS 8.4 => Converted to Rocky Linux 8.4 That could REALLY nuke what exists of Rocky Linux. Any “Words of Wisdom”?

“Seeking Words of Wisdom”

D’Cat

It would be better to know the spell that you did cast back then. Perhaps: history | grep nvidia
(although bash HISTORY_SIZE is short by default).

If it was the binary installer from NVidia (NVidia does have RPM’s in yum repo for CUDA too),
then I would rerun the installer with option --uninstall. The problem with NVidia’s binary is
that it is mostly unmanageable. Dark magic.

Then, I would:

sudo dnf install elrepo-release
sudo dnf install nvidia-detect
sudo dnf install $(nvidia-detect)

Every kernel is different. Each version has its own kernel modules. Additional modules have to be recompiled for each kernel. The NVidia’s binary compiles by default to the currently running kernel. It does accept options, so it can create module for non-running kernel too. For each kernel you install, you run the NVidia installer.

Some “dkms” service can compile (some) missing modules for new kernel during boot.

However, all kernels released for one RHEL point release are quite similar. So similar that a kernel module compiled for one kernel could be compatible with all kernels. E.g. for all kernels of RHEL 8.4. Point releases do differ though; same module will not work for 4.18.0-240*.el8_3, 4.18.0-305*.el8_4, and 4.18.0-348*.el8_5.

The compatibility with every EL 8.5’s kernel (4.18.0-348*) is not automatic. Such build requires special magic. ELRepo does that. The only time one needs extra care is when a new point release appears; update to 8.6’s kernel will require package built for el8_6 from ELRepo.

Hi jlehtone,

I hope you had a wonderful Thanksgiving.

I have to go back into my “Spell Book” to find out what I did exactly. I have held off of doing something STUPID out of desperation – I have openSUSE currently running though I had hoped to use RL 8.5 as the default OS for ocelot.

Perhaps you might know the answer to this question: Does Rocky Linux use kmod nvidia like CentOS uses / used?? That was the only thing I used between point releases, for that I would simply use yum update --exclude=kmod-nvidia until everything caught up. Mind you this might well all be academic as after RL 8.5 it jumps to RL 9.0 which will occur somewhere between the end of the year and end of January 2022, and will require a complete new reinstall from the ground up. While I would prefer to ride my hobby horse out until then without having do yet another full reinstall which would be a Giant PITA, it might tell me if the problem with the install of 8.4 (RHEL, CentOS, and RL) has been fixed, or if I have to get RL 8.5 via CentOS 8.3… I HOPE I still can, or I will be TOAST!! My buddy also had problem installing any of the 8.4 distros, but openSUSE went right in.

ocelot is a very FAST machine, but there is still a lot that still needs to be done, the top of the list will be to clear off the designated 2TB backup drive (currently being used a 3rd drive) and configure it to start receiving backups – that would have saved me a LOT of pain had I a backup, but NO that would have been too easy (actually it would not have been that easy as I am still configuring the machine and working with two vastly different OS’s narrowed down from 4. The hazards of being a Test Guinea Pig).

Thanks for your suggestions. I’ve tried a lot of things and they are a lot smarter than my brute force ideas. Just to be sure I am going to start downloading RL 8.5, and get back my CentOS 8.3 thumb drive.

In the meantime do have a Happy Holiday Season, and stay in touch (I’ll let you know if I have either BASHed in my brains or am prematurely bald from hair pulling).

D’Cat

An aim of Rocky Linux, CentOS Linux, (and AlmaLinux) is to be binary compatible with RHEL.
ELRepo, NVidia’s repo, RPMFusion (and probably some others) build kernel modules for RHEL as additional packages.
Those modules, build for RHEL, work in Rocky Linux, CentOS Linux, etc due to the compatibility.

That “it” must be you. RHEL will not “jump” and neither will RL.

Red Hat has planned more point updates for RHEL 8. See Red Hat Enterprise Linux Life Cycle - Red Hat Customer Portal
Since RL 8 rebuilds RHEL 8 public content, it too aims to live to 2029.

It is more likely that Red Hat will release RHEL 9 around next summer (2022). Not “tomorrow”. You can’t expect RL 9 before RHEL 9 is out.

You’re not alone. just ran dnf upgrade on RL 8.4 and it is now fubarred ( Had KDE 5.18 and there were Nvidia things in there … on which I gave up eventually - got an old radeon ). Luckily that was on a backup I had restored to a VM (I’m just not a very trusting person).

jlehtone

First I want to thank you very, VERY much. Thank nifty command you gave me – history | grep nvidia – worked like a charm. When I ran it it gave me two items 1) kmod-nvidia and 2) kmod-nvidia --allowerasing. I then erased kmod-nvidia and did a reinstall. While that did not solve the problem it got me to the point where it wanted to start the login page before it failed with a whole lot of error messaages that ran on and on, and it also gave me the warning that “Nvidia Kernel Module MISSING (!!!) Falling back to Nouveau”.

I am at my wits end so downloaded (I THNK – I still need to check if it did) the Rocky Linux 8.5 DVD1 iso and then to bed and to sleep.

Now for the GOOD NEWS!! I went on a bike ride this morning and I was chatting with my buddy, and was telling him that I planned to at least try an install of RL on my 2 TB to check it out before I do something STUPID like wipe out RL 8.4 on the NVMe 4.0 1 TB drive. He asked if I still had the 1TB HDD my sister sent me and suggested that I copy the RL 8.5 DVD.iso and install from there. This prompted me think I might have made a backup to there. The GOOD NEWS was I DID, the BAD NEWS it was CentOS 8.4, and I probably made it just before I made the jump and converted it to RL 8.4

Now I am about to start practicing the Dark Arts: I’m going to wipe out every thing on the 2TB HDD and reformat it Step 2: I’ll restore CentOS 8.4 to the 2 TB drive Step 3: I’ll rollover CentOS 8.4 to CentOS 8.5 Step 4: Convert CentOS 8.5 to RL 8.5. A Proof of Concept: IF it works then I can safely – fingers crossed – wipe out the NVMe drive and replicate the process there. ie I am Back Dooring my way into RL 8.5. If this experiment FAILS, I’ll still have my non-functional RL 8.5 version on my NVMe awaiting me to become “inspired” again. It looks as though my 2TB HDD is going back to being a TEST drive.

SIGH!! I’ll keep you posted.

D’ Cat

“You’re not alone. just ran dnf upgrade on RL 8.4 and it is now fubarred ( Had KDE 5.18 and there were Nvidia things in there … on which I gave up eventually - got an old radeon ). Luckily that was on a backup I had restored to a VM (I’m just not a very trusting person).”

HUUMMMMMMMM. Yep I’m also running KDE 5.18, and also NVIDIA drivers as well, and to top that off I’ve got it on an NVMe Gen 4.0 drive. Trivia question: Are you still running Rocky Linux 8.4??. I might have found a way to Back Door into RL 8.5 first by going CentOS 8.4 => Rollover to CentOS 8.5 => Rocky Linux 8.5. Later tonight I’ll pull on my navy blue Bathrobe w /floppy sleeves, pull out my Big Black Book of Linux, and dive into doing vacuous experiments with appropriate incantations as necessary – and probable curses as well.

D’Cat

Still on RL 8.4

Won’t ever be trying to upgrade my bare metal RL 8.4 given posts to date.

I’m in the process of moving permanently and completely (in so far as that is possible) from Windows to Linux - fortunately I’m a very stubborn individual. If Windows 7 had updates for the next 30yrs I would give up on Linux at this point having learnt what I have in the last 8 weeks.

My RL 8.4 is a test bench where I am trying to get things to a point where they are usable eg. Netbeans not making me physically ill when I look at it (Java font issues on Linux… because you know how Java is write once run anywhere…except it’s not - it’s a giant turd ), Excel in a Windows 7 VM with no network access ( because Libre Office etc is a replacement for Excel… except it’s not and likely never will be ) etc.

Anywho ! the plan is to devise a strategy whereby 1) I am effortlessly backed up with easy recovery to past time points including to bare metal and with adhoc “system restore” functionality and 2) I have a safe upgrade path based on scripting all application installs and system and application configuration and keeping files in their correct directories - which will hopefully allow me to just install the next version eg Rl 8.5, run my config scripts and copy my data over.

I would recommend most users (even the technically gifted) fleeing the tyranny that is Windows to try/settle for macOS before attempting a move to Linux if actually getting some work done (other than learning linux) is on their agenda/wish list.