Hello All , I’ve Rocky linux 8.8 on my production server, something weird has been observed over past few days, when I log into the server, it directs goes to bash 4.4. Now I have not played around with Bashrc or Bash_profile and these files are also present. After a while something happens the server will log into Bash 4.4. Some of the commands like “ls” “cat” will work but if I try to execute any command like “systemctl status mariadb” or “htop” , there’s no response from the command , it goes into infinite loop and it will not return anything. htop break the whole session making the session unresponsive and other commands like this are not working at all.
Surprisingly if I restart the server , everything gets fixed like nothing ever happened. It’s my production server and I cannot restart it again and again, would like to see if I can get to the root cause of this
Has anyone faced anything similar? I’m sure there’s nothing wrong with bashrc or bash_profile.
I have not seen this issue reported on a Rocky distro or experienced it on the other RH distro I use (fedora).
Currently your system is two versions of RL8 out of date with the current supported version being 8.10. So besides needing to update your system how do you logon to the server, directly via attached keyboard and monitor of via ssh remotely?
On start of session the bash sources config files in order to set up environment, including the PS1.
I’ve seen seriously clogged systems, where this initialization takes very long. If I interrupt with Ctrl-c, I do get to prompt, but with incompletely initialized environment. One visible hint of that is the prompt (PS1) is not [\u@\h \W]\$ but contains the bash version.
What “serious clog”? Not sure. Most likely network issues for us as our home/data directories are on NFS servers. Therefore, network blackouts leave most our user processes hanging, waiting.
This is the exact situation I face , this is exactly what’s happening, also the PS1 only contains bash version , I have tried replacing the (PS1) with [\u@\h \W]\$ but still I will keep the user processes hanging or waiting and as far as our home directory is concerned, it is physically attached to the server
When you reboot, it goes back to normal (for a short while), so maybe do a reboot, and then carefully monitor activity and see if the load average increases over time and then check to see which process is causing it.
looking into the kernel logs, I found out my mariadb task is blocked for some reason , killing this task brought the server back to normal. This is resolved now. Thankyou Everyone for your contributions!