Sss authentication timeouts due to user-nsswitch.conf configuration

I’ve noticed Rocky 8 (8.6 Green Obsidian in my case) has what appears to be an odd configuration in the nsswitch.conf under authselect. During the realm join and subsequent authentication attempts, my servers would be extremely slow. running the id command would take a long time as well. I tried some tweaks to my sssd.conf file but they didn’t help much. I would get many login timeouts. The same timeouts when I’d sudo. I found a forum mentioning the nsswitch.conf wasn’t configured correctly so I made changes as the forum help stated. It instantly fixed my issues. Changes:
root@…#vim /etc/authselect/user-nsswitch.conf
config causing slowness:
passwd: sss files systemd
shadow: files sss
group: sss files systemd
hosts: files dns myhostname
services: files sss
netgroup: sss
automount: files sss

config that works normally: Fixed spacing issue and put sss at the end for passwd and group
passwd: files systemd sss
shadow: files sss
group: files systemd sss
hosts: files dns myhostname
services: files sss
netgroup: sss
automount: files sss

Once I made the changes, I ran “authselect apply-changes” and my log in and sudo woes were immediately resolved.

I’m not sure if this is normal or something I did during original install/config. If it should be the second method, what’s the best way to report to Rocky to resolve future packages? Are others seeing the same issue?

Forum site I found the resolution for my issue:

That nsswitch.conf configuration is actually normal and will not be a Rocky nor Enterprise Linux specific problem. Part of the reason is because sssd can be used for multiple domains and can cache local users/groups as well if the user wishes. (The default configurations as part of authselect are developed upstream in both Red Hat and Fedora, and we do not adjust them.)

When I look at any of my IPA enrolled machines, sss files is completely sufficient and poses no issues, whether the account is local to the machine or a domain account. However, this is likely due to my smallish domain. In fact, I enrolled a new machine recently for my RelEng folks to login to and the login was smooth, including sudo. It was rather instant. Again, this is likely due to my domain size.

The timeouts are likely a symptom of another problem if they’re not coming back or resolving in time. Unfortunately in this instance, adjusting nsswitch.conf is a band-aid or it just appears that it fixed the issue, when indeed the right information from AD was cached properly.

One thing to keep in mind is that when a host is enrolled to AD, it has to cache brand new information for any new user or group that it comes across. It doesn’t enumerate nor is it ever aware of any of the users or groups until it needs to do a lookup. If your domain has a lot of users/groups and the users also have a lot of groups, that could easily slow down the caching or ID lookup process. What can also slow it down are nested groups too. ignore_group_members can sometimes help in these instances. If your domain is small, putting sssd into debug can help find issues. See this page for troubleshooting tips.

What I would also recommend is looking through the SSSD Users Mail List or posing your question there also, especially for more specific troubleshooting as it pertains to sssd.

1 Like

Thank you for that information. We do have a very large environment. I’ll do the same as you did and test in my home lab. I’ll check out the sssd debugging modes.