SSSD Problem with Active Directory

Posted on

Been experiencing an issue every 26 days +/- 1 day. All of our Linux infrastructure (RHEL7) is having problems talking to Active Directory around 26 days. When that 26+/-1 day hits, sssd stops authenticating to the domain and users aren't able to login. sssdtools stop working and restarting sssd doesn't seems to work. Until I reboot the system, sssd seems to be in a broken state . Also, NFSv4 mounts in the fstab timeout and don't mount during the boot up process. From grub the boot process takes 5 minutes and 15 seconds when joined to the AD domain. Non-cached user look ups are taking around 24 seconds. Once the system is up, I can manually mount my fstab items (creating a separate mount process that runs after the initial failure also works). Mounting the sss/db in a tmpfs partition speeds up a non-cached user lookup to 4 seconds, but the boot process is still stuck at 5 minutes. Leaving the domain/realm my system boots up in 20 seconds and my fstab entry mounts perfectly fine.

Any Ideas?

Additional details.

ignore_group_members = True seems to speed up user lookups/login, but the boot process still hangs.

This is on an air gapped network, and unfortunately getting logs off is a pita.