polkit keeps failing to start after fresh builds or upgrades to RHEL 7.6 (not consistently)

Latest response

I'm curious if anyone else besides this example with RHEL 7.x is experiencing issues with the polkit service not starting on RHEL 7.6 servers.

UPDATE this solution is vital at times. https://access.redhat.com/solutions/3900301

When the polkit service doesn't start, we can still log into the server, however logins are horribly slow (go figure) and there's a number of delays.

I've had this issue with no less than eight RHEL 7.6 servers. I found this solution https://access.redhat.com/solutions/1543343 which basically says to check/validate 3 things:

1) Ensure the existence of a polkitd system user & group

getent group polkitd >/dev/null && echo -e "\e[1;32mpolkitd group already exists\e[0m" || { groupadd -r polkitd && echo -e "\e[1;33mAdded missing polkitd group\e[0m" || echo -e "\e[1;31mAdding polkitd group FAILED\e[0m"; }

getent passwd polkitd >/dev/null && echo -e "\e[1;32mpolkitd user already exists\e[0m" || { useradd -r -g polkitd -d / -s /sbin/nologin -c "User for polkitd" polkitd && echo -e "\e[1;33mAdded missing polkitd user\e[0m" || echo -e "\e[1;31mAdding polkitd user FAILED\e[0m"; }

2) Reset the permissions and user/group ownership for all files provided by the polkit and polkit-pkla-compat packages

rpm -Va polkit\* && echo -e "\e[1;32mpolkit* rpm verification passed\e[0m" || { echo -e "\e[1;33mResetting polkit* rpm user/group ownership & perms\e[0m"; rpm --setugids polkit polkit-pkla-compat; rpm --setperms polkit polkit-pkla-compat; }

and
3) Reboot

(side note) Now I have in some cases done a yum -y reinstall polkit which only in 2 occasions fixed the issue with a reboot.

However, generally, I've had to reset the permissions and setugids and I used something close to what the solution above I cited stated.

Ok, I keep running into this, but it has not been an issue on all my systems, just a small amount of systems, and on separate networks with different satellite servers.

  • Yes, I checked the sha256sum of the rpm in question, it is fine on every Satellite server.

I keep running into this on various systems and am curious if anyone else is. I suspect not, but I thought I'd ask.

Regards

RJ

Responses

So in my specific situation this https://bugzilla.redhat.com/show_bug.cgi?id=1531486 is what I've been fighting in my environment. The ugly fix until RHEL 7.8 comes out is to manually "fix" (edit) the service unit files relating to polkit so polkit takes precedence. See the Bugzilla and this solution at https://access.redhat.com/solutions/3900301.

From the Red Hat solution above

https://access.redhat.com/solutions/3900301

RHEL 7.6: Logins take 25 seconds to complete and messages such as "Failed to activate service 'org.freedesktop.systemd1': timed out" are seen in the journal

Environment

Red Hat Enterprise Linux 7.6 systemd rpcbind ypbind

Issue

Since updating to RHEL 7.6, users need 25 seconds to login and messages such as the one shown below are seen in the journal:

  • [...] dbus[xxx]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
  • [...] systemd-logind[xxx]: Failed to enable subscription: Failed to activate service 'org.freedesktop.systemd1': timed out
  • [...] systemd-logind[xxx]: Failed to fully start up daemon: Connection timed out
  • [...] systemd[1]: systemd-logind.service: main process exited, code=exited, status=1/FAILURE
  • [...] systemd[1]: Failed to start Login Service.

RHEL 7.6 periodically doesn't boot properly because polkit.service and tuned.service fail to start. This happens about 1 in 10 to 20 boots.

Resolution

This issue has been reported to Engineering and is being tracked in Red Hat Bug 1531486. For more information or to also report this issue, please open a case with Red Hat Support. At the time of this writing, there is no fix available and Engineering is still working on this issue. There are two workarounds available.

  • Workaround 1: disabling socket activation using the systemctl edit command
  • Create a drop-in to disable socket activation for the rpcbind service
# systemctl edit --full rpcbind.socket
>> opens an editor <<

Disable the socket activation by commenting out the lines, as shown below

See this link for the remainder of the solution https://access.redhat.com/solutions/3900301 #

NOTE If needed, submit a case https://access.redhat.com/support/cases/#/case/new with Red Hat Support. Some of the services related to polkit may differ from one customer to anohter.

Regards

RJ

Hi RJ ! :)

Thanks for the information. One thing : "... until RHEL 7.8 comes out ..." Really 7.8 ? Or do you mean 7.7 ? - or 8.0 ?

Regards,
Christian

Chris, the Bugzilla (and my Red Hat Technical Account Manager (TAM)) say that the fix for this will not be available in RHEL 7.6 or 7.7, but high-confidence for 7.8. I'd hope RHEL 8 won't have this issue at all. That being said, I have a general rabid distrust of a version X.0 for anything.

Regards/thanks

RJ

Thanks for the explanation, RJ ! Me too hope that RHEL 8 won't be affected by this nasty bug ... :)

Regards,
Christian

Unfortunately, this didn't solve it on our system, nor did the solution you linked to, R. Hinton. Our permissions were already fine, all groups were there. We consistently get responses like:

systemctl daemon-reload

Authorization not available. Check if polkit service is running or see debug message for more information.

Any other suggestions to check?

Hi NCNR,

In our case, we faced the issues mentioned in the bugzilla. I spent some time to determine one service that was fighting to start before polkit could start. This behavior on the server was different after upgrading it to RHEL 7.6 with latest patches. We had to manually edit the service units as described in one of the links above https://access.redhat.com/solutions/3900301. Sadly your scenario might be different than mine.

I heard from one person I know, they are using IDM with an autofs instance with IDM using ypbind against IDM (and not Solaris/Oracle NIS, but the kind of NIS you can have within an IDM server)

Initial recommendation...

I'd recommend running systemctl | grep -i fail on your system to see if anything other than polkit failed as well. If so, what we did was within that other article besides the bugzilla. Sadly, your situation may differ slightly than ours. In short, the bugzilla I cited has made it so that we had to manually edit our service target files to make sure polkit starts before ypbind. I can't recall right now if we did anything to dbus.

RJ

NCNR,

If what I have posted doesn't help (bugzilla, and the solution that related to the bugzilla) - then please submit a case with Red Hat Support. From what I have read, I suspect some of the effects of this may differ from one customer to another with affected services.

Update: This solution id https://access.redhat.com/solutions/3900301 has been helping us.