systemd stops reading and processing dbus event of runc cgroup invokes

Solution Verified - Updated -

Issue

  • Pods which are started on a node are stuck in the state ContainerCreating until they are manually deleted or the host is rebooted.
  • Pods are stuck in Terminating state until the OpenShift Node is manually rebooted.
  • There were problems with deployments because most deploy pods have been in ContainerCreating state for more than a day.
  • Starting a container using docker fails with /usr/bin/docker-current: Error response from daemon: containerd: container did not start before the specified timeout
  • We are seeing a ton of the below messages in journal and are unable to start any new container using docker on this particular node.
Jun 24 10:10:26 node123 crond[111309]: pam_systemd(crond:session): Failed to create session: Connection timed out
Jun 24 10:10:26 node123 systemd-logind[10714]: Failed to start user slice user-0.slice, ignoring: Connection timed out ((null))
Jun 24 10:10:51 node123 systemd-logind[10714]: Failed to start session scope session-13692.scope: Connection timed out
  • Reporting a strange error message from systemd after which various operation related to DBus are failing:
Jun 04 14:03:35 node123 systemd[1]: Failed to propagate agent release message: Operation not supported

Environment

  • Red Hat Enterprise Linux 7
  • systemd

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In