Pods get stuck in pending state in some nodes with high number of iptables rules

Solution In Progress - Updated -

Issue

Pods in some nodes are stuck in pending state for tens of minutes. The logs for these pods can show errors similar to the following:

host.example.com atomic-openshift-node: I0328 14:11:36.763258   27983 prober.go:111] Readiness probe for "pod-deployment-0000ab0c0d-aaa4a_a1-1111-aaaaaaa-aaa(52fbee52-514c-11e9-8949-0050569a4a9c):pod-deployment" failed (failure): Get https://10.0.0.1:1111/check/health: dial tcp 10.0.0.1:1111: connect: connection refused
  • Error: "Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?"
atomic-openshift-node  36363 cni.go:275] Error deleting network: CNI request failed with status 400: 'Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?

atomic-openshift-node  36363 remote_runtime.go:109] StopPodSandbox "0000ab0c0d-aaa4a_a1-1111-aaaaaaa" from runtime service failed: rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod "<pod_name>_<project_name>" network: CNI request failed with status 400: 'Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?

Environment

Red Hat OpenShift Container Platform (OCP) 3.6 and above versions.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In