Pods get stuck in pending state in some nodes with high number of iptables rules
Issue
Pods in some nodes are stuck in pending state for tens of minutes. The logs for these pods can show errors similar to the following:
host.example.com atomic-openshift-node: I0328 14:11:36.763258 27983 prober.go:111] Readiness probe for "pod-deployment-0000ab0c0d-aaa4a_a1-1111-aaaaaaa-aaa(52fbee52-514c-11e9-8949-0050569a4a9c):pod-deployment" failed (failure): Get https://10.0.0.1:1111/check/health: dial tcp 10.0.0.1:1111: connect: connection refused
- Error: "Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?"
atomic-openshift-node 36363 cni.go:275] Error deleting network: CNI request failed with status 400: 'Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
atomic-openshift-node 36363 remote_runtime.go:109] StopPodSandbox "0000ab0c0d-aaa4a_a1-1111-aaaaaaa" from runtime service failed: rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod "<pod_name>_<project_name>" network: CNI request failed with status 400: 'Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
Environment
Red Hat OpenShift Container Platform (OCP) 3.6 and above versions.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.