Failing to create pod sandbox on OpenShift 3 and 4
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 3.x
- 4.x
Issue
- Getting the following error when trying to restart a pod:
- While installing
elasticsearchoperator in RHOCP 4.x
Failed create pod sandbox: rpc error code: = Unknown desc = [failed to set up sandbox container.
- Getting below error message:
NetworkPlugin cni failed to set up pod
Resolution
- Delete the OpenShift SDN pod in error state identified in Diagnostics Steps field:
$ oc delete pod ${podname}
- In some cases, fix of upstream dns sever resolves the issue.
Root Cause
- One of the OpenShift SDN pods in that particular namespace was corrupted. So, there was no network available to run pods.
- From the operator pod , it was not resolving the
quay.iodomain query and hence the upstream dns server was checked and found issue.
Diagnostic Steps
- Run the following command to inspect pods state and check the output for OpenShift SDN pods in error state:
$ oc get pods --all-namespaces
openshift-sdn ovs-wrzr9 1/1 Running 4 94d
openshift-sdn ovs-xg2wd 1/1 Running 7 94d
openshift-sdn ovs-xtrsr 1/1 Running 11644 94d
openshift-sdn ovs-z6jps 1/1 Running 3 94d
openshift-sdn ovs-zphdl 1/1 Running 8 94d
openshift-sdn ovs-zqtfg 1/1 Running 6 94d
NOTE: the list above shows that pod ovs-xtrsr had restarted 11644 since creation. That is the one to be recreated.
- For OCP4.3 and
elasticsearchoperator issue
# oc rsh certified-operators-866f85886d-5b6h9
sh-4.2$ nslookup quay.io
Server: 192.168.100.1
Address: 192.168.100.1#53
Non-authoritative answer:
Name: quay.io.abcd.example.com
Address: 192.168.100.200 <---upstream dns server
** server can't find quay.io.abcd.example..com: SERVFAIL
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments