Unable to generate sosreport on Red Hat OpenShift 4.x node

Solution Unverified - Updated -

Environment

  • Red Hat OpenShift 4.x

Issue

  • When attempting to run toolbox on an OpenShift node, a .toolboxrc file is detected and image pull fails

Resolution

Once it's determined that a custom .toolboxrc file is in place, we can test if it's needed at all:
1. Log into the cluster, then start a debug pod on the desired node:

sh-4.2# oc debug node/<node name>
  1. Change to the node's root directory:
[root@<node_name>/]# chroot /host bash
  1. Rename the .toolboxrc file to something different, for testing purposes:
[root@<node_name>/]# mv ~/.toolboxrc ~/.toolboxrc-backup

Try generating an sosreport via the normal (unmodified) steps in the KCS. If the sosreport is created successfully, remove the custom toolboxrc file:

[root@<node_name>/]# rm ~/.toolboxrc-backup

Root Cause

  • A custom .toolboxrc file was created at some point. This file will contain manually-defined images and registries, which may or may not be correct. It is unlikely to be needed in a connected environment.

Diagnostic Steps

  1. Log into the cluster, then start a debug pod on the desired node:
sh-4.2# oc debug node/<node name>
  1. Change to the node's root directory:
[root@<node_name>/]# chroot /host bash
  1. Run the toolbox command
[root@<node_name>/]# toolbox

Check for the message:

".toolboxrc file detected, overriding defaults..."

It's possible that this is the desired/expected message.
4. If toolbox is able to run correctly, the sosreport can be generated with the following command:

[root@<node_name> /]# sosreport -k crio.all=on -k crio.logs=on  -k podman.all=on -k podman.logs=on

If the toolbox cannot run correctly, output similar to this will be seen:

[root@<node_name> ~]# toolbox
.toolboxrc file detected, overriding defaults...
Trying to pull default-route-openshift-image-registry.apps.<domain>.com:5000/rhel8/support-tools:latest...
WARN[0010] failed, retrying in 1s ... (1/3). Error: Error initializing source docker://default-route-openshift-image-registry.apps.<domain>.com:5000/rhel8/support-tools:latest: error pinging docker registry default-route-openshift-image-registry.apps.<domain>.com:5000: Get "https://default-route-openshift-image-registry.apps.<domain>.com:5000/v2/": dial tcp <ip-address>:5000: connect: no route to host
WARN[0022] failed, retrying in 1s ... (2/3). Error: Error initializing source docker://default-route-openshift-image-registry.apps.<domain>.com:5000/rhel8/support-tools:latest: error pinging docker registry default-route-openshift-image-registry.apps.<domain>.com:5000: Get "https://default-route-openshift-image-registry.apps.<domain>.com:5000/v2/": dial tcp <ip-address>:5000: connect: no route to host
WARN[0035] failed, retrying in 1s ... (3/3). Error: Error initializing source docker://default-route-openshift-image-registry.apps.<domain>.com:5000/rhel8/support-tools:latest: error pinging docker registry default-route-openshift-image-registry.apps.<domain>.com:5000: Get "https://default-route-openshift-image-registry.apps.<domain>.com:5000/v2/": dial tcp <ip-address>:5000: connect: no route to host
Error: Error initializing source docker://default-route-openshift-image-registry.apps.<domain>.com:5000/rhel8/support-tools:latest: error pinging docker registry default-route-openshift-image-registry.apps.<domain>.com:5000: Get "https://default-route-openshift-image-registry.apps.<domain>.com:5000/v2/": dial tcp <ip-address>:5000: connect: no route to host
Would you like to manually authenticate to registry: 'default-route-openshift-image-registry.apps.<domain>.com:5000' and try again? [y/N] y
Username: xxxxxxxxxx
Password: xxxxxxxxxx
Error: authenticating creds for "default-route-openshift-image-registry.apps.<domain>.com:5000": error pinging docker registry default-route-openshift-image-registry.apps.<domain>.com:5000: Get "https://default-route-openshift-image-registry.apps.<domain>.com:5000/v2/": dial tcp <ip-address>:5000: connect: no route to host
  1. Attempt forcing the default parameters for deploying a toolbox image:
podman run -it --name toolbox-test  --privileged --ipc=host --net=host --pid=host   -e HOST=/host -e NAME=toolbox-test -e IMAGE=registry.redhat.io/rhel8/support-tools:latest -v /run:/run -v /var/log:/var/log -v /etc/machine-id:/etc/machine-id -v /etc/localtime:/etc/localtime -v /:/host registry.redhat.io/rhel8/support-tools:latest
  1. If the toolbox container runs normally, the sosreport can be generated:
[root@<node_name> /]# sosreport -k crio.all=on -k crio.logs=on  -k podman.all=on -k podman.logs=on

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments