How to provide a sos report from a RHEL CoreOS OpenShift 4 node

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • Red Hat Enterprise Linux CoreOS (RHCOS)
  • sos

Issue

  • In some situations Red Hat support may ask to provide a sos report taken from one or more OpenShift nodes running on Red Hat Enterprise Linux CoreOS (RHCOS)
  • RHCOS does not provide the sos report tool natively

Resolution

Even if RHCOS is based on RHEL components, various classical RHEL tools are not included into the system by default.

Collecting sos report from OpenShift nodes

Note: despite of the method used, the parameters that should be used for the sos report command are (at least) "-e openshift -e openshift_ovn -e openvswitch -e podman -e crio -k crio.all=on -k crio.logs=on -k podman.all=on -k podman.logs=on --all-logs --plugin-timeout=600".

For collecting the sos report, please refer to:

IMPORTANT: For all methods, if the cluster is disconnected or if the image for generating the sos report cannot be pulled, it is possible to use it having the registry.redhat.io/rhel9/support-tools image mirrored in the mirror registry used for the cluster to be able to collect a sos report.
If mirroring is not an option for any reason, there are other ways an image can be added to the node to allow the toolbox to use it, like the procedure described in how to import an image for toolbox manually to a node on OpenShift 4. Additional methods can be done with the information in how to use podman save to share container images and in how Podman can transfer container images without a registry.

Upload the sos report to the related support case

  • In case the node is part of a connected cluster the sos report can be uploaded directly from the node via the Red Hat Secure FTP.

    IMPORTANT: It requires access to access.redhat.com via HTTPS, if the OCP node is forced to use a proxy in order to access external hosts, the proxy needs to be configured for curl.

  • In case the node is part of a disconnected cluster, or no direct access to access.redhat.com via HTTPS the sos report must be copied locally for the upload with any of the following commands (preferred method is using the oc debug node):

    $ oc debug node/<nodename> -- cat /host/var/tmp/sosreport-XXXXX.tar.xz > /tmp/sosreport-XXXXX.tar.xz
    

    OR (if the oc debug node doesn't work):

    $ scp core@<nodename>:/var/tmp/sosreport-XXXXX.tar.xz /tmp/sosreport-XXXXX.tar.xz
    

    Now the sos report archive can be uploaded from the local machine to the case in Red Hat Customer Portal or to the Red Hat Secure FTP.

  • Once the sos report is uploaded, the tar.xz archive can be removed from the node:

    $ oc debug -t node/<nodename>
    $ rm /host/var/tmp/sosreport-XXXXX.tar.xz
    

Root Cause

By design, OpenShift 4 nodes are immutable and rely on ClusterOperators to apply the changes.

For that reason, the preferred method to collect a sos report from an OpenShift node is using the oc debug node command, and use the SSH method if that command is not working for the specific node.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments