How to generate a sos report in Red Hat Enterprise Linux CoreOS in OpenShift 4 with SSH access to nodes?
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
- Red Hat Enterprise Linux CoreOS (RHCOS)
- 4
Issue
- How to generate
sos report
in Red Hat Enterprise Linux CoreOS in OCP 4? - How to generate
sos report
for Red Hat OpenShift 4 nodes? -
Generating
sos report
usingrhel7/support-tools
image fails with a traceback.[root@ip-1-1-1-1 ~]# podman run -it registry.access.redhat.com/rhel7/support-tools /usr/bin/bash bash-4.2# sosreport Traceback (most recent call last): File "/usr/sbin/sosreport", line 19, in <module> main(sys.argv[1:]) File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 1498, in main sos = SoSReport(args) File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 360, in __init__ self.policy = sos.policies.load(sysroot=self.opts.sysroot) File "/usr/lib/python2.7/site-packages/sos/policies/__init__.py", line 44, in load cache['policy'] = policy(sysroot=sysroot) File "/usr/lib/python2.7/site-packages/sos/policies/redhat.py", line 258, in __init__ super(RHELPolicy, self).__init__(sysroot=sysroot) File "/usr/lib/python2.7/site-packages/sos/policies/redhat.py", line 58, in __init__ sysroot = self._container_init() File "/usr/lib/python2.7/site-packages/sos/policies/redhat.py", line 153, in _container_init host_tmp_dir = os.path.abspath(self._host_sysroot + self._tmp_dir) TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
Resolution
Important note: By design, OpenShift 4 clusters are immutable and rely on Operators to apply cluster changes. In turn, this means that accessing the underlying nodes directly by SSH is not the recommended procedure. Additionally, the nodes will be tainted as
accessed
.
Therefore, whenever possible, generate a sos report without using SSH, by spawning a debug pod directly from theoc
command line. See How to generate SOSREPORT within OpenShift4 nodes without SSH for further information.
Only if it is not possible to generate a sos report without SSH, connect to the OpenShift 4 node where a sos report shall be generated via SSH and become root
:
$ ssh core@[NODE] # ssh with core user to the NODE using ssh key specified in install-config.yaml
[core@node ~]$ sudo -i
Note: in disconnected environments, it is needed to have the
registry.redhat.io/rhel8/support-tools
mirrored. If the image is already available for the nodes, create a/root/.toolboxrc
file within the node as follows before runningtoolbox
with the URL of the registry and the name of the image in the custom registry:[root@node ~]# vi /root/.toolboxrc REGISTRY=private-registry.example.com:5000 IMAGE=rhel8/support-tools
Run the toolbox
command:
[root@node ~]# toolbox
Spawning a container 'toolbox-root' with image 'registry.redhat.io/rhel8/support-tools'
Detected RUN label in the container image. Using that as the default...
Command: /proc/self/exe run -it --name toolbox-root --privileged --ipc=host --net=host --pid=host -e HOST=/host -e NAME=toolbox-root -e IMAGE=registry.redhat.io/rhel8/support-tools:latest -v /run:/run -v /var/log:/var/log -v /etc/machine-id:/etc/machine-id -v /etc/localtime:/etc/localtime -v /:/host registry.redhat.io/rhel8/support-tools:latest
Execute sos report
command:
[root@node ~]# sos report -k crio.all=on -k crio.logs=on -k podman.all=on -k podman.logs=on
Note: if any of the plugins times out, or not all the information is collected, it could be needed to add the paramenter
--plugin-timeout=600
to increase the plugin timeout.
This will generate the sos report in /host/var/tmp
directory on the container (which maps to /var/tmp/
on the host). Refer to What options are available to copy/share the generated sosreport? for the different ways to attach the generated sosreport to a Support Case.
Once the sos report was created, run exit
to exit from the container's bash session to the node:
[root@node ~]# exit
[root@node ~]#
Root Cause
The toolbox
command runs podman container runlabel run registry.redhat.io/rhel8/support-tools
, which is the replacement for atomic run registry.redhat.io/rhel7/support-tools
from RHEL Atomic Host.
Diagnostic Steps
If toolbox
does not start the debug container as expected, check for a user-created $HOME/.toolboxrc
file that could be overriding the default values of the REGISTRY
, IMAGE
, or TOOLBOX_NAME
options. In disconnected environments, it will be needed to create that file to refer to the mirrored image.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments