Chapter 3. Commonly required logs for troubleshooting

Some of the commonly used logs for troubleshooting OpenShift Data Foundation are listed, along with the commands to generate them.

  • Generating logs for a specific pod:

     $ oc logs <pod-name> -n <namespace>
  • Generating logs for Ceph or OpenShift Data Foundation cluster:

    $ oc logs rook-ceph-operator-<ID> -n openshift-storage
    Important

    Currently, the rook-ceph-operator logs do not provide any information about the failure and this acts as a limitation in troubleshooting issues, see Enabling and disabling debug logs for rook-ceph-operator.

  • Generating logs for plugin pods like cephfs or rbd to detect any problem in the PVC mount of the app-pod:

    $ oc logs csi-cephfsplugin-<ID> -n openshift-storage -c csi-cephfsplugin
    $ oc logs csi-rbdplugin-<ID> -n openshift-storage -c csi-rbdplugin
    • To generate logs for all the containers in the CSI pod:

      $ oc logs csi-cephfsplugin-<ID> -n openshift-storage --all-containers
      $ oc logs csi-rbdplugin-<ID> -n openshift-storage --all-containers
  • Generating logs for cephfs or rbd provisioner pods to detect problems if PVC is not in BOUND state:

    $ oc logs csi-cephfsplugin-provisioner-<ID> -n openshift-storage -c csi-cephfsplugin
    $ oc logs csi-rbdplugin-provisioner-<ID> -n openshift-storage -c csi-rbdplugin
    • To generate logs for all the containers in the CSI pod:

      $ oc logs csi-cephfsplugin-provisioner-<ID> -n openshift-storage --all-containers
      $ oc logs csi-rbdplugin-provisioner-<ID> -n openshift-storage --all-containers
  • Generating OpenShift Data Foundation logs using cluster-info command:

    $ oc cluster-info dump -n openshift-storage --output-directory=<directory-name>
  • When using Local Storage Operator, generating logs can be done using cluster-info command:

    $ oc cluster-info dump -n openshift-local-storage --output-directory=<directory-name>
  • Check the OpenShift Data Foundation operator logs and events.

    • To check the operator logs :

      # oc logs <ocs-operator> -n openshift-storage
      <ocs-operator>
      # oc get pods -n openshift-storage | grep -i "ocs-operator" | awk '{print $1}'
    • To check the operator events :

      # oc get events --sort-by=metadata.creationTimestamp -n openshift-storage
  • Get the OpenShift Data Foundation operator version and channel.

    # oc get csv -n openshift-storage

    Example output :

    NAME                 DISPLAY                     VERSION
    REPLACES             PHASE
    mcg-operator.v4.9.2  NooBaa Operator             4.9.2
    mcg-operator.v4.9.1  Succeeded
    ocs-operator.v4.9.2  OpenShift Container Storage 4.9.2
    ocs-operator.v4.9.1  Succeeded
    odf-operator.v4.9.2  OpenShift Data Foundation   4.9.2
    odf-operator.v4.9.1  Succeeded
    # oc get subs -n openshift-storage

    Example output :

    NAME
    PACKAGE           SOURCE                  CHANNEL
    mcg-operator-stable-4.9-redhat-operators-openshift-marketplace
    mcg-operator      redhat-operators        stable-4.9
    ocs-operator-stable-4.9-redhat-operators-openshift-marketplace
    ocs-operator      redhat-operators        stable-4.9
    odf-operator
    odf-operator      redhat-operators        stable-4.9
  • Confirm that the installplan is created.

    # oc get installplan -n openshift-storage
  • Verify the image of the components post updating OpenShift Data Foundation.

    • Check the node on which the pod of the component you want to verify the image is running.

      # oc get pods -o wide | grep <component-name>

      For Example :

      # oc get pods -o wide | grep rook-ceph-operator

      Example output:

      rook-ceph-operator-566cc677fd-bjqnb 1/1 Running 20 4h6m 10.128.2.5 rook-ceph-operator-566cc677fd-bjqnb 1/1 Running 20 4h6m 10.128.2.5 dell-r440-12.gsslab.pnq2.redhat.com <none> <none>
      
      <none> <none>

      dell-r440-12.gsslab.pnq2.redhat.com is the node-name.

    • Check the image ID.

      # oc debug node/<node name>

      <node-name>

      Is the name of the node on which the pod of the component you want to verify the image is running.

      # chroot /host
      # crictl images | grep <component>

      For Example :

      # crictl images | grep rook-ceph

      Take a note of the IMAGEID and map it to the Digest ID on the Rook Ceph Operator page.

Additional resources