How to collect mds debug logs in Openshift Data Foundation

Updated -

In some cases while troubleshooting MDS issue in ODF we often need to collect MDS and MGR debug logs. This article has instructions on how to do so and can be used for troubleshooting MDS issues.

If it is necessary to perform MDS debugging, please have RH Support confirm that the MDSs are healthy enough for a debug session. For example, in most instances, the MDSs should be in their normal state, which is up:active, and safe for a debug session. However, in some instances, one of the MDSs may be stuck in replay. If the debug session causes the active MDS to failover, this could result in the loss of both MDSs increasing the chances for corruption.

Running the below is only for debug purposes. DO NOT keep debug logging enabled for an extended period of time or you run the risk of filling up the filesystem.

  • Switch to the openshift-storage namespace
 # oc project openshift-storage
$ date  (Note the timestamp when debug logging was enabled)
$ ceph config set mds debug_mds 20
$ ceph config set mgr debug_mgr 20
$ ceph config set mds debug_ms 1
$ ceph config set mgr debug_ms 1
  • Wait 10 minutes

  • Rsh back into the rook-ceph-tools pod, and disable the debug level by running the following commands (please don't skip this step; your underlying storage devices will fill up very quickly if this step is missed):

$ ceph config rm mds debug_mds
$ ceph config rm mgr debug_mgr
$ ceph config rm mds debug_ms
$ ceph config rm mgr debug_ms
  • Upload a fresh ODF must-gather to the case (where 'x' is the minor version number):
$ oc adm must-gather
  • Rsh into the Pod hosting the active MDS and the MGR.
$ find /var/log/ceph -type f
  • Copy all the MDS logs from the MDS Pod and all the MGR logs from the MGR Pod to the OS and then attach them to the support case.