Ceph - Device class is displaying wrong hardware class for an OSD

Solution Verified - Updated -

Environment

  • Red Hat Ceph Storage 3.x (Luminous)

Issue

  • After running ceph-ansible and adding OSD's, Ceph is displaying the wrong hardware class for the OSD.

Resolution

  • Run ceph osd tee to see the class listing for the osd devices:
[root@monnode1 ~]# ceph osd tree
    ID    CLASS     WEIGHT      TYPE    NAME            STATUS REWEIGHT  PRI-AFF 
    -1              0.07910     root    default                               
    -3              0.02637          host osdnode1                         
     0      ssd     0.00879             osd.0              up   1.00000      1.00000 
     2      ssd     0.00879             osd.2              up   1.00000      1.00000 
     4      ssd     0.00879             osd.4              up   1.00000      1.00000 
    -7              0.02637          host osdnode2                         
     6      hdd     0.00879             osd.6              up   1.00000      1.00000 
     7      hdd     0.00879             osd.7              up   1.00000      1.00000 
     8      hdd     0.00879             osd.8              up   1.00000      1.00000 
    -5              0.02637          host osdnode3                         
     1      nvme    0.00879             osd.1              up   1.00000      1.00000 
     3      nvme    0.00879             osd.3              up   1.00000      1.00000 
     5      nvme    0.00879             osd.5              up   1.00000      1.00000 
  • In the output above, the hardware class for osd.6, osd.7, and osd.8 is incorrect, and needs to be modified from hdd to sdd. First remove the class from the OSD and then re-add it with the proper one using the following syntax:
# ceph osd crush rm-device-class osd.<$id>
# ceph osd crush set-device-class <$class> osd.<$id>
  • For example:
# ceph osd crush set-device-class ssd osd.6
  • To speed up the process for changing multiple osd's, you can append others that need to be modified to the end for Example:
# ceph osd crush set-device-class ssd osd.6 osd.7 osd.8

Root Cause

  • In Red Hat Ceph Storage 3.0 a new "device class" feature that was implemented to automatically detect the device class based on the hardware properties exposed by the Linux kernel.
  • In some situations the automatic device class detection lists an incorrect class because the device driver is not properly exposing information about the device via /sys/block.
  • Environments may choose to use custom device class names. In this situation deploying or re-deploying new OSD(s) will take on the appropriate default class names of 'ssd', 'nvme', or 'hdd' and will need to be modified to the proper custom class name afterward. Failure to do this in some situations can result in communication issues among OSDs for the effected PGs and can result in PGs inactive or blocked/slowed requests until the map is updated properly.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments