How to list the local disk IDs for every OCS worker in Openshift 4.x

Solution Verified - Updated -

Environment

  • Openshift Container Storage 4.3 on Openshift 4.x

Issue

  • When installing Openshift Container Storage (OCS) to use local disks of nodes, one needs to collect all the disk IDs for the Local Storage Operator to be used in the LocalVolume Custom Resource definition(s).
  • Administrators must log into each single OCS worker node to collect these IDs. This can quickly get out of hand when there are more than three OCS worker nodes or many local disks on the worker nodes.

Resolution

Set up a Daemonset that collects local disks

  • Make sure you are logged into your Openshift cluster in your terminal - Check with oc get nodes
  • Make sure your OCS workers have the OCS label applied - you can check which nodes have the label with oc get nodes -l cluster.ocs.openshift.io/openshift-storage
  • Apply the DeamonSet with oc apply -f https://raw.githubusercontent.com/dmoessne/ocs-disk-gather/master/ocs-disk-gatherer.yaml

Now wait until the new Pods are Running. You can check this with oc get po -l name=ocs-disk-gatherer -n default. The output should be similar to this:

NAME                  READY   STATUS    RESTARTS   AGE
ocs-disk-gatherer-8dsnp   1/1     Running   0          64s
ocs-disk-gatherer-klstj   1/1     Running   0          63s
ocs-disk-gatherer-xnzqb   1/1     Running   0          58s

If the default namespace is not available

If the default namespace is not available or restricted due to security concerns, you can also use the extended version of the manifest. To use this, apply the DaemonSet like this:
oc apply -f https://raw.githubusercontent.com/dmoessne/ocs-disk-gather/master/ocs-disk-gatherer-own-project.yaml

This will set up a new namespace and configure all the needed permissions for the DaemonSet. In the following commands, you will need to replace --namespace default with --namespace ocs-disk-gatherer.

To clean up, use oc delete -f https://raw.githubusercontent.com/dmoessne/ocs-disk-gather/master/ocs-disk-gatherer-own-project.yaml. This will remove the DaemonSet, namespace and all permissions that were created for this procedure.

Collecting disk information

To collect the disk information, execute the following command:

kubectl logs --selector name=ocs-disk-gatherer --tail=-1 --since=10m --namespace default

An example output is:

          # NODE:compute-0
          # nvme0n1 : 1.5T
        - /dev/disk/by-id/lvm-pv-uuid-c3lSad-cmJh-1Oc9-afGi-dorm-Sz8P-ReUc9K
          # sda : 60G
        - /dev/disk/by-id/scsi-36000c29477c361e09aa3593630a138f5
          # sdb : 10G
        - /dev/disk/by-id/scsi-36000c2944997a920188c40e9072283e0
 -------------------------------------------
          # NODE:compute-1
          # nvme0n1 : 1.5T
        - /dev/disk/by-id/nvme-MO001600KWJSN_PHLE821600NR1P6CGN
          # sda : 60G
        - /dev/disk/by-id/scsi-36000c29b89117dab754e240d65e14797
          # sdb : 10G
        - /dev/disk/by-id/scsi-36000c29fff4553f181a8baea95416e49
 -------------------------------------------
          # NODE:compute-2
          # nvme0n1 : 1.5T
        - /dev/disk/by-id/nvme-MO001600KWJSN_PHLE821600571P6CGN
          # sda : 60G
        - /dev/disk/by-id/scsi-36000c29bc604bb3f298a3322fb2907a8
          # sdb : 10G
        - /dev/disk/by-id/scsi-36000c29f7b560b3b6a50b9a4f260d118
 -------------------------------------------

As you can see, there are three nodes, each with three disks of which one is an NVMe disk.

If you wanted to use the local NVMe disks in the above example, you would create a LocalVolume Customer Resource like this:

apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
  name: local-block
  namespace: local-storage
spec:
  nodeSelector:
    nodeSelectorTerms:
    - matchExpressions:
        - key: cluster.ocs.openshift.io/openshift-storage
          operator: In
          values:
          - ""
  storageClassDevices:
    - storageClassName: localblock
      volumeMode: Block
      devicePaths:
          # NODE:compute-0
          # nvme0n1 : 1.5T
        - /dev/disk/by-id/lvm-pv-uuid-c3lSad-cmJh-1Oc9-afGi-dorm-Sz8P-ReUc9K
          # NODE:compute-1
          # nvme0n1 : 1.5T
        - /dev/disk/by-id/nvme-MO001600KWJSN_PHLE821600NR1P6CGN
          # NODE:compute-2
          # nvme0n1 : 1.5T
        - /dev/disk/by-id/nvme-MO001600KWJSN_PHLE821600571P6CGN

This will then be picked up by the Local Storage Operator and it will create PVs for these three disks.
Make sure to indent the /dev/disk/by-id lines correctly! They need to be nested below devicePaths.
The lines with the node and device names are copied to highlight which devices we are using and will be ignored by the operator.

Refreshing disk information

If you want to refresh the disk information, you can safely delete the Pods and they will be recreated with the new disk information:
oc delete po --namespace default --selector name=ocs-disk-gatherer

Be sure to wait until all the Pods are running again, before you collect the new disk information.

By default, disk information is refreshed every 10 minutes.

Cleaning up

Once you have collected the disk information you needed, you can remove everything like this:

oc delete daemonsets ocs-disk-gatherer --namespace default

This will remove the DaemonSet and its Pods.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments