Language:
Format:

Chapter 6. Management of OSDs using the Ceph Orchestrator

As a storage administrator, you can use the Ceph Orchestrators to manage OSDs of a Red Hat Ceph Storage cluster.

6.1. Ceph OSDs

When a Red Hat Ceph Storage cluster is up and running, you can add OSDs to the storage cluster at runtime.

A Ceph OSD generally consists of one ceph-osd daemon for one storage drive and its associated journal within a node. If a node has multiple storage drives, then map one ceph-osd daemon for each drive.

Red Hat recommends checking the capacity of a cluster regularly to see if it is reaching the upper end of its storage capacity. As a storage cluster reaches its near full ratio, add one or more OSDs to expand the storage cluster’s capacity.

When you want to reduce the size of a Red Hat Ceph Storage cluster or replace the hardware, you can also remove an OSD at runtime. If the node has multiple storage drives, you might also need to remove one of the ceph-osd daemon for that drive. Generally, it’s a good idea to check the capacity of the storage cluster to see if you are reaching the upper end of its capacity. Ensure that when you remove an OSD that the storage cluster is not at its near full ratio.

Important

Do not let a storage cluster reach the full ratio before adding an OSD. OSD failures that occur after the storage cluster reaches the near full ratio can cause the storage cluster to exceed the full ratio. Ceph blocks write access to protect the data until you resolve the storage capacity issues. Do not remove OSDs without considering the impact on the full ratio first.

6.2. Ceph OSD node configuration

Configure Ceph OSDs and their supporting hardware similarly as a storage strategy for the pool(s) that will use the OSDs. Ceph prefers uniform hardware across pools for a consistent performance profile. For best performance, consider a CRUSH hierarchy with drives of the same type or size.

If you add drives of dissimilar size, adjust their weights accordingly. When you add the OSD to the CRUSH map, consider the weight for the new OSD. Hard drive capacity grows approximately 40% per year, so newer OSD nodes might have larger hard drives than older nodes in the storage cluster, that is, they might have a greater weight.

Before doing a new installation, review the Requirements for Installing Red Hat Ceph Storage chapter in the Installation Guide.

6.3. Automatically tuning OSD memory

The OSD daemons adjust the memory consumption based on the osd_memory_target configuration option. The option osd_memory_target sets OSD memory based upon the available RAM in the system.

If Red Hat Ceph Storage is deployed on dedicated nodes that do not share memory with other services, cephadm automatically adjusts the per-OSD consumption based on the total amount of RAM and the number of deployed OSDs.

Important

By default, the osd_memory_target_autotune parameter is set to true in Red Hat Ceph Storage 6.0.

Syntax

ceph config set osd osd_memory_target_autotune true

Once the storage cluster is upgraded to Red Hat Ceph Storage 6.0, for cluster maintenance such as addition of OSDs or replacement of OSDs, Red Hat recommends setting osd_memory_target_autotune parameter to true to autotune osd memory as per system memory.

Cephadm starts with a fraction mgr/cephadm/autotune_memory_target_ratio, which defaults to 0.7 of the total RAM in the system, subtract off any memory consumed by non-autotuned daemons such as non-OSDS and for OSDs for which osd_memory_target_autotune is false, and then divide by the remaining OSDs.

The osd_memory_target parameter is calculated as follows:

Syntax

osd_memory_target = TOTAL_RAM_OF_THE_OSD * (1048576) * (autotune_memory_target_ratio) / NUMBER_OF_OSDS_IN_THE_OSD_NODE - (SPACE_ALLOCATED_FOR_OTHER_DAEMONS)

SPACE_ALLOCATED_FOR_OTHER_DAEMONS may optionally include the following daemon space allocations:

Alertmanager: 1 GB
Grafana: 1 GB
Ceph Manager: 4 GB
Ceph Monitor: 2 GB
Node-exporter: 1 GB
Prometheus: 1 GB

For example, if a node has 24 OSDs and has 251 GB RAM space, then osd_memory_target is 7860684936.

The final targets are reflected in the configuration database with options. You can view the limits and the current memory consumed by each daemon from the ceph orch ps output under MEM LIMIT column.

Note

In Red Hat Ceph Storage 6.0, the default setting of osd_memory_target_autotune true is unsuitable for hyperconverged infrastructures where compute and Ceph storage services are colocated. In a hyperconverged infrastructure, the autotune_memory_target_ratio can be set to 0.2 to reduce the memory consumption of Ceph.

Example

[ceph: root@host01 /]# ceph config set mgr mgr/cephadm/autotune_memory_target_ratio 0.2

You can manually set a specific memory target for an OSD in the storage cluster.

Example

[ceph: root@host01 /]# ceph config set osd.123 osd_memory_target 7860684936

You can manually set a specific memory target for an OSD host in the storage cluster.

Syntax

ceph config set osd/host:HOSTNAME osd_memory_target TARGET_BYTES

Example

[ceph: root@host01 /]# ceph config set osd/host:host01 osd_memory_target 1000000000

Note

Enabling osd_memory_target_autotune overwrites existing manual OSD memory target settings. To prevent daemon memory from being tuned even when the osd_memory_target_autotune option or other similar options are enabled, set the _no_autotune_memory label on the host.

Syntax

ceph orch host label add HOSTNAME _no_autotune_memory

You can exclude an OSD from memory autotuning by disabling the autotune option and setting a specific memory target.

Example

[ceph: root@host01 /]# ceph config set osd.123 osd_memory_target_autotune false
[ceph: root@host01 /]# ceph config set osd.123 osd_memory_target 16G

6.4. Listing devices for Ceph OSD deployment

You can check the list of available devices before deploying OSDs using the Ceph Orchestrator. The commands are used to print a list of devices discoverable by Cephadm. A storage device is considered available if all of the following conditions are met:

The device must have no partitions.
The device must not have any LVM state.
The device must not be mounted.
The device must not contain a file system.
The device must not contain a Ceph BlueStore OSD.
The device must be larger than 5 GB.

Note

Ceph will not provision an OSD on a device that is not available.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
All manager and monitor daemons are deployed.

Procedure

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```
List the available devices to deploy OSDs:
Syntax
```
ceph orch device ls [--hostname=HOSTNAME_1 HOSTNAME_2] [--wide] [--refresh]
```
Example
```
[ceph: root@host01 /]# ceph orch device ls --wide --refresh
```
Using the --wide option provides all details relating to the device, including any reasons that the device might not be eligible for use as an OSD. This option does not support NVMe devices.
Optional: To enable Health, Ident, and Fault fields in the output of ceph orch device ls, run the following commands:
Note
These fields are supported by libstoragemgmt library and currently supports SCSI, SAS, and SATA devices.
1. As root user outside the Cephadm shell, check your hardware’s compatibility with libstoragemgmt library to avoid unplanned interruption to services:
  Example
```
[root@host01 ~]# cephadm shell lsmcli ldl
```
  In the output, you see the Health Status as Good with the respective SCSI VPD 0x83 ID.
  Note
  If you do not get this information, then enabling the fields might cause erratic behavior of devices.
2. Log back into the Cephadm shell and enable libstoragemgmt support:
  Example
```
[root@host01 ~]# cephadm shell
[ceph: root@host01 /]# ceph config set mgr mgr/cephadm/device_enhanced_scan true
```
  Once this is enabled, ceph orch device ls gives the output of Health field as Good.

Verification

List the devices:

Example

[ceph: root@host01 /]# ceph orch device ls

6.5. Zapping devices for Ceph OSD deployment

You need to check the list of available devices before deploying OSDs. If there is no space available on the devices, you can clear the data on the devices by zapping them.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
All manager and monitor daemons are deployed.

Procedure

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```

List the available devices to deploy OSDs:

Syntax

ceph orch device ls [--hostname=HOSTNAME_1 HOSTNAME_2] [--wide] [--refresh]

Example

[ceph: root@host01 /]# ceph orch device ls --wide --refresh

Clear the data of a device:

Syntax

ceph orch device zap HOSTNAME FILE_PATH --force

Example

[ceph: root@host01 /]# ceph orch device zap host02 /dev/sdb --force

Verification

Verify the space is available on the device:
Example
```
[ceph: root@host01 /]# ceph orch device ls
```
You will see that the field under Available is Yes.

Additional Resources

See the Listing devices for Ceph OSD deployment section in the Red Hat Ceph Storage Operations Guide for more information.

6.6. Deploying Ceph OSDs on all available devices

You can deploy all OSDS on all the available devices. Cephadm allows the Ceph Orchestrator to discover and deploy the OSDs on any available and unused storage device.

To deploy OSDs all available devices, run the command without the unmanaged parameter and then re-run the command with the parameter to prevent from creating future OSDs.

Note

The deployment of OSDs with --all-available-devices is generally used for smaller clusters. For larger clusters, use the OSD specification file.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
All manager and monitor daemons are deployed.

Procedure

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```

List the available devices to deploy OSDs:

Syntax

ceph orch device ls [--hostname=HOSTNAME_1 HOSTNAME_2] [--wide] [--refresh]

Example

[ceph: root@host01 /]# ceph orch device ls --wide --refresh

Deploy OSDs on all available devices:
Example
```
[ceph: root@host01 /]# ceph orch apply osd --all-available-devices
```
The effect of ceph orch apply is persistent which means that the Orchestrator automatically finds the device, adds it to the cluster, and creates new OSDs. This occurs under the following conditions:
- New disks or drives are added to the system.
- Existing disks or drives are zapped.
- An OSD is removed and the devices are zapped.
  You can disable automatic creation of OSDs on all the available devices by using the --unmanaged parameter.
  Example
```
[ceph: root@host01 /]# ceph orch apply osd --all-available-devices --unmanaged=true
```
  Setting the parameter --unmanaged to true disables the creation of OSDs and also there is no change if you apply a new OSD service.
  Note
  The command ceph orch daemon add creates new OSDs, but does not add an OSD service.

Verification

List the service:
Example
```
[ceph: root@host01 /]# ceph orch ls
```
View the details of the node and devices:
Example
```
[ceph: root@host01 /]# ceph osd tree
```

Additional Resources

See the Listing devices for Ceph OSD deployment section in the Red Hat Ceph Storage Operations Guide.

6.7. Deploying Ceph OSDs on specific devices and hosts

You can deploy all the Ceph OSDs on specific devices and hosts using the Ceph Orchestrator.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
All manager and monitor daemons are deployed.

Procedure

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```

List the available devices to deploy OSDs:

Syntax

ceph orch device ls [--hostname=HOSTNAME_1 HOSTNAME_2] [--wide] [--refresh]

Example

[ceph: root@host01 /]# ceph orch device ls --wide --refresh

Deploy OSDs on specific devices and hosts:
Syntax
```
ceph orch daemon add osd HOSTNAME:DEVICE_PATH
```
Example
```
[ceph: root@host01 /]# ceph orch daemon add osd host02:/dev/sdb
```
To deploy ODSs on a raw physical device, without an LVM layer, use the --method raw option.
Syntax
```
ceph orch daemon add osd --method raw HOSTNAME:DEVICE_PATH
```
Example
```
[ceph: root@host01 /]# ceph orch daemon add osd --method raw host02:/dev/sdb
```
Note
If you have separate DB or WAL devices, the ratio of block to DB or WAL devices MUST be 1:1.

Verification

List the service:
Example
```
[ceph: root@host01 /]# ceph orch ls osd
```
View the details of the node and devices:
Example
```
[ceph: root@host01 /]# ceph osd tree
```

List the hosts, daemons, and processes:

Syntax

ceph orch ps --service_name=SERVICE_NAME

Example

[ceph: root@host01 /]# ceph orch ps --service_name=osd

Additional Resources

See the Listing devices for Ceph OSD deployment section in the Red Hat Ceph Storage Operations Guide.

6.8. Advanced service specifications and filters for deploying OSDs

Service Specification of type OSD is a way to describe a cluster layout using the properties of disks. It gives the user an abstract way to tell Ceph which disks should turn into an OSD with the required configuration without knowing the specifics of device names and paths. For each device and each host, define a yaml file or a json file.

General settings for OSD specifications

service_type: 'osd': This is mandatory to create OSDS
service_id: Use the service name or identification you prefer. A set of OSDs is created using the specification file. This name is used to manage all the OSDs together and represent an Orchestrator service.
placement: This is used to define the hosts on which the OSDs need to be deployed.
You can use on the following options:
- host_pattern: '*' - A host name pattern used to select hosts.
- label: 'osd_host' - A label used in the hosts where OSD need to be deployed.
- hosts: 'host01', 'host02' - An explicit list of host names where OSDs needs to be deployed.
selection of devices: The devices where OSDs are created. This allows us to separate an OSD from different devices. You can create only BlueStore OSDs which have three components:
- OSD data: contains all the OSD data
- WAL: BlueStore internal journal or write-ahead Log
- DB: BlueStore internal metadata
data_devices: Define the devices to deploy OSD. In this case, OSDs are created in a collocated schema. You can use filters to select devices and folders.
wal_devices: Define the devices used for WAL OSDs. You can use filters to select devices and folders.
db_devices: Define the devices for DB OSDs. You can use the filters to select devices and folders.
encrypted: An optional parameter to encrypt information on the OSD which can set to either True or False
unmanaged: An optional parameter, set to False by default. You can set it to True if you do not want the Orchestrator to manage the OSD service.

block_wal_size: User-defined value, in bytes.
block_db_size: User-defined value, in bytes.
osds_per_device: User-defined value for deploying more than one OSD per device.
method: An optional parameter to specify if an OSD is created with an LVM layer or not. Set to raw if you want to create OSDs on raw physical devices that do not include an LVM layer. If you have separate DB or WAL devices, the ratio of block to DB or WAL devices MUST be 1:1.

Filters for specifying devices

Filters are used in conjunction with the data_devices, wal_devices and db_devices parameters.

Name of the filter	Description	Syntax	Example
Model	Target specific disks. You can get details of the model by running `lsblk -o NAME,FSTYPE,LABEL,MOUNTPOINT,SIZE,MODEL` command or `smartctl -i /DEVIVE_PATH`	Model: DISK_MODEL_NAME	Model: MC-55-44-XZ
Vendor	Target specific disks	Vendor: DISK_VENDOR_NAME	Vendor: Vendor Cs
Size Specification	Includes disks of an exact size	size: EXACT	size: '10G'
Size Specification	Includes disks size of which is within the range	size: LOW:HIGH	size: '10G:40G'
Size Specification	Includes disks less than or equal to in size	size: :HIGH	size: ':10G'
Size Specification	Includes disks equal to or greater than in size	size: LOW:	size: '40G:'
Rotational	Rotational attribute of the disk. 1 matches all disks that are rotational and 0 matches all the disks that are non-rotational. If rotational =0, then OSD is configured with SSD or NVME. If rotational=1 then the OSD is configured with HDD.	rotational: 0 or 1	rotational: 0
All	Considers all the available disks	all: true	all: true
Limiter	When you have specified valid filters but want to limit the amount of matching disks you can use the ‘limit’ directive. It should be used only as a last resort.	limit: NUMBER	limit: 2

Note

To create an OSD with non-collocated components in the same host, you have to specify the different types of devices used and the devices should be on the same host.

Note

The devices used for deploying OSDs must be supported by libstoragemgmt.

Additional Resources

See the Deploying Ceph OSDs using the advanced specifications section in the Red Hat Ceph Storage Operations Guide.
For more information on libstoragemgmt, see the Listing devices for Ceph OSD deployment section in the Red Hat Ceph Storage Operations Guide.

6.9. Deploying Ceph OSDs using advanced service specifications

The service specification of type OSD is a way to describe a cluster layout using the properties of disks. It gives the user an abstract way to tell Ceph which disks should turn into an OSD with the required configuration without knowing the specifics of device names and paths.

You can deploy the OSD for each device and each host by defining a yaml file or a json file.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
All manager and monitor daemons are deployed.

Procedure

On the monitor node, create the osd_spec.yaml file:
Example
```
[root@host01 ~]# touch osd_spec.yaml
```

Edit the osd_spec.yaml file to include the following details:

Syntax

service_type: osd
service_id: SERVICE_ID
placement:
  host_pattern: '*' # optional
data_devices: # optional
  model: DISK_MODEL_NAME # optional
  paths:
  - /DEVICE_PATH
osds_per_device: NUMBER_OF_DEVICES # optional
db_devices: # optional
  size: # optional
  all: true # optional
  paths:
   - /DEVICE_PATH
encrypted: true

Simple scenarios: In these cases, all the nodes have the same set-up.

Example

service_type: osd
service_id: osd_spec_default
placement:
  host_pattern: '*'
data_devices:
  all: true
  paths:
  - /dev/sdb
encrypted: true

Example

service_type: osd
service_id: osd_spec_default
placement:
  host_pattern: '*'
data_devices:
  size: '80G'
db_devices:
  size: '40G:'
  paths:
   - /dev/sdc

Simple scenario: In this case, all the nodes have the same setup with OSD devices created in raw mode, without an LVM layer.

Example

service_type: osd
service_id: all-available-devices
encrypted: "true"
method: raw
placement:
  host_pattern: "*"
data_devices:
  all: "true"

Advanced scenario: This would create the desired layout by using all HDDs as data_devices with two SSD assigned as dedicated DB or WAL devices. The remaining SSDs are data_devices that have the NVMEs vendors assigned as dedicated DB or WAL devices.

Example

service_type: osd
service_id: osd_spec_hdd
placement:
  host_pattern: '*'
data_devices:
  rotational: 0
db_devices:
  model: Model-name
  limit: 2
---
service_type: osd
service_id: osd_spec_ssd
placement:
  host_pattern: '*'
data_devices:
  model: Model-name
db_devices:
  vendor: Vendor-name

Advanced scenario with non-uniform nodes: This applies different OSD specs to different hosts depending on the host_pattern key.

Example

service_type: osd
service_id: osd_spec_node_one_to_five
placement:
  host_pattern: 'node[1-5]'
data_devices:
  rotational: 1
db_devices:
  rotational: 0
---
service_type: osd
service_id: osd_spec_six_to_ten
placement:
  host_pattern: 'node[6-10]'
data_devices:
  model: Model-name
db_devices:
  model: Model-name

Advanced scenario with dedicated WAL and DB devices:

Example

service_type: osd
service_id: osd_using_paths
placement:
  hosts:
    - host01
    - host02
data_devices:
  paths:
    - /dev/sdb
db_devices:
  paths:
    - /dev/sdc
wal_devices:
  paths:
    - /dev/sdd

Advanced scenario with multiple OSDs per device:

Example

service_type: osd
service_id: multiple_osds
placement:
  hosts:
    - host01
    - host02
osds_per_device: 4
data_devices:
  paths:
    - /dev/sdb

For pre-created volumes, edit the osd_spec.yaml file to include the following details:

Syntax

service_type: osd
service_id: SERVICE_ID
placement:
  hosts:
    - HOSTNAME
data_devices: # optional
  model: DISK_MODEL_NAME # optional
  paths:
  - /DEVICE_PATH
db_devices: # optional
  size: # optional
  all: true # optional
  paths:
   - /DEVICE_PATH

Example

service_type: osd
service_id: osd_spec
placement:
  hosts:
    - machine1
data_devices:
  paths:
    - /dev/vg_hdd/lv_hdd
db_devices:
  paths:
    - /dev/vg_nvme/lv_nvme

For OSDs by ID, edit the osd_spec.yaml file to include the following details:

Note

This configuration is applicable for Red Hat Ceph Storage 5.3z1 and later releases. For earlier releases, use pre-created lvm.

Syntax

service_type: osd
service_id: OSD_BY_ID_HOSTNAME
placement:
  hosts:
    - HOSTNAME
data_devices: # optional
  model: DISK_MODEL_NAME # optional
  paths:
  - /DEVICE_PATH
db_devices: # optional
  size: # optional
  all: true # optional
  paths:
   - /DEVICE_PATH

Example

service_type: osd
service_id: osd_by_id_host01
placement:
  hosts:
    - host01
data_devices:
  paths:
    - /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-5
db_devices:
  paths:
    - /dev/disk/by-id/nvme-nvme.1b36-31323334-51454d55204e564d65204374726c-00000001

For OSDs by path, edit the osd_spec.yaml file to include the following details:

Note

This configuration is applicable for Red Hat Ceph Storage 5.3z1 and later releases. For earlier releases, use pre-created lvm.

Syntax

service_type: osd
service_id: OSD_BY_PATH_HOSTNAME
placement:
  hosts:
    - HOSTNAME
data_devices: # optional
  model: DISK_MODEL_NAME # optional
  paths:
  - /DEVICE_PATH
db_devices: # optional
  size: # optional
  all: true # optional
  paths:
   - /DEVICE_PATH

Example

service_type: osd
service_id: osd_by_path_host01
placement:
  hosts:
    - host01
data_devices:
  paths:
    - /dev/disk/by-path/pci-0000:0d:00.0-scsi-0:0:0:4
db_devices:
  paths:
    - /dev/disk/by-path/pci-0000:00:02.0-nvme-1

Mount the YAML file under a directory in the container:

Example

[root@host01 ~]# cephadm shell --mount osd_spec.yaml:/var/lib/ceph/osd/osd_spec.yaml

Navigate to the directory:

Example

[ceph: root@host01 /]# cd /var/lib/ceph/osd/

Before deploying OSDs, do a dry run:
Note
This step gives a preview of the deployment, without deploying the daemons.
Example
```
[ceph: root@host01 osd]# ceph orch apply -i osd_spec.yaml --dry-run
```

Deploy OSDs using service specification:

Syntax

ceph orch apply -i FILE_NAME.yml

Example

[ceph: root@host01 osd]# ceph orch apply -i osd_spec.yaml

Verification

List the service:
Example
```
[ceph: root@host01 /]# ceph orch ls osd
```
View the details of the node and devices:
Example
```
[ceph: root@host01 /]# ceph osd tree
```

Additional Resources

See the Advanced service specifications and filters for deploying OSDs section in the Red Hat Ceph Storage Operations Guide.

6.10. Removing the OSD daemons using the Ceph Orchestrator

You can remove the OSD from a cluster by using Cephadm.

Removing an OSD from a cluster involves two steps:

Evacuates all placement groups (PGs) from the cluster.
Removes the PG-free OSDs from the cluster.

The --zap option removed the volume groups, logical volumes, and the LVM metadata.

Note

After removing OSDs, if the drives the OSDs were deployed on once again become available, cephadm` might automatically try to deploy more OSDs on these drives if they match an existing drivegroup specification. If you deployed the OSDs you are removing with a spec and do not want any new OSDs deployed on the drives after removal, modify the drivegroup specification before removal. While deploying OSDs, if you have used --all-available-devices option, set unmanaged: true to stop it from picking up new drives at all. For other deployments, modify the specification. See the Deploying Ceph OSDs using advanced service specifications for more details.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
Ceph Monitor, Ceph Manager and Ceph OSD daemons are deployed on the storage cluster.

Procedure

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```
Check the device and the node from which the OSD has to be removed:
Example
```
[ceph: root@host01 /]# ceph osd tree
```
Remove the OSD:
Syntax
```
ceph orch osd rm OSD_ID [--replace] [--force] --zap
```
Example
```
[ceph: root@host01 /]# ceph orch osd rm 0 --zap
```
Note
If you remove the OSD from the storage cluster without an option, such as --replace, the device is removed from the storage cluster completely. If you want to use the same device for deploying OSDs, you have to first zap the device before adding it to the storage cluster.
Optional: To remove multiple OSDs from a specific node, run the following command:
Syntax
```
ceph orch osd rm OSD_ID OSD_ID --zap
```
Example
```
[ceph: root@host01 /]# ceph orch osd rm 2 5 --zap
```

Check the status of the OSD removal:

Example

[ceph: root@host01 /]# ceph orch osd rm status
OSD  HOST   STATE                    PGS  REPLACE  FORCE  ZAP   DRAIN STARTED AT
9    host01 done, waiting for purge    0  False    False  True  2023-06-06 17:50:50.525690
10   host03 done, waiting for purge    0  False    False  True  2023-06-06 17:49:38.731533
11   host02 done, waiting for purge    0  False    False  True  2023-06-06 17:48:36.641105

When no PGs are left on the OSD, it is decommissioned and removed from the cluster.

Verification

Verify the details of the devices and the nodes from which the Ceph OSDs are removed:
Example
```
[ceph: root@host01 /]# ceph osd tree
```

Additional Resources

See the Deploying Ceph OSDs on all available devices section in the Red Hat Ceph Storage Operations Guide for more information.
See the Deploying Ceph OSDs on specific devices and hosts section in the Red Hat Ceph Storage Operations Guide for more information.
See the Zapping devices for Ceph OSD deployment section in the Red Hat Ceph Storage Operations Guide for more information on clearing space on devices.

6.11. Replacing the OSDs using the Ceph Orchestrator

When disks fail, you can replace the physical storage device and reuse the same OSD ID to avoid having to reconfigure the CRUSH map.

You can replace the OSDs from the cluster using the --replace option.

Note

If you want to replace a single OSD, see Deploying Ceph OSDs on specific devices and hosts. If you want to deploy OSDs on all available devices, see Deploying Ceph OSDs on all available devices.

This option preserves the OSD ID using the ceph orch rm command. The OSD is not permanently removed from the CRUSH hierarchy, but is assigned the destroyed flag. This flag is used to determine the OSD IDs that can be reused in the next OSD deployment. The destroyed flag is used to determine which OSD id is reused in the next OSD deployment.

Similar to rm command, replacing an OSD from a cluster involves two steps:

Evacuating all placement groups (PGs) from the cluster.
Removing the PG-free OSD from the cluster.

If you use OSD specification for deployment, the OSD ID of the disk being replaced is automatically assigned to the newly added disk as soon as it is inserted.

Note

After removing OSDs, if the drives the OSDs were deployed on once again become available, cephadm might automatically try to deploy more OSDs on these drives if they match an existing drivegroup specification. If you deployed the OSDs you are removing with a spec and do not want any new OSDs deployed on the drives after removal, modify the drivegroup specification before removal. While deploying OSDs, if you have used --all-available-devices option, set unmanaged: true to stop it from picking up new drives at all. For other deployments, modify the specification. See the Deploying Ceph OSDs using advanced service specifications for more details.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
Monitor, Manager, and OSD daemons are deployed on the storage cluster.
A new OSD that replaces the removed OSD must be created on the same host from which the OSD was removed.

Procedure

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```

Ensure to dump and save a mapping of your OSD configurations for future references:

Example

[ceph: root@node /]# ceph osd metadata -f plain | grep device_paths
"device_paths": "sde=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:1,sdi=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:1",
"device_paths": "sde=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:1,sdf=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:1",
"device_paths": "sdd=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:2,sdg=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:2",
"device_paths": "sdd=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:2,sdh=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:2",
"device_paths": "sdd=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:2,sdk=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:2",
"device_paths": "sdc=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:3,sdl=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:3",
"device_paths": "sdc=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:3,sdj=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:3",
"device_paths": "sdc=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:3,sdm=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:3",
[.. output omitted ..]

Check the device and the node from which the OSD has to be replaced:
Example
```
[ceph: root@host01 /]# ceph osd tree
```
Remove the OSD from the cephadm managed cluster:
Important
If the storage cluster has health_warn or other errors associated with it, check and try to fix any errors before replacing the OSD to avoid data loss.
Syntax
```
ceph orch osd rm OSD_ID --replace [--force]
```
The --force option can be used when there are ongoing operations on the storage cluster.
Example
```
[ceph: root@host01 /]# ceph orch osd rm 0 --replace
```

Recreate the new OSD by applying the following OSD specification:

Example

service_type: osd
service_id: osd
placement:
  hosts:
  - myhost
data_devices:
  paths:
  - /path/to/the/device

Check the status of the OSD replacement:
Example
```
[ceph: root@host01 /]# ceph orch osd rm status
```

Stop the orchestrator to apply any existing OSD specification:

Example

[ceph: root@node /]# ceph orch pause
[ceph: root@node /]# ceph orch status
Backend: cephadm
Available: Yes
Paused: Yes

Zap the OSD devices that have been removed:

Example

[ceph: root@node /]# ceph orch device zap node.example.com /dev/sdi --force
zap successful for /dev/sdi on node.example.com

[ceph: root@node /]# ceph orch device zap node.example.com /dev/sdf --force
zap successful for /dev/sdf on node.example.com

Resume the Orcestrator from pause mode
Example
```
[ceph: root@node /]# ceph orch resume
```

Check the status of the OSD replacement:

Example

[ceph: root@node /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME      STATUS  REWEIGHT  PRI-AFF
-1         0.77112  root default
-3         0.77112      host node
 0    hdd  0.09639          osd.0      up   1.00000  1.00000
 1    hdd  0.09639          osd.1      up   1.00000  1.00000
 2    hdd  0.09639          osd.2      up   1.00000  1.00000
 3    hdd  0.09639          osd.3      up   1.00000  1.00000
 4    hdd  0.09639          osd.4      up   1.00000  1.00000
 5    hdd  0.09639          osd.5      up   1.00000  1.00000
 6    hdd  0.09639          osd.6      up   1.00000  1.00000
 7    hdd  0.09639          osd.7      up   1.00000  1.00000
 [.. output omitted ..]

Verification

Verify the details of the devices and the nodes from which the Ceph OSDs are replaced:
Example
```
[ceph: root@host01 /]# ceph osd tree
```
You can see an OSD with the same id as the one you replaced running on the same host.

Verify that the db_device for the new deployed OSDs is the replaced db_device:

Example

[ceph: root@host01 /]# ceph osd metadata 0 | grep bluefs_db_devices
"bluefs_db_devices": "nvme0n1",

[ceph: root@host01 /]# ceph osd metadata 1 | grep bluefs_db_devices
"bluefs_db_devices": "nvme0n1",

Additional Resources

See the Deploying Ceph OSDs on all available devices section in the Red Hat Ceph Storage Operations Guide for more information.
See the Deploying Ceph OSDs on specific devices and hosts section in the Red Hat Ceph Storage Operations Guide for more information.

6.12. Replacing the OSDs with pre-created LVM

After purging the OSD with the ceph-volume lvm zap command, if the directory is not present, then you can replace the OSDs with the OSd service specification file with the pre-created LVM.

Prerequisites

A running Red Hat Ceph Storage cluster.
Failed OSD

Procedure

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```

Remove the OSD:

Syntax

ceph orch osd rm OSD_ID [--replace]

Example

[ceph: root@host01 /]# ceph orch osd rm 8 --replace
Scheduled OSD(s) for removal

Verify the OSD is destroyed:

Example

[ceph: root@host01 /]# ceph osd tree

ID   CLASS  WEIGHT   TYPE NAME        STATUS     REWEIGHT  PRI-AFF
 -1         0.32297  root default
 -9         0.05177      host host10
  3    hdd  0.01520          osd.3           up   1.00000  1.00000
 13    hdd  0.02489          osd.13          up   1.00000  1.00000
 17    hdd  0.01169          osd.17          up   1.00000  1.00000
-13         0.05177      host host11
  2    hdd  0.01520          osd.2           up   1.00000  1.00000
 15    hdd  0.02489          osd.15          up   1.00000  1.00000
 19    hdd  0.01169          osd.19          up   1.00000  1.00000
 -7         0.05835      host host12
 20    hdd  0.01459          osd.20          up   1.00000  1.00000
 21    hdd  0.01459          osd.21          up   1.00000  1.00000
 22    hdd  0.01459          osd.22          up   1.00000  1.00000
 23    hdd  0.01459          osd.23          up   1.00000  1.00000
 -5         0.03827      host host04
  1    hdd  0.01169          osd.1           up   1.00000  1.00000
  6    hdd  0.01129          osd.6           up   1.00000  1.00000
  7    hdd  0.00749          osd.7           up   1.00000  1.00000
  9    hdd  0.00780          osd.9           up   1.00000  1.00000
 -3         0.03816      host host05
  0    hdd  0.01169          osd.0           up   1.00000  1.00000
  8    hdd  0.01129          osd.8    destroyed         0  1.00000
 12    hdd  0.00749          osd.12          up   1.00000  1.00000
 16    hdd  0.00769          osd.16          up   1.00000  1.00000
-15         0.04237      host host06
  5    hdd  0.01239          osd.5           up   1.00000  1.00000
 10    hdd  0.01540          osd.10          up   1.00000  1.00000
 11    hdd  0.01459          osd.11          up   1.00000  1.00000
-11         0.04227      host host07
  4    hdd  0.01239          osd.4           up   1.00000  1.00000
 14    hdd  0.01529          osd.14          up   1.00000  1.00000
 18    hdd  0.01459          osd.18          up   1.00000  1.00000

Zap and remove the OSD using the ceph-volume command:

Syntax

ceph-volume lvm zap --osd-id OSD_ID

Example

[ceph: root@host01 /]# ceph-volume lvm zap --osd-id 8

Zapping: /dev/vg1/data-lv2
Closing encrypted path /dev/mapper/l4D6ql-Prji-IzH4-dfhF-xzuf-5ETl-jNRcXC
Running command: /usr/sbin/cryptsetup remove /dev/mapper/l4D6ql-Prji-IzH4-dfhF-xzuf-5ETl-jNRcXC
Running command: /usr/bin/dd if=/dev/zero of=/dev/vg1/data-lv2 bs=1M count=10 conv=fsync
 stderr: 10+0 records in
10+0 records out
 stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.034742 s, 302 MB/s
Zapping successful for OSD: 8

Check the OSD topology:

Example

[ceph: root@host01 /]# ceph-volume lvm list

Recreate the OSD with a specification file corresponding to that specific OSD topology:

Example

[ceph: root@host01 /]# cat osd.yml
service_type: osd
service_id: osd_service
placement:
  hosts:
  - host03
data_devices:
  paths:
  - /dev/vg1/data-lv2
db_devices:
  paths:
   - /dev/vg1/db-lv1

Apply the updated specification file:

Example

[ceph: root@host01 /]# ceph orch apply -i osd.yml
Scheduled osd.osd_service update...

Verify the OSD is back:

Example

[ceph: root@host01 /]# ceph -s
[ceph: root@host01 /]# ceph osd tree

6.13. Replacing the OSDs in a non-colocated scenario

When the an OSD fails in a non-colocated scenario, you can replace the WAL/DB devices. The procedure is the same for DB and WAL devices. You need to edit the paths under db_devices for DB devices and paths under wal_devices for WAL devices.

Prerequisites

A running Red Hat Ceph Storage cluster.
Daemons are non-colocated.
Failed OSD

Procedure

Identify the devices in the cluster:

Example

[root@host01 ~]# lsblk

NAME                                                                                                  MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                                                                                                     8:0    0   20G  0 disk
├─sda1                                                                                                  8:1    0    1G  0 part /boot
└─sda2                                                                                                  8:2    0   19G  0 part
  ├─rhel-root                                                                                         253:0    0   17G  0 lvm  /
  └─rhel-swap                                                                                         253:1    0    2G  0 lvm  [SWAP]
sdb                                                                                                     8:16   0   10G  0 disk
└─ceph--5726d3e9--4fdb--4eda--b56a--3e0df88d663f-osd--block--3ceb89ec--87ef--46b4--99c6--2a56bac09ff0 253:2    0   10G  0 lvm
sdc                                                                                                     8:32   0   10G  0 disk
└─ceph--d7c9ab50--f5c0--4be0--a8fd--e0313115f65c-osd--block--37c370df--1263--487f--a476--08e28bdbcd3c 253:4    0   10G  0 lvm
sdd                                                                                                     8:48   0   10G  0 disk
├─ceph--1774f992--44f9--4e78--be7b--b403057cf5c3-osd--db--31b20150--4cbc--4c2c--9c8f--6f624f3bfd89    253:7    0  2.5G  0 lvm
└─ceph--1774f992--44f9--4e78--be7b--b403057cf5c3-osd--db--1bee5101--dbab--4155--a02c--e5a747d38a56    253:9    0  2.5G  0 lvm
sde                                                                                                     8:64   0   10G  0 disk
sdf                                                                                                     8:80   0   10G  0 disk
└─ceph--412ee99b--4303--4199--930a--0d976e1599a2-osd--block--3a99af02--7c73--4236--9879--1fad1fe6203d 253:6    0   10G  0 lvm
sdg                                                                                                     8:96   0   10G  0 disk
└─ceph--316ca066--aeb6--46e1--8c57--f12f279467b4-osd--block--58475365--51e7--42f2--9681--e0c921947ae6 253:8    0   10G  0 lvm
sdh                                                                                                     8:112  0   10G  0 disk
├─ceph--d7064874--66cb--4a77--a7c2--8aa0b0125c3c-osd--db--0dfe6eca--ba58--438a--9510--d96e6814d853    253:3    0    5G  0 lvm
└─ceph--d7064874--66cb--4a77--a7c2--8aa0b0125c3c-osd--db--26b70c30--8817--45de--8843--4c0932ad2429    253:5    0    5G  0 lvm
sr0

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```

Identify the OSDs and their DB device:

Example

[ceph: root@host01 /]# ceph-volume lvm list /dev/sdh


====== osd.2 =======

  [db]          /dev/ceph-d7064874-66cb-4a77-a7c2-8aa0b0125c3c/osd-db-0dfe6eca-ba58-438a-9510-d96e6814d853

      block device              /dev/ceph-5726d3e9-4fdb-4eda-b56a-3e0df88d663f/osd-block-3ceb89ec-87ef-46b4-99c6-2a56bac09ff0
      block uuid                GkWLoo-f0jd-Apj2-Zmwj-ce0h-OY6J-UuW8aD
      cephx lockbox secret
      cluster fsid              fa0bd9dc-e4c4-11ed-8db4-001a4a00046e
      cluster name              ceph
      crush device class
      db device                 /dev/ceph-d7064874-66cb-4a77-a7c2-8aa0b0125c3c/osd-db-0dfe6eca-ba58-438a-9510-d96e6814d853
      db uuid                   6gSPoc-L39h-afN3-rDl6-kozT-AX9S-XR20xM
      encrypted                 0
      osd fsid                  3ceb89ec-87ef-46b4-99c6-2a56bac09ff0
      osd id                    2
      osdspec affinity          non-colocated
      type                      db
      vdo                       0
      devices                   /dev/sdh

====== osd.5 =======

  [db]          /dev/ceph-d7064874-66cb-4a77-a7c2-8aa0b0125c3c/osd-db-26b70c30-8817-45de-8843-4c0932ad2429

      block device              /dev/ceph-d7c9ab50-f5c0-4be0-a8fd-e0313115f65c/osd-block-37c370df-1263-487f-a476-08e28bdbcd3c
      block uuid                Eay3I7-fcz5-AWvp-kRcI-mJaH-n03V-Zr0wmJ
      cephx lockbox secret
      cluster fsid              fa0bd9dc-e4c4-11ed-8db4-001a4a00046e
      cluster name              ceph
      crush device class
      db device                 /dev/ceph-d7064874-66cb-4a77-a7c2-8aa0b0125c3c/osd-db-26b70c30-8817-45de-8843-4c0932ad2429
      db uuid                   mwSohP-u72r-DHcT-BPka-piwA-lSwx-w24N0M
      encrypted                 0
      osd fsid                  37c370df-1263-487f-a476-08e28bdbcd3c
      osd id                    5
      osdspec affinity          non-colocated
      type                      db
      vdo                       0
      devices                   /dev/sdh

In the osds.yaml file, set unmanaged parameter to true, else cephadm redeploys the OSDs:

Example

[ceph: root@host01 /]# cat osds.yml
service_type: osd
service_id: non-colocated
unmanaged: true
placement:
  host_pattern: 'ceph*'
data_devices:
  paths:
   - /dev/sdb
   - /dev/sdc
   - /dev/sdf
   - /dev/sdg
db_devices:
  paths:
   - /dev/sdd
   - /dev/sdh

Apply the updated specification file:

Example

[ceph: root@host01 /]# ceph orch apply -i osds.yml

Scheduled osd.non-colocated update...

Check the status:

Example

[ceph: root@host01 /]# ceph orch ls

NAME           PORTS        RUNNING  REFRESHED  AGE  PLACEMENT
alertmanager   ?:9093,9094      1/1  9m ago     4d   count:1
crash                           3/4  4d ago     4d   *
grafana        ?:3000           1/1  9m ago     4d   count:1
mgr                             1/2  4d ago     4d   count:2
mon                             3/5  4d ago     4d   count:5
node-exporter  ?:9100           3/4  4d ago     4d   *
osd.non-colocated                 8  4d ago     5s   <unmanaged>
prometheus     ?:9095           1/1  9m ago     4d   count:1

Remove the OSDs. Ensure to use the --zap option to remove hte backend services and the --replace option to retain the OSD IDs:
Example
```
[ceph: root@host01 /]# ceph orch osd rm 2 5 --zap --replace
Scheduled OSD(s) for removal
```

Check the status:

Example

[ceph: root@host01 /]# ceph osd df tree | egrep -i "ID|host02|osd.2|osd.5"

ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP  META     AVAIL   %USE   VAR   PGS  STATUS     TYPE NAME
-5         0.04877         -   55 GiB   15 GiB  4.1 MiB   0 B   60 MiB  40 GiB  27.27  1.17    -                 host02
 2    hdd  0.01219   1.00000   15 GiB  5.0 GiB  996 KiB   0 B   15 MiB  10 GiB  33.33  1.43    0  destroyed          osd.2
 5    hdd  0.01219   1.00000   15 GiB  5.0 GiB  1.0 MiB   0 B   15 MiB  10 GiB  33.33  1.43    0  destroyed          osd.5

Edit the osds.yaml specification file to change unmanaged parameter to false and replace the path to the DB device if it has changed after the device got physically replaced:
Example
```
[ceph: root@host01 /]# cat osds.yml
service_type: osd
service_id: non-colocated
unmanaged: false
placement:
  host_pattern: 'ceph01*'
data_devices:
  paths:
   - /dev/sdb
   - /dev/sdc
   - /dev/sdf
   - /dev/sdg
db_devices:
  paths:
   - /dev/sdd
   - /dev/sde
```
In the above example, /dev/sdh is replaced with /dev/sde.
Important
If you use the same host specification file to replace the faulty DB device on a single OSD node, modify the host_pattern option to specify only the OSD node, else the deployment fails and you cannot find the new DB device on other hosts.

Reapply the specification file with the --dry-run option to ensure the OSDs shall be deployed with the new DB device:

Example

[ceph: root@host01 /]# ceph orch apply -i osds.yml --dry-run
WARNING! Dry-Runs are snapshots of a certain point in time and are bound
to the current inventory setup. If any of these conditions change, the
preview will be invalid. Please make sure to have a minimal
timeframe between planning and applying the specs.
####################
SERVICESPEC PREVIEWS
####################
+---------+------+--------+-------------+
|SERVICE  |NAME  |ADD_TO  |REMOVE_FROM  |
+---------+------+--------+-------------+
+---------+------+--------+-------------+
################
OSDSPEC PREVIEWS
################
+---------+-------+-------+----------+----------+-----+
|SERVICE  |NAME   |HOST   |DATA      |DB        |WAL  |
+---------+-------+-------+----------+----------+-----+
|osd      |non-colocated  |host02  |/dev/sdb  |/dev/sde  |-    |
|osd      |non-colocated  |host02  |/dev/sdc  |/dev/sde  |-    |
+---------+-------+-------+----------+----------+-----+

Apply the specification file:

Example

[ceph: root@host01 /]# ceph orch apply -i osds.yml
Scheduled osd.non-colocated update...

Check the OSDs are redeployed:

Example

[ceph: root@host01 /]# ceph osd df tree | egrep -i "ID|host02|osd.2|osd.5"

ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP  META     AVAIL   %USE   VAR   PGS  STATUS  TYPE NAME
-5         0.04877         -   55 GiB   15 GiB  4.5 MiB   0 B   60 MiB  40 GiB  27.27  1.17    -              host host02
 2    hdd  0.01219   1.00000   15 GiB  5.0 GiB  1.1 MiB   0 B   15 MiB  10 GiB  33.33  1.43    0      up          osd.2
 5    hdd  0.01219   1.00000   15 GiB  5.0 GiB  1.1 MiB   0 B   15 MiB  10 GiB  33.33  1.43    0      up          osd.5

Verification

From the OSD host where the OSDS are redeployed, verify if they are on the new DB device:

Example

[ceph: root@host01 /]# ceph-volume lvm list /dev/sde

====== osd.2 =======

  [db]          /dev/ceph-15ce813a-8a4c-46d9-ad99-7e0845baf15e/osd-db-1998a02e-5e67-42a9-b057-e02c22bbf461

      block device              /dev/ceph-a4afcb78-c804-4daf-b78f-3c7ad1ed0379/osd-block-564b3d2f-0f85-4289-899a-9f98a2641979
      block uuid                ITPVPa-CCQ5-BbFa-FZCn-FeYt-c5N4-ssdU41
      cephx lockbox secret
      cluster fsid              fa0bd9dc-e4c4-11ed-8db4-001a4a00046e
      cluster name              ceph
      crush device class
      db device                 /dev/ceph-15ce813a-8a4c-46d9-ad99-7e0845baf15e/osd-db-1998a02e-5e67-42a9-b057-e02c22bbf461
      db uuid                   HF1bYb-fTK7-0dcB-CHzW-xvNn-dCym-KKdU5e
      encrypted                 0
      osd fsid                  564b3d2f-0f85-4289-899a-9f98a2641979
      osd id                    2
      osdspec affinity          non-colocated
      type                      db
      vdo                       0
      devices                   /dev/sde

====== osd.5 =======

  [db]          /dev/ceph-15ce813a-8a4c-46d9-ad99-7e0845baf15e/osd-db-6c154191-846d-4e63-8c57-fc4b99e182bd

      block device              /dev/ceph-b37c8310-77f9-4163-964b-f17b4c29c537/osd-block-b42a4f1f-8e19-4416-a874-6ff5d305d97f
      block uuid                0LuPoz-ao7S-UL2t-BDIs-C9pl-ct8J-xh5ep4
      cephx lockbox secret
      cluster fsid              fa0bd9dc-e4c4-11ed-8db4-001a4a00046e
      cluster name              ceph
      crush device class
      db device                 /dev/ceph-15ce813a-8a4c-46d9-ad99-7e0845baf15e/osd-db-6c154191-846d-4e63-8c57-fc4b99e182bd
      db uuid                   SvmXms-iWkj-MTG7-VnJj-r5Mo-Moiw-MsbqVD
      encrypted                 0
      osd fsid                  b42a4f1f-8e19-4416-a874-6ff5d305d97f
      osd id                    5
      osdspec affinity          non-colocated
      type                      db
      vdo                       0
      devices                   /dev/sde

6.14. Stopping the removal of the OSDs using the Ceph Orchestrator

You can stop the removal of only the OSDs that are queued for removal. This resets the initial state of the OSD and takes it off the removal queue.

If the OSD is in the process of removal, then you cannot stop the process.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
Monitor, Manager and OSD daemons are deployed on the cluster.
Remove OSD process initiated.

Procedure

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```
Check the device and the node from which the OSD was initiated to be removed:
Example
```
[ceph: root@host01 /]# ceph osd tree
```

Stop the removal of the queued OSD:

Syntax

ceph orch osd rm stop OSD_ID

Example

[ceph: root@host01 /]# ceph orch osd rm stop 0

Check the status of the OSD removal:

Example

[ceph: root@host01 /]# ceph orch osd rm status

Verification

Verify the details of the devices and the nodes from which the Ceph OSDs were queued for removal:
Example
```
[ceph: root@host01 /]# ceph osd tree
```

Additional Resources

See Removing the OSD daemons using the Ceph Orchestrator section in the Red Hat Ceph Storage Operations Guide for more information.

6.15. Activating the OSDs using the Ceph Orchestrator

You can activate the OSDs in the cluster in cases where the operating system of the host was reinstalled.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
Monitor, Manager and OSD daemons are deployed on the storage cluster.

Procedure

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```

After the operating system of the host is reinstalled, activate the OSDs:

Syntax

ceph cephadm osd activate HOSTNAME

Example

[ceph: root@host01 /]# ceph cephadm osd activate host03

Verification

List the service:
Example
```
[ceph: root@host01 /]# ceph orch ls
```

List the hosts, daemons, and processes:

Syntax

ceph orch ps --service_name=SERVICE_NAME

Example

[ceph: root@host01 /]# ceph orch ps --service_name=osd

6.16. Observing the data migration

When you add or remove an OSD to the CRUSH map, Ceph begins rebalancing the data by migrating placement groups to the new or existing OSD(s). You can observe the data migration using ceph-w command.

Prerequisites

A running Red Hat Ceph Storage cluster.
Recently added or removed an OSD.

Procedure

To observe the data migration:
Example
```
[ceph: root@host01 /]# ceph -w
```
Watch as the placement group states change from active+clean to active, some degraded objects, and finally active+clean when migration completes.
To exit the utility, press Ctrl + C.

6.17. Recalculating the placement groups

Placement groups (PGs) define the spread of any pool data across the available OSDs. A placement group is built upon the given redundancy algorithm to be used. For a 3-way replication, the redundancy is defined to use three different OSDs. For erasure-coded pools, the number of OSDs to use is defined by the number of chunks.

When defining a pool the number of placement groups defines the grade of granularity the data is spread with across all available OSDs. The higher the number the better the equalization of capacity load can be. However, since handling the placement groups is also important in case of reconstruction of data, the number is significant to be carefully chosen upfront. To support calculation a tool is available to produce agile environments.

During the lifetime of a storage cluster a pool may grow above the initially anticipated limits. With the growing number of drives a recalculation is recommended. The number of placement groups per OSD should be around 100. When adding more OSDs to the storage cluster the number of PGs per OSD will lower over time. Starting with 120 drives initially in the storage cluster and setting the pg_num of the pool to 4000 will end up in 100 PGs per OSD, given with the replication factor of three. Over time, when growing to ten times the number of OSDs, the number of PGs per OSD will go down to ten only. Because a small number of PGs per OSD will tend to an unevenly distributed capacity, consider adjusting the PGs per pool.

Adjusting the number of placement groups can be done online. Recalculating is not only a recalculation of the PG numbers, but will involve data relocation, which will be a lengthy process. However, the data availability will be maintained at any time.

Very high numbers of PGs per OSD should be avoided, because reconstruction of all PGs on a failed OSD will start at once. A high number of IOPS is required to perform reconstruction in a timely manner, which might not be available. This would lead to deep I/O queues and high latency rendering the storage cluster unusable or will result in long healing times.

Additional Resources

See the PG calculator for calculating the values by a given use case.
See the Erasure Code Pools chapter in the Red Hat Ceph Storage Strategies Guide for more information.

Select Your Language

Chapter 6. Management of OSDs using the Ceph Orchestrator

6.1. Ceph OSDs

6.2. Ceph OSD node configuration

6.3. Automatically tuning OSD memory

6.4. Listing devices for Ceph OSD deployment

6.5. Zapping devices for Ceph OSD deployment

6.6. Deploying Ceph OSDs on all available devices

6.7. Deploying Ceph OSDs on specific devices and hosts

6.8. Advanced service specifications and filters for deploying OSDs

6.9. Deploying Ceph OSDs using advanced service specifications

6.10. Removing the OSD daemons using the Ceph Orchestrator

6.11. Replacing the OSDs using the Ceph Orchestrator

6.12. Replacing the OSDs with pre-created LVM

6.13. Replacing the OSDs in a non-colocated scenario

6.14. Stopping the removal of the OSDs using the Ceph Orchestrator

6.15. Activating the OSDs using the Ceph Orchestrator

6.16. Observing the data migration

6.17. Recalculating the placement groups

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Language and Page Formatting Options

Chapter 6. Management of OSDs using the Ceph Orchestrator

6.1. Ceph OSDs

6.2. Ceph OSD node configuration

6.3. Automatically tuning OSD memory

6.4. Listing devices for Ceph OSD deployment

6.5. Zapping devices for Ceph OSD deployment

6.6. Deploying Ceph OSDs on all available devices

6.7. Deploying Ceph OSDs on specific devices and hosts

6.8. Advanced service specifications and filters for deploying OSDs

6.9. Deploying Ceph OSDs using advanced service specifications

6.10. Removing the OSD daemons using the Ceph Orchestrator

6.11. Replacing the OSDs using the Ceph Orchestrator

6.12. Replacing the OSDs with pre-created LVM

6.13. Replacing the OSDs in a non-colocated scenario

6.14. Stopping the removal of the OSDs using the Ceph Orchestrator

6.15. Activating the OSDs using the Ceph Orchestrator

6.16. Observing the data migration

6.17. Recalculating the placement groups

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links