[Ceph][Rados] How to modify the failure domains for existing pools?

Solution Verified - Updated -

Environment

  • Red Hat Ceph Storage
    • 3.x
    • 4.x
    • 5.x

Issue

  • Change/Modify the failure domain of existing rados pools ( Replicated and erasure-coded )

Resolution

For Replicated Pools

  1. To change the failure domain for multiple replicated pools having a common crush rule, edit the CrushMap to change the existing replicated rule failure domain.
    a. Collect and decompile the crush map from the cluster.


    $ ceph osd getcrushmap -o /tmp/cm.bin $ crushtool -d /tmp/cm.bin -o /tmp/cm.bin.ascii

    b. Edit the rules section as required. For example, change the failure domain from host to rack.

    # rules
    rule replicated_rule {
    id 0
    type replicated
    min_size 1
    max_size 10
    step take default
    step chooseleaf firstn 0 type host
    step emit
    }
    

    to


    # rules rule replicated_rule { id 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type rack step emit }

    c. Compile the crush map back and update in the cluster


    $ crushtool -c /tmp/cm.bin.ascii -o /tmp/cm_updated.bin $ ceph osd setcrushmap -i /tmp/cm_updated.bin
  2. To change the failure domain for a specific replicated pool, create a new replicated rule specifying the new failure domain
    a. Create a new replicated rule


    $ ceph osd crush rule create-replicated <name> <root> <failure domain> <class>

    b. Check the crush rule name and then Set the new crush rule to the pool


    $ ceph osd crush dump --> get rule name $ ceph osd pool set <pool-name> crush_rule <crush-rule-name>

NOTE: As the crush map gets updated, the cluster will start re-balancing
NOTE: It is preferred to use Method 2 above

For Erasure-coded Pools

NOTE: Any CRUSH related information like failure-domain and device storage class will be used from the EC profile only during the creation of the crush rule

  1. To change the failure domain for multiple erasure-coded pools using a common crush rule, edit the CrushMap to change the existing ec rule failure domain.
    a. Collect and decompile the crush map from the cluster.


    $ ceph osd getcrushmap -o /tmp/cm.bin $ crushtool -d /tmp/cm.bin -o /tmp/cm.bin.ascii

    b. Edit the rules section as ( Example: change the failure domain from host to rack

    # rules
    rule erasure-code {
    id 3
    type erasure
    min_size 3
    max_size 4
    step set_chooseleaf_tries 5
    step set_choose_tries 100
    step take default
    step chooseleaf indep 0 type host
    step emit
    }
    

    to


    # rules rule erasure-code { id 3 type erasure min_size 3 max_size 4 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default step chooseleaf indep 0 type rack step emit }

    c. Compile the crush map back and update in the cluster

    $ crushtool -c /tmp/cm.bin.ascii -o /tmp/cm_updated.bin
    $ ceph osd setcrushmap -i /tmp/cm_updated.bin
    
  2. To change the failure domain for a specific ec pool, create a new ec rule and specify the new failure domain in the new ec profile
    a. Create a new ec profile

    $ ceph osd erasure-code-profile set prof2 k=<int> m=<int> crush-failure-domain=<failure-domain>  crush-device-class=<class>
    

    b. Create a new crush rule with the above new ec profile and update the pool with the new crush rule


    $ ceph osd crush rule create-erasure <new-crush-rule-name> <profile-name> $ ceph osd pool set <pool-name> crush_rule <new-crush-rule-name>

NOTE: As the crush map gets updated, the cluster will start re-balancing
NOTE: It is preferred to Method 2 above

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments