Chapter 10. Scaling Compute nodes with director Operator

If you require more or fewer compute resources for your overcloud, you can scale the number of Compute nodes according to your requirements.

10.1. Adding Compute nodes to your overcloud with director Operator

To add more Compute nodes to your overcloud, you must increase the node count for the compute OpenStackBaremetalSet resource. When a new node is provisioned, you create a new OpenStackConfigGenerator resource to generate a new set of Ansible playbooks, then use the OpenStackConfigVersion to create or update the OpenStackDeploy object to reapply the Ansible configuration to your overcloud.

Procedure

  1. Check that you have enough hosts in a ready state in the openshift-machine-api namespace:

    $ oc get baremetalhosts -n openshift-machine-api

    For more information on managing your bare-metal hosts, see Managing bare metal hosts.

  2. Increase the count parameter for the compute OpenStackBaremetalSet resource:

    $ oc patch openstackbaremetalset compute --type=merge --patch '{"spec":{"count":3}}' -n openstack

    The OpenStackBaremetalSet resource automatically provisions the new nodes with the Red Hat Enterprise Linux base operating system.

  3. Wait until the provisioning process completes. Check the nodes periodically to determine the readiness of the nodes:

    $ oc get baremetalhosts -n openshift-machine-api
    $ oc get openstackbaremetalset
  4. Optional: Reserve static IP addresses for networks on the new Compute nodes. For more information, see Reserving static IP addresses for added Compute nodes with the OpenStackNetConfig CRD.
  5. Generate the Ansible playbooks by using OpenStackConfigGenerator and apply the overcloud configuration. For more information, see Configuring and deploying the overcloud with director Operator.

Additional resources

10.2. Reserving static IP addresses for added Compute nodes with the OpenStackNetConfig CRD

Use the OpenStackNetConfig CRD to define IP addresses that you want to reserve for the Compute node you added to your overcloud.

Tip

Use the following commands to view the OpenStackNetConfig CRD definition and specification schema:

$ oc describe crd openstacknetconfig

$ oc explain openstacknetconfig.spec

Procedure

  1. Open the openstacknetconfig.yaml file for the overcloud on your workstation.
  2. Add the following configuration to openstacknetconfig.yaml to create the OpenStackNetConfig custom resource (CR):

    apiVersion: osp-director.openstack.org/v1beta1
    kind: OpenStackNetConfig
    metadata:
      name: openstacknetconfig
  3. Reserve static IP addresses for networks on specific nodes:

    spec:
      ...
      reservations:
        controller-0:
          ipReservations:
            ctlplane: 172.22.0.120
        compute-0:
          ipReservations:
            ctlplane: 172.22.0.140
            internal_api: 172.17.0.40
            storage: 172.18.0.40
            tenant: 172.20.0.40
        ...
        //The key for the ctlplane VIPs
        controlplane:
          ipReservations:
            ctlplane: 172.22.0.110
            external: 10.0.0.10
            internal_api: 172.17.0.10
            storage: 172.18.0.10
            storage_mgmt: 172.19.0.10
          macReservations: {}
    Note

    Reservations have precedence over any autogenerated IP addresses.

  4. Save the openstacknetconfig.yaml definition file.
  5. Create the overcloud network configuration:

    $ oc create -f osnetconfig.yaml -n openstack

Verification

  1. To verify that the overcloud network configuration is created, view the resources for the overcloud network configuration:

    $ oc get openstacknetconfig/openstacknetconfig
  2. View the OpenStackNetConfig API and child resources:

    $ oc get openstacknetconfig/openstacknetconfig -n openstack
    $ oc get openstacknetattachment -n openstack
    $ oc get openstacknet -n openstack

    If you see errors, check the underlying network-attach-definition and node network configuration policies:

    $ oc get network-attachment-definitions -n openstack
    $ oc get nncp

10.3. Removing Compute nodes from your overcloud with director Operator

To remove a Compute node from your overcloud, you must disable the Compute node, mark it for deletion, and decrease the node count for the compute OpenStackBaremetalSet resource.

Note

If you scale the overcloud with a new node in the same role, the node reuses the host names starting with lowest ID suffix and corresponding IP reservation.

Prerequisites

Procedure

  1. Access the remote shell for openstackclient:

    $ oc rsh -n openstack openstackclient
  2. Identify the Compute node that you want to remove:

    $ openstack compute service list
  3. Disable the Compute service on the node to prevent the node from scheduling new instances:

    $ openstack compute service set <hostname> nova-compute --disable
  4. Annotate the bare-metal node to prevent Metal3 from starting the node:

    $ oc annotate baremetalhost <node> baremetalhost.metal3.io/detached=true
    $ oc logs --since=1h <metal3-pod> metal3-baremetal-operator | grep -i detach
    $ oc get baremetalhost <node> -o json | jq .status.operationalStatus
    "detached"
    • Replace <node> with the name of the BareMetalHost resource.
    • Replace <metal3-pod> with the name of your metal3 pod.
  5. Log in to the Compute node as the root user and shut down the bare-metal node:

    [root@compute-0 ~]# shutdown -h now

    If the Compute node is not accessible, complete the following steps:

    1. Log in to a Controller node as the root user.
    2. If Instance HA is enabled, disable the STONITH device for the Compute node:

      [root@controller-0 ~]# pcs stonith disable <stonith_resource_name>
      • Replace <stonith_resource_name> with the name of the STONITH resource that corresponds to the node. The resource name uses the format <resource_agent>-<host_mac>. You can find the resource agent and the host MAC address in the FencingConfig section of the fencing.yaml file.
    3. Use IPMI to power off the bare-metal node. For more information, see your hardware vendor documentation.
  6. Retrieve the BareMetalHost resource that corresponds to the node that you want to remove:

    $ oc get openstackbaremetalset compute -o json | jq '.status.baremetalHosts | to_entries[] | "\(.key) => \(.value | .hostRef)"'
    "compute-0, openshift-worker-3"
    "compute-1, openshift-worker-4"
  7. To change the status of the annotatedForDeletion parameter to true in the OpenStackBaremetalSet resource, annotate the BareMetalHost resource with osp-director.openstack.org/delete-host=true:

    $ oc annotate -n openshift-machine-api bmh/openshift-worker-3 osp-director.openstack.org/delete-host=true --overwrite
  8. Optional: Confirm that the annotatedForDeletion status has changed to true in the OpenStackBaremetalSet resource:

    $ oc get openstackbaremetalset compute -o json -n openstack | jq .status
    {
      "baremetalHosts": {
        "compute-0": {
          "annotatedForDeletion": true,
          "ctlplaneIP": "192.168.25.105/24",
          "hostRef": "openshift-worker-3",
          "hostname": "compute-0",
          "networkDataSecretName": "compute-cloudinit-networkdata-openshift-worker-3",
          "provisioningState": "provisioned",
          "userDataSecretName": "compute-cloudinit-userdata-openshift-worker-3"
        },
        "compute-1": {
          "annotatedForDeletion": false,
          "ctlplaneIP": "192.168.25.106/24",
          "hostRef": "openshift-worker-4",
          "hostname": "compute-1",
          "networkDataSecretName": "compute-cloudinit-networkdata-openshift-worker-4",
          "provisioningState": "provisioned",
          "userDataSecretName": "compute-cloudinit-userdata-openshift-worker-4"
        }
      },
      "provisioningStatus": {
        "readyCount": 2,
        "reason": "All requested BaremetalHosts have been provisioned",
        "state": "provisioned"
      }
    }
  9. Decrease the count parameter for the compute OpenStackBaremetalSet resource:

    $ oc patch openstackbaremetalset compute --type=merge --patch '{"spec":{"count":1}}' -n openstack

    When you reduce the resource count of the OpenStackBaremetalSet resource, you trigger the corresponding controller to handle the resource deletion, which causes the following actions:

    • Director Operator deletes the corresponding IP reservations from OpenStackIPSet and OpenStackNetConfig for the deleted node.
    • Director Operator flags the IP reservation entry in the OpenStackNet resource as deleted.

      $ oc get osnet ctlplane -o json -n openstack | jq .reservations
      {
        "compute-0": {
          "deleted": true,
          "ip": "172.22.0.140"
        },
        "compute-1": {
          "deleted": false,
          "ip": "172.22.0.100"
        },
        "controller-0": {
          "deleted": false,
          "ip": "172.22.0.120"
        },
        "controlplane": {
          "deleted": false,
          "ip": "172.22.0.110"
        },
        "openstackclient-0": {
          "deleted": false,
          "ip": "172.22.0.251"
        }
  10. Optional: To make the IP reservations of the deleted OpenStackBaremetalSet resource available for other roles to use, set the value of the spec.preserveReservations parameter to false in the OpenStackNetConfig object.
  11. Access the remote shell for openstackclient:

    $ oc rsh openstackclient -n openstack
  12. Remove the Compute service entries from the overcloud:

    $ openstack compute service list
    $ openstack compute service delete <service-id>
  13. Check the Compute network agents entries in the overcloud and remove them if they exist:

    $ openstack network agent list
    $ for AGENT in $(openstack network agent list --host <scaled-down-node> -c ID -f value) ; do openstack network agent delete $AGENT ; done
  14. Exit from openstackclient:

    $ exit