Chapter 8. Provisioning bare metal nodes before deploying the overcloud

Important

This feature is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.

The overcloud deployment process contains two primary operations:

  • Provisioning nodes
  • Deploying the overcloud

You can mitigate some of the risk involved with this process and identify points of failure more efficiently if you separate these operations into distinct processes:

  1. Provision your bare metal nodes.

    1. Create a node definition file in yaml format.
    2. Run the provisioning command, including the node definition file.
  2. Deploy your overcloud.

    1. Run the deployment command, including the heat environment file that the provisioning command generates.

The provisioning process provisions your nodes and generates a heat environment file that contains various node specifications, including node count, predictive node placement, custom images, and custom NICs. When you deploy your overcloud, include this file in the deployment command.

Important

You cannot combine pre-provisioned nodes with director-provisioned nodes.

8.1. Registering nodes for the overcloud

Director requires a node definition template, which you create manually. This template uses a JSON or YAML format, and contains the hardware and power management details for your nodes.

Procedure

  1. Create a template that lists your nodes. Use the following JSON and YAML template examples to understand how to structure your node definition template:

    Example JSON template

    {
        "nodes":[
            {
                "mac":[
                    "bb:bb:bb:bb:bb:bb"
                ],
                "name":"node01",
                "cpu":"4",
                "memory":"6144",
                "disk":"40",
                "arch":"x86_64",
                "pm_type":"ipmi",
                "pm_user":"admin",
                "pm_password":"p@55w0rd!",
                "pm_addr":"192.168.24.205"
            },
            {
                "mac":[
                    "cc:cc:cc:cc:cc:cc"
                ],
                "name":"node02",
                "cpu":"4",
                "memory":"6144",
                "disk":"40",
                "arch":"x86_64",
                "pm_type":"ipmi",
                "pm_user":"admin",
                "pm_password":"p@55w0rd!",
                "pm_addr":"192.168.24.206"
            }
        ]
    }

    Example YAML template

    nodes:
      - mac:
          - "bb:bb:bb:bb:bb:bb"
        name: "node01"
        cpu: 4
        memory: 6144
        disk: 40
        arch: "x86_64"
        pm_type: "ipmi"
        pm_user: "admin"
        pm_password: "p@55w0rd!"
        pm_addr: "192.168.24.205"
      - mac:
          - cc:cc:cc:cc:cc:cc
        name: "node02"
        cpu: 4
        memory: 6144
        disk: 40
        arch: "x86_64"
        pm_type: "ipmi"
        pm_user: "admin"
        pm_password: "p@55w0rd!"
        pm_addr: "192.168.24.206"

    This template contains the following attributes:

    name
    The logical name for the node.
    pm_type

    The power management driver that you want to use. This example uses the IPMI driver (ipmi).

    Note

    IPMI is the preferred supported power management driver. For more information about supported power management types and their options, see Appendix A, Power management drivers. If these power management drivers do not work as expected, use IPMI for your power management.

    pm_user; pm_password
    The IPMI username and password.
    pm_addr
    The IP address of the IPMI device.
    pm_port (Optional)
    The port to access the specific IPMI device.
    mac
    (Optional) A list of MAC addresses for the network interfaces on the node. Use only the MAC address for the Provisioning NIC of each system.
    cpu
    (Optional) The number of CPUs on the node.
    memory
    (Optional) The amount of memory in MB.
    disk
    (Optional) The size of the hard disk in GB.
    arch

    (Optional) The system architecture.

    Important

    When building a multi-architecture cloud, the arch key is mandatory to distinguish nodes using x86_64 and ppc64le architectures.

  2. After you create the template, run the following commands to verify the formatting and syntax:

    $ source ~/stackrc
    (undercloud) $ openstack overcloud node import --validate-only ~/nodes.json
  3. Save the file to the home directory of the stack user (/home/stack/nodes.json), then run the following commands to import the template to director:

    (undercloud) $ openstack overcloud node import ~/nodes.json

    This command registers each node from the template into director.

  4. Wait for the node registration and configuration to complete. When complete, confirm that director has successfully registered the nodes:

    (undercloud) $ openstack baremetal node list

8.2. Inspecting the hardware of nodes

Director can run an introspection process on each node. This process boots an introspection agent over PXE on each node. The introspection agent collects hardware data from the node and sends the data back to director. Director then stores this introspection data in the OpenStack Object Storage (swift) service running on director. Director uses hardware information for various purposes such as profile tagging, benchmarking, and manual root disk assignment.

Procedure

  1. Run the following command to inspect the hardware attributes of each node:

    (undercloud) $ openstack overcloud node introspect --all-manageable --provide
    • Use the --all-manageable option to introspect only the nodes that are in a managed state. In this example, all nodes are in a managed state.
    • Use the --provide option to reset all nodes to an available state after introspection.
  2. Monitor the introspection progress logs in a separate terminal window:

    (undercloud) $ sudo tail -f /var/log/containers/ironic-inspector/ironic-inspector.log
    Important

    Ensure that this process runs to completion. This process usually takes 15 minutes for bare metal nodes.

After the introspection completes, all nodes change to an available state.

8.3. Provisioning bare metal nodes

Create a new YAML file ~/overcloud-baremetal-deploy.yaml, define the quantity and attributes of the bare metal nodes that you want to deploy, and assign overcloud roles to these nodes. The provisioning process creates a heat environment file that you can include in your openstack overcloud deploy command.

Prerequisites

Procedure

  1. Source the stackrc undercloud credential file:

    $ source ~/stackrc
  2. Create a new ~/overcloud-baremetal-deploy.yaml file and define the node count for each role that you want to provision. For example, to provision three Controller nodes and three Compute nodes, use the following syntax:

    - name: Controller
      count: 3
    - name: Compute
      count: 3
  3. In the ~/overcloud-baremetal-deploy.yaml file, define any predictive node placements, custom images, custom NICs, or other attributes that you want to assign to your nodes. For example, use the following example syntax to provision three Controller nodes on nodes node00, node01, and node02, and three Compute nodes on node04, node05, and node06:

    - name: Controller
      count: 3
      instances:
      - hostname: overcloud-controller-0
        name: node00
      - hostname: overcloud-controller-1
        name: node01
      - hostname: overcloud-controller-2
        name: node02
    - name: Compute
      count: 3
      instances:
      - hostname: overcloud-novacompute-0
        name: node04
      - hostname: overcloud-novacompute-1
        name: node05
      - hostname: overcloud-novacompute-2
        name: node06

    By default, the provisioning process uses the overcloud-full image. You can use the image attribute in the instances parameter to define a custom image:

    - name: Controller
      count: 3
      instances:
      - hostname: overcloud-controller-0
        name: node00
        image:
          href: overcloud-custom

    You can also override the default parameter values with the defaults parameter to avoid manual node definitions for each node entry:

    - name: Controller
      count: 3
      defaults:
        image:
          href: overcloud-custom
      instances:
      - hostname :overcloud-controller-0
        name: node00
      - hostname: overcloud-controller-1
        name: node01
      - hostname: overcloud-controller-2
        name: node02

    For more information about the parameters, attributes, and values that you can use in your node definition file, see Section 8.6, “Bare metal node provisioning attributes”.

  4. Run the provisioning command, specifying the ~/overcloud-baremetal-deploy.yaml file and defining an output file with the --output option:

    (undercloud) $ sudo openstack overcloud node provision \
    --stack stack \
    --output ~/overcloud-baremetal-deployed.yaml \
    ~/overcloud-baremetal-deploy.yaml

    The provisioning process generates a heat environment file with the name that you specify in the --output option. This file contains your node definitions. When you deploy the overcloud, include this file in the deployment command.

  5. In a separate terminal, monitor your nodes to verify that they provision successfully. The provisioning process changes the node state from available to active:

    (undercloud) $ watch openstack baremetal node list

    Use the metalsmith tool to obtain a unified view of your nodes, including allocations and neutron ports:

    (undercloud) $ metalsmith list

    You can also use the openstack baremetal allocation command to verify association of nodes to hostnames, and to obtain IP addresses for the provisioned nodes:

    (undercloud) $ openstack baremetal allocation list

When your nodes are provisioned successfully, you can deploy the overcloud. For more information, see Chapter 9, Configuring a basic overcloud with pre-provisioned nodes.

8.4. Scaling up bare metal nodes

To increase the count of bare metal nodes in an existing overcloud, increment the node count in the ~/overcloud-baremetal-deploy.yaml file and redeploy the overcloud.

Prerequisites

Procedure

  1. Source the stackrc undercloud credential file:

    $ source ~/stackrc
  2. Edit the ~/overcloud-baremetal-deploy.yaml file that you used to provision your bare metal nodes, and increment the count parameter for the roles that you want to scale up. For example, if your overcloud contains three Compute nodes, use the following snippet to increase the Compute node count to 10:

    - name: Controller
      count: 3
    - name: Compute
      count: 10

    You can also add predictive node placement with the instances parameter. For more information about the parameters and attributes that are available, see Section 8.6, “Bare metal node provisioning attributes”.

  3. Run the provisioning command, specifying the ~/overcloud-baremetal-deploy.yaml file and defining an output file with the --output option:

    (undercloud) $ sudo openstack overcloud node provision \
    --stack stack \
    --output ~/overcloud-baremetal-deployed.yaml \
    ~/overcloud-baremetal-deploy.yaml
  4. Monitor the provisioning progress with the openstack baremetal node list command.
  5. Deploy the overcloud, including the ~/overcloud-baremetal-deployed.yaml file that the provisioning command generates, along with any other environment files relevant to your deployment:

    (undercloud) $ openstack overcloud deploy \
      ...
      -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-environment.yaml \
      -e ~/overcloud-baremetal-deployed.yaml \
      --deployed-server \
      --disable-validations \
      ...

8.5. Scaling down bare metal nodes

Tag the nodes that you want to delete from the stack in the ~/overcloud-baremetal-deploy.yaml file, redeploy the overcloud, and then include this file in the openstack overcloud node delete command with the --baremetal-deployment option.

Prerequisites

Procedure

  1. Source the stackrc undercloud credential file:

    $ source ~/stackrc
  2. Edit the ~/overcloud-baremetal-deploy.yaml file that you used to provision your bare metal nodes, and decrement the count parameter for the roles that you want to scale down. You must also define the following attributes for each node that you want to remove from the stack:

    • The name of the node.
    • The hostname that is associated with the node.
    • The attribute provisioned: false.

      For example, to remove the node overcloud-controller-1 from the stack, include the following snippet in your ~/overcloud-baremetal-deploy.yaml file:

      - name: Controller
        count: 2
        instances:
        - hostname: overcloud-controller-0
          name: node00
        - hostname: overcloud-controller-1
          name: node01
          # Removed from cluster due to disk failure
          provisioned: false
        - hostname: overcloud-controller-2
          name: node02
  3. Run the provisioning command, specifying the ~/overcloud-baremetal-deploy.yaml file and defining an output file with the --output option:

    (undercloud) $ sudo openstack overcloud node provision \
    --stack stack \
    --output ~/overcloud-baremetal-deployed.yaml \
    ~/overcloud-baremetal-deploy.yaml
  4. Redeploy the overcloud and include the ~/overcloud-baremetal-deployed.yaml file that the provisioning command generates, along with any other environment files relevant to your deployment:

    (undercloud) $ openstack overcloud deploy \
      ...
      -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-environment.yaml \
      -e ~/overcloud-baremetal-deployed.yaml \
      --deployed-server \
      --disable-validations \
      ...

    After you redeploy the overcloud, the nodes that you define with the provisioned: false attribute are no longer present in the stack. However, these nodes are still running in a provisioned state.

    Note

    If you want to remove a node from the stack temporarily, you can deploy the overcloud with the attribute provisioned: false and then redeploy the overcloud with the attribute provisioned: true to return the node to the stack.

  5. Run the openstack overcloud node delete command, including the ~/overcloud-baremetal-deploy.yaml file with the --baremetal-deployment option.

    (undercloud) $ sudo openstack overcloud node delete \
    --stack stack \
    --baremetal-deployment ~/overcloud-baremetal-deploy.yaml
    Note

    Do not include the nodes that you want to remove from the stack as command arguments in the openstack overcloud node delete command.

8.6. Bare metal node provisioning attributes

Use the following tables to understand the parameters, attributes, and values that are available for you to use when you provision bare metal nodes with the openstack baremetal node provision command.

Table 8.1. Role parameters

ParameterValue

name

Mandatory role name

count

The number of nodes that you want to provision for this role. The default value is 1.

defaults

A dictionary of default values for instances entry properties. An instances entry property overrides any defaults that you specify in the defaults parameter.

instances

A dictionary of values that you can use to specify attributes for specific nodes. For more information about supported properties in the instances parameter, see Table 8.2, “instances and defaults parameters”. The length of this list must not be greater than the value of the count parameter.

hostname_format

Overrides the default hostname format for this role. The default format uses the lower case role name. For example, the default format for the Controller role is %stackname%-controller-%index%. Only the Compute role does not follow the role name rule. The Compute default format is %stackname%-novacompute-%index%

Example syntax

In the following example, the name refers to the logical name of the node, and the hostname refers to the generated hostname which is derived from the overcloud stack name, the role, and an incrementing index. All Controller servers use a default custom image overcloud-full-custom and are on predictive nodes. One of the Compute servers is placed predictively on node04 with custom host name overcloud-compute-special, and the other 99 Compute servers are on nodes allocated automatically from the pool of available nodes:

- name: Controller
  count: 3
  defaults:
    image:
      href: file:///var/lib/ironic/images/overcloud-full-custom.qcow2
  instances:
  - hostname: overcloud-controller-0
    name: node00
  - hostname: overcloud-controller-1
    name: node01
  - hostname: overcloud-controller-2
    name: node02
- name: Compute
  count: 100
  instances:
  - hostname: overcloud-compute-special
    name: node04

Table 8.2. instances and defaults parameters

ParameterValue

hostname

If the hostname complies with the hostname_format pattern then other properties apply to the node allocated to this hostname. Otherwise, you can use a custom hostname for this node.

name

The name of the node that you want to provision.

image

Details of the image that you want to provision onto the node. For more information about supported properties in the image parameter, see Table 8.3, “image parameters”.

capabilities

Selection criteria to match the node capabilities.

nics

List of dictionaries that represent requested NICs. For more information about supported properties in the nics parameter, see Table 8.4, “nic parameters”.

profile

Selection criteria to use Advanced Profile Matching.

provisioned

Boolean to determine whether this node is provisioned or unprovisioned. The default value is true. Use false to unprovision a node. For more information, see Section 8.5, “Scaling down bare metal nodes”.

resource_class

Selection criteria to match the resource class of the node. The default value is baremetal.

root_size_gb

Size of the root partition in GiB. The default value is 49

swap_size_mb

Size of the swap partition in MiB.

traits

A list of traits as selection criteria to match the node traits.

Example syntax

In the following example, all Controller servers use a custom default overcloud image overcloud-full-custom. The Controller server overcloud-controller-0 is placed predictively on node00 and has custom root and swap sizes. The other two Controller servers are on nodes allocated automatically from the pool of available nodes, and have default root and swap sizes:

- name: Controller
  count: 3
  defaults:
    image:
      href: file:///var/lib/ironic/images/overcloud-full-custom.qcow2
  instances:
  - hostname: overcloud-controller-0
    name: node00
    root_size_gb: 140
    swap_size_mb: 600

Table 8.3. image parameters

ParameterValue

href

Glance image reference or URL of the root partition or whole disk image. URL schemes supported are file://, http://, and https://. If the value is not a valid URL, this value must be a valid glance image reference.

checksum

When the href is a URL, this value must be the SHA512 checksum of the root partition or whole disk image.

kernel

Glance image reference or URL of the kernel image. Use this property only for partition images.

ramdisk

Glance image reference or URL of the ramdisk image. Use this property only for partition images.

Example syntax

In the following example, all three Controller servers are on nodes allocated automatically from the pool of available nodes. All Controller servers in this environment use a default custom image overcloud-full-custom:

- name: Controller
  count: 3
  defaults:
    image:
      href: file:///var/lib/ironic/images/overcloud-full-custom.qcow2
      checksum: 1582054665
      kernel: file:///var/lib/ironic/images/overcloud-full-custom.vmlinuz
      ramdisk: file:///var/lib/ironic/images/overcloud-full-custom.initrd

Table 8.4. nic parameters

ParameterValue

fixed_ip

The specific IP address that you want to use for this NIC.

network

The neutron network where you want to create the port for this NIC.

subnet

The neutron subnet where you want to create the port for this NIC.

port

Existing Neutron port to use instead of creating a new port.

Example syntax

In the following example, all three Controller servers are on nodes allocated automatically from the pool of available nodes. All Controller servers in this environment use a default custom image overcloud-full-custom and have specific networking requirements:

- name: Controller
  count: 3
  defaults:
    image:
      href: file:///var/lib/ironic/images/overcloud-full-custom.qcow2
      nics:
        network: custom-network
        subnet: custom-subnet