Chapter 2. Planning your OVN deployment

Deploy OVN only in high-availability RHOSP high availability (HA) environments with at least three controller nodes.

DVR is enabled by default in new ML2/OVN deployments and disabled by default in new ML2/OVS deployments. The neutron-ovn-dvr-ha.yaml environment file configures the required DVR-specific parameters for deployments using OVN in an HA environment.

Note

To use OVN, your director deployment must use Generic Network Virtualization Encapsulation (Geneve), and not VXLAN. Geneve allows OVN to identify the network using the 24-bit Virtual Network Identifier (VNI) field and an additional 32-bit Type Length Value (TLV) to specify both the source and destination logical ports. You should account for this larger protocol header when you determine your MTU setting.

2.1. The ovn-controller service on Compute nodes

The ovn-controller service runs on each Compute node and connects to the OVN southbound (SB) database server to retrieve the logical flows. The ovn-controller translates these logical flows into physical OpenFlow flows and adds the flows to the OVS bridge (br-int). To communicate with ovs-vswitchd and install the OpenFlow flows, the ovn-controller connects to the local ovsdb-server (which hosts conf.db) using the UNIX socket path that was passed when ovn-controller was started (for example unix:/var/run/openvswitch/db.sock).

The ovn-controller service expects certain key-value pairs in the external_ids column of the Open_vSwitch table; puppet-ovn uses puppet-vswitch to populate these fields. The following example shows the key-value pairs that puppet-vswitch configures in the external_ids column:

hostname=<HOST NAME>
ovn-encap-ip=<IP OF THE NODE>
ovn-encap-type=geneve
ovn-remote=tcp:OVN_DBS_VIP:6642

2.2. The OVN composable service

Red Hat OpenStack Platform usually consists of nodes in pre-defined roles, such as nodes in Controller roles, Compute roles, and different storage role types. Each of these default roles contains a set of services that are defined in the core heat template collection.

In a default Red Hat OpenStack (RHOSP) deployment, the ML2/OVN composable service runs on Controller nodes. You can optionally create a custom Networker role and run the OVN composable service on dedicated Networker nodes.

The OVN composable service ovn-dbs is deployed in a container called ovn-dbs-bundle. In a default installation ovn-dbs is included in the Controller role and runs on Controller nodes. Because the service is composable, you can assign it to another role, such as a Networker role.

If you assign the OVN composable service to another role, ensure that the service is co-located on the same node as the pacemaker service, which controls the OVN database containers.

2.3. Deploying a custom role with ML2/OVN

In a default Red Hat OpenStack (RHOSP) deployment, the ML2/OVN composable service runs on Controller nodes. You can optionally use supported custom roles like those described in the following examples.

Networker
Run the OVN composable services on dedicated networker nodes.
Networker with SR-IOV
Run the OVN composable services on dedicated networker nodes with SR-IOV.
Controller with SR-IOV
Run the OVN composable services on SR-IOV capable controller nodes.

You can also generate your own custom roles.

Limitations

The following limitations apply to the use of SR-IOV with ML2/OVN and native OVN DHCP in this release.

  • All external ports are scheduled on a single gateway node because there is only one HA Chassis Group for all of the ports.
  • North/south routing on VF(direct) ports on VLAN tenant networks does not work with SR-IOV because the external ports are not colocated with the logical router’s gateway ports. See https://bugs.launchpad.net/neutron/+bug/1875852.

Prerequisites

Procedure

  1. Log in to the undercloud host as the stack user and source the stackrc file.

    $ source stackrc
  2. Choose the custom roles file that is appropriate for your deployment. Use it directly in the deploy command if it suits your needs as-is. Or you can generate your own custom roles file that combines other custom roles files.

    DeploymentRoleRole File

    Networker role

    Networker

    Networker.yaml

    Networker role with SR-IOV

    NetworkerSriov

    NetworkerSriov.yaml

    Co-located control and networker with SR-IOV

    ControllerSriov

    ControllerSriov.yaml

  3. [Optional] Generate a new custom roles data file that combines one of these custom roles files with other custom roles files. Follow the instructions in Creating a roles_data file. Include the appropriate source role files depending on your deployment.
  4. [Optional] To identify specific nodes for the role, you can create a specific hardware flavor and assign the flavor to specific nodes. Then use an environment file define the flavor for the role, and to specify a node count. For more information, see the example in Creating a new role.
  5. Create an environment file as appropriate for your deployment.

    DeploymentSample Environment File

    Networker role

    neutron-ovn-dvr-ha.yaml

    Networker role with SR-IOV

    ovn-sriov.yaml

  6. Include the following settings as appropriate for your deployment.

    DeploymentSettings

    Networker role

    ControllerParameters:
        OVNCMSOptions: ""
    ControllerSriovParameters:
            OVNCMSOptions: ""
    NetworkerParameters:
        OVNCMSOptions: "enable-chassis-as-gw"
    NetworkerSriovParameters:
        OVNCMSOptions: ""

    Networker role with SR-IOV

    OS::TripleO::Services::NeutronDhcpAgent: OS::Heat::None
    
    ControllerParameters:
        OVNCMSOptions: ""
    ControllerSriovParameters:
            OVNCMSOptions: ""
    NetworkerParameters:
        OVNCMSOptions: ""
    NetworkerSriYou can uovParameters:
        OVNCMSOptions: "enable-chassis-as-gw"

    Co-located control and networker with SR-IOV

    OS::TripleO::Services::NeutronDhcpAgent: OS::Heat::None
    
    ControllerParameters:
        OVNCMSOptions: ""
    ControllerSriovParameters:
            OVNCMSOptions: "enable-chassis-as-gw"
    NetworkerParameters:
        OVNCMSOptions: ""
    NetworkerSriovParameters:
        OVNCMSOptions: ""
  7. Deploy the overcloud. Include the environment file in your deployment command with the -e option. Include the custom roles data file in your deployment command with the -r option. For example: ``-r Networker.yaml` or '-r mycustomrolesfile.yaml`.

Verification steps

  1. Ensure that ovn_metadata_agent is running on Controller and Networker nodes.

    [heat-admin@controller-0 ~]$ sudo podman ps | grep ovn_metadata

    Expect output similar to the following example.

    a65125d9588d  undercloud-0.ctlplane.localdomain:8787/rh-osbs/rhosp16-openstack-neutron-metadata-agent-ovn:16.2_20200813.1  kolla_start           23 hours ago  Up 21 hours ago         ovn_metadata_agent
  2. Ensure that Controller nodes with OVN services or dedicated Networker nodes have been configured as gateways for OVS.

    [heat-admin@controller-0 ~]$ sudo ovs-vsctl get Open_Vswitch .
    ...OS::TripleO::Services::NeutronDhcpAgent: OS::Heat::None

    Expect output similar to the following example.

    external_ids:ovn-cms-options
        enable-chassis-as-gw

Additional verification steps for SR-IOV deployments

  1. Ensure that neutron_sriov_agent is running on compute nodes.

    [heat-admin@controller-0 ~]sudo podman ps | grep neutron_sriov_agent

    Expect output similar to the following example.

    f54cbbf4523a  undercloud-0.ctlplane.localdomain:8787/rh-osbs/rhosp16-openstack-neutron-sriov-agent:16.2_20200813.1
    kolla_start  23 hours ago  Up 21 hours ago         neutron_sriov_agent
  2. Ensure that network-available SR-IOV NICs have been successfully detected.

    [heat-admin@controller-0 ~]$ sudo podman exec -uroot galera-bundle-podman-0 mysql nova -e 'select hypervisor_hostname,pci_stats from compute_nodes;'

    Expect output similar to the following example.

    computesriov-1.localdomain	{... {"dev_type": "type-PF", "physical_network": "datacentre", "trusted": "true"}, "count": 1}, ... {"dev_type": "type-VF", "physical_network": "datacentre", "trusted": "true", "parent_ifname": "enp7s0f3"}, "count": 5}, ...}
    computesriov-0.localdomain	{... {"dev_type": "type-PF", "physical_network": "datacentre", "trusted": "true"}, "count": 1}, ... {"dev_type": "type-VF", "physical_network": "datacentre", "trusted": "true", "parent_ifname": "enp7s0f3"}, "count": 5}, ...}

2.4. SR-IOV with ML2/OVN and native OVN DHCP

You can deploy a custom role to use SR-IOV in an ML2/OVN deployment with native OVN DHCP. See Deploying a custom role with ML2/OVN

Limitations

The following limitations apply to the use of SR-IOV with ML2/OVN and native OVN DHCP in this release.

  • All external ports are scheduled on a single gateway node because there is only one HA Chassis Group for all of the ports.
  • North/south routing on VF(direct) ports on VLAN tenant networks does not work with SR-IOV because the external ports are not colocated with the logical router’s gateway ports. See https://bugs.launchpad.net/neutron/+bug/1875852.

2.5. High Availability with pacemaker and DVR

You can choose one of two ovn-dbs profiles: the base profile, ovn-dbs-container, and the pacemaker high availability (HA) profile, ovn-dbs-container-puppet.

With the pacemaker HA profile enabled, ovsdb-server runs in master-slave mode, managed by pacemaker and the resource agent Open Cluster Framework (OCF) script. The OVN database servers start on all the Controllers, and pacemaker then selects one controller to serve in the master role. The instance of ovsdb-server that runs in master mode can write to the database, while all the other slave ovsdb-server services replicate the database locally from the master, and can not write to the database.

The YAML file for this profile is the tripleo-heat-templates/environments/services/neutron-ovn-dvr-ha.yaml file.

When enabled, the OVN database servers are managed by Pacemaker, and `puppet-tripleo` creates a pacemaker OCF resource named `ovn:ovndb-servers`.

The OVN database servers are started on each Controller node, and the controller owning the virtual IP address (OVN_DBS_VIP) runs the OVN DB servers in master mode. The OVN ML2 mechanism driver and ovn-controller then connect to the database servers using the OVN_DBS_VIP value. In the event of a failover, Pacemaker moves the virtual IP address (OVN_DBS_VIP) to another controller, and also promotes the OVN database server running on that node to master.

2.6. Layer 3 high availability with OVN

OVN supports Layer 3 high availability (L3 HA) without any special configuration. OVN automatically schedules the router port to all available gateway nodes that can act as an L3 gateway on the specified external network. OVN L3 HA uses the gateway_chassis column in the OVN Logical_Router_Port table. Most functionality is managed by OpenFlow rules with bundled active_passive outputs. The ovn-controller handles the Address Resolution Protocol (ARP) responder and router enablement and disablement. Gratuitous ARPs for FIPs and router external addresses are also periodically sent by the ovn-controller.

Note

L3HA uses OVN to balance the routers back to the original gateway nodes to avoid any nodes becoming a bottleneck.

BFD monitoring

OVN uses the Bidirectional Forwarding Detection (BFD) protocol to monitor the availability of the gateway nodes. This protocol is encapsulated on top of the Geneve tunnels established from node to node.

Each gateway node monitors all the other gateway nodes in a star topology in the deployment. Gateway nodes also monitor the compute nodes to let the gateways enable and disable routing of packets and ARP responses and announcements.

Each compute node uses BFD to monitor each gateway node and automatically steers external traffic, such as source and destination Network Address Translation (SNAT and DNAT), through the active gateway node for a given router. Compute nodes do not need to monitor other compute nodes.

Note

External network failures are not detected as would happen with an ML2-OVS configuration.

L3 HA for OVN supports the following failure modes:

  • The gateway node becomes disconnected from the network (tunneling interface).
  • ovs-vswitchd stops (ovs-switchd is responsible for BFD signaling)
  • ovn-controller stops (ovn-controller removes itself as a registered node).
Note

This BFD monitoring mechanism only works for link failures, not for routing failures.