Chapter 2. Migrating the ML2 mechanism driver from OVS to OVN

2.1. Preparing the environment for migration of the ML2 mechanism driver from OVS to OVN

Environment assessment and preparation is critical to a successful migration. Your Red Hat Technical Account Manager or Global Professional Services will guide you through these steps.

Prerequisites

  • Your deployment is the latest RHOSP 16.2 version. In other words, if you need to upgrade or update your OpenStack version, perform the upgrade or update first, and then perform the ML2/OVS to ML2/OVN migration.
  • At least one IP address is available for each subnet pool.

    The OVN mechanism driver creates a metadata port for each subnet. Each metadata port claims an IP address from the IP address pool.

  • You have worked with your Red Hat Technical Account Manager or Global Professional Services to plan the migration and have filed a proactive support case. See How to submit a Proactive Case.

Procedure

  1. Create an ML2/OVN stage deployment to obtain the baseline configuration of your target ML2/OVN deployment and test the feasibility of the target deployment.

    Design the stage deployment with the same basic roles, routing, and topology as the planned post-migration production deployment. Save the overcloud-deploy.sh file and any files referenced by the deployment, such as environment files. You need these files later in this procedure to configure the migration target environment.

    Note

    Use these files only for creation of the stage deployment and in the migration. Do not re-use them after the migration.

  2. If your ML2/OVS deployment uses VXLAN or GRE project networks, schedule for a waiting period of up to 24 hours after the setup-mtu-t1 step.

    • This waiting period allows the VM instances to renew their DHCP leases and receive the new MTU value. During this time you might need to manually set MTUs on some instances and reboot some instances.
    • 24 hours is the time based on default configuration of 86400 seconds. The actual time depends on /var/lib/config-data/puppet-generated/neutron/etc/neutron/dhcp_agent.ini dhcp_renewal_time and /var/lib/config-data/puppet-generated/neutron/etc/neutron/neutron.conf dhcp_lease_duration parameters.
  3. Install python3-networking-ovn-migration-tool.

    sudo dnf install python3-networking-ovn-migration-tool @container-tools

    The @container-tools argument also installs the container tools if they are not already present.

  4. Create a directory on the undercloud, and copy the Ansible playbooks:

    mkdir ~/ovn_migration
    cd ~/ovn_migration
    cp -rfp /usr/share/ansible/networking-ovn-migration/playbooks .
  5. Copy your ML2/OVN stage deployment files to the migration home directory, such as ~/ovn_migration.

    The stage migration deployment files include overcloud-deploy.sh and any files referenced by the deployment, such as environment files. Rename the copy of overcloud-deploy.sh to overcloud-deploy-ovn.sh. Use this script for migration only. Do not use it for other purposes.

  6. Find your migration scenario in the following list and perform the appropriate steps to customize the openstack deploy command in overcloud-deploy-ovn.sh.

    Scenario 1: DVR to DVR, compute nodes have connectivity to the external network
    • Add the following environment files to the openstack deploy command in overcloud-deploy-ovn.sh. Add them in the order shown. This command example uses the default neutron-ovn-dvr-ha.yaml file. If you use a different file, replace the file name in the command.

      -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-dvr-ha.yaml \
      -e $HOME/ovn-extras.yaml
    Scenario 2: Centralized routing to centralized routing (no DVR)
    • If your deployment uses SR-IOV, add the service definition OS::TripleO::Services::OVNMetadataAgent to the Controller role in the file roles_data.yaml.
    • Preserve the pre-migration custom bridge mappings.

      • Run this command on a networker or combined networker/controller node to get the current bridge mappings:

        sudo podman exec -it neutron_ovs_agent crudini --get /etc/neutron/plugins/ml2/openvswitch_agent.ini ovs bridge_mappings

        Example output

        datacentre:br-ex,tenant:br-isolated
      • On the undercloud, create an environment file for the bridge mappings: /home/stack/neutron_bridge_mappings.yaml.
      • Set the defaults in the environment file. For example:

        parameter_defaults:
          ComputeParameters:
            NeutronBridgeMappings: "datacentre:br-ex,tenant:br-isolated"
    • Add the following environment files to the openstack deploy command in overcloud-deploy-ovn.sh. Add them in the order shown. If your environment does not use SR-IOV, omit the neutron-ovn-sriov.yaml file. The file ovn-extras.yaml does not exist yet but it is created by the script ovn_migration.sh before the openstack deploy command is run.

      -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml \
      -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-sriov.yaml \
      -e /home/stack/ovn-extras.yaml  \
      -e /home/stack/neutron_bridge_mappings.yaml
    • Leave any custom network modifications the same as they were before migration.
    Scenario 3: Centralized routing to DVR, with Geneve type driver, and compute nodes connected to external networks through br-ex
    Warning

    If your ML2/OVS deployment uses centralized routing and VLAN project (tenant) networks, do not migrate to ML2/OVN with DVR. You can migrate to ML2/OVN with centralized routing. To track progress on this limitation, see https://bugzilla.redhat.com/show_bug.cgi?id=1766930.

    • Ensure that compute nodes are connected to the external network through the br-ex bridge. For example, in an environment file such as compute-dvr.yaml, set the following:

      type: ovs_bridge
          # Defaults to br-ex, anything else requires specific # bridge mapping entries for it to be used.
          name: bridge_name
          use_dhcp: false
          members:
           -
            type: interface
            name: nic3
            # force the MAC address of the bridge to this interface
            primary: true
  7. Ensure that all users have execution privileges on the file overcloud-deploy-ovn.sh. The script requires execution privileges during the migration process.

    $ chmod a+x ~/overcloud-deploy-ovn.sh
  8. Use export commands to set the following migration-related environment variables. For example:

    $ export PUBLIC_NETWORK_NAME=my-public-network
    • STACKRC_FILE - the stackrc file in your undercloud.

      Default: ~/stackrc

    • OVERCLOUDRC_FILE - the overcloudrc file in your undercloud.

      Default: ~/overcloudrc

    • OVERCLOUD_OVN_DEPLOY_SCRIPT - the deployment script.

      Default: ~/overcloud-deploy-ovn.sh

    • PUBLIC_NETWORK_NAME - the name of your public network.

      Default: public.

    • IMAGE_NAME - the name or ID of the glance image to use to boot a test server.

      Default: cirros.

      The image is automatically downloaded during the pre-validation / post-validation process.

    • VALIDATE_MIGRATION - Create migration resources to validate the migration. Before starting the migration, the migration script boots a server and validates that the server is reachable after the migration.

      Default: True.

      Warning

      Migration validation requires at least two available floating IP addresses, two networks, two subnets, two instances, and two routers as admin.

      Also, the network specified by PUBLIC_NETWORK_NAME must have available floating IP addresses, and you must be able to ping them from the undercloud.

      If your environment does not meet these requirements, set VALIDATE_MIGRATION to False.

    • SERVER_USER_NAME - User name to use for logging to the migration instances.

      Default: cirros.

    • DHCP_RENEWAL_TIME - DHCP renewal time in seconds to configure in DHCP agent configuration file.

      Default: 30

  9. Ensure you are in the ovn-migration directory and run the command ovn_migration.sh generate-inventory to generate the inventory file hosts_for_migration and the ansible.cfg file.

    $ ovn_migration.sh generate-inventory   | sudo tee -a /var/log/ovn_migration_output.txt
  10. Review the hosts_for_migration file for accuracy.

    1. Ensure the lists match your environment.
    2. Ensure there are ovn controllers on each node.
    3. Ensure there are no list headings (such as [ovn-controllers]) that do not have list items under them.
    4. From the ovn migration directory, run the command ansible -i hosts_for_migration -m ping all
  11. If your original deployment uses VXLAN or GRE, you need to adjust maximum transmission unit (MTU) values. Proceed to Adjusting MTU for migration from the OVS mechanism driver to the OVN mechanism driver.

    If your original deployment uses VLAN networks, you can skip the MTU adjustments and proceed to Preparing container images for migration from the OVS mechanism driver to the OVN mechanism driver.

2.2. Adjusting MTU for migration of the ML2 mechanism driver from OVS to OVN

If you are migrating from RHOSP 16.2 with the OVN mechanism driver with VXLAN or GRE to the OVN mechanism driver with Geneve, you must ensure that the maximum transmission unit (MTU) settings are smaller than or equal to the smallest MTU in the network.

If your current deployment uses VLAN instead of VXLAN or GRE, skip this procedure and proceed to Preparing container images for migration from the OVS mechanism driver to the OVN mechanism driver.

Prerequisites

Procedure

  1. Run ovn_migration.sh setup-mtu-t1. This lowers the T1 parameter of the internal neutron DHCP servers that configure the dhcp_renewal_time in /var/lib/config-data/puppet-generated/neutron/etc/neutron/dhcp_agent.ini in all the nodes where DHCP agent is running.

    $ ovn_migration.sh setup-mtu-t1   | sudo tee -a /var/log/ovn_migration_output.txt
  2. If your original OVS deployment uses VXLAN or GRE project networking, wait until the DHCP leases have been renewed on all VM instances. This can take up to 24 hours depending on lease renewal settings and the number of instances.
  3. Verify that the T1 parameter has propagated to existing VMs.

    • Connect to one of the compute nodes.
    • Run tcpdump` over one of the VM taps attached to a project network.

      If T1 propagation is successful, expect to see requests occur approximately every 30 seconds:

      [heat-admin@overcloud-novacompute-0 ~]$ sudo tcpdump -i tap52e872c2-e6 port 67 or port 68 -n
      tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
      listening on tap52e872c2-e6, link-type EN10MB (Ethernet), capture size 262144 bytes
      13:17:28.954675 IP 192.168.99.5.bootpc > 192.168.99.3.bootps: BOOTP/DHCP, Request from fa:16:3e:6b:41:3d, length 300
      13:17:28.961321 IP 192.168.99.3.bootps > 192.168.99.5.bootpc: BOOTP/DHCP, Reply, length 355
      13:17:56.241156 IP 192.168.99.5.bootpc > 192.168.99.3.bootps: BOOTP/DHCP, Request from fa:16:3e:6b:41:3d, length 30013:17:56.249899 IP 192.168.99.3.bootps > 192.168.99.5.bootpc: BOOTP/DHCP, Reply, length 355
      Note

      This verification is not possible with cirros VMs. The cirros udhcpc` implementation does not respond to DHCP option 58 (T1). Try this verification on a port that belongs to a full Linux VM. Red Hat recommends that you check all the different operating systems represented in your workloads, such as variants of Windows and Linux distributions.

  4. If any VM instances were not updated to reflect the change to the T1 parameter of DHCP, reboot them.
  5. Lower the MTU of the pre-migration VXLAN and GRE networks:

    $ ovn_migration.sh reduce-mtu   | sudo tee -a /var/log/ovn_migration_output.txt

    This step reduces the MTU network by network and tags the completed network with adapted_mtu. The tool acts only on VXLAN and GRE networks. This step will not change any values if your deployment has only VLAN project networks.

  6. If you have any instances with static IP assignment on VXLAN or GRE project networks, manually modify the configuration of those instances to configure the new Geneve MTU, which is the current VXLAN MTU minus 8 bytes. For example, if the VXLAN-based MTU was 1450, change it to 1442.

    Note

    Perform this step only if you have manually provided static IP assignments and MTU settings on VXLAN or GRE project networks. By default, DHCP provides the IP assignment and MTU settings.

  7. Proceed to Preparing container images for migration from the OVS mechanism driver to the OVN mechanism driver.

2.3. Preparing container images for migration of the ML2 mechanism driver from OVS to OVN

Environment assessment and preparation is critical to a successful migration. Your Red Hat Technical Account Manager or Global Professional Services will guide you through these steps.

Prerequisites

Procedure

  1. Prepare the new container images for use after the migration to ML2/OVN.

    1. Create containers-prepare-parameter.yaml file in the home directory if it is not present.

      $ test -f $HOME/containers-prepare-parameter.yaml || sudo openstack tripleo container image prepare default \
      --output-env-file $HOME/containers-prepare-parameter.yaml
    2. Verify that containers-prepare-parameter.yaml is present at the end of your $HOME/overcloud-deploy-ovn.sh and $HOME/overcloud-deploy.sh files.
    3. Change the neutron_driver in the containers-prepare-parameter.yaml file to ovn:

      $ sed -i -E 's/neutron_driver:([ ]\w+)/neutron_driver: ovn/' $HOME/containers-prepare-parameter.yaml
    4. Verify the changes to the neutron_driver:

      $ grep neutron_driver $HOME/containers-prepare-parameter.yaml
      neutron_driver: ovn
    5. Update the images:

      $ sudo openstack tripleo container image prepare \
      --environment-file /home/stack/containers-prepare-parameter.yaml
      Note

      Provide the full path to your containers-prepare-parameter.yaml file. Otherwise, the command completes very quickly without updating the image list or providing an error message.

  2. On the undercloud, validate the updated images.

    . Log in to the undercloud as the user `stack` and source the stackrc file.
    $ source ~/stackrc
    $ openstack tripleo container image list | grep  '\-ovn'

    Your list should resemble the following example. It includes containers for the OVN databases, OVN controller, the metadata agent, and the neutron server agent.

    docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ovn-northd:16.2_20211110.2
    docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ovn-sb-db-server:16.2_20211110.2
    docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ovn-controller:16.2_20211110.2
    docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-server-ovn:16.2_20211110.2
    docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ovn-nb-db-server:16.2_20211110.2
    docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-metadata-agent-ovn:16.2_20211110.2
  3. Proceed to Migrating from ML2/OVS to ML2/OVN.

2.4. Migrating the ML2 mechanism driver from OVS to OVN

The ovn-migration script performs environmental setup, migration, and cleanup tasks related to the in-place migration of the ML2 mechanism driver from OVS to OVN.

Prerequisites

Procedure

  1. Stop all operations that interact with the Networking Service (neutron) API, such as creating new networks, subnets, or routers, or migrating virtual machine instances between compute nodes.

    Interaction with Networking API during migration can cause undefined behavior. You can restart the API operations after completing the migration.

  2. Run ovn_migration.sh start-migration to begin the migration process. The tee command creates a copy of the script output for troubleshooting purposes.

    $ ovn_migration.sh start-migration  | sudo tee -a /var/log/ovn_migration_output.txt

Result

The script performs the following actions.

  • Creates pre-migration resources (network and VM) to validate existing deployment and final migration.
  • Updates the overcloud stack to deploy OVN alongside reference implementation services using the temporary bridge br-migration instead of br-int. The temporary bridge helps to limit downtime during migration.
  • Generates the OVN northbound database by running neutron-ovn-db-sync-util. The utility examines the Neutron database to create equivalent resources in the OVN northbound database.
  • Clones the existing resources from br-int to br-migration, to allow ovn to find the same resource UUIDS over br-migration.
  • Re-assigns ovn-controller to br-int instead of br-migration.
  • Removes node resources that are not used by ML2/OVN, including the following.

    • Cleans up network namespaces (fip, snat, qrouter, qdhcp).
    • Removes any unnecessary patch ports on br-int.
    • Removes br-tun and br-migration ovs bridges.
    • Deletes ports from br-int that begin with qr-, ha-, and qg- (using neutron-netns-cleanup).
  • Deletes Networking Service (neutron) agents and Networking Service HA internal networks from the database through the Networking Service API.
  • Validates connectivity on pre-migration resources.
  • Deletes pre-migration resources.
  • Creates post-migration resources.
  • Validates connectivity on post-migration resources.
  • Cleans up post-migration resources.
  • Re-runs the deployment tool to update OVN on br-int.