Chapter 13. Configuring distributed virtual routing (DVR)
13.1. Understanding distributed virtual routing (DVR)
When you deploy Red Hat OpenStack Platform you can choose between a centralized routing model or DVR.
Each model has advantages and disadvantages. Use this document to carefully plan whether centralized routing or DVR better suits your needs.
New default RHOSP deployments use DVR and the Modular Layer 2 plug-in with the Open Virtual Network mechanism driver (ML2/OVN).
DVR is disabled by default in ML2/OVS deployments.
13.1.1. Overview of Layer 3 routing
The Red Hat OpenStack Platform Networking service (neutron) provides routing services for project networks. Without a router, VM instances in a project network can communicate with other instances over a shared L2 broadcast domain. Creating a router and assigning it to a project network allows the instances in that network to communicate with other project networks or upstream (if an external gateway is defined for the router).
13.1.2. Routing flows
Routing services in Red Hat OpenStack Platform (RHOSP) can be categorized into three main flows:
- East-West routing - routing of traffic between different networks in the same project. This traffic does not leave the RHOSP deployment. This definition applies to both IPv4 and IPv6 subnets.
- North-South routing with floating IPs - Floating IP addressing is a one-to-one network address translation (NAT) that can be modified and that floats between VM instances. While floating IPs are modeled as a one-to-one association between the floating IP and a Networking service (neutron) port, they are implemented by association with a Networking service router that performs the NAT translation. The floating IPs themselves are taken from the uplink network that provides the router with external connectivity. As a result, instances can communicate with external resources (such as endpoints on the internet) or the other way around. Floating IPs are an IPv4 concept and do not apply to IPv6. It is assumed that the IPv6 addressing used by projects uses Global Unicast Addresses (GUAs) with no overlap across the projects, and therefore can be routed without NAT.
- North-South routing without floating IPs (also known as SNAT) - The Networking service offers a default port address translation (PAT) service for instances that do not have allocated floating IPs. With this service, instances can communicate with external endpoints through the router, but not the other way around. For example, an instance can browse a website on the internet, but a web browser outside cannot browse a website hosted within the instance. SNAT is applied for IPv4 traffic only. In addition, Networking service networks that are assigned GUAs prefixes do not require NAT on the Networking service router external gateway port to access the outside world.
13.1.3. Centralized routing
Originally, the Networking service (neutron) was designed with a centralized routing model where a project’s virtual routers, managed by the neutron L3 agent, are all deployed in a dedicated node or cluster of nodes (referred to as the Network node, or Controller node). This means that each time a routing function is required (east/west, floating IPs or SNAT), traffic would traverse through a dedicated node in the topology. This introduced multiple challenges and resulted in sub-optimal traffic flows. For example:
- Traffic between instances flows through a Controller node - when two instances need to communicate with each other using L3, traffic has to hit the Controller node. Even if the instances are scheduled on the same Compute node, traffic still has to leave the Compute node, flow through the Controller, and route back to the Compute node. This negatively impacts performance.
- Instances with floating IPs receive and send packets through the Controller node - the external network gateway interface is available only at the Controller node, so whether the traffic is originating from an instance, or destined to an instance from the external network, it has to flow through the Controller node. Consequently, in large environments the Controller node is subject to heavy traffic load. This would affect performance and scalability, and also requires careful planning to accommodate enough bandwidth in the external network gateway interface. The same requirement applies for SNAT traffic.
To better scale the L3 agent, the Networking service can use the L3 HA feature, which distributes the virtual routers across multiple nodes. In the event that a Controller node is lost, the HA router will failover to a standby on another node and there will be packet loss until the HA router failover completes.
13.2. DVR overview
Distributed Virtual Routing (DVR) offers an alternative routing design. DVR isolates the failure domain of the Controller node and optimizes network traffic by deploying the L3 agent and schedule routers on every Compute node. DVR has these characteristics:
- East-West traffic is routed directly on the Compute nodes in a distributed fashion.
- North-South traffic with floating IP is distributed and routed on the Compute nodes. This requires the external network to be connected to every Compute node.
- North-South traffic without floating IP is not distributed and still requires a dedicated Controller node.
The L3 agent on the Controller node uses the
dvr_snatmode so that the node serves only SNAT traffic.
- The neutron metadata agent is distributed and deployed on all Compute nodes. The metadata proxy service is hosted on all the distributed routers.
13.3. DVR known issues and caveats
- Support for DVR is limited to the ML2 core plug-in and the Open vSwitch (OVS) mechanism driver or ML2/OVN mechanism driver. Other back ends are not supported.
- On ML2/OVS DVR deployments, network traffic for the Red Hat OpenStack Platform Load-balancing service (octavia) goes through the Controller and network nodes, instead of the compute nodes.
With an ML2/OVS mechanism driver network back end and DVR, it is possible to create VIPs. However, the IP address assigned to a bound port using
allowed_address_pairs, should match the virtual port IP address (/32).
If you use a CIDR format IP address for the bound port
allowed_address_pairsinstead, port forwarding is not configured in the back end, and traffic fails for any IP in the CIDR expecting to reach the bound IP port.
- SNAT (source network address translation) traffic is not distributed, even when DVR is enabled. SNAT does work, but all ingress/egress traffic must traverse through the centralized Controller node.
In ML2/OVS deployments, IPv6 traffic is not distributed, even when DVR is enabled. All ingress/egress traffic goes through the centralized Controller node. If you use IPv6 routing extensively with ML2/OVS, do not use DVR.
Note that in ML2/OVN deployments, all east/west traffic is always distributed, and north/south traffic is distributed when DVR is configured.
In ML2/OVS deployments, DVR is not supported in conjunction with L3 HA. If you use DVR with Red Hat OpenStack Platform 17.1 director, L3 HA is disabled. This means that routers are still scheduled on the Network nodes (and load-shared between the L3 agents), but if one agent fails, all routers hosted by this agent fail as well. This affects only SNAT traffic. The
allow_automatic_l3agent_failoverfeature is recommended in such cases, so that if one network node fails, the routers are rescheduled to a different node.
- For ML2/OVS environments, the DHCP server is not distributed and is deployed on a Controller node. The ML2/OVS neutron DCHP agent, which manages the DHCP server, is deployed in a highly available configuration on the Controller nodes, regardless of the routing design (centralized or DVR).
- Compute nodes require an interface on the external network attached to an external bridge. They use this interface to attach to a VLAN or flat network for an external router gateway, to host floating IPs, and to perform SNAT for VMs that use floating IPs.
- In ML2/OVS deployments, each Compute node requires one additional IP address. This is due to the implementation of the external gateway port and the floating IP network namespace.
- VLAN, GRE, and VXLAN are all supported for project data separation. When you use GRE or VXLAN, you must enable the L2 Population feature. The Red Hat OpenStack Platform director enforces L2 Population during installation.
13.4. Supported routing architectures
Red Hat OpenStack Platform (RHOSP) supports both centralized, high-availability (HA) routing and distributed virtual routing (DVR) in the RHOSP versions listed:
- RHOSP centralized HA routing support began in RHOSP 8.
- RHOSP distributed routing support began in RHOSP 12.
13.5. Migrating centralized routers to distributed routing
This section contains information about upgrading to distributed routing for Red Hat OpenStack Platform deployments that use L3 HA centralized routing.
- Upgrade your deployment and validate that it is working correctly.
- Run the director stack update to configure DVR.
- Confirm that routing functions correctly through the existing routers.
- You cannot transition an L3 HA router to distributed directly. Instead, for each router, disable the L3 HA option, and then enable the distributed option:
Disable the router:
$ openstack router set --disable router1
Clear high availability:
$ openstack router set --no-ha router1
Configure the router to use DVR:
$ openstack router set --distributed router1
Enable the router:
$ openstack router set --enable router1
- Confirm that distributed routing functions correctly.
13.6. Deploying ML2/OVN OpenStack with distributed virtual routing (DVR) disabled
New Red Hat OpenStack Platform (RHOSP) deployments default to the neutron Modular Layer 2 plug-in with the Open Virtual Network mechanism driver (ML2/OVN) and DVR.
In a DVR topology, compute nodes with floating IP addresses route traffic between virtual machine instances and the network that provides the router with external connectivity (north-south traffic). Traffic between instances (east-west traffic) is also distributed.
You can optionally deploy with DVR disabled. This disables north-south DVR, requiring north-south traffic to traverse a controller or networker node. East-west routing is always distributed in an an ML2/OVN deployment, even when DVR is disabled.
- RHOSP 17.1 distribution ready for customization and deployment.
Create a custom environment file, and add the following configuration:
parameter_defaults: NeutronEnableDVR: false
To apply this configuration, deploy the overcloud, adding your custom environment file to the stack along with your other environment files. For example:
(undercloud) $ openstack overcloud deploy --templates \ -e [your environment files] -e /home/stack/templates/<custom-environment-file>.yaml
13.6.1. Additional resources
- Understanding distributed virtual routing (DVR) in the Configuring Red Hat OpenStack Platform networking guide.