Chapter 19. Configuring Layer 3 high availability (HA)

19.1. OpenStack Networking without high availability (HA)

OpenStack Networking deployments without any high availability (HA) features are vulnerable to physical node failures.

In a typical deployment, projects create virtual routers, which are scheduled to run on physical L3 agent nodes. This becomes an issue when you lose a L3 agent node and the dependent virtual machines subsequently lose connectivity to external networks. Any floating IP addresses will also be unavailable. In addition, connectivity is lost between any networks that the router hosts.

19.2. Overview of Layer 3 high availability (HA)

This active/passive high availability (HA) configuration uses the industry standard VRRP (as defined in RFC 3768) to protect project routers and floating IP addresses. A virtual router is randomly scheduled across multiple OpenStack Networking nodes, with one designated as the active router, and the remainder serving in a standby role.

Note

To deploy Layer 3 HA, you must maintain similar configuration on the redundant OpenStack Networking nodes, including floating IP ranges and access to external networks.

In the following diagram, the active Router1 and Router2 routers are running on separate physical L3 agent nodes. Layer 3 HA has scheduled backup virtual routers on the corresponding nodes, ready to resume service in the case of a physical node failure. When the L3 agent node fails, Layer 3 HA reschedules the affected virtual router and floating IP addresses to a working node:

vrrp scheduling

During a failover event, instance TCP sessions through floating IPs remain unaffected, and migrate to the new L3 node without disruption. Only SNAT traffic is affected by failover events.

The L3 agent is further protected when in an active/active HA mode.

19.3. Layer 3 high availability (HA) failover conditions

Layer 3 high availability (HA) automatically reschedules protected resources in the following events:

  • The L3 agent node shuts down or otherwise loses power because of a hardware failure.
  • The L3 agent node becomes isolated from the physical network and loses connectivity.
Note

Manually stopping the L3 agent service does not induce a failover event.

19.4. Project considerations for Layer 3 high availability (HA)

Layer 3 high availability (HA) configuration occurs in the back end and is invisible to the project. Projects can continue to create and manage their virtual routers as usual, however there are some limitations to be aware of when designing your Layer 3 HA implementation:

  • Layer 3 HA supports up to 255 virtual routers per project.
  • Internal VRRP messages are transported within a separate internal network, created automatically for each project. This process occurs transparently to the user.

19.5. High availability (HA) changes to OpenStack Networking

The Neutron API has been updated to allow administrators to set the --ha=True/False flag when creating a router, which overrides the default configuration of l3_ha in /var/lib/config-data/neutron/etc/neutron/neutron.conf.

  • HA changes to neutron-server:

    • Layer 3 HA assigns the active role randomly, regardless of the scheduler used by OpenStack Networking (whether random or leastrouter).
    • The database schema has been modified to handle allocation of virtual IP addresses (VIPs) to virtual routers.
    • A transport network is created to direct Layer 3 HA traffic.
  • High availability (HA) changes to L3 agent:

    • A new keepalived manager has been added, providing load-balancing and HA capabilities.
    • IP addresses are converted to VIPs.

19.6. Enabling Layer 3 high availability (HA) on OpenStack Networking nodes

Complete the following steps to enable Layer 3 high availability (HA) on OpenStack Networking and L3 agent nodes.

  1. Configure Layer 3 HA in the /var/lib/config-data/neutron/etc/neutron/neutron.conf file by enabling L3 HA and defining the number of L3 agent nodes that you want to protect each virtual router:

    l3_ha = True
    max_l3_agents_per_router = 2
    min_l3_agents_per_router = 2

    L3 HA parameters:

    • l3_ha - When set to True, all virtual routers created from this point onwards default to HA (and not legacy) routers. Administrators can override the value for each router using the following option in the openstack router create command:

      # openstack router create --ha

      or

      # openstack router create --no-ha
    • max_l3_agents_per_router - Set this to a value between the minimum and total number of network nodes in your deployment.

      For example, if you deploy four OpenStack Networking nodes but set this parameter to 2, only two L3 agents protect each HA virtual router: one active, and one standby. In addition, each time a new L3 agent node is deployed, additional standby versions of the virtual routers are scheduled until the max_l3_agents_per_router limit is reached. As a result, you can scale out the number of standby routers by adding new L3 agents.

      In addition, each time a new L3 agent node is deployed, additional standby versions of the virtual routers are scheduled until the max_l3_agents_per_router limit is reached. As a result, you can scale out the number of standby routers by adding new L3 agents.

    • min_l3_agents_per_router - The minimum setting ensures that the HA rules remain enforced. This setting is validated during the virtual router creation process to ensure a sufficient number of L3 Agent nodes are available to provide HA.

      For example, if you have two network nodes and one becomes unavailable, no new routers can be created during that time, as you need at least min active L3 agents when creating a HA router.

  2. Restart the neutron-server service to apply the changes:

    # systemctl restart neutron-server.service

19.7. Reviewing high availability (HA) node configurations

  • Run the ip address command within the virtual router namespace to return a HA device in the result, prefixed with ha-.

    # ip netns exec qrouter-b30064f9-414e-4c98-ab42-646197c74020 ip address
    <snip>
    2794: ha-45249562-ec: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 12:34:56:78:2b:5d brd ff:ff:ff:ff:ff:ff
    inet 169.254.0.2/24 brd 169.254.0.255 scope global ha-54b92d86-4f

With Layer 3 HA enabled, virtual routers and floating IP addresses are protected against individual node failure.