Red Hat Training

A Red Hat training course is available for Red Hat OpenStack Platform

Chapter 20. Configuring Layer 3 high availability (HA)

20.1. RHOSP Networking service without high availability (HA)

Red Hat OpenStack Platform (RHOSP) Networking service deployments without any high availability (HA) features are vulnerable to physical node failures.

In a typical deployment, projects create virtual routers, which are scheduled to run on physical Networking service Layer 3 (L3) agent nodes. This becomes an issue when you lose an L3 agent node and the dependent virtual machines subsequently lose connectivity to external networks. Any floating IP addresses are also unavailable. In addition, connectivity is lost between any networks that the router hosts.

20.2. Overview of Layer 3 high availability (HA)

This active/passive high availability (HA) configuration uses the industry standard VRRP (as defined in RFC 3768) to protect project routers and floating IP addresses. A virtual router is randomly scheduled across multiple Red Hat OpenStack Platform (RHOSP) Networking service nodes, with one designated as the active router, and the remainder serving in a standby role.

Note

To deploy Layer 3 (L3) HA, you must maintain similar configuration on the redundant Networking service nodes, including floating IP ranges and access to external networks.

In the following diagram, the active Router1 and Router2 routers are running on separate physical L3 Networking service agent nodes. L3 HA has scheduled backup virtual routers on the corresponding nodes, ready to resume service in the case of a physical node failure. When the L3 agent node fails, L3 HA reschedules the affected virtual router and floating IP addresses to a working node:

vrrp scheduling

During a failover event, instance TCP sessions through floating IPs remain unaffected, and migrate to the new L3 node without disruption. Only SNAT traffic is affected by failover events.

The L3 agent is further protected when in an active/active HA mode.

20.3. Layer 3 high availability (HA) failover conditions

Layer 3 (L3) high availability (HA) for the Red Hat OpenStack Platform (RHOSP) Networking service automatically reschedules protected resources in the following events:

  • The Networking service L3 agent node shuts down or otherwise loses power because of a hardware failure.
  • The L3 agent node becomes isolated from the physical network and loses connectivity.
Note

Manually stopping the L3 agent service does not induce a failover event.

20.4. Project considerations for Layer 3 high availability (HA)

Red Hat OpenStack Platform (RHOSP) Networking service Layer 3 (L3) high availability (HA) configuration occurs in the back end and is invisible to the project. Projects can continue to create and manage their virtual routers as usual, however there are some limitations to be aware of when designing your L3 HA implementation:

  • L3 HA supports up to 255 virtual routers per project.
  • Internal VRRP messages are transported within a separate internal network, created automatically for each project. This process occurs transparently to the user.
  • When implementing high availability (HA) routers on ML2/OVS, each L3 agent spawns haproxy and neutron-keepalived-state-change-monitor processes for each router. Each process consumes approximately 20MB of memory. By default, each HA router resides on three L3 agents and consumes resources on each of the nodes. Therefore, when sizing your RHOSP networks, ensure that you have allocated enough memory to support the number of HA routers that you plan to implement.

20.5. High availability (HA) changes to the RHOSP Networking service

The Red Hat OpenStack Platform (RHOSP) Networking service (neutron) API has been updated to allow administrators to set the --ha=True/False flag when creating a router, which overrides the default configuration of l3_ha in /var/lib/config-data/neutron/etc/neutron/neutron.conf.

  • High availability (HA) changes to neutron-server:

    • Layer 3 (L3) HA assigns the active role randomly, regardless of the scheduler used by the Networking service (whether random or leastrouter).
    • The database schema has been modified to handle allocation of virtual IP addresses (VIPs) to virtual routers.
    • A transport network is created to direct L3 HA traffic.
  • HA changes to the Networking service L3 agent:

    • A new keepalived manager has been added, providing load-balancing and HA capabilities.
    • IP addresses are converted to VIPs.

20.6. Enabling Layer 3 high availability (HA) on RHOSP Networking service nodes

During installation, Red Hat OpenStack Platform (RHOSP) director enables high availability (HA) for virtual routers by default when you have at least two RHOSP Controllers and are not using distributed virtual routing (DVR). Using an RHOSP Orchestration service (heat) parameter, max_l3_agents_per_router, you can set the maximum number of RHOSP Networking service Layer 3 (L3) agents on which an HA router is scheduled.

Prerequisites

  • Your RHOSP deployment does not use DVR.
  • You have at least two RHOSP Controllers deployed.

Procedure

  1. Log in to the undercloud as the stack user, and source the stackrc file to enable the director command line tools.

    Example

    $ source ~/stackrc

  2. Create a custom YAML environment file.

    Example

    $ vi /home/stack/templates/my-neutron-environment.yaml

    Tip

    The Orchestration service (heat) uses a set of plans called templates to install and configure your environment. You can customize aspects of the overcloud with a custom environment file, which is a special type of template that provides customization for your heat templates.

  3. Set the NeutronL3HA parameter to true in the YAML environment file. This ensures HA is enabled even if director did not set it by default.

    parameter_defaults:
      NeutronL3HA: 'true'
  4. Set the maximum number of L3 agents on which an HA router is scheduled.

    Set the max_l3_agents_per_router parameter to a value between the minimum and total number of network nodes in your deployment. (A zero value indicates that the router is scheduled on every agent.)

    Example

    parameter_defaults:
      NeutronL3HA: 'true'
      ControllerExtraConfig:
        neutron::server::max_l3_agents_per_router: 2

    In this example, if you deploy four Networking service nodes, only two L3 agents protect each HA virtual router: one active, and one standby.

    If you set the value of max_l3_agents_per_router to be greater than the number of available network nodes, you can scale out the number of standby routers by adding new L3 agents. For every new L3 agent node that you deploy, the Networking service schedules additional standby versions of the virtual routers until the max_l3_agents_per_router limit is reached.

  5. Run the openstack overcloud deploy command and include the core heat templates, environment files, and this new custom environment file.

    Important

    The order of the environment files is important because the parameters and resources defined in subsequent environment files take precedence.

    Example

    $ openstack overcloud deploy --templates \
    -e [your-environment-files] \
    -e /usr/share/openstack-tripleo-heat-templates/environments/services/my-neutron-environment.yaml

    Note

    When NeutronL3HA is set to true, all virtual routers that are created default to HA routers. When you create a router, you can override the HA option by including the --no-ha option in the openstack router create command:

    # openstack router create --no-ha

Additional resources

20.7. Reviewing high availability (HA) RHOSP Networking service node configurations

Procedure

  • Run the ip address command within the virtual router namespace to return a high availability (HA) device in the result, prefixed with ha-.

    # ip netns exec qrouter-b30064f9-414e-4c98-ab42-646197c74020 ip address
    <snip>
    2794: ha-45249562-ec: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 12:34:56:78:2b:5d brd ff:ff:ff:ff:ff:ff
    inet 169.254.0.2/24 brd 169.254.0.255 scope global ha-54b92d86-4f

With Layer 3 HA enabled, virtual routers and floating IP addresses are protected against individual node failure.