Chapter 7. Monitoring the Load-balancing service

To keep load balancing operational, you can use the load-balancer management network and create, modify, and delete load-balancing health monitors.

7.1. Load-balancing management network

The Red Hat OpenStack Platform (RHOSP) Load-balancing service (octavia) monitors load balancers through a project network referred to as the load-balancing management network. Hosts that run the Load-balancing service must have interfaces to connect to the load-balancing management network. The supported interface configuration works with the neutron Modular Layer 2 plug-in with the Open Virtual Network mechanism driver (ML2/OVN) or the Open vSwitch mechanism driver (ML2/OVS). Use of the interfaces with other mechanism drivers has not been tested.

The default interfaces created at deployment are internal Open vSwitch (OVS) ports on the default integration bridge br-int. You must associate these interfaces with actual Networking service (neutron) ports allocated on the load-balancer management network.

The default interfaces are by default named, o-hm0. They are defined through standard interface configuration files on the Load-balancing service hosts. RHOSP director automatically configures a Networking service port and an interface for each Load-balancing service host during deployment. Port information and a template is used to create the interface configuration file, including:

  • IP network address information including the IP and netmask
  • MTU configuration
  • the MAC address
  • the Networking service port ID

In the default OVS case, the Networking service port ID is used to register extra data with the OVS port. The Networking service recognizes this interface as belonging to the port and configures OVS so it can communicate on the load-balancer management network.

By default, RHOSP configures security groups and firewall rules that allow the Load-balancing service controllers to communicate with its VM instances (amphorae) on TCP port 9443, and allows the heartbeat messages from the amphorae to arrive on the controllers on UDP port 5555. Different mechanism drivers might require additional or alternate requirements to allow communication between load-balancing services and the load balancers.

7.2. Load-balancing service instance monitoring

The Load-balancing service (octavia) monitors the load balancing instances (amphorae) and initiates failovers and replacements if the amphorae malfunction. Any time a failover occurs, the Load-balancing service logs the failover in the corresponding health manager log on the controller in /var/log/containers/octavia.

Use log analytics to monitor failover trends to address problems early. Problems such as Networking service (neutron) connectivity issues, Denial of Service attacks, and Compute service (nova) malfunctions often lead to higher failover rates for load balancers.

7.3. Load-balancing service pool member monitoring

The Load-balancing service (octavia) uses the health information from the underlying load balancing subsystems to determine the health of members of the load-balancing pool. Health information is streamed to the Load-balancing service database, and made available by the status tree or other API methods. For critical applications, you must poll for health information in regular intervals.

7.4. Load balancer provisioning status monitoring

You can monitor the provisioning status of a load balancer and send alerts if the provisioning status is ERROR. Do not configure an alert to trigger when an application is making regular changes to the pool and enters several PENDING stages.

The provisioning status of load balancer objects reflect the ability of the control plane to contact and successfully provision a create, update, and delete request. The operating status of a load balancer object reports on the current functionality of the load balancer.

For example, a load balancer might have a provisioning status of ERROR, but an operating status of ONLINE. This might be caused by a Networking (neutron) failure that blocked that last requested update to the load balancer configuration from successfully completing. In this case, the load balancer continues to process traffic through the load balancer, but might not have applied the latest configuration updates yet.

7.5. Load balancer functionality monitoring

You can monitor the operational status of your load balancer and its child objects.

You can also use an external monitoring service that connects to your load balancer listeners and monitors them from outside of the cloud. An external monitoring service indicates if there is a failure outside of the Load-balancing service (octavia) that might impact the functionality of your load balancer, such as router failures, network connectivity issues, and so on.

7.6. About Load-balancing service health monitors

A Load-balancing service (octavia) health monitor is a process that does periodic health checks on each back end member server to pre-emptively detect failed servers and temporarily pull them out of the pool.

If the health monitor detects a failed server, it removes the server from the pool and marks the member in ERROR. After you have corrected the server and it is functional again, the health monitor automatically changes the status of the member from ERROR to ONLINE, and resumes passing traffic to it.

Always use health monitors in production load balancers. If you do not have a health monitor, failed servers are not removed from the pool. This can lead to service disruption for web clients.

There are several types of health monitors, as briefly described here:

HTTP
by default, probes the / path on the application server.
HTTPS

operates exactly like HTTP health monitors, but with TLS back end servers.

If the servers perform client certificate validation, HAProxy does not have a valid certificate. In these cases, TLS-HELLO health monitoring is an alternative.

TLS-HELLO

ensures that the back end server responds to SSLv3-client hello messages.

A TLS-HELLO health monitor does not check any other health metrics, like status code or body contents.

PING

sends periodic ICMP ping requests to the back end servers.

You must configure back end servers to allow PINGs so that these health checks pass.

Important

A PING health monitor checks only if the member is reachable and responds to ICMP echo requests. PING health monitors do not detect if the application that runs on an instance is healthy. Use PING health monitors only in cases where an ICMP echo request is a valid health check.

TCP

opens a TCP connection to the back end server protocol port.

The TCP application opens a TCP connection and, after the TCP handshake, closes the connection without sending any data.

UDP-CONNECT

performs a basic UDP port connect.

A UDP-CONNECT health monitor might not work correctly if Destination Unreachable (ICMP type 3) is not enabled on the member server, or if it is blocked by a security rule. In these cases, a member server might be marked as having an operating status of ONLINE when it is actually down.

7.7. Creating Load-balancing service health monitors

Use Load-balancing service (octavia) health monitors to avoid service disruptions for your users. The health monitors run periodic health checks on each back end server to pre-emptively detect failed servers and temporarily pull the servers out of the pool.

Procedure

  1. Source your credentials file.

    Example

    $ source ~/overcloudrc

  2. Run the openstack loadbalancer healthmonitor create command, using argument values that are appropriate for your site.

    • All health monitor types require the following configurable arguments:

      <pool>
      Name or ID of the pool of back-end member servers to be monitored.
      --type
      The type of health monitor. One of HTTP, HTTPS, PING, TCP, TLS-HELLO, or UDP-CONNECT.
      --delay
      Number of seconds to wait between health checks.
      --timeout
      Number of seconds to wait for any given health check to complete. timeout must always be smaller than delay.
      --max-retries
      Number of health checks a back-end server must fail before it is considered down. Also, the number of health checks that a failed back-end server must pass to be considered up again.
    • In addition, HTTP health monitor types also require the following arguments, which are set by default:

      --url-path
      Path part of the URL that should be retrieved from the back-end server. By default this is /.
      --http-method
      HTTP method that is used to retrieve the url_path. By default this is GET.
      --expected-codes

      List of HTTP status codes that indicate an OK health check. By default this is 200.

      Example

      $ openstack loadbalancer healthmonitor create --name my-health-monitor --delay 10 --max-retries 4 --timeout 5 --type TCP lb-pool-1

Verification

  • Run the openstack loadbalancer healthmonitor list command and verify that your health monitor is running.

Additional resources

7.8. Modifying Load-balancing service health monitors

You can modify the configuration for Load-balancing service (octavia) health monitors when you want to change the interval for sending probes to members, the connection timeout interval, the HTTP method for requests, and so on.

Procedure

  1. Source your credentials file.

    Example

    $ source ~/overcloudrc

  2. Modify your health monitor (my-health-monitor).

    In this example, a user is changing the time in seconds that the health monitor waits between sending probes to members.

    Example

    $ openstack loadbalancer healthmonitor set my_health_monitor --delay 600

Verification

  • Run the openstack loadbalancer healthmonitor show command to confirm your configuration changes.

    $ openstack loadbalancer healthmonitor show my_health_monitor

Additional resources

7.9. Deleting Load-balancing service health monitors

You can remove a Load-balancing service (octavia) health monitor.

Tip

An alternative to deleting a health monitor is to disable it by using the openstack loadbalancer healthmonitor set --disable command.

Procedure

  1. Source your credentials file.

    Example

    $ source ~/overcloudrc

  2. Delete the health monitor (my-health-monitor).

    Example

    $ openstack loadbalancer healthmonitor delete my-health-monitor

Verification

  • Run the openstack loadbalancer healthmonitor list command to verify that the health monitor you deleted no longer exists.

Additional resources

7.10. Best practices for Load-balancing service HTTP health monitors

When you write the code that generates the health check in your web application, use the following best practices:

  • The health monitor url-path does not require authentication to load.
  • By default, the health monitor url-path returns an HTTP 200 OK status code to indicate a healthy server unless you specify alternate expected-codes.
  • The health check does enough internal checks to ensure that the application is healthy and no more. Ensure that the following conditions are met for the application:

    • Any required database or other external storage connections are up and running.
    • The load is acceptable for the server on which the application runs.
    • Your site is not in maintenance mode.
    • Tests specific to your application are operational.
  • The page generated by the health check should be small in size:

    • It returns in a sub-second interval.
    • It does not induce significant load on the application server.
  • The page generated by the health check is never cached, although the code that runs the health check might reference cached data.

    For example, you might find it useful to run a more extensive health check using cron and store the results to disk. The code that generates the page at the health monitor url-path incorporates the results of this cron job in the tests it performs.

  • Because the Load-balancing service only processes the HTTP status code returned, and because health checks are run so frequently, you can use the HEAD or OPTIONS HTTP methods to skip processing the entire page.