Support Policies for RHEL High Availability Clusters - Cluster Interconnect Network Latency

Updated -

Contents

Overview

Applicable Environments

  • Red Hat Enterprise Linux (RHEL) with the High Availability Add-On

Useful References and Guides

Introduction

This policy guide explains Red Hat's requirements and support limitations regarding latency of the network interconnect over which cluster nodes communicate with each other as members. Users of RHEL High Availability clusters should adhere to these policies in order to be eligible for support from Red Hat with the appropriate product support subscriptions.

Policies

Latency of cluster interconnect network(s): Red Hat recommends that RHEL High Availability clusters be designed with the cluster interconnect network(s) between nodes having a reasonably low latency for ideal performance and functionality. Networks providing inter-node round-trip latency of 2ms or less will typically produce optimal results.

Networks exhibiting higher than 300ms round-trip latency between any nodes may produce instability in the membership or resource-management of a cluster. The RHEL High Availability Add-On's components may not be able to accommodate such extremely-high-latency environments in all cases. Clusters involving large numbers of nodes, managing large quantities of resources, experiencing a high volume of cluster resource activity, or having other complex conditions may be particularly susceptible to instability with high-latency networks.

In some cases, advanced tuning of membership and communication settings may improve stability and performance of cluster deployments with high latency. However, if such tuning is unable to produce the desired results, it may be necessary to consider alternative cluster architecture designs that can achieve a more reasonable latency between nodes. Red Hat Support and Red Hat's development of the High Availability product may be unable to accommodate deployments operating with high inter-node latency where configuration-tuning does not produce the desired results.


Latency Determination: The latency limits defined within this guide are in reference to round-trip communications between any two nodes in the cluster over the node-interconnect.

  • Guidance on determining latency in communications over the node-interconnect can be found here.
  • These limits apply to all corosync and pacemaker traffic between nodes, which includes both tokens and messages. Messages may be delivered using multicast or broadcast in certain configurations, which makes those transmissions subject to different influences and conditions on the network. With such configurations, attention should be given to the latency of both direct node-to-node unicast traffic and the applicable multicast or broadcast traffic.

Applicable Networks: In a RHEL High Availability cluster, Red Hat's support policies regarding network latency apply to any node-interconnect over which membership-related communications are transmitted.


Identifying Cluster Interconnect Network The interconnects used by nodes to communicate in the membership protocol are selected based on the node's name or address as it is defined in the membership list, and the path these communications will take follow any network routing policies defined on the hosts or by the network.

  • RHEL 7 and 8: Node names or addresses are defined in /etc/corosync/corosync.conf. If a name is given, it is resolved to an address by the cluster nodes.
  • RHEL 6: Node names or addresses are defined in /etc/cluster/cluster.conf. If a name is given, it is resolved to an address by the cluster nodes.
  • Example: In a RHEL 7 cluster, the membership list includes node1.example.com and node2.example.com, which resolve to 192.168.1.5 and 192.168.1.6 - a private VLAN dedicated to this cluster. Policies in this guide would apply to packets transmitted between those addresses across that VLAN.

Excluded Networks: Any other network, link, or interconnect that a cluster node may utilize for tasks unrelated to core membership are not covered by the latency policies in this guide.

  • A network that carries traffic for an application managed as a highly available resource within the cluster is not subject to these policies.
  • A network over which connections are made from cluster nodes to configured fence/STONITH devices is not subject to these policies
  • A network which is used by clients to connect to cluster administration utilities such as pcsd or Conga are not subject to these policies

Comments