How to increase ring buffer size in Red Hat OpenShift Container Platform 4?

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4

Issue

  • How to increase the ring buffer Size in RHCOS on OpenShift 4 due to ring buffer errors in ethtool -S output?
  • The ethtool -S shows ring full and pkts rx OOB.

Resolution

To set the ring buffers for the TX and RX buffers on the interfaces it is recommended to use the Tuned Operator.

How to use Tuned Operator to update TX and RX buffers:

The Tuned Operator can be used to change the ring buffers in a more managed, automated way, instead of creating custom services OR files on the node to achieve the same configuration.

To do that is as simple as applying a Tuned CR like in the below example:

apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
  name: increase-ring-buffer
  namespace: openshift-cluster-node-tuning-operator
spec:
  profile:
  - data: |
      [main]
      summary=Configuring the number of queues (and also interrupts)per NIC
      include=openshift-node 
      [net]
      type=net
      devices_udev_regex=^INTERFACE=ens192      # Here we need to change the name with the actual interface name
      ring=rx 4096 tx 4096                      # Here we need to specify the decired rx and tx buffer size 
    name: increase-ring-buffer
  recommend:
  - machineConfigLabels:
      machineconfiguration.openshift.io/role: "worker"  # Here we pick worker MCP but can choose different MCPs.
    priority: 20
    profile: increase-ring-buffer

Root Cause

There are multiple Supported methods to increase the ring buffer size and each comes with its pros and cons but the recommended and more clean way is to use the Node Tuning Operator.

Note that this approach has the disadvantage that the tuned daemonset need to be started for the buffers to get updated. This means that during booting time the buffers are the default before kubelet starts. However this is not an issue cause its uncommon a node to need buffers increased at the booting time.

Diagnostic Steps

  • Verify ring buffer issues on a Red Hat CoreOS node by running the following command:
$ sudo ethtool -S ens192
[...]
     Rx Queue#: 2
       LRO pkts rx: 0
       LRO byte rx: 0
       ucast pkts rx: 7903511
       ucast bytes rx: 9121802867
       mcast pkts rx: 5
       mcast bytes rx: 342
       bcast pkts rx: 2870
       bcast bytes rx: 192682
       **pkts rx OOB: 632**
       pkts rx err: 0
       drv dropped rx total: 0
          err: 0
          fcs: 0
[...]

In a sosreport, it's possible to see ring full errors with any of the following commands:

$ xsos -yomep [sosreport_dir]
[...]
  Interface Errors:
    ens192  Tx Queue#: 0
              ring full: 88209
            Tx Queue#: 1
              ring full: 87586
            Tx Queue#: 2
              ring full: 72320
[...]
$ grep "Queue\|ring full\|OOB" [sosreport_dir]/sos_commands/networking/ethtool_-S*
[...]
sos_commands/networking/ethtool_-S_ens192:     Tx Queue#: 0
sos_commands/networking/ethtool_-S_ens192:       ring full: 88209
sos_commands/networking/ethtool_-S_ens192:     Tx Queue#: 1
sos_commands/networking/ethtool_-S_ens192:       ring full: 87586
sos_commands/networking/ethtool_-S_ens192:     Tx Queue#: 2
sos_commands/networking/ethtool_-S_ens192:       ring full: 72320
[...]

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments