RHEL8: NFS streaming read performance is slower after kernel update

Solution Verified - Updated -

Red Hat Insights can detect this issue

Proactively detect and remediate issues impacting your systems.
View matching systems and remediation

Environment

  • Red Hat Enterprise 8.3 - 8.6
  • kernel-4.18.0-240.el8 and later
  • NFS

Issue

  • After updating to kernel-4.18.0-240.el8 NFS read performance is far lower than previous RHEL8 releases.
  • Downgrading to kernels prior to kernel-4.18.0-240.el8 causes performance to resume to previous levels.

Resolution

Red Hat Enterprise Linux 8

Update package nfs-utils to 2.3.3-57.el8 (released with RHBA-2022:7768) or later, which introduces the nfsrahead tool, which you can use it to modify the readahead value for NFS mounts, and thus affect the NFS read performance. RHEL8.7GA and later already contain the updated package. The fixes were researched and implemented in bz1946283. After upgrading the package, check the nfsrahead(5) man page for configuration details.

Workarounds

  • Increase read_ahead_kb for the NFS share under /sys/class/bdi/<major>:<minor>/
  • The <major>:<minor> number for the mount can be determined using # mountpoint -d /path/to/mount
  • The <major>:<minor> number is also listed in /proc/self/mountinfo
  • The read_ahead_kb value can be also modified via udev rules and scripting

Root Cause

  • A commit was included into Red Hat Enterprise Linux 8.3 that lowered the default value for read_ahead_kb. This reduces the amount of pre-fetching done during read procedures which can impact large streaming reads.
  • Historically, read_ahead_kb has been equal to the negotiated rsize for the share at time of mount multiplied by fifteen.

Diagnostic Steps

RHEL 8.2

  • RHEL 8.2 throughput for a large streaming read is about 450 MiB/s.
# uname -r
4.18.0-193.6.3.el8_2.x86_64

# mount nfs-server-8.example.net:/mnt /mnt -o vers=4.2,sec=krb5p

# findmnt /mnt
TARGET SOURCE                        FSTYPE OPTIONS
/mnt   nfs-server-8.example.net:/mnt nfs4   rw,relatime,seclabel,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=192.168.124.156,local_lock=none,addr=192.168.124.138

# cat /sys/class/bdi/$(mountpoint -d /mnt)/read_ahead_kb
15360

# ls -1sh /mnt/test_file.bin
4.9G /mnt/test_file.bin

# dd if=/mnt/test_file.bin of=/dev/null bs=1M
5000+0 records in
5000+0 records out 
5242880000 bytes (5.2 GB, 4.9 GiB) copied, 11.5409 s, 454 MB/s

RHEL 8.3

  • The default read_ahead_kb is only 128 KiBs.
# uname -r
4.18.0-240.10.1.el8_3.x86_64

# mount nfs-server-8.example.net:/mnt /mnt -o vers=4.2,sec=krb5p

# findmnt /mnt
TARGET SOURCE                        FSTYPE OPTIONS
/mnt   nfs-server-8.example.net:/mnt nfs4   rw,relatime,seclabel,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=192.168.124.161,local_lock=none,addr=192.168.124.138

# cat /sys/class/bdi/$(mountpoint -d /mnt)/read_ahead_kb
128

# ls -1sh /mnt/test_file.bin
4.9G /mnt/test_file.bin
  • The initial streaming read throughput is far less than that of RHEL 8.2
# dd if=/mnt/test_file.bin of=/dev/null bs=1M
5000+0 records in
5000+0 records out
5242880000 bytes (5.2 GB, 4.9 GiB) copied, 38.8771 s, 135 MB/s
  • Increasing the read_ahead_kb value to the historical value causes performance to align with previous versions of RHEL8.
# echo 15360 > /sys/class/bdi/$(mountpoint -d /mnt)/read_ahead_kb

# cat /sys/class/bdi/$(mountpoint -d /mnt)/read_ahead_kb
15360

# echo 3 > /proc/sys/vm/drop_caches

# dd if=/mnt/test_file.bin of=/dev/null bs=1M
5000+0 records in
5000+0 records out 
5242880000 bytes (5.2 GB, 4.9 GiB) copied, 11.9788 s, 438 MB/s

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments