Performance impact on hugepage workloads in RHEL 8 and RHEL 9 systems with high CPU counts such as 150 or more
Issue
- Hugepage workloads, like Oracle, experience a performance impact/slowness on systems with high CPU counts, for example, 150 or more CPUs.
- After updating to
kernel-4.18.0-553.69.1.el8_10orkernel-5.14.0-570.35.1.el9_6or later, the database became slow. - Recurring Oracle DB latency leading to application access delays every 30 minutes (2–3 minute timeouts).
- DB active connections counts are going high and impacting application and this issue is lasting for around 15 to 20 mins and post that issue resolves on its own.
- A spike in latch waits in Oracle workload with no corresponding spike in CPU activity
Environment
- Red Hat Enterprise Linux 8.10
kernel-4.18.0-553.69.1.el8_10and later
- Red Hat Enterprise Linux 9.6
kernel-5.14.0-570.35.1.el9_6and later
- Red Hat Enterprise Linux 10.0
kernel-6.12.0-55.27.1.el10_0and later (theoretical, not reported)
- High CPU count such as 150 CPU cores
- Software using HugeTLB HugePages, such as Oracle Database
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.