Java application periodic high latency / processing times due to NUMA page reclaim on RHEL

Solution Verified - Updated -

Issue

  • JBoss server periodically consuming high CPU and experiencing pauses.          
    • Periodic (1 out of 100) garbage collections take an excessive amount of system time.
  • Java based web application experiences periodic (approximately 5 times out of 100) slow application response times          
    • Application response is < 100ms 95% of the time; the other 5%, response may take up to 100 seconds.
    • Unresponsiveness is seen across several processes (JBoss, Oracle, etc), and slowness appears to be system-wide.
  • Periodically, processes such as 'uname', 'grep', and 'perl', take an exceptional amount of time to execute, and all seem to be using an exceptional amount of system time.
  • Oracle responds to Jboss calls in less than 1s 90% of the time, but a few times Oracle takes 30-40s, and may exceed the 60s query timeout resulting in Oracle error ORA-01013.

Environment

  • Red Hat Enterprise Linux 5.4                      
    • kernel 2.6.18-164.11.1.el5.x86_64
  • CPU / memory          
    • 24 CPUs total, 6 cores
    • 16 GB ram, 8 GB swap
    • 2 Node NUMA system, with 8GB RAM on each NUMA node
  • Jboss (running in its own JVM), jbossas, jboss-messaging
    • Jboss interfaces with Oracle via local TCP (port 1521)
  • Web application (running in its own JVM)             
    • JSF Based Web application (TCP / HTTP 1.1) using RichFaces & a4j components.
  • Oracle Version: 11gR1 11.1.0.7          
    • Running with AMM, which forbids the use of HugePages
  • Veritas VCS, VxVM, VxDMP

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content