Failure to wake up RCU grace-period thread entails symptoms such as unresponsiveness of entire system, hung tasks in synchronize_rcu() and high slab cache memory usage

Solution Verified - Updated -

Issue

  • OpenShift node went into NotReady state
  • pods in OpenShift node are stuck in Terminating state
  • server is unresponsive
  • kernel logs "INFO: task ... blocked for more than 120 seconds" messages with tasks hung in synchronize_rcu() function
  • kernel panic "hung_task: blocked tasks"
  • high slab cache memory usage

Environment

  • Red Hat Enterprise Linux 8.0 and 8.1
  • OpenShift Container Platform 4.2, 4.3 and 4.4

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content