rhel6/7: system panic occurs when mcelog daemon offlines a hugepage

Solution Verified - Updated -

Issue

  • System panic occurs when mcelog daemon (or other software) offlines a hugepage. When the frequency of corrected memory errors exceeds the threshold, mcelog daemon executes memory offline. The problem is that page_check_address() called by the offline handler does not check pte, which is a return value from huge_pte_offset().

Environment

  • Red Hat Enterprise Linux (RHEL) 6
    • All kernel versions prior to 6.2 kernel-2.6.32-220.72.2.el6
    • All 6.2 kernels prior to kernel-2.6.32-220.72.2.el6
    • All 6.4 kernels prior to kernel-2.6.32-358.79.1.el6
    • All 6.5 kernels prior to kernel-2.6.32-431.81.2.el6
    • All 6.6 kernels prior to kernel-2.6.32-504.60.2.el6
    • All 6.7 kernels prior to kernel-2.6.32-573.43.2.el6
    • All 6.8 kernels
    • All 6.9 kernels prior to kernel-2.6.32-696.6.3.el6
  • Red Hat Enterprise Linux 7.0, 7.1 and 7.2
  • mcelog daemon running, or other code which is offlining hugepages

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content