rhel6/7: system panic occurs when mcelog daemon offlines a hugepage

Solution Verified - Updated -

Issue

  • System panic occurs when mcelog daemon (or other software) offlines a hugepage. When the frequency of corrected memory errors exceeds the threshold, mcelog daemon executes memory offline. The problem is that page_check_address() called by the offline handler does not check pte, which is a return value from huge_pte_offset().

Environment

  • Red Hat Enterprise Linux (RHEL) 6
    • All kernel versions prior to 6.2 kernel-2.6.32-220.72.2.el6
    • All 6.2 kernels prior to kernel-2.6.32-220.72.2.el6
    • All 6.4 kernels prior to kernel-2.6.32-358.79.1.el6
    • All 6.5 kernels prior to kernel-2.6.32-431.81.2.el6
    • All 6.6 kernels prior to kernel-2.6.32-504.60.2.el6
    • All 6.7 kernels prior to kernel-2.6.32-573.43.2.el6
    • All 6.8 kernels
    • All 6.9 kernels prior to kernel-2.6.32-696.6.3.el6
  • Red Hat Enterprise Linux 7.0, 7.1 and 7.2
  • mcelog daemon running, or other code which is offlining hugepages

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.