How to check if system RAM is faulty in Red Hat Enterprise Linux?

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux
  • x86 (32 or 64 bit)

Issue

  • How to check if system memory (RAM) is faulty in Red Hat Enterprise Linux?

Resolution

  • Red Hat Enterprise Linux ships a memory test tool called memtest86+. It is a bootable utility that tests physical memory by writing various patterns to it and reading them back. Since memtest86+ runs directly off the hardware it does not require any operating system support for execution.

  • This tool is available as an RPM package from Red Hat Network (RHN) as well as a boot option from the Red Hat Enterprise Linux rescue disk.

  • To boot memtest86+ from the rescue disk, you will need to boot your system from CD 1 of the Red Hat Enterprise Linux installation media, and type the following at the boot prompt (before the Linux kernel is started):

boot: memtest86
  • If you would rather install memtest86+ on the system, here is an example of how to do it on a Red Hat Enterprise Linux 5 and above versions, registered to RHN:
# yum install memtest86+
  • For the Red Hat Enterprise Linux version 4, perform the following command to install memtest86+. Make sure current system has been registered to RHN:
# up2date -i memtest86+
  • Then you will have to configure it to run on next reboot:
# memtest-setup 
  • After reboot, the GRUB menu will list memtest. Select this item and it will start testing the memory.

    • Please note that once memtest86+ is running it will never stop unless you interrupt it by pressing the Esc key. It is usually a good idea to let it run for a few passes so it has time to test each block of memory several times. With large memory situations it may take more than 10hours just to reach one pass.
  • memtest86+ may not always find all memory problems. It is possible that the system memory can have a fault that memtest86+ does not detect.

Root Cause

  • A common cause for needing to run memtest86+ is when the system becomes sluggish, unresponsive or panics repeatedly, and the logs show "Machine Check Exception.

Diagnostic Steps

  • How long does memtest take to run?
    This depends on the amount of memory that is installed and the processer speed of the server, that is able to scan through the memory.

You can provide a quick calculation based upon the first hour.
If after 2 hours it has reached 15% of the PASS

2 hours 15% 
4 hours 30%
6 hours 45%
8 hours 60%
10 hours 75%
12 hours 100%

With this average memtest should finish in 10 more hours, however in some cases, it has taken double the estimate provided.
On low memory servers, it may only take a few hours to reach a few passes.

  • Will memtest provide a log file of the results
    No, however it will provide a summary on conclusion of the test.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

6 Comments

Is there a any other way to test the faultiness of system memory(RAM), where we can really be specific to an error?

Nice article!

Note this will not work on UEFI systems.

Will an updated memory testing tool be released for EFI systems?

I hope the below helps. From https://www.memtest86.com/download.htm

"IMPORTANT: MemTest86 V9 images support only UEFI boot. On machines that don't support UEFI, MemTest86 will not boot. Please download the older V4 BIOS release of MemTest86 instead.

Installation and usage instructions are available on the Technical Information page

MemTest86 is a stand-alone program that does not require or use any operating system for execution. The version of Windows, Linux, or Mac being used is irrelevant for execution. However, you must use either Windows, Linux or Mac to create a bootable USB drive."

FYI It ships with RHEL.

Thanks Mark! It seems like memtest86 v9 is strictly better than memtest86+, is there any reason why this article mentions the ‘+’ version only? Or is it just outdated?

If so, since I am new here, how does the process of updating knowledgeable articles work? Do I submit a request, or will that be taken care of by someone from the RHEL team?