Why does it throw error messages "FWNMI: corrupt r3" during installing Red Hat Enterprise Linux 5.1 on IBM Power 720 Express Server (8202-E4B)?

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 5.1, 5.5

Issue

  • Following error messages found duing installing RHEL 5.1 on IBM Power 720 Express Server (8202-E4B) :

FWNMI: corrupt r3
Machine check in kernel mode
Caused by (from SRR1=80000******): Transfer error ack signal
cpu 0x0 : Vector : 200(Machine Check) at [***********]
pc:************: .hrtimer_run_queues+***/***
lr:************: .hrtimer_run_queues+***/***
sp:************
msr:************
current = ****************
paca = *******************
pid =0, comm = swapper

Resolution

  • As per the hardware compatibility records it is found that *Power 720 Express Server (8202-E4B)* is supported for Red Hat Enterprise Linux 5.5 not for 5.1.

  • Please refer the following links for more details.

https://hardware.redhat.com/show.cgi?id=654882
https://hardware.redhat.com/show.cgi?id=632428

Root Cause

  • It seems a buffer issue. The FWNMI code uses a global buffer without any locks to read the RTAS error information. If two CPUs take a machine check at once then it will corrupt this buffer. Since most FWNMI rtas messages are not of the extended type, we can create a 64bit percpu buffer and use it where possible. If we do receive an extended RTAS log then we fall back to the old behaviour of using the global buffer.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.