A fatal MCEs occur on multiple compute nodes during execution of ("REP; MOVS*") instructions in copy_page()
Issue
- A fatal MCEs occur on multiple compute nodes during execution of ("REP; MOVS*") instructions in copy_page()
core: [Hardware Error]: CPU 85: Machine Check Exception: f Bank 1: bd80000000100134
core: [Hardware Error]: RIP 10:<ffffffff92939aa7> {copy_page+0x7/0x10}
core: [Hardware Error]: TSC 60e4bb1a677e90 ADDR 5fd9886480 MISC 86
core: [Hardware Error]: PROCESSOR 0:50657 TIME 1744199090 SOCKET 1 APIC 63 microcode 5003102
core: [Hardware Error]: Run the above through 'mcelog --ascii'
core: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel
Kernel panic - not syncing: Fatal local machine check
......
#0 [fffffe00010fcc60] machine_kexec at ffffffff9206156e
#1 [fffffe00010fccb8] __crash_kexec at ffffffff9218f9ed
#2 [fffffe00010fcd80] panic at ffffffff920e0df7
#3 [fffffe00010fce10] mce_rdmsrl at ffffffff9203b6d3
#4 [fffffe00010fce48] do_machine_check at ffffffff9203c95a
#5 [fffffe00010fcf50] machine_check at ffffffff92a0112b
[exception RIP: copy_page+7]
RIP: ffffffff92939aa7 RSP: ffffa4481eb5b7b0 RFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff949dfe743d80 RCX: 0000000000000170
RDX: 0000000000000086 RSI: ffff948b59886480 RDI: ffff9561ac486480
RBP: fffff4503f660000 R8: 0000000000030688 R9: 0000000000030680
R10: 0000000000000007 R11: 0000000000000000 R12: fffff45398b12180
R13: fffff4503f662180 R14: fffff45398b10000 R15: 0000000000000200
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <MCE exception stack> ---
#6 [ffffa4481eb5b7b0] copy_page at ffffffff92939aa7
......
crash> dis -lr ffffffff92939aa7
/usr/src/debug/kernel-4.18.0-305.12.1.el8_4/linux-4.18.0-305.12.1.el8_4.x86_64/arch/x86/lib/copy_page_64.S: 17
0xffffffff92939aa0 <copy_page>: xchg %ax,%ax
/usr/src/debug/kernel-4.18.0-305.12.1.el8_4/linux-4.18.0-305.12.1.el8_4.x86_64/arch/x86/lib/copy_page_64.S: 18
0xffffffff92939aa2 <copy_page+2>: mov $0x200,%ecx
/usr/src/debug/kernel-4.18.0-305.12.1.el8_4/linux-4.18.0-305.12.1.el8_4.x86_64/arch/x86/lib/copy_page_64.S: 19
0xffffffff92939aa7 <copy_page+7>: rep movsq %ds:(%rsi),%es:(%rdi) <<----- The trapping instruction
Environment
- Red Hat Enterprise Linux 8
- Red Hat Enterprise Linux 9 older than 9.5 GA - kernel-5.14.0-503.11.1.el9_5.x86_64
- Intel CPUs (Skylake / Cascade Lake / Cooper Lake)
- ("REP; MOVS*") instructions executed in copy_page()
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.