System crashed at 'kernel BUG at mm/hugetlb.c:1257!'

Solution Verified - Updated -

Issue

  • System crashed with the below error message
[10848907.314843] {3}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 65534
[10848907.314846] {3}[Hardware Error]: It has been corrected by h/w and requires no further action
[10848907.314848] {3}[Hardware Error]: event severity: corrected
[10848907.314850] {3}[Hardware Error]:  Error 0, type: corrected
[10848907.314852] {3}[Hardware Error]:   section type: unknown, 330f1140-72a5-11df-9690-0002a5d5c51b
[10848907.314853] {3}[Hardware Error]:  Error 1, type: corrected
[10848907.314855] {3}[Hardware Error]:   section type: unknown, 330f1140-72a5-11df-9690-0002a5d5c51b
[10848907.314861] {4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[10848907.314862] {4}[Hardware Error]: It has been corrected by h/w and requires no further action
[10848907.314863] {4}[Hardware Error]: event severity: corrected
[10848907.314865] {4}[Hardware Error]:  Error 0, type: corrected
[10848907.314866] {4}[Hardware Error]:  fru_text: B1
[10848907.314867] {4}[Hardware Error]:   section_type: memory error
[10848907.314869] {4}[Hardware Error]:   error_status: 0x0000000000000400
[10848907.314870] {4}[Hardware Error]:   physical_address: 0x00000055d91c3000
[10848907.314873] {4}[Hardware Error]:   node: 2 card: 0 module: 0 rank: 0 bank: 2 row: 39711 column: 128 
[10848907.314875] {4}[Hardware Error]:   error_type: 13, scrub corrected error
[10848907.314877] {4}[Hardware Error]:   DIMM location: not present. DMI handle: 0x0000 
[10848907.314884] mce: [Hardware Error]: Machine check events logged
[10848907.314904] EDAC skx MC2: HANDLING MCE MEMORY ERROR
[10848907.314906] EDAC skx MC2: CPU 0: Machine Check Event: 0 Bank 1: 940000000000009f
[10848907.314908] EDAC skx MC2: TSC 4d171185969754 
[10848907.314910] EDAC skx MC2: ADDR 55d91c3000 
[10848907.314912] EDAC skx MC2: MISC 0 
[10848907.314913] EDAC skx MC2: PROCESSOR 0:50654 TIME 1577728518 SOCKET 0 APIC 0
[10848907.314923] EDAC MC2: 0 CE memory read error on CPU_SrcID#1_MC#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0x55d91c3 offset:0x0 grain:32 syndrome:0x0 -  err_code:0000:009f socket:1 imc:0 rank:0 bg:2 ba:1 row:1920e col:3d0)
[10848907.315360] soft offline: 0x55d91c3: migration failed 1, type 6fffff00008000
[10848907.315388] ------------[ cut here ]------------
[10848907.320279] kernel BUG at mm/hugetlb.c:1257!

Environment

  • Red Hat Enterprise Linux 7
    • kernel-3.10.0-957.21.3.el7.x86_64
    • kernel-3.10.0-1062.9.1.el7.x86_64

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content