Unexplained behaviour of RHEL systems with Broadwell/Haswell CPUs

Solution Verified - Updated -

Issue

  • System becomes sluggish with no load for a while.
  • System crashes with General Protection Fault.
  • Kernel crashes with below logs:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffffc027a5ef>] xlog_cil_push+0x18f/0x430 [xfs]
PGD 0
Oops: 0000 [#1] SMP
CPU: 5 PID: 32756 Comm: kworker/5:2 Kdump: loaded Tainted: G        W      ------------ T 3.10.0-1062.1.1.el7.x86_64 #1
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 01/22/2018
Workqueue: xfs-cil/dm-19 xlog_cil_push_work [xfs]
task: ffff9baa63f8a0e0 ti: ffff9bab33b4c000 task.ti: ffff9bab33b4c000
RIP: 0010:[<ffffffffc027a5ef>]  [<ffffffffc027a5ef>] xlog_cil_push+0x18f/0x430 [xfs]
Call Trace:
 [<ffffffff918ae168>] ? add_timer+0x18/0x20
 [<ffffffff918bb3db>] ? __queue_delayed_work+0x8b/0x1a0
 [<ffffffffc027a8a5>] xlog_cil_push_work+0x15/0x20 [xfs]
 [<ffffffff918bd0ff>] process_one_work+0x17f/0x440
 [<ffffffff918be368>] worker_thread+0x278/0x3c0
 [<ffffffff918be0f0>] ? manage_workers.isra.26+0x2a0/0x2a0
 [<ffffffff918c50d1>] kthread+0xd1/0xe0
 [<ffffffff918c5000>] ? insert_kthread_work+0x40/0x40
 [<ffffffff91f8cd37>] ret_from_fork_nospec_begin+0x21/0x21
 [<ffffffff918c5000>] ? insert_kthread_work+0x40/0x40
Code: 46 08 48 39 85 58 ff ff ff 74 61 45 31 ff eb 28 0f 1f 40 00 48 8b 70 10 49 89 37 4c 8b 78 10 48 c7 40 10 00 00 00 00 49 8b 46 08 <45> 03 6f 08 48 39 85 58 ff ff ff 74 34 48 89 c7 48 89 85 68 ff
RIP  [<ffffffffc027a5ef>] xlog_cil_push+0x18f/0x430 [xfs]
 RSP <ffff9bab33b4fd48>
CR2: 0000000000000008

[32386.065177] ------------[ cut here ]------------
[32386.070257] kernel BUG at fs/jbd2/journal.c:2482!
[32386.075426] invalid opcode: 0000 [#1] SMP 
[32386.226624] CPU: 31 PID: 7287 Comm: jbd2/dm-96-8 Not tainted 3.10.0-514.el7.x86_64 #1
[32386.235232] Hardware name: HP Superdome2 16s x86, BIOS Bundle: 008.004.084 SFW: 043.025.000 08/16/2016
[32386.245464] task: ffff887f7381edd0 ti: ffff887f7f284000 task.ti: ffff887f7f284000
[32386.253691] RIP: 0010:[<ffffffffa071ace2>]  [<ffffffffa071ace2>] jbd2_journal_put_journal_head+0x142/0x146 [jbd2]
[32386.348816] Stack:
[32386.351025]  ffff881911ea9400 ffff885f7ba34000 ffff887f7f287ca0 ffffffffa0713dbb
[32386.359192]  ffffffff811899e0 ffff885f7ba343a0 ffff881ff6b5ca92 ffff88de9820ce38
[32386.367365]  ffff88dea9015000 ffff881fe8d67600 ffff887f7f287e40 ffffffffa07120f6
[32386.375535] Call Trace:
[32386.378257]  [<ffffffffa0713dbb>] __jbd2_journal_remove_checkpoint+0x5b/0x160 [jbd2]
[32386.386782]  [<ffffffff811899e0>] ? free_pages.part.80+0x40/0x50
[32386.393389]  [<ffffffffa07120f6>] jbd2_journal_commit_transaction+0x1106/0x19a0 [jbd2]
[32386.402100]  [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0
[32386.408131]  [<ffffffffa0716e99>] kjournald2+0xc9/0x260 [jbd2]
[32386.414551]  [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
[32386.420967]  [<ffffffffa0716dd0>] ? commit_timeout+0x10/0x10 [jbd2]
[32386.427858]  [<ffffffff810b052f>] kthread+0xcf/0xe0
[32386.433218]  [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
[32386.440402]  [<ffffffff81696418>] ret_from_fork+0x58/0x90
[32386.446335]  [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
[32386.453509] Code: c7 c6 80 b5 71 a0 48 c7 c7 e8 d8 71 a0 31 c0 e8 2b 47 f6 e0 48 8b 73 20 49 8b 7c 24 20 e8 f7 f8 ff ff e9 64 ff ff ff 0f 0b 0f 0b <0f> 0b 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 0f 1f 44 00 00 
[32386.474936] RIP  [<ffffffffa071ace2>] jbd2_journal_put_journal_head+0x142/0x146 [jbd2]
[32386.483653]  RSP <ffff887f7f287c50>

Environment

  • Red Hat Enterprise Linux
  • Seen on below systems:

    • HP Superdome2 16s x86, BIOS Bundle: 008.008.034 SFW: 045.018.000 10/01/2019
    • HP ProLiant XL420 Gen9/ProLiant XL420 Gen9, BIOS U19 10/21/2019
    • Huawei RH2288H V3/BC11HGSA0 BIOS 3.79 11/07/2017 Insyde Corp. RH2288H V3
    • Huawei XH628 V3/BC21HGSA0, BIOS 3.50 11/23/2016
    • Cloudian HSA-1512/S2PH-MB, BIOS S2P_3B13.01 07/12/2019
    • Radisys DCE-CSLED-V2-2-001/S2600TPR, BIOS SE5C610.86B.01.01.0027.071020182329 07/10/2018
  • Seen on below Intel(R) Xeon(R) v4:

    • Intel(R) Xeon(R) CPU E7-8891 v4 @ 2.80GHz ff-mm-ss: 06-4f-01 microcode: 0xb000038
    • Intel(R) Xeon(R) CPU E7-8890 v4 @ 2.20GHz ff-mm-ss: 06-4f-01 microcode: 0xb000036
    • Intel(R) Xeon(R) CPU E7-8855 v4 microcode: sig=0x406f1, pf=0x80, revision=0xb000033
    • Intel(R) Xeon(R) CPU E7-8893 v4 @ 3.20GHz microcode: sig=0x406f1, pf=0x80, revision=0xb00002e
    • Intel(R) Xeon(R) CPU E5-2630L v4 microcode: sig=0x406f1, pf=0x1, revision=0xb00002e
    • Intel(R) Xeon(R) CPU E7-8894 v4 @ 2.40GHz microcode: sig=0x406f1, pf=0x80, revision=0xb00002a
    • Intel(R) Xeon(R) CPU E7-8880 v4 @ 2.20GHz microcode: sig=0x406f1, pf=0x80, revision=0xb000021
    • Intel(R) Xeon(R) CPU E7-8891 v4 microcode: sig=0x406f1, pf=0x80, revision=0xb000020
    • Intel(R) Xeon(R) CPU E5-2620 v4 microcode: sig=0x406f1, pf=0x1, revision=0xb000036
    • Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz ff-mm-ss: 06-4f-01 microcode: 0xb00002a
    • Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz microcode: sig=0x406f1, pf=0x1, revision=0xb000021
    • Intel(R) Xeon(R) CPU E5-2618L v4 microcode: sig=0x406f1, pf=0x8, revision=0xb000021
  • Seen on below Intel(R) Xeon(R) v3:

    • Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz ff-mm-ss: 06-3f-04 microcode: 0x12
    • Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz microcode: sig=0x306f4, pf=0x80, revision=0xd
    • Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz microcode: sig=0x306f4, pf=0x80, revision=0x16

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content