Kernel panic at __page_frag_cache_drain with memory corruption

Solution Verified - Updated -

Issue

  • panic with log:
[262416.022179] BUG: unable to handle kernel paging request at 0000159780003437
[262416.023090] PGD 0 
[262416.024174] Oops: 0002 [#1] SMP NOPTI
[262416.025248] CPU: 89 PID: 1705188 Comm: kworker/u224:2 Kdump: loaded Tainted: P        W  OE    --------- -  - 4.18.0-372.9.1.el8.x86_64 #1
[262416.027438] Hardware name: HPE ProLiant DL380 Gen10 Plus/ProLiant DL380 Gen10 Plus, BIOS U46 07/14/2022
[262416.028559] Workqueue: iavf iavf_reset_task [iavf]
[262416.029680] RIP: 0010:__page_frag_cache_drain+0x5/0x30
[262416.030797] Code: 0f 0f b6 77 51 85 f6 74 07 31 d2 e9 a5 e0 ff ff e9 30 ff ff ff 48 8b 05 e9 7d 53 01 eb b4 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <f0> 29 77 34 74 01 c3 48 8b 07 f6 c4 80 74 0f 0f b6 77 51 85 f6 74
[262416.033079] RSP: 0018:ff592021cf267df0 EFLAGS: 00010292
[262416.034213] RAX: ffffffffba5a36e0 RBX: ff20120e72230000 RCX: 0000000000000002
[262416.035343] RDX: 000000004c26d000 RSI: 0000000000000000 RDI: 0000159780003403
[262416.036467] RBP: ff201210094b7d00 R08: 0000000000000022 R09: 0000000000000009
[262416.037579] R10: 03408f6e00000000 R11: 0000000000000020 R12: 0000000000000000
[262416.038679] R13: 0000000000001000 R14: 0000000000001760 R15: 0000000000001760
[262416.039764] FS:  0000000000000000(0000) GS:ff20126caf640000(0000) knlGS:0000000000000000
[262416.040848] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[262416.041919] CR2: 0000159780003437 CR3: 0000000f1e610004 CR4: 0000000000771ee0
[262416.042992] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[262416.044052] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[262416.045096] PKRU: 55555554
[262416.046122] Call Trace:
[262416.047133]  iavf_clean_rx_ring+0xad/0x110 [iavf]
[262416.048152]  iavf_free_rx_resources+0xe/0x50 [iavf]
[262416.049158]  iavf_free_all_rx_resources.part.52+0x30/0x40 [iavf]
[262416.050159]  iavf_reset_task+0x1a1/0x7a0 [iavf]
[262416.051142]  process_one_work+0x1a7/0x360
[262416.052099]  ? create_worker+0x1a0/0x1a0
[262416.053049]  worker_thread+0x30/0x390
[262416.053978]  ? create_worker+0x1a0/0x1a0
[262416.054886]  kthread+0x10a/0x120
[262416.055788]  ? set_kthread_struct+0x40/0x40
[262416.056675]  ret_from_fork+0x1f/0x40
  • Another pattern of crash:
[90357.528069] general protection fault: 0000 [#1] SMP NOPTI
[90357.528901] CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: P           OE    --------- -  - 4.18.0-372.9.1.el8.x86_64 #1
[90357.531964] Hardware name: HPE ProLiant DL380 Gen10 Plus/ProLiant DL380 Gen10 Plus, BIOS U46 07/14/2022
[90357.535094] RIP: 0010:__build_skb_around+0x6d/0xb0
[90357.536179] Code: 9d e8 00 00 00 66 89 85 c6 00 00 00 48 89 e8 c7 85 f4 00 00 00 01 00 00 00 c7 85 d8 00 00 00 00 00 00 00 66 89 95 c2 00 00 00 <48> c7 06 00 00 00 00 48 c7 46 08 00 00 00 00 48 c7 46 10 00 00 00
[90357.538305] RSP: 0018:ff5fca048c908dc8 EFLAGS: 00010206
[90357.539360] RAX: ff4b3a5113b54800 RBX: 1a79a667400cff40 RCX: 0000000000000000
[90357.540413] RDX: 00000000ffffffff RSI: 1a79a667400d0600 RDI: ff4b3a5113b54800
[90357.541457] RBP: ff4b3a5113b54800 R08: ff4b3a5113b54800 R09: 000000000002a680
[90357.542506] R10: ff4b3a4ffe798000 R11: 0000000000000000 R12: ff4b3a4865578000
[90357.543545] R13: 000000000000003c R14: 0000000000000800 R15: 0000000000000000
[90357.544565] FS:  0000000000000000(0000) GS:ff4b3a77af040000(0000) knlGS:0000000000000000
[90357.545580] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[90357.546585] CR2: 000000c006a8b000 CR3: 000000304ba10003 CR4: 0000000000771ee0
[90357.547592] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[90357.548584] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[90357.549563] PKRU: 55555554
[90357.550520] Call Trace:
[90357.551465]  <IRQ>
[90357.552392]  build_skb+0x11/0x80
[90357.553312]  iavf_clean_rx_irq+0x145/0x860 [iavf]
[90357.554226]  ? handle_irq_event_percpu+0x6a/0x80
[90357.555126]  iavf_napi_poll+0x2cb/0x770 [iavf]
[90357.556016]  __napi_poll+0x2d/0x130
[90357.556886]  net_rx_action+0x253/0x320
[90357.557742]  __do_softirq+0xd7/0x2c4
[90357.558587]  irq_exit_rcu+0xcb/0xd0
[90357.559416]  irq_exit+0xa/0x10
[90357.560227]  do_IRQ+0x7f/0xd0
[90357.561021]  common_interrupt+0xf/0xf


[184497.173686] BUG: unable to handle kernel paging request at ffffffffb50fa940
[184497.175127] PGD 1eb3012067 P4D 1eb3013067 PUD 1eb3014063 PMD 1eb1c001e1 
[184497.176634] Oops: 0003 [#1] SMP NOPTI
[184497.178159] CPU: 57 PID: 5760 Comm: blk_mgr Kdump: loaded Tainted: P           OE    --------- -  - 4.18.0-372.9.1.el8.x86_64 #1
[184497.181153] Hardware name: HPE ProLiant DL380 Gen10 Plus/ProLiant DL380 Gen10 Plus, BIOS U46 07/14/2022
[184497.182698] RIP: 0010:native_queued_spin_lock_slowpath+0x130/0x1b0
[184497.184293] Code: 10 66 87 47 02 89 c1 c1 e1 10 74 57 c1 e9 12 83 e0 03 83 e9 01 48 c1 e0 05 48 63 c9 48 05 80 bb 02 00 48 03 04 cd 00 28 da b5 <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 02
[184497.187468] RSP: 0018:ff724d87cfc87b98 EFLAGS: 00010082
[184497.189030] RAX: ffffffffb50fa940 RBX: 0000000000000283 RCX: 0000000000003ffe
[184497.190068] RDX: ff2b082e2ff6bb80 RSI: 0000000000e80000 RDI: ff2b07ff800000c0
[184497.191041] RBP: ff724d87cfc87c60 R08: ff2b0835f643bef0 R09: 0000000000000000
[184497.192010] R10: ff2b082ef00000c0 R11: ffb11d44deabe708 R12: 0000000000000000
[184497.192972] R13: ff2b07ff80004540 R14: 0000000000000002 R15: ff2b082e2ff70060
[184497.193929] FS:  0000000000000000(0000) GS:ff2b082e2ff40000(0000) knlGS:0000000000000000
[184497.194899] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[184497.195858] CR2: ffffffffb50fa940 CR3: 0000001eb3010001 CR4: 0000000000771ee0
[184497.196823] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[184497.197782] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[184497.198731] PKRU: 55555554
[184497.199667] Call Trace:
[184497.200606]  _raw_spin_lock_irqsave+0x30/0x40
[184497.201537]  get_partial_node.part.87+0x38/0x280
[184497.202482]  ___slab_alloc.part.89+0x2b7/0x740
[184497.203393]  ? pcpu_block_refresh_hint+0x7f/0xa0
[184497.204294]  ? alloc_cpumask_var_node+0x1b/0x30
[184497.205184]  __kmalloc_node+0xcc/0x2a0
[184497.206057]  ? alloc_cpumask_var_node+0x1b/0x30
[184497.206918]  alloc_cpumask_var_node+0x1b/0x30
[184497.207773]  blk_mq_realloc_hw_ctxs+0x37e/0x5e0
[184497.208618]  blk_mq_init_allocated_queue+0x105/0x420
[184497.209452]  blk_mq_init_queue_data+0x3e/0x70
[184497.210274]  blkDev_Create+0x316/0x730 [scini]
[184497.211127]  mapVolBlkMgr_Thrd+0x405/0xa90 [scini]
[184497.211965]  ? mosTicks_DestroyEnvSpecific+0x10/0x10 [scini]
[184497.212796]  mosOsThrd_Entry+0x1b/0x40 [scini]
[184497.213608]  kthread+0x10a/0x120
[184497.214376]  ? set_kthread_struct+0x40/0x40
[184497.215131]  ret_from_fork+0x1f/0x40

[ 1899.145551] BUG: unable to handle kernel NULL pointer dereference at 0000000000000378
[ 1899.146354] PGD 0 
[ 1899.147294] Oops: 0000 [#1] SMP NOPTI
[ 1899.148228] CPU: 28 PID: 276708 Comm: kworker/28:0 Kdump: loaded Tainted: P        W  OE    --------- -  - 4.18.0-372.9.1.el8.x86_64 #1
[ 1899.150130] Hardware name: HPE ProLiant DL380 Gen10 Plus/ProLiant DL380 Gen10 Plus, BIOS U46 07/14/2022
[ 1899.151106] Workqueue: cgroup_destroy css_release_work_fn
[ 1899.152082] RIP: 0010:cgroup_rstat_flush_locked+0x7d/0x280
[ 1899.153053] Code: 28 7a 99 4c 89 f7 4c 89 74 24 08 e8 6d 39 7f 00 48 8b 04 24 48 89 c1 48 85 c0 0f 84 91 01 00 00 4b 8b 74 e5 00 eb 03 4c 89 f1 <48> 8b 81 78 03 00 00 48 01 f0 4c 8b 70 30 4c 39 f1 75 ea 4c 8b 48
[ 1899.155089] RSP: 0018:ff42eef9c0d67e10 EFLAGS: 00010086
[ 1899.156091] RAX: ff74eec978d80920 RBX: ff29de7290b51000 RCX: 0000000000000000
[ 1899.157097] RDX: 0000000000000001 RSI: ff29de6b2fb80000 RDI: ff29de6b2fb9d6e4
[ 1899.158098] RBP: 000000000000000e R08: 0000000000000008 R09: 0000000000000000
[ 1899.159096] R10: 8080808080808080 R11: 0000000000000010 R12: 000000000000000e
[ 1899.160078] R13: ffffffff997a2800 R14: 0000000000000000 R15: ff29de7290b51098
[ 1899.161066] FS:  0000000000000000(0000) GS:ff29de9b2ee00000(0000) knlGS:0000000000000000
[ 1899.162075] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1899.163071] CR2: 0000000000000378 CR3: 0000005ed3e10001 CR4: 0000000000771ee0
[ 1899.164068] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1899.165049] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1899.166015] PKRU: 55555554
[ 1899.166963] Call Trace:
[ 1899.167896]  cgroup_rstat_flush+0x27/0x40
[ 1899.168823]  css_release_work_fn+0x139/0x220
[ 1899.169740]  process_one_work+0x1a7/0x360
[ 1899.170649]  ? create_worker+0x1a0/0x1a0
[ 1899.171543]  worker_thread+0x30/0x390
[ 1899.172424]  ? create_worker+0x1a0/0x1a0
[ 1899.173290]  kthread+0x10a/0x120
[ 1899.174144]  ? set_kthread_struct+0x40/0x40
[ 1899.174991]  ret_from_fork+0x1f/0x40

Environment

  • Red Hat Enterprise Linux 8
  • Red Hat Enterprise Linux 9
  • OCP 4.10
  • ice/iavf driver

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content