RHEL8: Large number of refcount_t overflow messages that are sometimes followed by a crash

Solution Verified - Updated -

Issue

  • Large number of refcount_t overflow messages that are sometimes followed by crash
[971170.499671] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in su[3087833], uid/euid: 0/0
[975665.799431] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in IndexerTPoolWor[3781778], uid/euid: 30872/30872
[985670.343013] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[988278.646874] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[991565.632746] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[993986.287413] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in StreamSearch[1253844], uid/euid: 30872/30872
[1007486.127435] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[1016494.384002] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in systemd[1], uid/euid: 0/0
[1041958.213026] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[1061178.942438] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[1078277.567973] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[1097773.613079] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[1098081.078401] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in systemd[1], uid/euid: 0/0
[1104678.755488] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[1121170.122385] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[1128375.667499] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[1140071.991238] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in polkitd[1617], uid/euid: 999/999
[1142511.714927] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[1142511.714921] ------------[ cut here ]------------
[1142511.714927] refcount_t overflow at mem_cgroup_id_get_online+0x7a/0xa0 in kswapd0[239], uid/euid: 0/0
[1142511.714935] WARNING: CPU: 4 PID: 239 at kernel/panic.c:703 refcount_error_report+0x98/0x9d
[1142511.714940] Modules linked in: [...]
[1142511.714977] Red Hat flags: eBPF/rawtrace
[1142511.714979] CPU: 4 PID: 239 Comm: kswapd0 Kdump: loaded Tainted: G        W    L   --------- -  - 4.18.0-425.10.1.el8_7.x86_64 #1
[1142511.714982] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[1142511.714983] RIP: 0010:refcount_error_report+0x98/0x9d
[1142511.714986] Code: 8b 84 24 00 09 00 00 48 8b 95 80 00 00 00 49 8d 8c 24 e0 0a 00 00 41 55 41 89 c1 48 89 de 48 c7 c7 78 e9 ad 8b e8 05 00 00 00 <0f> 0b 58 eb 8b 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 c7 c7 d8
[1142511.714988] RSP: 0018:ffffb0a5c6f538f0 EFLAGS: 00010086
[1142511.714990] RAX: 0000000000000000 RBX: ffffffff8baef958 RCX: 0000000000000027
[1142511.714992] RDX: 0000000000000027 RSI: 00000000ffff7fff RDI: ffff9cf67df16690
[1142511.714993] RBP: ffffb0a5c6f539a8 R08: 0000000000000000 R09: c0000000ffff7fff
[1142511.714995] R10: 0000000000000001 R11: ffffb0a5c6f53708 R12: ffff9ce7b9be8000
[1142511.714996] R13: 0000000000000000 R14: 0000000000000000 R15: ffffb0a5c6f539a8
[1142511.714997] FS:  0000000000000000(0000) GS:ffff9cf67df00000(0000) knlGS:0000000000000000
[1142511.714999] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1142511.715000] CR2: 00005564d5cce400 CR3: 0000000919e10004 CR4: 00000000007706e0
[1142511.715042] PKRU: 55555554
[1142511.715044] Call Trace:
[1142511.715048]  ex_handler_refcount+0x4e/0x80
[1142511.715053]  fixup_exception+0x33/0x46
[1142511.715055]  do_trap+0x4c/0x110
[1142511.715060]  ? __noinstr_text_end+0x8c3/0x2bd9
[1142511.715065]  do_invalid_op+0x36/0x40
[1142511.715066]  ? __noinstr_text_end+0x8c3/0x2bd9
[1142511.715068]  invalid_op+0x14/0x20
[1142511.715071] RIP: 0010:mem_cgroup_id_get_online+0x7a/0xa0
[1142511.715073] Code: 48 0f 44 f8 eb af 85 c0 74 d7 89 c2 8d 48 01 c1 e8 1f 81 fa ff ff ff 7f 41 0f 94 c0 41 08 c0 75 04 39 d1 7d b0 e9 db 7e 6a 00 <48> 89 f8 e9 2e 95 8c 00 0f 0b 48 89 f8 e9 24 95 8c 00 48 89 c7 e9
[1142511.715075] RSP: 0018:ffffb0a5c6f53a58 EFLAGS: 00010812
[1142511.715077] RAX: ffff9ce6d5846000 RBX: ffffee3ff7883840 RCX: ffff9ce6d5846134
[1142511.715078] RDX: 00000000c0000000 RSI: ffff9ce6d5846134 RDI: ffff9ce6d5846000
[1142511.715079] RBP: 00000000003895cc R08: 0000000000000246 R09: 0000000000000011
[1142511.715080] R10: ffff9ce78a64b7a0 R11: 0000000000000000 R12: ffff9ce6d5846000
[1142511.715081] R13: 0000000000000000 R14: ffff9ce78a64b7a8 R15: 00000000003895cc
[1142511.715084]  mem_cgroup_swapout+0x4f/0x170
[1142511.715088]  __remove_mapping+0x158/0x220
[1142511.715092]  shrink_page_list+0x91c/0xca0
[1142511.715095]  shrink_inactive_list+0x19e/0x3e0
[1142511.715097]  shrink_lruvec+0x474/0x6c0
[1142511.715100]  shrink_node+0x22e/0x700
[1142511.715102]  balance_pgdat+0x2d7/0x550
[1142511.715104]  kswapd+0x201/0x3c0
[1142511.715106]  ? finish_wait+0x80/0x80
[1142511.715109]  ? balance_pgdat+0x550/0x550
[1142511.715111]  kthread+0x10b/0x130
[1142511.715115]  ? set_kthread_struct+0x50/0x50
[1142511.715117]  ret_from_fork+0x1f/0x40
[1142511.715123] ---[ end trace deb8deeac20add33 ]---
  • GPF that happens in memcg_flush_lruvec_page_state() follows the overflow messages sometimes:
[1161523.637331] general protection fault: 0000 [#1] SMP NOPTI
[1161523.637339] CPU: 11 PID: 1830388 Comm: kworker/11:0 Kdump: loaded Tainted: G        W    L   --------- -  - 4.18.0-425.10.1.el8_7.x86_64 #1
[1161523.637342] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[1161523.637344] Workqueue: events percpu_stats_free_rwork_fn
[1161523.637352] RIP: 0010:memcg_flush_lruvec_page_state.part.55+0x7d/0x140
[1161523.637356] Code: 00 48 c7 c5 40 c8 ba 8b 4c 8d ac 24 38 01 00 00 4c 63 c7 48 89 e2 4b 8b b4 c4 98 10 00 00 48 8b 86 90 00 00 00 48 03 44 dd 00 <48> 63 08 48 83 c2 08 c7 00 00 00 00 00 48 83 c0 04 48 89 4a f8 4c
[1161523.637358] RSP: 0018:ffffb0a5e82bbd08 EFLAGS: 00010087
[1161523.637360] RAX: ffff39e992ee6488 RBX: 0000000000000000 RCX: 0000000000000000
[1161523.637362] RDX: ffffb0a5e82bbd08 RSI: ffff9cf3150e6400 RDI: 0000000000000000
[1161523.637363] RBP: ffffffff8bbac840 R08: 0000000000000000 R09: 0000000000000000
[1161523.637364] R10: 8080808080808080 R11: 0000000000000001 R12: ffff9ce688e4c000
[1161523.637366] R13: ffffb0a5e82bbe40 R14: ffff9ce812616000 R15: ffff9ce688e4c588
[1161523.637367] FS:  0000000000000000(0000) GS:ffff9cf67e0c0000(0000) knlGS:0000000000000000
[1161523.637369] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1161523.637370] CR2: 000000c0001e1000 CR3: 0000000919e10003 CR4: 00000000007706e0
[1161523.637399] PKRU: 55555554
[1161523.637400] Call Trace:
[1161523.637408]  ? update_load_avg+0x7e/0x710
[1161523.637414]  ? update_load_avg+0x7e/0x710
[1161523.637416]  ? set_next_entity+0xb5/0x1e0
[1161523.637418]  ? cpumask_next+0x17/0x20
[1161523.637423]  ? cgroup_rstat_flush_locked+0x2f/0x280
[1161523.637427]  percpu_stats_free_rwork_fn+0x6b/0x130
[1161523.637429]  process_one_work+0x1a7/0x360
[1161523.637435]  ? create_worker+0x1a0/0x1a0
[1161523.637436]  worker_thread+0x30/0x390
[1161523.637438]  ? create_worker+0x1a0/0x1a0
[1161523.637440]  kthread+0x10b/0x130
[1161523.637444]  ? set_kthread_struct+0x50/0x50
[1161523.637447]  ret_from_fork+0x1f/0x40

Environment

  • Red Hat Enterprise Linux 8.7 GA kernel-4.18.0-425.3.1.el8 and onwards
  • Red Hat Enterprise Linux 8.6.z kernel-4.18.0-372.26.1.el8_6 and onwards
  • Red Hat Enterprise Linux 8.4.z kernel-4.18.0-305.62.1.el8_4 and onwards
  • cgroups

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content