list_add corruption occurs with a vxfs-specific list for managing dentries, resulting in soft lockup occurrences and ultimately, gab_halt() being called for VCS cluster fencing.

Solution Verified - Updated -

Issue

  • list_add corruption occurs with a vxfs-specific list for managing dentries, resulting in soft lockup occurrences and ultimately, gab_halt() being called for VCS cluster fencing.
[48140638.816770] ------------[ cut here ]------------
[48140638.817155] WARNING: CPU: 0 PID: 29143 at lib/list_debug.c:33 __list_add+0xac/0xc0
[48140638.817531] list_add corruption. prev->next should be next (ffff8800bbba77d8), but was 80000001108fb163. (prev=ffff8803b410c7d8).
[48140638.818338] Modules linked in: [...]
[48140638.823343] CPU: 0 PID: 29143 Comm: vx_worklist_thr Tainted: P    B   W  OE  ------------   3.10.0-693.11.1.el7.x86_64 #1
[48140638.824055] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B64.1704120155 04/12/2017
[48140638.825524]  ffff8801ccb43c80 00000000798a993d ffff8801ccb43c30 ffffffff816a3e61
[48140638.826319]  ffff8801ccb43c70 ffffffff810879d8 00000021c067aa83 ffff880258c23fd0
[48140638.827122]  ffff8800bbba77d8 ffff8803b410c7d8 ffff880258c23fd0 ffff8800bbba7558
[48140638.827935] Call Trace:
[48140638.828739]  [<ffffffff816a3e61>] dump_stack+0x19/0x1b
[48140638.829560]  [<ffffffff810879d8>] __warn+0xd8/0x100
[48140638.830361]  [<ffffffff81087a5f>] warn_slowpath_fmt+0x5f/0x80
[48140638.831159]  [<ffffffff8133d83c>] __list_add+0xac/0xc0
[48140638.831998]  [<ffffffffc073d2b2>] vx_diput+0x1b2/0x280 [vxfs]
[48140638.832804]  [<ffffffff8121863c>] dentry_kill+0x14c/0x1b0
[48140638.833630]  [<ffffffff812186fe>] dput+0x5e/0xd0
[48140638.834477]  [<ffffffffc074068b>] vx_prune_dcache_v2+0x26b/0x3b0 [vxfs]
[48140638.835336]  [<ffffffffc0740920>] vx_dcache_pruner+0x150/0x210 [vxfs]
[48140638.836199]  [<ffffffffc07407d0>] ? vx_prune_dcache_v2+0x3b0/0x3b0 [vxfs]
[48140638.837080]  [<ffffffffc06310ca>] vx_workitem_process+0x1a/0x40 [vxfs]
[48140638.837975]  [<ffffffffc0639e55>] vx_worklist_process+0x215/0x230 [vxfs]
[48140638.838869]  [<ffffffffc06c10c0>] ? vx_osdep_deinit+0x1d0/0x1d0 [vxfs]
[48140638.839772]  [<ffffffffc0639f08>] vx_worklist_thread+0x98/0x100 [vxfs]
[48140638.840681]  [<ffffffffc0639e70>] ? vx_worklist_process+0x230/0x230 [vxfs]
[48140638.841594]  [<ffffffffc06c1104>] vx_kthread_init+0x44/0x50 [vxfs]
[48140638.842481]  [<ffffffff810b099f>] kthread+0xcf/0xe0
[48140638.843368]  [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40
[48140638.844252]  [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90
[48140638.845141]  [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40
[48140638.846041] ---[ end trace a1acd63903ddcfac ]---
[48140848.845136] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [gzip:8812]
[48140848.846420] Modules linked in: [...]
    ...
[48140848.855131] CPU: 2 PID: 8812 Comm: gzip Tainted: P    B   W  OE  ------------   3.10.0-693.11.1.el7.x86_64 #1
[48140848.856231] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B64.1704120155 04/12/2017
[48140848.858388] task: ffff880350822f70 ti: ffff88004b91c000 task.ti: ffff88004b91c000
[48140848.859471] RIP: 0010:[<ffffffff81330122>]  [<ffffffff81330122>] copy_user_generic_unrolled+0x32/0xc0
[48140848.860558] RSP: 0018:ffff88004b91fc10  EFLAGS: 00010202
[48140848.861617] RAX: ffff8801108fb000 RBX: ffff88004b91fe48 RCX: 0000000000000040
[48140848.862657] RDX: 0000000000000000 RSI: ffff8801108fb000 RDI: 000000000065f020
[48140848.863676] RBP: ffff88004b91fce0 R08: 88a73898d17382d6 R09: d324cf9aaf1dbac6
[48140848.864674] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
[48140848.865655] R13: 0000000000001000 R14: 0000000000001000 R15: ffff88004b91fe58
[48140848.866611] FS:  00007f926a5bd740(0000) GS:ffff88043fc80000(0000) knlGS:0000000000000000
[48140848.867566] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[48140848.868492] CR2: ffff8801108fb000 CR3: 000000017527d000 CR4: 00000000001407e0
[48140848.869443] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[48140848.870395] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[48140848.871324] Stack:
[48140848.872222]  ffffffffc06c56e3 ffff8803b5bfa000 ffff88004b91fc90 ffff88004b91ffd8
[48140848.873158]  0000000000000000 ffff88004b91ffd8 0000000000000000 0000000000001000
[48140848.874102]  000000000065f020 ffffea0004423ec0 0000000000000001 0000000000007000
[48140848.875048] Call Trace:
[48140848.876018]  [<ffffffffc06c56e3>] ? vx_uiomove+0x473/0x6b0 [vxfs]
[48140848.876989]  [<ffffffffc06f89a9>] vx_read1+0x589/0xb30 [vxfs]
[48140848.877958]  [<ffffffffc06d726f>] vx_vop_read+0x8f/0x110 [vxfs]
[48140848.878935]  [<ffffffffc06d7531>] vx_read+0x241/0x720 [vxfs]
[48140848.879873]  [<ffffffff8120099c>] vfs_read+0x9c/0x170
[48140848.880812]  [<ffffffff8120185f>] SyS_read+0x7f/0xe0
[48140848.881735]  [<ffffffff816b5089>] system_call_fastpath+0x16/0x1b
[48140848.882657] Code: 82 8c 00 00 00 89 f9 83 e1 07 74 15 83 e9 08 f7 d9 29 ca 8a 06 88 07 48 ff c6 48 ff c7 ff c9 75 f2 89 d1 83 e2 3f c1 e9 06 74 4a <4c> 8b 06 4c 8b 4e 08 4c 8b 56 10 4c 8b 5e 18 4c 89 07 4c 89 4f 

            ...

[48144551.510506] Kernel panic - not syncing: GAB: Port h halting system due to client process failure at [14:1120]

[48144551.512539] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P    B   W  OEL ------------   3.10.0-693.11.1.el7.x86_64 #1
[48144551.513584] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B64.1704120155 04/12/2017
[48144551.515699]  0000000046070500 ae7e18824342d343 ffff88043fd03c80 ffffffff816a3e61
[48144551.516825]  ffff88043fd03d00 ffffffff8169dd24 0000000000000008 ffff88043fd03d10
[48144551.517938]  ffff88043fd03cb0 ae7e18824342d343 0000000000000030 ffffffffc1000897
[48144551.519038] Call Trace:
[48144551.520083]  <IRQ>  [<ffffffff816a3e61>] dump_stack+0x19/0x1b
[48144551.521146]  [<ffffffff8169dd24>] panic+0xe8/0x20d
[48144551.522176]  [<ffffffffc0feb5e4>] gab_halt+0x74/0x80 [gab]
[48144551.523175]  [<ffffffffc0feb893>] gab_kill_process+0x2a3/0x360 [gab]
[48144551.524152]  [<ffffffffc0feb5f0>] ? gab_halt+0x80/0x80 [gab]
[48144551.525106]  [<ffffffffc0fd27d3>] gab_timerscan+0x163/0x530 [gab]
[48144551.526034]  [<ffffffff810d2c51>] ? trigger_load_balance+0x61/0x1e0
[48144551.526946]  [<ffffffffc0fd2670>] ? gab_mlist_empty+0x20/0x20 [gab]
[48144551.527833]  [<ffffffff81097326>] call_timer_fn+0x36/0x110
[48144551.528700]  [<ffffffffc0fd2670>] ? gab_mlist_empty+0x20/0x20 [gab]
[48144551.529550]  [<ffffffff8109983d>] run_timer_softirq+0x22d/0x310
[48144551.530379]  [<ffffffff81090b4f>] __do_softirq+0xef/0x280
[48144551.531200]  [<ffffffff816b6b1c>] call_softirq+0x1c/0x30
[48144551.531993]  [<ffffffff8102d3c5>] do_softirq+0x65/0xa0
[48144551.532770]  [<ffffffff81090ed5>] irq_exit+0x105/0x110
[48144551.533537]  [<ffffffff816b7782>] smp_apic_timer_interrupt+0x42/0x50
[48144551.534310]  [<ffffffff816b5cdd>] apic_timer_interrupt+0x6d/0x80
[48144551.535076]  <EOI>  [<ffffffff816ab576>] ? native_safe_halt+0x6/0x10
[48144551.535854]  [<ffffffff816ab40e>] default_idle+0x1e/0xc0
[48144551.536626]  [<ffffffff81035006>] arch_cpu_idle+0x26/0x30
[48144551.537390]  [<ffffffff810e7bda>] cpu_startup_entry+0x14a/0x1c0
[48144551.538148]  [<ffffffff81051b56>] start_secondary+0x1b6/0x230

Environment

  • Red Hat Enterprise Linux 7
  • Veritas VCS and VxFS

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content