A server frequently crashes in vol_ru_verification_data_unpack() due to severe stack overflows/corruption and slab corruption caused by Veritas VxVM (vxio) issue
Issue
- A server frequently crashes in vol_ru_verification_data_unpack() due to Veritas VxVM (vxio) issue
[93744.626170] VxVM vxio V-5-0-1717 Tunable vvr_smartmove is set to 0
[94458.255043] SGI XFS with ACLs, security attributes, quota, no debug enabled
[94493.840531] VxVM VVR vxio V-5-0-1402 Connected from node xx.xx.xxx.xxx to node xx.xx.xxx.xxx
[94493.860304] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[94493.860353] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[94494.284379] VxVM VVR vxio V-5-0-1919 Disconnecting Rlink xxx to turn on secondary logging. :
[94495.440625] VxVM VVR vxio V-5-0-266 Rlink xxx disconnected from remote :
[94498.960472] VxVM VVR vxio V-5-0-1406 Node xx.xx.xxx.xxx disconnected from node xx.xx.xxx.xxx
[94502.032445] VxVM VVR vxio V-5-0-1402 Connected from node xx.xx.xxx.xxx to node xx.xx.xxx.xxx
[94502.052910] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[94502.052954] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[94516.482184] BUG: unable to handle kernel paging request at ffff9bc1f5ebb000
[94516.482221] PGD 2f8802067 P4D 2f8802067 PUD 2f8803067 PMD 3ded063 PTE 8000000035ebb061
[94516.482248] Oops: 0003 [#1] SMP PTI
[94516.482263] CPU: 2 PID: 6120 Comm: vxiod Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.50.1.el8_10.x86_64 #1
[94516.482300] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[94516.482330] RIP: 0010:vol_ru_verification_data_unpack+0x8e/0xe0 [vxio]
[94516.482481] Code: 0c 24 48 83 e0 f8 85 f6 7e 43 83 ee 01 48 8d 9a 90 00 00 00 48 8d ac b2 94 00 00 00 eb 14 8b 10 48 83 c3 04 48 83 c0 04 0f ca <89> 53 fc 48 39 dd 74 1b 48 39 c8 72 e7 48 89 e7 e8 0d dc ea ff 48
[94516.482532] RSP: 0018:ffffb8e804107d68 EFLAGS: 00010282
[94516.482549] RAX: ffff9bc3391e0fdc RBX: ffff9bc1f5ebb004 RCX: ffff9c0379554000
[94516.482570] RDX: 0000000030303030 RSI: ffff9bc339168000 RDI: ffffb8e804107d68
[94516.482591] RBP: ffff9bc2b6a2a110 R08: ffffffffffff8000 R09: 0000000000000000
[94516.482612] R10: ffff9bc33b34c600 R11: 000000000000b82d R12: ffff9bc33b34c600
[94516.482633] R13: 0000000020202000 R14: ffff9bc3217b0000 R15: ffff9bc1f5e22000
[94516.482655] FS: 0000000000000000(0000) GS:ffff9bc5efd00000(0000) knlGS:0000000000000000
[94516.482678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[94516.482696] CR2: ffff9bc1f5ebb000 CR3: 00000002f6e10004 CR4: 00000000001706e0
[94516.482742] Call Trace:
[94516.482757] ? __die_body+0x1a/0x60
[94516.482774] ? no_context+0x1ba/0x3f0
[94516.482791] ? __bad_area_nosemaphore+0x157/0x180
[94516.483140] ? spurious_kernel_fault+0x1ed/0x250
[94516.483371] ? do_page_fault+0x37/0x12d
[94516.483593] ? page_fault+0x1e/0x30
[94516.483816] ? vol_ru_verification_data_unpack+0x8e/0xe0 [vxio]
[94516.484150] ? vol_ru_verification_data_unpack+0xa3/0xe0 [vxio]
[94516.484479] vol_ru_verify+0x9d/0x110 [vxio]
[94516.484807] volrv_seclog_bulk_cleanup_verification+0x107/0x1a0 [vxio]
[94516.485141] volrv_seclog_write1_done+0xbf/0xd0 [vxio]
[94516.485469] voliod_iohandle+0x21f/0x390 [vxio]
[94516.485811] voliod_loop+0xc2/0x340 [vxio]
[94516.486147] ? voliod_iohandle+0x390/0x390 [vxio]
[94516.486478] kthread+0x134/0x150
[94516.486691] ? set_kthread_struct+0x50/0x50
[94516.486903] ret_from_fork+0x35/0x40
[94516.487115] Modules linked in: ...
...
[94516.489125] CR2: ffff9bc1f5ebb000
[ 34.213607] VxVM vxio V-5-0-1717 Tunable vvr_smartmove is set to 0
[ 36.016373] AMF Driver configured
[ 37.607022] VxVM VVR vxio V-5-0-0 waiting to start udp heartbeat server,volnm_sync_netiod: 1, volnm_udp_srv_running: 0
[ 37.607036] VxVM vxio V-5-3-0 binding on IP 0.0.0.0
[ 38.612013] VxVM vxio V-5-3-0 binding on IP 0.0.0.0
[ 40.083965] VxVM VVR vxio V-5-0-1402 Connected from node xx.xx.xxx.xxx to node xx.xx.xxx.xxx
[ 40.125043] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[ 40.125071] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[ 40.758161] VxVM VVR vxio V-5-0-1919 Disconnecting Rlink xxx to turn on secondary logging. :
[ 41.940120] VxVM VVR vxio V-5-0-266 Rlink xxx disconnected from remote :
[ 42.207351] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[ 42.207411] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[ 45.203903] VxVM VVR vxio V-5-0-1406 Node xx.xx.xxx.xxx disconnected from node xx.xx.xxx.xxx
[ 50.665159] WARNING: CPU: 1 PID: 11595 at kernel/auditsc.c:1831 __audit_syscall_entry+0x135/0x140
[ 50.665168] Modules linked in: ...
[ 50.665218] CPU: 1 PID: 11595 Comm: sh Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.50.1.el8_10.x86_64 #1
[ 50.665221] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[ 50.665223] RIP: 0010:__audit_syscall_entry+0x135/0x140
[ 50.665226] Code: ff ff 48 83 c4 20 5b 5d 41 5c c3 cc cc cc cc 0f 0b 45 85 c9 75 14 48 83 c4 20 48 c7 c7 48 55 f0 8f 5b 5d 41 5c e9 6b 72 ff ff <0f> 0b eb e8 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 53 65 48 8b 04
[ 50.665228] RSP: 0018:ffffbba403e53e88 EFLAGS: 00010202
[ 50.665230] RAX: ffff9ac121b58000 RBX: ffff9ac153b19800 RCX: 00007fffb6d47960
[ 50.665232] RDX: 00007fffb6d478c0 RSI: 0000000000000002 RDI: 000000000000000d
[ 50.665233] RBP: 00000000c000003e R08: 0000000000000008 R09: 0000000020202020
[ 50.665234] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 50.665235] R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
[ 50.665236] FS: 00007f1dc7d1d780(0000) GS:ffff9ac3efc80000(0000) knlGS:0000000000000000
[ 50.665238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 50.665239] CR2: 00007ff25802b5d8 CR3: 0000000155c34001 CR4: 00000000001706e0
[ 50.665265] Call Trace:
[ 50.665269] ? __warn+0x94/0xe0
[ 50.665274] ? __audit_syscall_entry+0x135/0x140
[ 50.665277] ? __audit_syscall_entry+0x135/0x140
[ 50.665279] ? report_bug+0xb1/0xe0
[ 50.665283] ? do_error_trap+0x9e/0xd0
[ 50.665287] ? do_invalid_op+0x36/0x40
[ 50.665288] ? __audit_syscall_entry+0x135/0x140
[ 50.665290] ? invalid_op+0x14/0x20
[ 50.665296] ? __audit_syscall_entry+0x135/0x140
[ 50.665298] ? __audit_syscall_entry+0xf2/0x140
[ 50.665301] syscall_trace_enter+0x1ff/0x2d0
[ 50.665306] ? audit_reset_context.part.16+0x26a/0x2d0
[ 50.665309] do_syscall_64+0x146/0x1a0
[ 50.665312] entry_SYSCALL_64_after_hwframe+0x66/0xcb
[ 50.665315] RIP: 0033:0x7f1dc7121684
[ 50.665322] Code: 00 00 00 00 48 0f 44 d0 0f 11 61 58 0f 11 69 68 0f 11 71 78 0f 11 b9 88 00 00 00 b9 0d 00 00 00 41 ba 08 00 00 00 89 c8 0f 05 <48> 89 c2 48 3d 00 f0 ff ff 0f 87 ed 00 00 00 4d 85 c0 0f 84 a1 00
[ 50.665324] RSP: 002b:00007fffb6d478c0 EFLAGS: 00000202 ORIG_RAX: 000000000000000d
[ 50.665326] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1dc7121684
[ 50.665327] RDX: 00007fffb6d47960 RSI: 00007fffb6d478c0 RDI: 0000000000000002
[ 50.665329] RBP: 00007fffb6d47a10 R08: 00007fffb6d47ab0 R09: 00005581e5f75a80
[ 50.665330] R10: 0000000000000008 R11: 0000000000000202 R12: 00007fffb6d47ab0
[ 50.665331] R13: 00007fffb6d47bf0 R14: 0000000000000000 R15: 0000000000000000
[ 50.665333] ---[ end trace 0564aa08f3feeea0 ]---
[ 50.665335] audit_panic: 10 callbacks suppressed
[ 50.665336] audit: unrecoverable error in audit_syscall_entry()
[202965.073111] VxVM vxio V-5-0-1717 Tunable vvr_smartmove is set to 0
[203012.947137] VxVM VVR vxio V-5-0-1402 Connected from node xx.xx.xxx.xxx to node xx.xx.xxx.xxx
[203012.971260] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[203012.971330] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[203014.453152] VxVM VVR vxio V-5-0-855 Disconnecting rlink xxx due to stream error.
[203014.504155] VxVM VVR vxio V-5-0-266 Rlink xxx disconnected from remote :
[203014.995265] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[203014.995311] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[203015.421265] VxVM VVR vxio V-5-0-1919 Disconnecting Rlink xxx to turn on secondary logging. :
[203016.596165] VxVM VVR vxio V-5-0-266 Rlink xxx disconnected from remote :
[203018.067043] VxVM VVR vxio V-5-0-1406 Node xx.xx.xxx.xxx disconnected from node xx.xx.xxx.xxx
[203021.138991] VxVM VVR vxio V-5-0-1402 Connected from node xx.xx.xxx.xxx to node xx.xx.xxx.xxx
[203022.135218] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[203022.135270] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[203030.126790] ------------[ cut here ]------------
[203030.126794] kernel BUG at net/core/skbuff.c:2094!
[203030.126834] invalid opcode: 0000 [#1] SMP PTI
[203030.126852] CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.50.1.el8_10.x86_64 #1
[203030.126887] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[203030.126919] RIP: 0010:skb_put+0x3e/0x40
[203030.126938] Code: 48 03 87 e0 00 00 00 85 c9 75 1f 01 f2 01 b7 80 00 00 00 89 97 d8 00 00 00 3b 97 dc 00 00 00 0f 87 1a 92 00 00 c3 cc cc cc cc <0f> 0b 0f 1f 44 00 00 48 89 f0 48 39 fe 74 0c 01 97 84 00 00 00 01
[203030.126990] RSP: 0018:ffffaceb4001edf0 EFLAGS: 00010206
[203030.127009] RAX: d8a1a531bdba96fd RBX: ffff9bea4927d690 RCX: 0000000043abf960
[203030.127031] RDX: 0000000087eaab77 RSI: 00000000000005f2 RDI: ffff9bea6bb9a900
[203030.127052] RBP: ffff9bea2023aac0 R08: 0000000000000000 R09: 0000000000000000
[203030.127073] R10: ffff9bea492e1668 R11: ffff9bea143beb00 R12: ffff9bea49224ef0
[203030.127095] R13: 0000000000000000 R14: ffff9bea20238ac0 R15: ffff9bea2023aac0
[203030.127116] FS: 0000000000000000(0000) GS:ffff9bed2fc80000(0000) knlGS:0000000000000000
[203030.127140] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[203030.127158] CR2: 00007f76c70cd080 CR3: 00000002afc10005 CR4: 00000000001706e0
[203030.127206] Call Trace:
[203030.127220] <IRQ>
[203030.127231] ? __die_body+0x1a/0x60
[203030.127249] ? die+0x2a/0x50
[203030.127262] ? do_trap+0xe7/0x110
[203030.127595] ? skb_put+0x3e/0x40
[203030.127817] ? do_invalid_op+0x36/0x40
[203030.128034] ? skb_put+0x3e/0x40
[203030.128248] ? invalid_op+0x14/0x20
[203030.128463] ? skb_put+0x3e/0x40
[203030.128673] vmxnet3_rq_rx_complete+0x798/0x1000 [vmxnet3]
[203030.128892] vmxnet3_poll_rx_only+0x31/0xa0 [vmxnet3]
[203030.129105] __napi_poll+0x2d/0x130
[203030.129315] net_rx_action+0x252/0x320
[203030.129543] __do_softirq+0xdc/0x2cf
[203030.129751] irq_exit_rcu+0xc6/0xd0
[203030.129958] irq_exit+0xa/0x10
[203030.130164] do_IRQ+0x7f/0xd0
[203030.130371] common_interrupt+0xf/0xf
[203030.130588] </IRQ>
[203030.130782] RIP: 0010:native_safe_halt+0xe/0x20
[203030.130980] Code: 00 a8 08 75 be e9 23 ff ff ff 31 ff e9 6a ff ff ff 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 0f 00 2d 36 41 5e 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00
[203030.131427] RSP: 0018:ffffaceb4193be28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd2
[203030.131650] RAX: 0000000080004080 RBX: ffff9bea022fdc64 RCX: 0000000000000020
[203030.131869] RDX: ffffffffb57c5ef0 RSI: ffffffffb70d1ce0 RDI: 0000000000000001
[203030.132086] RBP: ffff9bea022fdc64 R08: 0000000000000001 R09: ffff9bea022fdc00
[203030.132305] R10: 0000000000000daf R11: ffff9bed2fcb2484 R12: 0000000000000001
[203030.132337] show_signal_msg: 73 callbacks suppressed
[203030.132339] splunkd[9658]: segfault at d0cd3203 ip 00007fa01df91b38 sp 00007fa01da0a820 error 6
[203030.132533] R13: ffffffffb70d1ce0 R14: 0000000000000001 R15: 0000000000000001
[203030.132791] in libc-2.28.so[7fa01de98000+1cd000]
[203030.133034] ? acpi_processor_thermal_init.cold.6+0x66/0x66
[203030.133040] ? acpi_processor_thermal_init.cold.6+0x66/0x66
[203030.133042] acpi_idle_do_entry+0x93/0xa0
[203030.133300]
[203030.133542] acpi_idle_enter+0x5f/0xd0
[203030.133791] Code: 89 44 24 10 64 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 85 f6 0f 84 e9 01 00 00 4c 8b 2d 0f 63 2d 00 4c 8b 05 b8 98 2d 00 <49> c7 45 00 00 00 00 00 49 c7 45 08 00 00 00 00 4d 85 c0 0f 84 af
[203030.134024] cpuidle_enter_state+0x86/0x470
[203030.134030] cpuidle_enter+0x2c/0x40
[203030.134033] do_idle+0x26f/0x2d0
[203030.134038] cpu_startup_entry+0x6f/0x80
[203030.135939] start_secondary+0x187/0x1d0
[203030.136134] secondary_startup_64_no_verify+0xd1/0xdb
[203030.136329] Modules linked in: ...
...
Environment
- Red Hat Enterprise Linux 8.10.z
- Veritas Volume Manager VxVM
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.