A server frequently crashes in vol_ru_verification_data_unpack() due to severe stack overflows/corruption and slab corruption caused by Veritas VxVM (vxio) issue

Solution Verified - Updated -

Issue

  • A server frequently crashes in vol_ru_verification_data_unpack() due to Veritas VxVM (vxio) issue
[93744.626170] VxVM vxio V-5-0-1717 Tunable vvr_smartmove is set to 0
[94458.255043] SGI XFS with ACLs, security attributes, quota, no debug enabled
[94493.840531] VxVM VVR vxio V-5-0-1402 Connected from node xx.xx.xxx.xxx to node xx.xx.xxx.xxx
[94493.860304] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[94493.860353] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[94494.284379] VxVM VVR vxio V-5-0-1919 Disconnecting Rlink xxx to turn on secondary logging. :
[94495.440625] VxVM VVR vxio V-5-0-266 Rlink xxx disconnected from remote :
[94498.960472] VxVM VVR vxio V-5-0-1406 Node xx.xx.xxx.xxx disconnected from node xx.xx.xxx.xxx
[94502.032445] VxVM VVR vxio V-5-0-1402 Connected from node xx.xx.xxx.xxx to node xx.xx.xxx.xxx
[94502.052910] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[94502.052954] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[94516.482184] BUG: unable to handle kernel paging request at ffff9bc1f5ebb000
[94516.482221] PGD 2f8802067 P4D 2f8802067 PUD 2f8803067 PMD 3ded063 PTE 8000000035ebb061
[94516.482248] Oops: 0003 [#1] SMP PTI
[94516.482263] CPU: 2 PID: 6120 Comm: vxiod Kdump: loaded Tainted: P           OE     -------- -  - 4.18.0-553.50.1.el8_10.x86_64 #1
[94516.482300] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[94516.482330] RIP: 0010:vol_ru_verification_data_unpack+0x8e/0xe0 [vxio]
[94516.482481] Code: 0c 24 48 83 e0 f8 85 f6 7e 43 83 ee 01 48 8d 9a 90 00 00 00 48 8d ac b2 94 00 00 00 eb 14 8b 10 48 83 c3 04 48 83 c0 04 0f ca <89> 53 fc 48 39 dd 74 1b 48 39 c8 72 e7 48 89 e7 e8 0d dc ea ff 48
[94516.482532] RSP: 0018:ffffb8e804107d68 EFLAGS: 00010282
[94516.482549] RAX: ffff9bc3391e0fdc RBX: ffff9bc1f5ebb004 RCX: ffff9c0379554000
[94516.482570] RDX: 0000000030303030 RSI: ffff9bc339168000 RDI: ffffb8e804107d68
[94516.482591] RBP: ffff9bc2b6a2a110 R08: ffffffffffff8000 R09: 0000000000000000
[94516.482612] R10: ffff9bc33b34c600 R11: 000000000000b82d R12: ffff9bc33b34c600
[94516.482633] R13: 0000000020202000 R14: ffff9bc3217b0000 R15: ffff9bc1f5e22000
[94516.482655] FS:  0000000000000000(0000) GS:ffff9bc5efd00000(0000) knlGS:0000000000000000
[94516.482678] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[94516.482696] CR2: ffff9bc1f5ebb000 CR3: 00000002f6e10004 CR4: 00000000001706e0
[94516.482742] Call Trace:
[94516.482757]  ? __die_body+0x1a/0x60
[94516.482774]  ? no_context+0x1ba/0x3f0
[94516.482791]  ? __bad_area_nosemaphore+0x157/0x180
[94516.483140]  ? spurious_kernel_fault+0x1ed/0x250
[94516.483371]  ? do_page_fault+0x37/0x12d
[94516.483593]  ? page_fault+0x1e/0x30
[94516.483816]  ? vol_ru_verification_data_unpack+0x8e/0xe0 [vxio]
[94516.484150]  ? vol_ru_verification_data_unpack+0xa3/0xe0 [vxio]
[94516.484479]  vol_ru_verify+0x9d/0x110 [vxio]
[94516.484807]  volrv_seclog_bulk_cleanup_verification+0x107/0x1a0 [vxio]
[94516.485141]  volrv_seclog_write1_done+0xbf/0xd0 [vxio]
[94516.485469]  voliod_iohandle+0x21f/0x390 [vxio]
[94516.485811]  voliod_loop+0xc2/0x340 [vxio]
[94516.486147]  ? voliod_iohandle+0x390/0x390 [vxio]
[94516.486478]  kthread+0x134/0x150
[94516.486691]  ? set_kthread_struct+0x50/0x50
[94516.486903]  ret_from_fork+0x35/0x40
[94516.487115] Modules linked in: ...
        ...
[94516.489125] CR2: ffff9bc1f5ebb000
[   34.213607] VxVM vxio V-5-0-1717 Tunable vvr_smartmove is set to 0
[   36.016373] AMF Driver configured
[   37.607022] VxVM VVR vxio V-5-0-0 waiting to start udp heartbeat server,volnm_sync_netiod: 1, volnm_udp_srv_running: 0
[   37.607036] VxVM vxio V-5-3-0 binding on IP 0.0.0.0
[   38.612013] VxVM vxio V-5-3-0 binding on IP 0.0.0.0
[   40.083965] VxVM VVR vxio V-5-0-1402 Connected from node xx.xx.xxx.xxx to node xx.xx.xxx.xxx
[   40.125043] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[   40.125071] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[   40.758161] VxVM VVR vxio V-5-0-1919 Disconnecting Rlink xxx to turn on secondary logging. :
[   41.940120] VxVM VVR vxio V-5-0-266 Rlink xxx disconnected from remote :
[   42.207351] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[   42.207411] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[   45.203903] VxVM VVR vxio V-5-0-1406 Node xx.xx.xxx.xxx disconnected from node xx.xx.xxx.xxx
[   50.665159] WARNING: CPU: 1 PID: 11595 at kernel/auditsc.c:1831 __audit_syscall_entry+0x135/0x140
[   50.665168] Modules linked in: ...
[   50.665218] CPU: 1 PID: 11595 Comm: sh Kdump: loaded Tainted: P           OE     -------- -  - 4.18.0-553.50.1.el8_10.x86_64 #1
[   50.665221] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[   50.665223] RIP: 0010:__audit_syscall_entry+0x135/0x140
[   50.665226] Code: ff ff 48 83 c4 20 5b 5d 41 5c c3 cc cc cc cc 0f 0b 45 85 c9 75 14 48 83 c4 20 48 c7 c7 48 55 f0 8f 5b 5d 41 5c e9 6b 72 ff ff <0f> 0b eb e8 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 53 65 48 8b 04
[   50.665228] RSP: 0018:ffffbba403e53e88 EFLAGS: 00010202
[   50.665230] RAX: ffff9ac121b58000 RBX: ffff9ac153b19800 RCX: 00007fffb6d47960
[   50.665232] RDX: 00007fffb6d478c0 RSI: 0000000000000002 RDI: 000000000000000d
[   50.665233] RBP: 00000000c000003e R08: 0000000000000008 R09: 0000000020202020
[   50.665234] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[   50.665235] R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
[   50.665236] FS:  00007f1dc7d1d780(0000) GS:ffff9ac3efc80000(0000) knlGS:0000000000000000
[   50.665238] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   50.665239] CR2: 00007ff25802b5d8 CR3: 0000000155c34001 CR4: 00000000001706e0
[   50.665265] Call Trace:
[   50.665269]  ? __warn+0x94/0xe0
[   50.665274]  ? __audit_syscall_entry+0x135/0x140
[   50.665277]  ? __audit_syscall_entry+0x135/0x140
[   50.665279]  ? report_bug+0xb1/0xe0
[   50.665283]  ? do_error_trap+0x9e/0xd0
[   50.665287]  ? do_invalid_op+0x36/0x40
[   50.665288]  ? __audit_syscall_entry+0x135/0x140
[   50.665290]  ? invalid_op+0x14/0x20
[   50.665296]  ? __audit_syscall_entry+0x135/0x140
[   50.665298]  ? __audit_syscall_entry+0xf2/0x140
[   50.665301]  syscall_trace_enter+0x1ff/0x2d0
[   50.665306]  ? audit_reset_context.part.16+0x26a/0x2d0
[   50.665309]  do_syscall_64+0x146/0x1a0
[   50.665312]  entry_SYSCALL_64_after_hwframe+0x66/0xcb
[   50.665315] RIP: 0033:0x7f1dc7121684
[   50.665322] Code: 00 00 00 00 48 0f 44 d0 0f 11 61 58 0f 11 69 68 0f 11 71 78 0f 11 b9 88 00 00 00 b9 0d 00 00 00 41 ba 08 00 00 00 89 c8 0f 05 <48> 89 c2 48 3d 00 f0 ff ff 0f 87 ed 00 00 00 4d 85 c0 0f 84 a1 00
[   50.665324] RSP: 002b:00007fffb6d478c0 EFLAGS: 00000202 ORIG_RAX: 000000000000000d
[   50.665326] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1dc7121684
[   50.665327] RDX: 00007fffb6d47960 RSI: 00007fffb6d478c0 RDI: 0000000000000002
[   50.665329] RBP: 00007fffb6d47a10 R08: 00007fffb6d47ab0 R09: 00005581e5f75a80
[   50.665330] R10: 0000000000000008 R11: 0000000000000202 R12: 00007fffb6d47ab0
[   50.665331] R13: 00007fffb6d47bf0 R14: 0000000000000000 R15: 0000000000000000
[   50.665333] ---[ end trace 0564aa08f3feeea0 ]---
[   50.665335] audit_panic: 10 callbacks suppressed
[   50.665336] audit: unrecoverable error in audit_syscall_entry()
[202965.073111] VxVM vxio V-5-0-1717 Tunable vvr_smartmove is set to 0
[203012.947137] VxVM VVR vxio V-5-0-1402 Connected from node xx.xx.xxx.xxx to node xx.xx.xxx.xxx
[203012.971260] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[203012.971330] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[203014.453152] VxVM VVR vxio V-5-0-855 Disconnecting rlink xxx due to stream error.

[203014.504155] VxVM VVR vxio V-5-0-266 Rlink xxx disconnected from remote :
[203014.995265] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[203014.995311] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[203015.421265] VxVM VVR vxio V-5-0-1919 Disconnecting Rlink xxx to turn on secondary logging. :
[203016.596165] VxVM VVR vxio V-5-0-266 Rlink xxx disconnected from remote :
[203018.067043] VxVM VVR vxio V-5-0-1406 Node xx.xx.xxx.xxx disconnected from node xx.xx.xxx.xxx
[203021.138991] VxVM VVR vxio V-5-0-1402 Connected from node xx.xx.xxx.xxx to node xx.xx.xxx.xxx
[203022.135218] VxVM VVR vxio V-5-0-265 Rlink xxx connected to remote :
[203022.135270] VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink xxx
[203030.126790] ------------[ cut here ]------------
[203030.126794] kernel BUG at net/core/skbuff.c:2094!
[203030.126834] invalid opcode: 0000 [#1] SMP PTI
[203030.126852] CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: P           OE     -------- -  - 4.18.0-553.50.1.el8_10.x86_64 #1
[203030.126887] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[203030.126919] RIP: 0010:skb_put+0x3e/0x40
[203030.126938] Code: 48 03 87 e0 00 00 00 85 c9 75 1f 01 f2 01 b7 80 00 00 00 89 97 d8 00 00 00 3b 97 dc 00 00 00 0f 87 1a 92 00 00 c3 cc cc cc cc <0f> 0b 0f 1f 44 00 00 48 89 f0 48 39 fe 74 0c 01 97 84 00 00 00 01
[203030.126990] RSP: 0018:ffffaceb4001edf0 EFLAGS: 00010206
[203030.127009] RAX: d8a1a531bdba96fd RBX: ffff9bea4927d690 RCX: 0000000043abf960
[203030.127031] RDX: 0000000087eaab77 RSI: 00000000000005f2 RDI: ffff9bea6bb9a900
[203030.127052] RBP: ffff9bea2023aac0 R08: 0000000000000000 R09: 0000000000000000
[203030.127073] R10: ffff9bea492e1668 R11: ffff9bea143beb00 R12: ffff9bea49224ef0
[203030.127095] R13: 0000000000000000 R14: ffff9bea20238ac0 R15: ffff9bea2023aac0
[203030.127116] FS:  0000000000000000(0000) GS:ffff9bed2fc80000(0000) knlGS:0000000000000000
[203030.127140] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[203030.127158] CR2: 00007f76c70cd080 CR3: 00000002afc10005 CR4: 00000000001706e0
[203030.127206] Call Trace:
[203030.127220]  <IRQ>
[203030.127231]  ? __die_body+0x1a/0x60
[203030.127249]  ? die+0x2a/0x50
[203030.127262]  ? do_trap+0xe7/0x110
[203030.127595]  ? skb_put+0x3e/0x40
[203030.127817]  ? do_invalid_op+0x36/0x40
[203030.128034]  ? skb_put+0x3e/0x40
[203030.128248]  ? invalid_op+0x14/0x20
[203030.128463]  ? skb_put+0x3e/0x40
[203030.128673]  vmxnet3_rq_rx_complete+0x798/0x1000 [vmxnet3]
[203030.128892]  vmxnet3_poll_rx_only+0x31/0xa0 [vmxnet3]
[203030.129105]  __napi_poll+0x2d/0x130
[203030.129315]  net_rx_action+0x252/0x320
[203030.129543]  __do_softirq+0xdc/0x2cf
[203030.129751]  irq_exit_rcu+0xc6/0xd0
[203030.129958]  irq_exit+0xa/0x10
[203030.130164]  do_IRQ+0x7f/0xd0
[203030.130371]  common_interrupt+0xf/0xf
[203030.130588]  </IRQ>
[203030.130782] RIP: 0010:native_safe_halt+0xe/0x20
[203030.130980] Code: 00 a8 08 75 be e9 23 ff ff ff 31 ff e9 6a ff ff ff 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 0f 00 2d 36 41 5e 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00
[203030.131427] RSP: 0018:ffffaceb4193be28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd2
[203030.131650] RAX: 0000000080004080 RBX: ffff9bea022fdc64 RCX: 0000000000000020
[203030.131869] RDX: ffffffffb57c5ef0 RSI: ffffffffb70d1ce0 RDI: 0000000000000001
[203030.132086] RBP: ffff9bea022fdc64 R08: 0000000000000001 R09: ffff9bea022fdc00
[203030.132305] R10: 0000000000000daf R11: ffff9bed2fcb2484 R12: 0000000000000001
[203030.132337] show_signal_msg: 73 callbacks suppressed
[203030.132339] splunkd[9658]: segfault at d0cd3203 ip 00007fa01df91b38 sp 00007fa01da0a820 error 6
[203030.132533] R13: ffffffffb70d1ce0 R14: 0000000000000001 R15: 0000000000000001
[203030.132791]  in libc-2.28.so[7fa01de98000+1cd000]
[203030.133034]  ? acpi_processor_thermal_init.cold.6+0x66/0x66
[203030.133040]  ? acpi_processor_thermal_init.cold.6+0x66/0x66
[203030.133042]  acpi_idle_do_entry+0x93/0xa0
[203030.133300] 
[203030.133542]  acpi_idle_enter+0x5f/0xd0
[203030.133791] Code: 89 44 24 10 64 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 85 f6 0f 84 e9 01 00 00 4c 8b 2d 0f 63 2d 00 4c 8b 05 b8 98 2d 00 <49> c7 45 00 00 00 00 00 49 c7 45 08 00 00 00 00 4d 85 c0 0f 84 af
[203030.134024]  cpuidle_enter_state+0x86/0x470
[203030.134030]  cpuidle_enter+0x2c/0x40
[203030.134033]  do_idle+0x26f/0x2d0
[203030.134038]  cpu_startup_entry+0x6f/0x80
[203030.135939]  start_secondary+0x187/0x1d0
[203030.136134]  secondary_startup_64_no_verify+0xd1/0xdb
[203030.136329] Modules linked in: ...
        ...

Environment

  • Red Hat Enterprise Linux 8.10.z
  • Veritas Volume Manager VxVM

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content