Server panicking in Veritas vxio module

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux, version 5
  • Veritas vxio module

Issue

  • Red Hat server with Veritas cluster filesystem software keeps on rebooting.
  • Kernel panic occurring with the following signature:
vxfs: msgcnt 2 mesg 125: V-2-125: GLM restart callback, protocol flag 1
vxglm INFO V-42-106 GLM recovery complete, gen 120280d, mbr 2/0/0/0
vxglm INFO V-42-107 times: skew 0 ms, remaster 18 ms, completion 30 ms
VxVM VVR vxio V-5-0-1402 Connected from node 162.102.154.226 to node 10.91.146.40
VxVM VVR vxio V-5-0-265 Rlink rlk_prod-sv-vvr_sv_rvg connected to remote
VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink rlk_prod-sv-vvr_sv_rvg
VxVM VVR vxio V-5-0-330 Unable to connect to rlink rlk_prod-sv-vvr_ox_rvg on rvg ox_rvg: RVG is primary on remote
VxVM VVR vxio V-5-0-284 Rvg ox_rvg state transition from primary to acting secondary
VxVM vxio V-5-3-1533 vol_kmsg_cluster_request: unknown cmd 124
VxVM VVR vxio V-5-0-284 Rvg ox_rvg state transition from primary to acting secondary
Unable to handle kernel NULL pointer dereference at 0000000000000270 RIP: 
 [<ffffffff88804173>] :vxio:vol_multistepsio_read_source+0x94/0x124
PGD 92df26067 PUD 925291067 PMD 925365067 PTE 0
Oops: 0000 [1] SMP 
last sysfs file: /class/net/bond0/broadcast
CPU 2 
Modules linked in: ipmi_si mptctl mptbase ipmi_devintf ipmi_msghandler dell_rbu vxodm(PFU) vxgms(PU) vxglm(PU) vxfen(PU) gab(PU) llt(PU) autofs4 nfs lockd nfs_acl sunrpc dmpjbod(PU) dmpap(PU) dmpaa(PU) vxspec(PFU) vxio(PFU) vxdmp(PU) bonding vxportal(PFU) fdd(PFU) vxfs(PU) exportfs dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev sr_mod cdrom i7core_edac tpm_tis edac_mc pcspkr tpm serio_raw tpm_bios sg bnx2 dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage qla2xxx scsi_transport_fc ata_piix libata shpchp megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 6721, comm: vxiod Tainted: PF    ---- 2.6.18-348.1.1.el5 #1
RIP: 0010:[<ffffffff88804173>]  [<ffffffff88804173>] :vxio:vol_multistepsio_read_source+0x94/0x124
RSP: 0018:ffff811215807d30  EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff8112076e8800 RCX: 0000000000089b00
RDX: ffff8111dae39840 RSI: 0000000000000000 RDI: ffff811215807d80
RBP: ffff810926372000 R08: 0000000000000000 R09: 0000000000000000
R10: ffff811215807d30 R11: 0000000000000050 R12: ffff811215807ec0
R13: 0000000000000001 R14: 0000000000000000 R15: ffff810926372000
FS:  00002b4c62e906e0(0000) GS:ffff8109301146c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000270 CR3: 0000000924375000 CR4: 00000000000006a0
Process vxiod (pid: 6721, threadinfo ffff811215806000, task ffff811216f7c7e0)
Stack:  0000000000000000 ffff8112076e8800 ffff8111dae39840 0000000000000001
 0000000000000000 0000000000089b00 0000000000000080 0000000000000000
 0000000000000000 0000000000000000 ffff8112076e8800 ffff811215807ec0
Call Trace:
 [<ffffffff88805069>] :vxio:vol_multistepsio_start+0x2c0/0x5cd
 [<ffffffff887f8fff>] :vxio:voliod_iohandle+0x37/0x88
 [<ffffffff887f923b>] :vxio:voliod_loop+0x1eb/0x58f
 [<ffffffff8005dfc1>] child_rip+0xa/0x11
 [<ffffffff800e9f0f>] bdev_destroy_inode+0x0/0x3e
 [<ffffffff887f9050>] :vxio:voliod_loop+0x0/0x58f
 [<ffffffff8005dfb7>] child_rip+0x0/0x11

Code: 48 8b 90 70 02 00 00 48 89 c8 48 03 44 24 30 48 39 d0 76 08 
RIP  [<ffffffff88804173>] :vxio:vol_multistepsio_read_source+0x94/0x124
 RSP <ffff811215807d30>

Resolution

  • Symantec acknowledges what appears to be a bug in their module. Contact Symantec for further support and updates.

Diagnostic Steps

  • Nearly all of the code path leading to the crash is passing through vxio code:
vxfs: msgcnt 2 mesg 125: V-2-125: GLM restart callback, protocol flag 1
vxglm INFO V-42-106 GLM recovery complete, gen 120280d, mbr 2/0/0/0
vxglm INFO V-42-107 times: skew 0 ms, remaster 18 ms, completion 30 ms
VxVM VVR vxio V-5-0-1402 Connected from node 162.102.154.226 to node 10.91.146.40
VxVM VVR vxio V-5-0-265 Rlink rlk_prod-sv-vvr_sv_rvg connected to remote
VxVM VVR vxio V-5-0-1449 Disabling checksum for rlink rlk_prod-sv-vvr_sv_rvg
VxVM VVR vxio V-5-0-330 Unable to connect to rlink rlk_prod-sv-vvr_ox_rvg on rvg ox_rvg: RVG is primary on remote
VxVM VVR vxio V-5-0-284 Rvg ox_rvg state transition from primary to acting secondary
VxVM vxio V-5-3-1533 vol_kmsg_cluster_request: unknown cmd 124
VxVM VVR vxio V-5-0-284 Rvg ox_rvg state transition from primary to acting secondary
Unable to handle kernel NULL pointer dereference at 0000000000000270 RIP: 
 [<ffffffff88804173>] :vxio:vol_multistepsio_read_source+0x94/0x124
PGD 92df26067 PUD 925291067 PMD 925365067 PTE 0
Oops: 0000 [1] SMP 
last sysfs file: /class/net/bond0/broadcast
CPU 2 
Modules linked in: ipmi_si mptctl mptbase ipmi_devintf ipmi_msghandler dell_rbu vxodm(PFU) vxgms(PU) vxglm(PU) vxfen(PU) gab(PU) llt(PU) autofs4 nfs lockd nfs_acl sunrpc dmpjbod(PU) dmpap(PU) dmpaa(PU) vxspec(PFU) vxio(PFU) vxdmp(PU) bonding vxportal(PFU) fdd(PFU) vxfs(PU) exportfs dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev sr_mod cdrom i7core_edac tpm_tis edac_mc pcspkr tpm serio_raw tpm_bios sg bnx2 dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage qla2xxx scsi_transport_fc ata_piix libata shpchp megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 6721, comm: vxiod Tainted: PF    ---- 2.6.18-348.1.1.el5 #1
RIP: 0010:[<ffffffff88804173>]  [<ffffffff88804173>] :vxio:vol_multistepsio_read_source+0x94/0x124
RSP: 0018:ffff811215807d30  EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff8112076e8800 RCX: 0000000000089b00
RDX: ffff8111dae39840 RSI: 0000000000000000 RDI: ffff811215807d80
RBP: ffff810926372000 R08: 0000000000000000 R09: 0000000000000000
R10: ffff811215807d30 R11: 0000000000000050 R12: ffff811215807ec0
R13: 0000000000000001 R14: 0000000000000000 R15: ffff810926372000
FS:  00002b4c62e906e0(0000) GS:ffff8109301146c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000270 CR3: 0000000924375000 CR4: 00000000000006a0
Process vxiod (pid: 6721, threadinfo ffff811215806000, task ffff811216f7c7e0)
Stack:  0000000000000000 ffff8112076e8800 ffff8111dae39840 0000000000000001
 0000000000000000 0000000000089b00 0000000000000080 0000000000000000
 0000000000000000 0000000000000000 ffff8112076e8800 ffff811215807ec0
Call Trace:
 [<ffffffff88805069>] :vxio:vol_multistepsio_start+0x2c0/0x5cd
 [<ffffffff887f8fff>] :vxio:voliod_iohandle+0x37/0x88
 [<ffffffff887f923b>] :vxio:voliod_loop+0x1eb/0x58f
 [<ffffffff8005dfc1>] child_rip+0xa/0x11
 [<ffffffff800e9f0f>] bdev_destroy_inode+0x0/0x3e
 [<ffffffff887f9050>] :vxio:voliod_loop+0x0/0x58f
 [<ffffffff8005dfb7>] child_rip+0x0/0x11

Code: 48 8b 90 70 02 00 00 48 89 c8 48 03 44 24 30 48 39 d0 76 08 
RIP  [<ffffffff88804173>] :vxio:vol_multistepsio_read_source+0x94/0x124
 RSP <ffff811215807d30>

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments