RHEL7.2 で、nfs4_stid ポインターを逆参照していると、nfs4_put_stid で nfs4 サーバーのカーネルがクラッシュする
Issue
- カーネルの list_del 破損警告のあとに、
nfs4_put_stid
のlaundromat_main
で、nfsd4 の kworker タスクのカーネルクラッシュが発生します。
[88415.153187] ------------[ cut here ]------------
[88415.153194] WARNING: at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
[88415.153195] list_del corruption. prev->next should be ffff88005d315ef0, but was ffff88005d3146e0
[88415.153196] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache iptable_filter fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat ext4 mbcache jbd2 bridge stp llc appassure_vss(POE) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel ipmi_devintf aesni_intel lrw gf128mul glue_helper iTCO_wdt ablk_helper cryptd iTCO_vendor_support mxm_wmi dcdbas pcspkr sb_edac sg edac_core mei_me lpc_ich mei mfd_core shpchp ipmi_si acpi_power_meter ipmi_msghandler wmi dm_multipath nfsd nfs_acl lockd binfmt_misc auth_rpcgss grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm crct10dif_pclmul crct10dif_common
[88415.153227] crc32c_intel drm ahci tg3 libahci i2c_core ptp libata megaraid_sas pps_core dm_mirror dm_region_hash dm_log dm_mod
[88415.153234] CPU:8 PID:9631 Comm: nfsd Tainted:P OE ------------ 3.10.0-327.36.1.el7.x86_64 #1
[88415.153235] Hardware name:Dell Inc. PowerEdge R430/03XKDV, BIOS 2.0.1 04/11/2016
[88415.153236] ffff88041fc8fc10 00000000dd3855fa ffff88041fc8fbc8 ffffffff81636301
[88415.153239] ffff88041fc8fc00 ffffffff8107b260 ffff88005d315ef0 ffff88005d316070
[88415.153240] ffff88034c867000 ffff880412048200 ffff88034c867340 ffff88041fc8fc68
[88415.153242] Call Trace:
[88415.153248] [<ffffffff81636301>] dump_stack+0x19/0x1b
[88415.153251] [<ffffffff8107b260>] warn_slowpath_common+0x70/0xb0
[88415.153253] [<ffffffff8107b2fc>] warn_slowpath_fmt+0x5c/0x80
[88415.153263] [<ffffffffa038ff8b>] ? find_stateid_by_type+0x6b/0xa0 [nfsd]
[88415.153265] [<ffffffff8130c671>] __list_del_entry+0xa1/0xd0
[88415.153271] [<ffffffffa038f9ab>] nfs4_unhash_openowner+0x1b/0x40 [nfsd]
[88415.153275] [<ffffffffa038ed53>] nfs4_put_stateowner+0x33/0x60 [nfsd]
[88415.153279] [<ffffffffa038ff00>] nfs4_free_ol_stateid+0x20/0x40 [nfsd]
[88415.153284] [<ffffffffa0391568>] nfs4_put_stid+0x48/0x80 [nfsd]
[88415.153289] [<ffffffffa0396869>] nfsd4_close+0x139/0x310 [nfsd]
[88415.153293] [<ffffffffa0384914>] nfsd4_proc_compound+0x4d4/0x7f0 [nfsd]
[88415.153298] [<ffffffffa037012b>] nfsd_dispatch+0xbb/0x200 [nfsd]
[88415.153310] [<ffffffffa0156283>] svc_process_common+0x453/0x6f0 [sunrpc]
[88415.153316] [<ffffffffa0156623>] svc_process+0x103/0x170 [sunrpc]
[88415.153320] [<ffffffffa036faaf>] nfsd+0xdf/0x150 [nfsd]
[88415.153324] [<ffffffffa036f9d0>] ? nfsd_destroy+0x80/0x80 [nfsd]
[88415.153327] [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[88415.153329] [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[88415.153331] [<ffffffff81646958>] ret_from_fork+0x58/0x90
[88415.153333] [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[88415.153334] ---[ end trace bbc5392fbd36ab2c ]---
[88665.388447] ------------[ cut here ]------------
[88665.388455] WARNING: at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
[88665.388457] list_del corruption. prev->next should be ffff88005d315ef0, but was (null)
[88665.388458] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache iptable_filter fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat ext4 mbcache jbd2 bridge stp llc appassure_vss(POE) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel ipmi_devintf aesni_intel lrw gf128mul glue_helper iTCO_wdt ablk_helper cryptd iTCO_vendor_support mxm_wmi dcdbas pcspkr sb_edac sg edac_core mei_me lpc_ich mei mfd_core shpchp ipmi_si acpi_power_meter ipmi_msghandler wmi dm_multipath nfsd nfs_acl lockd binfmt_misc auth_rpcgss grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm crct10dif_pclmul crct10dif_common
[88665.388491] crc32c_intel drm ahci tg3 libahci i2c_core ptp libata megaraid_sas pps_core dm_mirror dm_region_hash dm_log dm_mod
[88665.388499] CPU:5 PID:9632 Comm: nfsd Tainted:P W OE ------------ 3.10.0-327.36.1.el7.x86_64 #1
[88665.388500] Hardware name:Dell Inc. PowerEdge R430/03XKDV, BIOS 2.0.1 04/11/2016
[88665.388501] ffff88046132bc20 00000000183422ce ffff88046132bbd8 ffffffff81636301
[88665.388504] ffff88046132bc10 ffffffff8107b260 ffff88005d315ef0 ffff88046132bca0
[88665.388506] ffff88005d316070 ffff8802adb48878 0000000000000000 ffff88046132bc78
[88665.388508] Call Trace:
[88665.388514] [<ffffffff81636301>] dump_stack+0x19/0x1b
[88665.388530] [<ffffffff8107b260>] warn_slowpath_common+0x70/0xb0
[88665.388532] [<ffffffff8107b2fc>] warn_slowpath_fmt+0x5c/0x80
[88665.388534] [<ffffffff8130c671>] __list_del_entry+0xa1/0xd0
[88665.388545] [<ffffffffa03916c9>] release_openowner+0x59/0x110 [nfsd]
[88665.388551] [<ffffffffa03918bb>] __destroy_client+0x11b/0x180 [nfsd]
[88665.388569] [<ffffffffa0391942>] expire_client+0x22/0x30 [nfsd]
[88665.388575] [<ffffffffa0393bad>] nfsd4_setclientid_confirm+0x1bd/0x290 [nfsd]
[88665.388580] [<ffffffffa0384914>] nfsd4_proc_compound+0x4d4/0x7f0 [nfsd]
[88665.388585] [<ffffffffa037012b>] nfsd_dispatch+0xbb/0x200 [nfsd]
[88665.388597] [<ffffffffa0156283>] svc_process_common+0x453/0x6f0 [sunrpc]
[88665.388605] [<ffffffffa0156623>] svc_process+0x103/0x170 [sunrpc]
[88665.388609] [<ffffffffa036faaf>] nfsd+0xdf/0x150 [nfsd]
[88665.388613] [<ffffffffa036f9d0>] ? nfsd_destroy+0x80/0x80 [nfsd]
[88665.388616] [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[88665.388618] [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[88665.388621] [<ffffffff81646958>] ret_from_fork+0x58/0x90
[88665.388623] [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[88665.388624] ---[ end trace bbc5392fbd36ab2d ]---
[88694.124277] ------------[ cut here ]------------
[88694.124285] WARNING: at lib/list_debug.c:62 __list_del_entry+0x82/0xd0()
[88694.124287] list_del corruption. next->prev should be ffff88004313e5a8, but was ffff88005d316080
[88694.124288] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache iptable_filter fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat ext4 mbcache jbd2 bridge stp llc appassure_vss(POE) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel ipmi_devintf aesni_intel lrw gf128mul glue_helper iTCO_wdt ablk_helper cryptd iTCO_vendor_support mxm_wmi dcdbas pcspkr sb_edac sg edac_core mei_me lpc_ich mei mfd_core shpchp ipmi_si acpi_power_meter ipmi_msghandler wmi dm_multipath nfsd nfs_acl lockd binfmt_misc auth_rpcgss grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm crct10dif_pclmul crct10dif_common
[88694.124322] crc32c_intel drm ahci tg3 libahci i2c_core ptp libata megaraid_sas pps_core dm_mirror dm_region_hash dm_log dm_mod
[88694.124330] CPU:3 PID:9225 Comm: kworker/u384:1 Tainted:P W OE ------------ 3.10.0-327.36.1.el7.x86_64 #1
[88694.124331] Hardware name:Dell Inc. PowerEdge R430/03XKDV, BIOS 2.0.1 04/11/2016
[88694.124342] Workqueue: nfsd4 laundromat_main [nfsd]
[88694.124343] ffff8800442c7d30 000000003b0e3606 ffff8800442c7ce8 ffffffff81636301
[88694.124345] ffff8800442c7d20 ffffffff8107b260 ffff880354e8b1d0 00000000580028ad
[88694.124347] ffff880461751d30 ffff88004313e5a8 ffff880461751c90 ffff8800442c7d88
[88694.124350] Call Trace:
[88694.124355] [<ffffffff81636301>] dump_stack+0x19/0x1b
[88694.124358] [<ffffffff8107b260>] warn_slowpath_common+0x70/0xb0
[88694.124360] [<ffffffff8107b2fc>] warn_slowpath_fmt+0x5c/0x80
[88694.124367] [<ffffffffa038ff0f>] ? nfs4_free_ol_stateid+0x2f/0x40 [nfsd]
[88694.124370] [<ffffffff8130c652>] __list_del_entry+0x82/0xd0
[88694.124376] [<ffffffffa03956e8>] laundromat_main+0x348/0x4e0 [nfsd]
[88694.124379] [<ffffffff8109d69b>] process_one_work+0x17b/0x470
[88694.124381] [<ffffffff8109e46b>] worker_thread+0x11b/0x400
[88694.124383] [<ffffffff8109e350>] ? rescuer_thread+0x400/0x400
[88694.124397] [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[88694.124400] [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[88694.124403] [<ffffffff81646958>] ret_from_fork+0x58/0x90
[88694.124405] [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[88694.124407] ---[ end trace bbc5392fbd36ab2e ]---
[88694.124412] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[88694.124454] IP:[<ffffffffa0391530>] nfs4_put_stid+0x10/0x80 [nfsd]
[88694.124479] PGD 0
[88694.124487] Oops:0000 [#1] SMP
[88694.124500] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache iptable_filter fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat ext4 mbcache jbd2 bridge stp llc appassure_vss(POE) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel ipmi_devintf aesni_intel lrw gf128mul glue_helper iTCO_wdt ablk_helper cryptd iTCO_vendor_support mxm_wmi dcdbas pcspkr sb_edac sg edac_core mei_me lpc_ich mei mfd_core shpchp ipmi_si acpi_power_meter ipmi_msghandler wmi dm_multipath nfsd nfs_acl lockd binfmt_misc auth_rpcgss grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm crct10dif_pclmul crct10dif_common
[88694.124792] crc32c_intel drm ahci tg3 libahci i2c_core ptp libata megaraid_sas pps_core dm_mirror dm_region_hash dm_log dm_mod
[88694.124835] CPU:3 PID:9225 Comm: kworker/u384:1 Tainted:P W OE ------------ 3.10.0-327.36.1.el7.x86_64 #1
[88694.124866] Hardware name:Dell Inc. PowerEdge R430/03XKDV, BIOS 2.0.1 04/11/2016
[88694.124893] Workqueue: nfsd4 laundromat_main [nfsd]
[88694.124909] task: ffff880445c1f300 ti: ffff8800442c4000 task.ti: ffff8800442c4000
[88694.124941] RIP:0010:[<ffffffffa0391530>] [<ffffffffa0391530>] nfs4_put_stid+0x10/0x80 [nfsd]
[88694.124971] RSP:0018:ffff8800442c7d78 EFLAGS:00010296
[88694.124986] RAX: ffff88004313e5a8 RBX:0000000000000000 RCX: dead000000200200
[88694.125006] RDX: ffff88004313e5a8 RSI: ffffea000d53a280 RDI:0000000000000000
[88694.125026] RBP: ffff8800442c7d98 R08: ffff88004313e5a8 R09:0000000180440038
[88694.125046] R10: ffffffffa038ff0f R11: ffffea000d53a280 R12:00000000580028ad
[88694.125065] R13: ffff880461751d30 R14: ffff88004313e5a8 R15: ffff880461751c90
[88694.125086] FS:0000000000000000(0000) GS:ffff88046c6c0000(0000) knlGS:0000000000000000
[88694.125108] CS:0010 DS:0000 ES:0000 CR0:0000000080050033
[88694.125124] CR2:0000000000000018 CR3:000000000194a000 CR4:00000000001407e0
[88694.125144] DR0:0000000000000000 DR1:0000000000000000 DR2:0000000000000000
[88694.125164] DR3:0000000000000000 DR6:00000000ffff0ff0 DR7:0000000000000400
[88694.125183] Stack:
[88694.125189] 0000000000000000 00000000580028ad ffff880461751d30 ffff88004313e5a8
[88694.125214] ffff8800442c7e18 ffffffffa039570b 000000000000000c 0000000000000000
[88694.125238] 0000000000000000 ffff880461751cb0 ffff8800741d1000 ffff880461751c00
[88694.125262] Call Trace:
[88694.125275] [<ffffffffa039570b>] laundromat_main+0x36b/0x4e0 [nfsd]
[88694.125295] [<ffffffff8109d69b>] process_one_work+0x17b/0x470
[88694.125312] [<ffffffff8109e46b>] worker_thread+0x11b/0x400
[88694.125329] [<ffffffff8109e350>] ? rescuer_thread+0x400/0x400
[88694.125346] [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[88694.125362] [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[88694.125381] [<ffffffff81646958>] ret_from_fork+0x58/0x90
[88694.125398] [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[88694.125416] Code:44 00 00 31 c0 eb ec 0f 1f 40 00 48 89 de 4c 89 ff e8 15 f7 e2 e0 31 c0 eb d9 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 53 <4c> 8b 6f 18 48 89 fb 4c 8b 77 20 4d 8d a5 40 03 00 00 4c 89 e6
[88694.125517] RIP [<ffffffffa0391530>] nfs4_put_stid+0x10/0x80 [nfsd]
[88694.126304] RSP <ffff8800442c7d78>
[88694.127111] CR2:0000000000000018
Environment
- Red Hat Enterprise Linux 7 (NFS4 サーバー)
- kernel-3.10.0-327.36.1.el7 で発生する
- nfs4
- サードパーティのカーネルモジュールがロードされている
crash> mod -t
NAME TAINTS
appassure_vss POE
crash>
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.