Kernel Panic: "BUG: unable to handle kernel NULL pointer dereference at (null)" with RIP in afs_linux_write_begin+0x36/0x160
Environment
- Red Hat Enterprise Linux 7
Issue
- Server reboot with kernel panic message
BUG: unable to handle kernel NULL pointer dereference at (null)
. RIP
in functionafs_linux_write_begin+0x36/0x160
.
Resolution
An exception occurred in unsigned openafs
kernel module. Contact the openafs
module vendor for further investigation.
Root Cause
The issue is because of dereferencing of a null address in RIP
at afs_linux_write_begin+0x36/0x160
by the unsigned module openafs
.
Diagnostic Steps
-
Kernel ring buffer from
vmcore-dmesg.txt
$ tail -n 40 vmcore-dmesg.txt [1476087.659656] BUG: unable to handle kernel NULL pointer dereference at (null) [1476087.659678] IP: [<ffffffffa06b54f6>] afs_linux_write_begin+0x36/0x160 [openafs] [1476087.659680] PGD 7c2bf2067 PUD 7c2bf3067 PMD 0 [1476087.659681] Oops: 0000 [#1] SMP [1476087.659702] Modules linked in: ext4 mbcache jbd2 overlay(T) squashfs loop rpcsec_gss_krb5 nfsv4 dns_resolver vfat fat uas usb_storage mpt3sas mpt2sas raid_class scsi_transport_sas mptctl mptbase dell_rbu osc(OE) mgc(OE) lustre(OE) lmv(OE) fld(OE) mdc(OE) fid(OE) lov(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_ssse3 sha512_generic nfsv3 nfs fscache crypto_null libcfs(OE) bonding ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ipmi_ssif sb_edac edac_core sg i2c_i801 lpc_ich mei_me mei ipmi_si ipmi_msghandler shpchp wmi acpi_power_meter openafs(POE) binfmt_misc nfsd nfs_acl auth_rpcgss lockd grace sunrpc ip_tables xfs libcrc32c sd_mod [1476087.659712] crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel mgag200 drm_kms_helper mlx5_core syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci igb libahci drm dca libata i2c_algo_bit i40e i2c_core ptp megaraid_sas pps_core fjes dm_mirror dm_region_hash dm_log dm_mod [1476087.659715] CPU: 4 PID: 610331 Comm: R Tainted: POE ------------ T 3.10.0-514.16.1.el7.x86_64 #1 [1476087.659716] Hardware name: Dell Inc. PowerEdge R930/0Y0V4F, BIOS 2.5.2 005/24/2018 [1476087.659716] task: ffff8819507b2f10 ti: ffff8808941d8000 task.ti: ffff8808941d8000 [1476087.659726] RIP: 0010:[<ffffffffa06b54f6>] [<ffffffffa06b54f6>] afs_linux_write_begin+0x36/0x160 [openafs] [1476087.659727] RSP: 0018:ffff8808941dbc00 EFLAGS: 00010282 [1476087.659728] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000030 [1476087.659728] RDX: ffffea007c8339a0 RSI: ffff881fff25a0c8 RDI: 0000000000000297 [1476087.659729] RBP: ffff8808941dbc28 R08: ffffea007c8339a0 R09: ffff88207ffcfa80 [1476087.659730] R10: 0000000000000018 R11: 0000000000000003 R12: 00000000000003cd [1476087.659730] R13: ffff88233eebba00 R14: ffff8808941dbc90 R15: 0000000000000003 [1476087.659732] FS: 00002aed21060340(0000) GS:ffff881fff240000(0000) knlGS:0000000000000000 [1476087.659732] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1476087.659733] CR2: 0000000000000000 CR3: 00000007c2bf1000 CR4: 00000000003407e0 [1476087.659733] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [1476087.659734] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [1476087.659734] Stack: [1476087.659736] 00000000000003cd 0000000000000003 0000000000000c33 ffff882dcc320950 [1476087.659737] ffff8808941dbe28 ffff8808941dbcf0 ffffffff81181b2e ffff8808941dbc98 [1476087.659738] ffff8808941dbe70 00000000000003cd 0000000000000c33 ffff8819507b2f10 [1476087.659738] Call Trace: [1476087.659746] [<ffffffff81181b2e>] generic_file_buffered_write+0x11e/0x2a0 [1476087.659749] [<ffffffff81183112>] __generic_file_aio_write+0x1e2/0x400 [1476087.659750] [<ffffffff811806ee>] ? __find_get_page+0x1e/0xa0 [1476087.659752] [<ffffffff81183389>] generic_file_aio_write+0x59/0xa0 [1476087.659762] [<ffffffffa06b3979>] afs_linux_aio_write+0x249/0x490 [openafs] [1476087.659763] [<ffffffff81180b9b>] ? unlock_page+0x2b/0x30 [1476087.659765] [<ffffffff811fdf3d>] do_sync_write+0x8d/0xd0 [1476087.659767] [<ffffffff811fe7ad>] vfs_write+0xbd/0x1e0 [1476087.659767] [<ffffffff811ff2cf>] SyS_write+0x7f/0xe0 [1476087.659770] [<ffffffff81697089>] system_call_fastpath+0x16/0x1b [1476087.659781] Code: 41 89 cf 41 56 4d 89 ce 41 55 49 89 fd 48 89 f7 48 89 d6 41 54 48 c1 fe 0c 49 89 d4 44 89 c2 53 e8 e0 c3 ac e0 48 89 c3 49 89 06 <48> 8b 00 a8 08 74 13 5b 41 5c 41 5d 41 5e 41 5f 31 c0 5d c3 66 [1476087.659790] RIP [<ffffffffa06b54f6>] afs_linux_write_begin+0x36/0x160 [openafs] [1476087.659791] RSP <ffff8808941dbc00> [1476087.659792] CR2: 0000000000000000 -
The panic task is 'R' with PID 610331
CPU: 4 PID: 610331 Comm: R Tainted: POE ------------ T 3.10.0-514.16.1.el7.x86_64 #1
-
The exception occurred at
afs_linux_write_begin+0x36
RIP [<ffffffffa06b54f6>] afs_linux_write_begin+0x36/0x160 [openafs]
-
All codes
0: 41 89 cf mov %ecx,%r15d 3: 41 56 push %r14 5: 4d 89 ce mov %r9,%r14 8: 41 55 push %r13 a: 49 89 fd mov %rdi,%r13 d: 48 89 f7 mov %rsi,%rdi 10: 48 89 d6 mov %rdx,%rsi 13: 41 54 push %r12 15: 48 c1 fe 0c sar $0xc,%rsi 19: 49 89 d4 mov %rdx,%r12 1c: 44 89 c2 mov %r8d,%edx 1f: 53 push %rbx 20: e8 e0 c3 ac e0 callq 0xffffffffe0acc405 25: 48 89 c3 mov %rax,%rbx 28: 49 89 06 mov %rax,(%r14) 2b:* 48 8b 00 mov (%rax),%rax <<<<< trapping instruction 2e: a8 08 test $0x8,%al 30: 74 13 je 0x45 32: 5b pop %rbx 33: 41 5c pop %r12 35: 41 5d pop %r13 37: 41 5e pop %r14 39: 41 5f pop %r15 3b: 31 c0 xor %eax,%eax 3d: 5d pop %rbp 3e: c3 retq 3f: 66 data16
-
Code starting with the faulting instruction
0: 48 8b 00 mov (%rax),%rax 3: a8 08 test $0x8,%al 5: 74 13 je 0x1a 7: 5b pop %rbx 8: 41 5c pop %r12 a: 41 5d pop %r13 c: 41 5e pop %r14 e: 41 5f pop %r15 10: 31 c0 xor %eax,%eax 12: 5d pop %rbp 13: c3 retq 14: 66 data16
-
The address stored in the register
%rax
atafs_linux_write_begin+0x36
is NULL.RAX: 0000000000000000
-
The function
afs_linux_write_begin()
is a part of an unsigned (E) moduleopenafs
.RIP [<ffffffffa06b54f6>] afs_linux_write_begin+0x36/0x160 [openafs] ^ ^ | | [ Function Name ] [ Module Name ]
-
The above evidence suggests that there is an issue within unsigned (E) module
openafs
. - Since Red Hat does not have the source code of this module for detail investigation, contact the provider of
openafs
module for further investigation.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments