Kernel panic: NULL pointer dereference at tcp_ioctl.
Environment
- Red Hat Enterprise Linux 6.4
- kernel-2.6.32-358.23.2.el6.x86_64
Issue
- Our customer encountered a kernel panic which was crashed at tcp_ioctl.
BUG: unable to handle kernel NULL pointer dereference at 000000000000000d
IP: [<ffffffff8148c479>] tcp_ioctl+0x189/0x1c0
PGD 866e02067 PUD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/bond0.2773/broadcast
CPU 30
Pid: 305160, comm: java Not tainted 2.6.32-358.23.2.el6.x86_64 #1 NEC Express5800/B120e-h [N8400-216Y]/G7LYM
RIP: 0010:[<ffffffff8148c479>] [<ffffffff8148c479>] tcp_ioctl+0x189/0x1c0
RSP: 0018:ffff8805cfd9bdf8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88105c2c70c0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88105c2c70c0
RBP: ffff8805cfd9be18 R08: ffffffff81668f40 R09: 00000000ef816710
R10: 000000000001504e R11: 0000000000000246 R12: 00007f42e03cf504
R13: 00000000000202d0 R14: ffff8800775dba40 R15: 00007f453575f800
FS: 00007f42e03d0700(0000) GS:ffff88089c680000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000000d CR3: 00000003c2ea3000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process java (pid: 305160, threadinfo ffff8805cfd9a000, task ffff8806c0722ae0)
Stack:
0000000000000000 000000000000541b 00007f42e03cf504 ffffffff81b165c0
<d> ffff8805cfd9be28 ffffffffa028da30 ffff8805cfd9be58 ffffffff81434d4a
<d> ffff880570cf2bc0 ffff8800775dba88 00007f42e03cf504 0000000000000628
Call Trace:
[<ffffffffa028da30>] inet6_ioctl+0x30/0xb0 [ipv6]
[<ffffffff81434d4a>] sock_ioctl+0x7a/0x280
[<ffffffff81195322>] vfs_ioctl+0x22/0xa0
[<ffffffff811954c4>] do_vfs_ioctl+0x84/0x580
[<ffffffff810ace7b>] ? sys_futex+0x7b/0x170
[<ffffffff81195a41>] sys_ioctl+0x81/0xa0
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Code: 8b 42 08 48 39 c2 0f 84 66 ff ff ff 45 85 ed 0f 84 5d ff ff ff 48 85 c0 0f 84 54 ff ff ff 8b 90 bc 00 00 00 48 8b 80 d0 00 00 00 <0f> b6 44 10 0d 83 e0 01 41 29 c5 e9 37 ff ff ff 0f 1f 80 00 00
RIP [<ffffffff8148c479>] tcp_ioctl+0x189/0x1c0
RSP <ffff8805cfd9bdf8>
CR2: 000000000000000d
Resolution
-
In latest RHEL6.5 code base. tcp_ioctl() has changes to find correct way to detect that FIN was received is to test SOCK_DONE.a3374c42aa5f7237e87ff3b0622018636b0c847e
-
Which is back-ported to RHEL6.5 (Private) BZ#1001479 "tcp_ioctl FIONREAD/SIOCINQ sets output variable to 1 when it should be 0"
- Probably this issue might be hitting. Please try latest kernel with BZ#1001479 fixed and let us know if issue reproduces. RHN-Errata - RHSA-2013:1645-2
Diagnostic Steps
We have panic in process context, java is the process here.
crash> bt
PID: 305160 TASK: ffff8806c0722ae0 CPU: 30 COMMAND: "java"
#0 [ffff8805cfd9b9c0] machine_kexec at ffffffff81035d6b
#1 [ffff8805cfd9ba20] crash_kexec at ffffffff810c0e22
#2 [ffff8805cfd9baf0] oops_end at ffffffff81511cb0
#3 [ffff8805cfd9bb20] no_context at ffffffff81046c1b
#4 [ffff8805cfd9bb70] __bad_area_nosemaphore at ffffffff81046ea5
#5 [ffff8805cfd9bbc0] bad_area at ffffffff81046fce
#6 [ffff8805cfd9bbf0] __do_page_fault at ffffffff81047780
#7 [ffff8805cfd9bd10] do_page_fault at ffffffff81513bfe
#8 [ffff8805cfd9bd40] page_fault at ffffffff81510fb5
[exception RIP: tcp_ioctl+0x189]
RIP: ffffffff8148c479 RSP: ffff8805cfd9bdf8 RFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88105c2c70c0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88105c2c70c0
RBP: ffff8805cfd9be18 R8: ffffffff81668f40 R9: 00000000ef816710
R10: 000000000001504e R11: 0000000000000246 R12: 00007f42e03cf504
R13: 00000000000202d0 R14: ffff8800775dba40 R15: 00007f453575f800
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff8805cfd9be20] inet6_ioctl at ffffffffa028da30 [ipv6]
#10 [ffff8805cfd9be30] sock_ioctl at ffffffff81434d4a
#11 [ffff8805cfd9be60] vfs_ioctl at ffffffff81195322
#12 [ffff8805cfd9bea0] do_vfs_ioctl at ffffffff811954c4
#13 [ffff8805cfd9bf30] sys_ioctl at ffffffff81195a41
#14 [ffff8805cfd9bf80] system_call_fastpath at ffffffff8100b072
RIP: 00007f4815d7fa47 RSP: 00007f42e03cf600 RFLAGS: 00000206
RAX: 0000000000000010 RBX: ffffffff8100b072 RCX: 00000000c016f358
RDX: 00007f42e03cf504 RSI: 000000000000541b RDI: 0000000000000628
RBP: 00007f42e03cf4d0 R8: 00007f4815b19d10 R9: 00000000ef816710
R10: 000000000001504e R11: 0000000000000246 R12: 0000000000000000
R13: 00000000000008b5 R14: 0000000000000628 R15: 00007f42e03cf504
ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b
Disassembly for panic statements shows it might have crashed while calculating FIN flag from socket buffer.
crash> dis -lr tcp_ioctl+0x189 | tail -n 9
/usr/src/debug/kernel-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.x86_64/net/ipv4/tcp.c: 488
0xffffffff8148c45a <tcp_ioctl+0x16a>: test %r13d,%r13d
0xffffffff8148c45d <tcp_ioctl+0x16d>: je 0xffffffff8148c3c0 <tcp_ioctl+0xd0>
0xffffffff8148c463 <tcp_ioctl+0x173>: test %rax,%rax
0xffffffff8148c466 <tcp_ioctl+0x176>: je 0xffffffff8148c3c0 <tcp_ioctl+0xd0>
/usr/src/debug/kernel-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.x86_64/net/ipv4/tcp.c: 489
0xffffffff8148c46c <tcp_ioctl+0x17c>: mov 0xbc(%rax),%edx
0xffffffff8148c472 <tcp_ioctl+0x182>: mov 0xd0(%rax),%rax
0xffffffff8148c479 <tcp_ioctl+0x189>: movzbl 0xd(%rax,%rdx,1),%eax <-- Crashed here
0xffffffff8148c47e <tcp_ioctl+0x18e>: and $0x1,%eax
0xffffffff8148c481 <tcp_ioctl+0x191>: sub %eax,%r13d
0xffffffff8148c484 <tcp_ioctl+0x194>: jmpq 0xffffffff8148c3c0 <tcp_ioctl+0xd0>
0xffffffff8148c489 <tcp_ioctl+0x199>: nopl 0x0(%rax)
Corrosponding C code:
From, net/ipv4/tcp.c
465 int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg)
[...]
482 struct sk_buff *skb;
483
484 answ = tp->rcv_nxt - tp->copied_seq;
485
486 /* Subtract 1, if FIN is in queue. */
487 skb = skb_peek_tail(&sk->sk_receive_queue);
488 if (answ && skb)
489 answ -= tcp_hdr(skb)->fin; <--
To find skb structure, first lets search for struct sock,
crash> dis -lr tcp_ioctl+0x189 | grep push
0xffffffff8148c2f0 <tcp_ioctl>: push %rbp
Lets check in inet6_ioctl
int inet6_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
{
struct sock *sk = sock->sk; ----------------------
struct net *net = sock_net(sk); !
!
Nothing pushed in inet6_ioctl lets check sock_ioctl !
crash> dis -lr ffffffffa028da30 | grep rdi !
0xffffffffa028da09 <inet6_ioctl+0x9>: mov 0x38(%rdi),%rdi <-----------
0xffffffffa028da16 <inet6_ioctl+0x16>: mov 0x38(%rdi),%rax
0xffffffffa028da1c <inet6_ioctl+0x1c>: mov 0x30(%rdi),%rax
crash> dis -lr ffffffff81434d4a
[...]
/usr/src/debug/kernel-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.x86_64/net/socket.c: 1014
0xffffffff81434d3b <sock_ioctl+0x6b>: mov 0x40(%r14),%rax
0xffffffff81434d3f <sock_ioctl+0x6f>: mov %r12,%rdx
0xffffffff81434d42 <sock_ioctl+0x72>: mov %ebx,%esi
0xffffffff81434d44 <sock_ioctl+0x74>: mov %r14,%rdi <---- r14 has sock
0xffffffff81434d47 <sock_ioctl+0x77>: callq *0x48(%rax)
0xffffffff81434d4a <sock_ioctl+0x7a>: mov %eax,%edx
crash> inet6_ioctl
inet6_ioctl = $4 =
{int (struct socket *, unsigned int, unsigned long)} 0xffffffffa028da00 <inet6_ioctl>
1014 err = sock->ops->ioctl(sock, cmd, arg);
crash> dis -lr ffffffffa028da30 | grep r14
crash> dis -lr tcp_ioctl+0x189 | grep r14
crash> struct socket ffff8800775dba40
struct socket {
state = SS_CONNECTED,
type_begin = 0xffff8800775dba44,
type = 0x1,
type_end = 0xffff8800775dba48,
flags = 0x0,
fasync_list = 0x0,
wait = {
lock = {
raw_lock = {
slock = 0x2c702c7
}
},
task_list = {
next = 0xffff8800775dba60,
prev = 0xffff8800775dba60
}
},
file = 0xffff880570cf2bc0,
sk = 0xffff88105c2c70c0,
ops = 0xffffffffa02c3920 <inet6_stream_ops>
}
crash> struct socket -o
struct socket {
[0x0] socket_state state;
[0x4] int type_begin[];
[0x4] short type;
[0x8] int type_end[];
[0x8] unsigned long flags;
[0x10] struct fasync_struct *fasync_list;
[0x18] wait_queue_head_t wait;
[0x30] struct file *file;
[0x38] struct sock *sk; <--
[0x40] const struct proto_ops *ops;
}
SIZE: 0x48
crash> struct tcp_sock 0xffff88105c2c70c0| grep -e rcv_nxt -e copied_seq
rcv_nxt = 0xd1813a7,
copied_seq = 0xd1610d7,
crash> eval 0xd1813a7 - 0xd1610d7 <---- answ = tp->rcv_nxt - tp->copied_seq;
hexadecimal: 202d0
decimal: 131792
octal: 401320
binary: 0000000000000000000000000000000000000000000000100000001011010000
crash> struct sock 0xffff88105c2c70c0 | grep state
skc_state = 0x1, <--- TCP_ESTABLISHED = 1,
crash> struct sock.sk_receive_queue 0xffff88105c2c70c0
sk_receive_queue = {
next = 0xffff880052d47568,
prev = 0xffff88096dc6b970,
qlen = 0xf,
lock = {
raw_lock = {
slock = 0x0
}
}
}
crash> list -H 0xffff880052d47568
ffff881053d4a5a8
ffff880c840f2868
ffff880c2c3469e8
ffff88106ac8e3e8
ffff8805967c1228
ffff880c2605f168
ffff880f849901e8
ffff880c29d992a8
ffff88105adbe7e8
ffff8808619ccc68
ffff88106658d828
ffff880e177553a8
ffff880575c631a8
ffff880c820fabe8
ffff88105c2c7170
crash> bt
PID: 305160 TASK: ffff8806c0722ae0 CPU: 30 COMMAND: "java"
#0 [ffff8805cfd9b9c0] machine_kexec at ffffffff81035d6b
#1 [ffff8805cfd9ba20] crash_kexec at ffffffff810c0e22
#2 [ffff8805cfd9baf0] oops_end at ffffffff81511cb0
#3 [ffff8805cfd9bb20] no_context at ffffffff81046c1b
#4 [ffff8805cfd9bb70] __bad_area_nosemaphore at ffffffff81046ea5
#5 [ffff8805cfd9bbc0] bad_area at ffffffff81046fce
#6 [ffff8805cfd9bbf0] __do_page_fault at ffffffff81047780
#7 [ffff8805cfd9bd10] do_page_fault at ffffffff81513bfe
#8 [ffff8805cfd9bd40] page_fault at ffffffff81510fb5
[exception RIP: tcp_ioctl+0x189]
RIP: ffffffff8148c479 RSP: ffff8805cfd9bdf8 RFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88105c2c70c0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88105c2c70c0 <--- sock
RBP: ffff8805cfd9be18 R8: ffffffff81668f40 R9: 00000000ef816710
R10: 000000000001504e R11: 0000000000000246 R12: 00007f42e03cf504
R13: 00000000000202d0 R14: ffff8800775dba40 R15: 00007f453575f800
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff8805cfd9be20] inet6_ioctl at ffffffffa028da30 [ipv6]
#10 [ffff8805cfd9be30] sock_ioctl at ffffffff81434d4a
#11 [ffff8805cfd9be60] vfs_ioctl at ffffffff81195322
#12 [ffff8805cfd9bea0] do_vfs_ioctl at ffffffff811954c4
#13 [ffff8805cfd9bf30] sys_ioctl at ffffffff81195a41
#14 [ffff8805cfd9bf80] system_call_fastpath at ffffffff8100b072
RIP: 00007f4815d7fa47 RSP: 00007f42e03cf600 RFLAGS: 00000206
RAX: 0000000000000010 RBX: ffffffff8100b072 RCX: 00000000c016f358
RDX: 00007f42e03cf504 RSI: 000000000000541b RDI: 0000000000000628
RBP: 00007f42e03cf4d0 R8: 00007f4815b19d10 R9: 00000000ef816710
R10: 000000000001504e R11: 0000000000000246 R12: 0000000000000000
R13: 00000000000008b5 R14: 0000000000000628 R15: 00007f42e03cf504
ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b
/usr/src/debug/kernel-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.x86_64/include/linux/skbuff.h: 829
0xffffffff8148c443 <tcp_ioctl+0x153>: lea 0xb0(%rbx),%rdx <--- A
/usr/src/debug/kernel-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.x86_64/net/ipv4/tcp.c: 484
0xffffffff8148c44a <tcp_ioctl+0x15a>: sub %eax,%r13d
/usr/src/debug/kernel-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.x86_64/include/linux/skbuff.h: 829
0xffffffff8148c44d <tcp_ioctl+0x15d>: mov 0x8(%rdx),%rax <---- B
/usr/src/debug/kernel-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.x86_64/include/linux/skbuff.h: 830
0xffffffff8148c451 <tcp_ioctl+0x161>: cmp %rax,%rdx
0xffffffff8148c454 <tcp_ioctl+0x164>: je 0xffffffff8148c3c0 <tcp_ioctl+0xd0>
/usr/src/debug/kernel-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.x86_64/net/ipv4/tcp.c: 488 <--- if (answ && skb) r13 had answ and rax had skb
0xffffffff8148c45a <tcp_ioctl+0x16a>: test %r13d,%r13d
0xffffffff8148c45d <tcp_ioctl+0x16d>: je 0xffffffff8148c3c0 <tcp_ioctl+0xd0>
0xffffffff8148c463 <tcp_ioctl+0x173>: test %rax,%rax
0xffffffff8148c466 <tcp_ioctl+0x176>: je 0xffffffff8148c3c0 <tcp_ioctl+0xd0>
/usr/src/debug/kernel-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.x86_64/net/ipv4/tcp.c: 489
0xffffffff8148c46c <tcp_ioctl+0x17c>: mov 0xbc(%rax),%edx
0xffffffff8148c472 <tcp_ioctl+0x182>: mov 0xd0(%rax),%rax
0xffffffff8148c479 <tcp_ioctl+0x189>: movzbl 0xd(%rax,%rdx,1),%eax
crash> sk_buff -o | grep -e bc -e d0
[0xbc] sk_buff_data_t transport_header;
[0xd0] unsigned char *head;
====
crash> struct sock -o | grep b0 <--- A
[0xb0] struct sk_buff_head sk_receive_queue;
crash> struct sock.sk_receive_queue ffff88105c2c70c0 <--- A
sk_receive_queue = {
next = 0xffff880052d47568,
prev = 0xffff88096dc6b970, <--- B
qlen = 0xf,
lock = {
raw_lock = {
slock = 0x0
}
}
}
crash> struct sk_buff_head -o
struct sk_buff_head {
[0x0] struct sk_buff *next;
[0x8] struct sk_buff *prev; <----- B
[0x10] __u32 qlen;
[0x14] spinlock_t lock;
}
SIZE: 0x18
crash> sk_buff 0xffff88096dc6b970 | grep -e head -e transport
transport_header = 0x0,
network_header = 0x0,
mac_header = 0x0,
head = 0x0,
crash> rd 0xffff88096dc6b970
ffff88096dc6b970: ffff88105c2c7170 pq,\....
Next linked list is intact, seem prev got deleted,
crash> list -H 0xffff880052d47568
ffff881053d4a5a8
ffff880c840f2868
ffff880c2c3469e8
ffff88106ac8e3e8
ffff8805967c1228
ffff880c2605f16
ffff880f849901e8
ffff880c29d992a8
ffff88105adbe7e8
ffff8808619ccc68
ffff88106658d828
ffff880e177553a8
ffff880575c631a8
ffff880c820fabe8
ffff88105c2c7170 <-
crash> rd ffff880c820fabe8
ffff880c820fabe8: ffff88105c2c7170
crash> rd ffff88105c2c7170
ffff88105c2c7170: ffff880052d47568 hu.R....
Trying out next sk_buff
crash> sk_buff 0xffff880052d47568 | grep -e head -e transport
transport_header = 0x260,
network_header = 0x24c,
mac_header = 0x23e,
head = 0xffff880589726000 "\a\350I\a\217g\021\202\364o\303Lt\215\211\\\016=7\351\221n\n\261Qo\361\344\375I\372\373z\224\030骚T\365\325\320\022Z>\020\003q\301\277\343\237~\307\345Ճ\037&#:00;u\025\024\332o P\001>\247\026+\376:\255]X\203\325]\377",
$ movzbl 0xd(%rax,%rdx,1),%eax i.e MOVZBL Move Zero-Extended Byte to Long %eax := zx 8→32 $m[%rax + %rdx + 13]
0xffff880589726000 + 0x260 + 0xd
crash> eval 0xffff880589726000 + 0x260
hexadecimal: ffff880589726260
decimal: 18446612156095029856 (-131917614521760)
octal: 1777774200261134461140
binary: 1111111111111111100010000000010110001001011100100110001001100000
crash> struct tcphdr -o | grep 0xd
[0xd] __u16 fin : 1;
[0xd] __u16 syn : 1;
[0xd] __u16 rst : 1;
[0xd] __u16 psh : 1;
[0xd] __u16 ack : 1;
[0xd] __u16 urg : 1;
[0xd] __u16 ece : 1;
[0xd] __u16 cwr : 1;
crash> struct tcphdr 0xffff880589726260
struct tcphdr {
source = 15601,
dest = 29993,
seq = 1071453453,
ack_seq = 1864368086,
res1 = 0,
doff = 8,
fin = 0,
syn = 0,
rst = 0,
psh = 0,
ack = 1,
urg = 0,
ece = 0,
cwr = 0,
window = 80,
check = 10558,
urg_ptr = 0
}
crash> struct tcphdr.fin 0xffff880589726260
fin = 0x0
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments