kernel:BUG: soft lockup - CPU stuck for 68s! in sctp_assoc_update_retran_path
Environment
- Red Hat Enterprise Linux (RHEL) 6.6 or earlier
- Stream Control Transmission Protocol (SCTP) association with multi-homed endpoints
- SCTP transport failover between end points
Issue
kernel:BUG: soft lockup - CPU stuck for 68s! in sctp_assoc_update_retran_path
-
Soft lockup hang panic with backtrace similar to:
RIP: 0010:[<ffffffffa031c64d>] [<ffffffffa031c64d>] sctp_assoc_update_retran_path+0x6d/0xa0 [sctp] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880fe5781000 Call Trace: <IRQ> [<ffffffffa03266ac>] ? sctp_retransmit+0x1dc/0x1f0 [sctp] [<ffffffffa031a423>] ? sctp_do_sm+0xab3/0x1210 [sctp] [<ffffffff81496d1d>] ? ip_rcv_finish+0x12d/0x440 [<ffffffff814972a5>] ? ip_rcv+0x275/0x350 [<ffffffffa031aeb0>] ? sctp_generate_t3_rtx_event+0x0/0xd0 [sctp] [<ffffffffa031af31>] ? sctp_generate_t3_rtx_event+0x81/0xd0 [sctp]
Resolution
- Update to the RHEL 6.7 kernel package (
kernel-2.6.32-573.el6
) or later - If it is necessary to remain on the RHEL 6.6 kernel, update the kernel package to
kernel-2.6.32-504.46.1.el6
or later. Please note that RHEL6.6 is no longer receiving updates as per the Extended Update Support section of the Red Hat Enterprise Linux Life Cycle page.
Root Cause
SCTP was retransmitting a packet on the RTO timer.
All transports were either in "unconfirmed" or "inactive" state.
The SCTP in RHEL 6.6 could get into a loop here, trying to pick which of the "inactive" transports was better, but neither is better and so resulted in a tie and hang.
This is resolved with upstream commit a7288c4 which applies relevant lines from the RFC to break the tie.
This was backported to RHEL 6 on Private Bug 1090561 and released in RHEL 6.7 kernel package kernel-2.6.32-573.el6
on Errata RHSA-2015:1272. This was also backported to RHEL6.6 on Private Bug 1306565 and released in kernel package kernel-2.6.32-504.46.1.el6
on Errata RHSA-2016:0617.
Diagnostic Steps
BUG: soft lockup - CPU#0 stuck for 67s! [swapper:0]
Pid: 0, comm: swapper Tainted: P --------------- 2.6.32-504.30.3.el6.x86_64 #1 HP ProLiant BL460c Gen9
RIP: 0010:[<ffffffffa031c64d>] [<ffffffffa031c64d>] sctp_assoc_update_retran_path+0x6d/0xa0 [sctp]
RSP: 0018:ffff880063203c00 EFLAGS: 00000286
RAX: ffff88106511b800 RBX: ffff880063203c00 RCX: ffff88106511b800
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880fe5781000
RBP: ffffffff8100bc13 R08: ffff880fe5781148 R09: 0000000000000000
R10: 0000000000000000 R11: ffff88103feec800 R12: ffff880063203b80
R13: ffff880fe57816e0 R14: ffff880063203b70 R15: ffffffff81533d55
FS: 0000000000000000(0000) GS:ffff880063200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007fc4f143b000 CR3: 0000002065720000 CR4: 00000000001407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020)
Stack:
ffff880063203c20 ffffffffa03266ac ffff880fe5781000 ffff880063203c80
<d> ffff880063203e10 ffffffffa031a423 ffff880063203c70 ffffffff81496d1d
<d> ffff88107fcd0340 ffff88102996d8c0 ffff88102996d8c0 ffff881062400020
Call Trace:
<IRQ>
[<ffffffffa03266ac>] ? sctp_retransmit+0x1dc/0x1f0 [sctp]
[<ffffffffa031a423>] ? sctp_do_sm+0xab3/0x1210 [sctp]
[<ffffffff81496d1d>] ? ip_rcv_finish+0x12d/0x440
[<ffffffff814972a5>] ? ip_rcv+0x275/0x350
[<ffffffffa031aeb0>] ? sctp_generate_t3_rtx_event+0x0/0xd0 [sctp]
[<ffffffffa031af31>] ? sctp_generate_t3_rtx_event+0x81/0xd0 [sctp]
[<ffffffff81087e07>] ? run_timer_softirq+0x197/0x340
[<ffffffff810b03c5>] ? tick_dev_program_event+0x65/0xc0
[<ffffffff8107d901>] ? __do_softirq+0xc1/0x1e0
[<ffffffff810b049a>] ? tick_program_event+0x2a/0x30
[<ffffffff8100c38c>] ? call_softirq+0x1c/0x30
[<ffffffff8100fbd5>] ? do_softirq+0x65/0xa0
[<ffffffff8107d7b5>] ? irq_exit+0x85/0x90
[<ffffffff81533d5a>] ? smp_apic_timer_interrupt+0x4a/0x60
[<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20
<EOI>
[<ffffffff810166d7>] ? mwait_idle+0x77/0xd0
[<ffffffff8153022a>] ? atomic_notifier_call_chain+0x1a/0x20
[<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
[<ffffffff8151075a>] ? rest_init+0x7a/0x80
[<ffffffff81c29f8f>] ? start_kernel+0x424/0x430
[<ffffffff81c2933a>] ? x86_64_start_reservations+0x125/0x129
[<ffffffff81c29453>] ? x86_64_start_kernel+0x115/0x124
Code: 74 3d 48 8b 00 4c 39 c0 74 f8 8b 90 d4 00 00 00 83 fa 03 74 ed 48 85 c9 74 1d 8b b1 d4 00 00 00 4c 63 d2 45 0f b6 92 00 7f 33 a0 <4c> 63 ce 45 3a 91 00 7f 33 a0 76 bf 83 fa 02 48 89 c1 75 be 48
crash> dis -l sctp_assoc_update_retran_path+0x6d
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1344
0xffffffffa031c64d <sctp_assoc_update_retran_path+109>: movslq %esi,%r9
net/sctp/associola.c
1335 static const u8 sctp_trans_state_to_prio_map[] = { <---- step 2: array
1336 [SCTP_ACTIVE] = 3, /* best case */
1337 [SCTP_UNKNOWN] = 2,
1338 [SCTP_PF] = 1,
1339 [SCTP_INACTIVE] = 0, /* worst case */
1340 };
1341
1342 static u8 sctp_trans_score(const struct sctp_transport *trans)
1343 {
1344 return sctp_trans_state_to_prio_map[trans->state]; <---- step 1: hang indexing in this array. we won't hang in an array so we're probably looping here
1345 }
1346
1347 static struct sctp_transport *sctp_trans_elect_best(struct sctp_transport *curr,
1348 struct sctp_transport *best)
1349 {
1350 if (best == NULL)
1351 return curr;
1352
1353 return sctp_trans_score(curr) > sctp_trans_score(best) ? curr : best; <---- step 3: caller
1354 }
1355
1356 void sctp_assoc_update_retran_path(struct sctp_association *asoc)
1357 {
1358 struct sctp_transport *trans = asoc->peer.retran_path;
1359 struct sctp_transport *trans_next = NULL;
1360
1361 /* We're done as we only have the one and only path. */
1362 if (asoc->peer.transport_count == 1)
1363 return;
1364 /* If active_path and retran_path are the same and active,
1365 * then this is the only active path. Use it.
1366 */
1367 if (asoc->peer.active_path == asoc->peer.retran_path &&
1368 asoc->peer.active_path->state == SCTP_ACTIVE)
1369 return;
1370
1371 /* Iterate from retran_path's successor back to retran_path. */
1372 for (trans = list_next_entry(trans, transports); 1;
1373 trans = list_next_entry(trans, transports)) {
1374 /* Manually skip the head element. */
1375 if (&trans->transports == &asoc->peer.transport_addr_list)
1376 continue;
1377 if (trans->state == SCTP_UNCONFIRMED)
1378 continue;
1379 trans_next = sctp_trans_elect_best(trans, trans_next); <--- step 4: caller
net/sctp/outqueue.c
479 /* Mark all the eligible packets on a transport for retransmission and force
480 * one packet out.
481 */
482 void sctp_retransmit(struct sctp_outq *q, struct sctp_transport *transport,
483 sctp_retransmit_reason_t reason)
484 {
485 int error = 0;
486
487 switch(reason) {
488 case SCTP_RTXR_T3_RTX:
489 SCTP_INC_STATS(SCTP_MIB_T3_RETRANSMITS);
490 sctp_transport_lower_cwnd(transport, SCTP_LOWER_CWND_T3_RTX);
491 /* Update the retran path if the T3-rtx timer has expired for
492 * the current retran path.
493 */
494 if (transport == transport->asoc->peer.retran_path)
495 sctp_assoc_update_retran_path(transport->asoc); <---- step 5: caller
496 transport->asoc->rtx_data_chunks +=
497 transport->asoc->unack_data;
498 break;
SCTP_RTXR_T3_RTX
means the packet is being retransmitted because the RTO is in use:
crash> dis -lr sctp_assoc_update_retran_path+0x6d
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1357
0xffffffffa031c5e0 <sctp_assoc_update_retran_path>: push %rbp
0xffffffffa031c5e1 <sctp_assoc_update_retran_path+1>: mov %rsp,%rbp
0xffffffffa031c5e4 <sctp_assoc_update_retran_path+4>: nopl 0x0(%rax,%rax,1)
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1362
0xffffffffa031c5e9 <sctp_assoc_update_retran_path+9>: cmpw $0x1,0x158(%rdi)
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1358
0xffffffffa031c5f1 <sctp_assoc_update_retran_path+17>: mov 0x190(%rdi),%r11
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1362
0xffffffffa031c5f8 <sctp_assoc_update_retran_path+24>: je 0xffffffffa031c668 <sctp_assoc_update_retran_path+136>
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1367
0xffffffffa031c5fa <sctp_assoc_update_retran_path+26>: cmp 0x188(%rdi),%r11
0xffffffffa031c601 <sctp_assoc_update_retran_path+33>: je 0xffffffffa031c66a <sctp_assoc_update_retran_path+138>
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1372
0xffffffffa031c603 <sctp_assoc_update_retran_path+35>: mov (%r11),%rax
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1375
0xffffffffa031c606 <sctp_assoc_update_retran_path+38>: lea 0x148(%rdi),%r8
0xffffffffa031c60d <sctp_assoc_update_retran_path+45>: xor %ecx,%ecx
0xffffffffa031c60f <sctp_assoc_update_retran_path+47>: jmp 0xffffffffa031c627 <sctp_assoc_update_retran_path+71>
0xffffffffa031c611 <sctp_assoc_update_retran_path+49>: nopl 0x0(%rax)
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1353
0xffffffffa031c618 <sctp_assoc_update_retran_path+56>: mov %esi,%edx
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1381
0xffffffffa031c61a <sctp_assoc_update_retran_path+58>: cmp $0x2,%edx
0xffffffffa031c61d <sctp_assoc_update_retran_path+61>: je 0xffffffffa031c661 <sctp_assoc_update_retran_path+129>
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1384
0xffffffffa031c61f <sctp_assoc_update_retran_path+63>: cmp %r11,%rax
0xffffffffa031c622 <sctp_assoc_update_retran_path+66>: je 0xffffffffa031c661 <sctp_assoc_update_retran_path+129>
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1373
0xffffffffa031c624 <sctp_assoc_update_retran_path+68>: mov (%rax),%rax
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1375
0xffffffffa031c627 <sctp_assoc_update_retran_path+71>: cmp %r8,%rax
0xffffffffa031c62a <sctp_assoc_update_retran_path+74>: je 0xffffffffa031c624 <sctp_assoc_update_retran_path+68>
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1377
0xffffffffa031c62c <sctp_assoc_update_retran_path+76>: mov 0xd4(%rax),%edx
0xffffffffa031c632 <sctp_assoc_update_retran_path+82>: cmp $0x3,%edx
0xffffffffa031c635 <sctp_assoc_update_retran_path+85>: je 0xffffffffa031c624 <sctp_assoc_update_retran_path+68>
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1350
0xffffffffa031c637 <sctp_assoc_update_retran_path+87>: test %rcx,%rcx
0xffffffffa031c63a <sctp_assoc_update_retran_path+90>: je 0xffffffffa031c659 <sctp_assoc_update_retran_path+121>
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1344
0xffffffffa031c63c <sctp_assoc_update_retran_path+92>: mov 0xd4(%rcx),%esi
0xffffffffa031c642 <sctp_assoc_update_retran_path+98>: movslq %edx,%r10
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1353
0xffffffffa031c645 <sctp_assoc_update_retran_path+101>: movzbl -0x5fcc8100(%r10),%r10d
/usr/src/debug/kernel-2.6.32-504.30.3.el6/linux-2.6.32-504.30.3.el6.x86_64/net/sctp/associola.c: 1344
0xffffffffa031c64d <sctp_assoc_update_retran_path+109>: movslq %esi,%r9
crash> dis -lr sctp_assoc_update_retran_path+0x6d | egrep "di$"
crash>
%rdi
is not modified over the runtime of the hung function, so the assoc
argument is still in %rdi
RDI: ffff880fe5781000
crash> struct sctp_association ffff880fe5781000
crash> struct sctp_association.peer.transport_count,peer.active_path,peer.retran_path ffff880fe5781000
peer.transport_count = 2,
peer.active_path = 0xffff88103feec800,
peer.retran_path = 0xffff88103feec800,
crash> struct sctp_transport 0xffff88103feec800
struct sctp_transport {
transports = {
next = 0xffff880fe5781148,
prev = 0xffff88106511b800
},
crash> struct sctp_transport.state 0xffff88103feec800
state = 3
enum sctp_spinfo_state {
SCTP_INACTIVE, // 0 these C99 comments added for analysis
SCTP_PF, // 1
SCTP_ACTIVE, // 2
SCTP_UNCONFIRMED, // 3
SCTP_UNKNOWN = 0xffff /* Value used for transport state unknown */
};
We're in SCTP_UNCONFIRMED
so we try to find a successor:
crash> struct sctp_transport.transports,state 0xffff880fe5781148
transports = {
next = 0xffff88106511b800,
prev = 0xffff88103feec800
}
state = 0
crash> struct sctp_transport.transports,state 0xffff88106511b800
transports = {
next = 0xffff88103feec800,
prev = 0xffff880fe5781148
}
state = 0
There are no suitable successors, so we spin forever
Upstream has a tie breaker here, this from Linux v4.3
:
1227 static struct sctp_transport *sctp_trans_elect_tie(struct sctp_transport *trans1,
1228 struct sctp_transport *trans2)
1229 {
1230 if (trans1->error_count > trans2->error_count) {
1231 return trans2;
1232 } else if (trans1->error_count == trans2->error_count &&
1233 ktime_after(trans2->last_time_heard,
1234 trans1->last_time_heard)) {
1235 return trans2;
1236 } else {
1237 return trans1;
1238 }
1239 }
1240
1241 static struct sctp_transport *sctp_trans_elect_best(struct sctp_transport *curr,
1242 struct sctp_transport *best)
1243 {
1244 u8 score_curr, score_best;
1245
1246 if (best == NULL || curr == best)
1247 return curr;
1248
1249 score_curr = sctp_trans_score(curr);
1250 score_best = sctp_trans_score(best);
1251
1252 /* First, try a score-based selection if both transport states
1253 * differ. If we're in a tie, lets try to make a more clever
1254 * decision here based on error counts and last time heard.
1255 */
1256 if (score_curr > score_best)
1257 return curr;
1258 else if (score_curr == score_best)
1259 return sctp_trans_elect_tie(curr, best);
1260 else
1261 return best;
1262 }
This was added with:
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=a7288c4
This is now in RHEL6.7 and newer:
* Thu Jan 29 2015 Rafael Aquini <aquini@redhat.com> [2.6.32-527.el6]
- [net] sctp: improve sctp_select_active_and_retran_path selection (Daniel Borkmann) [1090561]
This is also now in RHEL6.6 as of kernel-2.6.32-504.46.1.el6
:
* Thu Feb 18 2016 Radomir Vrbovsky <rvrbovsk@redhat.com> [2.6.32-504.44.1.el6]
...
- [net] sctp: improve sctp_select_active_and_retran_path selection (Daniel Borkmann) [1306565 1090561]
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments