Why RHEL-5 system hangs on snd_pcm_attach_substream+0x80/0x2d9 ?
Environment
- Red Hat Enterprise Linux 5 Workstation
- Sound card.
Issue
- System hangs with the following call trace:
Unable to handle kernel paging request at 0000000000100100 RIP:
[<ffffffff882ebf3b>] :snd_pcm:snd_pcm_attach_substream+0x80/0x2d9
PGD 1f2624067 PUD 1f7c32067 PMD 0
Oops: 0000 [1] SMP
<1>last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
<4>CPU 3
<4>Modules linked in: af_key iptable_filter ip_tables x_tables coretemp hidp rfcomm l2cap bluetooth lockd
sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_w
mi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport snd_hda_intel i2c_i801 nvidia(PU)
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd
_timer snd_page_alloc snd_hwdep tpm_tis snd tpm sr_mod tpm_bios cdrom e1000e i2c_core shpchp sg soundcore
pcspkr serio_raw dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_
mod isci libsas scsi_transport_sas ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
<4>Pid: 21326, comm: sox Tainted: P ---- 2.6.18-274.el5 #1
<4>RIP: 0010:[<ffffffff882ebf3b>] [<ffffffff882ebf3b>] :snd_pcm:snd_pcm_attach_substream+0x80/0x2d9
<4>RSP: 0018:ffff8101f4707d38 EFLAGS: 00010293
<4>RAX: ffff8101f3d6f7e0 RBX: ffff81022a2aa400 RCX: 0000000000100100
<4>RDX: 0000000000005340 RSI: 0000000000000000 RDI: ffff81022a2aa56c
<4>RBP: ffff8101f4707dd0 R08: ffff8101f4706000 R09: 0000000000000038
<4>R10: ffff81000901ea90 R11: ffffffff8002c6bf R12: ffff81022dbf84b8
<4>R13: 0000000000000000 R14: ffff81022dbf8400 R15: ffff8101f4707d88
<4>FS: 00002b2054802e60(0000) GS:ffff81022fc38640(0000) knlGS:0000000000000000
<4>CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>CR2: 0000000000100100 CR3: 00000001d653b000 CR4: 00000000000006a0
<4>Process sox (pid: 21326, threadinfo ffff8101f4706000, task ffff8101f3d6f7e0)
<4>Stack: ffff8101f70680c0 00000000fffffff2 ffff8101f4707dd0 ffff8101f70680c0
<4> 0000000000000000 ffff8101f70680c0 ffff81022f6c09c0 ffffffff882eea4e
<4> ffff8101f70680c0 ffffffff80063c9d 0000000000000000 00000000fffffff2
<4>Call Trace:
<4> [<ffffffff882eea4e>] :snd_pcm:snd_pcm_open_substream+0x13/0x88
<4> [<ffffffff80063c9d>] .text.lock.mutex+0xf/0x14
<4> [<ffffffff882eeb97>] :snd_pcm:snd_pcm_open+0xd4/0x21c
<4> [<ffffffff8008e857>] default_wake_function+0x0/0xe
<4> [<ffffffff8829b047>] :snd:snd_lookup_minor_data+0x47/0x4f
<4> [<ffffffff8829b441>] :snd:snd_open+0xf2/0x1af
<4> [<ffffffff8004921a>] chrdev_open+0x14d/0x183
<4> [<ffffffff8001b427>] open_namei+0x2c4/0x718
<4> [<ffffffff800490cd>] chrdev_open+0x0/0x183
<4> [<ffffffff8001ec04>] __dentry_open+0xd9/0x1dc
<4> [<ffffffff80027729>] do_filp_open+0x2a/0x38
<4> [<ffffffff8001a089>] do_sys_open+0x44/0xbe
<4> [<ffffffff8005d28d>] tracesys+0xd5/0xe0
<4>
<4>
<4>Code: 48 8b 01 0f 18 08 48 8d 83 a0 01 00 00 48 39 c1 75 cf 83 cd
<1>RIP [<ffffffff882ebf3b>] :snd_pcm:snd_pcm_attach_substream+0x80/0x2d9
<4> RSP <ffff8101f4707d38>
<0>CR2: 0000000000100100
- Some processes that require the sound device gets blocked with following call traces:
<4> <3>INFO: task sox:21312 blocked for more than 120 seconds.
<3>"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<4>sox D ffffffff801546d1 0 21312 21300 (L-TLB)
<4> ffff8101f2dbdc48 0000000000000046 ffffffff8008e202 0000000000000000
<4> ffff81000901d7a0 0000000000000004 ffff8102164fd0c0 ffff81022fd6e080
<4> 00075a065bc961f8 000000000001ae59 ffff8102164fd2a8 0000000600000096
<4>Call Trace:
<4> [<ffffffff8008e202>] enqueue_task+0x41/0x56
<4> [<ffffffff80046c44>] try_to_wake_up+0x472/0x484
<4> [<ffffffff800645e3>] __down_write_nested+0x7a/0x92
<4> [<ffffffff8829f048>] :snd:snd_ctl_release+0x55/0x102
<4> [<ffffffff80012c22>] __fput+0xd3/0x1bd
<4> [<ffffffff80023daf>] filp_close+0x5c/0x64
<4> [<ffffffff80038dae>] put_files_struct+0x63/0xae
<4> [<ffffffff80015988>] do_exit+0x376/0x955
<4> [<ffffffff80048ea8>] cpuset_exit+0x0/0x88
<4> [<ffffffff8002b4c0>] get_signal_to_deliver+0x465/0x494
<4> [<ffffffff8005a951>] do_notify_resume+0x9c/0x7af
<4> [<ffffffff882f0f35>] :snd_pcm:snd_pcm_playback_ioctl1+0x1e0/0x1fd
<4> [<ffffffff80067225>] do_page_fault+0x4cc/0x842
<4> [<ffffffff882de832>] :snd_timer:snd_timer_user_read+0x184/0x1ce
<4> [<ffffffff80041ef3>] do_ioctl+0x21/0x6b
<4> [<ffffffff800ba0cb>] audit_syscall_exit+0x329/0x344
<4> [<ffffffff8005d32e>] int_signal+0x12/0x17
<4>
int_signal+0x12/0x17
<4>
Resolution
- Please note that this is a known issue in RHEL 5 system. Hence, upgrade the system to RHEL 6 which is not affected by it. For more information about the issue, please contact Red Hat Technical Support team.
Diagnostic Steps
-
Collect vmcore of the system.
-
RCX 0000000000100100 is list poison value, thus we either have some weird use after free or unlocked list manipulation/traversal (or wrong lock is taken). If we take a look at panicking function:
down_read(&card->controls_rwsem);
list_for_each_entry(kctl, &card->ctl_files, list) {
if (kctl->pid == current->pid) {
prefer_subdevice = kctl->prefer_pcm_subdevice;
if (prefer_subdevice != -1)
break;
}
}
up_read(&card->controls_rwsem);
- It would seem ctl_files is supposed to be protected by controls_rwsem, however turns out all routines manipulating it take ctl_files_rwlock spinlock instead.
- Commit which fixed things:
commit 399ccdc1cd4e92e541d4dacbbf18c52bd693418b
Author: Takashi Iwai <tiwai@suse.de>
Date: Thu Sep 25 14:51:03 2008 +0200
ALSA: fix locking in snd_pcm_open*() and snd_rawmidi_open*()
The PCM and rawmidi open callbacks have a lock against card->controls_list
but it takes a wrong one, card->controls_rwsem, instead of a right one
card->ctl_files_rwlock. This patch fixes them.
This change also fixes automatically the potential deadlocks due to
mm->mmap_sem in munmap and copy_from/to_user, reported by Sitsofe
Wheeler:
A: snd_ctl_elem_user_tlv(): card->controls_rwsem => mm->mmap_sem
B: snd_pcm_open(): card->open_mutex => card->controls_rwsem
C: munmap: mm->mmap_sem => snd_pcm_release(): card->open_mutex
The patch breaks the chain.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
diff --git a/sound/core/pcm.c b/sound/core/pcm.c
index 9dd9bc7..ece25c7 100644
--- a/sound/core/pcm.c
+++ b/sound/core/pcm.c
@@ -781,7 +781,7 @@ int snd_pcm_attach_substream(struct snd_pcm *pcm, int stream,
return -ENODEV;
card = pcm->card;
- down_read(&card->controls_rwsem);
+ read_lock(&card->ctl_files_rwlock);
list_for_each_entry(kctl, &card->ctl_files, list) {
if (kctl->pid == current->pid) {
prefer_subdevice = kctl->prefer_pcm_subdevice;
@@ -789,7 +789,7 @@ int snd_pcm_attach_substream(struct snd_pcm *pcm, int stream,
break;
}
}
- up_read(&card->controls_rwsem);
+ read_unlock(&card->ctl_files_rwlock);
switch (stream) {
case SNDRV_PCM_STREAM_PLAYBACK:
diff --git a/sound/core/rawmidi.c b/sound/core/rawmidi.c
index f7ea728..b917a9f 100644
--- a/sound/core/rawmidi.c
+++ b/sound/core/rawmidi.c
@@ -418,7 +418,7 @@ static int snd_rawmidi_open(struct inode *inode, struct file *file)
mutex_lock(&rmidi->open_mutex);
while (1) {
subdevice = -1;
- down_read(&card->controls_rwsem);
+ read_lock(&card->ctl_files_rwlock);
list_for_each_entry(kctl, &card->ctl_files, list) {
if (kctl->pid == current->pid) {
subdevice = kctl->prefer_rawmidi_subdevice;
@@ -426,7 +426,7 @@ static int snd_rawmidi_open(struct inode *inode, struct file *file)
break;
}
}
- up_read(&card->controls_rwsem);
+ read_unlock(&card->ctl_files_rwlock);
err = snd_rawmidi_kernel_open(rmidi->card, rmidi->device,
subdevice, fflags, rawmidi_file);
if (err >= 0)
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
