System not responding and vmcore shows 'xagt' process
Issue
- System hang and not responding
- The log shows lots of 'rcu_sched detected stalls on CPUs/tasks' in a process that was in do_exit().
[40977820.601691] ------------[ cut here ]------------
[40977820.624154] WARNING: CPU: 0 PID: 20889 at kernel/trace/trace_kprobe.c:1350 kprobe_dispatcher+0x2b2/0x2c0
[40977820.669349] profile buffer not large enough
[40977820.688770] Modules linked in:
[40977820.704669] udp_diag tcp_diag inet_diag iptable_filter raid1 dm_raid raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache mptctl mptbase seos_1410_0_1119(POE) bonding sunrpc ext4 mbcache jbd2 dm_queue_length sb_edac intel_powerclamp coretemp intel_rapl iTCO_wdt iosf_mbi iTCO_vendor_support kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper ipmi_ssif cryptd lpfc pcspkr hpilo hpwdt lpc_ich nvmet_fc nvmet nvme_fc nvme_fabrics nvme_core scsi_transport_fc ioatdma scsi_tgt sg dca wmi video ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter shpchp dm_multipath binfmt_misc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common
[40977821.031146] mgag200 i2c_algo_bit crc32c_intel drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw drm i2c_core hpsa(OE) scsi_transport_sas be2net(OE) dm_mirror dm_region_hash dm_log dm_mod
[40977821.115854] CPU: 0 PID: 20889 Comm: kworker/u128:3 Kdump: loaded Tainted: P OE ------------ 3.10.0-862.20.2.el7.x86_64 #1
[40977821.174309] Hardware name: HP ProLiant BL460c Gen8, BIOS I31 05/21/2018
[40977821.206715] Call Trace:
[40977821.220179] [<ffffffff893138b4>] dump_stack+0x19/0x1b
[40977821.245871] [<ffffffff88c94618>] __warn+0xd8/0x100
[40977821.270432] [<ffffffff88c9469f>] warn_slowpath_fmt+0x5f/0x80
[40977821.298864] [<ffffffff893243e2>] kprobe_dispatcher+0x2b2/0x2c0
[40977821.328045] [<ffffffff88cc4d25>] ? commit_creds+0x5/0x260
[40977821.355229] [<ffffffff88e27991>] ? do_execve_common.isra.24+0x1/0x6e0
[40977821.387156] [<ffffffff8931fe6a>] kprobe_ftrace_handler+0xba/0x120
[40977821.417577] [<ffffffff88e27995>] ? do_execve_common.isra.24+0x5/0x6e0
[40977821.449505] [<ffffffff88e27990>] ? prepare_bprm_creds+0x80/0x80
[40977821.479175] [<ffffffff88e28088>] ? do_execve+0x18/0x20
[40977821.505255] [<ffffffff88d55674>] ftrace_ops_list_func+0xf4/0x120
[40977821.535112] [<ffffffff89329264>] ftrace_regs_call+0x5/0x81
[40977821.562571] [<ffffffff88e30b4b>] ? getname_kernel+0x2b/0x120
[40977821.590772] [<ffffffff88e27991>] ? do_execve_common.isra.24+0x1/0x6e0
[40977821.622594] [<ffffffff88e27995>] ? do_execve_common.isra.24+0x5/0x6e0
[40977821.654715] [<ffffffff88e28088>] ? do_execve+0x18/0x20
[40977821.680563] [<ffffffff88cb2c2f>] ____call_usermodehelper+0xff/0x140
[40977821.711521] [<ffffffff88cb2c70>] ? ____call_usermodehelper+0x140/0x140
[40977821.743663] [<ffffffff88cb2c8e>] call_helper+0x1e/0x20
[40977821.769753] [<ffffffff893255f7>] ret_from_fork_nospec_begin+0x21/0x21
[40977821.801580] [<ffffffff88cb2c70>] ? ____call_usermodehelper+0x140/0x140
[40977821.833861] ---[ end trace d6646489c5240dd5 ]---
[41000167.514425] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, t=60002 jiffies, g=5105213368, c=5105213367, q=322)
[41000167.570998] All QSes seen, last rcu_sched kthread activity 60002 (45294504176-45294444174), jiffies_till_next_fqs=3
[41000167.620672] xagt R running task 0 49898 1555 0x1000000a
[41000167.654856] Call Trace:
[41000167.668161] <IRQ> [<ffffffff88cd12a8>] sched_show_task+0xa8/0x110
[41000167.699352] [<ffffffff88d4f4ce>] rcu_check_callbacks+0x72e/0x730
[41000167.731020] [<ffffffff88d04700>] ? tick_sched_do_timer+0x50/0x50
[41000167.761474] [<ffffffff88ca7a56>] update_process_times+0x46/0x80
[41000167.790941] [<ffffffff88d04500>] tick_sched_handle+0x30/0x70
[41000167.819315] [<ffffffff88d04739>] tick_sched_timer+0x39/0x80
[41000167.847314] [<ffffffff88cc21a3>] __hrtimer_run_queues+0xf3/0x270
[41000167.877124] [<ffffffff88cc272f>] hrtimer_interrupt+0xaf/0x1d0
[41000167.905700] [<ffffffff88d86610>] ? event_sched_out.isra.87+0x2e0/0x2e0
[41000167.937732] [<ffffffff88c596cb>] local_apic_timer_interrupt+0x3b/0x60
[41000167.969483] [<ffffffff8932a083>] smp_apic_timer_interrupt+0x43/0x60
[41000168.000556] [<ffffffff893267b2>] apic_timer_interrupt+0x162/0x170
[41000168.030833] <EOI> [<ffffffff88d0aa6e>] ? generic_exec_single+0xfe/0x1b0
[41000168.063755] [<ffffffff88d81910>] ? perf_cgroup_attach+0x60/0x60
[41000168.092927] [<ffffffff88d81910>] ? perf_cgroup_attach+0x60/0x60
[41000168.122070] [<ffffffff88d0ab7f>] smp_call_function_single+0x5f/0xa0
[41000168.153000] [<ffffffff88d80dd3>] cpu_function_call+0x43/0x60
[41000168.180977] [<ffffffff88d80270>] ? perf_event_idx_default+0x10/0x10
[41000168.211602] [<ffffffff88d855b1>] event_function_call+0x101/0x110
[41000168.241100] [<ffffffff88d86610>] ? event_sched_out.isra.87+0x2e0/0x2e0
[41000168.272901] [<ffffffff88d857c5>] perf_remove_from_context+0x25/0x90
[41000168.303536] [<ffffffff88d890cc>] perf_event_release_kernel+0xcc/0x260
[41000168.335042] [<ffffffff88d89270>] perf_release+0x10/0x20
[41000168.360977] [<ffffffff88e216fc>] __fput+0xec/0x260
[41000168.384911] [<ffffffff88e2195e>] ____fput+0xe/0x10
[41000168.408835] [<ffffffff88cbabcb>] task_work_run+0xbb/0xe0
[41000168.434810] [<ffffffff88c9aa81>] do_exit+0x2d1/0xa40
[41000168.459541] [<ffffffff88c9b26f>] do_group_exit+0x3f/0xa0
[41000168.485824] [<ffffffff88cabaee>] get_signal_to_deliver+0x1ce/0x5e0
[41000168.516118] [<ffffffff88c2b527>] do_signal+0x57/0x6e0
[41000168.541373] [<ffffffff89317f27>] ? do_nanosleep+0xa7/0xf0
[41000168.568316] [<ffffffff89323968>] ? perf_trace_buf_prepare+0x88/0xb0
[41000168.598968] [<ffffffff88c38906>] ? perf_trace_sys_exit+0xb6/0xe0
[41000168.628577] [<ffffffff88c2bc22>] do_notify_resume+0x72/0xc0
[41000168.656106] [<ffffffff89325ae4>] int_signal+0x12/0x17
[...]
[41044449.098773] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, t=44341232 jiffies, g=5105213368, c=5105213367, q=524)
[41044449.158137] All QSes seen, last rcu_sched kthread activity 44341232 (45338785406-45294444174), jiffies_till_next_fqs=3
[41044449.210513] xagt R running task 0 49898 1555 0x1000000a
[41044449.246072] Call Trace:
[41044449.259759] <IRQ> [<ffffffff88cd12a8>] sched_show_task+0xa8/0x110
[41044449.292037] [<ffffffff88d4f4ce>] rcu_check_callbacks+0x72e/0x730
[41044449.322753] [<ffffffff88d04700>] ? tick_sched_do_timer+0x50/0x50
[41044449.353502] [<ffffffff88ca7a56>] update_process_times+0x46/0x80
[41044449.383927] [<ffffffff88d04500>] tick_sched_handle+0x30/0x70
[41044449.413173] [<ffffffff88d04739>] tick_sched_timer+0x39/0x80
[41044449.441352] [<ffffffff88cc21a3>] __hrtimer_run_queues+0xf3/0x270
[41044449.472028] [<ffffffff88cc272f>] hrtimer_interrupt+0xaf/0x1d0
[41044449.501463] [<ffffffff88d86610>] ? event_sched_out.isra.87+0x2e0/0x2e0
[41044449.534824] [<ffffffff88c596cb>] local_apic_timer_interrupt+0x3b/0x60
[41044449.567625] [<ffffffff8932a083>] smp_apic_timer_interrupt+0x43/0x60
[41044449.599606] [<ffffffff893267b2>] apic_timer_interrupt+0x162/0x170
[41044449.630718] <EOI> [<ffffffff88d0aa6e>] ? generic_exec_single+0xfe/0x1b0
[41044449.664645] [<ffffffff88d81910>] ? perf_cgroup_attach+0x60/0x60
[41044449.695054] [<ffffffff88d81910>] ? perf_cgroup_attach+0x60/0x60
[41044449.725623] [<ffffffff88d0ab7f>] smp_call_function_single+0x5f/0xa0
[41044449.757526] [<ffffffff88d80dd3>] cpu_function_call+0x43/0x60
[41044449.787742] [<ffffffff88d80270>] ? perf_event_idx_default+0x10/0x10
[41044449.819746] [<ffffffff88d855b1>] event_function_call+0x101/0x110
[41044449.850189] [<ffffffff88d86610>] ? event_sched_out.isra.87+0x2e0/0x2e0
[41044449.883562] [<ffffffff88d857c5>] perf_remove_from_context+0x25/0x90
[41044449.915555] [<ffffffff88d890cc>] perf_event_release_kernel+0xcc/0x260
[41044449.948368] [<ffffffff88d89270>] perf_release+0x10/0x20
[41044449.975155] [<ffffffff88e216fc>] __fput+0xec/0x260
[41044450.000394] [<ffffffff88e2195e>] ____fput+0xe/0x10
[41044450.025769] [<ffffffff88cbabcb>] task_work_run+0xbb/0xe0
[41044450.053066] [<ffffffff88c9aa81>] do_exit+0x2d1/0xa40
[41044450.078974] [<ffffffff88c9b26f>] do_group_exit+0x3f/0xa0
[41044450.106600] [<ffffffff88cabaee>] get_signal_to_deliver+0x1ce/0x5e0
[41044450.138177] [<ffffffff88c2b527>] do_signal+0x57/0x6e0
[41044450.164183] [<ffffffff89317f27>] ? do_nanosleep+0xa7/0xf0
[41044450.192243] [<ffffffff89323968>] ? perf_trace_buf_prepare+0x88/0xb0
[41044450.224287] [<ffffffff88c38906>] ? perf_trace_sys_exit+0xb6/0xe0
[41044450.254960] [<ffffffff88c2bc22>] do_notify_resume+0x72/0xc0
[41044450.284048] [<ffffffff89325ae4>] int_signal+0x12/0x17
Environment
- Red Hat Enterprise Linux 7
- 'xagt' process is running as daemon
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.