Many process blocked fore more than 120 seconds, even ps command

Latest response

Hi!
Since upgrading to RHEL 6.3 (Linux version 2.6.32-279.el6.x86_64) I regularly get some "blocked for more than 120 seconds" messages, with some processes locked and unresponsive.
It happens randomly on process that are writing on NFS devices. I thought it may come from NFS deviece problem but recently I saw that even the "ps" command can be blocked (and indeed never returns)!

Is this a known issue ? Is there a fix ?

Many thanks,
Xavier.

Here is an example:
Feb 19 21:28:33 svr428 kernel: ps D 0000000000000000 0 8914 8896 0x00000080
Feb 19 21:28:33 svr428 kernel: ffff880197c85c50 0000000000000086 ffff8801861480c0 ffffffff8161f700
Feb 19 21:28:33 svr428 kernel: ffff88017f7c0ab8 ffff880197c85bd8 ffffffff811e3a83 ffff8801861480c0
Feb 19 21:28:33 svr428 kernel: ffff8801982c65f8 ffff880197c85fd8 000000000000fb88 ffff8801982c65f8
Feb 19 21:28:33 svr428 kernel: Call Trace:
Feb 19 21:28:33 svr428 kernel: [] ? proc_pident_instantiate+0x93/0xc0
Feb 19 21:28:33 svr428 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Feb 19 21:28:33 svr428 kernel: [] rwsem_down_read_failed+0x26/0x30
Feb 19 21:28:33 svr428 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Feb 19 21:28:33 svr428 kernel: [] ? down_read+0x24/0x30
Feb 19 21:28:33 svr428 kernel: [] __access_remote_vm+0x41/0x1f0
Feb 19 21:28:33 svr428 kernel: [] ? security_inode_permission+0x1f/0x30
Feb 19 21:28:33 svr428 kernel: [] ? nameidata_to_filp+0x54/0x70
Feb 19 21:28:33 svr428 kernel: [] access_process_vm+0x5b/0x80
Feb 19 21:28:33 svr428 kernel: [] proc_pid_cmdline+0x6d/0x120
Feb 19 21:28:33 svr428 kernel: [] ? alloc_pages_current+0xaa/0x110
Feb 19 21:28:33 svr428 kernel: [] proc_info_read+0xad/0xf0
Feb 19 21:28:33 svr428 kernel: [] vfs_read+0xb5/0x1a0
Feb 19 21:28:33 svr428 kernel: [] ? audit_syscall_entry+0x272/0x2a0
Feb 19 21:28:33 svr428 kernel: [] sys_read+0x51/0x90
Feb 19 21:28:33 svr428 kernel: [] system_call_fastpath+0x16/0x1b

Responses

I can only advise you to contact Red Hat support, for these are kernel trace messages.

Like ir. Jan Gerrit said - you should probably call Red Hat support.

that being said - this might help you determine what specific processes are writing to the disk, and may or may not help you...

If you'd like to find the specific processes are involved that are writing to disk for the user, you might be able to try the following...
*(Perhaps you can do this when (as you mention) the ps process is not being blocked)

First - find the name of the user with the process

ps -ef >output.txt

Sort the results - and hopefully the user is not root

sort output.txt > sorted_output.txt

Examine the file and hopefully you can find the name of the user for the process involved,
if so (let's call the user "jack" for the purpose of the script below).

[root@yoursystem ~]# [root@yoursystem ~]# echo "this will only work if the process is writing to a disk"
[root@yoursystem ~]# lsof -l -n -P | grep jack | awk '{print $2}' | sort | uniq | perl -ne 'print;chomp;system("ps -ef | grep $_")' | grep jack | awk '{print $8}'| sort | uniq

Good luck.

The use of lsof -l means that the grep second in the chain of commands won't work because -l causes lsof to not convert UIDs to names. Also to save some space, sort -u provides unique output without having to pass it through uniq separately.

This command should do the same as the last command you have posted:

lsof -nPFp -u jack | cut -c2- | xargs ps -o cmd= -p1 | grep -v ^.*/init$

Thanks - oddly it seemed to have worked yesterday -- thanks again PixelDrift

I'm on the side of what works. What is odd, this has worked both in the past and and I tested it yet again yesterday. Appreciate the command you posted.

I'll check and compare both later. Thanks

Hello PixelDrift.NET,
I examined both commands against a qemu process running on one of my systems and discovered they both produced nearly identical results for checking a qemu process,

While your comment that the "-l" inhibits the conversion of user ID numbers to login names, that point is irrelevant in this case and use of the command because it produced the final process of "/usr/libexec/qemu-kvm".

The difference between the two shows "qemu 4179 1 3 20:25 ? 00:02:02", and all the associated arguments to the resultant process from examining the nodes of column 8 of the lsof command.

  • Your command is shorter, I like the brevity of it.

Your lsof command you posted with xargs is nice, I did not know xargs could be used with lsof.

  • And this produced the nearly idential report.
  • Here are the two commands, (in this case, mine is missing the final "| awk '{print $8}'| sort | uniq"

[root@test1 ~]# lsof -nPFp -u qemu | cut -c2- | xargs ps -o cmd= -p1 | grep -v ^.*/init$ /usr/libexec/qemu-kvm -name virt1 -S -M rhel7.0.0 -enable-kvm -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 03552cab-a4b3-4d9e-8f76-2cf7e662e45c -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/virt1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3- usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/vm/virt1.img,if=none,id=drive-virtio- lv_test,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive- virtio-lv_test,id=virtio-lv_test,bootindex=1 -drive file=/coffee/Downloads /some.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=2 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio- net-pci,netdev=hostnet0,id=net0,mac=00:00:00:00:00:00,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon- pci,id=balloon0,bus=pci.0,addr=0x6 lsof -l -n -P | grep qemu | awk '{print $2}' | sort | uniq | perl -ne 'print;chomp;system("ps -ef | grep $_")' | grep qemu qemu 4179 1 3 20:25 ? 00:02:02 /usr/libexec/qemu-kvm -name virt1 -S -M rhel7.0.0 -enable-kvm -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 03552cab-a4b3-4d9e-8f76-2cf7e662e45c -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/virt1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3- usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/vm/virt1.img,if=none,id=drive-virtio- lv_test,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive- virtio-lv_test,id=virtio-lv_test,bootindex=1 -drive file=/coffee/Downloads /some.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=2 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio- net-pci,netdev=hostnet0,id=net0,mac=00:00:00:00:00:00,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon- pci,id=balloon0,bus=pci.0,addr=0x6

Now if I run theese with a final bit of awk to show the process only within of the output to the space:

lsof -l -n -P | grep qemu | awk '{print $2}' | sort | uniq | perl -ne 'print;chomp;system("ps -ef | grep $_")' | grep qemu | awk '{print $8}'| sort | uniq
/usr/libexec/qemu-kvm

lsof -nPFp -u qemu | cut -c2- | xargs ps -o cmd= -p1 | grep -v ^.*/init$ | grep qemu | awk '{print $1}'
/usr/libexec/qemu-kvm

So the "-l" inclusion is not an issue and the two scripts do produce the same output when taylored a bit.

I appreciate you posting the lsof command you presented with the use of cut and xargs. It is nice to see a different way to produce the same output. I like to compare methods and see how others achieve results from queries since there are so many ways to do things within Linux.

I'll examine/compare your commands you post in the discussions here in the future, it is good to learn things from others and to deduce different methods of commands.

Thanks
Remmele

In these latest examples you are passing 'qemu' which is also in the process name.

In your original example you passed 'jack' which was just user's name. lsof with -l doesn't output a usename because UID's aren't converted to usernames (so my guess is it is catching output where 'jack' turns up elsewhere in the output of lsof eg. in the path).

Here is an example of -l on my system:
[james@build7 ~]$ lsof -l | grep james
bash 21678 502 cwd DIR 253,0 4096 138784 /home/james
lsof 21723 502 cwd DIR 253,0 4096 138784 /home/james
grep 21724 502 cwd DIR 253,0 4096 138784 /home/james
lsof 21725 502 cwd DIR 253,0 4096 138784 /home/james

These processes are only matching because 'james' is elsewhere in the line, not because he is listed as the owner. (502 in this case because the UID hasn't been converted)

I see your point after checking this link where this article recommends using "lsof -u username". I could use "lsof -u username"

I recommened to the original poster Xavier BRUN to acquire the actual user with commands to find that. In that context, I used/recommended to Xavier to use the script I mentioned always with the user.

I had to use 'qemu' in this case because I wanted to actually produce results.
Xavier hadn't posted the user so "jack" was intended to be replaced with a known process user as I had mentioned to Xavier. So the bit about '-l' in this case is irrelevant because I use it against a known user and recommended as such to Xavier.

I was glad to see that both scripts do indeed produce the same results. I like the brevity of yours.

In any case - the one I used has qemu as the user. When I run the script I wrote (even though my example of "jack" didn't match and was a poor example) -- we always run the script I originally posted against a known user as I mentioned to the OP.

I had mentioned to the OP "First - find the name of the user with the process" and recommended the 'ps -ef' command to a file and sort it find the user.

So that fits with the example I gave because I said in context to perform a 'ps -ef" and pipe the output to a sorted file and examine it for the specific users.

So in my case with the instructions and use I provided, it was proper. (even with 'jack' not being relevant, I should have in retrospect used "qemu").

And when used in the proper context of an actual user (what I had recommended in my first reply to the OP, both methods do work.

I like the brevity of yours and will remember to examine your commands because you have some useful methods,

Thanks

Back to the OP, What version of NFS are you using?

Has this problem always existed on this server? or is it a new problem that has just turned up after an upgrade/change?

What is exporting the NFS to the server? Another Red Hat server (which version)? commercial NAS? does it show anything in its error log?

Is there another server connected to the same export? does it show similar issues?

What is the OS installed on, local disk or SAN storage?

For a quick response i'd probably follow Jan's suggestion and contact support.

Thanks for your answers,

We have always experienced this kind of unexpected freeze problems (we lately used RHEL 4.7 on our system), but I am not sure this was the same problem as those "120 seconds" traces were not logged in RHEL4.7. And I have the feeling that those problems are more frequent since we use RHEL 6.3.
We have this problem on several servers.

We are exporting on NFS v3 (TCP). I don't have the NFS server logs but I'll try to get them. I'll also get the details of the OS of the nfs server.

To answer following question of Remmele Hinton, we are not running on virtual system.

I'll contact support today.

Hello Xavier BRUN

Ir. Jan Gerrit, myself and Pixeldrift recommended contacting Red Hat Support. - based on the output you provided, I am curious if you are running this on a virtual system?

If so, how is that system provisioned for memory, disk write throughput? If it deals with a quite lot of data movement, NFS and autofs, and is heavily audited, the system can become bogged (see next paragraph)...

If it is a KVM virtual server, here is an example we had: We had inherited some KVM servers a while back that dealt with a lot of data , servers provided from another contract (that we later made into physical servers), that had a seriously bad time dealing with NFS and seeming autofs lag and the systems would hang (since resolved). (we typically do not use KVM and especially not for production)

As I and others recommended previously, you should certainly call Red Hat Support.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.