RHEL6: NFS mounts become unresponsive when NFS server experiences repeated hardware.py python backtrace / abrt
Issue
- RPC-data reaches NFSD but it seems to stall while doing svc_authenticate
- NFS server TCP ACKs NFS client requests, then silently drops them, never responding at the NFS layer.
- NFS mounts on NFS clients periodically hang with visible in
/var/log/messages:
kernel: nfs: server hostname not responding, still trying
- On the NFS server, there are the below messages just before each above messages from the client side, indicating the nfs server is experiencing repeated instances of https://access.redhat.com/site/solutions/506353
Oct 20 08:25:18 hostname abrt: detected unhandled Python exception in '/usr/share/rhn/up2date_client/hardware.py'
Oct 20 08:25:18 hostname abrtd: New client connected
Oct 20 08:25:18 hostname abrtd: Directory 'pyhook-2013-10-20-08:25:18-10226' creation detected
Oct 20 08:25:18 hostname abrt-server[10333]: Saved Python crash dump of pid 10226 to /var/spool/abrt/pyhook-2013-10-20-08:25:18-10226
Oct 20 08:25:25 hostname abrtd: Sending an email...
Oct 20 08:25:26 hostname abrtd: Email was sent to: root@localhost
Oct 20 08:25:26 hostname abrtd: Duplicate: UUID
Oct 20 08:25:26 hostname abrtd: DUP_OF_DIR: /var/spool/abrt/pyhook-2013-10-20-07:28:52-15636
Oct 20 08:25:26 hostname abrtd: Corrupted or bad directory '/var/spool/abrt/pyhook-2013-10-20-08:22:37-1222', deleting
Environment
- Red Hat Enterprise Linux 6 (NFS Server)
- rhn-client-tools
- between rhn-client-tools-1.0.0.1-4.el6 and rhn-client-tools-1.0.0.1-16.el6
- sos (sosreport)
- between sos-2.2-29.el6 and before sos-2.2-47.el6
- sos-2.2-29.el6 includes fix for 'Bug 730641 - sosreport does not collect /proc/net details', and introduces the regression
- sos-2.2-47.el6 fixes the regression in 'Bug 913201 - sosreport can potentially cause an RPC failure when copying RPC channel files'
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.