RHEL6 NFS4 IPV4 mount request to localhost fails

Solution Verified - Updated -

Issue

I need to report a problem we’re having on RHEL6, that’s rooted in a historically misspecified automount map entry. Some of our map entries mount a local filesystem at an automount-controlled mountpoint like this:

blah localhost:/local/0/blah

NB, I’m aware this is the wrong way of specifying a bind mount of a local filesystem, ie we shouldn’t specify localhost:/path, just :/path. I was concerned that this was a Sun automounter configuration convention that needed to work under RHEL, as we serve both Solaris and RHEL hosts the same maps, but it turns out Solaris actually does support the :/path notation. More on this later.

The RHEL automounter does try to do what’s inferred in the above notation, ie, make a bind mount. What happens next, when the local filesystem doesn’t exist, is what’s causing us a problem. It next spawns a mount request to an NFS service on localhost for the mount, either via IPV6, or via IPV4; end-user hosts don’t serve NFS, so the problem arises due to how this is handled.

On RHEL5, the mount request contacts the portmapper, which reports there is no program registered for NFS service, and the mount fails straight away.

On RHEL6, if /etc/hosts has an entry for the IPV6 localhost address, an NFS mount from IPV6 localhost gets tried first, which fails immediately as expected, as IPV6 is not enabled, and no NFS service is registered on the host. Automount reports the mount failure straight away.

On RHEL6 however, if /etc/hosts does not have an entry for the IPV6 localhost address, automount tries the IPV4 localhost address; when the mount is requested using NFSv4, the RPC request seem to get localhost port 0 returned for NFS service, instead of being told there is no program registered for NFS service (if I’m reading this strace correctly):

1657  1416223283.572722 socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP) = 3
1657  1416223283.572764 bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
1657  1416223283.572811 connect(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
1657  1416223283.572849 getsockname(3, {sa_family=AF_INET, sin_port=htons(40477), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
1657  1416223283.572896 mount("localhost:/local/0/blah", "/mnt/test", "nfs", 0, "vers=4,addr=127.0.0.1,clientaddr"...) = -1 ECONNREFUSED (Connection refused)

The mount continues to pause and retry, until it reaches the retry limit, and then eventually fails. Here’s what the bare mount command shows:

[root@acme123]# mount -vvv -t nfs -o nfsvers=4 localhost:/local/0/blah /mnt/test
mount: fstab path: "/etc/fstab"
mount: mtab path:  "/etc/mtab"
mount: lock path:  "/etc/mtab~"
mount: temp path:  "/etc/mtab.tmp"
mount: UID:        0
mount: eUID:       0
mount: spec:  "localhost:/local/0/blah"
mount: node:  "/mnt/test"
mount: types: "nfs"
mount: opts:  "nfsvers=4"
final mount options: 'nfsvers=4'
mount: external mount: argv[0] = "/sbin/mount.nfs"
mount: external mount: argv[1] = "localhost:/local/0/blah"
mount: external mount: argv[2] = "/mnt/test"
mount: external mount: argv[3] = "-v"
mount: external mount: argv[4] = "-o"
mount: external mount: argv[5] = "rw,nfsvers=4"
mount.nfs: timeout set for Mon Nov 17 14:42:17 2014
mount.nfs: trying text-based options 'nfsvers=4,addr=127.0.0.1,clientaddr=127.0.0.1'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'nfsvers=4,addr=127.0.0.1,clientaddr=127.0.0.1'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'nfsvers=4,addr=127.0.0.1,clientaddr=127.0.0.1'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'nfsvers=4,addr=127.0.0.1,clientaddr=127.0.0.1'
mount.nfs: mount(2): Connection refused
mount.nfs: trying text-based options 'nfsvers=4,addr=127.0.0.1,clientaddr=127.0.0.1'
mount.nfs: mount(2): Connection refused

For contrast, when the RHEL6 client asks for an NFSv3 mount from IPV4 localhost, the right thing happens – it’s told there is no program registered for NFS service at localhost.

So in our scenario, a program references an automount to a location which doesn’t exist on the local filesystem, and hangs for a number of minutes before timing out and failing. The hanging is the problem for us, as these automounts might be referenced in the normal course of events, and now on RHEL6 they don’t fail straight away, instead they hang the requesting process until timing-out several minutes later.

We currently have the IPV6 localhost entry commented-out of /etc/hosts in our standard RHEL6 build, because it causes the Legato backup client to fail, so we’re not able to reinstate it until we start running a version of Legato that doesn’t choke on this. This part is out of our control.

On automount map syntax, we’ll have difficulty calling for a wholesale change from localhost:/path to :/path, from our operations group, as there are about 8,000 such entries, and it’s difficult to estimate the risk and impact from doing this, given how long this syntax has worked (albeit by accident) in our environment.

However it does look like we’ll be able to make a useable workaround by confining localhost mount requests to NFSv3 using /etc/nfsmount.conf on RHEL6 clients.

So essentially I’m asking for a couple of things here. Firstly, some of your assistance to review my analysis, and let me know if I’ve got any of this wrong, or if I’ve missed something crucial. Secondly, I’d be grateful for your analysis of what’s happening (inside rpcbind?) with the IPV4 NFS4 mount request to localhost, where no NFS service is running, and either some pointers on making rpcbind return something sensible, or potentially opening a bugzilla, if there is a bug here, as it appears to me. Looking forward to your thoughts.

Environment

Red Hat Enterprise Linux
6.5

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In