RHEL7.3: Newly created symlinks on NFS share generate IO errors and resolve after a few seconds
Issue
Our systems are freshly upgraded to the kernel 3.10.0-514.2.2.el7.x86_64. on the previous running kernel 3.10.0-327.22.2.el7.x86_64 this problem is not present.
First we create symlinks on a NFS share pointing to another NFS share. After the creation an "ls" in the folder is run and Input/output errors are seen
$ ls -la
ls: cannot read symbolic link file090: Input/output error
ls: cannot read symbolic link file098: Input/output error
ls: cannot read symbolic link file095: Input/output error
ls: cannot read symbolic link file093: Input/output error
ls: cannot read symbolic link file097: Input/output error
ls: cannot read symbolic link file094: Input/output error
...
When doing an "ls" for the second time, these "broken" links are repaired.
We're sporadically seeing readlink() calls on NFS-hosted symlinks fail with EIO. There's nothing wrong with the symlinks as far as we can tell, and they are readable from other NFS clients when this happens. A packet trace taken on the client (attached) reveals that the client is using a stale filehandle and receiving NFS3ERR_STALE from the server. We know the handle is stale because the earlier READDIRPLUS reply supplies a different handle.
We expect the readlink call to succeed, as it does on other clients, and as it does a few seconds later on the affected client after it issues a READLINK using the correct handle.
Environment
- Red Hat Enterprise Linux 7.3 (NFS client)
- kernel after 3.10.0-327*.el7 and prior to 3.10.0-514.16.1.el7
- NFS with symlinks
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.