Problems cloning SVN repositories on NFS share hosted on beegfs file system

Solution Verified - Updated -

Environment

  • RHEL NFS Server environment whose NFS export is on beegfs file system
  • beegfs-client 6.18-patch5
  • Issue reported on RHEL 7.6, but issue can occur on RHEL 6, 7 and 8 as well

Issue

  • Problem encountered while cloning svn files on NFS which is layered on top of beegfs file system
Several users are experiencing problems cloning SVN repositories in a NFS share. 
The error is "svn: E000005: Can't move '/mnt/temp/[...]/svn/spamassassin/.svn/tmp/svn-sCkDA6' to '/mnt/temp/[...]/svn/spamassassin/lm/am.utf-8.lm': Input/output error".

Resolution

Upgrade to beegfs-client 6.19.patch1 . Contact the vendor of beegfs file system for further details. If your NFS export resides on a beegfs file system then the following patch is required so that beegfs handles this scenario to return EXDEV instead of EBUSY to nfsd, so that nfsd handles the scenario appropriately without returning a EIO to the the application. In one such customer reported scenario the calling application was svn using the NFS files and encountered EIO errors.

beegfs patch identified for this issue is https://git.beegfs.io/pub/v6-nightly/commit/cf76b5258e00f4cc2b76c6a873766889e9f002d4

Root Cause

For nfs export that resides on beegfs file system , this scenario of error is encountered when a rename call is called immediately after a close call of a NFS file. Under some circumstances nfsd does not close files while it processes a client close() request, instead closing the file synchronously (sometimes after the close() request has been answered). Applications that close a file and rename it immediately afterwards may see an EBUSY error even though the file is not actually 'open' on any client. Returning EXDEV from beegfs instead of EBUSY allows client applications to fall back from rename() to copy+unlink

Snippet of the condition in the beegfs patch, the condition is handled in FhgfsOpsInode.c
client: add option to return EXDEV from rename() instead of EBUSY

....... 
   if (unlikely(retVal == -EBUSY && app->cfg->sysRenameEbusyAsXdev))
   {
      const EntryInfo* fromEntryInfo = FhgfsInode_getEntryInfo(fhgfsFromEntryInode);

      Logger_logFormatted(log, Log_NOTICE, logContext, "Rewriting EBUSY to EXDEV: "
            "%s fromDirID: %s oldName: %s toDirID: %s newName: %s EntryID: %s",
            FhgfsOpsErr_toErrString(renameRes),
            fromDirInfo->entryID, oldName, toDirInfo->entryID, newName, fromEntryInfo->entryID);
      retVal = -EXDEV; 
   }

.....

After applying the beegfs patch this configuration setting has to be enabled.

# [sysRenameEbusyAsXdev]
# Changes the semantics of rename() to return an EXDEV error if a file could not
# be moved because it is in use (instead of the default EBUSY). Applications and
# tools like mv can handle EXDEV and fall back to copy/unlink for the files.
# This is mostly useful for NFS exports, where files may not be closed by the                 << note
# server until after the last open file handle has been closed by clients. This
# can cause spurious EBUSY errors in clients that close a file and rename it
# immediately afterwards.

Diagnostic Steps

Collect a strace of the application, ( in below example it was svn checkout attempting a rename( ) call soon after close( ) of the nfs file residing on beegfs file system. )

The error on stdout, stderr or /var/log/messages
 "svn: E000005: Can't move '/mnt/temp/alpha/svn/spamassassin/.svn/tmp/svn-sCkDA6' to '/mnt/temp/alpha/svn/spamassassin/lm/am.utlip': Input/output error".

Check for failed sequence of close( ) and rename( ) system calls in strace :

350690 23118 14:23:08.583932 close(8</scratch/hfgk/svn/spamassassin/.svn/tmp/svn-TfVaGu>) = 0 <0.001014>
350691 23118 14:23:08.584961 close(7</scratch/hfgk/svn/spamassassin/.svn/pristine/74/7407eb26cdbe7afe33e974689af80c252e5c96e2.svn-base>) = 0 <0.000182>
350692 23118 14:23:08.585158 rename("/scratch/hfgk/svn/spamassassin/.svn/tmp/svn-TfVaGu", "/scratch/hfgk/svn/spamassassin/t/data/spam/badmime.txt") = -1        EIO (Input/output error) <0.001059>         <<  Failed
350693 23118 14:23:08.586280 close(5<socket:[4018340276]>) = 0 <0.000127>
350694 23118 14:23:08.586460 close(6</tmp/svn-6msak3>) = 0 <0.000009>

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.