XFS corruption on RHEL 6.3 XFS volume
Hello,
So I'm using a RHEL 6 based storage solution from Dell called NSS. It's a RHEL high availability setup with 260~TB logical volume formatted XFS.
The filesystem has corrupted twice this week and is repairable only with xfs_repair -L. The crash seems to happen while one user is deleting a large amount of files (80TB). During both crashes, the deletion was occurring.
If anyone could suggest a configuration issue or bug with XFS in this release of RHEL, or any suggestions, I would greatly appreciate it!
William
Responses
Interesting. The largest XFS volume we supported on RHEL 6.3 was 100 TiB, which increased to 300 TiB on RHEL 6.8 (see note 18 on the limits page). If you're operating within Dell's limits, it seems they have taken it upon themselves to support a larger limit. If not, then you're well into untested and unsupported territory.
There are 222 XFS patches between RHEL 6.3 and RHEL 6.8, and many more which apply to generic VFS. Updating may be helpful if that's possible within the NSS product.
You'll need the Scalable File System paid add-on to access an updated xfsprogs and related packages in RHEL6 (which you don't currently have). Alternately, RHEL 7 has a more recent implementation of XFS, supports a larger filesystem size limit (500 TiB), and doesn't require a paid add-on.
Your account has all OEM support entitlements, so Dell provide your support, not Red Hat. Troubleshooting repeat filesystem corruption is a bit beyond the scope of a Discussion. If there's a message which xfs_repair constantly repeats, search the customer portal for that message (scroll up to the very top for the search bar), we may have an existing knowledgebase solution about it.
Failing that, I would strongly recommend contacting Dell to troubleshoot this properly. They should have support engineers trained and experienced in troubleshooting this sort of thing.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
