What are some examples of GFS & GFS2 workloads that should be avoided?

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 6 or 7 with the Resilient Storage Add On
  • One or more gfs2 filesystems

Issue

  • Need to understand what sorts of GFS and GFS2 deployments may lead to problems
  • Have found examples of "whitelisted" good configs in Red Hat documentation but need to understand what kinds of workloads should be avoided
  • When using ls and other actions that crawl the filesystem accessing many files (tar, rsync, etc.), the performance of both that command and others run at the same time is severely decreased

Resolution

GFS2 is designed to facilitate concurrent shared storage with Red Hat High Availability cluster. As a general guideline, any deployment that attempts to use GFS2 for anything other than shared data may result in problems. A not-at-all-exhaustive list detailing examples of poor, unsupported, or dangerous GFS2 workloads would have to include:

  • Active/Active NFS over GFS2
    • This is an unsupported and dangerous configuration that will result in some combination of outages and file corruption.
    • This would entail exporting one or more GFS2 file systems from multiple nodes at the same time via NFS.
    • This also includes the "floating IP" form of NFS over GFS2 where GFS2 is exported by all nodes at the same time and just an IP is failed-over.
  • Active/Active Samba over GFS2
    • Active/active Samba over GFS2 can be configured with CTDB subject to Red Hat's CTDB support policies.
    • CTDB is supported on RHEL 6.2+ and on RHEL 7.4+.
    • Active/active Samba over GFS2 without CTDB is an unsupported and dangerous configuration that will result in some combination of outages and file corruption.
      • This would entail sharing one or more GFS2 file systems from multiple nodes at the same via Samba.
      • This also includes the "floating IP" form of Samba over GFS2 where GFS2 is shared by all nodes at the same time and only an IP is failed over.
  • Active/Active database instances such as MySQL, Postgres, or Oracle without RAC
    • This would entail running instances of a database application on multiple nodes and pointing each instance to a shared database data store on GFS2.
    • Unless the database application is explicitly coded to be cluster aware it will most likely be unaware that its data store could be modified by another instance.
    • Performance issues, outages, or data corruption will occur.
  • Active/Active applications where application binaries and content is written-to or modified on-disk
    • This would entail running application binaries on GFS2 and using that technology to change, write-to, or modify data that is being stored on GFS2 and shared out or acted on by all nodes.
    • Work loads like this need to be carefully vetted and tested and should be designed on GFS2 for use in a clustered environment.
    • Application binaries should be kept locally on the cluster nodes, and GFS2 used for shared storage.
    • Liberal use of file IO commands and methods without understanding their locking implications may result in lock contention when run under load.
    • Read-only access shared between nodes, for example in an image farm, may work without issue but should be tested under realistic load.
    • Issues may arise, specifically with regards to GFS scalability, if multiple nodes are writing-to and creating files in a shared directory.
  • Any Active/Active work load where the application or process is not built with clustered filesystem access in mind.
    • Custom developed applications
    • Third party applications
    • Red Hat supported and distributed applications that are not designed for active/active clustered workloads
  • Workloads that involve processes constantly crawling the contents of the GFS2 filesystem.

For more information on GFS2 performance troubleshooting:

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

1 Comments

This article is too broad in its scope. We do support active/active applications provided they are designed in such a way as to maximise their performance over GFS/GFS2. Please add a reference to the GFS/GFS2 performance kbase here since that is the info that people need to know in order to understand which applications will work well and which ones will not.

Phrases such as "Any Active/Active work load where the application or process is not built with clustered filesystem access in mind" are not helpful since they don't convey any information about what this actually means.