Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

7.5. Clustering

Clustered storage provides a consistent file system image across all servers in a cluster, allowing servers to read and write to a single, shared file system. This simplifies storage administration by limiting tasks like installing and patching applications to one file system. A cluster-wide file system also eliminates the need for redundant copies of application data, simplifying backup and disaster recovery.
Red Hat's High Availability Add-On provides clustered storage in conjunction with Red Hat Global File System 2 (part of the Resilient Storage Add-On).

7.5.1. Global File System 2

Global File System 2 (GFS2) is a native file system that interfaces directly with the Linux kernel file system. It allows multiple computers (nodes) to simultaneously share the same storage device in a cluster. The GFS2 file system is largely self-tuning, but manual tuning is possible. This section outlines performance considerations when attempting to tune performance manually.
As of Red Hat Enterprise Linux 6.5, GFS2 includes the Orlov block allocator. This allows administrators to spread out block allocations on disk, so that the contents of directories can be placed in proximity to the directories on disk. This generally increases write speed within those directories.
All directories created in the top-level directory of the GFS2 mount point are spaced automatically. To treat another directory as a top-level directory, mark that directory with the T attribute, like so.
chattr +T directory
This ensures that all subdirectories created in the marked directory are spaced on disk.
Red Hat Enterprise Linux 6.4 introduced improvements to file fragmentation management in GFS2. Files created by Red Hat Enterprise Linux 6.3 or earlier were prone to file fragmentation if multiple files were written at the same time by more than one process. This fragmentation made things run slowly, especially in workloads involving large files. With Red Hat Enterprise Linux 6.4, simultaneous writes result in less file fragmentation and therefore better performance for these workloads.
While there is no defragmentation tool for GFS2 on Red Hat Enterprise Linux, you can defragment individual files by identifying them with the filefrag tool, copying them to temporary files, and renaming the temporary files to replace the originals. (This procedure can also be done in versions prior to 6.4 as long as the writing is done sequentially.)
Since GFS2 uses a global locking mechanism that potentially requires communication between nodes of a cluster, the best performance will be achieved when your system is designed to avoid file and directory contention between these nodes. Some methods of avoiding contention are to:
  • Pre-allocate files and directories with fallocate where possible, to optimize the allocation process and avoid the need to lock source pages.
  • Minimize the areas of the file system that are shared between multiple nodes to minimize cross-node cache invalidation and improve performance. For example, if multiple nodes mount the same file system, but access different sub-directories, you will likely achieve better performance by moving one subdirectory to a separate file system.
  • Select an optimal resource group size and number. This depends on typical file sizes and available free space on the system, and affects the likelihood that multiple nodes will attempt to use a resource group simultaneously. Too many resource groups can slow block allocation while allocation space is located, while too few resource groups can cause lock contention during deallocation. It is generally best to test multiple configurations to determine which is best for your workload.
However, contention is not the only issue that can affect GFS2 file system performance. Other best practices to improve overall performance are to:
  • Select your storage hardware according to the expected I/O patterns from cluster nodes and the performance requirements of the file system.
  • Use solid-state storage where possible to lower seek time.
  • Create an appropriately-sized file system for your workload, and ensure that the file system is never at more than 80% capacity. Smaller file systems will have proportionally shorter backup times, and require less time and memory for file system checks, but are subject to high fragmentation if they are too small for their workload.
  • Set larger journal sizes for metadata-intensive workloads, or when journaled data is in use. Although this uses more memory, it improves performance because more journaling space is available to store data before a write is necessary.
  • Ensure that clocks on GFS2 nodes are synchronized to avoid issues with networked applications. We recommend using NTP (Network Time Protocol).
  • Unless file or directory access times are critical to the operation of your application, mount the file system with the noatime and nodiratime mount options.

    Note

    Red Hat strongly recommends the use of the noatime option with GFS2.
  • If you need to use quotas, try to reduce the frequency of quota synchronization transactions or use fuzzy quota synchronization to prevent performance issues arising from constant quota file updates.

    Note

    Fuzzy quota accounting can allow users and groups to slightly exceed their quota limit. To minimize this issue, GFS2 dynamically reduces the synchronization period as a user or group approaches its quota limit.
For more detailed information about each aspect of GFS2 performance tuning, refer to the Global File System 2 guide, available from http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/.