How to Choose Your Red Hat Enterprise Linux File System

Updated -

Choosing the Red Hat Enterprise Linux file system that is appropriate for your application is an important decision due to the large number of options available and the trade-offs involved. This paper describes some of the file systems that ship with Red Hat Enterprise Linux and provides historical background and recommendations on the right file system to suit your application.

Types of File Systems

Red Hat Enterprise Linux supports a variety of file systems. Different types of file systems solve different kinds of problems, and their usage is application specific. At the most general level, file systems available in Red Hat Enterprise Linux can be grouped into the following major categories:

  • Disk or local file system
  • Network or client/server file system
  • Shared storage or shared disk file system
  • Special file systems

Local File Systems Overview

Local file systems are file systems that run on a single, local server and are directly attached to storage. For example, a local file system is the only choice for internal S-ATA or SAS disks and is used when your server has internal hardware RAID controllers with local drives. Local file systems are also the most common file systems used on SAN attached storage when the SAN’s exported device is not shared.

The Red Hat Enterprise Linux 4 and Red Hat Enterprise Linux 5 platforms have traditionally provided two main file systems for the Ext2 and Ext3 class of systems. The newest version of the Ext file system family, Ext4, is fully supported in all versions of Red Hat Enterprise Linux since 5.6.

All of these file systems are POSIX compliant and are fully compatible with all later Red Hat Enterprise Linux releases. POSIX-compliant file systems provide support for a well-defined set of system calls, such as read(), write(), and seek(). From the application programmer’s point of view, there are relatively few differences. The most notable differences from a user’s perspective are related to scalability and performance. When considering a file system choice, you should consider how large the file system needs to be, what unique features it should have, and how it performs under your workload.

The XFS File System

XFS is a robust and mature 64-bit journaling file system that supports very large files and file systems on a single host. It is the default file system in Red Hat Enterprise Linux 7. Journaling ensures file system integrity after system crashes (for example, due to power outages) by keeping a record of file system operations that can be replayed when the system is restarted and the file system remounted. XFS was originally developed in the early 1990s by SGI and has a long history of running on extremely large servers and storage arrays. XFS supports a wealth of features including the following:

  • Delayed allocation
  • Dynamically allocated inodes
  • B-tree indexing for scalability of free space management
  • Ability to support a large number of concurrent operations
  • Extensive run-time metadata consistency checking
  • Sophisticated metadata read-ahead algorithms
  • Tightly integrated backup and restore utilities
  • Online defragmentation
  • Online filesystem growing
  • Comprehensive diagnostics capabilities
  • Scalable and fast repair utilities
  • Optimizations for streaming video workloads

While XFS scales to exabytes, Red Hat’s maximum supported XFS file system image is 100TB for Red Hat Enterprise Linux 5, 300TB for Red Hat Enterprise Linux 6, and 500TB for Red Hat Enterprise Linux 7. Given its long history in environments that require high performance and scalability, it is not surprising that XFS is routinely measured as one of the highest performing file systems on large systems with enterprise workloads. For instance, a large system would be one with a relatively high number of CPUs, multiple HBAs, and connections to external disk arrays. XFS also performs well on smaller systems that have a multi-threaded, parallel I/O workload. XFS has a relatively poor performance for single threaded, metadata-intensive workloads, for example, a workload that creates or deletes large numbers of small files in a single thread. Finally, you cannot shrink (reduce) XFS file systems in size, so you should take extra care to not over-allocate storage to an existing file system.

The Ext File System Family

Ext4 File System

Ext4 is the fourth generation of the Ext file system family and is the default file system in Red Hat Enterprise Linux 6. Ext4 can read and write to Ext2 or Ext3 file systems, but the Ext4 file-system format is not compatible with Ext2 and Ext3 drivers. However, Ext4 adds several new and improved features that are common with most modern file systems, such as the following:

  • Extent-based metadata
  • Delayed allocation
  • Journal checksumming
  • Large storage support

A more compact and efficient way to track utilized space in a file system is the usage of extent-based metadata and the delayed allocation feature. These features improve file system performance and reduce the space consumed by metadata. Delayed allocation allows the file system to postpone selection of the permanent location for newly written user data until the data is flushed to disk. This enables higher performance since it can allow for larger, more contiguous allocations, allowing the file system to make decisions with much better information.

File system repair time (fsck) in Ext4 is much faster than in Ext2 and Ext3. Some file system repairs have demonstrated up to a six-fold increase in performance. Currently, Red Hat’s maximum supported size for Ext4 is 16TB in both Red Hat Enterprise Linux 5 and Red Hat Enterprise Linux 6, and 50TB in Red Hat Enterprise Linux 7.

You can shrink (reduce) Ext4 and Ext3 file systems, so they provide a bit more flexibility as far as storage allocation.

For detailed information about the size limits of filesystem, files, and directories, see the File systems and storage section of the Red Hat Enterprise Linux technology capabilities and limits article.

Choosing a Local File System

How should you go about choosing a file system that meets your application requirements? The first step is to understand the target system on which you are going to deploy the file system. The following questions can be used to inform your decision:

  • Do you have a large server?
  • Do you have large storage requirements or have a local, slow S-ATA drive?
  • What kind of I/O workload do you expect your application to present?
  • What are your throughput and latency requirements?
  • How stable is your server and storage hardware?
  • What is the typical size of your files and data set?
  • If the system fails, how much downtime can you suffer?
  • Do you foresee the need to shrink (reduce) the filesystem size?

Depending on the answers to the above questions, your choice might be obvious. If both your server and your storage device are large, and there is no need to shrink (reduce) the filesystem size, XFS is likely to be the best choice. Even with smaller storage arrays, XFS performs very well when the average file sizes are large (for example, hundreds of megabytes in size).

If your existing workload has performed well with Ext3, staying with Ext3 on Red Hat Enterprise Linux 5 or migrating to Ext4 on Red Hat Enterprise Linux 6 or Red Hat Enterprise Linux 7 should provide you and your applications with a very familiar environment. Two key advantages of Ext4 over Ext3 on the same storage include faster file system check and repair times and higher streaming read and write performance on high-speed devices.

Another way to characterize this is that the Ext4 file system variants tend to perform better on systems that have limited I/O capability. Ext3 and Ext4 perform better on limited bandwidth (< 200MB/s) and up to ~1,000 IOPS capability. For anything with higher capability, XFS tends to be faster. XFS also consumes about twice the CPU-per-metadata operation compared to Ext3 and Ext4, so if you have a CPU-bound workload with little concurrency, then the Ext3 or Ext4 variants will be faster. In general, Ext3 or Ext4 is better if an application uses a single read/write thread and small files, while XFS shines when an application uses multiple read/write threads and bigger files.

We recommend that you measure the performance of your specific application on your target server and storage system to make sure you choose the appropriate type of file system.

Red Hat Enterprise Linux 6 has new file system capabilities and performance characteristics. Key features that have been introduced in Red Hat Enterprise Linux 6 include support for the SSD “trim” command, support for thinly provisioned storage, and automated detection and alignment of new file systems on many types of storage devices.

Network File Systems

Network file systems, also referred to as client/server file systems, allow client machines to access files that are stored on a shared server. This makes it possible for multiple users on multiple machines to share files and storage resources. Such file systems are built from one or more servers that export to one or more clients a set of file systems. The client nodes do not have access to the underlying block storage, but rather interact with the storage using a protocol that allows for better access control. Historically, these systems have used L2 networking technologies like Gigabit Ethernet to provide reasonably good performance for a set of clients.

The most common client/server file system for Red Hat Enterprise Linux customers is the NFS file system. Red Hat Enterprise Linux provides both an NFS server component that is used to export a local file system over the network and an NFS client that can be used to import these file systems.

Red Hat Enterprise Linux also includes a CIFS client that supports the popular Microsoft SMB file servers for Windows interoperability. To provide Windows clients with a Microsoft SMB service from a Red Hat Enterprise Linux server, Red Hat Enterprise Linux provides the userspace Samba server.

Shared Storage File Systems

Shared storage file systems, sometimes referred to as cluster file systems, give each server in the cluster direct access to a shared block device over a local storage area network (SAN). Like the client/server file systems mentioned above, shared storage file systems work on a set of servers that are all members of a cluster. Unlike NFS, however, no single server provides access to data or metadata to other members: each member of the cluster has direct access to the same storage device (the “shared storage”), and all cluster member nodes access the same set of files.

Cache coherency is paramount in a clustered file system to ensure data consistency and integrity. There must be a single version of all files in a cluster visible to all nodes within a cluster. In order to prevent members of the cluster from updating the same storage block at the same time and causing data corruption, shared storage file systems use a cluster wide-locking mechanism to arbitrate access to the storage as a concurrency control mechanism. For example, before creating a new file or writing to a file that is opened on multiple servers, the file system component on the server must obtain the correct lock.

The requirement of cluster file systems is to provide a highly available service like an Apache web server. Any member of the cluster will see a fully coherent view of the data stored in his or her shared disk file system, and all updates will be arbitrated correctly by the locking mechanisms. Performance of shared disk file systems is normally less than that of a local file system running on the same system since it has to account for the cost of the locking overhead. Shared disk file systems perform well with workloads where each node writes almost exclusively to a particular set of files that are not shared with other nodes or where a set of files is to be shared in an almost exclusively read-only manner across a set of nodes. This results in a minimum of cross-node cache invalidation and can maximize performance. Setting up a shared disk file system is complex, and tuning an application to perform well on a shared disk file system can be challenging.

For Red Hat Enterprise Linux customers, Red Hat provides the GFS1 file system in Red Hat Enterprise Linux 4 and Red Hat Enterprise Linux 5, and the GFS2 file system is available in Red Hat Enterprise Linux 5.4 and later major releases including Red Hat Enterprise Linux 6 and 7. GFS1 is not supported in Red Hat Enterprise Linux 6 and later, so customers will have to migrate their data to GFS2 before upgrading. GFS2 comes tightly integrated with the Red Hat Enterprise Linux High Availability Add-On and the Resilient Storage Add-On. Note that Red Hat Enterprise Linux supports both GFS1 and GFS2 on clusters that range in size from 2–16 nodes.

Choosing Between Network and Shared Storage File Systems

NFS-based network file systems are an extremely common and popular choice for environments that provide NFS servers. Note that network file systems can be deployed using very high-performance networking technologies like Infiniband or 10 Gigabit Ethernet. This means that users should not turn to shared storage file systems just to get raw bandwidth to their storage. If the speed of access is of prime importance, then use NFS to export a local file system like Ext4.

Shared storage file systems are not easy to set up or to maintain, so users should deploy them only when they cannot provide their required availability with either local or network file systems. Additionally, a shared storage file system in a clustered environment helps reduce downtime by eliminating the steps needed for un-mounting and mounting that need to be done during a typical fail-over scenario involving the relocation of a high-availability service. We recommend the use of shared storage file systems primarily for deployments that need to provide high-availability services with minimum downtime and have stringent service-level requirements.

Conclusion

Choosing the Red Hat Enterprise Linux file system that satisfies your specific application needs requires consultation of various parameters. This document was intended to outline the benefits of various file system options and to help users make the decision regarding the right file system for their application environments. For additional information about file systems for your IT environment, please contact Red Hat Support.

Comments