Chapter 7. File Systems
7.1. Tuning Considerations for File Systems
7.1.1. Formatting Options
Block size can be selected at
mkfs time. The range of valid sizes depends on the system: the upper limit is the maximum page size of the host system, while the lower limit depends on the file system used. The default block size is appropriate for most use cases.
If your system uses striped storage such as RAID5, you can improve performance by aligning data and metadata with the underlying storage geometry at
mkfs time. For software RAID (LVM or MD) and some enterprise hardware storage, this information is queried and set automatically, but in many cases the administrator must specify this geometry manually with
mkfs at the command line.
Metadata-intensive workloads mean that the log section of a journaling file system (such as ext4 and XFS) is updated extremely frequently. To minimize seek time from file system to journal, you can place the journal on dedicated storage. Note, however, that placing the journal on external storage that is slower than the primary file system can nullify any potential advantage associated with using external storage.
mkfstime, with journal devices being specified at mount time. Refer to the
mount(8)man pages for further information.
7.1.2. Mount Options
A write barrier is a kernel mechanism used to ensure that file system metadata is correctly written and ordered on persistent storage, even when storage devices with volatile write caches lose power. File systems with write barriers enabled also ensure that any data transmitted via
fsync() persists across a power outage. Red Hat Enterprise Linux enables barriers by default on all hardware that supports them.
fsync()heavily, or create and delete many small files. For storage with no volatile write cache, or in the rare case where file system inconsistencies and data loss after a power loss is acceptable, barriers can be disabled by using the
nobarriermount option. For further information, refer to the Storage Administration Guide.
Historically, when a file is read, the access time (
atime) for that file must be updated in the inode metadata, which involves additional write I/O. If accurate
atime metadata is not required, mount the file system with the
noatime option to eliminate these metadata updates. In most cases, however,
atime is not a large overhead due to the default relative atime (or
relatime) behavior in the Red Hat Enterprise Linux 6 kernel. The
relatime behavior only updates
atime if the previous
atime is older than the modification time (
mtime) or status change time (
noatimeoption also enables
nodiratimebehavior; there is no need to set both
Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk. Some workloads, such as those involving heavy streaming of sequential I/O, benefit from high read-ahead values.
blockdevcommand to view and edit the read-ahead value. To view the current read-ahead value for a particular block device, run:
# blockdev -getra device
# blockdev -setra N device
blockdevcommand will not persist between boots. We recommend creating a run level
init.dscript to set this value during boot.
7.1.3. File system maintenance
Batch discard and online discard operations are features of mounted file systems that discard blocks which are not in use by the file system. These operations are useful for both solid-state drives and thinly-provisioned storage.
fstrimcommand. This command discards all unused blocks in a file system that match the user's criteria. Both operation types are supported for use with the XFS and ext4 file systems in Red Hat Enterprise Linux 6.2 and later as long as the block device underlying the file system supports physical discard operations. Physical discard operations are supported if the value of
/sys/block/device/queue/discard_max_bytesis not zero.
-o discardoption (either in
/etc/fstabor as part of the
mountcommand), and run in realtime without user intervention. Online discard operations only discard blocks that are transitioning from used to free. Online discard operations are supported on ext4 file systems in Red Hat Enterprise Linux 6.2 and later, and on XFS file systems in Red Hat Enterprise Linux 6.4 and later.
7.1.4. Application Considerations
The ext4, XFS, and GFS2 file systems support efficient space pre-allocation via the
fallocate(2) glibc call. In cases where files may otherwise become badly fragmented due to write patterns, leading to poor read performance, space preallocation can be a useful technique. Pre-allocation marks disk space as if it has been allocated to a file, without writing any data into that space. Until real data is written to a pre-allocated block, read operations will return zeroes.