Managing file systems
Creating, modifying, and administering file systems in Red Hat Enterprise Linux 8
Abstract
Making open source more inclusive
Red Hat is committed to replacing problematic language in our code, documentation, and web properties. We are beginning with these four terms: master, slave, blacklist, and whitelist. Because of the enormity of this endeavor, these changes will be implemented gradually over several upcoming releases. For more details, see our CTO Chris Wright’s message.
Providing feedback on Red Hat documentation
We appreciate your input on our documentation. Please let us know how we could make it better. To do so:
For simple comments on specific passages:
- Make sure you are viewing the documentation in the Multi-page HTML format. In addition, ensure you see the Feedback button in the upper right corner of the document.
- Use your mouse cursor to highlight the part of text that you want to comment on.
- Click the Add Feedback pop-up that appears below the highlighted text.
- Follow the displayed instructions.
For submitting more complex feedback, create a Bugzilla ticket:
- Go to the Bugzilla website.
- As the Component, use Documentation.
- Fill in the Description field with your suggestion for improvement. Include a link to the relevant part(s) of documentation.
- Click Submit Bug.
Chapter 1. Overview of available file systems
Choosing the file system that is appropriate for your application is an important decision due to the large number of options available and the trade-offs involved. This chapter describes some of the file systems that ship with Red Hat Enterprise Linux 8 and provides historical background and recommendations on the right file system to suit your application.
1.1. Types of file systems
Red Hat Enterprise Linux 8 supports a variety of file systems (FS). Different types of file systems solve different kinds of problems, and their usage is application specific. At the most general level, available file systems can be grouped into the following major types:
Table 1.1. Types of file systems and their use cases
Type | File system | Attributes and use cases |
---|---|---|
Disk or local FS | XFS | XFS is the default file system in RHEL. Because it lays out files as extents, it is less vulnerable to fragmentation than ext4. Red Hat recommends deploying XFS as your local file system unless there are specific reasons to do otherwise: for example, compatibility or corner cases around performance. |
ext4 | ext4 has the benefit of longevity in Linux. Therefore, it is supported by almost all Linux applications. In most cases, it rivals XFS on performance. ext4 is commonly used for home directories. | |
Network or client-and-server FS | NFS | Use NFS to share files between multiple systems on the same network. |
SMB | Use SMB for file sharing with Microsoft Windows systems. | |
Shared storage or shared disk FS | GFS2 | GFS2 provides shared write access to members of a compute cluster. The emphasis is on stability and reliability, with the functional experience of a local file system as possible. SAS Grid, Tibco MQ, IBM Websphere MQ, and Red Hat Active MQ have been deployed successfully on GFS2. |
Volume-managing FS | Stratis (Technology Preview) | Stratis is a volume manager built on a combination of XFS and LVM. The purpose of Stratis is to emulate capabilities offered by volume-managing file systems like Btrfs and ZFS. It is possible to build this stack manually, but Stratis reduces configuration complexity, implements best practices, and consolidates error information. |
1.2. Local file systems
Local file systems are file systems that run on a single, local server and are directly attached to storage.
For example, a local file system is the only choice for internal SATA or SAS disks, and is used when your server has internal hardware RAID controllers with local drives. Local file systems are also the most common file systems used on SAN attached storage when the device exported on the SAN is not shared.
All local file systems are POSIX-compliant and are fully compatible with all supported Red Hat Enterprise Linux releases. POSIX-compliant file systems provide support for a well-defined set of system calls, such as read()
, write()
, and seek()
.
From the application programmer’s point of view, there are relatively few differences between local file systems. The most notable differences from a user’s perspective are related to scalability and performance. When considering a file system choice, consider how large the file system needs to be, what unique features it should have, and how it performs under your workload.
Available local file systems
- XFS
- ext4
1.3. The XFS file system
XFS is a highly scalable, high-performance, robust, and mature 64-bit journaling file system that supports very large files and file systems on a single host. It is the default file system in Red Hat Enterprise Linux 8. XFS was originally developed in the early 1990s by SGI and has a long history of running on extremely large servers and storage arrays.
The features of XFS include:
- Reliability
- Metadata journaling, which ensures file system integrity after a system crash by keeping a record of file system operations that can be replayed when the system is restarted and the file system remounted
- Extensive run-time metadata consistency checking
- Scalable and fast repair utilities
- Quota journaling. This avoids the need for lengthy quota consistency checks after a crash.
- Scalability and performance
- Supported file system size up to 1024 TiB
- Ability to support a large number of concurrent operations
- B-tree indexing for scalability of free space management
- Sophisticated metadata read-ahead algorithms
- Optimizations for streaming video workloads
- Allocation schemes
- Extent-based allocation
- Stripe-aware allocation policies
- Delayed allocation
- Space pre-allocation
- Dynamically allocated inodes
- Other features
- Reflink-based file copies (new in Red Hat Enterprise Linux 8)
- Tightly integrated backup and restore utilities
- Online defragmentation
- Online file system growing
- Comprehensive diagnostics capabilities
-
Extended attributes (
xattr
). This allows the system to associate several additional name/value pairs per file. - Project or directory quotas. This allows quota restrictions over a directory tree.
- Subsecond timestamps
Performance characteristics
XFS has a high performance on large systems with enterprise workloads. A large system is one with a relatively high number of CPUs, multiple HBAs, and connections to external disk arrays. XFS also performs well on smaller systems that have a multi-threaded, parallel I/O workload.
XFS has a relatively low performance for single threaded, metadata-intensive workloads: for example, a workload that creates or deletes large numbers of small files in a single thread.
1.4. The ext4 file system
The ext4 file system is the fourth generation of the ext file system family. It was the default file system in Red Hat Enterprise Linux 6.
The ext4 driver can read and write to ext2 and ext3 file systems, but the ext4 file system format is not compatible with ext2 and ext3 drivers.
ext4 adds several new and improved features, such as:
- Supported file system size up to 50 TiB
- Extent-based metadata
- Delayed allocation
- Journal checksumming
- Large storage support
The extent-based metadata and the delayed allocation features provide a more compact and efficient way to track utilized space in a file system. These features improve file system performance and reduce the space consumed by metadata. Delayed allocation allows the file system to postpone selection of the permanent location for newly written user data until the data is flushed to disk. This enables higher performance since it can allow for larger, more contiguous allocations, allowing the file system to make decisions with much better information.
File system repair time using the fsck
utility in ext4 is much faster than in ext2 and ext3. Some file system repairs have demonstrated up to a six-fold increase in performance.
1.5. Comparison of XFS and ext4
XFS is the default file system in RHEL. This section compares the usage and features of XFS and ext4.
- Metadata error behavior
-
In ext4, you can configure the behavior when the file system encounters metadata errors. The default behavior is to simply continue the operation. When XFS encounters an unrecoverable metadata error, it shuts down the file system and returns the
EFSCORRUPTED
error. - Quotas
In ext4, you can enable quotas when creating the file system or later on an existing file system. You can then configure the quota enforcement using a mount option.
XFS quotas are not a remountable option. You must activate quotas on the initial mount.
Running the
quotacheck
command on an XFS file system has no effect. The first time you turn on quota accounting, XFS checks quotas automatically.- File system resize
- XFS has no utility to reduce the size of a file system. You can only increase the size of an XFS file system. In comparison, ext4 supports both extending and reducing the size of a file system.
- Inode numbers
The ext4 file system does not support more than 232 inodes.
XFS dynamically allocates inodes. An XFS file system cannot run out of inodes as long as there is free space on the file system.
Certain applications cannot properly handle inode numbers larger than 232 on an XFS file system. These applications might cause the failure of 32-bit stat calls with the
EOVERFLOW
return value. Inode number exceed 232 under the following conditions:- The file system is larger than 1 TiB with 256-byte inodes.
- The file system is larger than 2 TiB with 512-byte inodes.
If your application fails with large inode numbers, mount the XFS file system with the
-o inode32
option to enforce inode numbers below 232. Note that usinginode32
does not affect inodes that are already allocated with 64-bit numbers.ImportantDo not use the
inode32
option unless a specific environment requires it. Theinode32
option changes allocation behavior. As a consequence, theENOSPC
error might occur if no space is available to allocate inodes in the lower disk blocks.
1.6. Choosing a local file system
To choose a file system that meets your application requirements, you need to understand the target system on which you are going to deploy the file system. You can use the following questions to inform your decision:
- Do you have a large server?
- Do you have large storage requirements or have a local, slow SATA drive?
- What kind of I/O workload do you expect your application to present?
- What are your throughput and latency requirements?
- How stable is your server and storage hardware?
- What is the typical size of your files and data set?
- If the system fails, how much downtime can you suffer?
If both your server and your storage device are large, XFS is the best choice. Even with smaller storage arrays, XFS performs very well when the average file sizes are large (for example, hundreds of megabytes in size).
If your existing workload has performed well with ext4, staying with ext4 should provide you and your applications with a very familiar environment.
The ext4 file system tends to perform better on systems that have limited I/O capability. It performs better on limited bandwidth (less than 200MB/s) and up to around 1000 IOPS capability. For anything with higher capability, XFS tends to be faster.
XFS consumes about twice the CPU-per-metadata operation compared to ext4, so if you have a CPU-bound workload with little concurrency, then ext4 will be faster. In general, ext4 is better if an application uses a single read/write thread and small files, while XFS shines when an application uses multiple read/write threads and bigger files.
You cannot shrink an XFS file system. If you need to be able to shrink the file system, consider using ext4, which supports offline shrinking.
In general, Red Hat recommends that you use XFS unless you have a specific use case for ext4. You should also measure the performance of your specific application on your target server and storage system to make sure that you choose the appropriate type of file system.
Table 1.2. Summary of local file system recommendations
Scenario | Recommended file system |
---|---|
No special use case | XFS |
Large server | XFS |
Large storage devices | XFS |
Large files | XFS |
Multi-threaded I/O | XFS |
Single-threaded I/O | ext4 |
Limited I/O capability (under 1000 IOPS) | ext4 |
Limited bandwidth (under 200MB/s) | ext4 |
CPU-bound workload | ext4 |
Support for offline shrinking | ext4 |
1.7. Network file systems
Network file systems, also referred to as client/server file systems, enable client systems to access files that are stored on a shared server. This makes it possible for multiple users on multiple systems to share files and storage resources.
Such file systems are built from one or more servers that export a set of file systems to one or more clients. The client nodes do not have access to the underlying block storage, but rather interact with the storage using a protocol that allows for better access control.
Available network file systems
- The most common client/server file system for RHEL customers is the NFS file system. RHEL provides both an NFS server component to export a local file system over the network and an NFS client to import these file systems.
- RHEL also includes a CIFS client that supports the popular Microsoft SMB file servers for Windows interoperability. The userspace Samba server provides Windows clients with a Microsoft SMB service from a RHEL server.
1.10. Volume-managing file systems
Volume-managing file systems integrate the entire storage stack for the purposes of simplicity and in-stack optimization.
Available volume-managing file systems
Red Hat Enterprise Linux 8 provides the Stratis volume manager as a Technology Preview. Stratis uses XFS for the file system layer and integrates it with LVM, Device Mapper, and other components.
Stratis was first released in Red Hat Enterprise Linux 8.0. It is conceived to fill the gap created when Red Hat deprecated Btrfs. Stratis 1.0 is an intuitive, command line-based volume manager that can perform significant storage management operations while hiding the complexity from the user:
- Volume management
- Pool creation
- Thin storage pools
- Snapshots
- Automated read cache
Stratis offers powerful features, but currently lacks certain capabilities of other offerings that it might be compared to, such as Btrfs or ZFS. Most notably, it does not support CRCs with self healing.
Chapter 2. Managing local storage using RHEL System Roles
To manage LVM and local file systems (FS) using Ansible, you can use the storage
role, which is one of the RHEL System Roles available in RHEL 8.
Using the storage
role enables you to automate administration of file systems on disks and logical volumes on multiple machines and across all versions of RHEL starting with RHEL 7.7.
For more information on RHEL System Roles and how to apply them, see Introduction to RHEL System Roles.
2.1. Introduction to the storage role
The storage
role can manage:
- File systems on disks which have not been partitioned
- Complete LVM volume groups including their logical volumes and file systems
With the storage
role you can perform the following tasks:
- Create a file system
- Remove a file system
- Mount a file system
- Unmount a file system
- Create LVM volume groups
- Remove LVM volume groups
- Create logical volumes
- Remove logical volumes
- Create RAID volumes
- Remove RAID volumes
- Create LVM pools with RAID
- Remove LVM pools with RAID
2.2. Parameters that identify a storage device in the storage system role
Your storage
role configuration affects only the file systems, volumes, and pools that you list in the following variables.
storage_volumes
List of file systems on all unpartitioned disks to be managed.
Partitions are currently unsupported.
storage_pools
List of pools to be managed.
Currently the only supported pool type is LVM. With LVM, pools represent volume groups (VGs). Under each pool there is a list of volumes to be managed by the role. With LVM, each volume corresponds to a logical volume (LV) with a file system.
2.3. Example Ansible playbook to create an XFS file system on a block device
This section provides an example Ansible playbook. This playbook applies the storage
role to create an XFS file system on a block device using the default parameters.
The storage
role can create a file system only on an unpartitioned, whole disk or a logical volume (LV). It cannot create the file system on a partition.
Example 2.1. A playbook that creates XFS on /dev/sdb
--- - hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: xfs roles: - rhel-system-roles.storage
-
The volume name (
barefs
in the example) is currently arbitrary. Thestorage
role identifies the volume by the disk device listed under thedisks:
attribute. -
You can omit the
fs_type: xfs
line because XFS is the default file system in RHEL 8. To create the file system on an LV, provide the LVM setup under the
disks:
attribute, including the enclosing volume group. For details, see Example Ansible playbook to manage logical volumes.Do not provide the path to the LV device.
Additional resources
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
2.4. Example Ansible playbook to persistently mount a file system
This section provides an example Ansible playbook. This playbook applies the storage
role to immediately and persistently mount an XFS file system.
Example 2.2. A playbook that mounts a file system on /dev/sdb to /mnt/data
--- - hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: xfs mount_point: /mnt/data roles: - rhel-system-roles.storage
-
This playbook adds the file system to the
/etc/fstab
file, and mounts the file system immediately. -
If the file system on the
/dev/sdb
device or the mount point directory do not exist, the playbook creates them.
Additional resources
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
2.5. Example Ansible playbook to manage logical volumes
This section provides an example Ansible playbook. This playbook applies the storage
role to create an LVM logical volume in a volume group.
Example 2.3. A playbook that creates a mylv logical volume in the myvg volume group
- hosts: all vars: storage_pools: - name: myvg disks: - sda - sdb - sdc volumes: - name: mylv size: 2G fs_type: ext4 mount_point: /mnt roles: - rhel-system-roles.storage
The
myvg
volume group consists of the following disks:-
/dev/sda
-
/dev/sdb
-
/dev/sdc
-
-
If the
myvg
volume group already exists, the playbook adds the logical volume to the volume group. -
If the
myvg
volume group does not exist, the playbook creates it. -
The playbook creates an Ext4 file system on the
mylv
logical volume, and persistently mounts the file system at/mnt
.
Additional resources
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
2.6. Example Ansible playbook to enable online block discard
This section provides an example Ansible playbook. This playbook applies the storage
role to mount an XFS file system with online block discard enabled.
Example 2.4. A playbook that enables online block discard on /mnt/data/
--- - hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: xfs mount_point: /mnt/data mount_options: discard roles: - rhel-system-roles.storage
Additional resources
- This playbook also performs all the operations of the persistent mount example described in Section 2.4, “Example Ansible playbook to persistently mount a file system”.
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
2.7. Example Ansible playbook to create and mount an Ext4 file system
This section provides an example Ansible playbook. This playbook applies the storage
role to create and mount an Ext4 file system.
Example 2.5. A playbook that creates Ext4 on /dev/sdb and mounts it at /mnt/data
--- - hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: ext4 fs_label: label-name mount_point: /mnt/data roles: - rhel-system-roles.storage
-
The playbook creates the file system on the
/dev/sdb
disk. -
The playbook persistently mounts the file system at the
/mnt/data
directory. -
The label of the file system is
label-name
.
Additional resources
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
2.8. Example Ansible playbook to create and mount an ext3 file system
This section provides an example Ansible playbook. This playbook applies the storage
role to create and mount an Ext3 file system.
Example 2.6. A playbook that creates Ext3 on /dev/sdb and mounts it at /mnt/data
--- - hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: ext3 fs_label: label-name mount_point: /mnt/data roles: - rhel-system-roles.storage
-
The playbook creates the file system on the
/dev/sdb
disk. -
The playbook persistently mounts the file system at the
/mnt/data
directory. -
The label of the file system is
label-name
.
Additional resources
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
2.9. Configuring a RAID volume using the storage system role
With the storage
System Role, you can configure a RAID volume on RHEL using Red Hat Ansible Automation Platform. In this section you will learn how to set up an Ansible playbook with the available parameters to configure a RAID volume to suit your requirements.
Prerequisites
You have Red Hat Ansible Engine installed on the system from which you want to run the playbook.
NoteYou do not have to have Red Hat Ansible Automation Platform installed on the systems on which you want to deploy the
storage
solution.-
You have the
rhel-system-roles
package installed on the system from which you want to run the playbook. -
You have an inventory file detailing the systems on which you want to deploy a RAID volume using the
storage
System Role.
Procedure
Create a new
playbook.yml
file with the following content:- hosts: all vars: storage_safe_mode: false storage_volumes: - name: data type: raid disks: [sdd, sde, sdf, sdg] raid_level: raid0 raid_chunk_size: 32 KiB mount_point: /mnt/data state: present roles: - name: rhel-system-roles.storage
WarningDevice names can change in certain circumstances; for example, when you add a new disk to a system. Therefore, to prevent data loss, we do not recommend using specific disk names in the playbook.
Optional. Verify playbook syntax.
# ansible-playbook --syntax-check playbook.yml
Run the playbook on your inventory file:
# ansible-playbook -i inventory.file /path/to/file/playbook.yml
Additional resources
- For more information about RAID, see Managing RAID.
-
For details about the parameters used in the storage system role, see the
/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
2.10. Configuring an LVM pool with RAID using the storage system role
With the storage
System Role, you can configure an LVM pool with RAID on RHEL using Red Hat Ansible Automation Platform. In this section you will learn how to set up an Ansible playbook with the available parameters to configure an LVM pool with RAID.
Prerequisites
You have Red Hat Ansible Engine installed on the system from which you want to run the playbook.
NoteYou do not have to have Red Hat Ansible Automation Platform installed on the systems on which you want to deploy the
storage
solution.-
You have the
rhel-system-roles
package installed on the system from which you want to run the playbook. -
You have an inventory file detailing the systems on which you want to configure an LVM pool with RAID using the
storage
System Role.
Procedure
Create a new
playbook.yml
file with the following content:- hosts: all vars: storage_safe_mode: false storage_pools: - name: my_pool type: lvm disks: [sdh, sdi] raid_level: raid1 volumes: - name: my_pool size: "1 GiB" mount_point: "/mnt/app/shared" fs_type: xfs state: present roles: - name: rhel-system-roles.storage
NoteTo create an LVM pool with RAID, you must specify the RAID type using the
raid_level
parameter.Optional. Verify playbook syntax.
# ansible-playbook --syntax-check playbook.yml
Run the playbook on your inventory file:
# ansible-playbook -i inventory.file /path/to/file/playbook.yml
Additional resources
- For more information about RAID, see Managing RAID.
-
For details about the parameters used in the storage system role, see the
/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
2.11. Creating a LUKS encrypted volume using the storage role
You can use the storage
role to create and configure a volume encrypted with LUKS by running an Ansible playbook.
Prerequisites
You have Red Hat Ansible Engine installed on the system from which you want to run the playbook.
NoteYou do not have to have Red Hat Ansible Automation Platform installed on the systems on which you want to create the volume.
-
You have the
rhel-system-roles
package installed on the Ansible controller. - You have an inventory file detailing the systems on which you want to deploy a LUKS encrypted volume using the storage System Role.
Procedure
Create a new
playbook.yml
file with the following content:- hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: xfs fs_label: label-name mount_point: /mnt/data encryption: true encryption_password: your-password roles: - rhel-system-roles.storage
Optional. Verify playbook syntax:
# ansible-playbook --syntax-check playbook.yml
Run the playbook on your inventory file:
# ansible-playbook -i inventory.file /path/to/file/playbook.yml
Additional resources
- For more information about LUKS, see 17. Encrypting block devices using LUKS..
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
Additional resources
For more information, install the
rhel-system-roles
package and see the following directories:-
/usr/share/doc/rhel-system-roles/storage/
-
/usr/share/ansible/roles/rhel-system-roles.storage/
-
Chapter 5. Securing NFS
To minimize NFS security risks and protect data on the server, consider the following sections when exporting NFS file systems on a server or mounting them on a client.
5.1. NFS security with AUTH_SYS and export controls
NFS provides the following traditional options in order to control access to exported files:
- The server restricts which hosts are allowed to mount which file systems either by IP address or by host name.
-
The server enforces file system permissions for users on NFS clients in the same way it does for local users. Traditionally, NFS does this using the
AUTH_SYS
call message (also calledAUTH_UNIX
), which relies on the client to state the UID and GIDs of the user. Be aware that this means that a malicious or misconfigured client might easily get this wrong and allow a user access to files that it should not.
To limit the potential risks, administrators often limits the access to read-only or squash user permissions to a common user and group ID. Unfortunately, these solutions prevent the NFS share from being used in the way it was originally intended.
Additionally, if an attacker gains control of the DNS server used by the system exporting the NFS file system, they can point the system associated with a particular hostname or fully qualified domain name to an unauthorized machine. At this point, the unauthorized machine is the system permitted to mount the NFS share, because no username or password information is exchanged to provide additional security for the NFS mount.
Wildcards should be used sparingly when exporting directories through NFS, as it is possible for the scope of the wildcard to encompass more systems than intended.
Additional resources
-
To secure NFS and
rpcbind
, use, for example,nftables
andfirewalld
. For details about configuring these frameworks, see thenft(8)
andfirewalld-cmd(1)
man pages.
5.2. NFS security with AUTH_GSS
All version of NFS support RPCSEC_GSS and the Kerberos mechanism.
Unlike AUTH_SYS, with the RPCSEC_GSS Kerberos mechanism, the server does not depend on the client to correctly represent which user is accessing the file. Instead, cryptography is used to authenticate users to the server, which prevents a malicious client from impersonating a user without having that user’s Kerberos credentials. Using the RPCSEC_GSS Kerberos mechanism is the most straightforward way to secure mounts because after configuring Kerberos, no additional setup is needed.
5.3. Configuring an NFS server and client to use Kerberos
Kerberos is a network authentication system that allows clients and servers to authenticate to each other by using symmetric encryption and a trusted third party, the KDC. Red Hat recommends using Identity Management (IdM) for setting up Kerberos.
Prerequisites
-
The Kerberos Key Distribution Centre (
KDC
) is installed and configured.
Procedure
-
Create the
nfs/hostname.domain@REALM
principal on the NFS server side. -
Create the
host/hostname.domain@REALM
principal on both the server and the client side. - Add the corresponding keys to keytabs for the client and server.
-
Create the
On the server side, use the
sec=
option to enable the wanted security flavors. To enable all security flavors as well as non-cryptographic mounts:/export *(sec=sys:krb5:krb5i:krb5p)
Valid security flavors to use with the
sec=
option are:-
sys
: no cryptographic protection, the default -
krb5
: authentication only -
krb5i
: integrity protection -
krb5p
: privacy protection
-
On the client side, add
sec=krb5
(orsec=krb5i
, orsec=krb5p
, depending on the setup) to the mount options:# mount -o sec=krb5 server:/export /mnt
Additional resources
- If you need to write files as root on the Kerberos-secured NFS share and keep root ownership on these files, see https://access.redhat.com/articles/4040141. Note that this configuration is not recommended.
- For more information on NFS configuration, see the exports(5) and nfs(5) man pages.
5.4. NFSv4 security options
NFSv4 includes ACL support based on the Microsoft Windows NT model, not the POSIX model, because of the Microsoft Windows NT model’s features and wide deployment.
Another important security feature of NFSv4 is the removal of the use of the MOUNT
protocol for mounting file systems. The MOUNT
protocol presented a security risk because of the way the protocol processed file handles.
5.5. File permissions on mounted NFS exports
Once the NFS file system is mounted as either read or read and write by a remote host, the only protection each shared file has is its permissions. If two users that share the same user ID value mount the same NFS file system on different client systems, they can modify each others' files. Additionally, anyone logged in as root on the client system can use the su -
command to access any files with the NFS share.
By default, access control lists (ACLs) are supported by NFS under Red Hat Enterprise Linux. Red Hat recommends to keep this feature enabled.
By default, NFS uses root squashing when exporting a file system. This sets the user ID of anyone accessing the NFS share as the root user on their local machine to nobody
. Root squashing is controlled by the default option root_squash
; for more information about this option, see Section 4.6, “NFS server configuration”.
When exporting an NFS share as read-only, consider using the all_squash
option. This option makes every user accessing the exported file system take the user ID of the nobody
user.
Chapter 6. Enabling pNFS SCSI layouts in NFS
You can configure the NFS server and client to use the pNFS SCSI layout for accessing data. pNFS SCSI is beneficial in use cases that involve longer-duration single-client access to a file.
Prerequisites
- Both the client and the server must be able to send SCSI commands to the same block device. That is, the block device must be on a shared SCSI bus.
- The block device must contain an XFS file system.
- The SCSI device must support SCSI Persistent Reservations as described in the SCSI-3 Primary Commands specification.
6.1. The pNFS technology
The pNFS architecture improves the scalability of NFS. When a server implements pNFS, the client is able to access data through multiple servers concurrently. This can lead to performance improvements.
pNFS supports the following storage protocols or layouts on RHEL:
- Files
- Flexfiles
- SCSI
6.2. pNFS SCSI layouts
The SCSI layout builds on the work of pNFS block layouts. The layout is defined across SCSI devices. It contains a sequential series of fixed-size blocks as logical units (LUs) that must be capable of supporting SCSI persistent reservations. The LU devices are identified by their SCSI device identification.
pNFS SCSI performs well in use cases that involve longer-duration single-client access to a file. An example might be a mail server or a virtual machine housing a cluster.
Operations between the client and the server
When an NFS client reads from a file or writes to it, the client performs a LAYOUTGET
operation. The server responds with the location of the file on the SCSI device. The client might need to perform an additional operation of GETDEVICEINFO
to determine which SCSI device to use. If these operations work correctly, the client can issue I/O requests directly to the SCSI device instead of sending READ
and WRITE
operations to the server.
Errors or contention between clients might cause the server to recall layouts or not issue them to the clients. In those cases, the clients fall back to issuing READ
and WRITE
operations to the server instead of sending I/O requests directly to the SCSI device.
To monitor the operations, see Section 6.7, “Monitoring pNFS SCSI layouts functionality”.
Device reservations
pNFS SCSI handles fencing through the assignment of reservations. Before the server issues layouts to clients, it reserves the SCSI device to ensure that only registered clients may access the device. If a client can issue commands to that SCSI device but is not registered with the device, many operations from the client on that device fail. For example, the blkid
command on the client fails to show the UUID of the XFS file system if the server has not given a layout for that device to the client.
The server does not remove its own persistent reservation. This protects the data within the file system on the device across restarts of clients and servers. In order to repurpose the SCSI device, you might need to manually remove the persistent reservation on the NFS server.
6.3. Checking for a SCSI device compatible with pNFS
This procedure checks if a SCSI device supports the pNFS SCSI layout.
Prerequisites
Install the
sg3_utils
package:# yum install sg3_utils
Procedure
On both the server and client, check for the proper SCSI device support:
# sg_persist --in --report-capabilities --verbose path-to-scsi-device
Ensure that the Persist Through Power Loss Active (
PTPL_A
) bit is set.Example 6.1. A SCSI device that supports pNFS SCSI
The following is an example of
sg_persist
output for a SCSI device that supports pNFS SCSI. ThePTPL_A
bit reports1
.inquiry cdb: 12 00 00 00 24 00 Persistent Reservation In cmd: 5e 02 00 00 00 00 00 20 00 00 LIO-ORG block11 4.0 Peripheral device type: disk Report capabilities response: Compatible Reservation Handling(CRH): 1 Specify Initiator Ports Capable(SIP_C): 1 All Target Ports Capable(ATP_C): 1 Persist Through Power Loss Capable(PTPL_C): 1 Type Mask Valid(TMV): 1 Allow Commands: 1 Persist Through Power Loss Active(PTPL_A): 1 Support indicated in Type mask: Write Exclusive, all registrants: 1 Exclusive Access, registrants only: 1 Write Exclusive, registrants only: 1 Exclusive Access: 1 Write Exclusive: 1 Exclusive Access, all registrants: 1
Additional resources
-
The
sg_persist(8)
man page
6.4. Setting up pNFS SCSI on the server
This procedure configures an NFS server to export a pNFS SCSI layout.
Procedure
- On the server, mount the XFS file system created on the SCSI device.
Configure the NFS server to export NFS version 4.1 or higher. Set the following option in the
[nfsd]
section of the/etc/nfs.conf
file:[nfsd] vers4.1=y
Configure the NFS server to export the XFS file system over NFS with the
pnfs
option:Example 6.2. An entry in /etc/exports to export pNFS SCSI
The following entry in the
/etc/exports
configuration file exports the file system mounted at/exported/directory/
to theallowed.example.com
client as a pNFS SCSI layout:/exported/directory allowed.example.com(pnfs)
Additional resources
- For more information on configuring an NFS server, see Chapter 4, Exporting NFS shares.
6.5. Setting up pNFS SCSI on the client
This procedure configures an NFS client to mount a pNFS SCSI layout.
Prerequisites
- The NFS server is configured to export an XFS file system over pNFS SCSI. See Section 6.4, “Setting up pNFS SCSI on the server”.
Procedure
On the client, mount the exported XFS file system using NFS version 4.1 or higher:
# mount -t nfs -o nfsvers=4.1 host:/remote/export /local/directory
Do not mount the XFS file system directly without NFS.
Additional resources
- For more information on mounting NFS shares, see Chapter 3, Mounting NFS shares.
6.6. Releasing the pNFS SCSI reservation on the server
This procedure releases the persistent reservation that an NFS server holds on a SCSI device. This enables you to repurpose the SCSI device when you no longer need to export pNFS SCSI.
You must remove the reservation from the server. It cannot be removed from a different IT Nexus.
Prerequisites
Install the
sg3_utils
package:# yum install sg3_utils
Procedure
Query an existing reservation on the server:
# sg_persist --read-reservation path-to-scsi-device
Example 6.3. Querying a reservation on /dev/sda
# sg_persist --read-reservation /dev/sda LIO-ORG block_1 4.0 Peripheral device type: disk PR generation=0x8, Reservation follows: Key=0x100000000000000 scope: LU_SCOPE, type: Exclusive Access, registrants only
Remove the existing registration on the server:
# sg_persist --out \ --release \ --param-rk=reservation-key \ --prout-type=6 \ path-to-scsi-device
Example 6.4. Removing a reservation on /dev/sda
# sg_persist --out \ --release \ --param-rk=0x100000000000000 \ --prout-type=6 \ /dev/sda LIO-ORG block_1 4.0 Peripheral device type: disk
Additional resources
-
The
sg_persist(8)
man page
6.7. Monitoring pNFS SCSI layouts functionality
You can monitor if the pNFS client and server exchange proper pNFS SCSI operations or if they fall back on regular NFS operations.
Prerequisites
- A pNFS SCSI client and server are configured.
6.7.1. Checking pNFS SCSI operations from the server using nfsstat
This procedure uses the nfsstat
utility to monitor pNFS SCSI operations from the server.
Procedure
Monitor the operations serviced from the server:
# watch --differences \ "nfsstat --server | egrep --after-context=1 read\|write\|layout" Every 2.0s: nfsstat --server | egrep --after-context=1 read\|write\|layout putrootfh read readdir readlink remove rename 2 0% 0 0% 1 0% 0 0% 0 0% 0 0% -- setcltidconf verify write rellockowner bc_ctl bind_conn 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% -- getdevlist layoutcommit layoutget layoutreturn secinfononam sequence 0 0% 29 1% 49 1% 5 0% 0 0% 2435 86%
The client and server use pNFS SCSI operations when:
-
The
layoutget
,layoutreturn
, andlayoutcommit
counters increment. This means that the server is serving layouts. -
The server
read
andwrite
counters do not increment. This means that the clients are performing I/O requests directly to the SCSI devices.
-
The
6.7.2. Checking pNFS SCSI operations from the client using mountstats
This procedure uses the /proc/self/mountstats
file to monitor pNFS SCSI operations from the client.
Procedure
List the per-mount operation counters:
# cat /proc/self/mountstats \ | awk /scsi_lun_0/,/^$/ \ | egrep device\|READ\|WRITE\|LAYOUT device 192.168.122.73:/exports/scsi_lun_0 mounted on /mnt/rhel7/scsi_lun_0 with fstype nfs4 statvers=1.1 nfsv4: bm0=0xfdffbfff,bm1=0x40f9be3e,bm2=0x803,acl=0x3,sessions,pnfs=LAYOUT_SCSI READ: 0 0 0 0 0 0 0 0 WRITE: 0 0 0 0 0 0 0 0 READLINK: 0 0 0 0 0 0 0 0 READDIR: 0 0 0 0 0 0 0 0 LAYOUTGET: 49 49 0 11172 9604 2 19448 19454 LAYOUTCOMMIT: 28 28 0 7776 4808 0 24719 24722 LAYOUTRETURN: 0 0 0 0 0 0 0 0 LAYOUTSTATS: 0 0 0 0 0 0 0 0
In the results:
-
The
LAYOUT
statistics indicate requests where the client and server use pNFS SCSI operations. -
The
READ
andWRITE
statistics indicate requests where the client and server fall back to NFS operations.
-
The
Chapter 7. Getting started with FS-Cache
FS-Cache is a persistent local cache that file systems can use to take data retrieved from over the network and cache it on local disk. This helps minimize network traffic for users accessing data from a file system mounted over the network (for example, NFS).
7.1. Overview of the FS-Cache
The following diagram is a high-level illustration of how FS-Cache works:
Figure 7.1. FS-Cache Overview
FS-Cache is designed to be as transparent as possible to the users and administrators of a system. Unlike cachefs
on Solaris, FS-Cache allows a file system on a server to interact directly with a client’s local cache without creating an overmounted file system. With NFS, a mount option instructs the client to mount the NFS share with FS-cache enabled. The mount point will cause automatic upload for two kernel modules: fscache
and cachefiles
. The cachefilesd
daemon communicates with the kernel modules to implement the cache.
FS-Cache does not alter the basic operation of a file system that works over the network - it merely provides that file system with a persistent place in which it can cache data. For instance, a client can still mount an NFS share whether or not FS-Cache is enabled. In addition, cached NFS can handle files that will not fit into the cache (whether individually or collectively) as files can be partially cached and do not have to be read completely up front. FS-Cache also hides all I/O errors that occur in the cache from the client file system driver.
To provide caching services, FS-Cache needs a cache back end. A cache back end is a storage driver configured to provide caching services, which is cachefiles
. In this case, FS-Cache requires a mounted block-based file system that supports bmap
and extended attributes (e.g. ext3) as its cache back end.
File systems that support functionalities required by FS-Cache cache back end include the Red Hat Enterprise Linux 8 implementations of the following file systems:
- ext3 (with extended attributes enabled)
- ext4
- XFS
FS-Cache cannot arbitrarily cache any file system, whether through the network or otherwise: the shared file system’s driver must be altered to allow interaction with FS-Cache, data storage/retrieval, and metadata setup and validation. FS-Cache needs indexing keys and coherency data from the cached file system to support persistence: indexing keys to match file system objects to cache objects, and coherency data to determine whether the cache objects are still valid.
In Red Hat Enterprise Linux 8, the cachefilesd package is not installed by default and needs to be installed manually.
7.2. Performance guarantee
FS-Cache does not guarantee increased performance. Using a cache incurs a performance penalty: for example, cached NFS shares add disk accesses to cross-network lookups. While FS-Cache tries to be as asynchronous as possible, there are synchronous paths (e.g. reads) where this isn’t possible.
For example, using FS-Cache to cache an NFS share between two computers over an otherwise unladen GigE network likely will not demonstrate any performance improvements on file access. Rather, NFS requests would be satisfied faster from server memory rather than from local disk.
The use of FS-Cache, therefore, is a compromise between various factors. If FS-Cache is being used to cache NFS traffic, for instance, it may slow the client down a little, but massively reduce the network and server loading by satisfying read requests locally without consuming network bandwidth.
7.3. Setting up a cache
Currently, Red Hat Enterprise Linux 8 only provides the cachefiles
caching back end. The cachefilesd
daemon initiates and manages cachefiles
. The /etc/cachefilesd.conf
file controls how cachefiles
provides caching services.
The cache back end works by maintaining a certain amount of free space on the partition hosting the cache. It grows and shrinks the cache in response to other elements of the system using up free space, making it safe to use on the root file system (for example, on a laptop). FS-Cache sets defaults on this behavior, which can be configured via cache cull limits. For more information about configuring cache cull limits, see Section 7.5, “Cache cull limits configuration”.
This procedure shows how to set up a cache.
Prerequisites
The cachefilesd package is installed and service has started successfully. To be sure the service is running, use the following command:
# systemctl start cachefilesd # systemctl status cachefilesd
The status must be active (running).
Procedure
Configure in a cache back end which directory to use as a cache, use the following parameter:
$ dir /path/to/cache
Typically, the cache back end directory is set in
/etc/cachefilesd.conf
as/var/cache/fscache
, as in:$ dir /var/cache/fscache
If you want to change the cache back end directory, the selinux context must be same as
/var/cache/fscache
:# semanage fcontext -a -e /var/cache/fscache /path/to/cache # restorecon -Rv /path/to/cache
- Replace /path/to/cache with the directory name while setting up cache.
If the given commands for setting selinux context did not work, use the following commands:
# semanage permissive -a cachefilesd_t # semanage permissive -a cachefiles_kernel_t
FS-Cache will store the cache in the file system that hosts
/path/to/cache
. On a laptop, it is advisable to use the root file system (/
) as the host file system, but for a desktop machine it would be more prudent to mount a disk partition specifically for the cache.The host file system must support user-defined extended attributes; FS-Cache uses these attributes to store coherency maintenance information. To enable user-defined extended attributes for ext3 file systems (i.e.
device
), use:# tune2fs -o user_xattr /dev/device
To enable extended attributes for a file system at the mount time, as an alternative, use the following command:
# mount /dev/device /path/to/cache -o user_xattr
Once the configuration file is in place, start up the
cachefilesd
service:# systemctl start cachefilesd
To configure
cachefilesd
to start at boot time, execute the following command as root:# systemctl enable cachefilesd
7.4. Using the cache with NFS
NFS will not use the cache unless explicitly instructed. This paragraph shows how to configure an NFS mount by using FS-Cache.
Prerequisites
The cachefilesd package is installed and running. To ensure it is running, use the following command:
# systemctl start cachefilesd # systemctl status cachefilesd
The status must be active (running).
Mount NFS shares with the following option:
# mount nfs-share:/ /mount/point -o fsc
All access to files under
/mount/point
will go through the cache, unless the file is opened for direct I/O or writing. For more information, see Section 7.4.2, “Cache limitations with NFS”. NFS indexes cache contents using NFS file handle, not the file name, which means hard-linked files share the cache correctly.
NFS versions 3, 4.0, 4.1 and 4.2 support caching. However, each version uses different branches for caching.
7.4.1. Configuring NFS cache sharing
There are several potential issues to do with NFS cache sharing. Because the cache is persistent, blocks of data in the cache are indexed on a sequence of four keys:
- Level 1: Server details
- Level 2: Some mount options; security type; FSID; uniquifier
- Level 3: File Handle
- Level 4: Page number in file
To avoid coherency management problems between superblocks, all NFS superblocks that require to cache the data have unique Level 2 keys. Normally, two NFS mounts with same source volume and options share a superblock, and thus share the caching, even if they mount different directories within that volume.
This is an example how to configure cache sharing with different options.
Procedure
Mount NFS shares with the following commands:
mount home0:/disk0/fred /home/fred -o fsc mount home0:/disk0/jim /home/jim -o fsc
Here,
/home/fred
and/home/jim
likely share the superblock as they have the same options, especially if they come from the same volume/partition on the NFS server (home0
).To not share the superblock, use the
mount
command with the following options:mount home0:/disk0/fred /home/fred -o fsc,rsize=8192 mount home0:/disk0/jim /home/jim -o fsc,rsize=65536
In this case,
/home/fred
and/home/jim
will not share the superblock as they have different network access parameters, which are part of the Level 2 key.To cache the contents of the two subtrees (
/home/fred1
and/home/fred2
) twice with not sharing the superblock, use the following command:mount home0:/disk0/fred /home/fred1 -o fsc,rsize=8192 mount home0:/disk0/fred /home/fred2 -o fsc,rsize=65536
Another way to avoid superblock sharing is to suppress it explicitly with the
nosharecache
parameter. Using the same example:mount home0:/disk0/fred /home/fred -o nosharecache,fsc mount home0:/disk0/jim /home/jim -o nosharecache,fsc
However, in this case only one of the superblocks is permitted to use cache since there is nothing to distinguish the Level 2 keys of
home0:/disk0/fred
andhome0:/disk0/jim
.To specify the addressing to the superblock, add a unique identifier on at least one of the mounts, i.e.
fsc=unique-identifier
:mount home0:/disk0/fred /home/fred -o nosharecache,fsc mount home0:/disk0/jim /home/jim -o nosharecache,fsc=jim
Here, the unique identifier
jim
is added to the Level 2 key used in the cache for/home/jim
.
The user can not share caches between superblocks that have different communications or protocol parameters. For example, it is not possible to share between NFSv4.0 and NFSv3 or between NFSv4.1 and NFSv4.2 because they force different superblocks. Also setting parameters, such as the read size (rsize), prevents cache sharing because, again, it forces a different superblock.
7.4.2. Cache limitations with NFS
There are some cache limitations with NFS:
- Opening a file from a shared file system for direct I/O automatically bypasses the cache. This is because this type of access must be direct to the server.
- Opening a file from a shared file system for either direct I/O or writing flushes the cached copy of the file. FS-Cache will not cache the file again until it is no longer opened for direct I/O or writing.
- Furthermore, this release of FS-Cache only caches regular NFS files. FS-Cache will not cache directories, symlinks, device files, FIFOs and sockets.
7.5. Cache cull limits configuration
The cachefilesd
daemon works by caching remote data from shared file systems to free space on the disk. This could potentially consume all available free space, which could be bad if the disk also housed the root partition. To control this, cachefilesd
tries to maintain a certain amount of free space by discarding old objects (i.e. accessed less recently) from the cache. This behavior is known as cache culling.
Cache culling is done on the basis of the percentage of blocks and the percentage of files available in the underlying file system. There are settings in /etc/cachefilesd.conf
which control six limits:
- brun N% (percentage of blocks), frun N% (percentage of files)
- If the amount of free space and the number of available files in the cache rises above both these limits, then culling is turned off.
- bcull N% (percentage of blocks), fcull N% (percentage of files)
- If the amount of available space or the number of files in the cache falls below either of these limits, then culling is started.
- bstop N% (percentage of blocks), fstop N% (percentage of files)
- If the amount of available space or the number of available files in the cache falls below either of these limits, then no further allocation of disk space or files is permitted until culling has raised things above these limits again.
The default value of N
for each setting is as follows:
-
brun
/frun
- 10% -
bcull
/fcull
- 7% -
bstop
/fstop
- 3%
When configuring these settings, the following must hold true:
-
0 ≤
bstop
<bcull
<brun
< 100 -
0 ≤
fstop
<fcull
<frun
< 100
These are the percentages of available space and available files and do not appear as 100 minus the percentage displayed by the df
program.
Culling depends on both bxxx and fxxx pairs simultaneously; the user can not treat them separately.
7.6. Retrieving statistical information from the fscache kernel module
FS-Cache also keeps track of general statistical information. This procedure shows how to get this information.
Procedure
To view the statistical information on FS-Cache, use the following command:
# cat /proc/fs/fscache/stats
FS-Cache statistics includes information on decision points and object counters. For more information, see the following kernel document:
/usr/share/doc/kernel-doc-4.18.0/Documentation/filesystems/caching/fscache.txt
7.7. FS-Cache references
This section provides reference information for FS-Cache.
For more information on
cachefilesd
and how to configure it, seeman cachefilesd
andman cachefilesd.conf
. The following kernel documents also provide additional information:-
/usr/share/doc/cachefilesd/README
-
/usr/share/man/man5/cachefilesd.conf.5.gz
-
/usr/share/man/man8/cachefilesd.8.gz
-
For general information about FS-Cache, including details on its design constraints, available statistics, and capabilities, see the following kernel document:
/usr/share/doc/kernel-doc-4.18.0/Documentation/filesystems/caching/fscache.txt
Chapter 9. Overview of persistent naming attributes
As a system administrator, you need to refer to storage volumes using persistent naming attributes to build storage setups that are reliable over multiple system boots.
9.1. Disadvantages of non-persistent naming attributes
Red Hat Enterprise Linux provides a number of ways to identify storage devices. It is important to use the correct option to identify each device when used in order to avoid inadvertently accessing the wrong device, particularly when installing to or reformatting drives.
Traditionally, non-persistent names in the form of /dev/sd(major number)(minor number)
are used on Linux to refer to storage devices. The major and minor number range and associated sd
names are allocated for each device when it is detected. This means that the association between the major and minor number range and associated sd
names can change if the order of device detection changes.
Such a change in the ordering might occur in the following situations:
- The parallelization of the system boot process detects storage devices in a different order with each system boot.
-
A disk fails to power up or respond to the SCSI controller. This results in it not being detected by the normal device probe. The disk is not accessible to the system and subsequent devices will have their major and minor number range, including the associated
sd
names shifted down. For example, if a disk normally referred to assdb
is not detected, a disk that is normally referred to assdc
would instead appear assdb
. -
A SCSI controller (host bus adapter, or HBA) fails to initialize, causing all disks connected to that HBA to not be detected. Any disks connected to subsequently probed HBAs are assigned different major and minor number ranges, and different associated
sd
names. - The order of driver initialization changes if different types of HBAs are present in the system. This causes the disks connected to those HBAs to be detected in a different order. This might also occur if HBAs are moved to different PCI slots on the system.
-
Disks connected to the system with Fibre Channel, iSCSI, or FCoE adapters might be inaccessible at the time the storage devices are probed, due to a storage array or intervening switch being powered off, for example. This might occur when a system reboots after a power failure, if the storage array takes longer to come online than the system take to boot. Although some Fibre Channel drivers support a mechanism to specify a persistent SCSI target ID to WWPN mapping, this does not cause the major and minor number ranges, and the associated
sd
names to be reserved; it only provides consistent SCSI target ID numbers.
These reasons make it undesirable to use the major and minor number range or the associated sd
names when referring to devices, such as in the /etc/fstab
file. There is the possibility that the wrong device will be mounted and data corruption might result.
Occasionally, however, it is still necessary to refer to the sd
names even when another mechanism is used, such as when errors are reported by a device. This is because the Linux kernel uses sd
names (and also SCSI host/channel/target/LUN tuples) in kernel messages regarding the device.
9.2. File system and device identifiers
This sections explains the difference between persistent attributes identifying file systems and block devices.
File system identifiers
File system identifiers are tied to a particular file system created on a block device. The identifier is also stored as part of the file system. If you copy the file system to a different device, it still carries the same file system identifier. On the other hand, if you rewrite the device, such as by formatting it with the mkfs
utility, the device loses the attribute.
File system identifiers include:
- Unique identifier (UUID)
- Label
Device identifiers
Device identifiers are tied to a block device: for example, a disk or a partition. If you rewrite the device, such as by formatting it with the mkfs
utility, the device keeps the attribute, because it is not stored in the file system.
Device identifiers include:
- World Wide Identifier (WWID)
- Partition UUID
- Serial number
Recommendations
- Some file systems, such as logical volumes, span multiple devices. Red Hat recommends accessing these file systems using file system identifiers rather than device identifiers.
9.3. Device names managed by the udev mechanism in /dev/disk/
This section lists different kinds of persistent naming attributes that the udev
service provides in the /dev/disk/
directory.
The udev
mechanism is used for all types of devices in Linux, not just for storage devices. In the case of storage devices, Red Hat Enterprise Linux contains udev
rules that create symbolic links in the /dev/disk/
directory. This enables you to refer to storage devices by:
- Their content
- A unique identifier
- Their serial number.
Although udev
naming attributes are persistent, in that they do not change on their own across system reboots, some are also configurable.
9.3.1. File system identifiers
The UUID attribute in /dev/disk/by-uuid/
Entries in this directory provide a symbolic name that refers to the storage device by a unique identifier (UUID) in the content (that is, the data) stored on the device. For example:
/dev/disk/by-uuid/3e6be9de-8139-11d1-9106-a43f08d823a6
You can use the UUID to refer to the device in the /etc/fstab
file using the following syntax:
UUID=3e6be9de-8139-11d1-9106-a43f08d823a6
You can configure the UUID attribute when creating a file system, and you can also change it later on.
The Label attribute in /dev/disk/by-label/
Entries in this directory provide a symbolic name that refers to the storage device by a label in the content (that is, the data) stored on the device.
For example:
/dev/disk/by-label/Boot
You can use the label to refer to the device in the /etc/fstab
file using the following syntax:
LABEL=Boot
You can configure the Label attribute when creating a file system, and you can also change it later on.
9.3.2. Device identifiers
The WWID attribute in /dev/disk/by-id/
The World Wide Identifier (WWID) is a persistent, system-independent identifier that the SCSI Standard requires from all SCSI devices. The WWID identifier is guaranteed to be unique for every storage device, and independent of the path that is used to access the device. The identifier is a property of the device but is not stored in the content (that is, the data) on the devices.
This identifier can be obtained by issuing a SCSI Inquiry to retrieve the Device Identification Vital Product Data (page 0x83
) or Unit Serial Number (page 0x80
).
Red Hat Enterprise Linux automatically maintains the proper mapping from the WWID-based device name to a current /dev/sd
name on that system. Applications can use the /dev/disk/by-id/
name to reference the data on the disk, even if the path to the device changes, and even when accessing the device from different systems.
Example 9.1. WWID mappings
WWID symlink | Non-persistent device | Note |
---|---|---|
|
|
A device with a page |
|
|
A device with a page |
|
| A disk partition |
In addition to these persistent names provided by the system, you can also use udev
rules to implement persistent names of your own, mapped to the WWID of the storage.
The Partition UUID attribute in /dev/disk/by-partuuid
The Partition UUID (PARTUUID) attribute identifies partitions as defined by GPT partition table.
Example 9.2. Partition UUID mappings
PARTUUID symlink | Non-persistent device |
---|---|
|
|
|
|
|
|
The Path attribute in /dev/disk/by-path/
This attribute provides a symbolic name that refers to the storage device by the hardware path used to access the device.
The Path attribute fails if any part of the hardware path (for example, the PCI ID, target port, or LUN number) changes. The Path attribute is therefore unreliable. However, the Path attribute may be useful in one of the following scenarios:
- You need to identify a disk that you are planning to replace later.
- You plan to install a storage service on a disk in a specific location.
9.4. The World Wide Identifier with DM Multipath
This section describes the mapping between the World Wide Identifier (WWID) and non-persistent device names in a Device Mapper Multipath configuration.
If there are multiple paths from a system to a device, DM Multipath uses the WWID to detect this. DM Multipath then presents a single "pseudo-device" in the /dev/mapper/wwid
directory, such as /dev/mapper/3600508b400105df70000e00000ac0000
.
The command multipath -l
shows the mapping to the non-persistent identifiers:
-
Host:Channel:Target:LUN
-
/dev/sd
name -
major:minor
number
Example 9.3. WWID mappings in a multipath configuration
An example output of the multipath -l
command:
3600508b400105df70000e00000ac0000 dm-2 vendor,product [size=20G][features=1 queue_if_no_path][hwhandler=0][rw] \_ round-robin 0 [prio=0][active] \_ 5:0:1:1 sdc 8:32 [active][undef] \_ 6:0:1:1 sdg 8:96 [active][undef] \_ round-robin 0 [prio=0][enabled] \_ 5:0:0:1 sdb 8:16 [active][undef] \_ 6:0:0:1 sdf 8:80 [active][undef]
DM Multipath automatically maintains the proper mapping of each WWID-based device name to its corresponding /dev/sd
name on the system. These names are persistent across path changes, and they are consistent when accessing the device from different systems.
When the user_friendly_names
feature of DM Multipath is used, the WWID is mapped to a name of the form /dev/mapper/mpathN
. By default, this mapping is maintained in the file /etc/multipath/bindings
. These mpathN
names are persistent as long as that file is maintained.
If you use user_friendly_names
, then additional steps are required to obtain consistent names in a cluster.
9.5. Limitations of the udev device naming convention
The following are some limitations of the udev
naming convention:
-
It is possible that the device might not be accessible at the time the query is performed because the
udev
mechanism might rely on the ability to query the storage device when theudev
rules are processed for audev
event. This is more likely to occur with Fibre Channel, iSCSI or FCoE storage devices when the device is not located in the server chassis. -
The kernel might send
udev
events at any time, causing the rules to be processed and possibly causing the/dev/disk/by-*/
links to be removed if the device is not accessible. -
There might be a delay between when the
udev
event is generated and when it is processed, such as when a large number of devices are detected and the user-spaceudevd
service takes some amount of time to process the rules for each one. This might cause a delay between when the kernel detects the device and when the/dev/disk/by-*/
names are available. -
External programs such as
blkid
invoked by the rules might open the device for a brief period of time, making the device inaccessible for other uses. -
The device names managed by the
udev
mechanism in /dev/disk/ may change between major releases, requiring you to update the links.
9.6. Listing persistent naming attributes
This procedure describes how to find out the persistent naming attributes of non-persistent storage devices.
Procedure
To list the UUID and Label attributes, use the
lsblk
utility:$ lsblk --fs storage-device
For example:
Example 9.4. Viewing the UUID and Label of a file system
$ lsblk --fs /dev/sda1 NAME FSTYPE LABEL UUID MOUNTPOINT sda1 xfs Boot afa5d5e3-9050-48c3-acc1-bb30095f3dc4 /boot
To list the PARTUUID attribute, use the
lsblk
utility with the--output +PARTUUID
option:$ lsblk --output +PARTUUID
For example:
Example 9.5. Viewing the PARTUUID attribute of a partition
$ lsblk --output +PARTUUID /dev/sda1 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT PARTUUID sda1 8:1 0 512M 0 part /boot 4cd1448a-01
To list the WWID attribute, examine the targets of symbolic links in the
/dev/disk/by-id/
directory. For example:Example 9.6. Viewing the WWID of all storage devices on the system
$ file /dev/disk/by-id/* /dev/disk/by-id/ata-QEMU_HARDDISK_QM00001 symbolic link to ../../sda /dev/disk/by-id/ata-QEMU_HARDDISK_QM00001-part1 symbolic link to ../../sda1 /dev/disk/by-id/ata-QEMU_HARDDISK_QM00001-part2 symbolic link to ../../sda2 /dev/disk/by-id/dm-name-rhel_rhel8-root symbolic link to ../../dm-0 /dev/disk/by-id/dm-name-rhel_rhel8-swap symbolic link to ../../dm-1 /dev/disk/by-id/dm-uuid-LVM-QIWtEHtXGobe5bewlIUDivKOz5ofkgFhP0RMFsNyySVihqEl2cWWbR7MjXJolD6g symbolic link to ../../dm-1 /dev/disk/by-id/dm-uuid-LVM-QIWtEHtXGobe5bewlIUDivKOz5ofkgFhXqH2M45hD2H9nAf2qfWSrlRLhzfMyOKd symbolic link to ../../dm-0 /dev/disk/by-id/lvm-pv-uuid-atlr2Y-vuMo-ueoH-CpMG-4JuH-AhEF-wu4QQm symbolic link to ../../sda2
9.7. Modifying persistent naming attributes
This procedure describes how to change the UUID or Label persistent naming attribute of a file system.
Changing udev
attributes happens in the background and might take a long time. The udevadm settle
command waits until the change is fully registered, which ensures that your next command will be able to utilize the new attribute correctly.
In the following commands:
-
Replace new-uuid with the UUID you want to set; for example,
1cdfbc07-1c90-4984-b5ec-f61943f5ea50
. You can generate a UUID using theuuidgen
command. -
Replace new-label with a label; for example,
backup_data
.
Prerequisites
- If you are modifying the attributes of an XFS file system, unmount it first.
Procedure
To change the UUID or Label attributes of an XFS file system, use the
xfs_admin
utility:# xfs_admin -U new-uuid -L new-label storage-device # udevadm settle
To change the UUID or Label attributes of an ext4, ext3, or ext2 file system, use the
tune2fs
utility:# tune2fs -U new-uuid -L new-label storage-device # udevadm settle
To change the UUID or Label attributes of a swap volume, use the
swaplabel
utility:# swaplabel --uuid new-uuid --label new-label swap-device # udevadm settle
Chapter 10. Getting started with partitions
As a system administrator, you can use the following procedures to create, delete, and modify various types of disk partitions.
For an overview of the advantages and disadvantages to using partitions on block devices, see the following KBase article: https://access.redhat.com/solutions/163853.
10.1. Viewing the partition table
As a system administrator, you can display the partition table of a block device to see the partition layout and details about individual partitions.
10.1.1. Viewing the partition table with parted
This procedure describes how to view the partition table on a block device using the parted
utility.
Procedure
Start the interactive
parted
shell:# parted block-device
-
Replace block-device with the path to the device you want to examine: for example,
/dev/sda
.
-
Replace block-device with the path to the device you want to examine: for example,
View the partition table:
(parted) print
Optionally, use the following command to switch to another device you want to examine next:
(parted) select block-device
Additional resources
-
The
parted(8)
man page.
10.1.2. Example output of parted print
This section provides an example output of the print
command in the parted
shell and describes fields in the output.
Example 10.1. Output of the print
command
Model: ATA SAMSUNG MZNLN256 (scsi) Disk /dev/sda: 256GB Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 1049kB 269MB 268MB primary xfs boot 2 269MB 34.6GB 34.4GB primary 3 34.6GB 45.4GB 10.7GB primary 4 45.4GB 256GB 211GB extended 5 45.4GB 256GB 211GB logical
Following is a description of the fields:
Model: ATA SAMSUNG MZNLN256 (scsi)
- The disk type, manufacturer, model number, and interface.
Disk /dev/sda: 256GB
- The file path to the block device and the storage capacity.
Partition Table: msdos
- The disk label type.
Number
-
The partition number. For example, the partition with minor number 1 corresponds to
/dev/sda1
. Start
andEnd
- The location on the device where the partition starts and ends.
Type
- Valid types are metadata, free, primary, extended, or logical.
File system
-
The file system type. If the
File system
field of a device shows no value, this means that its file system type is unknown. Theparted
utility cannot recognize the file system on encrypted devices. Flags
-
Lists the flags set for the partition. Available flags are
boot
,root
,swap
,hidden
,raid
,lvm
, orlba
.
10.2. Creating a partition table on a disk
As a system administrator, you can format a block device with different types of partition tables to enable using partitions on the device.
Formatting a block device with a partition table deletes all data stored on the device.
10.2.1. Considerations before modifying partitions on a disk
This section lists key points to consider before creating, removing, or resizing partitions.
This section does not cover the DASD partition table, which is specific to the IBM Z architecture. For information on DASD, see:
- Configuring a Linux instance on IBM Z
- The What you should know about DASD article at the IBM Knowledge Center
The maximum number of partitions
The number of partitions on a device is limited by the type of the partition table:
On a device formatted with the Master Boot Record (MBR) partition table, you can have either:
- Up to four primary partitions, or
- Up to three primary partitions, one extended partition, and multiple logical partitions within the extended.
-
On a device formatted with the GUID Partition Table (GPT), the maximum number of partitions is 128. While the GPT specification allows for more partitions by growing the area reserved for the partition table, common practice used by the
parted
utility is to limit it to enough area for 128 partitions.
Red Hat recommends that, unless you have a reason for doing otherwise, you should at least create the following partitions: swap
, /boot/
, and /
(root).
The maximum size of a partition
The size of a partition on a device is limited by the type of the partition table:
- On a device formatted with the Master Boot Record (MBR) partition table, the maximum size is 2TiB.
- On a device formatted with the GUID Partition Table (GPT), the maximum size is 8ZiB.
If you want to create a partition larger than 2TiB, the disk must be formatted with GPT.
Size alignment
The parted
utility enables you to specify partition size using multiple different suffixes:
- MiB, GiB, or TiB
Size expressed in powers of 2.
- The starting point of the partition is aligned to the exact sector specified by size.
- The ending point is aligned to the specified size minus 1 sector.
- MB, GB, or TB
Size expressed in powers of 10.
The starting and ending point is aligned within one half of the specified unit: for example, ±500KB when using the MB suffix.
10.2.2. Comparison of partition table types
This section compares the properties of different types of partition tables that you can create on a block device.
Table 10.1. Partition table types
Partition table | Maximum number of partitions | Maximum partition size |
---|---|---|
Master Boot Record (MBR) | 4 primary, or 3 primary and 12 logical inside an extended partition | 2TiB |
GUID Partition Table (GPT) | 128 | 8ZiB |
10.2.3. MBR disk partitions
The diagrams in this chapter show the partition table as being separate from the actual disk. However, this is not entirely accurate. In reality, the partition table is stored at the very start of the disk, before any file system or user data, but for clarity, they are separate in the following diagrams.
Figure 10.1. Disk with MBR partition table
As the previous diagram shows, the partition table is divided into four sections of four primary partitions. A primary partition is a partition on a hard drive that can contain only one logical drive (or section). Each section can hold the information necessary to define a single partition, meaning that the partition table can define no more than four partitions.
Each partition table entry contains several important characteristics of the partition:
- The points on the disk where the partition starts and ends.
- Whether the partition is active. Only one partition can be flagged as active.
- The partition’s type.
The starting and ending points define the partition’s size and location on the disk. The "active" flag is used by some operating systems boot loaders. In other words, the operating system in the partition that is marked "active" is booted, in this case.
The type is a number that identifies the partition’s anticipated usage. Some operating systems use the partition type to denote a specific file system type, to flag the partition as being associated with a particular operating system, to indicate that the partition contains a bootable operating system, or some combination of the three.
The following diagram shows an example of a drive with single partition:
Figure 10.2. Disk with a single partition
The single partition in this example is labeled as DOS
. This label shows the partition type, with DOS
being one of the most common ones.
10.2.4. Extended MBR partitions
In case four partitions are insufficient for your needs, you can use extended partitions to create up additional partitions. You can do this by setting the type of partition to "Extended".
An extended partition is like a disk drive in its own right - it has its own partition table, which points to one or more partitions (now called logical partitions, as opposed to the four primary partitions), contained entirely within the extended partition itself. The following diagram shows a disk drive with one primary partition and one extended partition containing two logical partitions (along with some unpartitioned free space):
Figure 10.3. Disk with both a primary and an extended MBR partition
As this figure implies, there is a difference between primary and logical partitions - there can be only up to four primary and extended partitions, but there is no fixed limit to the number of logical partitions that can exist. However, due to the way in which partitions are accessed in Linux, no more than 15 logical partitions can be defined on a single disk drive.
10.2.5. MBR partition types
The table below shows a list of some of the commonly used MBR partition types and hexadecimal numbers used to represent them.
Table 10.2. MBR partition types
MBR partition type | Value | MBR partition type | Value |
Empty | 00 | Novell Netware 386 | 65 |
DOS 12-bit FAT | 01 | PIC/IX | 75 |
XENIX root | O2 | Old MINIX | 80 |
XENIX usr | O3 | Linux/MINUX | 81 |
DOS 16-bit ⇐32M | 04 | Linux swap | 82 |
Extended | 05 | Linux native | 83 |
DOS 16-bit >=32 | 06 | Linux extended | 85 |
OS/2 HPFS | 07 | Amoeba | 93 |
AIX | 08 | Amoeba BBT | 94 |
AIX bootable | 09 | BSD/386 | a5 |
OS/2 Boot Manager | 0a | OpenBSD | a6 |
Win95 FAT32 | 0b | NEXTSTEP | a7 |
Win95 FAT32 (LBA) | 0c | BSDI fs | b7 |
Win95 FAT16 (LBA) | 0e | BSDI swap | b8 |
Win95 Extended (LBA) | 0f | Syrinx | c7 |
Venix 80286 | 40 | CP/M | db |
Novell | 51 | DOS access | e1 |
PRep Boot | 41 | DOS R/O | e3 |
GNU HURD | 63 | DOS secondary | f2 |
Novell Netware 286 | 64 | BBT | ff |
10.2.6. GUID Partition Table
The GUID Partition Table (GPT) is a partitioning scheme based on using Globally Unique Identifier (GUID). GPT was developed to cope with limitations of the MBR partition table, especially with the limited maximum addressable storage space of a disk. Unlike MBR, which is unable to address storage larger than 2 TiB (equivalent to approximately 2.2 TB), GPT is used with hard disks larger than this; the maximum addressable disk size is 2.2 ZiB. In addition, GPT, by default, supports creating up to 128 primary partitions. This number could be extended by allocating more space to the partition table.
A GPT has partition types based on GUIDs. Note that certain partitions require a specific GUID. For example, the system partition for EFI boot loaders require GUID C12A7328-F81F-11D2-BA4B-00A0C93EC93B
.
GPT disks use logical block addressing (LBA) and the partition layout is as follows:
- To preserve backward compatibility with MBR disks, the first sector (LBA 0) of GPT is reserved for MBR data and it is called "protective MBR".
- The primary GPT header begins on the second logical block (LBA 1) of the device. The header contains the disk GUID, the location of the primary partition table, the location of the secondary GPT header, and CRC32 checksums of itself, and the primary partition table. It also specifies the number of partition entries on the table.
- The primary GPT includes, by default 128 partition entries, each with an entry size of 128 bytes, its partition type GUID and unique partition GUID.
- The secondary GPT is identical to the primary GPT. It is used mainly as a backup table for recovery in case the primary partition table is corrupted.
- The secondary GPT header is located on the last logical sector of the disk and it can be used to recover GPT information in case the primary header is corrupted. It contains the disk GUID, the location of the secondary partition table and the primary GPT header, CRC32 checksums of itself and the secondary partition table, and the number of possible partition entries.
Figure 10.4. Disk with a GUID Partition Table
There must be a BIOS boot partition for the boot loader to be installed successfully onto a disk that contains a GPT (GUID Partition table). This includes disks initialized by Anaconda. If the disk already contains a BIOS boot partition, it can be reused.
10.2.7. Creating a partition table on a disk with parted
This procedure describes how to format a block device with a partition table using the parted
utility.
Procedure
Start the interactive
parted
shell:# parted block-device
-
Replace block-device with the path to the device where you want to create a partition table: for example,
/dev/sda
.
-
Replace block-device with the path to the device where you want to create a partition table: for example,
Determine if there already is a partition table on the device:
(parted) print
If the device already contains partitions, they will be deleted in the next steps.
Create the new partition table:
(parted) mklabel table-type
Replace table-type with with the intended partition table type:
-
msdos
for MBR -
gpt
for GPT
-
Example 10.2. Creating a GPT table
For example, to create a GPT table on the disk, use:
(parted) mklabel gpt
The changes start taking place as soon as you enter this command, so review it before executing it.
View the partition table to confirm that the partition table exists:
(parted) print
Exit the
parted
shell:(parted) quit
Additional resources
-
The
parted(8)
man page.
Next steps
- Create partitions on the device. See Section 10.3, “Creating a partition” for details.
10.3. Creating a partition
As a system administrator, you can create new partitions on a disk.
10.3.1. Considerations before modifying partitions on a disk
This section lists key points to consider before creating, removing, or resizing partitions.
This section does not cover the DASD partition table, which is specific to the IBM Z architecture. For information on DASD, see:
- Configuring a Linux instance on IBM Z
- The What you should know about DASD article at the IBM Knowledge Center
The maximum number of partitions
The number of partitions on a device is limited by the type of the partition table:
On a device formatted with the Master Boot Record (MBR) partition table, you can have either:
- Up to four primary partitions, or
- Up to three primary partitions, one extended partition, and multiple logical partitions within the extended.
-
On a device formatted with the GUID Partition Table (GPT), the maximum number of partitions is 128. While the GPT specification allows for more partitions by growing the area reserved for the partition table, common practice used by the
parted
utility is to limit it to enough area for 128 partitions.
Red Hat recommends that, unless you have a reason for doing otherwise, you should at least create the following partitions: swap
, /boot/
, and /
(root).
The maximum size of a partition
The size of a partition on a device is limited by the type of the partition table:
- On a device formatted with the Master Boot Record (MBR) partition table, the maximum size is 2TiB.
- On a device formatted with the GUID Partition Table (GPT), the maximum size is 8ZiB.
If you want to create a partition larger than 2TiB, the disk must be formatted with GPT.
Size alignment
The parted
utility enables you to specify partition size using multiple different suffixes:
- MiB, GiB, or TiB
Size expressed in powers of 2.
- The starting point of the partition is aligned to the exact sector specified by size.
- The ending point is aligned to the specified size minus 1 sector.
- MB, GB, or TB
Size expressed in powers of 10.
The starting and ending point is aligned within one half of the specified unit: for example, ±500KB when using the MB suffix.
10.3.2. Partition types
This section describes different attributes that specify the type of a partition.
Partition types or flags
The partition type, or flag, is used by a running system only rarely. However, the partition type matters to on-the-fly generators, such as systemd-gpt-auto-generator
, which use the partition type to, for example, automatically identify and mount devices.
-
The
parted
utility provides some control of partition types by mapping the partition type to flags. The parted utility can handle only certain partition types: for example LVM, swap, or RAID. -
The
fdisk
utility supports the full range of partition types by specifying hexadecimal codes.
Partition file system type
The parted
utility optionally accepts a file system type argument when creating a partition. The value is used to:
- Set the partition flags on MBR, or
-
Set the partition UUID type on GPT. For example, the
swap
,fat
, orhfs
file system types set different GUIDs. The default value is the Linux Data GUID.
The argument does not modify the file system on the partition in any way. It only differentiates between the supported flags or GUIDs.
The following file system types are supported:
-
xfs
-
ext2
-
ext3
-
ext4
-
fat16
-
fat32
-
hfs
-
hfs+
-
linux-swap
-
ntfs
-
reiserfs
10.3.3. Partition naming scheme
Red Hat Enterprise Linux uses a file-based naming scheme, with file names in the form of /dev/xxyN
.
Device and partition names consist of the following structure:
/dev/
-
This is the name of the directory in which all device files are located. Because partitions are placed on hard disks, and hard disks are devices, the files representing all possible partitions are located in
/dev
. xx
-
The first two letters of the partitions name indicate the type of device on which is the partition located, usually
sd
. y
-
This letter indicates which device the partition is on. For example,
/dev/sda
for the first hard disk,/dev/sdb
for the second, and so on. In systems with more than 26 drives, you can use more letters. For example,/dev/sdaa1
. N
-
The final letter indicates the number that represents the partition. The first four (primary or extended) partitions are numbered
1
through4
. Logical partitions start at5
. For example,/dev/sda3
is the third primary or extended partition on the first hard disk, and/dev/sdb6
is the second logical partition on the second hard disk. Drive partition numbering applies only to MBR partition tables. Note that N does not always mean partition.
Even if Red Hat Enterprise Linux can identify and refer to all types of disk partitions, it might not be able to read the file system and therefore access stored data on every partition type. However, in many cases, it is possible to successfully access data on a partition dedicated to another operating system.
10.3.4. Mount points and disk partitions
In Red Hat Enterprise Linux, each partition is used to form part of the storage necessary to support a single set of files and directories. This is done using the process known as mounting, which associates a partition with a directory. Mounting a partition makes its storage available starting at the specified directory, known as a mount point.
For example, if partition /dev/sda5
is mounted on /usr/
, that would mean that all files and directories under /usr/
physically reside on /dev/sda5
. So the file /usr/share/doc/FAQ/txt/Linux-FAQ
would be stored on /dev/sda5
, while the file /etc/gdm/custom.conf
would not.
Continuing the example, it is also possible that one or more directories below /usr/
would be mount points for other partitions. For instance, a partition /dev/sda7
could be mounted on /usr/local
, meaning that /usr/local/man/whatis
would then reside on /dev/sda7
rather than /dev/sda5
.
10.3.5. Creating a partition with parted
This procedure describes how to create a new partition on a block device using the parted
utility.
Prerequisites
- There is a partition table on the disk. For details on how to format the disk, see Section 10.2, “Creating a partition table on a disk”.
- If the partition you want to create is larger than 2TiB, the disk must be formatted with the GUID Partition Table (GPT).
Procedure
Start the interactive
parted
shell:# parted block-device
-
Replace block-device with the path to the device where you want to create a partition: for example,
/dev/sda
.
-
Replace block-device with the path to the device where you want to create a partition: for example,
View the current partition table to determine if there is enough free space:
(parted) print
- If there is not enough free space, you can resize an existing partition. For more information, see Section 10.5, “Resizing a partition”.
From the partition table, determine:
- The start and end points of the new partition
- On MBR, what partition type it should be.
Create the new partition:
(parted) mkpart part-type name fs-type start end
-
Replace part-type with with
primary
,logical
, orextended
based on what you decided from the partition table. This applies only to the MBR partition table. - Replace name with an arbitrary partition name. This is required for GPT partition tables.
-
Replace fs-type with any one of
xfs
,ext2
,ext3
,ext4
,fat16
,fat32
,hfs
,hfs+
,linux-swap
,ntfs
, orreiserfs
. The fs-type parameter is optional. Note thatparted
does not create the file system on the partition. -
Replace start and end with the sizes that determine the starting and ending points of the partition, counting from the beginning of the disk. You can use size suffixes, such as
512MiB
,20GiB
, or1.5TiB
. The default size megabytes.
Example 10.3. Creating a small primary partition
For example, to create a primary partition from 1024MiB until 2048MiB on an MBR table, use:
(parted) mkpart primary 1024MiB 2048MiB
The changes start taking place as soon as you enter this command, so review it before executing it.
-
Replace part-type with with
View the partition table to confirm that the created partition is in the partition table with the correct partition type, file system type, and size:
(parted) print
Exit the
parted
shell:(parted) quit
Use the following command to wait for the system to register the new device node:
# udevadm settle
Verify that the kernel recognizes the new partition:
# cat /proc/partitions
Additional resources
-
The
parted(8)
man page.
10.3.6. Setting a partition type with fdisk
This procedure describes how to set a partition type, or flag, using the fdisk
utility.
Prerequisites
- There is a partition on the disk.
Procedure
Start the interactive
fdisk
shell:# fdisk block-device
-
Replace block-device with the path to the device where you want to set a partition type: for example,
/dev/sda
.
-
Replace block-device with the path to the device where you want to set a partition type: for example,
View the current partition table to determine the minor partition number:
Command (m for help): print
You can see the current partition type in the
Type
column and its corresponding type ID in theId
column.Enter the partition type command and select a partition using its minor number:
Command (m for help): type Partition number (1,2,3 default 3): 2
Optionally, list the available hexadecimal codes:
Hex code (type L to list all codes): L
Set the partition type:
Hex code (type L to list all codes): 8e
Write your changes and exit the
fdisk
shell:Command (m for help): write The partition table has been altered. Syncing disks.
Verify your changes:
# fdisk --list block-device
10.4. Removing a partition
As a system administrator, you can remove a disk partition that is no longer used to free up disk space.
Removing a partition deletes all data stored on the partition.
10.4.1. Considerations before modifying partitions on a disk
This section lists key points to consider before creating, removing, or resizing partitions.
This section does not cover the DASD partition table, which is specific to the IBM Z architecture. For information on DASD, see:
- Configuring a Linux instance on IBM Z
- The What you should know about DASD article at the IBM Knowledge Center
The maximum number of partitions
The number of partitions on a device is limited by the type of the partition table:
On a device formatted with the Master Boot Record (MBR) partition table, you can have either:
- Up to four primary partitions, or
- Up to three primary partitions, one extended partition, and multiple logical partitions within the extended.
-
On a device formatted with the GUID Partition Table (GPT), the maximum number of partitions is 128. While the GPT specification allows for more partitions by growing the area reserved for the partition table, common practice used by the
parted
utility is to limit it to enough area for 128 partitions.
Red Hat recommends that, unless you have a reason for doing otherwise, you should at least create the following partitions: swap
, /boot/
, and /
(root).
The maximum size of a partition
The size of a partition on a device is limited by the type of the partition table:
- On a device formatted with the Master Boot Record (MBR) partition table, the maximum size is 2TiB.
- On a device formatted with the GUID Partition Table (GPT), the maximum size is 8ZiB.
If you want to create a partition larger than 2TiB, the disk must be formatted with GPT.
Size alignment
The parted
utility enables you to specify partition size using multiple different suffixes:
- MiB, GiB, or TiB
Size expressed in powers of 2.
- The starting point of the partition is aligned to the exact sector specified by size.
- The ending point is aligned to the specified size minus 1 sector.
- MB, GB, or TB
Size expressed in powers of 10.
The starting and ending point is aligned within one half of the specified unit: for example, ±500KB when using the MB suffix.
10.4.2. Removing a partition with parted
This procedure describes how to remove a disk partition using the parted
utility.
Procedure
Start the interactive
parted
shell:# parted block-device
-
Replace block-device with the path to the device where you want to remove a partition: for example,
/dev/sda
.
-
Replace block-device with the path to the device where you want to remove a partition: for example,
View the current partition table to determine the minor number of the partition to remove:
(parted) print
Remove the partition:
(parted) rm minor-number
-
Replace minor-number with the minor number of the partition you want to remove: for example,
3
.
The changes start taking place as soon as you enter this command, so review it before executing it.
-
Replace minor-number with the minor number of the partition you want to remove: for example,
Confirm that the partition is removed from the partition table:
(parted) print
Exit the
parted
shell:(parted) quit
Verify that the kernel knows the partition is removed:
# cat /proc/partitions
-
Remove the partition from the
/etc/fstab
file if it is present. Find the line that declares the removed partition, and remove it from the file. Regenerate mount units so that your system registers the new
/etc/fstab
configuration:# systemctl daemon-reload
If you have deleted a swap partition or removed pieces of LVM, remove all references to the partition from the kernel command line in the
/etc/default/grub
file and regenerate GRUB configuration:On a BIOS-based system:
# grub2-mkconfig --output=/etc/grub2.cfg
On a UEFI-based system:
# grub2-mkconfig --output=/etc/grub2-efi.cfg
To register the changes in the early boot system, rebuild the
initramfs
file system:# dracut --force --verbose
Additional resources
-
The
parted(8)
man page
10.5. Resizing a partition
As a system administrator, you can extend a partition to utilize unused disk space, or shrink a partition to use its capacity for different purposes.
10.5.1. Considerations before modifying partitions on a disk
This section lists key points to consider before creating, removing, or resizing partitions.
This section does not cover the DASD partition table, which is specific to the IBM Z architecture. For information on DASD, see:
- Configuring a Linux instance on IBM Z
- The What you should know about DASD article at the IBM Knowledge Center
The maximum number of partitions
The number of partitions on a device is limited by the type of the partition table:
On a device formatted with the Master Boot Record (MBR) partition table, you can have either:
- Up to four primary partitions, or
- Up to three primary partitions, one extended partition, and multiple logical partitions within the extended.
-
On a device formatted with the GUID Partition Table (GPT), the maximum number of partitions is 128. While the GPT specification allows for more partitions by growing the area reserved for the partition table, common practice used by the
parted
utility is to limit it to enough area for 128 partitions.
Red Hat recommends that, unless you have a reason for doing otherwise, you should at least create the following partitions: swap
, /boot/
, and /
(root).
The maximum size of a partition
The size of a partition on a device is limited by the type of the partition table:
- On a device formatted with the Master Boot Record (MBR) partition table, the maximum size is 2TiB.
- On a device formatted with the GUID Partition Table (GPT), the maximum size is 8ZiB.
If you want to create a partition larger than 2TiB, the disk must be formatted with GPT.
Size alignment
The parted
utility enables you to specify partition size using multiple different suffixes:
- MiB, GiB, or TiB
Size expressed in powers of 2.
- The starting point of the partition is aligned to the exact sector specified by size.
- The ending point is aligned to the specified size minus 1 sector.
- MB, GB, or TB
Size expressed in powers of 10.
The starting and ending point is aligned within one half of the specified unit: for example, ±500KB when using the MB suffix.
10.5.2. Resizing a partition with parted
This procedure resizes a disk partition using the parted
utility.
Prerequisites
If you want to shrink a partition, back up the data that are stored on it.
WarningShrinking a partition might result in data loss on the partition.
- If you want to resize a partition to be larger than 2TiB, the disk must be formatted with the GUID Partition Table (GPT). For details on how to format the disk, see Section 10.2, “Creating a partition table on a disk”.
Procedure
- If you want to shrink the partition, shrink the file system on it first so that it is not larger than the resized partition. Note that XFS does not support shrinking.
Start the interactive
parted
shell:# parted block-device
-
Replace block-device with the path to the device where you want to resize a partition: for example,
/dev/sda
.
-
Replace block-device with the path to the device where you want to resize a partition: for example,
View the current partition table:
(parted) print
From the partition table, determine:
- The minor number of the partition
- The location of the existing partition and its new ending point after resizing
Resize the partition:
(parted) resizepart minor-number new-end
-
Replace minor-number with the minor number of the partition that you are resizing: for example,
3
. -
Replace new-end with the size that determines the new ending point of the resized partition, counting from the beginning of the disk. You can use size suffixes, such as
512MiB
,20GiB
, or1.5TiB
. The default size megabytes.
Example 10.4. Extending a partition
For example, to extend a partition located at the beginning of the disk to be 2GiB in size, use:
(parted) resizepart 1 2GiB
The changes start taking place as soon as you enter this command, so review it before executing it.
-
Replace minor-number with the minor number of the partition that you are resizing: for example,
View the partition table to confirm that the resized partition is in the partition table with the correct size:
(parted) print
Exit the
parted
shell:(parted) quit
Verify that the kernel recognizes the new partition:
# cat /proc/partitions
- If you extended the partition, extend the file system on it as well. See (reference) for details.
Additional resources
-
The
parted(8)
man page.
10.6. Strategies for repartitioning a disk
There are several different ways to repartition a disk. This section discusses the following possible approaches:
- Unpartitioned free space is available
- An unused partition is available
- Free space in an actively used partition is available
Note that this section discusses the previously mentioned concepts only theoretically and it does not include any procedural steps on how to perform disk repartitioning step-by-step.
The following illustrations are simplified in the interest of clarity and do not reflect the exact partition layout that you encounter when actually installing Red Hat Enterprise Linux.
10.6.1. Using unpartitioned free space
In this situation, the partitions that are already defined do not span the entire hard disk, leaving unallocated space that is not part of any defined partition. The following diagram shows what this might look like:
Figure 10.5. Disk with unpartitioned free space
In the previous example, the first diagram represents a disk with one primary partition and an undefined partition with unallocated space, and the second diagram represents a disk with two defined partitions with allocated space.
An unused hard disk also falls into this category. The only difference is that all the space is not part of any defined partition.
In any case, you can create the necessary partitions from the unused space. This scenario is mostly likely for a new disk. Most preinstalled operating systems are configured to take up all available space on a disk drive.
10.6.2. Using space from an unused partition
In this case, you can have one or more partitions that you no longer use. The following diagram illustrated such a situation.
Figure 10.6. Disk with an unused partition
In the previous example, the first diagram represents a disk with an unused partition, and the second diagram represents reallocating an unused partition for Linux.
In this situation, you can use the space allocated to the unused partition. You must delete the partition and then create the appropriate Linux partition(s) in its place. You can delete the unused partition and manually create new partitions during the installation process.
10.6.3. Using free space from an active partition
This is the most common situation. It is also the hardest to handle, because even if you have enough free space, it is presently allocated to a partition that is already in use. If you purchased a computer with preinstalled software, the hard disk most likely has one massive partition holding the operating system and data.
Aside from adding a new hard drive to your system, you can choose from destructive and non-destructive repartitioning.
10.6.3.1. Destructive repartitioning
This deletes the partition and creates several smaller ones instead. You must make a complete backup because any data in the original partition is destroyed. Create two backups, use verification (if available in your backup software), and try to read data from the backup before deleting the partition.
If an operating system was installed on that partition, it must be reinstalled if you want to use that system as well. Be aware that some computers sold with pre-installed operating systems might not include the installation media to reinstall the original operating system. You should check whether this applies to your system before you destroy your original partition and its operating system installation.
After creating a smaller partition for your existing operating system, you can reinstall software, restore your data, and start your Red Hat Enterprise Linux installation.
Figure 10.7. Destructive repartitioning action on disk
Any data previously present in the original partition is lost.
10.6.3.2. Non-destructive repartitioning
With non-destructive repartitioning you execute a program that makes a big partition smaller without losing any of the files stored in that partition. This method is usually reliable, but can be very time-consuming on large drives.
The non-destructive repartitioning process is straightforward and consist of three steps:
- Compress and backup existing data
- Resize the existing partition
- Create new partition(s)
Each step is described further in more detail.
10.6.3.2.1. Compressing existing data
The first step is to compress the data in your existing partition. The reason for doing this is to rearrange the data to maximize the available free space at the "end" of the partition.
Figure 10.8. Compression on disk
In the previous example, the first diagram represents disk before compression, and the second diagram after compression.
This step is crucial. Without it, the location of the data could prevent the partition from being resized to the desired extent. Note that some data cannot be moved. In this case, it severely restricts the size of your new partitions, and you might be forced to destructively repartition your disk.
10.6.3.2.2. Resizing the existing partition
The following figure shows the actual resizing process. While the actual result of the resizing operation varies, depending on the software used, in most cases the newly freed space is used to create an unformatted partition of the same type as the original partition.
Figure 10.9. Partition resizing on disk
In the previous example, the first diagram represents partition before resizing, and the second diagram after resizing.
It is important to understand what the resizing software you use does with the newly freed space,so that you can take the appropriate steps. In the case illustrated here, it would be best to delete the new DOS partition and create the appropriate Linux partition or partitions.
10.6.3.2.3. Creating new partitions
As mentioned in the example Resizing the existing partition, it might or might not be necessary to create new partitions. However, unless your resizing software supports systems with Linux installed, it is likely that you must delete the partition that was created during the resizing process.
Figure 10.10. Disk with final partition configuration
In the previous example, the first diagram represents disk before configuration, and the second diagram after configuration.
Chapter 11. Getting started with XFS
This is an overview of how to create and maintain XFS file systems.
11.1. The XFS file system
XFS is a highly scalable, high-performance, robust, and mature 64-bit journaling file system that supports very large files and file systems on a single host. It is the default file system in Red Hat Enterprise Linux 8. XFS was originally developed in the early 1990s by SGI and has a long history of running on extremely large servers and storage arrays.
The features of XFS include:
- Reliability
- Metadata journaling, which ensures file system integrity after a system crash by keeping a record of file system operations that can be replayed when the system is restarted and the file system remounted
- Extensive run-time metadata consistency checking
- Scalable and fast repair utilities
- Quota journaling. This avoids the need for lengthy quota consistency checks after a crash.
- Scalability and performance
- Supported file system size up to 1024 TiB
- Ability to support a large number of concurrent operations
- B-tree indexing for scalability of free space management
- Sophisticated metadata read-ahead algorithms
- Optimizations for streaming video workloads
- Allocation schemes
- Extent-based allocation
- Stripe-aware allocation policies
- Delayed allocation
- Space pre-allocation
- Dynamically allocated inodes
- Other features
- Reflink-based file copies (new in Red Hat Enterprise Linux 8)
- Tightly integrated backup and restore utilities
- Online defragmentation
- Online file system growing
- Comprehensive diagnostics capabilities
-
Extended attributes (
xattr
). This allows the system to associate several additional name/value pairs per file. - Project or directory quotas. This allows quota restrictions over a directory tree.
- Subsecond timestamps
Performance characteristics
XFS has a high performance on large systems with enterprise workloads. A large system is one with a relatively high number of CPUs, multiple HBAs, and connections to external disk arrays. XFS also performs well on smaller systems that have a multi-threaded, parallel I/O workload.
XFS has a relatively low performance for single threaded, metadata-intensive workloads: for example, a workload that creates or deletes large numbers of small files in a single thread.
11.2. Creating an XFS file system
As a system administrator, you can create an XFS file system on a block device to enable it to store files and directories.
11.2.1. Creating an XFS file system with mkfs.xfs
This procedure describes how to create an XFS file system on a block device.
Procedure
To create the file system:
If the device is a regular partition, an LVM volume, an MD volume, a disk, or a similar device, use the following command:
# mkfs.xfs block-device
-
Replace block-device with the path to the block device. For example,
/dev/sdb1
,/dev/disk/by-uuid/05e99ec8-def1-4a5e-8a9d-5945339ceb2a
, or/dev/my-volgroup/my-lv
. - In general, the default options are optimal for common use.
-
When using
mkfs.xfs
on a block device containing an existing file system, add the-f
option to overwrite that file system.
-
Replace block-device with the path to the block device. For example,
To create the file system on a hardware RAID device, check if the system correctly detects the stripe geometry of the device:
If the stripe geometry information is correct, no additional options are needed. Create the file system:
# mkfs.xfs block-device
If the information is incorrect, specify stripe geometry manually with the
su
andsw
parameters of the-d
option. Thesu
parameter specifies the RAID chunk size, and thesw
parameter specifies the number of data disks in the RAID device.For example:
# mkfs.xfs -d su=64k,sw=4 /dev/sda3
Use the following command to wait for the system to register the new device node:
# udevadm settle
Additional resources
-
The
mkfs.xfs(8)
man page.
11.2.2. Creating an XFS file system on a block device using RHEL System Roles
This section describes how to create an XFS file system on a block device on multiple target machines using the storage
role.
Prerequisites
An Ansible playbook that uses the
storage
role exists.For information on how to apply such a playbook, see Applying a role.
11.2.2.1. Example Ansible playbook to create an XFS file system on a block device
This section provides an example Ansible playbook. This playbook applies the storage
role to create an XFS file system on a block device using the default parameters.
The storage
role can create a file system only on an unpartitioned, whole disk or a logical volume (LV). It cannot create the file system on a partition.
Example 11.1. A playbook that creates XFS on /dev/sdb
--- - hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: xfs roles: - rhel-system-roles.storage
-
The volume name (
barefs
in the example) is currently arbitrary. Thestorage
role identifies the volume by the disk device listed under thedisks:
attribute. -
You can omit the
fs_type: xfs
line because XFS is the default file system in RHEL 8. To create the file system on an LV, provide the LVM setup under the
disks:
attribute, including the enclosing volume group. For details, see Example Ansible playbook to manage logical volumes.Do not provide the path to the LV device.
Additional resources
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
11.2.2.2. Additional resources
-
For more information about the
storage
role, see Section 2.1, “Introduction to the storage role”.
11.3. Backing up an XFS file system
As a system administrator, you can use the xfsdump
to back up an XFS file system into a file or on a tape. This provides a simple backup mechanism.
11.3.1. Features of XFS backup
This section describes key concepts and features of backing up an XFS file system with the xfsdump
utility.
You can use the xfsdump
utility to:
Perform backups to regular file images.
Only one backup can be written to a regular file.
Perform backups to tape drives.
The
xfsdump
utility also enables you to write multiple backups to the same tape. A backup can span multiple tapes.To back up multiple file systems to a single tape device, simply write the backup to a tape that already contains an XFS backup. This appends the new backup to the previous one. By default,
xfsdump
never overwrites existing backups.Create incremental backups.
The
xfsdump
utility uses dump levels to determine a base backup to which other backups are relative. Numbers from 0 to 9 refer to increasing dump levels. An incremental backup only backs up files that have changed since the last dump of a lower level:- To perform a full backup, perform a level 0 dump on the file system.
- A level 1 dump is the first incremental backup after a full backup. The next incremental backup would be level 2, which only backs up files that have changed since the last level 1 dump; and so on, to a maximum of level 9.
- Exclude files from a backup using size, subtree, or inode flags to filter them.
Additional resources
-
The
xfsdump(8)
man page.
11.3.2. Backing up an XFS file system with xfsdump
This procedure describes how to back up the content of an XFS file system into a file or a tape.
Prerequisites
- An XFS file system that you can back up.
- Another file system or a tape drive where you can store the backup.
Procedure
Use the following command to back up an XFS file system:
# xfsdump -l level [-L label] \ -f backup-destination path-to-xfs-filesystem
-
Replace level with the dump level of your backup. Use
0
to perform a full backup or1
to9
to perform consequent incremental backups. -
Replace backup-destination with the path where you want to store your backup. The destination can be a regular file, a tape drive, or a remote tape device. For example,
/backup-files/Data.xfsdump
for a file or/dev/st0
for a tape drive. -
Replace path-to-xfs-filesystem with the mount point of the XFS file system you want to back up. For example,
/mnt/data/
. The file system must be mounted. -
When backing up multiple file systems and saving them on a single tape device, add a session label to each backup using the
-L label
option so that it is easier to identify them when restoring. Replace label with any name for your backup: for example,backup_data
.
-
Replace level with the dump level of your backup. Use
Example 11.2. Backing up multiple XFS file systems
To back up the content of XFS file systems mounted on the
/boot/
and/data/
directories and save them as files in the/backup-files/
directory:# xfsdump -l 0 -f /backup-files/boot.xfsdump /boot # xfsdump -l 0 -f /backup-files/data.xfsdump /data
To back up multiple file systems on a single tape device, add a session label to each backup using the
-L label
option:# xfsdump -l 0 -L "backup_boot" -f /dev/st0 /boot # xfsdump -l 0 -L "backup_data" -f /dev/st0 /data
Additional resources
-
The
xfsdump(8)
man page.
11.3.3. Additional resources
-
The
xfsdump(8)
man page.
11.4. Restoring an XFS file system from backup
As a system administrator, you can use the xfsrestore
utility to restore XFS backup created with the xfsdump
utility and stored in a file or on a tape.
11.4.1. Features of restoring XFS from backup
This section describes key concepts and features of restoring an XFS file system from backup with the xfsrestore
utility.
The xfsrestore
utility restores file systems from backups produced by xfsdump
. The xfsrestore
utility has two modes:
- The simple mode enables users to restore an entire file system from a level 0 dump. This is the default mode.
- The cumulative mode enables file system restoration from an incremental backup: that is, level 1 to level 9.
A unique session ID or session label identifies each backup. Restoring a backup from a tape containing multiple backups requires its corresponding session ID or label.
To extract, add, or delete specific files from a backup, enter the xfsrestore
interactive mode. The interactive mode provides a set of commands to manipulate the backup files.
Additional resources
-
The
xfsrestore(8)
man page.
11.4.2. Restoring an XFS file system from backup with xfsrestore
This procedure describes how to restore the content of an XFS file system from a file or tape backup.
Prerequisites
- A file or tape backup of XFS file systems, as described in Section 11.3, “Backing up an XFS file system”.
- A storage device where you can restore the backup.
Procedure
The command to restore the backup varies depending on whether you are restoring from a full backup or an incremental one, or are restoring multiple backups from a single tape device:
# xfsrestore [-r] [-S session-id] [-L session-label] [-i] -f backup-location restoration-path
-
Replace backup-location with the location of the backup. This can be a regular file, a tape drive, or a remote tape device. For example,
/backup-files/Data.xfsdump
for a file or/dev/st0
for a tape drive. -
Replace restoration-path with the path to the directory where you want to restore the file system. For example,
/mnt/data/
. -
To restore a file system from an incremental (level 1 to level 9) backup, add the
-r
option. To restore a backup from a tape device that contains multiple backups, specify the backup using the
-S
or-L
options.The
-S
option lets you choose a backup by its session ID, while the-L
option lets you choose by the session label. To obtain the session ID and session labels, use thexfsrestore -I
command.Replace session-id with the session ID of the backup. For example,
b74a3586-e52e-4a4a-8775-c3334fa8ea2c
. Replace session-label with the session label of the backup. For example,my_backup_session_label
.To use
xfsrestore
interactively, use the-i
option.The interactive dialog begins after
xfsrestore
finishes reading the specified device. Available commands in the interactivexfsrestore
shell includecd
,ls
,add
,delete
, andextract
; for a complete list of commands, use thehelp
command.
-
Replace backup-location with the location of the backup. This can be a regular file, a tape drive, or a remote tape device. For example,
Example 11.3. Restoring Multiple XFS File Systems
To restore the XFS backup files and save their content into directories under
/mnt/
:# xfsrestore -f /backup-files/boot.xfsdump /mnt/boot/ # xfsrestore -f /backup-files/data.xfsdump /mnt/data/
To restore from a tape device containing multiple backups, specify each backup by its session label or session ID:
# xfsrestore -L "backup_boot" -f /dev/st0 /mnt/boot/ # xfsrestore -S "45e9af35-efd2-4244-87bc-4762e476cbab" \ -f /dev/st0 /mnt/data/
Additional resources
-
The
xfsrestore(8)
man page.
11.4.3. Informational messages when restoring an XFS backup from a tape
When restoring a backup from a tape with backups from multiple file systems, the xfsrestore
utility might issue messages. The messages inform you whether a match of the requested backup has been found when xfsrestore
examines each backup on the tape in sequential order. For example:
xfsrestore: preparing drive xfsrestore: examining media file 0 xfsrestore: inventory session uuid (8590224e-3c93-469c-a311-fc8f23029b2a) does not match the media header's session uuid (7eda9f86-f1e9-4dfd-b1d4-c50467912408) xfsrestore: examining media file 1 xfsrestore: inventory session uuid (8590224e-3c93-469c-a311-fc8f23029b2a) does not match the media header's session uuid (7eda9f86-f1e9-4dfd-b1d4-c50467912408) [...]
The informational messages keep appearing until the matching backup is found.
11.4.4. Additional resources
-
The
xfsrestore(8)
man page.
11.5. Increasing the size of an XFS file system
As a system administrator, you can increase the size of an XFS file system to utilize larger storage capacity.
It is not currently possible to decrease the size of XFS file systems.
11.5.1. Increasing the size of an XFS file system with xfs_growfs
This procedure describes how to grow an XFS file system using the xfs_growfs
utility.
Prerequisites
- Ensure that the underlying block device is of an appropriate size to hold the resized file system later. Use the appropriate resizing methods for the affected block device.
- Mount the XFS file system.
Procedure
While the XFS file system is mounted, use the
xfs_growfs
utility to increase its size:# xfs_growfs file-system -D new-size
- Replace file-system with the mount point of the XFS file system.
With the
-D
option, replace new-size with the desired new size of the file system specified in the number of file system blocks.To find out the block size in kB of a given XFS file system, use the
xfs_info
utility:# xfs_info block-device ... data = bsize=4096 ...
-
Without the
-D
option,xfs_growfs
grows the file system to the maximum size supported by the underlying device.
Additional resources
-
The
xfs_growfs(8)
man page.
11.6. Comparison of tools used with ext4 and XFS
This section compares which tools to use to accomplish common tasks on the ext4 and XFS file systems.
Task | ext4 | XFS |
---|---|---|
Create a file system |
|
|
File system check |
|
|
Resize a file system |
|
|
Save an image of a file system |
|
|
Label or tune a file system |
|
|
Back up a file system |
|
|
Quota management |
|
|
File mapping |
|
|
Chapter 12. Configuring XFS error behavior
You can configure how an XFS file system behaves when it encounters different I/O errors.
12.1. Configurable error handling in XFS
The XFS file system responds in one of the following ways when an error occurs during an I/O operation:
XFS repeatedly retries the I/O operation until the operation succeeds or XFS reaches a set limit.
The limit is based either on a maximum number of retries or a maximum time for retries.
- XFS considers the error permanent and stops the operation on the file system.
You can configure how XFS reacts to the following error conditions:
EIO
- Error when reading or writing
ENOSPC
- No space left on the device
ENODEV
- Device cannot be found
You can set the maximum number of retries and the maximum time in seconds until XFS considers an error permanent. XFS stops retrying the operation when it reaches either of the limits.
You can also configure XFS so that when unmounting a file system, XFS immediately cancels the retries regardless of any other configuration. This configuration enables the unmount operation to succeed despite persistent errors.
Default behavior
The default behavior for each XFS error condition depends on the error context. Some XFS errors such as ENODEV
are considered to be fatal and unrecoverable, regardless of the retry count. Their default retry limit is 0.
12.2. Configuration files for specific and undefined XFS error conditions
The following directories store configuration files that control XFS error behavior for different error conditions:
/sys/fs/xfs/device/error/metadata/EIO/
-
For the
EIO
error condition /sys/fs/xfs/device/error/metadata/ENODEV/
-
For the
ENODEV
error condition /sys/fs/xfs/device/error/metadata/ENOSPC/
-
For the
ENOSPC
error condition /sys/fs/xfs/device/error/default/
- Common configuration for all other, undefined error conditions
Each directory contains the following configuration files for configuring retry limits:
max_retries
- Controls the maximum number of times that XFS retries the operation.
retry_timeout_seconds
- Specifies the time limit in seconds after which XFS stops retrying the operation.
12.3. Setting XFS behavior for specific conditions
This procedure configures how XFS reacts to specific error conditions.
Procedure
Set the maximum number of retries, the retry time limit, or both:
To set the maximum number of retries, write the desired number to the
max_retries
file:# echo value > /sys/fs/xfs/device/error/metadata/condition/max_retries
To set the time limit, write the desired number of seconds to the
retry_timeout_seconds
file:# echo value > /sys/fs/xfs/device/error/metadata/condition/retry_timeout_second
value is a number between -1 and the maximum possible value of the C signed integer type. This is 2147483647 on 64-bit Linux.
In both limits, the value
-1
is used for continuous retries and0
to stop immediately.device is the name of the device, as found in the
/dev/
directory; for example,sda
.
12.4. Setting XFS behavior for undefined conditions
This procedure configures how XFS reacts to all undefined error conditions, which share a common configuration.
Procedure
Set the maximum number of retries, the retry time limit, or both:
To set the maximum number of retries, write the desired number to the
max_retries
file:# echo value > /sys/fs/xfs/device/error/metadata/default/max_retries
To set the time limit, write the desired number of seconds to the
retry_timeout_seconds
file:# echo value > /sys/fs/xfs/device/error/metadata/default/retry_timeout_seconds
value is a number between -1 and the maximum possible value of the C signed integer type. This is 2147483647 on 64-bit Linux.
In both limits, the value
-1
is used for continuous retries and0
to stop immediately.device is the name of the device, as found in the
/dev/
directory; for example,sda
.
12.5. Setting the XFS unmount behavior
This procedure configures how XFS reacts to error conditions when unmounting the file system.
If you set the fail_at_unmount
option in the file system, it overrides all other error configurations during unmount, and immediately unmounts the file system without retrying the I/O operation. This allows the unmount operation to succeed even in case of persistent errors.
You cannot change the fail_at_unmount
value after the unmount process starts, because the unmount process removes the configuration files from the sysfs
interface for the respective file system. You must configure the unmount behavior before the file system starts unmounting.
Procedure
Enable or disable the
fail_at_unmount
option:To cancel retrying all operations when the file system unmounts, enable the option:
# echo 1 > /sys/fs/xfs/device/error/fail_at_unmount
To respect the
max_retries
andretry_timeout_seconds
retry limits when the file system unmounts, disable the option:# echo 0 > /sys/fs/xfs/device/error/fail_at_unmount
device is the name of the device, as found in the
/dev/
directory; for example,sda
.
Chapter 13. Checking and repairing a file system
RHEL provides file system administration utilities which are capable of checking and repairing file systems. These tools are often referred to as fsck
tools, where fsck
is a shortened version of file system check. In most cases, these utilities are run automatically during system boot, if needed, but can also be manually invoked if required.
File system checkers guarantee only metadata consistency across the file system. They have no awareness of the actual data contained within the file system and are not data recovery tools.
13.1. Scenarios that require a file system check
The relevant fsck
tools can be used to check your system if any of the following occurs:
- System fails to boot
- Files on a specific disk become corrupt
- The file system shuts down or changes to read-only due to inconsistencies
- A file on the file system is inaccessible
File system inconsistencies can occur for various reasons, including but not limited to hardware errors, storage administration errors, and software bugs.
File system check tools cannot repair hardware problems. A file system must be fully readable and writable if repair is to operate successfully. If a file system was corrupted due to a hardware error, the file system must first be moved to a good disk, for example with the dd(8)
utility.
For journaling file systems, all that is normally required at boot time is to replay the journal if required and this is usually a very short operation.
However, if a file system inconsistency or corruption occurs, even for journaling file systems, then the file system checker must be used to repair the file system.
It is possible to disable file system check at boot by setting the sixth field in /etc/fstab
to 0
. However, Red Hat does not recommend doing so unless you are having issues with fsck
at boot time, for example with extremely large or remote file systems.
Additional resources
-
The
fstab(5)
man page. -
The
fsck(8)
man page. -
The
dd(8)
man page.
13.2. Potential side effects of running fsck
Generally, running the file system check and repair tool can be expected to automatically repair at least some of the inconsistencies it finds. In some cases, the following issues can arise:
- Severely damaged inodes or directories may be discarded if they cannot be repaired.
- Significant changes to the file system may occur.
To ensure that unexpected or undesirable changes are not permanently made, ensure you follow any precautionary steps outlined in the procedure.
13.3. Error-handling mechanisms in XFS
This section describes how XFS handles various kinds of errors in the file system.
Unclean unmounts
Journalling maintains a transactional record of metadata changes that happen on the file system.
In the event of a system crash, power failure, or other unclean unmount, XFS uses the journal (also called log) to recover the file system. The kernel performs journal recovery when mounting the XFS file system.
Corruption
In this context, corruption means errors on the file system caused by, for example:
- Hardware faults
- Bugs in storage firmware, device drivers, the software stack, or the file system itself
- Problems that cause parts of the file system to be overwritten by something outside of the file system
When XFS detects corruption in the file system or the file-system metadata, it may shut down the file system and report the incident in the system log. Note that if the corruption occurred on the file system hosting the /var
directory, these logs will not be available after a reboot.
Example 13.1. System log entry reporting an XFS corruption
# dmesg --notime | tail -15 XFS (loop0): Mounting V5 Filesystem XFS (loop0): Metadata CRC error detected at xfs_agi_read_verify+0xcb/0xf0 [xfs], xfs_agi block 0x2 XFS (loop0): Unmount and run xfs_repair XFS (loop0): First 128 bytes of corrupted metadata buffer: 00000000027b3b56: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000005f9abc7a: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000005b0aef35: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000da9d2ded: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000001e265b07: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000006a40df69: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000000b272907: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000e484aac5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ XFS (loop0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2 len 1 error 74 XFS (loop0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117, agno 0 XFS (loop0): Failed to read root inode 0x80, error 11
User-space utilities usually report the Input/output error message when trying to access a corrupted XFS file system. Mounting an XFS file system with a corrupted log results in a failed mount and the following error message:
mount: /mount-point: mount(2) system call failed: Structure needs cleaning.
You must manually use the xfs_repair
utility to repair the corruption.
Additional resources
-
The
xfs_repair(8)
man page provides a detailed list of XFS corruption checks.
13.4. Checking an XFS file system with xfs_repair
This procedure performs a read-only check of an XFS file system using the xfs_repair
utility. You must manually use the xfs_repair
utility to repair any corruption. Unlike other file system repair utilities, xfs_repair
does not run at boot time, even when an XFS file system was not cleanly unmounted. In the event of an unclean unmount, XFS simply replays the log at mount time, ensuring a consistent file system; xfs_repair
cannot repair an XFS file system with a dirty log without remounting it first.
Although an fsck.xfs
binary is present in the xfsprogs
package, this is present only to satisfy initscripts
that look for an fsck.file
system binary at boot time. fsck.xfs
immediately exits with an exit code of 0.
Procedure
Replay the log by mounting and unmounting the file system:
# mount file-system # umount file-system
NoteIf the mount fails with a structure needs cleaning error, the log is corrupted and cannot be replayed. The dry run should discover and report more on-disk corruption as a result.
Use the
xfs_repair
utility to perform a dry run to check the file system. Any errors are printed and an indication of the actions that would be taken, without modifying the file system.# xfs_repair -n block-device
Mount the file system:
# mount file-system
Additional resources
-
The
xfs_repair(8)
man page. -
The
xfs_metadump(8)
man page.
13.5. Repairing an XFS file system with xfs_repair
This procedure repairs a corrupted XFS file system using the xfs_repair
utility.
Procedure
Create a metadata image prior to repair for diagnostic or testing purposes using the
xfs_metadump
utility. A pre-repair file system metadata image can be useful for support investigations if the corruption is due to a software bug. Patterns of corruption present in the pre-repair image can aid in root-cause analysis.Use the
xfs_metadump
debugging tool to copy the metadata from an XFS file system to a file. The resultingmetadump
file can be compressed using standard compression utilities to reduce the file size if largemetadump
files need to be sent to support.# xfs_metadump block-device metadump-file
Replay the log by remounting the file system:
# mount file-system # umount file-system
Use the
xfs_repair
utility to repair the unmounted file system:If the mount succeeded, no additional options are required:
# xfs_repair block-device
If the mount failed with the Structure needs cleaning error, the log is corrupted and cannot be replayed. Use the
-L
option (force log zeroing) to clear the log:WarningThis command causes all metadata updates in progress at the time of the crash to be lost, which might cause significant file system damage and data loss. This should be used only as a last resort if the log cannot be replayed.
# xfs_repair -L block-device
Mount the file system:
# mount file-system
Additional resources
-
The
xfs_repair(8)
man page.
13.6. Error handling mechanisms in ext2, ext3, and ext4
The ext2, ext3, and ext4 file systems use the e2fsck
utility to perform file system checks and repairs. The file names fsck.ext2
, fsck.ext3
, and fsck.ext4
are hardlinks to the e2fsck
utility. These binaries are run automatically at boot time and their behavior differs based on the file system being checked and the state of the file system.
A full file system check and repair is invoked for ext2, which is not a metadata journaling file system, and for ext4 file systems without a journal.
For ext3 and ext4 file systems with metadata journaling, the journal is replayed in userspace and the utility exits. This is the default action because journal replay ensures a consistent file system after a crash.
If these file systems encounter metadata inconsistencies while mounted, they record this fact in the file system superblock. If e2fsck
finds that a file system is marked with such an error, e2fsck
performs a full check after replaying the journal (if present).
Additional resources
-
The
fsck(8)
man page. -
The
e2fsck(8)
man page.
13.7. Checking an ext2, ext3, or ext4 file system with e2fsck
This procedure checks an ext2, ext3, or ext4 file system using the e2fsck
utility.
Procedure
Replay the log by remounting the file system:
# mount file-system # umount file-system
Perform a dry run to check the file system.
# e2fsck -n block-device
NoteAny errors are printed and an indication of the actions that would be taken, without modifying the file system. Later phases of consistency checking may print extra errors as it discovers inconsistencies which would have been fixed in early phases if it were running in repair mode.
Additional resources
-
The
e2image(8)
man page. -
The
e2fsck(8)
man page.
13.8. Repairing an ext2, ext3, or ext4 file system with e2fsck
This procedure repairs a corrupted ext2, ext3, or ext4 file system using the e2fsck
utility.
Procedure
Save a file system image for support investigations. A pre-repair file system metadata image can be useful for support investigations if the corruption is due to a software bug. Patterns of corruption present in the pre-repair image can aid in root-cause analysis.
NoteSeverely damaged file systems may cause problems with metadata image creation.
If you are creating the image for testing purposes, use the
-r
option to create a sparse file of the same size as the file system itself.e2fsck
can then operate directly on the resulting file.# e2image -r block-device image-file
If you are creating the image to be archived or provided for diagnostic, use the
-Q
option, which creates a more compact file format suitable for transfer.# e2image -Q block-device image-file
Replay the log by remounting the file system:
# mount file-system # umount file-system
Automatically repair the file system. If user intervention is required,
e2fsck
indicates the unfixed problem in its output and reflects this status in the exit code.# e2fsck -p block-device
Additional resources
-
The
e2image(8)
man page. -
The
e2fsck(8)
man page.
-
The
Chapter 14. Mounting file systems
As a system administrator, you can mount file systems on your system to access data on them.
14.1. The Linux mount mechanism
This section explains basic concepts of mounting file systems on Linux.
On Linux, UNIX, and similar operating systems, file systems on different partitions and removable devices (CDs, DVDs, or USB flash drives for example) can be attached to a certain point (the mount point) in the directory tree, and then detached again. While a file system is mounted on a directory, the original content of the directory is not accessible.
Note that Linux does not prevent you from mounting a file system to a directory with a file system already attached to it.
When mounting, you can identify the device by:
-
a universally unique identifier (UUID): for example,
UUID=34795a28-ca6d-4fd8-a347-73671d0c19cb
-
a volume label: for example,
LABEL=home
-
a full path to a non-persistent block device: for example,
/dev/sda3
When you mount a file system using the mount
command without all required information, that is without the device name, the target directory, or the file system type, the mount
utility reads the content of the /etc/fstab
file to check if the given file system is listed there. The /etc/fstab
file contains a list of device names and the directories in which the selected file systems are set to be mounted as well as the file system type and mount options. Therefore, when mounting a file system that is specified in /etc/fstab
, the following command syntax is sufficient:
Mounting by the mount point:
# mount directory
Mounting by the block device:
# mount device
Additional resources
-
The
mount(8)
man page. - For information on how to list persistent naming attributes such as the UUID, see Section 9.6, “Listing persistent naming attributes”.
14.2. Listing currently mounted file systems
This procedure describes how to list all currently mounted file systems on the command line.
Procedure
To list all mounted file systems, use the
findmnt
utility:$ findmnt
To limit the listed file systems only to a certain file system type, add the
--types
option:$ findmnt --types fs-type
For example:
Example 14.1. Listing only XFS file systems
$ findmnt --types xfs TARGET SOURCE FSTYPE OPTIONS / /dev/mapper/luks-5564ed00-6aac-4406-bfb4-c59bf5de48b5 xfs rw,relatime ├─/boot /dev/sda1 xfs rw,relatime └─/home /dev/mapper/luks-9d185660-7537-414d-b727-d92ea036051e xfs rw,relatime
Additional resources
-
The
findmnt(8)
man page.
14.3. Mounting a file system with mount
This procedure describes how to mount a file system using the mount
utility.
Prerequisites
Make sure that no file system is already mounted on your chosen mount point:
$ findmnt mount-point
Procedure
To attach a certain file system, use the
mount
utility:# mount device mount-point
Example 14.2. Mounting an XFS file system
For example, to mount a local XFS file system identified by UUID:
# mount UUID=ea74bbec-536d-490c-b8d9-5b40bbd7545b /mnt/data
If
mount
cannot recognize the file system type automatically, specify it using the--types
option:# mount --types type device mount-point
Example 14.3. Mounting an NFS file system
For example, to mount a remote NFS file system:
# mount --types nfs4 host:/remote-export /mnt/nfs
Additional resources
-
The
mount(8)
man page.
14.4. Moving a mount point
This procedure describes how to change the mount point of a mounted file system to a different directory.
Procedure
To change the directory in which a file system is mounted:
# mount --move old-directory new-directory
Example 14.4. Moving a home file system
For example, to move the file system mounted in the
/mnt/userdirs/
directory to the/home/
mount point:# mount --move /mnt/userdirs /home
Verify that the file system has been moved as expected:
$ findmnt $ ls old-directory $ ls new-directory
Additional resources
-
The
mount(8)
man page.
14.5. Unmounting a file system with umount
This procedure describes how to unmount a file system using the umount
utility.
Procedure
Try unmounting the file system using either of the following commands:
By mount point:
# umount mount-point
By device:
# umount device
If the command fails with an error similar to the following, it means that the file system is in use because of a process is using resources on it:
umount: /run/media/user/FlashDrive: target is busy.
If the file system is in use, use the
fuser
utility to determine which processes are accessing it. For example:$ fuser --mount /run/media/user/FlashDrive /run/media/user/FlashDrive: 18351
Afterwards, terminate the processes using the file system and try unmounting it again.
14.6. Common mount options
This section lists some commonly used options of the mount
utility.
You can use these options in the following syntax:
# mount --options option1,option2,option3 device mount-point
Table 14.1. Common mount options
Option | Description |
---|---|
| Enables asynchronous input and output operations on the file system. |
|
Enables the file system to be mounted automatically using the |
|
Provides an alias for the |
| Allows the execution of binary files on the particular file system. |
| Mounts an image as a loop device. |
|
Default behavior disables the automatic mount of the file system using the |
| Disallows the execution of binary files on the particular file system. |
| Disallows an ordinary user (that is, other than root) to mount and unmount the file system. |
| Remounts the file system in case it is already mounted. |
| Mounts the file system for reading only. |
| Mounts the file system for both reading and writing. |
| Allows an ordinary user (that is, other than root) to mount and unmount the file system. |
14.7. Sharing a mount on multiple mount points
As a system administrator, you can duplicate mount points to make the file systems accessible from multiple directories.
14.7.2. Creating a private mount point duplicate
This procedure duplicates a mount point as a private mount. File systems that you later mount under the duplicate or the original mount point are not reflected in the other.
Procedure
Create a virtual file system (VFS) node from the original mount point:
# mount --bind original-dir original-dir
Mark the original mount point as private:
# mount --make-private original-dir
Alternatively, to change the mount type for the selected mount point and all mount points under it, use the
--make-rprivate
option instead of--make-private
.Create the duplicate:
# mount --bind original-dir duplicate-dir
Example 14.5. Duplicating /media into /mnt as a private mount point
Create a VFS node from the
/media
directory:# mount --bind /media /media
Mark the
/media
directory as private:# mount --make-private /media
Create its duplicate in
/mnt
:# mount --bind /media /mnt
It is now possible to verify that
/media
and/mnt
share content but none of the mounts within/media
appear in/mnt
. For example, if the CD-ROM drive contains non-empty media and the/media/cdrom/
directory exists, use:# mount /dev/cdrom /media/cdrom # ls /media/cdrom EFI GPL isolinux LiveOS # ls /mnt/cdrom #
It is also possible to verify that file systems mounted in the
/mnt
directory are not reflected in/media
. For instance, if a non-empty USB flash drive that uses the/dev/sdc1
device is plugged in and the/mnt/flashdisk/
directory is present, use:# mount /dev/sdc1 /mnt/flashdisk # ls /media/flashdisk # ls /mnt/flashdisk en-US publican.cfg
Additional resources
-
The
mount(8)
man page.
14.7.4. Creating a slave mount point duplicate
This procedure duplicates a mount point as a slave mount. File systems that you later mount under the original mount point are reflected in the duplicate but not the other way around.
Procedure
Create a virtual file system (VFS) node from the original mount point:
# mount --bind original-dir original-dir
Mark the original mount point as shared:
# mount --make-shared original-dir
Alternatively, to change the mount type for the selected mount point and all mount points under it, use the
--make-rshared
option instead of--make-shared
.Create the duplicate and mark it as slave:
# mount --bind original-dir duplicate-dir # mount --make-slave duplicate-dir
Example 14.7. Duplicating /media into /mnt as a slave mount point
This example shows how to get the content of the /media
directory to appear in /mnt
as well, but without any mounts in the /mnt
directory to be reflected in /media
.
Create a VFS node from the
/media
directory:# mount --bind /media /media
Mark the
/media
directory as shared:# mount --make-shared /media
Create its duplicate in
/mnt
and mark it as slave:# mount --bind /media /mnt # mount --make-slave /mnt
Verify that a mount within
/media
also appears in/mnt
. For example, if the CD-ROM drive contains non-empty media and the/media/cdrom/
directory exists, use:# mount /dev/cdrom /media/cdrom # ls /media/cdrom EFI GPL isolinux LiveOS # ls /mnt/cdrom EFI GPL isolinux LiveOS
Also verify that file systems mounted in the
/mnt
directory are not reflected in/media
. For instance, if a non-empty USB flash drive that uses the/dev/sdc1
device is plugged in and the/mnt/flashdisk/
directory is present, use:# mount /dev/sdc1 /mnt/flashdisk # ls /media/flashdisk # ls /mnt/flashdisk en-US publican.cfg
Additional resources
-
The
mount(8)
man page.
14.7.5. Preventing a mount point from being duplicated
This procedure marks a mount point as unbindable so that it is not possible to duplicate it in another mount point.
Procedure
To change the type of a mount point to an unbindable mount, use:
# mount --bind mount-point mount-point # mount --make-unbindable mount-point
Alternatively, to change the mount type for the selected mount point and all mount points under it, use the
--make-runbindable
option instead of--make-unbindable
.Any subsequent attempt to make a duplicate of this mount fails with the following error:
# mount --bind mount-point duplicate-dir mount: wrong fs type, bad option, bad superblock on mount-point, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so
Example 14.8. Preventing /media from being duplicated
To prevent the
/media
directory from being shared, use:# mount --bind /media /media # mount --make-unbindable /media
Additional resources
-
The
mount(8)
man page.
14.8. Persistently mounting file systems
As a system administrator, you can persistently mount file systems to configure non-removable storage.
14.8.1. The /etc/fstab file
This section describes the /etc/fstab
configuration file, which controls persistent mount points of file systems. Using /etc/fstab
is the recommended way to persistently mount file systems.
Each line in the /etc/fstab
file defines a mount point of a file system. It includes six fields separated by white space:
-
The block device identified by a persistent attribute or a path it the
/dev
directory. - The directory where the device will be mounted.
- The file system on the device.
-
Mount options for the file system. The option
defaults
means that the partition is mounted at boot time with default options. This section also recognizessystemd
mount unit options in thex-systemd.option
format. -
Backup option for the
dump
utility. -
Check order for the
fsck
utility.
Example 14.9. The /boot
file system in /etc/fstab
Block device | Mount point | File system | Options | Backup | Check |
---|---|---|---|---|---|
|
|
|
|
|
|
The systemd
service automatically generates mount units from entries in /etc/fstab
.
Additional resources
-
The
fstab(5)
man page. -
The fstab section of the
systemd.mount(5)
man page.
14.8.2. Adding a file system to /etc/fstab
This procedure describes how to configure persistent mount point for a file system in the /etc/fstab
configuration file.
Procedure
Find out the UUID attribute of the file system:
$ lsblk --fs storage-device
For example:
Example 14.10. Viewing the UUID of a partition
$ lsblk --fs /dev/sda1 NAME FSTYPE LABEL UUID MOUNTPOINT sda1 xfs Boot ea74bbec-536d-490c-b8d9-5b40bbd7545b /boot
If the mount point directory does not exist, create it:
# mkdir --parents mount-point
As root, edit the
/etc/fstab
file and add a line for the file system, identified by the UUID.For example:
Example 14.11. The /boot mount point in /etc/fstab
UUID=ea74bbec-536d-490c-b8d9-5b40bbd7545b /boot xfs defaults 0 0
Regenerate mount units so that your system registers the new configuration:
# systemctl daemon-reload
Try mounting the file system to verify that the configuration works:
# mount mount-point
Additional resources
- Other persistent attributes that you can use to identify the file system: Section 9.3, “Device names managed by the udev mechanism in /dev/disk/”
14.8.3. Persistently mounting a file system using RHEL System Roles
This section describes how to persistently mount a file system using the storage
role.
Prerequisites
An Ansible playbook that uses the
storage
role exists.For information on how to apply such a playbook, see Applying a role.
14.8.3.1. Example Ansible playbook to persistently mount a file system
This section provides an example Ansible playbook. This playbook applies the storage
role to immediately and persistently mount an XFS file system.
Example 14.12. A playbook that mounts a file system on /dev/sdb to /mnt/data
--- - hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: xfs mount_point: /mnt/data roles: - rhel-system-roles.storage
-
This playbook adds the file system to the
/etc/fstab
file, and mounts the file system immediately. -
If the file system on the
/dev/sdb
device or the mount point directory do not exist, the playbook creates them.
Additional resources
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
14.8.3.2. Additional resources
-
For more information about the
storage
role, see Section 2.1, “Introduction to the storage role”.
14.9. Mounting file systems on demand
As a system administrator, you can configure file systems, such as NFS, to mount automatically on demand.
14.9.1. The autofs service
This section explains the benefits and basic concepts of the autofs
service, used to mount file systems on demand.
One drawback of permanent mounting using the /etc/fstab
configuration is that, regardless of how infrequently a user accesses the mounted file system, the system must dedicate resources to keep the mounted file system in place. This might affect system performance when, for example, the system is maintaining NFS mounts to many systems at one time.
An alternative to /etc/fstab
is to use the kernel-based autofs
service. It consists of the following components:
- A kernel module that implements a file system, and
- A user-space service that performs all of the other functions.
The autofs
service can mount and unmount file systems automatically (on-demand), therefore saving system resources. It can be used to mount file systems such as NFS, AFS, SMBFS, CIFS, and local file systems.
Additional resources
-
The
autofs(8)
man page.
14.9.2. The autofs configuration files
This section describes the usage and syntax of configuration files used by the autofs
service.
The master map file
The autofs
service uses /etc/auto.master
(master map) as its default primary configuration file. This can be changed to use another supported network source and name using the autofs
configuration in the /etc/autofs.conf
configuration file in conjunction with the Name Service Switch (NSS) mechanism.
All on-demand mount points must be configured in the master map. Mount point, host name, exported directory, and options can all be specified in a set of files (or other supported network sources) rather than configuring them manually for each host.
The master map file lists mount points controlled by autofs
, and their corresponding configuration files or network sources known as automount maps. The format of the master map is as follows:
mount-point map-name options
The variables used in this format are:
- mount-point
-
The
autofs
mount point; for example,/mnt/data/
. - map-file
- The map source file, which contains a list of mount points and the file system location from which those mount points should be mounted.
- options
- If supplied, these apply to all entries in the given map, if they do not themselves have options specified.
Example 14.13. The /etc/auto.master file
The following is a sample line from /etc/auto.master
file:
/mnt/data /etc/auto.data
Map files
Map files configure the properties of individual on-demand mount points.
The automounter creates the directories if they do not exist. If the directories exist before the automounter was started, the automounter will not remove them when it exits. If a timeout is specified, the directory is automatically unmounted if the directory is not accessed for the timeout period.
The general format of maps is similar to the master map. However, the options field appears between the mount point and the location instead of at the end of the entry as in the master map:
mount-point options location
The variables used in this format are:
- mount-point
-
This refers to the
autofs
mount point. This can be a single directory name for an indirect mount or the full path of the mount point for direct mounts. Each direct and indirect map entry key (mount-point) can be followed by a space separated list of offset directories (subdirectory names each beginning with/
) making them what is known as a multi-mount entry. - options
- When supplied, these are the mount options for the map entries that do not specify their own options. This field is optional.
- location
-
This refers to the file system location such as a local file system path (preceded with the Sun map format escape character
:
for map names beginning with/
), an NFS file system or other valid file system location.
Example 14.14. A map file
The following is a sample from a map file; for example, /etc/auto.misc
:
payroll -fstype=nfs4 personnel:/dev/disk/by-uuid/52b94495-e106-4f29-b868-fe6f6c2789b1 sales -fstype=xfs :/dev/disk/by-uuid/5564ed00-6aac-4406-bfb4-c59bf5de48b5
The first column in the map file indicates the autofs
mount point: sales
and payroll
from the server called personnel
. The second column indicates the options for the autofs
mount. The third column indicates the source of the mount.
Following the given configuration, the autofs
mount points will be /home/payroll
and /home/sales
. The -fstype=
option is often omitted and is generally not needed for correct operation.
Using the given configuration, if a process requires access to an autofs
unmounted directory such as /home/payroll/2006/July.sxc
, the autofs
service automatically mounts the directory.
The amd map format
The autofs
service recognizes map configuration in the amd
format as well. This is useful if you want to reuse existing automounter configuration written for the am-utils
service, which has been removed from Red Hat Enterprise Linux.
However, Red Hat recommends using the simpler autofs
format described in the previous sections.
Additional resources
-
The
autofs(5)
,autofs.conf(5)
, andauto.master(5)
man pages. -
For details on the
amd
map format, see the/usr/share/doc/autofs/README.amd-maps
file, which is provided by theautofs
package.
14.9.3. Configuring autofs mount points
This procedure describes how to configure on-demand mount points using the autofs
service.
Prerequisites
Install the
autofs
package:# yum install autofs
Start and enable the
autofs
service:# systemctl enable --now autofs
Procedure
-
Create a map file for the on-demand mount point, located at
/etc/auto.identifier
. Replace identifier with a name that identifies the mount point. - In the map file, fill in the mount point, options, and location fields as described in Section 14.9.2, “The autofs configuration files”.
- Register the map file in the master map file, as described in Section 14.9.2, “The autofs configuration files”.
Try accessing content in the on-demand directory:
$ ls automounted-directory
14.9.4. Automounting NFS server user home directories with autofs service
This procedure describes how to configure the autofs service to mount user home directories automatically.
Prerequisites
- The autofs package is installed.
- The autofs service is enabled and running.
Procedure
Specify the mount point and location of the map file by editing the
/etc/auto.master
file on a server on which you need to mount user home directories. To do so, add the following line into the/etc/auto.master
file:/home /etc/auto.home
Create a map file with the name of
/etc/auto.home
on a server on which you need to mount user home directories, and edit the file with the following parameters:* -fstype=nfs,rw,sync host.example.com:/home/&i
You can skip
fstype
parameter, as it isnfs
by default. For more information, seeautofs(5)
man page.Reload the
autofs
service:# systemctl reload autofs
14.9.5. Overriding or augmenting autofs site configuration files
It is sometimes useful to override site defaults for a specific mount point on a client system.
Example 14.15. Initial conditions
For example, consider the following conditions:
Automounter maps are stored in NIS and the
/etc/nsswitch.conf
file has the following directive:automount: files nis
The
auto.master
file contains:+auto.master
The NIS
auto.master
map file contains:/home auto.home
The NIS
auto.home
map contains:beth fileserver.example.com:/export/home/beth joe fileserver.example.com:/export/home/joe * fileserver.example.com:/export/home/&
-
The file map
/etc/auto.home
does not exist.
Example 14.16. Mounting home directories from a different server
Given the preceding conditions, let’s assume that the client system needs to override the NIS map auto.home
and mount home directories from a different server.
In this case, the client needs to use the following
/etc/auto.master
map:/home /etc/auto.home +auto.master
The
/etc/auto.home
map contains the entry:* host.example.com:/export/home/&
Because the automounter only processes the first occurrence of a mount point, the /home
directory contains the content of /etc/auto.home
instead of the NIS auto.home
map.
Example 14.17. Augmenting auto.home with only selected entries
Alternatively, to augment the site-wide auto.home
map with just a few entries:
Create an
/etc/auto.home
file map, and in it put the new entries. At the end, include the NISauto.home
map. Then the/etc/auto.home
file map looks similar to:mydir someserver:/export/mydir +auto.home
With these NIS
auto.home
map conditions, listing the content of the/home
directory outputs:$ ls /home beth joe mydir
This last example works as expected because autofs
does not include the contents of a file map of the same name as the one it is reading. As such, autofs
moves on to the next map source in the nsswitch
configuration.
14.9.6. Using LDAP to store automounter maps
This procedure configures autofs
to store automounter maps in LDAP configuration rather than in autofs
map files.
Prerequisites
-
LDAP client libraries must be installed on all systems configured to retrieve automounter maps from LDAP. On Red Hat Enterprise Linux, the
openldap
package should be installed automatically as a dependency of theautofs
package.
Procedure
-
To configure LDAP access, modify the
/etc/openldap/ldap.conf
file. Ensure that theBASE
,URI
, andschema
options are set appropriately for your site. The most recently established schema for storing automount maps in LDAP is described by the
rfc2307bis
draft. To use this schema, set it in the/etc/autofs.conf
configuration file by removing the comment characters from the schema definition. For example:Example 14.18. Setting autofs configuration
DEFAULT_MAP_OBJECT_CLASS="automountMap" DEFAULT_ENTRY_OBJECT_CLASS="automount" DEFAULT_MAP_ATTRIBUTE="automountMapName" DEFAULT_ENTRY_ATTRIBUTE="automountKey" DEFAULT_VALUE_ATTRIBUTE="automountInformation"
Ensure that all other schema entries are commented in the configuration. The
automountKey
attribute replaces thecn
attribute in therfc2307bis
schema. Following is an example of an LDAP Data Interchange Format (LDIF) configuration:Example 14.19. LDF Configuration
# extended LDIF # # LDAPv3 # base <> with scope subtree # filter: (&(objectclass=automountMap)(automountMapName=auto.master)) # requesting: ALL # # auto.master, example.com dn: automountMapName=auto.master,dc=example,dc=com objectClass: top objectClass: automountMap automountMapName: auto.master # extended LDIF # # LDAPv3 # base <automountMapName=auto.master,dc=example,dc=com> with scope subtree # filter: (objectclass=automount) # requesting: ALL # # /home, auto.master, example.com dn: automountMapName=auto.master,dc=example,dc=com objectClass: automount cn: /home automountKey: /home automountInformation: auto.home # extended LDIF # # LDAPv3 # base <> with scope subtree # filter: (&(objectclass=automountMap)(automountMapName=auto.home)) # requesting: ALL # # auto.home, example.com dn: automountMapName=auto.home,dc=example,dc=com objectClass: automountMap automountMapName: auto.home # extended LDIF # # LDAPv3 # base <automountMapName=auto.home,dc=example,dc=com> with scope subtree # filter: (objectclass=automount) # requesting: ALL # # foo, auto.home, example.com dn: automountKey=foo,automountMapName=auto.home,dc=example,dc=com objectClass: automount automountKey: foo automountInformation: filer.example.com:/export/foo # /, auto.home, example.com dn: automountKey=/,automountMapName=auto.home,dc=example,dc=com objectClass: automount automountKey: / automountInformation: filer.example.com:/export/&
Additional resources
-
The
rfc2307bis
draft: https://tools.ietf.org/html/draft-howard-rfc2307bis.
14.10. Setting read-only permissions for the root file system
Sometimes, you need to mount the root file system (/
) with read-only permissions. Example use cases include enhancing security or ensuring data integrity after an unexpected system power-off.
14.10.1. Files and directories that always retain write permissions
For the system to function properly, some files and directories need to retain write permissions. When the root file system is mounted in read-only mode, these files are mounted in RAM using the tmpfs
temporary file system.
The default set of such files and directories is read from the /etc/rwtab
file, which contains:
dirs /var/cache/man dirs /var/gdm <content truncated> empty /tmp empty /var/cache/foomatic <content truncated> files /etc/adjtime files /etc/ntp.conf <content truncated>
Entries in the /etc/rwtab
file follow this format:
copy-method path
In this syntax:
- Replace copy-method with one of the keywords specifying how the file or directory is copied to tmpfs.
- Replace path with the path to the file or directory.
The /etc/rwtab
file recognizes the following ways in which a file or directory can be copied to tmpfs
:
empty
An empty path is copied to
tmpfs
. For example:empty /tmp
dirs
A directory tree is copied to
tmpfs
, empty. For example:dirs /var/run
files
A file or a directory tree is copied to
tmpfs
intact. For example:files /etc/resolv.conf
The same format applies when adding custom paths to /etc/rwtab.d/
.
14.10.2. Configuring the root file system to mount with read-only permissions on boot
With this procedure, the root file system is mounted read-only on all following boots.
Procedure
In the
/etc/sysconfig/readonly-root
file, set theREADONLY
option toyes
:# Set to 'yes' to mount the file systems as read-only. READONLY=yes
Add the
ro
option in the root entry (/
) in the/etc/fstab
file:/dev/mapper/luks-c376919e... / xfs x-systemd.device-timeout=0,ro 1 1
Add the
ro
option to theGRUB_CMDLINE_LINUX
directive in the/etc/default/grub
file and ensure that the directive does not containrw
:GRUB_CMDLINE_LINUX="rhgb quiet... ro"
Recreate the GRUB2 configuration file:
# grub2-mkconfig -o /boot/grub2/grub.cfg
If you need to add files and directories to be mounted with write permissions in the
tmpfs
file system, create a text file in the/etc/rwtab.d/
directory and put the configuration there.For example, to mount the
/etc/example/file
file with write permissions, add this line to the/etc/rwtab.d/example
file:files /etc/example/file
ImportantChanges made to files and directories in
tmpfs
do not persist across boots.- Reboot the system to apply the changes.
Troubleshooting
If you mount the root file system with read-only permissions by mistake, you can remount it with read-and-write permissions again using the following command:
# mount -o remount,rw /
Chapter 15. Limiting storage space usage with quotas
You can restrict the amount of disk space available to users or groups by implementing disk quotas. You can also define a warning level at which system administrators are informed before a user consumes too much disk space or a partition becomes full.
15.1. Disk quotas
In most computing environments, disk space is not infinite. The quota subsystem provides a mechanism to control usage of disk space.
You can configure disk quotas for individual users as well as user groups on the local file systems. This makes it possible to manage the space allocated for user-specific files (such as email) separately from the space allocated to the projects that a user works on. The quota subsystem warns users when they exceed their allotted limit, but allows some extra space for current work (hard limit/soft limit).
If quotas are implemented, you need to check if the quotas are exceeded and make sure the quotas are accurate. If users repeatedly exceed their quotas or consistently reach their soft limits, a system administrator can either help the user determine how to use less disk space or increase the user’s disk quota.
You can set quotas to control:
- The number of consumed disk blocks.
- The number of inodes, which are data structures that contain information about files in UNIX file systems. Because inodes store file-related information, this allows control over the number of files that can be created.
15.1.1. The xfs_quota
tool
You can use the xfs_quota
tool to manage quotas on XFS file systems. In addition, you can use XFS file systems with limit enforcement turned off as an effective disk usage accounting system.
The XFS quota system differs from other file systems in a number of ways. Most importantly, XFS considers quota information as file system metadata and uses journaling to provide a higher level guarantee of consistency.
Additional resources
-
The
xfs_quota(8)
man page.
15.2. Managing XFS disk quotas
You can use the xfs_quota
tool to manage quotas in XFS and to configure limits for project-controlled directories.
Generic quota configuration tools (quota
, repquota
, and edquota
for example) may also be used to manipulate XFS quotas. However, these tools cannot be used with XFS project quotas.
Red Hat recommends the use of xfs_quota
over all other available tools.
15.2.1. File system quota management in XFS
The XFS quota subsystem manages limits on disk space (blocks) and file (inode) usage. XFS quotas control or report on usage of these items on a user, group, or directory or project level. Group and project quotas are only mutually exclusive on older non-default XFS disk formats.
When managing on a per-directory or per-project basis, XFS manages the disk usage of directory hierarchies associated with a specific project.
15.2.2. Enabling disk quotas for XFS
This procedure enables disk quotas for users, groups, and projects on an XFS file system. Once quotas are enabled, the xfs_quota
tool can be used to set limits and report on disk usage.
Procedure
Enable quotas for users:
# mount -o uquota /dev/xvdb1 /xfs
Replace
uquota
withuqnoenforce
to allow usage reporting without enforcing any limits.Enable quotas for groups:
# mount -o gquota /dev/xvdb1 /xfs
Replace
gquota
withgqnoenforce
to allow usage reporting without enforcing any limits.Enable quotas for projects:
# mount -o pquota /dev/xvdb1 /xfs
Replace
pquota
withpqnoenforce
to allow usage reporting without enforcing any limits.Alternatively, include the quota mount options in the
/etc/fstab
file. The following example shows entries in the/etc/fstab
file to enable quotas for users, groups, and projects, respectively, on an XFS file system. These examples also mount the file system with read/write permissions:# vim /etc/fstab /dev/xvdb1 /xfs xfs rw,quota 0 0 /dev/xvdb1 /xfs xfs rw,gquota 0 0 /dev/xvdb1 /xfs xfs rw,prjquota 0 0
Additional resources
-
The
mount(8)
man page. -
The
xfs_quota(8)
man page.
-
The
15.2.3. Reporting XFS usage
You can use the xfs_quota
tool to set limits and report on disk usage. By default, xfs_quota
is run interactively, and in basic mode. Basic mode subcommands simply report usage, and are available to all users.
Prerequisites
- Quotas have been enabled for the XFS file system. See Enabling disk quotas for XFS.
Procedure
Start the
xfs_quota
shell:# xfs_quota
Show usage and limits for the given user:
# xfs_quota> quota username
Show free and used counts for blocks and inodes:
# xfs_quota> df
Run the help command to display the basic commands available with
xfs_quota
.# xfs_quota> help
Specify
q
to exitxfs_quota
.# xfs_quota> q
Additional resources
-
The
xfs_quota(8)
man page.
15.2.4. Modifying XFS quota limits
Start the xfs_quota
tool with the -x
option to enable expert mode and run the administrator commands, which allow modifications to the quota system. The subcommands of this mode allow actual configuration of limits, and are available only to users with elevated privileges.
Prerequisites
- Quotas have been enabled for the XFS file system. See Enabling disk quotas for XFS.
Procedure
Start the
xfs_quota
shell with the-x
option to enable expert mode:# xfs_quota -x
Report quota information for a specific file system:
# xfs_quota> report /path
For example, to display a sample quota report for
/home
(on/dev/blockdevice
), use the commandreport -h /home
. This displays output similar to the following:User quota on /home (/dev/blockdevice) Blocks User ID Used Soft Hard Warn/Grace ---------- --------------------------------- root 0 0 0 00 [------] testuser 103.4G 0 0 00 [------]
Modify quota limits:
# xfs_quota> limit isoft=500m ihard=700m user /path
For example, to set a soft and hard inode count limit of 500 and 700 respectively for user
john
, whose home directory is/home/john
, use the following command:# xfs_quota -x -c 'limit isoft=500 ihard=700 john' /home/
In this case, pass
mount_point
which is the mounted xfs file system.Run the help command to display the expert commands available with
xfs_quota -x
:# xfs_quota> help
Additional resources
-
The
xfs_quota(8)
man page.
-
The
15.2.5. Setting project limits for XFS
This procedure configures limits for project-controlled directories.
Procedure
Add the project-controlled directories to
/etc/projects
. For example, the following adds the/var/log
path with a unique ID of 11 to/etc/projects
. Your project ID can be any numerical value mapped to your project.# echo 11:/var/log >> /etc/projects
Add project names to
/etc/projid
to map project IDs to project names. For example, the following associates a project calledLogs
with the project ID of 11 as defined in the previous step.# echo Logs:11 >> /etc/projid
Initialize the project directory. For example, the following initializes the project directory
/var
:# xfs_quota -x -c 'project -s logfiles' /var
Configure quotas for projects with initialized directories:
# xfs_quota -x -c 'limit -p bhard=lg logfiles' /var
Additional resources
-
The
xfs_quota(8)
man page. -
The
projid(5)
man page. -
The
projects(5)
man page.
-
The
15.3. Managing ext3 and ext4 disk quotas
You have to enable disk quotas on your system before you can assign them. You can assign disk quotas per user, per group or per project. However, if there is a soft limit set, you can exceed these quotas for a configurable period of time, known as the grace period.
15.3.1. Installing the quota tool
You must install the quota
RPM package to implement disk quotas.
Procedure
-
Install the
quota
package:
# yum install quota
15.3.2. Enabling quota feature on file system creation
This procedure describes how to enable quotas on file system creation.
Procedure
Enable quotas on file system creation:
# mkfs.ext4 -O quota /dev/sda
NoteOnly user and group quotas are enabled and initialized by default.
Change the defaults on file system creation:
# mkfs.ext4 -O quota -E quotatype=usrquota:grpquota:prjquota /dev/sda
Mount the file system:
# mount /dev/sda
Additional resources
See the man
page for ext4
for additional information.
15.3.3. Enabling quota feature on existing file systems
This procedure describes how to enable the quota feature on existing file system using the tune2fs
command.
Procedure
Unmount the file system:
# umount /dev/sda
Enable quotas on existing file system:
# tune2fs -O quota /dev/sda
NoteOnly user and group quotas are initialized by default.
Change the defaults:
# tune2fs -Q usrquota,grpquota,prjquota /dev/sda
Mount the file system:
# mount /dev/sda
Additional resources
See the man
page for ext4
for additional information.
15.3.4. Enabling quota enforcement
The quota accounting is enabled by default after mounting the file system without any additional options, but quota enforcement is not.
Prerequisites
- Quota feature is enabled and the default quotas are initialized.
Procedure
Enable quota enforcement by
quotaon
for the user quota:# mount /dev/sda /mnt
# quotaon /mnt
NoteThe quota enforcement can be enabled at mount time using
usrquota
,grpquota
, orprjquota
mount options.# mount -o usrquota,grpquota,prjquota /dev/sda /mnt
Enable user, group, and project quotas for all file systems:
# quotaon -vaugP
-
If neither of the
-u
,-g
, or-P
options are specified, only the user quotas are enabled. -
If only
-g
option is specified, only group quotas are enabled. -
If only
-P
option is specified, only project quotas are enabled.
-
If neither of the
Enable quotas for a specific file system, such as
/home
:# quotaon -vugP /home
Additional resources
-
See the
quotaon(8)
man page.
15.3.5. Assigning quotas per user
The disk quotas are assigned to users with the edquota
command.
The text editor defined by the EDITOR
environment variable is used by edquota
. To change the editor, set the EDITOR
environment variable in your ~/.bash_profile
file to the full path of the editor of your choice.
Prerequisites
- User must exist prior to setting the user quota.
Procedure
Assign the quota for a user:
# edquota username
Replace username with the user to which you want to assign the quotas.
For example, if you enable a quota for the
/dev/sda
partition and execute the commandedquota testuser
, the following is displayed in the default editor configured on the system:Disk quotas for user testuser (uid 501): Filesystem blocks soft hard inodes soft hard /dev/sda 44043 0 0 37418 0 0
Change the desired limits.
If any of the values are set to 0, limit is not set. Change them in the text editor.
For example, the following shows the soft and hard block limits for the testuser have been set to 50000 and 55000 respectively.
Disk quotas for user testuser (uid 501): Filesystem blocks soft hard inodes soft hard /dev/sda 44043 50000 55000 37418 0 0
- The first column is the name of the file system that has a quota enabled for it.
- The second column shows how many blocks the user is currently using.
- The next two columns are used to set soft and hard block limits for the user on the file system.
-
The
inodes
column shows how many inodes the user is currently using. The last two columns are used to set the soft and hard inode limits for the user on the file system.
- The hard block limit is the absolute maximum amount of disk space that a user or group can use. Once this limit is reached, no further disk space can be used.
- The soft block limit defines the maximum amount of disk space that can be used. However, unlike the hard limit, the soft limit can be exceeded for a certain amount of time. That time is known as the grace period. The grace period can be expressed in seconds, minutes, hours, days, weeks, or months.
Verification steps
Verify that the quota for the user has been set:
# quota -v testuser Disk quotas for user testuser: Filesystem blocks quota limit grace files quota limit grace /dev/sda 1000* 1000 1000 0 0 0
15.3.6. Assigning quotas per group
You can assign quotas on a per-group basis.
Prerequisites
- Group must exist prior to setting the group quota.
Procedure
Set a group quota:
# edquota -g groupname
For example, to set a group quota for the
devel
group:# edquota -g devel
This command displays the existing quota for the group in the text editor:
Disk quotas for group devel (gid 505): Filesystem blocks soft hard inodes soft hard /dev/sda 440400 0 0 37418 0 0
- Modify the limits and save the file.
Verification steps
Verify that the group quota is set:
# quota -vg groupname
15.3.7. Assigning quotas per project
This procedure assigns quotas per project.
Prerequisites
- Project quota is enabled on your file system.
Procedure
Add the project-controlled directories to
/etc/projects
. For example, the following adds the/var/log
path with a unique ID of 11 to/etc/projects
. Your project ID can be any numerical value mapped to your project.# echo 11:/var/log >> /etc/projects
Add project names to
/etc/projid
to map project IDs to project names. For example, the following associates a project calledLogs
with the project ID of 11 as defined in the previous step.# echo Logs:11 >> /etc/projid
Set the desired limits:
# edquota -P 11
NoteYou can choose the project either by its project ID (
11
in this case), or by its name (Logs
in this case).Using
quotaon
, enable quota enforcement:
Verification steps
Verify that the project quota is set:
# quota -vP 11
NoteYou can verify either by the project ID, or by the project name.
Additional resources
-
The
edquota(8)
man page. -
The
projid(5)
man page. -
The
projects(5)
man page.
15.3.8. Setting the grace period for soft limits
If a given quota has soft limits, you can edit the grace period, which is the amount of time for which a soft limit can be exceeded. You can set the grace period for users, groups, or projects.
Procedure
Edit the grace period:
# edquota -t
While other edquota
commands operate on quotas for a particular user, group, or project, the -t
option operates on every file system with quotas enabled.
Additional resources
-
The
edquota(8)
man page.
15.3.9. Turning file system quotas off
Use quotaoff
to turn disk quota enforcement off on the specified file systems. Quota accounting stays enabled after executing this command.
Procedure
To turn all user and group quotas off:
# quotaoff -vaugP
-
If neither of the
-u
,-g
, or-P
options are specified, only the user quotas are disabled. -
If only
-g
option is specified, only group quotas are disabled. -
If only
-P
option is specified, only project quotas are disabled. -
The
-v
switch causes verbose status information to display as the command executes.
-
If neither of the
Additional resources
-
See the
quotaoff(8)
man page.
15.3.10. Reporting on disk quotas
You can create a disk quota report using the repquota
utility.
Procedure
Run the
repquota
command:# repquota
For example, the command
repquota /dev/sda
produces this output:*** Report for user quotas on device /dev/sda Block grace time: 7days; Inode grace time: 7days Block limits File limits User used soft hard grace used soft hard grace ---------------------------------------------------------------------- root -- 36 0 0 4 0 0 kristin -- 540 0 0 125 0 0 testuser -- 440400 500000 550000 37418 0 0
View the disk usage report for all quota-enabled file systems:
# repquota -augP
The --
symbol displayed after each user determines whether the block or inode limits have been exceeded. If either soft limit is exceeded, a +
character appears in place of the corresponding -
character. The first -
character represents the block limit, and the second represents the inode limit.
The grace
columns are normally blank. If a soft limit has been exceeded, the column contains a time specification equal to the amount of time remaining on the grace period. If the grace period has expired, none
appears in its place.
Additional resources
The repquota(8)
man page for more information.
Chapter 16. Discarding unused blocks
You can perform or schedule discard operations on block devices that support them.
16.1. Block discard operations
Block discard operations discard blocks that are no longer in use by a mounted file system. They are useful on:
- Solid-state drives (SSDs)
- Thinly-provisioned storage
Requirements
The block device underlying the file system must support physical discard operations.
Physical discard operations are supported if the value in the /sys/block/device/queue/discard_max_bytes
file is not zero.
16.2. Types of block discard operations
You can run discard operations using different methods:
- Batch discard
- Are run explicitly by the user. They discard all unused blocks in the selected file systems.
- Online discard
- Are specified at mount time. They run in real time without user intervention. Online discard operations discard only the blocks that are transitioning from used to free.
- Periodic discard
-
Are batch operations that are run regularly by a
systemd
service.
All types are supported by the XFS and ext4 file systems and by VDO.
Recommendations
Red Hat recommends that you use batch or periodic discard.
Use online discard only if:
- the system’s workload is such that batch discard is not feasible, or
- online discard operations are necessary to maintain performance.
16.3. Performing batch block discard
This procedure performs a batch block discard operation to discard unused blocks on a mounted file system.
Prerequisites
- The file system is mounted.
- The block device underlying the file system supports physical discard operations.
Procedure
Use the
fstrim
utility:To perform discard only on a selected file system, use:
# fstrim mount-point
To perform discard on all mounted file systems, use:
# fstrim --all
If you execute the fstrim
command on:
- a device that does not support discard operations, or
- a logical device (LVM or MD) composed of multiple devices, where any one of the device does not support discard operations,
the following message displays:
# fstrim /mnt/non_discard fstrim: /mnt/non_discard: the discard operation is not supported
Additional resources
-
The
fstrim(8)
man page
16.4. Enabling online block discard
This procedure enables online block discard operations that automatically discard unused blocks on all supported file systems.
Procedure
Enable online discard at mount time:
When mounting a file system manually, add the
-o discard
mount option:# mount -o discard device mount-point
-
When mounting a file system persistently, add the
discard
option to the mount entry in the/etc/fstab
file.
Additional resources
-
The
mount(8)
man page -
The
fstab(5)
man page
16.5. Enabling online block discard using RHEL System Roles
This section describes how to enable online block discard using the storage
role.
Prerequisites
-
An Ansible playbook including the
storage
role exists.
For information on how to apply such a playbook, see Applying a role.
16.5.1. Example Ansible playbook to enable online block discard
This section provides an example Ansible playbook. This playbook applies the storage
role to mount an XFS file system with online block discard enabled.
Example 16.1. A playbook that enables online block discard on /mnt/data/
--- - hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: xfs mount_point: /mnt/data mount_options: discard roles: - rhel-system-roles.storage
Additional resources
- This playbook also performs all the operations of the persistent mount example described in Section 2.4, “Example Ansible playbook to persistently mount a file system”.
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
16.5.2. Additional resources
-
For more information about the
storage
role, see Section 2.1, “Introduction to the storage role”.
16.6. Enabling periodic block discard
This procedure enables a systemd
timer that regularly discards unused blocks on all supported file systems.
Procedure
Enable and start the
systemd
timer:# systemctl enable --now fstrim.timer
Chapter 17. Managing layered local storage with Stratis
You can easily set up and manage complex storage configurations integrated by the Stratis high-level system.
Stratis is available as a Technology Preview. For information on Red Hat scope of support for Technology Preview features, see the Technology Preview Features Support Scope document.
Customers deploying Stratis are encouraged to provide feedback to Red Hat.
17.1. Setting up Stratis file systems
As a system administrator, you can enable and set up the Stratis volume-managing file system on your system to easily manage layered storage.
17.1.1. The purpose and features of Stratis
Stratis is a local storage-management solution for Linux. It is focused on simplicity and ease of use, and gives you access to advanced storage features.
Stratis makes the following activities easier:
- Initial configuration of storage
- Making changes later
- Using advanced storage features
Stratis is a hybrid user-and-kernel local storage management system that supports advanced storage features. The central concept of Stratis is a storage pool. This pool is created from one or more local disks or partitions, and volumes are created from the pool.
The pool enables many useful features, such as:
- File system snapshots
- Thin provisioning
- Tiering
17.1.2. Components of a Stratis volume
Externally, Stratis presents the following volume components in the command-line interface and the API:
blockdev
- Block devices, such as a disk or a disk partition.
pool
Composed of one or more block devices.
A pool has a fixed total size, equal to the size of the block devices.
The pool contains most Stratis layers, such as the non-volatile data cache using the
dm-cache
target.Stratis creates a
/stratis/my-pool/
directory for each pool. This directory contains links to devices that represent Stratis file systems in the pool.
filesystem
Each pool can contain one or more file systems, which store files.
File systems are thinly provisioned and do not have a fixed total size. The actual size of a file system grows with the data stored on it. If the size of the data approaches the virtual size of the file system, Stratis grows the thin volume and the file system automatically.
The file systems are formatted with XFS.
ImportantStratis tracks information about file systems created using Stratis that XFS is not aware of, and changes made using XFS do not automatically create updates in Stratis. Users must not reformat or reconfigure XFS file systems that are managed by Stratis.
Stratis creates links to file systems at the
/stratis/my-pool/my-fs
path.
Stratis uses many Device Mapper devices, which show up in dmsetup
listings and the /proc/partitions
file. Similarly, the lsblk
command output reflects the internal workings and layers of Stratis.
17.1.3. Block devices usable with Stratis
This section lists storage devices that you can use for Stratis.
Supported devices
Stratis pools have been tested to work on these types of block devices:
- LUKS
- LVM logical volumes
- MD RAID
- DM Multipath
- iSCSI
- HDDs and SSDs
- NVMe devices
In the current version, Stratis does not handle failures in hard drives or other hardware. If you create a Stratis pool over multiple hardware devices, you increase the risk of data loss because multiple devices must be operational to access the data.
Unsupported devices
Because Stratis contains a thin-provisioning layer, Red Hat does not recommend placing a Stratis pool on block devices that are already thinly-provisioned.
Additional resources
-
For iSCSI and other block devices requiring network, see the
systemd.mount(5)
man page for information on the_netdev
mount option.
17.1.4. Installing Stratis
This procedure installs all packages necessary to use Stratis.
Procedure
Install packages that provide the Stratis service and command-line utilities:
# yum install stratisd stratis-cli
Make sure that the
stratisd
service is enabled:# systemctl enable --now stratisd
17.1.5. Creating a Stratis pool
This procedure describes how to create an encrypted or an unencrypted Stratis pool from one or more block devices.
The following notes apply to encrypted Stratis pools:
-
Each block device is encrypted using the
cryptsetup
library and implements theLUKS2
format. - Each Stratis pool can have a unique key or it can share the same key with other pools. These keys are stored in the kernel keyring.
- All block devices that comprise a Stratis pool are either encrypted or unencrypted. It is not possible to have both encrypted and unencrypted block devices in the same Stratis pool.
- Block devices added to the data tier of an encrypted Stratis pool are automatically encrypted.
Prerequisites
- Stratis v2.2.1 is installed on your system. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - The block devices on which you are creating a Stratis pool are not in use and are not mounted.
- The block devices on which you are creating a Stratis pool are at least 1 GiB in size each.
On the IBM Z architecture, the
/dev/dasd*
block devices must be partitioned. Use the partition in the Stratis pool.For information on partitioning DASD devices, see Configuring a Linux instance on IBM Z.
Procedure
If the selected block device contains file system, partition table, or RAID signatures, erase them using the following command:
# wipefs --all block-device
where
block-device
is the path to the block device; for example,/dev/sdb
.Create the new Stratis pool on the selected block device(s):
NoteSpecify multiple block devices on a single line, separated by a space:
# stratis pool create my-pool block-device-1 block-device-2
To create an unencrypted Stratis pool, use the following command and go to step 3:
# stratis pool create my-pool block-device
where
block-device
is the path to an empty or wiped block device.NoteYou cannot encrypt an unencrypted Stratis pool after you create it.
To create an encrypted Stratis pool, complete the following steps:
If you have not created a key set already, run the following command and follow the prompts to create a key set to use for the encryption:
# stratis key set --capture-key key-description
where
key-description
is the description or name of the key set.Create the encrypted Stratis pool and specify the key description to use for the encryption. You can also specify the key path using the
--keyfile-path
parameter instead.# stratis pool create --key-desc key-description my-pool block-device
where
key-description
- Specifies the description or name of the key file to be used for the encryption.
my-pool
- Specifies the name of the new Stratis pool.
block-device
- Specifies the path to an empty or wiped block device.
Verify that the new Stratis pool was created:
# stratis pool list
Troubleshooting
After a system reboot, sometimes you might not see your encrypted Stratis pool or the block devices that comprise it. If you encounter this issue, you must unlock the Stratis pool to make it visible.
To unlock the Stratis pool, complete the following steps:
Recreate the key set using the same key description that was used previously:
# stratis key set --capture-key key-description
Unlock the Stratis pool and the block device(s):
# stratis pool unlock
Verify that the Stratis pool is visible:
# stratis pool list
Additional resources
-
The
stratis(8)
man page.
Next steps
- Create a Stratis file system on the pool. For more information, see Section 17.1.6, “Creating a Stratis file system”.
17.1.6. Creating a Stratis file system
This procedure creates a Stratis file system on an existing Stratis pool.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - You have created a Stratis pool. See Section 17.1.5, “Creating a Stratis pool”.
Procedure
To create a Stratis file system on a pool, use:
# stratis fs create my-pool my-fs
- Replace my-pool with the name of your existing Stratis pool.
- Replace my-fs with an arbitrary name for the file system.
To verify, list file systems within the pool:
# stratis fs list my-pool
Additional resources
-
The
stratis(8)
man page
Next steps
- Mount the Stratis file system. See Section 17.1.7, “Mounting a Stratis file system”.
17.1.7. Mounting a Stratis file system
This procedure mounts an existing Stratis file system to access the content.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - You have created a Stratis file system. See Section 17.1.6, “Creating a Stratis file system”.
Procedure
To mount the file system, use the entries that Stratis maintains in the
/stratis/
directory:# mount /stratis/my-pool/my-fs mount-point
The file system is now mounted on the mount-point directory and ready to use.
Additional resources
-
The
mount(8)
man page
17.1.8. Persistently mounting a Stratis file system
This procedure persistently mounts a Stratis file system so that it is available automatically after booting the system.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - You have created a Stratis file system. See Section 17.1.6, “Creating a Stratis file system”.
Procedure
Determine the UUID attribute of the file system:
$ lsblk --output=UUID /stratis/my-pool/my-fs
For example:
Example 17.1. Viewing the UUID of Stratis file system
$ lsblk --output=UUID /stratis/my-pool/fs1 UUID a1f0b64a-4ebb-4d4e-9543-b1d79f600283
If the mount point directory does not exist, create it:
# mkdir --parents mount-point
As root, edit the
/etc/fstab
file and add a line for the file system, identified by the UUID. Usexfs
as the file system type and add thex-systemd.requires=stratisd.service
option.For example:
Example 17.2. The /fs1 mount point in /etc/fstab
UUID=a1f0b64a-4ebb-4d4e-9543-b1d79f600283 /fs1 xfs defaults,x-systemd.requires=stratisd.service 0 0
Regenerate mount units so that your system registers the new configuration:
# systemctl daemon-reload
Try mounting the file system to verify that the configuration works:
# mount mount-point
Additional resources
17.2. Extending a Stratis volume with additional block devices
You can attach additional block devices to a Stratis pool to provide more storage capacity for Stratis file systems.
17.2.1. Components of a Stratis volume
Externally, Stratis presents the following volume components in the command-line interface and the API:
blockdev
- Block devices, such as a disk or a disk partition.
pool
Composed of one or more block devices.
A pool has a fixed total size, equal to the size of the block devices.
The pool contains most Stratis layers, such as the non-volatile data cache using the
dm-cache
target.Stratis creates a
/stratis/my-pool/
directory for each pool. This directory contains links to devices that represent Stratis file systems in the pool.
filesystem
Each pool can contain one or more file systems, which store files.
File systems are thinly provisioned and do not have a fixed total size. The actual size of a file system grows with the data stored on it. If the size of the data approaches the virtual size of the file system, Stratis grows the thin volume and the file system automatically.
The file systems are formatted with XFS.
ImportantStratis tracks information about file systems created using Stratis that XFS is not aware of, and changes made using XFS do not automatically create updates in Stratis. Users must not reformat or reconfigure XFS file systems that are managed by Stratis.
Stratis creates links to file systems at the
/stratis/my-pool/my-fs
path.
Stratis uses many Device Mapper devices, which show up in dmsetup
listings and the /proc/partitions
file. Similarly, the lsblk
command output reflects the internal workings and layers of Stratis.
17.2.2. Adding block devices to a Stratis pool
This procedure adds one or more block devices to a Stratis pool to be usable by Stratis file systems.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - The block devices that you are adding to the Stratis pool are not in use and not mounted.
- The block devices that you are adding to the Stratis pool are at least 1 GiB in size each.
Procedure
To add one or more block devices to the pool, use:
# stratis pool add-data my-pool device-1 device-2 device-n
Additional resources
-
The
stratis(8)
man page
17.3. Monitoring Stratis file systems
As a Stratis user, you can view information about Stratis volumes on your system to monitor their state and free space.
17.3.1. Stratis sizes reported by different utilities
This section explains the difference between Stratis sizes reported by standard utilities such as df
and the stratis
utility.
Standard Linux utilities such as df
report the size of the XFS file system layer on Stratis, which is 1 TiB. This is not useful information, because the actual storage usage of Stratis is less due to thin provisioning, and also because Stratis automatically grows the file system when the XFS layer is close to full.
Regularly monitor the amount of data written to your Stratis file systems, which is reported as the Total Physical Used value. Make sure it does not exceed the Total Physical Size value.
Additional resources
-
The
stratis(8)
man page
17.3.2. Displaying information about Stratis volumes
This procedure lists statistics about your Stratis volumes, such as the total, used, and free size or file systems and block devices belonging to a pool.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running.
Procedure
To display information about all block devices used for Stratis on your system:
# stratis blockdev Pool Name Device Node Physical Size State Tier my-pool /dev/sdb 9.10 TiB In-use Data
To display information about all Stratis pools on your system:
# stratis pool Name Total Physical Size Total Physical Used my-pool 9.10 TiB 598 MiB
To display information about all Stratis file systems on your system:
# stratis filesystem Pool Name Name Used Created Device my-pool my-fs 546 MiB Nov 08 2018 08:03 /stratis/my-pool/my-fs
Additional resources
-
The
stratis(8)
man page
17.4. Using snapshots on Stratis file systems
You can use snapshots on Stratis file systems to capture file system state at arbitrary times and restore it in the future.
17.4.1. Characteristics of Stratis snapshots
This section describes the properties and limitations of file system snapshots on Stratis.
In Stratis, a snapshot is a regular Stratis file system created as a copy of another Stratis file system. The snapshot initially contains the same file content as the original file system, but can change as the snapshot is modified. Whatever changes you make to the snapshot will not be reflected in the original file system.
The current snapshot implementation in Stratis is characterized by the following:
- A snapshot of a file system is another file system.
- A snapshot and its origin are not linked in lifetime. A snapshotted file system can live longer than the file system it was created from.
- A file system does not have to be mounted to create a snapshot from it.
- Each snapshot uses around half a gigabyte of actual backing storage, which is needed for the XFS log.
17.4.2. Creating a Stratis snapshot
This procedure creates a Stratis file system as a snapshot of an existing Stratis file system.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - You have created a Stratis file system. See Section 17.1.6, “Creating a Stratis file system”.
Procedure
To create a Stratis snapshot, use:
# stratis fs snapshot my-pool my-fs my-fs-snapshot
Additional resources
-
The
stratis(8)
man page
17.4.3. Accessing the content of a Stratis snapshot
This procedure mounts a snapshot of a Stratis file system to make it accessible for read and write operations.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - You have created a Stratis snapshot. See Section 17.4.2, “Creating a Stratis snapshot”.
Procedure
To access the snapshot, mount it as a regular file system from the
/stratis/my-pool/
directory:# mount /stratis/my-pool/my-fs-snapshot mount-point
Additional resources
- Section 17.1.7, “Mounting a Stratis file system”
-
The
mount(8)
man page
17.4.4. Reverting a Stratis file system to a previous snapshot
This procedure reverts the content of a Stratis file system to the state captured in a Stratis snapshot.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - You have created a Stratis snapshot. See Section 17.4.2, “Creating a Stratis snapshot”.
Procedure
Optionally, back up the current state of the file system to be able to access it later:
# stratis filesystem snapshot my-pool my-fs my-fs-backup
Unmount and remove the original file system:
# umount /stratis/my-pool/my-fs # stratis filesystem destroy my-pool my-fs
Create a copy of the snapshot under the name of the original file system:
# stratis filesystem snapshot my-pool my-fs-snapshot my-fs
Mount the snapshot, which is now accessible with the same name as the original file system:
# mount /stratis/my-pool/my-fs mount-point
The content of the file system named my-fs is now identical to the snapshot my-fs-snapshot.
Additional resources
-
The
stratis(8)
man page
17.4.5. Removing a Stratis snapshot
This procedure removes a Stratis snapshot from a pool. Data on the snapshot are lost.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - You have created a Stratis snapshot. See Section 17.4.2, “Creating a Stratis snapshot”.
Procedure
Unmount the snapshot:
# umount /stratis/my-pool/my-fs-snapshot
Destroy the snapshot:
# stratis filesystem destroy my-pool my-fs-snapshot
Additional resources
-
The
stratis(8)
man page
17.5. Removing Stratis file systems
You can remove an existing Stratis file system or a Stratis pool, destroying data on them.
17.5.1. Components of a Stratis volume
Externally, Stratis presents the following volume components in the command-line interface and the API:
blockdev
- Block devices, such as a disk or a disk partition.
pool
Composed of one or more block devices.
A pool has a fixed total size, equal to the size of the block devices.
The pool contains most Stratis layers, such as the non-volatile data cache using the
dm-cache
target.Stratis creates a
/stratis/my-pool/
directory for each pool. This directory contains links to devices that represent Stratis file systems in the pool.
filesystem
Each pool can contain one or more file systems, which store files.
File systems are thinly provisioned and do not have a fixed total size. The actual size of a file system grows with the data stored on it. If the size of the data approaches the virtual size of the file system, Stratis grows the thin volume and the file system automatically.
The file systems are formatted with XFS.
ImportantStratis tracks information about file systems created using Stratis that XFS is not aware of, and changes made using XFS do not automatically create updates in Stratis. Users must not reformat or reconfigure XFS file systems that are managed by Stratis.
Stratis creates links to file systems at the
/stratis/my-pool/my-fs
path.
Stratis uses many Device Mapper devices, which show up in dmsetup
listings and the /proc/partitions
file. Similarly, the lsblk
command output reflects the internal workings and layers of Stratis.
17.5.2. Removing a Stratis file system
This procedure removes an existing Stratis file system. Data stored on it are lost.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - You have created a Stratis file system. See Section 17.1.6, “Creating a Stratis file system”.
Procedure
Unmount the file system:
# umount /stratis/my-pool/my-fs
Destroy the file system:
# stratis filesystem destroy my-pool my-fs
Verify that the file system no longer exists:
# stratis filesystem list my-pool
Additional resources
-
The
stratis(8)
man page
17.5.3. Removing a Stratis pool
This procedure removes an existing Stratis pool. Data stored on it are lost.
Prerequisites
- Stratis is installed. See Section 17.1.4, “Installing Stratis”.
-
The
stratisd
service is running. - You have created a Stratis pool. See Section 17.1.5, “Creating a Stratis pool”.
Procedure
List file systems on the pool:
# stratis filesystem list my-pool
Unmount all file systems on the pool:
# umount /stratis/my-pool/my-fs-1 \ /stratis/my-pool/my-fs-2 \ /stratis/my-pool/my-fs-n
Destroy the file systems:
# stratis filesystem destroy my-pool my-fs-1 my-fs-2
Destroy the pool:
# stratis pool destroy my-pool
Verify that the pool no longer exists:
# stratis pool list
Additional resources
-
The
stratis(8)
man page
Chapter 18. Getting started with an ext3 file system
As a system administrator, you can create, mount, resize, backup, and restore an ext3 file system. The ext3 file system is essentially an enhanced version of the ext2 file system.
18.1. Features of an ext3 file system
Following are the features of an ext3 file system:
Availability: After an unexpected power failure or system crash, file system check is not required due to the journaling provided. The default journal size takes about a second to recover, depending on the speed of the hardware
NoteThe only supported journaling mode in ext3 is
data=ordered
(default). For more information, see Is the EXT journaling option "data=writeback" supported in RHEL? Knowledgebase article.- Data Integrity: The ext3 file system prevents loss of data integrity during an unexpected power failure or system crash.
- Speed: Despite writing some data more than once, ext3 has a higher throughput in most cases than ext2 because ext3’s journaling optimizes hard drive head motion.
- Easy Transition: It is easy to migrate from ext2 to ext3 and gain the benefits of a robust journaling file system without reformatting.
Additional resources
-
The
ext3
man page.
18.2. Creating an ext3 file system
As a system administrator, you can create an ext3 file system on a block device using mkfs.ext3
command.
Prerequisites
A partition on your disk. For information on creating MBR or GPT partitions, see Section 10.2, “Creating a partition table on a disk”.
Alternatively, use an LVM or MD volume.
Procedure
To create an ext3 file system:
For a regular-partition device, an LVM volume, an MD volume, or a similar device, use the following command:
# mkfs.ext3 /dev/block_device
Replace /dev/block_device with the path to a block device.
For example,
/dev/sdb1
,/dev/disk/by-uuid/05e99ec8-def1-4a5e-8a9d-5945339ceb2a
, or/dev/my-volgroup/my-lv
. In general, the default options are optimal for most usage scenarios.
NoteTo specify a UUID when creating a file system:
# mkfs.ext3 -U UUID /dev/block_device
Replace UUID with the UUID you want to set: for example,
7cd65de3-e0be-41d9-b66d-96d749c02da7
.Replace /dev/block_device with the path to an ext3 file system to have the UUID added to it: for example,
/dev/sda8
.To specify a label when creating a file system:
# mkfs.ext3 -L label-name /dev/block_device
To view the created ext3 file system:
# blkid
Additional resources
-
The
ext3
man page. -
The
mkfs.ext3
man page.
18.3. Mounting an ext3 file system
As a system administrator, you can mount an ext3 file system using the mount
utility.
Prerequisites
- An ext3 file system. For information on creating an ext3 file system, see Section 18.2, “Creating an ext3 file system”.
Procedure
To create a mount point to mount the file system:
# mkdir /mount/point
Replace /mount/point with the directory name where mount point of the partition must be created.
To mount an ext3 file system:
To mount an ext3 file system with no extra options:
# mount /dev/block_device /mount/point
- To mount the file system persistently, see Section 14.8, “Persistently mounting file systems”.
To view the mounted file system:
# df -h
Additional resources
-
The
mount
man page. -
The
ext3
man page. -
The
fstab
man page. - Chapter 14, Mounting file systems
18.4. Resizing an ext3 file system
As a system administrator, you can resize an ext3 file system using the resize2fs
utility. The resize2fs
utility reads the size in units of file system block size, unless a suffix indicating a specific unit is used. The following suffixes indicate specific units:
-
s (sectors) -
512
byte sectors -
K (kilobytes) -
1,024
bytes -
M (megabytes) -
1,048,576
bytes -
G (gigabytes) -
1,073,741,824
bytes -
T (terabytes) -
1,099,511,627,776
bytes
Prerequisites
- An ext3 file system. For information on creating an ext3 file system, see Section 18.2, “Creating an ext3 file system”.
- An underlying block device of an appropriate size to hold the file system after resizing.
Procedure
To resize an ext3 file system, take the following steps:
To shrink and grow the size of an unmounted ext3 file system:
# umount /dev/block_device # e2fsck -f /dev/block_device # resize2fs /dev/block_device size
Replace /dev/block_device with the path to the block device, for example
/dev/sdb1
.Replace size with the required resize value using
s
,K
,M
,G
, andT
suffixes.An ext3 file system may be grown while mounted using the
resize2fs
command:# resize2fs /mount/device size
NoteThe size parameter is optional (and often redundant) when expanding. The
resize2fs
automatically expands to fill the available space of the container, usually a logical volume or partition.
To view the resized file system:
# df -h
Additional resources
-
The
resize2fs
man page. -
The
e2fsck
man page. -
The
ext3
man page.
18.5. Creating and mounting ext3 file systems using RHEL System Roles
This section describes how to create an ext3 file system with a given label on a disk, and persistently mount the file system using the storage
role.
Prerequisites
-
An Ansible playbook including the
storage
role exists.
For information on how to apply such a playbook, see Applying a role.
18.5.1. Example Ansible playbook to create and mount an ext3 file system
This section provides an example Ansible playbook. This playbook applies the storage
role to create and mount an Ext3 file system.
Example 18.1. A playbook that creates Ext3 on /dev/sdb and mounts it at /mnt/data
--- - hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: ext3 fs_label: label-name mount_point: /mnt/data roles: - rhel-system-roles.storage
-
The playbook creates the file system on the
/dev/sdb
disk. -
The playbook persistently mounts the file system at the
/mnt/data
directory. -
The label of the file system is
label-name
.
Additional resources
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
18.5.2. Additional resources
-
For more information about the
storage
role, see Section 2.1, “Introduction to the storage role”.
Chapter 19. Getting started with an ext4 file system
As a system administrator, you can create, mount, resize, backup, and restore an ext4 file system. The ext4 file system is a scalable extension of the ext3 file system. With Red Hat Enterprise Linux 8, it can support a maximum individual file size of 16
terabytes, and file system to a maximum of 50
terabytes.
19.1. Features of an ext4 file system
Following are the features of an ext4 file system:
- Using extents: The ext4 file system uses extents, which improves performance when using large files and reduces metadata overhead for large files.
- Ext4 labels unallocated block groups and inode table sections accordingly, which allows the block groups and table sections to be skipped during a file system check. It leads to a quick file system check, which becomes more beneficial as the file system grows in size.
- Metadata checksum: By default, this feature is enabled in Red Hat Enterprise Linux 8.
Allocation features of an ext4 file system:
- Persistent pre-allocation
- Delayed allocation
- Multi-block allocation
- Stripe-aware allocation
-
Extended attributes (
xattr
): This allows the system to associate several additional name and value pairs per file. Quota journaling: This avoids the need for lengthy quota consistency checks after a crash.
NoteThe only supported journaling mode in ext4 is
data=ordered
(default). For more information, see Is the EXT journaling option "data=writeback" supported in RHEL? Knowledgebase article.- Subsecond timestamps — This gives timestamps to the subsecond.
Additional resources
-
The
ext4
man page.
19.2. Creating an ext4 file system
As a system administrator, you can create an ext4 file system on a block device using mkfs.ext4
command.
Prerequisites
A partition on your disk. For information on creating MBR or GPT partitions, see Section 10.2, “Creating a partition table on a disk”.
Alternatively, use an LVM or MD volume.
Procedure
To create an ext4 file system:
For a regular-partition device, an LVM volume, an MD volume, or a similar device, use the following command:
# mkfs.ext4 /dev/block_device
Replace /dev/block_device with the path to a block device.
For example,
/dev/sdb1
,/dev/disk/by-uuid/05e99ec8-def1-4a5e-8a9d-5945339ceb2a
, or/dev/my-volgroup/my-lv
. In general, the default options are optimal for most usage scenarios.For striped block devices (for example, RAID5 arrays), the stripe geometry can be specified at the time of file system creation. Using proper stripe geometry enhances the performance of an ext4 file system. For example, to create a file system with a 64k stride (that is, 16 x 4096) on a 4k-block file system, use the following command:
# mkfs.ext4 -E stride=16,stripe-width=64 /dev/block_device
In the given example:
- stride=value: Specifies the RAID chunk size
- stripe-width=value: Specifies the number of data disks in a RAID device, or the number of stripe units in the stripe.
NoteTo specify a UUID when creating a file system:
# mkfs.ext4 -U UUID /dev/block_device
Replace UUID with the UUID you want to set: for example,
7cd65de3-e0be-41d9-b66d-96d749c02da7
.Replace /dev/block_device with the path to an ext4 file system to have the UUID added to it: for example,
/dev/sda8
.To specify a label when creating a file system:
# mkfs.ext4 -L label-name /dev/block_device
To view the created ext4 file system:
# blkid
Additional resources
-
The
ext4
man page. -
The
mkfs.ext4
man page.
19.3. Mounting an ext4 file system
As a system administrator, you can mount an ext4 file system using the mount
utility.
Prerequisites
- An ext4 file system. For information on creating an ext4 file system, see Section 19.2, “Creating an ext4 file system”.
Procedure
To create a mount point to mount the file system:
# mkdir /mount/point
Replace /mount/point with the directory name where mount point of the partition must be created.
To mount an ext4 file system:
To mount an ext4 file system with no extra options:
# mount /dev/block_device /mount/point
- To mount the file system persistently, see Section 14.8, “Persistently mounting file systems”.
To view the mounted file system:
# df -h
Additional resources
-
The
mount
man page. -
The
ext4
man page. -
The
fstab
man page. - Chapter 14, Mounting file systems
19.4. Resizing an ext4 file system
As a system administrator, you can resize an ext4 file system using the resize2fs
utility. The resize2fs
utility reads the size in units of file system block size, unless a suffix indicating a specific unit is used. The following suffixes indicate specific units:
-
s (sectors) -
512
byte sectors -
K (kilobytes) -
1,024
bytes -
M (megabytes) -
1,048,576
bytes -
G (gigabytes) -
1,073,741,824
bytes -
T (terabytes) -
1,099,511,627,776
bytes
Prerequisites
- An ext4 file system. For information on creating an ext4 file system, see Section 19.2, “Creating an ext4 file system”.
- An underlying block device of an appropriate size to hold the file system after resizing.
Procedure
To resize an ext4 file system, take the following steps:
To shrink and grow the size of an unmounted ext4 file system:
# umount /dev/block_device # e2fsck -f /dev/block_device # resize2fs /dev/block_device size
Replace /dev/block_device with the path to the block device, for example
/dev/sdb1
.Replace size with the required resize value using
s
,K
,M
,G
, andT
suffixes.An ext4 file system may be grown while mounted using the
resize2fs
command:# resize2fs /mount/device size
NoteThe size parameter is optional (and often redundant) when expanding. The
resize2fs
automatically expands to fill the available space of the container, usually a logical volume or partition.
To view the resized file system:
# df -h
Additional resources
-
The
resize2fs
man page. -
The
e2fsck
man page. -
The
ext4
man page.
19.5. Creating and mounting ext4 file systems using RHEL System Roles
This section describes how to create an ext4 file system with a given label on a disk, and persistently mount the file system using the storage
role.
Prerequisites
-
An Ansible playbook including the
storage
role exists.
For information on how to apply such a playbook, see Applying a role.
19.5.1. Example Ansible playbook to create and mount an Ext4 file system
This section provides an example Ansible playbook. This playbook applies the storage
role to create and mount an Ext4 file system.
Example 19.1. A playbook that creates Ext4 on /dev/sdb and mounts it at /mnt/data
--- - hosts: all vars: storage_volumes: - name: barefs type: disk disks: - sdb fs_type: ext4 fs_label: label-name mount_point: /mnt/data roles: - rhel-system-roles.storage
-
The playbook creates the file system on the
/dev/sdb
disk. -
The playbook persistently mounts the file system at the
/mnt/data
directory. -
The label of the file system is
label-name
.
Additional resources
-
For details about the parameters used in the
storage
system role, see the/usr/share/ansible/roles/rhel-system-roles.storage/README.md
file.
Additional resources
-
For more information about the
storage
role, see Section 2.1, “Introduction to the storage role”.
19.6. Comparison of tools used with ext4 and XFS
This section compares which tools to use to accomplish common tasks on the ext4 and XFS file systems.
Task | ext4 | XFS |
---|---|---|
Create a file system |
|
|
File system check |
|
|
Resize a file system |
|
|
Save an image of a file system |
|
|
Label or tune a file system |
|
|
Back up a file system |
|
|
Quota management |
|
|
File mapping |
|
|