Chapter 10. Managing Red Hat Gluster Storage Volumes

This chapter describes how to perform common volume management operations on the Red Hat Gluster Storage volumes.

10.1. Configuring Volume Options

Note

Volume options can be configured while the trusted storage pool is online.
The current settings for a volume can be viewed using the following command:
# gluster volume info VOLNAME
Volume options can be configured using the following command:
# gluster volume set VOLNAME OPTION PARAMETER
For example, to specify the performance cache size for test-volume:
# gluster volume set test-volume performance.cache-size 256MB
Set volume successful
The following table lists available volume options along with their description and default value.

Note

The default values are subject to change, and may not be the same for all versions of Red Hat Gluster Storage.
Option Value Description Allowed Values Default Value
auth.allow IP addresses or hostnames of the clients which are allowed to access the volume. Valid hostnames or IP addresses, which includes wild card patterns including *. For example, 192.168.1.*. A list of comma separated addresses is acceptable, but a single hostname must not exceed 256 characters. * (allow all)
auth.reject IP addresses or hostnames of the clients which are denied access to the volume. Valid hostnames or IP addresses, which includes wild card patterns including *. For example, 192.168.1.*. A list of comma separated addresses is acceptable, but a single hostname must not exceed 256 characters. none (reject none)

Note

Using auth.allow and auth.reject options, you can control access of only glusterFS FUSE-based clients. Use nfs.rpc-auth-* options for NFS access control.
changelogEnables the changelog translator to record all the file operations.on | off off
client.event-threads Specifies the number of network connections to be handled simultaneously by the client processes accessing a Red Hat Gluster Storage node. 1 - 32 2
server.event-threads Specifies the number of network connections to be handled simultaneously by the server processes hosting a Red Hat Gluster Storage node. 1 - 32 2
cluster.consistent-metadata If set to On, the readdirp function in Automatic File Replication feature will always fetch metadata from their respective read children as long as it holds the good copy (the copy that does not need healing) of the file/directory. However, this could cause a reduction in performance where readdirps are involved. on | off off

Note

After cluster.consistent-metadata option is set to On, you must ensure to unmount and mount the volume at the clients for this option to take effect.
cluster.min-free-disk Specifies the percentage of disk space that must be kept free. This may be useful for non-uniform bricks. Percentage of required minimum free disk space. 10%
cluster.op-version Allows you to set the operating version of the cluster. The op-version number cannot be downgraded and is set for all the volumes. Also the op-version does not appear when you execute the gluster volume info command. 3000z | 30703 | 30706 Default value is 3000z after an upgrade from Red Hat Gluster Storage 3.0 or 30703 after upgrade from RHGS 3.1.1. Value is set to 30706 for a new cluster deployment.
cluster.self-heal-daemon Specifies whether proactive self-healing on replicated volumes is activated. on | off on
cluster.server-quorum-type If set to server, this option enables the specified volume to participate in the server-side quorum. For more information on configuring the server-side quorum, see Section 10.11.1.1, “Configuring Server-Side Quorum” none | server none
cluster.server-quorum-ratio Sets the quorum percentage for the trusted storage pool. 0 - 100 >50%
cluster.quorum-type If set to fixed, this option allows writes to a file only if the number of active bricks in that replica set (to which the file belongs) is greater than or equal to the count specified in the cluster.quorum-count option. If set to auto, this option allows writes to the file only if the percentage of active replicate bricks is more than 50% of the total number of bricks that constitute that replica. If there are only two bricks in the replica group, the first brick must be up and running to allow modifications. fixed | auto none
cluster.quorum-count The minimum number of bricks that must be active in a replica-set to allow writes. This option is used in conjunction with cluster.quorum-type =fixed option to specify the number of bricks to be active to participate in quorum. The cluster.quorum-type = auto option will override this value. 1 - replica-count 0
cluster.lookup-optimizeIf this option, is set ON, enables the optimization of -ve lookups, by not doing a lookup on non-hashed sub-volumes for files, in case the hashed sub-volume does not return any result. This option disregards the lookup-unhashed setting, when enabled.  off
cluster.read-freq-thresholdSpecifies the number of reads, in a promotion/demotion cycle, that would mark a file HOT for promotion. Any file that has read hits less than this value will be considered as COLD and will be demoted.0-200
cluster.write-freq-thresholdSpecifies the number of writes, in a promotion/demotion cycle, that would mark a file HOT for promotion. Any file that has write hits less than this value will be considered as COLD and will be demoted.0-200
cluster.tier-promote-frequencySpecifies how frequently the tier daemon must check for files to promote.1- 172800 seconds120 seconds
cluster.tier-demote-frequencySpecifies how frequently the tier daemon must check for files to demote.1 - 172800 seconds3600 seconds
cluster.tier-modeIf set to cache mode, promotes or demotes files based on whether the cache is full or not, as specified with watermarks. If set to test mode, periodically demotes or promotes files automatically based on access.test | cachecache
cluster.tier-max-mbSpecifies the maximum number of MB that may be migrated in any direction from each node in a given cycle.1 -100000 (100 GB)4000 MB
cluster.tier-max-filesSpecifies the maximum number of files that may be migrated in any direction from each node in a given cycle.1-100000 files10000
cluster.watermark-hiUpper percentage watermark for promotion. If hot tier fills above this percentage, no promotion will happen and demotion will happen with high probability.1- 99 %90%
cluster.watermark-low Lower percentage watermark. If hot tier is less full than this, promotion will happen and demotion will not happen. If greater than this, promotion/demotion will happen at a probability relative to how full the hot tier is. 1- 99 %75%
config.transport Specifies the type of transport(s) volume would support communicating over. tcp OR rdma OR tcp,rdma tcp
diagnostics.brick-log-level Changes the log-level of the bricks. INFO | DEBUG | WARNING | ERROR | CRITICAL | NONE | TRACE info
diagnostics.client-log-level Changes the log-level of the clients. INFO | DEBUG | WARNING | ERROR | CRITICAL | NONE | TRACE info
diagnostics.brick-sys-log-level Depending on the value defined for this option, log messages at and above the defined level are generated in the syslog and the brick log files. INFO | WARNING | ERROR | CRITICAL CRITICAL
diagnostics.client-sys-log-level Depending on the value defined for this option, log messages at and above the defined level are generated in the syslog and the client log files. INFO | WARNING | ERROR | CRITICAL CRITICAL
diagnostics.client-log-format Allows you to configure the log format to log either with a message id or without one on the client. no-msg-id | with-msg-id with-msg-id
diagnostics.brick-log-format Allows you to configure the log format to log either with a message id or without one on the brick. no-msg-id | with-msg-id with-msg-id
diagnostics.brick-log-flush-timeout The length of time for which the log messages are buffered, before being flushed to the logging infrastructure (gluster or syslog files) on the bricks. 30 - 300 seconds (30 and 300 included) 120 seconds
diagnostics.brick-log-buf-size The maximum number of unique log messages that can be suppressed until the timeout or buffer overflow, whichever occurs first on the bricks. 0 and 20 (0 and 20 included) 5
diagnostics.client-log-flush-timeout The length of time for which the log messages are buffered, before being flushed to the logging infrastructure (gluster or syslog files) on the clients. 30 - 300 seconds (30 and 300 included) 120 seconds
diagnostics.client-log-buf-size The maximum number of unique log messages that can be suppressed until the timeout or buffer overflow, whichever occurs first on the clients. 0 and 20 (0 and 20 included) 5
features.ctr-enabledEnables Change Time Recorder (CTR) translator for a tiered volume. This option is used in conjunction with features.record-counters option to enable recording write and read heat counters.on | offon
features.ctr_link_consistencyEnables a crash consistent way of recording hardlink updates by Change Time Recorder translator. When recording in a crash consistent way the data operations will experience more latency.on | offoff
features.quota-deem-statfs When this option is set to on, it takes the quota limits into consideration while estimating the filesystem size. The limit will be treated as the total size instead of the actual size of filesystem. on | off on
features.record-countersIf set to enabled, cluster.write-freq-threshold and cluster.read-freq-threshold options defines the number of writes and reads to a given file that are needed before triggering migration.on | offon
features.read-only Specifies whether to mount the entire volume as read-only for all the clients accessing it. on | off off
geo-replication.indexingEnables the marker translator to track the changes in the volume.on | off off
performance.quick-read To enable/disable quick-read translator in the volume. on | off on
network.ping-timeout The time the client waits for a response from the server. If a timeout occurs, all resources held by the server on behalf of the client are cleaned up. When the connection is reestablished, all resources need to be reacquired before the client can resume operations on the server. Additionally, locks are acquired and the lock tables are updated. A reconnect is a very expensive operation and must be avoided. 42 seconds 42 seconds
nfs.acl Disabling nfs.acl will remove support for the NFSACL sideband protocol. This is enabled by default. enable | disable enable
nfs.enable-ino32 For nfs clients or applciatons that do not support 64-bit inode numbers, use this option to make NFS return 32-bit inode numbers instead. Disabled by default, so NFS returns 64-bit inode numbers. enable | disable disable
nfs.export-dir By default, all NFS volumes are exported as individual exports. This option allows you to export specified subdirectories on the volume. The path must be an absolute path. Along with the path allowed, list of IP address or hostname can be associated with each subdirectory. None
nfs.export-dirs By default, all NFS sub-volumes are exported as individual exports. This option allows any directory on a volume to be exported separately. on | off on

Note

The value set for nfs.export-dirs and nfs.export-volumes options are global and applies to all the volumes in the Red Hat Gluster Storage trusted storage pool.
nfs.export-volumes Enables or disables exporting entire volumes. If disabled and used in conjunction with nfs.export-dir, you can set subdirectories as the only exports. on | off on
nfs.mount-rmtab Path to the cache file that contains a list of NFS-clients and the volumes they have mounted. Change the location of this file to a mounted (with glusterfs-fuse, on all storage servers) volume to gain a trusted pool wide view of all NFS-clients that use the volumes. The contents of this file provide the information that can get obtained with the showmount command. Path to a directory /var/lib/glusterd/nfs/rmtab
nfs.mount-udp Enable UDP transport for the MOUNT sideband protocol. By default, UDP is not enabled, and MOUNT can only be used over TCP. Some NFS-clients (certain Solaris, HP-UX and others) do not support MOUNT over TCP and enabling nfs.mount-udp makes it possible to use NFS exports provided by Red Hat Gluster Storage. disable | enable disable
nfs.nlm By default, the Network Lock Manager (NLMv4) is enabled. Use this option to disable NLM. Red Hat does not recommend disabling this option. on on|off
nfs.rpc-auth-allow IP_ADRESSES A comma separated list of IP addresses allowed to connect to the server. By default, all clients are allowed. Comma separated list of IP addresses accept all
nfs.rpc-auth-reject IP_ADRESSES A comma separated list of addresses not allowed to connect to the server. By default, all connections are allowed. Comma separated list of IP addresses reject none
nfs.ports-insecure Allows client connections from unprivileged ports. By default only privileged ports are allowed. This is a global setting for allowing insecure ports for all exports using a single option. on | off off
nfs.addr-namelookup Specifies whether to lookup names for incoming client connections. In some configurations, the name server can take too long to reply to DNS queries, resulting in timeouts of mount requests. This option can be used to disable name lookups during address authentication. Note that disabling name lookups will prevent you from using hostnames in nfs.rpc-auth-* options. on | off on
nfs.port Associates glusterFS NFS with a non-default port. 1025-65535 38465- 38467
nfs.disable Specifies whether to disable NFS exports of individual volumes. on | off off
nfs.server-aux-gids When enabled, the NFS-server will resolve the groups of the user accessing the volume. NFSv3 is restricted by the RPC protocol (AUTH_UNIX/AUTH_SYS header) to 16 groups. By resolving the groups on the NFS-server, this limits can get by-passed. on|off off
nfs.transport-type Specifies the transport used by GlusterFS NFS server to communicate with bricks. tcp OR rdma tcp
open-behind It improves the application's ability to read data from a file by sending success notifications to the application whenever it receives a open call. on | off on
performance.io-thread-count The number of threads in the IO threads translator. 0 - 65 16
performance.cache-max-file-size Sets the maximum file size cached by the io-cache translator. Can be specified using the normal size descriptors of KB, MB, GB, TB, or PB (for example, 6GB). Size in bytes, or specified using size descriptors. 2 ^ 64-1 bytes
performance.cache-min-file-size Sets the minimum file size cached by the io-cache translator. Can be specified using the normal size descriptors of KB, MB, GB, TB, or PB (for example, 6GB). Size in bytes, or specified using size descriptors. 0
performance.cache-refresh-timeout The number of seconds cached data for a file will be retained. After this timeout, data re-validation will be performed. 0 - 61 seconds 1 second
performance.cache-size Size of the read cache. Size in bytes, or specified using size descriptors. 32 MB
performance.md-cache-timeout The time period in seconds which controls when metadata cache has to be refreshed. If the age of cache is greater than this time-period, it is refreshed. Every time cache is refreshed, its age is reset to 0. 0-60 seconds 1 second
performance.use-anonymous-fd This option requires open-behind to be on. For read operations, use anonymous FD when the original FD is open-behind and not yet opened in the backend. Yes | No Yes
performance.lazy-open This option requires open-behind to be on. Perform an open in the backend only when a necessary FOP arrives (for example, write on the FD, unlink of the file). When this option is disabled, perform backend open immediately after an unwinding open. Yes/No Yes
rebal-throttleRebalance process is made multithreaded to handle multiple files migration for enhancing the performance. During multiple file migration, there can be a severe impact on storage system performance. The throttling mechanism is provided to manage it.lazy, normal, aggressive normal
server.allow-insecure Allows client connections from unprivileged ports. By default, only privileged ports are allowed. This is a global setting for allowing insecure ports to be enabled for all exports using a single option. on | off off

Important

Turning server.allow-insecure to on allows ports to accept/reject messages from insecure ports. Enable this option only if your deployment requires it, for example if there are too many bricks in each volume, or if there are too many services which have already utilized all the privileged ports in the system. You can control access of only glusterFS FUSE-based clients. Use nfs.rpc-auth-* options for NFS access control.
server.root-squash Prevents root users from having root privileges, and instead assigns them the privileges of nfsnobody. This squashes the power of the root users, preventing unauthorized modification of files on the Red Hat Gluster Storage Servers. on | off off
server.anonuid Value of the UID used for the anonymous user when root-squash is enabled. When root-squash is enabled, all the requests received from the root UID (that is 0) are changed to have the UID of the anonymous user. 0 - 4294967295 65534 (this UID is also known as nfsnobody)
server.anongid Value of the GID used for the anonymous user when root-squash is enabled. When root-squash is enabled, all the requests received from the root GID (that is 0) are changed to have the GID of the anonymous user. 0 - 4294967295 65534 (this UID is also known as nfsnobody)
server.gid-timeout The time period in seconds which controls when cached groups has to expire. This is the cache that contains the groups (GIDs) where a specified user (UID) belongs to. This option is used only when server.manage-gids is enabled. 0-4294967295 seconds 2 seconds
server.manage-gids Resolve groups on the server-side. By enabling this option, the groups (GIDs) a user (UID) belongs to gets resolved on the server, instead of using the groups that were send in the RPC Call by the client. This option makes it possible to apply permission checks for users that belong to bigger group lists than the protocol supports (approximately 93). on|off off
server.statedump-path Specifies the directory in which the statedump files must be stored. /var/run/gluster (for a default installation) Path to a directory
storage.health-check-interval Sets the time interval in seconds for a filesystem health check. You can set it to 0 to disable. The POSIX translator on the bricks performs a periodic health check. If this check fails, the filesystem exported by the brick is not usable anymore and the brick process (glusterfsd) logs a warning and exits. 0-4294967295 seconds 30 seconds
storage.owner-uid Sets the UID for the bricks of the volume. This option may be required when some of the applications need the brick to have a specific UID to function correctly. Example: For QEMU integration the UID/GID must be qemu:qemu, that is, 107:107 (107 is the UID and GID of qemu). Any integer greater than or equal to -1. The UID of the bricks are not changed. This is denoted by -1.
storage.owner-gid Sets the GID for the bricks of the volume. This option may be required when some of the applications need the brick to have a specific GID to function correctly. Example: For QEMU integration the UID/GID must be qemu:qemu, that is, 107:107 (107 is the UID and GID of qemu). Any integer greater than or equal to -1. The GID of the bricks are not changed. This is denoted by -1.