Chapter 3. Known Issues
3.1. Red Hat Gluster Storage
Issues related to Containers
- BZ#1294776
- When a container with one or more logical volumes bind-mounted as bricks is started in Atomic Host, the logical volumes are sometimes unmounted from Atomic Host during container start. This causes problems when the container is re-spawned.Workaround: After starting the Red Hat Gluster Storage container, verify that the mount point is still mounted in the Atomic Host by checking the output of
df -h. If the mount point is not mounted, ensure that it is configured in/etc/fstaband remount it on Atomic Host by runningmount -a.
Issues related to Tiering
- BZ#1294790 , BZ#1294808
- Currently, the tier process performs a fix-layout operation on the entire volume every time it starts. Tier migration operations only begin after the fix-layout operation is complete. This means that in some circumstances, such as when extremely large amounts of data are present on the volume immediately before tiering is enabled, the fix-layout operation can take a long time to complete and prevents file promotion to the hot tier until after the fix-layout operation has completed.
- BZ#1303298
- When a
readdirpcall is performed on a USS (User Serviceable Snapshot) as part of a request to list the entries on a snapshot of a tiered volume, the USS provides the wrong stat for files in the cold tier. This results in incorrect permissions being applied to the mount point, and files appear to have-----Tpermissions.Workaround: FUSE clients can work around this issue by applying any of the following options:use-readdirp=no(recommended)attribute-timeout=0entry-timeout=0
NFS clients can work around the issue by applying thenoacoption. - BZ#1300679
- If the hot and cold tiers in a tiered volume have the same number of sub-volumes, the first group of files migrated in a single cycle is likely to be migrated to the same sub-volume on the hot tier rather than being distributed across multiple sub-volumes. This is particularly noticeable when the files that are candidates for migration exceed the number defined by
tier-max-filesor the size defined bytier-max-mb. - BZ#1302968
- The defrag variable is not being reinitialized during glusterd restart. This means that if glusterd fails while the following processes are running, it does not reconnect to these processes after restarting:
- rebalance
- tier
- remove-brick
This results in these processes continuing to run without communicating with glusterd. Therefore, any operation that requires communication between these processes and glusterd fails.Workaround: Stop or kill the rebalance, tier, or remove-brick process before restarting glusterd. This ensures that a new process is spawned when glusterd restarts. - BZ#1303045
- When a tier is attached while I/O is occurring on an NFS mount, I/O pauses temporarily, usually for between 3 to 5 minutes. If I/O does not resume within 5 minutes, use the
gluster volume start volname forcecommand to resume I/O without interruption. - BZ#1273741
- Files with hard links are not promoted or demoted on tiered volumes.
- BZ#1305490
- A race condition between tier migration and hard link creation results in the hard link operation failing with a 'File exists' error, and logging 'Stale file handle' messages on the client. This does not impact functionality, and file access works as expected.This race occurs when a file is migrated to the cold tier after a hard link has been created on the cold tier, but before a hard link is created to the data on the hot tier. In this situation, the attempt to create a hard link on the hot tier fails. However, because the migration converts the hard link on the cold tier to a data file, and a linkto already exists on the cold tier, the links exist and work as expected.
- BZ#1277112
- When hot tier storage is full, write operations such as file creation or new writes to existing files fail with a 'No space left on device' error, instead of redirecting writes or flushing data to cold tier storage.Workaround: If the hot tier is not completely full, it is possible to work around this issue by waiting for the next CTR promote/demote cycle before continuing with write operations.If the hot tier does fill completely, administrators can copy a file from the hot tier to a safe location, delete the original file from the hot tier, and wait for demotion to free more space on the hot tier before copying the file back.
- BZ#1278391
- Migration from the hot tier fails when the hot tier is completely full because there is no space left to set the extended attribute that triggers migration.
- BZ#1283507
- Corrupted files can be identified for promotion and promoted to hot tier storage.In rare circumstances, corruption can be missed by the BitRot scrubber. This can happen in two ways:
- A file is corrupted before its checksum is created, so that the checksum matches the corrupted file, and the BitRot scrubber does not mark the file as corrupted.
- A checksum is created for a healthy file, the file becomes corrupted, and the corrupted file is not compared to its checksum before being identified for promotion and promoted to the hot tier, where a new (corrupted) checksum is created.
When tiering is in use, these unidentified corrupted files can be 'heated' and selected for promotion to the hot tier. If a corrupted file is migrated to the hot tier, and the hot tier is not replicated, the corrupted file cannot be accessed or migrated back to the cold tier. - BZ#1283957
- When volume status or volume tier status is requested for a tiered volume, the status of all nodes in the storage pool is listed as in progress, even when a node is not part of the tiered volume. This occurs because the tier daemon runs on all nodes of the trusted storage pool, and reports status for every volume in the trusted storage pool.
Issues related to Snapshot
- 1306917
- When a User Serviceable Snapshot is enabled, attaching a tier succeeds, but any I/O operations in progress during the attach tier operation may fail with stale file handle errors.Workaround: Disable User Serviceable Snapshots before performing
attach tier. Onceattach tierhas succeeded, User Serviceable Snapshots can be enabled. - BZ#1309209
- When a cloned volume is deleted, its brick paths (stored under
/run/gluster/snaps) are not cleaned up correctly. This means that attempting to create a clone that has the same name as a previously cloned and deleted volume fails with a Commit failed message.Workaround: After deleting a cloned volume, ensure that brick entries in/run/gluster/snapsare unmounted and deleted, and that their logical volumes are removed. - BZ#1201820
- When a snapshot is deleted, the corresponding file system object in the User Serviceable Snapshot is also deleted. Any subsequent file system access results in the
snapshotdaemon becoming unresponsive. To avoid this issue, ensure that you do not perform any file system operations on the snapshot that is about to be deleted. - BZ#1308837
- When a tiered volume with quota enabled is snapshotted, and that snapshot is cloned, rebooting the node or restarting glusterd on the node can result in that node entering a peer rejected state. This occurs because the quota checksum is not being copied as part of the snapshot or clone operations.
Workaround:
- Check glusterd logs for a quota checksum mismatch error, which looks similar to the following:
E [MSGID: 106012] [glusterd-utils.c:2845:glusterd_compare_friend_volume] 0-management: Cksums of quota configuration of volume volname differ. local cksum = 1405646976, remote cksum = 0 on peer peername
- If the volume with this error is cloned, edit the
/var/lib/glusterd/vols/volname/infofile for that volume and increase the value in theversionfield by one. - Restart glusterd on the node.
- BZ#1160621
- If the current directory is not a part of the snapshot, for example,
snap1, then the user cannot enter the.snaps/snap1directory. - BZ#1169790
- When a volume is down and there is an attempt to access
.snapsdirectory, a negative cache entry is created in the Kernal Virtual File System (VFS) cache for the.snapsdirectory. After the volume is brought back online, accessing the.snapsdirectory fails with an ENOENT error because of the negative cache entry.Workaround: Clear the kernel VFS cache by executing the following command:# echo 3 > /proc/sys/vm/drop_caches
Note that this can cause temporary performance degradation. - BZ#1170145
- After the restore operation is complete, if restore a volume while you are in the
.snapsdirectory, the following error message is displayed from the mount point -"No such file or directory".Workaround: Navigate to the parent directory of the.snapsdirectory and use the following command to drop the VFS cache:# echo 3 > /proc/sys/vm/drop_caches
Then move back into the.snapsfolder. Note that this command can cause temporary performance degradation. - BZ#1170365
- Virtual inode numbers are generated for all the files in the
.snapsdirectory. Any hard links are assigned different inode numbers instead of the same inode number. - BZ#1170502
- When the User Serviceable Snapshot feature is enabled, if a directory or a file by name
.snapsexists on a volume, it appears in the output of thels -acommand. - BZ#1174618
- If the User Serviceable Snapshot feature is enabled, and a directory has a pre-existing
.snapsfolder, then accessing that folder can lead to unexpected behavior.Workaround: Rename the pre-existing.snapsfolder with another name. - BZ#1167648
- Performing operations which involve client graph changes such as volume set operations, restoring snapshot, etc. eventually leads to out of memory scenarios for the client processes that mount the volume.
- BZ#1133861
- New snap bricks fails to start if the total snapshot brick count in a node goes beyond 1K. Until this bug is corrected, Red Hat recommends deactivating unused snapshots to avoid hitting the 1K limit.
- BZ#1126789
- If any node or
glusterdservice is down when snapshot is restored then any subsequent snapshot creation fails. Red Hat recommends not restoring a snapshot while the node or theglusterdservice is unavailable. - BZ#1139624
- While taking a snapshot of a gluster volume, Red Hat Gluster Storage creates another volume which is similar to the original volume. All volumes, including snapshot volumes, consume some memory when started. This can create an out of memory state when creating a snapshot on a system with low memory. Red Hat recommends deactivating unused snapshots to reduce the memory footprint of the system and avoid this issue.
- BZ#1129675
- Performing a snapshot restore while
glusterdis not available in a cluster node or a node is unavailable results in the following errors:- Executing the
gluster volume heal vol-name infocommand displays the error messageTransport endpoint not connected. - Error occurs when clients try to connect to glusterd service.
Workaround: Perform snapshot restore only if all the nodes and their correspondingglusterdservices are running. Startglusterdby running the following command:# service glusterd start
- BZ#1105543
- When a node with old snap entry is attached to the cluster, the old entries are propagated throughout the cluster and old snapshots which are not present are displayed.Workaround: Do not attach a peer with old snap entries.
- BZ#1104191
- The
snapshotcommand fails if snapshot command is run simultaneously from multiple nodes when a large number of write or read operations are happening on the origin or parent volume.Workaround: Avoid running multiple snapshot commands simultaneously from different nodes. - BZ#1059158
- The
NFS mountoption is not supported for snapshot volumes. - BZ#1113510
- Executing the
gluster volume infocommand displays system limits (snap-max-hard-limitandsnap-max-soft-limit) instead of volume limits, and also displays snapshot auto-delete values. - BZ#1111479
- Attaching a new node to the cluster while snapshot delete was in progress appears to successfully delete snapshots, but the
gluster snapshot listcommand shows some of the snapshots are still present.Workaround: Do not attach or detach new node to the trusted storage pool operation while a snapshot is in progress. - BZ#1092510
- If you create a snapshot when the rename of a directory is in progress (that is, renaming is complete on hashed sub-volume but not on all of the sub-volumes), on snapshot restore, directory which was undergoing rename operation will have same GFID for both source and destination. Having same GFID is an inconsistency in DHT and can lead to undefined behavior.In DHT, a rename (source, destination) of directories is done first on hashed sub-volume and if successful, then on rest of the sub-volumes. At this point in time, if you have both source and destination directories present in the cluster with same GFID - destination on hashed sub-volume and source on rest of the sub-volumes. A parallel lookup (on either source or destination) at this time can result in creation of directories on missing sub-volumes - source directory entry on hashed and destination directory entry on rest of the sub-volumes. Hence, there would be two directory entries - source and destination - having same GFID.
- BZ#1112250
- Probing/detaching a new peer during any snapshot operation is not supported.
- BZ#1236149
- If a node/brick is down, the
snapshot createcommand fails even with the force option. - BZ#1240227
- LUKS encryption over LVM is currently not supported.
- BZ#1236025
- The time stamp of files and directories changes on snapshot restore, resulting in a failure to read the appropriate change logs.
glusterfind prefails with the following error:historical changelogs not available. Existing glusterfind sessions fail to work after a snapshot restore.Workaround: Gather the necessary information from existing glusterfind sessions, remove the sessions, perform a snapshot restore, and then create new glusterfind sessions. - BZ#1160412
- During the update of the glusterfs-server package, warnings and fatal errors appear on-screen by
librdmacmif the machine does not have an RDMA device. If you do not require Gluster to work with RDMA transport, these errors can be safely ignored. - BZ#1246183
- User Serviceable Snapshots is not supported on Erasure Coded (EC) volumes.
Issues related to Nagios
- BZ#1136207
- Volume status service shows All bricks are Up message even when some of the bricks are in unknown state due to unavailability of
glusterdservice. - BZ#1109683
- When a volume has a large number of files to heal, the
volume self heal infocommand takes time to return results and the nrpe plug-in times out as the default timeout is 10 seconds.Workaround: In/etc/nagios/gluster/gluster-commands.cfgincrease the timeout of nrpe plug-in to 10 minutes by using the -t option in the command. For example:$USER1$/gluster/check_vol_server.py $ARG1$ $ARG2$ -o self-heal -t 600
- BZ#1094765
- When certain commands invoked by Nagios plug-ins fail, irrelevant outputs are displayed as part of performance data.
- BZ#1107605
- Executing
sadfcommand used by the Nagios plug-ins returns invalid output.Workaround: Delete the datafile located at/var/log/sa/saDDwhere DD is current date. This deletes the datafile for current day and a new datafile is automatically created and which is usable by Nagios plug-in. - BZ#1107577
- The Volume self heal service returns a WARNING when there unsynchronized entries are present in the volume, even though these files may be synchronized during the next run of self-heal process if
self-healis turned on in the volume. - BZ#1121009
- In Nagios, CTDB service is created by default for all the gluster nodes regardless of whether CTDB is enabled on the Red Hat Gluster Storage node or not.
- BZ#1089636
- In the Nagios GUI, incorrect status information is displayed as Cluster Status OK : None of the Volumes are in Critical State, when volumes are utilized beyond critical level.
- BZ#1111828
- In Nagios GUI, Volume Utilization graph displays an error when volume is restored using its snapshot.
- BZ#1236997
- Bricks with an
UNKNOWNstatus are not considered asDOWNwhen volume status is calculated. When the glusterd service is down in one node, brick status changes toUNKNOWNwhile the volume status remains OK. You may think the volume is up and running when bricks may not be running. You are not able to detect the correct status.Workaround: You are notified when gluster is down and when bricks are in anUNKNOWNstate. - BZ#1240385
- When the
configure-gluster-nagioscommand tries to get the IP Address and FLAGs for all network interfaces in the system, the following error is displayed:ERROR:root:unable to get ipaddr/flags for nic-name: [Errno 99] Cannot assign requested address when there is an issue while retrieving IP Address/Fags for a NIC.
However, the command actually succeeded and configured the nagios correctly.
Issues related to Rebalancing Volumes
- BZ#1266874
- Rebalance operation tries to start the gluster volume before doing the actual rebalance. In most of the cases, volume is already in
Startedstate. If the volume is already started and the volume start command fails, gdeploy assumes that volume has started and does not start the rebalance process.Workaround: Rebalance in gdeploy is possible only for stopped volumes. - BZ#1110282
- Executing
rebalance statuscommand, after stopping rebalance process, fails and displays a message that the rebalance process is not started. - BZ#1140531
- Extended attributes set on a file while it is being migrated during a rebalance operation are lost.Workaround: Reset the extended attributes on the file once the migration is complete.
- BZ#960910
- After executing
rebalanceon a volume, running therm -rfcommand on the mount point to remove all of the content from the current working directory recursively without being prompted may return Directory not Empty error message. - BZ#862618
- After completion of the rebalance operation, there may be a mismatch in the failure counts reported by the
gluster volume rebalance statusoutput and the rebalance log files. - BZ#1039533
- While Rebalance is in progress, adding a brick to the cluster displays an error message,
failed to get indexin the gluster log file. This message can be safely ignored. - BZ#1064321
- When a node is brought online after rebalance, the status displays that the operation is completed, but the data is not rebalanced. The data on the node is not rebalanced in a remove-brick rebalance operation and running commit command can cause data loss.Workaround: Run the
rebalancecommand again if any node is brought down while rebalance is in progress, and also when the rebalance operation is performed after remove-brick operation. - BZ#1237059
- The rebalance process on a distributed-replicated volume may stop if a brick from a replica pair goes down as some operations cannot be redirected to the other available brick. This causes the rebalance process to fail.
- BZ#1245202
- When rebalance is run as a part of
remove-brickcommand, some files may be reported as split-brain and, therefore, not migrated, even if the files are not split-brain.Workaround: Manually copy the files that did not migrate from the bricks into the Gluster volume via the mount.
Issues related to Geo-replication
- BZ#1293634
- Sync performance for geo-replicated storage is reduced when the master volume is tiered, resulting in slower geo-rep performance on tiered volumes.
- BZ#1302320
- During file promotion, the rebalance operation sets the sticky bit and suid/sgid bit. Normally, it removes these bits when the migration is complete. If readdirp is called on a file before migration completes, these bits are not removed, and remain applied on the client.This means that, if rsync happens while the bits are applied, the bits remain applied to the file as it is synced to the destination, impairing accessibility on the destination. This can happen in any geo-replicated configuration, but the likelihood increases with tiering because the rebalance process is continuous.
- BZ#1286587
- When geo-replication is in use alongside tiering, bricks attached as part of a tier are incorrectly set to passive. If geo-replication is subsequently restarted, these bricks can become faulty.Workaround: Stop geo-replication session prior to attaching or detaching bricks that are part of a tier.
To attach a tier:
- Stop geo-replication:
# gluster volume geo-replication master_vol slave_host::slave_vol stop
- Attach the tier:
# gluster volume attach-tier master_vol replica 2 server1:/path/to/brick1 server2:/path/to/brick2
- Restart geo-replication:
# gluster volume geo-replication master_vol slave_host::slave_vol start
- Verify that bricks in tier are available in geo-replication session:
# gluster volume geo-replication master_vol slave_host::slave_vol status
To detach a tier:
- Begin the tier detachment process:
# gluster volume detach-tier master_vol start
- Ensure all data in that tier is synced to the slave:
# gluster volume geo-replication master_vol slave_host::slave_vol config checkpoint now
- Monitor checkpoint until displayed status is
checkpoint as of time of checkpoint creation is completed at time.# gluster volume geo-replication master_vol slave_host::slave_vol status detail
- Verify that detachment is complete:
# gluster volume detach-tier master_vol status
- Stop geo-replication:
# gluster volume geo-replication master_vol slave_host::slave_vol stop
- Commit tier detachment:
# gluster volume detach-tier master_vol commit
- Verify tier is detached:
# gluster volume info master_vol
- Restart geo-replication:
# gluster volume geo-replication master_vol slave_host::slave_vol start
- BZ#1102524
- The Geo-replication worker goes to faulty state and restarts when resumed. It works as expected when it is restarted, but takes more time to synchronize compared to resume.
- BZ#987929
- While the
rebalanceprocess is in progress, starting or stopping a Geo-replication session results in some files not get synced to the slave volumes. When a Geo-replication sync process is in progress, running therebalancecommand causes the Geo-replication sync process to stop. As a result, some files do not get synced to the slave volumes. - BZ#1029799
- Starting a Geo-replication session when there are tens of millions of files on the master volume takes a long time to observe the updates on the slave mount point.
- BZ#1027727
- When there are hundreds of thousands of hard links on the master volume prior to starting the Geo-replication session, some hard links are not getting synchronized to the slave volume.
- BZ#984591
- After stopping a Geo-replication session, if the files synced to the slave volume are renamed then when Geo-replication starts again, the renamed files are treated anew, (without considering the renaming) and synced on to the slave volumes again. For example, if 100 files were renamed, you would find 200 files on the slave side.
- BZ#1235633
- Concurrent
rmdirandlookupoperations on a directory during a recursive remove may prevent the directory from being deleted on some bricks. The recursive remove operation fails withDirectory not emptyerrors even though the directory listing from the mount point shows no entries.Workaround: Unmount the volume and delete the contents of the directory on each brick. If the affected volume is a geo-replication slave volume, runstop geo-rep sessionbefore deleting the contents of the directory on the bricks. - BZ#1238699
- The Changelog History API expects brick path to remain the same for a session. However, on snapshot restore, brick path is changed. This causes the History API to fail and geo-rep to change to
Faulty.Workaround:
- After the snapshot restore, ensure the master and slave volumes are stopped.
- Backup the
htimedirectory (of master volume).cp -a <brick_htime_path> <backup_path>
Note
Using-aoption is important to preserve extended attributes.For example:cp -a /var/run/gluster/snaps/a4e2c4647cf642f68d0f8259b43494c0/brick0/b0/.glusterfs/changeslogs/htime /opt/backup_htime/brick0_b0
- Run the following command to replace the
OLDpath in the htime file(s) with the new brick path, where OLD_BRICK_PATH is the brick path of the current volume, and NEW_BRICK_PATH is the brick path after snapshot restore.find <new_brick_htime_path> - name 'HTIME.*' -print0 | \ xargs -0 sed -ci 's|<OLD_BRICK_PATH>|<NEW_BRICK_PATH>|g'
For example:find /var/run/gluster/snaps/a4e2c4647cf642f68d0f8259b43494c0/brick0/b0/.glusterfs/changelogs/htime/ -name 'HTIME.*' -print0 | \ xargs -0 sed -ci 's|/bricks/brick0/b0/|/var/run/gluster/snaps/a4e2c4647cf642f68d0f8259b43494c0/brick0/b0/|g'
- Start the Master and Slave volumes and Geo-replication session on the restored volume. The status should update to
Active.
- BZ#1240333
- Concurrent rename and lookup operations on a directory can cause both old and new directories to be "healed." Both directories will exist at the end of the operation and will have the same GFID. Clients might be unable to access some of the contents of the directory. Contact Red Hat Support for assistance with this issue.
Issues related to Self-heal
- BZ#1063830
- Performing add-brick or remove-brick operations on a volume having replica pairs when there are pending self-heals can cause potential data loss.Workaround: Ensure that all bricks of the volume are online and there are no pending self-heals. You can view the pending heal info using the command
gluster volume heal volname info. - BZ#1230092
- When you create a replica 3 volume, client quorum is enabled and set to
autoby default. However, it does not get displayed ingluster volume info. - BZ#1233608
- When
cluster.data-self-heal,cluster.metadata-self-healandcluster.entry-self-healare set tooff(through volume set commands), the Gluster CLI to resolve split-brain fails withFile not in split brainmessage (even though the file is in split brain). - BZ#1240658
- When files are accidentally deleted from a brick in a replica pair in the back-end, and
gluster volume heal VOLNAME fullis run, then there is a chance that the files may not get healed.Workaround: Perform a lookup on the files from the client (mount). This triggers the heal. - BZ#1173519
- If you write to an existing file and go over the
_AVAILABLE_BRICK_SPACE_, the write fails with an I/O error.Workaround: Use thecluster.min-free-diskoption. If you routinely write files up to nGB in size, then you can set min-free-disk to an mGB value greater than n.For example, if your file size is 5GB, which is at the high end of the file size you will be writing, you might consider setting min-free-disk to 8 GB. This ensures that the file will be written to a brick with enough available space (assuming one exists).# gluster v set _VOL_NAME_ min-free-disk 8GB
Issues related to replace-brick operation
- After the
gluster volume replace-brick VOLNAME Brick New-Brick commit forcecommand is executed, the file system operations on that particular volume, which are in transit, fail. - After a replace-brick operation, the stat information is different on the NFS mount and the FUSE mount. This happens due to internal time stamp changes when the
replace-brickoperation is performed.
- BZ#1021466
- After setting Quota limit on a directory, creating sub directories and populating them with files and renaming the files subsequently while the I/O operation is in progress causes a quota limit violation.
- BZ#1020713
- In a distribute or distribute replicate volume, while setting quota limit on a directory, if one or more bricks or one or more replica sets respectively, experience downtime, quota is not enforced on those bricks or replica sets, when they are back online. As a result, the disk usage exceeds the quota limit.Workaround: Set quota limit again after the brick is back online.
Issues related to NFS
- After you restart the NFS server, the unlock within the grace-period feature may fail and the locks help previously may not be reclaimed.
fcntllocking (NFS Lock Manager) does not work over IPv6.- You cannot perform NFS mount on a machine on which glusterfs-NFS process is already running unless you use the NFS mount
-o nolockoption. This is because glusterfs-nfs has already registered NLM port with portmapper. - If the NFS client is behind a NAT (Network Address Translation) router or a firewall, the locking behavior is unpredictable. The current implementation of NLM assumes that Network Address Translation of the client's IP does not happen.
nfs.mount-udpoption is disabled by default. You must enable it to use posix-locks on Solaris when using NFS to mount on a Red Hat Gluster Storage volume.- If you enable the
nfs.mount-udpoption, while mounting a subdirectory (exported using thenfs.export-diroption) on Linux, you must mount using the-o proto=tcpoption. UDP is not supported for subdirectory mounts on the GlusterFS-NFS server. - For NFS Lock Manager to function properly, you must ensure that all of the servers and clients have resolvable hostnames. That is, servers must be able to resolve client names and clients must be able to resolve server hostnames.
Issues related to NFS-Ganesha
- BZ#1259402
- When vdsmd and abrt are installed alongside each other, vdsmd overwrites abrt core dump configuration in
/proc/sys/kernel/core_pattern. This prevents NFS-Ganesha from generating core dumps.Workaround: Disable core dumps in/etc/vdsm/vdsm.confby settingcore_dump_enabletofalse, and then restart theabrt-ccppservice:# systemctl restart abrt-ccpp
- BZ#1257548
nfs-ganeshaservice monitor script which triggers IP failover runs periodically every 10 seconds. The ping-timeout of the glusterFS server (after which the locks of the unreachable client gets flushed) is 42 seconds by default. After an IP failover, some locks may not get cleaned by the glusterFS server process, hence reclaiming the lock state by NFS clients may fail.Workaround: It is recommended to set thenfs-ganeshaservice monitor period interval (default 10sec) at least as twice as the Gluster server ping-timout (default 42sec).Hence, either you must decrease the network ping-timeout using the following command:# gluster volume set <volname> network.ping-timeout <ping_timeout_value>
or increase nfs-service monitor interval time using the following commands:# pcs resource op remove nfs-mon monitor
# pcs resource op add nfs-mon monitor interval=<interval_period_value> timeout=<timeout_value>
- BZ#1224250
- Same epoch values on all the NFS-Ganesha heads results in NFS server sending
NFS4ERR_FHEXPIREDerror instead ofNFS4ERR_STALE_CLIENTIDorNFS4ERR_STALE_STATEIDafter failover. This results in NFSv4 clients not able to recover locks after failover.Workaround: To use NFSv4 locks, specify different epoch values for each NFS-Ganesha head before setting up the NFS-Ganesha cluster. - BZ#1226874
- If NFS-Ganesha is started before you set up an HA cluster, there is no way to validate the cluster state and stop NFS-Ganesha if the set up fails. Even if the HA cluster set up fails, the NFS-Ganesha service continues running.Workaround: If HA set up fails, run service nfs-ganesha stop on all nodes in the HA cluster.
- BZ#1228196
- If you have less than three nodes, pacemaker shuts down HA.Workaround: To restore HA, add a third node with
ganesha-ha.sh --add $path-to-config $node $virt-ip. - BZ#1233533
- When the
nfs-ganeshaoption is turnedoff, gluster NFS may not restart automatically.. The volume may no longer be exported from the storage nodes via a nfs-server.Workaround:
- Turn off the
nfs.disableoption for the volume:gluster volume set volume name nfs.disable off
- Restart the volume:
gluster volume start volume name force
- BZ#1235597
- On the nfs-ganesha server IP,
showmountdoes not display a list of the clients mounting from that host. - BZ#1236017
- When a server is rebooted, services such as
pcsdandnfs-ganeshado not start by default.nfs-ganeshawon't be running on the rebooted node, so it won't be part of the HA-cluster.Workaround: Manually restart the services after a server reboot. - BZ#1240258
- When files and directories are created on the mount point with root squash enabled for
nfs-ganesha, executinglscommand displaysuser:group as 4294967294:4294967294instead ofnfsnobody:nfsnobody. This is because the client maps only 16 bit unsigned representation of -2 tonfsnobodywhereas 4294967294 is 32 bit equivalent of -2.This is currently a limitation in upstreamnfs-ganesha.
Issues related to Object Store
- The GET and PUT commands fail on large files while using Unified File and Object Storage.Workaround: You must set the
node_timeout=60variable in the proxy, container, and the object server configuration files.
Issues related to Red Hat Gluster Storage Volumes
- BZ#1306656
- When management encryption is enabled, and a volume is started before
glusterdhas been started on all nodes in the cluster, the bricks on late-starting nodes are assigned different ports. This results in the bricks being inaccessible, as the new ports are blocked by the firewall.Workaround: When management encryption is enabled, ensureglusterdis started on all nodes before starting volumes. - BZ#1311362
- Red Hat Gluster Storage 3.1 Update 2 adds a new directory (
brick_path/.glusterfs/indices/dirty) to assist with internal maintenance. Version 3.1 Update 2 incorrectly expects this directory to be present when running commands, even on nodes with older Red Hat Gluster Storage versions, resulting in misleading output.When a node with Red Hat Gluster Storage 3.1 Update 2 is used to run thegluster volume heal volnamecommand on older nodes, the output of agluster volume heal infocommand run from the new node contains the following message, even though all entries were processed:Failed to process entries completely
Workaround: If not all nodes in your cluster have been updated to Red Hat Gluster Storage 3.1 Update 2, you can perform either of the following actions to work around the issue.- Use older nodes to review heal info output.
- For each brick, check that the only index entry listed under the
brick_path/.glusterfs/indices/xattropdirectory isxattrop-*.
- BZ#1304585
- When quota is disabled on a volume, a cleanup process is initiated to clean up the extended attributes used by quota. If this cleanup process is still in progress when quota is re-enabled, extended attributes for the newly enabled quota can be removed by the cleanup process. This has negative effects on quota accounting.
- BZ#1306907
- During an inode forget operation, files under the quarantine directory are removed. The inode forget operation is called during the unlinking of a file, and when the inode table's LRU (Least Recently Used) cache size exceeds 16 KB. This means that, when a corrupted file is not accessed for a long time, and the LRU cache exceeds 16 KB, the corrupted file will be removed from the quarantine directory. This results in the corrupted file not being shown in BitRot status output, even though the corrupted file has not been deleted from the volume itself.
- BZ#986090
- Currently, the Red Hat Gluster Storage server has issues with mixed usage of hostnames, IPs and FQDNs to refer to a peer. If a peer has been probed using its hostname but IPs are used during add-brick, the operation may fail. It is recommended to use the same address for all the operations, that is, during peer probe, volume creation, and adding/removing bricks. It is preferable if the address is correctly resolvable to a FQDN.
- BZ#1260779
- In a distribute-replicate volume, the
getfattr -n replica.split-brain-status <path-to-dir>command on mount-point might report that the directory is not in split-brain even though it is.Workaround: To know the split-brain status of a directory, run the following command:# gluster v heal <volname> info split-brain
- BZ#852293
- The management daemon does not have a rollback mechanism to revert any action that may have succeeded on some nodes and failed on the those that do not have the brick's parent directory. For example, setting the
volume-idextended attribute may fail on some nodes and succeed on others. Because of this, the subsequent attempts to recreate the volume using the same bricks may fail with the error brickname or a prefix of it is already part of a volume.Workaround:
- You can either remove the brick directories or remove the glusterfs-related extended attributes.
- Try creating the volume again.
- BZ#913364
- An NFS server reboot does not reclaim the file LOCK held by a Red Hat Enterprise Linux 5.9 client.
- BZ#1030438
- On a volume, when read and write operations are in progress and simultaneously a rebalance operation is performed followed by a remove-brick operation on that volume, then the
rm -rfcommand fails on a few files. - BZ#1224064
- Glusterfind is a independent tool and is not integrated with glusterd. When a Gluster volume is deleted, respective glusterfind session directories/files for that volume persist.Workaround: Manually, delete the Glusterfind session directory in each node for the Gluster volume in the
/var/lib/glusterd/glusterfinddirectory. - BZ#1224153
- When a brick process dies, BitD tries to read from the socket used to communicate with the corresponding brick. If it fails, BitD logs the failure to the log file. This results in many messages in the log files, leading to the failure of reading from the socket and an increase in the size of the log file.
- BZ#1224162
- Due to an unhandled race in the RPC interaction layer, brick down notifications may result in corrupted data structures being accessed. This can lead to NULL pointer access and segfault.Workaround: When the
Bitrotdaemon (bitd) crashes (segfault), you can usevolume start VOLNAME forceto restartbitdon the node(s) where it crashed. - BZ#1224880
- If you delete a gluster volume before deleting the Glusterfind session, then the Glusterfind session can't be deleted. A new session can't be created with same name.Workaround: In all the nodes that were part of the volume before you deleted it, manually clean up the session directory, for example,
/var/lib/glusterd/glusterfind/SESSION/VOLNAME. - BZ#1227672
- A successful scrub of the filesystem (objects) is required to see if a given object is clean or corrupted. When a file gets corrupted and a scrub has not been run on the filesystem, there is a good chance of replicating corrupted objects in cases when the brick holding the good copy was offline when I/O was performed.Workaround: Objects need to be checked on demand for corruption during healing.
- BZ#1231150
- When you set diagnostic.client-log-level DEBUG, and then reset the
diagnostic.client-log-leveloption, DEBUG logs continue to appear in log files. INFO log level is enabled by default.Workaround: Restart the volume usinggluster volume start VOLNAME force, to reset log level defaults. - BZ#1233213
- If you run a
gluster volume info --xmlcommand on a newly probed peer without running any other gluster volume command in between, brick UUIDs will appear as null ('00000000-0000-0000-0000-000000000000').Workaround: Run any volume command (excludinggluster volume listandgluster volume get) before you run the info command. Brick UUIDs will then correctly populate. - BZ#1241314
- The
volume get VOLNAME enable-shared-storageoption always shows as disabled, even when it is enabled.Workaround:gluster volume info VOLNAMEcommand shows the correct status of theenable-shared-storageoption. - BZ#1297442
- Currently, attempting to run the
gluster volume get volname user.optioncommand fails because the volume get command does not display user option values in its output.Workaround: Run thegluster volume info volnamecommand on the same volume to see the value of any user options. - BZ#1241336
- When an Red Hat Gluster Storage node is shut down due to power failure or hardware failure, or when the network interface on a node goes down abruptly, subsequent gluster commands may time out. This happens because the corresponding TCP connection remains in the
ESTABLISHEDstate. You can confirm this by executing the following command:ss -tap state established '( dport = :24007 )' dst IP-addr-of-powered-off-RHGS-nodeWorkaround: Restartglusterdservice on all other nodes. - BZ#1223306
gluster volume heal VOLNAME infoshows stale entries, even after the file is deleted. This happens due to a rare case when the gfid-handle of the file is not deleted.Workaround: On the bricks where the stale entries are present, for example,<gfid:5848899c-b6da-41d0-95f4-64ac85c87d3f>, check if the file'sgfidhandle is not deleted by running the following command and checking whether the file appears in the output, for example,<brick-path>/.glusterfs/58/48/5848899c-b6da-41d0-95f4-64ac85c87d3f.# find <brick-path>/.glusterfs -type f -links 1
If the file appears in the output of this command, delete the file using the following command.# rm <brick-path>/.glusterfs/58/48/5848899c-b6da-41d0-95f4-64ac85c87d3f
- BZ#1224180
- In some cases, operations on the mount displays error:
Input/Output errorinstead ofDisk quota exceededmessage after the quota limit is exceeded. - BZ#1244759
- Sometimes gluster volume heal VOLNAME info shows some symlinks which need to be healed for hours.To confirm this issue, the files must have the following extended attributes:
# getfattr -d -m. -e hex -h /path/to/file/on/brick | grep trusted.ec Example output: trusted.ec.dirty=0x3000 trusted.ec.size=0x3000 trusted.ec.version=0x30000000000000000000000000000001
The first four digits must be3000and the file must be a symlink/softlink.Workaround: Execute the following commands on the files in each brick and ensure to stop all operations on them.- Delete
trusted.ec.size.# setfattr -x trusted.ec.size /path/to/file/on/brick
- First 16 digits must have '0' in both
trusted.ec.dirtyandtrusted.ec.versionattributes and the rest of the 16 digits should remain as is. If the number of digits is less than 32, then use '0' s as padding.# setfattr -n trusted.ec.dirty -v 0x00000000000000000000000000000000 /path/to/file/on/brick # setfattr -n trusted.ec.version -v 0x00000000000000000000000000000001 /path/to/file/on/brick
Issues related to Red Hat Gluster Storage Server
- BZ#1306667
- If server-side quorum is enabled, and the quorum conditions are not met, starting a volume should fail. Currently, executing
gluster volume startincorrectly succeeds even when quorum conditions are not met. However, because stopping the volume is correctly dependent on quorum conditions being met, this means that attempts to stop the volume fail while quorum is enabled.Workaround:
- Disable server-side quorum:
# gluster volume reset volname cluster.server-quorum-type
- Stop the volume.
- Re-enable server-side quorum.
# gluster volume set volname cluster.server-quorum-type server
- BZ#1298955
- When Red Hat Gluster Storage is set up with a server version of 3.1 Update 1 and a client version of 3.0 Update 4, attempting to set any option with the
volume setcommand fails with the following error:volume set: failed: One or more connected clients cannot support the feature being set. These clients need to be upgraded or disconnected before running this command again
When operating correctly, this restriction is in place to prevent newer features from being enabled on a volume when the clients in use cannot support the feature. Currently, the restriction check is incorrect, and will prevent even valid, supported options from being set.Workaround: Upgrade all clients to the same version of Red Hat Gluster Storage as the server. - BZ#1266824
- After an ISO installation, the
ntpdservice does not start by default on Red Hat Enterprise Linux 7. The server is out of sync with the rest of the cluster. This is visible if there is a huge difference between the current date and the system time.Workaround: You must configure thentpdservice manually after installation. Execute the following commands to enable and start thentpdservice:# systemctl enable ntpd
# systemctl start ntpd
Issues related to POSIX ACLs:
- Mounting a volume with
-o aclcan negatively impact the directory read performance. Commands like recursive directory listing can be slower than normal. - When POSIX ACLs are set and multiple NFS clients are used, there could be inconsistency in the way ACLs are applied due to attribute caching in NFS. For a consistent view of POSIX ACLs in a multiple client setup, use the -o noac option on the NFS mount to disable attribute caching. Note that disabling the attribute caching option could lead to a performance impact on the operations involving the attributes.
Issues related to Samba
- BZ#1300572
- Due to a bug in the Linux CIFS client, SMB2.0+ connections from Linux to Red Hat Gluster Storage currently will not work properly. SMB1 connections from Linux to Red Hat Gluster Storage, and all connections with supported protocols from Windows continue to work.Workaround: If practical, restrict Linux CIFS mounts to SMB version 1. The simplest way to do this is to not specify the
versmount option, since the default setting is to use only SMB version 1. If restricting Linux CIFS mounts to SMB1 is not practical, disable asynchronous I/O in Samba by settingaio read sizeto 0 in thesmb.conffile. Disabling asynchronous I/O is not generally recommended and may negatively impact performance on other clients. - BZ#1282452
- Attempting to upgrade to ctdb version 4 fails when ctdb2.5-debuginfo is installed, because the ctdb2.5-debuginfo package currently conflicts with the samba-debuginfo package.Workaround: Manually remove the ctdb2.5-debuginfo package before upgrading to ctdb version 4. If necessary, install samba-debuginfo after the upgrade.
- BZ#1164778
- Any changes performed by an administrator in a Gluster volume's share section of
smb.confare replaced with the default Gluster hook scripts settings when the volume is restarted.Workaround: The administrator must perform the changes again on all nodes after the volume restarts.
Issues related to SELinux
- BZ#1294762
- When the Red Hat Gluster Storage Container is deployed on Red Hat Enterprise Atomic Host, SELinux policy labels the /var/log/glusterfs directory as svirt_sandbox_file_t. Logrotate cannot run on files with this label, and logs AVC denials when log rotation is attempted on files in /var/log/glusterfs. This means that Red Hat Gluster Storage logs cannot currently be rotated, and could potentially fill up and consume a large amount of storage as a result. Correcting this requires updates to the selinux-policy package. In the meantime, you can work around this issue by resetting the label of /var/log/glusterfs after the host volume is bind mounted inside the container.
Workaround:
- Start the container and bind mount the host volume:
# docker run ... -v /var/log/glusterfs:/var/log/glusterfs:z ... image_name # docker exec -it container_id /bin/bash
- In the container, run the following command to manually apply the appropriate SELinux label.
# chcon -Rt glusterd_log_t /var/log/glusterfs
Note that this workaround cannot persist to subsequent docker runs, and must be performed for each docker run. - BZ#1256635
- Red Hat Gluster Storage does not currently support SELinux Labeled mounts.On a FUSE mount, SELinux cannot currently distinguish file systems by subtype, and therefore cannot distinguish between different FUSE file systems (BZ#1291606). This means that a client-specific policy for Red Hat Gluster Storage cannot be defined, and SELinux cannot safely translate client-side extended attributes for files tracked by Red Hat Gluster Storage.A workaround is in progress for NFS-Ganesha mounts as part of BZ#1269584. When complete, BZ#1269584 will enable Red Hat Gluster Storage support for NFS version 4.2, including SELinux Labeled support.
- BZ#1290514 , BZ#1292781
- Current SELinux policy prevents the use of the
ctdb enablescriptandctdb disablescriptcommands.Workaround: Instead of runningctdb disablescript script, runchmod -x /etc/ctdb/events.d/scriptas the root user. Instead of runningctdb enablescript script, runchmod +x /etc/ctdb/events.d/scriptas the root user. - BZ#1291194 , BZ#1292783
- Current SELinux policy prevents ctdb's 49.winbind event script from executing smbcontrol. This can create inconsistent state in winbind, because when a public IP address is moved away from a node, winbind fails to drop connections made through that IP address.
General issues
- BZ#1303125
- The defrag variable is not being reinitialized during glusterd restart. This means that if glusterd fails while the following processes are running, it does not reconnect to these processes after restarting:
- rebalance
- tier
- remove-brick
This results in these processes continuing to run without communicating with glusterd. Additionally, glusterd does not retain the decommission_is_in_progress flag that is set to indicate that the rebalance process is running.If glusterd fails and restarts on a node where remove-brick was triggered and the rebalance process is not yet complete, but the rebalance process on other nodes has already completed, then the remove-brick commit operation succeeds because glusterd cannot identify that there is an ongoing rebalance operation on the node. This can result in data loss.Workaround:
- Stop or kill the
rebalanceprocess before restartingglusterd. This ensures that a newrebalanceprocess is spawned whenglusterdrestarts. - On the node on which glusterd restarted, check the status of the remove-brick process. Only execute the
remove-brick commitcommand whenremove-brick statusshows that data migration is complete.
- BZ#1290653
- When the
gluster volume status all taskscommand is executed, messages like the following are recorded in the glusterd log.Failed to aggregate response from node/brick
This error is logged erroneously, and can be safely ignored. - GFID mismatches cause errors
- If files and directories have different GFIDs on different back-ends, the glusterFS client may hang or display errors. Contact Red Hat Support for more information on this issue.
- BZ#1260119
glusterfindcommand must be executed from one node of the cluster. If all the nodes of cluster are not added inknown_hostslist of the command initiated node, thenglusterfind createcommand hangs.Workaround: Add all the hosts in peer including local node toknown_hosts.- BZ#1030962
- On installing the Red Hat Gluster Storage Server from an ISO or PXE, the
kexec-toolspackage for thekdumpservice gets installed by default. However, thecrashkernel=autokernel parameter required for reserving memory for thekdumpkernel, is not set for the current kernel entry in the bootloader configuration file,/boot/grub/grub.conf. Therefore thekdumpservice fails to start up with the following message available in the logs.kdump: No crashkernel parameter specified for running kernel
On installing a new kernel after installing the Red Hat Gluster Storage Server, thecrashkernel=autokernel parameter is successfully set in the bootloader configuration file for the newly added kernel.Workaround: After installing the Red Hat Gluster Storage Server, thecrashkernel=auto, or an appropriatecrashkernel=sizeMkernel parameter can be set manually for the current kernel in the bootloader configuration file. After that, the Red Hat Gluster Storage Server system must be rebooted, upon which the memory for thekdumpkernel is reserved and thekdumpservice starts successfully. Refer to the following link for more information on Configuring kdump on the Command Line - BZ#1058032
- While migrating VMs, libvirt changes the ownership of the guest image, unless it detects that the image is on a shared filesystem and the VMs can not access the disk images as the required ownership is not available.Workaround: Before migration, power off the VMs. When migration is complete, restore the ownership of the VM Disk Image (107:107) and start the VMs.
- Concurrent volume and peer management
- The glusterd service crashes when volume management commands are executed concurrently with peer commands.
- BZ#1130270
- If a 32 bit Samba package is installed before installing Red Hat Gluster Storage Samba package, the installation fails as Samba packages built for Red Hat Gluster Storage do not have 32 bit variants.Workaround: Uninstall 32 bit variants of Samba packages.
- BZ#1139183
- The Red Hat Gluster Storage 3.0 version does not prevent clients with older versions from mounting a volume on which rebalance is performed. Users with versions older than Red Hat Gluster Storage 3.0 mounting a volume on which rebalance is performed can lead to data loss.Workaround: You must install latest client version to avoid this issue.
- BZ#1127178
- If a replica brick goes down and comes up when
rm -rfcommand is executed, the operation may fail with the message Directory not empty.Workaround: Retry the operation when there are no pending self-heals. - BZ#969020
- Renaming a file during remove-brick operation may cause the file not to get migrated from the removed brick.Workaround: Check the removed brick for any files that might not have been migrated and copy those to the gluster volume before decommissioning the brick.
- BZ#1007773
- When
remove-brick startcommand is executed, even though the graph change is propagated to the NFS server, the directory inodes in memory are not refreshed to exclude the removed brick. Hence, new files that are created may end up on the removed-brick.Workaround: If files are found on the removed-brick path afterremove-brick commit, copy them via a gluster mount point before re-purposing the removed brick. - BZ#1120437
- Executing
peer-statuscommand on probed host displays the IP address of the node on which the peer probe was performed. For example, when probing node B with a hostname from node A, executingpeer statuscommand on node B displays IP address of node A instead of its hostname.Workaround: Probe node A from node B with hostname of node A. For example, execute the command:# gluster peer probe HostnameAfrom node B. - BZ#1122371
- The NFS server process and gluster
self-healdaemon process restarts when gluster daemon process is restarted. - BZ#1110692
- Executing
remove-brick statuscommand, after stopping remove-brick process, fails and displays a message that the remove-brick process is not started. - BZ#1123733
- Executing a command which involves glusterd-glusterd communication
gluster volume statusimmediately after one of the nodes is down hangs and fails after 2 minutes with cli-timeout message. The subsequent command fails with the error message Another transaction in progress for 10 mins (frame timeout).Workaround: Set a non-zero value for ping-timeout in/etc/glusterfs/glusterd.volfile. - BZ#1136718
- The AFR self-heal can leave behind a partially healed file if the brick containing AFR self-heal source file goes down in the middle of heal operation. If this partially healed file is migrated before the brick that was down comes online again, the migrated file would have incorrect data and the original file would be deleted.
- BZ#1139193
- After
add-brickoperation, any application (like git) which attemptsopendiron a previously present directory fails withESTALE/ENOENTerrors. - BZ#1141172
- If you rename a file from multiple mount points, there are chances of losing the file. This issue is witnessed since
mvcommand sends unlinks instead of renames when source and destination happens to be hard links to each other. Hence, the issue is in mv, distributed as part ofcoreutilsin various Linux distributions.For example, if there are parallel renames of the form (mv a b) and (mv b a) where a and b are hard links to the same file, because of the above mentioned behavior of mv, unlink (a) and unlink (b) would be issued from both instances ofmv. This results in losing both the links a and b and hence the file. - BZ#979926
- When any process establishes a TCP connection with
glusterfsservers of a volume using port> 1023, the server rejects the requests and the corresponding file or management operations fail. By default,glusterfsservers treat ports> 1023as unprivileged.Workaround: To disable this behavior, enablerpc-auth-allow-insecureoption on the volume using the steps given below:- To allow
insecureconnections to a volume, run the following command:# gluster volume set VOLNAME rpc-auth-allow-insecure on
- To allow
insecureconnections to glusterd process, add the following line in/etc/glusterfs/glusterd.volfile:option rpc-auth-allow-insecure on
- Restart
glusterdprocess using the following command:# service glusterd restart
- Restrict connections to trusted clients using the following command:
# gluster volume set VOLNAME auth.allow IP address
- BZ#1139676
- Renaming a directory may cause both source and target directories to exist on the volume with the same GFID and make some files in these directories not visible from the mount point. The files will still be present on the bricks.Workaround: The steps to fix this issue are documented in: https://access.redhat.com/solutions/1211133
- BZ#1139676
- Renaming a directory may cause both source and target directories to exist on the volume with the same GFID and make some files in these directories not visible from the mount point. The files will still be present on the bricks.Workaround: The steps to fix this issue are documented in: https://access.redhat.com/solutions/1211133
- BZ#1030309
- During directory creations attempted by geo-replication, though an
mkdirfails withEEXIST, the directory might not have a complete layout for sometime and the directory creation fails withDirectory existsmessage. This can happen if there is a parallelmkdirattempt on the same name. Till the othermkdircompletes, layout is not set on the directory. Without a layout, entry creations within that directory fails.Workaround: Set the layout on those sub-volumes where the directory is already created by the parallelmkdirbefore failing the currentmkdirwithEEXIST.Note
This is not a complete fix as the othermkdirmight not have created directories on all sub-volumes. The layout is set on the sub-volumes where directory is already created. Any file or directory names which hash to these sub-volumes on which layout is set, can be created successfully. - BZ#1238067
- In rare instances, glusterd may crash when it is stopped. The crash is due to a race between the clean up thread and the running thread and doesn't impact functionality. The clean up thread releases URCU resources while a running thread continues to try to access it, which results in a crash.
Issues related to Red Hat Gluster Storage AMI
- BZ#1267209
- The redhat-storage-server package is not installed by default in a Red Hat Gluster Storage Server 3 on Red Hat Enterprise Linux 7 AMI image.Workaround: It is highly recommended to manually install this package using yum.
# yum install redhat-storage-server
The redhat-storage-server package primarily provides the/etc/redhat-storage-releasefile, and sets the environment for the storage node.
Issues related to Upgrade
- BZ#1247515
- As part of the tiering feature, a new dictionary key value pair was introduced to send the number of bricks in the hot-tier. So
glusterdexpects this key in a dictionary which is sent to other peers during the data exchange. Since one of the node runs Red Hat Gluster Storage 2.1, this key value pair is not sent which causesglusterdrunning on Red Hat Gluster Storage 3.1 to complain about the missing key value pair from the peer data.Workaround: No functionality issues. An error is displayed inglusterdlogs.

Where did the comment section go?
Red Hat's documentation publication system recently went through an upgrade to enable speedier, more mobile-friendly content. We decided to re-evaluate our commenting platform to ensure that it meets your expectations and serves as an optimal feedback mechanism. During this redesign, we invite your input on providing feedback on Red Hat documentation via the discussion platform.