Menu Close
Settings Close

Language and Page Formatting Options

Red Hat Training

A Red Hat training course is available for Red Hat Gluster Storage

Chapter 3. Notable Bug Fixes

This chapter describes bugs fixed in this release of Red Hat Gluster Storage that have significant impact on users.

Note

Bugzilla IDs that are not hyperlinked are private bugs that have potentially sensitive data attached.

arbiter

BZ#1401969
Previously, when bricks went down in a particular order while parallel I/O was in progress, the arbiter brick became the source for data heal. This led to data being unavailable, since arbiter bricks store only metadata. With this fix, arbiter brick will not be marked as source.
BZ#1446125
Red Hat Gluster Storage volumes exported using SMB can now be mounted on macOS clients using Finder.

bitrot

BZ#1519740
Previously, the BitRot version internal extended attribute was passed up to AFR making it launch spurious metadata self-heals on the files. With this fix, the attribute is filtered out by bit-rot preventing these heals and the corresponding messages in the log files.
BZ#1517463
When scrub-throttle or scrub-frequency is set, the output for all successful BitRot operation (enable, disable or scrub options) would appear as volume bitrot: scrub-frequency is set to FREQUENCY successfully for volume VOLUME_NAME.

common-ha

BZ#1226874
Previously, the NFS-Ganesha startup script did not clean up the ganesha processes if the setup failed. Hence, ganesha processes continued to run on the nodes even if the ‘gluster nfs-ganesha enable’ command failed. With this fix, the ganesha processes are stopped on the nodes when the ‘gluster nfs-ganesha enable’ command fails, and the processes are cleaned up.

core

BZ#1324531
Previously, users were able to create trash directory on volumes, even when the trash directory feature was disabled. The users did not have permissions to remove the trash directory from the mount point. With this fix, a trash directory can only be created if the trash directory feature is enabled. This also allows the users to remove the directory even when the feature is disabled.
BZ#1446046
Earlier to configure ssl-cert-depth option parameter needs to set in /etc/glusterfs/glusterd.vol but after apply the patch parameter transport.socket.ssl-cert-depth needs to be set in /var/lib/glusterd/secure-access. This parameter is useful only while management Secure Sockets Layer is enabled.
BZ#1484446
Some gluster daemons (like glustershd) consumes higher CPU and memory while healing a large amount of data/entries. This can be resolved by running the control-cpu-load.sh script. This script controls the groups for regulation of CPU and memory consumption by any gluster daemon(s).
BZ#1559886
Earlier, while dumping clients, inode dictionaries were redundantly accessed under client_table lock. As a result of this behavior, gluster volume status inode suspended the brick. With this fix, the inode dictionaries under client_table lock are not redundantly populated.

disperse

BZ#1561733
Previously, when lookup-optimize was enabled, some files were not migrated on a disperse volume due to incomplete data returned by a special rebalance request. This fix ensures that all required data is transferred and all files are migrated.
BZ#1499865
This feature implements support for discard operation on Erasure Coded volumes. This operation can be used to deallocate blocks inside a file. Within the given range, partial fragments are zeroed, and whole fragments are deallocated.
BZ#1459101
Previously, the performance of write FOPs was affected due to FOPs modifying a non overlapping range of offset of the same file. This behavior prevented optimum performance, especially, if a brick was slow that caused each FOP taking more time to return. With the implementation of the parallel-writes feature, the FOPs performance is significantly improved.
BZ#1509830
Disperse translator is optimized to perform xattr update operation in parallel on the bricks during self-heal to improve performance.
BZ#1530519
Red Hat Gluster Storage 3.4 introduces a new option 'other-eager-lock' to keep eager-lock enabled for regular files but disabled for directory access. Earlier, the eager-lock option was used to accelerate performance for file access, however, directory access suffered when this option was enabled.
BZ#1555261
An issue prevented seal-heal to progress causing delays in the heal process after replacing a brick or bringing a failed brick online. This fix keeps self-heal active until all pending files are completely healed.
BZ#1611151
A delay in lock contention caused entry fops like Ls and renames to be slow when two client accessed the same directory on an EC volume. As a consequence, listing of directories and other entry operations were slow. This issue is solved in this version of RHGS.

distribute

BZ#1550315
Earlier for a distributed volume, the custom extended attribute value for a directory was shown incorrect value after executing stop, start command or while adding a new brick. DHT failed to sync the extended attribute value. This release introduces a new volume that will be referred at the time of update custom xattrs for the directory.
BZ#1463114
Skipped files are now logged in the rebalance log with MSGID: 109126. The users can search for the list of skipped files using the message id.
BZ#1550771
Renaming a directory on a gluster volume involves renaming the directory on every brick of the volume. If the rename failed on one or more bricks, directories with both old and new names on one or more bricks were present on the bricks. This caused some of the contents of those directories to become inaccessible from the mount point. With this fix, the directory rename operation is rolled back if it fails on any brick.
BZ#1118770
Previously, a lookup on a directory that was being renamed could lead to directories with both old and new names existing on the volume. This could cause some of the contents of these directories to be inaccessible from the mount point. Now, additional synchronization has been introduced to prevent this from occurring.
BZ#1392905
During a rebalance process, hardlinks were reported as failures instead of being marked as skipped. With this fix, no failures for hardlink migration are noticed, rather they are added to a skipped list during rebalance operation
BZ#1557365
With this update, the lookup-optimize option is enabled by default. This option enhances the lookup and create performances.
BZ#1577051
Previously, the remove-brick process did not show any failure during a lookup failure. It is recommended to check the decommissioned brick before doing a remove-brick commit for any left out files. With this fix, the remove brick status shows failure count.

eventsapi

BZ#1466122
Previously, gluster-events daemon failed to send events to a registered webhook if the webhook was https enabled. With this fix, gluster-events daemon registers the webhook if it is https enabled.
BZ#1466129
Earlier, Gluster did not add HMAC signature (hash-based message authentication code) to the events pushed to the webhook. With this fix, gluster-event daemon generates an HMAC token and adds it to the authorization header while sending it to the webhook.

fuse

BZ#1508999
subdir mounted clients cannot heal the directory structure when an add-brick is performed because distribute would not know the parent directories of subdirectories while performing directory self-heal. You can fix this by mounting the volume (without the subdirectory) on one of the server after add-brick, and run self-heal operations on the directories. You can perform these tasks by using 'hook' scripts, so that no user intervention is required.
BZ#1501013
Earlier, Gluster did not add HMAC signature (hash-based message authentication code) to the events pushed to the webhook. With this fix, gluster-event daemon generates an HMAC token and adds it to the authorization header while sending it to the webhook.

geo-replication

BZ#1568655
Previously, if symbolic links (symlinks) were created by a non-privileged user to the current directory on the master volume, geo-replication failed to synchronize them to the slave volume. Instead of setting the permissions on the symlink, it used to dereference the symlink to the virtual directory .gfid and the geo-replication session failed with Operation not supported error. With this fix, geo-replication does not dereference the symlink while setting the permissions on the file and synchronizes the symlinks created by a non-privileged user to the current directory.
BZ#1288115
Red Hat Gluster Storage 3.4 provides an option to grant read-only access on a gluster volume using the following command:
#gluster volume set VOLNAME features.read-only on
This option is useful in geo-replication, when the data should be written on the slave volume by any other client except geo-replication.
BZ#1468972
Red Gluster Gluster Storage 3.4 introduces structured logging in geo-replication to improve the log messages. With structured logging you can easily parse the log messages to get the required information.
BZ#1599037
Previously, when geo-replication was started or recovered from a Faulty state, it regenerated xsync changelogs that needed to be processed. Whereas the unsynced xsync changelogs, which were generated before the stop or restart operation, were not processed. This resulted in consumption of inodes and space. With this release, the unsynced changelogs are unprocessed every time geo-replication is restarted or recovered from Faulty state and also does not consume space and inodes.
BZ#1342785
Previously, metadata changes such as ownership change on a symbolic link (symlink) file resulted in a crash with Permission Denied error. With this release, geo-replication is fixed to synchronize metadata of the symlink files, and the ownership change of symlink files is replicated properly without crashing.
BZ#1470967
Geo-replication expects the GFID to be the same on master volume and slave volume. However, geo-replication failed to synchronize the entry to slave volume when a file already existed with a different GFID. With this release, the GFID mismatch failures are handled by verifying it on the master volume.
BZ#1498391
Geo-replication uses changelog feature and enables it as part of geo-replication session creation. But the command to disable changelog did not warn the user about any existing geo-replication session. With this release, a geo-replication session check is added while disabling the changelog. Hence, while disabling the changelog, a warning is generated for the user about any existing geo-replication sessions.
BZ#1503173
Geo-replication mounts the master and slave volume internally to synchronize the data, thus the mount points were not available to debug. As a result, the user failed to retrieve the client volume profile information. This fix brings two new options to users to access the geo-replication mount points: slave_access_mount: for slave mount points and master_access_mount: for master mount points.
BZ#1557297
Earlier, geo-replication could be paused or resumed by an unauthorized user. As a result, snapshot creation operation failed. With this release, pausing or resuming a geo-replication session by an unauthorized user, other than the session creator is restricted, and will display the following error message: Geo-replication session between USERNAME and SLAVE_HOSTNAME does not exist
BZ#1565399
Previously, if symbolic links (symlinks) were created by a non-privileged user pointing to current directory on the master volume, geo-replication failed to synchronize them to slave. Instead of setting the permissions on the symlink, it used to dereference the symlink pointing to the virtual directory .gfid and failed with Operation not supported error. With this fix, geo-replication does not dereference symlink while setting permissions on file and syncs symlinks created by non-privileged user pointing to the current directory.
BZ#1601314
Previously, when a symbolic link (symlink) was created and renamed, followed by directory creation with the same name as the original symlink, the file caused out-of-order synchronization. This caused the directory to be synchronized first without synchronizing the renamed symlink (or symlink renaming). As a consequence, geo-replication sometimes failed to synchronize the renamed symlink.
With this release, while processing out-of-order file rename operation, if the file name already exists, the users can use GFID to check the identity of the file and synchronize accordingly.

samba

BZ#1379444
Previously, sharing a subdirectory of a gluster volume failed with an I/O error when the shadow_copy2 vfs object was specified in the smb.conf file. This occurred because gluster volumes are remote file systems, and shadow_copy2 only detected shared paths in the local file system. This update forces the value of shadow:mountpath to /, skipping the code related to mount point detection and preventing this problem. This fix requires that the glusterfs vfs object is listed after the shadow_copy2 vfs object in the smb.conf file.

NFS-ganesha

BZ#1480138
NFS-Ganesha internally uses a different file descriptor for every lock request. When a client removes a file with a specific file descriptor, other clients who are trying to access the same file with the same file descriptor receive a No such file or directory error. With this release, lock request uses the same file descriptor obtained from an open call and no additional open call in the lock handling path is required.
BZ#1514615
To limit the NFS version to 3 and 4.0, NFSv4 blocks were added in the ganesha.conf file manually. With this release, these options are added as a default in the ganesha.conf file.
BZ#1481040
Currently, the default interval between upcall polls is 10 microseconds. For large numbers of threads, this results in heavy CPU utilization. With this release, the default polling interval has been increased to 100 milliseconds (100000 microseconds), thus, helping in the reduction of CPU utilization.
BZ#1545523
Previously, if a client requests a SET_ATTR_MODE in a create call, the NFS server needed to perform a setattr operation post creation. Incase of gluster NFS server, GFID used for the setattr operation was NULL and thus resulted in EINVAL error. With this release, you can use the GFID from the linked node and the create call with SET_ATTR_MODE is executed successfully.
BZ#1489378
With this release, the USE_GLUSTER_XREADDIRPLUS option is introduced that enables enhanced readdirp instead of standard readdir. This option is turned on by default. When this option is turned off, NFS falls back to standard readdir. Turning off this option would result in more lookup and stat requests being sent from the client, which may impact performance.
BZ#1516699
With Red Hat Gluster Storage 3.4, NFS-Ganesha log files have been moved to the /var/log/ganesha subdirectory.

glusterd

BZ#1449867
Earlier in case of a node reboot, if the network interface was not completely functional before the glusterd service was initiated, glusterd failed to resolve the brick addresses for different peers, thus, resulting in glusterd service to fail to initiate. With this release, glusterd initialization would not fail even when the network interface is not completely functional.
BZ#1599823
Earlier, during a volume create operation, glusterd failed to handle blank real paths while checking if the brick path is already a part of another volume. Hence, volume create requests would fail. This release fixes the path comparison logic. Now, glusterd can handle blank paths and prevents failure of volume create requests.
BZ#1575539
The memory leak at glusterd resulted during a gluster volume geo-replication status operation has been addressed in Red Hat Gluster Storage 3.4.
BZ#1369420
Previously, when glusterd service was restarted, the AVC denial message was displayed for port 61000. With this release, if you configure the max-port in glusterd.vol below 61000, then the AVC denial message is no longer seen.
BZ#1529451
Previously, executing gluster volume status command multiple times caused excessive memory consumption by the glusterd process. With this release, this issue has been fixed.
BZ#1474745
Red Hat Gluster Storage 3.4 allows you to define the max-port value in glusterd.vol to control the range of the port that gluster bricks can consume.

locks

BZ#1507361
Earlier, the gluster volume clear-locks command failed to release the acquired memory completely. This caused increasingly high memory utilization on the brick processes over time. With this fix, the associated memories are released when the clear-locks command is executed.
BZ#1510725
When a stale lock on the volume is cleared using the gluster volume clear-locks volname command, one of the references to lock is persistent in the memory even after the lock is destroyed. Upon disconnection from the client, this invalid memory is accessed which leads to a crash. With this release, the final reference to the lock is cleared as part of the clear-locks command, so the operation does not lead to unofficial memory access.
BZ#1495161
Previously, processes that used multiple POSIX locks, possibly in combination with gluster clear-locks command, would lead to memory leak causing high memory consumption on brick processes. In some cases, this triggered a OOM killer error. This release fixes the issue related to leaks present in translators.

VDSM

BZ#1503070
The VDSM package has been upgraded to upstream version 4.19, which provides a number of bug fixes and enhancements over the previous version. This update allows Red Hat Gluster Storage 3.4 nodes with a cluster compatibility version of 4.1 to be managed by Red Hat Virtualization.
BZ#1542859
When a response from VDSM did not include the 'device' field, ovirt-engine did not update the database, which resulted in incorrect status information being displayed in the Administration Portal. This fix ensures that the device field is always part of VDSM responses so that brick status is accurately reflected in the Administration Portal.

posix

BZ#1464350
The POSIX translator is now enhanced with an option that allows user to reserve disk space on the bricks. Some administrative operations, like expanding storage or rebalancing data across nodes, require spare working space on the disk. The storage.reserve option lets users expand disk or cluster when backend bricks are full preventing ENOSPC errors on mount points.
BZ#1620765
Previously, file rename operation failed to rename the linkto file because of an incorrect xattr value set on the linkto file. This was seen on volumes upon which glusterfind was used. As a result, files were inaccessible if the lookup-optimize option was enabled on the volume. With this fix, the value of the xattr set on the linkto file allows the rename to proceed.

protocol

BZ#1319271
Previously, auth.allow or auth.reject options did not accept hostnames as values when provided with FQDN. With this fix, these options now accept hostnames when provided with FQDN.

quota

BZ#1414456
Previously, the path ancestry was not accurately populated when a symbolic link file had multiple hard links to it. This resulted in pending entry heal. With this fix, the ancestry is populated by handling the scenario of the symbolic link file with multiple hard links.
BZ#1475779
Previously, a directory failed to get healed post an add-brick operation if the directory's quota had already exceeded hard limit prior to add-brick. As a result, the directory structure remained incomplete on the newly added brick. With this fix, the directory heal happens irrespective of the resulting quota limit.
BZ#1557551, BZ#1581231
Previously, while quota was enabled on a volume, the quota used values were not updated to the list command until a lookup was done from the client mount point. Due to this, there was inaccuracy while reporting the file size even after performing the crawl operation. With this fix, it is ensured that the crawl operation looks up all files and reports the accurate quota used.
BZ#1511766
Previously, it was not possible to have more than 7712 limits configured for a volume due to limitations on how the quota.conf file was read and written. With this fix, you can now configure more than 65000 limits on a single volume.

readdir-ahead

BZ#1463592
The parallel-readdir volume options were not a part of any of the translators. Because of this, the following warning message was shown in the client log when parallel-readdir was enabled: "option 'parallel-readdir' is not recognized". With this fix, parallel-readdir is now added as an option of readdir-ahead translator and the warning message is not seen in client logs.
BZ#1559884
When the gluster volume had the combinations mentioned below, some of the files appeared twice in mountpoint after performing a readdir operation. The following combination of volume options caused the error: performance.stat-prefetch off performance.readdir-ahead on performance.parallel-readdir on With this fix, when readdir is issued on the mountpoint, no double entries are seen for a single file.

replicate

BZ#1286820
Red Hat Gluster Storage 3.4 introduces the summary command. This command displays the statistics of entries pending heal in split-brain and the entries undergoing healing.
BZ#1413959
With this release, GFID split-brains can be resolved from the CLI using any of the policies: choice of brick, mtime or size. You need to provide the absolute path of the file which needs GFID heal.
BZ#1452915
Previously, when the heal daemon was disabled by using the heal disable command, you had to manually trigger a heal by using gluster volume heal volname command. The heal command displayed an incorrect and misleading error message. With this fix, when you try to trigger a manual heal on a disabled daemon, a useful and correct error message is displayed directing the user to enable the daemon in order to trigger a heal.
BZ#1489876
Since replica 2 volumes are prone to split-brain, they will be deprecated in the future releases of Red Hat Gluster Storage 3.4. Therefore, while creating a replica 2 volume, an appropriate warning message is displayed which recommends to use the Arbiter or replica 3 configurations.
BZ#1501023
Previously, volume-set command used to re-configure the choose-local option was not working as expected due to AFR not handling reconfiguration of the choose-local option. With this fix, appropriate changes are made to make AFR handle reconfiguring choose-local option.
BZ#1593865
glusterd can send heal related requests to self-heal daemon before the latter's graph is fully initialized. In this case, the self-heal daemon used to crash when trying to access certain data structures. With the fix, if the self-heal daemon receives a request before its graph is initialized, it will ignore the request.
BZ#1361209
Previously, when the heal daemon was disabled by using the heal disable command, you had to manually trigger a heal by using gluster volume heal volname command. The heal command displayed an incorrect and misleading error message. With this fix, when you try to trigger a manual heal on a disabled daemon, a useful and correct error message is displayed directing the user to enable the daemon in order to trigger a heal.
BZ#1470566
The users can now convert a plain distributed volume to a distributed-replicate volume by executing the 'gluster volume add-brick` command, provided there is no I/O happening on the client.
BZ#1552414
In replica 3 volumes, there was a possibility of ending up in split brain, when multiple clients simultaneously write data on the same file at non overlapping regions. With the new cluster.full-lockoption, you can take full file lock which helps you in maintaining data consistency and avoid ending up in split-brain. By default, the cluster.full-lockoption option is set to take full file lock and can be reconfigured to take range locks, if needed.
BZ#1566336
Previously, a successful file creation on a brick in a replica 3 volume would set the pending changelog as part of a new entry marking for that file. As the entry transaction failed on quorum number of bricks, the parent file will not have any entry pending changelog set. As a consequence, the entry transaction would be listed in the heal info output, but would never get healed by the SHD crawl or index heal. With this fix, if an entry transaction fails on quorum number of bricks, a dirty marking is set on the parent file of the brick where the transaction was successful. This action allows the entry to be healed as part of the next SHD crawl or index heal.
BZ#1397798, BZ#1479335
Some gluster daemons like glustershd have a higher CPU or memory consumption when there is a large amount of data or entries to be healed. This resulted in slow consumption of resources. With this fix, the users can resolve the slow consumption of resources by running the control-cpu-load.sh script. This script uses the control groups for regulating CPU and memory consumption of any gluster daemon.

rpc

BZ#1408354
This feature controls the TCP keep-alive feature timings per socket. Applications deployed by users had prerequisites concerning the system call latencies. Using gluster as the file-system adds further network latency to the existing system-call latency. These gluster options opened to the user will help them fine-tune the application to an acceptable network latency beyond which the TCP socket will be assumed as disconnected. Default values assigned to these options configure every TCP socket to the system default values when no such options are explicitly set. The gluster volume options opened to the users are:
  • server.tcp-user-timeout
  • client.tcp-user-timeout
  • transport.socket.keepalive-interval
  • transport.socket.keepalive-count
  • server.tcp-user-timeout
These options are 1:1 compatible with similar options described in the TCP(7) manual page. This feature provides a mechanism to fine tune the application behaviour for better user experience.

snapshot

1464150
GlusterFS used to mount deactivated snapshot(s) under /run/gluster/snaps by default. Furthermore, the snapshot status command should show relevant information for the deactivated snapshot(s). Since we have a mount, there is a possibility that some process may access the mount causing issues while unmounting the volume during the snapshot deletion. This feature assures that GlusterFS does not mount deactivated snapshot(s) and displays the text N/A (Deactivated Snapshot) in Volume Group filed for snapshot status command.

vulnerability

BZ#1601298
It was found that glusterfs server does not properly sanitize file paths in the trusted.io-stats-dump extended attribute which is used by the debug/io-statstranslator. An attacker can use this flaw to create files and execute arbitrary code. To exploit this, the attacker would require sufficient access to modify the extended attributes of files on a gluster volume.
BZ#1601642
It was found that glusterfs server is vulnerable to mulitple stack based buffer overflows due to functions in server-rpc-fopc.c allocating fixed size buffers using alloca(3). An authenticated attacker could exploit this by mounting a gluster volume and sending a string longer that the fixed buffer size to cause crash or potential code execution.
BZ#1610659
It was found that the mknod call derived from mknod(2) can create files pointing to devices on a glusterfs server node. An authenticated attacker could use this to create an arbitrary device and read data from any device attached to the glusterfs server node.
BZ#1612658
A flaw was found in RPC request using gfs3_lookup_req in glusterfs server. An authenticated attacker could use this flaw to leak information and execute remote denial of service by crashing gluster brick process.
BZ#1612659
A flaw was found in RPC request using gfs3_symlink_req in glusterfs server which allows symbolic link (symlink) destinations to point to file paths outside of the gluster volume. An authenticated attacker could use this flaw to create arbitrary symlinks pointing anywhere on the server and execute arbitrary code on glusterfs server nodes.
BZ#1612660
A flaw was found in RPC request using gfs2_create_req in glusterfs server. An authenticated attacker could use this flaw to create arbitrary files and execute arbitrary code on glusterfs server nodes.
BZ#1612664
A flaw was found in RPC request using gfs3_rename_req in glusterfs server. An authenticated attacker could use this flaw to write to a destination outside the gluster volume.
BZ#1613143
A flaw was found in RPC request using gfs3_mknod_req supported by glusterfs server handles. An authenticated attacker could use this flaw to write files to an arbitrary location via path traversal and execute arbitrary code on a glusterfs server node.
BZ#1601657
A flaw was found in the way dict.c:dict_unserialize function of glusterfs, dic_unserialize function does not handle negative key length values. An attacker could use this flaw to read memory from other locations into the stored dict value.
BZ#1607617
It was found that an attacker could issue a xattr request via glusterfs FUSE to cause gluster brick process to crash which will result in a remote denial of service. If gluster multiplexing is enabled this will result in a crash of multiple bricks and gluster volumes.
BZ#1607618
An information disclosure vulnerability was discovered in glusterfs server. An attacker could issue a xattr request via glusterfs FUSE to determine the existence of any file.

write-behind

BZ#1426042
The performance.write-behind-trickling-writes and performance.nfs.write-behind-trickling-writes options enables the trickling-write strategy for the write-behind translator for FUSE and NFS clients respectively.

Web Administration

BZ#1590405
tendrl-ansible runs yum update by default that causes all involved systems to update with latest packages. As a consequence, unintentional patching of OS and services occurs which is not desirable for production systems.
To avoid unintentional package updates by tendrl-ansible, drop the yum update variable from the ansible playbooks.
BZ#1518525
Previously, during Red Hat Gluster Storage Web Administration installation, if the server had multiple active IP addresses, tendrl-ansible failed to automatically choose the correct IP address, causing installation failure. In this version, the user has to set all the required variables for tendrl-ansible as per the installation instructions.
BZ#1563648
Earlier, at infra level while persisting the object details in the central store (etcd), each field used to be written separately causing multiple REST API calls to etcd.
This method of persisting the objects to etcd triggered multiple REST API calls resulting in high volume of network calls. Also, this used to end up in race conditions when other threads tried to update an object whereas the saving thread was still writing the data to etcd.
As a performance improvement and getting away from the race condition, now the whole object gets serialized into a single JSON and is written to etcd. While reading object state from etcd, this single JSON is read and then object is weaved back using the details.
As a result, the read and write operations of each object to and from etcd now results in a single REST API call. It also avoids the critical race condition which caused lot of issues in various flows of the system.
BZ#1512937
The host-level dashboard in Grafana listed duplicate hostnames with FQDN and IP addresses for a given storage node irrespective of how the peer probe was done. This caused duplicate data for the same node being displayed in the time series data and the Grafana dashboard.
With this fix, the host-level dashboard and other dashboards display the name of the hosts only once which was used for peer probe while Gluster cluster creation.
BZ#1514442
Earlier, if import cluster failed in Web Administration interface, there was no way to initiate import again from UI. The user needed to clean the etcd details and initiate import again. As a consequence, successive attempts to import the same cluster failed.
With the latest changes around import cluster and a new feature to un-manage a cluster and import, if import cluster fails for a cluster due to invalid repositories configured in the storage nodes for installation of components, the user can now correct the issues in the underlying node and then reinitiate import of the cluster. The unmanage cluster operation also helps in this workflow, as the cluster can be un-managed and then re-imported. The import job is successful only if all the nodes report that all required components are installed and running on them with first round of synchronization complete. If any node fails to report the same, import cluster job fails.
With this fix, if import fails, the user can correct the issues on underlying nodes and execute reimport for the cluster.
BZ#1516417
Previously, Web Administration was unable to detect or expand new storage nodes added to gluster trusted storage pool. As a consequence, Web Administration could not manage and provide provide metrics to newly added nodes to a cluster after initial Import Cluster.
With this fix, Web Administration can now detect and expand new nodes to an already managed cluster once the new nodes are added to the gluster trusted storage pool.
BZ#1549146
Previously, the Grafana dashboard reported unusual and unrealistic values for different performance panels along with missing performance units. Few of the affected panels that displayed unrealistic numbers were Weeks Remaining, IOPS, Throughput, etc.
With this fix, the Grafana panels including the IOPS and Weeks Remaining panels display realistic values understandable by the user along with the appropriate performance units. Additionally, the Inode panels were removed from the brick-level and volume-level dashboards of Grafana.
BZ#1516135
Previously, there was no way to unmanage a cluster which was partially imported due to tendrl-gluster-integration install failure on few nodes of the cluster. Despite the unsuccessful import, the cluster displayed status as being successfully imported and managed by Web Administration.
As a consequence, the depiction of the cluster in the Web Administration interface was not correct as not all the peers of the cluster were successfully imported and reported in the interface.
With this fix, now an import job would be marked as finished only if all the peers report tendrl-gluster-integration running on them and their first round of synchronization of data is done in the Web Administration environment. With the unmanage cluster functionality in place, any affected cluster can be unmanaged and the underlying issues can be fixed before reimporting the cluster in the Web Administration environment.
If import fails, the issues can be corrected in the underlying cluster and re-imported in the Web Administration environment.
BZ#1519158
Previously, in the Clusters view of the Web Administration interface, the filter button to filter the cluster attributes was unresponsive on Mozilla Firefox web browser. Due to this issue, users were not able to filter the clusters view based on a particular cluster attribute.
This browser issue is now fixed with this release and the filter button is responsive on Firefox.