IMPORTANT update for upgrading Red Hat Gluster Storage from 3.0.x to 3.1.x
Environment
- Red Hat Gluster Storage 3.1.x
- Red Hat Enterprise Linux 6.x
Issue
- How to upgrade Red Hat Gluster Storage from 3.0.x to 3.1.x?
- Why Geo-replication & self-heal is not working properly after upgrading Red Hat Gluster storage from 3.0.x to 3.1.x?
- Why the
brick volfile
is missing all the options for the latest Red Hat Gluster Storage version after the upgrade? - Why
0-dict: dict|match|action is NULL [Invalid argument]
logs are flooding after Red Hat Gluster Storage upgrade? -
Why the below errors are seen during the upgrade of Red Hat Gluster Storage ?
0-glusterd: geo-replication module not working as desired gsyncd version checking is failed
Resolution
- An issue has been identified with the direct upgrade from Red Hat Gluster Storage 3.0.x to Red Hat Gluster Storage 3.1.x. This issue is being tracked in this bugzilla#1353470
Updating RHGS 3.0.x to 3.1.x using In-service method
Warning : SMB and CTDB in-service upgrade is not supported, Follow the offline upgrade if SMB and CTDB are in use.
-
Stop any geo-replication sessions before you begin. Note that slave nodes must be updated before master nodes when geo-replication is in use.
# gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL stop
-
Stop the gluster services on the storage server using the following commands.
# service glusterd stop (RHEL6) or #systemctl stop glusterd (RHEL 7) # pkill glusterfs # pkill glusterfsd
-
Update the server using the following command
# yum update
Wait for the update to complete.
a. Run the following command to check your log file for the relevant messages, ensuring that you wait until the command line prompt reappears
# grep -irns "geo-replication module not working as desired" /var/log/glusterfs/etc-glusterfs-glusterd.vol.log | wc -l
b. If the output is 0, continue with the rest of the procedure from Step 4.
If the output is greater than zero, follow these steps.
i. Check whether glusterd is running.
# ps aux | grep glusterd
If glusterd is running, stop glusterd.
ii. Run the following command.
# glusterd --xlator-option *.upgrade=on -N
-
Reboot the server if a kernel update was included as part of the update process in the previous step.
-
If there is no kerenl upgrade was included , Start glusterd .
#service glusterd start (RHEL 6 ) or #systemctl start glusterd (RHEL7)
-
To verify that you have upgraded to the desired version, run the following command.
# gluster --version
-
Ensure that all the bricks are online.
# gluster volume status
-
Start self-heal on the volumes.
#gluster volume heal volname
-
Verify the self-heal is completed on the replica.
#gluster volume heal volname info
-
Repeat Step 2 - Step 9 on the other node of replica pair.
-
When all nodes have been upgraded, run the following command to update the op-version of the cluster. This helps to prevent any compatibility issues within the cluster.
#gluster volume set all cluster.op-version 30712
The
op-version 30712
is for RHGS 3.1.3 , if the cluster was updated to a different version , refer this knowledge base and set the appropriateop-version
for the cluster. -
If
geo-replication
was in use, Start thegeo-replication
session#gluster volume geo-replication MASTER_VOL SLAVE_HOST::SLAVE_VOL start | force
Updating RHGS 3.0.x to 3.1.x using Offline method
-
Make a complete backup using a reliable backup solution.
-
When it is certain that the backup is working , Stop the volumes.
# gluster volume stop volname
-
Stop the gluster services on the storage server using the following commands
# service glusterd stop (RHEL6) or #systemctl stop glusterd (RHEL 7) # pkill glusterfs # pkill glusterfsd
-
Update the server using the following command
# yum update
Wait for the update to complete.
a. Run the following command to check your log file for the relevant messages, ensuring that you wait until the command line prompt reappears
# grep -irns "geo-replication module not working as desired" /var/log/glusterfs/etc-glusterfs-glusterd.vol.log | wc -l
b. If the output is 0, continue with the rest of the procedure from Step 4.
If the output is greater than zero, follow these steps.
i. Check whether glusterd is running.
# ps aux | grep glusterd
If glusterd is running, stop glusterd.
ii. Run the following command.
# glusterd --xlator-option *.upgrade=on -N
-
Start
glusterd
service.#service glusterd start (RHEL 6) or #systemctl start glusterd (RHEL7)
-
When all nodes have been upgraded, run the following command to update the op-version of the cluster. This helps to prevent any compatibility issues within the cluster.
#gluster volume set all cluster.op-version 30712
The
op-version 30712
is for RHGS 3.1.3 , if the cluster is updated to a different version , refer this knowledge base and set the appropriateop-version
for the cluster. -
Start the volumes with the following command:
#gluster volume start volname
Already upgraded to RHGS 3.1.x and vol files does not contain options mentioned in diagnostic steps
Note : If the above mentioned in-service or offline method is not followed for the upgrade , It might cause issues to some basic functionality. Follow the below procedure to resolve this.
-
Start and stop a volume profile to re-generate the
volfile
#gluster volume profile <volname> start #gluster volume profile <volname> stop
Check verify whether the
volfile
has the below optionsvolume volname-index type features/index option xattrop-pending-watchlist trusted.afr.volname- option xattrop-dirty-watchlist trusted.afr.dirty option index-base /<brickpath>/.glusterfs/indices subvolumes volname-barrier end-volume Note : Vol files are present at /var/lib/glusterd/vols/<VOLNAME>
-
Stop the gluster volumes
#gluster volume stop <volname>
-
Navigate to the brick directories and execute the attached
generate-index-files.sh
script. This can be run in parrellel on all the brick directories and on all the nodes. Once the script is initiated on all the brick directories, proceed to the Step 4, no need to wait for the script to complete, let it fix the indices in the background .This will resolve any existing issues with the self-heal.#./generate-index-files.sh <path-to-brick> <volname> replicate
-
Start the volumes
#gluster volume start <volname>
-
Make sure the
generate-index-files.sh
script is completed and no files are pending to heal.#gluster v heal <volname> info Verify this output against the content of directory $brick/.glusterfs/indices/xattrop/
-
If the split-brain is existing , refer this knowledge base and resolve manually.
Root Cause
- During the upgrade process, we expect the
yum update
to runglusterd --xlator-option *.upgrade=on -N
. But it gets failed while executinggsyncd --version
via runner interface. The newvolfile
with the options of the upgraded version is not getting generated as theglusterd
is not started with the mentioned option, This eventually causes some issues for the functionality of the upgraded version of Red Hat Gluster Storage.
Diagnostic Steps
-
After the upgrade check whether the
brick volfile
has the below optionsvolume volname-index type features/index option xattrop-pending-watchlist trusted.afr.volname- option xattrop-dirty-watchlist trusted.afr.dirty option index-base /<brickpath>/.glusterfs/indices subvolumes volname-barrier end-volume
-
Check the
/var/log/glusterfs/etc-glusterfs-glusterd.vol.log
, whether it contains any of the following messages-
geo-replication module not working as desired
-
gsyncd version checking is failed
-
-
Check the brick logs
/var/log/glusterfs/bricks/<brick-path>.log
whether it contains below, while any read/write happening0-dict: dict|match|action is NULL [Invalid argument]
Attachments
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments