Red Hat Training

A Red Hat training course is available for Red Hat Gluster Storage

Error Message Guide

Red Hat Gluster Storage 3

Error description and recommended action for possible errors that occur in the Red Hat Storage Server environment.

Pavithra Srinivasan

Red Hat Engineering Content Services

Abstract

Red Hat Storage Error Messages Guide makes an effort to provide valuable information that may help you solve the errors that occur in the Red Hat Storage environment, or at least narrow down the parameters of the error.

Chapter 1. Introduction

The guide provides a list of error messages that may appear while running the Red Hat Storage Server. Each message includes the following information:
  • Message ID
  • Description
  • Recommended Action
Message ID

The message ID is an internally generated ID that uniquely identifies the message. The guide is divided into ranges of Message IDs.

Within each section, the messages are ordered by Message ID. The best way to find a particular message is by searching for the Message ID.
Description

The description is an explanation of the error that was encountered including any background information that may aid you in determining the reason for the error.

Recommended Action

The recommended action is the suggested steps that you must take to recover from any problems caused by the error.

Important

The message IDs appear in the log files only if thewith-msg-id format is configured using the gluster volume set <VOLNAME> OPTION PARAMETER command. The Administrator can search for the Message ID in the log files. This guide contains the description and recommended action for every Message ID.
There are several storage terminologies used in the context of this guide and the Red Hat Storage environment.
See section Storage Concepts in the Red Hat Storage Administration Guide
This is the first release of the Red Hat Storage Error Messages Guide. The commonly occurring errors in the following components are documented in this release.
  • Automatic File Replication Translator
  • glusterFS Management Daemon
  • Quota Daemon
  • Distributed Hash Table Translator

Chapter 2. glusterFS Management Daemon

2.1. glusterFS Management Daemon

The glusterFS Management Daemon is used for elastic volume management. The service runs on all Red Hat Storage servers. This chapter provides a listing of the possible error scenarios in the glusterFS Management Daemon.

glusterFS Management Daemon Error Scenarios

106001
Description: The operation could not be performed because the server quorum was not met.
Recommended Action: Ensure that the other peer Red Hat Storage nodes are online and reachable from the local peer Red Hat Storage node.
106002
Description: The local bricks belonging to the volume were killed because server-quorum was not met.
Recommended Action: Ensure that the other Red Hat Storage nodes are online and reachable from the local peer node.
106003
Description: Informational message: The local bricks belonging to the volume were restarted because server-quorum was met.
Recommended Action: None.
106004
Description: The glusterFS Management Daemon is either offline or the peer Red Hat Storage node is unreachable.
Recommended Action: Ensure that the glusterFS Management Daemon is running on the peer node or the firewall rules are not blocking the port 24007.
106005
Description: The brick process is offline.
Recommended Action:
  1. Identify the reason for the brick to go offline in the brick log files.
  2. Run the gluster volume start <VOLNAME> force command to bring the bricks back online.
106006
Description: Either the glusterNFS Server or Self-heal Daemon is offline.
Recommended Action:
  1. Investigate the Self-heal Daemon and glusterNFS Server log files for the reason for the brick to go offline.
  2. Run the gluster volume start <VOLNAME> force command to bring the glusterNFS Server and the Self-heal Daemon back online.
106007
Description:The rebalance process is offline.
Recommended Action: Execute the gluster volume rebalance <VOLNAME> status command and check if the rebalance status is complete, if not, then execute the command: gluster volume rebalance <VOLNAME> start
106008
Description:The volume cleanup operation failed.
Recommended Action: None.
106009
Description:A volume version mismatch occurred while adding a peer Red Hat Storage node.
Recommended Action: None.
106010
Description: A volume checksum version mismatch occurred while adding a peer Red Hat Storage node.
The log includes the peer name which caused the mismatch.
Recommended Action:
  1. Identify the node that is causing the checksum error.
  2. Contact the Red Hat Global Support.
106011
Description: A volume quota configuration version mismatch occurred while adding a peer Red Hat Storage node.
Recommended Action: None.
106012
Description: A quota configuration version checksum mismatch occurred while adding a Red Hat Storage node.
Recommended Action:
  1. Identify the node that is causing the checksum error. The log includes the peer name which caused the mismatch.
  2. Contact the Red Hat Global Support Services.
106013
Description: The brick process could not be brought offline.
Recommended Action:
  1. Identify the PID of the brick process from the log file.
  2. Kill the brick process manually.
106014
Description: One of the listed services: glusterNFS Server, Quota Daemon, Self Heal Daemon, or Brick Process could not be brought offline.
Recommended Action:
  1. Identify the PID of the process from the log file.
  2. Kill the process manually.
106015
Description: The process with the specified PID could not be killed.
Recommended Action: None.
106016
Description: The rebalance socket file is missing.
Recommended Action: Execute the command gluster volume rebalance <VOLNAME> start.
106017
Description: The UNIX options could not be set as the Red Hat Storage server is out of memory.
Recommended Action: Restart the Red Hat Storage server.
106018
Description: The rebalance process failed as the glusterFS Management Daemon could not establish an RPC connection.
Recommended Action:
  1. Identify the reason for failure in the log file to resolve the issue.
  2. Execute gluster volume rebalance <VOLNAME> start.
106019
Description: The default volume options could not be set on the gluster volume create or gluster volume reset commands.
Recommended Action: None
For more information on where the log files are located, see section, Red Hat Storage Component Logs and Location in the Red Hat Storage 3.0 Administration Guide.

Chapter 3. Automatic File Replication Translator

3.1. Automatic File Replication Translator

The Automatic File Replication (AFR) translator in glusterFS makes use of the extended attributes to keep track of the file operations.

Automatic File Replication Translator Error Scenarios

108001
Description: The file modification operations are not allowed because the client quorum is not met. A few brick processes are either offline or not visible from the client.
Recommended Action: Ensure that the bricks are online and the packet traffic in the network is not blocked.
108002
Description: The bricks that were earlier offline are now online and the client quorum is restored.
Recommended Action: Identify the reason for the bricks to go offline in the brick and mount log files.
108003
Description: Information message: The quorum-count option is no longer valid because the client quorum-type was set to auto.
Recommended Action: None.
108004
Description: Replication subvolume received a connection notification from a brick that does not belong to the replica set.
Recommended Action: None.
108005
Description: A replica set that was earlier inaccessible because all its bricks were offline. It is now accessible because at least one of the bricks came back online.
Recommended Action: Identify the reason for the brick to go offline in the brick and mount log files.
108006
Description: All the bricks of a replica set are down.The data residing in that replica cannot be accessed until one of the bricks in the replica set is online.
Recommended Action: Ensure that the bricks are brought back online.
108007
Description: Entry unlocks failed on a brick.
Recommended Action:
  1. Identify the reason for the failure in the client log files. The error number indicates the reason for failure.
  2. Examine the brick log files for more information.
108008
Description: Inconsistency is noticed in either the data or metadata or GFID of the file amongst the bricks of a replica set.
Recommended Action:
  1. Clear the AFR changelog attributes from the appropriate brick to resolve the split brain issue.
  2. Execute the command gluster volume heal <VOLNAME>
    To resolve split brain issues, see section Managing Split-brain in the Red Hat Storage Administration Guide.
108009
Description: Either the open() or opendir() call failed on the brick.
Recommended Action:
  1. Identify the reason for the failure in the client log files. The error number indicates the reason for failure.
  2. Examine the brick log files for more information.

Chapter 4. Distributed Hash Table Translator

4.1. Distributed Hash Table Translator

The Distributed Hash Table (DHT) translator in glusterFS distributes data across the bricks depending on the filenames.

Distributed Hash Table Translator Error Scenarios

109001
Description: A cached subvolume could not be found for the specified path.
Recommended Action: None.
109002
Description: A linkfile creation failed.
Recommended Action: None.
109003
Description: The value could not be set for the specified key in the dictionary.
Recommended Action: None.
109004
Description: Directory attributes could not be healed.
Recommended Action: None.
109005
Description: Self-heal failed for the specified directory.
Recommended Action:
  1. Ensure that all the subvolumes are online and reachable.
  2. Perform a lookup operation on the directory again.
109006
Description: The extended attributes could not be healed for the specified directory on the specified subvolume.
Recommended Action: None.
109007
Description: A lookup operation found a file with the same path on multiple subvolumes.

Note

If a rebalance process is in progress, this message can be ignored.
Recommended Action:
  1. Create backups of the file on other subvolumes.
  2. Inspect the content of all the files to identify and retain the most appropriate one.
109008
Description: A path resolves to a file on one subvolume and a directory on another.
Recommended Action:
  1. Create a backup of the file with a different name and delete the original file.
  2. In the newly created back up file, remove the trusted.gfid extended attribute using the command:
    setfattr -x trusted.gfid <path to the newly created backup file>
  3. Perform a new lookup operation on both the new and old paths.
  4. From the mount point, inspect both the paths and retain the relevant file or directory.
109009
Description: The GFID of the file or directory is different on different subvolumes.
Recommended Action: None.
109010
Description: The GFID of the specified file or directory is NULL.
Recommended Action: None.
109011
Description: The hashed subvolume could not be found for the specified file or directory.
Recommended Action: None.
109012
Description: The Distributed Hash Table Translator could not be initiated as the system is out of memory.
Recommended Action: None.
109013
Description: Invalid Distributed Hash Table configuration in the volume configuration file.
Recommended Action: None.
109014
Description: Invalid disk layout.
Recommended Action: None.
109015
Description: Invalid Distributed Hash Table configuration option.
Recommended Action:
  1. Reset the option with a valid value using the gluster volume set <VOLNAME> command.
  2. Restart the process that logged the message in the log file.
109016
Description: The fix layout operation failed.
Recommended Action: None.
109017
Description: Layout merge failed.
Recommended Action: None.
109018
Description:The layout for the specified directory does not match that on the disk.
Recommended Action: None.
109019
Description:No layout is present for the specified file or directory.
Recommended Action: None.
109020
Description:Informational message: Migration of data from the cached subvolume to the hashed subvolume is complete.
Recommended Action: None.
109021
Description:Migration of data failed during the rebalance operation.
Cause:Directories could not be read to identify the files for the migration process.
Recommended Action:The log message indicates the reason for the failure and the corrective action depends on the specific error that is encountered. The error would be one of the standard UNIX errors.
109022
Description:Informational message: The file was migrated successfully during the rebalance operation.
Recommended Action:None.
109023
Description:File migration failed during the rebalance operation.
Cause:Rebalance moves data from the cached subvolume to the hashed subvolume. Migrating a single file is a multi-step operation which involves opening, reading, and writing the data and metadata. Any failures in this multi-step operation can result in a file migration failure.
Recommended Action:The log message would indicate the reason for the failure and the corrective action depends on the specific error that is encountered. The error is one of the standard UNIX errors.
109024
Description:The system is out of memory.
Recommended Action:None.
109025
Description:The opendir() call failed on the specified directory.
Cause:When a directory is renamed, the DHT Translator checks whether the destination directory is empty. This message indicates that the opendir() call on the destination directory has failed.
Recommended Action:The log message indicates the reason for the failure and the corrective action depends on the specific error that is encountered. The error is one of the standard UNIX errors.
109026
Description:The rebalance operation failed.
Possible Cause:A subvolume is down.
Recommended Action:Restart the rebalance operation after all the subvolumes are online.
109027
Description:Failed to start the rebalance process.
Recommended Action:Identify the reason for failure in the log files.
109028
Description:Informational Message: Indicates the status of the rebalance operation and details such as the number of files migrated, skipped, or failed.
Recommended Action:None.
109029
Description:The rebalance operation was aborted by the user.
Recommended Action:None.
109030
Description:The file or directory could not be renamed.
Recommended Action:Ensure that all the subvolumes are reachable and try renaming the file or directory again.
109031
Description:Attributes could not be set for the specified file or directory.
Recommended Action:None.
109032
Description:The specified subvolume is running out of file system inodes. If all subvolumes run out of inodes, then new files cannot be created.
Recommended Action:Add more nodes to the cluster if all subvolumes run out of inodes.
109033
Description:The specified subvolume is running out of disk space. If all subvolumes run out of space, new files cannot be created.
Recommended Action:Add more nodes to the cluster if all subvolumes run out of disk space.
109034
Description:Failed to unlink the specified file or directory.
Recommended Action:The log message indicates the reason for the failure and the corrective action depends on the specific error that is encountered.
109035
Description:The layout information could not be set in the inode.
Recommended Action:None.

Appendix A. Revision History

Revision History
Revision 3-18Thu Oct 09 2014Pavithra Srinivasan
Version for 3.0 GA release.

Legal Notice

Copyright © 2013-2014 Red Hat, Inc.
This document is licensed by Red Hat under the Creative Commons Attribution-ShareAlike 3.0 Unported License. If you distribute this document, or a modified version of it, you must provide attribution to Red Hat, Inc. and provide a link to the original. If the document is modified, all Red Hat trademarks must be removed.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat Software Collections is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.