15.3. Restoring a bad file

When bad files are revealed by the scrubber, you can perform the following process to heal the file by recovering a copy from a replicate volume.

Important

The following procedure is easier if GFID-to-path translation is enabled.
Mount all volumes using the -oaux-gfid-mount mount option, and enable GFID-to-path translation on each volume by running the following command.
# gluster volume set VOLNAME build-pgfid on
Files created before this option was enabled must be looked up with the find command.

Procedure 15.1. Restoring a bad file from a replicate volume

  1. Note the identifiers of bad files

    Check the output of the scrub status command to determine the identifiers of corrupted files.
    # gluster volume bitrot VOLNAME scrub status
    Volume name: VOLNAME
    ...
    Node name: NODENAME
    ...
    Error count: 3
    Corrupted objects:
    5f61ade8-49fb-4c37-af84-c95041ff4bf5
    e8561c6b-f881-499b-808b-7fa2bce190f7
    eff2433f-eae9-48ba-bdef-839603c9434c
  2. Determine the path of each corrupted object

    For files created after GFID-to-path translation was enabled, use the getfattr command to determine the path of the corrupted files.
    # getfattr -n glusterfs.ancestry.path -e text
    /mnt/VOLNAME/.gfid/GFID
    ...
    glusterfs.ancestry.path="/path/to/corrupted_file"
    For files created before GFID-to-path translation was enabled, use the find command to determine the path of the corrupted file and the index file that match the identifying GFID.
    # find /rhgs/brick*/.glusterfs -name GFID
    /rhgs/brick1/.glusterfs/path/to/GFID
    # find /rhgs -samefile /rhgs/brick1/.glusterfs/path/to/GFID
    /rhgs/brick1/.glusterfs/path/to/GFID
    /rhgs/brick1/path/to/corrupted_file
  3. Delete the corrupted files

    Delete the corrupted files from the path output by the getfattr or find command.
  4. Delete the GFID file

    Delete the GFID file from the /rhgs/brickN/.glusterfs directory.
  5. Restore the file

    Follow these steps to safely restore corrupt files.
    1. Disable metadata caching

      If the metadata cache is enabled, disable it by running the following command:
      # gluster volume set VOLNAME stat-prefetch off
    2. Create a recovery mount point

      Create a mount point to use for the recovery process. For example, /mnt/recovery.
      # mkdir /mnt/recovery
    3. Mount the volume with timeouts disabled

      # mount -t glusterfs -o attribute-timeout=0,entry-timeout=0 hostname:volume-path /mnt/recovery
    4. Heal files and hard links

      Access files and hard links to heal them. For example, run the stat command on the files and hard links you need to heal.
      $ stat /mnt/recovery/corrupt-file
      If you do not have client self-heal enabled, you must manually heal the volume with the following command.
      # gluster volume heal VOLNAME
    5. Unmount and optionally remove the recovery mount point

      # umount /mnt/recovery
      # rmdir /mnt/recovery
    6. Optional: Re-enable metadata caching

      If the metadata cache was enabled previously, re-enable it by running the following command:
      # gluster volume set VOLNAME stat-prefetch on
The next time that the bitrot scrubber runs, this GFID is no longer listed (unless it has become corrupted again).