RHEL6: Possible Data loss on ext4 filesystem after system loses power

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 6
    • reported on kernel 2.6.32-279.el6
    • kernels prior to kernel-2.6.32-358.18.1.el6 likely affected
  • Any ext4 filesystem that stores metadata in files with the SYNC flags set.
  • This issue has been observed on bare metal as well as virtual guests under KVM.

Issue

  • Data loss when a system loses power for any application storing information/metadata in files on an ext4 filesystem.

Resolution

  • Fixed in 6.4.z kernel-2.6.32-358.18.1.el6 RHSA-2013:1173
    • Originally tracked by private Red Hat Bug 955807 - Data loss in ext4 file system after power loss.
  • Fixed kernel packages are available at RHBA-2014:0291 for Red Hat Enterprise Linux Server 6.3 EUS.

Root Cause

  • Due to several bugs in the ext4 code, data integrity system calls did not always properly persist data on the disk. Therefore, the unsynchronized data in the ext4 file system could have been lost after the system's unexpected termination. A series of patches has been applied to the ext4 code to address this problem, including a fix that ensures proper usage of data barriers in the code responsible for file synchronization. Data loss no longer occurs in the described situation.

Diagnostic Steps

  1. While files are open files for writing (with o_sync) and fsync commits to disk.
  2. Power is lost during this operation.
  3. After reboot data loss has been observed.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments