Bad ext4 sync performance on 16 TB GPT partition

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 5
  • Red Hat Enterprise Linux 6

Issue

  • Writing a single large file with dd and syncing afterwards takes long time.

# /usr/bin/time bash -c "dd if=/dev/zero of=/mnt/large/10GB bs=1M count=10000
&& sync"
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 15.9423 seconds, 658 MB/s
0.01user 441.40system 7:26.10elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+794minor)pagefaults 0swaps

dd: ~16 seconds
sync: ~7 minutes

sync performance is fine in following cases..

  • using xfs
  • using ext3
  • disabling the ext4 journal
  • disabling ext4 extents (with enabled journal)

Resolution

Root Cause

When a sync is done, basically write_cache_pages is told to sync the whole range of the file.  It starts from the first dirty page, and loops. Nothing stops it until it gets to the end of the file. Then IO is submited, but it only actually writes out a small amount.  Then it come backs and scans the entire file again and again writes out a small portion.  Rinse, repeat, and the number of loops in write_cache_pages goes down by a bit each time. This is causing the slowness. This is why slowness is only apparent on large files.

  • In RHEL6 the bug is addressed in BZ682831
  • In REHL5 the bug is addressed in BZ572930

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments