Large file copies require manual drop_caches to maintain throughput
Hello
I have (4) systems that are identical, with rhel 6.8/9. The difference is that two were setup with rhel 6.2 and the others setup about rhel 6.5.
systems have 12 core / 2 thread CPU and 128GB RAM, and dual 8Gpbs fibre cards.
For the longest time I have had this issue. When I perform a file copy, with say a 100GB file, the first two servers will maintain a high throughput for about 50% of the file, then trail off to about 10MB/sec for the rest of the copy. The second two seem to be able to maintain a high throughput for the entire copy.
Over time I have figured out, by watching my SAN perf monitor, iostat, and mem usage that the once the first to fill up cache, I can issue a manual
sync
echo 1 >> /proc/sys/vm/drop_caches
and see memory usage drop to minimal, and the throughput pick back up to several hundred MB/sec. The servers loaded with 6.5 do not exhibit this behavior, and maintain a pretty steady copy the entire time, as if they are flushing the cache as needed. I have been unable to pinpoint how to fix the older servers.
By watching my stats, the typical order of events
- start copy
- as copy progresses mem usage increases
- SAN IOPs on writes runs at about 30k
- SAN read IOPS drops to minimal at about 40GB
- SAN write IOPS run for another couple of minutes, then drops
- throughput drops to about 10MB/sec
- I issue commands
- mem usage drops
- IOPS increase to 30K
- throughput returns
Just curious if there is a config change I can make on the two servers to make them copy more like the second two.
Thanks!
Jim
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
