RHEL 7.2 system, df and du /var/log inconsistent results. 84MB actual usage, df shows 80% full

Latest response

I have a RHEL 7.2 system with a (miserably small) 4 GB /var/log xfs file system mounted (we're increasing it tomorrow). However, I'm noticing a very strange (to me) inconsistency between the results of df -PhT /var/log showing 80% consumption and du -sk /var/log/* and du -skh /var/log which only shows 84MB (megabytes, not GB). consumption. As I typed this, I found sssd had to be restarted to release some uneeded log files.

A df -PhT /var/log shows it 80% full, however, a du -skh /var/log shows realistic 84M actual usage.

Here's some relevant info...

[root@rhel72server] # du -sk /var/log/* | sort -nr 
21632 /var/log/messages
14920 /var/log/mesos
11004 /var/log/aide
2912 /var/log/httmpd
2648 /var/log/exhibitor.log
*truncated*

[root@rhel72server] # du -skh /var/log
84M   /var/log

[root@rhel72server] # df -PhT /var/log
/dev/mapper/disk0-varlog xfs 4G  4.2G   825M  80%   /var/log

I've zeroed out the lastlog sparse fiile by redirecting /dev/null to it and that makes no difference to the above.

My initial search led me to do an lsof | grep deleted. which yielded some apparent files that were deleted but apparently are being counted against the df -PhT /var/log, and when I restarted the sssd service, the deleted-but-not-released files were finally released.

[root@rhel72server] # cd /var/log
[root@rhel72server] # lsof | grep deleted
sssd_be 1828  root 10w REG 253,7 3258809724 4194460 /var/log/sssd/sssd_redacted_fqdn.log-20160501 (deleted)
sssd        1823 root 3w REG 253,7      62771719   4194459 /var/log/sssd/sssd.log-20160501 (deleted)
sssd_be 1828 root 17w REG 253,7     1482662    4194453 /var/log/sssd/ldap_child.log-20160502 (deleted)
wpa_suppl 1837 root 3w REG 253,7          120             145   /var/log/wpa_supplicant.log-20160504 (deleted)
*truncated for only /var/log*

So seeing that "sssd" is the life of the party, I did a

[root@rhel72server] # systemctl restart sssd
[root@rhel72server] # df -PhT /var/log
/dev/mapper/disk0-varlog xfs 4G  4.2G   4.0G 3%   /var/log

And this released the files in question so df and du against that directory are more consistent. I've seen this once against a RHEL 5 system, a long while ago but have not seen it (personally) against rhel 6 (it probably exists under rhel 6, but I just have not experienced it with rhel6 in my environment).

Does anyone have any ideas on how to deal with sssd logs that are rotated off ? (I have already added "compress" to the logrotate directives, and I plan on examining sssd logs for errant issues)

I turned debug_level = 5 to debug_level = 1 in /etc/sssd/sssd.conf under the [sssd] and [domain/redacted.fqdn.something] headers. Is there a recommended debug_level for /etc/sssd/sssd.conf?

I'd appreciate any assistance/recommendations.

Thanks
RJ

Responses

Interestingly, I lack permission to edit the original post I made in this discussion thread I started.

i hit "Edit Post" and an edit page appears, yet in the "Content" block is the following error:

This field has been disabled because you do not have sufficient permissions to edit it

Minor note to this matter... I commented out "debug_level = [x]" on the offending server because I don't have it anywhere else. I suspect someone was debugging sssd and turned it on with verbosity 5. it is now commented out...

EDITED: A freshly reloaded system shows the /etc/sssd/sssd.conf file as having debug_level =5 as the default, after a fresh load & joining to domain.

Hi,

I have run into similar inconsistencies between df and du output and they appeared when logrotate was doing its' work (on other log files than sssd.log). I suggest you look into logrotate and the interaction between logrotate and sssd.

Regards, Siem.

Thanks Siem, I've already established (previously) proper logrotation directives. However there's a chance those files were deleted just before those directives were made (and were not released by the sssd service until restart of sssd).

We now have compress (only delaycompress is set by default) for the logs. Oddly (as mentioned before) debug_level = 5 is apparently a default loging level right after a fresh load (which I now change during my after-kickstart configuration script).

Based on this fedora sssd diagnosis link,, I'm commenting out the default-after-load value of debug_level = 5 after the system is built as part of my baseline configuration.

Perhaps it was these files were deleted without the service being restarted.

Thanks/Regards, RJ

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.